Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Music Style Transfer Using Constant-Q Transform Spectrograms
NTNU Norwegian University of Science and Technology, Norway.
RISE Research Institutes of Sweden, Digitala system, Datavetenskap. NTNU Norwegian University of Science and Technology, Norway.ORCID-id: 0000-0002-5252-707x
2022 (Engelska)Ingår i: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)Volume 13221 LNCS, Pages 195 - 2112022, Springer Science and Business Media Deutschland GmbH , 2022, s. 195-211Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Previous work on music generation and transformation has commonly targeted single instrument or single melody music. Here, in contrast, five music genres are used with the goal to achieve selective remixing by using domain transfer methods on spectrogram images of music. A pipeline architecture comprised of two independent generative adversarial network models was created. The first applies features from one of the genres to constant-Q transform spectrogram images to perform style transfer. The second network turns a spectrogram into a real-value tensor representation which is approximately reconstructed back into audio. The system was evaluated experimentally and through a survey. Due to the increased complexity involved in processing high sample rate music with homophonic or polyphonic audio textures, the system’s audio output was considered to be low quality, but the style transfer produced noticeable selective remixing on most of the music tracks evaluated. © 2022, The Author(s),

Ort, förlag, år, upplaga, sidor
Springer Science and Business Media Deutschland GmbH , 2022. s. 195-211
Nyckelord [en]
Audio acoustics, Generative adversarial networks, Music, Quality control, Textures, Audio textures, Domain transfers, Music genre, Network models, Pipeline architecture, Real values, Sample rate, Spectrograms, Tensor representation, Transfer method, Spectrographs
Nationell ämneskategori
Interaktionsteknik
Identifikatorer
URN: urn:nbn:se:ri:diva-59247DOI: 10.1007/978-3-031-03789-4_13Scopus ID: 2-s2.0-85128973918ISBN: 9783031037887 (tryckt)OAI: oai:DiVA.org:ri-59247DiVA, id: diva2:1668472
Konferens
11th International Conference on Artificial Intelligence in Music, Sound, Art and Design, EvoMUSART 2022, held as Part of EvoStar 2022Madrid20 April 2022 through 22 April 2022
Tillgänglig från: 2022-06-13 Skapad: 2022-06-13 Senast uppdaterad: 2022-06-13Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Övriga länkar

Förlagets fulltextScopus

Person

Gambäck, Björn

Sök vidare i DiVA

Av författaren/redaktören
Gambäck, Björn
Av organisationen
Datavetenskap
Interaktionsteknik

Sök vidare utanför DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetricpoäng

doi
isbn
urn-nbn
Totalt: 47 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf