Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Music Style Transfer Using Constant-Q Transform Spectrograms
NTNU Norwegian University of Science and Technology, Norway.
RISE Research Institutes of Sweden, Digital Systems, Data Science. NTNU Norwegian University of Science and Technology, Norway.ORCID iD: 0000-0002-5252-707x
2022 (English)In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)Volume 13221 LNCS, Pages 195 - 2112022, Springer Science and Business Media Deutschland GmbH , 2022, p. 195-211Conference paper, Published paper (Refereed)
Abstract [en]

Previous work on music generation and transformation has commonly targeted single instrument or single melody music. Here, in contrast, five music genres are used with the goal to achieve selective remixing by using domain transfer methods on spectrogram images of music. A pipeline architecture comprised of two independent generative adversarial network models was created. The first applies features from one of the genres to constant-Q transform spectrogram images to perform style transfer. The second network turns a spectrogram into a real-value tensor representation which is approximately reconstructed back into audio. The system was evaluated experimentally and through a survey. Due to the increased complexity involved in processing high sample rate music with homophonic or polyphonic audio textures, the system’s audio output was considered to be low quality, but the style transfer produced noticeable selective remixing on most of the music tracks evaluated. © 2022, The Author(s),

Place, publisher, year, edition, pages
Springer Science and Business Media Deutschland GmbH , 2022. p. 195-211
Keywords [en]
Audio acoustics, Generative adversarial networks, Music, Quality control, Textures, Audio textures, Domain transfers, Music genre, Network models, Pipeline architecture, Real values, Sample rate, Spectrograms, Tensor representation, Transfer method, Spectrographs
National Category
Interaction Technologies
Identifiers
URN: urn:nbn:se:ri:diva-59247DOI: 10.1007/978-3-031-03789-4_13Scopus ID: 2-s2.0-85128973918ISBN: 9783031037887 (print)OAI: oai:DiVA.org:ri-59247DiVA, id: diva2:1668472
Conference
11th International Conference on Artificial Intelligence in Music, Sound, Art and Design, EvoMUSART 2022, held as Part of EvoStar 2022Madrid20 April 2022 through 22 April 2022
Available from: 2022-06-13 Created: 2022-06-13 Last updated: 2022-06-13Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Gambäck, Björn

Search in DiVA

By author/editor
Gambäck, Björn
By organisation
Data Science
Interaction Technologies

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 46 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf