Planned maintenance
A system upgrade is planned for 10/12-2024, at 12:00-13:00. During this time DiVA will be unavailable.
Change search
Link to record
Permanent link

Direct link
Publications (10 of 46) Show all publications
McAllister, T. & Gambäck, B. (2022). Music Style Transfer Using Constant-Q Transform Spectrograms. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)Volume 13221 LNCS, Pages 195 - 2112022: . Paper presented at 11th International Conference on Artificial Intelligence in Music, Sound, Art and Design, EvoMUSART 2022, held as Part of EvoStar 2022Madrid20 April 2022 through 22 April 2022 (pp. 195-211). Springer Science and Business Media Deutschland GmbH
Open this publication in new window or tab >>Music Style Transfer Using Constant-Q Transform Spectrograms
2022 (English)In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)Volume 13221 LNCS, Pages 195 - 2112022, Springer Science and Business Media Deutschland GmbH , 2022, p. 195-211Conference paper, Published paper (Refereed)
Abstract [en]

Previous work on music generation and transformation has commonly targeted single instrument or single melody music. Here, in contrast, five music genres are used with the goal to achieve selective remixing by using domain transfer methods on spectrogram images of music. A pipeline architecture comprised of two independent generative adversarial network models was created. The first applies features from one of the genres to constant-Q transform spectrogram images to perform style transfer. The second network turns a spectrogram into a real-value tensor representation which is approximately reconstructed back into audio. The system was evaluated experimentally and through a survey. Due to the increased complexity involved in processing high sample rate music with homophonic or polyphonic audio textures, the system’s audio output was considered to be low quality, but the style transfer produced noticeable selective remixing on most of the music tracks evaluated. © 2022, The Author(s),

Place, publisher, year, edition, pages
Springer Science and Business Media Deutschland GmbH, 2022
Keywords
Audio acoustics, Generative adversarial networks, Music, Quality control, Textures, Audio textures, Domain transfers, Music genre, Network models, Pipeline architecture, Real values, Sample rate, Spectrograms, Tensor representation, Transfer method, Spectrographs
National Category
Interaction Technologies
Identifiers
urn:nbn:se:ri:diva-59247 (URN)10.1007/978-3-031-03789-4_13 (DOI)2-s2.0-85128973918 (Scopus ID)9783031037887 (ISBN)
Conference
11th International Conference on Artificial Intelligence in Music, Sound, Art and Design, EvoMUSART 2022, held as Part of EvoStar 2022Madrid20 April 2022 through 22 April 2022
Available from: 2022-06-13 Created: 2022-06-13 Last updated: 2022-06-13Bibliographically approved
Ekern, E. & Gambäck, B. (2021). Interactive, Efficient and Creative Image Generation Using Compositional Pattern-Producing Networks. In: Artificial Intelligence in Music, Sound, Art and Design. EvoMUSART 2021. Lecture Notes in Computer Science, vol 12693.: . Paper presented at Artificial Intelligence in Music, Sound, Art and Design. EvoMUSART 2021. (pp. 131-146). Springer Science and Business Media Deutschland GmbH, 12693
Open this publication in new window or tab >>Interactive, Efficient and Creative Image Generation Using Compositional Pattern-Producing Networks
2021 (English)In: Artificial Intelligence in Music, Sound, Art and Design. EvoMUSART 2021. Lecture Notes in Computer Science, vol 12693., Springer Science and Business Media Deutschland GmbH , 2021, Vol. 12693, p. 131-146Conference paper, Published paper (Refereed)
Abstract [en]

In contrast to most recent models that generate an entire image at once, the paper introduces a new architecture for generating images one pixel at a time using a Compositional Pattern-Producing Network (CPPN) as the generator part in a Generative Adversarial Network (GAN), allowing for effective generation of visually interesting images with artistic value, at arbitrary resolutions independent of the dimensions of the training data. The architecture, as well as accompanying (hyper-) parameters, for training CPPNs using recent GAN stabilisation techniques is shown to generalise well across many standard datasets. Rather than relying on just a latent noise vector (entangling various features with each other), mutual information maximisation is utilised to get disentangled representations, removing the requirement to use labelled data and giving the user control over the generated images. A web application for interacting with pre-trained models was also created, unique in the offered level of interactivity with an image-generating GAN.

Place, publisher, year, edition, pages
Springer Science and Business Media Deutschland GmbH, 2021
Keywords
Compositional pattern-producing networks, Generative adversarial networks, Image generation, Data visualization, Network architecture, Adversarial networks, Artistic value, Image generations, Mutual informations, Noise vectors, Training data, WEB application, Artificial intelligence
National Category
Medical Image Processing
Identifiers
urn:nbn:se:ri:diva-53516 (URN)10.1007/978-3-030-72914-1_9 (DOI)2-s2.0-85107436289 (Scopus ID)9783030729134 (ISBN)
Conference
Artificial Intelligence in Music, Sound, Art and Design. EvoMUSART 2021.
Available from: 2021-06-17 Created: 2021-06-17 Last updated: 2021-06-17Bibliographically approved
Jamatia, A., Swamy, S., Gambäck, B., Das, A. & Debbarma, S. (2020). Deep Learning Based Sentiment Analysis in a Code-Mixed English-Hindi and English-Bengali Social Media Corpus. International journal on artificial intelligence tools, 29(5), Article ID 2050014.
Open this publication in new window or tab >>Deep Learning Based Sentiment Analysis in a Code-Mixed English-Hindi and English-Bengali Social Media Corpus
Show others...
2020 (English)In: International journal on artificial intelligence tools, ISSN 0218-2130, Vol. 29, no 5, article id 2050014Article in journal (Refereed) Published
Abstract [en]

Sentiment analysis is a circumstantial analysis of text, identifying the social sentiment to better understand the source material. The article addresses sentiment analysis of an English-Hindi and English-Bengali code-mixed textual corpus collected from social media. Code-mixing is an amalgamation of multiple languages, which previously mainly was associated with spoken language. However, social media users also deploy it to communicate in ways that tend to be somewhat casual. The coarse nature of social media text poses challenges for many language processing applications. Here, the focus is on the low predictive nature of traditional machine learners when compared to Deep Learning counterparts, including the contextual language representation model BERT (Bidirectional Encoder Representations from Transformers), on the task of extracting user sentiment from code-mixed texts. Three deep learners (a BiLSTM CNN, a Double BiLSTM and an Attention-based model) attained accuracy 20-60% greater than traditional approaches on code-mixed data, and were for comparison also tested on monolingual English data.

Place, publisher, year, edition, pages
World Scientific, 2020
Keywords
Code-switching, convolutional neural networks, recurrent neural networks, Codes (symbols), Learning systems, Metals, Sentiment analysis, Social networking (online), Language processing, Machine learners, Multiple languages, Representation model, Social media, Source material, Spoken languages, Traditional approaches, Deep learning
National Category
Natural Sciences
Identifiers
urn:nbn:se:ri:diva-50436 (URN)10.1142/S0218213020500141 (DOI)2-s2.0-85092430312 (Scopus ID)
Available from: 2020-11-09 Created: 2020-11-09 Last updated: 2020-12-01Bibliographically approved
Swamy, S. D., Jamatia, A. & Gambäck, B. (2019). Studying generalisability across abusive language detection datasets. In: CoNLL 2019 - 23rd Conference on Computational Natural Language Learning, Proceedings of the Conference: . Paper presented at 23rd Conference on Computational Natural Language Learning, CoNLL 2019, 3 November 2019 through 4 November 2019 (pp. 940-950). Association for Computational Linguistics
Open this publication in new window or tab >>Studying generalisability across abusive language detection datasets
2019 (English)In: CoNLL 2019 - 23rd Conference on Computational Natural Language Learning, Proceedings of the Conference, Association for Computational Linguistics , 2019, p. 940-950Conference paper, Published paper (Refereed)
Abstract [en]

Work on Abusive Language Detection has tackled a wide range of subtasks and domains. As a result of this, there exists a great deal of redundancy and non-generalisability between datasets. Through experiments on cross-dataset training and testing, the paper reveals that the preconceived notion of including more non-abusive samples in a dataset (to emulate reality) may have a detrimental effect on the generalisability of a model trained on that data. Hence a hierarchical annotation model is utilised here to reveal redundancies in existing datasets and to help reduce redundancy in future efforts.

Place, publisher, year, edition, pages
Association for Computational Linguistics, 2019
Keywords
Statistical tests, Language detection, Subtasks, Training and testing, Redundancy
National Category
Engineering and Technology
Identifiers
urn:nbn:se:ri:diva-45025 (URN)2-s2.0-85084333327 (Scopus ID)9781950737727 (ISBN)
Conference
23rd Conference on Computational Natural Language Learning, CoNLL 2019, 3 November 2019 through 4 November 2019
Available from: 2020-05-25 Created: 2020-05-25 Last updated: 2020-05-28Bibliographically approved
Kumar, U., Reganti, A. N., Maheshwari, T., Chakroborty, T., Gambäck, B. & Das, A. (2018). Inducing Personalities and Values from Language Use in Social Network Communities. Information Systems Frontiers, 20(6), 1219-1240
Open this publication in new window or tab >>Inducing Personalities and Values from Language Use in Social Network Communities
Show others...
2018 (English)In: Information Systems Frontiers, ISSN 1387-3326, E-ISSN 1572-9419, Vol. 20, no 6, p. 1219-1240Article in journal (Refereed) Published
Abstract [en]

A community in social networks is generally assumed to be composed of a group of individuals with similar characteristics. Although there has been a plethora of work on understanding network topologies (edge density, clustering coefficient, etc.) within an online community, the psycho-sociological compositions of social network communities have hardly been studied. The present paper aims to analyse the communities as composition of induced psycholinguistic and sociolinguistic variables (Personalities, Values and Ethics) across individuals in social media networks. The motivation behind this analysis is to understand the behavioural characteristics at individual as well as societal level in social networks. To this end, three studies were carried out on six different datasets: three Twitter corpora, two Facebook corpora, and an Essay corpus, annotated with Values and Ethics of the users. First, experiments on creating automatic models to determine the Personality and Values of individuals by analysing their language usage and social media behaviour. Second, experiments on understanding the characteristics or blend of characteristics of individuals within an online community. Finally, generation of a map of values and ethics for India, a multi-lingual and multi-cultural country. Striking similarities to general intuitive perception could be observed, i.e., the results obtained in the study resemble our general perception about the cities/towns of India. 

Place, publisher, year, edition, pages
Springer New York LLC, 2018
Keywords
Community, Ethics, Personality, Social network, Values, Linguistics, Online systems, Philosophical aspects, Clustering coefficient, Network communities, On-line communities, Social media networks, Social networking (online)
National Category
Engineering and Technology
Identifiers
urn:nbn:se:ri:diva-37955 (URN)10.1007/s10796-017-9793-8 (DOI)2-s2.0-85029010113 (Scopus ID)
Available from: 2019-04-23 Created: 2019-04-23 Last updated: 2019-05-03Bibliographically approved
Maheshwari, T., Reganti, A. N., Gupta, S., Jamatia, A., Kumar, U., Gambäck, B. & Das, A. (2017). A societal sentiment analysis: Predicting the values and ethics of individuals by analysing social media content. In: 15th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2017 - Proceedings of Conference: . Paper presented at 15th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2017, 3 April 2017 through 7 April 2017 (pp. 731-741).
Open this publication in new window or tab >>A societal sentiment analysis: Predicting the values and ethics of individuals by analysing social media content
Show others...
2017 (English)In: 15th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2017 - Proceedings of Conference, 2017, p. 731-741Conference paper, Published paper (Refereed)
Abstract [en]

To find out how users' social media behaviour and language are related to their ethical practices, the paper investigates applying Schwartz' psycholinguistic model of societal sentiment to social media text. The analysis is based on corpora collected from user essays as well as social media (Facebook and Twitter). Several experiments were carried out on the corpora to classify the ethical values of users, incorporating Linguistic Inquiry Word Count analysis, n-grams, topic models, psycholinguistic lexica, speech-acts, and nonlinguistic information, while applying a range of machine learners (Support Vector Machines, Logistic Regression, and Random Forests) to identify the best linguistic and non-linguistic features for automatic classification of values and ethics.

Keywords
Classification (of information), Computational linguistics, Decision trees, Linguistics, Philosophical aspects, Automatic classification, Ethical practices, Ethical values, Linguistic features, Logistic regressions, Machine learners, Psycholinguistic models, Sentiment analysis, Social networking (online)
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:ri:diva-31136 (URN)10.18653/v1/e17-1069 (DOI)2-s2.0-85021625321 (Scopus ID)9781510838604 (ISBN)
Conference
15th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2017, 3 April 2017 through 7 April 2017
Available from: 2017-08-28 Created: 2017-08-28 Last updated: 2019-08-14Bibliographically approved
Gambäck, B., Olsson, F. & Täckström, O. (2011). Active Learning for Dialogue Act Classification (9ed.). In: : . Paper presented at INTERSPEECH 2011, 12th Annual Conference of the International Speech Communication Association.
Open this publication in new window or tab >>Active Learning for Dialogue Act Classification
2011 (English)Conference paper, Published paper (Refereed)
Abstract [en]

Active learning techniques were employed for classification of dialogue acts over two dialogue corpora, the English human-human Switchboard corpus and the Spanish human-machine Dihana corpus. It is shown clearly that active learning improves on a baseline obtained through a passive learning approach to tagging the same data sets. An error reduction of 7% was obtained on Switchboard, while a factor 5 reduction in the amount of labeled data needed for classification was achieved on Dihana. The passive Support Vector Machine learner used as baseline in itself significantly improves the state of the art in dialogue act classification on both corpora. On Switchboard it gives a 31% error reduction compared to the previously best reported result.

National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:ri:diva-23877 (URN)
Conference
INTERSPEECH 2011, 12th Annual Conference of the International Speech Communication Association
Projects
COMPANIONS
Available from: 2016-10-31 Created: 2016-10-31 Last updated: 2020-12-01Bibliographically approved
Wilks, Y., Gambäck, B. & Danieli, M. (Eds.). (2010). Workshop on Companionable Dialogue Systems (6ed.). Uppsala, Sweden: ACL
Open this publication in new window or tab >>Workshop on Companionable Dialogue Systems
2010 (English)Collection (editor) (Refereed)
Place, publisher, year, edition, pages
Uppsala, Sweden: ACL, 2010. p. 59 Edition: 6
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:ri:diva-23896 (URN)978-1-932432-81-7 (ISBN)
Available from: 2016-10-31 Created: 2016-10-31 Last updated: 2020-12-01Bibliographically approved
Ståhl, O., Gambäck, B., Turunen, M. & Hakulinen, J. (2009). A Mobile Health and Fitness Companion Demonstrator (11ed.). In: Proceedings of the Demonstrations Session at EACL 2009: . Paper presented at 12th Conference of the European Chapter of the Association for Computational Linguistics, ACL (pp. 65-68).
Open this publication in new window or tab >>A Mobile Health and Fitness Companion Demonstrator
2009 (English)In: Proceedings of the Demonstrations Session at EACL 2009, 2009, 11, p. 65-68Conference paper, Published paper (Refereed)
Abstract [en]

Multimodal conversational spoken dialogues using physical and virtual agents provide a potential interface to motivate and support users in the domain of health and fitness. The paper presents a multimodal conversational Companion system focused on health and fitness, which has both a stationary and a mobile component.

National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:ri:diva-23676 (URN)
Conference
12th Conference of the European Chapter of the Association for Computational Linguistics, ACL
Projects
COMPANIONS
Available from: 2016-10-31 Created: 2016-10-31 Last updated: 2023-12-04Bibliographically approved
Gambäck, B., Olsson, F., Argaw, A. A. & Asker, L. (2009). Methods for Amharic part-of-speech tagging (1ed.). In: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics: . Paper presented at First Workshop on Language Technologies for African Languages, March 2009, Athens, Greece.
Open this publication in new window or tab >>Methods for Amharic part-of-speech tagging
2009 (English)In: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, 2009, 1, , p. 8Conference paper, Published paper (Refereed)
Abstract [en]

The paper describes a set of experiments involving the application of three state-of- the-art part-of-speech taggers to Ethiopian Amharic, using three different tagsets. The taggers showed worse performance than previously reported results for Eng- lish, in particular having problems with unknown words. The best results were obtained using a Maximum Entropy ap- proach, while HMM-based and SVM- based taggers got comparable results.

Publisher
p. 8
Keywords
part-of-speech tagging, amharic, machine learning
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:ri:diva-23519 (URN)
Conference
First Workshop on Language Technologies for African Languages, March 2009, Athens, Greece
Available from: 2016-10-31 Created: 2016-10-31 Last updated: 2020-12-01Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-5252-707x

Search in DiVA

Show all publications