Deep Learning Based Sentiment Analysis in a Code-Mixed English-Hindi and English-Bengali Social Media CorpusShow others and affiliations
2020 (English)In: International journal on artificial intelligence tools, ISSN 0218-2130, Vol. 29, no 5, article id 2050014Article in journal (Refereed) Published
Abstract [en]
Sentiment analysis is a circumstantial analysis of text, identifying the social sentiment to better understand the source material. The article addresses sentiment analysis of an English-Hindi and English-Bengali code-mixed textual corpus collected from social media. Code-mixing is an amalgamation of multiple languages, which previously mainly was associated with spoken language. However, social media users also deploy it to communicate in ways that tend to be somewhat casual. The coarse nature of social media text poses challenges for many language processing applications. Here, the focus is on the low predictive nature of traditional machine learners when compared to Deep Learning counterparts, including the contextual language representation model BERT (Bidirectional Encoder Representations from Transformers), on the task of extracting user sentiment from code-mixed texts. Three deep learners (a BiLSTM CNN, a Double BiLSTM and an Attention-based model) attained accuracy 20-60% greater than traditional approaches on code-mixed data, and were for comparison also tested on monolingual English data.
Place, publisher, year, edition, pages
World Scientific , 2020. Vol. 29, no 5, article id 2050014
Keywords [en]
Code-switching, convolutional neural networks, recurrent neural networks, Codes (symbols), Learning systems, Metals, Sentiment analysis, Social networking (online), Language processing, Machine learners, Multiple languages, Representation model, Social media, Source material, Spoken languages, Traditional approaches, Deep learning
National Category
Natural Sciences
Identifiers
URN: urn:nbn:se:ri:diva-50436DOI: 10.1142/S0218213020500141Scopus ID: 2-s2.0-85092430312OAI: oai:DiVA.org:ri-50436DiVA, id: diva2:1499378
2020-11-092020-11-092020-12-01Bibliographically approved