While sentiment analysis has received significant attention in the last years, problems still exist when tools need to be applied to microblogging content. This because, typically, the text to be analysed consists of very short messages lacking in structure and semantic context. At the same time, the amount of text produced by online platforms is enormous. So, one needs simple, fast and effective methods in order to be able to efficiently study sentiment in these data. Lexicon-based methods, which use a predefined dictionary of terms tagged with sentiment valences to evaluate sentiment in longer sentences, can be a valid approach. Here we present a method based on epidemic spreading to automatically extend the dictionary used in lexicon-based sentiment analysis, starting from a reduced dictionary and large amounts of Twitter data. The resulting dictionary is shown to contain valences that correlate well with human-annotated sentiment, and to produce tweet sentiment classifications comparable to the original dictionary, with the advantage of being able to tag more tweets than the original. The method is easily extensible to various languages and applicable to large amounts of data.
|Data di pubblicazione:||2017|
|Titolo:||Sentiment spreading: An epidemic model for lexicon-based sentiment analysis on twitter|
|Rivista:||LECTURE NOTES IN COMPUTER SCIENCE|
|Titolo del libro:||Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)|
|Digital Object Identifier (DOI):||http://dx.doi.org/10.1007/978-3-319-70169-1_9|
|Appare nelle tipologie:||4.1 Articolo in Atti di convegno|