Combining Overall and Target Oriented Sentiment Analysis over Portuguese Text from Social Media
This document describes an approach to perform sentiment analysis on social media Portuguese content. In a single system, we perform polarity classification for both the overall sentiment, and target oriented sentiment. In both modes we train a Maximum Entropy classifier. The overall model is based on BoW type features, and also features derived from POS tagging and from sentiment lexicons. Target oriented analysis begins with named entity recognition, followed by the classification of sentiment polarity on these entities. This classifier model uses features dedicated to the entity mention textual zone, including negation detection, and the syntactic function of the target occurrence segment. Our experiments have achieved an accuracy of 75% for target oriented polarity classification, and 97% in overall polarity.
(1) J. Saias, Senti.ue: Tweet overall sentiment classification approach for SemEval-2014 task 9. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), Dublin, Ireland, August 2014. ISBN 978-1-941643-24-2, p. 546–550.
(2) J. Saias, Sentiue: Target and Aspect based Sentiment Analysis in SemEval-2015 Task 12. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), Colorado, USA, 2015. ACL.
(3) M. J. Silva et al., Notas sobre a Realização e Qualidade do Twitómetro. Technical Report. University of Lisbon, LASIGE. 2011.
(4) M.J. Silva et al., The Design of OPTIMISM, an Opinion Mining
System for Portuguese Politics. In New Trends in Artificial Intelligence: Proceedings of EPIA 2009 - Fourteenth Portuguese Conference on Artificial Intelligence, 2009, p. 565-576.
(5) J. Filgueiras and S. Amir, POPSTAR at RepLab 2013: Polarity for Reputation Classification. In Proceedings of the 4th International Conference of the CLEF initiative, CLEF 2013, Valencia, Spain.
(6) P. Lambert and C. Rodriguez-Penagos, Adapting Freely Available Resources to Build an Opinion Mining Pipeline in Portuguese. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), Iceland, 2014.
(7) E. Amigó et al., Overview of RepLab 2014: Author Profiling and Reputation Dimensions for Online Reputation Management. In Information Access Evaluation. Multilinguality, Multimodality, and Interaction. Lecture Notes in Computer Science, Volume 8685, 2014, p. 307-322.
(8) S. Rosenthal et al., SemEval-2014 Task 9: Sentiment Analysis in Twitter. In Proceedings of the Eighth International Workshop on Semantic Evaluation (SemEval’14). August 23-24, 2014, Dublin, Ireland.
(9) M. Pontiki et al., SemEval-2014 Task 4: Aspect Based Sentiment Analysis. Proceedings of the 8th SemEval, Dublin, Ireland. 2014.
(10) M. Pontiki et al., SemEval-2015 Task 12: Aspect Based Sentiment Analysis. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), Denver, Colorado, USA. 2015.
(11) S. Kiritchenko et al., NRC-Canada-2014: Detecting Aspects and Sentiment in Customer Reviews. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014). Dublin, Ireland, 2014, p. 437–442.
(12) A. K. McCallum, MALLET: A Machine Learning for Language Toolkit. 2002. http://mallet.cs.umass.edu
(13) M. J. Silva et al., Building a Sentiment Lexicon for Social Judgement Mining. In Lecture Notes in Computer Science (LNCS) / Lecture Notes in Artificial Intelligence (LNAI), International Conference on Computational Processing of Portuguese (PROPOR), Coimbra, 2012.
(14) M. Mourão and J. Saias, BCLaaS: implementação de uma base de
conhecimento linguístico as-a-service. In L. Ferreira and V. Pedro, editors, Actas das 3as Jornadas de Informática da Universidade de Évora. ECT, Universidade de Évora, Portugal, 2013.