---
abstract: "This paper presents a simple unsupervised learning algorithm for classifying reviews as recommended (thumbs up) or not recommended (thumbs down). The classification of a review is predicted by the average semantic orientation of the phrases in the review that contain adjectives or adverbs. A phrase has a positive semantic orientation when it has good associations (e.g., \"subtle nuances\") and a negative semantic orientation when it has bad associations (e.g., \"very cavalier\"). In this paper, the semantic orientation of a phrase is calculated as the mutual information between the given phrase and the word \"excellent\" minus the mutual information between the given phrase and the word \"poor\". A review is classified as recommended if the average semantic orientation of its phrases is positive. The algorithm achieves an average accuracy of 74% when evaluated on 410 reviews from Epinions, sampled from four different domains (reviews of automobiles, banks, movies, and travel destinations). The accuracy ranges from 84% for automobile reviews to 66% for movie reviews. \n\n"
altloc:
- http://extractor.iit.nrc.ca/reports/acl02.pdf
chapter: ~
commentary: ~
commref: ~
confdates: July 8-10
conference: 40th Annual Meeting of the Association for Computational Linguistics (ACL'02)
confloc: 'Philadelphia, Pennsylvania'
contact_email: ~
creators_id: []
creators_name:
- family: Turney
given: Peter D.
honourific: ''
lineage: ''
date: 2002
date_type: published
datestamp: 2002-07-15
department: ~
dir: disk0/00/00/23/21
edit_lock_since: ~
edit_lock_until: ~
edit_lock_user: ~
editors_id: []
editors_name: []
eprint_status: archive
eprintid: 2321
fileinfo: /style/images/fileicons/application_postscript.png;/2321/1/turney%2Dacl02%2Dfinal.ps|/style/images/fileicons/application_pdf.png;/2321/5/turney%2Dacl02%2Dfinal.pdf
full_text_status: public
importid: ~
institution: ~
isbn: ~
ispublished: pub
issn: ~
item_issues_comment: []
item_issues_count: 0
item_issues_description: []
item_issues_id: []
item_issues_reported_by: []
item_issues_resolved_by: []
item_issues_status: []
item_issues_timestamp: []
item_issues_type: []
keywords: ~
lastmod: 2011-03-11 08:54:57
latitude: ~
longitude: ~
metadata_visibility: show
note: ~
number: ~
pagerange: 417-424
pubdom: FALSE
publication: ~
publisher: ~
refereed: TRUE
referencetext: |
Agresti, A. 1996. An introduction to categorical data analysis. New York: Wiley.
Brill, E. 1994. Some advances in transformation-based part of speech tagging. Proceedings of the Twelfth National Conference on Artificial Intelligence (pp. 722-727). Menlo Park, CA: AAAI Press.
Church, K.W., & Hanks, P. 1989. Word association norms, mutual information and lexicography. Proceedings of the 27th Annual Conference of the ACL (pp. 76-83). New Brunswick, NJ: ACL.
Frank, E., & Hall, M. 2001. A simple approach to ordinal classification. Proceedings of the Twelfth European Conference on Machine Learning (pp. 145-156). Berlin: Springer-Verlag.
Hatzivassiloglou, V., & McKeown, K.R. 1997. Predicting the semantic orientation of adjectives. Proceedings of the 35th Annual Meeting of the ACL and the 8th Conference of the European Chapter of the ACL (pp. 174-181). New Brunswick, NJ: ACL.
Hatzivassiloglou, V., & Wiebe, J.M. 2000. Effects of adjective orientation and gradability on sentence subjectivity. Proceedings of 18th International Conference on Computational Linguistics. New Brunswick, NJ: ACL.
Hearst, M.A. 1992. Direction-based text interpretation as an information access refinement. In P. Jacobs (Ed.), Text-Based Intelligent Systems: Current Research and Practice in Information Extraction and Retrieval. Mahwah, NJ: Lawrence Erlbaum Associates.
Landauer, T.K., & Dumais, S.T. 1997. A solution to Plato�s problem: The latent semantic analysis theory of the acquisition, induction, and representation of knowledge. Psychological Review, 104, 211-240.
Santorini, B. 1995. Part-of-Speech Tagging Guidelines for the Penn Treebank Project (3rd revision, 2nd printing). Technical Report, Department of Computer and Information Science, University of Pennsylvania.
Spertus, E. 1997. Smokey: Automatic recognition of hostile messages. Proceedings of the Conference on Innovative Applications of Artificial Intelligence (pp. 1058-1065). Menlo Park, CA: AAAI Press.
Tong, R.M. 2001. An operational system for detecting and tracking opinions in on-line discussions. Working Notes of the ACM SIGIR 2001 Workshop on Operational Text Classification (pp. 1-6). New York, NY: ACM.
Turney, P.D. 2001. Mining the Web for synonyms: PMI-IR versus LSA on TOEFL. Proceedings of the Twelfth European Conference on Machine Learning (pp. 491-502). Berlin: Springer-Verlag.
Wiebe, J.M. 2000. Learning subjective adjectives from corpora. Proceedings of the 17th National Conference on Artificial Intelligence. Menlo Park, CA: AAAI Press.
Wiebe, J.M., Bruce, R., Bell, M., Martin, M., & Wilson, T. 2001. A corpus study of evaluative and speculative language. Proceedings of the Second ACL SIG on Dialogue Workshop on Discourse and Dialogue. Aalborg, Denmark.
relation_type: []
relation_uri: []
reportno: ~
rev_number: 14
series: ~
source: ~
status_changed: 2007-09-12 16:44:12
subjects:
- comp-sci-art-intel
- comp-sci-lang
- comp-sci-mach-learn
- comp-sci-stat-model
succeeds: ~
suggestions: ~
sword_depositor: ~
sword_slug: ~
thesistype: ~
title: 'Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews'
type: confpaper
userid: 2175
volume: ~