Subjectivity — Tapping All the Valuable Insights beyond Sentiment for Nextgen Information Extraction

5:30:00 PM
by Unknown

The information in text can be generally divided into two categories: objective information and subjective information. Objective information encompasses the facts about something or someone, while subjective information is about someone's personal experiences. For example, the fact that it is raining is objective, while how one feels about the rain is subjective; that you have not had breakfast is objective, while your feeling of hunger is subjective; that you watched “You Can Count on Me” last night is objective, while your emotion of heartwarming because of the movie is subjective.  

Subjective information about what people think and how people feel is useful for all parties including individuals, businesses, and government agencies during their decision-making processes. The traditional way of collecting subjective information takes the form of surveys, questionnaires, polls, focus groups, interviews, etc. For example, individuals ask their friends which cell phone carriers they recommend and whether the coverage is good in their area; retailers conduct a focus group to have in-depth discussions with their target customers about how they feel regarding shopping in the stores; governments solicit public opinions on particular policy issues via surveys.

The web and social media have changed the way we communicate and provide new potentially powerful avenues for us to glean useful subjective information from user generated content such as blogs, forum posts, reviews, chats, and microblogs. However, much of the useful subjective information is buried in ever-growing user generated data, which makes it very difficult (if not impossible) to manually capture and process the information needed for various purposes. To address the information overload, it is essential to develop techniques to automatically discover and derive high-quality (i.e., contextually or application relevant and accurate) subjective information from user generated content.

Current subjectivity and sentiment analysis efforts have been focused on classifying the text polarity, specifically, whether the expressed opinion for a specific topic in a given text (e.g., document, sentence, word/phrase) is positive, negative, or neutral. This narrow definition considers subjective information and sentiment as the same object, while other types of subjective information (e.g., emotion, intent, preference, expectation) are either not taken into account, or are handled similarly without sufficient differentiation. This limitation may prevent the exploitation of subjective information from reaching its full potential.  

At Kno.e.sis, we extend the definition of subjective information and develop a unified framework that captures the key components of diverse types of subjective information. We define a subjective experience as a quadruple (h, s, e, c), where h is an individual who holds the experiences, s is a stimulus (or target) that elicits the experiences, e.g., an entity or an event, e is a set of expressions that are used to describe the subjective experiences, e.g., the sentiment words/phrases or the opinion claims, and c is a classification or assessment that characterizes or measures the subjectivity. Accordingly, the problem of identifying different types of subjective information can all be formulated as a data mining task that aims to automatically derive the four components of the quadruple from text, as illustrated in Table 1


Table 1 Components of sentiment, opinion, emotion, intent, preference and expectation.
Subjective Experience Holder h Stimulus s Expression e Classification c
Sentiment an individual who holds the sentiment an object sentiment words and phrases positive, negative, neutral
Opinion an individual who holds the opinion an object opinion claims (may or may not contain sentiment words) positive, negative, neutral
Emotion an individual who holds the emotion An event or situation emotion words and phrases, description of events/situations anger, disgust, fear, happiness, sadness, surprise, etc.
Intent an individual who holds the intent an action expressions of desires and beliefs depending on specific tasks
Preference an individual who holds the preference a set of alternatives the expressions of liking, disliking or preferring an alternative depending on specific tasks
Expectation an individual who holds the expectation an object expressions of beliefs about someone or how something will be depending on specific tasks

Consider the following example:

“Action and science fiction movies are usually my favorite, but I don't like the new Jurassic World. Mad Max: Fury Road is the best I've seen so far this year. It's a magnificent visual spectacle and the acting is stellar too. I cried, laughed and smiled watching Inside Out. It was so touching. Would like to watch the new Spy movie this weekend. I hope it’s good!''

The traditional sentiment analysis would find positive opinion about action and science fiction movies, “Mad Max: Fury Road,” “Inside Out” and “Spy,” and find negative opinion about the movie “Jurassic World.” However, if we consider different types of subjective information, and handle each particular type based on the framework we proposed, we will be able to derive much richer information from the text, as illustrated in Table 2.

Table 2 Information that can be extracted from the example text.
Subjective Experience Holder h Stimulus s Expression e Classification c
Preference the author movie genres “favorite” prefer action and science fiction movies over other types of movies
Sentiment the author movie Jurassic World “don’t like” negative
Opinion the author movie Mad Max: Fury Road, visual effect, performances “best”, “magnificent”,“spectacle”, “stellar” positive
Emotion the author movie Inside Out “cried”, “laughed”, “smiled”, “touching” sadness, joy, touching
Intent the author movie Spy “would like to” transactional
Expectation the author movie Spy “hope” optimistic

Figure 1 depicts the process of subjective information extraction. At the beginning, a number of preprocessing steps are needed to handle the raw textual data before the information extraction can take place. Common preprocessing steps include sentence splitting, word tokenization, syntactic parsing or POS tagging, and stop words removal. Afterwards, an optional step is to detect the subjective content from the input text, such as classifying the sentences into subjective or objective categories. The subjective content can be further classified into different types, e.g., sentiment, emotion, intent and expectation. Language resources such as WordNet, Urban Dictionary, and subjectivity lexicons (e.g., MPQA, SentiWordNet) can be used for the subjectivity classification task.

Figure 1 An overview of subjective information extraction.

The next step is to extract the four components of subjective experiences, including the holder, the stimulus or target, the set of expressions, and the classification category or assessment score. Depending on the type of subjective information, specific techniques need to be developed and applied. For example, the target of sentiment is usually an entity, and thus entity recognition is used to extract sentiment target; while the target of intent can be an action, e.g., “to buy a new cell-phone”, thus we need to develop techniques to extract actions from text. In addition, for the same type of subjective information, different classification/assessment schema and techniques may need to be developed according to the purpose of application. For example, many sentiment analysis and opinion mining systems classify the polarity of a text (e.g., a movie review, a tweet) as positive, negative or neutral [1-3], or rate it on a 1-5 stars rating scale [4,5]. Some emotion identification systems focus on classifying emotions into six basic categories: anger, disgust, fear, happiness, sadness, and surprise [6], while some other systems define their own set of emotions based on the application purpose, e.g., understanding emotions in suicide notes [7], identifying emotions that people express using cursing words [8], classifying emotional response to TV shows and movies [9]. Existing work on detecting users' query intent classifies search queries into three categories: navigational, informational, or transactional [10,11]. Studies on identifying purchase intent (PI) for online advertising classify users' posts into PI or Non-PI [12], or information seeking or transactional [13].

Finally, the extracted subjective information can be used for a wide variety of applications, including but not limited to business analytics, Customer Relationship anagement (CRM), marketing, predicting the financial performance of a company, targeting advertisement, recommendation (based on users' interest and preference), monitoring social phenomena (e.g., social tension, subjective well-being), and predicting election results.  

At Kno.e.sis,  we have developed automatic methods to extract components of different subjective experiences. We have proposed an optimization-based approach that extracts a diverse set of sentiment-bearing expressions, including formal and slang words/phrases, for a given target from an unlabeled corpus [2]. We have developed a clustering approach that identifies opinion targets (product features and aspects) from plain product reviews [14]. The proposed approach identifies features and clusters them into aspects simultaneously. Furthermore, it extracts both explicit and implicit features and does not require seed terms. We have also explored the classification and assessment of different types of subjective information. In particular, we have explored supervised methods for emotion classification [6-9]. We have proposed methods to group opinion holders based on their political preference and participation in the discussion about election candidates on Twitter, and assess their sentiments towards the candidates to predict the election results [15]. In order to understand the effect of religiosity on happiness, we analyzed the tweets and networks of more than 250k U.S. Twitter users who self-declared their religious beliefs, and examined the pleasant/unpleasant emotional expressions in users' tweets to estimate their subjective well-being [16,17].



References:

[1] Pang, Bo, Lillian Lee, and Shivakumar Vaithyanathan. "Thumbs up?: sentiment classification using machine learning techniques." EMNLP. 2002.
[2] Lu Chen, Wenbo Wang, Meenakshi Nagarajan, Shaojun Wang, Amit Sheth. Extracting Diverse Sentiment Expressions with Target-dependent Polarity from Twitter. ICWSM. 2012.
[3] Cícero Nogueira dos Santos, and Maira Gatti. "Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts." COLING. 2014.
[4] Ganu, Gayatree, Noemie Elhadad, and Amélie Marian. "Beyond the Stars: Improving Rating Predictions using Review Text Content." WebDB. Vol. 9. 2009.
[5] Sharma, Raksha, et al. "Adjective Intensity and Sentiment Analysis." EMNLP. 2015.
[6] Wenbo Wang, Lu Chen, Krishnaprasad Thirunarayan, Amit Sheth. Harnessing Twitter "Big Data" for Automatic Emotion Identification.SocialCom. 2012.
[7] Wenbo Wang, Lu Chen, Ming Tan, Shaojun Wang, Amit Sheth. Discovering Fine-grained Sentiment in Suicide Notes. Biomedical Informatics Insights (BII). 2012.
[8] Wenbo Wang, Lu Chen, Krishnaprasad Thirunarayan, Amit Sheth. Cursing in English on Twitter. CSCW. 2014.
[10] Jansen, Bernard J., Danielle L. Booth, and Amanda Spink. "Determining the informational, navigational, and transactional intent of Web queries." Information Processing & Management 44.3: 1251-1266. 2008.
[11] Hu, Jian, et al. "Understanding user's query intent with wikipedia." Proceedings of the 18th international conference on World wide web. ACM, 2009.
[12] Gupta, Vineet, et al. "Identifying Purchase Intent from Social Posts." ICWSM. 2014.
[13] Nagarajan, Meenakshi, et al. "Monetizing user activity on social networks-challenges and experiences." Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology-Volume 01. IEEE Computer Society, 2009.
[14] Lu Chen, Justin Martineau, Doreen Cheng and Amit Sheth. Clustering for Simultaneous Extraction of Aspects and Features from Reviews. NAACL. 2016.
[15] Lu Chen, Wenbo Wang, Amit Sheth. Are Twitter Users Equal in Predicting Elections? A Study of User Groups in Predicting 2012 U.S. Republican Presidential Primaries. Proceedings of the 4th International Conference on Social Informatics (SocInfo). 2012.
[16] Lu Chen, Ingmar Weber and Adam Okulicz-Kozaryn. U.S. Religious Landscape on Twitter. Proceedings of the 6th International Conference on Social Informatics (SocInfo), 2014.
[17] Lu Chen. “Mining and Analyzing Subjective Experiences in User Generated Content.” Ph.D. Dissertation. Department of Computer Science & Engineering. [Dayton]: Wright State University; 2016. p. 161.

 

You Might Also Like

28 comments

  1. This information is really amazing, thanks for sharing. Looking for more post from you. 바카라사이트인포

    ReplyDelete
  2. Your blog have nice information, I got good ideas from this amazing blog. 바둑이사이트넷

    ReplyDelete
  3. You really work great for this post. Thank you for sharing! 바카라사이트윈

    ReplyDelete
  4. Thank you for sharing your info. I truly appreciate your efforts and I will be waiting for your further post thanks once again.
    프로야구경기일정
    일야중계
    J리그

    ReplyDelete
  5. Thanks a bunch for sharing this with all of us you really realize what you are talking about! Please also consult with my site. We may have a hyperlink trade agreement between us
    NBA농구분석
    NBA라운드티
    NBA라이브스코어

    ReplyDelete
  6. "Greetings! Very helpful advice within this post! It's the little changes that will make the most important changes.

    Thanks for sharing!
    먹튀검증사이트
    파워볼엔트리
    토토사이트

    ReplyDelete
  7. 토토 have found a lot of approaches after visiting your post

    ReplyDelete
  8. Whereas an unmanaged dedicated server is one which is managed by the company needing the server, without the help of the 스포츠토토

    ReplyDelete
  9. 슬롯머신 It's wonderful that you are getting thoughts from this article as well as from our argument made at this place

    ReplyDelete
  10. I check your blog regular and attempt to take in something from your blog.

    ReplyDelete
  11. what a nice article i found , i was looking it for a long time ,

    ReplyDelete
  12. this type of article that enlighted me all thoughout and thanks for this.

    ReplyDelete
  13. Hello ! I am a student writing a report on the subject of your post.

    ReplyDelete
  14. When it comes to gathering personal information, various methods like surveys and interviews prove beneficial for individuals, businesses, and government agencies in decision-making processes. If you're a student in Ireland, juggling assignments alongside this research can be challenging. That's why it's essential to seek professional assistance. If you ever think, "Who can write my assignment ireland?" consider contacting a reliable assignment writing service. They can provide expert guidance and support, ensuring you excel academically.

    ReplyDelete
  15. Thanks for sharing a wonderful information. It is very informative and creative. I enjoyed reading the post. reckless driving Virginia

    ReplyDelete
  16. Subjectivity in information extraction goes beyond merely identifying positive or negative sentiments. It aims to tap into the nuances of language, including context, tone, intent, and personal experiences. This approach acknowledges that language is not one-dimensional; it's a complex and dynamic reflection of the human experience...... va divorce attorney

    ReplyDelete
  17. Thank you for sharing. I had a great time reading this post because it was so informative. Uncontested Divorce Lawyer Fairfax

    ReplyDelete
  18. With its multicultural population and vibrant communities, Fairfax, Virginia, sees a wide range of criminal cases, from minor infractions to violent felonies. A knowledgeable criminal defence attorney in this jurisdiction can handle a wide range of legal issues and provide a tailored, well-thought-out defence for clients who are accused of crimes. Fairfax Virginia Criminal Lawyer

    ReplyDelete
  19. Having an able guide can be extremely beneficial when navigating the frequently choppy waters of divorce. Accordance Advisors shows itself as a source of knowledge . new york state divorce alimony

    ReplyDelete
  20. Awesome blog! I truly enjoyed reading it and found it to be instructive. Continue to share.criminal defense attorney prince william county

    ReplyDelete
  21. Great informative post to read. Keep posting more good interesting blogs. abogado tráfico halifax virginia

    ReplyDelete
  22. "Subjectivity – Tapping All the Valuable Insights Beyond Sentiment for NextGen Information Extraction" presents a compelling exploration of the nuanced aspects of data analysis in the realm of information extraction. This insightful piece delves into the often-overlooked concept of subjectivity, arguing for its critical role in comprehending and utilizing data beyond mere sentiment analysis. It skillfully illustrates how incorporating a deeper understanding of subjective elements can significantly enhance the accuracy and applicability of information extraction techniques. This work is particularly relevant in the age of big data, where the ability to discern and interpret the subtleties of human expression can lead to more informed decision-making and innovative solutions. The article stands out as an invaluable resource for professionals and enthusiasts alike who are keen to advance their knowledge in the ever-evolving field of data science.
    a dispute over a contract between
    contract dispute meaning

    ReplyDelete

  23. Great informative post to read. Keep posting more good interesting blogs.
    Our traffic lawyers maintain clear communication with their clients, keeping them informed about the progress of their case, court dates, and any developments that may impact the outcome. For more visit our website abogado de trafico en virginia

    ReplyDelete
  24. Great read! Your article inspired me to explore this subject further. Thank you for sharing. semi truck accident lawyer

    ReplyDelete
  25. To create a valuable review comment, provide specific details about the product or service being reviewed, your experience, target audience, and specific aspects to highlight. Be specific, honest, and objective, avoiding inflammatory language or personal attacks. Be respectful, avoiding inflammatory language even if you had a negative experience. Proofread your comment before submission to ensure it is free of errors and easy to read. Remember to balance personal opinions with informative and helpful content.
    abogado flsa cerca de mí

    ReplyDelete
  26. Great article! I appreciate the positive vibes in this article. It's a great reminder to appreciate the little things and find joy in everyday moments. Thank you for spreading positivity. motorcycle accident near me

    ReplyDelete
  27. One common strategy for reducing a reckless driving ticket in Virginia is through plea bargaining. This involves negotiating with the prosecutor to plead guilty to a lesser offense, such as improper driving or a simple speeding violation. How to reduce a reckless driving ticket in Virginia

    ReplyDelete