Sentiment Analysis

From New Media Business Blog

Jump to: navigation, search

Contents

Overview

Jonathan Taplin on social sentiment

Sentiment Analysis involves analyzing human generated information in order to determine what sentiment, if any, is being expressed within. Seth Grimes, one of the leading experts on the subject, defines it as as:

“a set of methods, typically (but not always) implemented in computer software, that detect, measure, report, and exploit attitudes, opinions, and emotions in online, social, and enterprise information sources” [1]

Essentially, any techniques that attempt to understand what is being articulated through some means of communication can be considered sentiment analysis. Textual information is one of the most common inputs but many other information formats, including audio, video and images, can potentially be used.[1]

As a branch of data analytics, the use of sentiment analysis in business owes its origins to the emergence of business intelligence (BI). The 1980s was when business-adapted data analytics really began to appear, with the rise of BI for corporate use.[2] Because of this relationship to BI, there is a tendency for users to treat sentiment analysis like a data analysis tool. This is a pitfall, however, because sentiment analysis creates the most value when it is used to analyze the relationship between measurable social interactions and business outcomes.[2] Sadly, this cannot be done when social media analysis tools continue to “treat social media in a siloed fashion.”[2] It really seems like businesses are only beginning to scratch the surface of the potential uses of these technologies.

The primary goal of performing sentiment analysis is to create value.[2] Only by exploiting the understandings and insights that sentiment analysis reveals can value be generated. Increasingly, businesses are using these analytical tools to better understand what their customer’s attitudes are towards their company, their products and their brands. As Seth Grimes writes:

“Sentiment analysis lets marketers (and market researchers, customer service and support staff, product managers, etc.) get at root causes, at explanations of behaviors that are captured in transaction and tracking records. Sentiment analysis means better targeted marketing, faster detection of opportunities and threats, brand-reputation protection, and the ultimate aim, profit.” [1]

Sentiment analysis has many other potential applications as will be discussed subsequently.

Natural Language Processing (NLP)

In order to interpret social information, sentiment analysis relies on natural language processing. NLP employs a number of different techniques and statistical methodologies [3] to analyze social information and extract meaning from it. Before they can identify sentiment correctly, however, NLP platforms have to be trained to interpret it.[3] This can be done through machine learning, where the system learns from examples that have been interpreted by humans first.[3]

Basic sentiment scale: positive, neutral, negative

Early on, sentiment analysis was able to interpret polarity: whether social data had positive, negative or neutral connotations. For example, the following tweet would be interpreted as negative:

Being able to determine whether negative or positive things are being said about your company can be very useful, especially if a large proportion of the messages about the company or its products are negative. In this way, sentiment analysis can alert companies to significant changes in the public’s attitudes towards its image and its brands. They can then use this feedback to mitigate negative market outcomes or exploit positive ones.

Recently, developments in sentiment analysis have taken polarity further, culminating in what are being called beyond-polarity solutions. Beyond-polarity provides a broader range of emotional categories like anger, happiness, frustration and satisfaction, offering greater business insight than polarity does.[1]

While NLP is the foundation of sentiment analysis, it actually becomes more powerful when combined with other information and techniques. For example, a ‘like’ on Facebook or a five star rating for a product can really reinforce a positive message. It can also be useful to look at the reputation and reach of content authors. When a person with twenty thousand Twitter followers or a blog with two hundred thousand monthly visitors, posts something negative about your company, it can affect the purchasing decisions of many people. Conversely, when a person or entity who posts almost exclusively negative content says something negative, it is probably not as meaningful as when a more positive actor does this. These and other considerations will be explored more fully in later sections.

Nuances of Sentiment Analysis

Sentiment Analysis is still in its early days and there are limitations as to how effective it can deliver results. The three main nuances of Sentiment Analysis that users should be aware of are the following: Language Idiosyncrasies, Sarcasm, and Optimistic Accuracy.

Language Idiosyncrasies

Each language has its own set of nuances or idiosyncrasies and this becomes an issue when using Sentiment Analysis comparatively on multiple languages due to translation. The meaning of a statement usually gets lost by using a literal translation. For example, if a word in Language A does not exist in Language B, context is required to meaningfully translate the statement. To this day, computers or algorithms are still not comprehensive enough to extract the true sentiment from statements. To thoroughly solve this problem, all the language rules for each language have to be programmed into the application. By now, you can see that even for just one language, it could be several lifetimes before each rule and each exception can be encoded. This same process must be done for all languages and dialects in order to have a human-like application that can harvest sentiment independently.

Sarcasm

Another unavoidable nuance in Sentiment Analysis is sarcasm. It is used in everyday speech in order to say one thing and mean the opposite.[4] If it is already difficult to decipher sarcasm in our daily conversations, how much more can we expect computers and algorithms to differentiate between sarcastic and non-sarcastic sentiments? Algorithmically, it could be possible but it places a deeper emphasis on context acquisition to be able to even succeed with very obvious sarcastic and non-sarcastic sentiments.

Optimistic Accuracy

Sentiment analysis application providers will often promise you the moon and the stars with their optimistic accuracy rates (often 70% and above). It is important to err on the side of caution and note that even humans will disagree on a statement 30% of the time.[5]

Business Use

Currently companies are starting to utilize the numerous data that can be found through social media. By the use of web crawlers which use algorithms, data found on social media websites can be used to create business value and increase customer value by allowing the company to see their customers' perception of their products, their company and their competitors. By using a combination of human analyzers and computer software, companies are able to mine through many social media websites such as Twitter, Facebook, Yelp, various blogs and forums as well as online newspaper and magazine articles. Through all these outlets, any company can scan for reviews of their products or services, customer complaints, public opinions or statements about the company, “likes” and “retweets” of status updates, or even tweets received about the company or their products. The main goal of data mining within the social media realm is to manage brand reputation, identify and respond to customer complaints, measure attitude towards marketing efforts, and look at overall sentiment towards a product or a brand.

Marketing

Traditional marketing research is conducted through the use of focus groups and survey involving the recruitment of volunteers. However, most people are unlikely to be willing to participate in these activities, and those that do are not representative of the company’s target markets. By the use of data mining on social media outlets, companies are able to see true opinions of people who represent their target market.[6]

Finance

Studies have shown that there is a correlation between executives and the stock market depending on how executives delivered news about the company. For example, Cisco

“uses the sentiment engine to determine which executives have the highest correlation to positively moving the stock market when they deliver positive news. They found that certain executives had a positive influence on the markets, while others actually had a negative influence because of the tone of their delivery" [7]

Other studies have also shown that companies which have positive perceptions from their consumers through social media tend to have higher stock prices. [8] [9]

Currently, social media being used in relation to the stock market is becoming more common. For example, it has been used to predict the direction of the market and the forces that shape it, because social media is used as a real-time data source. With sites such as StockTwits, real-time conversations are carried out about trends in the market. [10]

Current Application

What is 900feet?

Currently, business who are utilizing sentiment analysis use a combination of human analyzers and analyzing software. Each compliments the other as the biggest downfall of humans is low quantity of data that can be mined over a certain period of time, and the biggest downfall of analyzing software is the inability to detect emotion, facial expression, and body language as it simply mines text only.

There are many software applications emerging for sentiment analysis, some available for free online, and others available as a paid application or service. Some that are offered for free online either focusing on mining one social media outlet, while others focus on managing and mining multiple social media outlets. Some examples can be found below:

  • Sentiment 140: A website which analyzes tweets only. Users input a brand name or product, and each tweet that contains that they keyword entered will be categorized into positive, negative and neutral sentiments. One downside is that the categorization may not be accurate (e.g. identifying a tweet as neutral when it is more positive or negative). Another downside is that the algorithm will only mine words. For example, searching for “Apple” will bring up tweets about the company as well as the fruit. [1]
  • Brandkarma: Is a site especially made for people to post their opinions about companies. People can rate any company for the 3 P’s: people, product, and planet. People are encouraged to have their friends and followers support their “post” so that companies will then pay more attention to their opinion [2]
  • 900 feet: This website utilizes the dashboard concept for managing an analyzing each of a business' social media accounts. For example, a social media representative can create one item or update that will be posted on all social media accounts. 900feet also allows businesses to analyze what their customers think about their company as well as view their thoughts about their competitors. Business are also able to view any deals or offers that their neighbouring competitors may be offering at any given time.[3]

Implications

While there are many free resources that companies can use to manage their brand image and analyze consumer perceptions of their company, no one method is absolutely complete. For example, it is difficult to differentiate what text is actually positive, negative and neutral. Furthermore, analyzing software doesn’t pick up the consumer’s body language, tone of voice, and facial expressions simply because the software deploys text mining only. In addition, the text mining software can not read emoticons (e.g. heart are typed out using a "<" and the number "3") or emotion abbreviations (such as LOL; Laugh out loud) and is often disregarded during the analyzing process.[4]

Public Use

With the growth of internet and the user generated-content, this has led to a massive number of products, services, and merchant reviews available online. This is beneficial to us as users as we have access to tons of information to base our purchasing decisions but in practice this can be a daunting task due to the massive amount of information available for us to absorb. With the use of sentiment analysis, different systems can be created to automatically exploit available resources online and turn information in such a way that consumer can effectively and easily absorb to use for their own purpose. Consumer summarization, recommender system, and expert finding system are just a few applications that have gain interest in the adoption of sentiment analysis for consumer application.

Opinion Summarization

Social Media enabled people to express their opinion as well as hear other’s opinion. [5] The rapid growth of its use resulted in the generation of opinionated data available online that can be seen from online reviews, forums, microblogs, blogs and social networks. Merchants who sell their products online often encourage their customers to leave comments, product reviews and suggestions as a strategy to market their product. Opinion summarization was developed as a tool to generate a summary of the key aspects of a product or service from a number of reviews provided for that particular entity.[6] Sentiment analysis has played a key role in generating and presenting these summaries in a format that can be easily understood and effectively use in decision making.

Opinion Summarization Example

Google Product Search is an example that uses opinion summarization. Figure 1 below shows a summary product review for a particular ASUS monitor. This summary is based on 1,339 reviews found from several online resources. As shown on the right framework, the summary is based from multiple resources such as B&H, Best Buy, Buy.com, etc. The summary itself, which is found at the top of the page, highlights the key aspects that represent the general idea of what most consumers think of the product. Aside from creating a product review summary, Google Product Search also aggregated all the specific reviews into one location providing the convenience for consumers to browse through if they wish to do so.

A Google Product Search Result on an Asus monitor
Figure 1 Google Product Search result on ASUS monitor

Recommender System

A new level of individual personalization is shifting businesses from simply building a product or providing a service. Companies are investing more in learning about their consumers in order to create values that meet their needs. [7] Customizing consumer’s shopping experience online is a common strategy among e-commerce sites. It is through the use of a recommender system that allows it to add value to its customers’ experience.

From a consumer point of view, recommender system has been a valuable tool in helping us sift through information based on our and others consumer purchasing behavior. A New Wharton research even suggested that consumers receiving tailored suggestions on the type of music that they should listen to can help build connections and widen exposures for individuals. [8] The system also helps increase sales volume for e-retailers because of the personalized options it provides to consumers. As an example, 35% of Amazon’s sales are from product suggestions while 60% of Netflix revenue are from rental suggestions. [8]

Studies have shown that sentiment analysis can help improve recommender system performance.[9] Combining sentiment classification approach with the common rating model used by recommender system, increases the chance of helping consumers find the product or service that is most valuable to them and in turn help online retailers position themselves strongly in the market.[7]

Recommender System Examples

The following e-commerce sites are major companies that have already taken advantage of recommender systems to enhance its consumer experience and increase its revenue stream.

Amazon.com has structured its site to recommend additional items based on what other shoppers have bought online. It has the capability to recommend during and after online transactions.[7]

Pandora Radio has structured its site to play music with the similar characteristics to the initial input of a musician or of a song. [7]

Netflix has structured its site to recommend movies to users that they might like to watch based on consumer’s watching habits, the characteristics of the film watched and/or on user’s previous ratings.[7]

Expert Finding System

Expert Finding System is another way of organizing information available online in a more effective and efficient manner. The aspect of using sentiment analysis in the design and development of the expert finding system model can enhance the result of finding the right expert on a given topic. [6]As Don Tapscott emphasized on this Ted talk “Four Principles for the Open World”, sharing expertise and knowledge within and among organizations and individuals should be embraced as we move forward in a continuously changing world. It is with this expectation that expert finding system would become more useful as people have to learn things constantly and be able to use information more effectively. [10] Examples of firms using expert finding systems are PR Newswire, ProfNet, Pivot, ExpertWitness.com.

Future Implications of Sentiment Analysis

Don Tapscott: Four Principles for the Open World

Openness of the Web

‘Openness’ refers to the degree in which information on the internet remains accessible to the public. Information on open websites is not protected by privacy measures or hidden behind password walls, unlike with closed websites. This has implications for sentiment analysis because social information on closed platforms cannot be easily analyzed by algorithms. Third parties cannot perform sentiment analysis when passwords are required for access or when privacy measures prevent public observation of information. Taking Facebook as an example, the company has put measures in place that prevent web crawlers from accessing its website, something Google’s CEO, Sergey Brin, says is a threat to the open internet.[1]

Information on closed platforms is privatized, in a sense, because it becomes proprietary to the owners and operators of the platform itself. Data and information, then, is only accessible to third parties when it’s given to them by the platforms, and usually only after it has been paid for. Again, Facebook is an example of a company that is doing just this. The phenomenon of closed platforms does not only reduce the quantity of information that is available for sentiment analysis, it also makes it more difficult to analyze social information across different platforms. This is a shame because sentiment analysis becomes that much more powerful when it can be used to compare reactions and attitudes among different online groups and communities.

Currently there is a push towards keeping online information public and open[2], so the future looks bright for sentiment analysis. However, if this changes and trends emerge which cause more and more platforms to become closed, then the ability to perform sentiment analysis could become compromised.


Geospatial Sentiment Analysis

Mobile Applications

Like with closed platforms, mobile applications can cause roadblocks to be put up for sentiment analysis. With the rapid increase of smartphone use around the world, this has significant implications for this topic.

Mobile applications are like closed platforms in that their content is generally not publicly accessible and cannot be compared across different applications.[1] App developers and app owners are the only parties that have potential access to the information. Once again, information becomes proprietary and is probably only available to third parties who can purchase it.

It is unclear whether mobile applications will continue to be run locally on phones or start to become more web-based. If Google Inc. has its way, mobile applications would probably become HTML5 based, which would help enable sentiment analysis to be performed on mobile information. Unfortunately, Apple Inc. established most of the standards in the contemporary mobile space, so local applications will remain for the time being.

Mobile phone applications present an interesting trade-off for sentiment analysis. While they do tend to restrict the availability of social information, mobile applications can also provide invaluable location data. Through geolocation, businesses can interact with consumers at a location-by-location basis. Business outcomes, reactions and trends can be identified across specific locales or broader regions, allowing for business offerings to be tailored to suit consumers at various geographic levels. As smartphones continue to be adopted by the public, more and more social information will be tagged with location data.


Data Quality

Sentiment analysis is often marketed as an essential component in understanding consumer’s perception about a product or service. Depending on the user, it can definitely be valuable. However, sentiment analysis can suffer from fake reviews and biases. This aspect skews sentiment analysis giving rise to the dark side of social media marketing.[2]

The following are the most common issues affecting the quality of data used in sentiment analysis:

  • Fake reviews vs. real opinions
It can be a challenge to distinguish fake reviews from real opinions especially when fake reviews are as good as being a real opinion. With the prevalence of sharing information online, ensuring the integrity of the source of information can be a daunting task for review hosting sites. [3]
  • Negative biases
It is also important to note that some users tend to provide negative sentiments all the time (ie. A person who hates everything equally). This deceptive opinion can skew the validity of information about a product or service. Therefore, it projects a misguided tool for consumers or companies to use. [3]
Bing Liu a data mining expert who worked on text analytics & online rating fraud detection
Bing Liu, data mining expert who worked on text analytics & online rating fraud detection
  • Source contains same kind of opinion (biased)
When some sites contain the same kind of opinion, it can provide a distorted perception of a product or service misleading consumers in their purchasing decisions. It is therefore important to watch out for sites that provide positive reviews about a product when it is not actually the case.[3]


According to Bing Liu, a data mining expert who had worked on text analytics and online rating fraud detection, fake review identification is one of the major issues gaining popularity in dealing with sentiment analysis.[2] Fake reviews can be easily generated through social media and it is also considered cheap. The emerging trend in the use of product reviews and opinions among individuals and businesses is something to look out for. Positive opinions can be equated to good brand reputation for companies and individuals. It can also mean constant profit and a good driver for business. Therefore, there is a strong incentive for individuals and business to promote its products and services while discrediting others. [4]

Liu expects that fake or deceptive reviews can be an issue for a long time.[2] The good news is that almost all review hosting sites are aware of this and are actively dealing with this issue. There has been an interesting ongoing research in this area to build a detection model using different opinion mining, data mining, and machine learning algorithms.[2] There are also models that use linguistic features from the actual review content in combination with meta-data features such as reviewer’s User ID, hosted IP address, time review of post, user profile star rating, etc. As detection models evolve, this means that it may get difficult for imposters to post fake reviews. [2] It could also ensure that quality of information used for sentiment analysis can be trustworthy.


Marketing Research, Brand Reputation Management, Customer Support, etc:

Velcom: Acting on Social Analytics - Finding the Alpha Influencer

Currently, businesses are just doing simple text mining and analysis, but with future technological advancements, companies will be able to take the opinions and emotions of their consumers and create action plans for the future. Furthermore, companies can look at these opinions and where they originate from so they can look into the specific people who post negatively about the company as some may just be negative about everything. In this case, their post can be disregarded.[1] If a negative post comes from one individual person who is truly unhappy with a company's product or services, companies could personally reach out and begin to build a relationship with that customer tracking the progress of the relationship to increase the company’s brand image and customer service. In addition, companies can post publicly in response to blogs that may complain about problems with the product so that others who are having the same problem can be made aware of the various solutions.[1] It is important to note, that companies should prioritize the tweets, posts etc. and respond to the negative ones first.

On the other hand, companies can join into conversations that reflect a positive image of their company so the wants and needs of customers can be heard which ease customer relationship management. Joining in on conversations through social media can allow a company to identify purchase intent and provide an opportunity to influence their customers towards purchasing certain products or accessories.[2] By identifying "opinion leaders" and ensuring that they have a positive perception of the company is a strategy that can extend a company's reach as these "opinion leaders" have a wide span of influence throughout the social media world.

Future advancements in analytical software could mean that search options will be more precise in terms of separating tweets between products or items with the same name (e.g. apple the fruit and Apple the company) as well as the difference between intensities in emotion (e.g. dislike for an application, vs. hating an entire brand line).[1] The technology used for sentiment analysis can advance in ways that allow for the recording of body language and tone of voice.

The growing application of sentiment analysis can also means its increased use throughout a company's different departments. Current uses are typically found in the marketing and financial areas, but future advancements can result in the integration and use throughout other departments such as accounting, HR etc.

Summary

Sentiment analysis is not exactly a new phenomenon but is growing dramatically in importance and application. It is also an integral part of Web 2.0 as it provides business value even in its current iteration. Below is a summary of the key points regarding the future of Sentiment Analysis:

  • Language processing is the current roadblock for more sophisticated sentiment analysis. At its current iteration, it is difficult to automate sentiment analysis and leave it on its own. Human interaction with the data and the preliminary results are key in providing business value.
  • Context is the new paradigm both for Web 2.0 and beyond. Without context, it is almost impossible to decipher sentiment. This context is also the driving force behind people’s needs and wants.
  • Despite the early days in sentiment analysis, there are many business and consumer applications that take advantage of this concept. Where the software capability is lacking, human interaction steps in to fill in the gaps.


Possible future of Sentiment Analysis: Person of Interest Trailer

Conclusion

It is our group’s opinion that Sentiment Analysis is an absolute necessity for Web 3.0, marketing, e-commerce, and m-commerce. Without it, business and consumer value derived from sentiment analysis will be limited by human ability. Lastly, like anything else with great social value, there is a huge risk for people to take advantage of sentiment analysis technology and use it for ill-conceived reasons.

References

Personal tools