Sentiment Analysis on Amazon's Customer Reviews
Sentiment Analysis, also referred to as opinion mining, is an approach to natural language processing (NLP) that identifies the emotional tone behind a body of text. This is a popular way for organizations to determine and categorize opinions about a product, service or idea. It involves the use of data mining, machine learning and artificial intelligence to mine text for sentiment and subjective information.
Types of Sentiment Analysis
- Fine-grained sentiment analysis provides a more precise level of polarity by breaking it down into further categories, usually very positive to very negative. This can be considered the opinion equivalent of ratings on a 5-star scale.
- Emotion detection identifies specific emotions rather than positivity and negativity. Examples could include happiness, frustration, shock, anger and sadness.
- Intent-based analysis recognizes actions behind a text in addition to opinion. For example, an online comment expressing frustration about changing a battery could prompt customer service to reach out to resolve that specific issue.
- Aspect-based analysis gathers the specific component being positively or negatively mentioned. For example, a customer might leave a review on a product saying the battery life was too short. Then, the system will return that the negative sentiment is not about the product as a whole, but about the battery life
Application of Sentiment Analysis
Sentiment analysis tools can be used by organizations for a variety of application, including:
- Identifying brand awareness, reputation and popularity at a specific moment or over time.
- Tracking consumer reception of new products or features.
- Evaluating the success of marketing campaign.
- Pinpointing the target audience or demographics.
- Collecting customer feedback from social media, websites or online forms.
- Conducting market research.
- Categorizing customer service requests.
Challenges with Sentiment Analysis
Challenging associated with sentiment analysis typically revolve around inaccuracies in training models. Objectivity, or comments with a neutral sentiment, tend to pose a problem for system and are often misidentified.
Sentiment can be challenging to identify when systems cannot understand the context or tone. Answers to polls or surveys questions like "nothing" or "everything" are hard to categorize when the context is not given, as they could be labeled as positive or negative depending on the question. Similarly, irony and sarcasm often cannot be explicitly trained and lead to falsely labeled sentiments.
Computer programs also have trouble when encountering emojis and irrelevant information. Special attention needs to be given to training models with emojis and neutral data so as to not improperly flag texts.
Finally, people can be contradictory in their statements. Most reviews will have both positive and negative comments, which is somewhat manageable by analyzing sentences one at a time. However, the more informal the medium, the more likely people are to combine different opinions in the same sentences and the more difficult it will be for a computer to parse.
Steps of Sentiment Analysis
Sentiment analysis steps are deeply intrinsic, comprising many different machine learning and NLP tasks and subtasks.
Step 1: Data Collection
This is one of the most important steps in the sentiment analysis process. Everything from here on will be dependent on the quality of the data that has been gathered and how it has been annotated or labelled.
API Data - Data can be uploaded through Live APIs for Social Media. A news API can help us glean information from all kinds of news publishers, while a Facebook API can allow us to take all the publicly available data we need from its platform. We can also use open source repositories like Kaggle, Amazon Reviews, or Yelp.
Manual - If we have data that we already have from a CRM tool, we can manually upload that onto the sentiment analysis API as a .csv file.
Step 2: Data Processing
The processing of the data will depend on the kind of information it has - text, image, video or audio. Repustate IQ sentiment analysis steps also include handling video content analysis with the same ease it does text analysis. Below are the sub-tasks.
Audio Transcription - The audio from the video data is transcribed through speech to text software to ensure that any video or audio file in the data is not overlooked.
Caption Overlay - If there are any captions appearing in the video, they are extracted by Repustate IQ and analyzed for any appearing entities, aspects or topics that we have identified as important.
Image Overlay - Similarly, the platform recognizes and captures any images in the video or text data through OCR (Optical Character Reader).
Logo Recognition - Repustate IQ's intelligent data scanner immediately recognizes any logos that appear in the video background. This means even videos that appear in the clothes of the presenter or say on an item like a pen, or a mug on the desk. It even picks up logos from background posters. it does it in such a way so that not even the smallest detail foes unnoticed when the platform conducts sentiment analysis of the brand.
Text Extraction - All the text is similarly recognized and extracted in the sentiment analysis process. This includes emojis and hashtags as well, which are a vital part of social media sentiment analysis. Unlike other sentiment analysis platforms, Repustate IQ ensures that emojis are never left out of the data processing because that could lead to false positives or negatives.
Step 3: Data Analysis
There are many subtasks that need to be done for this stage of the sentiment analysis process.
Training the Model - A set of dedicated, classified and labeled sentiment analysis of dataset that will be used to train the model needs to be pre-processed and manually labelled. It is this labelled data that will be used to train the model by comparing the correctly classified data with the incorrectly classified one. This will help improve the custom model that is created for a brand.
Multilingual Data - In sentiment analysis steps that include multilingual data processing, Repustate IQ has the dataset for each language individually annotated and trained. This is because the platform does not rely on translations at all since all the information can get distorted, or nuances lost due to vast differences in certain languages like Spanish and Korean. This is the reason why Repustate gives the highest accuracy score compared to other platforms.
Custom Tags - In this part of the process, custom tags for aspects and themes will be created for the data such as brand mentions, product name etc. Once the model has been trained, it will automatically segregate text based on these custom created tags.
Topic Classification - The topic classifier attaches a theme to a text. For example, this text "The dresses were awesome, and I found some really good scarves as well." will be tagged as the topic "clothes".
Sentiment Analysis - Each aspect and theme is isolated in this stage by the platform and then analyzed for the sentiment. Sentiment scores are given in the range of -1 to +1. A neutral statement may be termed as 'Zero'. This assigning of polarity is important as ultimately, even as the platform assigns the different scores to different aspects like convenience, speed, cleanliness, functionality, drinks, ambience, etc. it is the aggregate score that is calculated to know the sentiment of the audience towards the brand. So, if 3 of the 7 aspects receive a poor rating and 4 of them receive good ones, the sentiment analyzer will give an average score as the overall sentiment of the brand.
Step 4: Data Visualization
Once all the steps in the sentiment analysis process have been covered, the insights are quickly turned into actionable reports in the form of graphs and charts. These reports can then be shared within teams as well. These visual reports are really important because it is through them that we see granular, aspect-based results. For example, when we get an average score for our brand, we can filter the results in the sentiment analysis dashboard to see which aspects got how high a score and which ones got low scores. This will give us an idea as to what areas need attention more than others. Thus, at this stage of the sentiment analysis steps, we get actionable insights that we can use to decide the right course of action for our growth plans.
F1 Score
In the fields of computational linguistics and probability, an n-gram is a contiguous sequence of n items from a given sample of text or speech. The items can be phonemes, syllables, letters, words or base pairs according to the application. The n-gram typically are collected from a text or speech corpus.
Since 1-gram is sometimes insufficient to understand the significance of certain words in a text, it is natural to consider blocks of words i.e. n-gram.
The simplest version of the n-gram model, for n>1, is the bigram model, which looks at pairs of consecutive words.
![]() |
| n-gram |
Bag of Words Approach
The bag-of-words approach model is a simplifying representation used in natural language processing and information retrieval. In this model, a text is represented as the bag of its words, disregarding grammar and even word order but keeping multiplicity. The bag-of-words model has also been used for computer vision.
![]() |
| We got 95% F1 score after applying Bag of Words approach |
TF IDF Approach
TF-IDF stands for "Term Frequency - Inverse Document Frequency". This is a technique to quantify words in a set of documents. We generally compute a score for each word to signify its importance in the document and corpus. This method is a widely used technique in Information Retrieval and Text Mining.
Term Frequency (TF)
Document Frequency (DF)
This measures the importance of documents in a whole set of the corpus. This is very similar to Term Frequency but the only difference is that TF is frequency counter for a term t in document d, where Document Frequency is the count of occurrences of term t in the document set N. In other words, DF is the number of documents in which the word is present.
df(t) = occurrence of t in N documents
Inverse Document Frequency (IDF)
![]() |
| IDF Formula |
Document Frequency - Inverse Document Frequency (TF-IDF)
![]() |
| We got 94% F1 score after applying TF-IDF Approach |
Source Code
Finalization
We cleaned up and improved an Amazon reviews dataset and built some classification models on these improvements to predict sentiments.
We saw that bag-of-words and TF-IDF both the approaches gave interpretable features.
Through the increment of the set of n-grams we used from 1-grams to up to 4-grams, we were able to get out logistic regression model accuracy up to 95%.
Although there are different types of pre-processing involved in the textual data, not everything has to be applied in each case.
Every NLP classification task is different, but the process to be followed is similar to what we did in this case: wrangle the data -> create features from text -> train ML models.







Comments
Post a Comment