Three case studies from different industries prove that sentiment analysis works
This far I have only seen anecdotal evidence that sentiment analysis works. The reason for this is that the only way to benchmark sentiment is human analysis. And we are notoriously expensive and subjective.
If not human tabulation, then what hard measure or key performance indicator (KPI) should be used to prove whether sentiment analysis works? The only one I could come up with was Net Promoter Score (NPS).
HYPOTHESIS: In Net Promoter Score surveys the comment
field sentiment should correlate with the NPS score.
HOW ETUMA CALCULATES SENTIMENT
Just to be sure that we are on the same page: we calculate the whole comment sentiment in the following way:
- Detect industry specific category (topic) by pooling keywords and phrases
- Detect the sentiment per topic
- Calculate the average sentiment of all topic mentions
If all the topic sentiments are negative then the whole comment sentiment is -1. If all topic sentiments are positive then the whole comment sentiment is +1. Because customers comment about many issues, most comments have multiple categories. The whole comment sentiment is calculated as an average of all topic mentions.
I took three NPS data sets and created the visualization using Tableau. In the graphs I am demonstrating the correlation between the NPS score and the whole comment sentiment.
In the graph the first column shows the NPS rating and the second column the average whole comment sentiment in all open-text fields related to that rating.
The first one is in Telecom industry with a sample of over 15,000 responses. This specific NPS survey didn’t follow the standard NPS format. There were five open-ended questions in it. That might be the reason for a somewhat smaller spread (-0,31 to 0,32). Everything else correlates quite nicely except scores from 1 to 4. There is no meaningful difference. For example the sentiment for NPS score 1 is -0,24 and 4 it is -0,23.
This survey had about 4000 responses. What is interesting here is that only the Promoter scores are on the positive side and that there is a substantial jump from Score 8 to 9. And again, there is no substantial difference between the Detractor scores. This just proves that the theory behind NPS scoring is solid: making the Detractors cover a wide spectrum of scores (0-6) makes a lot of sense.
I was not able to get a retail dataset that had actual NPS scores in it. This one was already post processed to include the NPS groups. This survey had about 44,000 responses. Again it proves that the NPS score strongly correlates with the average sentiment of customer comments.
Here were just three examples from three industries. We have dozens of NPS data sets and they all prove the hypothesis true: there is a strong correlation between the NPS score and the whole comment sentiment - SENTIMENT ANALYSIS WORKS!