From data to insights

Steps on how to turn customer survey data into a powerful loyalty management asset

You will learn:

  • How to organize the CX data
  • How to setup the analytics process
  • Example graphs

View the entire guide below.
You can browse and jump to different guide sections using the table of contents. 


1. Chapter 1: Starting Point

2. Chapter 2: Create a CX Dataset

3. Chapter 3: How to Setup a CX Analytics Process

4. Chapter 4: How to Extract Insights

5. Chapter 5: How to Design a Customer Insight Dashboard




Chapter 1: Starting Point

Customer experience analysis results is like any other data: actionable insights don’t jump out and announce their existence and importance. Actually, CX analytics is even harder because most of the feedback is text and it is unstructured. CX analysis results are raw material that you have to enrich and refine. That's why you need to implement a CX analytics process.

Even if every single feedback dataset has actionable insights, it doesn't mean that extracting insights is easy. Developing the right competencies in order to turn the analysis results into actionable insights seems to be a challenge.
Most companies, even large enterprises, don't have enough time and resources to develop niche horizontal competencies. Customer and employee experience analytics is one of these niche horizontal competencies. The learning curve is quite steep. It takes months of full time effort, background knowledge in statistics, and experience in visualization and analytics tools to be able to effectively extract insights. And even if one person has become good at it, they either move to a new position or change employers.

This document introduces you to the customer experience analytics and insight distribution processes and gives you tips and ideas about what kind of approaches and graphs work. Follow the process and guidelines in this document and you will be extracting actionable insights in weeks if not days.

Analytics must always have a purpose

It is good to keep in mind why you need to analyze customer and employee feedback:

  • to find out the loyalty drivers and eaters,
  • mine product and service ideas,
  • detect problem clusters before they escalate,
  • mine positive comments to be used in marketing, and
  • identify customers that are about to churn.


Preparations - Good data is paramount

Before starting extracting actionable insights from customer feedback, you need to set up an enterprise insight process that:

  1. Gathers feedback: you actively solicit feedback through many channels and touchpoints, crawl social media, and analyze contact and support center feedback; and
  2. Categorize feedback: you've implemented a feedback analysis system that detects the sentiment and categorizes all open-ended customer comments according to a categorization scheme or structure that makes sense for the customer. We call this structure Codeframe or Topic Codeframe.

cx text analysis example.pngFor example, Etuma detects the meaning (what customers are talking about - Topics) and how they feel about that aspect of product or services (Sentiment) from open-ended customer comments. By detecting Topic and Sentiment from customer comments, Etuma turns all customer comments into statistical information that can be used to detect patterns from unstructured text comments. This gives you  an excellent starting point: all comments are categorized by topic and sentiment.

Introduction to CX text analysis dimensions

In CX text analysis you have to think a bit differently than when analyzing structured data. It is important to understand how CX text analysis results data fields behave. 

Topic volume

Knowing how much customers talk (share-of-talk) about a specific Topic gives you an idea of how important different aspects of your operations are to your customers. But it is hard to make concrete actions based on this data. It only tells you on very high level what drives the customer loyalty. I wouldn’t focus too much on the overall volume as long as it is sufficientOnce you know the topic specific volume over time, you can start seeing the relative interest of issues, detect whether some areas are getting more critical to your customers and so on.

Text analysis results don’t have an absolute benchmark. Topic (volume) is only interesting when you:

  • Compare a topic to itself over time (e.g. people are talking more about PRICING this month);
  • Use background variable as a filter to compare what topics are talked about (e.g. what are women talking about compared to men);
  • Use a topic to filter the main operational dimension (e.g. compare points of sales: how much customers are talking about TIDINESS and what is the sentiment for each store’s TIDINESS. This enables you to find out the stores that have a problem with TIDINESS); and
  • Compare the volume of one topic to the overall topic volume identifying your key loyalty drivers.

Topic Sentiment

With Topic level sentiment analysis you can identify how customers FEEL about a product, processe or other business aspect. The example below explains how Etuma sentiment analysis works.

"The installer was competent and cleaned the space thoroughly but your invoices are still difficult to understand". In that feedback message the sentence “The installer was competent and cleaned the space thoroughly” has positive topics (sentiment +1) regarding PERSONNEL, EXPERTISE and CLEANLINESS, and the sentence “but your invoices are still difficult to understand” has negative topics (sentiment -1) regarding INVOICING and CLARITY. Once you receive more than one comment e.g. about INVOICING you can start tracking the volume and sentiment trends. This analysis process turns unstructured text into statistical information.

Tools and demo dataset used in this document

I used our customer experience text analysis service, Etuma, for categorizing all open-ended customer comments by topic and sentiment. You can get similar analysis results by using ClaraBridge or Ascribe. Currently I don’t know any other tools that do customer experience (CX) text analysis out of the box. All other solutions and services must be developed, tuned and maintained for each customer separately. This is slow and expensive.

What dashboard or visualization platform should you use? The answer is: it doesn’t matter. I used Tableau to create these graphs and dashboards (visualizations) but you can use any advanced visualization platform like Qlik, Microsoft Power BI, Bime etc. The results are pretty much the same. Sometimes getting there might take a couple of extra steps depending on the platform.

I designed the Topic Groups in Google Sheets. Etuma staff implemented them into the CX text analysis service.

The demo dataset is a Transactional Net Promoter Score (TNPS) survey for a grocery store chain with six stores.


Chapter 2: Create a CX Dataset

We recommend that you create one customer experience dataset or warehouse in which you store all customer feedback: survey response data, contact center complaints, social media comments, forum comments, spontaneous feedback, etc. And, remember to enrich this database with customer's demographic and purchase behavior data.

Structure the feedback data according to the visualization or analysis system requirements

It took a bit of database tinkering to make the open-ended feedback analysis results suitable for Tableau. Tableau likes data in which each data record (in our case customer's survey response) is on one row. The challenge with open-ended customer feedback comes from the fact that almost all customers talk about multiple Topics.

You could, of course, create a separate table for each customer response but this would create a large number of small tables. In order to avoid this our CTO came up with an elegant and simple solution: each Topic (mention in a customer response) is a separate row in the database. This does create a minor inconvenience: the number of rows in the data table doesn't match the number of responses. There is an easy remedy to this. Just create a new calculated 'measure': countd([Signal Id]))

This is what the analysis results database looks like (in Tableau).


Here are explanations for the data fields (columns):

  • Topic sentiment: Lists the detected topic (LINK) level sentiment(LINK)
  • Topic En: Topic name in English, Topic Fi: Topic name in Finnish, Topic Sv: Topic name in Swedish. By naming the topics across all languages makes the CX analysis results comparable between different languages.
  • Topic Context (sentence): Topic specific sentence extracted from the whole comment. This makes root-cause analysis easy and fast: you only need to read contextually relevant sentences.
  • Signal Text: The whole comment text that includes one or more Topics and Topic Context sentences.
  • Signal Ambiance: Whole comment sentiment. It is the average of all Topic sentiments in one Signal. The score is between -1 (all topics negative) and 1 (all topics positive).
  • Date: date on which the survey response was received.
  • Signal Id: Etuma gives each analyzed response a unique identifier. 
  • BGV Viewpoint: some people call this a Topic Group. It is a set of topics that belong to the same organizational unit or area of responsibility.
  • BGV Age, Gender etc.: the rest of the database is background variables (aka metadata) on the respondents demographics, purchase behavior, market segmentation etc.

Once the database is correctly structured, the report creation is simple. Topic and Sentiment are Tableau Dimensions. For that I had to figure out how to get both pieces of information into the same box (Square in Tableau language). I did this by using the two measures Topic Sentiment and number of records (in this case not the number of responses, but Topic mentions).

This chart demonstrates the problem areas in the grocery store operations: If the Topic count is high and the color red (negative sentiment) then you should dig deeper to find out what is wrong and where (root cause) and inform the appropriate manager about this issue.




Chapter 3: How to Setup a CX Analytics Process


1. Insight extraction principles

When it comes to extracting insights from unstructured text there is the hard way and then there is the organized way. Follow these principles and you will have an easy time turning customers' text comments into structured information.

Make information as rich as possible

If you know the respondent identity, the data MUST be enriched with demographic, purchase behavior and structured survey answers (e.g. NPS). You can do this by linking the response id to a customer record, which includes all these background variables (metadata on customer feedback). You should create a separate cx dataset for anonymous feedback (e.g. social media).

Always include your primary operational dimension as a background variable

The most important thing is to measure how your primary operational dimensions (point of sale, customer segment, product (group), business unit, geographical area) are performing. If you cannot enrich the CX dataset then at least gather this piece of information as part of the survey process.

Analyze the whole dataset

Don’t focus just on (time) trend changes – identify also from the whole dataset the loyalty drivers and eaters. Because traditional data analytics is so heavily focused on time series or patterns in time series, most people seem to do the same with CX text analysis. Whereas in financial reporting it might not make sense to look at revenue over many years, in CX text analysis totals can be quite interesting.

Always use averages in Sentiment analysis

Whereas in financial reporting the results are absolute, in text analysis the results are relative: zero point means nothing. You should always compare topics or benchmark business units to the average sentiment of the whole dataset (or period). In the example dataset it was +26. In topic sentiment analysis the average topic sentiment is between -1 (all topic mentions were negative) and 1 (all topic mentions were positive). This means that the zero (sentiment) point is often unusable in feedback analytics. What is more interesting is the average sentiment of all topics.

Benchmark business units or countries to the total averages

Because there are no absolute measures, you need to create a way to see how your different products, business units or geographical areas are performing. The best way to do this is to compare one unit to the average of all other units.

Always use % or stacked charts in time-trend topic analysis

Topic volume is a volatile measure. Some of the volatility is real (people are mentioning the topic more or less often in their comments) and some of it is unintended (the overall feedback volume varies due to feedback process or seasonal reasons). You need to figure out how to minimize the effect of unintended volatility in your feedback analysis process. It is impossible to receive the same number of feedbacks every day, week or month. What is important is to figure out the relative share of the topics customers or employees are talking about. Stacked charts or share-of-talk percentage makes this easy to do.

It is easier to compare two periods

Because there are no absolutes in CX text analysis results, it makes sense to compare previous two periods. And what you anyway want to find out is what is happening NOW. You need to detect issue clusters before they become problems.

Create a way to contextually drill into to the relevant customer comments

The pattern detection is done with Topic and sentiment or their relationship to background variables but the actual root-cause is often 'hiding' in the actual customer comments. Use Topic level sentiment and background variables as filters to produce a list of contextually the relevant comments.


2. Create new calculations

I already talked about the need to create a formula for counting the number of responses. You also need to create these additional formulas. These examples are created using Tableau. That's why some of the syntactic or command rules might not make sense.

  • COMMENT VOLUME = countd([Signal Id]) | Because there are more rows and responses, you need to calculate the response volume from the Signal id'.
  • AVG TOPIC SENTIMENT = avg([Topic Sentiment]) | You need to always use average sentiment. Total sentiment doesn't mean anything.
  • # of TOPIC MENTIONS = 1 | This calculates how many Topics were mentioned in the comments. 
  • TOPIC SENTIMENT SCORE = 100*[AVG TOPIC SENTIMENT] | Etuma system uses -1 to 1 sentiment scoring and I prefer the NPS style -100 to 100 scale. 
  • BENCHMARK TOPIC = SUM([# Topic Mentions]) / TOTAL(sum([# Topic Mentions])) | Because topic volume is volatile and you need to use it in many reports, it makes sense to create formula for it.
  • # OF TOPICS = countd([Topic En]) | This tells you the number of Topics in the whole dataset. It is also useful in figuring out long responses.
  • NPS SCORE = CASE [Bgv Nps], WHEN "Promoters" THEN 100, WHEN "Passives" THEN 0, WHEN "Detractors" THEN -100, END | I know that all of you already have a formula on how to calculate the NPS score but just in case...



3. Review the dataset

Acme Retail TNPS.png

As mentioned earlier: good data is paramount. The purpose of this step is to make sure that all your data has transferred to the analytics platform and that the basic calculations are done right.

Usually you know before this what is the average NPS score. Make sure that the figure here is the same as in your survey or CEM tool. One possible mistake is to transfer only the responses that have comments in them. In this case the NPS score and volume would not match.

You need to transfer also those comments in which customers didn't leave any comments. This is also a good first check on the Topic sentiment score. In high volume TNPS process, It shouldn't be very far from the average NPS score.


4. Verify sentiment analysis quality

Sentiment Test NPS.pngBefore moving forward you need to ensure that the sentiment analysis quality is at acceptable level. NPS gives you an excellent benchmark to test this.

The x-axis lists the NPS score and the why access has the AVG TOPIC SENTIMENT SCORE. Color indicates positive or negative sentiment.

In this case there is a strong correlation between avg. sentiment and NPS score. This is, yet again, another proof that sentiment analysis works.





5. Check topic categorization accuracy


check topics.pngEtuma has analyzed millions of comments but there might still be errors in the way the comments are analyzed.

Etuma has a tool in which you can easily verify that the keywords are correctly mapped into the topics.





6. Create topic groups


In order to capture 'the whole world' there are typically about 500 topics in any dataset. That is too many for efficient reporting. Topic groups solve this problem.
  • There should be no more than nine ’hard’ and one ’soft’ topic group.
  • Topic groups are created in a separate project between the CX text analysis vendor and customer

Topic groups should reflect the way your company is organized and along the lines of your touchpoints. I group all the 'soft' Topics into a Topic Group called Experience. It isn't visible because I use it as another dimension in CX text analytics process.

I use a simple excel chart to create the Topic Groups. I put the topics to the left and the topic groups to the top and then just start matching the topics into the Topic Groups. This should be an iterative process: as you gain a deeper understanding of the Topics, you end up tuning the Topic Groups.


7. Dashboard design principles

acme_retail_dashboard.pngCreate a  hierarchical structure: Background variables – Topic Groups – Topics – Topic Sentences.

Make sure to include those background variables that are relevant to viewer’s role.

Always include either topic specific sentences or whole comments.






Chapter 4. How to Extract Insights


How to create a topic cloud

Word clouds seem to be popular. They have limited informational value because they don’t tell a change in Topic or what is a root cause for a change. But they do give you an overall view on what your customers are talking about and what is the sentiment of what they are talking about. But word clouds have very limited value. Customers just use too many words to express their concerns, wishes and ideas. Because CX text analysis turns all comments into Topics, Topic Cloud is a much more informative way to share information.

This example has all the 554 Topics of a grocery retailers Transactional Net Promoter Score survey process. Size tells the relative number of Topic mentions and the color the average sentiment of that topic.

Here are of instructions on how to create a word or topic cloud with Tableau.Topic Cloud ALL.png

  1. Drag Topics_en (list of all topics) into Text Control of the Marks Shelf.
  2. Drag Number of Records to the Size Control of the Marks Shelf (should be set to Sum).
  3. Change the Marks Shelf Type from Automatic to Text.
  4. Drag Topic_Sentiment to the Color Control of the Marks Shelf
  5. Set the Topic_Sentiment calculation to AVG (Average).
  6. Set the zero point for the color to the whole dataset average sentiment (LINK).
  7. Drag Topics_En to the Filters shelf. Choose Top and the number of Topics you want to appear in the wordcloud.


You can also compare topic clouds using background variables: in this case NPS categories.

Up to now, text analysis results have mostly been presented in the form of word clouds, but there are many other, often more powerful, ways to visualize the analysis results.



NPS topic cloud

As said in the previous point, Topic clouds as whole have relatively little informational value. But when you combine them with background variables (in this case NPS group) the results can be more informative. In this case CHECKOUTS seem to be much bigger cause for dissatisfaction for DETRACTORS than PROMOTERS.

Topic Cloud NPS.png


Topic Sentiment Heatmap

I have tested many different visualizations for detecting patterns from open-ended comments during the past six years. What I have learned is that a simple topic-sentiment heatmap works the best.


Just a quick reminder, once again, what CX text analysis does.

“The checkout line was really long but the person at the checkout was helpful and friendly.” -> CHECKOUT -1, SERVICE ATTITUDE +1.

In this example the average sentiment zero point is 0,221. Everything to the left of this point are areas of improvement. The overall topic volume (relative topic importance) is on the y-axis. Topics on the upper left corner require urgent action.

This approach has a big limitation though: it leaves out the time trend. 

Change between two periods

Because text analysis doesn’t have absolute benchmarks, it makes sense to compare two periods. Arrows indicate the trend change between periods (usually month or week) (same chart as above).


Hot topic monitoring

AcmeRetail Hottopics Monitoring.png

Hot topics are the most important customer experience attributes (or at least the most talked about attributes). They need to be tracked in more detail
 than other topics. In practise this means that you need to develop a dedicated topic-sentiment monitoring visualization to track the hot topics. Typically there are up to 25 Hot Topics. 

Hot-topics are listed on the left. The volume of the hot-topic is on the Y-axis on the right and the bars demonstrate the monthly volume. The line and the Y-axis on the left show the average sentiment.

There is just one rule here: when the sentiment is going down and volume up, you have a problem.














Identify loyalty drivers and loyalty eaters 

There are too many topics to do this but you can also do this with topics. I have used topic groups here.

Identify what topic groups (aspects of your operations) drive and eat customer loyalty

Identify ’share-of-talk’ – how much customers are talking about a topic group

  • WEIGHT is the percentage of share of talk. 
  • Color tells the sentiment (Center point in which the color turns from green to red is 32).
  • Avg NPS is the average NPS score of all comments in which customers talked about that Topic Group.



Time trend analysis
time trend.png

Because Topic volume has often unintended volatility, you need to use either stacked charts or share-of-talk percentage.

On the graph on the right the percentages demonstrate the share-of-talk from that months total talk. Color demonstrates the average sentiment.















Compare point-of-sales performance (or other locations)

Use location to visualize text analysis results and compare points of sales or countries

In this chart the pie chart size reflects the overall feedback volume, wedges the most common Topics and color the average Topic sentiment.








Use Topic, sentiment and background variable to drill-down to the root cause

You detect from your dashboard that something is wrong but you have no idea what is causing the problem. That’s why feedback analytics is divided into two distinct phases:Dashboard_negative_comments.png

  1. Detecting that something is wrong (or right); and
  2. Finding the reason for that problem (or opportunity).

The process of finding this reason is called root-cause analysis.

Let’s say that you notice from the dashboard that the STORE LAYOUT topic volume starts increasing and the average sentiment is getting more negative. In this case you know that something is wrong but you have no idea what. Once you drill down to the actual customer comments, you notice that the problem is with stacking carts that employees leave unattended on the aisles.

Drill down to the root cause by using a Topic and NPS detractors as filters. Then you can read the contextually relevant comments. Often reading five to ten comments gives you a good idea about what is causing the problem.


Use two heatmaps to drill-down to the root cause

Using two Topic-sentiment heatmaps will enable you to find out how different stores (products, business units, countries, market segments…) are performing, what kind of problems different stores have and why are some stores performing better than average (identify best practices).




Use two heatmaps to benchmark support or sales agent performance

We’ve met with dozens of contact center managers during the past six years. They all complain about the same issue: It is difficult to detect systemic issues and benchmark qualitative agent performance.

agent_benchmark.pngOnce you automatically and consistently categorize all support and sales transaction surveys (preferably CES not NPS format), incoming emails, social media complaints and webforms, you can start extracting agent performance from the CX text analysis results.

The upper heatmap lists the support agents and the lower Topics. You can filter in either direction. Choose an agent or a cluster of agents with similar behavior and find out what kind of issues they are struggling with. Or choose a Topic and see what agents are performing well (or not so well) when it comes to that Topic. If you have CES (or NPS) score available in your post transaction surveys, you can use those instead of the sentiment.

Detecting systemic issues (i.e., finding clusters of problems) enables companies to take corrective action fast and reduce the number of claims they need to process related to that issue.

Being able to benchmark agents, identify best practices and this way continuously improve the contact center performance will reduce the number of incoming complaints and improve the agent behavior. And this, of course, in the long term leads to happier and more loyal customers.


Use positive sentiment to identify best practices

Dashboard_positive_comments.pngBeing able to use the positive sentiment or NPS score to drill down to the customer comments is a powerful way to identify best practices. The analysis results become even more relevant and valuable when you filter by Topic level sentiment or NPS score.

You can also use these positive comments to motivate employees by sharing them via intranet.







Identify customers who have written many verbatims

Identifying customers (or employees) who write many open-ended comments is important. There are multiple reasons for this:UKBanks_many_comments.png

  1. This type of customers can distort the analysis results (the one who yells the loudest is heard the best); and
  2. They are often the people whose voice spills over to the social media. If you don’t address their concerns, they will ‘talk’ about you somewhere else.

This graph lists the people who tweet about the UK banks the most.

It is important to identify those customers who ‘talk’ with or about you the most. They can be your most loyal customers, but more often not. They are people, who like to spread their unhappiness to a wide base of friends and acquaintances. Social media makes this sharing easy. You need to identify them, communicate with them, and remedy their concern fast and as well as you can.


Identify customers who have written long verbatims

Customers who write long open-ended comments are usually unhappy. But they can also be your most loyal customers giving you insightful feedback. That’s why you need to identify customers who write long open-ended comments.

long_verbatims.pngThe dots in this heatmap are B to B customers. The x-axis reflects the average sentiment of all the topics the customer is talking about. The y-axis demonstrates how many topics the customer is talking about in their comment. This chart includes customers who have talked about more than 10 Topics in their comment.

Using this method, it was easy to demonstrate that the cluster of unhappy customers on the top left corner of this heatmap were more negative about the quality of this company’s customer support. By being able to identify the customers and further drill-down to their concerns, this company was able to focus their customer support center improvement efforts.

This visualization and process will bring additional focus on your detractor call-back process: if you are doing call back on an unhappy customer (e.g detractors) but need to limit the volume even further, this simple method of identifying long responses will make that possible.


CHAPTER 5: How to Design a Customer Insight Dashboard

One of the most important things in customer feedback analysis is to figure out the optimum way to filter and drill down to the actual comments. That is why the actual customer comment or Topic context sentences are on the left of all of my dashboards. Here the comments are not the whole customer comments. What is displayed here is the sentence from which the feedback analysis service detected the Topic mention (Topic context sentence).

The main challenge in Tableau dashboard design doesn't come from the text analysis results but how to display the background variables (metadata on the text comment). In this case they are GENDER and AGE GROUP.  There were many more background variables  in this dataset but I decided to ignore them for simplicity's sake: there is just that much real estate in one screen.

One dashboard doesn't fulfill all requirements. Depending on the CX stakeholder's role, you might need to create additional views for benchmarking e.g. point of sales. What works well for that purpose, especially if there are lot of stores, is two-heatmaps on top of each other.



How to turn red into green?

Dashboard_negative_comments.pngAccording to the dashboard on the right it is clear that Birmingham customers are most unhappy with the topic STORE LAYOUT. The next step is to figure out the root cause for this problem.

That is easy to do by limiting the actual customer comments to Birgmingham store, negative sentiment, NPS detractors and STORE LAYOUT, and then reading through all the comments on the right.

Usually reading the first five to ten comments will give you pretty good idea for the root cause. In this case it was staff members leaving stacking carts during business hours unattended on aisles blocking customers access.



Want to learn more about Etuma?