09/13/2019
By
MJV Team

8 metrics for you to evaluate the success of your chatbot

Now that you’ve developed your chatbot, it’s time to check out the main KPIs that you should be aware of, in order to improve and evaluate its impact!

If you’ve followed our chatbots series up until now, you should already have a good idea of how to develop a bot for your company’s’ needs. For your strategy to be successful, you still need to define which metrics to follow. After all, the improvement process should be continuous and focused on the real needs of the customer.

Continuing on from our previous posts, let’s talk a little about the main indicators that should be used to test the performance of your conversational robot. Indicator through which you can plan training to help chatbot become a valuable member of your team.

  • Net Promoter Score (NPS)

NPS is an excellent thermometer for testing your chatbot’s strategy. To track the metric, the  following questions should be asked at the end of a session: (using a 0-10 rating system)

With the results, you can divide customers into three groups:

  • Promoters: those who would use the bot again and recommend its services to other people – A score between 9 or 10;
  • Neutrals: as the name suggests, they had a neutral experience towards the chatbot – Grading it between 7 and 8;
  • Detractors: Those who did not like the experience and would probably not recommend your bot – A score of between 0 and 6.

Most companies convert the NPS to a 5-star system. To do this, simply fit the notes we saw above into a reduced spectrum – for example, a ratings 9 or 10 would be converted into five stars.

  • Chatbot Rates (CR)

CR is another interesting metric. Basically, at the end of service or each response, you can request the user to provide a positive or negative evaluation of the experience. A good opportunity to abuse the emojis: you could, for example, create “thumbs up” and “thumbs down” buttons.

Example:

Was this answer helpful? 👍 or 👎

With the results, it is possible to rethink the training of the bot or if necessary, revise the AI entities and intentions that have been inputted into the model (if any).

Do not forget to consider the users who did not respond to the questionnaire – the same is true for NPS. Understanding why they have not interacted with the evaluation system can also bring interesting insights and help develop improvements.

  • Fall Back Rates (FBR)

Most chatbots have fall-back answers, programmed to suit the user if he “explores” areas that are still unknown to your robot. Usually, the virtual attendant says: “he does not know how to meet that demand”.

Monitoring incidents of this type of response are crucial, as this may mean the need for training or simply the identification of new intentions and entities not currently, covered in the bot design.

If we divide the number of times the chatbot has had to use a fall back response by the total number of messages, we will have the rate of confusion.

Confusion rate = number of fallback answers / total answers offered

  • Active Users

This metric is widely used in chatbots strategies. With it, you find out how many people are interacting with your bot in a given period – it can be a daily, weekly, monthly monitoring etc. If the number is low, it means you should rethink your approach, choose other channels, and even review the robot’s design.

For a complete overview, a metric that is often used in conjunction with this is: retention rate.

  • Retention rate

As we have seen, it is not enough to keep track of active users, and special attention must be given to the retention rate. The metric is intended to find out how many users returned to use the bot in a given period of time.

Of course, a high retention rate means that your strategy is on the right track. Probably the users are satisfied and the conversation with the bot is meeting customer needs.

  • Conversation Interactions

The conversation interaction rate aims to find out how many times the user has exchanged messages with your bot during a session. You must be very careful when using this metric, as each chatbot has its own strategies and purposes.

A chatbot intended for service, for example, may have low interaction rates, since the ultimate goal is to promptly serve its user. If we are talking about a chatbot focused on entertainment (storytelling), on the other hand, high interaction rates mean that users are exploring the robot’s full capabilities!

  • Goal Completion Rate (GCR)

To see if the chatbot is fulfilling its original purpose: (To find and give a correct answer), it is advisable to follow the Goal Completion Rate (GCR). This metric aims to verify how many times the bot is providing results expected by the company and the user.

Is your goal to capture leads? If so, how many times did the chatbot get the requested data from the user? Is the idea to perform a specific service? How many times has the bot been able to solve a customers problem?

  • Session Length

Session Lengths can be evaluated in conjunction with another metric, as we have already seen: interactions by conversation. The difference is that this indicator provides other insights about our conversational robot.

Does my client waste a lot of time reading the messages? Are the buttons used confusing? Is the text easy to read? These are just a few examples of the answers we can get from this indicator.

Again, the evaluation criterion for the success of this metric depends on the strategy and purpose of the chatbot.

How to monitor the indicators?

Luckily, most chatbots development tools have their own dashboards, with key metrics to track their impact. Some examples are interfaces like Hubspot and Blip.

If you use a tool that does not show the expected indicators, it could be preferable to use solutions designed specifically for chatbots analysis, as Chatbase and Dashbot, for example.

How to improve results?

Chatbots are a relatively new tool. There are no preformatted templates or magic formulas for best results. The result? Test and exploit the tool’s full potential!

A/B tests have never been so important. You should promote improvements gradually and keep track of each of these metrics. This work requires the maximum creativity of UX writers and UX designers, who need to look for new ways to promote the: good user experience.

What did you think of our post? As we prepare our last post of the series, how about downloading our complete practical guide on how to apply User Experience in your company?

Back