Identifying emotions in email with human-level accuracy

As part of the Master’s degree Business Analytics at the VU Amsterdam, Erwin Huijzer completed his master thesis at Anchormen:
“Identifying effective affective email responses; Predicting customer affect after email conversation”

When customers contact a company with regards to queries and complaints, often they prefer to use email. Handling these emails is a massive task for the Customer Support department. Automating email handling can help improve , reduce costs and shorten response time. However, awareness of customer emotion during the conversation is an important aspect in effective email handling.

In the thesis, sentiment analysis was used on incoming customer emails to determine the initial emotion of a customer. Furthermore, affect analysis was applied to predict the customer’s emotion after the response email from Customer Support. Both analyses were executed using supervised machine learning which trains computer models based on labelled data. This required manual labelling of a set of emails with sentiment (None, Neg, Pos, Mix) and emotions (Anger, Disgust, Fear, Joy, Sadness).

Manual labelling revealed that humans find it very difficult to determine emotions in email. Still, using majority vote, a reliable labelset could be determined. Applying machine learning (voting ensemble of Random Forest and Neural Net) on the labelled data resulted in human-level accuracy for Anger and Joy. For Disgust, the model even significantly outperforms human annotation. Using the same voting ensemble and including SVM, leads to human-level performance on Sentiment too. In both sentiment and emotions, the domain specific models trained on a small (742) set of emails outperforms a commercial model that was trained on millions of news sources.

Machine learning to predict customer affect, showed low performance. Still, results are significantly better than the benchmarks. A more direct measurement of customer affect may however drastically improve performance.

The full thesis is available for download here. The presentation is available here.

Posted in Masters Projects