Advances in customer intent prediction & pattern discovery

Changing banking for good with clickstream data

Cailing Dong

October 14, 2020

As customers become more digitally engaged and more businesses shift to the online space, a strong digital customer experience becomes an increasing necessity. Businesses are striving to keep up, developing customer-driven business strategies to increase customer satisfaction, conversion, and retention rates on their online sites. One of the areas of opportunity that data scientists have found is through clickstream data.

Clickstream data is a detailed log of how users navigate through websites or mobile applications. It typically includes when they visited the pages or clicked on a button, how they arrived on the page, time spent on each page, and where they navigated next. Clickstream data has been widely used in e-commerce to better understand user behavior, specifically to help introduce potential buyers to new and better product options, as well as convert on-the-fence considerations to firm sales. However, the potential for applying clickstream data analysis beyond e-commerce sites has not been fully explored. Can clickstream data be used beyond the sales arena to help with customer services issues instead?

I am a Principal Data Scientist at Capital One working on exploring new opportunities to use data to increase customer satisfaction. My team conducted a systematic analysis of our clickstream data in hopes of identifying areas where it could be applied to customer service solutions in the digital banking industry. In this article, I will present three use cases - clickstream-based call intent prediction, search analysis and frequent sequential pattern mining - that demonstrate how clickstream data can help provide better customer service.

Real world value of clickstream data

Before we talk about use cases, let’s talk about why this matters in light of current events. During the COVID-19 global pandemic, many financial services companies have been experiencing an increasingly high volume of online visits and customer calls. From canceling travel plans booked using credit cards or card rewards to concerns about their financial situation including their mortgages, rent, auto loans, credit card payments, etc; the financial industry carries a big responsibility to understand the common concerns of customers during this time, and pass along the latest COVID-19 related information in a timely and efficient manner.

Clickstream data, especially customer search terms, becomes an important resource to understand the common concerns of customers during this time. As shown in Figure 3, starting from the middle of March 2020, the number of search activities on our site with both general terms about COVID-19 (covid_searches) and specific terms about COVID-19 (covid_only_searches) increased. Additionally, searches about stimulus checks (stimulus_checks) started to appear and increase in April, right after the first release period issued by the government.

graph with orange, blue, and green bell curves

Figure 3: Search activities about COVID-19.

This simple example showed how we can better understand our customers’ concerns and needs through clickstream data.

Use case 1: Call intent prediction

Customer service is one of the most important ways to improve customer retention. Many companies have adopted Interactive Voice Response (IVR) for customer service over the phone, in an attempt to solve customer problems in a self-serviceable space rather than transferring them to the more expensive call agent. For example, you may want to activate your new card when you receive it. You call in and your account identity is verified. The IVR knows that a new card was recently sent and that one possible reason you called is to activate this card; so further instructions will be provided to help you successfully activate it without transferring you to a call agent.

Modern IVR systems usually use advanced speech recognition and natural language processing techniques to reactively understand why customers are calling based on their conversations (utterances) with the system. Can clickstream data predict why a customer is calling in a proactive way before they even express their concerns.

Reactive IVR systems usually have a pre-recorded number of options for customers to choose such as ‘make a payment’, ‘credit card due’, or ‘talk to a representative’. By nature, most customers prefer to talk to a real human to solve their problems as soon as possible, especially when their call reasons do not fall into any of the given options. However, if one can anticipate customers' call reasons from their most recent searches and activities on the site and mobile app, they can customize the greetings and give them the right options right away.

For example, let’s say a customer logged into their account and reviewed many pages about redeeming credit card rewards and booking travel. If they made a customer service phone call immediately afterwards, it’s safe to say that redeeming rewards might be one of the most possible reasons. By asking the right questions and providing the right options, IVR can proactively prompt customers to resolve their issues in the self-service space. When the given issue can’t be effectively solved by IVR knowing the call intent, IVR can help direct the call to the right call agent who is an expert on the specific issue, avoiding multiple transfers. This means the proactive process can potentially reduce call volume reaching representatives and reduce the average call handling time.

To understand how clickstream data can help, my team started with some exploratory data analysis on the correlation between clickstream data and call reasons. It is worth noting that we only used information from customers logged into their accounts and only on our site. Specifically, we collected all the titles of the web pages viewed within one hour before a customer placed a customer service call, along with the existing call reason inferred by the IVR system based on the customer's utterances. From this we identified the most frequently viewed pages visited before placing certain types of customer service calls.

Figure 1 demonstrates a generic example of the web pages a banking customer might view before calling IVR with call reasons/intents around ‘rewards’ and ‘fraud’. In this example, the larger the circle size, the more frequent the page was visited. The darker the shade, the stronger the relation. This shows that customers’ digital activities captured by clickstream data are strong indicators of customers’ intents when calling into IVR.

table with black text and dots in various shades of blue

Figure 1: Example of most frequent pages viewed before customers called IVR with call reason ‘rewards’ and ‘fraud’, respectively.

A word-level analysis can also be conducted to further verify the correlation between clickstream data and call intents. Specifically, Let’s use U and P to denote customer utterances during an IVR call and the set of pages viewed before the call, respectively.

Each utterance and page title consists of a list of words, i.e:

Thus, the final U and P can be represented by a word vector where the words are obtained from all the and .

The importance of each word is measured by a technique called Term Frequency-Inverse Document Frequency (TF-IDF). We further adopted Cosine Similarity, the most commonly-used document similarity in information retrieval, to calculate the similarity between the two word vectors. The pairwise similarities are demonstrated by the heatmap below (darker color indicates greater similarity).

Building off our example, we can see in Figure 2, the largest similarity values occur on the diagonal line. That is, the words that occurred in the web page titles and the IVR utterances are highly correlated when the web session and call happened within an hour, suggesting both events have the same intent. This further supports the idea that clickstream data can be used to predict call reasons and intents.

table with navy rows and white text and cells in varying shades of green

Figure 2: Heatmap of similarity between page title words and utterance words.

Given the positive results of the above exploratory data analysis, one could build a customer call intent prediction model to predict the most likely reasons a customer is calling, even before the initial customer utterance. The rich information contained in clickstream data enables one to use not only page titles as input to the model, but also button-click and link-click activities.

We used the same strategy to represent the clickstream data associated with each call by a word vector, where the importance of each word is represented by the corresponding TF-IDF values. While different machine learning algorithms can be used, e.g., Logistic Regression, Random Forest, Naive Bayes, and Support Vector Machine, etc, we found that logistic regression produced slightly better results with higher interpretability for this multi-class classification problem. Overall, this type of intent prediction model can achieve a high level of accuracy. The detailed performance metrics from an example model are shown below as a heatmap.

Table 1: Performance of call intent prediction model.

As you can see, this example model performs better on specific intents such as ‘fraud/dispute’, and ‘rewards/travel’, compared with general purpose intents such as ‘account information’ and ‘credit card information.’ This is likely because the latter are associated with many common-purpose activities, which is hard to classify even using human judgment.

Use case 2: Search analysis

Typically, clickstream data does not capture raw text from the fields a customer fills in due to the sensitivity of said data. However, one of the text fields that can be collected is customer search data. Although the percentage of customers that use the search bar for financial sites is small, we can learn a lot about the intent of these customers, as they explicitly tell us about their needs when they search.

For example, during tax season, many customers search for ‘1099’ or ‘tax form’. Before college starts, many customers search for ‘increase credit limit’ and ‘add authorized user’ probably for their college kids. Search terms can not only tell the ‘trends’ of customers interests, but also can indicate individual customer’s intents. For example, Figure 3 shows the similarity between the text of what customers searched for on the web with what they said to the voice prompt in the IVR system (the “call utterance” is captured as a transcript using speech-to-text).

diagram made of black text and dots in varying shades of blue

Figure 3: Similarity between call utterances and search terms.

You see that, for many intents (reasons for calling), the website search text matches what the customer says verbally when prompted. For example, customers who call in about credit limit increases use similar words in their search queries and their verbal utterances. Knowing that search text can provide clues about what the customer is seeking, one can use such data along with pageview and button/link click data to help improve customers’ experiences with voice self-serve, reducing call volume via the customer call intent model described earlier.

Customer search terms are indeed valuable resources to understanding customers’ intents. However, through exploratory data analysis and conversations with our tech partners, my team discovered that search was based on exact keyword matches that had difficulty handling the imprecise nature of language; including synonyms, misspellings, and conjugations to name a few.

In order to improve the customer experience by delivering the most relevant search results, we focused on the need for a corpus, i.e., the domain-specific collection of search terms ever captured from our site.

On average, customers will use 2-3 words to describe what they are searching for. Therefore, traditional document classification or topic modeling techniques such as Latent Dirichlet Allocation are not as effective in finding search terms that are alike. In this case, one can leverage an unsupervised machine learning technique called Word2Vec that allows them to identify semantic related words in the corpus. The Word2Vec process allows us to transform the corpus into a numeric space in which terms that are semantically similar are closer together in said numeric space. When one pulls the top synonyms for some popular search terms, or the most closely related words in numeric space, one can see that the synonyms can pick up on nuances like common misspellings, missing spaces, and other words that are synonymous but may not be searchable with a direct keyword.

table with light blue header row and black text

Table 2: Top Synonyms based on Word2Vec.

Applying Word2Vec on search terms pulled from clickstream data enables one to leverage the ‘synonyms’ to provide customers with more meaningful search results for a broader range of words used, streamlining the search process and the overall customer experience. This prevents more customers from needing to call customer service for answers. After all, if customers only want or need a digital interaction, you want to be able to supply it. More effective search analysis can help supply this.

Use case 3: Frequent sequential pattern mining

What makes clickstream data so special and useful is that it consists of a series of ordered events triggered by user interactions. That is, every event , such as a page view or a button click, is associated with a timestamp , representing the exact time this event happened. It is of great importance to understand common paths taken by users, and identify potential anomalies, as such patterns can provide insights about customer behaviors, website design and product feature effectiveness.

For this exploration, one can target customers’ activities in sessions where they are trying to complete certain tasks - such as making a payment, exploring credit cards, redeeming rewards, etc - to discover interesting, useful, and unexpected patterns, or sub-sequences, inherent in our data.

To achieve this goal, my team uses PrefixSpan, an effective frequent sequential pattern mining algorithm, to extract frequent sequential patterns. Different from string match or regular pattern mining, sequential pattern mining can find statistically relevant patterns in a data collection where the values are delivered in a sequence.

Table 3 lists a few simple examples of frequent sequential patterns, with indicators on if a loop happens in the sequential patterns.

Table 3: Example of frequent sequential patterns

By extracting sequential patterns that commonly exist in sessions where customers try to complete a task, one can learn their common paths to completing the task, common issues that prevent them from successfully completing the task, etc. For example, able 3[a][d] shows an example of the frequent sequential patterns that happen during the process of applying for a credit card and making a payment, respectively. To be more specific, Figure 4 shows the common sequential patterns that happen during sessions to apply for a credit card. You can see, in the process of applying for a credit card, how customers tend to go back to their home page, then go back to the pre-qualified pages and card application pages. And customers tend to compare different credits (card:ccp in Figure 4) before they apply for a card. This is a common process that falls within expectations. However, if you saw different patterns were formed after web layout changes or certain marketing campaign launches, the new patterns may shed light on how the new website design or market campaign helps, or even hurts, the customers’ experience in completing some typical tasks.

flow chart with navy rectangles and arrows and white text

Figure 4: Common sequential patterns happened during sessions to apply for a credit card

By extracting frequent sequential patterns between a few products under the same product family, one can learn more about the popularity of a product over another during a certain period. Table 3 [b] and [c] show examples of customer preferences on different credit cards (Venture vs. VentureOne, and Savor vs. Quicksilver) within different product families (Travel Rewards and Cash Back Rewards) in a short time period last year. Such patterns can also help verify the effectiveness of a marketing campaign on a specific product, which creates the expected preference on the marketing product over another (for example, ‘Venture’ over ‘VentureOne’ shown in the example in Table 3[b] ).

One can also apply frequent sequential pattern mining to the clickstream data belonging to each individual customer. On one hand, the frequent sequential patterns can help one make product or service oriented customer segmentations, and further provide corresponding marketing decisions and product recommendations. On the other hand, by comparing the most recent behavior with the frequent sequential patterns extracted from their historical behaviors/data, one may identify potential anomalies if the current behavior is very different from the previous common patterns.

Overall, frequent sequential pattern mining from clickstream data can help one to identify common customer behaviors, shedding light on how to enhance website design, and help testify the effectiveness of a marketing campaign on a specific product. Furthermore, one can also use it to help uncover anomalous patterns indicative of adversarial threats, such as common online interactions that happened before a fraudulent incident.

Conclusion

In this article, I’ve covered three potential use cases for clickstream data: call intent prediction,search analysis and frequent sequential pattern mining. Understanding and predicting customer intent is the key to providing better customer service, and to change banking for good. Overall, exploring clickstream data at an individual customer level can help the financial services industry provide a more personalized experience to their customers, while identifying common patterns among customers can help identify seasoning trends, common customer pain points, and even abnormal behavior patterns.

Special thanks to Gretchen Wagner, Mehul Sharma, and Brian d'Alessandro, for working on the projects together and providing valuable comments on this blog post.

[1] Krishnan, Krish. Data warehousing in the age of big data. Newnes, 2013.

[2] Chowdhury, Gobinda G. Introduction to modern information retrieval. Facet publishing, 2010.

[3] Shalev-Shwartz, Shai, and Ben-David, Shai. Understanding machine learning: From theory to algorithms. Cambridge university press, 2014.

[4] Kim, Jang, et al. A document query search using an extended centrality with the word2vec. ICEC,2016.

[5] Liu, Zhicheng, et al. Patterns and sequences: Interactive exploration of clickstreams to understand common visitor paths. TVCG, 2016.

[6] Han, Jiawei, et al. Prefixspan: Mining sequential patterns efficiently by prefix-projected pattern growth. ICDE, 2001.

Cailing Dong, Principal Data Scientist

Cailing is a Principal Data Scientist at Capital One, where she worked closely on discovering interesting patterns and use cases of using clickstream data for prediction, marketing and servicing models. Cailing obtained her Ph.D. in Information Systems, and Master in Computer Science in 2017 and 2010, respectively. She is broadly interested in solving various predictive modeling problems using natural language processing, traditional machine learning, and deep learning methods. Outside work she enjoys Zumba and Yoga.