Dynamic Customer Embeddings & Understanding Customer Intent

How sequential recommendation and representation learning can be leveraged to model customer behavior

By Sam Sharpe, Senior Software Engineer and Karthik Rajasethupathy, Senior Manager, Data Science

Digitization has made its way into the financial services industry with the explosion of online services for credit cards, rewards, loans, banking, investing, and budgeting. Innovation in this field has roughly mirrored similar trends in ecommerce, where ecommerce companies have mastered personalization, marketing, and efficiency. From providing automated customer service to alerting customers about potentially fraudulent transactions, all the necessary services we provide are only improved by having a deep understanding of our customers.

Online activities on web and mobile apps open a completely new lens through which to gain this understanding. Not only is digital activity always changing, but similar to transactions, these activities are highly dimensional and require feature engineering for specific tasks. In our recent paper accepted to the ICML 2021 Workshop on Representation Learning for Finance and E-Commerce Applications, Dynamic Customer Embeddings for Financial Service Applications, we explored methods to learn dynamic representations of user online activity to simplify and improve utilization of digital activity data in downstream applications.

Sequential recommendation and representation learning

Tons of progress in sequential recommendation has been made since the introduction of collaborative filtering and the famous Netflix recommendation challenge with fixed functional forms for modeling time effects on recommendations. Recurrent neural networks (RNN) have accelerated research on methods that adapt to evolving user behavior.

The first methods to take advantage of RNNs were DeepCoevolve–a point process model parameterized by an RNN to capture the mutual influence between users and items over time–and Recurrent Recommender Networks–RNNs that update user/movie representations used to predict ratings.

The evolution of users and movies depicting how predictions depend on which movies a user rated previously.

Recurrent Recommender Network (Wu et al 2017 - https://cseweb.ucsd.edu/classes/fa17/cse291-b/reading/rrn_wsdm2017.pdf)

At each interaction between an item and user, DeepCoevolve use RNNs to update the embeddings of items and users.

Toy example of interactions and dynamic user embeddings in DeepCoevolve (Dai et. al 2016 - https://arxiv.org/pdf/1609.03675.pdf)

Over the past few years, other methods have tweaked how recommendation systems incorporate other context (e.g., app device, user characteristics, etc). Most recently, Kumar et al introduced some important and unique concepts with their framework JODIE. They were the first to propose mutually recursive updates to items and users using a shared RNN. More importantly, they reframed the problem as representation learning for interaction networks where the main goal is to create user embeddings that can predict embeddings of items users would interact with next.

The JODIE framework uses sequential interactions between items and users to project future item embeddings for recommendation.

Illustration of the JODIE recommendation framework (Kumar et al 2019 - https://cs.stanford.edu/~srijan/pubs/jodie-kdd2019.pdf)

Self-supervised RNN framework for digital customer embeddings

Inspired by JODIE and Spotify’s recommendation framework, we designed a method to learn dynamic representations of Capital One user’s online activity.

We treat each customer’s sequence of click-stream events (e.g., page views or actions), beginning with a login and ending with a logout, as a single digital session and encode each session into an embedding via seq2seq autoencoders.

Page views of each digital session are tokenized and embedded using seq2seq autoencoders.

Embedding of digital sessions via seq2seq autoencoders (Chitsazan et al. 2021 - https://arxiv.org/abs/2106.11880)

We jointly model the sequence of embedded customer sessions along with time and financial context in order to fully represent users’ implicit intent and the temporal dynamics of customer behavior.

We can utilize the customer’s latent representation at any point in this sequence to more effectively predict the intents of the next session, anticipate customer service calls, and identify account takeover. For more details about our methodology, comparisons to previous dynamic recommendation tasks, and results on a variety of downstream applications check out our paper Dynamic Customer Embeddings for Financial Service Applications!

Our framework consists of LSTMs that model user sequences of sessions, time patterns, and financial context.

Dynamic Customer Embedding framework (Chitsazan et al. 2021 - https://arxiv.org/abs/2106.11880)

Application & deployment of dynamic customer embeddings at Capital One

Through dynamic customer embeddings we have shown that a customer’s previous digital activity is representative of digital intent, behavioral preferences, and predictive of future activity. Therefore, the first applications of this at Capital One have been to help customers find relevant servicing messaging and insights related to their accounts, and to help Capital One servicing agents select the best digital channels to use when communicating with our customers.

To support these applications, we have deployed customer embeddings as a batch scoring job that runs multiple times per day. After each batch run, we refresh representations of existing customers with recent activity, and also generate representations for newly active customers. Once refreshed, a job is triggered to publish our embeddings to our centralized feature platform. This feature platform serves embeddings via an API - which will enable these and other downstream applications to consume and utilize the representations on-demand.

Any major changes in our digital assets (pages, events, layouts, etc) and/or other exogenous events (such as Covid-19), can cause drift in the source data and customer representations. To detect such changes, we deploy a monitoring solution that triggers alerts based on the change in distribution of the cosine distances between newer and older representations for each customer. With this monitoring, we can detect, alert and respond to large shifts in the representations.


We are constantly evolving our modeling framework at Capital One to capture and respond to granular shifts in customer behavior. Representation learning and temporal sequence modeling remain essential building blocks for providing teams across Capital One with meaningful features to build effective, personalized systems for the best customer experience. Check out our paper for more details!


Capital One Tech,

Stories and ideas on development from the people who build it at Capital One.

DISCLOSURE STATEMENT: © 2021 Capital One. Opinions are those of the individual author. Unless noted otherwise in this post, Capital One is not affiliated with, nor endorsed by, any of the companies mentioned. All trademarks and other intellectual property used or displayed are property of their respective owners.

Related Content