Navigating the Dynamics of Financial Embeddings Over Time
How Graph Representation Learning can capture financial system patterns in a meaningful and robust way
Consider a shopping plaza. There are restaurants, a few big-box stores and national retail chains, a grocery store, local mom-and-pop shops, and several specialty stores. Each of these merchants fulfills one or many possible needs for a customer. These needs are driven by a complex mixture of physical and social realities, and it is these needs that determine how a customer makes their way through the shopping plaza. After all, customers may be looking for winter coats in December or swimwear in July, shopping at one or the other retailer depending on the season. The shopping trip of a family that just moved into a new home will look different compared to their shopping pattern a year previously, with a wide range of new furniture and home supply stores added to the list of merchants they interact with.
A more timely example is that when COVID-19 started affecting our communities, a sharp delineation between essential versus nonessential businesses and online versus brick and mortar businesses was drawn. Many customers started looking for online products and services, and businesses adapted to the new reality by changing their interaction model with their customers. As customer behavior was greatly impacted by the impacts of quarantine and which products and services were easily accessible, new shopping habits were formed.
In reality, everyone’s relationship with their preferred merchants changes depending on the season, social and family factors or the economic climate of the times. In essence, these relationships are mutable in both time and space. While these relationships are unique to individuals, as a financial institution Capital One is interested in its customers’ changing relationships with merchants.
Accounts as Words, Time as Space
As machine learning practitioners, we ask ourselves how can we better understand these relationships from the vast transactional data over multiple points in time?
The series of transactions someone makes as they shop at Restaurant A, Grocery Store B and Gas Station C in a limited amount of time forms a sequence, A->B->C. The hundreds, if not thousands, of customers that shop at some combination of these stores every day creates a network of sequences of transactions. In this instance, we are not interested in an individual’s relationship with the stores, but rather the collective interaction patterns with these merchants. From a macroscopic perspective, we hypothesize that there are clearly identifiable trends, and most importantly inflection points, represented in these sequences.
To investigate this hypothesis, we project the aforementioned sequences into a latent dense space that represents closeness in terms of similarity of shopping patterns. These representations, namely embeddings, are generated using a highly scalable shallow neural network known as a skip-gram. This was covered in a previous Capital One blogpost on DeepTrax which you can read here: https://www.capitalone.com/tech/machine-learning/learning-embeddings-of-financial-graphs/
One way to view these embeddings is as an efficient mathematical summarization of the (latent) state of an entity relative to all others in that system. In our financial transactions, this means each embedding represents a merchant’s or an account’s state. Of course, people and economies change all the time, impacting that state. Therefore, it becomes imperative to update our understanding of their latent state in a dynamic setting. There are a few ways to do this. In recent work that we presented in a spotlight talk in a ICML Graph Representation Learning (GRL+) , which we will also be sharing at ICAIF (International Conference on AI in Finance) in October, we showed how we train dynamic embeddings of financial transactions and the meta-analysis steps that we perform to extract distinguishable real-world trends from them.
In our research, we seek to answer the following questions:
- How do embeddings change over time?
- Do they change seasonally?
- Do they change in response to large endogenous shocks such as COVID-19?
- How does one measure if this change is meaningful or not?
Measuring and Interpreting Changes
There is an important difference between meaningful changes and the random rotation of the embedding space inherent to a noisy training process. As we discussed in an earlier blog post, in graphs generated from transactional data, new nodes are continuously coming in while at a slower but still consistent rate other nodes are becoming inactive. New edges are formed while past connections can get eliminated and the full set of timesteps are not available beforehand. To address these issues we use warm start training, meaning we initialize the next timestep’s model with the weights from the previous one, only updating the nodes that obtained new edges. Additionally, we remove “inactive” nodes that have not received new transactions in over 12 months. In this consistently updating setting, we expect the latent representations to update smoothly but meaningfully. How do we measure this representation shift, and most importantly, how does it relate to real semantic shift?
Consider some specific merchants or accounts, which we will call the seed. The seed’s reconstructed 2-hop neighborhood is specifically the top k similar accounts, or merchants, measured by cosine distance in the embedding space along with their neighbor’s top k similar entities. Neighborhoods can have different sizes, and most importantly, can change over time.
Figure 1 shows an example of a reconstructed 2-hop neighborhood calculated from two different snapshots of the embedding space. The seed node is depicted in the center, with a larger size. The further away we move from the seed node and the less central a node is, the smaller the size of the node in this representation. The colors represent modularity based communities inside the neighborhood. As new nodes get added (right figure), some nodes change community membership while others form more connections with one another and end up splitting apart from their original community.
By looking at neighborhood composition and the similarities between the same embeddings across time, we can get a clearer picture of the shifting embedding space and understand if there are global trends driving the representation change, versus what may be an artifact of random wiggling in the latent space. This will help us answer whether a node changes in accordance with its neighbors or whether nodes change drastically and potentially move to another neighborhood of the graph.
For merchants, specifically, another way to aggregate changes to the embeddings over time is by looking at the type of merchant. Oftentimes merchants from the same industry occupy their own neighborhoods -that is, they move together. Airlines and travel industry merchants, for example, can be viewed as a segmentation of all of the merchants.
Shifts and Disturbances Before and During COVID
Recent events related to the Covid-19 pandemic provide an interesting use case in measuring semantic shift. Compared with representation shift, we can draw conclusions on the effectiveness of our proposed framework and gain insights into the pandemic’s effects on merchant-consumer interaction.
First off, when grouping by merchant category, we found that average cosine distance shifts over time differed. This is explained above in line chart Figure 2. Even though peaks and valleys happen around the same timesteps, mostly due to global economic trends, the actual deltas - aka the size of the shift - is different per category. Shifts are often also related with seasonality. For instance, towards the holiday season at the end of each year. When combined with the bar graph in Figure 3 below, demonstrating the number of merchants that had their maximum shift in that month, we see that periods of large cosine shift corresponded to major events - specifically, the upheaval of the early stages of COVID in the right end of the chart.
In Figure 4 below we Move into a microscopic view of neighborhood change for individual seed nodes. The seed node’s trajectory in cosine distance between timesteps is depicted in black, while its neighbors in the beginning (2017-12) are depicted with red and the neighbors in later timesteps (2020-03) are depicted in blue.
Two interesting observations can be made here:
- Neighbors often follow identical trends with the seed node (e.g. J Crew and Banana Republic).
- With the beginning of the pandemic, neighbors are mostly based on similarity of the services (e.g. Equinox and SoulCycle) and less on co-location of the businesses (e.g. Walmart and nail salons). Since a large portion of transactions were happening purely online, the effect of location in the transactional patterns became less prevalent.
Conclusion and Future Work
Ultimately, exploring the way our embeddings change over time gives us insight into the actual change in collective behavior of our customers and merchants. In a d-dimensional latent space we need one more axis to account for time and it is the projection on this axis that will capture the complexity of transaction patterns and the dynamics that are derived from them. The ebb and flow of the embedding space, the speed of change in embeddings, and the aggregated statistics through each time step give us a deeper understanding of our transactions that we didn’t have before.
DISCLOSURE STATEMENT: © 2020 Capital One. Opinions are those of the individual author. Unless noted otherwise in this post, Capital One is not affiliated with, nor endorsed by, any of the companies mentioned. All trademarks and other intellectual property used or displayed are property of their respective owners.