Three Tips for Harnessing Snowflake’s Data Cloud

How Capital One optimized Snowflake for real-time data analysis at scale


Since its founding, data has been at the heart of Capital One. We believe in the power of data to drive insights and empower people to deliver real-time solutions to our millions of customers. Of course, the amount of data we analyze has skyrocketed over the last thirty years, making it more difficult to share data across the company and derive insights in real time. That’s where Snowflake comes in with its cloud data platform.

Snowflake separated data storage from compute for relational data warehouses — and for customers like Capital One, that means our hardware no longer limits us. Instead of racking up technical debt, we can focus on our data and what we do best: build personalized customer experiences that transform people’s relationship with their money.

Our unique journey with Snowflake

Capital One is the first U.S. bank to exit our on-premise data centers and go all in on the cloud, and we’ve written a great deal about our cloud journey and our learnings. We exited our data centers because we worked hard not to be burdened by legacy technologies, technical debt, and silos.

As we worked to modernize our data operations in the cloud, we adopted Snowflake to enable our more than 6,000 analysts to run millions of queries with no degradation in performance. We needed performance that could scale infinitely and instantly for any workload, and would allow multiple lines of business to seamlessly share data with proper fine grained access control.

With Snowflake, multiple analysts can access the same data without affecting each other’s performance. In concrete terms, Snowflake allows our credit card team to make intensive queries without affecting the performance of other teams who are making queries on that same data. At the same time, we can have ETL jobs running different compute tasks on the same data without impacting anyone else.

Snowflake is so flexible and efficient that you can quickly go from “data starved, to “data drunk.” To avoid that data avalanche and associated costs, we worked to put some controls in place before our users migrated to Snowflake. For example, users cannot select a larger cluster than their workload requires or run workloads in a manner that never allows Snowflake compute/warehouse to suspend.

Also, as a technology company in financial services, we operate in a regulated environment. Our model is unique, but our journey with Snowflake applies to any company that operates within a regulated industry. In many ways, our journey with Snowflake applies to any company that must get value from its data.

To generate the most value, organizations need to integrate tools like Snowflake thoughtfully and, at times, creatively. We figured out how to take advantage of Snowflake’s speed and flexibility — while providing the kind of traceability a heavily regulated company like ours requires. Also, being a bank, we understand a thing or two about budgets. So we devised a way to ensure that usage levels were reasonable and on budget.

Best practices we’ve found for using Snowflake

1. Create ways to streamline onboarding and develop processes and solutions.

To provision and manage compute or storage resources, Capital One created an online self-service portal that equips teams with the resources they need. But our tools also fit into existing processes and organizational structures to control costs and assure best practices are followed.

2. Ensure you track and optimize resources to control cost.

With Snowflake, your company unlocks access to data — the data flow is the difference between a garden hose and a fire hose. It’s important to manage and track usage, as costs can rise due to faulty configurations or inefficient queries. While it’s possible to centralize Snowflake access and provisioning through a department head, that method can reintroduce the bottlenecks you were trying to get rid of when you opted for Snowflake in the first place.

Capital One developed a dashboard interface that puts performance and cost management into the hands of key decision-makers — without slowing down the overall process. It generates alerts when there is a sudden increase in cost. It also automatically recommends a way to remediate. In short, you find out right away if something should go wrong.

3. Govern securely and transparently.

As data becomes pervasive, ensuring it’s being managed responsibly grows increasingly critical. As a heavily regulated company, Capital One has built a traceability solution into their Snowflake system that enables approval workflows and data logging to support data remediation and retention use cases.

In Conclusion

At Capital One, we’re believers in Snowflake because it enables us to harness data and put it to work. But as with any technology, corporations must take a 360-degree look at what’s required when you integrate any solution. Technology on its own is a resource, but as our use case demonstrates: We must also think creatively.

Learn more about how we’re leveraging data at scale with Snowflake at www.CapitalOne.com/Snowflake-Summit.


Salim Syed, Senior Director, Capital One Enterprise Shared Services

Salim Syed is a Senior Director for Capital One’s Enterprise Shared Services. He led Capital One’s data warehouse migration to AWS and is a specialist in deploying Snowflake to a large enterprise. Salim’s expertise lies in developing Big Data (Lake) and Data Warehouse strategy on the public cloud. He leads an organization of more than 100 data engineers, support engineers, DBAs and full stack developers in driving enterprise data lake, data warehouse, data management and visualization platform services. Salim has more than 25 years of experience in the data ecosystem. His career started in data engineering where he built data pipelines and then moved into maintenance and administration of large database servers using multi-tier replication architecture in various remote locations. He then worked at CodeRyte as a database architect and at 3M Health Information Systems as an enterprise data architect. Salim has been at Capital One for the past six years. He has a bachelor’s degree in math and computer science from Lewis & Clark College and a master’s degree from George Washington University. Salim is a passionate leader and uses his positive attitude, integrity and transparency to inspire others to deliver their best work and succeed. In his free time, he likes to play tennis, hike and go on the road with his daughter’s soccer team.


DISCLOSURE STATEMENT: © 2021 Capital One. Opinions are those of the individual author. Unless noted otherwise in this post, Capital One is not affiliated with, nor endorsed by, any of the companies mentioned. All trademarks and other intellectual property used or displayed are property of their respective owners.

Related Content