4 common data management challenges in the cloud

The cloud offers many benefits including instantly scalable capacity, cost efficiencies and access to infrastructure. But it can also present challenges that can limit the value of data and require rethinking how businesses manage data.

From our experience at Capital One and from our customer conversations, we found that there are four data management challenges that businesses commonly face in the cloud. Our learnings and the practical solutions we’ve implemented have helped us address these challenges to operate more efficiently at scale.

4 cloud data management challenges

When we moved our data workloads to the cloud, the volume and complexity of data increased which created bottlenecks for our business. We knew we had to manage our data differently in order to scale in the cloud. To do so, we evolved our data management approach, adopted Snowflake as our cloud data platform and built Slingshot, a data management solution that helped us address four key challenges as we scaled:

  • Inadequate monitoring, insights and alerting
  • One-size fits all warehouse management
  • Centralized governance
  • Inefficient queries

Challenge 1: Inadequate monitoring, insights and alerting

Proper visibility is incredibly important for understanding your cost, performance, and data usage in the cloud. These insights provide critical data points for operating efficiently, allocating resources effectively and making strategic business decisions. Without it, inefficient data consumption can lead to excessive or unnecessary costs.

Many organizations use native monitoring tools in the cloud to achieve this. For example, Snowflake has a feature called Resource Monitor that monitors the credit usage and cloud services of virtual warehouses and sends alerts to prevent overspending. We expanded on this by giving our lines of business even greater visibility into their costs and performance with Slingshot. Comprehensive dashboards and proactive alerts allow our teams to stay up to date on credit usage, identify cost spikes and monitor warehouse performance in Snowflake. Plus, custom tagging provides the ability to break down warehouse costs by custom categories like line of business, environment or account for more granular analysis.

Read more about how we improved the monitoring, insights, and alerting for our data: Building observability for Snowflake data applications 

Challenge 2: One-size-fits-all warehouse management

Many data management strategies assume one warehouse size can be used for all types of workloads. But workloads change constantly depending on the time of day and day of the week. Warehouses in Snowflake are configurable, but static. To address this challenge we built dynamic scheduling of warehouses within Slingshot which proactively right-sizes them using custom scaling policies, in turn freeing up data teams to work on generating business value. This right-sizing recommendations feature in Slingshot also notifies users of new schedule recommendations to continuously optimize warehouses for cost and performance.

slingshot warehouse recommendations UI

Capital One realized a 27% projected Snowflake cost savings* using this tool.

Challenge 3: Centralized governance

Many organizations migrate to the cloud and continue to operate with traditional governance policies in place. While a centralized team usually vets and manages all aspects of data, this approach can create bottlenecks due to the incredible volume and increased complexity of data. It’s important to strike the right balance between good governance and the speed necessary to scale a business to reduce these bottlenecks.

Our answer was to move from a centralized data team managing all data governance to a further federated data management model. This allows teams the flexibility to create and use data at their own pace, while leveraging common enterprise platforms and systems to get the benefits of scale.

centralized vs federated data management

These teams work with centralized tooling and central policies that we built into Slingshot. Slingshot enables our teams to operate in a self-service manner with central policies while still adhering to a change management workflow and auditability.

Learn more about how we found a balance between governance and enablement: How to unleash your data while managing risk and costs

Challenge 4: Inefficient queries

Lastly, inefficient queries are one of the biggest drivers of resource waste and cost spikes in the cloud. We’ve seen five common types of queries that most negatively impact performance:

  • Single row inserts
  • Select * from a large table
  • Cartesian product joins
  • Deep nested views
  • Spilling to disk

Identifying these queries in advance means you won’t waste a lot of resources running them on your systems. Slingshot’s Query Advisor analyzes queries, identifies inefficiencies and makes suggestions for how to optimize query text.

slingshot query advisor UI

Additionally, the dashboards we built for better monitoring and insights give us visibility into our costliest queries and who is executing them.

Overcoming data management challenges to scale in the cloud

Enterprises continue to scale their operations in the cloud with the promise of instant scalability, reduced operational costs and business agility. But governance, cost management and adequate monitoring are challenges that must be addressed in order to operate efficiently and scale in the cloud. We built Slingshot to help solve these challenges while empowering our teams to spend wisely, forecast confidently and make smart decisions. If you’re interested in how Slingshot can help your business overcome data management challenges, reach out to get a demo from one of our experts today.

*This figure is based on internal use of similar Slingshot functionality and may not be indicative of future results for your business.

Rahul Mode, Senior Director of Solutions Architecture at Capital One Software

Rahul Mode is the Senior Director of Solutions Architecture at Capital One Software. In this capacity, his primary responsibility is to ensure that customers derive the most value from Capital One Slingshot. Previously, Rahul spent 8 years at Amazon, leading a Solutions Architecture team, working with enterprise customers in the Cloud Tech and Digital Payments space.

Related Content