Layered data security: Balancing security and utility

Why modern data protection requires tokenization and granular masking.

Unlocking the next frontier in data security

With a mission to change banking for good, Capital One has always operated at the intersection of cutting edge technology and finance.

As an early adopter of cloud-native architecture within our industry, we were faced with two non-negotiable mandates: uncompromising data security and operational agility at massive scale. As we began rearchitecting our data infrastructure for migration to the cloud, Capital One made the strategic decision to tokenize our sensitive data.

To handle the scale and complexity at which Capital One operates without compromising on data security, we developed an in-house tokenization engine designed to handle billions of operations per month. Purpose-built to be as performant as it is secure, this engine also improved our security posture by protecting our most sensitive field elements across operational and analytical data stores.

Balancing security and utility: The last mile problem

As enterprises continue to navigate evolving regulatory requirements, traditional data security practices create friction. Overly restrictive approaches can unintentionally silo critical data at a time when generative AI and digital transformation demand unprecedented data utility and access. This dynamic creates a new, more nuanced challenge: How do we balance foundational security requirements with the practical, day-to-day needs of our business?

The consequences of failing to strike the right balance can create unnecessary friction. When security protocols become a barrier to strategic innovation and operational agility, it can impede key functions, stall business intelligence and hinder customer support. Analytics teams may not need to see a single byte of plaintext data to conduct business intelligence operations or train an AI model, but customer service representatives may need to verify the last four digits of a Social Security number (SSN). 

This is the new "last mile" problem of data protection: Security vs. utility.

Forcing a binary choice—either total exposure or full de-identification—creates friction and risk. A truly modern data security strategy isn't about choosing one, it's about layering both. Understanding when to use tokenization and when to layer in dynamic masking is key to unlocking the full utility of enterprise data without compromising security.

The foundation: Databolt tokenization

Inspired by our own data transformation journey, Capital One Software developed Databolt. Built for commercial use, it provides foundational data security by de-identifying sensitive data at rest, in transit and in use.

When data is tokenized, it is transformed into a fully de-identified non-sensitive string (the token). This token can then be stored and used in place of the sensitive plaintext data, ensuring the original, raw data is not exposed in your systems.

The flexibility layer: Dynamic masking for granular visibility

But what about operational use cases that require access to at least a portion of the raw data? This is where Databolt’s dynamic masking capabilities come into play. Dynamic masking builds upon tokenization and provides a critical balance between security and usability by allowing for partial visibility of sensitive data on read.

With dynamic masking, the data remains fully tokenized and secure at rest, but when an authorized user requests to view it, a predefined mask is applied securely on your server. This design prevents raw sensitive data from leaving your data store while in transit to downstream applications and helps ensure that individual users only have access to the precise data they need to perform their job.

By implementing dynamic masking alongside tokenization, organizations can avoid the “all-or-nothing” trap. This multi-layered approach maintains high security standards while granting granular role-based access to partial data visibility.

Operationalizing the Principle of Least Privilege

Effectively implementing a multi-layered approach to data security requires more than just the right technology; it demands a clear governance framework rooted in the Principle of Least Privilege. By pairing tokenization with Databolt’s dynamic, server-side masking, you create a mechanism to enforce this principle. The challenge then shifts to precisely defining which roles require access to specific pieces of data and what level of visibility is required in order to effectively execute business operations.  

To help navigate these decisions, we’ve shared some example guidelines and best practices below. By default, we recommend all sensitive data be tokenized at ingestion and stored, transported and processed in this tokenized form. The strategic decision then becomes: What level of data visibility meets the specific access requirement of the user or application?

 

Required level of visibility Core function Example data When to use
Display fully redacted (format-preserved) Validation & format check only ***-**-****

For automated systems or auditors who need to verify the presence and format of the data without seeing any of the sensitive value.

Examples might include a downstream process checking if an SSN field is populated and correctly structured, but does not need the value itself.

Display tokenized data Highest security & data linkage AbC-12-DeFG

For applications or users that require no knowledge of the plaintext. Ideal for internal processing, analytics, data linkage or reducing regulatory audit scope.

As a best practice, Capital One Software recommends tokenizing all data when in use, in transit and at rest.

Display partial plaintext (dynamic masking) Balancing usability & security ***-**-6789

For users who need to confirm a portion of the data. Applies a template mask on read.

Examples might include customer support agents verifying identity using the last four digits of an SSN or account number.

Display full plaintext (detokenized) Mandatory operational utility 123-45-6789

For highly privileged users who must see all of the data to fulfill their role.

Access should be highly restricted and logged. 

Examples might include fraud investigations and mandatory compliance reviews.

 

Final thoughts: A data protection strategy for security and utility

Modern data protection isn't a single product; it's a layered strategy.

  • Modern, format-preserving tokenization is the non-negotiable foundation. It de-identifies your data, protecting it down to its core.
  • Databolt’s dynamic masking capability is the flexible utility layer. It works seamlessly on top of data tokenized by Databolt to provide controlled, partial visibility for specific operational needs.

You no longer have to choose between locking down your data and making it useful. Databolt is built upon the same security principles that Capital One developed, validated and hardened to secure billions of sensitive transactions each day. By combining Databolt’s foundational tokenization capabilities with its granular dynamic masking capabilities, you can achieve a resilient, secure architecture that also unlocks the full utility of your data.

To learn how Databolt can address your most complex data security challenges, schedule a demo today.


Harper Foley, Senior Manager, Product Management

Harper is a senior product manager at Capital One Software where he leads the product strategy for secure data sharing and AI. His work centers on securing data for use in AI/ML models and serving critical data sharing use cases in highly regulated industries like financial services and healthcare. This work builds on his experience as a multi-sector tech founder, including co-founding a TechCrunch Disrupt finalist venture that develops AI-powered software for hospital systems.