DataAgents: How we turned 9 months of analysis into 10 days

An engineering deep dive into the pattern that changed how we approach large-scale classification problems.

Ram Manohar Bheemana

June 9, 2026|8 min read

Every engineering team has that project sitting in the backlog. The one where someone says, "We really should analyze all of these," and the room goes quiet. Everyone knows what "all of these" means—hundreds of entities, complex rules, no clear starting point.

For us, it was cloud resource dormancy detection.

We had around 350 distinct cloud resource types spread across AWS, Azure and Google Cloud Platform (GCP). Each type has different behavior patterns. An EC2 instance sitting idle looks nothing like a dormant Amazon S3 (S3) bucket or an unattached Elastic IP. Detecting dormancy required understanding what "active" means for each specific resource, then writing detection logic that wouldn't flood operations teams with false positives.

Traditional estimate: 6–9 months of expert analysis.

Actual time: 10 days.

Here's how we did it, and more importantly, here's the reusable pattern behind it.

The problem with large-scale analysis:

Before we get to the solution, it's worth naming the pattern that makes these projects so painful. It shows up everywhere:

Cloud resources - Which of our 350 resource types are dormant?
Data governance - Which of our 800 tables have quality issues we should monitor?
Security - Which of our access entitlements violate least-privilege principles?
Compliance - Which of our 500 policy controls need remediation?

In every case, the structure is similar—a large catalog of heterogeneous entities, entity-specific rules that don't generalize, unknown priorities and a high cost for getting it wrong.

The traditional approach is not just slow. It's structurally limited. You get coverage of the "obvious" cases, inconsistent logic across analysts, and tribal knowledge that evaporates when people leave. What you need is something that can assess each entity, apply consistent criteria, prioritize by confidence and document its reasoning. The DataAgents pattern gets you there.

The data foundation: Why quality data comes first

The quality of your analysis is bounded by the quality of your data.

An AI agent can reason only over what it's given. If your data is incomplete, inconsistently structured or untrustworthy, the agent's outputs will be too—just faster and at greater scale.

Diagram showing an authoritative data product feeding into an AI agent, producing confidence-rated outputs with documented reasoning.

For our cloud analysis, we had a genuine advantage—an authoritative data product called cloud-asset-data-product. It contains a daily snapshot of every cloud resource across all providers, with a standardized schema. What made this data product valuable was data quality and rich metadata: formal field definitions, PK-FK relationships for cross-entity reasoning and lineage tracking for provenance and freshness. Quality data isn't optional—it's what makes AI analysis trustworthy.

resource_id - Unique composite identifier
resource_type - EC2 instance, S3 bucket, EIP, etc.
service_id - Service grouping
business_application_name - Ownership
data_structured_tag - JSON: configuration, tags, state flags
resource_updated_utc_timestamp - Last change timestamp

The principle: the richer and more standardized your data product, the more an AI agent can do with it. Without quality data, AI analysis is unreliable. This is not a caveat. It's the prerequisite.

The DataAgents pattern

Once you have the data foundation, the pattern is straightforward:

Authoritative data product + AI agent = DataAgent

A DataAgent is not just "AI doing analysis." It's a structured combination of:

An authoritative data source—the single source of truth for your domain
An AI agent that understands domain behavior, applies entity-specific rules and generates confidence-rated outputs with documented reasoning
A human-AI validation loop that catches errors and guides refinement

The output is not a spreadsheet of results. It's a self-documenting artifact—detection logic, confidence classifications, false-positive risk assessments and plain-English reasoning for every entity.

The three-phase process

Phase 1: Broad assessment

Input the full entity catalog. Ask the agent to categorize each resource type by analysis feasibility:

Config-detectable - Dormancy is detectable from configuration data alone
Needs telemetry - Reliable detection requires additional usage signals

Phase 2: Classification and logic generation

For each entity, the agent analyzes three questions:

What indicates the target state (dormancy)? → Detection logic
How reliable is this signal? → Confidence level
What's the false-positive risk? → Risk assessment

Output: Spark SQL detection queries and documented reasoning for every resource type.

Phase 3: Deep validation (the game-changer)

This is where the approach separates itself from "run the AI and ship the output."

Human: "Double-check all MEDIUM confidence entities one by one."

The agent reviewed every MEDIUM classification individually—not as a batch. Some were upgraded to HIGH. Some moved to Phase 2 for requiring telemetry. Then the same for LOW confidence. Every entity reviewed, every decision documented.

Without Phase 3, you're stuck at, "Let's try some and see what happens." With it, you get systematic validation of every entity in your catalog.

What the agent discovered that humans would miss

The most valuable outputs weren't the easy HIGH confidence cases—those are obvious. The value was in what systematic analysis uncovered across 350 types.

The false-positive trap. Some resource types look dormant by age but are actively used without any detectable configuration changes. An S3 bucket with no config updates for 90 days might be accessed millions of times per day—access patterns don't touch the config. Age-based detection on these types runs 40-60% false-positive rates. The agent identified these and moved them to Phase 2.

State-based vs. age-based detection. The agent surfaced a clean framework from the analysis: High-confidence detection uses explicit state indicators—a binary state (stopped, unattached, disabled) that definitively signals dormancy. Low-confidence detection relies only on timestamps.

Disproportionate value concentration. Just 12 HIGH confidence resource types account for 30-40% of total dormancy savings—from 3.6% of resource types. Without systematic analysis of all 350, you'd find some of these, but you'd miss others.

A diagram showing false-positive rate for state-based, age-based with filter and pure age-based detection types, with an example.

The audit trail: Every decision documented

One of the underappreciated outputs of this approach is the reasoning column. Every entity gets a plain-English explanation of why it received its classification:

▎ HIGH Confidence: "Resource is in an explicit dormant state (stopped) and has not been updated for 90+ days. State-based detection with <5% false-positive rate. Automation-ready."

▎ Phase 2 Reclassification: "This resource type can be actively used without configuration changes, resulting in 40-60% false-positive rate with age-based detection. Requires usage telemetry for reliable detection."

▎ LOW Confidence: "No state indicators available. Age-based detection only—active entities may qualify. Manual owner review required before action."

This is not a nice-to-have. Months later, when someone asks, "Why does this logic work this way," the answer is in the output. When a developer is assigned to maintain the detection queries, the context is self-documenting. When stakeholders ask why a resource type is flagged, the reasoning is already written.

Without AI-generated documentation, these explanations either don't exist or live in someone's head.

Of course, humans have to prompt AI to provide such output in a field.

Human-AI partnership: The part that actually makes it work

AI made errors. We should be clear about this.

Field name casing was wrong on several generated queries.
Column names in some detection logic referenced fields that didn't exist in the schema.
A handful of confidence levels were over-optimistic before the deep-dive review.

Every one of these errors was caught by human validation before production.

AI contributes speed, pattern recognition and consistency at scale. Humans contribute strategic direction, quality control and domain validation. The 18-27x speed improvement is the net result after human validation—not in spite of it.

Where this pattern applies

The DataAgents pattern works for any domain with these properties:

A large catalog of heterogeneous entities
Entity-specific rules and thresholds (one-size-fits-all doesn't work)
An authoritative data product/source with a standardized schema
A need for confidence-based prioritization and documented reasoning

Beyond cloud resources, the same pattern applies to:

Data and governance - Generate data quality rules across hundreds of tables. Classify PII and sensitive data at scale. Detect schema drift with documented rationale.
Risk and compliance - Detect policy violations across entity catalogs. Review access entitlements. Map regulatory controls to technical implementations.
Security - Assess security posture across resource configurations. Identify misconfigured services with ranked confidence.

The pattern is reusable. The data product changes. The agent changes. The output structure adapts. But the three-phase process—broad assessment, classification with logic generation, deep validation—applies every time.

What you need to try this

Before you start:

Identify your highest-value analysis problem
Confirm you have an authoritative data product/source with standardized schema for that domain
Define what "target state" looks like for your entities (what does dormant/at-risk/non-compliant mean?)

The minimum viable setup:

A data product/source you trust
An AI agent with enough context about your domain (schema, sample data, domain documentation)
A human validator who knows the domain and can challenge the outputs

The investment: 10 days of focused work for something that would otherwise take 9 months. Most of that time is in Phase 3: the deep validation loop. Don't skip it.

The bottom line

AI doesn't replace human expertise. It amplifies it.

What changed with DataAgents is that humans can now spend their time on judgment calls and validation—instead of manually analyzing entity 47 through entity 350. The AI capability is available. What's often missing is the structured data foundation that makes it reliable.

If you have that foundation, you have more analytical capability available to you right now than you probably realize.

Ram Manohar Bheemana, Senior Lead Data Engineer, Cloud Radar

Ram Bheemana is a Senior Data Engineer at Capital One, where he builds cloud-scale data products that give engineers and business teams real-time visibility into cloud resources across AWS, Azure and GCP. His recent work includes developing the DataAgents pattern—an approach that pairs authoritative data products with AI agents to automate large-scale analysis tasks that once took months, and pioneering Spark Streaming architectures that have delivered over a million dollars in annual cost savings. Ram is passionate about the intersection of data engineering and AI—not just using AI as a tool, but rethinking how data pipelines and intelligent agents can work together to solve problems at enterprise scale. Outside of work, he enjoys mentoring engineers, contributing to community initiatives and staying curious about what's next in the data and AI space.