DataAgents: How we turned 9 months of analysis into 10 days
An engineering deep dive into the pattern that changed how we approach large-scale classification problems.
Every engineering team has that project sitting in the backlog. The one where someone says, "We really should analyze all of these," and the room goes quiet. Everyone knows what "all of these" means—hundreds of entities, complex rules, no clear starting point.
For us, it was cloud resource dormancy detection.
We had around 350 distinct cloud resource types spread across AWS, Azure and Google Cloud Platform (GCP). Each type has different behavior patterns. An EC2 instance sitting idle looks nothing like a dormant Amazon S3 (S3) bucket or an unattached Elastic IP. Detecting dormancy required understanding what "active" means for each specific resource, then writing detection logic that wouldn't flood operations teams with false positives.
Traditional estimate: 6–9 months of expert analysis.
Actual time: 10 days.
Here's how we did it, and more importantly, here's the reusable pattern behind it.
The problem with large-scale analysis:
Before we get to the solution, it's worth naming the pattern that makes these projects so painful. It shows up everywhere:
- Cloud resources - Which of our 350 resource types are dormant?
- Data governance - Which of our 800 tables have quality issues we should monitor?
- Security - Which of our access entitlements violate least-privilege principles?
- Compliance - Which of our 500 policy controls need remediation?
In every case, the structure is similar—a large catalog of heterogeneous entities, entity-specific rules that don't generalize, unknown priorities and a high cost for getting it wrong.
The traditional approach is not just slow. It's structurally limited. You get coverage of the "obvious" cases, inconsistent logic across analysts, and tribal knowledge that evaporates when people leave. What you need is something that can assess each entity, apply consistent criteria, prioritize by confidence and document its reasoning. The DataAgents pattern gets you there.
The data foundation: Why quality data comes first
The quality of your analysis is bounded by the quality of your data.
An AI agent can reason only over what it's given. If your data is incomplete, inconsistently structured or untrustworthy, the agent's outputs will be too—just faster and at greater scale.
For our cloud analysis, we had a genuine advantage—an authoritative data product called cloud-asset-data-product. It contains a daily snapshot of every cloud resource across all providers, with a standardized schema. What made this data product valuable was data quality and rich metadata: formal field definitions, PK-FK relationships for cross-entity reasoning and lineage tracking for provenance and freshness. Quality data isn't optional—it's what makes AI analysis trustworthy.
- resource_id - Unique composite identifier
- resource_type - EC2 instance, S3 bucket, EIP, etc.
- service_id - Service grouping
- business_application_name - Ownership
- data_structured_tag - JSON: configuration, tags, state flags
- resource_updated_utc_timestamp - Last change timestamp
The principle: the richer and more standardized your data product, the more an AI agent can do with it. Without quality data, AI analysis is unreliable. This is not a caveat. It's the prerequisite.
The DataAgents pattern
Once you have the data foundation, the pattern is straightforward:
Authoritative data product + AI agent = DataAgent
A DataAgent is not just "AI doing analysis." It's a structured combination of:
- An authoritative data source—the single source of truth for your domain
- An AI agent that understands domain behavior, applies entity-specific rules and generates confidence-rated outputs with documented reasoning
- A human-AI validation loop that catches errors and guides refinement
The output is not a spreadsheet of results. It's a self-documenting artifact—detection logic, confidence classifications, false-positive risk assessments and plain-English reasoning for every entity.
The three-phase process
Phase 1: Broad assessment
Input the full entity catalog. Ask the agent to categorize each resource type by analysis feasibility:
- Config-detectable - Dormancy is detectable from configuration data alone
- Needs telemetry - Reliable detection requires additional usage signals
Phase 2: Classification and logic generation
For each entity, the agent analyzes three questions:
- What indicates the target state (dormancy)? → Detection logic
- How reliable is this signal? → Confidence level
- What's the false-positive risk? → Risk assessment
Output: Spark SQL detection queries and documented reasoning for every resource type.
Phase 3: Deep validation (the game-changer)
This is where the approach separates itself from "run the AI and ship the output."
Human: "Double-check all MEDIUM confidence entities one by one."
The agent reviewed every MEDIUM classification individually—not as a batch. Some were upgraded to HIGH. Some moved to Phase 2 for requiring telemetry. Then the same for LOW confidence. Every entity reviewed, every decision documented.
Without Phase 3, you're stuck at, "Let's try some and see what happens." With it, you get systematic validation of every entity in your catalog.
What the agent discovered that humans would miss
The most valuable outputs weren't the easy HIGH confidence cases—those are obvious. The value was in what systematic analysis uncovered across 350 types.
The false-positive trap. Some resource types look dormant by age but are actively used without any detectable configuration changes. An S3 bucket with no config updates for 90 days might be accessed millions of times per day—access patterns don't touch the config. Age-based detection on these types runs 40-60% false-positive rates. The agent identified these and moved them to Phase 2.
State-based vs. age-based detection. The agent surfaced a clean framework from the analysis: High-confidence detection uses explicit state indicators—a binary state (stopped, unattached, disabled) that definitively signals dormancy. Low-confidence detection relies only on timestamps.
Disproportionate value concentration. Just 12 HIGH confidence resource types account for 30-40% of total dormancy savings—from 3.6% of resource types. Without systematic analysis of all 350, you'd find some of these, but you'd miss others.
The audit trail: Every decision documented
One of the underappreciated outputs of this approach is the reasoning column. Every entity gets a plain-English explanation of why it received its classification:
▎ HIGH Confidence: "Resource is in an explicit dormant state (stopped) and has not been updated for 90+ days. State-based detection with <5% false-positive rate. Automation-ready."
▎ Phase 2 Reclassification: "This resource type can be actively used without configuration changes, resulting in 40-60% false-positive rate with age-based detection. Requires usage telemetry for reliable detection."
▎ LOW Confidence: "No state indicators available. Age-based detection only—active entities may qualify. Manual owner review required before action."
This is not a nice-to-have. Months later, when someone asks, "Why does this logic work this way," the answer is in the output. When a developer is assigned to maintain the detection queries, the context is self-documenting. When stakeholders ask why a resource type is flagged, the reasoning is already written.
Without AI-generated documentation, these explanations either don't exist or live in someone's head.
Of course, humans have to prompt AI to provide such output in a field.
Human-AI partnership: The part that actually makes it work
AI made errors. We should be clear about this.
- Field name casing was wrong on several generated queries.
- Column names in some detection logic referenced fields that didn't exist in the schema.
- A handful of confidence levels were over-optimistic before the deep-dive review.
Every one of these errors was caught by human validation before production.
AI contributes speed, pattern recognition and consistency at scale. Humans contribute strategic direction, quality control and domain validation. The 18-27x speed improvement is the net result after human validation—not in spite of it.
Where this pattern applies
The DataAgents pattern works for any domain with these properties:
- A large catalog of heterogeneous entities
- Entity-specific rules and thresholds (one-size-fits-all doesn't work)
- An authoritative data product/source with a standardized schema
- A need for confidence-based prioritization and documented reasoning
Beyond cloud resources, the same pattern applies to:
- Data and governance - Generate data quality rules across hundreds of tables. Classify PII and sensitive data at scale. Detect schema drift with documented rationale.
- Risk and compliance - Detect policy violations across entity catalogs. Review access entitlements. Map regulatory controls to technical implementations.
- Security - Assess security posture across resource configurations. Identify misconfigured services with ranked confidence.
The pattern is reusable. The data product changes. The agent changes. The output structure adapts. But the three-phase process—broad assessment, classification with logic generation, deep validation—applies every time.
What you need to try this
Before you start:
- Identify your highest-value analysis problem
- Confirm you have an authoritative data product/source with standardized schema for that domain
- Define what "target state" looks like for your entities (what does dormant/at-risk/non-compliant mean?)
The minimum viable setup:
- A data product/source you trust
- An AI agent with enough context about your domain (schema, sample data, domain documentation)
- A human validator who knows the domain and can challenge the outputs
The investment: 10 days of focused work for something that would otherwise take 9 months. Most of that time is in Phase 3: the deep validation loop. Don't skip it.
The bottom line
AI doesn't replace human expertise. It amplifies it.
What changed with DataAgents is that humans can now spend their time on judgment calls and validation—instead of manually analyzing entity 47 through entity 350. The AI capability is available. What's often missing is the structured data foundation that makes it reliable.
If you have that foundation, you have more analytical capability available to you right now than you probably realize.
