Cloud compliance: What it is and how to build a framework

Understanding cloud compliance standards and key best practices for operating successfully in the cloud.

Goutham Kadhaba

December 18, 2024

As the world adopts cloud-based technologies, whether as a consumer or as a business, there exists a set of threats and risks that must be addressed so cloud adoption and all its associated benefits can be realized. An effective cloud compliance framework can provide protection from these vulnerabilities that can expose private and protected information while we leverage the advantages of the cloud.

What is cloud compliance?

Cloud compliance is a set of systematic operations that ensure a business is run in a compliant way, while at the same time protecting an organization’s resources, be it network, compute or storage. Cloud compliance maps to a wide range of regulations and best practices that organizations are expected to follow while using systems and services through cloud capabilities. The International Organization for Standardization (ISO) is a well established entity defining a majority of the standard operating procedures and regulations in the use of cloud. There are many regulations with which an organization must maintain compliance. The following are some of the most widely referenced:

FISMA
HIPAA
PCI/DSS
GDPR
SoX
FedRAMP

The steps to successful cloud adoption must include a foundational layer of compliance that is set to operate in accordance with regulatory and risk management aspects. A collaborative model of technology, process and people are at the root of such an ecosystem. Compliance is represented as a set of policies or rules that constantly consume the state of machine data, analyze for risks and report on vulnerabilities. Setting up a dedicated organization chartered with implementing cloud compliance standards can help to influence the business and engineering community towards a change in behavior and adopt best practices to effectively leverage cloud computing.

What is the difference between cloud compliance and cloud governance?

Cloud compliance is not to be confused with cloud governance. Cloud compliance is a snapshot in time of the security posture of an organization in accordance with external regulations and standards. Regulations are specific to an industry, data consumed or produced and geography where the organization operates. Cloud governance is a set of policies and procedures established with the goal to securely consume cloud based services. It is a continuous process of evaluating existing policies and identifying the need to develop new policies aligned with the organization goals, while ensuring compliance.

What are the implications of cloud compliance?

Cloud compliance covers a wide range of data sets identified by regulations to understand the risk posture of an organization while leveraging cloud based services and systems. Maintaining compliance will accelerate organizations to confidently build and launch new products and services, while non-compliance could lead to reprioritization of intent from building new products to remediation of risk based on the compliance reports and guidance from the regulators. Cloud compliance has multiple domains like data privacy, data security, identity management, network security, data retention, service vulnerability and many more. Each of these needs specific solutions that are aimed at capturing granular data, identifying vulnerabilities and remediation.

The importance of cloud compliance

Ensuring cloud compliance practices are leveraged at every level is more than a legal requirement. It's also a set of best practices that help ensure the integrity and responsible use of the information ecosystem.

Data protection and privacy

A commitment to ongoing cloud compliance ensures sensitive data is protected from unauthorized access and other common security threats. Teams can implement robust security measures to safeguard their data through Snowflake or other user-friendly platforms. This protects individuals' privacy, builds trust with customers and stakeholders and enhances the organization’s reputation for trustworthiness.

Legal and regulatory adherence

Compliance with legal and regulatory requirements is mandatory for organizations operating in regulated industries. Noncompliance can result in significant legal penalties, fines and potentially reputational damage. Developing and implementing a solid plan for cloud data compliance helps enterprises and providers avoid, or at least reduce, legal risk. Good compliance also helps organizations demonstrate their commitment to operating ethically and responsibly.

Reduced business risk

With cloud compliance measures in place, organizations better mitigate risks, including security incidents and regulatory violations. By proactively addressing these risks, an enterprise or provider can prevent costly incidents, minimize disruptions and help ensure business continuity. Cloud compliance also provides a framework for managing and responding to security threats, which reduces the overall risk to organizations.

Uninterrupted operations

A provider or data manager that's actually in compliance will have the necessary controls and safeguards to maintain uninterrupted operations. To get to this point, organizations must implement disaster recovery and business continuity plans, conduct regular audits and assessments and continuously monitor the cloud environment for emergent threats. By ensuring compliance, organizations can maintain the availability and reliability of their services, minimizing downtime and costly operational disruptions.

Challenges of managing machine events in cloud environments

The majority of cloud service providers support a mechanism in which machine events are shared as a stream or as APIs that can be consumed. Once machine events are available, it is now appropriate to apply rules on top to evaluate for compliance. This process is time sensitive and at the same time has to process a large volume of machine events. It demands a high performance platform that can be trusted to not have failures or errors in evaluation with upward of 99.9999% accuracy, availability and an uptime 24/7.

There are many proven patterns to divide and conquer such huge volumes of machine events at high speeds. It is also important to set guardrails to ensure that even the most granular change to any cloud resource is taken into consideration with a broader context of its ecosystem, like account, VPC and IAM, and not just the event itself in isolation.

As engineers and product owners tend to leverage newer cloud-based tools and services, the challenge will be to ensure they are ready to roll out controls or policies that will help the engineers to leverage the tool efficiently and responsibly. This implies that cloud governance has established a strong relationship and partnership with cloud compliance, cloud providers, regulators, cyber and audit groups who can collectively look ahead to a set of new cloud capabilities that can be adopted into the organization, but identify what vulnerabilities it would possess or pose by integrating into existing resources and applications.

Elements of a cloud compliance framework

The following five elements are critical for a holistic cloud compliance strategy that can successfully provide visibility and protection for an organization.

1. Service assessment and vulnerability analysis

Cloud providers release new capabilities and services very often to keep up with competition and provide differentiated value to their customers. In addition, cloud providers also change, deprecating existing APIs and services. However as a consumer, it is important to assess new services as they are released and existing services periodically for changes. These assessments (service assessments) are meant to identify vulnerabilities associated with the native service. The scope of this assessment will also include design and implementation of these cloud native services specific to your organization and your organization’s best practices.

Service assessments need strong cloud architects, risk and compliance subject-matter-experts (SMEs), engineers and product owners coming together to evaluate various aspects of new and existing cloud native services. Post assessment, a report is shared that clearly describes the risk (if any), recommends a set of controls to be implemented before a broader adoption, and suggests the collection of service usage reports and service compliance reports to be regularly reviewed to address risks associated with such a service adoption.

To ensure these assessments happen in an unbiased way, an independent group or a department like Cyber is charted to manage and report.

2. Collecting machine data and machine events

Identifying vulnerabilities is a time sensitive opportunity and the value depreciates as time lapses. In many cases though, the window of opportunity to react is much longer, helping cloud compliance platforms to leverage a wide variety of data sources. In general, there are three main options for machine data:

Cloud native resources (ex. CloudTrial)
Tool-based resources (ex. ServiceNow)
Direct resources (ex. APIs)

Direct resources via APIs are typically the fastest and easiest to process, but get expensive as the volume of the API calls go up. The cost of processing all machine data as events from API sources will demand massive and resilient compliance platforms. There needs to be a balance of how much real time events could be processed versus batching data for evaluation of vulnerabilities.

Collection of machine data and classification and categorization of data based on resource type or criticality/risk or vulnerability that will help distribute the processing is fundamental. A key differentiator could be to build a data layer that is always available and democratized by which multiple lines of business or departments could leverage to build custom algorithms to identify risk. As data gets democratized across the organization it is important to have clear disclosures associated with each stream or lake of data. These disclosures include age of data, source of data, signatures, and risk category.

Further, as the cloud footprint grows it will become necessary to establish a group or a department completely focused on managing the data layer of machine events. Many organizations that have large footprints on the cloud also provide learning platforms around machine data to help build intelligence on top of applying policies or rules that can start to bring predictability of risk.

3. Developing controls and policies

It’s tricky as to how fast the dark web learns to leverage resource-based vulnerabilities against organizations or organizations learn to develop policies and rules that will protect from such vulnerabilities. It’s all about the length of the learning curve. Cloud SMEs play a very big role here in assessing new service offerings from cloud providers, not just broadly but deeply in the context of how a business wants to leverage new cloud-based resources or products. Assessments are generally led by a group of SMEs who understand risk (technical, financial, and operational), coming together to score a particular cloud-based tool or resource. It then creates a risk score that expresses the criticality as Low/Medium/High/Critical. These assessments are not periodic, but happen as a standard process throughout the year and include re-assessments of existing services to ensure changes to service APIs or deprecations.

Much needs to be sorted out on how these policies are released based on how many cloud accounts need to be provisioned. For example:

Is it to be done at the same time or progressively?
How many applications are impacted based on the release of these policies?
Does the velocity of product development get affected by such a release and does the policies have inbuilt actions that may vary by environments?

It is evident that release management aspects of these policies have the potential to influence and affect the much larger organization both positively and negatively. Negatively, though, it would be more secure, because these policies will restrict the number of patterns in which cloud-based resources or services are leveraged, designed and integrated with business products. As a result, this would add more work for the application teams to retrospectively modify existing and live applications and/or make changes to upcoming designs and implementations.

End user impact analysis is a necessary wing of policy development that attempts to keep the impact of policies to a much lower extent by automating the remediation on behalf of the application teams. However, this will require a trusted relationship between application teams and cloud compliance organizations. In certain situations, cloud compliance teams will not have full visibility into the rationale or the implications of a particular design of a component that uses a cloud resource. In such scenarios, it is ideal to report non-compliance as a notification and provide a window to the application teams to own the remediation. To fully understand the impact on the end user or applications themselves, it is important to have the ability to create test scenarios in which resources created, evaluated, and actions taken to remediate are all recorded and further analyzed to clearly identify impacts that the release policies would cause. This part of controls testing or policy testing is quite a riddle and many times exhaustive, but very rewarding.

4. Processing machine events against policies

Controls (or policies) are developed as:

Detective controls
Corrective controls
Preventive controls

Preventative controls are the most effective way to mitigate compliance risks to begin with, but at the same time, preventative controls are complex to implement. Cloud compliance solutions are specifically designed to address each of the above categories of controls. Controls are either developed as batch (near real time) and also event based or API based. The goal here is to be able to get as close to the event as possible. Using CloudTrial data or any other cloud native logs is the simplest source for machine events, while it is important to notice that cloud native logs also have a delay between the actual event and the time it would make that event available in the log.

Controls are composed of one to many policies applied correspondingly against different machine events originating across regions, environments and data classifications. Policies are rules that evaluate machine data against a set of conditions which have a boolean outcome. This implies that a resource can only be compliant or non-compliant to the scope of a control objective. The outcome is determined based on a wide range of simple to complex sets of rules that will evaluate machine events from multiple perspectives of data, identity, access and visibility. The process of evaluation and inference will consequently lead to identifying non-compliance resources, enabling visibility through various channels of notification and generally suggesting a pattern-based remediation.

5. Identifying non-compliance and taking action to remediate

All types of controls are essential to have one or many ways to report on the state of compliance. These reports can be further commoditized and customized, grouped and distributed through multiple channels like email notifications, compliance dashboards, operational reports, and risk state data. As an organization's footprint increases on cloud, the set of machine events could also grow significantly making the reports more complex and harder to interpret. The majority of the focus would be to have reports that clearly show non-compliance by risk category like regulatory, operational or security. These reports can also include non-compliance listed by resource type and criticality represented as high, medium, or low. Divisional and line-of-business (LoB) dashboards are helpful to drive prioritization conversation with LoBs and executive decisions, encouraging business and technology groups to come together on risk management and risk remediation.

Certain channels used to report non-compliance, like email notifications and Slack notifications, gradually become ineffective due to volumes growing higher each day. It is equally hard for developers to watch every notification that comes their way and help drive quick remediation. Policies that provide these notifications need product owners and product managers who come with a good experience in driving user empathy based feature developments, carving out effective and outcome driven notifications which are self-intuitive. These notifications need to be supported with clear steps to remediate, expressed as instructions that will help end users in a faster and more consistent way of remediation. Compliance solutions can go one step further in assigning incident tickets to applications teams that have default remediation windows and support automated escalation processes.

Best practices for cloud compliance

Achieving and staying in cloud compliance calls for enterprises to take a strategic approach. Below are some best practices to help organizations ensure compliance across the board.

Regulatory requirements

Organizations should thoroughly understand the regulatory requirements applicable to their operations. This involves identifying the relevant laws, standards and regulations and assessing the impact these have on the organization’s cloud environment.

Data encryption

Encrypting data in transit and at rest is a critical security measure for cloud compliance. Encryption protects sensitive information from unauthorized access and ensures data remains secure even if it's intercepted, accessed by malicious actors or accidentally leaked.

Access control and authentication mechanisms

Implementing robust access control and authentication mechanisms helps protect sensitive data in the cloud. This includes using multi-factor authentication, role-based access control and least-privilege access so only authorized users can access critical data and systems. Regularly reviewing and updating these access controls helps maintain security, and it's a requirement for ongoing compliance in many cases.

Regular audits and assessments

Regular audits and assessments of the total security package are needed to verify compliance and identify areas for improvement. Internal and external audits provide insights into the effectiveness of security controls, highlight potential vulnerabilities and ensure the organization remains compliant with regulatory requirements. Enterprises should establish a routine audit schedule.

Data residency compliance

Data residency requirements dictate where and under what rules data can be stored and processed. This is particularly important for organizations operating in multiple regions with varying regulatory requirements. Implementing data residency controls helps organizations meet legal obligations and avoid potential compliance issues.

Secure configuration management

Reliable cloud compliance arguably starts with maintaining secure configuration management practices. Teams managing data should establish a baseline configuration for cloud resources, if only to set what the normal configuration should be. Secure configuration management helps prevent misconfigurations that could lead to security vulnerabilities and noticeable compliance violations.

Incident response plans

Emergencies, almost by definition, happen unexpectedly. It could be a massive hacking attempt on a company’s cloud servers, a tornado knocking down wires at the data center or a distributed denial of service attack, but more than one unexpected incident has the ability to disrupt data security.

Robust incident response plans should outline procedures for detecting, responding to and recovering from security incidents. They also should have communication protocols for notifying stakeholders and the relevant regulatory authorities.

Data loss prevention (DLP)

Using DLP measures may help protect sensitive data from unauthorized access, which often includes unauthorized sharing and exfiltration. DLP solutions monitor and control data flows, detect potential data breaches and enforce policies to prevent data loss.

Continuous monitoring

Even the most robust cloud security and compliance measures fall out of step if not continuously monitored. Periodic reviews and audits, aided by automated tools for data security, such as the proactive alerts in Capital One Slingshot, help identify weak points and places where standards may have begun to slip.

Secure development practices

Cloud environments have to be secure from end to end every step along the way. Developers should incorporate solid awareness of compliance requirements into their initial construction, as well as the ongoing operations of the teams working in a cloud environment.

This requirement includes involving security testing in the development life cycle, following secure coding guidelines and conducting regular vulnerability assessments. Secure development practices help prevent security vulnerabilities, while being compliant from the first step of development.

Why establishing a cloud compliance framework is the responsibility of every developer

In order to establish a collaborative and supported cloud compliance framework across the organization, all teams need to have clear intent, roles and responsibilities. Technology, architecture, operations, risk, regulatory compliance and governance groups need to work together to set, manage and socialize security standards, provide continuous support and guidance, and create a transparent process for new services adoption.

As organizations move towards being fully on cloud, compliance must play an integral role. Defining a cloud compliance framework within an organization will go through a journey of maturity and learning to balance roles, responsibilities and accountability. However, the standards outlined in this article should be helpful in successfully implementing and managing an effective cloud compliance program. As this domain evolves further over time and with continued innovation, cloud compliance as a practice will become a fully operational entity for every consumer.

Goutham Kadhaba, Director of Software Engineering, Enterprise Cloud Controls

Goutham Kadhaba is a Director of Technology at Capital One leading transformational work on Enterprise Controls & Cloud Compliance. He is focused on the development of automated and preventive controls to improve end user experience at the same time helping better the compliance posture. Goutham has also led the modernization of Card Imaging and Card Payment Processing platforms at Capital One. Prior to Capital One, Goutham led development of microservices and streaming based platforms across the banking & financial services industry.