Governance in a DevOps Environment
As a technology governance director at Capital One with 20 years of risk management and audit experience, I’d like to share a little of the auditor’s perspective on DevOps processes. How is risk managed in an environment where developers are continuously pushing code to production? What are the lines of defense and the controls ensuring the code being produced is safe and well governed?
What is Governance?
When talking about software and technology guidelines, we in the industry prefer the word governance over compliance. Why governance? Because we don’t want developers approaching risk management like it’s a series of boxes to tick off. Governance means actively understanding and managing risk; it’s awareness of everything you do, it’s thinking about the impact of one’s actions, and it’s both establishing and following processes for those actions.
One popular risk management framework used when considering governance is the Three Lines of Defense framework (the Institute of Internal Auditors developed this framework, you can find it on their website). Used by a lot of regulators, it helps to evaluate enterprise risk management.
1st Line — Who Owns the Risk — Individual Developers and Engineers.
2nd Line — Who Sets Policy and Monitors the Risk — Governance and Risk functions that set policy and monitor risk on a daily basis.
3rd Line — Independent Assurance — Internal audits that provide independent insurance and report directly to the Audit committee or the board.
In addition to this popular enterprise tech model is an informal 4th line that is external to the company but works with them hand-in-hand to properly implement governance.
4th Line — External Partners- Auditors and regulators who must be brought into the conversation and given full transparency into development processes and risk management.
What is the Developer’s Role as Part of the First Line?
First of all, it’s awareness of risk. Developers need to be aware that their work can create risk and can cause consequences for the company with even small code changes. As the saying goes, “To fix a problem, first you have to realize you have one.” Being aware of the capacity to create risk is the first step in understanding and mitigating it.
Once awareness is in place, the next step is being responsible for that risk. This means making sure work is performed in a controlled manner and follows best practices and controls at all times.
So, what are controls? From a formal definition perspective, controls are processes that mitigate risk. Specifically, they are an activity that assures operational effectiveness and efficiency, reliable financial reporting, and compliance with laws and regulations. For example, having one person enter a wire transaction into the system and another approve the wire before it goes out for payment would be a simple example of a financial system operating with controls.
Why are controls needed? In my mind, the best quote that sums it all up is from W. Edward Deming, “Uncontrolled variation is the enemy of quality.”
So, what are some of the controls seen in robust enterprise DevOps processes?There are three specific controls I’m going to focus on here.
Two Sets of Eyes
A key concept prior to deploying changes involves having two sets of eyes to peer review all changes. This requires that the second set of eyes is qualified to perform the review in question. This isn’t a box that any developer can check off. It requires someone who understands the languages, libraries, environments, and products being used; what the application does; and what the change will impact.
A second concept is least privilege. This is a key access concept for secure development pipelines that ensures developers don’t have access to directories or files that they don’t need. On a daily basis, do most developers need write access to production? Read access, yes. Write access, probably not. By limiting access to specific roles, granting that access through time-limited tokens granted under access approval rules (just-in-time admin), you’re preventing a “wild west attitude” around who can write to production. Otherwise, having too much access might open us up to unwanted risk.
Unauthorized Change Monitoring
Two sets of eyes and least privilege are both preventative controls, which means they help prevent someone from doing something inadvertently or maliciously bad. The third control that is essential to well-managed development is unauthorized change monitoring. This is in contrast a detective control that logs all change events and allows governance functions and management to review what changes were made and if they were authorized. Let’s say a high Severity issue has come up — did a code change cause it or was it external factors such as a network outage? Robust monitoring set up with automated evidence and log collection is what provides that detective control to trace and identify potential issues.
Putting it Into Practice
I have seen so many pipelines, both internally and externally, and I can find a hole in every pipeline. I learned that from Jen actually. Every time I thought I had a solution and Jen said, ‘What if?’ I had to go back to the drawing board. That’s how I learned many of these things about governance and controls
A developer’s view on good governance usually focuses on automation, such as:
- Building on every commit.
- Static code analysis on every build.
- Scanning for open source vulnerability.
- Static security scanning.
- Automated tests.
Doing all the above is essential in DevOps and Continuous Delivery. But we need to also take a higher view in order to have comprehensive governance of our DevOps processes. The biggest hurdle is often that one person cannot do it all and maintain appropriate segregation of duties.
To quote my colleague Topo again, to create safe and well governed DevOps environments it helps to adopt a “cleanroom model” where all product pipelines — whether they are application, test, or infrastructure code — are identified and registered under source control.
This means code changes can be monitored to ensure the “two sets of eyes” peer review took place. That “least privilege” prevents developers from going outside the pipeline to access the production box. Even better is to restrict production access for everyone and allow changes to production only via a “cleanroom” pipeline. “unauthorized change monitoring” allows to track changes in production without a change order or ticket.
The results are kind of amazing with this particular cleanroom model. At Capital One, the number of products that we were deploying multiple times a day in 2016 was about twenty, by 2017 it was hundreds. If you try and do this all manually, you could not make this happen. You need clean rooms and control points to do this.
Risk cannot be mitigated with typical “CI/CD” automation alone. At the minimum, the three controls need to be built into the automation process. This is only possible via a good partnership with governance and audit teams. This allows for safe, well governed DevOps pipelines where development teams own, and can proactively solve for, their risk without having to wait until the audit reveals problems. By teaming up early on, your first and second line governance teams can create processes to allow developers to more effectively build as they go while properly maintaining risk in a DevOps environment.