Machine Learning in the Enterprise: Lessons from the Front Lines

Dave Castillo

February 13, 2019

Machine learning (ML) and artificial intelligence (AI) are making their way into all kinds of business operations today as more companies explore how they can put their data to work. Just this past month, Capital One’s 2019 Machine Learning Survey — Are We There Yet? The 2019 State of Machine Learning Survey — showed that a surprisingly large number of early adopters are realizing new business value from ML — 80 percent of respondents said machine learning is impacting their organizations now — but the survey also revealed widespread disagreement as companies find their way.

Read Are We There Yet? The 2019 State of Machine Learning here.

There was a notable lack of consensus about the best approach to dealing with risk and other issues introduced by ML. As with any new field, the survey shows that people are learning on the fly and still figuring out what makes sense and what does not.

At Capital One, I lead the Center for Machine Learning, an internal center for excellence in ML that works with lines of business as they develop and implement AI and ML systems. Over the past two years alone, we’ve developed a number of machine learning applications, and as our processes have evolved, we’ve developed many best practices and playbooks, learning a lot along the way.

For companies that may just be getting started with ML, here are some of the lessons we’ve learned about how to approach ML implementation in the enterprise.

Be clear about the goal and whether it truly requires machine learning

There’s a seduction that happens when people set their sights on ML, so the first question to consider is whether an ML solution is really the right approach to achieve the outcome you envision. Sometimes you may just need to clean up your data or solve a business intelligence problem.

Among the most common reasons to apply ML to business processes is the ability of the technology to perform certain functions at scale, which would otherwise require vast amounts of time and resources. It’s also common to rely on ML for processing large volumes of data analysis beyond what’s possible for human capability (like crunching hundreds of thousands of variables in real-time).

These are just a few initial questions — the key is to ensure your case really is one that wouldn’t be feasible or make sense without machine learning, and that you have access to the data and skills required to make it work.

Do your data due diligence

Data is the fuel that drives a machine learning solution, so without the right data, it’s nearly impossible to get an ML solution to work properly.

But the data engineering piece can be an overwhelming challenge if you’re not prepared. And laying down the technical framework for getting access to that data can be daunting.

It’s important to have the proper feature ecosystem in place. Otherwise you will be frustrated watching your model performance deteriorate because you locked it into a fragmented or disconnected proprietary data store.

Take a holistic view

Another common pitfall in implementing ML is creating point solutions that only work for a singular use case. As we all know, the world changes, and those changes can leave narrowly-focused solutions behind, or at least limit their value in the organization.

Once you put a model into production, you must ensure that the proper interfaces are in place to deliver your model features to the model. This becomes especially important when you are dealing with many use cases. At Capital One, we look at data signals across our entire organization.

By doing so, you very quickly evolve from highly segmented views to entirely robust models incorporating almost every aspect of your data. This may not be possible for everyone, but when you step back, you can start to see all the business facets that are potential use cases for ML, which can allow you to create a much richer and effective model.

Realize that not everything needs to be a deep learning solution

While deep learning is a fascinating and exciting field of ML and an area of ML that we are deeply invested in, I’ve seen how the allure can be deceiving for enterprise solutions. In many cases, basic supervised learning techniques are sufficient to achieve the desired outcome.

At Capital One, we’re highly focused on identifying the right algorithm for a given use case. For example, highly regulated use cases that require high degrees of explainability are more prone to supervised learning solutions. We’ve seen plenty of supervised learning algorithms that could solve many problems businesses face, and they can be much easier to get into production.

Understand the risks

You can’t talk about machine learning without discussing the risks involved. Do you understand what this algorithm is doing? Is there the potential for unintended consequences, like bias or unequal outcomes?

Even assuming that your model developers are all highly qualified and responsible people, you may find situations where bias creeps into the model, which can translate to many things, from poor performance, to regulatory issues or even disparate human impact. At Capital One, we think it’s paramount to advance the responsible use of ML, and we have several ongoing initiatives across the enterprise in this area, from research and partnerships to multidisciplinary internal working groups.

We’re also very focused on creating a well-managed and well-governed ML environment, and are exploring ways to enhance these efforts by applying ML to the process. For example, we’re exploring the ability to essentially apply machine learning to certain internal processes to compress the ML lifecycle from months to days — resulting in a transformational level of efficiency and agility.

Recognize it's early and machine learning will continue to evolve

Today, we’re still a long way from machine learning that’s totally automated and plug-and-play. It still requires highly skilled people and humans to play a role, and a great degree of diligence for each phase of the life cycle.

Yet while there are still many factors that have to be considered, including whether machine learning is even the right solution for you, there’s no doubt we’re in the midst of an exciting transformation as more companies figure out novel ways to make sense of their data.

Dave Castillo, MVP, Machine Learning

Dr. David Castillo is a Managing Vice President of Machine Learning at Capital One. He heads up Applied ML Research, University Partnerships, ML Technology, tools and frameworks, and ML Consulting within the Capital One LOBs on Machine Learning products and project opportunities. David has over 20 years developing applications involving artificial intelligence, big data, machine learning, and large-scale distributed computing technologies across a wide variety of industries. David is an advocate of analyzing data streams "inflight" to distill meaningful content from volumes of data in near real-time and applying automation for creating model features and driverless ML model refitting. David is an active speaker and participant in industry events and serves as an Adjunct Professor of Computer Science at University of Maryland University College.