How to Train and Maintain Machine Learning Models

Video produced by Steve Nathans-Kelly

At Data Summit Connect 2021, Optum SVP Sanji Fernando explained how his organization approaches machine learning model training, evaluation, and retraining.

Optum created an analytic center of excellence to connect teams from different business segments to manage analytic capabilities and help advance analytics in healthcare. With the successful adoption of its first advanced machine learning and AI models, the company recognized the benefit of building a standard set of models for use broadly across its business. In his presentation, Fernando shared his insights into AI product development, how it informs better decisions, and how several teams can connect to help advance its capabilities and benefits.

"In order to train machine learning models, we almost have a massive paradigm shift in how we think about software. Since the advent of computing and software development, I've heard other smarter data scientists and artificial  intelligence experts describe programming as micromanagement but now with machine learning, we're actually driving towards more goal-oriented management," said Fernando.

"We're presenting information to systems and expressing the goal we want to achieve but we're not explaining how to achieve that goal and I think Peter Norvig spoke about this once years ago. It does change the way we think about how to build and train these solutions. It drives us towards a more experiment-driven approach where initially we might try to design and train a model. And, more importantly, we established the criterion in which you evaluate the performance but then we track that experimentation and see how well it beats those goals and, over time after much experimentation, if we find a model that works, we might then be ready to deploy it as part of a broader business or software product solution."

Fernando said that Optum uses best-of-breed deployment methods to do that and then integrates it with that software process. "But that's really not the finish line. These models do improve with more data." Optum monitors them to see if they're operating as expected and can be proved with additional data, and then repeats the cycle while evaluating the model for performance and fairness.

"We may not be able to fix everything in healthcare, but we can understand how the model outputs might affect different individuals of different protected classes, demographics such as age, race, and gender—or gender orientation. We also are at a point in time that we're all benefiting from lots of great breakthroughs in technology and hardware, but also in models of computing and open source tools that allow us to put together what we hope will be best-of-breed tools today but also add to this collection over time."  Optum has chosen to use specific cloud providers like Microsoft but with their help, as well as the contribution of many others in the open source community, it can use tools that allow it to manage experiments, manage data transformations, execute and run models, evaluate for bias, and much more, said Fernando.

"We are constantly iterating on this to see if there are new tools that could be added or swapped out again, leveraging some of the great work done by technology companies, cloud providers, but also contributions across the open source community, to make this process, not simply seamless, but more performant, productive, and effective."