On Minimizing Loss

Guest Post By Zane Martin, 2020-2021 Sustainability Leadership Fellow and Postdoctoral Fellow in the Department of Atmospheric Science

2020 was a year of loss. In many facets of American life, we lost more than we gained over the past year: we have lost rituals and ceremonies; we have lost small pleasures, like sitting at a table with friends, or crowding into a park on a sunny day; we have lost lives to a pandemic that continues to plague us.

As a climate scientist, I also think about loss a lot in the context of my work: I think about what we have already lost environmentally and, unless we take action to curb rising emissions, what we stand to lose globally. But recently my work as a scientist has dealt more directly with the idea of loss. And in that capacity, has given me reason to hope.

Over the past few months, I have worked with a series of tools in a branch of data science called machine learning. It sounds futuristic, and if you move in scientific circles these days it’s a prevalent and trendy topic. Google and Facebook and Amazon and the “who’s who” of the tech world use machine learning and are spending money and time and labor investing in it. Scientists in research labs and academia too have started to look at it.

Machine learning is what helps computers identify cats or babies or mountains in digital photo albums, or translates sentences between languages, or aids in operating driver-less cars. It’s a powerful and flexible mathematical tool. At their core, all machine learning models do one thing: they minimize loss.

This may sound obtuse, but it’s relatively simple. A machine learning model (or, to be technically precise for a moment, a “supervised” machine learning model) is given data (pictures, or pixels, or rows of numbers) and it makes a prediction about some aspect of that data. It tells you whether it thinks there is a cat in photo or not, or whether a handwritten digit is a 0 or a 1 or a 5. The model is “trained” on data where it knows the answer, so it has information about what it’s trying to guess and can check whether it was right or not.

Machine learning models can be trained to identify which number a hand-written digit shows. Image credit.

Usually it does this by combining different aspects of the data you gave it and weighting those aspects in different ways, like a painter combining colors. Too much yellow, or too much brown, and the painting looks off. But just the right combination of red, and orange, and green, and you’ll get the sunset you’re aiming for. The machine learning model makes its best guess at how much it needs to weight different aspects of the data, and it makes a guess:

“Yes, your Uncle Bob is in this picture.”

“No, we don’t need to apply the car’s brakes for that.”

“She’s written a 6 over here, and a 1 over here.”

After it makes its guess, the model measures whether it was right or not. And here at last, we’ve arrived at the loss.

In machine learning, the loss is the distance between the model’s guess and the truth. The loss measures how well or poorly the model did. The model makes a guess and calculates the loss. And then it does something mathematically simple, but seemingly miraculous: it follows a series of rules to improve itself in such a way that makes the loss smaller. Like water flowing down a hill, the machine learning model figures out which direction is “down” (towards less loss) and moves in that direction, adjusting its weights to better minimize the loss. It adjusts accordingly. “A little more red paint, a little less orange, a little less white”, then the model predicts again. It measures the loss again. And it figures out where it needs to go next.

Machine learning models are often trained to identify objects in pictures. Sometimes tasks that are easy for humans can be difficult for these models, as in the ‘cats or croissants’ example above. Image credit.

It’s important to recognize the model is a machine. The model is a series of rules, defined and strict, like a code of laws. And this is where we as scientists, or really we as humans – with our messiness and our nuance and our capacities for imagination and inspiration – come in. Only the scientist, not the model, can define the loss. We chart the path ahead. We determine what we are willing to compromise each time we step forward.

Even once we define the loss, these models are never perfect. For complex problems, even exceptional machine learning models rarely reach a loss of zero. As new data comes in and we move into uncharted realms, we can never not have loss. All we can hope to do is find a minimum, and step towards it. That step forward, towards an imperfect approximation of what we want, is the best we can do.

I’ve also learned, in working with these models, that sometimes we have to take smaller steps than we want. Like a car with too much momentum flying through a stop sign, or a reader skimming too fast and missing a vital plot point, sometimes in machine learning if we rocket too quickly towards the minimum we can blow right by it. In those instances, it’s smaller steps that get where we want to be. If we make the steps smaller, the model takes longer to train, like a hiker inching their way down a mountain. But it’s safer, and surer. There are times when we have to endure more loss, and move more slowly, and work harder, to arrive where we want.

I’m writing this blog during a difficult period in our collective national life. But the scientist in me has reason to hope that we are in a system that ultimately, and somewhat miraculously, is built to minimize loss. That we as a people have defined what it is that we are and aren’t willing to sacrifice. That we have charted our course. That, in this instance, we have to accept a small step. That we are working, training, iterating at every moment, to minimize our loss.