What Actually Is Machine Learning?

Now that I’ve posted two things about going to the ML conference, I’ve had two people I know ask me about machine learning. For all the attention and press that the field has received, it seems that a lot of people still just don’t quite understand what it is.

I completely understand the confusion. It seems like every explanation I see focuses more on what machine learning can do, and not on what it is. For non-technical people, this is probably very helpful, but if you’re somewhat programming-savvy, I think there’s only a certain amount of times you can hear “Well do you know Siri? Siri uses machine learning to give you movie showtimes!” or whatever. At some point you might wonder what actually is going on.

I’m going to give a very simple explanation, just of Machine Learning. I’d like to give more explanations, but I find it tough to explain something like a neural network, for instance, in really simple language. I think the reason is that the concept is not easy to compare to some common human experience. If you want to explain a wig to someone, you say “You know how hair looks?” If you want to explain a plane to someone, you’d say “You know how birds fly?”.

However, even though neural networks are based on human brains, the majority of people are not going to understand “You know how neurotransmitters diffuse across a synpase?” or whatever. So many discoveries and inventions start with someone observing something in the world, and then applying it in some other way. If all the ML concepts made for simple, clear analogies to stuff that everyone understands, it wouldn’t be a field that was exploding right now in the mid-2010s. More people would have jumped onto it earlier, and it would be even more widespread already.

Anyhow, I got a bit off topic, but here’s my explanation for the whole “What is machine learning?” subject:

In traditional computer programming, most of the time you have 3 things. It might help to visualize these as boxes:

  1. An input. This is some information/data that goes into a function.
  2. A function. This is usually some kind of command/function/procedure.
  3. A result. This is what you’re searching for when you stick box 1 into box 2.

For instance, Box 1 might have a number inside, Box 2 is a programming function that doubles that number, and Box 3 is the result. Box 3 is what you’re trying to discover, it’s the unknown step.

So you stick 33 in Box 1, you put it into Box 2, and when you open Box 3, you have 66.

With Machine Learning, you are approaching the problem like this:

You start with a whole bunch of Box 1s and a whole bunch of corresponding Box 3s. Then the whole point of Machine Learning is to figure out what the function of Box 2 is. It basically looks at an input in Box 1, sees how it comes out in Box 3, and after doing this for thousands, or millions, of boxes, it makes a (very good hopefully) approximation of what Box 2 is doing. Box 2 is what you’re trying to discover, it’s the unknown step

So actually, now I’m worried I explained this confusingly, but the point is just that instead of working with a function that you know, and finding output that you don’t know, you’re flipping those things. You know what the ouput is, you just are trying to figure out what the function is.

How does this happen? Oh hell that’s complicated, and I don’t know if anyone reading this is that interested in it. I’ll see how this post goes over.