Quick Introduction to Instance-Based Learning in Machine Learning

Instance-based learning is a machine learning technique that relies on storing and recalling instances or examples of training data.

You may have also heard of instance-based learning, referenced as memory learning or lazy learning.

This is for a good reason, as we take many shortcuts in instance-based learning.

This guide will discuss instance-based learning, how it works, and some of its benefits.

We will also provide examples of how you can use this technique in your projects!

cheering for machine learning

Why Do We Use Instance-Based Learning?

While all of machine learning learns from examples, instance-based learning does not generalize to some function and instead uses the examples as truth.

This contrasts with other types of machine learning, which focus on generalizing to some equation from a set of training data.

This has many advantages and disadvantages.

Advantages of Instance-Based Learning

Below, we break down a few of the advantages of instance-based learning.

We can train highly specific Algorithms.

Since our algorithms no longer try to “generalize,” we can build highly specific models based on only a few training data points.

Instead of building a “Cat Classifier,” we could create a “Is The Cat Wearing a Green Hat Classifier” very quickly.

Instance-based learning does better than traditional Machine learning with low amounts of training data.

Traditional Machine learning models need a good amount of data to build their generalizations. In instance-based learning, this isn’t the case, as our model will reference our training data directly.

Easier to Understand The Results

Instance-based learning models are often much easier to understand than other methods since it relies on simple examples rather than complex mathematical models.

Disadvantages of Instance-Based Learning

While those advantages are nice, they don’t come without some disadvantages:

Instance-Based Learning Stores Data In Memory

Since the training data has to be referenced directly, all of our training data must be in memory when we try to make a prediction.

As data sets are growing exponentially, this alone is sometimes enough to make you try other models.

Generalizability Hides Blind Spots in Models

Since we do not create a generalized algorithm in instance-based learning, we sometimes are left with a model with “blind spots.”

If we receive data that is different from our training data, we will often receive a very poor prediction – since our algorithm hasn’t seen anything like this before.

Model Decay and Model Drift

As the world changes, these instance-based learners quickly become “stale.”

Instance-based learners will need to be retrained much more often.

Instance-Based Machine Learning Algorithms

There are three main categories of Instance-based Machine Learning Algorithms

  1. Lazy Learners (K-Nearest Neighbors)
  2. Radial-Based Functions (RBF Kernel)
  3. Case-Based Reasoning (CBR)

Instance-Based Learning Example

We think instance-based learning is easier to see with an example.

Let’s say you were building an instance-based classifier, and you had 6 points of data.

There needs to be more data to build a generalizable model, but we can leverage the training points directly and classify objects closely related to what we’ve seen.

In the image below, we build “zones” with our points and classify them into their respective colors.

instance based learning example enjoymachinelearning.com

For green and blue points that fall into our “zones,” this is awesome, and we’re able to get a great classifier.

But what happens when our new point falls outside of our zone?

The situation explained above is one of the classic problems with instance-based learners, as they would need help figuring out what to do with that new data point.

Other Quick Data Science Tutorials

At EML, we have a ton of cool data science tutorials that break things down so anyone can understand them.

Below we’ve listed a few that are similar to this guide:

Dylan Kaplan