How Reinforcement Learning is different from Supervised/Unsupervised Learning?

Pranav Anand Joshi
Dec 28, 2017
3 min read

Updated: Feb 21, 2019

Let's understand these terms supervised learning, unsupervised learning, and reinforcement learning with real world examples.

Supervised/Unsupervised Learning

Scenario1

You are a kid, you see different types of animals, your father tells you that this particular animal is a dog. After he give you tips few times, you see a new type of dog that you never saw before - you identify it as a dog and not as a cat or a monkey or a potato.

Scenario2

You go bag-packing to a new country, you did not know much about it - their food, culture, language etc. However from day 1, you start making sense there, learning to eat new cuisines including what not to eat, find a way to that beach etc.

Scenario1 is an example of supervised classification, where you have a teacher to guide you and learn concepts, such that when a new sample comes your way that you have not seen before, you may still be able to identify it.

Scenario2 is an example of unsupervised classification, where you have lots of information but you did not know what to do with it initially. A major distinction is that, there is no teacher to guide you and you have to find a way out on your own. Then, based on *some* criteria you start churning out that information into groups that makes sense to you.

Basically, in unsupervised learning, we know only input, no target, and no output. We have to find a pattern. In unsupervised learning we will be able to find the structure or relationships between different inputs. Most important unsupervised learning is clustering, which will create different cluster of inputs and will be able to put any new input in appropriate cluster.

Other important things related to Supervised/Unsupervised Learning: -

“If the targets are expressed in some classes, it is called classification problem. Alternatively, if the target space is continuous, it is called regression problem.”
Is supervised learning can be carried out after clustering? No, because clustering and classification (or supervised learning) are two different philosophies of machine learning. If you already have class labels, why would you do clustering? What is the purpose of that? Alternatively, if you don't have class labels, you can't do classification and only clustering is possible to understand the possible groups within the data. It is very important to understand when we should do clustering and when we should do classification.

Reinforcement Learning

Reinforcement Learning is learning what to do and how to map situations to actions. The end result is to maximize the numerical reward signal. The learner is not told which action to take, but instead must discover which action will yield the maximum reward. Let’s understand this with a simple example below.

Consider an example of a child learning to walk.

Here are the steps a child will take while learning to walk:

The first thing the child will observe is to notice how you are walking. You use two legs, taking a step at a time in order to walk. Grasping this concept, the child tries to replicate you.
But soon he/she will understand that before walking, the child has to stand up! This is a challenge that comes along while trying to walk. So now the child attempts to get up, staggering and slipping but still determinant to get up.
Then there’s another challenge to cope up with. Standing up was easy, but to remain still is another task altogether! Clutching thin air to find support, the child manages to stay standing.
Now the real task for the child is to start walking. But it’s easy to say than actually do it. There are so many things to keep in mind, like balancing the body weight, deciding which foot to put next and where to put it.

Let’s formalize the above example, the “problem statement” of the example is to walk, where the child is an agent trying to manipulate the environment (which is the surface on which it walks) by taking actions (viz walking) and he/she tries to go from one state (viz each step he/she takes) to another. The child gets a reward (let’s say chocolate) when he/she accomplishes a sub module of the task (viz taking couple of steps) and will not receive any chocolate (a.k.a negative reward) when he/she is not able to walk. This is a simplified description of a reinforcement learning problem.

How Reinforcement Learning is different from Supervised/Unsupervised Learning?

Supervised/Unsupervised Learning

Reinforcement Learning

Recent Posts

Comments