Codementor Events

Simplest algorithm to get started with machine learning

Published May 12, 2019Last updated Nov 07, 2019
Simplest algorithm to get started with machine learning

Hello, my name is Alex. And this article is about KNN algorithm. It's been a while since I first time met term "machine learning". For me, as a front-end developer, it's always been a struggle to understand it. There are so many things like supervised, unsupervised, reinforcement learning. Hundreds types of algorithms and neural networks.

So I was looking for some gentle start. Algorithm which easy to understand with basic level of math.

How does it work?

back.png

First of all. Let's define simple problem so it was easier to understand further explanation.

We have a field. Let's say in one part of this filed there are mainly circles. And on the other part mainly triangles. All of them has coordinates (X, Y). Not a quantum physics, ha.

Which type is a point with selected coordinates (x, y)?

Step 1 - let's measure the distance

Obviously that the closer point to group of circles, the higher probability that it is circle as also. The same for triangles. So all we need to understand - is** how far** this point from each of these groups.

The things is that we can't really measure the distance from a point to a whole group.
So we will calculate a distance from selected point to all other elements on a field.

Euclidean distance formula for 2 dimensional space
distance.png

alt

Distance Type
3.4 Triangle
3.7 Triangle
3.7 Triangle
4.1 Triangle
4.3 Triangle
5 Circle
2.7 Circle
2.9 Circle
3.3 Circle
4.1 Circle
3.9 Circle

Step 2 - sort a list

Now we have a list of distances to all points on a field. All that's left to do is - just sort it ascending.

2.7 Circle
2.9 Circle
3.3 Circle
3.4 Triangle
...

Step 3 - select K

The last step is - take first K elements from the list and calculate whether there are more circle or triangles. Based on this we can define type of selected point.

So if we take K = 4, selected point will have a type of circles. Because first 3 elements are circles and only one of them is triangle. Important note - there could be situation when you have equal number from each category. That's why it's better to use odd K value (3, 7, 9, ...).

Now you might ask - how to choose K value. Well the smaller it is, the higher influence of a noisy data. And if the value is too big, prediction might be less accurate. So for every dataset value is unique.

How to work with non-numerical values?

There might be a situation when you have not 2 dimension, but 10. And some of them are not numbers at all. It could be categories, like shape ("square", "round", ...), etc.

In this case we can represent it with numbers 0 and 1. So if parameter has 2 possible values ("square", "round").
We will replace it with 2 parameters - "square", "round"
So if element is type "square" it will have values (1, 0).
And if element is type "round" it will have values (0, 1).

Normalization

If we are working with few dimensions which has different range we need to apply regularization. For example, one dimension has range 0-1. It's influence to the results will be extremely small in comparison with one which has range (0-255). And this is not right.

Algorithm is very simple.

  1. Pick up dimension with the smallest range (0-1 dimension_A)
  2. Loop though all other dimension and change it's value

dimension_B_value / (dimension_B_range / dimension_A_range)

So if dimension B has range from 0 to 200, it's 200 times bigger small range of dimension A. So all values for dimension B should be divided by 200. If value was 100, it will become 0.5.
The same thing for all other dimensions.

Wanna build cool things?

Check out my course - Hacking JavaScript Career.
https://www.udemy.com/javascript-career-jump-start/

Discover and read more posts from Alex Polymath
get started
post commentsBe the first to share your opinion
Abdullah Blogger
8 months ago

You can download and install https://gbwhts.net/ogwhatsapp-apk/ on your device if you want to utilize WhatsApp with a few more capabilities. WhatsApp is a mod for WhatsApp that enables us to install a modified version of the well-known messaging and chat software on Android with more features.

Nirmal mishra
4 years ago

Hi! If you want to use WhatsApp with some extra features, then OGWhatsApp Apk 2020 is available to download and install on your device. OGWhatsApp is a mod for WhatsApp that allows us to install on Android a modified version with more functions for the popular messaging and chat app.

ashokkumarn
4 years ago

If you want to <a href=ā€œhttp://wonderwomanfull.com/when-do-i-start-with-an-anti-aging-regimen/ā€>know more</a> about Thermage,here is the blog you get detailed information. Make sure you can check it out and keep on visiting and please share our blog.

Show more replies