The Lazy Learning Algorithm-KNN

Sarthak Arora
3 min readJan 6, 2021

You must be aware why are we being advised to stay at least 6 ft. away from other people whenever we go out? It’s being said that the closer we get to a COVID positive patient, the higher is the chance for us to get the disease as well.

The K-Nearest Neighbours algorithm works on a similar principle. It assumes that an entity acquires the characteristics of k of its nearest neighbours.

That is why it is rightly said, ‘You become like the 5 people you spend the most time with.’ Here, we take the value of k is considered as 5. (You could really take out a moment to laugh, okay?)

Blog Posts | There's no such thing as Laziness.
Oh, we’re all lazy!

The KNN is also known as a Lazy Learning Algorithm. Do you know why?

This is because it does absolutely nothing when we give the command to ‘train’ it. The actual work starts when we test the algorithm, then it calculates the distance of each testing point from all the training points and picks the k nearest points. Well, you have to specify the value of k before running the algorithm.

For example, the linear regression or the logistic regression algorithm learns its parameters during training time. But, there is no training time in the case of K-Nearest Neighbours Algorithm.

Although this may sound very convenient, this property doesn’t come without a cost: The “prediction” step in K-NN is relatively expensive! Each time we want to make a prediction, K-NN is searching for the nearest neighbour(s) in the entire training set!

Why is scaling necessary to implement KNN?

KNN, being a distance-based algorithm requires the calculation of distances of a point from all the other points. If the features have different scales, some of them will have a higher weightage than the others.

For Example:
Person A weighs 70 kgs and has a height of 1.5 metres whereas Person B weighs 60 kgs and has a height of 1.6 metres. If we calculate the distance between A(70,1.5) and B(60,1.6), the weight part is always going to dominate on the height as there is a huge difference in scale between the two.

That is why it is vital to bring the features to the same scale before implementing the KNN algorithm on our data. Now, we have a number of options when it comes to scaling but the Min-Max scaler seems to work the best as it scales all the features down to the range [0,1].

KNN is a great tool for imputing missing values:

Using central tendencies to impute missing values is not always the right way because it doesn’t take into consideration whether there is any dependence between the variable with missing values and the other variables. The KNN Imputer helps us solve that- it sees the behaviour of neighbours of the data point with missing value and assigns the value accordingly.

So, next time you could try using the KNN Imputer along with any basic imputation technique and compare the results. You never know what works out for you.

But beware, whatever you do, make sure that it makes a business sense and do mention this to the stakeholders beforehand as this step could completely change the results of modelling.

If you’re here, do give me a follow on Medium and let’s connect on Linkedin to chat more!

--

--

Sarthak Arora

Data Scientist @ Jupiter.co | Ex - Assistant Manager in Analytics @ Paisabazaar | I write about Data Science and ML | https://www.linkedin.com/in/iasarthak/