KNN: Finding the nearest neighbors

A guide to building AI apps and artifacts

Chapter 7 - Finding the nearest neighbors

Ken Kahn, University of Oxford

You can import the blocks presented here as a project or download them as a library to import into your projects.

Browser compatibility

This chapter of the guide includes many interactive elements are known to run well in the Chrome or Edge browser. See the troubleshooting guide for how to deal with problems encountered.

Introduction

K Nearest neighbors (KNN) is a simple technique to find lists of numbers that are closest to a new list of numbers. Associated with each list is a label and the nearest neighbors each "votes" for their label. The voting is used to score how confident the system which label should be assigned to the new list. You can decide how many neighbors vote by specifying the "top k"; usually it is a small odd integer.

One could implement KNN in Snap! without any new blocks by iterating through a list of lists of numbers computing the distance the new list has to each of the existing lists. The distance could be measured using Euclidean distance (in higher-dimensions) or cosine similarity. The results can be sorted by distance and top k elements used to tally the associated labels. However, using a TensorFlow.js library can speed things up very many times by taking advantage of the GPU.

K Nearest Neighbors blocks

In the following a KNN model is created containing 5 silly sentences and 5 serious ones. It is then queried with a new sentence. The Add example(s) block can be given a single example (i.e. a list of numbers) or a list of examples. Here the numbers are being generated by the features of sentences block. The numbers can be from any data so long as the number of numbers in the same in each list. The Classify block takes a new example and reports the votes by the "top k" examples of the model. There are several other KNN blocks but one can do most things with just the Add example(s) and Classify blocks.

Other KNN blocks can be found in this project. The get dataset of KNN model block reports the state of a KNN model that can be saved or exported. The Set dataset of KNN model block can then be used to recreate the model from the saved state. This can be much faster than recreating the model using the Add example(s) block.

Where to get these blocks to use in your projects

You can import the blocks presented here as a project or download them as a library to import into your projects.

Return to the previous chapter on neural networks.