Skip to content
Pusat Penelitian, Pengabdian kepada Masyarakat dan Publikasi Internasional
twitter
youtube
instagram
Pusat Penelitian, Pengabdian kepada Masyarakat dan Publikasi Internasional
Call Support 0822-7473-7806
Email Support [email protected]
Location Jl. Kolam No. 1 Medan Estate
  • Beranda
  • Tentang
    • Profil
    • Visi dan Misi
    • Struktur Organisasi
    • Pimpinan Pusat
    • Program Kerja
    • Sasaran, Program Strategis dan IK
  • Berita Kegiatan
  • Layanan & Informasi
    • Aplikasi
      • UMA
        • Penjaminan Mutu
        • Himpunan Aplikasi Online
        • Jurnal Ilmiah Online
        • Repositori UMA
        • Open Access Public Catalog
      • Unit
        • Aplikasi Penelitian & Pengabdian (LIPAN)
        • SWAMP-D
        • SUSITAO
        • SINTA Verifikator
        • BIMA Kemdiktisaintek
    • Arsip Digital
    • Helpdesk
    • Pendanaan
      • Penelitian
        • Penelitian Pendanaan Nasional
        • Penelitian Kerjasama Internasional
      • Pengabdian Kepada Masyarakat
        • PKM Pendanaan Nasional
    • Publikasi
      • Internasional Bereputasi
    • Reviewer Penelitian dan PKM
  • Kerjasama
  • Jadwal Kegiatan

An Introduction to K-Nearest Neighbors (K-NN)

Posted on October 16, 2024October 30, 2024 by admin
0

K-Nearest Neighbors (K-NN) is one of the most popular and easy-to-implement algorithms in machine learning, particularly in supervised learning. Despite its simplicity, K-NN is known for its effectiveness in solving both classification and regression problems. It is a non-parametric algorithm, meaning it doesn’t make any assumptions about the underlying data distribution, which makes it versatile and widely applicable.

How K-NN Works

The fundamental idea behind K-NN is that similar data points are likely to belong to the same class. To classify a new data point, K-NN looks at the ‘k’ nearest data points (neighbors) in the feature space and assigns the majority label among these neighbors to the new data point.

Here’s how it works step-by-step:

1. Choose a value for K: The user specifies the number of nearest neighbors, ‘k’, to be considered.
2. Calculate the distance: For the given new data point, the algorithm calculates the distance between this point and all the points in the training dataset. Common distance metrics include Euclidean distance, Manhattan distance, and Minkowski distance.
3. Find K Nearest Neighbors: Based on the distance calculations, the ‘k’ nearest points are identified.
4. Vote for the majority: In the case of classification, the new data point is assigned the class most common among its k nearest neighbors. In regression, the average value of the k nearest neighbors is calculated.

Key Features of K-NN

– Lazy Learning: K-NN is a ‘lazy learner’, meaning it doesn’t explicitly learn a model during the training phase. Instead, it memorizes the training dataset and only performs computation during the prediction phase.

– Instance-based Learning: The algorithm directly uses the training instances to make predictions, without abstracting the data into a model. This makes K-NN an instance-based learning algorithm.

Distance Metrics in K-NN

The performance of K-NN largely depends on the choice of the distance metric used to measure how far apart the data points are. Some common metrics include:

– Euclidean Distance: The straight-line distance between two points in Euclidean space. It is the most commonly used metric.

\[
d(x, y) = \sqrt{\sum_{i=1}^{n} (x_i – y_i)^2}
\]

– Manhattan Distance: Also known as L1 distance, it measures the distance between two points by summing the absolute differences of their coordinates.

\[
d(x, y) = \sum_{i=1}^{n} |x_i – y_i|
\]

– Minkowski Distance: A generalized distance metric that can represent both Euclidean and Manhattan distances, depending on the parameter ‘p’. For p=2, it becomes Euclidean, and for p=1, it becomes Manhattan.

\[
d(x, y) = \left(\sum_{i=1}^{n} |x_i – y_i|^p \right)^{1/p}
\]

Choosing the Right Value of K

One of the challenges in using K-NN is selecting the optimal value of ‘k’. A small value of ‘k’ (e.g., k=1) makes the algorithm sensitive to noise in the data, while a large value of ‘k’ can smooth out decision boundaries but may also introduce bias. A common strategy is to use cross-validation to select the best ‘k’ for a specific dataset.

Advantages

– Simple and Intuitive: K-NN is easy to understand and implement. It requires minimal training time since no model is learned upfront.
– No Assumptions About Data: K-NN is non-parametric, meaning it doesn’t assume any specific form of the underlying data distribution.
– Versatile: It can be used for both classification and regression tasks.

Disadvantages

– Computationally Expensive: Since K-NN requires calculating the distance to every training point for each prediction, it can become computationally expensive, especially for large datasets.

– Sensitive to the Scale of Data: Features with larger scales can dominate the distance calculation, leading to biased predictions. Feature scaling (e.g., normalization or standardization) is often required.

– Memory Intensive: it requires storing the entire training dataset, which can lead to high memory usage.

Applications of K-NN

– Image Recognition: it can be used for image classification by considering pixel values as features and identifying similar images.

– Recommendation Systems: In recommender systems, it can be used to find users with similar preferences to make product recommendations.

– Medical Diagnosis: it is often used in healthcare to classify patients based on symptoms and historical data, providing insights for diagnosis.

Conclusion

K-Nearest Neighbors is a powerful yet simple machine learning algorithm that performs well in various tasks, especially classification and regression. Its effectiveness depends on factors like the choice of distance metric, the value of ‘k’, and the quality of the training data. While it can be computationally expensive and sensitive to feature scaling, its non-parametric nature and ease of understanding make it a great starting point for many machine learning tasks.

Berita Terbaru
UMA Kukuhkan Posisi sebagai Kampus Swasta Terbaik di Sumut Versi SJR
Universitas Medan Area kembali mencatatkan pencapaian membanggakan di tingkat nasional dengan meraih predikat sebagai perguruan tinggi swasta terbaik di Sumatera...
UMA Terima Kunjungan STIE Graha Kirana: Perkuat Kolaborasi Tridharma dan Pengelolaan HKI
Medan, 24 April 2026 — Universitas Medan Area (UMA) menerima kunjungan akademik dari Sekolah Tinggi Ilmu Ekonomi (STIE) Graha Kirana...
KAMPUS I
Jalan Kolam Nomor 1 Medan Estate / Jalan Gedung PBSI, Medan 20223
(061) 7360168 CALL CENTER : 0811-6013-888
[email protected]
KAMPUS II
Jalan Sei Serayu No. 70 A / Jalan Setia Budi No. 79 B, Medan 20112
(061) 42402994
[email protected]

Statistik Pengunjung

  • 0
  • 36
  • 30
  • 21,764
  • 23,723
@Copyright 2026 BPDI | Universitas Medan Area

This will close in 10 seconds