Unsupervised Learning

Unsupervised learning finds patterns in data without labeled examples. The model discovers hidden structure, groups similar items, or reduces complexity—all without being told what to look for.

Key Difference: No labels needed! The model explores data to find natural groupings and patterns.

Main Types

Clustering

Group similar data points together.

Examples:

Customer segmentation
Document categorization
Image compression
Anomaly detection

Dimensionality Reduction

Reduce number of features while preserving information.

Examples:

Data visualization
Feature extraction
Noise reduction
Preprocessing for ML

Clustering Algorithms

K-Means

Partition data into K clusters by minimizing within-cluster variance.

1. Initialize K centroids randomly
2. Assign points to nearest centroid
3. Update centroids to cluster mean
4. Repeat until convergence

✓ Fast, simple | ✗ Need to specify K, sensitive to outliers

DBSCAN

Density-based clustering. Groups points that are closely packed.

Finds clusters of arbitrary shape
Automatically detects outliers

✓ No need to specify K, handles noise | ✗ Sensitive to parameters

Hierarchical Clustering

Build tree of clusters (dendrogram).

Agglomerative: bottom-up (merge clusters)
Divisive: top-down (split clusters)

✓ No need to specify K, interpretable | ✗ Computationally expensive

python

Output:

Click "Run Code" to see output

Dimensionality Reduction

PCA (Principal Component Analysis)

Find directions of maximum variance in data.

Linear transformation to orthogonal axes
Preserves most variance with fewer dimensions

Use for: Visualization, noise reduction, preprocessing

t-SNE

Non-linear technique for visualization (2D/3D).

Preserves local structure
Great for visualizing clusters

Use for: Visualization only (not for preprocessing)

Autoencoders

Neural networks that learn compressed representations.

Encoder: compress to latent space
Decoder: reconstruct from latent space

Use for: Non-linear reduction, anomaly detection, denoising

python

Output:

Click "Run Code" to see output

Applications

Customer Segmentation

Group customers by behavior for targeted marketing

Anomaly Detection

Find unusual patterns (fraud, defects)

Recommendation Systems

Find similar items or users

Image Compression

Reduce image size with K-means

Topic Modeling

Discover themes in documents

Data Visualization

Reduce to 2D/3D for plotting

Key Takeaway: Unsupervised learning discovers hidden patterns without labels. Use clustering to group data and dimensionality reduction to simplify it.