Unsupervised Learning

Unsupervised learning finds patterns in data without labeled examples. The model discovers hidden structure, groups similar items, or reduces complexity—all without being told what to look for.

Key Difference: No labels needed! The model explores data to find natural groupings and patterns.

Main Types

Clustering

Group similar data points together.

Examples:
  • Customer segmentation
  • Document categorization
  • Image compression
  • Anomaly detection

Dimensionality Reduction

Reduce number of features while preserving information.

Examples:
  • Data visualization
  • Feature extraction
  • Noise reduction
  • Preprocessing for ML

Clustering Algorithms

K-Means

Partition data into K clusters by minimizing within-cluster variance.

1. Initialize K centroids randomly
2. Assign points to nearest centroid
3. Update centroids to cluster mean
4. Repeat until convergence

✓ Fast, simple | ✗ Need to specify K, sensitive to outliers

DBSCAN

Density-based clustering. Groups points that are closely packed.

Finds clusters of arbitrary shape
Automatically detects outliers

✓ No need to specify K, handles noise | ✗ Sensitive to parameters

Hierarchical Clustering

Build tree of clusters (dendrogram).

Agglomerative: bottom-up (merge clusters)
Divisive: top-down (split clusters)

✓ No need to specify K, interpretable | ✗ Computationally expensive

python
Output:
Click "Run Code" to see output

Dimensionality Reduction

PCA (Principal Component Analysis)

Find directions of maximum variance in data.

Linear transformation to orthogonal axes
Preserves most variance with fewer dimensions

Use for: Visualization, noise reduction, preprocessing

t-SNE

Non-linear technique for visualization (2D/3D).

Preserves local structure
Great for visualizing clusters

Use for: Visualization only (not for preprocessing)

Autoencoders

Neural networks that learn compressed representations.

Encoder: compress to latent space
Decoder: reconstruct from latent space

Use for: Non-linear reduction, anomaly detection, denoising

python
Output:
Click "Run Code" to see output

Applications

Customer Segmentation
Group customers by behavior for targeted marketing
Anomaly Detection
Find unusual patterns (fraud, defects)
Recommendation Systems
Find similar items or users
Image Compression
Reduce image size with K-means
Topic Modeling
Discover themes in documents
Data Visualization
Reduce to 2D/3D for plotting

Key Takeaway: Unsupervised learning discovers hidden patterns without labels. Use clustering to group data and dimensionality reduction to simplify it.