Unsupervised Learning
Unsupervised learning finds patterns in data without labeled examples. The model discovers hidden structure, groups similar items, or reduces complexity—all without being told what to look for.
Key Difference: No labels needed! The model explores data to find natural groupings and patterns.
Main Types
Clustering
Group similar data points together.
- Customer segmentation
- Document categorization
- Image compression
- Anomaly detection
Dimensionality Reduction
Reduce number of features while preserving information.
- Data visualization
- Feature extraction
- Noise reduction
- Preprocessing for ML
Clustering Algorithms
K-Means
Partition data into K clusters by minimizing within-cluster variance.
2. Assign points to nearest centroid
3. Update centroids to cluster mean
4. Repeat until convergence
✓ Fast, simple | ✗ Need to specify K, sensitive to outliers
DBSCAN
Density-based clustering. Groups points that are closely packed.
Automatically detects outliers
✓ No need to specify K, handles noise | ✗ Sensitive to parameters
Hierarchical Clustering
Build tree of clusters (dendrogram).
Divisive: top-down (split clusters)
✓ No need to specify K, interpretable | ✗ Computationally expensive
Dimensionality Reduction
PCA (Principal Component Analysis)
Find directions of maximum variance in data.
Preserves most variance with fewer dimensions
Use for: Visualization, noise reduction, preprocessing
t-SNE
Non-linear technique for visualization (2D/3D).
Great for visualizing clusters
Use for: Visualization only (not for preprocessing)
Autoencoders
Neural networks that learn compressed representations.
Decoder: reconstruct from latent space
Use for: Non-linear reduction, anomaly detection, denoising
Applications
Key Takeaway: Unsupervised learning discovers hidden patterns without labels. Use clustering to group data and dimensionality reduction to simplify it.