Dimensionality Reduction (Generalization)
"Aggregates specific features into higher-level features"
Currently used for:
- Recommendation Systems (★)
- Beautiful Visualizations
- Topic Modeling and Finding Similar Documents
- Fake Image Analysis
Risk Management
Popular algorithms: Principal Component Analysis (PCA), Singular Value Decomposition (SVD), Latent Dirichlet Allocation (LDA), Latent Semantic Analysis (LSA, pLSA, GLSA), t-SNE (for visualization)
Previously, this hard data was used by scientists who had to find "something interesting" in a large number of numbers. When Excel charts didn't help, they forced machines to perform pattern finding. This is how they obtained dimensionality reduction methods or specific learning.
It's always easier for people to use abstractions, not a bunch of scattered features. For example, we can merge all dogs with triangular ears, long noses, and large tails into a good abstraction - "shepherd". Yes, we lose some information about specific shepherds, but the new abstraction is much more useful for naming and explaining goals. As a bonus, such "abstract" models learn faster, overfit less, and use fewer features.
These algorithms became an amazing tool for topic modeling. This is what Latent Semantic Analysis (LSA) does. The topic depends on how often you see the word exactly. Certainly, there are more technical terms in technical articles. The names of politicians are found more often in political news, etc.
Yes, we can just create clusters from all the words in the articles, but we will lose all the important connections (for example, the same meaning of "battery" in different documents). LSA will handle it correctly, that's why it's called "latent semantic".
So to preserve these latent connections, we need to connect words and documents to a feature – it turns out that Singular Value Decomposition (SVD) does this unpleasantly well, and useful topic clusters of words are seen.
Recommendation systems and collaborative filtering are another incredibly popular application of dimensionality reduction methods. It turns out that if you use it to rank users, you get an excellent system for recommending movies, music, games, and anything you want.
A full understanding of this machine abstraction is hardly possible, but some correlations can be observed with a closer look. Some of them are related to the user's age – children play Contra and watch cartoons more. Others are related to the genre of movies or the user's hobbies.
Machines receive these high-level concepts even without understanding them, only based on user ranking knowledge.