Lecture6 Clustering
Lecture6 Clustering
Partial of the content of this class are copied from online materials. In particular:
1. Introduction to Computational Thinking and Data Science, by Pro. Eric Grimson, Prof. John Guttag and Dr. Ana Bell.
https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-0002-introduction-to-computational-thinking-and-data-science-fall-20
16/
, MIT.
2. Unsupervised Learning Clustering, by Shimon Ullman, Tomaso Poggio Danny Harari, DaneilZysman, Darren Seibert
http://www.mit.edu/~9.54/fall14/slides/Class13.pdf, MIT
Machine learning paradigm
• Observe set of examples: training data
• Infer something about process that generated that data
• Use inference to make predictions about previously unseen data: test
data
• Supervised: give a set of feature/label pairs, find a rule that predicts
the label associated with a previously unseen input
• Unsupervised: given a set of feature vectors (without labels) group
them into "natural clusters".
What is Clustering?
What do we need for Clustering?
Distance Measures
Clustering is an Optimization Problem