This document discusses hierarchical clustering, which produces nested clusters organized as a hierarchical tree. It can be visualized using a dendrogram. There are two main types: agglomerative, which starts with each point as its own cluster and merges the closest pairs; and divisive, which starts with all points in one cluster and splits them. Hierarchical clustering does not require specifying the number of clusters upfront like partitional clustering but is generally slower and the dendrogram can be difficult to interpret. The document provides examples of applications and notes pros and cons.