The document discusses big data and provides examples of how it can be collected and analyzed. It describes a master's thesis that collected 74,000 Dutch news articles over 2 months to analyze rare content. It also describes a bachelor's thesis that automated the coding of tweets to determine the tone politicians used when referring to opponents. The document outlines the typical process of collecting, storing, and analyzing big data and describes the infrastructure used in the workshop to collect Twitter tweets, news articles, and web snapshots.