This document provides an overview of conceptual foundations and preprocessing steps for text mining. It discusses the differences between syntax and semantics in text, and presents a general framework for text analytics including preprocessing, representation, and knowledge discovery. For text representation, it describes bag-of-words models and vector space models, including frequency vectors, one-hot encoding, and TF-IDF weighting. It also provides an introduction to n-grams for representing sequential data.