The paper discusses document classification using the Expectation Maximization (EM) algorithm with a semi-supervised learning approach, aiming to improve accuracy and efficiency in categorizing online documents. It highlights the benefits of utilizing both labeled and unlabeled data, leading to dynamically generated new classes when documents do not fit predefined categories. The semi-supervised method outperforms the traditional supervised approach in terms of accuracy and efficiency, demonstrating significant improvements through experimental results based on car evaluation datasets.