The document discusses approaches for building real-time applications on Hadoop systems before using Impala. It recommends using HBase to store and query data in real-time, SolrCloud for secondary indexing, and streaming tools like Storm on YARN for continuously processing data. The document provides examples of querying log data and malware information in real-time. It emphasizes clarifying use cases, computing data batches efficiently, and minimizing the gap between batches to approach real-time capabilities. The document advises that Impala is not always needed and that the same problems can occur, so the three-arrow approach of HBase, SolrCloud, and streaming often provides good real-time functionality without overengineering the solution.