Talk from Kafka Summit San Francisco 2019 (https://kafka-summit.org/sessions/event-driven-model-serving-stream-processing-vs-rpc-kafka-tensorflow/). Video recording will be available for free on the Summit website. Event-based stream processing is a modern paradigm to continuously process incoming data feeds, e.g. for IoT sensor analytics, payment and fraud detection, or logistics. Machine Learning / Deep Learning models can be leveraged in different ways to do predictions and improve the business processes. Either analytic models are deployed natively in the application or they are hosted in a remote model server. In the latter you combine stream processing with RPC / Request-Response paradigm instead of direct doing direct inference within the application. This talk discusses the pros and cons of both approaches and shows examples of stream processing vs. RPC model serving using Kubernetes, Apache Kafka, Kafka Streams, gRPC and TensorFlow Serving. The trade-offs of using a public cloud service like AWS or GCP for model deployment are also discussed and compared to local hosting for offline predictions directly “at the edge”. Key takeaways • Machine Learning / Deep Learning models can be used in different ways to do predictions. Scalability and loose coupling are important success factors • Stream processing vs. RPC / Request-Response for model serving has many trade-offs – learn about alternatives and best practices for your different scenarios • Understand the alternatives and trade-offs of model deployment in modern infrastructures like Kubernetes or Cloud Services like AWS or GCP • See live demos with Java, gRPC, Apache Kafka, KSQL and TensorFlow Serving to understand the trade-offs