July 2018 talk to SW Data Meetup by Rob Vesse, Software Engineer, Cray Inc, discussing open source technologies for data science on high performance systems (Spark, Hadoop, PyData ecosystem, containers, etc), focusing on some of the implementation and scaling challenges they face.