SQL Server 使いのための Azure Synapse Analytics - Spark 入門Daiyu Hatakeyama
Japan SQL Server Users Group - 第35回 SQL Server 2019勉強会 - Azure Synapese Analytics - SQL Pool 入門 のセッション資料です。
Spark の位置づけ。Synapse の中での入門編の使い方。そして、Synapse ならではの価値について触れてます。
22. This course:
‣ This is a course about distributed data
parallelism in Spark.
Not a machine learning or data science course!
‣ Extending familiar functional abstractions like
functional lists over
large clusters.
‣ Context: analyzing large data sets.
だけど
23. This course:
‣ This is a course about distributed data
parallelism in Spark.
Not a machine learning or data science course!
‣ Extending familiar functional abstractions like
functional lists over
large clusters.
‣ Context: analyzing large data sets.
だけど…
マシンラーニングやデータサイエンスのコースちゃうで。
26. Rich deep learning support. Modeled after Torch BigDL provides
comprehensive support for deep learning, including numeric computing (via
Tensor and high-level neural networks; in addition, you can load pretrained
Caffe* or Torch models into the Spark framework, and then use the BigDL
library to run inference applications on their data.
Efficient scale out. BigDL can efficiently scale out to perform data analytics at
“big data scale” by using Spark as well as efficient implementations of
synchronous stochastic gradient descent (SGD) and all-reduce communications
in Spark.
Extremely high performance. To achieve high performance, BigDL uses Intel®
Math Kernel Library (Intel® MKL) and multithreaded programming in each
Spark task. Consequently, it is orders of magnitude faster than out-of-the-box
open source Caffe, Torch, or TensorFlow on a single-node Intel® Xeon®
processor (i.e., comparable with mainstream graphics processing units).
BigDLとは
https://software.intel.com/en-us/articles/bigdl-distributed-deep-learning-on-apache-spark
27. Rich deep learning support. Modeled after Torch BigDL provides
comprehensive support for deep learning, including numeric computing (via
Tensor and high-level neural networks; in addition, you can load pretrained
Caffe* or Torch models into the Spark framework, and then use the BigDL
library to run inference applications on their data.
Efficient scale out. BigDL can efficiently scale out to perform data analytics at
“big data scale” by using Spark as well as efficient implementations of
synchronous stochastic gradient descent (SGD) and all-reduce communications
in Spark.
Extremely high performance. To achieve high performance, BigDL uses Intel®
Math Kernel Library (Intel® MKL) and multithreaded programming in each
Spark task. Consequently, it is orders of magnitude faster than out-of-the-box
open source Caffe, Torch, or TensorFlow on a single-node Intel® Xeon®
processor (i.e., comparable with mainstream graphics processing units).
BigDLとは
https://software.intel.com/en-us/articles/bigdl-distributed-deep-learning-on-apache-spark
Spark上で動かせるDeepLaearningライブラリ
Intel CPUを駆使することで高速に処理可能!
* v0.2.0になってドキュメントがかなり充実しました!