SlideShare a Scribd company logo
Building DSLs with
Scala
Mohit Jaggi
Code Ninja and Big (Data) Troublemaker
Ayasdi
Who am I?
• Software engineer and architect
• Past life: built networking, security and
application delivery appliances using ASICs,
microcode, C, some C++. Operating system
optional.
• This life: distributed systems, (big) data analytics,
machine learning using Java and Scala. And real
servers with a proper operating system!
Why I love Scala?
• Less red tape
• Multi-paradigm
• Exploits the Java ecosystem
• Static typing and type inference
• Same language for everybody from casual script
writer to application programmer to library
designer
Who will find this useful?
• Beginner to intermediate scala programmers
• If you write code, you have APIs
• Library designers
Agenda
• Motivation for DSLs
• Scala Constructs useful for DSLs
• Lessons from my limited experience
Motivation for DSLs
• API Design Considerations
• DSL
• Types of DSLs and tradeoffs
Considerations for a good
API
• sufficient for current needs
• extensible for anticipated needs
• does not preclude unanticipated needs
• forward/backward compatibility
• easy to use
• little or no documentation required, but
• complete documentation available
DSL
• Domain Specific Language
• helps with ease of use part
• e.g. SQL, R, jooq, bigdf
Types of DSLs
• External or standalone
• Internal or embedded
External DSL
• New language from “scratch”
• Write a parser using
something like lex/yacc, antlr
• Write an interpreter or
• Write a code generator
• Manage a run-time
environment(Java?, native?,
REPL?)
Internal DSL
• “internal” or “embedded”
in a general purpose
programming language
• Designed to “feel like” a
different language more
intuitive to the domain
• e.g. Jooq in Java/Scala
feels like SQL
Tradeoffs
• Less work
• More completeness
• Less syntax flexibility
• “global optimizations” hard
• Error messages less
meaningful
• More freedom in syntax
• More “global optimizations”
possible
• Possible to have better error
messages
• More work
• Typically slower without
good code generation
I
n
t
e
r
n
a
l
E
x
t
e
r
n
a
l
Benefit Cost
Why Scala for DSLs?
select(BOOK.ID) from BOOK where BOOK.TITLE === “Scala scala”
select(BOOK.ID).from(BOOK).where(BOOK.TITLE.equal(“Java java”))
Scala Constructs
Useful For DSLs
Vanishing Method Call
• apply() and update() can be called without
naming them
df(“age”) == df.apply(“age”) == df.column(“age”)
df(“age”) = df(“age”) + 1
df.update(df(“age”), df(“age”) + 1)
Simulated dynamic typing
• extend the Dynamic trait
• hides compile errors, is it worth paying that cost?
df.age == df(“age”) == df.selectDynamic(“age”)
def selectDynamic(colName: String) = {

val col = self.column(colName)

if (col == null) logger.error(s"$colName does not match any DF API or column
name")

col

}

def updateDynamic(colName: String)(that: Column): Unit = update(colName, that)
df.age = df.age + 1
df.updateDynamic(“age”)(age + 1)
Dial 0 for operator
• dot is optional, so methods looks like operators
• any symbol can be used as operator
• for example to list people that can buy beer you
can use
df.where(df(“age”).gt(21))
df where (df(“age”) > 21)
• operator ending in colon applies to right
operand, this provides more freedom (~active vs
passive voice)
object orange {
def eat_:(who: String):Unit =
println(s"$who is eating orange”)
}
“monkey” eat_: orange
• careful with operators like ==, use === instead
Free companion object
• hide “new” to make an object
val df = DF.fromCSVFile(…)
• also a good place to put implicit conversions in
Reading between the lines
• implicit conversion to “enrich” types
• can also use “implicit class”
• newer construct “Value Classes” can sometimes replace this
2 MB == ByteSize(2*1024*1024)
class BytesMaker(n: Int) {
def MB = new Bytes(n * 1024 * 1024)
def GB = new Bytes(n * 1024 * 1024 * 1024)
}
case class Bytes(val n: Int) { def print = println(s”bytes=$n") }
object Bytes {
import scala.language.implicitConversions
implicit def intToBytesMaker(n: Int) = new BytesMaker(n)
}
My name is Bond
• Pass by name parameters are useful to pass in
“blocks of code”
def doTwice(action: => Unit) {
action
action
}
doTwice { println("hey") }
Multiple Parameter Lists
• Help avoid unwanted commas and more intuitive
parentheses/braces
def when(predicate: => Boolean)(action: => Unit) = {
if(predicate) action
}
when(1 == 1) { println(“1”) }
instead of
when(1 == 1, println(“1”))
FWIW - API
• Don't provide unnecessary options, if needed
you can always add more; like "salt in a dish",
you can't remove it
• Names meaningful to user not the coder
• Consistent coding style
• Option[T] if truly optional, else exception
FWIW - DSL
• Don’t take it to extremes (like “Baysick”)
• Assume users know some(or lot of) Scala
• Try to generate useful error messages
• Careful with ==, right associative operators, implicit
search order
• Aim to provide gentle slope to Scala instead of none;
basic Scala is simple, your API users will appreciate it
Thanks!
We are hiring
http://engineering.ayasdi.com
http://www.ayasdi.com/careers

More Related Content

What's hot (20)

PPTX
Distributed ML with Dask and Kubernetes
Ray Hilton
 
PPTX
Extreme Apache Spark: how in 3 months we created a pipeline that can process ...
Josef A. Habdank
 
PPTX
Real Time Data Processing With Spark Streaming, Node.js and Redis with Visual...
Brandon O'Brien
 
PDF
HBase at Mendeley
Dan Harvey
 
PPTX
Building a Virtual Data Lake with Apache Arrow
Dremio Corporation
 
PDF
Spark Streaming and MLlib - Hyderabad Spark Group
Phaneendra Chiruvella
 
PDF
Spark Summit EU talk by Miklos Christine paddling up the stream
Spark Summit
 
PDF
Spark: Interactive To Production
Jen Aman
 
PDF
Spark's Role in the Big Data Ecosystem (Spark Summit 2014)
Databricks
 
PDF
Koalas: Pandas on Apache Spark
Databricks
 
PDF
Teaching Apache Spark Clusters to Manage Their Workers Elastically: Spark Sum...
Spark Summit
 
PDF
Using SparkML to Power a DSaaS (Data Science as a Service) with Kiran Muglurm...
Databricks
 
PDF
Apache Arrow: Leveling Up the Data Science Stack
Wes McKinney
 
PDF
Using SparkR to Scale Data Science Applications in Production. Lessons from t...
Spark Summit
 
PDF
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
Databricks
 
PDF
Lessons from Running Large Scale Spark Workloads
Databricks
 
PPTX
Why Functional Programming Is Important in Big Data Era
Handaru Sakti
 
PDF
Deep Learning to Production with MLflow & RedisAI
Databricks
 
PPTX
Apache Arrow: In Theory, In Practice
Dremio Corporation
 
PDF
Data Analytics Service Company and Its Ruby Usage
SATOSHI TAGOMORI
 
Distributed ML with Dask and Kubernetes
Ray Hilton
 
Extreme Apache Spark: how in 3 months we created a pipeline that can process ...
Josef A. Habdank
 
Real Time Data Processing With Spark Streaming, Node.js and Redis with Visual...
Brandon O'Brien
 
HBase at Mendeley
Dan Harvey
 
Building a Virtual Data Lake with Apache Arrow
Dremio Corporation
 
Spark Streaming and MLlib - Hyderabad Spark Group
Phaneendra Chiruvella
 
Spark Summit EU talk by Miklos Christine paddling up the stream
Spark Summit
 
Spark: Interactive To Production
Jen Aman
 
Spark's Role in the Big Data Ecosystem (Spark Summit 2014)
Databricks
 
Koalas: Pandas on Apache Spark
Databricks
 
Teaching Apache Spark Clusters to Manage Their Workers Elastically: Spark Sum...
Spark Summit
 
Using SparkML to Power a DSaaS (Data Science as a Service) with Kiran Muglurm...
Databricks
 
Apache Arrow: Leveling Up the Data Science Stack
Wes McKinney
 
Using SparkR to Scale Data Science Applications in Production. Lessons from t...
Spark Summit
 
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
Databricks
 
Lessons from Running Large Scale Spark Workloads
Databricks
 
Why Functional Programming Is Important in Big Data Era
Handaru Sakti
 
Deep Learning to Production with MLflow & RedisAI
Databricks
 
Apache Arrow: In Theory, In Practice
Dremio Corporation
 
Data Analytics Service Company and Its Ruby Usage
SATOSHI TAGOMORI
 

Similar to Building DSLs with Scala (20)

PDF
Using Scala for building DSLs
IndicThreads
 
PPT
Introducing Scala to your Ruby/Java Shop : My experiences at IGN
Manish Pandit
 
PPTX
An Intro to Scala for PHP Developers
HuffPost Code
 
PPTX
Why Scala is the better Java
Thomas Kaiser
 
PDF
Scala Sjug 09
Michael Neale
 
PPTX
AestasIT - Internal DSLs in Scala
Dmitry Buzdin
 
PPT
Writing DSL's in Scala
Abhijit Sharma
 
PPTX
Introduction to Scala language
Aaqib Pervaiz
 
PPTX
All about scala
Yardena Meymann
 
PDF
Writing a DSL for the Dense with Scala - JVMCon
Jan-Hendrik Kuperus
 
KEY
LSUG: How we (mostly) moved from Java to Scala
Graham Tackley
 
PDF
Programming in scala - 1
Mukesh Kumar
 
PDF
Scala in Practice
Francesco Usai
 
PDF
ACCU 2011 Introduction to Scala: An Object Functional Programming Language
Peter Pilgrim
 
PDF
Meet scala
Wojciech Pituła
 
PDF
Quick introduction to scala
Mohammad Hossein Rimaz
 
PDF
A Brief, but Dense, Intro to Scala
Derek Chen-Becker
 
PDF
Scala @ TechMeetup Edinburgh
Stuart Roebuck
 
PPT
Scala presentationjune112011
PrasannaKumar Sathyanarayanan
 
PDF
Scala: Object-Oriented Meets Functional, by Iulian Dragos
3Pillar Global
 
Using Scala for building DSLs
IndicThreads
 
Introducing Scala to your Ruby/Java Shop : My experiences at IGN
Manish Pandit
 
An Intro to Scala for PHP Developers
HuffPost Code
 
Why Scala is the better Java
Thomas Kaiser
 
Scala Sjug 09
Michael Neale
 
AestasIT - Internal DSLs in Scala
Dmitry Buzdin
 
Writing DSL's in Scala
Abhijit Sharma
 
Introduction to Scala language
Aaqib Pervaiz
 
All about scala
Yardena Meymann
 
Writing a DSL for the Dense with Scala - JVMCon
Jan-Hendrik Kuperus
 
LSUG: How we (mostly) moved from Java to Scala
Graham Tackley
 
Programming in scala - 1
Mukesh Kumar
 
Scala in Practice
Francesco Usai
 
ACCU 2011 Introduction to Scala: An Object Functional Programming Language
Peter Pilgrim
 
Meet scala
Wojciech Pituła
 
Quick introduction to scala
Mohammad Hossein Rimaz
 
A Brief, but Dense, Intro to Scala
Derek Chen-Becker
 
Scala @ TechMeetup Edinburgh
Stuart Roebuck
 
Scala presentationjune112011
PrasannaKumar Sathyanarayanan
 
Scala: Object-Oriented Meets Functional, by Iulian Dragos
3Pillar Global
 
Ad

Recently uploaded (20)

PDF
SAP Firmaya İade ABAB Kodları - ABAB ile yazılmıl hazır kod örneği
Salih Küçük
 
PDF
Wondershare PDFelement Pro Crack for MacOS New Version Latest 2025
bashirkhan333g
 
PDF
MiniTool Partition Wizard Free Crack + Full Free Download 2025
bashirkhan333g
 
PDF
Digger Solo: Semantic search and maps for your local files
seanpedersen96
 
PDF
Driver Easy Pro 6.1.1 Crack Licensce key 2025 FREE
utfefguu
 
PPTX
Transforming Mining & Engineering Operations with Odoo ERP | Streamline Proje...
SatishKumar2651
 
PPTX
Finding Your License Details in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PDF
Generic or Specific? Making sensible software design decisions
Bert Jan Schrijver
 
PDF
MiniTool Partition Wizard 12.8 Crack License Key LATEST
hashhshs786
 
PDF
HiHelloHR – Simplify HR Operations for Modern Workplaces
HiHelloHR
 
PPTX
Homogeneity of Variance Test Options IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PDF
Download Canva Pro 2025 PC Crack Full Latest Version
bashirkhan333g
 
PDF
The 5 Reasons for IT Maintenance - Arna Softech
Arna Softech
 
PDF
AI + DevOps = Smart Automation with devseccops.ai.pdf
Devseccops.ai
 
PDF
[Solution] Why Choose the VeryPDF DRM Protector Custom-Built Solution for You...
Lingwen1998
 
PPTX
Agentic Automation: Build & Deploy Your First UiPath Agent
klpathrudu
 
PDF
Why Businesses Are Switching to Open Source Alternatives to Crystal Reports.pdf
Varsha Nayak
 
PDF
TheFutureIsDynamic-BoxLang witch Luis Majano.pdf
Ortus Solutions, Corp
 
PPTX
Home Care Tools: Benefits, features and more
Third Rock Techkno
 
PPTX
Empowering Asian Contributions: The Rise of Regional User Groups in Open Sour...
Shane Coughlan
 
SAP Firmaya İade ABAB Kodları - ABAB ile yazılmıl hazır kod örneği
Salih Küçük
 
Wondershare PDFelement Pro Crack for MacOS New Version Latest 2025
bashirkhan333g
 
MiniTool Partition Wizard Free Crack + Full Free Download 2025
bashirkhan333g
 
Digger Solo: Semantic search and maps for your local files
seanpedersen96
 
Driver Easy Pro 6.1.1 Crack Licensce key 2025 FREE
utfefguu
 
Transforming Mining & Engineering Operations with Odoo ERP | Streamline Proje...
SatishKumar2651
 
Finding Your License Details in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
Generic or Specific? Making sensible software design decisions
Bert Jan Schrijver
 
MiniTool Partition Wizard 12.8 Crack License Key LATEST
hashhshs786
 
HiHelloHR – Simplify HR Operations for Modern Workplaces
HiHelloHR
 
Homogeneity of Variance Test Options IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
Download Canva Pro 2025 PC Crack Full Latest Version
bashirkhan333g
 
The 5 Reasons for IT Maintenance - Arna Softech
Arna Softech
 
AI + DevOps = Smart Automation with devseccops.ai.pdf
Devseccops.ai
 
[Solution] Why Choose the VeryPDF DRM Protector Custom-Built Solution for You...
Lingwen1998
 
Agentic Automation: Build & Deploy Your First UiPath Agent
klpathrudu
 
Why Businesses Are Switching to Open Source Alternatives to Crystal Reports.pdf
Varsha Nayak
 
TheFutureIsDynamic-BoxLang witch Luis Majano.pdf
Ortus Solutions, Corp
 
Home Care Tools: Benefits, features and more
Third Rock Techkno
 
Empowering Asian Contributions: The Rise of Regional User Groups in Open Sour...
Shane Coughlan
 
Ad

Building DSLs with Scala

  • 1. Building DSLs with Scala Mohit Jaggi Code Ninja and Big (Data) Troublemaker Ayasdi
  • 2. Who am I? • Software engineer and architect • Past life: built networking, security and application delivery appliances using ASICs, microcode, C, some C++. Operating system optional. • This life: distributed systems, (big) data analytics, machine learning using Java and Scala. And real servers with a proper operating system!
  • 3. Why I love Scala? • Less red tape • Multi-paradigm • Exploits the Java ecosystem • Static typing and type inference • Same language for everybody from casual script writer to application programmer to library designer
  • 4. Who will find this useful? • Beginner to intermediate scala programmers • If you write code, you have APIs • Library designers
  • 5. Agenda • Motivation for DSLs • Scala Constructs useful for DSLs • Lessons from my limited experience
  • 6. Motivation for DSLs • API Design Considerations • DSL • Types of DSLs and tradeoffs
  • 7. Considerations for a good API • sufficient for current needs • extensible for anticipated needs • does not preclude unanticipated needs • forward/backward compatibility • easy to use • little or no documentation required, but • complete documentation available
  • 8. DSL • Domain Specific Language • helps with ease of use part • e.g. SQL, R, jooq, bigdf
  • 9. Types of DSLs • External or standalone • Internal or embedded
  • 10. External DSL • New language from “scratch” • Write a parser using something like lex/yacc, antlr • Write an interpreter or • Write a code generator • Manage a run-time environment(Java?, native?, REPL?)
  • 11. Internal DSL • “internal” or “embedded” in a general purpose programming language • Designed to “feel like” a different language more intuitive to the domain • e.g. Jooq in Java/Scala feels like SQL
  • 12. Tradeoffs • Less work • More completeness • Less syntax flexibility • “global optimizations” hard • Error messages less meaningful • More freedom in syntax • More “global optimizations” possible • Possible to have better error messages • More work • Typically slower without good code generation I n t e r n a l E x t e r n a l Benefit Cost
  • 13. Why Scala for DSLs? select(BOOK.ID) from BOOK where BOOK.TITLE === “Scala scala” select(BOOK.ID).from(BOOK).where(BOOK.TITLE.equal(“Java java”))
  • 15. Vanishing Method Call • apply() and update() can be called without naming them df(“age”) == df.apply(“age”) == df.column(“age”) df(“age”) = df(“age”) + 1 df.update(df(“age”), df(“age”) + 1)
  • 16. Simulated dynamic typing • extend the Dynamic trait • hides compile errors, is it worth paying that cost? df.age == df(“age”) == df.selectDynamic(“age”) def selectDynamic(colName: String) = {
 val col = self.column(colName)
 if (col == null) logger.error(s"$colName does not match any DF API or column name")
 col
 }
 def updateDynamic(colName: String)(that: Column): Unit = update(colName, that) df.age = df.age + 1 df.updateDynamic(“age”)(age + 1)
  • 17. Dial 0 for operator • dot is optional, so methods looks like operators • any symbol can be used as operator • for example to list people that can buy beer you can use df.where(df(“age”).gt(21)) df where (df(“age”) > 21)
  • 18. • operator ending in colon applies to right operand, this provides more freedom (~active vs passive voice) object orange { def eat_:(who: String):Unit = println(s"$who is eating orange”) } “monkey” eat_: orange • careful with operators like ==, use === instead
  • 19. Free companion object • hide “new” to make an object val df = DF.fromCSVFile(…) • also a good place to put implicit conversions in
  • 20. Reading between the lines • implicit conversion to “enrich” types • can also use “implicit class” • newer construct “Value Classes” can sometimes replace this 2 MB == ByteSize(2*1024*1024) class BytesMaker(n: Int) { def MB = new Bytes(n * 1024 * 1024) def GB = new Bytes(n * 1024 * 1024 * 1024) } case class Bytes(val n: Int) { def print = println(s”bytes=$n") } object Bytes { import scala.language.implicitConversions implicit def intToBytesMaker(n: Int) = new BytesMaker(n) }
  • 21. My name is Bond • Pass by name parameters are useful to pass in “blocks of code” def doTwice(action: => Unit) { action action } doTwice { println("hey") }
  • 22. Multiple Parameter Lists • Help avoid unwanted commas and more intuitive parentheses/braces def when(predicate: => Boolean)(action: => Unit) = { if(predicate) action } when(1 == 1) { println(“1”) } instead of when(1 == 1, println(“1”))
  • 23. FWIW - API • Don't provide unnecessary options, if needed you can always add more; like "salt in a dish", you can't remove it • Names meaningful to user not the coder • Consistent coding style • Option[T] if truly optional, else exception
  • 24. FWIW - DSL • Don’t take it to extremes (like “Baysick”) • Assume users know some(or lot of) Scala • Try to generate useful error messages • Careful with ==, right associative operators, implicit search order • Aim to provide gentle slope to Scala instead of none; basic Scala is simple, your API users will appreciate it