Table of Contents: Language Core
Table of Contents: Language Core
of Contents
Introduction 1.1
Language core
1 - Getting started 2.1
2 - Data types 2.2
3 - Control flow 2.3
4 - Functions 2.4
5 - Custom structures 2.5
6 - Input - Output 2.6
7 - Managing run-time errors (exceptions) 2.7
8 - Interfacing Julia with other languages 2.8
9 - Metaprogramming 2.9
10 - Performances (parallelisation, debugging, profiling..) 2.10
11 - Developing Julia packages 2.11
Useful packages
Plotting 3.1
DataFrames 3.2
JuMP 3.3
SymPy 3.4
Weave 3.5
LAJuliaUtils 3.6
IndexedTables 3.7
Pipe 3.8
1
Introduction
Introduction
Update November 2019: This tutorial (largely updated, expanded and revised) has
evolved into a book thanks to Apress :-)
This tutorial itself is still updated and may include new stuff that will be the base of further
editions of the book.
Julia 1.2/1.3 From 24 November 2019. More than Julia itself (quite stable now), this
version accounts for mayor API changes of the various packages, DataFrames, JuMP,
PyCall..
Julia 1.0: From 5 September 2018
Julia 0.6: 19 July 2017 - 15 August 2018 versions
Julia 0.5: Versions before 19 July 2017
The purposes of this tutorial are (a) to store things I learn myself about Julia and (b) to help
those who want to start coding in Julia before reading the 982 pages of the (outstanding)
official documentation.
2
Introduction
Antonello Lobianco
Latest version
The latest version of this tutorial can be found online on GitBook, at
https://syl1.gitbook.io/julia-language-a-concise-tutorial
PDF version (if it works)
A legacy interface (if it works)
Corresponding GIT repository
Citations
Please cite this tutorial as:
Acknowledgements
Development of this tutorial was supported by:
the French National Research Agency through the Laboratory of Excellence ARBRE, a
part of the “Investissements d'Avenir” Program (ANR 11 – LABX-0002-01).
3
Introduction
4
1 - Getting started
1 - Getting started
Why Julia
Without going into a long discussion, Julia solves a trade-off (partially thanks to the recent
developments in just-in-time compilers) that has long existed in programming: fast coding vs.
fast execution.
On the one hand, Julia allows you to code in a dynamic language like Python, R or Matlab,
allowing for fast interaction with your program and exceptional expressive power (see the
Metaprogramming chapter, for example).
On the other hand, with minimum effort programs written in Julia can run nearly as fast as C
(see Performances).
While it is still young, Julia allows you to easily interface your code with all the major
programming languages (see Interfacing Julia with other languages), hence reusing their
huge set of libraries, when these are not already being ported into Julia.
Julia has its roots in the domain of scientific, high performances computing, but it is
becoming more and more mature as a general purpose programming language.
Installing Julia
All you need to run the code in this tutorial is a working Julia interpreter (aka REPL - Read
Eval Print Loop).
In Linux you can simply use your package manager to install julia , but for a more up-to-
date version, or for Windows/Mac packages, I strongly suggest to download the binaries
available on the download section of the Julia web-site.
For Integrated Development Environment, check out either Juno or IJulia, the Julia Jupiter
backend.
You can find their detailed setup instructions here:
Juno (an useful tip I always forget: the key binding for block selection mode is
ALT+SHIFT )
IJulia (in a nutshell: if you already have Jupiter installed, just run using Pkg;
Pkg.update();Pkg.add("IJulia") from the Julia console. That's all! ;-) )
You can also choose, at least to start with, not to install Julia at all, and try JuliaBox, a free
online IJulia notebook server that you access from your browser, instead.
5
1 - Getting started
Running Julia
There are several ways to run Julia code:
Once you have it installed, just type julia in a console and then enter your commands
in the prompt that follows. You can type exit() when you have finished;
A Julia script is a text file ending in .jl , which you can have Julia parse and run with
julia myscript.jl [arg1, arg2,..] . Script files can also be run from within the Julia
To make a shebang script, just add the location of the Julia interpreter on your system,
preceded by #! and followed by an empty row, to the top of the script. You can find the
full path of the Julia interpreter by typing which julia in a console, for example,
/usr/bin/julia . Be sure that the file is executable (e.g. chmod 755 myscript.jl ). Then
Julia keeps many things in memory within the same work session, so if this creates
problems in the execution of your code, you can restart Julia. You can also use the Revise.jl
package for a finer control over what Julia keeps in memory during a work session.
You can check which version of Julia you are using with versioninfo().
Syntax elements
Single line comments start with # and multi-line comments can be placed in between #=
and =# . Multi-line comments can be nested, as well.
In console mode, ; after a command suppresses the output (this is done automatically in
scripting mode), and typed alone switches to one-time command shell prompt.
Indentation doesn't matter, but empty spaces sometimes do, e.g. functions must have the
curved parenthesis with the inputs strictly attached to them, e.g.:
6
1 - Getting started
Note: If you come from C or Python, one important thing to remember is that Julia's arrays
and other ordered data structures start indexes counting from 1 and not 0 .
Packages
Many functions are provided in Julia by external "packages". The "standard library" is a
package that is shipped with Julia itself, but like normal packages the user is required to
manually load the standard library. Many standard features that were in the language core
before Julia 1.0 have been moved to the standard library, so if you're moving from an older
version of Julia be aware of this.
To include a Julia package's functionality in your Julia code, you must write using Pkg to
use Pkg 's capabilities (alternatively, only for the package module, you can type ] to enter
a "special" Pkg mode).
You can then run the desired command, directly if you are in a terminal, in the Pkg mode, or
pkg"cmd to run" in a script (notice that there is no space between pkg and the quoted
command to run).
1. status Retrieves a list with name and versions of locally installed packages
2. update Updates your local index of packages and all your local packages to the latest
version
3. add myPkg Automatically downloads and installs a package
4. rm myPkg Removes a package and all its dependent packages that has been installed
To use the functions provided by a package, just include a using mypackage statement in the
console or at the beginning of the script. If you prefer to import the package but keep the
namespace clean, use import mypackage (you will then need to refer to a package function
as myPkg.myFunction ). You can also include other files using include("MyFile.jl") : when
that line is run, the included file is completely ran (not only parsed) and any symbol defined
there becomes available in the namespace relative to where include has been called.
7
1 - Getting started
Winston or Plots (plotting) and DataFrames (R-like tabular data) are example of packages
For example (see the Plotting section for specific Plotting issues):
(note: as of writing, the Plot package has not yet be ported to Julia 1.0)
using Plots
pyplot()
plot(rand(4,4))
or
import Plots
const pl = Plots # this create an an alias, equivalent to Python "import Plots as pl".
Declaring it constant may improve the performances.
pl.pyplot()
pl.plot(rand(4,4))
or
You can read more about packages in the relevant section of the Julia documentation, and if
you are interested in writing your own package, skip to the "Developing Julia package"
section.
While an updated, expanded and revised version of this chapter is available in "Chapter 1 -
Getting Started" of Antonello Lobianco (2019), "Julia Quick Syntax Reference", Apress, this
tutorial remains in active development.
8
1 - Getting started
9
2 - Data types
2 - Data types
Scalar types
In Julia, variable names can include a subset of Unicode symbols, allowing a variable to be
represented, for example, by a Greek letter.
In most Julia development environments (including the console), to type the Greek letter you
can use a LaTeX-like syntax, typing \ and then the LaTeX name for the symbol, e.g.
\alpha for α . Using LaTeX syntax, you can also add subscripts, superscripts and
decorators.
The main types of scalar are Int64 , Float64 , Char (e.g. x = 'a' ), String ¹ (e.g.
x="abc" ) and Bool .
Strings
Julia supports most typical string operations, for example: split(s) (default on
whitespaces), join([s1,s2], "") , replace(s, "toSearch" => "toReplace") and strip(s)
(remove leading and trailing whitespaces) Attention to use the single quote for chars and
double quotes for strings. c
Concatenation
There are several ways to concatenate strings:
Concatenation operator: * ;
Function string(str1,str2,str3) ;
Combine string variables in a bigger one using the dollar symbol: a = "$str1 is a
string and $(myobject.int1) is an integer" ("interpolation")
Note: the first method doesn't automatically cast integer and floats to strings.
Arrays (lists)
Arrays are N-dimensional mutable containers. In this section, we deal with 1-dimensional
arrays, in the next one we consider 2 or more dimensional arrays.
10
2 - Data types
Row vector (Matrix container, alias for 2-dimensions array, see next section
"Multidimensional and nested arrays"): a = [1 2 3]
Arrays can be heterogeneous (but in this case the array will be of Any type and in general
much slower): x = [10, "foo", false] .
If you need to store a limited set of types in the array, you can use the Union keyword to
still have an efficient implementation, e.g. a = Union{Int64,String,Bool}[10, "Foo", false] .
a=Array{T,1}(undef,n) .
Square brackets are used to access the elements of an array (e.g. a[1] ). The slice syntax
[from:step:to] is generally supported and in several contexts will return a (fast) iterator
rather than a list (you can use the keyword end , but not begin ). To then transform the
iterator in a list use collect(myiterator) . You can initialise an array with a mix of values and
ranges with either y=[2015; 2025:2030; 2100] (note the semicolon!) or y=vcat(2015,
2025:2030, 2100) .
Push an element to the end of a: push!(a,b) (as a single element even if it is an Array.
Equivalent to python append )
To append all the elements of b to a: append!(a,b) (if b is a scalar obviously push! and
append! are the same. Attention that a string is treated as a list!. Equivalent to Python
extend or += )
11
2 - Data types
original array)
Reversing an arry: a[end:-1:1]
Checking for existence: in(1, a)
Getting the length: length(a)
Get the maximum value: maximum(a) or max(a...) ( max returns the maximum value
between the given arguments)
Get the minimum value: minimum(a) or min(a...) ( min returns the maximum value
between the given arguments)
Empty an array: empty!(a) (only column vector, not row vector)
Transform row vectors in column vectors: b = vec(a)
Random-shuffling the elements: shuffle(a) (or shuffle!(a) . From Julia 1.0 this
require using Random before)
Checking if an array is empty: isempty(a)
Find the index of a value in an array: findall(x -> x == value, myarray) . This is a bit
tricky. The first argument is an anonymous function that returns a boolean value for
each value of myarray , and then find() returns the index position(s).
Delete a given item from a list: deleteat!(myarray, findall(x -> x == myunwanteditem,
myarray))
a = [[1,2,3] [4,5,6]] [[elements of the first column] [elements of the second column]
...] (note that this is valid only if wrote in a single line. Use hcat(col1, col2) to write
matrix by each column)
a = [1 4; 2 5; 3 6] [elements of the first row; elements of the second row; ...] (here
again a vector);
a = [[1,2,3] [4,5,6]] creates a 2-dimensional array (a matrix with 2 columns) with
12
2 - Data types
Nested arrays can be accessed with double square brackets, e.g. a[2][3] .
Elements of bidimensional arrays can be accessed instead with the a[row,col] syntax,
where again the slice syntax can be used, for example, given a is a 3x3 Matrix, a[1:2,:]
would return a 2x3 Matrix with all the column elements of the first and second row.
a = [[1,2,3] [4,5,6]]
mask = [[true,true,false] [false,true,false]]
a[mask] returns an 1-D array with 1, 2 and 5. Note that boolean selection results always in
a flatted array, even if delete a whole row or a whole column of the original data. It is up to
the programmer to then reshape the data accordingly.
Note: for row vectors, both a[2] or a[1,2] returns the second element.
ndims(a) returns the number of dimensions of the array (e.g. 2 for a Matrix)
must all have a single element for all the other dimensions, e.r. be of size 1) the
transpose ' operator. These operations perform a shadow copy, returning just a
different "view" of the underlying data (so modifying the original matrix modifies also the
reshaped/transposed matrix). You can use collect(reshape/dropdims/transpose) to
force a deepcopy.
alias to AbstractArray{T,2} .
13
2 - Data types
Multidimensional Arrays can arise for example from using list comprehension: a = [3x + 2y
+ z for x in 1:2, y in 2:3, z in 1:2]
For further operations on arrays and matrices have a look at the QuantEcon tutorial.
Tuples
Use tuples to have a list of immutable elements: a = (1,2,3) or even without parenthesis
a = 1,2,3
Tuples can be easily unpacked to multiple variable: var1, var2 = (x,y) (this is useful, for
example, to collect the values of functions returning multiple values)
Useful tricks:
NamedTuples
NamedTuples are collections of items whose position in the collection (index) can be
identified not only by the position but also by name.
As "normal" tuples, NamedTuples can hold any values, but cannot be modified (i.e. are
"immutable").
Before Julia 1.0 Named Tuples were implemented in a separate package (NamedTuple.jl).
The idea is that, like for the Missing type, the separate package provides additional
functionality to the core NamedTuple type, but there is still a bit of confusion over it and, at
time of writing, the additional package still provide its own implementation (and many other
external packages require it), resulting in crossed incompatibilies.
Dictionaries
14
2 - Data types
Dictionaries store mappings from keys to values, and they have an apparently random
sorting.
You can create an empty (zero-elements) dictionary with mydict = Dict() , or initialize a
dictionary with values: mydict = Dict('a'=>1, 'b'=>2, 'c'=>3)
Look up values: mydict['a'] (it raises an error if looked-up value doesn't exist)
Look up value with a default value for non-existing key: get(mydict,'a',0)
Get all keys: keys(mydict) (the result is an iterator, not an Array. Use collect() to
transform it.)
Get all values: values(mydict) (result is again an iterator)
Check if a key exists: haskey(mydict, 'a')
Check if a given key/value pair exists (that is, if the key exists and has that specific
value): in(('a' => 1), mydict)
You can iterate through both the key and the values of a dictionary at the same time:
While named tuples and dictionaries can look similar, there are some important difference
between them:
The syntax is a bit less verbose and readable with NamedTuples: nt.k1 vs d[:k1]
Overall, NamedTuple are generally more efficient and should be thought more as
anonymous struct (see the "Custom structure" section) than Dictionaries.
Sets
Use sets to represent collections of unordered, unique values.
15
2 - Data types
Some methods:
"simple" types (e.g. Float64, Int64, but also String ) are deep copied
containers of simple types (or other containers) are shadow copied (their internal is only
referenced, not copied)
copy(x)
deepcopy(x)
You can check if two objects have the same values with == and if two objects are actually
the same with === (in the sense that immutable objects are checked at the bit level and
mutable objects are checked for their memory address):
given a = [1, 2]; b = [1, 2]; , a == b and a === a are true, but a === b is false;
given a = (1, 2); b = (1, 2); , all a == b, a === a and a === b are true.
16
2 - Data types
Attention to the keyword const . When applied to a variable (e.g. const x = 5 ) doesn't
mean that the variable can't change value (as in C), but simply that it can not change type.
Only global variables can be declared constant.
To convert ("cast") between types, use convertedObj = convert(T,x) . Still, when conversion
is not possible, e.g. trying to convert a 6.4 Float64 in a Int64 value, an error, will be risen
( InexactError in this case).
For the opposite (to convert integers or floats to strings), use myString = string(123) .
You can "broadcast" a function to work over an Array (instead of a scalar) using the dot ( . )
operator.
For example, to broadcast parse to work over an array use: myNewList = parse.(Float64,
["1.1","1.2"]) (see also Broadcast in the "Functions" Section)
Variable names have to start with a letter, as if they start with a number there is ambiguity if
the initial number is a multiplier or not, e.g. in the expression 6ax the variable ax is
multiplied by 6, and it is equal to 6 * ax (and note that 6 ax would result in a compile
error). Conversely, ax6 would be a variable named ax6 and not ax * 6 .
You can import data from a file to a matrix using readdlm() (in standard library package
DelimitedFiles ). You can skip rows and/or columns using the slice operator and then
Random numbers
Random float in [0,1]: rand()
Random integer in [a,b]: rand(a:b)
Random float in [a,b] with "precision" to the second digit : rand(a:0.01:b)
This last can be executed faster and more elegantly using the Distribution package:
You can obtain an Array or a Matrix of random numbers simply specifying the requested size
to rand(), e.g. rand(2,3) or rand(Uniform(a,b),2,3) for a 2x3 Matrix.
17
2 - Data types
nothing (type Nothing ): is the value returned by code blocks and functions which do
not return anything. It is a single instance of the singleton type Nothing , and the closer
to C style NULL (sometimes it is referred as to the "software engineer’s null"). Most
operations with nothing values will result in a run-type error. In some contexts it is
printed as #NULL ;
missing (type Missing ): represents a missing value in a statistical sense: there should
be a value but we don't know which is (so it is sometimes referred to as the "data
scientist’s null"). Most operations with missing values will result in missing propagate
(silently). Containers can handle missing values efficiently when declared of type
Union{T,Missing} . The Missing.jl package provides additional methods to handle
missing elements;
NaN (type Float64 ): represents when an operation results in a Not-a-Number value
(e.g. 0/0). It is similar to missing in the fact that it propagates silently. Similarly, Julia
also offers Inf (e.g. 1/0 ) and -Inf (e.g. -1/0 ).
¹: Technically a String is an array in Julia (try to append a String to an array!), but for most
uses it can be thought as a scalar type.
While an updated, expanded and revised version of this chapter is available in "Chapter 2 -
Data Types and Structures" of Antonello Lobianco (2019), "Julia Quick Syntax Reference",
Apress, this tutorial remains in active development.
18
3 - Control flow
3 - Control flow
All typical control flow ( for , while , if / else , do ) are supported, and parenthesis
around the condition are not necessary. Multiple conditions can be specified in the for loop,
e.g.:`
for i=1:2,j=2:4
println(i*j)
end
returns an iterator to tuples with the index and the value of elements in an array)
[students[name] = sex for (name,sex) in zip(names,sexes)] ( zip returns an iterator of
arguments) When mapping a function with a single parameter, the parameter can be
omitted: a = map(f, [1,2,3]) is equal to a = map(x->f(x), [1,2,3]) .
Logical operators
And: &&
Or: ||
Not: !
Currently and and or aliases to respectively && and || has not being imlemented.
Do blocks
19
3 - Control flow
Do blocks allow to define anonymous functions that are passed as first argument to the
outer functions. For example, findall(x -> x == value, myarray) expects the first argument
to be a function. Every time the first argument is a function, this can be written at posteriori
with a do block:
findall(myarray) do x
x == value
end
This defines x as a variable that is passed to the inner contend of the do block. It is the
task of the outer function to where to apply this anonymous function (in this case to the
myarray array) and what to do with its return values (in this case boolean values used for
While an updated, expanded and revised version of this chapter is available in "Chapter 3 -
Control Flow and Functions" of Antonello Lobianco (2019), "Julia Quick Syntax Reference",
Apress, this tutorial remains in active development.
20
4 - Functions
4 - Functions
Functions can be defined inline or using the function keyword, e.g.:
f(x,y) = 2x+y
function f(x)
x+2
end
(a third way is to create an anonymous function and assign it to a nameplace, see later)
Arguments
Arguments are normally specified by position, while arguments given after a semicolon are
instead specified by name.
The call of the function must respect this distinction, calling positional argument by position
and keyword arguments by name (e.g., it is not possible to call positional arguments by
name).
The last argument(s) (whatever positional or keyword) can be specified together with a
default value.
myfunction(a,b=1;c=2) = (a+b)*3 # definition with 2 position arguments and one keyword
argument myfunction(10,c=13) # calling (10+1)*3
To declare a function parameter as being either a scalar type T or a vector T you can use
an Union: function f(par::Union{Float64, Vector{Float64}} = Float64[]) [...] end
The ellipsis (splat ... ) can be uses in order to both specify a variable number of
arguments and "splicing" a list or array in the parameters of a function call:
values = [1,2,3]
function average(init, args...) #The parameter that uses the ellipsis must be the last
one
s = 0
for arg in args
s += arg
end
return init + s/length(args)
end
a = average(10,1,2,3) # 12.0
a = average(10, values ...) # 12.0
21
4 - Functions
Return value
Return value using the keyword return is optional: by default it is returned the last
computed value.
The return value can also be a tuple (so returning effectively multiple values):
myfunction(a,b) = a*2,b+2
x,y = myfunction(1,2)
The above function first defines two types, T (a subset of Number) and T2, and then specify
each parameter of which of these two types must be.
You can call it with (1,2,3) or (1,2.5,3.5) as parameter, but not with (1,2,3.5) as the
definition of myfunction constrains that the second and third parameter must be the same
type (whatever it is).
Functions as objects
22
4 - Functions
Functions themselves are objects and can be assigned to new variables, returned, or
nested. E.g.:
first one)
Anonymous functions
Sometimes we don't need to give a name to a function (e.g. within the map function). To
define anonymous (nameless) functions we can use the -> syntax, like this:
x -> x^2 + 2x - 1
This defines a nameless function that takes an argument, calls it x , and produces x^2 + 2x
- 1 . Multiple arguments can be provided using tuples: (x,y,z) -> x + y + z
You can still assign an anonymous function to a variable: f = (x,y) -> x+y
Broadcast
You can "broadcast" a function to work over each elements of an array (singleton): myArray
= broadcast(i -> replace(i, "x" => "y"), myArray) . This is equivalent to (note the dot):
myArray = replace.(myArray, Ref("x" => "y")) ( Ref() is needed to protect the pair (x,y)
While in the past broadcast was available on a limited number of core functions only, the f.
() syntax is now automatically available for any function, including the ones you define.
23
4 - Functions
While an updated, expanded and revised version of this chapter is available in "Chapter 3 -
Control Flow and Functions" of Antonello Lobianco (2019), "Julia Quick Syntax Reference",
Apress, this tutorial remains in active development.
24
5 - Custom structures
5 - Custom structures
Structures (previously known in Julia as "Types") are, for the most (see later for the
difference), what in other languages are called classes, or "structured data": they define the
kind of information that is embedded in the structure, that is a set of fields (aka "properties"
in other languages), and then individual instances (or "objects") can be produced each with
its own specific values for the fields defined by the structure.
They are "composite" types, in the sense that are not made of just a fixed amound of bits as
instead "primitive" types.
Defining a structure
mutable struct MyOwnType
property1
property2::String
end
For increasing performances in certain circumstances, you can optionally specify the type of
each field, as done in the previous example for property2.
You can omit the mutable keyword in front of struct when you want to enforce that once
an object of that type has been created, its fields can no longer be changed (i.e. , structures
are immutable by default. Note that mutable objects -as arrays- remain themselves mutable
also in a immutable structure). Although obviously less flexible, immutable structures are
much faster.
25
5 - Custom structures
You can create abstract types using the keyword abstract type . Abstract types do not have
any field, and objects can not be instantiated from them, although concrete types
(structures) can be defined as subtypes of them (an issue to allow abstract classes to have
fields is currently open and may be implemented in the future).
Note that you initialise the object with the values in the order that has been specified in the
structure definition.
26
5 - Custom structures
struct Person
myname::String
age::Int64
end
struct Shoes
shoesType::String
colour::String
end
struct Student
s::Person
school::String
shoes::Shoes
end
function printMyActivity(self::Student)
println("I study at $(self.school) school")
end
struct Employee
s::Person
monthlyIncomes::Float64
company::String
shoes::Shoes
end
function printMyActivity(self::Employee)
println("I work at $(self.company) company")
end
gymShoes = Shoes("gym","white")
proShoes = Shoes("classical","brown")
printMyActivity(Marc)
printMyActivity(MrBrown)
There are three big elements that distinguish Julia implementation from a pure Object-
Oriented paradigm:
1. Firstly, in Julia you do not associate functions to a type. So, you do not call a
function over a method ( myobj.func(x,y) ) but rather you pass the object as a
parameter ( func(myobj, x, y) );
2. In order to extend the behaviour of any object, Julia doesn't use inheritance (only
abstract classes can be inherited) but rather composition (a field of the subtype is of
the higher type, allowing access to its fields). I personally believe that this is a bit a limit
27
5 - Custom structures
in the expressiveness of the language, as the code can not consider directly different
concepts of relations between objects (e.g. Person->Student specialisation, Person-
>Arm composition, Person->Shoes weak relation );
3. Multiple-inheritance is not supported (yet).
More on types
Some useful type-related functions:
While an updated, expanded and revised version of this chapter is available in "Chapter 4 -
Custom Types" of Antonello Lobianco (2019), "Julia Quick Syntax Reference", Apress, this
tutorial remains in active development.
28
6 - Input - Output
6 - Input - Output
File reading/writing
File reading/writing is similar to other languages where you first open the file, specify the
modality ( r read, w write or a append) and bind the file to an object, and finally operate
on this object and close() it when you are done.
A better alternative is however to encapsulate the file operations in a do block that closes
the file automatically when the block ends:
Write:
open("afile.txt", "r") do f
for ln in eachline(f)
println(ln)
end
end
open("afile.txt", "r") do f
for (i,ln) in enumerate(eachline(f))
println("$i $ln")
end
end
29
6 - Input - Output
Other IO
Some packages that deals with IO are:
CSV: CSV.jl
Web stream: HTTP.jl
Spreadsheets (OpenDocument): OdsIO.jl
HDF5: HDF5.jl
Some basic examples that use them are available in the DataFrame section.
While an updated, expanded and revised version of this chapter is available in "Chapter 5 -
Input/Output" of Antonello Lobianco (2019), "Julia Quick Syntax Reference", Apress, this
tutorial remains in active development.
30
7 - Managing run-time errors (exceptions)
try
# ..some dangerous code..
catch
# ..what to do if an error happens, most likely send an error message using:
error("My detailed message")
end
31
8 - Interfacing Julia with other languages
C
mylib.h:
#ifndef _MYLIB_H_
#define _MYLIB_H_
mylib.c:
float
iplustwo (float i){
return i+2;
}
Compiled with:
i = 2
const mylib = joinpath(@__DIR__, "libmylib.so")
j = ccall((:iplustwo, mylib), Float32, (Float32,), i)
Python
We show here an example with Python. The following code converts an ODS spreadsheet in
a Julia DataFrame, using the Python ezodf module (of course this have to be already be
available in the local installation of python):
32
8 - Interfacing Julia with other languages
using PyCall
using DataFrames
The first thing, is to declare we are using PyCall and to @pyimport the python module we
want to work with. We can then directly call its functions with the usual Python syntax
module.function() .
Type conversions are automatically performed for numeric, boolean, string, IO stream,
date/period, and function types, along with tuples, arrays/lists, and dictionaries of these
types.
Other types are instead converted to the generic PyObject type, as it is the case for the
destDoc object returned by the module function.
You can then access its attributes and methods with myPyObject.attibute and
myPyObject.method() respectively.
While an updated, expanded and revised version of this chapter is available in "Chapter 7 -
Interfacing Julia with Other Languages" of Antonello Lobianco (2019), "Julia Quick Syntax
Reference", Apress, this tutorial remains in active development.
33
9 - Metaprogramming
9 - Metaprogramming
Julia represents its own code as a data structure accessible from the language itself. Since
code is represented by objects that can be created and manipulated from within the
language, it is possible for a program to transform and generate its own code, that is to
create powerful macros (the term "metaprogramming" refers to the possibility to write code
that write codes that is then evaluated).
Note the difference with C or C++ macros. There, macros work performing textual
manipulation and substitution before any actual parsing or interpretation occurs.
In Julia, macros works when the code has been already parsed and organised in a syntax
tree, and hence the semantic is much richer and allows for much more powerful
manipulations.
Expressions
There are really many way to create an expression:
a = 1
expr = :($a+2) # expr is now :(1+2)
Quote block
An alternative of the :([...]) operator is to use the quote [...] end block.
Parse a string
Or also, starting from a string (that is, the original representation of source code for Julia):
34
9 - Metaprogramming
expr = Meta.parse("1+2") # parses the string "1+2" and saves the `1+2` expression in t
he `expr` expression, same as expr = :(1+2)
eval(expr) # here the expression is evaluated and the code returns 3
:args is an array of elements that can be symbols, literal values or other expressions.
Symbols
The second meaning of the : operator is to create symbols, and it is equivalent to the
Symbol() function that concatenate its arguments to form a symbol:
a = 2;
ex = Expr(:call, :*, a, :b) # ex is equal to :(2 * b). Note that b doesn't even need t
o be defined
a = 0; b = 2; # no matter what now happens to a, as a is evaluated at th
e moment of creating the expression and the expression stores its value, without any m
ore reference to the variable
eval(ex) # returns 4, not 0
Macros
The possibility to represent code into expressions is at the heart of the usage of macros.
Macros in Julia take one or more input expressions and return a modified expressions (at
parse time). This contrast with normal functions that, at runtime, take the input values
(arguments) and return a computed value.
35
9 - Metaprogramming
Macro definition
Macro call
Like for strings, the $ interpolation operator will substitute the variable with its content, in
this context the expression. So the "expanded" macro will look in this case as:
if !(3 in array)
println("array does not contain 3")
end
Attention that the macro doesn't create a new scope, and variables declared or assigned
within the macro may collide with variables in the scope of where the macro is actually
called.
While an updated, expanded and revised version of this chapter is available in "Chapter 6 -
Metaprogramming and Macros" of Antonello Lobianco (2019), "Julia Quick Syntax
Reference", Apress, this tutorial remains in active development.
36
10 - Performances (parallelisation, debugging, profiling..)
10 - Performances (parallelisation,
debugging, profiling..)
Julia is relatively fast when working with Any data, but when the JIT compiler is able to infer
the exact type of an object (or a Union of a few types) Julia runs with the same order of
magnitude of C.
As example, here is how a typical loop-based function compare with the same function
written using other programming languages (to be fair: these other programming languages
can greatly improve the way they compute this function, e.g. using vectorised code):
Julia:
function f(n)
s = 0
for i = 1:n
s += i/2
end
s
end
g++:
#include <iostream>
#include <chrono>
Non optimised: 2.48 seconds Optimised (compiled with the -O3 switch) : 0.83 seconds
37
10 - Performances (parallelisation, debugging, profiling..)
Python:
start_time = time.time()
main()
print("--- %s seconds ---" % (time.time() - start_time))
Non optimised (wihtout using numba and the @jit decorator): 98 seconds Optimised (using
the just in time compilation):0.88 seconds
R:
f <- function(n){
# Start the clock!
ptm <- proc.time()
s <- 0
for (i in 1:n){
s <- s + (i/2)
}
print(s)
# Stop the clock
proc.time() - ptm
}
Non optimised: 287 seconds Optimised (vectorised): the function returns an error (on my
8GB laptop), as too much memory is required to build the arrays!
Human mind:
Of course the result is just n*(n+1)/4, so the best programming language is the human
mind.. but still compilers are doing a pretty smart optimisation!
Type annotation
38
10 - Performances (parallelisation, debugging, profiling..)
In general (see above for exceptions) type annotation is not necessary. Only in the few
cases where the compiler can't determine the type it is useful for improving performances.
avoid global variables and run your performance-critical code within functions rather
than in the global scope;
annotate the inner type of a container, so it can be stored in memory contiguously;
annotate the fields of composite types (use eventually parametric types);
loop matrices first by column and then by row, as Julia is column-mayor;
Code parallelisation
Julia provides core functionality to parallelise code using processes. These can be even in
different machines, where connection is realised trough SSH. Threads instead (that are
limited to the same CPU but, contrary to processes, share the same memory) are not yet
implemented (as it is much more difficult to "guarantee" safe multi-threads than safe multi-
processes).
This notebook shows how to use several functions to facilitate code parallelism:
Debugging
Full debugger (both text based and graphical) is now available in Julia. The base
functionality is provided by the
https://github.com/JuliaDebug/JuliaInterpreter.jl[JuliaInterpreter.jl] package, while the user
interface is provided by the command-line packages
https://github.com/JuliaDebug/Debugger.jl[Debugger.jl] and
https://github.com/timholy/Rebugger.jl[Rebugger.jl] or the https://junolab.org/[Juno IDE] itself:
just type Juno.@enter myFunction(args) in Juno to start its graphical debugger.
39
10 - Performances (parallelisation, debugging, profiling..)
Profiling
Profiling is the "art" of finding bottlenecks in the code.
A simple way to time a part of the code is to simply type @time myFunc(args) (but be sure
you ran that function at least once, or you will measure compile time rather than run-time) or
@benchmark myFunc(args) (from package BenchmarkTools )
For more extensive coverage, Julia comes with a integrated statistical profile, that is, it runs
every x milliseconds and memorize in which line of code the program is at that moment.
Using this sampling method, at a cost of loosing some precision, profiling can be very
efficient, in terms of very small overheads compared to run the code normally.
Profile a function: Profile.@profile myfunct() (best after the function has been already
ran once for JIT-compilation).
Print the profiling results: Profile.print() (number of samples in corresponding line
and all downstream code; file name:line number; function name;)
Explore a chart of the call graph with profiled data: ProfileView.view() (from package
ProfileView ).
While an updated, expanded and revised version of this chapter is available in "Chapter 8 -
Effectively Write Efficient Code" of Antonello Lobianco (2019), "Julia Quick Syntax
Reference", Apress, this tutorial remains in active development.
40
11 - Developing Julia packages
PkgDev.publish(pkg)
or (much better) use the package Registrator that automatise the workflow (after you
installed Registrator on your GitHub repository, just create a new GitHub release in order to
spread it to the Julia package ecosystem).
It is a good practice to document your own functions. You can use triple quoted strings (""")
just before the function to document and use Markdown syntax in it. The Julia
documentation recommends that you insert a simplified version of the function, together with
an Arguments and an Examples sessions.
For example, this is the documentation string of the ods_readall function within the OdsIO
package:
41
11 - Developing Julia packages
"""
ods_readall(filename; <keyword arguments>)
# Arguments
* `sheetsNames=[]`: the list of sheet names from which to import data.
* `sheetsPos=[]`: the list of sheet positions (starting from 1) from which to import d
ata.
* `ranges=[]`: a list of pair of touples defining the ranges in each sheet from which
to import data, in the format ((tlr,trc),(brr,brc))
* `innerType="Matrix"`: the type of the inner container returned. Either "Matrix", "Di
ct" or "DataFrame"
# Notes
* sheetsNames and sheetsPos can not be given together
* ranges is defined using integer positions for both rows and columns
* individual dictionaries or dataframes are keyed by the values of the cells in the fi
rst row specified in the range, or first row if `range` is not given
* innerType="Matrix", differently from innerType="Dict", preserves original column ord
er, it is faster and require less memory
* using innerType="DataFrame" also preserves original column order
# Examples
``julia
julia> outDic = ods_readall("spreadsheet.ods";sheetsPos=[1,3],ranges=[((1,1),(3,3)),(
(2,2),(6,4))], innerType="Dict")
Dict{Any,Any} with 2 entries:
3 => Dict{Any,Any}(Pair{Any,Any}("c",Any[33.0,43.0,53.0,63.0]),Pair{Any,Any}("b",Any
[32.0,42.0,52.0,62.0]),Pair{Any,Any}("d",Any[34.0,44.0,54.…
1 => Dict{Any,Any}(Pair{Any,Any}("c",Any[23.0,33.0]),Pair{Any,Any}("b",Any[22.0,32.0
]),Pair{Any,Any}("a",Any[21.0,31.0]))
``
"""
42
Plotting
Plotting
Plotting in julia can be obtained using a specific plotting package (e.g. Gadfly, Winston) or,
as I prefer, use the Plots package that provide a unified API to several supported backends
Backends are chosen running chosenbackend() (that is, the name of the corresponding
backend package, but written all in lower case) before calling the plot function. You need
to install at least one backend before being able to use the Plots package. My preferred
one is PlotlyJS (a julia interface to the plotly.js visualization library. ), but you may be
interested also in PyPlot (that use the excellent python matplotlib VERSION 2).
For example:
Pkg.add("Plots")
Pkg.add("PyPlot.jl") # or Pkg.add("PlotlyJS")
using Plots
pyplot() # or plotlyjs()
plot(sin, -2pi, pi, label="sine function")
Attention not to mix using different plotting packages (e.g. Plots and one of its backends). I
had troubles with that. If you have already imported a plot package and you want to use an
other package, always restart the julia kernel (this is not necessary, and it is one of the
advantage, when switching between different bakends of the Plots package).
43
Plotting
The first call to plot() create a new plot. Calling plot!() modify instead the plot that is
passed as first argument (if none, the latest plot is modified)
44
Plotting
Saving
To save the figure just call one of the following:
savefig("fruits_plot.svg")
savefig("fruits_plot.pdf")
savefig("fruits_plot.png")
While an updated, expanded and revised version of this chapter is available in "Chapter 9 -
Working with Data" of Antonello Lobianco (2019), "Julia Quick Syntax Reference", Apress,
this tutorial remains in active development.
45
DataFrames
DataFrames
Dataframes
Julia has a library to handle tabular data, in a way similar to R or Pandas dataframes. The
name is, no surprises, DataFrames. The approach and the function names are similar,
although the way of actually accessing the API may be a bit different.
For complex analysis, DataFramesMeta adds some helper macros.
Documentation:
DataFrames: http://juliadata.github.io/DataFrames.jl/stable/,
https://en.wikibooks.org/wiki/Introducing_Julia/DataFrames
DataFramesMeta: https://github.com/JuliaStats/DataFramesMeta.jl
Stats in Julia in general: http://juliastats.github.io/
using CSV
supplytable = CSV.read(IOBuffer("""
prod Epinal Bordeaux Grenoble
Fuelwood 400 700 800
Sawnwood 800 1600 1800
Pannels 200 300 300
"""), delim=" ", ignorerepeated=true, copycols=true)
If a column has in the first top rows used by type-autorecognition only missing values, but
then has non-missing values in subsequent rows, an error may appear. The trick is to
manually specify the column value with the type parameter (Vector or Dictionary, e.g.
46
DataFrames
If you need to edit the values of your imported dataframe, do not forget the copycols=true
option.
df = DataFrame(
colour = ["green","blue","white","green","green"],
shape = ["circle", "triangle", "square","square","circle"],
border = ["dotted", "line", "line", "line", "dotted"],
area = [1.1, 2.3, 3.1, missing, 5.2])
last(df, 6)
describe(df)
ENV["LINES"] = 60 change the default number of lines before the content is truncated
47
DataFrames
Column names are Julia symbols. To programmatically compose a column name you need
hence to use the Symbol(String) constructor, e.g.:
Symbol("value_",0)
Referencing is obtained using the exclamation mark for the row position (to emphasize that
referenced data could be changed in the new object) or using the dot syntax:
myObjWithReferencedData = df[!,[cNames]] or myObjWithReferencedData.cName .
Copying use instead the old two point syntax: myobjWithCopyedData = df[:,[cName(s)]] .
Filter by value, based on a field being in a list of values using boolean selection trough
list comprehension: df[ [i in ["blue","green"] for i in df.colour], :]
Combined boolean selection: df[([i in ["blue","green"] for i in df.colour] .> 0) .&
(df.shape .== "triangle"), :] (the dot is needed to vectorize the operation. Note the
wrap it using the cols() function, e.g. col = Symbol("x"); @where(df, cols(col) .> 2)
Change a single value by filtering columns: df[ (df.product .== "hardWSawnW") .&
(df.year .== 2010) , :consumption] = 200
48
DataFrames
Edit data
Replace values based to a dictionary : mydf.col1 = map(akey->myDict[akey], mydf.col1)
(the original data to replace can be in a different column or a totally different DataFrame
Concatenate (string) values for several columns to create the value a new column: df.c
= df.a .* " " .* df.b
To compute the value of a column based of other columns you need to use elementwise
operations using the dot, e.g. df.a = df.b .* df.c (note that the equal sign doesn't
have the dot.. but if you have to make a comparison, the == operator wants also the
dot, i.e. .== )
Append a row: push!(df, [1 2 3])
Delete a given row: use deleterows!(df,rowIdx) or just copy a df without the rows that
are not needed, e.g. df2 = df[[1:(i-1);(i+1):end],:]
Empty a dataframe: df = similar(df,0)
Edit structure
Delete columns by name: select!(df, Not([:col1, :col2]))
Rename columns: names!(df, [:c1,:c2,:c3]) (all) rename!(df, Dict(:c1 => :newCol))
(a selection)
Change column order: df = df[:,[:b, :a]]
Add an "id" column (useful for unstacking): df.id = 1:size(df, 1) # this makes it
easier to unstack
Add a Float64 column (all filled with missing by default): df.a =
Array{Union{Missing,Float64},1}(missing,size(df,1))
Add a column based on values of other columns: df.c = df.a .+ df.b (as alternative
use map: df.c = map((x,y) -> x + y, df.a, df.b) )
Insert a column at a position i: insert!(df, i, [colContent], :colName)
Convert columns:
from Int to Float: df.A = convert(Array{Float64,1},df.A)
from Float to Int: df.A = convert(Array{Int64,1},df.A)
49
DataFrames
from Any to T (including String, if the individual elements are already strings): df.A
= convert(Array{T,1},df.A)
You can "pool" specific columns in order to efficiently store repeated categorical
variables with categorical!(df, [:A, :B]) . Attention that while the memory decrease,
filtering with categorical values is not quicker (indeed it is a bit slower). You can go back
to normal arrays wih collect(df.A) .
Merge/Join/Copy datasets
Concatenate different dataframes (with same structure): df = vcat(df1,df2,df3) or df
= vcat([df1,df2,df3]...) (note the three dots at the end, i.e. the splat operator).
dropmissing!(df) (in both its version with or without question mark) and
completecases(df) select only rows without missing values. The first returns the
skimmed DataFrame , while the second return a boolean array, and you can also specify
on which columns you want to limit the application of this filter
completecases(df[[:col1,:col2]]) . You can then get the df with df2 =
df[completecases(df[[:col1,:col2]]),:] )
Within an operation (e.g. a sum) you can use dropmissing() in order to skip missing
values before the operation take place.
Remove missing values on all string and numeric columns: [df[ismissing.(df[!,i]), i]
50
DataFrames
To make comparison (e.g. for boolean selection or within the @where macro in
DataFramesMeta ) where missing values could be present you can use isequal.(a,b) to
Split-Apply-Combine strategy
The DataFrames package supports the Split-Apply-Combine strategy through the by
function, which takes in three arguments: (1) a DataFrame, (2) a column (or columns) to split
the DataFrame on, and (3) a function or expression to apply to each subset of the
DataFrame.
The function can return a value, a vector, or a DataFrame. For a value or vector, these are
merged into a column along with the cols keys. For
a DataFrame, cols are combined along columns with the resulting DataFrame. Returning a
DataFrame is the clearest because it allows column labelling.
by function can take the function as first argument, so to allow the usage of do blocks.
Inside, it uses the groupby() function, as in the code it is defined as nothing else than:
Aggregate
Aggregate by several fields:
Attention that all categorical fields have to be included in the list of fields over which to
aggregate, otherwise Julia will try to compute a sum also over them (but them being
string, it will raice an error) instead of just ignoring them.
The workaround is to remove the fields you don't want before doing the operation.
51
DataFrames
by(df, [:catfield1,:catfield2]) do df
DataFrame(m = sum(df.valueField))
end
df = DataFrame(region=["US","US","US","US","EU","EU","EU","EU"],
year = [2010,2011,2012,2013,2010,2011,2012,2013],
value=[3,3,2,2,2,2,1,1])
df.cumValue = copy(df.value)
[r.cumValue = df[(df.region .== r.region) .& (df.year .== (r.year-1)),:cumValue][1
] + r.value for r in eachrow(df) if r.year != minimum(df.year)]
52
DataFrames
Pivot
Stack
Move columns to rows of a "variable" column, i.re. moving from wide to long format.
For stack(df,[cols]) you have to specify the column(s) that have to be stacked, for
melt(df,[cols]) at the opposite you specify the other columns, that represent the id
df = DataFrame(region = ["US","US","US","US","EU","EU","EU","EU"],
product = ["apple","apple","banana","banana","apple","apple","banana","
banana"],
year = [2010,2011,2010,2011,2010,2011,2010,2011],
produced = [3.3,3.2,2.3,2.1,2.7,2.8,1.5,1.3],
consumed = [4.3,7.4,2.5,9.8,3.2,4.3,6.5,3.0])
long1 = stack(df,[:produced,:consumed])
long2 = melt(df,[:region,:product,:year])
long3 = stack(df)
long1 == long2 == long3 # true
Unstack
You can specify the dataframe, the column name which content will become the row index
(id variable), the column name with content will become the name of the columns (column
variable names) and the column name containing the values that will be placed in the new
table (column values):
widedf = unstack(longdf, [:ids], :variable, :value)
53
DataFrames
Alternatively you can omit the :id parameter and all the existing column except the one
defining column names and the one defining column values will be preserved as index (row)
variables:
Sorting
sort!(df, cols = (:col1, :col2), rev = (false, false)) The (optional) reverse order
parameter (rev) must be a tuple of the same size as the cols parameter
Use LAJuliaUtils.jl
You can use (my own utility module) LAJuliaUtils.jl in order to Pivot and optionally filter
and sort in a single function in a spreadsheet-like Pivot Tables fashion. See the relevant
section.
Export to CSV
CSV.write("file.csv", df, delim = ';', header = true) (from package CSV )
Export to Dict
This export to a dictionary where the keys are the unique elements of a df column and the
values are the splitted dataframes:
vars = Dict{String,DataFrame}()
[vars[x] = @where(df, :varName .== x) for x in unique(df.varName)]
[select!(vars[k], Not([:varName])) for k in keys(vars)]
54
DataFrames
To use hdf5 with the HDF5 package, some systems may require system-wide hdf5 binaries,
e.g. in Ubuntu linux sudo apt-get install hdf5-tools.
h5write("out.h5", "mygroup/myDf", convert(Array, df[:,[list_of_cols]))
The HDF5 package doesn't yet support directly dataframes, so you need first to export them
as Matrix (a further limitation is that it doesn't accept a matrix of Any type, so you may want
to export a DataFrame in two pieces, the string and the numeric columns separatly). You
can read back the data with data = h5read("out.h5", "mygroup/myDf") .
While an updated, expanded and revised version of this chapter is available in "Chapter 9 -
Working with Data" of Antonello Lobianco (2019), "Julia Quick Syntax Reference", Apress,
this tutorial remains in active development.
55
JuMP
JuMP
`` JuMP is an algebraic modelling language for mathematical optimisation problems, similar
to GAMS, AMPL or Pyomo.
It is solver-independent. It supports also non-linear solvers, providing them with the Gradient
and the Hessian.
Note: The notebook has been updated to the latest JuMP 0.20
While an updated, expanded and revised version of this chapter is available in "Chapter 10 -
Mathematical Libraries" of Antonello Lobianco (2019), "Julia Quick Syntax Reference",
Apress, this tutorial remains in active development.
56
SymPy
SymPy
SymPy
`` SymPy is a wrapper to the Python SymPy library for symbolic computation: solve
equations (or system of equations), simplify them, find derivates or integrals...
http://nbviewer.jupyter.org/github/sylvaticus/juliatutorial/blob/master/assets/Symbolic
computation.ipynb
You can plot a function that includes symbols, e.g.: plot(2x,0,1) plots y=2x in the
[0,1] range
For the infinity symbol use either oo or Inf (eventually with + or -)
While an updated, expanded and revised version of this chapter is available in "Chapter 10 -
Mathematical Libraries" of Antonello Lobianco (2019), "Julia Quick Syntax Reference",
Apress, this tutorial remains in active development.
57
Weave
Weave
`` Weave allows to produce dynamic documents where the script that produce the output is
embedded directly in the document, with optionally only the output rendered.
Save the document below in a file with extension jmd (e.g. testWeave.jmd)
---
title : Test of a document with embedded Julia code and citations
date : 5th September 2018
bibliography: biblio.bib
---
This is a strong affermation that needs a citation [see @Lecocq:2011, pp. 33-35; @Caur
la:2013b, ch. 1].
## Subsection 1.1
This should print a plot. Note that I am not showing the source code in the final PDF:
```{julia;echo=false}
using Plots
pyplot()
plot(sin, -2pi, pi, label="sine function")
Here instead I will put in the PDF both the script source code and the output:
using DataFrames
df = DataFrame(
colour = ["green","blue","white","green","green"],
shape = ["circle", "triangle", "square","square","circle"],
border = ["dotted", "line", "line", "line", "dotted"],
area = [1.1, 2.3, 3.1, missing, 5.2]
)
df
Note also that I can refer to variables defined in previous chunks (or "cells", following Jupyter
terminology):
df[:colour]
58
Weave
Subsubsection
For a much more complete example see the Weave documentation.
References
```
You can then "compile" the document (from within Julia) with:
59
Weave
In Ubuntu Linux (but most likely also in other systems), weave needs pandora and LaTeX
( texlive-xetex ) already installed in the system.
If you use Ununtu, the version of pandora in the official repositories is too old. Use instead
the deb available in https://github.com/jgm/pandoc/releases/latest .
While an updated, expanded and revised version of this chapter is available in "Chapter 11 -
Utilities" of Antonello Lobianco (2019), "Julia Quick Syntax Reference", Apress, this tutorial
remains in active development.
60
LAJuliaUtils
LAJuliaUtils
`` LAJuliaUtils is my personal repository for utility functions, mainly for dataframes.
of type(s) colsType
pivot(df::AbstractDataFrame, rowFields, colField, valuesField; <kwd args>) - Pivot
string
value)
rowFields : the field(s) to be used as row categories (also known as IDs or keys)
ammissible values]
sort : optional row field(s) to sort
While an updated, expanded and revised version of this chapter is available in "Chapter 9 -
Working with Data" of Antonello Lobianco (2019), "Julia Quick Syntax Reference", Apress,
this tutorial remains in active development.
61
IndexedTables
IndexedTables
`` IndexedTables are DataFrame-like data structure that, working with tuples dictionaries, are
in my experience much faster to perform select operations.
There are two types of IndexedTables, table and ndsparse . The main difference from a
user-point of view is that the former is looked up by position, while the later can be looked up
by stored values (and hence is, at least for me, more useful):
# table constructor..
myTable = table(
(param=param,item=item,region=region2,value00=value2000,value10=value2010)
;
pkey = [:param, :item, :region]
)
# ndsparse construct.. note two separated NamedTuples for keys and values..
mySparseTable = ndsparse(
(param=param,item=item,region=region),
(value00=value2000,value10=value2010)
)
# Query data..
myTable[3]
mySparseTable["price",:,:] # ":" let select all values for the specific dimension
While an updated, expanded and revised version of this chapter is available in "Chapter 9 -
Working with Data" of Antonello Lobianco (2019), "Julia Quick Syntax Reference", Apress,
this tutorial remains in active development.
62
Pipe
Pipe
The Pipe package allows you to improve the Pipe operator |> in Julia Base.
Chaining (or "piping") allows to string together multiple function calls in a way that is at the
same time compact and readable. It avoids saving intermediate results without having to
embed function calls within one another.
With the chain operator |> instead, the code to the right of |> operates on the result from
the code to the left of it. In practice, what is on the left becomes the argument of the function
call(s) that is on the right.
Chaining is very useful in data manipulation. Let's assume that you want to use the following
(silly) functions operate one after the other on some data and print the final result:
add6(a) = a+6; div4(a) = 4/a;
You could either introduce temporary variables or embed the function calls:
a = 2; b = add6(a); c = div4(b); println(c) # 0.5 println(div4(add6(a)))
Pipes in Base are very limited, in the sense that support only functions with one argument
and only a single function at a time.
Conversely, the Pipe package together with the @pipe macro hoverrides the |> operator
allowing you to use functions with multiple arguments (and there you can use the
underscore character " _ " as placeholder for the value on the LHS) and multiple functions,
e.g.:
addX(a,x) = a+x; divY(a,y) = a/y @pipe a |> addX(_,6) + divY(4,_) |> println # 10.0
Note that, as in the basic pipe, functions that require a single argument and this is provided
by the piped data, don't need parenthesis.
While an updated, expanded and revised version of this chapter is available in "Chapter 11 -
Utilities" of Antonello Lobianco (2019), "Julia Quick Syntax Reference", Apress, this tutorial
remains in active development.
63