Mastering Julia - Sample Chapter
Mastering Julia - Sample Chapter
ee
P U B L I S H I N G
C o m m u n i t y
$ 54.99 US
35.99 UK
pl
Malcolm Sherrington
Mastering Julia
Mastering Julia
Sa
m
D i s t i l l e d
Mastering Julia
Develop your analytical and programming skills further
in Julia to solve complex data processing problems
E x p e r i e n c e
Malcolm Sherrington
Preface
Julia is a relatively young programming language. The initial design work on the
Julia project began at MIT in August 2009, and by February 2012, it became open
source. It is largely the work of three developers Stefan Karpinski, Jeff Bezanson,
and Viral Shah. These three, together with Alan Edelman, still remain actively
committed to Julia and MIT currently hosts a variety of courses in Julia, many
of which are available over the Internet.
Initially, Julia was envisaged by the designers as a scientific language sufficiently
rapid to make the necessity of modeling in an interactive language and subsequently
having to redevelop in a compiled language, such as C or Fortran. At that time the
major scientific languages were propriety ones such as MATLAB and Mathematica,
and were relatively slow. There were clones of these languages in the open source
domain, such as GNU Octave and Scilab, but these were even slower. When it
launched, the community saw Julia as a replacement for MATLAB, but this is not
exactly case. Although the syntax of Julia is similar to MATLAB, so much so that
anyone competent in MATLAB can easily learn Julia, it was not designed as a clone.
It is a more feature-rich language with many significant differences that will be
discussed in depth later.
The period since 2009 has seen the rise of two new computing disciplines:
big data/cloud computing, and data science. Big data processing on Hadoop
is conventionally seen as the realm of Java programming, since Hadoop runs
on the Java virtual machine. It is, of course, possible to process big data by using
programming languages other than those that are Java-based and utilize the
streaming-jar paradigm and Julia can be used in a way similar to C++, C#,
and Python.
Preface
The emergence of data science heralded the use of programming languages that were
simple for analysts with some programming skills but who were not principally
programmers. The two languages that stepped up to fill the breach have been R and
Python. Both of these are relatively old with their origins back in the 1990s. However,
the popularity of these two has seen a rapid growth, ironically from around the time
when Julia was introduced to the world. Even so, with such estimated and staid
opposition, Julia has excited the scientific programming community and continues to
make inroads in this space.
The aim of this book is to cover all aspects of Julia that make it appealing to the
data scientist. The language is evolving quickly. Binary distributions are available
for Linux, Mac OS X, and Linux, but these will lag behind the current sources. So,
to do some serious work with Julia, it is important to understand how to obtain and
build a running system from source. In addition, there are interactive development
environments available for Julia and the book will discuss both the Jupyter and
Juno IDEs.
Preface
Introduction
Julia was first released to the world in February 2012 after a couple of years of
development at the Massachusetts Institute of Technology (MIT).
All the principal developersJeff Bezanson, Stefan Karpinski, Viral Shah, and Alan
Edelmanstill maintain active roles in the language and are responsible for the core,
but also have authored and contributed to many of the packages.
The language is open source, so all is available to view. There is a small amount of
C/C++ code plus some Lisp and Scheme, but much of core is (very well) written in
Julia itself and may be perused at your leisure. If you wish to write exemplary Julia
code, this is a good place to go in order to seek inspiration. Towards the end of this
chapter, we will have a quick run-down of the Julia source tree as part of exploring
the Julia environment.
[1]
Philosophy
Julia was designed with scientific computing in mind. The developers all tell us that
they came with a wide array of programming skillsLisp, Python, Ruby, R, and
MATLAB. Some like myself even claim to originate as Perl hackers. However, all
need a fast compiled language in their armory such as C or Fortran as the current
languages listed previously are pitifully slow.
So, to quote the development team:
"We want a language that's open source, with a liberal license. We want the speed
of C with the dynamism of Ruby. We want a language that's homoiconic, with true
macros like Lisp, but with obvious, familiar mathematical notation like Matlab. We
want something as usable for general programming as Python, as easy for statistics
as R, as natural for string processing as Perl, as powerful for linear algebra as
Matlab, as good at gluing programs together as the shell. Something that is dirt
simple to learn, yet keeps the most serious hackers happy. We want it interactive
and we want it compiled.
(Did we mention it should be as fast as C?)"
http://julialang.org/blog/2012/02/why-we-created-julia
With the introduction of the Low-Level Virtual Machine (LLVM) compilation, it has
become possible to achieve this goal and to design a language from the outset, which
makes the two-language approach largely redundant.
Julia was designed as a language similar to other scripting languages and so should
be easy to learn for anyone familiar to Python, R, and MATLAB. It is syntactically
closest to MATLAB, but it is important to note that it is not a drop-in clone. There are
many important differences, which we will look at later.
[2]
Chapter 1
[3]
Julia
Python R
fib
0.26
0.91
30.37
411.31 1992.0
mandel
0.86
0.85
14.19
pi_sum
0.80
1.00
rand_
0.64
mat_stat
rand_
mat_mul
0.96
MATLAB Octave
Mathe matica
Java Script
Go
3211.81 64.46
2.18
1.0
106.97 64.58
316.95
6.07
3.49
2.36
16.33
15.42
1.29
237.41
1.32
0.84
1.41
1.66
13.52
10.84
6.61
14.98
4.52
3.28
8.12
1.01
3.41
3.98
1.10
3.41
1.16
14.60
8.51
This table is useful in another respect too, as it lists all the major comparative
languages of Julia. No real surprises here, except perhaps the range of execution times.
Python: This has become the de facto data science language, and the range
of modules available is overwhelming. Both version 2 and version 3 are in
common usage; the latter is NOT a superset of the former and is around 10%
slower. In general, Julia is an order of magnitude faster than Python, so often
when the established Python code is compiled or rewritten in C.
[4]
Chapter 1
JavaScript and Go: These are linked together since they both use the Google
V8 engine. V8 compiles to native machine code before executing it; hence, the
excellent performance timings but both languages are more targeted at webbased applications.
So, Julia would seem to be an ideal language for tackling data science problems. It's
important to recognize that many of the built-in functions in R and Python are not
implemented natively but are written in C. Julia performs roughly as well as C, so
Julia won't do any better than R or Python if most of the work you do in R or Python
calls built-in functions without performing any explicit iteration or recursion.
However, when you start doing custom work, Julia will come into its own. It is
the perfect language for advanced users of R or Python, who are trying to build
advanced tools inside of these languages. The alternative to Julia is typically
resorting to C; R offers this through Rcpp, and Python offers it through Cython.
There is a possibility of more cooperation between Julia with R and/or Python than
competition, although this is not the common view.
Features
The Julia programming language is free and open source (MIT licensed), and the
source is available on GitHub.
To the veteran programmer, it has looks and feels similar to MATLAB. Blocks
created by the for, while, and if statements are all terminated by end rather than
by endfor, endwhile, and endif or by using the familiar {} style syntax. However,
it is not a MATLAB clone, and sources written for MATLAB will not run on Julia.
[5]
Julia's core is implemented in C and C++, and its parser in Scheme; the LLVM
compiler framework is used for the JIT generation of machine code.
The standard library is written in Julia itself by using Node.js's libuv library for
efficient, cross-platform I/O.
Julia has a rich language of types for constructing and describing objects that can
also optionally be used to make type declarations. It has the ability to define function
behavior across many combinations of argument types via a multiple dispatch,
which is the key cornerstone of language design.
Julia can utilize code in other programming languages by directly calling routines
written in C or Fortran and stored in shared libraries or DLLs. This is a feature of the
language syntax and will be discussed in detail later.
In addition, it is possible to interact with Python via PyCall and this is used in the
implementation of the IJulia programming environment.
Getting started
Starting to program in Julia is very easy. The first place to look at is the main Julia
language website: http://julialang.org. This is not blotted with graphics, just the
Julia logo, some useful major links to other parts of the site, and a quick sampler on
the home page.
[6]
Chapter 1
and the package system, all of which we will be discussing later. Moreover, the
documentation can be downloaded as a PDF file, a zipped file of HTML pages, or an
ePub file.
Julia sources
At present, we will be looking at the download link. This provides links to 32-bit and
64-bit distros for Windows, Mac OS X, CentOS, and Ubuntu; both the stable release
and the nightly development snapshot. So, a majority of the users getting started
require nothing more than a download and a standard installation procedure.
For Windows, this is by running the downloaded .exe file, which will extract Julia
into a folder. Inside this folder is a batch file julia.bat, which can be used to start
the Julia console.
For Mac OS X, the users need to click on the downloaded .dmg file to run the disk
image and drag the app icon into the Applications folder. On Mac OS X, you will be
prompted to continue as the source has been downloaded from the Internet and so is
not considered secure.
Similarly, uninstallation is a simple process. In Windows, delete the julia folder,
and in Mac OS X, delete Julia.app. To do a "clean" uninstall, it is also necessary to
tidy up a few hidden files/folders, and we will consider this after talking about the
package system.
For Ubuntu (Linux), it's a little bit more involved as you need to add a reference to
Personal Package Archive (PPA) to your system. You will have to have the root
privilege for this to execute the following commands:
sudo apt-get add-repository ppa:staticfloat/juliareleases
sudo add-apt-repository ppa:staticfloat/julia-deps
sudo apt-get update
sudo apt-get install julia
[7]
The releases are provided by Elliot Saba, and there is a separate PPA for the nightly
snapshots: ppa:staticfloat/julianightlies.
It is only necessary to add PPA once, so for updates, all you need to do is execute the
following command:
sudo apt-get update
wiki/List_of_Linux_distributions.
The link is again available from the julialang.org downloads page. Julia uses
GitHub as a repository for its source distribution as well as for various Julia
packages. We will look at installing on CentOS, which is the community edition
of Red Hat and is widely used.
Installing on CentOS
CentOS can be downloaded as an ISO image from http://www.centos.org and
written to a DVD. It can be installed as a replacement for an existing Windows
system or to run alongside Windows as a dual-booted configuration.
CentOS does not come with the git command as a standard; upon installation,
the first task will be to install it. For this and other installation processes, we use
the yum command (Yellowdog Updater and Modified (YUM)).
You will need the root/superuser privileges, so typically, you would type su -:
su
(type root password)
yum update
yum install git
Yum will fetch the Git sources for a Red Hat repository, list what needs to be
installed, and prompt you to press Y/N to continue.
[8]
Chapter 1
Once you have installed Git, we will need to grab the Julia sources from GitHub by
using the following command:
git clone git://Github.com/JuliaLang/julia.git
This will produce a subfolder at the current location called julia with all the sources
and documentation.
To build, Julia requires development tools that are not normally present in a
standard CentOS distribution, particularly GCC, g++, and gfortran.
These can be installed as follows:
sudo yum install gcc
sudo yum install gcc-c++
sudo yum install gcc-gfortran
Other tools (which are usually present) such as GNU Make, Perl, and patch are
needed, but groupinstall should take care of these too if not present. We did find
that an installation on Fedora 19 failed because the M4 processor was not found, but
again, yum install m4 was all that was required and the process could be resumed
from where it failed.
So, to proceed with the build, we change into the cloned julia folder and issue the
make command. Note that for seasoned Linux open source builders, there is no need
for a configuration step. All the prerequisites are assumed to be in place (or else the
make fails), and the executable is created in the julia folder, so there is no make
install step.
The build process can take considerable time and produces a number of warnings on
individual source files but when it has finished, it produces a file called julia in the
build folder. This is a symbolic link to the actual executable in the usr/bin folder.
[9]
So, typically, if all the tools are in place, the process may look like this:
[malcolm@localhost] cd ~
[malcolm@localhost] mkdir Build
[malcolm@localhost] cd Build
[malcolm@localhost Build] git clone git://github.com/JuliaLang/julia.git
[malcolm@localhost julia] cd julia
[malcolm@localhost julia] make
After the build:
[malcolm@localhost julia] ls -l julia
lrwxrwxrwx 1 malcolm malcolm 39 Jun 10 09:11 julia -> /home/malcolm/
Build/julia/usr/bin/julia
If you have (or create) a bin folder just under the home folder, it is worth recreating
the link there as it will be automatically appended to the path.
[malcolm@localhost] cd ~/bin
[malcolm@localhost bin] ln -s /home/malcolm/Build/julia/usr/bin/julia
julia
To test out the installation (assuming julia is on your path), use the following
command:
[malcolm@localhost] julia -q
The -q switch on the julia command represses the print of the Julia banner.
julia> println("I've just installed Julia")
I've just installed Julia
The julia> prompt indicates the normal command mode. It is worth noting that
there are a couple of other modes, which can be used at the console help (?) and
shell (;).
For example:
julia>?print
?print
Base.print(x)
Write (to the default output stream) a canonical (un-decorated)
text representation of a value if there is one, otherwise call
"show". The representation used by "print" includes minimal
formatting and tries to avoid Julia-specific details.
julia> ;ls
[ 10 ]
Chapter 1
;ls
asian-ascplot.jl asian-winplot.jl
run-asian.jl time-asian.jl
asian.jl
asian.m
asian.o
asian.r
We will be looking at an example of Julia code in the next section, but if you want to
be a little more adventurous, try typing in the following at the julia> prompt:
sumsq(x,y) = x^2 + y^2;
N=1000000; x = 0;
for i = 1:N
if sumsq(rand(), rand()) < 1.0
x += 1;
end
end
@printf "Estimate of PI for %d trials is %8.5f\n" N 4.0*(x / N);
[ 11 ]
On Mac OS X, you need to use a 64-bit gfortran compiler to build Julia. This can be
downloaded from HPC - Mac OS X on SourceForge http://hpc.sourceforge.net.
In order to work correctly, HPC gfortran requires HPC GCC to be installed as well.
From OS X 10.7, Clang is now used by default to build Julia, and Xcode version 5 or
later should be used. The minimum version of Clang needed to build Julia is v3.1.
Building under Windows is tricky and will not be covered here. It uses the
Minimalist GNU for Windows (MinGW) distribution, and there are many caveats.
If you wish to try it out, there is a comprehensive guide on the Julia site.
Contents
contrib
deps
doc
etc
examples
src
The C/C++, Lisp, and Scheme files to build the Julia kernel
test
ui
To gain some insight into Julia coding, the best folders to look at are base, examples,
and test.
1. The base folder contains a great portion of the standard library and the
coding style exemplary.
2. The test folder has some code that illustrates how to write test scripts and
use the Base.Test system.
3. The examples folder gives Julia's take on some well-known old computing
chestnuts such as Queens Problem, Wordcounts, and Game of Life.
If you have created Julia from source, you will have all the folders available in the
Git/build folder; the build process creates a new folder tree in the folder starting
with usr and the executable is in the usr/bin folder.
[ 12 ]
Chapter 1
Juno
Juno is an IDE, which is bundled, for stable distributions on the Julia website.
There are different versions for most popular operating systems.
It requires unzipping into a subfolder and putting the Juno executable on the runsearch path, so it is one of the easiest ways to get started on a variety of platforms.
It uses Light Table, so unlike IJulia (explained in the following section), it does not
need a helper task (viz. Python) to be present.
The driver is the Jewel.jl package, which is a collection of IDE-related code and is
responsible for communication with Light Table. The IDE has a built-in workspace
and navigator. Opening a folder in the workspace will display all the files via the
navigator.
Juno handles things such as the following:
Extensible autocomplete
Evaluation of code blocks with the correct file, line, and module data
Juno's basic job is to transform expressions into values, which it does on pressing
Ctrl + Enter, (Cmd + Enter on Mac OS X) inside a block. The code block is evaluated
as a whole, rather than line by line, and the final result returned.
[ 13 ]
By default, the result is "collapsed." It is necessary to click on the bold text to toggle
the content of the result. Graphs, from Winston and Gadfly, say, are displayed in line
within Juno, not as a separate window.
IJulia
IJulia is a backend interface to the Julia language, which uses the IPython interactive
environment. It is now part of Jupyter, a project to port the agnostic parts of IPython
for use with other programming languages.
This combination allows you to interact with the Julia language by using IPython's
powerful graphical notebook, which combines code, formatted text, math support,
and multimedia in a single document.
You need version 1.0 or later of IPython. Note that IPython 1.0 was released in
August 2013, so the version of Python required is 2.7 and the version pre-packaged
with operating-system distribution may be too old to run it. If so, you may have
to install IPython manually.
On Mac OS X and Windows systems, the easiest way is to use the Anaconda Python
installer. After installing Anaconda, use the conda command to install IPython:
conda update conda conda update ipython
On Ubuntu, we use the apt-get command and it's a good idea to install matplotlib
(for graphics) plus a cocktail of other useful modules.
sudo apt-get install python-matplotlib python-scipy python-pandas pythonsympy python-nose
IPython is available on Fedora (v18+) but not yet on CentOS (v6.5) although this
should be resolved with CentOS 7. Installation is via yum as follows:
sudo yum install python-matplotlib scipy python-pandas sympy python-nose
The IPython notebook interface runs in your web browser and provides a rich
multimedia environment. Furthermore, it is possible to produce some graphic
output via Python's matplotlib by using a Julia to Python interface. This requires
installation of the IJulia package.
Start IJulia from the command line by typing ipython notebook --profile julia,
which opens a window in your browser.
This can be used as a console interface to Julia; using the PyPlot package is also a
convenient way to plot some curves.
[ 14 ]
Chapter 1
0.1x)*cos(2.0x):
In order to set up the contract, the beneficiary must pay an agreed fee to the grantor.
The beneficiary's liability is therefore limited by this fee, while the grantor's liability
is unlimited. The following question arises: How can we arrive at a price that is fair
to both the grantor and the beneficiary? The price will be dependent on a number of
factors such as the price that the beneficiary wishes to pay, the time to exercise the
option, the rate of inflation, and the volatility of the stock.
Options characteristically exist in one of two forms: call options and put options.
Call options, which give the beneficiary the right to require the grantor to sell the
stock to him/her at the agreed price upon exercise, and put options, which give the
beneficiary the right to require the grantor to buy the stock at the agreed price on
exercise. The problem of the determination of option price was largely solved in
the 1970s by Fisher Black and Myron Scholes by producing a formula for the price
after treating the stock movement as random (Brownian) and making a number of
simplifying assumptions.
We are going to look at the example of an Asian option, which is one for which there
can be no formula. This is a type of option (sometimes termed an average value
option) where the payoff is determined by the average underlying price over some
preset period of time up to exercise rather than just the final price at that time.
So, to solve this type of problem, we must simulate the possible paths (often called
random walks) for the stock by generating these paths using random numbers. We
have seen a simple use of random numbers earlier while estimating the value of
Pi. Our problem is that the accuracy of the result typically depends on the square
of the number of trials, so obtaining an extra significant figure needs a hundred
times more work. For our example, we are going to do 100000 simulations, each 100
steps representing a daily movement over a period of around 3 months. For each
simulation, we determine at the end whether based on the average price of the stock,
there would be a positive increase for a call option or a negative one for a put option.
In which case, we are "in the money" and would exercise the option. By averaging all
the cases where there is a gain, we can arrive at a fair price.
The code that we need to do this is relatively short and needs no special features
other than simple coding.
Chapter 1
# Euler and Milstein discretization for Black-Scholes.
# Option features.
println("Setting option parameters");
S0
= 100;
# Spot price
= 100;
# Strike price
= 0.05;
= 0.0;
# Dividend yield
= 0.2;
# Volatility
tma = 0.25;
Averaging = 'A';
# Time to maturity
# 'A'rithmetic or 'G'eometric
dt = tma/T;
# Time increment
# Simulate the stock price under the Euler and Milstein schemes.
# Take average of terminal stock price.
println("Looping $N times.");
A = zeros(Float64,N);
for n=1:N
for t=2:T
dW = (randn(1)[1])*sqrt(dt);
z0 = (r - q - 0.5*v*v)*S[n,t-1]*dt;
z1 = v*S[n,t-1]*dW;
z2 = 0.5*v*v*S[n,t-1]*dW*dW;
S[n,t] = S[n,t-1] + z0 + z1 + z2;
[ 17 ]
We have wrapped the main body of the code in a function run_asian(N, PutCall)
.... end statement. The reason for this is to be able to time the execution of the task in
Julia, thereby eliminating the startup times associated with the Julia runtime when
using the console.
The stochastic behavior of the stock is modeled by the randn function; randn(N)
provides an array of N elements, normally distributed with zero mean and unit
variance.
All the work is done in the inner loop; the z-variables are just written to decompose
the calculation.
To store the averages for each track, use the zeros function to allocate and initialise
the array.
[ 18 ]
Chapter 1
The option would only be exercised if the average value of the stock is above the
"agreed" prices. This is called the payoff and is stored for each run in the array P.
It is possible to use arithmetic or geometric averaging. The code sets this as
arithmetic, but it could be parameterized.
The final price is set by applying the mean function to the P array. This is an example
of vectorized coding.
So, to run this simulation, start the Julia console and load the script as follows:
julia> include("asian.jl")
julia> run_asian()
To get an estimate of the time taken to execute this command, we can use the tic()/
toc() function or the @elapsed macro:
include("asian.jl")
tic(); run_asian(1000000, 'C'); toc();
Option Price: 1.6788 elapsed: time 1.277005471 seconds
If we are not interested in the execution times, there are a couple of ways in which
we can proceed.
The first is just to append to the code a single line calling the function as follows:
run_asian(1000000, 'C')
Then, we can run the Asian option from the command prompt by simply typing the
following: julia asian.jl.
This is pretty inflexible since we would like to pass different values of the number of
trials N and to determine the price for either a call option or a put option.
Julia provides an ARG array when a script is started from the command line to hold
the passed arguments. So, we add the following code at the end of asian.jl:
nArgs = length(ARGS)
if nArgs >= 2
run_asian(ARGS[1], ARGS[2])
elseif nArgs == 1
run_asian(ARGS[1])
else
run_asian()
end
[ 19 ]
Julia variables are case-sensitive, so we must use ARGS (uppercase) to pass the
arguments.
Because we have specified the default values in the function definition, this will run
from the command line or if loaded into the console.
Arguments to Julia are passed as strings but will be converted automatically
although we are not doing any checking on what is passed for the number of trials
(N) or the PutCall option.
Let's use the same parameters as before. We will be doing a single walk so there is no
need for an outer loop or for accumulating the price estimates and averaging them to
produce an option price.
By compacting the inner loop, we can write this as follows:
using ASCIIPlots;
S0
= 100;
# Spot price
= 102;
# Strike price
= 0.05;
= 0.0;
# Dividend yield
= 0.2;
# Volatility
tma = 0.25;
# Time to maturity
T = 100;
dt = tma/T;
# Time increment
S = zeros(Float64,T);
S[1] = S0;
[ 20 ]
Chapter 1
dW = randn(T)*sqrt(dt)
[ S[t] = S[t-1] * (1 + (r - q - 0.5*v*v)*dt + v*dW[t] +
0.5*v*v*dW[t]*dW[t]) for t=2:T ]
x = linspace(1,T);
scatterplot(x,S,sym='*');
Note that when adding a package the statement using its name as an ASCIIPlots
string, whereas when using the package, it does not.
[ 21 ]
S0
= 100;
# Spot price
= 102;
# Strike price
= 0.05;
= 0.0;
# Dividend yield
= 0.2;
# Volatility
tma = 0.25;
# Time to maturity
T = 100;
dt = tma/T;
# Time increment
S = zeros(Float64,T)
S[1] = S0;
dW = randn(T)*sqrt(dt);
[ S[t] = S[t-1] * (1 + (r - q - 0.5*v*v)*dt + v*dW[t] +
0.5*v*v*dW[t]*dW[t]) for t=2:T ]
x = linspace(1, T, length(T));
p = FramedPlot(title = "Random Walk, drift 5%, volatility 2%")
add(p, Curve(x,S,color="red"))
display(p)
[ 22 ]
Chapter 1
[ 23 ]
My benchmarks
We compared the Asian option code above with similar implementations in the
"usual" data science languages discussed earlier.
The point of these benchmarks is to compare the performance of specific algorithms
for each language implementation. The code used is available to download.
Language
Timing (c = 1)
1.0
Asian
option
1.681
Julia
1.41
1.680
Python
(v3)
32.67
1.671
154.3
1.646
Octave
789.3
1.632
The runs were executed on a Samsung RV711 laptop with an i5 processor and 4GB
RAM running CentOS 6.5 (Final).
Package management
We have noted that Julia uses Git as a repository for itself and for its package and
that the installation has a built-in package manager, so there is no need to interface
directly to GitHub. This repository is located in the Git folder of the installed system.
As a full discussion of the package system is given on the Julia website, we will only
cover some of the main commands to use.
[ 24 ]
Chapter 1
The latest versions of all installed packages can be updated with the
Pkg.update() command.
Notice that if the repository does not exist, the first use of a package command
such as Pkg.update() or Pkg.add() will call Pkg.init() to create it:
julia> Pkg.update()
Pkg.update()
INFO: Updating METADATA...
INFO: Computing changes...
INFO: No packages to install, update or remove.
julia> Pkg.rm("ASCIIPlots")
Pkg.rm("ASCIIPlots")
INFO: Removing ASCIIPlots INFO: REQUIRE updated.
0.0.2
0.11.0
Additional packages:
[ 25 ]
0.2.14
0.2.13
- Color 0.2.10
- HTTPClient
0.1.0
- IniFile 0.2.2
- LibCURL
0.1.3
- LibExpat0.0.4
- Tk
0.2.12
- URIParser 0.0.2
- URLParse 0.0.0
- WinRPM
- Zlib
0.0.13
0.1.5
Chapter 1
Even with an old relatively untouched package, there is nothing to stop you checking
out the code and modifying or building on it. Any enhancements or modifications
can be applied and the code returned; that's how open source grows. Furthermore,
the principal author is likely to be delighted that someone else is finding the package
useful and taking an interest in the work.
It is not possible to create a specific taxonomy of Julia packages but certain groupings
emerge, which build on the backs of the earlier ones. We will be meeting many of
these later in this book, but before that, it may be useful to quickly list a few.
Data visualization
Graphics support in Julia has sometimes been given less than favorable press in
comparison with other languages such as Python, R, and MATLAB. It is a stated aim
of the developers to incorporate some degree of graphics support in the core, but at
present, this is largely the realm of package developers.
While it was true that v0.1.x offered very limited and flaky graphics, v0.2.x vastly
improved the situation and this continues with v0.3.x.
[ 27 ]
Firstly, there is a module in the core called Base.Graphics, which acts as an abstract
layer to packages such as Cairo and Tk/Gtk, which serve to implement much of the
required functionality.
Layered on top of these are a couple of packages, namely Winston (which we have
introduced already) and Gadfly. Normally, as a user, you will probably work with
one or the other of these.
Winston is a 2D graphics package that provides methods for curve plotting and
creating histograms and scatter diagrams. Axis labels and display titles can be
added, and the resulting display can be saved to files as well as shown on the screen.
Gadfly is a system for plotting and visualization equivalent to the ggplot2 module
in R. It can be used to render the graphics output to PNG, PostScript, PDF, and SVG
files. Gadfly works best with the following C libraries installed: cairo, pango, and
fontconfig. The PNG, PS, and PDF backends all require cairo, but without it, it is
still possible to create displays to SVG and Javascript/D3.
There are a couple of different approaches, which are worthy of note: Gaston
and PyPlot.
Gaston is an interface to the gnuplot program on Linux. You need to check whether
gnuplot is available, and if not, it must be installed in the usual way via yum or aptget. For this, you need to install XQuartz, which must be started separately before
using Gaston.
Gaston can do whatever gnuplot is capable of. There is a very comprehensive script
available in the package by running Gaston.demo().
We have discussed Pyplot briefly before when looking at IJulia. The package uses
Julia's PyCall package to call the Matplotlib Python module directly and can
display plots in any Julia graphical backend, including as we have seen, inline
graphics in IJulia.
Chapter 1
[ 29 ]
Any packages here will be listed using the package manager or in Julia Studio.
However, it is possible to use an unregistered package by using Pkg.clone(url),
where the url is a Git URL from which the package can be cloned. The package
should have the src and test folders and may have several others. If it contains
a REQUIRE file at the top of the source tree, that file can be used to determine any
dependent registered packages; these packages will be automatically installed.
If you are developing a package, it is possible to place the source in the .julia
folder alongside packages added with Pkg.add() or Pkg.clone(). Eventually,
you will wish to use GitHub in a more formal way; we will deal with that later
when considering package implementation.
Parallel processing
As a language aimed at the scientific community, it is natural that Julia should
provide facilities for executing code in parallel. In running tasks on multiple
processors, Julia takes a different approach to the popular message passing interface
(MPI). In Julia, communication is one-sided and appears to the programmer more as
a function call than the traditional message send and receive paradigm typified by
pyMPI on Python and Rmpi on R.
[ 30 ]
Chapter 1
Julia provides two in-built primitives: remote references and remote calls. A remote
reference is an object that can be used from any processor to refer to an object stored
on a particular processor. A remote call is a request by one processor to call a certain
function or certain arguments on another, or possibly the same, processor.
Sending messages and moving data constitute most of the overhead in a parallel
program, and reducing the number of messages and the amount of data sent is
critical to achieving performance and scalability. We will be investigating how
Julia tackles this in a subsequent chapter.
Multiple dispatch
The choice of which method to execute when a function is applied is called dispatch.
Single dispatch polymorphism is a familiar feature in object-orientated languages
where a method call is dynamically executed on the basis of the actual derived type
of the object on which the method has been called.
Multiple dispatch is an extension of this paradigm where dispatch occurs by using
all of a function's arguments rather than just the first.
Homoiconic macros
Julia, like Lisp, represents its own code in memory by using a user-accessible data
structure, thereby allowing programmers to both manipulate and generate code that
the system can evaluate. This makes complex code generation and transformation far
simpler than in systems without this feature.
We met an example of a macro earlier in @printf, which mimics the C-like printf
statements. Its definition is in given in the base/printf.jl file.
Interlanguage cooperation
We noted that Julia is often seen as a competitor to languages such as C, Python,
and R, but this is not the view of the language designers and developers.
Julia makes it simple to call C and Fortran functions, which are compiled and saved
as shared libraries. This is by the use of the in-built call. This means that there is no
need for the traditional "wrapper code" approach, which acts on the function inputs,
transforms them into an appropriate form, loads the shared library, makes the call,
and then repeats the process in reverse on the return value. Julia's JIT compilation
generates the same low-level machine instructions as a native C call, so there is no
additional overhead in making the function call from Julia.
[ 31 ]
Additionally, we have seen that the PyCall package makes it easy to import Python
libraries, and this has been seen to be an effective method of displaying the graphics
from Python's matplotlib. Further, inter-cooperation with Python is evident in the
provision of the IJulia IDE and an adjunction to IPython notebooks.
There is also work on calling R libraries from Julia by using the Rif package and
calling Java programs from within Julia by using the JavaCall package. These
packages present the exciting prospect of opening up Julia to a wealth of existing
functionalities in a straightforward and elegant fashion.
Summary
This chapter introduced you to Julia, how to download it, install it, and build it from
source. We saw that the language is elegant, concise, and powerful. The next three
chapters will discuss the features of Julia in more depth.
We looked at interacting with Julia via the command line (REPL) in order to use a
random walk method to evaluate the price of an Asian option. We also discussed
the use of two interactive development environments (IDEs), Juno and IJulia, as an
alternative to REPL.
In addition, we reviewed the in-built package manager and how to add, update,
and remove modules, and then demonstrated the use of two graphics packages to
display the typical trajectories of the Asian option calculation. In the next chapter,
we will look at various other approaches to creating display graphics and quality
visualizations.
[ 32 ]
www.PacktPub.com
Stay Connected: