Research Software Engineering with Python: Building software that makes research possible 1st Edition Damien Irving - The complete ebook set is ready for download today
Research Software Engineering with Python: Building software that makes research possible 1st Edition Damien Irving - The complete ebook set is ready for download today
com
https://ebookmeta.com/product/research-software-engineering-
with-python-building-software-that-makes-research-
possible-1st-edition-damien-irving/
OR CLICK HERE
DOWLOAD EBOOK
https://ebookmeta.com/product/software-engineering-research-and-
practice-1st-edition-hamid-r-arabnia-leonidas-deligiannidis-fernando-
g-tinetti/
ebookmeta.com
https://ebookmeta.com/product/engineering-software-products-an-
introduction-to-modern-software-engineering-ian-sommerville/
ebookmeta.com
https://ebookmeta.com/product/famous-obsession-endless-obsession-1st-
edition-m-k-moore-2/
ebookmeta.com
Self Defense for Women Fight Back 2nd Edition Loren W
Christensen Lisa Christensen
https://ebookmeta.com/product/self-defense-for-women-fight-back-2nd-
edition-loren-w-christensen-lisa-christensen/
ebookmeta.com
https://ebookmeta.com/product/the-authority-of-tenderness-dignity-and-
the-true-self-in-psychoanalysis-1st-edition-paul-williams/
ebookmeta.com
https://ebookmeta.com/product/the-healing-energy-of-your-hands-
michael-bradford-2/
ebookmeta.com
https://ebookmeta.com/product/expert-python-programming-2nd-edition-
michal-jaworski-tarek-ziade/
ebookmeta.com
https://ebookmeta.com/product/new-screen-media-cinema-art-
narrative-1st-edition-martin-rieser/
ebookmeta.com
His Terms 1st Edition Jenika Snow
https://ebookmeta.com/product/his-terms-1st-edition-jenika-snow/
ebookmeta.com
Research Software Engineering
with Python
Research Software Engineering
with Python
Building software that makes research
possible
Damien Irving
Kate Hertweck
Luke Johnston
Joel Ostblom
Charlotte Wickham
Greg Wilson
First edition published 2022
by CRC Press
6000 Broken Sound Parkway NW, Suite 300, Boca Raton, FL 33487-2742
© 2022 Damien Irving, Kate Hertweck, Luke Johnston, Joel Ostblom, Charlotte Wickham, and
Greg Wilson
Reasonable efforts have been made to publish reliable data and information, but the author and
publisher cannot assume responsibility for the validity of all materials or the consequences of their
use. The authors and publishers have attempted to trace the copyright holders of all material
reproduced in this publication and apologize to copyright holders if permission to publish in this
form has not been obtained. If any copyright material has not been acknowledged please write and
let us know so we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or
hereafter invented, including photocopying, microfilming, and recording, or in any information
storage or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, access
www.copyright.com or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive,
Danvers, MA 01923, 978-750-8400. For works that are not available on CCC please contact
[email protected]
Trademark notice: Product or corporate names may be trademarks or registered trademarks and are
used only for identification and explanation without intent to infringe.
DOI: 10.1201/9781003143482
Publisher's note: This book has been prepared from camera-ready copy provided by the authors.
To David Flanders
who taught me so much about growing and sustaining coding communities.
— Damien
To Joshua.
— Charlotte
To Brent Gorda
without whom none of this would have happened.
— Greg
All royalties from this book are being donated to The Carpentries,
an organization that teaches foundational coding and data science skills
to researchers worldwide.
Contents
Welcome 1
0.1 The Big Picture . . . . . . . . . . . . . . . . . . . . . . . . . 1
0.2 Intended Audience . . . . . . . . . . . . . . . . . . . . . . . 2
0.3 What You Will Learn . . . . . . . . . . . . . . . . . . . . . . 3
0.4 Using this Book . . . . . . . . . . . . . . . . . . . . . . . . . 4
0.5 Contributing and Re-Use . . . . . . . . . . . . . . . . . . . . 4
0.6 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . 5
1 Getting Started 7
1.1 Project Structure . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2 Downloading the Data . . . . . . . . . . . . . . . . . . . . . 9
1.3 Installing the Software . . . . . . . . . . . . . . . . . . . . . 10
1.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.6 Key Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
iii
iv Contents
2.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.11 Key Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
15 Finale 369
15.1 Why We Wrote This Book . . . . . . . . . . . . . . . . . . . 370
Appendix 371
Contents ix
A Solutions 371
H YAML 481
I Anaconda 485
I.1 Package Management with conda . . . . . . . . . . . . . . . 485
I.2 Environment Management with conda . . . . . . . . . . . . 487
J Glossary 489
K References 503
Index 511
Welcome
• Open science: Making data, methods, and results freely available to all
by publishing them under open licenses.
1
2 0 Welcome
rather than to replace it. Sustainability isn’t just a property of the soft-
ware: it also depends on the skills and culture of its users.
People often conflate these three ideas, but they are distinct. For example, if
you share your data and the programs that analyze it, but don’t document
what steps to take in what order, your work is open but not reproducible.
Conversely, if you completely automate your analysis, but your data is only
available to people in your lab, your work is reproducible but not open. Fi-
nally, if a software package is being maintained by a couple of post-docs who
are being paid a fraction of what they could earn in industry and have no re-
alistic hope of promotion because their field doesn’t value tool building, then
sooner or later it will become abandonware, at which point openness and
reproducibility become less relevant.
Nobody argues that research should be irreproducible or unsustainable, but
“not against it” and actively supporting it are very different things. Academia
doesn’t yet know how to reward people for writing useful software, so while
you may be thanked, the effort you put in may not translate into academic
job security or decent pay.
Some people worry that if they make their data and code publicly available,
someone else will use it and publish a result they could have come up with
themselves. This is almost unheard of in practice, but that doesn’t stop it being
used as a scare tactic. Other people are afraid of looking foolish or incompetent
by sharing code that might contain bugs. This isn’t just impostor syndrome:
members of marginalized groups are frequently judged more harshly than
others, so being wrong in public is much riskier for them.
With this course, we hope to give researchers the tools and knowledge to be
better research software developers, to be more efficient in their work, make
less mistakes, and work more openly and reproducibly. We hope that by having
more researchers with these skills and knowledge, research culture can improve
to address the issues raised above.
This book is written for researchers who are already using Python for their
data analysis, but who want to take their coding and software development
to the next level. You don’t have to be highly proficient with Python, but
you should already be comfortable doing things like reading data from files
and writing loops, conditionals, and functions. The following personas are
examples of the types of people that are our target audience.
0.3 What You Will Learn 3
Amira Khan completed a master’s in library science five years ago and has
since worked for a small aid organization. She did some statistics during
her degree, and has learned some R and Python by doing data science
courses online, but has no formal training in programming. Amira would
like to tidy up the scripts, datasets, and reports she has created in order
to share them with her colleagues. These lessons will show her how to do
this.
Jun Hsu completed an Insight Data Science1 fellowship last year after doing
a PhD in geology and now works for a company that does forensic audits.
He uses a variety of machine learning and visualization packages, and
would now like to turn some of his own work into an open source project.
This book will show him how such a project should be organized and how
to encourage people to contribute to it.
Sami Virtanen became a competent programmer during a bachelor’s degree
in applied math and was then hired by the university’s research computing
center. The kinds of applications they are being asked to support have
shifted from fluid dynamics to data analysis; this guide will teach them
how to build and run data pipelines so that they can pass those skills on
to their users.
Rather than simply providing reference material about good coding practices,
the book follows Amira and Sami as they work together to write an actual
software package to address a real research question. The data analysis task
that we focus on relates to a fascinating result in the field of quantitative
linguistics. Zipf’s Law2 states that the second most common word in a body
of text appears half as often as the most common, the third most common
appears a third as often, and so on. To test whether Zipf’s Law holds for a
collection of classic novels that are freely available from Project Gutenberg3 ,
we write a software package that counts and analyzes the word frequency
distribution in any arbitrary body of text.
In the process of writing and publishing a Python package to verify Zipf’s
Law, we will show you how to do the following:
This book was written to be used as the material for a (potentially) semester-
long course at the university level, although it can also be used for independent
self-study. Participatory live-coding is the anticipated style for teaching the
material, rather than lectures simply talking about the code presented (N. C.
C. Brown and Wilson 2018; Wilson 2019a). The chapters and their content
are generally designed to be used in the order given.
Chapters are structured with the introduction at the start, content in the
middle, and exercises at the end. Callout boxes are interspersed throughout
the content to be used as a supplement to the main text, but not a requirement
for the course overall. Early chapters have many small exercises; later chapters
have fewer but larger exercises. In order to break up long periods of live-
coding while teaching, it may be preferable to stop and complete some of
the exercises at key points throughout the chapter, rather than waiting until
the end. Possible exercise solutions are provided (Appendix A), in addition
to learning objectives (Appendix B) and key points (Appendix C) for each
chapter.
The source for the book can be found at the py-rse GitHub repository4
and any corrections, additions, or contributions are very welcome. Everyone
whose work is included will be credited in the acknowledgments. Check out our
4 https://github.com/merely-useful/py-rse
0.6 Acknowledgments 5
0.6 Acknowledgments
This book owes its existence to everyone we met through The Carpentries10 .
We are also grateful to Insight Data Science11 for sponsoring the early stages of
this work, to the authors of Noble (2009), Haddock and Dunn (2010), Wilson
et al. (2014), Scopatz and Huff (2015), Taschuk and Wilson (2017), Wilson et
al. (2017), N. C. C. Brown and Wilson (2018), Devenyi et al. (2018), Sholler
et al. (2019), Wilson (2019b) and to everyone who has contributed, including
Madeleine Bonsma-Fisher, Jonathan Dursi, Christina Koch, Sara Mahallati,
Brandeis Marshall, and Elizabeth Wickes.
• Many of the explanations and exercises in Chapters 2–4 have been adapted
from Software Carpentry’s lesson The Unix Shell 12 .
• Many of the explanations and exercises in Chapters 6 and 7 have been
adapted from Software Carpentry’s lesson Version Control with Git 13 and
an adaptation/extension of that lesson14 that is maintained by the Uni-
versity of Wisconsin-Madison Data Science Hub.
• Chapter 9 is based on Software Carpentry’s lesson Automation and Make15
and on Jonathan Dursi’s Introduction to Pattern Rules16 .
• Chapter 14 is based in part on Python 102 17 by Ashwin Srinath.
5 https://github.com/merely-useful/py-rse/blob/book/CONTRIBUTING.md
6 https://github.com/merely-useful/py-rse/blob/book/CONDUCT.md
7 https://github.com/merely-useful/py-rse/blob/book/LICENSE.md
8 https://creativecommons.org/licenses/by/4.0/
9 https://github.com/merely-useful/py-rse/blob/book/LICENSE-MIT.md
10 https://carpentries.org/
11 https://www.insightdatascience.com/
12 http://swcarpentry.github.io/shell-novice/
13 http://swcarpentry.github.io/git-novice/
14 https://uw-madison-datascience.github.io/git-novice-custom/
15 http://swcarpentry.github.io/make-novice/
16 https://github.com/ljdursi/make_pattern_rules
17 https://python-102.readthedocs.io/
1
Getting Started
As with many research projects, the first step in our Zipf’s Law analysis is
to download the research data and install the required software. Before doing
that, it’s worth taking a moment to think about how we are going to organize
everything. We will soon have a number of books from Project Gutenberg1
in the form of a series of text files, plots we’ve produced showing the word
frequency distribution in each book, as well as the code we’ve written to
produce those plots and to document and release our software package. If we
aren’t organized from the start, things could get messy later on.
Project organization is like a diet: everyone has one, it’s just a question of
whether it’s healthy or not. In the case of a project, “healthy” means that
people can find what they need and do what they want without becoming
frustrated. This depends on how well organized the project is and how familiar
people are with that style of organization.
As with good coding style, small pieces in predictable places with readable
names are easier to find and use than large chunks that vary from project to
project and have names like “stuff.” We can be messy while we are working
and then tidy up later, but experience teaches that we will be more productive
if we make tidiness a habit.
In building the Zipf’s Law project, we’ll follow a widely used template for
organizing small and medium-sized data analysis projects (Noble 2009). The
project will live in a directory called zipf, which will also be a Git repository
1 https://www.gutenberg.org/
7
8 1 Getting Started
zipf/
.gitignore
CITATION.md
CONDUCT.md
CONTRIBUTING.md
LICENSE.md
README.md
Makefile
bin
book_summary.sh
collate.py
countwords.py
...
data
README.md
dracula.txt
frankenstein.txt
...
docs
...
results
collated.csv
dracula.csv
dracula.png
...
...
Our project will contain a few standard files that should be present in every
research software project, open source or otherwise:
Some projects also include a CONTRIBUTORS or AUTHORS file that lists everyone
who has contributed to the project, while others include that information in
the README (we do this in Chapter 7) or make it a section in CITATION. These
files are often called boilerplate, meaning they are copied without change
from one use to the next.
Following Noble (2009), the directories in the repository’s root are organized
according to purpose:
This structure works well for many computational research projects and we
encourage its use beyond just this book. We will add some more folders and
files not directly addressed by Noble (2009) when we talk about testing (Chap-
ter 11), provenance (Chapter 13), and packaging (Chapter 14).
https://doi.org/10.6084/m9.figshare.13040516
We can download a zip file containing the data files by clicking “download all”
at this URL and then unzipping the contents into a new zipf/data directory
(also called a folder) that follows the project structure described above. Here’s
how things look once we’re done:
zipf/
data
README.md
dracula.txt
frankenstein.txt
jane_eyre.txt
moby_dick.txt
sense_and_sensibility.txt
sherlock_holmes.txt
time_machine.txt
1. A Bash shell
2. Git version control
3. A text editor
4. Python 32 (via the Anaconda distribution)
5. GNU Make3
• Linux (Debian/Ubuntu): Install it from the Bash shell using sudo apt-get
install make.
• Mac: Install Xcode5 (via the App Store).
• Windows: Follow the installation instructions6 maintained by the Master of
Data Science program at the University of British Columbia.
Software Versions
Throughout the book, we’ll be showing you examples of the output you
can expect to see. This output is derived from running a Mac with:
Git version 2.29.2, Python version 3.7.6, GNU bash version 3.2.57(1)-
release (x86_64-apple-darwin19), GNU Make 3.81, and conda 4.9.2. In
some cases, what you see printed to the screen may differ slightly based
on software version. We’ll help you understand how to interpret the
output so you can keep working and troubleshoot regardless of software
version.
1.4 Summary
Now that our project structure is set up, our data is downloaded, and our
software is installed, we are ready to start our analysis.
5 https://developer.apple.com/xcode/
6 https://ubc-mds.github.io/resources_pages/install_ds_stack_windows/#make
12 1 Getting Started
1.5 Exercises
Make sure you’ve downloaded the required data files (following Section 1.2)
and installed the required software (following Section 1.3) before progressing
to the next chapter.
• Make tidiness a habit, rather than cleaning up your project files later.
• Include a few standard files in all your projects, such as README, LI-
CENSE, CONTRIBUTING, CONDUCT and CITATION.
• Put runnable code in a bin/ directory.
• Put raw/original data in a data/ directory and never modify it.
• Put results in a results/ directory. This includes cleaned-up data and fig-
ures (i.e., everything created using what’s in bin and data).
• Put documentation and manuscripts in a docs/ directory.
• Refer to The Carpentries software installation guide7 if you’re having trou-
ble.
7 https://carpentries.github.io/workshop-template/#setup
2
The Basics of the Unix Shell
Ninety percent of most magic merely consists of knowing one extra fact.
— Terry Pratchett
Computers do four basic things: store data, run programs, talk with each
other, and interact with people. They do the interacting in many different
ways, of which graphical user interfaces (GUIs) are the most widely used.
The computer displays icons to show our files and programs, and we tell it to
copy or run those by clicking with a mouse. GUIs are easy to learn but hard
to automate, and don’t create a record of what we did.
In contrast, when we use a command-line interface (CLI) we communicate
with the computer by typing commands, and the computer responds by dis-
playing text. CLIs existed long before GUIs; they have survived because they
are efficient, easy to automate, and automatically record what we have done.
The heart of every CLI is a read-evaluate-print loop (REPL). When we
type a command and press Return (also called Enter) the CLI reads the
command, evaluates it (i.e., executes it), prints the command’s output, and
loops around to wait for another command. If you have used an interactive
console for Python, you have already used a simple CLI.
This lesson introduces another CLI that lets us interact with our computer’s
operating system. It is called a “command shell,” or just shell for short, and
in essence is a program that runs other programs on our behalf (Figure 2.1).
Those “other programs” can do things as simple as telling us the time or as
complex as modeling global climate change; as long as they obey a few simple
rules, the shell can run them without having to know what language they are
written in or how they do what they do.
13
14 2 The Basics of the Unix Shell
What’s in a Name?
Programmers have written many different shells over the last forty
years, just as they have created many different text editors and plotting
packages. The most popular shell today is called Bash (an acronym of
Bourne Again SHell, and a weak pun on the name of its predecessor,
the Bourne shell). Other shells may differ from Bash in minor ways, but
the core commands and ideas remain the same. In particular, the most
recent versions of MacOS use a shell called the Z Shell or zsh; we will
point out a few differences as we go along.
Please see Section 1.3 for instructions on how to install and launch the shell
on your computer.
However, different shells may use a different symbol: in particular, the zsh
shell, which is the default on newer versions of MacOS, uses %. As we’ll see in
Section 4.6, we can customize the prompt to give us more information.
2.1 Exploring Files and Directories 15
Let’s run a command to find out who the shell thinks we are:
$ whoami
amira
Learn by Doing
Amira is one of the learners described in Section 0.2. For the rest of
the book, we’ll present code and examples from her perspective. You
should follow along on your own computer, though what you see might
deviate in small ways because of differences in operating system (and
because your name probably isn’t Amira).
Now that we know who we are, we can explore where we are and what we
have. The part of the operating system that manages files and directories (also
called folders) is called the filesystem. Some of the most commonly used
commands in the shell create, inspect, rename, and delete files and directories.
Let’s start exploring them by running the command pwd, which stands for
print working directory. The “print” part of its name is straightforward; the
“working directory” part refers to the fact that the shell keeps track of our
current working directory at all times. Most commands read and write
files in the current working directory unless we tell them to do something else,
so knowing where we are before running a command is important.
$ pwd
/Users/amira
Slashes
The / character means two different things in a path. At the front of
a path or on its own, it refers to the root directory. When it appears
inside a name, it is a separator. Windows uses backslashes (\\) instead
of forward slashes as separators.
Underneath /Users, we find one directory for each user with an account on
this machine. Jun’s files are stored in /Users/jun, Sami’s in /Users/sami, and
Amira’s in /Users/amira. This is where the name “home directory” comes
from: when we first log in, the shell puts us in the directory that holds our
files.
2.1 Exploring Files and Directories 17
Now that we know where we are, let’s see what we have using the command
ls (short for “listing”), which prints the names of the files and directories in
the current directory:
$ ls
Again, our results may be different depending on our operating system and
what files or directories we have.
We can make the output of ls more informative using the -F option (also
sometimes called a switch or a flag). Options are exactly like arguments to
a function in Python; in this case, -F tells ls to decorate its output to show
what things are. A trailing / indicates a directory, while a trailing * tells us
something is a runnable program. Depending on our setup, the shell might
also use colors to indicate whether each entry is a file or directory.
$ ls -F
Here, we can see that almost everything in our home directory is a subdirec-
tory; the only thing that isn’t is a file called todo.txt.
18 2 The Basics of the Unix Shell
Spaces Matter
1+2 and 1 + 2 mean the same thing in mathematics, but ls -F and
ls-F are very different things in the shell. The shell splits whatever
we type into pieces based on spaces, so if we forget to separate ls and
-F with at least one space, the shell will try to find a program called
ls-F and (quite sensibly) give an error message like ls-F: command
not found.
Some options tell a command how to behave, but others tell it what to act on.
For example, if we want to see what’s in the /Users directory, we can type:
$ ls /Users
We often call the file and directory names that we give to commands argu-
ments to distinguish them from the built-in options. We can combine options
and arguments:
$ ls -F /Users
but we must put the options (like -F) before the names of any files or direc-
tories we want to work on, because once the command encounters something
that isn’t an option it assumes there aren’t any more:
$ ls /Users -F
$ ls /Users -F
$ ls -F
If we want to see what’s in the zipf directory we can ask ls to list its contents:
$ ls -F zipf
data/
Notice that zipf doesn’t have a leading slash before its name. This absence
tells the shell that it is a relative path, i.e., that it identifies something start-
ing from our current working directory. In contrast, a path like /Users/amira
is an absolute path: it is always interpreted from the root directory down, so
it always refers to the same thing. Using a relative path is like telling someone
to go two kilometers north and then half a kilometer east; using an absolute
path is like giving them the latitude and longitude of their destination.
20 2 The Basics of the Unix Shell
We can use whichever kind of path is easiest to type, but if we are going to do
a lot of work with the data in the zipf directory, the easiest thing would be
to change our current working directory so that we don’t have to type zipf
over and over again. The command to do this is cd, which stands for change
directory. This name is a bit misleading because the command doesn’t change
the directory; instead, it changes the shell’s idea of what directory we are in.
Let’s try it out:
$ cd zipf
cd doesn’t print anything. This is normal: many shell commands run silently
unless something goes wrong, on the theory that they should only ask for our
attention when they need it. To confirm that cd has done what we asked, we
can use pwd:
$ pwd
/Users/amira/zipf
$ ls -F
data/
2.2 Moving Around 21
$ cd -j
On the other hand, if we get the syntax right but make a mistake in
the name of a file or directory, it will tell us that:
$ cd whoops
We now know how to go down the directory tree, but how do we go up? This
doesn’t work:
$ cd amira
because amira on its own is a relative path meaning “a file or directory called
amira below our current working directory.” To get back home, we can either
use an absolute path:
$ cd /Users/amira
or a special relative path called .. (two periods in a row with no spaces), which
always means “the directory that contains the current one.” The directory that
contains the one we are in is called the parent directory, and sure enough,
.. gets us there:
22 2 The Basics of the Unix Shell
$ cd ..
$ pwd
/Users/amira
$ ls -F -a
The output also shows another special directory called . (a single period),
which refers to the current working directory. It may seem redundant to have
a name for it, but we’ll see some uses for it soon.
Combining Options
You’ll occasionally need to use multiple options in the same command.
In most command-line tools, multiple options can be combined with a
single - and no spaces between the options:
$ ls -Fa
The special names . and .. don’t belong to cd: they mean the same thing to
every program. For example, if we are in /Users/amira/zipf, then ls .. will
display a listing of /Users/amira. When the meanings of the parts are the
2.2 Moving Around 23
same no matter how they’re combined, programmers say they are orthogonal.
Orthogonal systems tend to be easier for people to learn because there are
fewer special cases to remember.
$ pwd
/Users/amira/Movies
$ cd
$ pwd
/Users/amira
No matter where we are, cd on its own always returns us to our home directory.
We can achieve the same thing using the special directory name ~, which is a
shortcut for our home directory:
$ ls ~
(ls doesn’t show any trailing slashes here because we haven’t used -F.) We
can use ~ in paths, so that (for example) ~/Downloads always refers to our
download directory.
Finally, cd interprets the shortcut - (a single dash) to mean the last direc-
tory we were in. Using this is usually faster and more reliable than trying to
remember and type the path, but unlike ~, it only works with cd: ls - tries
to print a listing of a directory called - rather than showing us the contents
of our previous directory.
$ cd ~/zipf
$ ls -F
data/
To create a new directory, we use the command mkdir (short for make
directory):
$ mkdir docs
Since docs is a relative path (i.e., does not have a leading slash) the new
directory is created below the current working directory:
$ ls -F
data/ docs/
Using the shell to create a directory is no different than using a graphical tool.
If we look at the current directory with our computer’s file browser we will see
the docs directory there too. The shell and the file explorer are two different
ways of interacting with the files; the files and directories themselves are the
same.
2.3 Creating New Files and Directories 25
Since we just created the docs directory, ls doesn’t display anything when
we ask for a listing of its contents:
$ ls -F docs
Let’s change our working directory to docs using cd, then use a very simple
text editor called Nano to create a file called draft.txt (Figure 2.3):
$ cd docs
$ nano draft.txt
When we say “Nano is a text editor” we really do mean “text”: it can only
work with plain character data, not spreadsheets, images, Microsoft Word
files, or anything else invented after 1970. We use it in this lesson because it
runs everywhere, and because it is as simple as something can be and still be
called an editor. However, that last trait means that we shouldn’t use it for
larger tasks like writing a program or a paper.
Random documents with unrelated
content Scribd suggests to you:
those "Contes à Ninon" gave no warning of what was to follow from his pen.
And yet at the very time of writing most of them he was being weaned from
romance and fable and idyl. Not only had he taken considerable interest in
About's "Madelon," but he had been studying Balzac, and particularly
Flaubert's "Madame Bovary," the perusal of which had quite stirred him. A
man had come, axe in hand, into the huge and often tangled forest which
Balzac had left behind him; and the formula of the modern novel now
appeared in a blaze of light. When "Madame Bovary" was issued in 1860, the
average Parisian, the average literary man even, regarded it merely as a
succès de scandale. Many of those who praised the book failed to understand
its real import; and when Flaubert was satirised in the popular theatrical
révue, "Ohé! les petits Agneaux," half Paris, by way of deriding him, hummed
the trivial lines sung by the actress who impersonated "Madame Bovary":
Émile Zola's Home, Impasse Sylvacanne, Aix-in-Provence.—Photo by C.
Martinet
Zola took the hint (conveyed pleasantly enough) and gave notice to leave at
the end of the following January. And he was the better pleased at having
adopted that course, and having averted, perhaps, a direct dismissal, as a few
weeks after the appearance of "La Confession de Claude" the Procureur
Impérial, otherwise the public prosecutor, influenced by certain reviews of the
book, caused some inquiries to be made at Hachette's with respect to its
author. No prosecution ensued, and "Madame Bovary" having escaped scot
free, it is extremely doubtful if one would have succeeded even in those days
of judicial subserviency to the behests of the authorities, particularly as,
whatever might be the subject-matter of the "Confession," it was instinct
throughout with loathing and censure of the incidents it narrated. In any case,
Zola, on writing to Valabrègue early in January, 1866, with thoughts, perhaps,
of "Henriette Maréchal" and the Goncourts in his mind, was by no means
alarmed or cast down. If, said he, the "Confession" had damaged him in the
opinion of respectable folk, it had also made him known; he was feared and
insulted, classed among the writers whose works were read with horror. For
his part, he did not mean to pander to the likes or the dislikes of the crowd;
he intended to force the public to caress or insult him. Doubtless, indifference
would be loftier, more dignified; but he belonged to an impatient age, and if
he and his fellows did not trample the others under foot, the others would
certainly pass over them, and, personally, he did not desire to be crushed by
fools.
And now, then, having published two volumes, the first fairly well received,
the second virulently attacked, he quitted Hachette's, to give himself up
entirely to journalism and literature.
IV
1866-1868
Such writing as this was bound to ruffle many dovecotes. There had
previously been various efforts on behalf of the new school of painting, the
complaints of injustice having led one year to the granting of a Salon des
Réfusés, but never had any writer hit out so vigorously, with such disregard
for the pretentious vanity of the artistic demigods of the hour. If, however,
Zola was banished from "L'Événement" as an art critic, he was not silenced,
for he republished his articles in pamphlet form,[10] with a dedicatory preface
addressed to Paul Cézanne, in which he said: "I have faith in the views I
profess; I know that in a few years everybody will hold me to be right. So I
have no fear that they may be cast in my face hereafter." In this again he was
fairly accurate: at least several of the views then held to be not merely
revolutionary but ridiculous have become commonplaces of criticism.
Though this campaign did not improve Zola's material position, it brought him
into notoriety among the public, and gave him quite a position among the
young men of the French art-world. At this time he still had his home in the
Rue de Vaugirard, overlooking the Luxembourg gardens, but in the summer of
1866 he was able to spend several weeks at Bennecourt, a little village on the
right bank of the Seine, near Bonnières, and—as the crow flies—about half-
way between Paris and Rouen. Here he was joined at intervals by some of his
Provençal friends, Baille, Cézanne, Marius Roux, and Numa Coste;[11] and
they roamed and boated, rested on the pleasant river islets and formed the
grandest plans for the future, while Paris became all excitement about the war
which had broken out between Prussia and Austria. The crash of Kœnigsgratz
echoed but faintly in that pleasant valley of the Seine, among those young
men whose minds were intent on art and literature. But politically the year
was an important one for France, for, from that time, the Franco-German War
became inevitable. The Napoleonic prestige was departing. The recall of the
expeditionary force from Mexico had become imperative. In vain did the
unhappy Empress Charlotte hasten to Paris and beg and pray and weep;
Napoleon III, who had placed her husband Maximilian in his dangerous
position, would give him no further help, and she, poor woman, was soon to
lose her reason and sink into living death.
The year which had opened so brightly for Zola was to end badly for him also.
After shocking the readers of "L'Événement" as an art critic, he imagined he
might be more successful with them as a story writer. So he proposed a serial
to Villemessant, who after examining a synopsis of the suggested narrative,
accepted the offer. The story which Zola then wrote was called "Le Vœu d'une
Morte," but it met with no more success than the art criticisms, and after
issuing the first part, Villemessant stopped the publication. The second part
was never written; yet the abortion—for it was nothing else—was issued in
volume form,[12] and of recent years has even been translated into English,
[13] and reviewed approvingly by English critics! Zola himself always regarded
it as the very worst of his productions. "What a wretched thing, my friend!" he
remarked in a letter to M. George Charpentier twenty years after this story's
first appearance. "Nowadays young men of eighteen turn out work ten times
superior in craftsmanship to what we produced when we were five and
twenty."
This second failure to catch the public fancy injured Zola considerably in the
opinion of Villemessant, but the latter continued to take various articles from
him, such as a series of literary character-sketches, entitled "Marbres et
Plâtres," in which figured such men as Flaubert, Janin, Taine, Paradol, and
About. These articles were merely signed "Simplice,"—Zola's name having
become odious to the readers of "L'Événement,"—and portions were worked
by the author into later studies on French literary men.
About this time Villemessant found himself in serious difficulties with the
authorities, through having sailed too near to politics in a journal only
authorised for literature and news. "L'Événement" was suppressed, but its
editor turned "Le Figaro" into a daily organ, and Zola's services were
transferred to the latter journal. He contributed to it a number of Parisian and
other sketches, portions of which will be found under the title "Souvenirs," in
a second volume of "Contes à Ninon," published in 1874.
In the latter part of 1866 his pecuniary position was a declining one. As he
wrote to his friend, Antony Valabrègue, he found himself in a period of
transition. He had penned a pretty and pathetic nouvelle, "Les Quatre
Journées de Jean Gourdon," for "L'Illustration,"[14] but he was chiefly turning
his thoughts to dramatic art, going, he said, as often as possible to the
theatre—with the idea, undoubtedly, that, as he had failed to conquer Paris as
an art critic and a novelist, he might yet do so as a playwright. The young
man was certainly indomitable; after each repulse he came up, smiling, to try
the effect of another attack. Already in 1865, although his comedy, "La Laide,"
had been declined by the Odéon Theatre, he had started on a three-act
drama, called "La Madeleine," and this now being finished he sent it to
Montigny, the director of the Gymnase Theatre, who replied, however, that
the play was "impossible, mad, and would bring down the very chandeliers if
an attempt were made to perform it." Harmant of the Vaudeville also declined
"La Madeleine," but on the ground that the piece was "too colourless," from
which, as Alexis points out, one may surmise that he had not troubled to read
it.
After this experience Zola slipped his manuscript into a drawer and turned to
other matters. In December, 1866, he is found informing Valabrègue that he
has received a very flattering invitation to the Scientific Congress of France,
[15] and asking him, as he cannot attend personally, to read on his behalf a
paper he has written for it. This was a "definition of the novel," prepared, said
Zola, according to the methods of Taine,[16] and it embodied at least the
germs of the theories which he afterwards applied to his own work. When
writing to Valabrègue on the subject he was in a somewhat despondent
mood, for his position on "Le Figaro" had now become very precarious. He
wished to undertake some serious work, he said, but it was imperative that he
should raise money, and he was "very unskilful in such matters." Indeed, in
spite of every effort, he did not earn more than an average of three hundred
francs a month. Nevertheless, he still received his friends every Thursday,
when Pissarro, Baille, Solari, and others went "to complain with him about the
hardness of the times."[17] And he at least had a ray of comfort amid his
difficulties, for he was now in love, was loved in return, and hoped to marry at
the first favourable opportunity. The young person was tall, dark haired, very
charming, very intelligent, with a gift, too, of that prudent thrift which makes
so many Frenchwomen the most desirable of companions for the men who
have to fight for position and fame. Her name was Alexandrine Gabrielle
Mesley; before very long she became Madame Zola.
In 1867 Zola put forth a large quantity of work. Early in the year he quitted
"Le Figaro," and bade good-bye to the Quartier Latin, removing to Batignolles,
quite at the other end of Paris; his new address being 1, Rue Moncey, at the
corner of the Avenue de Clichy. He was now near his artistic friends of
Montmartre, and complained to Valabrègue of having only painters around
him, without a single literary chum to join him in his battle. His association
with artists led, however, to the production of a fresh study on Manet,[18] and
to another abortive effort to write a "Salon," this time in a newspaper called
"La Situation," which the blind, despoiled King of Hanover had started in Paris
for the purpose of inciting the French against the Prussians. This journal was
edited by Édouard Grénier, a publiciste and minor poet of the time, who was
well disposed towards Zola, but the latter's articles again called forth so many
protests, that Grénier, fearing the newspaper would be wrecked when it was
barely launched, cast his contributor overboard.
Zola fortunately had other work in hand, having arranged with the director of
a Marseillese newspaper, "Le Messager de Provence," to supply him with a
serial story, based (so Zola wrote to Valabrègue), on certain criminal trials,
respecting which he had received such an infinity of documents that he hardly
knew how to reduce so much chaos to order and invest it with life. He hoped,
however, that the story, which he called "Les Mystères de Marseille," might
give him a reputation in the south of France, even if from a pecuniary
standpoint it provided little beyond bread and cheese, the remuneration being
fixed at no more than two sous a line. That, perhaps, was full value for such
matter, at all events the London Sunday papers and halfpenny evening
journals often pay no more, if indeed as much, for the serials they issue
nowadays, the majority of which are no whit better than was Zola's tale. It
was not literature certainly, but it was clearly and concisely written, and
generally good as narrative, in spite of some sentimental mawkishness and
sensational absurdity. As often happens with hack work of this description the
tale opens better than it ends. Long, indeed, before it was finished, the writer
had grown heartily tired of it, as many of its readers must have perceived. At
the same time it was not a work to be ashamed of, particularly in the case of
an author fighting for his daily bread, and Zola, when at the height of his
reputation, showed that he was not ashamed of it, for on his adversaries
casting this forgotten "pot boiler" in his face, he caused it to be reprinted,
with a vigorous preface, in which he recounted under what circumstances the
story had been written.[19]
The money paid for it had been very acceptable to him, for it had meant an
income of two hundred francs a month for nine months in succession; and it
had enabled him to give time to some real literary work, the writing of his first
notable novel, "Thérèse Raquin." This he had begun in 1866; the idea of it
then being suggested to him by Adolphe Belot and Ernest Daudet's "Vénus de
Gordes," in which a husband is killed by the wife's lover, who, with his
mistress, is sent to the Assizes. Zola, for his part, pictured a similar crime in
which the paramours escaped detection, but suffered all the torment of
remorse, and ended by punishing each other. An article, a kind of nouvelle
which he contributed to "Le Figaro" on the subject, led him to develop this
theme in the form of a novel. In parts, "Thérèse Raquin," as the author
afterwards remarked, was neither more nor less than a study of the animality
existing in human nature. It was, therefore, bound to be repulsive to many
folk. But if one accept the subject, the book will be found to possess
considerable literary merit, a quality which cannot be claimed for Émile
Gaboriau's "Crime d'Orcival," with which it has been compared by Mr. Andrew
Lang. Gaboriau was a clever man in his way, but he wrote in commonplace
language for the folk of little education who patronised the feuilletons of "Le
Petit Journal." No French critic, except, perhaps, the ineffable M. de
Brunetière, who has declared the illiterate Ponson du Terrail to be infinitely
superior to the Goncourts, would think of associating Gaboriau's name with
that of Émile Zola.
Under the title of "Un Mariage d'Amour," "Thérèse Raquin" was published
during the summer and autumn of 1867, in Arsène Houssaye's review,
"L'Artiste," which paid Zola the sum of six hundred francs[20] for the serial
rights. There was some delay and difficulty in the matter. Houssaye, who was
bien en cour, as the French say, and desirous of doing nothing that might
interfere with his admission to the Tuileries, informed Zola that the Empress
Eugénie read the review, and on that ground obtained his assent to the
omission of certain strongly worded passages from the serial issue. But the
author rebelled indignantly when he found that Houssaye, not content with
this expurgation, had written a fine moral tag at the end of the last sheet of
proofs. Zola would have none of it, and he was right; yet for years the great
quarrel between him and his critics arose less from the outspokenness with
which he treated certain subjects than from his refusal to interlard his
references to evil with pious ejaculations and moral precepts. But for all
intelligent folk the statement of fact should carry its own moral, and books are
usually written for intelligent folk, not for idiots. In the case in point the
spectacle of Arsène Houssaye, a curled, dyed, perfumed ex-lady killer,
tendering moral reflections to the author of "Thérèse Raquin," was extremely
amusing. Here was a man who for years had pandered to vice, adorned,
beautified, and worshipped it, not only in a score of novels, but also in
numerous semi-historical sketches. For him it was all "roses and rapture,"
whereas under Zola's pen it appeared absolutely vile. In the end Houssaye
had to give way, and the moral tag was deleted.
Zola took his story to M. Albert Lacroix, who in the autumn of 1867 published
it as a volume. Naturally it was attacked; and notably by Louis Ulbach, a
writer with whom Zola frequently came in contact, for Ulbach did a large
amount of work for Lacroix, and was often to be met at the afternoon
gatherings at the Librairie Internationale. It was he who had initiated the
most popular book of that year: Lacroix's famous "Paris Guide by the principal
authors and artists of France"; but at the same time he did not neglect
journalism, and just then he was one of the principal contributors to "Le
Figaro," for which he wrote under the pseudonym of "Ferragus." In an article
printed by that journal he frankly denounced "Thérèse Raquin" as "putrid
literature," and Zola, with Villemessant's sanction, issued a slashing reply. This
certainly attracted attention to the book, with the result that a second edition
was called for at the end of the year, which had not been a remunerative one
for the book-selling world, for it was that of the great Exhibition when Paris,
receiving visits from almost every ruler and prince of Europe, gave nearly all
its attention to sight-seeing and festivity.[21]
Zola had sent a copy of his book to Ste.-Beuve, for whom, as for Taine, he
always professed considerable deference, though he reproached him
somewhat sharply for having failed to understand Balzac, Flaubert, and
others. Ste.-Beuve, having read "Thérèse Raquin," pronounced it to be a
"remarkable and conscientious" work, but objected to certain of its features.
Some years afterwards Zola had occasion to refer to this subject, and the
remarks he then penned[22] may be quoted with the more advantage as they
embody his own criticism of his book:——
About the time of the publication of "Thérèse Raquin" Zola at last obtained
the coveted honours of the footlights. In conjunction with his friend Marius
Roux he wrote a drama based on his "Mystères de Marseille," and the director
of the Marseillese Gymnase consented to stage it. It is possible that this
arrangement was effected during a visit which the director made to Paris, for,
according to some accounts, a trial performance of the play took place in the
capital.[23] Zola and Roux, being anxious to witness its production at
Marseilles, afterwards repaired thither, and superintended the last rehearsals;
but their hopes were scarcely fulfilled, for although, as Alexis points out rather
naïvely, the first performance[24] "proceeded fairly well, enlivened by only a
little hissing," no more than two others were ever given. And while it is true
that a "run" could hardly be expected in a provincial city, particularly in those
days, three solitary performances, followed by no revival, could not be
interpreted as signifying success.
Perhaps it was the failure of this effort that caused Zola to abandon for some
years all hope of making his way as a dramatic author. Judging by the
comparative success of "Thérèse Raquin," novel writing seemed the safer
course for him. Accordingly, he transformed his rejected play, "La Madeleine,"
into a novel, which he entitled "La Honte," and offered as a serial to a certain
M. Bauer, who had established a new "Evénement." Bauer accepted it, but its
minute descriptions of the working of sensual passion in a woman shocked his
readers, and the publication ceased abruptly. On the whole, this story, written
in a large degree on the same lines as "Thérèse Raquin," was not a good
piece of work. When Lacroix published it, however, in volume form, under the
title of "Madeleine Férat," it soon went into a second edition.[25]
This was the chief literary work accomplished by Zola in 1868, when he also
published a variety of articles in different Paris newspapers. And as his books
were now selling fairly well, he began to think of giving some fulfilment to an
old and once vague project, to which the example of Balzac's works had at
last imparted shape. Writing in May, 1867, to his friend Valabrègue, he had
then said: "By the way, have you read all Balzac? What a man he was! I am
reperusing him at this moment. To my mind, Victor Hugo and the others
dwindle away beside him, I am thinking of a book on Balzac, a great study, a
kind of real romance."
That book was never written, but the perusal of "La Comédie Humaine" and
its haunting influence at least largely inspired "Les Rougon-Macquart."
[1] This was in the early sixties. Marx, who "interviewed" the boyish Prince
Imperial, Baron James de Rothschild, M. de Lesseps, and many others,
collected his articles in a volume entitled, "Indiscrétions Parisiennes."
[2] Alexis, l. c., p. 67.
[3] The first volume had appeared in 1863.
[4] Napoleon III. and his wife attended the first performance at the Odéon
(March, 1866), and when Got, one of the performers, had occasion to
exclaim, "England, the land of liberty!" nearly the entire audience,
composed of the intellectual leaders of Paris, rose and applauded
tumultuously, in spite of the Emperor's presence. He was deeply impressed
by this demonstration.
[5] Zola's "Les Romanciers Naturalistes," Paris, Charpentier, 1881 et seq.
[6] The present writer can speak of these matters from personal
knowledge; he well knew M. Bourdilliat, the founder of the Librairie
Nouvelle, and afterwards connected for many years with "Le Monde
Illustré," which Frank Vizetelly helped to establish, and of which he was
the first editor. As for the Librairie Internationale, it became the
commercial agency of the "Illustrated London News," which Henry Vizetelly
(the writer's father) represented in Paris for several years.
[7] "Le Maudit" was followed by "La Réligieuse," "Le Jésuite," "Le Moine,"
etc., all of these books having very large sales in Paris.
[8] See ante. p. 66.
[9] The above passage corrects and supplements the particulars given by
the writer in the preface to the English translation of "L'Œuvre," edited by
him. "His Masterpiece," by É. Zola, London, Chatto and Windus, 1902.
[10] "Mon Salon," Paris, Librairie Centrale, 1866, 12mo, 99 pages. The
articles are also given in the volume entitled "Mes Haines" (Charpentier
and Fasquelle).
[11] M. Coste, who is well known as a publiciste in France, should have
been mentioned earlier in this work. Though not so intimate with Zola as
Baille and Cézanne, he knew him in his school days. He largely helped Paul
Alexis in the preparation of the latter's biographical work on Zola.
[12] "Le Vœu d'une Morte," Paris, Faure, 1866, 18mo. Reissued by
Charpentier, 1889 and 1891.
[13] "A Dead Woman's Wish," translated by Count C. S. de Soissons,
London, 1902.
[14] "L'Illustration," December 15, 1866, to February 16, 1867. The story is
included in the "Nouveaux Contes à Ninon," 1874.
[15] It must have been held, we think, at Marseilles or Aix.
[16] The substance of the paper was worked into the articles which Zola
collected in the volume entitled "Le Roman Expérimental," Paris, 1880 et
seq.
[17] "La Grande Revue," May, 1903, p. 254.
[18] First issued in the "Revue du XIXe Siècle"; afterwards in pamphlet
form by Dentu, with a portrait of Manet by Bracquemond, and an etching
of Manet's "Olympia" by the painter himself. The text was reprinted in the
volume, "Mes Haines."
[19] Besides appearing serially in "Le Messager de Provence," "Les
Mystères de Marseille" was issued in parts (16mo) by Mengelle of
Marseilles, 1867-1868; and in volume form (with preface) by Charpentier,
Paris, 1884. Both "La Lanterne" and "Le Corsaire," of Paris, published the
story serially after the Franco-German War. In the latter journal it was
called "Un Duel Social," by "Agrippa," under which title it was again issued
in parts (12mo) for popular consumption. There is an English translation:
"The Mysteries of Marseilles," translated by Edward Vizetelly. London,
Hutchinson & Co., 1895 et seq.
[20] £24 or about $120. Houssaye had previously paid Zola a third of that
amount for his study on Manet (see ante, p. 101), and the money had
reached the young author just in time to enable him to save his furniture
from being seized and sold by a creditor.
[21] "Thérèse Raquin," Paris, Librairie Internationale: 1st edition, 1867; 2d,
1868; 3d, 1872; 4th and 5th, 1876; 6th, 7th, etc., Charpentier, 1880, 1882,
etc. Illustrated editions: Marpon, 8vo, 1883; Charpentier, 32mo, 1884.
Popular edition at 60 centimes: Marpon, 16mo, 1887. English translations:
(1) anonymous, Vizetelly & Co., cir. 1886-1889; (2) by Edward Vizetelly,
London, Grant Richards, 1902.
[22] "Le Voltaire," August 10-14, 1880. See also "Documents Littéraires,"
by É. Zola, Paris, Charpentier (and Fasquelle), 1881 et seq.