Buy Ebook Data Analysis With Python and PySpark (MEAP V07) Jonathan Rioux Cheap Price
Buy Ebook Data Analysis With Python and PySpark (MEAP V07) Jonathan Rioux Cheap Price
com
https://textbookfull.com/product/data-analysis-
with-python-and-pyspark-meap-v07-jonathan-rioux/
https://textbookfull.com/product/data-analysis-from-scratch-with-
python-peters-morgan/
textbookfull.com
https://textbookfull.com/product/python-for-data-analysis-data-
wrangling-with-pandas-numpy-and-ipython-wes-mckinney/
textbookfull.com
https://textbookfull.com/product/go-optimizations-101-tapir-liu/
textbookfull.com
Black Children in Hollywood Cinema: Cast in Shadow 1st
Edition Debbie Olson (Auth.)
https://textbookfull.com/product/black-children-in-hollywood-cinema-
cast-in-shadow-1st-edition-debbie-olson-auth/
textbookfull.com
https://textbookfull.com/product/university-court-and-slave-
proslavery-academic-thought-and-southern-jurisprudence-1831-1861-1st-
edition-brophy/
textbookfull.com
https://textbookfull.com/product/security-in-the-persian-gulf-
region-1st-edition-fatemeh-shayan-auth/
textbookfull.com
https://textbookfull.com/product/the-geometry-of-musical-rhythm-what-
makes-a-good-rhythm-good-second-edition-godfried-t-toussaint/
textbookfull.com
https://textbookfull.com/product/the-muscular-system-manual-the-
skeletal-muscles-of-the-human-body-joseph-e-muscolino/
textbookfull.com
A System of Health Accounts 2011 Organization For Economic
Cooperation And Development
https://textbookfull.com/product/a-system-of-health-
accounts-2011-organization-for-economic-cooperation-and-development/
textbookfull.com
MEAP Edition
Manning Early Access Program
Data Analysis with Python and PySpark
Version 77
—Jonathan Rioux
According to pretty much every news outlet, data is everything, everywhere. It’s the new oil, the
new electricity, the new gold, plutonium, even bacon! We call it powerful, intangible, precious,
dangerous. I prefer calling it useful in capable hands. After all, for a computer, any piece of data
is a collection of zeroes and ones, and it is our responsibility, as users, to make sense of how it
translates to something useful.
Just like oil, electricity, gold, plutonium and bacon (especially bacon!), our appetite for data is
growing. So much, in fact, that computers aren’t following. Data is growing in size and
complexity, yet consumer hardware has been stalling a little. RAM is hovering for most laptops
at around 8 to 16 GB, and SSD are getting prohibitively expensive past a few terabytes. Is the
solution for the burgeoning data analyst to triple-mortgage his life to afford top of the line
hardware to tackle Big Data problems?
Introducing Spark, and its companion PySpark, the unsung heroes of large-scale analytical
workloads. They take a few pages of the supercomputer playbook— powerful, but manageable,
compute units meshed in a network of machines— and bring it to the masses. Add on top a
powerful set of data structures ready for any work you’re willing to throw at them, and you have
a tool that will grow (pun intended) with you.
This book is great introduction to data manipulation and analysis using PySpark. It tries to cover
just enough theory to get you comfortable, while giving enough opportunities to practice. Each
chapter except this one contains a few exercices to anchor what you just learned. The exercises
are all solved and explained in Appendix A.
At the core, PySpark can be summarized as being the Python API to Spark. While this is an
accurate definition, it doesn’t give much unless you know the meaning of Python and Spark. If
we were in a video game, I certainly wouldn’t win any prize for being the most useful NPC.
Let’s continue our quest to understand what is PySpark by first answering What is Spark?.
Digging a little deeper, we can compare Spark to an analytics factory. The raw material— here,
data— comes in, and data, insights, visualizations, models, you name it! comes out.
Just like a factory will often gain more capacity by increasing its footprint, Spark can process an
increasingly vast amount of data by scaling out instead of scaling up. This means that, instead of
buying thousand of dollars of RAM to accommodate your data set, you’ll rely instead of multiple
computers, splitting the job between them. In a world where two modest computers are less
costly than one large one, it means that scaling out is less expensive than up, keeping more
money in your pockets.
The problem with computers is that they crash or behave unpredictably once in a while. If
instead of one, you have a hundred, the chance that at least one of them go down is now much
higher.1 Spark goes therefore through a lot of hoops to manage, scale, and babysit those poor
little sometimes unstable computers so you can focus on what you want, which is to work with
data.
This is, in fact, one of the weird thing about Spark: it’s a good tool because of what you can do
with it, but especially because of what you don’t have to do with it. Spark provides a powerful
API2 that makes it look like you’re working with a cohesive, non-distributed source of data,
while working hard in the background to optimize your program to use all the power available.
You therefore don’t have to be an expert at the arcane art of distributed computing: you just need
to be familiar with the language you’ll use to build your program. This leads us to…
Python is a dynamic, general purpose language, available on many platforms and for a variety of
tasks. Its versatility and expressiveness makes it an especially good fit for PySpark. The
language is one of the most popular for a variety of domains, and currently is a major force in
data analysis and science. The syntax is easy to learn and read, and the amount of library
available means that you’ll often find one (or more!) who’s just the right fit for your problem.
PySpark packs a lot of advantages for modern data workloads. It sits at the intersection of fast,
expressive and versatile. Let’s explore those three themes one by one.
PYSPARK IS FAST
If you search "Big Data" in a search engine, there is a very good chance that Hadoop will come
within the first few results. There is a very good reason to this: Hadoop popularized the famous
MapReduce framework that Google pioneered in 2004 and is now a staple in Data Lakes and Big
Data Warehouses everywhere.
Spark was created a few years later, sitting on Hadoop’s incredible legacy. With an aggressive
query optimizer, a judicious usage of RAM and some other improvements we’ll touch on in the
next chapters, Spark can run up to 100x faster than plain Hadoop. Because of the integration
between the two frameworks, you can easily switch your Hadoop workflow to Spark and gain
the performance boost without changing your hardware.
PYSPARK IS EXPRESSIVE
Beyond the choice of the Python language, one of the most popular and easy-to-learn language,
PySpark’s API has been designed from the ground up to be easy to understand. Most programs
read as a descriptive list of the transformations you need to apply to the data, which makes them
easy to reason about. For those familiar with functional programming languages, PySpark code
is conceptually closer to the "pipe" abstraction rather than pandas, the most popular in-memory
DataFrame library.
You will obviously see many examples through this book. As I was writing those examples, I
was pleased about how close to my initial (pen and paper) reasoning the code ended up looking.
After understanding the fundamentals of the framework, I’m confident you’ll be in the same
situation.
PYSPARK IS VERSATILE
There are two components to this versatility. First, there is the availability of the framework.
Second, there is the diversified ecosystem surrounding Spark.
PySpark is everywhere. All three major cloud providers have a managed Hadoop/Spark cluster
as part of their offering, which means you have a fully provisioned cluster at a click of a few
buttons. You can also easily install Spark on your own computer to nail down your program
before scaling on a more powerful cluster. Appendix B covers how to get your own local Spark
running, while Appendix C will walk through the current main cloud offerings.
PySpark is open-source. Unlike some other analytical software, you aren’t tied to a single
company. You can inspect the source code if you’re curious, and even contribute if you have an
idea for a new functionality or find a bug. It also gives a low barrier to adoption: download,
learn, profit!
Finally, Spark’s eco-system doesn’t stop at PySpark. There is also an API for Scala, Java, R, as
well as a state-of-the-art SQL layer. This makes it easy to write a polyglot program in Spark. A
Java software engineer can tackle the ETL pipeline in Spark using Java, while a data scientist can
build a model using PySpark.
PySpark isn’t the right choice if you’re dealing with small data sets. Managing a distributed
cluster comes with some overhead, and if you’re just using a single node, you’re paying the price
but aren’t using the benefits. As an example, a PySpark shell will take a few seconds to launch:
this is often more than enough time to process data that fits within your RAM.
PySpark also has a disadvantage when it comes to the Java and Scala API. Since Spark is at the
core a Scala program, Python code have to be translated to and from JVM3 instructions. While
more recent versions have been bridging that gap pretty well, pure Python translation, which
happens mainly when you’re defining your own User Defined Functions (UDF), will perform
slower. We will cover UDF and some ways to mitigate the performance problem in Chapter 8.
Finally, while programming PySpark can feel easy and straightforward, managing a cluster can
be a little arcane. Spark is a pretty complicated piece of software, and while the code base
matured remarkably over the past few years, the days where scaling a 100-machine cluster and
manage it as easily as a single node are far ahead. We will cover some of the developer-facing
configuration and problems in the Chapter about performance, but for hairier problems, do what
I do: befriend your dev ops.
If we’re keeping the factory analogy, we can imagine that the cluster of computer where Spark is
sitting on is the building.
If we look at , we can see two different way to interpret a data factory. On the left, we see how it
looks like from the outside: a cohesive unit where projects come in and results comes out. This is
what it will appear to you most of the time. Under the hood, it looks more like on the right: you
have some workbenches where some workers are assigned to. The workbenches are like the
computers in our Spark cluster: there is a fixed amount of them, and adding or removing some is
easy but needs to be planned in advance. The workers are called executors in Spark’s literature:
they are the one performing the actual work on the machines.
One of the little workers looks spiffier than the other. That top hat definitely makes him stand out
of the crowd. In our data factory, he’s the manager of the work floor. In Spark terms, we call this
the master. In the spirit of the open work-space, he shares one of the workbenches with his
fellow employees. The role of the master is crucial to the efficient execution of your program, so
is dedicated to this.
NOTE Spark provides its own cluster manager, but can also play well with other
ones when working in conjunction with Hadoop or another Big Data
platform. We will definitely discuss the intricacies of managing the cluster
manager (pun intended) in the chapter about performance, but in the
meantime, if you read about YARN or Mesos in the wild, know that they are
two of the most popular nowadays.
Any directions about capacity (machines and executors) are encoded in a SparkContext object
which represents the connection to our Spark cluster. Since our instructions didn’t mention any
specific capacity, the cluster manager will allocate the default capacity prescribed by our Spark
installation.
We’re off to a great start! We have a task to accomplish, and the capacity to accomplish it.
What’s next? Let’s get working!
Your manager/master has all the qualities a good manager has: smart, cautious and lazy. Wait,
what? You read me right. Laziness in a programming context— and one could argue in the real
world too— can actually be very good thing. Every instruction you’re providing in Spark can be
classified in two categories: transformations and actions. Actions are what many programming
languages would consider IO. Actions includes, but are not limited to:
In Spark, we’ll see those instructions most often via the show and write methods, as well as
other calling those two in their body.
Transformations are pretty much everything else. Some examples of transformation are:
Why the distinction, you might ask? When thinking about computation over data, you, as the
developer, are only concerned about the computation leading to an action. You’ll always interact
with the results of an action, because this is something you can see. Spark, with his lazy
computation model, will take this to the extreme and will avoid performing data work until an
action triggers the computation chain. Before that, the master will store (or cache) your
instructions. This way of dealing with computation has many benefits when dealing with large
scale data.
First, storing instructions in memory takes much less space than storing intermediate data results.
If you are performing many operations on a data set and are materializing the data each step of
the way, you’ll blow your storage much faster although you don’t need the intermediate results.
We can all argue that less waste is better.
Second, by having the full list of tasks to be performed available, the master can optimize the
work between executors much more efficiently. It can use information available at run-time, such
as the node where specific parts of the data are located. It can also re-order and eliminate useless
transformations if necessary.
Finally, during interactive development, you don’t have to submit a huge block of commands
and wait for the computation to happen. Instead, you can iteratively build your chain of
transformation, one at the time, and when you’re ready to launch the computation (like during
your coffee break), you can add an action and let Spark work its magic.
Lazy computation is a fundamental aspect of Spark’s operating model and part of the reason it’s
so fast. Most programming languages, including Python, R and Java, are eagerly evaluated. This
means that they process instructions as soon as they receive them. If you have never worked with
a lazy language before, it can look a little foreign and intimidating. If this is the case, don’t
worry: we’ll weave practical explanations and implications of that laziness during the code
examples when relevant. You’ll be a lazy pro in no time!
What’s a manager without competent employees? Once the task, with its action, has been
received, the master starts allocating data to what Spark calls executors. Executors are processes
that run computations and store data for the application. Those executors sit on what’s called a
worker node, which is the actual computer. In our factory analogy, an executor would be an
employee performing the work, while the worker node would be a workbench where many
employees/executors can work. If we recall , our master wears a top hat and sits with his
employees/workers at one of the workbenches.
That concludes our factory tour. Let’s summarize our typical PySpark program.
When submitting our program (or launching a PySpark shell), the cluster manager allocates
resources for us to use. Those will stay constant for the duration of the program.
The master ingests your code and translate it into Spark instructions. Those instructions are
either transformations or actions.
Once the master reaches an action, it optimizes the whole computation chain and splits the work
between executors. Executors are processes performing the actual data work and they reside on
machines labelled worked nodes.
That’s it! As we can see, the overall process is quite simple, but it’s obvious that Spark hides a
lot of the complexity arising from efficient distributed processing. For a developer, this means
read and write data from (and to) a variety of sources and formats;
deal with messy data with PySpark’s data manipulation functionality;
discover new data sets and perform exploratory data analysis;
build data pipelines that transform, summarize and get insights from data in an automated
fashion;
test, profile and improve your code;
troubleshoot common PySpark errors, how to recover from them and avoid them in the
first place.
After covering those fundamentals, we’ll also tackle different tasks that aren’t as frequent, but
are interesting and an excellent way to showcase the power and versatility of PySpark.
We are trying to cater to many potential readers, but are focusing on people with little to no
exposure to Spark and/or PySpark. More seasoned practitioners might find useful analogies for
when they need to explain difficult concepts and maybe learn a thing or two!
The book focuses on Spark version 2.4, which is currently the most recent available. Users on
older Spark versions will be able to go through most of the code in the book, but we definitely
recommend using at least Spark 2.0+.
We’re assuming some basic Python knowledge: some useful concepts are outlined in Appendix
D. If you feel for a more in-depth introduction to Python, I recommend The Quick Python Book,
by Naomi Ceder (Manning, 2018).
A code editor will also be very useful for writing, reading and editing scripts as you go through
the examples and craft your own programs. A Python-aware editor, such as PyCharm, is a
nice-to-have but is in no way necessary. Just make sure it saves your code without any
formatting: don’t use Microsoft Word to write your programs!
The book’s code examples are available on GitHub, so Git will be a useful piece of software to
have. If you don’t know git, or don’t have it handy, GitHub provides a way to download all the
book’s code in a Zip file. Make sure you check regularly for updates!
Finally, I recommend that you have an analog way of drafting your code and schema. I am a
compulsive note-taker and doodler, and even if my drawing are very basic and crude, I find that
working through a new piece of software via drawings helps in clarifying my thoughts. This
means less code re-writing, and a happier programmer! Nothing spiffy, some scrap paper and a
pencil will do wonders.
1.4 Summary
PySpark is the Python API for Spark, a distributed framework for large-scale data
analysis. It provides the expressiveness and dynamism of the Python programming
language to Spark.
PySpark provides a full-stack analytics workbench. It has an API for data manipulation,
graph analysis, streaming data as well as machine learning.
Spark is fast: it owes its speed to a judicious usage of the RAM available and an
aggressive and lazy query optimizer.
Spark provides bindings for Python, Scala, Java, and R. You can also use SQL for data
manipulation.
Spark uses a master which processes the instructions and orchestrates the work. The
executors receive the instructions from the master and perform the work.
All instructions in PySpark are either transformations or actions. Spark being lazy, only
actions will trigger the computation of a chain of instructions.
Data-driven applications, no matter how complex, all boils down to what I like to call three
meta-steps, which are easy to distinguish in a program.
The next two chapters will introduce a basic workflow with PySpark via the creation of a simple
ETL (Extract, Transform and Load, which is a more business-speak way of saying Ingest,
Transform and Export). We will spend most of our time at the pyspark shell, interactively
building our program one step at a time. Just like normal Python development, using the shell or
REPL (I’ll use the terms interchangeably) provides rapid feedback and quick progression. Once
we are comfortable with the results, we will wrap our program so we can submit it in batch
mode.
Data manipulation is the most basic and important aspect of any data-driven program and
PySpark puts a lot of focus on this. It serves as the foundation of any reporting, any machine
learning or data science exercise we wish to perform. This section will give you the tools to not
only use PySpark to manipulate data at scale, but also how to think in terms of data
transformation. We obviously can’t cover every function provided in PySpark, but I’ll provide a
good explanation of the ones we use. I’ll also introduce how to use the shell as a friendly
reminder for those cases when you forget how something works.
Since this is your first end-to-end program in PySpark, we’ll get our feet wet with a simple
problem to solve: counting the most popular word being used in the English language. Now,
since collecting all the material ever produced in the English language would be a massive
undertaking, we’ll start with a very small sample: Pride and Prejudice by Jane Austen. We’ll
make our program work with this small sample and then scale it to ingest a larger corpus of text.
Since this is our first program, and I need to introduce many new concepts, this Chapter will
focus on the data manipulation part of the program. Chapter 3 will cover the final computation as
well as wrapping our program and then scaling it.
TIP The book repository, containing the code and data used for the examples
and exercises, is available at https://github.com/jonesberg/PySparkInAction
.
For this chapter (and the rest of the book), I assume that you have access to a working
installation of Spark, either locally or in the cloud. If you want to perform the installation
yourself, Appendix B contains step-by-step instructions for Linux, OsX and Windows. If you
can’t install it on your computer, or prefer not to, Appendix C provides a few cloud-powered
options as well as additional instructions to upload your data and make it visible to Spark.
Once everything is set up, you can launch the pyspark shell by inputting pyspark into your
terminal. You should see an ASCII-art version of the Spark logo, as well as some useful
information. shows what happens on my local machine.
In [1]:
When using PySpark locally, you most often won’t have a full Hadoop cluster
pre-configured. For learning purposes, this is perfectly fine.
Spark is indicating the level of details it’ll provide to you. We will see how to
configure this in .
We are using Spark version 2.4.3
PySpark is using the Python available on your path.
The pyspark shell provides an entry point for you through the variable spark.
More on this in .
The REPL is now ready for your input!
NOTE I highly recommend you using IPython when using PySpark in interactive
mode. IPython is a better front-end to the Python shell containing many
useful functionalities, such as friendlier copy-and-paste and syntax
highlighting. The installation instructions in Appendix B includes configuring
PySpark to use the IPython shell.
While all the information provided in is useful, two elements are worth expanding on: the
SparkSession entry point and the log level.
In Chapter 1, we spoke briefly about the Spark entry point called SparkContext. SparkSession
is a super-set of that. It wraps the SparkContext and provides functionality for interacting with
data in a distributed fashion.Just to prove our point, see how easy it is to get to the
SparkContext from our SparkSession object: just call the sparkContext attribute from
spark.
$ spark.sparkContext
# <SparkContext master=local[*] appName=PySparkShell>
The SparkSession object is a recent addition to the PySpark API, making its way in version
2.0. This is due to the API evolving in a way that makes more room for the faster, more versatile
data frame as the main data structure over the lower level RDD. Before that time, you had to use
another object (called the SQLContext) in order to use the data frame. It’s much easier to have
everything under a single umbrella.
This book will focus mostly on the data frame as our main data structure. I’ll discuss about the
RDD in Chapter 8, when we discuss about lower-level PySpark programming and how to embed
our own Python functions in our programs.
sc = spark.sparkContext
sqlContext = spark
lists the available keywords you can pass to setLogLevel. Each subsequent keyword contains
all the previous ones, with the obvious exception of OFF that doesn’t show anything.
NOTE When using the pyspark shell, anything chattier than WARN might appear
when you’re typing a command, which makes it quite hard to input
commands into the shell. You’re welcome to play with the log levels as you
please, but we won’t show any output unless it’s valuable for the task at
hand.
Setting the log level to ALL is a very good way to annoy oblivious
co-workers if they don’t lock their computers. You haven’t heard it from me.
You now have the REPL fired-up and ready for your input.
1. Read: Read the input data (we’re assuming a plain text file)
2. Token: Tokenize each word
3. Clean: Remove any punctuation and/or tokens that aren’t words.
4. Count: Count the frequency of each word present in the text
5. Answer: Return the top 10 (or 20, 50, 100)
Our goal is quite lofty: the English language produced through history an unfathomable amount
of written material. Since we are learning, we’ll start with a relatively small source, get our
program working, and then scale it to accommodate a larger body of text. For this, I chose to use
Jane Austen’s Pride and Prejudice, since it’s already in plain text and freely available.
The RDD can be seen like a distributed collection of objects. I personally visualize this as a bag
that you give orders to. You pass orders to the RDD through regular Python functions over the
items in the bag.
The data frame is a stricter version of the RDD: conceptually, you can think of it like a table,
where each cell can contain one value. The data frame makes heavy usage of the concept of
columns where you perform operation on columns instead of on records, like in the RDD.
provides a visual summary of the two structures.
If you’ve used SQL in the past, you’ll find that the data frame implementation takes a lot of
inspiration from SQL. The module name for data organization and manipulation is even named
pyspark.sql! Furthermore, Chapter 7 will teach you how to mix PySpark and SQL code within
the same program.
Figure 2.2 A RDD vs a data frame. In the RDD, each record is processed independently.
With the data frame, we work with its columns, performing functions on them.
This book will focus on the data frame implementation as it is more modern and performs faster
for all but the most esoteric tasks. Chapter 8 will discuss about trade-offs between the RDD and
the data frame. Don’t worry: once you’re learned the data frame, it’ll be a breeze the learn the
RDD.
Reading data into a data frame is done through the DataFrameReader object, which we can
access through spark.read. The code in displays the object, as well as the methods it exposes.
We recognize a few file formats: csv stands for comma separated values (which we’ll use as
early as Chapter 4), json for JavaScript Object Notation (a popular data exchange format) and
text is plain text.
In [4]: dir(spark.read)
Out[4]: [<some content removed>, _spark', 'csv', 'format', 'jdbc', 'json',
'load', 'option', 'options', 'orc', 'parquet', 'schema', 'table', 'text']
Let’s read our data file. I am assuming that you launched PySpark at the root of this book’s
repository. Depending on your case, you might need to change the path where the file is located.
book
# DataFrame[value: string]
We get a data frame, as expected! If you input your data frame, conveniently named book , into
the shell, you see that PySpark doesn’t actually output any data to the screen. Instead, it prints
the schema, which is the name of the columns and their type. In PySpark’s world, each column
has a type: it represents how the value is represented by Spark’s engine. By having the type
attached to each column, you can know instantly what operations you can do on a the data. With
this information, you won’t inadvertently try to add an integer to a string: PySpark won’t let you
add 1 to "blue". Here, we have one column, named value, composed of a string. A quick
graphical representation of our data frame would look like . Besides being a helpful reminder of
the content of the data frame, types are integral to how Spark processes data quickly and
accurately. We will explore the subject extensively in Chapter 5.
Figure 2.3 A high-level schema of a our book data frame, containing a value string
column. We can see the name of the column, its type, and a small snippet of the data.
If you want to see the schema in a more readable way, you can use the handy method
printSchema(), illustrated in . This will print a tree-like version of the data frame’s schema. It
is probably the method I use the most when developing interactively!
# root
# |-- value: string (nullable = true)
.. versionadded:: 1.4
In [*]: spark.read?
Type: property
String form: <property object at 0x1159a0958>
Docstring:
Returns a :class:`DataFrameReader` that can be used to read data
in as a :class:`DataFrame`.
:return: :class:`DataFrameReader`
.. versionadded:: 2.0
In , we saw that the default behaviour of imputing a data frame in the shell is to provide the
schema or column information of the object. While very useful, sometimes we want to take a
peek of the data.
The show() method displays a sample of the data back to you. Nothing more, nothing less. With
printSchema(), it will become one of your best friend to perform data exploration and
validation. By default, it will show 20 rows and truncate long rows. The code in shows the
default behaviour of the method, applied to our book data frame. For text data, the length
limitation is limiting (pun intended). Fortunately, show() provides some options to display just
what you need.
# +--------------------+
# | value|
# +--------------------+
# |The Project Guten...|
# | |
# |This eBook is for...|
# |almost no restric...|
# |re-use it under t...|
# |with this eBook o...|
# | |
# | |
# |Title: Pride and ...|
# | |
# | Author: Jane Austen|
# | |
# |Posting Date: Aug...|
# |Release Date: Jun...|
# |Last Updated: Mar...|
# | |
# | Language: English|
# | |
# |Character set enc...|
# | |
# +--------------------+
# only showing top 20 rows
n can be set to any positive integer, and will display that number of rows.
truncate, if set to true, will truncate the columns to display only 20 characters. Set to
False to display the whole length, or any positive integer to truncate to a specific
number of characters.
vertical takes a Boolean value and, when set to True, will display each record as a
small table. Try it!
The code in shows a couple options, stating with showing 10 records and truncating then at 50
Ő Á
Oh be szép volt kormány-ünnepnapokon az az Ősz Ádám csinos,
kevély mitrovszki süvegével; bőrös oldalkardjával; haragos fekete
kun képével, és tört döczögő német szavaival. Egy kis
lovagkorbácskával járt mindig – s Nagy-Kakasd százszor láthatta és
látta is, mikor ez a dölyfös Bach-huszár az iskolából kijövő, – s neki
magyarul jó napot – dicsértessék a Jézust kivánó gyermekekre jókat
huzogatott, és törvényszerűleg oltogatta beléjük a német főneveket
és mellékneveket.
Most ez az Ősz Ádám is a legvéresebb szájú 48-as, s oly dühös,
hogy még béreseitől is határozott programmot követel – 48-czal.
Gazdag ember – és független. És független Magyarországot
követel.
Imádja Nagy-Kakasd.
Harczos Pál József közelebb int egy polgárt.
– Vörös Károly, – tudja mit? követeljen szavazást, és válaszszák
meg egyik koronázási küldöttnek Ádámot. A Czifrában este nagy
vacsora. Szóljon, szóljon.
Zsongás-bongás, mozgás, zúgás.
– Legyen az egyik küldött Méltóságos Ősz Ádám úr – kiáltja ki a
szót.
– Méltóságos Ősz Ádám úr –
– Halljuk!
Itt-ott rettenetes kaczagás.
– Kicsoda? – kiálták a Halász és Hajkó fiúk, a Sebestyének,
Szentpéteriek. Akkora zúgás, kaczagás, ordítozás támadt a
közönségben, hogy a polgármester gyűlés-felfüggesztéssel
fenyegetődzött.
– No ha ez megtörténnék, – szólalt fel hosszas szótkérések s
fölállások után a volt képviselő Ugodi Sándor, keményen dörzsölve
tüskés bajuszát – ha az uraim megtörténnék, de talán csak nem
fajult még el annyira a kun?
– Kövesse meg az úr magát, – rángatódzott Harczos-Hayermayer
úr, – én csak olyan kun vagyok, mint a tekintetes úr. A mi volt, az
elmult – ne bolygassuk, mert sok minden ki találna kerülni egyben is,
másban is. Én rég megtértem, hivatalomat elhagytam – pedig most
hatvanegytől fogva talán nem vállalhattam volna akár főispánságot?
nem tettem, magyar vagyok; csak olyan kun, mint az úr. A vőm meg
Ősz Ádám egyenesen levelezésben áll a legnagyobb magyarral. Most
is ki akart hozzá zarándokolni a tavaszon, de az őszre ki is megyünk.
Azt gondolom azért, hogy ha Hunyadi úr jó lesz banderistának, kun
banderistának – pedig ki merné azt állítani, hogy nem jó: hát az én
vőm Ősz Ádám, a ki a legkeményebb 48-as – szinte megülhet egy jó
kun paripát. Ajánlom vőmet. Igenis ajánlom.
– Ez mégis szemtelenség! – pattant fel székéről Mészáros Simon
úr.
– Ez gyalázat! – kiáltott a vén Hajkó. – Hát egy jött-ment
zsandárból, adószedőből, nyúzóból, a legnagyobb pecsovicsból,
németből, feladóból, spiczliből…
Harczos úr, rendes szokása szerint, azalatt míg gyalázták:
alázatos szavakkal és igéretekkel kereste a híveket, és gyűjtötte a
votumokat. Hallott ő mindent, de a mi előtte állott, az a czél volt,
hogy veje koronázási banderista legyen.
– Szót kérek! – ugrott előrébb a hosszas börtönt szenvedett
Baranyai József, s szöktek előrébb testvérei: Albert, Ábris és Mihály
teljes szófegyverzetben.
– Kérdem a közgyűléstől, – kezdé Baranyai József úr tajtékzó
ajkakkal, – mi akar itt készülni? Meggyalázása a magyarnak,
kigúnyolása kun őseinknek?
Ő
– Szavazzunk! – kiáltottak a Harczos és Ősz urak párthívei.
– Szavazzunk! – követelték a Hunyadisták. Baranyai úr tombolt,
üvöltött, szidta a németeket, gyalázta a pintlis apró kutyás Bach-
sereget, vágta a Schwindlereket és Hayermayerokat, Rundokat,
Randokat, Kucsekeket, Vacsekeket – dicsérte magát és emlegette
jutalmatlan martyromságát, hivatkozott a dicső multra.
– Nem jó azt a sarkot bántani uram! – hangzott valahonnan egy
szó, – mert sok eltett rabolt pénz –
– Itt nem lehet tanácskozni, ez czudarság! – üvöltöttek a
Baranyaiak, – személyeskedés, irigység – menjünk, oszlassuk fel a
gyűlést! – El, el! –
A becsületesebbek hallgattak, mosolyogtak és mormogtak
egymás közt.
– Igaz, a Baranyaiak sok rabolt pénzt eltettek – meg is kapták a
magukét. Bűn büntetlen nem marad. Szíhatják a fogaikat, de
szíhatják.
– Csúf dolog lesz, ha az a két Bach-huszár megy el kun-
banderistának.
– Meglássa István bácsi, – szól a beszélő öreg ősz kun atyafihoz
egy szép deli barna kun ifjú, – mert úgy lesz. Engem nem
választanak.
– Akkor Halász öcsém pörköljék is föl ezt a karaktertelen várost.
– Sok a pénz amott; Hunyadi, Harczos győzik pénzzel, – súgott
oda a hátuk mögött egy kun.
– Mi meg birjuk tán Varga bácsi becsülettel?
– Kicsit ér az ma! Hát szólaljon föl szomszéd? De úgy-e nem szól?
– Szégyenlem ilyen emberek közt. Inkább elmegyek.
É
– No úgy-e! Én is úgy vagyok.
– Tudom, rám úgy se hallgatna senki.
Az elnök csöngetett.
A küzdők sorakoztak; a szemek villogtak; a szívek hevültek;
Hunyadi úr nézte éles szemeivel a bajnokokat; Harczos Hayermayer
hol ide, hol oda suhant édes igéreteivel; a szép Hajkó és Halász fiuk
nemzetségeikkel futták a düh és bosszú tüzét; dobogtak a lábak, és
feszültek az izmok.
– Tisztelt közgyűlés! – kezdé a polgármester, – úgy látom ki van
merítve a disputa. Körülbelől el van döntve, hogy városunk kit
küldjön a koronázásra banderistának. Azt gondolom kettőt…
– Kettőt! – dörgött az alacsony füstös terem.
– Szavazásra bocsátom a dolgot, kinevezem a szavazatszedő
bizottmányt s ennek elnökét, és felkérem Tekintetes Varga Pál úr
elnöklete alatt Veres Károly és Harnócz polgártársainkat, hogy – – –
A zaj, a Hajkók, Halászok, Mészárosok, Sebestyének zaja,
morgása elnyelte a többit.
Megmozdult a tömeg kifelé, befelé, és zúgott, morajlott mint a
rémületes áradat.
A pénz!
Hunyadi úr az első veszteségen tanult. Több pénzt. Nagyon
megokosodott. Harczos úr is tudta, hogy a pénz csak úgy nincs
kiadva hiába, ha markunkban a jószág kötele.
Dologhoz látott tehát férfiasan, erélyesen.
Tizenegy óra tájt a polgármester úr ravasz alázatos képpel
kihirdette a szavazás eredményét.
– A közgyűlés banderistáknak a koronázásra megválasztotta
szavazattöbbséggel Méltóságos Hunyadi Gusztáv és Méltóságos Ősz
Ádám urakat, az elsőt hetven, az utóbbit hatvankét szavazattal.
– Ne neked vitéz magyar, szájas nagy-kakasdi atyafiság! –
mormogá kimenet az öreg Sebestyén, – megkaptad a karakteredért
a czímeredet.
Érd be vele!
A vérmesebbek, a dühösebbek lent a piaczi kút körül sereglettek
együvé, borzasztó szavakban fejezve ki rettentő megütődésüket.
Oda bent még egy más tárgy került szőnyegre.
Honnan vétessék az a szent föld, melyet ez ősi kun város a
koronázási dombhoz küldjön?
A vesztes fél elhagyta már a harcztért: Hunyadi és Harczos és
Ősz úr szabadon határozhatta meg a helyet.
Ekkorra már a nagy-kakasdi gyűléseknek szónoka, feje, vezére
egészen Hunyadi Gusztáv úr lett.
Fölállott tehát az ablak mellett, a Deák Ferencz képe alatt, mely
rendes helye szokott lenni – fölállott, ha az egész c görbületű
alázatos magatartást fölállásnak lehet venni és mágnásos
ropogtatásával a szép r betűnek: ekkép tette meg indítványát:
– Árbocz, az erős kun vezér, a mint a monda tartja: a szent-
jakabi pusztán porlad. Ott van sírja, koporsója. Én tehát azt
tartanám, hogy a szükséges földet innen küldenők s a László Károly
–––
– Helyes! – helyes! – erősíték élénk fejmozgatásokkal a
Hunyadisták.
– Hogyan? mi? – kérdé az épen visszatérő Halász Máté.
A polgármester fölfejtette a kérdés körülményeit.
– Nem lenne-e jobb uraim! ha már igazán ki akarunk tenni
magunkért, – mondá nevetve a vén ember, – ha azoknak a német és
cseh atyafiaknak a sírja tetejéből küldenénk egy láda földet, a kik a
Bach-világból itt maradtak? Mutassuk meg, hogy igazán
elaljasodtunk és valami féreg minden nemesebbet kiett a szívünkből.
Itt a Homok-erdő, a hol Huba vezér tetemei nyugosznak, hanem az
hogy érne fel egy elhullott cseh köszörűs porával? Nagy-Kakasd
úgyis híres már – legyen még híresebb.
Halász Máté úr égő arczczal és tántorgó léptekkel hagyta el a
tanácstermet. Nevetni, kaczagni, röhögni akart, de csak csuklott,
fuldokolt.
Hunyadi és Harczos úrék összenéztek – és mint nyertes felekhez
illő: mosolyogtak a vén eszelős bolondságain.
Természetesen határozatba ment, hogy a szent föld egyenesen a
László Károly úr pusztájáról a nádas ér melletti kis tölgyerdőcske
talajából vétessék.
Hunyadi úr ajánlotta, hogy ő ezt a szent földet kész ezüst
ládikában személyesen átadni az országos bizottmánynak – és pedig
a saját költségén.
– Igen, igen, – intett Harczos úr, – én tudom, hogy ezt az
áldozatot vőm is ezer örömmel megtette volna, de a dolog – igen,
igen a legjobb kezekben van.
Hunyadi ez újabb áldozatkészsége ismét nagy mértékben
megnyerte a nép tetszését.
Hunyadi úr és Ősz Ádám úr karonfogva vonult el a tüntetőleg
éljenző tömeg előtt, mely utálta a becsületes, nyers, de kevély,
rátartó Hajkó és Halász és Sebestyén nemzetséget és örömmel
hajolt meg a Bach-urak előtt, mert azok simák, hizelgők, adakozók
és barátságosak voltak.
– Éljen! – éljen! – hangzott az elismerő, lelkesült, jó kivánság és
még azoknak ajkairól is, kik imént a piaczi kút mellett a
legrettentőbb szavakkal szidták a Schwindler és Hayermayer és
Mitrovszki süveges ebeit.
Igen, mert Hunyadi és Ősz úr előttük ment el, és alázattal
meghajtotta magát mindegyik előtt – külön-külön.
A nép, mint a gyermek, megkivánja, hogy szeressék és
simogassák.
És Hunyadi úr is és Ősz Ádám úr is, e kemény 48-as, forrón
szerette a népet.
– Te Guszti! – nyomta meg Ősz Ádám úr Hunyadi úr karját, –
hitted ezt valaha?
– Lassan Ádám, mert meghallják, – nevetett a néhai Schwindler
úr.
– Szeretnék egy kunnak a nyakán menni föl Pestre. Te Guszti?
Nos?
– Kaphatsz jó pénzért eleget.
– De hogy Pestig korbácsolhatnám is. Te Guszti, vérig
korbácsolhatnám.
– Csitt! hallod, éljeneznek. Hajtsd meg magadat.
– Köszönöm, köszönöm barátim! – integetett kalapjával Ősz
Ádám úr, – valóban nem tudom, mikép köszönjem meg
kibeszélhetlen jóságtokat, kedves barátaim. Számítsatok reám. Éljen
a magyar, éljen a haza, éljen a mi apánk: Kossuth Lajos!
A két új magyar levette süvegét s a jó nép tiszta, őszinte
örömmel kiáltotta szíve mélyéből:
– Az Isten sokáig éltesse az ilyen igaz embereket! Az Isten ő
felsége áldja meg a nép embereit!!!
X.
Kis gyerekek nem igen tudnak arról, hogy a mama nagyon beteg
ám! A sápadt beteges arczot megszokják. Oh, hogy sejtenék ők az
alatt a kibeszélhetetlen veszteséget? Nevetnek, ha van min – és
gyereknél hogyne volna ilyen? Rendel a jó isten örökké, mert szereti,
hogy a kicsinyek semmit se lássanak a sötétből.
A kis Vilmos és Bella még a piros szemű, fehér köntösű nyulakról
is megfeledkezett – futott, futott át a nénihez, az új mamához. Ott
egy csomó gyerek volt, leány és fiú, a kik két hosszú asztal mellett
írtak, olvastak és énekeltek.
Tanította őket a néni.
Vilmos és Bella is beállott a béres és munkás gyerekek közé.
Bellát oda vette a néni az ölébe s a kis leányka forró szeretettel –
mintha édes anyját ölelné – kulcsolta át az új mama nyakát s leste
tekintetét, szavait.
A leányka kisebb volt még, semhogy valami nehezebbet tanuljon,
de arra elég nagy, hogy mindenre figyeljen és minden zajban,
mozgásban, mosolyban gyönyört találjon.
Hunyadiné az alatt, míg beszélte a szebbnél szebb történeteket a
bibliából, Kainról és a szegény Ábelről, Józsefről, a kit testvérei
eladtak és Dávidról, a ki kicsiny létére az Úr segítségével olyan
szépen eltalálta a nagy óriást: valami soha nem érzett melegséget
talált a kis gyermek gömbőlyű karocskájában és puha pirosas
ujjacskáiban. Hagyta, hogy a leányka üljön ott, és meg nem
mozdította volna a nyakát a világért is.
– Nénimama, most már énekeljünk, – kiáltott oda Vilmos a
gyerekek közül, – és feltartotta hosszú fehér kis kezét, hogy indítson.
És megzendült egy szép gyermek-ének a vadgalambról, a mely
szereti a zöld erdőt.