100% found this document useful (1 vote)

1K views

Mastering Python High Performance 1st Edition Fernando Doglio all chapter instant download

The document provides information about the book 'Mastering Python High Performance' by Fernando Doglio, which focuses on optimizing Python code performance. It includes links to download the book and other related ebooks, as well as details about the author and reviewers. The book covers various topics such as profiling, optimization techniques, and practical applications for improving Python performance.

Uploaded by

butensneenwk

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

1K views

Mastering Python High Performance 1st Edition Fernando Doglio all chapter instant download

Uploaded by

butensneenwk

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 57

Visit https://ebookfinal.

com to download the full version and

explore more ebooks

Mastering Python High Performance 1st Edition

Fernando Doglio

_ Click the link below to download _

https://ebookfinal.com/download/mastering-python-high-
performance-1st-edition-fernando-doglio/

Explore and download more ebooks at ebookfinal.com

Here are some suggested products you might be interested in.
Click the link to download

Python High Performance Programming 1st Edition Gabriele

Lanaro

https://ebookfinal.com/download/python-high-performance-
programming-1st-edition-gabriele-lanaro/

Fast Python High performance techniques for large datasets

MEAP V10 Tiago Rodrigues Antao

https://ebookfinal.com/download/fast-python-high-performance-
techniques-for-large-datasets-meap-v10-tiago-rodrigues-antao/

Super High Strength High Performance Concrete 1st Edition

Pu Xincheng (Author)

https://ebookfinal.com/download/super-high-strength-high-performance-
concrete-1st-edition-pu-xincheng-author/

R High Performance Programming 1st Edition Lim

https://ebookfinal.com/download/r-high-performance-programming-1st-
edition-lim/
PostgreSQL 9 0 High Performance Gregory Smith

https://ebookfinal.com/download/postgresql-9-0-high-performance-
gregory-smith/

Mastering Python Networking Your one stop solution to

using Python for network automation DevOps and SDN 1st
Edition Eric Chou [Chou
https://ebookfinal.com/download/mastering-python-networking-your-one-
stop-solution-to-using-python-for-network-automation-devops-and-
sdn-1st-edition-eric-chou-chou/

Clojure High Performance Programming 2nd Edition Shantanu

Kumar

https://ebookfinal.com/download/clojure-high-performance-
programming-2nd-edition-shantanu-kumar/

High Performance Chelation Ion Chromatography 1st Edition

Edition Pavel Nesterenko

https://ebookfinal.com/download/high-performance-chelation-ion-
chromatography-1st-edition-edition-pavel-nesterenko/

DK Essential Managers Achieving High Performance Pippa

Bourne

https://ebookfinal.com/download/dk-essential-managers-achieving-high-
performance-pippa-bourne/
Mastering Python High Performance 1st Edition
Fernando Doglio Digital Instant Download
Author(s): Fernando Doglio
ISBN(s): 9781783989300, 1783989300
Edition: 1
File Details: PDF, 5.23 MB
Year: 2015
Language: english
[1]
Mastering Python High
Performance

Measure, optimize, and improve the performance of

your Python code with this easy-to-follow guide

Fernando Doglio

BIRMINGHAM - MUMBAI
Mastering Python High Performance

Copyright © 2015 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval
system, or transmitted in any form or by any means, without the prior written
permission of the publisher, except in the case of brief quotations embedded in
critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy
of the information presented. However, the information contained in this book is
sold without warranty, either express or implied. Neither the author, nor Packt
Publishing, and its dealers and distributors will be held liable for any damages
caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the
companies and products mentioned in this book by the appropriate use of capitals.
However, Packt Publishing cannot guarantee the accuracy of this information.

First published: September 2015

Production reference: 1030915

Published by Packt Publishing Ltd.

Livery Place
35 Livery Street
Birmingham B3 2PB, UK.

ISBN 978-1-78398-930-0

www.packtpub.com
Credits

Author Project Coordinator

Fernando Doglio Milton Dsouza

Reviewers Proofreader
Erik Allik Safis Editing
Mike Driscoll
Enrique Escribano Indexer
Mariammal Chettiyar
Mosudi Isiaka

Graphics
Commissioning Editor
Sheetal Aute
Kunal Parikh

Production Coordinator
Acquisition Editors
Arvindkumar Gupta
Vivek Anantharaman
Richard Brookes-Bland
Cover Work
Arvindkumar Gupta
Content Development Editors
Akashdeep Kundu
Rashmi Suvarna

Technical Editor
Vijin Boricha

Copy Editors
Relin Hedly
Karuna Narayanan
About the Author

Fernando Doglio has been working as a web developer for the past 10 years.
During that time, he shifted his focus to the Web and grabbed the opportunity of
working with most of the leading technologies, such as PHP, Ruby on Rails, MySQL,
Python, Node.js, AngularJS, AJAX, REST APIs, and so on.

In his spare time, Fernando likes to tinker and learn new things. This is why his
GitHub account keeps getting new repos every month. He's also a big open source
supporter and tries to win the support of new people with the help of his website,
lookingforpullrequests.com.

You can reach him on Twitter at @deleteman123.

When he is not programming, he spends time with his family.

I'd like to thank my lovely wife for putting up with me and the
long hours I spent writing this book; this book would not have
been possible without her continued support. I would also like to
thank my two sons. Without them, this book would've been finished
months earlier.

Finally, I'd like to thank the reviewers and editors. They helped me get
this book in shape and achieve the quality level that you deserve.
About the Reviewers

Erik Allik is a self-taught multilingual, multiparadigm full-stack software engineer.

He started programming at the age of 14. Since then, Erik has been working with
many programming languages (both imperative and functional) and various web
and non-web-related technologies.

He has worked primarily with Python, Scala, and JavaScript. Erik is currently
focusing on applying Haskell and other innovative functional programming
techniques in various industries and leveraging the power of a mathematical
approach and formalism in the wild.
Mike Driscoll has been programming in Python since 2006. He enjoys writing
about Python on his blog at http://www.blog.pythonlibrary.org/. Mike has
coauthored Core Python refcard for DZone. He recently authored Python 101 and
was a technical reviewer for the following books by Packt Publishing:

• Python 3 Object-Oriented Programming

• Python 2.6 Graphics Cookbook
• Tkinter GUI Application Development Hotshot

I would like to thank my beautiful wife, Evangeline, for supporting

me throughout. I would also like to thank my friends and family for
all their help. Also, thank you Jesus Christ for taking good care of me.
Enrique Escribano lives in Chicago and is working as a software engineer at
Nokia. Although he is just 23 years old, he holds a master's of computer science
degree from IIT (Chicago) and a master's of science degree in telecommunication
engineering from ETSIT-UPM (Madrid). Enrique has also worked as a software
engineer at KeepCoding and as a developer intern at Telefonica, SA, the most
important Spanish tech company.

He is an expert in Java and Python and is proficient in using C/C++. Most of his
projects involve working with cloud-based technologies, such as AWS, GAE,
Hadoop, and so on. Enrique is also working on an open source research project
based on security with software-defined networking (SDN) with professor
Dong Jin at IIT Security Lab.

You can find more information about Enrique on his personal website
at enriquescribano.com. You can also reach him on LinkedIn at
linkedin.com/in/enriqueescribano.

I would like to thank my parents, Lucio and Carmen, for all the
unconditional support they have provided me with over the years.
They allowed me to be as ambitious as I wanted. Without them,
I may never have gotten to where I am today.

I would like to thank my siblings, Francisco and Marta. Being

the eldest brother is challenging, but you both keep inspiring me
everyday.

Lastly, I would also like to thank Paula for always being my main
inspiration and motivation since the very first day. I am so fortunate
to have her in my life.
Mosudi Isiaka is a graduate in electrical and computer engineering from the
Federal University of Technology Minna, Niger State, Nigeria. He demonstrates
excellent skills in numerous aspects of information and communication technology.
From a simple network to a mid-level complex network scenario of no less than
one thousand workstations (Microsoft Windows 7, Microsoft Windows Vista, and
Microsoft Windows XP), along with a Microsoft Windows 2008 Server R2 Active
Directory domain controller deployed in more than a single location, Mosudi has
extensive experience in implementing and managing a local area network. He has
successfully set up a data center infrastructure, VPN, WAN link optimization,
firewall and intrusion detection system, web/e-mail hosting control panel,
OpenNMS network management application, and so on.

Mosudi has the ability to use open source software and applications to achieve
enterprise-level network management solutions in scenarios that cover a virtual
private network (VPN), IP PBX, cloud computing, clustering, virtualization, routing,
high availability, customized firewall with advanced web filtering, network load
balancing, failover and link aggregation for multiple Internet access solutions, traffic
engineering, collaboration suits, network-attached storage (NAS), Linux systems
administration, virtual networking and computing.

He is currently employed as a data center manager at One Network Ltd., Nigeria.

Mosudi also works with ServerAfrica(http://www.serverafrica.com) as a
managing consultant (technicals).

You can find more information about him at http://www.mioemi.com. You can also
reach him at http://ng.linkedin.com/pub/isiaka-mosudi/1b/7a2/936/.

I would like to thank my amiable wife, Mosudi Efundayo Coker, for

her moral support.

Also, many thanks to my colleague, Oyebode Micheal Tosin,

for his timely reminders and technical suggestions during the
reviewing process.
www.PacktPub.com

Support files, eBooks, discount offers, and more

For support files and downloads related to your book, please visit www.PacktPub.com.

Did you know that Packt offers eBook versions of every book published, with PDF
and ePub files available? You can upgrade to the eBook version at www.PacktPub.com
and as a print book customer, you are entitled to a discount on the eBook copy. Get in
touch with us at [email protected] for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign
up for a range of free newsletters and receive exclusive discounts and offers on Packt
books and eBooks.
TM

https://www2.packtpub.com/books/subscription/packtlib

Do you need instant solutions to your IT questions? PacktLib is Packt's online digital
book library. Here, you can search, access, and read Packt's entire library of books.

Why subscribe?
• Fully searchable across every book published by Packt
• Copy and paste, print, and bookmark content
• On demand and accessible via a web browser

Free access for Packt account holders

If you have an account with Packt at www.PacktPub.com, you can use this to access
PacktLib today and view 9 entirely free books. Simply use your login credentials for
immediate access.
Table of Contents
Preface v
Chapter 1: Profiling 101 1
What is profiling? 2
Event-based profiling 2
Statistical profiling 5
The importance of profiling 6
What can we profile? 8
Execution time 8
Where are the bottlenecks? 10
Memory consumption and memory leaks 11
The risk of premature optimization 15
Running time complexity 15
Constant time – O(1) 16
Linear time – O(n) 16
Logarithmic time – O(log n) 17
Linearithmic time – O(nlog n) 18
Factorial time – O(n!) 18
Quadratic time – O(n^) 19
Profiling best practices 22
Build a regression-test suite 22
Mind your code 22
Be patient 22
Gather as much data as you can 23
Preprocess your data 23
Visualize your data 23
Summary 25

[i]
Table of Contents

Chapter 2: The Profilers 27

Getting to know our new best friends: the profilers 27
cProfile 28
A note about limitations 30
The API provided 30
The Stats class 34
Profiling examples 38
Fibonacci again 38
Tweet stats 44
line_profiler 52
kernprof 54
Some things to consider about kernprof 55
Profiling examples 56
Back to Fibonacci 56
Inverted index 58
Summary 69
Chapter 3: Going Visual – GUIs to Help
Understand Profiler Output 71
KCacheGrind – pyprof2calltree 72
Installation 72
Usage 73
A profiling example – TweetStats 75
A profiling example – Inverted Index 78
RunSnakeRun 82
Installation 83
Usage 84
Profiling examples – the lowest common multiplier 85
A profiling example – search using the inverted index 87
Summary 96
Chapter 4: Optimize Everything 97
Memoization / lookup tables 98
Performing a lookup on a list or linked list 102
Simple lookup on a dictionary 103
Binary search 103
Use cases for lookup tables 103
Usage of default arguments 108
List comprehension and generators 110
ctypes 115
Loading your own custom C library 116
Loading a system library 118

[ ii ]
Table of Contents

String concatenation 119

Other tips and tricks 123
Summary 126
Chapter 5: Multithreading versus Multiprocessing 127
Parallelism versus concurrency 128
Multithreading 128
Threads 130
Multiprocessing 143
Multiprocessing with Python 144
Summary 150
Chapter 6: Generic Optimization Options 151
PyPy 151
Installing PyPy 153
A Just-in-time compiler 154
Sandboxing 155
Optimizing for the JIT 156
Think of functions 156
Consider using cStringIO to concatenate strings 157
Actions that disable the JIT 159
Code sample 160
Cython 161
Installing Cython 162
Building a Cython module 163
Calling C functions 166
Solving naming conflicts 167
Defining types 168
Defining types during function definitions 169
A Cython example 171
When to define a type 173
Limitations 178
Generator expressions 178
Comparison of char* literals 179
Tuples as function arguments 179
Stack frames 179
How to choose the right option 180
When to go with Cython 180
When to go with PyPy 181
Summary 181

[ iii ]
Table of Contents

Chapter 7: Lightning Fast Number Crunching with Numba,

Parakeet, and pandas 183
Numba 184
Installation 185
Using Numba 187
Numba's code generation 187
Running your code on the GPU 194
The pandas tool 195
Installing pandas 195
Using pandas for data analysis 196
Parakeet 200
Installing Parakeet 201
How does Parakeet work? 202
Summary 204
Chapter 8: Putting It All into Practice 205
The problem to solve 205
Getting data from the Web 206
Postprocessing the data 209
The initial code base 209
Analyzing the code 217
Scraper 217
Analyzer 222
Summary 229
Index 231

[ iv ]
Preface
The idea of this book came to me from the nice people at Packt Publishing.
They wanted someone who could delve into the intricacies of high performance
in Python and everything related to this subject, be it profiling, the available
tools (such as profilers and other performance enhancement techniques),
or even alternatives to the standard Python implementation.

Having said that, I welcome you to Mastering Python High Performance. In this
book, we'll cover everything related to performance improvements. Knowledge
about the subject is not strictly required (although it won't hurt), but knowledge
of the Python programming language is required, especially in some of the
Python-specific chapters.

We'll start by going through the basics of what profiling is, how it fits into the
development cycle, and the benefits related to including this practice in it. Afterwards,
we'll move on to the core tools required to get the job done (profilers and visual
profilers). Then, we will take a look at a set of optimization techniques and finally
arrive at a fully practical chapter that will provide a real-life optimization example.

What this book covers

Chapter 1, Profiling 101, provides information about the art of profiling to those who
are not aware of it.

Chapter 2, The Profilers, tells you how to use the core tools that will be mentioned
throughout the book.

Chapter 3, Going Visual – GUIs to Help Understand Profiler Output, covers how to
use the pyprof2calltree and RunSnakeRun tools. It also helps the developer to
understand the output of cProfile with different visualization techniques.

[v]
Preface

Chapter 4, Optimize Everything, talks about the basic process of optimization and a set
of good/recommended practices that every Python developer should follow before
considering other options.

Chapter 5, Multithreading versus Multiprocessing, discusses multithreading and

multiprocessing and explains how and when to apply them.

Chapter 6, Generic Optimization Options, describes and shows you how to install and
use Cython and PyPy in order to improve code performance.

Chapter 7, Lightning Fast Number Crunching with Numba, Parakeet, and pandas, talks
about tools that help optimize Python scripts that deal with numbers. These specific
tools (Numba, Parakeet, and pandas) help make number crunching faster.

Chapter 8, Putting It All into Practice, provides a practical example of profilers, finds
its bottlenecks, and removes them using the tools and techniques mentioned in this
book. To conclude, we'll compare the results of using each technique.

What you need for this book

Your system must have the following software before executing the code mentioned
in this book:

• Python 2.7
• Line profiler 1.0b2
• Kcachegrind 0.7.4
• RunSnakeRun 2.0.4
• Numba 0.17
• The latest version of Parakeet
• pandas 0.15.2

Who this book is for

Since the topics tackled in this book cover everything related to profiling and
optimizing the Python code, Python developers at all levels will benefit from
this book.

The only essential requirement is to have some basic knowledge of the Python
programing language.

[ vi ]
Preface

Conventions
In this book, you will find a number of text styles that distinguish between different
kinds of information. Here are some examples of these styles and an explanation of
their meaning.

Code words in text, database table names, folder names, filenames, file extensions,
pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "We
can print/gather the information we deem relevant inside the PROFILER function."

A block of code is set as follows:

import sys

def profiler(frame, event, arg):

print 'PROFILER: %r %r' % (event, arg)

sys.setprofile(profiler)

When we wish to draw your attention to a particular part of a code block, the
relevant lines or items are set in bold:
Traceback (most recent call last):
File "cprof-test1.py", line 7, in <module>
runRe() ...
File "/usr/lib/python2.7/cProfile.py", line 140, in runctx
exec cmd in globals, locals
File "<string>", line 1, in <module>
NameError: name 're' is not defined

Any command-line input or output is written as follows:

$ sudo apt-get install python-dev libxml2-dev libxslt-dev

[ vii ]
Preface

New terms and important words are shown in bold. Words that you see on the
screen, for example, in menus or dialog boxes, appear in the text like this: "Again,
with the Callee Map selected for the first function call, we can see the entire map
of our script."

Warnings or important notes appear in a box like this.

Tips and tricks appear like this.

Reader feedback
Feedback from our readers is always welcome. Let us know what you think about
this book—what you liked or disliked. Reader feedback is important for us as it helps
us develop titles that you will really get the most out of.

To send us general feedback, simply e-mail [email protected], and mention

the book's title in the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing
or contributing to a book, see our author guide at www.packtpub.com/authors.

Customer support
Now that you are the proud owner of a Packt book, we have a number of things to
help you to get the most from your purchase.

Downloading the example code

You can download the example code files from your account at http://www.
packtpub.com for all the Packt Publishing books you have purchased. If you
purchased this book elsewhere, you can visit http://www.packtpub.com/support
and register to have the files e-mailed directly to you.

[ viii ]
Preface

Downloading the color images of this book

We also provide you with a PDF file that has color images of the screenshots/diagrams
used in this book. The color images will help you better understand the changes in
the output. You can download this file from: https://www.packtpub.com/sites/
default/files/downloads/9300OS_GraphicBundle.pdf.

Errata
Although we have taken every care to ensure the accuracy of our content, mistakes
do happen. If you find a mistake in one of our books—maybe a mistake in the text or
the code—we would be grateful if you could report this to us. By doing so, you can
save other readers from frustration and help us improve subsequent versions of this
book. If you find any errata, please report them by visiting http://www.packtpub.
com/submit-errata, selecting your book, clicking on the Errata Submission Form
link, and entering the details of your errata. Once your errata are verified, your
submission will be accepted and the errata will be uploaded to our website or added
to any list of existing errata under the Errata section of that title.

To view the previously submitted errata, go to https://www.packtpub.com/books/

content/support and enter the name of the book in the search field. The required
information will appear under the Errata section.

Piracy
Piracy of copyrighted material on the Internet is an ongoing problem across all
media. At Packt, we take the protection of our copyright and licenses very seriously.
If you come across any illegal copies of our works in any form on the Internet, please
provide us with the location address or website name immediately so that we can
pursue a remedy.

Please contact us at [email protected] with a link to the suspected

pirated material.

We appreciate your help in protecting our authors and our ability to bring you
valuable content.

Questions
If you have a problem with any aspect of this book, you can contact us at
[email protected], and we will do our best to address the problem.

[ ix ]
Profiling 101
Just like any infant needs to learn how to crawl before running 100 mts with
obstacles in under 12 seconds, programmers need to understand the basics of
profiling before trying to master that art. So, before we start delving into the
mysteries of performance optimization and profiling on Python programs,
we need to have a clear understanding of the basics.

Once you know the basics, you'll be able to learn about the tools and techniques.
So, to start us off, this chapter will cover everything you need to know about
profiling but were too afraid to ask. In this chapter we will do the following things:

• We will provide a clear definition of what profiling is and the different

profiling techniques.
• We will explain the importance of profiling in the development cycle,
because profiling is not something you do only once and then forget about
it. Profiling should be an integral part of the development process, just like
writing tests is.
• We will cover things we can profile. We'll go over the different types of
resources we'll be able to measure and how they'll help us find our problems.
• We will discuss the risk of premature optimization, that is, why optimizing
before profiling is generally a bad idea.
• You will learn about running time complexity. Understanding profiling
techniques is one step into successful optimization, but we also need to
understand how to measure the complexity of an algorithm in order to
understand whether we need to improve it or not.
• We will also look at good practices. Finally, we'll go over some good
practices to keep in mind when starting the profiling process of your project.

[1]
Profiling 101

What is profiling?
A program that hasn't been optimized will normally spend most of its CPU cycles
in some particular subroutines. Profiling is the analysis of how the code behaves
in relation to the resources it's using. For instance, profiling will tell you how
much CPU time an instruction is using or how much memory the full program is
consuming. It is achieved by modifying either the source code of the program or the
binary executable form (when possible) to use something called as a profiler.

Normally, developers profile their programs when they need to either optimize their
performance or when those programs are suffering from some kind of weird bug,
which can normally be associated with memory leaks. In such cases, profiling can
help them get an in-depth understanding of how their code is using the computer's
resources (that is, how many times a certain function is being called).

A developer can use this information, along with a working knowledge of the source
code, to find the program's bottlenecks and memory leaks. The developer can then
fix whatever is wrong with the code.

There are two main methodologies for profiling software: event-based profiling and
statistical profiling. When using these types of software, you should keep in mind
that they both have pros and cons.

Event-based profiling
Not every programming language supports this type of profiling. Here are some
programming languages that support event-based profiling:

• Java: The JVMTI (JVM Tools Interface) provides hooks for profilers to trap
events such as calls, thread-related events, class loads and so on
• .NET: Just like with Java, the runtime provides events (http://
en.wikibooks.org/wiki/Introduction_to_Software_Engineering/
Testing/Profiling#Methods_of_data_gathering)
• Python: Using the sys.setprofile function, a developer can
trap events such as python_[call|return|exception] or c_
[call|return|exception]

[2]
Chapter 1

Event-based profilers (also known as tracing profilers) work by gathering data

on specific events during the execution of our program. These profilers generate a
large amount of data. Basically, the more events they listen to, the more data they
will gather. This makes them somewhat impractical to use, and they are not the
first choice when starting to profile a program. However, they are a good last resort
when other profiling methods aren't enough or just aren't specific enough. Consider
the case where you'd want to profile all the return statements. This type of profiler
would give you the granularity you'd need for this task, while others would simply
not allow you to execute this task.

A simple example of an event-based profiler on Python could be the following code

(we'll understand this topic better once we reach the upcoming chapters):
import sys

def profiler(frame, event, arg):

print 'PROFILER: %r %r' % (event, arg)

sys.setprofile(profiler)

#simple (and very ineficient) example of how to calculate the

Fibonacci sequence for a number.
def fib(n):
if n == 0:
return 0
elif n == 1:
return 1
else:
return fib(n-1) + fib(n-2)

def fib_seq(n):
seq = [ ]
if n > 0:
seq.extend(fib_seq(n-1))
seq.append(fib(n))
return seq

print fib_seq(2)

[3]
Profiling 101

The preceding code contributes to the following output:

PROFILER: 'call' None
PROFILER: 'call' None
PROFILER: 'call' None
PROFILER: 'call' None
PROFILER: 'return' 0
PROFILER: 'c_call' <built-in method append of list object at
0x7f570ca215f0>
PROFILER: 'c_return' <built-in method append of list object at
0x7f570ca215f0>
PROFILER: 'return' [0]
PROFILER: 'c_call' <built-in method extend of list object at
0x7f570ca21bd8>
PROFILER: 'c_return' <built-in method extend of list object at
0x7f570ca21bd8>
PROFILER: 'call' None
PROFILER: 'return' 1
PROFILER: 'c_call' <built-in method append of list object at
0x7f570ca21bd8>
PROFILER: 'c_return' <built-in method append of list object at
0x7f570ca21bd8>
PROFILER: 'return' [0, 1]
PROFILER: 'c_call' <built-in method extend of list object at
0x7f570ca55bd8>
PROFILER: 'c_return' <built-in method extend of list object at
0x7f570ca55bd8>
PROFILER: 'call' None
PROFILER: 'call' None
PROFILER: 'return' 1
PROFILER: 'call' None
PROFILER: 'return' 0
PROFILER: 'return' 1
PROFILER: 'c_call' <built-in method append of list object at
0x7f570ca55bd8>
PROFILER: 'c_return' <built-in method append of list object at
0x7f570ca55bd8>
PROFILER: 'return' [0, 1, 1]
[0, 1, 1]
PROFILER: 'return' None
PROFILER: 'call' None
PROFILER: 'c_call' <built-in method discard of set object at
0x7f570ca8a960>
PROFILER: 'c_return' <built-in method discard of set object at
0x7f570ca8a960>
PROFILER: 'return' None

[4]
Chapter 1

PROFILER: 'call' None

PROFILER: 'c_call' <built-in method discard of set object at
0x7f570ca8f3f0>
PROFILER: 'c_return' <built-in method discard of set object at
0x7f570ca8f3f0>
PROFILER: 'return' None

As you can see, PROFILER is called on every event. We can print/gather the
information we deem relevant inside the PROFILER function. The last line on the
sample code shows that the simple execution of fib_seq(2) generates a lot of
output data. If we were dealing with a real-world program, this output would be
several orders of magnitude bigger. This is why event-based profiling is normally the
last option when it comes to profiling. There are other alternatives out there (as we'll
see) that generate much less output, but, of course, have a lower accuracy rate.

Statistical profiling
Statistical profilers work by sampling the program counter at regular intervals. This
in turn allows the developer to get an idea of how much time the target program is
spending on each function. Since it works by sampling the PC, the resulting numbers
will be a statistical approximation of reality instead of exact numbers. Still, it should
be enough to get a glimpse of what the profiled program is doing and where the
bottlenecks are.

Some advantages of this type of profiling are as follows:

• Less data to analyze: Since we're only sampling the program's execution
instead of saving every little piece of data, the amount of information to
analyze will be significantly smaller.
• Smaller profiling footprint: Due to the way the sampling is made (using
OS interrupts), the target program suffers a smaller hit on its performance.
Although the presence of the profiler is not 100 percent unnoticed, statistical
profiling does less damage than the event-based one.

Here is an example of the output of OProfile (http://oprofile.sourceforge.net/

news/), a Linux statistical profiler:

Function name,File name,Times Encountered,Percentage

"func80000","statistical_profiling.c",30760,48.96%
"func40000","statistical_profiling.c",17515,27.88%
"func20000","static_functions.c",7141,11.37%
"func10000","static_functions.c",3572,5.69%
"func5000","static_functions.c",1787,2.84%
"func2000","static_functions.c",768,1.22%

[5]
Profiling 101

"func1500","statistical_profiling.c",701,1.12%
"func1000","static_functions.c",385,0.61%
"func500","statistical_profiling.c",194,0.31%

Here is the output of profiling the same Fibonacci code from the preceding code
using a statistical profiler for Python called statprof:
% cumulative self
time seconds seconds name
100.00 0.01 0.01 B02088_01_03.py:11:fib
0.00 0.01 0.00 B02088_01_03.py:17:fib_seq
0.00 0.01 0.00 B02088_01_03.py:21:<module>
---
Sample count: 1
Total time: 0.010000 seconds

As you can see, there is quite a difference between the output of both profilers for the
same code.

The importance of profiling

Now that we know what profiling means, it is also important to understand
how important and relevant it is to actually do it during the development cycle
of our applications.

Profiling is not something everyone is used to do, especially with non-critical software
(unlike peace maker embedded software or any other type of execution-critical
example). Profiling takes time and is normally useful only after we've detected that
something is wrong with our program. However, it could still be performed before
that even happens to catch possible unseen bugs, which would, in turn, help chip away
the time spent debugging the application at a later stage.

As hardware keeps advancing, getting faster and cheaper, it is increasingly hard

to understand why we, as developers, should spend resources (mainly time) on
profiling our creations. After all, we have practices such as test-driven development,
code review, pair programming and others that assure us our code is solid and that
it'll work as we want it. Right?

[6]
Chapter 1

However, what we sometimes fail to realize is that the higher level our languages
become (we've gone from assembler to JavaScript in just a few years), the less
we think about CPU cycles, memory allocation, CPU registries, and so on. New
generations of programmers learn their craft using higher level languages because
they're easier to understand and provide more power out of the box. However,
they also abstract the hardware and our interaction with it. As this tendency keeps
growing, the chances that new developers will even consider profiling their software
as another step on its development grows weaker by the second.

Let's look at the following scenario:

As we know, profiling measures the resources our program uses. As I've stated earlier,
they keep getting cheaper and cheaper. So, the cost of getting our software out and the
cost of making it available to a higher number of users is also getting cheaper.

These days, it is increasingly easy to create and publish an application that will be
reached by thousands of people. If they like it and spread the word through social
media, that number can blow up exponentially. Once that happens, something that is
very common is that the software will crash, or it'll become impossibly slow and the
users will just go away.

A possible explanation for the preceding scenario is, of course, a badly thought and
non-scalable architecture. After all, one single server with a limited amount of RAM
and processing power will get you so far until it becomes your bottleneck. However,
another possible explanation, one that proves to be true many times, is that we failed
to stress test our application. We didn't think about resource consumption; we just
made sure our tests passed, and we were happy with that. In other words, we failed
to go that extra mile, and as a result, our project crashed and burned.

Profiling can help avoid that crash and burn outcome, since it provides a fairly
accurate view of what our program is doing, no matter the load. So, if we profile it
with a very light load, and the result is that we're spending 80 percent of our time
doing some kind of I/O operation, it might raise a flag for us. Even if, during our
test, the application performed correctly, it might not do so under heavy stress.
Think of a memory leak-type scenario. In those cases, small tests might not generate
a big enough problem for us to detect it. However, a production deployment under
heavy stress will. Profiling can provide enough evidence for us to detect this problem
before it even turns into one.

[7]
Profiling 101

What can we profile?

Going deeper into profiling, it is very important to understand what we can actually
profile. Measuring is the core of profiling, so let's take a detailed look at the things
we can measure during a program's execution.

Execution time
The most basic of the numbers we can gather when profiling is the execution time.
The execution time of the entire process or just of a particular portion of the code
will shed some light on its own. If you have experience in the area your program is
running (that is, you're a web developer and you're working on a web framework),
you probably already know what it means for your system to take too much time. For
instance, a simple web server might take up to 100 milliseconds when querying the
database, rendering the response, and sending it back to the client. However, if the
same piece of code starts to slow down and now it takes 60 seconds to do the same
task, then you should start thinking about profiling. You also have to consider that
numbers here are relative. Let's assume another process: a MapReduce job that is
meant to process 2 TB of information stored on a set of text files takes 20 minutes. In
this case, you might not consider it as a slow process, even when it takes considerably
more time than the slow web server mentioned earlier.

To get this type of information, you don't really need a lot of profiling experience or
even complex tools to get the numbers. Just add the required lines into your code
and run the program.

For instance, the following code will calculate the Fibonnacci sequence for the
number 30:
import datetime

tstart = None
tend = None

def start_time():
global tstart
tstart = datetime.datetime.now()
def get_delta():
global tstart
tend = datetime.datetime.now()
return tend - tstart

def fib(n):

[8]
Chapter 1

return n if n == 0 or n == 1 else fib(n-1) + fib(n-2)

def fib_seq(n):
seq = [ ]
if n > 0:
seq.extend(fib_seq(n-1))
seq.append(fib(n))
return seq

start_time()
print "About to calculate the fibonacci sequence for the number 30"
delta1 = get_delta()

start_time()
seq = fib_seq(30)
delta2 = get_delta()

print "Now we print the numbers: "

start_time()
for n in seq:
print n
delta3 = get_delta()

print "====== Profiling results ======="

print "Time required to print a simple message: %(delta1)s" % locals()
print "Time required to calculate fibonacci: %(delta2)s" % locals()
print "Time required to iterate and print the numbers: %(delta3)s" %
locals()
print "====== ======="

Now, the code will produce the following output:

About to calculate the Fibonacci sequence for the number 30
Now we print the numbers:
0
1
1
2
3
5
8
13
21
#...more numbers
4181

[9]
Profiling 101

6765
10946
17711
28657
46368
75025
121393
196418
317811
514229
832040
====== Profiling results =======
Time required to print a simple message: 0:00:00.000030
Time required to calculate fibonacci: 0:00:00.642092
Time required to iterate and print the numbers: 0:00:00.000102

Based on the last three lines, we see the obvious results: the most expensive part of
the code is the actual calculation of the Fibonacci sequence.

Downloading the example code

You can download the example code files from your account at
http://www.packtpub.com for all the Packt Publishing books
you have purchased. If you purchased this book elsewhere, you
can visit http://www.packtpub.com/support and register
to have the files e-mailed directly to you.

Where are the bottlenecks?

Once you've measured how much time your code needs to execute, you can profile
it by paying special attention to the slow sections. These are the bottlenecks, and
normally, they are related to one or a combination of the following reasons:

• Heavy I/O operations, such as reading and parsing big files, executing
long-running database queries, calling external services (such as HTTP
requests), and so on
• Unexpected memory leaks that start building up until there is no memory
left for the rest of the program to execute properly
• Unoptimized code that gets executed frequently
• Intensive operations that are not cached when they could be

[ 10 ]
Chapter 1

I/O-bound code (file reads/write, database queries, and so on) is usually harder
to optimize, because that would imply changing the way the program is dealing
with that I/O (normally using core functions from the language). Instead, when
optimizing compute-bound code (like a function that is using a badly implemented
algorithm), getting a performance improvement is easier (although not necessarily
easy). This is because it just implies rewriting it.

A general indicator that you're near the end of a performance optimization process is
when most of the bottlenecks left are due to I/O-bound code.

Memory consumption and memory leaks

Another very important resource to consider when developing software is memory.
Regular software developers don't really care much about it, since the era of
the 640 KB of RAM PC is long dead. However, a memory leak on a long-running
program can turn any server into a 640 KB computer. Memory consumption is not
just about having enough memory for your program to run; it's also about having
control over the memory that your programs use.

There are some developments, such as embedded systems, that actually require
developers to pay extra attention to the amount of memory they use, because it is a
limited resource in those systems. However, an average developer can expect their
target system to have the amount of RAM they require.

With RAM and higher level languages that come with automatic memory
management (like garbage collection), the developer is less likely to pay much
attention to memory utilization, trusting the platform to do it for them.

[ 11 ]
Profiling 101

Keeping track of memory consumption is relatively straightforward. At least for a

basic approach, just use your OS's task manager. It'll display, among other things,
the amount of memory used or at least the percentage of total memory used by your
program. The task manager is also a great tool to check your CPU time consumption.
As you can see in the next screenshot, a simple Python program (the preceding one)
is taking up almost the entire CPU power (99.8 percent), and barely 0.1 percent of the
total memory that is available:

With a tool like that (the top command line tool from Linux), spotting memory leaks
can be easy, but that will depend on the type of software you're monitoring. If your
program is constantly loading data, its memory consumption rate will be different
from another program that doesn't have to deal much with external resources.

[ 12 ]
Chapter 1

For instance, if we were to chart the memory consumption over time of a program
dealing with lots of external data, it would look like the following chart:

There will be peaks, when these resources get fully loaded into memory, but there
will also be some drops, when those resources are released. Although the memory
consumption numbers fluctuate quite a bit, it's still possible to estimate the average
amount of memory that the program will use when no resources are loaded. Once
you define that area (marked as a green box in the preceding chart), you can spot
memory leaks.

Let's look at how the same chart would look with bad resource handling (not fully
releasing allocated memory):

[ 13 ]
Profiling 101

In the preceding chart, you can clearly see that not all memory is released when a
resource is no longer used, which is causing the line to move out of the green box.
This means the program is consuming more and more memory every second, even
when the resources loaded are released.

The same can be done with programs that aren't resource heavy, for instance, scripts
that execute a particular processing task for a considerable period of time. In those
cases, the memory consumption and the leaks should be easier to spot.

Let's take a look at an example:

When the processing stage starts, the memory consumption should stabilize within a
clearly defined range. If we spot numbers outside that range, especially if it goes out
of it and never comes back, we're looking at another example of a memory leak.

Let's look at an example of such a case:

[ 14 ]
Chapter 1

The risk of premature optimization

Optimization is normally considered a good practice. However, this doesn't hold
true when the act of optimization ends up driving the design decisions of the
software solution.

A very common pitfall developers face while starting to code a new piece of software
is premature optimization.

When this happens, the end result ends up being quite the opposite of the intended
optimized code. It can contain an incomplete version of the required solution, or it
can even contain errors derived from the optimization-driven design decisions.

As a normal rule of thumb, if you haven't measured (profiled) your code, optimizing
it might not be the best idea. First, focus on readable code. Then, profile it and find out
where the real bottlenecks are, and as a final step, perform the actual optimization.

Running time complexity

When profiling and optimizing code, it's really important to understand what
Running time complexity (RTC) is and how we can use that knowledge to
properly optimize our code.

RTC helps quantify the execution time of a given algorithm. It does so by providing
a mathematical approximation of the time a piece of code will take to execute for any
given input. It is an approximation, because that way, we're able to group similar
algorithms using that value.

RTC is expressed using something called Big O notation. In mathematics, Big O

notation is used to express the limiting behavior of a given function when the terms
tend to infinity. If I apply that concept in computer science, we can use Big O notation
to express the limiting behavior of the function describing the execution time.

In other words, this notation will give us a broad idea of how long our algorithm
will take to process an arbitrarily large input. It will not, however, give us a precise
number for the time of execution, which would require a more in-depth analysis of
the source code.

As I've said earlier, we can use this tendency to group algorithms. Here are some of
the most common groups:

[ 15 ]
Profiling 101

Constant time – O(1)

This is the simplest of them all. This notation basically means that the action we're
measuring will always take a constant amount of time, and this time is not dependent
on the size of the input.

Here are some examples of code that have O(1) execution time:

• Determining whether a number is odd or even:

if number % 2:
odd = True
else:
odd = False

• Printing a message into standard output:

print "Hello world!"

Even something more conceptually complex, like finding the value of a key inside
a dictionary (or hash table), if implemented correctly, can be done in constant time.
Technically speaking, accessing an element on the hash takes O(1) amortized time,
which roughly means that the average time each operation takes (without taking into
account edge cases) is a constant O(1) time.

Linear time – O(n)

Linear time dictates that for a given input of arbitrary length n, the amount of time
required for the execution of the algorithm is linearly proportional to n, for instance,
3n, 4n + 5, and so on.

[ 16 ]
Chapter 1

The preceding chart clearly shows that both the blue (3n) line and the red one
(4n + 5) have the same upper limit as the black line (n) when x tends to infinity.
So, to simplify, we can just say that all three functions are O(n).

Examples of algorithms with this execution order are:

• Finding the smallest value in an unsorted list

• Comparing two strings
• Deleting the last item inside a linked list

Logarithmic time – O(log n)

An algorithm with logarithmic execution time is one that will have a very
determined upper limit time. A logarithmic function grows quickly at first,
but it'll slow down as the input size gets bigger. It will never stop growing,
but the amount it grows by will be so small that it will be irrelevant.

The preceding chart shows three different logarithmic functions. You can clearly
see that they all possess a similar shape, including the upper limit x, which keeps
increasing to infinity.

Some examples of algorithms that have logarithmic execution time are:

• Binary search
• Calculating Fibonacci numbers (using matrix multiplications)

[ 17 ]
Profiling 101

Linearithmic time – O(nlog n)

A particular combination of the previous two orders of execution is the linearithmic
time. It grows quickly as soon as the value of x starts increasing.

Here are some examples of algorithms that have this order of execution:

• Merge sort
• Heap sort
• Quick sort (at least its average time complexity)

Let's see a few examples of plotted linearithmic functions to understand them better:

Factorial time – O(n!)

Factorial time is one of the worst execution times we might get out of an algorithm.
It grows so quickly that it's hard to plot.

[ 18 ]
Chapter 1

Here is a rough approximation of how the execution time of our algorithm would
look with factorial time:

An example of an algorithm with factorial execution time is the solution for

the traveling salesman using brute force search (basically checking every
single possible solution).

Quadratic time – O(n^)

Quadratic execution time is another example of a fast growing algorithm. The bigger
the input size, the longer it's going to take (this is true for most complexities, but then
again, specially true for this one). Quadratic execution time is even less efficient that
linearithmic time.

Some examples of algorithms having this order of execution are:

• Bubble sort
• Traversing a 2D array
• Insertion sort

[ 19 ]
Exploring the Variety of Random
Documents with Different Content
THE FULL PROJECT GUTENBERG LICENSE
PLEASE READ THIS BEFORE YOU DISTRIBUTE OR USE THIS WORK

To protect the Project Gutenberg™ mission of promoting the free

distribution of electronic works, by using or distributing this work (or
any other work associated in any way with the phrase “Project
Gutenberg”), you agree to comply with all the terms of the Full
Project Gutenberg™ License available with this file or online at
www.gutenberg.org/license.

Section 1. General Terms of Use and

Redistributing Project Gutenberg™
electronic works
1.A. By reading or using any part of this Project Gutenberg™
electronic work, you indicate that you have read, understand, agree
to and accept all the terms of this license and intellectual property
(trademark/copyright) agreement. If you do not agree to abide by all
the terms of this agreement, you must cease using and return or
destroy all copies of Project Gutenberg™ electronic works in your
possession. If you paid a fee for obtaining a copy of or access to a
Project Gutenberg™ electronic work and you do not agree to be
bound by the terms of this agreement, you may obtain a refund from
the person or entity to whom you paid the fee as set forth in
paragraph 1.E.8.

1.B. “Project Gutenberg” is a registered trademark. It may only be

used on or associated in any way with an electronic work by people
who agree to be bound by the terms of this agreement. There are a
few things that you can do with most Project Gutenberg™ electronic
works even without complying with the full terms of this agreement.
See paragraph 1.C below. There are a lot of things you can do with
Project Gutenberg™ electronic works if you follow the terms of this
agreement and help preserve free future access to Project
Gutenberg™ electronic works. See paragraph 1.E below.
1.C. The Project Gutenberg Literary Archive Foundation (“the
Foundation” or PGLAF), owns a compilation copyright in the
collection of Project Gutenberg™ electronic works. Nearly all the
individual works in the collection are in the public domain in the
United States. If an individual work is unprotected by copyright law in
the United States and you are located in the United States, we do
not claim a right to prevent you from copying, distributing,
performing, displaying or creating derivative works based on the
work as long as all references to Project Gutenberg are removed. Of
course, we hope that you will support the Project Gutenberg™
mission of promoting free access to electronic works by freely
sharing Project Gutenberg™ works in compliance with the terms of
this agreement for keeping the Project Gutenberg™ name
associated with the work. You can easily comply with the terms of
this agreement by keeping this work in the same format with its
attached full Project Gutenberg™ License when you share it without
charge with others.

1.D. The copyright laws of the place where you are located also
govern what you can do with this work. Copyright laws in most
countries are in a constant state of change. If you are outside the
United States, check the laws of your country in addition to the terms
of this agreement before downloading, copying, displaying,
performing, distributing or creating derivative works based on this
work or any other Project Gutenberg™ work. The Foundation makes
no representations concerning the copyright status of any work in
any country other than the United States.

1.E. Unless you have removed all references to Project Gutenberg:

1.E.1. The following sentence, with active links to, or other

immediate access to, the full Project Gutenberg™ License must
appear prominently whenever any copy of a Project Gutenberg™
work (any work on which the phrase “Project Gutenberg” appears, or
with which the phrase “Project Gutenberg” is associated) is
accessed, displayed, performed, viewed, copied or distributed:
This eBook is for the use of anyone anywhere in the United
States and most other parts of the world at no cost and with
almost no restrictions whatsoever. You may copy it, give it away
or re-use it under the terms of the Project Gutenberg License
included with this eBook or online at www.gutenberg.org. If you
are not located in the United States, you will have to check the
laws of the country where you are located before using this
eBook.

1.E.2. If an individual Project Gutenberg™ electronic work is derived

from texts not protected by U.S. copyright law (does not contain a
notice indicating that it is posted with permission of the copyright
holder), the work can be copied and distributed to anyone in the
United States without paying any fees or charges. If you are
redistributing or providing access to a work with the phrase “Project
Gutenberg” associated with or appearing on the work, you must
comply either with the requirements of paragraphs 1.E.1 through
1.E.7 or obtain permission for the use of the work and the Project
Gutenberg™ trademark as set forth in paragraphs 1.E.8 or 1.E.9.

1.E.3. If an individual Project Gutenberg™ electronic work is posted

with the permission of the copyright holder, your use and distribution
must comply with both paragraphs 1.E.1 through 1.E.7 and any
additional terms imposed by the copyright holder. Additional terms
will be linked to the Project Gutenberg™ License for all works posted
with the permission of the copyright holder found at the beginning of
this work.

1.E.4. Do not unlink or detach or remove the full Project

Gutenberg™ License terms from this work, or any files containing a
part of this work or any other work associated with Project
Gutenberg™.

1.E.5. Do not copy, display, perform, distribute or redistribute this

electronic work, or any part of this electronic work, without
prominently displaying the sentence set forth in paragraph 1.E.1 with
active links or immediate access to the full terms of the Project
Gutenberg™ License.
1.E.6. You may convert to and distribute this work in any binary,
compressed, marked up, nonproprietary or proprietary form,
including any word processing or hypertext form. However, if you
provide access to or distribute copies of a Project Gutenberg™ work
in a format other than “Plain Vanilla ASCII” or other format used in
the official version posted on the official Project Gutenberg™ website
(www.gutenberg.org), you must, at no additional cost, fee or expense
to the user, provide a copy, a means of exporting a copy, or a means
of obtaining a copy upon request, of the work in its original “Plain
Vanilla ASCII” or other form. Any alternate format must include the
full Project Gutenberg™ License as specified in paragraph 1.E.1.

1.E.7. Do not charge a fee for access to, viewing, displaying,

performing, copying or distributing any Project Gutenberg™ works
unless you comply with paragraph 1.E.8 or 1.E.9.

1.E.8. You may charge a reasonable fee for copies of or providing

access to or distributing Project Gutenberg™ electronic works
provided that:

• You pay a royalty fee of 20% of the gross profits you derive from
the use of Project Gutenberg™ works calculated using the
method you already use to calculate your applicable taxes. The
fee is owed to the owner of the Project Gutenberg™ trademark,
but he has agreed to donate royalties under this paragraph to
the Project Gutenberg Literary Archive Foundation. Royalty
payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information
about donations to the Project Gutenberg Literary Archive
Foundation.”

• You provide a full refund of any money paid by a user who

notifies you in writing (or by e-mail) within 30 days of receipt that
s/he does not agree to the terms of the full Project Gutenberg™
License. You must require such a user to return or destroy all
copies of the works possessed in a physical medium and
discontinue all use of and all access to other copies of Project
Gutenberg™ works.

• You provide, in accordance with paragraph 1.F.3, a full refund of

any money paid for a work or a replacement copy, if a defect in
the electronic work is discovered and reported to you within 90
days of receipt of the work.

• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.

1.E.9. If you wish to charge a fee or distribute a Project Gutenberg™

electronic work or group of works on different terms than are set
forth in this agreement, you must obtain permission in writing from
the Project Gutenberg Literary Archive Foundation, the manager of
the Project Gutenberg™ trademark. Contact the Foundation as set
forth in Section 3 below.

1.F.

1.F.1. Project Gutenberg volunteers and employees expend

considerable effort to identify, do copyright research on, transcribe
and proofread works not protected by U.S. copyright law in creating
the Project Gutenberg™ collection. Despite these efforts, Project
Gutenberg™ electronic works, and the medium on which they may
be stored, may contain “Defects,” such as, but not limited to,
incomplete, inaccurate or corrupt data, transcription errors, a
copyright or other intellectual property infringement, a defective or
damaged disk or other medium, a computer virus, or computer
codes that damage or cannot be read by your equipment.

1.F.2. LIMITED WARRANTY, DISCLAIMER OF DAMAGES - Except

for the “Right of Replacement or Refund” described in paragraph
1.F.3, the Project Gutenberg Literary Archive Foundation, the owner
of the Project Gutenberg™ trademark, and any other party
distributing a Project Gutenberg™ electronic work under this
agreement, disclaim all liability to you for damages, costs and
expenses, including legal fees. YOU AGREE THAT YOU HAVE NO
REMEDIES FOR NEGLIGENCE, STRICT LIABILITY, BREACH OF
WARRANTY OR BREACH OF CONTRACT EXCEPT THOSE
PROVIDED IN PARAGRAPH 1.F.3. YOU AGREE THAT THE
FOUNDATION, THE TRADEMARK OWNER, AND ANY
DISTRIBUTOR UNDER THIS AGREEMENT WILL NOT BE LIABLE
TO YOU FOR ACTUAL, DIRECT, INDIRECT, CONSEQUENTIAL,
PUNITIVE OR INCIDENTAL DAMAGES EVEN IF YOU GIVE
NOTICE OF THE POSSIBILITY OF SUCH DAMAGE.

1.F.3. LIMITED RIGHT OF REPLACEMENT OR REFUND - If you

discover a defect in this electronic work within 90 days of receiving it,
you can receive a refund of the money (if any) you paid for it by
sending a written explanation to the person you received the work
from. If you received the work on a physical medium, you must
return the medium with your written explanation. The person or entity
that provided you with the defective work may elect to provide a
replacement copy in lieu of a refund. If you received the work
electronically, the person or entity providing it to you may choose to
give you a second opportunity to receive the work electronically in
lieu of a refund. If the second copy is also defective, you may
demand a refund in writing without further opportunities to fix the
problem.

1.F.4. Except for the limited right of replacement or refund set forth in
paragraph 1.F.3, this work is provided to you ‘AS-IS’, WITH NO
OTHER WARRANTIES OF ANY KIND, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR ANY PURPOSE.

1.F.5. Some states do not allow disclaimers of certain implied

warranties or the exclusion or limitation of certain types of damages.
If any disclaimer or limitation set forth in this agreement violates the
law of the state applicable to this agreement, the agreement shall be
interpreted to make the maximum disclaimer or limitation permitted
by the applicable state law. The invalidity or unenforceability of any
provision of this agreement shall not void the remaining provisions.
1.F.6. INDEMNITY - You agree to indemnify and hold the
Foundation, the trademark owner, any agent or employee of the
Foundation, anyone providing copies of Project Gutenberg™
electronic works in accordance with this agreement, and any
volunteers associated with the production, promotion and distribution
of Project Gutenberg™ electronic works, harmless from all liability,
costs and expenses, including legal fees, that arise directly or
indirectly from any of the following which you do or cause to occur:
(a) distribution of this or any Project Gutenberg™ work, (b)
alteration, modification, or additions or deletions to any Project
Gutenberg™ work, and (c) any Defect you cause.

Section 2. Information about the Mission of

Project Gutenberg™
Project Gutenberg™ is synonymous with the free distribution of
electronic works in formats readable by the widest variety of
computers including obsolete, old, middle-aged and new computers.
It exists because of the efforts of hundreds of volunteers and
donations from people in all walks of life.

Volunteers and financial support to provide volunteers with the

assistance they need are critical to reaching Project Gutenberg™’s
goals and ensuring that the Project Gutenberg™ collection will
remain freely available for generations to come. In 2001, the Project
Gutenberg Literary Archive Foundation was created to provide a
secure and permanent future for Project Gutenberg™ and future
generations. To learn more about the Project Gutenberg Literary
Archive Foundation and how your efforts and donations can help,
see Sections 3 and 4 and the Foundation information page at
www.gutenberg.org.

Section 3. Information about the Project

Gutenberg Literary Archive Foundation
The Project Gutenberg Literary Archive Foundation is a non-profit
501(c)(3) educational corporation organized under the laws of the
state of Mississippi and granted tax exempt status by the Internal
Revenue Service. The Foundation’s EIN or federal tax identification
number is 64-6221541. Contributions to the Project Gutenberg
Literary Archive Foundation are tax deductible to the full extent
permitted by U.S. federal laws and your state’s laws.

The Foundation’s business office is located at 809 North 1500 West,

Salt Lake City, UT 84116, (801) 596-1887. Email contact links and up
to date contact information can be found at the Foundation’s website
and official page at www.gutenberg.org/contact

Section 4. Information about Donations to

the Project Gutenberg Literary Archive
Foundation
Project Gutenberg™ depends upon and cannot survive without
widespread public support and donations to carry out its mission of
increasing the number of public domain and licensed works that can
be freely distributed in machine-readable form accessible by the
widest array of equipment including outdated equipment. Many small
donations ($1 to $5,000) are particularly important to maintaining tax
exempt status with the IRS.

The Foundation is committed to complying with the laws regulating

charities and charitable donations in all 50 states of the United
States. Compliance requirements are not uniform and it takes a
considerable effort, much paperwork and many fees to meet and
keep up with these requirements. We do not solicit donations in
locations where we have not received written confirmation of
compliance. To SEND DONATIONS or determine the status of
compliance for any particular state visit www.gutenberg.org/donate.

While we cannot and do not solicit contributions from states where

we have not met the solicitation requirements, we know of no
prohibition against accepting unsolicited donations from donors in
such states who approach us with offers to donate.

International donations are gratefully accepted, but we cannot make

any statements concerning tax treatment of donations received from
outside the United States. U.S. laws alone swamp our small staff.

Please check the Project Gutenberg web pages for current donation
methods and addresses. Donations are accepted in a number of
other ways including checks, online payments and credit card
donations. To donate, please visit: www.gutenberg.org/donate.

Section 5. General Information About Project

Gutenberg™ electronic works
Professor Michael S. Hart was the originator of the Project
Gutenberg™ concept of a library of electronic works that could be
freely shared with anyone. For forty years, he produced and
distributed Project Gutenberg™ eBooks with only a loose network of
volunteer support.

Project Gutenberg™ eBooks are often created from several printed

editions, all of which are confirmed as not protected by copyright in
the U.S. unless a copyright notice is included. Thus, we do not
necessarily keep eBooks in compliance with any particular paper
edition.

Most people start at our website which has the main PG search
facility: www.gutenberg.org.

This website includes information about Project Gutenberg™,

including how to make donations to the Project Gutenberg Literary
Archive Foundation, how to help produce our new eBooks, and how
to subscribe to our email newsletter to hear about new eBooks.
Welcome to our website – the ideal destination for book lovers and
knowledge seekers. With a mission to inspire endlessly, we offer a
vast collection of books, ranging from classic literary works to
specialized publications, self-development books, and children's
literature. Each book is a new journey of discovery, expanding
knowledge and enriching the soul of the reade

Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.