The Data Science Design Manual 1st Edition Steven S. Skiena 2024 scribd download
The Data Science Design Manual 1st Edition Steven S. Skiena 2024 scribd download
com
https://textbookfull.com/product/the-data-science-design-
manual-1st-edition-steven-s-skiena/
OR CLICK BUTTON
DOWNLOAD NOW
https://textbookfull.com/product/the-algorithm-design-manual-3rd-
edition-steven-s-skiena/
textboxfull.com
https://textbookfull.com/product/a-clinician-s-guide-to-cannabinoid-
science-1st-edition-steven-james/
textboxfull.com
https://textbookfull.com/product/the-craft-and-science-of-game-design-
a-video-game-designers-manual-1st-edition-oconnor/
textboxfull.com
https://textbookfull.com/product/the-devil-s-garden-1st-edition-
steven-zaloga/
textboxfull.com
Living for the Elderly A Design Manual A Design Manual
Second and Revised Edition Eckhard Feddersen
https://textbookfull.com/product/living-for-the-elderly-a-design-
manual-a-design-manual-second-and-revised-edition-eckhard-feddersen/
textboxfull.com
https://textbookfull.com/product/hospitals-a-design-manual-1st-
edition-noor-wagenaar/
textboxfull.com
https://textbookfull.com/product/audiology-science-to-practice-steven-
kramer/
textboxfull.com
https://textbookfull.com/product/the-graphic-design-idea-book-
inspiration-from-50-masters-1st-edition-steven-heller/
textboxfull.com
https://textbookfull.com/product/the-martha-manual-martha-stewart-s/
textboxfull.com
TEXTS IN COMPUTER SCIENCE
THE
Data Science Design
MANUAL
Steven S. Skiena
123
Texts in Computer Science
Series editor
David Gries
Orit Hazzan
Fred B. Schneider
More information about this series at http://www.springer.com/series/3191
Steven S. Skiena
Series editors
David Gries Fred B. Schneider
Department of Computer Science Department of Computer Science
Cornell University Cornell University
Ithaca, NY Ithaca, NY
USA USA
Orit Hazzan
Faculty of Education in Science and Technology
Technion—Israel Institute of Technology
Haifa
Israel
This book was advertised with a copyright holder in the name of the publisher in error, whereas the author(s) holds the copyright.
Making sense of the world around us requires obtaining and analyzing data from
our environment. Several technology trends have recently collided, providing
new opportunities to apply our data analysis savvy to greater challenges than
ever before.
Computer storage capacity has increased exponentially; indeed remembering
has become so cheap that it is almost impossible to get computer systems to for-
get. Sensing devices increasingly monitor everything that can be observed: video
streams, social media interactions, and the position of anything that moves.
Cloud computing enables us to harness the power of massive numbers of ma-
chines to manipulate this data. Indeed, hundreds of computers are summoned
each time you do a Google search, scrutinizing all of your previous activity just
to decide which is the best ad to show you next.
The result of all this has been the birth of data science, a new field devoted
to maximizing value from vast collections of information. As a discipline, data
science sits somewhere at the intersection of statistics, computer science, and
machine learning, but it is building a distinct heft and character of its own.
This book serves as an introduction to data science, focusing on the skills and
principles needed to build systems for collecting, analyzing, and interpreting
data.
My professional experience as a researcher and instructor convinces me that
one major challenge of data science is that it is considerably more subtle than it
looks. Any student who has ever computed their grade point average (GPA) can
be said to have done rudimentary statistics, just as drawing a simple scatter plot
lets you add experience in data visualization to your resume. But meaningfully
analyzing and interpreting data requires both technical expertise and wisdom.
That so many people do these basics so badly provides my inspiration for writing
this book.
To the Reader
I have been gratified by the warm reception that my book The Algorithm Design
Manual [Ski08] has received since its initial publication in 1997. It has been
recognized as a unique guide to using algorithmic techniques to solve problems
that often arise in practice. The book you are holding covers very different
material, but with the same motivation.
v
vi
Equally important is what you will not find in this book. I do not emphasize
any particular language or suite of data analysis tools. Instead, this book pro-
vides a high-level discussion of important design principles. I seek to operate at
a conceptual level more than a technical one. The goal of this manual is to get
you going in the right direction as quickly as possible, with whatever software
tools you find most accessible.
To the Instructor
This book covers enough material for an “Introduction to Data Science” course
at the undergraduate or early graduate student levels. I hope that the reader
has completed the equivalent of at least one programming course and has a bit
of prior exposure to probability and statistics, but more is always better than
less.
I have made a full set of lecture slides for teaching this course available online
at http://www.data-manual.com. Data resources for projects and assignments
are also available there to aid the instructor. Further, I make available online
video lectures using these slides to teach a full-semester data science course. Let
me help teach your class, through the magic of the web!
Pedagogical features of this book include:
• Chapter Notes: Finally, each tutorial chapter concludes with a brief notes
section, pointing readers to primary sources and additional references.
Dedication
My bright and loving daughters Bonnie and Abby are now full-blown teenagers,
meaning that they don’t always process statistical evidence with as much alacrity
as I would I desire. I dedicate this book to them, in the hope that their analysis
skills improve to the point that they always just agree with me.
And I dedicate this book to my beautiful wife Renee, who agrees with me
even when she doesn’t agree with me, and loves me beyond the support of all
creditable evidence.
Acknowledgments
My list of people to thank is large enough that I have probably missed some.
I will try to do enumerate them systematically to minimize omissions, but ask
those I’ve unfairly neglected for absolution.
ix
First, I thank those who made concrete contributions to help me put this
book together. Yeseul Lee served as an apprentice on this project, helping with
figures, exercises, and more during summer 2016 and beyond. You will see
evidence of her handiwork on almost every page, and I greatly appreciate her
help and dedication. Aakriti Mittal and Jack Zheng also contributed to a few
of the figures.
Students in my Fall 2016 Introduction to Data Science course (CSE 519)
helped to debug the manuscript, and they found plenty of things to debug. I
particularly thank Rebecca Siford, who proposed over one hundred corrections
on her own. Several data science friends/sages reviewed specific chapters for
me, and I thank Anshul Gandhi, Yifan Hu, Klaus Mueller, Francesco Orabona,
Andy Schwartz, and Charles Ward for their efforts here.
I thank all the Quant Shop students from Fall 2015 whose video and mod-
eling efforts are so visibly on display. I particularly thank Jan (Dini) Diskin-
Zimmerman, whose editing efforts went so far beyond the call of duty I felt like
a felon for letting her do it.
My editors at Springer, Wayne Wheeler and Simon Rees, were a pleasure to
work with as usual. I also thank all the production and marketing people who
helped get this book to you, including Adrian Pieron and Annette Anlauf.
Several exercises were originated by colleagues or inspired by other sources.
Reconstructing the original sources years later can be challenging, but credits
for each problem (to the best of my recollection) appear on the website.
Much of what I know about data science has been learned through working
with other people. These include my Ph.D. students, particularly Rami al-Rfou,
Mikhail Bautin, Haochen Chen, Yanqing Chen, Vivek Kulkarni, Levon Lloyd,
Andrew Mehler, Bryan Perozzi, Yingtao Tian, Junting Ye, Wenbin Zhang, and
postdoc Charles Ward. I fondly remember all of my Lydia project masters
students over the years, and remind you that my prize offer to the first one who
names their daughter Lydia remains unclaimed. I thank my other collaborators
with stories to tell, including Bruce Futcher, Justin Gardin, Arnout van de Rijt,
and Oleksii Starov.
I remember all members of the General Sentiment/Canrock universe, partic-
ularly Mark Fasciano, with whom I shared the start-up dream and experienced
what happens when data hits the real world. I thank my colleagues at Yahoo
Labs/Research during my 2015–2016 sabbatical year, when much of this book
was conceived. I single out Amanda Stent, who enabled me to be at Yahoo
during that particularly difficult year in the company’s history. I learned valu-
able things from other people who have taught related data science courses,
including Andrew Ng and Hans-Peter Pfister, and thank them all for their help.
Caveat
It is traditional for the author to magnanimously accept the blame for whatever
deficiencies remain. I don’t. Any errors, deficiencies, or problems in this book
are somebody else’s fault, but I would appreciate knowing about them so as to
determine who is to blame.
Steven S. Skiena
Department of Computer Science
Stony Brook University
Stony Brook, NY 11794-2424
http://www.cs.stonybrook.edu/~skiena
[email protected]
May 2017
Contents
2 Mathematical Preliminaries 27
2.1 Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.1.1 Probability vs. Statistics . . . . . . . . . . . . . . . . . . . 29
2.1.2 Compound Events and Independence . . . . . . . . . . . . 30
2.1.3 Conditional Probability . . . . . . . . . . . . . . . . . . . 31
2.1.4 Probability Distributions . . . . . . . . . . . . . . . . . . 32
2.2 Descriptive Statistics . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.2.1 Centrality Measures . . . . . . . . . . . . . . . . . . . . . 34
2.2.2 Variability Measures . . . . . . . . . . . . . . . . . . . . . 36
2.2.3 Interpreting Variance . . . . . . . . . . . . . . . . . . . . 37
2.2.4 Characterizing Distributions . . . . . . . . . . . . . . . . 39
2.3 Correlation Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.3.1 Correlation Coefficients: Pearson and Spearman Rank . . 41
2.3.2 The Power and Significance of Correlation . . . . . . . . . 43
2.3.3 Correlation Does Not Imply Causation! . . . . . . . . . . 45
xi
xii CONTENTS
3 Data Munging 57
3.1 Languages for Data Science . . . . . . . . . . . . . . . . . . . . . 57
3.1.1 The Importance of Notebook Environments . . . . . . . . 59
3.1.2 Standard Data Formats . . . . . . . . . . . . . . . . . . . 61
3.2 Collecting Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.2.1 Hunting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.2.2 Scraping . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.2.3 Logging . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.3 Cleaning Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.3.1 Errors vs. Artifacts . . . . . . . . . . . . . . . . . . . . . 69
3.3.2 Data Compatibility . . . . . . . . . . . . . . . . . . . . . . 72
3.3.3 Dealing with Missing Values . . . . . . . . . . . . . . . . . 76
3.3.4 Outlier Detection . . . . . . . . . . . . . . . . . . . . . . . 78
3.4 War Story: Beating the Market . . . . . . . . . . . . . . . . . . . 79
3.5 Crowdsourcing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
3.5.1 The Penny Demo . . . . . . . . . . . . . . . . . . . . . . . 81
3.5.2 When is the Crowd Wise? . . . . . . . . . . . . . . . . . . 82
3.5.3 Mechanisms for Aggregation . . . . . . . . . . . . . . . . 83
3.5.4 Crowdsourcing Services . . . . . . . . . . . . . . . . . . . 84
3.5.5 Gamification . . . . . . . . . . . . . . . . . . . . . . . . . 88
3.6 Chapter Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
3.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
13 Coda 423
13.1 Get a Job! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423
13.2 Go to Graduate School! . . . . . . . . . . . . . . . . . . . . . . . 424
13.3 Professional Consulting Services . . . . . . . . . . . . . . . . . . 425
14 Bibliography 427
Chapter 1
What is data science? Like any emerging field, it hasn’t been completely defined
yet, but you know enough about it to be interested or else you wouldn’t be
reading this book.
I think of data science as lying at the intersection of computer science, statis-
tics, and substantive application domains. From computer science comes ma-
chine learning and high-performance computing technologies for dealing with
scale. From statistics comes a long tradition of exploratory data analysis, sig-
nificance testing, and visualization. From application domains in business and
the sciences comes challenges worthy of battle, and evaluation standards to
assess when they have been adequately conquered.
But these are all well-established fields. Why data science, and why now? I
see three reasons for this sudden burst of activity:
• New technology makes it possible to capture, annotate, and store vast
amounts of social media, logging, and sensor data. After you have amassed
all this data, you begin to wonder what you can do with it.
• Computing advances make it possible to analyze data in novel ways and at
ever increasing scales. Cloud computing architectures give even the little
guy access to vast power when they need it. New approaches to machine
learning have lead to amazing advances in longstanding problems, like
computer vision and natural language processing.
• Prominent technology companies (like Google and Facebook) and quan-
titative hedge funds (like Renaissance Technologies and TwoSigma) have
proven the power of modern data analytics. Success stories applying data
to such diverse areas as sports management (Moneyball [Lew04]) and elec-
tion forecasting (Nate Silver [Sil12]) have served as role models to bring
data science to a large popular audience.
1
© The Author(s) 2017
S.S. Skiena, The Data Science Design Manual,
Texts in Computer Science, DOI 10.1007/978-3-319-55444-0_1
2 CHAPTER 1. WHAT IS DATA SCIENCE?
This introductory chapter has three missions. First, I will try to explain how
good data scientists think, and how this differs from the mindset of traditional
programmers and software developers. Second, we will look at data sets in terms
of the potential for what they can be used for, and learn to ask the broader
questions they are capable of answering. Finally, I introduce a collection of
data analysis challenges that will be used throughout this book as motivating
examples.
• Data vs. method centrism: Scientists are data driven, while computer
scientists are algorithm driven. Real scientists spend enormous amounts
of effort collecting data to answer their question of interest. They invent
fancy measuring devices, stay up all night tending to experiments, and
devote most of their thinking to how to get the data they need.
By contrast, computer scientists obsess about methods: which algorithm
is better than which other algorithm, which programming language is best
for a job, which program is better than which other program. The details
of the data set they are working on seem comparably unexciting.
• Concern about results: Real scientists care about answers. They analyze
data to discover something about how the world works. Good scientists
care about whether the results make sense, because they care about what
the answers mean.
By contrast, bad computer scientists worry about producing plausible-
looking numbers. As soon as the numbers stop looking grossly wrong,
they are presumed to be right. This is because they are personally less
invested in what can be learned from a computation, as opposed to getting
it done quickly and efficiently.
1.1. COMPUTER SCIENCE, DATA SCIENCE, AND REAL SCIENCE 3
• Robustness: Real scientists are comfortable with the idea that data has
errors. In general, computer scientists are not. Scientists think a lot about
possible sources of bias or error in their data, and how these possible prob-
lems can effect the conclusions derived from them. Good programmers use
strong data-typing and parsing methodologies to guard against formatting
errors, but the concerns here are different.
Becoming aware that data can have errors is empowering. Computer
scientists chant “garbage in, garbage out” as a defensive mantra to ward
off criticism, a way to say that’s not my job. Real scientists get close
enough to their data to smell it, giving it the sniff test to decide whether
it is likely to be garbage.
Aspiring data scientists must learn to think like real scientists. Your job is
going to be to turn numbers into insight. It is important to understand the why
as much as the how.
To be fair, it benefits real scientists to think like data scientists as well. New
experimental technologies enable measuring systems on vastly greater scale than
ever possible before, through technologies like full-genome sequencing in biology
and full-sky telescope surveys in astronomy. With new breadth of view comes
new levels of vision.
Traditional hypothesis-driven science was based on asking specific questions
of the world and then generating the specific data needed to confirm or deny
it. This is now augmented by data-driven science, which instead focuses on
generating data on a previously unheard of scale or resolution, in the belief that
new discoveries will come as soon as one is able to look at it. Both ways of
thinking will be important to us:
There is another way to capture this basic distinction between software en-
gineering and data science. It is that software developers are hired to build
systems, while data scientists are hired to produce insights.
This may be a point of contention for some developers. There exist an
important class of engineers who wrangle the massive distributed infrastructures
necessary to store and analyze, say, financial transaction or social media data
4 CHAPTER 1. WHAT IS DATA SCIENCE?
• What things might you be able to learn from a given data set?
The key is thinking broadly: the answers to big, general questions often lie
buried in highly-specific data sets, which were by no means designed to contain
them.
Figure 1.2: Personal information on every major league baseball player is avail-
able at http://www.baseball-reference.com.
The most obvious types of questions to answer with this data are directly
related to baseball:
These are interesting questions. But even more interesting are questions
about demographic and social issues. Almost 20,000 major league baseball play-
1.2. ASKING INTERESTING QUESTIONS FROM DATA 7
ers have taken the field over the past 150 years, providing a large, extensively-
documented cohort of men who can serve as a proxy for even larger, less well-
documented populations. Indeed, we can use this baseball player data to answer
questions like:
• How often do people return to live in the same place where they were
born? Locations of birth and death have been extensively recorded in this
data set. Further, almost all of these people played at least part of their
career far from home, thus exposing them to the wider world at a critical
time in their youth.
• To what extent have heights and weights been increasing in the population
at large?
There are two particular themes to be aware of here. First, the identifiers
and reference tags (i.e. the metadata) often prove more interesting in a data set
than the stuff we are supposed to care about, here the statistical record of play.
Second is the idea of a statistical proxy, where you use the data set you have
to substitute for the one you really want. The data set of your dreams likely
does not exist, or may be locked away behind a corporate wall even if it does.
A good data scientist is a pragmatist, seeing what they can do with what they
have instead of bemoaning what they cannot get their hands on.
Figure 1.3: Representative film data from the Internet Movie Database.
Figure 1.4: Representative actor data from the Internet Movie Database.
1.2. ASKING INTERESTING QUESTIONS FROM DATA 9
Perhaps the most natural questions to ask IMDb involve identifying the
extremes of movies and actors:
• Which actors appeared in the most films? Earned the most money? Ap-
peared in the lowest rated films? Had the longest career or the shortest
lifespan?
• What was the highest rated film each year, or the best in each genre?
Which movies lost the most money, had the highest-powered casts, or got
the least favorable reviews.
Then there are larger-scale questions one can ask about the nature of the
motion picture business itself:
• How well does movie gross correlate with viewer ratings or awards? Do
customers instinctively flock to trash, or is virtue on the part of the cre-
ative team properly rewarded?
• How do Hollywood movies compare to Bollywood movies, in terms of rat-
ings, budget, and gross? Are American movies better received than foreign
films, and how does this differ between U.S. and non-U.S. reviewers?
• What is the age distribution of actors and actresses in films? How much
younger is the actress playing the wife, on average, than the actor playing
the husband? Has this disparity been increasing or decreasing with time?
• Live fast, die young, and leave a good-looking corpse? Do movie stars live
longer or shorter lives than bit players, or compared to the general public?
Assuming that people working together on a film get to know each other,
the cast and crew data can be used to build a social network of the movie
business. What does the social network of actors look like? The Oracle of
Bacon (https://oracleofbacon.org/) posits Kevin Bacon as the center of
the Hollywood universe and generates the shortest path to Bacon from any
other actor. Other actors, like Samuel L. Jackson, prove even more central.
More critically, can we analyze this data to determine the probability that
someone will like a given movie? The technique of collaborative filtering finds
people who liked films that I also liked, and recommends other films that they
liked as good candidates for me. The 2007 Netflix Prize was a $1,000,000 com-
petition to produce a ratings engine 10% better than the proprietary Netflix
system. The ultimate winner of this prize (BellKor) used a variety of data
sources and techniques, including the analysis of links [BK07].
10 CHAPTER 1. WHAT IS DATA SCIENCE?
Figure 1.5: The rise and fall of data processing, as witnessed by Google Ngrams.
Google makes this data freely available. So what are you going to do with it?
Observing the time series associated with particular words using the Ngrams
Viewer is fun. But more sophisticated historical trends can be captured by
aggregating multiple time series together. The following types of questions
seem particularly interesting to me:
• How has the amount of cursing changed over time? Use of the four-
letter words I am most familiar with seem to have exploded since 1960,
although it is perhaps less clear whether this reflects increased cussing or
lower publication standards.
• How often do new words emerge and get popular? Do these words tend
to stay in common usage, or rapidly fade away? Can we detect when
words change meaning over time, like the transition of gay from happy to
homosexual?
You can also use this Ngrams corpus to build a language model that captures
the meaning and usage of the words in a given language. We will discuss word
embeddings in Section 11.6.3, which are powerful tools for building language
models. Frequency counts reveal which words are most popular. The frequency
of word pairs appearing next to each other can be used to improve speech
recognition systems, helping to distinguish whether the speaker said that’s too
bad or that’s to bad. These millions of books provide an ample data set to build
representative models from.
The Ignition Point in the Cycle.—In practice, the firing takes place
before the crank has made the turn past the dead center, and this is
called pre-ignition, when the spark is advanced too far to the left.
The ignition should take place slightly before the crank turns,
because it takes a small interval of time for the charge to burn the
gases, and during this time the crank will have passed the dead
center, and started on its way downwardly.
From the diagrams it will be observed that two of the strokes,
namely the first and the third, are downward, and the second and
fourth are upward, and that the downward strokes take place during
the admission and impulse, and the compression and exhaust while
the piston moves upwardly.
The Fly-Wheel.—As the impulse in this type can take place only at
each second revolution, it is obvious that some means must be
provided to keep the shaft moving during the two turns, and for this
purpose the fly-wheel is utilized.
Practice has found the multi-cylinder type the most valuable, in
connection with the fly-wheel, as in employing two or more cylinders
in line, a smaller fly wheel will be sufficient.
Impulses in 4-Cylinder Engine.—In such a case the four cylinders are
arranged so the impulse will be at four different points of the shaft,
and we may assume that the four cylinders in Figs. 63, 64, 65 and
66, show the relative positions of the four pistons in a four cylinder
engine.
The Cylinder Case, and Connections.—A cross section of a case and
the relative positions of the various parts, is shown in Fig. 67. The
cylinder A is provided with a water jacket B, so as to form a space C
around the cylinder which has an inlet pipe D at the bottom, and an
outlet pipe E at the upper end.
Fig. 72a.
Increasing
Cooling Area.
The area forward of the engine is the most available space for
placing the water tank, and, especially for the reasons that the
radiator itself may be utilized for inclosing the engine hood, and
because the air, which is only partially heated in passing through the
radiator, serves to keep the space within the hood reasonably cool.
Force System of Cooling.—Under the circumstances the water
should be caused to circulate by mechanical means, which, while it
adds another operative element to the machinery, is nevertheless so
much more effective that it is worth the care, attention and expense
which are involved.
The Radiator Connection.—In Fig. 74 a radiator, engine and
circulating system are connected together to show the relative
arrangement of the various elements, in which the pump A is placed
in the pipe line B running from the lower end of the radiator C to the
manifold D at the lower end of the water jacket of the engine.
The upper end of the radiator is connected by a pipe E with the
top of the jacket, and the pipes are thus so disposed as to be free of
the other mechanism, and are all contained within the hood of the
engine.
A fan F, suitably geared to the crane shaft of the engine, provides
a means for inducing an air current through the radiator whenever
the engine is running.
Radiators.—Much time and money has been spent in developing a
simple and efficient type of radiator. As, of necessity, it must be
made up of a multiplicity of parts, leakage is apt to occur, and while
in the past most of the constructions depended on soldering
together the various portions, it will be seen how insecure such a
system of construction must be necessarily.
Construction of Radiator.—In Fig. 75, is shown a front and a
sectional view of portion of a simple type, which is made up of
square tubes A, their ends being fitted into square holes formed
through front and rear plates B C, and the tubes are so arranged
that there are small spaces D between the tubes.
When water enters through the inlet tube E, it fills the spaces, and
being cooled moves downwardly, while the air rushing through the
open-ended tubes, cools down the water over the large area thus
afforded.
All radiators employ substantially the same construction, the
illustration given being merely to show the principle of the device.
A drain cock G, Fig. 74 should be placed in the system below the
radiator, in the pipe line B, so that water can be drained off from all
the pipes, to prevent liability of freezing. The diagram shows the fan
shaft connected and run by a belt H. This is not the best
construction, as it is not a positive drive. Most cars are provided with
gearing for this purpose.
Operation of Radiator.—The water is thus carried from the bottom
of the radiator to the water jacket space, and from the upper end of
the jacketed area to the top of the radiator, and used over again.
More or less of the water is lost by evaporation, so more must be
added from time to time, and the radiator should be kept as full as
possible to get the best results. If the water level falls too far below
the return pipe at the top of the radiator, the area of the heating
surface and the decreased quantity of water exposed to the cooling
surface, are likely to cause undue heating, or vaporization.
The Pump.—A variety of pumps are used, but they are generally
based on the principle of the turbine impelling system, or on
centrifugal action. A type which utilizes both these principles is
shown in Figs. 76 and 77, in which the former is a cross vertical
section of 77 along line 1, and the latter is a central vertical section
on line 2 of Fig. 76.
The device comprises a cylindrical shell A, with an inlet B, at one
edge near the front wall, and an outlet C at the upper edge near the
rear wall.
Pump Construction.—Within is a revoluble tubular hub D, with one
end E projecting, to which power is applied. A disk partition G is
secured to this hub, midway between its ends, and on each side of
the partition is a pair of oppositely-projecting convolute blades,
those on the inlet side, indicated by H, and the ones in the discharge
side by I.
Fig. 76. Side View
of Pump. Fig. 77. Section.
Thus far we have the fuel oil control, together with the manner in
which the primary air supply is introduced. We shall now go a step
further, and illustrate the mixing chamber, discharge and throttle.
The Throttle Valve.—Referring to Fig. 80 it will be seen that directly
above the venturi tube described, is a space O. This is the mixing
chamber, which has an outlet P to the left, which connects with the
engine cylinders.
Within this tube is a throttle valve Q, operated by the throttle lever
on the steering wheel of the car. It is simply a disk which fits into the
interior of the conduit and is adapted to be turned by a stem R, on
which it is mounted.
While the lower inlets K are designed to supply the primary air for
carburetion, it is found necessary to admit a secondary supply, and
this should be taken into the mixing chamber directly instead of
passing the tube which conveys the oil.
The Secondary Air Supply.—The particular reasons for thus admitting
the air may be explained as follows: When the engine draws in a
supply of carbureted air, more or less of a vacuum is brought about
in the mixing chamber O. The faster the engine runs the richer will
the mixture become, because the additional suction draws in an
increasing quantity of gasoline, but the throat of the tube does not
change, and the requisite, proportionate quantity of air does not
follow, so that the mixture has too much fuel for the air.
Automatic Admission of Secondary Air.—If the engine should be
speeded up so twice the amount of oil is drawn into the mixing
chamber, the additional suction will not, at the same time, draw in
twice the amount of air.
This necessitates a provision whereby the secondary air shall be
admitted automatically only at times when the suction exceeds the
normal requirement, or to prevent too rich a mixture, which is
explained by reference to Fig. 80.
We may connect up the six cells in such a way that we can get
First: 9 volts, and 25 amperes, equal to 225 watts, or,
Second: 1-1/2 volts and 150 amperes, equal to 225 watts, or,
Third: 4-1/2 volts and 50 amperes, also equal to 225 watts.
In either case, you will see we have 225 watts. These three
windings are designated as series, parallel, and series multiple.
The Series Connection.—The illustration, Fig. 84, shows the series
winding. Here the positive wire B is connected with the carbon pole
C, and the wire D, wired up with the zinc pole, E, the connections
being made directly through each cell, to the outlet wire F. Now, as
we have six cells, the combined voltage is 1-1/2 × 6 = 9 volts.
As, however, all the cells now act as one cell, the amperage is just
the same as of one cell, namely, 25.
Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.
textbookfull.com