0% found this document useful (0 votes)
33 views

PDF Advanced Analytics with Transact-SQL: Exploring Hidden Patterns and Rules in Your Data 1st Edition Dejan Sarka download

Patterns

Uploaded by

arenasdeviol
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views

PDF Advanced Analytics with Transact-SQL: Exploring Hidden Patterns and Rules in Your Data 1st Edition Dejan Sarka download

Patterns

Uploaded by

arenasdeviol
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

Download Full Version ebook - Visit ebookmeta.

com

Advanced Analytics with Transact-SQL: Exploring


Hidden Patterns and Rules in Your Data 1st Edition
Dejan Sarka

https://ebookmeta.com/product/advanced-analytics-with-
transact-sql-exploring-hidden-patterns-and-rules-in-your-
data-1st-edition-dejan-sarka/

OR CLICK HERE

DOWLOAD NOW

Discover More Ebook - Explore Now at ebookmeta.com


Instant digital products (PDF, ePub, MOBI) ready for you
Download now and discover formats that fit your needs...

Start reading on any device today!

Advanced Analytics With Transact-SQL: Exploring Hidden


Patterns and Rules in Your Data 1st Edition Dejan Sarka

https://ebookmeta.com/product/advanced-analytics-with-transact-sql-
exploring-hidden-patterns-and-rules-in-your-data-1st-edition-dejan-
sarka-2/
ebookmeta.com

Advanced Data Analytics Using Python : With Architectural


Patterns, Text and Image Classification, and Optimization
Techniques 2nd Edition Sayan Mukhopadhyay
https://ebookmeta.com/product/advanced-data-analytics-using-python-
with-architectural-patterns-text-and-image-classification-and-
optimization-techniques-2nd-edition-sayan-mukhopadhyay-2/
ebookmeta.com

Advanced Data Analytics Using Python: With Architectural


Patterns, Text and Image Classification, and Optimization
Techniques [2nd Edition] Sayan Mukhopadhyay
https://ebookmeta.com/product/advanced-data-analytics-using-python-
with-architectural-patterns-text-and-image-classification-and-
optimization-techniques-2nd-edition-sayan-mukhopadhyay/
ebookmeta.com

Jaded Love The Lycan Academy 4 1st Edition Mazzy J March

https://ebookmeta.com/product/jaded-love-the-lycan-academy-4-1st-
edition-mazzy-j-march/

ebookmeta.com
Recent Trends in Wave Mechanics and Vibrations Proceedings
of WMVC 2022 1st Edition Zuzana Dimitrovová Paritosh
Biswas Rodrigo Gonçalves Tiago Silva
https://ebookmeta.com/product/recent-trends-in-wave-mechanics-and-
vibrations-proceedings-of-wmvc-2022-1st-edition-zuzana-dimitrovova-
paritosh-biswas-rodrigo-goncalves-tiago-silva/
ebookmeta.com

The Truth About Hawks 1st Edition Eaton

https://ebookmeta.com/product/the-truth-about-hawks-1st-edition-eaton/

ebookmeta.com

The Chainbreaker Bike Book A Rough Guide to Bicycle


Maintenance 2nd Edition Shelley Jackson

https://ebookmeta.com/product/the-chainbreaker-bike-book-a-rough-
guide-to-bicycle-maintenance-2nd-edition-shelley-jackson/

ebookmeta.com

Advances in Digital Science ICADS 2021 Advances in


Intelligent Systems and Computing 1352 Tatiana Antipova
(Editor)
https://ebookmeta.com/product/advances-in-digital-science-
icads-2021-advances-in-intelligent-systems-and-computing-1352-tatiana-
antipova-editor/
ebookmeta.com

Web of Lies A Midlands Crime Thriller 1st Edition Sally


Rigby

https://ebookmeta.com/product/web-of-lies-a-midlands-crime-
thriller-1st-edition-sally-rigby/

ebookmeta.com
Environmental Pollution and the Brain 1st Edition Sultan
Ayoub Meo

https://ebookmeta.com/product/environmental-pollution-and-the-
brain-1st-edition-sultan-ayoub-meo/

ebookmeta.com
Dejan Sarka

Advanced Analytics with Transact-SQL


Exploring Hidden Patterns and Rules in Your Data
1st ed.
Dejan Sarka
Ljubjana, Slovenia

ISBN 978-1-4842-7172-8 e-ISBN 978-1-4842-7173-5


https://doi.org/10.1007/978-1-4842-7173-5

© Dejan Sarka 2021

This work is subject to copyright. All rights are solely and exclusively
licensed by the Publisher, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, reuse of
illustrations, recitation, broadcasting, reproduction on microfilms or in
any other physical way, and transmission or information storage and
retrieval, electronic adaptation, computer software, or by similar or
dissimilar methodology now known or hereafter developed.

Trademarked names, logos, and images may appear in this book. Rather
than use a trademark symbol with every occurrence of a trademarked
name, logo, or image we use the names, logos, and images only in an
editorial fashion and to the benefit of the trademark owner, with no
intention of infringement of the trademark. The use in this publication
of trade names, trademarks, service marks, and similar terms, even if
they are not identified as such, is not to be taken as an expression of
opinion as to whether or not they are subject to proprietary rights.

The publisher, the authors and the editors are safe to assume that the
advice and information in this book are believed to be true and accurate
at the date of publication. Neither the publisher nor the authors or the
editors give a warranty, expressed or implied, with respect to the
material contained herein or for any errors or omissions that may have
been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
This Apress imprint is published by the registered company APress
Media, LLC part of Springer Nature.
The registered company address is: 1 New York Plaza, New York, NY
10004, U.S.A.
Introduction
If you want to learn how to get information from your data with
Transact-SQL, or the T-SQL language, this book is for you. It teaches you
how to calculate statistical measures from descriptive statistics,
including centers, spreads, skewness, and the kurtosis of a distribution,
find the associations between pairs of variables, including calculating
the linear regression formula, calculate the confidence level with
definite integration, find the amount of information in your variables,
and also do some machine learning or data science analysis, including
predictive modeling and text mining.
The T-SQL language is in the latest editions of SQL Server, Azure SQL
Database, and Azure Synapse Analytics. It has so many business
intelligence (BI) improvements that it might become your primary
analytic database system. Many database developers and
administrators are already proficient with T-SQL. Occasionally they
need to analyze the data with statistical or data science methods, but
they do not want to or have time to learn a completely new language for
these tasks. In addition, they need to analyze huge amounts of data,
where specialized languages like R and Python might not be fast
enough. SQL Server has been optimized for work with big datasets for
decades.
To get the maximum out of these language constructs, you need to
learn how to use them properly. This in-depth book shows extremely
efficient statistical queries that use the window functions and are
optimized through algorithms that use mathematical knowledge and
creativity. The formulas and usage of those statistical procedures are
explained as well.
Any serious analysis starts with data preparation. This book
introduces some common data preparation tasks and shows how to
implement them in T-SQL.
No analysis is good without good data quality. The book introduces
data quality issues and shows how you can check for completeness and
accuracy with T-SQL and measure improvements in data quality over
time. It also shows how you can optimize queries with temporal data;
for example, when you search for overlapping intervals. More advanced
time-oriented information includes hazard and survival analysis.
Next, the book turns to data science. Some advanced algorithms can
be implemented in T-SQL. You learn about the market basket analysis
with association rules using different measures like support and
confidence, and sequential market basket analysis when there is a
sequence in the basket. Then the book shows how to develop predictive
models with a mixture of k-nearest neighbor and decision tree
algorithms and Bayesian inference analysis.
Analyzing text, or text mining, is a popular topic. You can do a lot of
text mining in pure T-SQL, and SQL Server can become a text mining
engine. The book explains how to analyze text in multiple natural
languages with pure T-SQL and features from full-text search (FTS).
In short, this book teaches you how to use T-SQL for
statistical analysis
data science methods
text mining

Who Should Read This Book


Advanced Analytics with Transact-SQL is for database developers and
database administrators who want to take their T-SQL programming
skills to the max. It is for those who want to efficiently analyze huge
amounts of data by using their existing knowledge of the T-SQL
language. It is also for those who want to improve querying by learning
new and original optimization techniques.

Assumptions
This book assumes that the reader already has good knowledge of the
Transact-SQL language. A few years of coding experience is very
welcome. A basic grasp of performance tuning and query optimization
can help you better understand how the code works.

The Organization of This Book


There are eight chapters in this book, which are logically structured in
four parts, each part with two chapters. The following is a brief
description of the chapters.
Part I: Statistics Most advanced analytics starts with good old
statistics. Sometimes statistical analysis might already provide the
needed information, and sometimes statistics is only used in an
overview of the data.
Chapter 1 : Descriptive Statistics With descriptive statistics, the
analyst gets an understanding of the distribution of a variable. One can
analyze either continuous or discrete variables. Depending on the
variable type, the analyst must choose the appropriate statistical
measures.
Chapter 2 : Associations Between Pairs of Variables When
measuring associations between pairs of variables, there are three
possibilities: both variables are continuous, both are discrete, or one is
continuous and the other one is discrete. Based on the type of the
variables, different measures of associations can be calculated. To
calculate the statistical significance of associations, the calculation of
the definite integrals is needed.
Part II: Data Quality and Preparation Before doing advanced
analyses, it is crucial to understand the quality of the input data. A lot of
additional work with appropriate data preparation is usually a big part
of an analytics project in real life.
Chapter 3 : Data Preparation There is no end to data preparation
tasks. Some of the most common tasks include converting strings to
numerical variables and discretizing continuous variables. Missing
values are typical in analytical projects. Many times, derived variables
help explain the values of a target variable more than the original input.
Chapter 4 : Data Quality Garbage in, garbage out is a very old rule.
Before doing advanced analyses, it is always recommendable to check
for the data quality. Measuring improvements in data quality over time
can help with understanding the factors that influence it.
Part III: Dealing with Time Queries that deal with time-oriented
data are very frequent in analytical systems. Beyond a simple
comparison of data in different time periods, much more complex
problems and analyses can arise.
Chapter 5 : Time-Oriented Data Understanding what kind of
temporal data can appear in a database is very important. Some types
of queries that deal with temporal data are hard to optimize. Data
preparation of time series data has some own rules as well.
Chapter 6 : Time-Oriented Analyses How long is a customer
faithful to the supplier or the subscribed services and service provider?
Which are the most hazardous days for losing a customer? What will be
the sales amount in the next few periods? This chapter shows how to
answer these questions with T-SQL.
Part IV: Data Science Some of the most advanced algorithms for
analyzing data are many times mentioned with the term data science.
Expressions as data mining, machine learning, and text mining are also
very popular.
Chapter 7 : Data Mining Every online or retail shop wants to know
which products customers tend to buy together. Predicting a target
discrete or continuous variable with few input variables is important in
practically every type of business. This chapter introduces some of the
most popular algorithms implemented with T-SQL.
Chapter 8 : Text Mining The last chapter of the book introduced
text mining with T-SQL. Text mining can include semantic search, term
extraction, quantitative analysis of words and characters, and more.
Data mining algorithms like association rules can also be used to
understand analyzed text better.

System Requirements
You need the following software to run the code samples in this book.
Microsoft SQL Server 2019 Developer of Enterprise edition, which is
at www.microsoft.com/en-us/sql-server/sql-server-
2019 .
Azure SQL Server Managed Instance, which is at
https://azure.microsoft.com/en-us/services/azure-
sql/sql-managed-instance/ .
Most of the code should run on the Azure SQL Database, which is at
https://azure.microsoft.com/en-us/services/sql-
database/ .
If you would like to try the code on unlimited resources, you can use
Azure Synapse Analytics at
https://azure.microsoft.com/en-
us/services/synapse-analytics/ .
SQL Server Management Studio is the default client tool. You can
download it for free at https://docs.microsoft.com/en-
us/sql/ssms/download-sql-server-management-
studio-ssms?view=sql-server-ver15 .
Another free client tool is the Azure Data Studio at
https://docs.microsoft.com/en-us/sql/azure-data-
studio/download-azure-data-studio?view=sql-
server-ver15 .
For demo data, you can find the AdventureWorks sample databases
at https://docs.microsoft.com/en-
us/sql/samples/adventureworks-install-
configure%3Fview%3Dsql-server-ver15&tabs=ssms .
Some demo data comes from R. I explain how to get it and show the R
code for loading the demo data in SQL Server.
You can download all the code used for the companion content in this
book at https://github.com/Apress/adv-analytics-w-
transact-sql .

Naming Conventions
When I create tables in SQL Server, I start with the column(s) that form
the primary key, and I use Pascal case (e.g., FirstName) for the physical
columns. For computed, typically aggregated columns from a query, I
tend to use camel case (e.g., avgAmount). However, the book deals with
the data from many sources. Demo data provided from Microsoft demo
databases is not enough for all my examples. Two demo tables come
from R. In R, the naming convention is not strict. I had a choice on how
to proceed.
I decided to go with the original names when data comes from R, so
the names of the columns in the table are all lowercase (e.g., carbrand).
However, Microsoft demo data is far from perfect as well. Many
dynamic management objects return all lowercase objects or even
reserved keywords as the names of the columns. For example, in
Chapter 8, I use two tabular functions provided by Microsoft, which
return two columns with names [KEY] and [RANK]. Both in uppercase,
and both are even reserved words in SQL, so they need to be enclosed
in brackets. This is why sometimes the reader might get the impression
that the naming convention is not good. I made an arbitrary decision,
which I hope was the best in this situation.
Acknowledgments
Many people helped me with this book, directly or indirectly. I am not
mentioning the names explicitly because I could unintentionally omit
some of them. However, I am pretty sure they are going to recognize
themselves in the following text.
Thank you to my family, who understood that I could not spend so
much time with them while I am writing the book.
Thank you to my friends, who were always encouraging me to finish
the work.
Thank you to all the great people that work for or are engaged with
Apress. Without your constant support, this book would probably never
be finished.
Thank you to the reviewers. A well-reviewed book is as important
as the writing itself. Thank you very much for your work.
Table of Contents
Part I: Statistics
Chapter 1:​Descriptive Statistics
Variable Types
Demo Data
Frequency Distribution of Discrete Variables
Frequencies of Nominals
Frequencies of Ordinals
Descriptive Statistics for Continuous Variables
Centers of a Distribution
Measuring the Spread
Skewness and Kurtosis
Conclusion
Chapter 2:​Associations Between Pairs of Variables
Associations Between Continuous Variables
Covariance
Correlation
Interpreting the Correlation
Associations Between Discrete Variables
Contingency Tables
Chi-Squared Test
Associations Between Discrete and Continuous Variables
Testing Continuous Variable Moments over a Discrete
Variable
Analysis of Variance
Definite Integration
Conclusion
Part II: Data Preparation and Quality
Chapter 3:​Data Preparation
Dealing with Missing Values
NULLs in T-SQL Functions
Handling NULLs
String Operations
Scalar String Functions
Aggregating and Splitting Strings
Derived Variables and Grouping Sets
Adding Computed Columns
Efficient Grouping
Data Normalization
Range and Z-score Normalization
Logistic and Hyperbolic Tangent Normalization
Recoding Variables
Converting Strings to Numerics
Discretizing Numerical Variables
Conclusion
Chapter 4:​Data Quality and Information
Data Quality
Measuring Completeness
Finding Inaccurate Data
Measuring Data Quality over Time
Measuring the Information
Introducing Entropy
Mutual Information
Conditional Entropy
Conclusion
Part III: Dealing with Time
Chapter 5:​Time-Oriented Data
Application and System Times
Inclusion Constraints
Demo Data
System-Versioned Tables and Issues
A Quick Introduction to System-Versioned Tables
Querying System-Versioned Tables Surprises
Optimizing Temporal Queries
Modifying the Filter Predicate
Using the Unpacked Form
Time Series
Moving Averages
Conclusion
Chapter 6:​Time-Oriented Analyses
Demo Data
Exponential Moving Average
Calculating EMA Efficiently
Forecasting with EMA
ABC Analysis
Relational Division
Top Customers and Products
Duration of Loyalty
Survival Analysis
Hazard Analysis
Conclusion
Part IV: Data Science
Chapter 7:​Data Mining
Demo Data
Linear Regression
Autoregression and Forecasting
Association Rules
Starting from the Negative Side
Frequency of Itemsets
Association Rules
Look-Alike Modeling
Training and Test Data Sets
Performing Predictions with LAM
Naïve Bayes
Training the NB Model
Performing Predictions with NB
Conclusion
Chapter 8:​Text Mining
Demo Data
Introducing Full-Text Search
Full-Text Predicates
Full-Text Functions
Statistical Semantic Search
Quantitative Analysis
Analysis of Letters
Word Length Analysis
Advanced Analysis of Text
Term Extraction
Words Associations
Association Rules with Many Items
Conclusion
Index
About the Author
Dejan Sarka
, MCT and Data Platform MVP, is an independent trainer and consultant
who focuses on developing database and business intelligence
applications, with more than 30 years of experience in this field.
Besides projects, he spends about half of his time on training and
mentoring. He is the founder of the Slovenian SQL Server and .NET
Users Group. Sarka is the author or co-author of 19 books about
databases and SQL Server. He has developed many courses and
seminars for Microsoft, RADACAD, SolidQ, and Pluralsight.
About the Technical Reviewer
Ed Pollack
has more than 20 years of experience in database and systems
administration, developing a passion for performance optimization,
database design, and wacky analytics. He has spoken at many user
groups, data conferences, and summits. This led him to organize SQL
Saturday Albany, which has become an annual event for New York’s
Capital Region. Sharing these experiences with the community is a
passion. In his free time, Ed enjoys video games, backpacking, traveling,
and cooking exceptionally spicy foods.
Part I
Statistics
© The Author(s), under exclusive license to APress Media, LLC, part of Springer
Nature 2021
D. Sarka, Advanced Analytics with Transact-SQL
https://doi.org/10.1007/978-1-4842-7173-5_1

1. Descriptive Statistics
Dejan Sarka1
(1) Ljubjana, Slovenia

Descriptive statistics summarize or quantitatively describe variables


from a dataset. In a SQL Server table, a dataset is a set of the rows, or a
rowset, that comes from a SQL Server table, view, or tabular expression.
A variable is stored in a column of the rowset. In statistics, a variable is
frequently called a feature .
When you analyze a variable, you first want to understand the
distribution of its values. You can get a better understanding through
graphical representation and descriptive statistics. Both are important.
For most people, a graphical representation is easier to understand.
However, with descriptive statistics, where you get information through
numbers, it is simpler to analyze a lot of variables and compare their
aggregated values; for example, their means and variability. You can
always order numbers and quickly notice which variable has a higher
mean, median, or other measure.
Transact-SQL is not very useful for graphing. Therefore, I focus on
calculating descriptive statistics measures. I also include a few graphs,
which I created with Power BI.

Variable Types
Before I calculate the summary values, I need to introduce the types of
variables. Different types of variables require different calculations. The
most basic division of the Variables are basically divided into two
groups: discrete and continuous.
Discrete variables can only take a value from a limited pool. For
example, there are only seven different or distinct values for the days of
the week. Discrete variables can be further divided into two groups:
nominal and ordinal.
If a value does not have a quantitative value (e.g., a label for a
group), it is a nominal variable. For example, a variable that describes
marital status could have three possible values: single, married, or
divorced.
Discrete variables could also have an intrinsic order, which are
called ordinal variables. If the values are represented as numbers, it is
easy to notice the order. For example, evaluating a product purchased
on a website could be expressed with numbers from 1 to 7, where a
higher number means greater satisfaction with the product. If the
values of a variable are represented with strings, it is sometimes harder
to notice the order. For example, education could be represented with
strings, like high school degree, graduate degree, and so forth. You
probably don’t want to sort the values alphabetically because there is
an order hidden in the values. With education, the order is defined
through the years of schooling needed to get the degree.
If a discrete variable can take only two distinct values, it is a
dichotomous variable called an indicator, a flag, or a binary variable. If
the variable can only take a single value, it is a constant. Constants are
not useful for analysis; there is no information in a constant. After all,
variables are called variables because they introduce some variability.
Continuous variables can take a value from an unlimited,
uncountable set of possible values. They are represented with integral
or decimal numbers. They can be further divided into two classes:
intervals or numerics (or true numerics).
Intervals are limited on the lower side, the upper side, or both sides.
For example, temperature is an interval, limited with absolute zero on
the lower side. On the other hand, true numerics have no limits on any
side. For example, cashflow can be positive, negative, or zero.
It is not always completely clear if a variable is discrete or
continuous. For example, the number of cars owned is an integer and
can take any value between zero and infinite. You can use such
variables in both ways—as discrete, when needed, or as continuous.
For example, the naïve Bayes algorithm, which is explained in Chapter
7, uses only discrete variables so that you can treat the number of cars
owned variable as discrete. But the linear regression algorithm, which
is explained in the same chapter, uses only continuous variables, and
you can treat the same variable as continuous.

Demo Data
I use a couple of demo datasets for the demos in this book. In this
chapter, I use the mtcars demo dataset that comes from the R language;
mtcars is an acronym for MotorTrend Car Road Tests. The dataset
includes 32 cases, or rows, originally with 11 variables. For demo
purposes, I add a few calculated variables. The data comes from a 1974
MotorTrend magazine and includes design and performance aspects for
32 cars, all 1973 and 1974 models. You can learn more about this
dataset at
www.rdocumentation.org/packages/datasets/versions/
3.6.2/topics/mtcars.
I introduce variables when needed.
From SQL Server 2016, it is easy to execute R code inside SQL
Server Database Engine. You can learn more about machine learning
inside SQL Server with R or the Python language in official Microsoft
documentation. A good introduction is at
https://docs.microsoft.com/en-us/sql/machine-
learning/sql-server-machine-learning-services?
view=sql-server-ver15. Since this book is about T-SQL and not R,
I will not spend more time explaining the R part of the code. I introduce
the code that I used to import the mtcars dataset, with some additional
calculated columns, in a SQL Server table.
First, you need to enable external scripts execution in SQL Server.

-- Configure SQL Server to enable external scripts


USE master;
EXEC sys.sp_configure 'show advanced options', 1;
RECONFIGURE
EXEC sys.sp_configure 'external scripts enabled',
1;
RECONFIGURE;
GO
I created a new table in the AdventureWorksDW2017 demo
database, which is a Microsoft-provided demo database. I use the data
from this database later in this book as well. You can find the
AdventureWorks sample databases at
https://docs.microsoft.com/en-
us/sql/samples/adventureworks-install-configure?
view=sql-server-ver15&tabs=ssms. For now, I won’t spend
more time on the content of this database. I just needed a database to
create a table in, and because I use this database later, it seems like the
best place for my first table with demo data. Listing 1-1 shows the T-
SQL code for creating the demo table.

-- Create a new table in the AWDW database


USE AdventureWorksDW2017;
DROP TABLE IF EXISTS dbo.mtcars;
CREATE TABLE dbo.mtcars
(
mpg numeric(8,2),
cyl int,
disp numeric(8,2),
hp int,
drat numeric(8,2),
wt numeric(8,3),
qsec numeric(8,2),
vs int,
am int,
gear int,
carb int,
l100km numeric(8,2),
dispcc numeric(8,2),
kw numeric(8,2),
weightkg numeric(8,2),
transmission nvarchar(10),
engine nvarchar(10),
hpdescription nvarchar(10),
carbrand nvarchar(20) PRIMARY KEY
)
GO
Listing 1-1 Creating the Demo Table
I want to discuss the naming conventions in this book. When I
create tables in SQL Server, I start with the column(s) that form the
primary key and use pascal case (e.g., FirstName) for the physical
columns. For computed columns, typically aggregated columns from a
query, I tend to use camel case (e.g., avgAmount). However, the book
deals with data from many sources. Demo data provided from Microsoft
demo databases is not enough for all of my examples. Two demo tables
come from R. In R, the naming convention is not strict. I had a choice to
make on how to proceed. I decided to go with the original names when
data comes from R, so the names of the columns in the table in Listing
1-1 are all lowercase (e.g., carbrand).

Note Microsoft demo data is far from perfect. Many dynamic


management objects return all lowercase objects or reserved
keywords as the names of the columns. For example, in Chapter 8, I
use two tabular functions by Microsoft that return two columns
named [KEY] and [RANK]. Both are uppercase reserved words in
SQL, so they need to be enclosed in brackets.

Now let’s use the sys.sp_execute_external_script system


stored procedure to execute the R code. Listing 1-2 shows how to
execute the INSERT...EXECUTE T-SQL statement to get the R
dataset in a SQL Server table.

-- Insert the mtcars dataset


INSERT INTO dbo.mtcars
EXECUTE sys.sp_execute_external_script
@language=N'R',
@script = N'
data("mtcars")
mtcars$l100km = round(235.214583 / mtcars$mpg, 2)
mtcars$dispcc = round(mtcars$disp * 16.38706, 2)
mtcars$kw = round(mtcars$hp * 0.7457, 2)
mtcars$weightkg = round(mtcars$wt * 1000 *
0.453592, 2)
mtcars$transmission = ifelse(mtcars$am == 0,
"Automatic",
"Manual")
mtcars$engine = ifelse(mtcars$vs == 0,
"V-shape", "Straight")
mtcars$hpdescription =
factor(ifelse(mtcars$hp > 175, "Strong",
ifelse(mtcars$hp < 100, "Weak",
"Medium")),
order = TRUE,
levels = c("Weak", "Medium", "Strong"))
mtcars$carbrand = row.names(mtcars)
',
@output_data_1_name = N'mtcars';
GO
Listing 1-2 Inserting R Data in the SQL Server Demo Table
You can check if the demo data successfully imported with a simple
SELECT statement.

SELECT *
FROM dbo.mtcars;

When the demo data is loaded, let’s start analyzing it.

Frequency Distribution of Discrete Variables


You usually represent the distribution of a discrete variable with
frequency distribution or frequencies. In the simplest example, you can
calculate only the values’ count. You can also express these value counts
as percentages of the total number of rows or cases.

Frequencies of Nominals
The following is a simple example of calculating the counts and
percentages for the transmission variable, which shows the
transmission type .
-- Simple, nominals
SELECT c.transmission,
COUNT(c.transmission) AS AbsFreq,
CAST(ROUND(100. * (COUNT(c.transmission)) /
(SELECT COUNT(*) FROM mtcars), 0) AS int)
AS AbsPerc
FROM dbo.mtcars AS c
GROUP BY c.transmission;
The following is the result.

transmission AbsFreq AbsPerc


------------ ----------- -----------
Automatic 19 59
Manual 13 41

I used a simple GROUP BY clause of the SELECT statement and the


COUNT() aggregate function. Graphically, you can represent the
distribution with vertical or horizontal bar charts. Figure 1-1 shows the
bar charts for three variables from the mtcars dataset, created with
Power BI.
Figure 1-1 Bar charts for discrete variables
You can see the distribution of the transmission, engine, and
cyl variables. The cyl variable is represented with the numbers 4, 6,
and 8, which represent the number of engine cylinders. Can you create
a bar chart with T-SQL? You can use the percentage number as a
parameter to the REPLICATE() function and mimic the horizontal bar
chart, or a horizontal histogram, as the following code shows.

WITH freqCTE AS
(
SELECT c.transmission,
COUNT(c.transmission) AS AbsFreq,
CAST(ROUND(100. * (COUNT(c.transmission)) /
(SELECT COUNT(*) FROM mtcars), 0) AS int)
AS AbsPerc
FROM dbo.mtcars AS c
GROUP BY c.transmission
)
SELECT transmission,
AbsFreq,
AbsPerc,
CAST(REPLICATE('*', AbsPerc) AS varchar(50)) AS
Histogram
FROM freqCTE;
I used a common table expression to enclose the first query, which
calculated the counts and the percentages, and then added the
horizontal bars in the outer query. Figure 1-2 shows the result.

Figure 1-2 Counts with a horizontal bar

For nominal variables, this is usually all that you calculate. For
ordinals, you can also calculate running totals.

Frequencies of Ordinals
Ordinals have intrinsic order. When you sort the values in the correct
order, it makes sense to also calculate the running totals. What is the
total count of cases up to some specific value? What is the running total
of percentages? You can use the T_SQL window aggregate functions to
calculate the running totals. Listing 1-3 shows the calculation for the
cyl variable.

-- Ordinals - simple with numerics


WITH frequency AS
(
SELECT v.cyl,
COUNT(v.cyl) AS AbsFreq,
CAST(ROUND(100. * (COUNT(v.cyl)) /
(SELECT COUNT(*) FROM dbo.mtcars), 0) AS
int) AS AbsPerc
FROM dbo.mtcars AS v
GROUP BY v.cyl
)
SELECT cyl,
AbsFreq,
SUM(AbsFreq)
OVER(ORDER BY cyl
ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW) AS CumFreq,
AbsPerc,
SUM(AbsPerc)
OVER(ORDER BY cyl
ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW) AS CumPerc,
CAST(REPLICATE('*', AbsPerc) AS varchar(50)) AS
Histogram
FROM frequency
ORDER BY cyl;
Listing 1-3 Frequencies of an Ordinal Variable
The query returns the result shown in Figure 1-3.

Figure 1-3 Frequencies of an ordinal variable

Note If you are not familiar with the T-SQL window functions and
the OVER() clause, please refer to the official SQL Server
documentation at https://docs.microsoft.com/en-
us/sql/t-sql/queries/select-over-clause-
transact-sql?view=sql-server-ver15.

Ordering by the cyl variable was simple because the values are
represented with integral numbers, and the order is automatically
correct. But if an ordinal is represented with strings, you need to be
careful with the proper order. You probably do not want to use
alphabetical order.
For a demo, I created (already in the R code) a hpdescription
derived variable (originally stored in the hp continuous variable),
which shows engine horsepower in three classes: weak, medium, and
strong. The following query incorrectly returns the result in
alphabetical order.

-- Ordinals - incorrect order with strings


WITH frequency AS
(
SELECT v.hpdescription,
COUNT(v.hpdescription) AS AbsFreq,
CAST(ROUND(100. * (COUNT(v.hpdescription)) /
(SELECT COUNT(*) FROM dbo.mtcars), 0) AS
int) AS AbsPerc
FROM dbo.mtcars AS v
GROUP BY v.hpdescription
)
SELECT hpdescription,
AbsFreq,
SUM(AbsFreq)
OVER(ORDER BY hpdescription
ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW) AS CumFreq,
AbsPerc,
SUM(AbsPerc)
OVER(ORDER BY hpdescription
ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW) AS CumPerc,
CAST(REPLICATE('*', AbsPerc) AS varchar(50)) AS
Histogram
FROM frequency
ORDER BY hpdescription;

The results of this query are shown in Figure 1-4.


Discovering Diverse Content Through
Random Scribd Documents
other counter jumping namby-pamby, goody-goodies of the Howells
stripe, including his own weary history of himself, and the “Books
Which Most Influence Him,” the baleful effects of which are
legitimately and plainly perceptible in his works. There are shams in
literature more dreadful than Mr. Howells, who is a turgid fact and no
sham. For instance;
I know of one evanescently popular young creature who
chronically contributes to the magazines, whose mother it is said,
writes his tales which, she being a clever woman and he an
uncommonly stupid man, appears credible to say the least; and
there is another “man” I am told of whose sister is said to write his
poems and modestly efface herself, and as the stories are good and
the poems fairly readable, it should be the part of The Philistine to
disclose to the world the real authors and chastise these and other
shams, for shams are the hardest hurdles in the steeplechase which
Truth has to make in this world, since they substitute the false for the
real and crown the fool with the laurels of the genius.
How much more might be said of the tasks you have to
accomplish, brave Philistine with your brawny arm and your good
naked sword! So much that the very thought of it fatigues one and
that, hailing you as the latest and best contestant in the tourney of
Knighthood and yet, considering you as a publication in an
embryonic stage, I am compelled to quote these lovely lines of
Longfellow:

“Oh, little feet that such long years


Must wander through this vale of tears,
I, nearer to the wayside inn
Where travail ends and rest begins,
Grow weary thinking of your road.”

Mark S. Hubbell.
side talks with the philistines:
being sundry bits of wisdom
which have been heretofore
secreted, and are now set
forth in print.
It is a land of free speech, Philistia, and if one of us chooses to
make remarks concerning the work of the others no sense of
modesty keeps us quiet. It is because we cannot say what we would
in the periodicals which are now issued in a dignified, manner in
various places, that we have made this book. In the afore-mentioned
periodicals divers men chatter with great fluency, startling regularity
and “damnable complacency,” each through his individual bonnet.
Edward W. Bok, evidently assisted by Mrs. Lydia Pinkham and W. L.
Douglas, of Brocton, Massachusetts, prints the innermost secrets of
dead women told by their living male relatives for six dollars a
column. Thereby the authors are furnished with the price of a week’s
board, and those of us who may have left some little sense of
decency, wonder what manner of man it may be who sells his wife’s
heart to the readers of Bok. But the “unspeakable Bok” is
“successful.” His magazine flourishes like a green bay tree. Many
readers write him upon subjects of deportment and other matters in
which he is accomplished. So, the gods give us joy! Let him drive on,
and may his Home Journal have five million readers before the year
is out—God help them!
Mr. Gilder dishes up monthly beautifully printed articles which
nobody cares about, but which everybody buys, because The
Century looks well on the library table.
Mr. Howells maunders weekly in a column called “Life and Letters”
in Harper’s journal of civilization. This “Life and Letters” reminds me
of the Peterkin’s famous picnic at Strawberry Nook. “There weren’t
any strawberries and there wasn’t any nook, but there was a good
place to tie the horses.”
So it goes through the whole list. There are people, however, who
believe that Romance is not dead, and that there is literature to be
made which is neither inane nor yet smells of the kitchen sink. This
is a great big merry world, says Mr. Dana, and there’s much good to
be got out of it, so toward those who believe as we do—we of
Philistia—this paper starts upon its great and perilous voyage at one
dollar a year.
It was Balzac, or some one else, who used to tell of a flea that
lived on a mangy lion and boasted to all the rank outside fleas that
he met: I have in me the blood of the King of Beasts.
It is a comforting thought that somewhere, at some time, every
good thing on earth is brought to an accounting of itself. Thereby are
the children of men saved from much tyranny. For the good things of
earth are your true oppressors.
For such an accounting are Philistines born in every age. By their
audit are men perpetually set free from trammels self-woven.
Earnest men have marvelled in all times that convention has
imputed to husks and symbols the potency of the things they
outwardly stand for. Many also have protested, and these, in
reproach, have been called Philistines. And yet they have done no
more than show forth that in all things the vital purpose is more than
the form that shrines it. The inspirations of to-day are the shams of
to-morrow—for the purpose has departed and only the dead form of
custom remains. “Is not the body more than raiment”—and is not life
more than the formulæ that hedge it in?
Wherefore men who do their own thinking, and eke women
betimes, take honor rather than disparagement in the name which is
meant to typify remorseless commonplace. They hesitate not to
question custom, whether there be reason in it. They ask “Why?”
when one makes proclamation:
“Lo! Columbus discovered America four hundred years ago! Let us
give a dance.” There have been teachers who sought to persuade
mankind that use alone is beauty—and these too have done
violence to the fitness of things. On such ideals is the civilization of
Cathay founded. Neither in the grossness of material things nor in
the false refinements that “divorce the feeling from its mate the deed”
is the core and essence of living.
It is the business of the true Philistine to rescue from the
environment of custom and ostentation the beauty and the goodness
cribbed therein. And so the Philistines of these days, whose prime
type is the Knight of La Mancha, go tilting at windmills and other
fortresses—often on sorry nags and with shaky lances, and yet on
heroic errand bent. And to such merry joust and fielding all lovers of
chivalry are bidden: to look on—perhaps to laugh, it may be to grieve
at a woeful belittling of lofty enterprise. Come, such of you as have
patience with such warriors. It is Sancho Panza who invites you.
The Chip-Munk has a bright reference in the issue of May 15 to
Coventry, Patmore, Pater and Meredith. These are four great men,
as The Chip-Munk boldly states.
The Chip-Munk further announces that the Only Original Lynx-
Eyed Proof Reader has not gone on a journey. Really, I supposed of
course he had been gone these many moons!
I wonder if Carman is still upon a diet of Mellin’s Food that he
imagines people do not know that this poem

LITTLE LYRICS OF JOY—V.

Lord of the vasty tent of Heaven,


Who hast to thy saints and sages given
A thousand nights with their thousand stars,
And the star of faith for a thousand years.

Grant me, only a foolish rover,


All thy beautiful wide world over,
A thousand loves in a thousand days,
And one great love for a thousand years.
—Bliss Carman in The Chap Book, May, 1895.

was written years and years ago as follows:

The night has a thousand eyes,


And the day but one;
Yet the light of the bright world dies
With the dying sun.

The mind has a thousand eyes,


And the heart but one;
Yet the light of a whole life dies
When love is done.

—Francis W. Bourdillon.

I desire to swipe him after this manner:

LITTLE DELIRICS OF BLISS.

mdcccxciv.

Lord of the wires that tangle Heaven,


Who hast to thy brake-persuaders given,
The longest of days to ring and grind,
And no least screen from the winter’s wind.

Grant me, only, a summer lover,


Sunshiny days the long year over,
A thousand whirls and a thousand fares,
And one long whirl of a thousand hours.

Joy Trolleyman.

iv-xi-xliv.

White and rose are the colors of strife,


What care I for the crimson and blue?
Greater than football the battle of life
And tragic as aught the gods may view,
The clutch and the gripe of inward ills;
Pallid the People and Pink the Pills.

Joy Cartman.

Mark Twain says he is writing “Joan of Arc” anonymously in


Harper’s because he is convinced if he signed it the people would
insist the stuff was funny. Mr. Twain is worried unnecessarily. It has
been a long time since any one insisted the matter he turns out so
voluminously was or is funny.
The amusing William Dean Howells writes that he is so
bothered by autograph seekers that he will hereafter refuse to send
his signature “with a sentiment” unless the applicant for his favor
produces satisfactory evidence he has read all of his works, “now
some thirty or forty in number.” When this proof has been sent if Mr.
Howells does not return his autograph on the bottom of a check for a
large amount, he deserves to be arrested for cruelty to his fellows.
There is no doubt that a teacher once committed to a certain
line of thought will cling to that line long after all others have
deserted it. In trying to persuade others he convinces himself. This is
especially so if he is opposed. Opposition evolves in his mind a
maternal affection for the product of his brain and he defends it
blindly to the death. Thus we see why institutions are so
conservative. Like the coral insect they secrete osseous matter; and
when a preacher preaches he himself always goes forward to the
mourners’ bench and accepts all of the dogmas that have just been
so ably stated.
Literature is the noblest of all the arts. Music dies on the air, or
at best exists only as a memory; oratory ceases with the effort; the
painter’s colors fade and the canvas rots; the marble is dragged from
its pedestal and is broken into fragments; but the Index
Expurgatorius is as naught, and the books burned by the fires of the
auto da fe still live. Literature is reproduced ten thousand times ten
thousand and lodges its appeal with posterity. It dedicates itself to
Time.
The action of various theatrical managers in cutting from their
programmes the name of the author of the plays running at their
houses and the similar action of numerous librarians in withdrawing
his books from their shelves is simply another proof of the
marvellous powers of stultification possessed by the humans of the
present time. These managers, having the scattering wits of birds,
do not seem to appreciate that, whatever the character of the author,
the plays he has written were as bad before they were produced as
they are now that he has been so effectually extinguished; and these
librarians cannot comprehend, evidently, that his books were fully as
immoral as they are now when they were first put on the shelves.
Would it not be a refreshing thing to find a theatrical manager who
managed a theater because he had an honest purpose of elevating,
perpetuating, purifying and strengthening the drama, instead of
speculating in it as a Jew speculates in old clothes? And would it not
be a marvel to discover librarian who knew something about books?
Buffalo, New York, is getting to be very classic in some things. It
tolerated the nude with great equanimity in the recent Art Exhibition
and exhibits the female embodiment of everything ideal, from the
German muse of song to the still more German muse of barley
products, at the great variety of fests, more or less related to beer,
that follow in swift succession in that town. But the classic climax
was reached on Good Friday of this year, when the Venus of Milo,
mounted on a Bock beer pedestal, was the center piece of an Easter
symbol picture in a Hebrew clothing advertisement. The limit of
Buffalo congruity seems to have been reached.
The Chip-Munk for May has a bit of folk-lore about a man who
advised another to join a conspiracy of silence. This item appeared
in 1893 and during 1894 was published by actual count in one
hundred and forty-nine newspapers. The editors of The Chip-Munk
are a bit slow in reading their exchanges.
The Two Orphans at the Kate Claxton Building, Chicago
Stockyards, have a motto on their letter heads that reads, “We are
the people and wisdom will die with us.”
The editor of The Baseburner, who claims to be a veritist,
states that it is not true that the Garland stoves were named after
Ham Garland of Chicago Stockyards; but the fact is Garland named
himself after the stoves.
Current Literature recently had a long article on Louise
Imogene Guinly. Doubtless the spelling of the name was a
typographical error, as the editor probably refers to Miss Louisa
Imogene Quinney, who is postmistress at Auburn, New York, and
daughter of Richard Quinney, manufacturer of the famous Quinney
Mineral Water.
Judge Robert Grant has in preparation a series of articles
called “How to Live on a Million a Minute and Have Money to Burn.”
I hear the voice of the editors of The Chip-Munk complaining
that Little Journeys, The Bibelot, Chips and other publications are
base, would-be imitators of their own chaste periodical. Why, you
sweet things, did you know that many hundred years ago a great
printer made a book which was printed in black inside with a cover in
red and black. I believe this is the thing which you claim is original
with yourselves. So far as the rest of the periodicals are concerned I
have no means of knowing whether they are imitations or not, but
Little Journeys was in type and printed long before The Chip-Munk
came out of its hole.
Messrs. Copeland & Day of Boston recently published for Mr.
Stephen Crane a book which he called “The Black Riders.” I don’t
know why; the riders might have as easily been green or yellow or
baby-blue for all the book tells about them, and I think the title, “The
Pink Rooters,” would have been better, but it doesn’t matter. My
friend, The Onlooker, of Town Topics, quotes one of the verses and
says this, which I heartily endorse:

I saw a man pursuing the horizon;


Round and round they sped.
I was disturbed at this;
I accosted the man.
“It is futile,” I said,
“You can never”——
“You lie,” he cried.
And ran on.

This was Mr. Howells proving that Ibsen is valuable and


interesting. It is to be hoped that Mr. Crane will write another poem
about him after his legs have been worn off.
I was moved to read Mr. Hermann Sudermann’s diverting novel,
“The Wish,” upon observing an extended notice in the “Sub Rosa”
column of the Buffalo Courier. The writer therein alleged that the
novel taught a great moral lesson, and desiring to be taught a great
moral lesson I bought the book. It treats of the wish of a girl for her
sister’s death in order that she might marry the husband. I suppose
the great moral truth is that one should not wish for such things, but I
supposed that had been taught in one of the Commandments, which
tells of coveting thy neighbor’s wife, and my Sunday School teacher
used to tell me that it referred equally to husbands. I was evidently
mistaken, and Hermann Sudermann is hereby hailed as a teacher of
morals. I should think, from the style of the “Sub Rosa” article, that
the writer is a woman. If she is, I’ll bet her feet are cold if she enjoys
such things as this:

When Old Hellinger entered the gable room he saw a sight


which froze the blood in his veins. His son’s body lay
stretched on the ground. As he fell he must have clutched the
supports of the bier on which the dead girl had been placed,
and dragged down the whole erection with him; for on the top
of him, between the broken planks, lay the corpse, in its long,
white shroud, its motionless face upon his face, its bared
arms thrown over his head. At this moment he regained
consciousness, and started up. The dead girl’s head sank
down from his and bumped on the floor.

This cheerful book is translated from the German by Lily Henkel


and published by the Appletons. I commend it to Mr. Bliss Carman
and his shroud washers.
Mr. Thomas B. Mosher, of Portland, Maine, deserves the thanks
of the reading public for the issuing of The Bibelot. Each month this
dainty periodical comes like a dash of salt water on a hot day, and is
as refreshing. After reading the longings and the heartburnings of the
various degenerates who inflict their stuff on us these days, Mr.
Mosher’s “Sappho” comes and makes us really believe that there is
a man up on the coast of Maine who has the salt of the sea and the
breath of the pines in him, and is willing to think that there are other
people who care for purity and sweetness, rather than such literature
as “Vistas” and the plays of Maeterlinck.
When in five consecutive stories, printed in the same periodical,
the hero or heroine has ended the narrative by shooting himself or
herself, is it not about time to hire somebody to invent some other
denouement?
Many a man’s reputation would not know his character if they
met on the street.
To be stupid when inclined and dull when you wish is a boon
that only goes with high friendship.
Every man has moments when he doubts his ability. So does
every woman at times doubt her wit and beauty and long to see
them mirrored in a masculine eye. This is why flattery is acceptable.
A woman will doubt everything you say except it be compliments to
herself—here she believes you truthful and mentally admires you for
your discernment.

STIGMATA.
“Behold the miracle!” he cried—
The sombre priest who stood beside
A figure on whose snowy breast
The outlines of a cross expressed
In ruddy life-drops ebbed and flowed;
“Behold th’ imprimatur of God!”
A kneeling woman raised her eyes;
Lo! At the sight, in swift surprise,
Ere awe-struck lips a prayer could speak
Love’s stigma glowed on brow and cheek;
And one in reverence bent his head—
“Behold the miracle?” he said.

William McIntosh.

THE MAGAZINES.
Kate Field’s Wash is dry.
The Arena has sand.
“Sub-Tragic” is the latest description of Vic. Woodhull’s
Humanitarian.
McClure’s is getting a little weary with its living pictures.
Scribner’s has a thrilling article on “Books We Have Published.”
Godey’s is very gay in its second childhood.
Judge Tourgee’s Basis isn’t business. “It’s pretty, but it isn’t war.”
The Century, it is said, will insert a page or two of reading matter
between the Italian art and the ads.
The Basis is out with prizes for poets and sermon writers. It was
as certain as the law of nature makes the filling of every vacuum at
some time, that somewhere and at some time these people would
get their reward. It seems to be coming now. But where and when
will be the reward of the people who read what they write? The
thought of their fate is all shuddery.
Ginger used to be in evidence in magazines and pumpkin pies.
Squash is a prominent ingredient now.
If Peterson’s wouldn’t mix ads. and reading matter in their books
and on title pages the cause of current literature would be advanced.
Between Grant’s essays on the art of living and the mild satire of
“The Point of View,” it really looks as if the Tattler had come again—a
little disembodied for Dick Steele, but in character.

THE BOK BILLS OF NARCISSUS.


“Narcissus is the glory of his race,
For who does nothing with a better grace.”

Young—Love of Fame.

Narcissus: or, The Self-Lover.

James Shirley, 1646.

Philadelphia, June 1, 1895.

W. D. Howells:

To EDWARD W. BOK, Dr.


42 sq. inches in Boiler Plate. “Literary Letter,” on What I $4 20
Know of Howells’s Modesty
Mentioning Howells’s name, 730,000 times in same (up to 7 30
date)
Cussing Trilby (your suggestion) 20
$11 70
Less 2 per cent. for cash.

Please remit.
Die Heintzemannsche
Buchdruckerei
In Boston, Mass., empfiehlt sich zur geschmackvollen und
preiswerten Herstellung von feinen Druckarbeiten aller Art, als:
Schul- und Lehrbucher in allen Sprachen, Schul-
Examinationspapiere, Diplome, Zirkulare, Preisverzeichnisse,
Geschafts-Kataloge u. s. w. Herstellung von ganzen Werken mit
oder ohne Illustrationen, von der einfachsten bis zur reichsten
Ausfuhrung.

Carl H. Heintzemann,
234 Congress Street, Boston, Mass.

WANTED—Books on the History and Mythology of Sweden,


Denmark, Norway, Lapland, Finland, Greenland, Iceland, etc.,
in any language. Also maps, pamphlets, manuscripts, magazines
and any work on Northern Subjects, works of General Literature,
etc. Address, giving titles, dates, condition, etc., with price,
JOHN A. STERNE,
5247 Fifth Avenue, Chicago, Ill.
All kinds of Old Books and Magazines bought.
*** END OF THE PROJECT GUTENBERG EBOOK THE
PHILISTINE ***

Updated editions will replace the previous one—the old editions


will be renamed.

Creating the works from print editions not protected by U.S.


copyright law means that no one owns a United States copyright
in these works, so the Foundation (and you!) can copy and
distribute it in the United States without permission and without
paying copyright royalties. Special rules, set forth in the General
Terms of Use part of this license, apply to copying and
distributing Project Gutenberg™ electronic works to protect the
PROJECT GUTENBERG™ concept and trademark. Project
Gutenberg is a registered trademark, and may not be used if
you charge for an eBook, except by following the terms of the
trademark license, including paying royalties for use of the
Project Gutenberg trademark. If you do not charge anything for
copies of this eBook, complying with the trademark license is
very easy. You may use this eBook for nearly any purpose such
as creation of derivative works, reports, performances and
research. Project Gutenberg eBooks may be modified and
printed and given away—you may do practically ANYTHING in
the United States with eBooks not protected by U.S. copyright
law. Redistribution is subject to the trademark license, especially
commercial redistribution.

START: FULL LICENSE


THE FULL PROJECT GUTENBERG LICENSE
PLEASE READ THIS BEFORE YOU DISTRIBUTE OR USE THIS WORK

To protect the Project Gutenberg™ mission of promoting the


free distribution of electronic works, by using or distributing this
work (or any other work associated in any way with the phrase
“Project Gutenberg”), you agree to comply with all the terms of
the Full Project Gutenberg™ License available with this file or
online at www.gutenberg.org/license.

Section 1. General Terms of Use and


Redistributing Project Gutenberg™
electronic works
1.A. By reading or using any part of this Project Gutenberg™
electronic work, you indicate that you have read, understand,
agree to and accept all the terms of this license and intellectual
property (trademark/copyright) agreement. If you do not agree to
abide by all the terms of this agreement, you must cease using
and return or destroy all copies of Project Gutenberg™
electronic works in your possession. If you paid a fee for
obtaining a copy of or access to a Project Gutenberg™
electronic work and you do not agree to be bound by the terms
of this agreement, you may obtain a refund from the person or
entity to whom you paid the fee as set forth in paragraph 1.E.8.

1.B. “Project Gutenberg” is a registered trademark. It may only


be used on or associated in any way with an electronic work by
people who agree to be bound by the terms of this agreement.
There are a few things that you can do with most Project
Gutenberg™ electronic works even without complying with the
full terms of this agreement. See paragraph 1.C below. There
are a lot of things you can do with Project Gutenberg™
electronic works if you follow the terms of this agreement and
help preserve free future access to Project Gutenberg™
electronic works. See paragraph 1.E below.
1.C. The Project Gutenberg Literary Archive Foundation (“the
Foundation” or PGLAF), owns a compilation copyright in the
collection of Project Gutenberg™ electronic works. Nearly all the
individual works in the collection are in the public domain in the
United States. If an individual work is unprotected by copyright
law in the United States and you are located in the United
States, we do not claim a right to prevent you from copying,
distributing, performing, displaying or creating derivative works
based on the work as long as all references to Project
Gutenberg are removed. Of course, we hope that you will
support the Project Gutenberg™ mission of promoting free
access to electronic works by freely sharing Project
Gutenberg™ works in compliance with the terms of this
agreement for keeping the Project Gutenberg™ name
associated with the work. You can easily comply with the terms
of this agreement by keeping this work in the same format with
its attached full Project Gutenberg™ License when you share it
without charge with others.

1.D. The copyright laws of the place where you are located also
govern what you can do with this work. Copyright laws in most
countries are in a constant state of change. If you are outside
the United States, check the laws of your country in addition to
the terms of this agreement before downloading, copying,
displaying, performing, distributing or creating derivative works
based on this work or any other Project Gutenberg™ work. The
Foundation makes no representations concerning the copyright
status of any work in any country other than the United States.

1.E. Unless you have removed all references to Project


Gutenberg:

1.E.1. The following sentence, with active links to, or other


immediate access to, the full Project Gutenberg™ License must
appear prominently whenever any copy of a Project
Gutenberg™ work (any work on which the phrase “Project
Gutenberg” appears, or with which the phrase “Project
Gutenberg” is associated) is accessed, displayed, performed,
viewed, copied or distributed:

This eBook is for the use of anyone anywhere in the United


States and most other parts of the world at no cost and with
almost no restrictions whatsoever. You may copy it, give it
away or re-use it under the terms of the Project Gutenberg
License included with this eBook or online at
www.gutenberg.org. If you are not located in the United
States, you will have to check the laws of the country where
you are located before using this eBook.

1.E.2. If an individual Project Gutenberg™ electronic work is


derived from texts not protected by U.S. copyright law (does not
contain a notice indicating that it is posted with permission of the
copyright holder), the work can be copied and distributed to
anyone in the United States without paying any fees or charges.
If you are redistributing or providing access to a work with the
phrase “Project Gutenberg” associated with or appearing on the
work, you must comply either with the requirements of
paragraphs 1.E.1 through 1.E.7 or obtain permission for the use
of the work and the Project Gutenberg™ trademark as set forth
in paragraphs 1.E.8 or 1.E.9.

1.E.3. If an individual Project Gutenberg™ electronic work is


posted with the permission of the copyright holder, your use and
distribution must comply with both paragraphs 1.E.1 through
1.E.7 and any additional terms imposed by the copyright holder.
Additional terms will be linked to the Project Gutenberg™
License for all works posted with the permission of the copyright
holder found at the beginning of this work.

1.E.4. Do not unlink or detach or remove the full Project


Gutenberg™ License terms from this work, or any files
containing a part of this work or any other work associated with
Project Gutenberg™.
1.E.5. Do not copy, display, perform, distribute or redistribute
this electronic work, or any part of this electronic work, without
prominently displaying the sentence set forth in paragraph 1.E.1
with active links or immediate access to the full terms of the
Project Gutenberg™ License.

1.E.6. You may convert to and distribute this work in any binary,
compressed, marked up, nonproprietary or proprietary form,
including any word processing or hypertext form. However, if
you provide access to or distribute copies of a Project
Gutenberg™ work in a format other than “Plain Vanilla ASCII” or
other format used in the official version posted on the official
Project Gutenberg™ website (www.gutenberg.org), you must, at
no additional cost, fee or expense to the user, provide a copy, a
means of exporting a copy, or a means of obtaining a copy upon
request, of the work in its original “Plain Vanilla ASCII” or other
form. Any alternate format must include the full Project
Gutenberg™ License as specified in paragraph 1.E.1.

1.E.7. Do not charge a fee for access to, viewing, displaying,


performing, copying or distributing any Project Gutenberg™
works unless you comply with paragraph 1.E.8 or 1.E.9.

1.E.8. You may charge a reasonable fee for copies of or


providing access to or distributing Project Gutenberg™
electronic works provided that:

• You pay a royalty fee of 20% of the gross profits you derive from
the use of Project Gutenberg™ works calculated using the
method you already use to calculate your applicable taxes. The
fee is owed to the owner of the Project Gutenberg™ trademark,
but he has agreed to donate royalties under this paragraph to
the Project Gutenberg Literary Archive Foundation. Royalty
payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information
about donations to the Project Gutenberg Literary Archive
Foundation.”

• You provide a full refund of any money paid by a user who


notifies you in writing (or by e-mail) within 30 days of receipt that
s/he does not agree to the terms of the full Project Gutenberg™
License. You must require such a user to return or destroy all
copies of the works possessed in a physical medium and
discontinue all use of and all access to other copies of Project
Gutenberg™ works.

• You provide, in accordance with paragraph 1.F.3, a full refund of


any money paid for a work or a replacement copy, if a defect in
the electronic work is discovered and reported to you within 90
days of receipt of the work.

• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.

1.E.9. If you wish to charge a fee or distribute a Project


Gutenberg™ electronic work or group of works on different
terms than are set forth in this agreement, you must obtain
permission in writing from the Project Gutenberg Literary
Archive Foundation, the manager of the Project Gutenberg™
trademark. Contact the Foundation as set forth in Section 3
below.

1.F.

1.F.1. Project Gutenberg volunteers and employees expend


considerable effort to identify, do copyright research on,
transcribe and proofread works not protected by U.S. copyright
law in creating the Project Gutenberg™ collection. Despite
these efforts, Project Gutenberg™ electronic works, and the
medium on which they may be stored, may contain “Defects,”
such as, but not limited to, incomplete, inaccurate or corrupt
data, transcription errors, a copyright or other intellectual
property infringement, a defective or damaged disk or other

You might also like