Descriptive Statistics-Summary Tables
Descriptive Statistics-Summary Tables
com
Chapter 201
Introduction
This procedure is used to summarize continuous data. Large volumes of such data may be easily
summarized in statistical lists of means, counts, standard deviations, etc. Up to 8 categorical group variables
may be used to calculate summaries for individual group combinations. The summary lists may be output
directly to a new dataset.
This procedure produces lists of the following summary statistics:
• Count • Interquartile Range (IQR)
• Missing Count • 10th Percentile (10th Pctile)
• Sum • 25th Percentile (25th Pctile)
• Mean • 75th Percentile (75th Pctile)
• Standard Deviation (Std Dev) • 90th Percentile (90th Pctile)
• Standard Error (Std Error) • Variance
• Lower 95% Confidence Limit for the • Mean Absolute Deviation (MAD)
Mean (95% LCL) • Mean Absolute Deviation from the
• Upper 95% Confidence Limit for the Median (MADM)
Mean (95% UCL) • Coefficient of Variation (COV)
• Median • Coefficient of Dispersion (COD)
• Minimum • Skewness
• Maximum • Kurtosis
• Range
201-1
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Data Structure
The data below are a subset of the Resale dataset provided with the software. This (computer simulated)
data gives the selling price, the number of bedrooms, the total square footage (finished and unfinished), and
the size of the lots for 150 residential properties sold during the last four months in two states. This data is
representative of the type of data that may be analyzed with this procedure. Only the first 6 of the 150
observations are displayed.
Missing Values
Observations with missing values in either the group variables or the continuous data variables are ignored.
The procedure also allows you to specify up to 5 additional values to be considered as missing in categorical
group variables.
Summary Statistics
The following sections outline the summary statistics that are available in this procedure.
Count
The number of non-missing data values, n. If no frequency variable was specified, this is the number of rows
with non-missing values.
Missing Count
The number of missing data values. If no frequency variable was specified, this is the number of rows with
missing values.
Sum
The sum (or total) of the data values.
𝑛𝑛
𝑆𝑆𝑆𝑆𝑆𝑆 = � 𝑥𝑥𝑖𝑖
𝑖𝑖=1
201-2
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Mean
The average of the data values.
∑𝑛𝑛𝑖𝑖=1 𝑥𝑥𝑖𝑖
𝑥𝑥̅ =
𝑛𝑛
Variance
The sample variance, s2, is a popular measure of dispersion. It is an average of the squared deviations from
the mean.
∑𝑛𝑛𝑖𝑖=1(𝑥𝑥𝑖𝑖 − 𝑥𝑥̅ )2
𝑠𝑠 2 =
𝑛𝑛 − 1
𝑠𝑠
𝑠𝑠𝑥𝑥̅ =
√𝑛𝑛
95% Confidence Interval for the Mean (95% LCL & 95% UCL)
This is the upper and lower values of a 95% confidence interval estimate for the mean based on a t
distribution with n – 1 degrees of freedom. This interval estimate assumes that the population standard
deviation is not known and that the data for this variable are normally distributed.
Minimum
The smallest data value.
201-3
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Maximum
The largest data value.
Range
The difference between the largest and smallest data values.
Percentiles
The 100pth percentile is the value below which 100p% of data values may be found (and above which
100(1 - p)% of data values may be found). The 100pth percentile is computed as
where k1 equals the integer part of p(n + 1), k2 = k1 + 1, g is the fractional part of p(n + 1), and X[k] is the kth
observation when the data are sorted from lowest to highest.
Median
The median (or 50th percentile) is the “middle number” of the sorted data values.
𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀 = 𝑍𝑍50
∑𝑛𝑛𝑖𝑖=1|𝑥𝑥𝑖𝑖 − 𝑥𝑥̅ |
𝑀𝑀𝑀𝑀𝑀𝑀 =
𝑛𝑛
201-4
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
∑𝑛𝑛𝑖𝑖=1|𝑥𝑥𝑖𝑖 − 𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀|
𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀 =
𝑛𝑛
𝑠𝑠
𝐶𝐶𝐶𝐶𝐶𝐶 =
𝑥𝑥̅
∑𝑛𝑛𝑖𝑖=1|𝑥𝑥𝑖𝑖 − 𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀|
𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀 � 𝑛𝑛
�
𝐶𝐶𝐶𝐶𝐶𝐶 = =
𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀 𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀
Skewness
Measures the direction and degree of asymmetry in the data distribution.
𝑚𝑚3
𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 = 3/2
𝑚𝑚2
where
∑𝑛𝑛𝑖𝑖=1(𝑥𝑥𝑖𝑖 − 𝑥𝑥̅ )𝑟𝑟
𝑚𝑚𝑟𝑟 =
𝑛𝑛
Kurtosis
Measures the heaviness of the tails in the data distribution.
𝑚𝑚4
𝐾𝐾𝐾𝐾𝐾𝐾𝐾𝐾𝐾𝐾𝐾𝐾𝐾𝐾𝐾𝐾 =
𝑚𝑚22
where
∑𝑛𝑛𝑖𝑖=1(𝑥𝑥𝑖𝑖 − 𝑥𝑥̅ )𝑟𝑟
𝑚𝑚𝑟𝑟 =
𝑛𝑛
201-5
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Setup
To run this example, complete the following steps:
Variables Tab
_________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
201-6
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Summary Table
Summary Table
─────────────────────────────────────────────────────────────────────────
Variable
────────────────────────────────────────────────
Sales Garage Total Area
Statistic Price Bedrooms Bathrooms Size (Sqft)
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Count 150 150 150 150 150
Mean 174392 2.42 2.4 1.266667 1893.38
Standard Deviation 97656.81 0.8919476 0.8047677 0.5636252 754.2496
Lower 95% CL Mean 158636 2.276093 2.270158 1.175731 1771.689
Upper 95% CL Mean 190148 2.563908 2.529842 1.357602 2015.071
─────────────────────────────────────────────────────────────────────────
The table is created with the statistics as rows and the data variables as columns when the positions are
both set to “Auto”.
201-7
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
The plots are not very informative because the variables have vastly different scales.
Variables Tab
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
201-8
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Summary Table
─────────────────────────────────────────────────────────────────────────
Statistic
───────────────────────────────────────────────
Standard Lower 95% Upper 95%
Variable Count Mean Deviation CL Mean CL Mean
─────────────────────────────────────────────────────────────────────────────────────────────────────────────
Sales Price 150 174392 97656.81 158636 190148
Bedrooms 150 2.42 0.8919476 2.276093 2.563908
Bathrooms 150 2.4 0.8047677 2.270158 2.529842
Garage Size 150 1.266667 0.5636252 1.175731 1.357602
Total Area (Sqft) 150 1893.38 754.2496 1771.689 2015.071
─────────────────────────────────────────────────────────────────────────
The table is now rotated with the data variables as rows and the statistics as columns. Notice that the actual
summary statistic values are exactly the same.
201-9
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Setup
To run this example, complete the following steps:
Variables Tab
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
201-10
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Summary Table
Summary Table
─────────────────────────────────────────────────────────────────────────
Variable
──────────────────────────
Sales Total Area Lot Size
State Statistic Price (Sqft) (Sqft)
──────────────────────────────────────────────────────────────────────────────────────────
Nevada Count 88 88 88
Mean 170762.5 1881.33 8571.454
Standard Deviation 98665.72 788.569 2419.88
Virginia Count 62 62 62
Mean 179543.5 1910.484 8076.597
Standard Deviation 96771.49 708.6572 2301.226
The table displays the group variable values as the rows, the statistics as the subrows, and the data
variables as the columns. The plots are not shown because they are not very informative because the
variables have vastly different scales. Totals are given for the group variable.
Variables Tab
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
201-11
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Summary Table
─────────────────────────────────────────────────────────────────────────
State
────────────────
Variable Statistic Nevada Virginia Total
──────────────────────────────────────────────────────────────────────────────────────────────────
Sales Price Count 88 62 150
Mean 170762.5 179543.5 174392
Standard Deviation 98665.72 96771.49 97656.81
The table is now rotated with the data variables as rows and the group variable values as columns. Notice
that the actual summary statistic values are exactly the same.
Variables Tab
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
201-12
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Summary Table
─────────────────────────────────────────────────────────────────────────
Statistic
─────────────────────────
Standard
Variable State Count Mean Deviation
─────────────────────────────────────────────────────────────────────────────────────
Sales Price Nevada 88 170762.5 98665.72
Virginia 62 179543.5 96771.49
Total 150 174392 97656.81
The table now has the data variables as rows and the group variable values as subrows with the statistics as
columns.
201-13
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Setup
To run this example, complete the following steps:
Variables Tab
______________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
201-14
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Output
The table is displays Group Variable 1 (Drug) values as the rows, the statistics as the subrows, and Group
Variable 2 (Time) values as the columns.
201-15
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Individual plots are created with the table row item (Group Variable 1 --- “Drug”) on the group (X) axis and
the table column item (Group Variable 2 --- “Time”) as the legend variable. A separate plot is created for each
statistic. These plots are very useful for seeing overall trends. From the plots shown here, it is apparent that
the average and minimum pain response is lower for both drugs than for placebo and that the pain control
is better over time. Kerlosin appears to control pain the best from these results. Statistical tests would need
to be performed, however, to assert statistical significance in the differences.
201-16
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
The combined plot displays all of the information in the table. We rotated the group axis labels so they
would not overlap and be readable. The table row item (Group Variable 1 --- “Drug”) and table sub row item
(Statistic) are combined on the group (X) axis. The table column item (Group Variable 2 --- “Time”) is the
legend variable.
201-17
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Variables Tab
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
1 Kerlosin 69.86 62 64 68 77 80
Laposec 71.00 65 67 71 77 78
Placebo 76.29 66 69 78 82 84
2 Kerlosin 54.00 46 47 56 57 64
Laposec 60.57 54 56 61 65 66
Placebo 72.14 67 70 73 73 76
3 Kerlosin 40.14 33 37 41 45 46
Laposec 51.71 47 49 50 56 58
Placebo 65.43 60 62 67 68 70
─────────────────────────────────────────────────────────────────────────
The table displays Group Variable 2 (Time) values as the rows, Group Variable 1 (Drug) values as the
subrows, and the statistics as the columns.
201-18
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
The individual plots are different now with the table row item (Group Variable 2 --- “Time”) on the group (X)
axis and the table column item (Group Variable 1 --- “Drug”) as the legend variable. A separate plot is created
for each statistic. These plots are again useful for seeing overall trends. There is a very distinct reduction in
pain over time.
201-19
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Again, the combined plot displays all of the information in the table. The table row item (Group Variable 2 ---
“Time”) and table sub row item (Group Variable 1 --- “Drug”) are combined on the group (X) axis. The table
column item (Statistic) is the legend variable.
Variables Tab
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
201-20
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
201-21
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
(Report continues with table and plot for each Data Variable/Statistic combination)
A separate table is created for each Data Variable/Statistic combination. If more than one data variable were
entered, the report would be even longer. There is no combined plot in the output because the combined
plot is the same as the individual plot in this case.
201-22
© NCSS, LLC. All Rights Reserved.