0% found this document useful (0 votes)
7 views

math notes module 4A

Uploaded by

Coley Boyd
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

math notes module 4A

Uploaded by

Coley Boyd
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Basics of statistics

● Descriptive vs inferential statistics


○ Descriptive statistics focus on gathering, sorting, summarizing, and displaying
data, while Inferential statistics focus on using descriptive statistics to estimate
population parameters based on simple data.
● Population vs parameter
○ The population of a study is the study of a group the collected data is intended to
describe.
○ A parameter meanwhile is a value, such as an average, percentage, etc,
calculated using all data from the population.
■ It is rare for parameters to be used due to how time and resource heavy it
is to collect the data, unless the pop is small or the data is already
collected.
○ A census meanwhile is a survey of an entire population.
● Sample vs statistic
○ Samples are smaller subsets of the entire population, ideally one that is
representative of the population as a whole.
○ A Statistic is a value that is calculated using data from a sample.
○ Example
■ The city of Raleigh has 8,700 registered voters. There are two candidates
for city council in an upcoming election: Brown and Feliz. The day before
the election, a telephone poll of 600 randomly selected registered voters
was conducted. 197 said they'd vote for Brown, 388 said they'd vote for
Feliz, and 15 were undecided.
■ Identify the following:
● Population
● Sample
● Sample statistic
● Number and percentage of voters expected to vote for (1) Brown,
(2) Feliz, and (3) Undecided. Round your answers to the nearest
person. Round your percentages to the nearest tenth.
■ Population:8,700
■ Sample size:600
■ Voters:
● Brown:2857, 32.8%
● Feliz:64.7%
● Undecided:218, 2.5%
● Data Types
○ Categorical/Qualitative data are pieces of data that allow us to classify the
objects under investigation into different categories.
○ Quantitative data are responses that are numerical in nature, with which we can
preform meaningful arithmetic calculations
○ Examples
■ Classify these as categorical or quantitative
● Zip codes:Categorical
● Eye color of a certain group:categorical
● Daily high temperature of a city over several weeks:Quantitative
● Annual income:quantitative
● Sampling methods
○ A sampling method is biased if every member of a population doesn’t have a fair
chance of being included.
○ Random samples are ones where each member of the population has an equal
chance of being chosen.
■ A simple random sample is one where each member of the population
and any group of members has an equal chance of being chosen.
○ Stratified sampling is where a population is devided into a number of subgroups
known as strata. Random samples are then taken from each group with sample
sizes proportional to the size of the subgroup in the population.
■ Quota sampling is a variation on stratified, where samples are collected in
each strata until the quota is met.
○ Cluster sampling is where the population is separated into subgroups known as
clusters, and further sets of subgroups are to be selected from these clusters.
○ Systemic sampling is where every Nth member of a population is to be selected
in the sample.
○ Convenient sampling is chosen by who ever is most convenient.
○ Voluntary response is where the sample size is based upon volunteer.
○ Examples
■ Identifty each type of sampling used.
● A sample was selected to contain 25 men and 35 women.
Stratified sampling.
● Viewers of new show being asked to vote on a website. Voluntary
response.
● Every 4th member of a class was asked. Systemic.
● Website randomly sends survey to 50 users. Simple random.
● To survey voters in a town, a polling company randomly selects 10
city blocks, interviews everyone who lives on those blocks. cluster.
● Studies and experiments.
○ Observational studies is a study based upon observations or measurements.
○ An experiment is a study where the effects of a treatment are measured.
Tables and graphs.
● Frequency tables are tables with two columns where one column lists the categories and
the other with the frequencies at the items occur, aka how many items fit the category.
○ A relative frequency table is a table where you have columns of fractions or
percents detailing the relative frequency of each category.
● A bar graph is a graph that displays a bar for each category with the length of each bar
indicating the frequency of the category.
○ Pareto graphs are bar graphs that are ordered from highest to lowest.
● A pie chart is a circle with wedges cut into varying sizes, where the sizes indicate the
frequency of items in that category.
● A histogram is a graph that displays a rectangle for each numerical class interval for
each rectangle indicating the frequency of values in the interval. A histogram is close to
a bar graph but the horizontal line is a number line, with all class intervals being of equal
width.
● A line chart shows each category as a point connected with a line.

Measures of center and variation


● Measures of central tendency
○ This is the distribution of a variable or data set that refers to the way it’s values
are spread over all possible values. The distribution can be shown visually with a
table or graph.
■ Mean:the arithmetic mean is also known as the average, is the sum of all
values divided by the total number of values, or Sum/total.
■ Median:the median is the middle value when the data is sorted in
numerical order, or halfway if the number of values are even.
■ Mode is the most common value or group of values in a data set.
■ Outliers are values that are much higher or lower than all other values.
This will bring the mean up.
■ The range is the different between the maximum and minimum values.
■ Standard deviation is a measure of variation based on measuring how far
each data value deviates or is different, from the mean.
● Standard deviations are always positive, and will be zero if all the
data values are equal, and will increase as data spreads out.
● SD has the same units as original data.
● SD is also affected by outliers/
● SD=square root of Sum of (deviations from the mean)2/total
numbers of data values-1
○ Examples
■ For the following dataset of contract offers, find the mean, median,
mode, range, and standard deviation: $50,000, $80,000, $100,000
$90,000,$10,000,000
● N=5
● Mean:2,064,000
● Median:90,000
● Mode:
● SD:1.968163e13

You might also like