Stat and Prob
Stat and Prob
Statistics
FAMOUS STATISTICIANS
Gertrude Cox William Sealy Gosset
Florence Nightingale Ronald A. Fisher
J. Stuart Hunter George E.P. Box
John Carl Friedrich Gauss Thomas Bayes
Area of Statistics
Descriptive Statistics
- Describes the properties of sample and population data
- Include mean (average), variance, skewness and kurtosis
Inferential Statistics
- Use those properties to test hypotheses and draw conclusions
- Include linear regression analysis, analysis of variance (ANOVA), and null hypothesis testing
Sources of Data
Primary Data – the researcher gathers the data him/herself
Secondary Data – the researcher uses data gathered by somebody
Data Science
- The center of data science is data, especially Big Data
- The purpose of data science is to obtain information or knowledge from the data that will help
in making better decisions and understanding the development and change of nature or
society better
- Data science is a multidisciplinary field that has applied theories and technologies from several
disciplines
R is a language and environment for statistical computing and graphics developed by Bell
Laboratories (present-day Lucent Technologies).
Python is an object-oriented, interpreted, and interactive programming language developed by Guido
van Rossum.
The SAS language is a programming language developed by Anthony James Barr as a statistical
analysis tool.
Probability
- A number that reflects the chance or likelihood that a particular event will occur
- 0 to 1 or 0% to 100%
Interpretation of Probability
Classical – equally likely to happen
Frequentist – long frequency of repeatable experiments
Subjective – a probability derived from an individual’s personal judgement or own experience
Bayesian – measures a degree of belief
Discrete Probability Distribution – is a table, graph, or a formula listing all possible values that a
discrete random variable can take on, along with the associated probabilities
03 HANDOUT AND PPT
Binomial Distribution
In an experiment of trials, each trial has two (2) possible outcomes: success or failure.
The trials are independent, meaning, the result of the first trial does not affect the result of the
next.
The process is called binomial experiment, and each trial in a process that has two (2) possible
outcomes is called the Bernoulli Trial
Poisson Distribution – counts the number of rare events or successes that occur in a specified time
interval or region