Research Designe and Basics of Stistics Manish Jain
Research Designe and Basics of Stistics Manish Jain
AND
BASICS OF STATISTICS
Prepared & Presented by
MANISH P. JAIN
Research Scholar(DS13CE006)
Credit Seminar- I
Guided by
Dr. S. S. Arkatkar
Assistant Professor
Civil Engg Dept, SVNIT, Surat
RESEARCH FORMULATION AND DESIGN
INTRODUCTION OF STATISTICS
SAMPLING TECHNIQUES
SCALING TECHNIQUES
HYPOTHESIS TESTING
1. Exploratory Research
- which structures and identifies new problems
2. Constructive Research
- which develops solutions to a problem
3. Empirical Research
- which tests the feasibility of a solution using
empirical evidence.
Type – II
1. Qualitative Research
- understanding of human behavior and the
reasons that govern such behavior
2. Quantitative Research
- systematic empirical investigation of
quantitative properties and phenomena and
their relationships
RESEARCHES OBJECTIVES
The objectives of a research project must indicate :
Problem
Statement
Purposes
Benefits
Theory/literature
Assumptions
Background
Variables
Cont…
Measurement
Methodology
Sampling
Data Analysis
Conclusions
Interpretations
Recommendations
STATISTICS
Statistics is a subject consisting of scientific
methods for collecting and analyzing data and
drawing inferences from them.
Plan procedure
for selecting sampling units
Conduct fieldwork
Basic Sampling Classifications
RANDOM NON
(PROBABILITY) PROBABILITY
Multiphase
Cluster sample Quota
sample
SCALING TECHNIQUES
Scale
A scale is basically a continuous spectrum or series of
categories and has been defined as any series of items that are
arranged progressively according to value or magnitude, into
which an item can be placed according to its quantification
- Nominal scales
- Ordinal scales
- Interval scales
- Ratio scales
Primary scales of measurement
Nominal Numbers
assigned to 4 81 9
runners
Count/Frequencies, Mean,
Arithmetic Operations on
Interval Median,Mode, standard deviation,
Intervals between numbers
variance
Ratio - Most powerful
Arithmetic Operations on Geometric mean, coefficient of
with most meaningful
actual quantities variation
answers
DATA ANALYSIS-
ORGANIZE, COMPARING
AND SUMMARIZE
Various methods for Describing, Exploring and
Comparing Data
1. Frequency Distribution
2. Central Tendency (Mean, Median, Mode)
3. Percentile Values ( Deciles, Quartiles etc. )
4. Graphical representation by Pie chart, Frequency polygon
Histograms, Stem & Leaf etc.
5. Measure of variation by Variance
6. Standard deviation,
7. Coefficient of variation
class speed frequency cumulative Average Median Mode
limit frequency value Kmph Kmph
kmph (mean)
kmph
1 30-40 5 5
2 40-50 6 11
3 50-60 12 23
4 60-70 10 33 63.43 62.5 57.5
5 70-80 8 41
6 80-90 7 48
7 90-100 3 51
total 51
Series1, 90-100, 3, Series1, 30-40, 5,
6% 10%
e.g.- what is a chance that out of total 4-wheelers 20% are big
cars.
Basic properties of probability
The total probability of all possible event always sums to 1. i.e for
any event probability always lies between 0 to 1
P( X | Y ) = P( X Y )
P(Y )
Baye’s Theorem
In conditional probability we consider the probability of an event
when we have information about the occurrence of an earlier
event. Bay’s theorem determine the probability of an earlier event
based on the information about the occurrence of a later event.
P(A )
P(A | B ) = P(B | A )
P(B )
Random Variables
Random variable - a quantity resulting from an experiment that, by chance, can assume
different values.
33
Types of Random Variables
Discrete Random Variable can assume only certain clearly separated values. It is
usually the result of counting something
34
PROBABILITY
DISTRIBUTIONS
It is a listing of all outcomes of an experiment
and the probability associated with each
outcomes.
Experiment: Toss a
coin three times.
Observe the number of
heads. The possible
results are: zero
heads, one head, two
heads, and three
heads.
What is the probability
distribution for the
number of heads?
36
Binomial Probability distribution
P(m, N , p) = C N ,m p q m N m
= p
N
m
m
q N m
=
N!
m!( N m)!
p m q N m
37
Poisson Probability Distribution
e m
P(m, ) =
m!
= mean number of successes in a particular interval
e = constant 2.71828 (base of napeerian logarithmic
system)
m = Total number of occurrence
3. It is Symmetric around the mean: Two halves of the curve are the
same (mirror images)
Significance level (α) represent the probability of making type I error. The
significance level indicates which portion of the sampling population is
considered too unlikely to occur only by chance. If sample mean falls into
this region, then we reject the null hypothesis.
If sample mean falls into this region, then we reject the null hypothesis.
Normally (α) values are 0.10, 0.05, 0.02 and 0.01 etc or their percentage
equivalents 10 %, 5%, 2% and 1%.
P-values
chance
more than
two sample
1 Mann
whitney U
ANNOVA test
2 Kruskal
Wallis test
one sample Two sample
large sample
Independent Dependent
z-value
Small
large sample large sample
F-test sample t-
Z-value Z-value
value
Small
sample t-
value
Hypothesis testing
CORRELATION ,
REGRESSION AND
STATISTICAL TESTS
Overview of Correlation and Regression
No correlation
Measuring the Relationship
Pearson’s Sample Correlation Coefficient, r
rs = 1-
where
D = difference of rank between two variable
N = nos of data
Regression
– Specific statistical methods for finding the “line of
best fit” for one response (dependent) numerical
variable based on one or more explanatory
(independent) variables.
– It is used to
1. To describe (or model)
x
Standard Error of the Estimate
1. www.bized.co.uk
2. http://lib.stat.cmu.edu
3. http://www.ruf.rice.edu
4. http://www.stat.uiuc.edul
5. http:// www.idrc.ca