0% found this document useful (0 votes)
14 views

Statistcal Techiques - Notes

Researxh methodolgy notes

Uploaded by

abhiabhinesh810
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Statistcal Techiques - Notes

Researxh methodolgy notes

Uploaded by

abhiabhinesh810
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

Statistical Techniques for Data Analysis

Two main statistical methods are used in data analysis: descriptive statistics, which
summarizes data using indexes such as mean and median and another is inferential statistics,
which draw conclusions from data using statistical tests such as student's t-test.
Selection of appropriate statistical method depends on the following three things:
 Aim and objective of the study,
 Type and distribution of the data used, and
 Nature of the observations (paired/unpaired)
Definition
The term descriptive statistics covers statistical methods for describing data using statistical
characteristics, charts, graphics or tables.
It is important here that only the properties of the respective sample are described and
evaluated. However, no conclusions are drawn about other points in time or the population.
This is the task of inferential statistics or concluding statistics. The various sub-areas of
descriptive statistics can be summarised as follows:

Depending on which question and which measurement scale is available, different key
figures, tables and graphics are used for evaluation. The best known of these are:
 Location parameter: Mean value, median, mode, sum
 Dispersion parameter: Standard deviation, variance, range
 Tables: Absolute, relative and cumulative frequencies
 Charts: Histograms, bar charts, box plots, scatter charts, matrix plots
The first group of Descriptive Statistics are location parameter like the mean and mode.
They are used to express a central tendency of the data set. They therefore describe where the
center of a sample is located or where a large part of the sample is located.
The second group are measures of dispersion. They provide information about how much the
values of a variable in a sample differ from each other. Measures of dispersion can therefore
describe how strongly the values of a variable deviate from the mean value: Are the values
rather close together, i.e. are they similar, or are they far apart and thus differ greatly? A
classic example is the standard deviation.

Which measures of location or dispersion are suitable for describing the data depends on the
respective scales of measurement of the variable. Here, a distinction can be made
between metric, ordinal and nominal scales of measurement.
Finally, a large area of descriptive statistics is diagrams such as the bar chart, the pie chart, or
the histogram.

Inferential statistics
definition
Inferential statistics is a branch of statistics that uses various analytical tools to draw
conclusions about the population from sample data. For a given hypothesis about the
population, inferential statistics uses a sample and gives an indication of the validity of the
hypothesis based on the sample collected.

Example inferential statistics


In the example above, a sample of 10 basketball players was drawn and then exactly this
sample was described, this is the task of descriptive statistics. If you want to make a
statement about the population you need the inferential statistics. For example, it could be
of interest if basketball players are larger than the average male population. To test this
hypothesis a t-Test is calculated, the t-test compares the sample mean with the mean of the
population. Furthermore, the question could arise whether basketball players are larger than
football players. For this purpose, a sample of football players is drawn, and then the mean
value of the basketball players can be compared with the mean value of the football players
using an independent t-test. Now a statement can be made, for example, whether basketball
players are larger than football players in the population or not.
Since this statement is only made based on the samples and it can also be pure coincidence
that the basketball players are larger in exactly this sample, the statement can only be
confirmed or re-submitted with a certain probability.
Parametric Tests And Non -Parametric Tests

Definition:

Parametric tests are statistical tests that make assumptions about the parameters of the

population distribution from which the sample is drawn. These tests often assume normality

and specific distributional characteristics.

Common Parametric Tests:

t-Test:

 Use: Compares means of two groups.

 Conditions: Assumes normally distributed data, The variability of the data in each group

should be similar, which is known as homogeneity of variance (homogeneity of

variances).

ANOVA (Analysis of Variance):

 Use: Compares means of more than two groups.


 Conditions: Assumes normally distributed data, homogeneity of variances.

 Homogeneity of variances (also known as homoscedasticity) means that the variability or

spread of scores is similar across different groups or conditions. Imagine we are

comparing the exam scores of students from three different schools. If the spread of

scores in one school is much wider (higher variance) than the others, it could affect the

validity of our comparison. Homogeneity of variances ensures that the groups are

comparable in terms of how spread out their scores are.

Regression Analysis:

 Use: Examines relationships between variables.

 Conditions: Assumes normally distributed residuals.

When to Use Parametric Tests:

 When data is approximately normally distributed.

 When assumptions of the specific test are met.

 When precision in parameter estimation is crucial, it means that obtaining accurate and
narrow estimates of population parameters is of utmost importance.

Nonparametric Tests

Definition:

Nonparametric tests are statistical tests that make fewer assumptions about the population

distribution. They are more flexible and applicable to a broader range of data types.

Common Nonparametric Tests:

Mann-Whitney U Test:
 Use: Compares medians of two independent groups.

 Conditions: Few assumptions about the underlying distribution.

Kruskal-Wallis Test:

 Use: Compares medians of more than two independent groups.

 Conditions: Few assumptions about the underlying distribution.

Wilcoxon Signed-Rank Test:

 Use: Compares medians of two related groups.

 Conditions: Few assumptions about the underlying distribution.

When to Use Nonparametric Tests:

 When data is not normally distributed.

 When the assumptions of parametric tests are violated.

 When dealing with ordinal or nominal data.

Key Differences Between Parametric and Non-parametric Tests


The key differences between parametric and non-parametric tests can be summarized as
follows:
Assumptions: Parametric tests require assumptions about the data distribution, while non-
parametric tests do not have such assumptions.
Central Tendency Value: Parametric tests use the mean value to measure central tendency,
whereas non-parametric tests use the median value.
Correlation: Parametric tests employ Pearson correlation, while non-parametric tests use
Spearman correlation.
Probabilistic Distribution: Parametric tests assume a normal distribution, while non-
parametric tests can be applied to arbitrary distributions.
Population Knowledge: Parametric tests require knowledge about the population, whereas
non-parametric tests do not have this requirement.
Applicability: Parametric tests are suitable for interval data, while non-parametric tests are
used for nominal data.
Examples: Parametric tests include z-test and t-test, while non-parametric tests include
Kruskal-Wallis and Mann-Whitney.

You might also like