0% found this document useful (0 votes)
4 views8 pages

unit-1(DAE)

The document outlines the fundamental concepts of experimental design, emphasizing the importance of experiments in establishing causal relationships, testing hypotheses, and driving innovation. It details various experimental strategies, basic principles of design, and key terminologies related to experiments, including ANOVA for statistical analysis. Additionally, it discusses the significance of sample size and the use of normal probability plots in assessing data distribution.

Uploaded by

kushisaivivek
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views8 pages

unit-1(DAE)

The document outlines the fundamental concepts of experimental design, emphasizing the importance of experiments in establishing causal relationships, testing hypotheses, and driving innovation. It details various experimental strategies, basic principles of design, and key terminologies related to experiments, including ANOVA for statistical analysis. Additionally, it discusses the significance of sample size and the use of normal probability plots in assessing data distribution.

Uploaded by

kushisaivivek
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Design and Analysis of Experiments UNIT – 1, Experimental Design Fundamental

DESIGN AND ANALYSIS OF EXPERIMENTS


UNIT – 1, Experimental Design Fundamental
1. Importance of Experiments
a) Experiments play a crucial role in research as they allow researchers to establish
causal relationships between variables. By manipulating independent variables and
measuring their impact on dependent variables, experiments provide strong
evidence for cause and effect.
b) The importance of experiments lies in their ability to test hypotheses and theories.
Through carefully designed experimental conditions, researchers can systematically
investigate the effects of different factors or interventions on the outcome of
interest. This helps validate or invalidate existing knowledge and generate new
insights.
c) Experiments provide a controlled environment for studying phenomena. By
controlling and manipulating variables while holding others constant, experiments
reduce the influence of confounding variables. This allows researchers to isolate
specific causal factors and obtain more accurate assessments of the effects being
studied.
d) Replicability and reliability are key features of experiments. By documenting the
experimental design, procedures, and measurements, other researchers can replicate
the study and validate its results. This ensures the credibility and robustness of
scientific knowledge.
e) Evidence-based decision making relies on experimental results. Whether in
academia, industry, or public policy, experiments provide valuable insights that
guide choices and actions. Understanding the effects and consequences of different
interventions allows stakeholders to make informed decisions and optimize
outcomes.
f) Experiments drive innovation and discovery. By exploring new ideas, testing novel
interventions, and challenging existing assumptions, experiments can uncover
unexpected findings and lead to breakthroughs. This promotes advancements in
various fields and expands our understanding of the world.
g) Optimization and improvement are facilitated by experiments. Through iterative
experimentation and analysis, researchers can identify factors that contribute to
desired outcomes and refine them for maximum efficiency or effectiveness. This is
particularly relevant in fields like engineering, manufacturing, and user experience
design.
h) Quantitative analysis is a hallmark of experiments. By collecting data and applying
statistical methods, researchers can quantify the magnitude and significance of
effects, establish confidence intervals, and assess the generalizability of findings.
This adds precision and objectivity to research conclusions.

2. Experimental Strategies
a) Completely Randomized Design (CRD): In this strategy, treatments are
assigned randomly to the experimental units without any restriction. This is
suitable when the treatments are homogeneous and the experimental units are
homogeneous as well.
Dr. A. Varun, [email protected], 9494226711
Department of Mechanical Engineering, B V Raju Institute of Technology, Narsapur 1
Design and Analysis of Experiments UNIT – 1, Experimental Design Fundamental

b) Randomized Block Design (RBD): This strategy involves grouping


experimental units into blocks based on specific characteristics that may affect
the response variable. Treatments are then randomly assigned within each block
to minimize variability due to these characteristics.
c) Factorial Design: This strategy involves studying the effects of multiple factors
simultaneously and their interactions. It allows researchers to investigate the
main effects of each factor and their combined effects. Factorial designs provide
a more comprehensive understanding of the factors influencing the response
variable.
d) Response Surface Methodology (RSM): RSM is used to optimize response
variables by determining the optimal levels of factors. It involves fitting a
mathematical model to experimental data and using statistical techniques to find
the optimal settings that maximize or minimize the response.

3. Basic Principles of Design


a) Randomization: Randomization is the process of randomly assigning
treatments to experimental units. It helps minimize bias and ensures that the
groups being compared are comparable. Randomization is crucial for drawing
valid causal inferences.
b) Replication: Replication involves including multiple observations or
measurements for each treatment. This is important for obtaining reliable
estimates of treatment effects and assessing the variability of the response
variable.
c) Control: Control refers to holding extraneous factors constant or accounting for
their effects. By controlling for confounding variables, researchers can isolate
the effects of the variables of interest and attribute any observed differences to
those variables.
d) Blocking: Blocking involves grouping experimental units into blocks based on
specific characteristics that may influence the response variable. Treatments are
then randomly assigned within each block. Blocking reduces the variability due
to these characteristics, increases precision, and helps account for potential
sources of variability.

4. Terminologies
a) Experimental unit: The experimental unit is the smallest entity to which a
treatment can be applied and from which data can be collected. It can be an
individual, a group, an object, or any other defined entity.
b) Treatment: A treatment refers to a specific condition or intervention being applied
to the experimental units. It represents a level or value of a factor being studied.
c) Factor: A factor is the variable or attribute being manipulated in the experiment. It
represents a potential source of variation that may affect the response variable.
d) Level: A level is a specific value or setting of a factor. Factors can have multiple
levels, and each level represents a unique condition or value of the factor.
e) Response variable: The response variable is the outcome or variable of interest
being measured in the experiment. It represents the effect or response being studied.

Dr. A. Varun, [email protected], 9494226711


Department of Mechanical Engineering, B V Raju Institute of Technology, Narsapur 2
Design and Analysis of Experiments UNIT – 1, Experimental Design Fundamental

f) Control group: The control group is a group that does not receive the treatment or
intervention being studied. It serves as a baseline for comparison and helps assess
the effects of the treatment.

5. ANOVA
a) Analysis of Variance (ANOVA) is a statistical technique used to analyze the
differences between group means and assess the significance of these differences.
b) ANOVA determines the variation in the response variable that can be attributed to
different factors or treatments. It decomposes the total variation into components
associated with factors, error, and residual variation.
c) ANOVA helps identify significant factors and understand their impact on the
response variable. It provides statistical evidence to support conclusions about
treatment effects.
d) ANOVA can be extended to more complex designs, such as factorial designs, where
multiple factors and their interactions are considered.
ANOVA, which stands for Analysis of Variance, is a statistical technique used to
analyze the differences between group means and assess the significance of these
differences. ANOVA is particularly useful when comparing means across three or more
groups or treatments.
Here are the key aspects and principles of ANOVA:
a) Variation and partitioning: ANOVA involves partitioning the total variation
observed in the data into different components associated with the treatment effects,
error, and residual variation. This partitioning allows for a quantitative assessment
of the sources of variation and their contributions to the overall variability in the
data.
b) Null hypothesis and alternative hypothesis: In ANOVA, the null hypothesis
assumes that there is no significant difference between the means of the groups or
treatments being compared. The alternative hypothesis, on the other hand, suggests
that at least one of the means is significantly different from the others.
c) F-statistic: ANOVA uses the F-statistic to test the null hypothesis. The F-statistic
compares the variation between the group means (explained variation) with the
variation within the groups (unexplained variation). If the observed difference
between the group means is significantly larger than the expected variation within
the groups, the null hypothesis is rejected.
d) Sum of Squares: ANOVA calculates the Sum of Squares to quantify the variation
in the data. The Total Sum of Squares (SST) represents the total variation in the
data, the Treatment Sum of Squares (SSTreatment) represents the variation between
the group means, and the Error Sum of Squares (SSError) represents the
unexplained variation within the groups.
e) Degrees of Freedom: Degrees of Freedom (df) indicate the number of independent
observations available for estimating the variation. In ANOVA, there are two types
of degrees of freedom: the degrees of freedom associated with the treatment
(dfTreatment) and the degrees of freedom associated with the error (dfError).
f) Mean Squares: Mean Squares are obtained by dividing the Sum of Squares by the
corresponding degrees of freedom. Mean Square Treatment (MSTreatment)
Dr. A. Varun, [email protected], 9494226711
Department of Mechanical Engineering, B V Raju Institute of Technology, Narsapur 3
Design and Analysis of Experiments UNIT – 1, Experimental Design Fundamental

represents the average variation between the group means, and Mean Square Error
(MSError) represents the average unexplained variation within the groups.
g) F-distribution and p-value: The F-statistic follows the F-distribution under the
assumption of the null hypothesis. By comparing the observed F-statistic with the
critical value from the F-distribution, researchers can determine the statistical
significance of the results. The p-value is then calculated to quantify the probability
of observing the results under the null hypothesis.
h) Multiple comparisons: When ANOVA reveals a significant difference among the
group means, additional post hoc tests, such as Tukey's test or Bonferroni
correction, can be performed to identify which specific groups differ significantly
from each other.
i) Assumptions: ANOVA assumes that the data within each group or treatment are
independent and identically distributed, and that the residuals (unexplained
variation) are normally distributed with constant variance. Violations of these
assumptions may affect the validity of the ANOVA results.
ANOVA is a powerful tool for analyzing the differences between multiple groups or
treatments and determining whether these differences are statistically significant. It
provides a structured approach to comparing means and helps researchers draw
meaningful conclusions from their data.

6. Steps in Experimentation
a) Formulate research question and objectives: Clearly define the research question
and the specific objectives of the experiment.
b) Design the experiment: Determine the factors to be studied, their levels, and the
appropriate experimental strategy. Decide on the number of replications and the
randomization scheme.
c) Randomly assign treatments: Randomly assign treatments to the experimental units
according to the chosen experimental design.
d) Collect data: Measure the response variable for each experimental unit and record
the data.
e) Analyze the data: Apply appropriate statistical methods, such as ANOVA or
regression analysis, to analyze the data and test hypotheses.
f) Draw conclusions and make inferences: Interpret the results of the analysis in the
context of the research question and objectives. Draw conclusions and make
inferences about the effects of the factors on the response variable.
g) Communicate findings: Present the findings of the experiment through written
reports, presentations, or other appropriate means. Clearly communicate the results,
conclusions, and implications of the study.

7. Sample Size
a) Sample size refers to the number of experimental units or observations included
in the study. It is crucial for obtaining reliable and statistically valid results.
b) Determining an appropriate sample size depends on various factors, including
the desired level of precision, the expected effect size, the variability in the data,
and the desired statistical power.
Dr. A. Varun, [email protected], 9494226711
Department of Mechanical Engineering, B V Raju Institute of Technology, Narsapur 4
Design and Analysis of Experiments UNIT – 1, Experimental Design Fundamental

c) A larger sample size generally leads to more precise estimates and higher
statistical power. However, larger sample sizes may also require more resources
and increase the cost and complexity of the study.
Sample size refers to the number of individuals or observations included in a study or
experiment. Determining an appropriate sample size is crucial for obtaining reliable and
statistically valid results. The sample size should be carefully chosen to ensure that the
study has adequate statistical power to detect meaningful effects and provide precise
estimates. Here are some key considerations when determining sample size:
a) Desired level of precision: The sample size should be sufficient to provide a
desired level of precision in estimating population parameters. A larger sample
size generally leads to more precise estimates with narrower confidence
intervals.
b) Expected effect size: The expected effect size refers to the magnitude of the
difference or relationship that the study aims to detect. A larger effect size
typically requires a smaller sample size to detect it with sufficient power.
c) Variability of the data: The variability or dispersion of the data also affects the
required sample size. Higher variability generally requires a larger sample size
to achieve a desired level of precision.
d) Statistical power: Statistical power is the probability of correctly rejecting the
null hypothesis when it is false. A higher sample size increases the statistical
power, allowing for a better chance of detecting true effects. Researchers often
aim for a power of 80% or higher.
e) Significance level: The significance level (alpha) is the probability of
incorrectly rejecting the null hypothesis when it is true. Commonly used values
for alpha are 0.05 (5%) or 0.01 (1%). The sample size calculation should
consider the desired significance level.
f) Study design and analysis methods: The sample size calculation may depend on
the study design and the analysis methods employed. Different study designs,
such as cross-sectional studies, case-control studies, or randomized controlled
trials, may require different sample size considerations.
g) Resources and feasibility: Practical considerations, such as available resources,
time constraints, and feasibility, may influence the determination of sample
size. It is important to strike a balance between obtaining a sample size that is
statistically valid and feasible within the limitations of the study.
h) Population characteristics: The characteristics of the target population may also
influence the sample size calculation. For example, if the population is highly
heterogeneous, a larger sample size may be needed to capture this variability.
i) Sampling technique: The sampling technique used may affect the required
sample size. If the sampling technique is stratified or clustered, adjustments may
be needed in the sample size calculation to account for the design effect.
j) Consultation with a statistician: It is often beneficial to consult with a statistician
during the planning stage to determine an appropriate sample size. Statisticians
can help perform power calculations or provide guidance based on the specific
study objectives and design.

Dr. A. Varun, [email protected], 9494226711


Department of Mechanical Engineering, B V Raju Institute of Technology, Narsapur 5
Design and Analysis of Experiments UNIT – 1, Experimental Design Fundamental

k) Overall, determining an appropriate sample size is a critical step in designing a


study or experiment. It involves considering the desired precision, effect size,
variability, power, significance level, study design, resources, and population
characteristics. A well-chosen sample size enhances the validity and reliability
of the study's findings.

8. Normal Probability Plot


a) A normal probability plot is a graphical tool used to assess whether a dataset
follows a normal distribution.
b) It plots the observed data quantiles against the expected quantiles of a normal
distribution.
c) If the data points closely follow a straight line, it indicates that the data are
approximately normally distributed.
d) Deviations from the straight line suggest departures from normality, such as
skewness or outliers.
A normal probability plot, also known as a normal quantile plot or a QQ plot
(quantile-quantile plot), is a graphical tool used to assess whether a dataset follows
a normal distribution. It provides a visual comparison between the observed data
and the expected values if the data were normally distributed. This plot helps
researchers evaluate the normality assumption, which is often required in many
statistical analyses.
Here's how a normal probability plot is constructed and interpreted:
a) Construction of the plot: The normal probability plot is created by plotting the
sorted observed data values on the y-axis against the corresponding expected
quantiles from a standard normal distribution on the x-axis. If the data follows
a normal distribution, the points on the plot will approximately fall along a
straight line.
b) Expected quantiles: The expected quantiles are determined based on the number
of data points and their rank order. These quantiles represent the values that the
data would have if it were normally distributed.
c) Interpretation of the plot: If the points on the plot closely follow a straight line,
it suggests that the data is approximately normally distributed. Deviations from
the straight line indicate departures from normality. Common deviations include
curvature, outliers, or systematic patterns such as an "S" shape or bends in the
plot.
d) If the points deviate from the line towards the tails, it suggests heavy-tailed or
skewed data.
e) If the points deviate from the line near the center, it suggests a peaked or flat
distribution.
f) Validity of the normality assumption: The normality assumption is important in
many statistical analyses, such as t-tests, ANOVA, and regression. If the data
significantly deviates from normality, it may affect the validity of these
analyses. In such cases, alternative methods or transformations may be
considered.

Dr. A. Varun, [email protected], 9494226711


Department of Mechanical Engineering, B V Raju Institute of Technology, Narsapur 6
Design and Analysis of Experiments UNIT – 1, Experimental Design Fundamental

Benefits of normal probability plots: Normal probability plots offer several


advantages:
a) They provide a visual and intuitive assessment of the data's departure from
normality.
b) They can detect both subtle and significant departures from normality.
c) They are not affected by the sample size, making them useful for small and large
datasets alike.
d) They allow for the identification of specific regions where the data deviates
from normality.
e) Other types of probability plots: In addition to normal probability plots, similar
plots can be constructed to assess the fit of data to other specific distributions,
such as exponential, gamma, or Weibull distributions. These plots follow the
same general principles but use the expected quantiles based on the specific
distribution being evaluated.
Normal probability plots are a valuable diagnostic tool in statistics and data analysis.
They provide a visual assessment of the normality assumption and help researchers
make informed decisions about the appropriateness of statistical methods based on the
data distribution.

9. Linear Regression Model


a) Linear regression is a statistical technique used to model the relationship between a
dependent variable (response variable) and one or more independent variables
(predictor variables).
b) The linear regression model assumes a linear relationship between the variables,
where changes in the predictor variables are associated with proportional changes
in the response variable.
c) The model estimates the coefficients that represent the slope and intercept of the
linear equation. These coefficients quantify the relationship between the predictor
variables and the response variable.
d) The linear regression model can be used for prediction, understanding the effects of
predictor variables, and testing hypotheses about the relationships between
variables.
Linear regression is a statistical modeling technique used to understand the relationship
between a dependent variable and one or more independent variables. It assumes a
linear relationship between the variables, meaning that changes in the independent
variables are associated with proportional changes in the dependent variable.

The basic idea behind linear regression is to find the best-fitting line (or hyperplane)
that minimizes the difference between the predicted values from the model and the
actual observed values. This line is represented by a linear equation of the form:

y = b0 + b1x1 + b2x2 + ... + bn*xn

where:
Dr. A. Varun, [email protected], 9494226711
Department of Mechanical Engineering, B V Raju Institute of Technology, Narsapur 7
Design and Analysis of Experiments UNIT – 1, Experimental Design Fundamental

• y is the dependent variable (the variable to be predicted or explained).


• b0 is the intercept (the value of y when all independent variables are zero).
• b1, b2, ..., bn are the coefficients (or slopes) associated with each independent
variable.
• x1, x2, ..., xn are the independent variables (also called features or predictors).

The goal of linear regression is to estimate the values of the coefficients (b0, b1, ..., bn)
that minimize the sum of squared differences between the predicted values and the
actual observed values. This is typically done using a method called ordinary least
squares (OLS).

Once the model is trained and the coefficients are estimated, you can use it to make
predictions on new data by plugging in the values of the independent variables into the
equation.

Linear regression has several assumptions:

1. Linearity: The relationship between the dependent variable and the independent
variables is linear.
2. Independence: The observations are independent of each other.
3. Homoscedasticity: The variability of the errors is constant across all levels of the
independent variables.
4. Normality: The errors are normally distributed.
5. No multicollinearity: The independent variables are not highly correlated with each
other.

Linear regression is widely used for tasks such as predicting house prices, analyzing
the impact of advertising on sales, and studying the relationship between variables in
scientific research. It serves as a fundamental building block for more complex
regression models and machine learning algorithms.

Dr. A. Varun, [email protected], 9494226711


Department of Mechanical Engineering, B V Raju Institute of Technology, Narsapur 8

You might also like