Scientific Method Lab Report
Scientific Method Lab Report
Lab 1
Introduction
In this lab activity you will learn about the scientific method. You will
understand the steps of the method and the importance of good writing to
communicate scientific findings. The skills developed in this lab activity will
help you during the semester as you conduct further experiments.
Activity 1. Identify the steps of the scientific method
Scientists are inquisitive individuals that always ask questions and give
potential explanations to them. Consider for a moment the following questions:
1. Problem definition
2. Hypothesis formulation
3. Experimental design and procedures
4. Data collection
5. Analysis of results
6. Publication of results
Problem Definition
In order to get involved in good scientific research, the variables of the scientific
question -that we focus on- should be able to be measured with numbers.
There are things that are very hard to measure with numbers. For example, the
beauty of a painting, our emotions and feelings, or our unconscious thinking
are very difficult (if not impossible) to be measured with numbers. In contrasts,
things like our age, our weight, our grades, or general health conditions are
more prone to numerical analysis. So, some things are subject of scientific
analysis and others are impossible to measure. It is essential to consider these
conditions to define a good scientific question.
Hypothesis Formulation
Charles Darwin used the fossil record available in his time and the fact that
island organisms are very different to continental organisms and thought that
perhaps organisms change as a result of adaptation to different environmental
conditions. James Watson and Francis Cricks were aware that a molecule
composed of nitrogen bases and deoxyribose sugar was in some way related to
genetics. Several experiments conducted, decades before their time, provided
evidence for this. Different hypotheses existed about the structure of DNA:
different helixes, globular forms, or compact conformations. Finally, they
developed a hypothesis of the model and proved it using experimental evidence.
Experimental Design and Procedures
The Controlled Variables are defined as the variables that are kept
constant during an experiment. Suppose that in our previous example
the plants that were exposed to chemical “A” show differences in plant
growth when compared to plants grown without the compound. Based on
these results, we may conclude that chemical “A” causes an effect in
plant growth. However, in order to establish an accurate cause-effect
relationship we have to maintain constant all the other variables (such as
humidity, air composition, soil content, light conditions, temperature,
water composition and many other factors) that could have an effect in
plant growth in both the experimental and control groups. In this way,
we are confident that any measured difference in plant growth is due by
exposure to chemical “A” and not by any other external factor.
In this activity, you will identify the process and components of the scientific
method in a scientific article. The journal article that we will “dissect” reported
an experiment to study the effects of an aquatic pollutant on gene expression of
fish.
Your TA will provide one copy of the research paper to each team. The article is
entitled “Physiological changes and differential gene expression in
mummichogs (Fundulus heteroclitus) exposed to arsenic”. You will analyze
the contents of the article with your team members and answer the questions
below.
After a general visual inspection of the article you may notice that it is
organized in certain sections. Identify the following sections in the article:
1.- Abstract
2.- Introduction
4.- Results
5.- Discussion
6.- References
In the following boxes, briefly describe what information is included in each
section of the journal article.
Abstract
Materials and Methods (Are the methods clear? Is it possible to replicate the
experiment based on this section?)
Results (Are the results of the experiment clear? Are they presented
in tables and graphs? Is there any connection between the
graphs and tables and the text?)
Did you notice any similarities between the steps of the scientific method
and the sections of the journal article?
Do you think that research articles must be written following the standard
format analyzed today or the organization of the writing should be more
relaxed? Would this be convenient for scientific communication? Type your
ideas below.
2.2.- The standard deviation and the T-test
Notice that in the second and third columns of table 5 in the paper
the number of molecules is expressed as a number followed by a
“plus” and a “minus” signs (+). Do you know the meaning of this?
Also, in the fourth column of Table 5 some numbers have a “little star” (*)
written to their right. The little star also appears in one column of all the
graphs in Figure 2. Do you know the meaning of this?
Difference
# of molecules # of molecules
“Little star”
8-7 EST 118+86 126+157 1.1
*
“Plus” and a
8-26 Type II keratin 4,379+1,107 6,777+998 1.5 *
“minus” sign
(+)
8-68 Type I cytokeratin 14+28.1 4+7 0.31
* p< 0.05
** p< 0.01
Myosin light chain 2 molecules
100 90 *
* 80
7-72 molecules
75 70
60
50
50
40
30
25 20
10
0 0
Control 230ppb Arsenic Control 230ppb Arsenic
55000 * 8000 *
30000
*
Parvalbumin molecules
25000
20000
15000
10000
5000
0
Control 230ppb Arsenic
The “plus” and a “minus” signs (+) indicate the standard deviation. The “little
star” (*) means that the results are statistically significant.
The standard deviation and the statistical significance are two important
parameters used to analyze our data during scientific investigation and provide
a measure of the quality of our results. It is impossible to measure all the
individuals or factors in the universe to establish a cause-effect relationship.
For these reason, a good statistical analysis is used to investigate if our
experimental samples and results are representative of the whole population
under study.
There are many statistical tests to analyze our experimental results and you
will learn them in detail in an advanced statistical course. Here, we will briefly
introduce you to the standard deviation and the T-test.
If the T-test “detects” that the difference between our results is not significant,
there is a high probability that our experimental conclusions are not true. In
contrast, if the T-test indicates that the results are significant there is evidence
of a good cause-effect relationship and our results are valid for publication.
Statistical significance is indicated in scientific papers with a “little star”
besides the number or the figure.
How do we calculate the standard deviation and the T-test? A more detailed
explanation for both tests and other concepts needed in their calculation is
given in Appendix 1 (Section 1 and Section 2).
In this activity, you will learn with your TA how to write lab reports of
excellence. Please review the sample lab report in Appendix 2 with your team.
Your TA will also open the sample lab report and project it in the screen for
explanation.
Did you notice some similarities between the sections of the lab report and
the sections of the scientific article? Are the sections the same?
Are there references in the lab report?
Are the references cited directly or are they used to support the flow of
information?
Are there graphs and tables that explain the results? Are the numbers
statistically significant?
Answer carefully the questions above since they will be a guide to write your
lab report in this course.
Homework: Statistics in Appendix
1) Read the statistics sections 1 and 2 in Appendix 2 (The mean and the
standard deviation; the t-test).
Introduction to Statistics
Section 1. The mean and the standard deviation
Statistics is the science that analyzes numerical data. During experiments,
scientific variables are reported as numerical quantities so it is essential to do
an accurate and effective analysis of numbers. It is not uncommon to gather
vast amounts of data points during experiments and there is a need to
summarize them and represent them in a condensed manner.
THE MEAN- For a specific set of data the mean is defined as the value of the
sum of the individual values divided by the number of values.
THE MEDIAN- The median is the middle numerical value in a data set when
the numbers are arranged according to their value. The median provides a
point of division for a specific dataset in two segments: a lower half and an
upper half.
THE MODE- The mode is the number that occurs more frequently in a specific
dataset. It is the most likely value to occur in the sample
The variance and the standard deviation are parameters that describe
dispersion within a dataset. Since experiments yield vast amounts of numerical
quantities it is important to understand the relationship between the individual
numerical values and the population mean. This provides the base for more
complex statistical tests.
For example, consider the following data sets:
MEAN
DATASET 1 28 25 24 23 25
DATASET 2 35 48 10 7 25
The arithmetic mean for both datasets is the same: 25. However, if you take a
close look at both datasets you will observe that the values of the numbers of
dataset 1 are more close to the mean than the values in dataset 2.
DATASET 1: DATASET 2:
28-25=3 35-25=10
25-25=0 48-25=23
24-25=-1 10-25=-13
23-25=-2 7-25=-18
For dataset 1 the difference is never greater than three units; for dataset 2 the
difference is always greater than three units.
The difference is clearly higher for dataset 2 than for dataset 1. So different
dataset can have the same mean but the individual values can be very different
or very close to the means. In descriptive statistics, it is important to know how
different our values are from the mean. This allows the identification of outliers
and can give an estimate of the efficiency of the replicates during an
experiment or perhaps interesting relationships between the variables. These
relationships are essential for the mathematical foundation of scientific
theories.
The variance and the standard deviation are statistical parameters that
describe how different are the values from the mean for a particular dataset.
Both parameters are calculated using specific mathematical formulas and their
value is a number. So a high numerical value for the variance and the standard
deviation indicate dispersion from the mean (the values in the set show
variation), while a small numerical value for the variance and the standard
deviation indicates that the numbers in the dataset are close to the mean.
In order to calculate the variance the formula that is used is:
Where M is the mean of the dataset, Xi is the individual experimental data and
N is the sample size.
The standard deviation is defined as the square root of the variance. The
formula to calculate the standard deviation is then:
So, to find the variance and the standard deviation of a dataset first find the
mean of the dataset, then subtract the mean from every individual number in
the dataset, raise every difference to the second power and add the results.
Divide the result of this sum by the number of “numbers” in the dataset minus
one. This operation will give you the variance. Take now the square root of the
variance in order to calculate the standard deviation.
The values for the control (no-exposure) are in the first column and the values
for the experimental group (exposed) are in the second column:
C E
32 36
33 37
36 39
38 37
30 38
Sum 169 187
M 33.8 37.4
2) Subtract the mean from every individual value for each dataset:
C (Xi-M)c E (Xi-M)e
32-33.8 -1.8 36-37.4 -1.4
33-33.8 -0.8 37-37.4 -0.4
36-33.8 2.2 39-37.4 1.6
38-33.8 4.2 37-37.4 -0.4
30-33.8 -3.8 38-37.4 0.6
((Xi-M)c)^2 ((Xi-M)e)^2
3.24 1.96
0.64 0.16
4.84 2.56
17.64 0.16
14.44 0.36
Ʃ((Xi-M)^2) 40.8 5.2
5) To calculate the variance for each dataset divide each value from the
previous sum by N-1. Since there are five numbers in each data set N=5
and N-1=4; divide each value by four:
6) Finally, calculate the square root of the numbers in (5) to find the
standard deviation:
C E
Ʃ((Xi-M)^2)/4 10.2 1.3
Standard deviation 3.193744 1.140175
The variance for the control group is 10.2 and for the experimental group 1.3.
The standard deviation for the control group is 3.19 and the standard
deviation for the experimental group is 1.14.
A B
1 32 36
2 33 37
3 36 39
4 38 37
5 30 38
2) Place the cursor in the grid below the last number of each dataset and
use the function for the standard deviation. Type “=STDEV(A1:A5)” for
the control and “=STDEV(B1:B5)” for the experimental and click enter in
each grid.
A B
1 32 36
2 33 37
3 36 39
4 38 37
5 30 38
=STDEV(A1:A5) =STDEV(B1:B5)
3) The computer will calculate the standard deviation for each dataset.
A B
1 32 36
2 33 37
3 36 39
4 38 37
5 30 38
3.193743885 1.140175425
The mean value of the experimental group is higher than the mean of the
control so we may assume that the fertilizer had an effect in the growth of the
plants. This may be true, but we need further statistical proof to see if the
means are really different (statistically significant).
Why do we need that additional test? The reason is that there are many ways
to obtain the value of a mean from datasets but the variance of the internal
values may “vary”. When this happens the numbers used to calculate the
means of datasets may be more “similar” to one another is some cases and
more “different” in other cases. For this reason, the numerical value of the
means is not enough to see if the results of the experimental groups are really
different when compared with the results of the control since the values of two
datasets with high variance may overlap.
C.
Difference between means
In A there is high variability within the numerical values and some values from
both datasets overlap (shaded area). In B there is medium variability and the
overlapping region is smaller. In C, there is less variability within the datasets
and the overlapping region is very small.
Although the numerical value of the difference of the means is the same in A, B
and C we conclude that the difference is more “significant” in C than in A or B,
In other words, the numbers that originated the means in C are more
“different” (less overlapping). So in C, the difference of the means is more
credible (more significant). This is conclusion is extremely important in science
because a significant difference means a better measure of a cause-effect
relationship.
A t-test measures the means and the variability of two datasets using a
mathematical formula. The formula for the t-test is:
M2=mean of sample 2
N1=sample 1 size
= variance of sample 1
=variance of sample 2
Statistics books usually contain tables to help us in the use of the critical value
for the t-tests. This requires the use of a confidence level and degrees of
freedom. Briefly, the confidence level (or alpha level) is a level of risk in the test.
In general, we accept a level of risk of .05 (5%). This means that in our test we
are 95% sure that there is a difference of the means is true but there is an
uncertainty of 5%.
So to perform a t-test we need the T-value from the formula above, the alpha
value of 0.05 and the degrees of freedom.
T-test example
C1 T1
25 19
23 14
21 17
20 21
20 19
26 14
20 21
26 15
20 22
26 14
24 25
26 23
25 20
23 18
19 17
2.- Find the mean and the standard deviation for C1 and T2:
C1 T1
Mean 22.93333 18.6
ST Dev 2.685056 3.459975
2.- Place the values in the formula for the t-test:
T value = = 3.9
√
3.- We will now compare our calculated t-value (3.9) with the values of a T-test
table to find if the means are different at the 95% confidence level. (Our degrees
of freedom are N-1).
In the t-test table for 14 degrees of freedom and 0.05 alpha value the critical
value is 2.1448. For T value 3.9>2.1448 so the difference in the means is
significant. So if we want to publish the results of this experiment we will write
a little star in the bar graph of the results.
1.- Type the numerical values from the previous example in a spreadsheet.
A B
1 25 19
2 23 14
3 21 17
4 20 21
5 20 19
6 26 14
7 20 21
8 26 15
9 20 22
10 26 14
11 24 25
12 26 23
13 25 20
13 23 18
15 19 17
2.- Place the cursor in the grid below the column B16 (below the last number
in column B) and use the function for the T-test. Type
“=T.TEST(A1:A15,B1:B15,2,2)” and click enter. The p-value is 0.0006 which is
less than 0.05, therefore the results are significant.
25 19
23 14
21 17
20 21
20 19
26 14
20 21
26 15
20 22
26 14
24 25
26 23
25 20
23 18
19 17
0.000658
Section 3. The ANOVA test
The t-test is statistical parameter that is used to investigate if two means are
actually different. What happens if an experiment contains more than two
experimental groups (for example, three control groups and three experimental
groups?). In this case, a different test is used: the ANOVA (Analysis of Variance
Test). The ANOVA test l tells us if three or more means are really different. A
different statistical parameter (the “F” value) is used in the ANOVA test.
Perform the following operations with your team in order to find the F value for
an ANOVA test.
Average
DATASET 1 28 25 24 27 26 27 35 27.4
DATASET 2 31 25 27 28 26 24 41 28.85
DATASET 3 17 15 10 33 42 39 44 28.57
4) Calculate the average of all sample variances for the three datasets
Df1=3-1=2
Df2=(Na-1)+(Nb-1)+(Nc-1)=6+6+6=18
F-table (0.05%)
df2/df1 1 2 3 4 5 6 7
15 4.54 3.68 3.29 3.06 2.90 2.79 2.71
16 4.49 3.63 3.24 3.01 2.85 2.74 2.66
17 4.45 3.59 3.20 2.96 2.81 2.70 2.61
18 4.41 3.55 3.16 2.93 2.77 2.66 2.58
19 4.38 3.52 3.13 2.90 2.74 2.63 2.54
20 4.35 3.49 3.10 2.87 2.71 2.60 2.51
If your calculated F value is greater than 3.55 the difference between the
means is significant, if not the difference is not significant and may be due by
chance.
APPENDIX 2
Student’s name:___________
Biology 1107 Section: XXXX
Lab Day/Time:__ ________
Abstract
Introduction.
Organic waste has beneficial effects in soils. These effects can improve plant
growth in both natural and human ecosystems. As a result, it is possible to
improve agricultural yields and biomass production in forests. In this way, the
health of humans and the environment is enhanced. Organic waste improves
soil properties by optimizing porosity and transport of materials and by
complementing the chemical composition of soils. In this way, plants obtain
more nutrients, the activity of microbes in the soil is improved and the action
of plant growth regulators is enhanced (Edwards et al., 2002).
The sources of organic waste include sewage sludge, animal waste, crop
residues and waste from industry and vermicompost devices. For these
reasons, the effects of organic waste in soils are diverse. For example, organic
waste enhances the humification of organic matter and produces chemicals
that could affect plant growth such as auxins and humic acids (Tomati et al.,
1995). In addition, there is evidence that organic waste enhances the health of
plants by making them more resistant to disease and by minimizing the
presence of other organisms that compete with plants for space and resource
availability.
Tomatoes are important components of the human diet throughout the world.
Their consumption has been linked to several health benefits, including
reduced risk for cancer and cardiovascular disease. The reason for this is the
presence of an antioxidant molecule called lycopene in tomatoes (Agarwa et al.,
2000). In addition, tomatoes contain several valuable minerals and vitamin C
(Guil-Guerrero et al., 2009). The presence of these nutrients in the plants may
be affected by the conditions of the soil. There is a need to improve the growth
of tomatoes in the field and obtain better agricultural yields. One possible way
to achieve this is the use of organic waste in soils where the plants will growth.
For these reasons, this experiment will investigate the effects of organic waste
in the growth of tomatoe plants. It is hypothesized that the organic waste will
enhance plant growth as a function of concentration.
A vermicompost device was filled with soil, worms and different organic
materials (fruit and vegetable remains). After a month, the materials in the
device were mixed and a sample was taken for analysis of total organic matter.
Five plastic containers were labeled as 1,2,3 4 and 5 and filled with different
amounts of soil (250, 300, 350, 400 and 450 grams) respectively. Different
samples were taken from the vermicompost device in order to mix them with
the soil in the plastic containers and obtain a final weight of 500 grams in each
container (250 grams in the first, 200 grams in the second, 150 grams in the
third, 200 grams in the fourth and 50 grams in the fifth). The soil and the
organic waste were mixed until a homogenous state was achieved.
Ten tomatoe seeds were placed in the mid-part of the soil of each container and
covered with the organic waste-soil mix. Water was added to each container so
the organic waste-soil mix was humidified completely. After this, the five
containers were placed in an hibernating chamber at a constant temperature of
25 C. The plants were grown exposed to 12 hours of light and 12 hours of dark
conditions using an artificial illumination system located in the hibernating
chamber. Each day, 100 ml of water were added to each container.
The plants were grown for seven weeks under these conditions and the total
number of plants that germinated as well as their weight and length were
recorded.
Results.
Tomatoe plants were grown in five plastic containers with different amounts of
organic waste and soil as described above. The number of plants that
germinated in each container were counted and their weight and height were
measured (Table 1). The experiment was conducted normally and without any
further modification or adaptation.
The seeds germinated in all containers. Eight plants germinated in C1, C2 and
C5 and seven plants in C3 and C4. An average of two tomatoe plants did not
germinate in each container, perhaps as a result of light conditions or soil type.
The average length for the plants in the first two containers (C1 and C2) was
clearly higher than the average length of the remaining containers. The average
length was similar for both C3 and C5 (in general low) and the plants in the C4
container grew moderately (Figure 2).
The maximum plant length was 17 cm (in plants of the C1 and C2 containers)
and the lowest was 8cm (in plants of the C5 container). The average length for
all plants in all containers was 13.21 cm (Table 1; figure 1). The plants in the
C3 containers grew in average lower than in the other four, but if we compare
the absolute values of length individually the lowest length is present in plants
from container 5 (C5) (Table 1).
Table 1. Individual plant length
(250 g soil + (300 g soil + (350 g soil + (400 g soil + (450 g soil +
250 g 200 g 150 g 100 g 50 g organic
organic mix) organic mix) organic mix) organic mix) mix)
Length plant 1 10 cm 14 cm No 13 cm 8 cm
germination
Length plant 2 15 cm No 15 cm No 15 cm
germination germination
Length plant 3 11 cm 17 cm 14 cm No 8 cm
germination
Length plant 13 cm 13 cm 12 cm 16 cm 10 cm
Length plant 5 16 cm No 10 cm 16 cm 12 cm
germination
Length plant 6 12 cm 12 cm 9 cm 12 cm 16 cm
Length plant 7 17 cm 15 cm 15 cm 11 cm 12 cm
Length plant 8 18 cm 17 cm No 12 cm No
germination germination
Length plant 9 No 12 cm 12 cm 15 cm 9 cm
germination
Length plant 10 No 13 cm No No No
germination germination germination germination
In general the experiment was conducted without problems and following the
original protocol. The only unexpected outcome was the lack of germination in
some seeds. This situation is common in agriculture so the cause-effect
relationships in our experiment were not affected.
c3
14
c4
12
c5
10
c1
8
c2
6
c3
4
c4
2
c5
0
Individual plant
10
Plant height (centimeters)
Average height
4
0
c1 c2 c3 c4 c5
Group
Conclusion.
After placing the seeds in the containers the plants were monitored and grown
for seven weeks. A total of eight plants germinated in C1, C2 and C5 and seven
plants in C3 and C4. The plants were higher in average in containers 1 and 2
which also contain the highest proportion of organic waste from the
vermiculture station. It is possible to establish a cause-effect relationship
between the amount of organic matter and plant growth. An increase in the
amount of organic matter clearly enhanced plant growth. This fact supports
our original hypothesis.
The average length of the plants is smaller in the C3 container than that of the
plants from C5 and C5. This could be an argument against the relationship
explained above, however, if we consider the absolute length of plants in all
containers the conclusion is still valid since the plants in C4 and C5 are
smaller. We recommend to perform a statistical test could detect any outlier
value that could artificially increase the average in C3 and give the false
impression of a decreased length.
The results from this experiment clearly demonstrate that the organic content
in soil has a positive effect in the growth of plants. These results have great
potential, since organic waste can be obtained from vermiculture stations that
recycle organic waste and in this way contribute to a better environment and
improved nutritional components (Wilson et al., 1989). In addition, since
tomatoes contain chemicals that improve heart conditions, this results may be
used in biotechnology to produce and isolate those compounds under reduced
spatial conditions (Khachik et al., 2002).
References.
Agarwa, I.A. and Rao, A.V. 2000. Tomato lycopene and its role in human health
and chronic diseases. Canadian Medical Association Journal. 163: 739-744
Edwards, N.Q., Arancon, J.D., Metzger, L. 2002. The influence of humic acids
derived from earth worm processed organic wastes. Bioresource Technology.
84:7-14
Tomati, V. and Galli, E. 1995. Earthworms, soil fertility and plant productivity.
Acta Zoologica Fennica. 196: 11-14
Wilson, D.P. and Carlile, W.R. 1989. Plant growth in potting media containing
worm-worked duck waste. Acta Horticulturae. 238: 205-220