Agri 225 Agricultural Meteorological Data
Agri 225 Agricultural Meteorological Data
(e) Information relating to weather disasters and carrying out similar research work. At the
their influence on agriculture; same time, the existence of these data should
(f) Information relating to the distribution of be publicized at the national level and possi-
weather and agricultural crops, and geograph- bly at the international level, if appropriate,
ical information, including digital maps; especially in the case of longer series of special
(g) Metadata that describe the observation tech- observations;
niques and procedures used. (d) All the usual data storage media are recom-
mended:
(i) The original data records, or agromete-
3.2.2 Data collection
orological summaries, are often the most
The collection of data is very important as it lays convenient format for the observing
the foundation for agricultural weather and climate stations;
data systems that are necessary to expedite the (ii) The format of data summaries intended
generation of products, analyses and forecasts for for forwarding to regional or national
agricultural cropping decisions, irrigation manage- centres, or for dissemination to the user
ment, fire weather management, and ecosystem community, should be designed so that
conservation. The impact on crops, livestock, water the data may be easily transferred to a vari-
and soil resources, and forestry must be evaluated ety of media for processing. The format
from the best available spatial and temporal array should also facilitate either the manual
of parameters. Agrometeorology is an interdiscipli- preparation or automated processing
nary branch of science requiring the combination of statistical summaries (computation
of general meteorological data observations and of means, frequencies, and the like). At
specific biological parameters. Meteorological data the same time, access to and retrieval of
can be viewed as typically physical elements that data files should be simple, flexible and
may be measured with relatively high accuracy, reproducible for assessment, modelling or
while other types of observations (namely, biologi- research purposes;
cal or phenological) may be more subjective. In (iii) Rapid advances in electronic technology
collecting, managing and analysing the data for facilitate effective exchange of data files,
agrometeorological purposes, the source of data summaries and charts of recording instru-
and the methods of observation define their char- ments, particularly at the national and
acter and management criteria. Some useful international levels;
suggestions with regard to the storage and process- (iv) Agrometeorological data should be trans-
ing of data can be offered, however: ferred to electronic media in the same way
(a) Original data files, which may be used for as conventional climatological data, with
reference purposes (the daily register of obser- an emphasis on automatic processing.
vations, and so on), should be stored at the
observation site; this applies equally to atmos- The availability of proper agricultural meteorological
pheric, biological, crop and soil data; databases is a major prerequisite for studying and
(b) The most frequently used data should be managing the processes of agricultural and forest
collected at national or regional agrometeoro- production. The agricultural meteorology
logical centres and reside in host servers for community has great interest in incorporating new
network accessibility. This may not always be information technologies into a systematic design
practical, however, since stations or laborato- for agrometeorological management to ensure
ries under the control of different authorities timely and reliable data from national reporting
(meteorological services, agricultural services, networks for the benefit of the local farming
universities, research institutes) often collect community. While much more information has
unique agrometeorological data. Steps should become available to the agricultural user, it is
therefore be taken to ensure that possible users essential that appropriate standards be maintained
are aware of the existence of such data, either for basic instrumentation, collection and
through some form of data library or compu- observations, quality control, and archiving and
terized documentation, and that appropriate dissemination. After they have been recorded,
data exchange mechanisms are available to collected and transferred to the data centres, all
access and share these data; agricultural meteorological data need to be
(c) Data resulting from special studies should be standardized or technically treated so that they can
stored at the place where the research work is be used for various purposes. The data centres need
undertaken, but it would be advantageous to to maintain special databases. These databases
arrange for exchanges of data among centres should include meteorological, phenological,
CHAPTER 3. AGRICULTURAL METEOROLOGICAL DATA, THEIR PRESENTATION AND STATISTICAL ANALYSIS 3–3
edaphic and agronomic information. Database ways, including by mail, telephone, telegraph,
management and processing and the quality fax and Internet, and via Comsat; transmission
control, archiving, timely accessing and via the Internet and Comsat is more efficient.
dissemination of data are all important components After reaching the data centres, data should
that render the information valuable and useful in be identified and processed by means of a
agricultural research and operational programmes. special program in order to facilitate their
dissemination to other users.
After they have been stored in a data centre, the data
are disseminated to users. There have been major
3.2.4 Scrutiny of data and acquisition of
advancements in making more data products availa-
metadata
ble to the user community through automation. The
introduction of electronic transfer of data files via the It is very important that all agricultural meteorologi-
Internet using the file transfer protocol (FTP) and the cal data be carefully scrutinized, both at the observing
World Wide Web (WWW) has brought this informa- station and at regional or national centres, by means
tion transfer process up to a new level. The Web allows of subsequent automatic computer processing. All
users to access text, images and even sound files that data should be identified immediately. The code
can be linked together electronically. The Web’s parameters should be specified, such as types, regions,
attributes include the flexibility to handle a wide missing values and possible ranges for different meas-
range of data presentation methods and the capabil- urements. The quality control should be done
ity to reach a large audience. Developing countries according to Wijngaard et al. (2003), WMO-TD
have some access to this type of electronic informa- No. 1236 (WMO, 2004a) and the current Guide to
tion, but limitations still exist in the development of Climatological Practices (WMO, 1983). Every measure-
their own electronically accessible databases. These ment code must be checked to make certain that the
limitations will diminish as the cost of technology measurement is reasonable. If the value is unreasona-
decreases and its availability increases. ble, it should be corrected immediately. After being
scrutinized, the data can be processed further for
different purposes. In order to ascertain the quality of
3.2.3 Recording of data
observation data and determine whether to correct or
Recording of basic data is the first step for agricul- normalize them before analysis, metadata are needed.
tural meteorological data collection. When the These are the details and history of local conditions,
environmental factors and other agricultural mete- and instrumentation, operational, data-processing
orological elements are measured or observed, they and other factors relevant to the observation process.
must be recorded on the same media, such as agri- Such metadata should be documented and treated
cultural meteorological registers, diskettes, and the with the same care as the data themselves (see WMO
like, manually or automatically. 2003a, 2003b). Unfortunately, observation metadata
(a) The data, such as the daily register of obser- are often incomplete and poorly organized.
vations and charts of recording instruments,
should be carefully preserved as permanent In Chapter 2 of this Guide, essential metadata are
records. They should be readily identifiable specified for individual parameters and the
and include the place, date and time of each organization of their acquisition is reviewed in
observation, and the units used. 2.2.5. Many kinds of metadata can be recorded as
(b) These basic data should be sent to analysis simple numbers, as is the case with observation
centres for operational uses, such as local heights, for example; but more complex aspects,
agricultural weather forecasts, agricultural such as instrument exposure, must also be
meteorological information services, plant recorded in a manner that is practicable for the
protection treatment and irrigation guidance. observers and station managers. Acquiring
Summaries (weekly, 10-day or monthly) of metadata on present observations and inquiring
these data should be made regularly from the about metadata on past observations are now a
daily register of observations according to the major responsibility of data managers. Omission
user demand and then distributed to inter- of metadata acquisition implies that the data will
ested agencies and users. have low quality for applications. The optimal
(c) Observers need to record all measurements set-up of a database for metadata is at present still
in compliance with rules for harmonization. in development, because metadata characteristics
This will ensure that the data are recorded are so variable. To be manageable, the optimal
in a standard format so that they can readily database should not only be efficient for archiving,
be transferred to data centres for automatic but also easily accessible for those who are
processing. Data can be transferred in several recording the metadata. To allow for future
3–4 GUIDE TO AGRICULTURAL METEOROLOGICAL PRACTICES
management. Thus, a database management system CLICOM provides tools (such as stations, observa-
for agricultural applications should be comprehen- tions and instruments) to describe and manage the
sive, bearing in mind the following considerations: climatological network. It offers procedures for the
(a) Communication among climatologists, key entry, checking and archiving of climate data,
agrometeorologists and agricultural extension and for computing and analysing the data. Typical
personnel must be improved to establish an standard outputs include monthly or 10-day data
operational database; from daily data; statistics such as means, maxi-
(b) The outputs must be adapted for an opera- mums, minimums and standard deviations; and
tional database in order to support specific tables and graphs. Other products requiring more
agrometeorological applications at a national/ elaborate data processing include water balance
regional/global level; monitoring, estimation of missing precipitation
(c) Applications must be linked to the Climate data, calculation of the return period and prepara-
Applications Referral System (CARS) tion of the CLIMAT message.
project, spatial interpolated databases and a
Geographical Information System (GIS). The CLICOM software is widely used in developing
countries. The installation of CLICOM as a data
Personal computers (PCs) are able to provide prod- management system in many of these countries has
ucts formatted for easy reading and presentation, successfully transferred the technology for use with
which are generated through simple processors, PCs, but the resulting climate data management
databases or spreadsheet applications. Some careful improvements have not yet been fully realized.
thought needs to be given, however, to what type of Station network density as recommended by WMO
product is needed, what the product looks like and has not been fully achieved and the collection of
what it contains, before the database delivery design data in many countries remains inadequate.
is finalized. The greatest difficulty often encountered CLICOM systems are beginning to yield positive
is how to treat missing data or information (WMO, results, however, and there is a growing recognition
2004a). This process is even more complicated when of the operational applications of CLICOM.
data from several different datasets, such as climatic
and agricultural data, are combined. Some software There are a number of constraints that have been
programs for database management, especially the identified over time and recognized for possible
software for climatic database management, provide improvement in future versions of the CLICOM
convenient tools for agrometeorological database system. Among the technical limitations, the list
management. includes (WMO, 2000):
(a) The lack of flexibility to implement specific
applications in the agricultural field and/or at
3.4.1 CLICOM Database Management
a regional/global level;
System
(b) The lack of functionality in real-time operations;
CLICOM (CLImate COMputing) refers to the (c) Few options for file import;
WMO World Climate Data Programme Project, (d) The lack of transparent linkages to other appli-
which is aimed at coordinating and assisting the cations;
implementation, maintenance and upgrading of (e) The risk of overlapping of many datasets;
automated climate data management procedures (f) A non-standard georeferencing system;
and systems in WMO Member countries (that is, (g) Storage of climate data without the corre-
the National Meteorological and Hydrological sponding station information;
Services in these countries). The goal of CLICOM (h) The possibility of easy modification of the data
is the transfer of three main components of entry module, which may destroy existing data.
modern technology, namely, desktop computer
hardware, database management software and 3.4.2 Geographical Information System
training in climate data management. CLICOM is (GIS)
a standardized, automated database management
system software for use on a personal computer A Geographical Information System (GIS) is a
and it is targeted at introduction of a system in computer-assisted system for the acquisition, storage,
developing countries. As of May 1996, CLICOM analysis and display of observed data on spatial
version 3.0 was installed in 127 WMO Member distribution. GIS technology integrates common
countries. Now CLICOM software is available in database operations such as query and statistical
Czech, English, French, Spanish and Russian. analysis with the unique visualization and geographic
CLICOM Version 3.1 Release 2 became available in analysis benefits offered by mapping overlays. Maps
January 2000. have traditionally been used to explore the Earth and
CHAPTER 3. AGRICULTURAL METEOROLOGICAL DATA, THEIR PRESENTATION AND STATISTICAL ANALYSIS 3–7
its resources. GIS technology takes advantage of developing future climate scenarios based on global
computer science technologies, enhancing the climate model (GCM) simulations or subjectively
efficiency and analytical power of traditional introduced climate changes for climate change impact
methodologies. models. Weather generators project future changes in
means (averages) onto the observed historical weather
GIS is becoming an essential tool in the effort to series by incorporating changes in variability; these
understand complex processes at different scales: projections are widely used for agricultural impact
local, regional and global. In GIS, the information studies. Daily climate scenarios can be used to study
coming from different disciplines and sources, such potential changes in agroclimatic resources. Weather
as traditional point sources, digital maps, databases generators can calculate agroclimatic indices on the
and remote‑sensing, can be combined in models basis of historical climate data and GCM outputs.
that simulate the behaviour of complex systems. Various agroclimatic indices can be used to assess crop
production potentials and to rate the climatic suita-
The presentation of geographic elements is solved in bility of land for crops. A methodologically more
two ways: using x, y coordinates (vectors), or repre- consistent approach is to use a stochastic weather
senting the object as a variation of values in a generator, instead of historical data, in conjunction
geometric array (raster). The possibility of transform- with a crop simulation model. The stochastic weather
ing the data from one format to the other allows fast generator allows temporal extrapolation of observed
interaction between different informative layers. weather data for agricultural risk assessment and
Typical operations include overlaying different provides an expanded spatial source of weather data
thematic maps; acquiring statistical information by interpolation between the point-based parameters
about the attributes; changing the legend, scale and used to define the weather generators. Interpolation
projection of maps; and making three-dimensional procedures can create both spatial input data and
perspective view plots using elevation data. spatial output data. The density of meteorological
stations is often low, especially in developing coun-
The capability to manage this diverse information, tries, and reliable and complete long-term data are
by analysing and processing the informative layers scarce. Daily interpolated surfaces of meteorological
together, opens up new possibilities for the simula- variables rarely exist. More commonly, weather gener-
tion of complex systems. GIS can be used to produce ators can be used to generate the weather variables in
images – not only maps, but cartographic products, grids that cover large geographic regions and come
drawings, animations or interactive instruments as from interpolated surfaces of weekly or monthly
well. These products allow researchers to analyse climate variables. On the basis of these interpolated
their data in new ways, predicting the natural behav- surfaces, daily weather data for crop simulation
iours, explaining events and planning strategies. models are generated using statistical models that
attempt to reproduce series of daily data with means
For the agronomic and natural components in and a variability similar to those that would be
agrometeorology, these tools have taken the name observed at a given location.
Land Information Systems (LIS) (Sivakumar et al.,
2000). In both GIS and LIS, the key components are Weather generators have the capacity to simulate
the same, namely, hardware, software, data, tech- statistical properties of observed weather data for agri-
niques and technicians. LIS, however, requires cultural applications, including a set of agroclimatic
detailed information on environmental elements, indices. They are able to simulate temperature, precip-
such as meteorological parameters, vegetation, soil itation and related statistics. Weather generators
and water. The final product of LIS is often the result typically calculate daily precipitation risk and use this
of a combination of a large number of complex information to guide the generation of other weather
informative layers, whose precision is fundamental variables, such as daily solar radiation, maximum and
for the reliability of the whole system. Chapter 4 of minimum temperature, and potential evapotranspi-
this Guide contains an extensive overview of GIS. ration. They can also simulate statistical properties of
daily weather series under a changing/changed
climate through modifications to the weather genera-
3.4.3 Weather generators (WG)
tor parameters with optimal use of available
Weather generators are widely used to generate information on climate change. For example, weather
synthetic weather data, which can be arbitrarily long generators can simulate the frequency distributions of
for input into impact models, such as crop models the wet and dry spells fairly well by modifying the
and hydrological models that are used for assessing four transition probabilities of the second-order
agroclimatic long-term risk and agrometeorological Markov chain. Weather generators are generally based
analysis. Weather generators are also the tool used for on the statistics. For example, to generate the amount
3–8 GUIDE TO AGRICULTURAL METEOROLOGICAL PRACTICES
(j) Agrometeorological observations statistical methods on which these analyses are based.
i. Soil moisture at regular depths; Another point that needs to be stressed is that one is
ii. Plant growth observations; often obliged to compare measurements of the physi-
iii. Plant population; cal environment with biological data, which are often
iv. Phenological events; difficult to quantify.
v. Leaf area index;
vi. Above-ground biomass; Once the agrometeorological data are stored in
vii. Crop canopy temperature; electronic form in a file or database, they can be
viii. Leaf temperature; analysed using a public domain or commercial
ix. Crop root length. statistical software. Some basic statistical analyses
can be performed in widely available commercial
spreadsheet software. More comprehensive basic
3.5.1 Forecast information
and advanced statistical analyses generally require
Operational weather information is defined as real- specialized statistical software. Basic statistical
time data that provide conditions of past weather analyses include simple descriptive statistics,
(over the previous few days), present weather, as distribution fitting, correlation analysis, multiple
well as predicted weather. It is well known, however, linear regression, non-parametrics and enhanced
that the forecast product deteriorates with time, so graphic capabilities. Advanced software includes
that the longer the forecast period, the less reliable linear/non-linear models, time series and forecast-
the forecast. Forecasting of agriculturally important ing, and multivariate exploratory techniques such
elements is discussed in Chapters 4 and 5. as cluster analysis, factor analysis, principal
components and classification analysis, classifica-
tion trees, canonical analysis and discriminant
analysis. Commercial statistical software for PCs
3.6 STATISTICAL METHODS OF would be expected to provide a user-friendly inter-
AGROMETEOROLOGICAL DATA face with self-prompting analysis selection
ANALYSIS dialogues. Many software packages include elec-
tronic manuals that provide extensive explanations
The remarks set out here are intended to be of analysis options with examples and compre-
supplementary to WMO-No. 100, Guide to hensive statistical advice.
Climatological Practices, Chapter 5, “The use of
statistics in climatology”, and to WMO-No. 199, Some commercial packages are rather expensive, but
Some Methods of Climatological Analysis (WMO some free statistical analysis software can be down-
Technical Note No. 81), which contain advice loaded from the Web or made available upon request.
generally appropriate and applicable to agricul- One example of freely available software is INSTAT,
tural climatology. which was developed with applications in agromete-
orology in mind. It is a general-purpose statistics
Statistical analyses play an important role in agro package for PCs that was developed by the Statistical
meteorology, as they provide a means of Service Centre of the University of Reading in the
interrelating series of data from diverse sources, United Kingdom. It uses a simple command language
namely biological data, soil and crop data, and to process and analyse data. The documentation and
atmospheric measurements. Because of the software can be downloaded from the Web. Data for
complexity and multiplicity of the effects of envi- analysis can be entered into a table or copied and
ronmental factors on the growth and development pasted from the clipboard. If CLICOM is used as the
of living organisms, and consequently on agricul- database management software, then INSTAT, which
tural production, it is sometimes necessary to use was designed for use with CLICOM, can readily be
rather sophisticated statistical methods to detect used to extract the data and perform statistical analy-
the interactions of these factors and their practical ses. INSTAT can be used to calculate simple descriptive
consequences. statistics, including minimum and maximum values,
range, mean, standard deviation, median, lower quar-
It must not be forgotten that advice on long-term tile, upper quartile, skewness and kurtosis. It can be
agricultural planning, selection of the most suitable used to calculate probabilities and percentiles for
farming enterprise, the provision of proper equip- standard distributions, normal scores, t-tests and
ment and the introduction of protective measures confidence intervals, chi-square tests, and non-para-
against severe weather conditions all depend to some metric statistics. It can be used to plot data for
extent on the quality of the climatological analyses of regression and correlation analysis and analysis of
the agroclimatic and related data, and hence, on the time series. INSTAT is designed to provide a range of
3–10 GUIDE TO AGRICULTURAL METEOROLOGICAL PRACTICES
climate analyses. It has commands for 10-day, for much shorter periods than those used for
monthly and yearly statistics. It calculates water macroclimatic analyses, provided that they can
balance from rainfall and evaporation, start of rains, be related to some long reference series;
degree-days, wind direction frequencies, spell lengths, (c) For bioclimatic research, the physical envi-
potential evapotranspiration according to Penman, ronment should be studied at the level of the
and the crop performance index according to meth- plant or animal, or the pathogenic colony
odology used by the Food and Agriculture itself. Obtaining information about radiation
Organization of the United Nations (FAO). The useful- energy, moisture and chemical exchanges
ness of INSTAT for agroclimatic analysis is illustrated involves handling measurements on the
in Sivakumar et al. (1993): the major part of the analy- much finer scale of microclimatology;
sis reported here was carried out using INSTAT. (d) For research on the impacts of a changing
climate, past long-term historical and future
climate scenarios should be used.
3.6.1 Series checks
rhythms, since the arbitrary calendar periods Any one of the statistics mean, median, mode and
(month, year) do not coincide with these. For mid-interquartile range would seem to be suitable
example, in temperate zones, the starting point for use as an estimator of the population mean m. In
could be autumn (sowing of winter cereals) or order to choose the best estimator of a parameter
spring (resumption of growth). In regions subject to from a set of estimators, three important desirable
monsoons or the seasonal movement of the properties should be considered. These are unbias-
intertropical convergence zone, it could be the edness, efficiency and consistency.
onset of the rainy season. It could also be based on
the evolution of a significant climatic factor
3.6.4 Frequency distributions
considered to be representative of a biological cycle
that is difficult to assess directly, for example, the When dealing with a large set of measured data, it
summation of temperatures exceeding a threshold is usually necessary to arrange it into a certain
temperature necessary for growth. number of equal groupings, or classes, and to count
the number of observations that fall into each class.
The number of observations falling into a given
3.6.2.3 Analysis of the effects of weather
class is called the frequency for that class. The
The climatic elements do not act independently on number of classes chosen depends on the number
the biological life cycle of living things: an analyti- of observations. As a rough guide, the number of
cal study of their individual effects is often illusory. classes should not exceed five times the logarithm
Handling them all simultaneously, however, (base 10) of the number of observations. Thus, for
requires considerable data and complex statistical 100 observations or more, there should be a maxi-
treatment. It is often better to try to combine several mum of 10 classes. It is also important that adjacent
factors into single agroclimatic indices, considered groups do not overlap. Table 3.1 serves as the basis
as complex parameters, which can be compared for Table 3.2, which displays the result of this oper-
more easily with biological data. ation as a grouped frequency table.
3.6.3 Population parameters and The table has columns showing limits that define
sample statistics classes and another column giving lower and upper
class boundaries, which in turn give rise to class widths
The two population characteristics m and s are or class intervals. Another column gives the mid-marks
called parameters of the population, while each of of the classes, and yet another column gives the totals
the sample characteristics, such as sample mean –x of the tally known as the group or class frequencies.
and sample standard deviation s, is called a sample
statistic. Another column contains entries that are known as
the cumulative frequencies. They are obtained from
A sample statistic used to provide an estimate of a the frequency column by entering the number of
corresponding population parameter is called a observations with values less than or equal to the
point estimator. For example, x– may be used as an value of the upper class boundary of that group.
estimator of m, the median may be used as an esti-
mator of m and s2 may be used as an estimator of the The pattern of frequencies obtained by arranging
population variance s2. data into classes is called the frequency
Table 3.1. Climatological series of annual rainfall (mm) for Mbabane, Swaziland (1930–1979)
Year 0 1 2 3 4 5 6 7 8 9
193- 1 063 1 237 1 495 1 160 1 513 912 1 495 1 769 1 319 2 080
194- 1 350 1 033 1 707 1 570 1 480 1 067 1 635 1 627 1 168 1 336
195- 1 102 1 195 1 307 1 118 1 262 1 585 1 199 1 306 1 220 1 328
196- 1 411 1 351 1 115 1 256 1 226 1 062 1 546 1 545 1 049 1 830
197- 1 018 1 690 1 800 1 528 1 285 1 727 1 704 1 741 1 667 1 260
3–12 GUIDE TO AGRICULTURAL METEOROLOGICAL PRACTICES
Table 3.2. Frequency distribution of annual precipitation for Mbabane, Swaziland (1930–1979)
1 2 3 5 6 7
This fact is embodied in the central limit theorem; The meaning here is that the X-score lies one stand-
it states that if random samples of size n are drawn ard deviation to the right of the mean. If a z-score
from a large population (hypothetically infinite), equivalent of X=74 is computed, one obtains:
which has mean m and standard deviation s, then
– has X − μ 74 − 80 −6
the theoretical sampling distribution of x Z= = = = −1.5 (3.3)
σ σ 4 4
mean m and standard deviation . The
n
theoretical sampling distribution of .–xcan be closely The meaning of this negative z-score is that the
approximated by the corresponding normal curve original X-score of 74 lies 1.5 standard devia-
if n is large. Thus, for quite small samples, tions (that is, six units) to the left of the mean.
particularly if one knows that the parent A z-score tells how many standard deviations
population is itself approximately normal, the removed from the mean the original x-score is,
theorem can be confidently applied. If one is not to the right (if Z is positive) or to the left (if Z is
sure that the parent population is normal, negative).
application of the theorem should, as a rule, be
restricted to samples of size ≥30. The standard There are many different normal curves due to the
deviation of a sampling distribution is often called different means and standard deviations. For a fixed
the standard error of the sample statistic concerned. mean m and a fixed standard deviation s, however,
Thus σ X = σ is the standard error of .–x there is exactly one normal curve having that mean
n
and that standard deviation.
A comparison among different distributions with
different means and different standard deviations Normal distributions can be used to calculate prob-
requires that they be transformed. One way would be abilities. Since a normal curve is symmetrical, having
to centre them about the same mean by subtracting a total area of one square unit under it, the area to
the mean from each observation in each of the popula- the right of the mean is half a square unit, and the
tions. This will move each of the distributions along same is true for the area to the left of the mean. The
the scale until they are centred about zero, which is the characteristics of the standard normal distribution
mean of all transformed distributions. Each distribu- are extremely well known, and tables of areas under
tion will still maintain a different bell shape, however. specified segments of the curve are available in
almost all statistical textbooks. The areas are directly
expressed as probabilities. The probability of encoun-
3.6.4.1.2 The z-score
tering a sample, by random selection from a normal
A further transformation is done by subtracting the population, whose measurement falls within a speci-
mean of the distribution from each observation fied range can be found with the use of these tables.
and dividing by the standard deviation of the distri- The variance of the population must, however, be
bution, a procedure known as standardization. The known. The fundamental idea connected with the
result is a variable Z, known as a z-score and having area under a normal curve is that if a measurement X
the standard normal form: is normally distributed, then the probability that X
X−μ will lie in some range between a and b on any given
Z=
σ (3.1) occasion is equal to the area under the normal curve
between a and b.
This will give identical bell-shaped curves with
normal distribution around zero mean and stand- To find the area under a normal curve between
ard deviation equal to unit. the mean m and some x-value, convert the x into
a z‑score. The number indicated is the desired
The z-scale is a horizontal scale set up for any given area. If z turns out to be negative, just look it up
normal curve with some mean m and some standard as if it were positive. If the data are normally
deviation s. On this scale, the mean is marked 0 distributed, then it is probable that at least 68 per
and the unit measure is taken to be s, the particular cent of data in the series will fall within ±1s of
standard deviation of the normal curve in question. the mean, that is, z = ±1. Also, the probability is
A raw score X can be converted into a z-score by the 95 per cent that all data fall within ±2s of the
above formula. mean, or z = ±2, and 99 per cent within ±3s of the
mean, or z = ±3.
For instance, with m = 80 and s = 4, in order to
formally convert the X-score 85 into a z-score, the
3.6.4.1.3 Examples using the z-score
following equation is used:
X − μ 85 − 80 5 Suppose a population of pumpkins is known to
Z= = = = 1.25
σ 4 4 (3.2) have a normal distribution with a mean and
3–14 GUIDE TO AGRICULTURAL METEOROLOGICAL PRACTICES
Here is a slightly more complicated example. If the 3.6.4.2 Extreme value distributions
heights of all the rice stalks in a farm are thought to
be normally distributed with mean X = 38 cm and Certain crops may be exposed to lethal conditions
standard deviation s = 4.5 cm, find the probability (frost, excessive heat or cold, drought, high winds,
that the height of a stalk taken at random will be and so on), even in areas where they are commonly
between 35 and 40 cm. To solve this problem, one grown. Extreme value analysis typically involves
must find the area under a portion of the appropri- the collection and analysis of annual maxima of
ate normal curve, between X = 35 and X = 40. (See parameters that are observed daily, such as
Figure 3.1). It is necessary to convert these x-values temperature, precipitation and wind speed. The
into z-scores as follows. process of extreme value analysis involves data
Area gathering; the identification of a suitable
0.418 6 probability model, such as the Gumbel distribution
or generalized extreme value (GEV) distribution
(Coles, 2001), to represent the distribution of the
observed extremes; the estimation of model
parameters; and the estimation of the return values
for periods of fixed length.
35 40
x =38 The Gumbel double exponential distribution is the
one most used for describing extreme values. An
z event that has occurred m times in a long series of n
–0.67 0 0.44 independent trials, one per year say, has an esti-
Figure 3.1. Probabilities for a normal distribution mated probability p = nm ; conversely, the average
with
–x = 3.8 and s = 4.5
interval between recurrences of the event during a
CHAPTER 3. AGRICULTURAL METEOROLOGICAL DATA, THEIR PRESENTATION AND STATISTICAL ANALYSIS 3–15
m
long period would be n ; this is defined as the 3.6.4.4 Distribution of sequences of
return period T where: consecutive days
1
T= (3.8) The distribution of sequences of consecutive days
p
in which certain climatic events occur is of special
For example, if there is a 5 per cent chance that an interest to the agriculturist. From such data one
event will occur in any one year, its probability of can, for example, deduce the likelihood of being
occurrence is 0.05. This can be expressed as an event able to undertake cultural operations requiring
having a return period of five times in 100 years or specific weather conditions and lasting for several
once in 20 years. days (haymaking, gathering grapes, and the like).
The choice of protective measures to be taken
For a valid application of extreme value analysis, against frost or drought may likewise be based on
two conditions must be met. First, the data must an examination of their occurrence and the distri-
be independent, that is, the occurrence of one bution of the corresponding sequences. For
extreme is not linked to the next. Second, the whatever purpose the sequences are to be used, it is
data series must be trend-free and the quantity of important to specify clearly the periods to which
data must be large, usually not fewer than they refer (also whether or not they are for overlap-
15 values. ping periods). Markov chain probability models
have frequently been used to estimate the probabil-
ity of sequences of certain consecutive days, such as
3.6.4.3 Probability and risk
wet days or dry days. Under many climate condi-
Frequency distributions, which provide an indication tions, the probability, for example, that a day will
of risk, are of particular interest in agriculture due to be dry is significantly larger if the previous day is
the existence of ecological thresholds which, when known to have been dry. Knowledge of the persist-
reached, may result either in a limited yield or in ence of weather events such as wet days or dry days
irreversible reactions within the living tissue. can be used to estimate the distribution of consecu-
Histograms can be fitted to the most appropriate tive days using a Markov chain. INSTAT includes
distribution function and used to make statements algorithms to calculate Markov chain models, to
about probabilities or risk of critical climate simulate spell lengths and to estimate probability
conditions, such as freezing temperatures or dry using climatological data.
spells of more than a specified number of days.
Cumulative frequencies are particularly suitable and 3.6.5 Measuring central tendency
convenient for operational use in agrometeorology.
Cumulative distributions can be used to prepare One descriptive aspect of statistical analysis is
tables or graphs showing the frequencies of occasions the measurement of what is called central
when the values of certain parameters exceed (or fall tendency, which gives an idea of the average or
below) given threshold values during a selected middle value about which all measurements
period. If a sufficiently long series of observations coming from the process will cluster. To this
(10 to 20 years) is available, it can be assumed to be group belong the mean, the median and the
representative of the total population, so that mean mode. Their symbols are as listed below:
durations of the periods when the values exceed (or
fall below) specified thresholds can be deduced. ––
x arithmetic mean of a sample;
When calculating these mean frequencies, it is often m– population mean;
an advantage to extract information regarding the –w –
x weighted mean;
extreme values observed during the period chosen, –
xh – harmonic mean.
such as the growing season, growth stage or period
of particular sensitivity. Some examples are:
3.6.5.1 The mean
(a) Threshold values of daily maximum and
minimum temperatures, which can be used to While frequency distributions are undoubtedly
estimate the risk of excessive heat or frost and useful for operational purposes, mean values of the
the duration of this risk; main climatic elements (10-day, monthly or
(b) Threshold values of 10-day water deficits, seasonal) may be used broadly to compare climatic
taking into account the reserves in the soil. regions. To show how the climatic elements are
The quantity of water required for irrigation distributed, however, these mean values should be
can then be estimated; supplemented by other descriptive statistics, such
(c) Threshold values of relative humidity from as the standard deviation, coefficient of variation
hourly or 3-hour observations. (variability), quintiles and extreme values. In
3–16 GUIDE TO AGRICULTURAL METEOROLOGICAL PRACTICES
agroclimatology, series of observations that have into equation (3.11), the overall mean yield of
not been made simultaneously may have to be maize for these 21 000 ha of land is as follows:
compared. To obtain comparable means in such
cases, adjustments are applied to the series so as to 3 000(1.5) + 7 000(2.0) + 2 000(1.8) + 5 000(1.3) + 4 000(1.9)
Xw =
3 000 + 7 000 + 2 000 + 5 000 + 4 000
fill in any gaps (see Some Methods of Climatological (3.12)
33 800
Analysis, WMO-No. 199). Sivakumar et al. (1993) =
21 000
illustrate the application of INSTAT in calculating
descriptive statistics for climate data and discuss In operational agrometeorology, the mean is
the usefulness of the statistics for assessing normally computed for 10 days, known as dekads, as
agricultural potential. They produce tables of well as for the day, month, year and longer periods.
monthly mean, standard deviation, and maximum This is used in agrometeorological bulletins and for
and minimum for rainfall amounts and for the describing current weather conditions. At agro
number of rainy days for available stations. meteorological stations where the maximum and
Descriptive statistics are also presented for the minimum temperatures are read, a useful approx-
maximum and minimum air temperatures. imation of the daily mean temperature is given by
taking the average of these two temperatures. These
The arithmetic mean is the most commonly used averages should be used with caution when compar-
measure of central tendency, defined as: ing data from different stations, as such averages
1 n may differ systematically from each other.
X= ∑ xi i = 1,2,...n (3.9)
n i =1
Another measure of the mean is the harmonic
This consists of adding all data in a series and divid- mean, which is defined as n divided by the sum of
ing their sum by the number of data. The mean of the reciprocals or multiplicative inverses of the
the annual precipitation series from Table 3.1 is: numbers: k
∑x ∑ ni xi
X= i = 69 449 / 50 = 1 388.9 (3.10) Xw = i ≡ 1k (3.13)
n
The arithmetic mean may be computed using other
∑ ni
i ≡1
For example, the average yield of maize for the The mode is the most frequent value in any array.
five districts in the Ruvuma Region of Tanzania Some series have even more than one modal value.
was 1.5, 2.0, 1.8, 1.3 and 1.9 tonnes per hectare Mean annual rainfall patterns in some sub-
(t/ha), respectively. The respective areas under equatorial countries have bimodal distributions,
maize were 3 000, 7 000, 2 000, 5 000 and 4 000 meaning they exhibit two peaks. Unlike the mean,
ha. If the values n1 = 3 000, n2 = 7 000, n3 = the mode is an actual value in the series. Its use is
2 000, n4 = 5 000 and n5 = 4 000 are substituted mainly in describing the average.
CHAPTER 3. AGRICULTURAL METEOROLOGICAL DATA, THEIR PRESENTATION AND STATISTICAL ANALYSIS 3–17
This is the difference between the largest and the small- Statistical inference is a process of inferring informa-
est values. For instance, the annual range of mean tion about a population from the data of samples
temperature is the difference between the mean daily drawn from it. The purpose of statistical inference is
temperatures of the hottest and coldest months. to help a decision-maker to be right more often than
not, or at least to give some idea of how much danger
3.6.7.2 The variance and the standard there is of being wrong when a particular decision is
deviation made. It is also meant to ensure that long-term costs
through wrong decisions are kept to a minimum.
The variance is the mean of the squares of the devia-
tions from the arithmetic mean. The standard deviation Two main lines of attacking the problem of statisti-
s is the square root of the variance and is defined as the cal inference are available. One is to devise sample
root-mean-square of the deviations from the arithme- statistics that may be regarded as suitable estima-
tic mean. To obtain the standard deviation of a given tors of corresponding population parameters. For
–
sample, the mean x– is computed first and then the example, the sample mean X may be used as an
deviations from the mean (x – –x– ): estimator of the population mean m, or else the
i
sample median may be used. Statistical estimation
S=
∑ (x
i − x )2 (3.15) theory deals with the issue of selecting best estima-
tors. The steps to be taken to arrive at a decision are
n −1
as follows:
3–18 GUIDE TO AGRICULTURAL METEOROLOGICAL PRACTICES
Step 1. Formulate the null and alternative Step 4. Decide upon the test statistic to be used
hypotheses
The decision in a hypothesis test can be made depend-
Once the null hypothesis has been clearly defined, ing upon a random variable known as a test statistic,
one can calculate what kind of samples to expect such as z or t, as used in finding confidence intervals.
under the supposition that it is true. Then if a Its sampling distribution, under the assumption that
random sample is drawn, and if it differs markedly Ho is true, must be known. It can be normal, bino-
in some respect from what is expected, the observed mial or another type of sampling distribution.
difference is said to be significant and one is
inclined to reject the null hypothesis and accept Step 5. Calculate the acceptance and rejection
the alternative hypothesis. If the difference regions
observed is not too large, one might accept the null
hypothesis or call for more statistical data before Assuming that the null hypothesis is true, and bear-
coming to a decision. One can make the decision in ing in mind the chosen values of n and a, an
a hypothesis test depending upon a random varia- acceptance region of values for the test statistic is
ble known as a test statistic, such as the z-score used now calculated. Values outside this region form the
in finding confidence intervals, and critical values rejection region. The acceptance region is so chosen
of this test statistic can be specified that can be used that if a value of the test statistic, obtained from the
to indicate not only whether a sample difference is data of a sample, fails to fall inside it, then the
significant, but also the strength of the assumption that Ho is true must be strongly
significance. doubted. In general, there is a test statistic X, whose
sampling distribution, defined by certain parame-
For instance, in a coin experiment to determine if ters such as h and s, is known. The values of the
the coin is fair or loaded: parameters are specified in the null hypothesis Ho.
From integral tables of the sampling distribution,
Null Ho: p = 0.5 (namely, the coin is fair). critical values X1, X2 are obtained such that
And alternative H1: p ≠ 0.5 (namely, the coin is P [X1 < X < X2] = 1 – a (3.18)
biased).
These determine an acceptance region, which gives
(Or equivalently H1: p < 0.5 or p > 0.5; this is called a test for the null hypothesis at the appropriate
a two-sided alternative). level of significance (a).
Step 2. Choose an appropriate level of significance Step 6. Formulate the decision rule
The probability of wrongly rejecting a null hypothesis The general decision rule, or test of hypothesis, may
is called the level of significance (a) of the test. The now be stated as follows:
value for a is selected first, before any experiments
are carried out; the values most commonly used by (a) Reject Ho at the a significance if the sample
statisticians are 0.05, 0.01 and 0.001. The level of value of X lies in the rejection region (that is,
significance a = 0.5 means that the test procedure outside [X1, X2]). This is equivalent to saying
has only 5 chances in 100 of leading one to decide that the observed sample value is significant
that the coin is biased if in fact it is not. at the 100 a % level.
Step 3. Choose the sample size n The alternative hypothesis H1 is then to be accepted.
It is fairly clear that if bias exists, a large sample will (b) Accept Ho if the sample value of X lies in the
have more chance of demonstrating its existence acceptance region [X1, X2]. (Sometimes, espe-
than a small one. So one should make n as large as cially if the sample size is small, or if X is close
possible, especially if one is concerned with to one of the critical values X1 and X2, the
demonstrating a small amount of bias. Cost of decision to accept Ho is deferred until more
experimentation, time involved in sampling, data are collected.)
necessity of maintaining statistically constant
conditions, amount of inherent random variation Step 7. Carry out the experiment and make the test
and possible consequences of making wrong
decisions are among the considerations on which The n trials of the experiment may now be carried
the sample sizes depend. out, and from the results, the value of the chosen
CHAPTER 3. AGRICULTURAL METEOROLOGICAL DATA, THEIR PRESENTATION AND STATISTICAL ANALYSIS 3–19
test statistic may be calculated. The decision rule to indicate not only whether a sample difference
described in Step 6 may then be applied. Note: All is significant but also the strength of the
statistical test procedures should be carefully significance.
formulated before experiments are carried out.
The test statistic, the level of significance, and
3.7.3 Interval estimation
whether a one- or two-tailed test is required, must
be decided before any sample data are looked at. Confidence interval estimation is a technique of
Switching tests in midstream, as it were, leads to calculating intervals for population parameters
invalid probability statements about the decisions and measures of confidence placed upon them. If
made. one has chosen an unbiased sample statistic b as
the point estimator of b, the estimator will have a
sampling distribution with mean E(b) = b and
3.7.2 Two-tailed and one-tailed tests
standard deviation S.D.(b) = sb. Here the parame-
The determination of whether one uses a two-tailed ter b is the unknown and the purpose is to estimate
or a one-tailed test depends on how the hypothesis it. Based on the remarkable fact that many sample
wcharacterized. If the H1 was defined as μ ≠ 0, the statistics used in practice have a normal or approx-
critical region would occupy both extremes of the imately normal sampling distribution, from the
test distribution. This is a two-tailed test, where the tables of the normal integral one can obtain the
values could be on either side of μ. If the H1 was probability that a particular sample will provide a
defined as μ > 0 or μ < 0, the critical region occurs value of b within a given interval (b – d) to (b + d).
only at high or low values of the test statistic. This
is known as a one-tailed test. This is indicated in the diagram below. Conversely,
for a given amount of probability, one can deduce
With a two-tailed test, the critical region contain- the value d. For example, for 0.95 probability, one
d
ing 5 per cent of the area of the normal distribution knows from standard normal tables that σ b = 1.96 .
is split into two equal parts, each containing 2.5 per In other words, the probability that a sample will
cent of the total area. If the computed value of Z provide a value of b in the interval [b – 1.96sb,
falls into the left-hand region, the sample came b + 1.96sb] is 0.95. This is written as P [b – 1.96sb <
from a population having a smaller mean than the = b < = b = 1.96sb] = 0.95. After rearranging the
known population. Conversely, if it falls into the inequalities inside the brackets to the equivalent
right-hand region, the mean of the sample’s parent form [b – 1.96sb ≤ b ≤ b + 1.96sb], one obtains the
population is larger than the mean of the known 95 per cent confidence interval for b, namely the
population. From the standardized normal distribu- interval [b – 1.96s b, b + 1.96s b]. In general,
tion table found in most statistical textbooks, one confidence intervals are expressed in the form
can find that approximately 2.5 per cent of the area [b – z.sb, b + z.sb], where z, the z-score, is the
of the curve is to the left of a Z value of –1.96 and number obtained from tables of the sampling
97.5 per cent of the area of the curve is to the left of distribution of b. This z-score is chosen so that the
+1.96. An example of a normal table can be accessed desired percentage confidence may be assigned to
from http://www.isixsigma.com/library/content/ the interval; it is now called the confidence
zdistribution.asp. coefficient, or sometimes the critical value. The
endpoints of a confidence interval are known as
Once the null hypothesis has been clearly defined, the lower and upper confidence limits. The
one can calculate what kind of samples to expect probable error of estimate is half the interval
under the supposition that it is true. Then, if a length of the 50 per cent confidence interval,
random sample is drawn, and if it differs mark- namely, 0.674s. Table 3.3 is an abbreviated table
edly in some respect from what is expected, one of confidence values for z.
can say that the observed difference is significant,
and one is inclined to reject the null hypothesis The most commonly required point and interval
and accept the alternative hypothesis. If the estimates are for means, proportions, differences
difference observed is not too large, one might between two means, and standard deviations. Table
accept the null hypothesis, or one might call for 3.4 gives all the formulae needed for these esti-
more statistical data before coming to a decision. mates. The reader should note the standard form of
The decision in a hypothesis test can be made b ± z . s b for each of the confidence interval
depending upon a random variable known as a estimators.
test statistic, such as z or t, as used in finding
confidence intervals, and critical values of this For the formulae to be valid, sampling must be
test statistic can be specified, which can be used random and the samples must be independent. In
3–20 GUIDE TO AGRICULTURAL METEOROLOGICAL PRACTICES
some cases, sb will be known from prior informa- The observations in the sample were selected
tion. Then the sample estimator will not be used. In randomly from a normal population whose variance
each of the confidence interval formulae, the confi- is known.
dence coefficient z may be found from tables of the
normal integral for any desired degree of confi-
3.7.5 Tests for normal population
dence. This will give exact results if the population
means
from which the sampling is done is normal; other-
wise, the errors introduced will be small if n is A random sample size n is drawn from a normal
reasonably large (n ≥ 30). population having unknown mean m and known
standard deviation s. The objective is to test the
What should one do when samples are small? It is clear hypothesis Ho: m = m’, that is, the assumption that
that the smaller the sample, the smaller amount of the population mean has value s’.
confidence one can place on a particular interval esti-
X − μʹ
mate. Alternatively, for a given degree of confidence, The variate Z = has a standard normal
the interval quoted must be wider than for larger σ n
–
samples. To bring this about, one must have a confi- distribution if Ho is true. Z (or X) may be used as the
dence coefficient that depends upon n. The letter t test statistic.
shall be used for this coefficient and confidence inter-
val formulae shall be provided for the population mean Example 1
and for the difference of two population means.
Suppose that the shelf life of one-litre bottles of
The reader will note that these are the same as for pasteurized milk is guaranteed to be at least 400 days,
large samples, except that t replaces z. When the with a standard deviation of 60 days. If a sample of 25
sample estimators for s x– and s x– 1–x– 2 and are used, bottles is randomly chosen from a production batch,
the correct values for t are obtained from what is and a sample mean shelf life of 375 is calculated after
called the Student t-distribution. For convenience, testing has been performed, should the batch be
they are related not directly to sample sizes, but to rejected as not meeting the guarantee?
a number known as “degrees of freedom”; this shall
be denoted by u. An abbreviated table of t-values is Solution: Let m be the batch mean.
given in Table 3.5.
Step 1. Null hypothesis Ho: h = 400.
Table 3.4. Formulae for confidence interval estimates
Alternative hypothesis H1: h = 400 (one-sided: one
Confidence interval Degrees of is only interested in whether or not the mean is up
freedom (u) to the guaranteed minimum value).
1. Mean m: –x ± t.s – n–1
x
Steps 2 and 3. n = 25 (given); choose a = 0.05.
2. Difference (x– 1 – –x 2) ± n1 + n2 – 2 –
m1 – m1: t.s(x– 1 – –x 2) Step 4. If X is the sample mean, the quantity
Confidence 50% 60% 80% 86.8% 90% 92% 93.4% 94.2% 95% 95.6% 96% 97.4% 98%
level
Confidence 0.674 0.84 1.28 1.50 1.645 1.75 1.84 1.90 1.96 2.01 2.05 2.23 2.33
coefficient z
CHAPTER 3. AGRICULTURAL METEOROLOGICAL DATA, THEIR PRESENTATION AND STATISTICAL ANALYSIS 3–21
(a) Reject the production batch if the value of Z Alternative hypothesis H1: mA > mB.
calculated from the sample is less than –1.65.
(b) Accept the batch otherwise. The specific question was whether the heat-treated
seeds were faster germinating than the untreated
Step 7. Carry out the test: seeds, so the one-sided alternative hypothesis is
used.
From the sample data one finds that
Steps 2 and 3. nA = 30 and nB = 36 (given).
375 − 400
Z= = −2.083 (3.21)
60 25
The significance level shall be a = 0.05.
Decision: the production batch must be rejected,
since –2.083 < –1.65. It is highly unlikely that the Step 4. Test statistic:
mean shelf life of milk bottles in the batch will be
400 days or more. The chance that this decision is No information other than the two sample means
wrong is smaller than 5 per cent. is given. Even if the individual students’ results
were known, the paired comparison test could not
Example 2 be used – there would be no possible reason for
linking the results in pairs.
A sample of 66 seeds of a certain plant variety were
planted on a plot using a randomized block design. The difference in means (x–A – x–B) is approximately
Before planting, 30 of the seeds were subjected to a normally distributed, with mean (mA – mB) and
certain heat treatment. The times from planting to standard deviation
germination were observed. The 30 treated seeds took
σ2 σ2
52 days to germinate, while the 36 untreated seeds σʹ = + (3.22)
took 47 days. If the common standard deviation for nA nB
time to germination applicable to individual seeds, So one may use as a test statistic the standard
calculated from several thousand seeds, may be taken normal variate
as 12 days, can it be said that the heat treatment
significantly speeds up a seed’s germination rate? (xA − xB ) − (μ A − μ B ) (3.23)
z=
σʹ
From the data given, it is clear that the heat-treated And if Ho is true, mA – mB = 0; so the test statistic
seeds had an earlier start in growth. One may reduces to
consider, however, the wider question as to whether
xA − xB
heat-treated seeds are significantly faster germinat- z= (3.24)
ing generally than untreated seeds. σʹ
Step 5. Acceptance region:
The test is as follows:
The critical value of z at 5 per cent level of signifi-
Step 1. Let mA, mB be the germination period popula- cance, for a one-tailed test, is 1.65. Therefore, the
tion means for heat-treated and untreated seeds, acceptance region for the null hypothesis is the set
respectively. of values z less than or equal to 1.65.
u 3 4 5 7 9 10 15 20 25 30
90% 2.35 2.13 2.02 1.89 1.83 1.81 1.75 1.72 1.71 1.70
95% 3.18 2.78 2.57 2.36 2.26 2.23 2.13 2.09 2.06 2.04
99% 5.84 4.60 4.03 3.50 3.25 3.17 2.95 2.85 2.79 2.75
Degrees of freedom u
3–22 GUIDE TO AGRICULTURAL METEOROLOGICAL PRACTICES
Step 6. Decision rule: pumpkins that the farmer is selling are ordinary
pumpkins.
(a) If the sample value of z > 1.65, conclude that
heat-treated seeds germinate significantly earlier One can hypothesize that the mean of the popula-
(at the 5 per cent level) than untreated seeds. tion from which the farmer’s pumpkins were taken
(b) If z ≤ 1.65, the germination rates of both heat- is the same as the mean of the ordinary pumpkins
treated and untreated seeds may well be the same. by the null hypothesis
Therefore Ho : m1 ≠ m0 ,
1 1 1 1
σʹ = σ + = 12 + ≅ 2.96 (3.25) stating that the mean of the population from which
nA nB 30 36
the sample was drawn does not equal the specified
And so the sample value of the test statistic is population mean. If the two parent populations are
x A − x B 52 − 47 not the same, one must conclude that the pumpkins
z= = ≅ 1.69 (3.26) that the farmer was selling were not drawn from the
σʹ 2.96
ordinary pumpkin population, but from the popula-
Decision: The heat-treated seed is just significantly tion of some other genus. One needs to specify levels
earlier germinating at the 5 per cent level than the of probability of correctness, or level of significance,
untreated seed. denoted by a. A probability level of 5 per cent may
be applied; this means a willingness to risk rejecting
the hypothesis when it is correct 5 times out of 100
3.7.6 The t-test
trials. One must have the variance of the population
The uncertainty introduced into estimates based against which one is checking. A formal statistical
on samples can be accounted for by using a prob- test may now be set up in the following manner:
ability distribution that has a wider spread than
the normal distribution. One such distribution is 1. The hypothesis and alternative:
the t-distribution, which is similar to the normal
distribution, but dependent on the size of sample H o : m1 = m0 (3.28)
taken. When the number of observations in the
sample is infinite, the t-distribution and the Ho : m1 ≠ m0 (3.29)
normal distribution are identical. Tables of the
t-distribution and other sample-based distribu- 2. The level of significance:
tions are used in exactly the same manner as tables
of the cumulative standard normal distribution, a = 0.05 (3.30)
except that two entries are necessary to find a
probability in the table. The two entries are the 3. The test statistic:
desired level of significance (a) and the degrees of X − μ0
Z=
freedom (u), defined as the number of observa- σ n (3.31)
tions in the sample minus the number of
parameters estimated from the sample. The test statistic, Z, has a frequency distribution that
X − μ0 is a standardized normal distribution, provided that
t =
Then for the test statistic one uses S n , the observations in the sample were selected randomly
which has a Student’s t-distribution with n – 1 from a normal population whose variance is known.
degrees of freedom. If that has been specified, one is willing to reject the
hypothesis of the equality of means when they actu-
Example using the z-test ally are equal one time out of twenty: that is, one will
accept a 5 per cent risk of being wrong. On the stand-
A farmer was found to be selling pumpkins that ardized normal distribution curve, therefore, the
looked like ordinary pumpkins except that these extreme regions that contain 5 per cent of the area of
were very large, with an average diameter of the curve need to be determined. This part of the
30.0 cm for 10 samples. The mean and standard probability curve is called the area of rejection or the
deviation for pumpkins are 14.2 cm and 4.7 cm, critical region. If the computed value of the test statis-
respectively. The intent is to test whether the tic falls into this area, the null hypothesis will be
CHAPTER 3. AGRICULTURAL METEOROLOGICAL DATA, THEIR PRESENTATION AND STATISTICAL ANALYSIS 3–23
rejected. The hypothesis will be rejected if the test The standard deviation of the sampling distribu-
statistic is either too large or too small. The critical tion of –x1 – –x2, written as s(x
– – –x ), is estimated first.
1 2
region, therefore, occupies the extremities of the The data from the two samples are pooled, thus:
probability distribution and each subregion contains 1 1 1 1
2.5 per cent of the total area of the curve. σ̂ x 1− x 2 = s. + = s. + (3.35)
n1 n2 6 11
– – –x ) ± t.s∧
(x
3.7.7 Estimators using pooled samples 1 2 x1 – x2 = 0.374 ± 2.13 x 0.141196 =
0.374 ± 0.300748 (3.39)
Let two random samples of sizes n1, n2, respectively,
be drawn from a large population that has mean m Thus, the 95 per cent confidence limits for the
and variance s2. Suppose that the samples yield difference in mean strengths of the acids in the two
unbiased estimates, –x1 and –x2 of m and s12, s22 of s2. bottles are 0.0733 and 0.6747. This indicates that
The problem arises of combining these pairs of esti- one is 95 per cent confident that the difference in
mates to obtain single unbiased estimates of m and mean strengths of the acids in the two bottles lies
s2. The process of combining estimates from two or between 0.0733 and 0.6747.
more samples is known as pooling. The correct
ways to pool unbiased estimates of means and vari- 3.7.8 The paired comparison test and the
ances, to yield single unbiased estimates, are difference between two means test
n1 x1 + n2 x2
Means: μ̂ = (3.32) Example: Paired comparison test
n1 + n2
(n1 − 1)s1 2 + (n2 − 1)s2 2
Variances: σσ σ2= (3.33) The yields from two varieties of wheat were compared.
n1 + n2 − 2 The wheat was planted on 25 test plots. Each plot was
Example: divided into two equal parts; one part was chosen
randomly and planted with the first variety and the
A soil scientist made six determinations of the other part was planted with the second variety of
strength of dilute sulphuric acid. His results showed a wheat. This process was repeated for all 25 plots.
mean strength of 9.234 with a standard deviation of When the crop yields were measured, the difference
0.12. Using acid from another bottle, he made eleven in yields from each plot was recorded (second variety
determinations, which showed a mean strength of minus first variety). The sample mean plot yield
8.86 with a standard deviation of 0.21. Obtain 95 per difference was found to be 3.5 t/ha, and the variance
cent confidence limits for the difference in mean of these differences was calculated to be 16 t/ha.
strengths of the acids in the two bottles. Could the (a) Does the second variety produce signifi-
bottles have been filled from the same source? cantly higher yields than the first variety?
(b) Test the hypothesis that the population mean
Working: plot yield difference is as high as 5 t/ha.
(c) Obtain 95 per cent confidence limits for
The difference in mean strengths of the acids is the population mean plot yield difference.
estimated by
It is clear that there is a good deal of variation in
–x – –x = 9.234 – 8.86 = 0.374 (3.34) yields from plot to plot. This variation tends to
1 2
3–24 GUIDE TO AGRICULTURAL METEOROLOGICAL PRACTICES
confound the main issue, which is to determine second wheat variety were due to chance vari-
whether yields are increased by using a second vari- ation in the experiment.
ety. This confusion has been avoided by considering
only the change in yields for each plot. If the second Step 7. Carry out the test:
variety has no effect, the average change will be zero.
From the sample data
Data of this kind, in which results are combined in
D − 0 3.5 − 0
pairs and each pair arises from one experimental t= = = 2.375 (3.40)
unit or has some clear reason for being linked in this S n 6 2
way, are analysed by the paired comparison test.
Each pair provides a single comparison as a measure Decision: since 2.375 > 1.71, one can conclude at
of the effect of the treatment applied (for example, the 5 per cent level that the second variety signifi-
growing a different variety). Let D denote the differ- cantly produces higher yields than the first
ence in a given pair of results. D will have normal variety.
distribution with mean m and standard deviation s
(both the parameters are unknown in this case).
3.7.9 Difference between two means
Example using the t-test A sampling result, which is frequently used in infer-
ence tests, is one concerning the distribution of the
Step 1. difference in means of independent samples drawn
from two different populations. Let a random
Null hypothesis Ho: m = 0 (namely, the yield of the sample of size n1 be drawn from a population
two wheat varieties is the same). having mean m1 and standard deviation sx; and let
an independent sample of size n2 be drawn from
Alternative hypothesis H1: m > 0 (that is, the second another population having mean mY and standard
–
variety yields are higher than the first variety yields). deviation sY. Consider the random variable D = X1
–
and Y; that is, the difference in means of the two
H1 is one sided; a one-tailed test must be applied. samples. The theorem states that D has a sampling
distribution with mean mD = mX – mY and variance
Steps 2 and 3. Twenty-five plots were used, which σX 2 σX2
means that n = 25. A significance level of a = 0.05 Var (D) = + (3.41)
n1 n1
shall be used.
D − 0
z=
Step 4. The quantity σ 25 is a standard normal 3.7.10 The F-Test
variate and may be used as the test statistic if s is
known from previous experimentation. It seems reasonable that the sample variances will
range more from trial to trial if the number of
The parameters of a population are rarely known. observations used in their calculation is small.
In this case, s is not given, so it must be estimated Therefore, the shape of the F-distribution would be
from the sample data. expected to change with changes in sample size.
The degrees of freedom idea comes to mind, except
Step 5. Acceptance region: the critical level of t at in this situation the F-distribution is dependent on
the 0.05 level of significance (one-tailed test) is the two values of g, one associated with each variance
same as the upper 90 per cent confidence coeffi- in the ratio. Since the F-ratio is the ratio of two
cient, as provided in Table 3.5. With 24 degrees of positive numbers, the F-distribution cannot be
freedom, this value is 1.71. The acceptance region negative. If the samples are large, the average of the
is therefore all values of t from –infinity to 1.71. ratios should be close to 1.0.
unlikely that such a ratio could be obtained, this can be alternative hypothesis states that they do not.
seen to indicate that the samples come from different Degrees of freedom associated with this test are
populations having different variances. (n2 – 1) for g1 and (n2 – 1) for g2. The critical value of
F with g1 = 9 and g1 = 9 degrees of freedom and a
For any pair of variances, two ratios can be computed level of significance of 5 per cent (α = 0.05).
⎛ S 2 and S 1 ⎞ . If one arbitrarily decides that the larger
⎝ S1 S2 ⎠
variance will always be placed in the numerator, the The value of F calculated from (3.42) will fall into
ratio will always be greater than 1.0 and the statistical one of the two areas shown in Figure 3.2. If the
tests can be simplified. Only one-tailed tests need to calculated value of F exceeds 3.18, the null hypoth-
be utilized, and the alternative hypothesis actually is esis is rejected and one concludes that the variation
a statement that the absolute difference between the in porosity is not the same in the two groups. If the
two sample variances is greater than expected if the calculated value is less than 3.18, there would be no
population variances are equal. This is shown in evidence for concluding that the variances are differ-
Figure 3.2, a typical F-distribution curve in which the ent (determine at 0.05 if variances are the same).
critical region or area of rejection has been shaded.
In most practical situations, one ordinarily has no
knowledge of the parameters of the population, except
for estimates made from samples. In comparing two
samples, it is appropriate to first determine if their vari-
ances are statistically equivalent. If they appear to be
equal and the samples have been selected without bias
5% from a naturally occurring population, it is probably
safe to proceed to additional statistical tests.
F= (3.42)
S2 2 3.7.11.1 Correlation methods
where S12 is the larger variance and S22 is the smaller. Correlation methods are used to discover objectively
Now the hypothesis and quantitatively the relationship that may exist
between several variables. The correlation coefficient
Ho : s12 = s22 (3.43) determines the extent to which values of two
variables are linearly related; that is, the correlation
is tested against is high if it can be approximated by a straight line
(sloped upwards or downwards). This line is called
Ho : s12 = ≠ s22 (3.44) the regression line. Correlation analysis is especially
valuable in agrometeorology because of the many
The null hypothesis states that the parent popula- factors that may be involved, simultaneously or
tions of the two samples have equal variances: the successively, during the development of a crop and
3–26 GUIDE TO AGRICULTURAL METEOROLOGICAL PRACTICES
Care should be exercised in interpreting these corre- b is the slope of the regression.
lations. Graphs and scatter plots should be used to
give much more information about the nature of The least squares criterion requires that the line be
the relationship between variables. The discovery chosen to fit the data so that the sum of the squares
of a significant correlation coefficient should of the vertical deviations separating the points from
encourage the agrometeorologist, in most cases, to the line will be a minimum.
seek a physical or biological explanation for the
relationship and not just be content with the statis- The recommended formulae for estimating the two
tical result. sample coefficients for least squares are:
Table 3.6. Sunshine normals from Lyamungu, model, such that effects can be tested for categori-
United Republic of Tanzania cal predictor variables, as well as effects for
continuous predictor variables. An objective in
n/N R/RA
(X) (Y) performing multiple regression analysis is to spec-
ify a parsimonious model whose factors contribute
Jan 0.660 0.620 significantly to variation in response. Statistical
Feb 0.647 0.578 software such as INSTAT provides tools to select
independent factors for a regression model. These
Mar 0.536 0.504
programs include forward stepwise regression to
Apr 0.366 0.395 individually add or delete the independent varia-
May 0.251 0.368 bles from the model at each step of the regression
0.319 0.399 until the “best” regression model is obtained, or
Jun
backward stepwise regression to remove the inde-
Jul 0.310 0.395 pendent variables from the regression equation
Aug 0.409 0.442 one at a time until the “best” regression model is
Sept 0.448 0.515 obtained. It is generally recommended that one
should have at least 10 times as many observations
Oct 0.542 0.537
or cases as one has variables in a regression
Nov 0.514 0.503 model.
Dec 0.602 0.582
Residual analysis is recommended as a tool to
– –
N = 12, X = 0.467, sx = 0.132, Y = 0.487, sy = 0.081, assess the multiple regression models and to iden-
b = 0.603, a = 0.205, r = 0.973 tify violations of assumptions that threaten the
validity of results. Residuals are the deviations of
The regression explains r² = 95 per cent of the vari- the observed values of the dependent variable
ance of R/RA and is significantly below p = 0.01. from the predicted values. Most statistical software
provides extensive residuals analyses, allowing
There are cases where a scatter diagram suggests one to use a variety of diagnostic tools in inspect-
that the relationship between variables is not linear. ing different residual and predicted values, and
This can be turned into a linear regression by taking thus to examine the adequacy of the prediction
the logarithms of the relationship if it is exponen- model, the need for transformations of the varia-
tial, or by turning it into a reciprocal if it is square, bles in the model, and the existence of outliers in
and so forth. For example, when the saturation the data. Outliers (that is, extreme cases) can seri-
vapour pressure is plotted against temperature, the ously bias the results.
curve suggests that a function like y = p.ebx could
probably be used to describe the function. This is 3.7.11.4 Stepwise regression
turned into a linear regression ln (y) = ln (p) + bX,
where X is the temperature function and y is the This will be explained by using an example for
saturation vapour pressure. An expression of the yields. A combination of variables may work
form y = aX2 can be turned into a linear form by together to produce the final yield. These variables
taking the reciprocal 1 = X −2 . could be the annual precipitation, the temperature
y a
of a certain month, the precipitation of a certain
3.7.11.3 Multiple regressions month, the potential evapotranspiration of a
certain month, or the difference between precipi-
The general purpose of multiple regression is to tation and potential evapotranspiration for a given
learn more about the relationship between several month.
independent or predictor variables and a depend-
ent variable. A linear combination of predictor In stepwise regression, a simple linear regression for
factors is used to predict the outcomes or response the yield is constructed on each of the variables and
factor. For example, multiple regression has been their coefficients of determination found. The varia-
successfully used to estimate crop yield as a func- ble that produces the largest r2 statistic is selected.
tion of weather, or to estimate soil temperatures as Additional variables are then brought in one by one
a function of air temperature, soil characteristic and subjected to a multivariate regression with the
and soil cover. It has been used to perform a trend best variable to see how much that variable would
analysis of agrometeorological parameters using a contribute to the model if it were to be included.
polynomial expansion of time. The general linear This is done by calculating the F statistic for each
model is a generalization of the linear regression variable. The variable with the largest F statistic that
3–28 GUIDE TO AGRICULTURAL METEOROLOGICAL PRACTICES
has a significance probability greater than the speci- predictors when univariate splits are used. They
fied significance level for entry is included in the readily lend themselves to graphical display,
multivariate regression model. Other variables are which makes them easier to interpret.
included in the model one by one. If the partial F Classification trees are used in medicine for
statistic of a variable is not significant at a specified diagnosis and in biology for classification. They
level for staying in the regression model, it is left out. have been used to predict levels of winter survival
Only those variables that have produced significant of overwintering crops using weather and
F statistics are included in the regression. A more categorical variables related to topography and
in-depth explanation can be found in Draper and crop cultivars.
Smith (1981).
3.7.12 Climatic periodicities and time
3.7.11.5 Cluster analysis series
Cluster analysis is a technique for grouping indi- Data are commonly collected as time series,
viduals or objects into unknown groups. In biology, namely, observations made on the same variable
cluster analysis has been used for taxonomy, which at repeated points in time. The INSTAT software
is the classification of living things into arbitrary provides facilities for descriptive analysis and
groups on the basis of their characteristics. In display of such data. The goals of time series anal-
agrometeorology, cluster analysis can be used to ysis include identifying the nature of the
analyse historical records of the spatial and tempo- phenomenon represented by the sequence of
ral variations in pest populations in order to classify observations and predicting future values of the
regions on the basis of population densities and the times series. Moving averages are frequently used
frequency and persistence of outbreaks. The analy- to “smooth” a time series so that trends and other
sis can be used to improve regional monitoring and patterns are seen more easily. Sivakumar et al.
control of pest populations. (1993) present a number of graphs showing the
five-year moving averages of monthly and annual
Clustering techniques require that one define a rainfall at selected sites in Niger. Most time series
measure of closeness or similarity between two can be described in terms of trend and seasonality.
observations. Clustering algorithms may be hierar- When trends, such as seasonal or other determin-
chical or non-hierarchical. Hierarchical methods can istic patterns, have been identified and removed
be either agglomerative or divisive. Agglomerative from a series, the interest focuses on the random
methods begin by assuming that each observation is component. Standard techniques can be used to
a cluster and then, through successive steps, the clos- look at its distribution. The feature of special inter-
est clusters are combined. Divisive methods begin est, resulting from the time-series nature of the
with one cluster containing all the observations and data, is the extent to which consecutive observa-
successively split off cases that are the most dissimi- tions are related. A useful summary is provided by
lar to the remaining ones. K-means clustering is a the sample autocorrelations at various lags, the
popular non-hierarchical clustering technique. It autocorrelation at lag m being the correlation
begins with user-specified clusters and then reassigns between observations m time units apart. In simple
data on the basis of the distance from the centroid of applications this is probably most useful for deter-
each cluster. See von Storch and Zwiers (2001) for mining whether the assumption of independence
more detailed explanations. of successive observations used in many elemen-
tary analyses is valid. The autocorrelations also
3.7.11.6 Classification trees give an indication of whether more advanced
modelling methods are likely to be helpful. The
The goal of classification trees is to predict or cross-correlation function provides a summary of
explain responses on a categorical dependent the relationship between two series from which all
variable. They have much in common with trend and seasonal patterns have been removed.
discriminate analysis, cluster analysis, The lag m cross-correlation is defined as the corre-
non-parametric statistics and non-linear lation between x and y lagged by m units.
estimation. They are one of the main techniques
used in data mining. The ability of classification More than any other user of climatic data, the
trees to perform univariate splits, examining the agrometeorologist may be tempted to search for
effects of predictors one at a time, contributes to climatic periodicities that could provide a basis for the
their flexibility. Classification trees can be management of agricultural production. It should be
computed for categorical predictors, continuous noted that the Guide to Climatological Practices (WMO-
predictors or any mix of the two types of No. 100) (section 5.3) is more than cautious with
CHAPTER 3. AGRICULTURAL METEOROLOGICAL DATA, THEIR PRESENTATION AND STATISTICAL ANALYSIS 3–29
regard to such periodicities and that, although they scales used on the graph must be specified and their
may be of theoretical interest, they have been found graduations should be shown. Publications intended
to be unreliable, having amplitudes that are too small for wide distribution among agricultural users
for any practical conclusions to be drawn. should not have complicated scales (for instance,
logarithmic, Gaussian, and so forth) with which
the users may be unfamiliar, and which might lead
to serious errors in interpreting the data.
3.8 PUBLICATION OF RESULTS Furthermore, giving too much information on the
same graph and using complicated conventional
symbols should be avoided.
3.8.1 General methods
long-term agricultural decisions such as crop temperature observed during the three pentads are
adaptation to weather patterns, marketing decisions compared to their respective normal values.
or modelling.
In agmet bulletins, extreme weather events, which
Second, the users’ requirements must be clearly are masked by the averaging procedure involved in
established, so that the most appropriate informa- the calculation of the pentad, must be highlighted,
tion is provided. This is possible only after discussion probably in the form of a footnote, to draw the
with them. In most cases, they do not have a clear attention of users. For example, in Table 3.7 it can
picture of the type of information that is best suited be seen that during the period 6–15 July, the maxi-
for their purpose; the role of the agrometeorologist mum temperature was below the normal by not
is crucial here. more than 1.8°C. But in fact, during the period
9–12 July, maximum temperature was below the
Third, the methods of dissemination of informa- normal by 2.8°C to 3.0°C; this can be of importance
tion must be decided upon after consultation to both animals and plants.
with the users. Some farmers may have full access
to the Internet, while others have only limited The presentation of data in this format, together
access, and others have no such access to this with the broadcast of daily values on the radio
technology. Obviously, the presentation of data and on television, is very effective. It can be used
for these categories will not be the same. by farmers interested in day-to-day activities and
Furthermore, some information must be provided by research workers and model builders. It is suit-
as quickly as possible, while other information able for all types of crops, ranging from tomatoes
may be provided two or three weeks later. and lettuce to sugar cane and other deep-root
crops.
Fourth, it is very important to consider the cost of
the agmet bulletin that is proposed to the users, 3.8.6.1.2 Data in 10-day intervals
especially in developing countries where the finan-
cial burden is becoming heavier. On the basis of the agrometeorological requirements
for a Mediterranean climate with two main seasons
and two transitional seasons, the main climatic
3.8.6.1 Some examples
parameters should be published on a year-round basis.
Some examples of the presentation of agmet informa- The selection of agromet parameters/indices should be
tion are given below to illustrate the points mentioned. published according to the season and the agricultural
situation of the crops, including data representing the
3.8.6.1.1 Data in pentads various agricultural regions of the country.
Table 3.7 shows part of an agmet bulletin issued by The bulletin should include daily data, 10-day means
a government service in a tropical country where or totals, and deviation or per cent from average. In
agriculture is an important component of the parameters such as maximum and minimum
economy. This bulletin was developed to cater for temperature and maximum and minimum relative
all crops, ranging from tomatoes to sugar cane. It humidity, absolute values of the decade based on a
is issued on a half-monthly basis and is sent to the long series of years are also recommended.
users by post and is also available on the service’s
Website. Bearing in mind the time taken to collect The list of recommended data to be published in
the data, it would not be before the 20th day of the agrometeorological bulletin is as follows:
the month at the very earliest that the bulletin (a) Daily data of maximum and minimum
would reach the users. To provide farmers (tomato temperature and relative humidity;
growers, for example) with data relevant to their (b) Temperature near the ground;
day-to-day activities, the agmet bulletin is (c) Soil temperature;
supplemented by daily values of rainfall and (d) Radiation and/or sunshine duration in hours;
maximum and minimum temperatures, which are (e) Class A pan evaporation and/or Penman
broadcast on radio and television. Of course, data evapotranspiration;
relevant to different geographical localities can be (f) Rainfall amount;
included. (g) Accumulated rainfall from the beginning of
the rainy season;
Rainfall amounts (RR) and maximum temperature (h) Number of rainy days;
(MxT) are shown for a given area of a tropical coun- (i) Accumulated number of dry days since the
try. Total rainfall amounts and the mean maximum last rainy day;
CHAPTER 3. AGRICULTURAL METEOROLOGICAL DATA, THEIR PRESENTATION AND STATISTICAL ANALYSIS 3–31
(j) Number of hours below different temperature Little rainfall was observed during the first two
thresholds depending on the crop; pentads of December 2003 and farmers were start-
(k) Number of hours temperature is below 0°C. ing to get worried. The indication that significant
rain was expected on Sunday 14 December (Table
Examples of agrometeorological parameters or indi- 3.8) had given great hope to the farmers and,
ces that should be published are: because it was a weekend, they made plans on
(a) Accumulated number of dynamic model units Friday to do some fieldwork on Saturday and on
since the beginning of the winter as an indica- Monday. Such plans are costly because they imply
tion of budbreak in deciduous trees; the booking of manpower and transport, the
(b) Accumulated number of units above 13°C purchase of fertilizers, and so forth. But model
since the beginning of spring as an indication output received on Friday 12 December indicated
of citrus growth; that the probability of having rain during the
(c) Physiological days – the accumulated number following five days was negligible, and in fact, it
of units above 12°C since the beginning of was not before 31 December that significant rainfall
spring as an indication of cotton growth. was observed.
With the availability on the Internet of short- December 2003 West North
range weather forecasts (5 to 10 days) provided by Friday 12 <1.0 <1.0
World Weather Centres (WWCs), many Saturday 13 1.1–5.0 <1.0
Agrometeorological Services are providing 5- to
10-day weather forecasts to farmers. An example is Sunday 14 5.1–25.0 5.1–25.0
given below, showing expected rainfall (in milli-
Monday 15 1.1–5.0 1.1–5.0
metres) for two rainfed farming areas. The
information was released to users through e-mail Tuesday 16 <1.0 <1.0
and posted on the Website.
Here, it is not the validity of the weather outlook that
This weather outlook, based on model output received is questioned. The point to be noted is that no update
early on Thursday 11 December from WWCs, was of the outlook could reach the farmers because the
released in the afternoon of 11 December; it was sent farming centres were closed for the weekend. If, besides
by e-mail to users through farming centres and posted being sent by e-mail and posted on the Website, the
on the Website. This outlook was not broadcast on outlook had been broadcast on radio and television,
either radio or television. The issue of such a weather the updated version would have reached the farmers
outlook is important, but it must be carefully planned. and appropriate measures could have been taken. To
Otherwise, it can lead to financial losses, as shown avoid similar incidents, it is advisable to decide on the
below. methods for dissemination of information.
Table 3.7. Part of an agmet bulletin issued by a government service in a tropical country
3.8.6.1.4 Seasonal forecast Table 3.9. Real data for the period October 2003
to January 2004
An extract from a seasonal forecast issued in the
first half of October 2003 for a country situated in Rainfall amounts in millimetres (mm)
the southern hemisphere, for summer 2003/2004 Oct. 2003 Nov. 2003 Dec. 2003 Jan. 2004
(summer in that country is from November to
April), reads as follows: “The rainfall season may First half 1.8 4.1 5.2 176.4
begin by November. The summer cumulative rain-
Second half 12.8 35.7 12.8 154.1
fall amount is expected to reach the long-term
mean of 1 400 millimetres. Heavy rainfall is
expected in January and February 2004.” This forecasts must be sent to specialists who are trained
seasonal forecast was published in the newspapers to interpret them and they should be supplemented
and read on television. by short-range weather forecasts.
Given that October and the first half of November Sooner or later, the financial situation in these
2003 were relatively dry and that a significant amount countries will not be able to sustain the issuing of
of rainfall was recorded during the second half of costly agmet bulletins by local personnel. So agro
November, and noting that the seasonal forecast meteorologists must think carefully about the
opted for normal rainfall during summer and that the cost–benefit of the agmet bulletin, especially when
rainfall season may start in November, the farmers developed countries are getting ready to offer their
thought that the rainy season had begun. Most of services for free. (And one must ask how long they
them started planting their crops during the last will continue to be free.)
pentad of November. Unfortunately, the rainfall
during the second half of November was a false signal: Already, shipping bulletins, cyclone warnings and
December was relatively dry. The rainy season started aviation forecasts are being offered for free on a
in January 2004. global scale by a few developed countries. But how
long will these services be free? Sooner or later, the
To prevent seasonal forecasts from falling into the small and poor countries will have to pay for these
wrong hands, it is not advisable to have them services. It is very important to keep the cost of the
published in the newspapers; these seasonal agmet bulletin to a minimum.
ANNEX
Z | 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
-------------------------------------------------------------------------------------------------------------------------------------------
0.0 | 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359
0.1 | 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753
0.2 | 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141
0.3 | 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.6517
0.4 | 0.6554 0.6591 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.6879
0.5 | 0.6915 0.6950 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.7224
0.6 | 0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.7549
0.7 | 0.7580 0.7611 0.7642 0.7673 0.7704 0.7734 0.7764 0.7794 0.7823 0.7852
0.8 | 0.7881 0.7910 0.7939 0.7967 0.7995 0.8023 0.8051 0.8078 0.8106 0.8133
0.9 | 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389
1.0 | 0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.8621
1.1 | 0.8643 0.8665 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.8830
1.2 | 0.8849 0.8869 0.8888 0.8907 0.8925 0.8944 0.8962 0.8980 0.8997 0.9015
1.3 | 0.9032 0.9049 0.9066 0.9082 0.9099 0.9115 0.9131 0.9147 0.9162 0.9177
1.4 | 0.9192 0.9207 0.9222 0.9236 0.9251 0.9265 0.9279 0.9292 0.9306 0.9319
1.5 | 0.9332 0.9345 0.9357 0.9370 0.9382 0.9394 0.9406 0.9418 0.9429 0.9441
1.6 | 0.9452 0.9463 0.9474 0.9484 0.9495 0.9505 0.9515 0.9525 0.9535 0.9545
1.7 | 0.9554 0.9564 0.9573 0.9582 0.9591 0.9599 0.9608 0.9616 0.9625 0.9633
1.8 | 0.9641 0.9649 0.9656 0.9664 0.9671 0.9678 0.9686 0.9693 0.9699 0.9706
1.9 | 0.9713 0.9719 0.9726 0.9732 0.9738 0.9744 0.9750 0.9756 0.9761 0.9767
2.0 | 0.9772 0.9778 0.9783 0.9788 0.9793 0.9798 0.9803 0.9808 0.9812 0.9817
2.1 | 0.9821 0.9826 0.9830 0.9834 0.9838 0.9842 0.9846 0.9850 0.9854 0.9857
2.2 | 0.9861 0.9864 0.9868 0.9871 0.9875 0.9878 0.9881 0.9884 0.9887 0.9890
2.3 | 0.9893 0.9896 0.9898 0.9901 0.9904 0.9906 0.9909 0.9911 0.9913 0.9916
2.4 | 0.9918 0.9920 0.9922 0.9925 0.9927 0.9929 0.9931 0.9932 0.9934 0.9936
2.5 | 0.9938 0.9940 0.9941 0.9943 0.9945 0.9946 0.9948 0.9949 0.9951 0.9952
2.6 | 0.9953 0.9955 0.9956 0.9957 0.9959 0.9960 0.9961 0.9962 0.9963 0.9964
2.7 | 0.9965 0.9966 0.9967 0.9968 0.9969 0.9970 0.9971 0.9972 0.9973 0.9974
2.8 | 0.9974 0.9975 0.9976 0.9977 0.9977 0.9978 0.9979 0.9979 0.9980 0.9981
2.9 | 0.9981 0.9982 0.9982 0.9983 0.9984 0.9984 0.9985 0.9985 0.9986 0.9986
3.0 | 0.9987 0.9987 0.9987 0.9988 0.9988 0.9989 0.9989 0.9989 0.9990 0.9990
REFERENCES
Bessemoulin, G., 1973: Sur la statistique des valeurs Wijngaard, J.B., A.M.G. Klein Tank and G.P. Konnen,
extrêmes. Monographie No. 89. Paris, 2003: Homogeneity of 20th century European
Météorologie Nationale. daily temperature and precipitation series. Int.
Carruthers, N. and C.E.P. Brooks, 1953: Handbook of J. Climatol., 23:679–692.
Statistical Methods in Meteorology. Publ. No. 538. Wilks, D.S., 1995: Statistical Methods in the
London, Her Majesty’s Stationery Office. Atmospheric Sciences. San Diego, Academic
Coles, S., 2001: An Introduction to Statistical Modeling Press.
of Extreme Values. London, Springer. World Meteorological Organization, 1966: Some
Draper, N.R. and H. Smith, 1981: Applied Regression Methods of Climatological Analysis (H.C.S. Thom).
Analysis. Second edition. New York, John Wiley Technical Note No. 81 (WMO-No. 199), Geneva.
and Sons. ———, 1983: Guide to Climatological Practices
Gumbel, E.J., 1959: Statistics of Extremes. New York, (WMO-No. 100), Geneva.
Columbia University Press. ———, 2000: Agrometeorological Data Management
Hartkampa, A.D., J.W. Whitea and G. Hoogenboomb, (R.P. Motha, ed.) (WMO/TD-No. 1015),
2003: Comparison of three weather generators Geneva.
for crop modeling: a case study for subtropical ———, 2003a: Guidelines on Climate Observation
environments. Agric. Syst., 76:539–560. Networks and Systems (N. Plummer, T. Allsopp
Sivakumar, M.V.K., U.S. De, K.C. Simharay and and J.A. Lopez). WCDMP-No. 52 (WMO/TD-
M. Rajeevan (eds), 1998: User Requirements for No. 1185), Geneva.
Agrometeorological Services. Proceedings of an ———, 2003b: Guidelines on Climate Metadata and
International Workshop held at Pune, India, Homogenization (E. Aguilar, I. Auer, M. Brunet,
10–14 November 1997. Shivajinagar, India T.C. Peterson and J. Wieringa). WCDMP-No. 53
Meteorological Department. (WMO/TD-No. 1186), Geneva.
Sivakumar, M.V.K., A. Maidoukia and R.D. Stern, ———, 2004a: Fourth Seminar for Homogenization
1993: Agroclimatology of West Africa: Niger. and Quality Control in Climatological Databases.
Information Bulletin No. 5. Patancheru, ICRISAT. (WMO/TD-No. 1236), Geneva.
Sivakumar, M.V.K., C.J. Stigter and D. Rijks (eds), ———, 2004b: Statistical analysis of results of homoge-
2000: Agrometeorology in the 21st Century – neity testing and homogenisation of long
Needs and Perspectives. Papers from the climatological time series in Germany (G. Müller-
International Workshop held in Accra, Ghana, Westermeier). In: Proceedings of the 4th Seminar for
15–17 February 1999. Agric. For. Meteorol., Homogenization and Quality Control in Climatological
103(1–2). Special Issue. Databases (Budapest, October 2003). WCDMP-No.
Steel, R.G.D. and J.H. Torrie, 1980: Principles and 56 (WMO/TD-No. 1236), Geneva.
Procedures of Statistics: A Biometrical Approach. ———, 2004c: Improving agrometeorological
New York, McGraw-Hill. bulletins (M.V.K. Sivakumar, ed.). In: Proceedings
von Storch, H. and F. Zwiers, 1999: Statistical Analysis of the Inter-Regional Workshop, 15–19 October
in Climate Research. Cambridge, Cambridge 2001, Bridgetown, Barbados. AGM-5
University Press. (WMO/TD-No. 1108), Geneva.