0% found this document useful (0 votes)
143 views5 pages

Polynomial Regression: y A X + A X X X

Polynomial regression is a variant of multiple regression that uses explanatory variables like x raised to different powers (e.g. x, x^2, x^3) in the regression equation to fit a nonlinear model. This allows modeling nonlinear relationships between a response variable y and one or more explanatory variables x. Trend surface analysis is a type of multiple regression where the explanatory variables are geographical coordinates (x,y) sometimes including higher order polynomials of x and y. This allows modeling how a variable varies across a spatial area based on broad-scale trends or gradients. For example, a cubic trend surface model could account for variations in a variable based on linear and nonlinear effects of location x and y.

Uploaded by

Emil Tengwar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
143 views5 pages

Polynomial Regression: y A X + A X X X

Polynomial regression is a variant of multiple regression that uses explanatory variables like x raised to different powers (e.g. x, x^2, x^3) in the regression equation to fit a nonlinear model. This allows modeling nonlinear relationships between a response variable y and one or more explanatory variables x. Trend surface analysis is a type of multiple regression where the explanatory variables are geographical coordinates (x,y) sometimes including higher order polynomials of x and y. This allows modeling how a variable varies across a spatial area based on broad-scale trends or gradients. For example, a cubic trend surface model could account for variations in a variable based on linear and nonlinear effects of location x and y.

Uploaded by

Emil Tengwar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Polynomial regression

Daniel Borcard, Département de sciences biologiques, Université de Montréal

Reference: Legendre and Legendre (1998) p. 526

A variant form of multiple regression can be used to fit a nonlinear


model of an explanatory variable x (or several explanatory variables
xj) to a response variable y. The method consists in using the
explanatory variable x in different powers in the regression equation:
power 1 (which is the original variable), power 2, power 3, etc. The
equation becomes a polynomial function of order k of variable x:
yˆ = a1 x + a2 x 2 + a3 x 3 + ... + ak x k + b
Adding an order to the equation adds a segment with a different slope
sign to the curve representing the fitted values. A first-order equation
is a straight line; a second-order equation is a parabola; a third-order
equation is represented by an S-shaped curve; and so on.

3 3
R2 = 0.0139 R2 = 0.0836
2.5 2.5

2 2

1.5 1.5

1 1

0.5 0.5

0 0
0 10 20 30 0 10 20 30

3 3
R2 = 0.2566 R2 = 0.7519
2.5 2.5

2 2

1.5 1.5

1 1

0.5 0.5

0 0
0 10 20 30 0 10 20 30
6.3.2 Trend surface analysis
This technique is a particular case of multiple regression, where
the explanatory variables are geographical (x-y) coordinates,
sometimes completed by higher order polynomials. When
applying this method, one generally supposes that the spatial
structure of the observed variable is a result of one or two
generating processes that spread over the whole studied area, and
that the resulting broad-scale structure of the dependent variable
can be modelled by means of a polynomial of the spatial
coordinates of the samples. A simple example follows:
Imagine a soil arthropod, the density of which (let us call it z)
increases from 0 (near a stream) to 100 individuals per square
meter (in a nearby meadow). If this density variation is linear, a
simple linear regression, with the distance to the stream (x) acting
as explanatory variable, is enough to model the arthropod density
in the whole meadow (Figure 44):
zˆ = b0 + b1 x

b0

z =b0 + b1 x x
x

Figure 44 - Density of an arthropod species along a gradient and


linear model.
Now, if the stream (with its neighbouring meadows) extends from
higher mountains to sea level, perhaps the arthropod density varies
also with the altitude (y). A second explanatory variable is
necessary, i.e. the altitude, or possibly the distance to the source
along the stream. If the density variation with respect to the
altitude is also linear, one gets a first order multiple regression
equation of the form:
zˆ = b0 + b1 x + b2 y
The result is thus a regression plane fitted through the z data
(densities) by means of the x-y coordinates of the arthropod
sampling points (Figure 45).
z

y
z =b 0 + b 1 x +b 2 y
x

Figure 45 - Density of an arthropod species along a double


gradient and linear model.
If a plane does not explain enough variation, one can try to fit
higher order polynomials, by adding second, third ...order x-y
terms and their products. The following equation is a cubic trend
surface equation:
zˆ = b0 + b1 x + b2 y + b3 x 2 + b4 xy + b5 y 2 + b6 x 3 + b7 x 2 y + b8 xy2 + b9 y3

It is easy to visualize the outcome of the addition of one order to a


trend surface model by remembering that each addition of an order
allows one more fold to the surface (Figure 46):
Figure 46 - Example of trend surface analysis, equations of order
1, 2, 3 and 5.
Trend surface analysis can model relatively simple structures with
a reasonable amount of “hills” and “ holes” resulting of one or two
long-range trends (hence the name) across the sampling area. But
this method, although easy to compute, suffers from several
conceptual and practical problems, and should be used with great
care. Here are some of these problems:
Conceptual problem:
- fitting a trend surface is useful only when the trend has an
underlying physical or biological explanation, or if it can help
generating biological hypotheses; interpretation of individual
terms is often difficult;
Practical problems:
- when data points are few, extreme values can seriously distort the
surface;
- the surfaces are extremely susceptible to edge effects. Higher-
order polynomials can turn abruptly near area edges, leading to
unrealistic values;
- trend surfaces are inexact interpolators. Because they are long-
range models, extreme values of distant data points can exert an
unduly large influence, resulting in poor local estimates of the
studied variable.

Detrending
Despite its problems, trend surface analysis is very useful in one
specific case. It has been said in Section 6.2.5 that, for testing, the
condition of second-order stationarity or, at least, the intrinsic
assumption must be satisfied. Removing a trend from the data at
least makes the mean constant over the sampling area (although it
does not address any problem of heterogeneity of variance).
Furthermore, most methods of spatial analysis are devised to
model the intermediate-scale component of spatial variation and
are therefore much more powerful on detrended data. For these
reasons, trend surface analysis is often used to detrend data: one
fits a plane on the data and proceeds to analyze the finer scale
structure on the residuals of this regression (this is equivalent to
subtract the fitted values from the raw data and to work with what
remains).

You might also like