0% found this document useful (0 votes)
8 views

Symmetric Polynomials

Uploaded by

yigithalitkerem
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Symmetric Polynomials

Uploaded by

yigithalitkerem
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

ROOTS AND SYMMETRIC POLYNOMIALS

DAVID SMYTH

1. From finding roots to factoring.


To see the connection between finding roots and factoring the polynomial, we begin with
the following easy lemma. It says that finding a root α of f (x) is the same as factoring f (x)
into (x − α) and a lower factor.
Lemma 1.1 (Remainder Theorem). Let k be a field, and let f (x) ∈ k[x] be a polynomial. For
any α ∈ k, we can write
f (x) = (x − α)g(x) + f (α),
where g(x) is a polynomial of degree n − 1. In particular if f (α) = 0, then f (x) admits a
factorization as (x − α)g(x).
Proof. Using the usual division algorithm for polynomials, just divide f (x) by (x − α). We
will get
f (x) = (x − α)g(x) + c
where c ∈ k is some constant. By plugging α into both sides of this equation, we see that
c = f (α). 
If we use this factoring procedure inductively, we get two useful corollaries.
Corollary 1.2. If f (x) ∈ k[x] is a degree n polynomial, then f has at most n roots.
Proof. If f has no roots, then there is nothing to prove, so we may assume that f has a root
α. By the Remainder Theorem, we may factor f as
f (x) = (x − α)g(x).
By induction on the degree of f , we may assume that g has no more than n − 1 roots. Since
any root of f must be either a root of (x − α) (namely α) or a root of g(x), it follows that
f (x) has no more than n roots. 
Corollary 1.3. If f (x) ∈ k[x] is a degree n polynomial with n distinct roots α1 , . . . , αn ∈ k,
then f can be factored as:
n
Y
f (x) = (x − αi )
i=1

Proof. We can factor f (x) as:


f (x) = (x − α1 )g(x).
Since the roots are distinct (αi − α1 ) 6= 0 for all i = 2, . . . , n. Thus, αQ
2 , α3 , . . . , αn must be
roots of g(x). By induction on the degree of f , we may assume g(x) = ni=2 (x − αi ), and the
desired result follows. 
1
2 DAVID SMYTH

Now let us assume for the time being that f (x) actually has n distinct roots, so that we can
factor
Yn
f (x) = (x − αi )
i=1

Then we can view the roots αi as variables, and the coefficients of the polynomial as giving
equations involving these variables. At least in the case n = 2, this idea should be familiar
from high school, i.e. if we write

(x − α1 )(x − α2 ) = x2 + a1 x + a2 ,

where a1 and a2 are given to begin with, then we see the the problem of finding the roots of
f is just the same as finding two numbers α1 , α2 such that

−(α1 + α2 ) = a1
α1 α2 = a2 .

In the next lecture, we will generalize this system of equations to higher n.

2. Symmetric Functions
Definition 2.1 (Elementary Symmetric Functions). Let k[x1 , . . . , xn ] be a polynomial ring in
n variables. For i = 1, . . . , n, we define the following special polynomials si ∈ k[x1 , . . . , xn ]:

s1 = x1 + x2 + · · · + xn
s2 = x1 x2 + x1 x3 + . . . + xn−1 xn
..
.
X
sk = xi1 xi2 . . . xik
1≤i1 ≤i2 ≤...≤ik ≤n
..
.
sn = x1 x2 · · · xn

In words, we can say that the k th symmetric function is simply the sum of all degree k
monomials with no repeated variables.
The point of this definition is that the functions si precisely encode the relationship between
the roots of a polynomial and its coefficients. By some straight-forward high school algebra,
you can check:
n
Y
(x − αi ) = xn − s1 (α1 , . . . , αn )xn−1 + s2 (α1 , . . . , αn )xn−2 − . . . + (−1)n sn (α1 , . . . , αn ),
i=1

This means that finding the roots of a given polynomial f (x) = xn + a1 xn−1 + . . . + an (at
least under the assumption that f (x) has n distinct roots - later we’ll see this assumption is
ROOTS AND SYMMETRIC POLYNOMIALS 3

unnecessary) is precisely equivalent to finding α1 , . . . , αn which satisfy the following equations.


α1 + . . . + αn = a1
α1 α2 + . . . + αn−1 αn = −a2
..
.
α1 α2 · · · αn = (−1)n an
In these equations, you should think of ai as given, and αi as being unknown numbers that
you are trying to find.
We started off with a single equation in one variable, and now we have n equations in n
variables. How on earth could this be any easier than the original problem? The point is
that these are not just any random old equations; the elementary symmetric functions have
very special properties that will make them easier to work with than arbitrary functions. For
starters, they are symmetric. What exactly does that mean?
Definition 2.2. Let Sn act on k[x1 , . . . , xn ] by permuting the variables, i.e. σ(f (x1 , . . . , xn )) =
f (xσ(1) , . . . , xσ(n) ). We say that a function is symmetric if σ(f ) = f .
Example 2.3 (n = 3). If σ = (123) is the cyclic permutation of 3 variables, then σ(x31 x22 x3 ) =
x32 x23 x1 . Evidently, x31 x22 x3 is not a symmetric function. On the other hand, x31 + x32 + x33 is a
symmetric function.
The elementary symmetric functions si are all symmetric. While there are many symmetric
functions besides the elementary ones, it turns out that they are all generated as polynomial
combinations of the elementary symmetric functions. This is an astounding fact!
Theorem 2.4 (Fundamental Theorem of Symmetric Functions). Let f (x1 , . . . , xn ) be any
symmetric polynomial. Then, f can be expressed as polynomial in the symmetric function, i.e
f = g(s1 , . . . , sn ) for some polynomial g.
Example 2.5. The theorem says that one can express x31 +x32 +x33 as a polynomial in s1 , s2 , s3 .
One can easily check that
x31 + x32 + x33 = s31 − 3s1 s2 + 3s3
Is there a way to derive this formula systematically? Yes, there is - we shall spell out the
algorithm in full detail when we prove the fundamental theorem, but it may be useful to sketch
the idea informally in the context of this example. To begin with, it’s easy to see that s31 , s1 s2 , s3
are the only monomials in s1 , s2 , s3 that give rise to degree 3 monomials in x1 , x2 , x3 , so these
are the only monomials that can appear in our formula. In other words, we must have a
formula like
x31 + x32 + x33 = as31 + bs1 s2 + cs3 ,
for some coefficients a, b, c, and the question is how to figure out these coefficients.
First, focus attention on the x31 term. On the left, it occurs with coefficient 1. On the right,
it’s easy to see that only s31 contains an x31 term and it occurs with coefficient one. Thus, we
must have a = 1.
Next, let’s subtract s31 from both sides, to get:
x31 + x32 + x33 − (x1 + x2 + x3 )3 = bs1 s2 + cs3 ,
If we expand out the left hand side, the x31 term cancels, so let’s examine the next lowest order
term, i.e. x21 x2 . On the left, it occurs with coefficient −3. On the right, it’s easy to see that
only s1 s2 contains an x21 x2 term and it occurs with coefficient 1. Thus, we must have b = −3.
4 DAVID SMYTH

Next, let’s add 3s1 s2 to both sides to get:


x31 + x32 + x33 − (x1 + x2 + x3 )3 + 3(x1 + x2 + x3 )(x1 x2 + x1 x3 + x2 x3 ) = cs3 ,
If we expand out the left hand side, we see that x31 and x21 x2 terms cancel - in fact everything
cancels except the x1 x2 x3 term which occurs with coefficient 3. Since s3 = x1 x2 x3 , we must
have c = 3, and we are done.
In the example, I made use of the idea of the “lowest” monomial without actually explaining
what I meant. The key technical tool in proving the general theorem is the introduction of an
ordering on monomials which makes this concept precise.
Definition 2.6 (Lexicographic Ordering). The lexicographic ordering is a total ordering on
all degree m monomials in n variables, which can be defined as follows. Given any monomial,
we can write it as xi1 xi2 . . . xim with i1 ≤ i2 ≤ . . . ≤ im . In other words, we can write it as a
product of m variables whose subscripts go from lowest to highest. To compare two monomials,
we then just look at the first subscript were two monomials differ. More formally, we say that
xi1 xi2 . . . xim < xj1 xj2 . . . xjm
if i1 = j1 , i2 = j2 , . . . , ik−1 = jk−1 and ik < jk for some k ∈ {1, . . . , m}.
Example 2.7. The lexicographic ordering for degree 3 monomials in 3 variables goes like this:
x31 < x21 x2 < x21 x3 < x1 x22 < x1 x2 x3 < x1 x23 < x32 < x22 x3 < x2 x23 < x33
Definition 2.8. If f is a homogeneous polynomial of degree m (homogeneous means that
every monomial in f has the same degree), we let L(f ) be the “lowest” monomial of f , i.e.
the monomial of f which is least with respect to the lexicographic ordering.
Example 2.9. If f = 2x21 x2 + x1 x2 x3 + 3x33 , then L(f ) = 2x21 x2 , because x21 x2 < x1 x2 x3 < x33
in the lexicographic ordering.
If you think about it, you will see that certain monomials cannot occur as L(f ) for a
symmetric function f . For example, x1 x22 could never be the lowest monomial of a symmetric
function. Why not? Because if f contains the monomial x1 x22 , then it must also (by symmetry)
contain the monomial x21 x2 and x21 x2 < x1 x22 . More generally, we have the following lemma.
Lemma 2.10. If f is a symmetric function, and L(f ) = cxk11 xk21 . . . xknn , then k1 ≥ k2 ≥ . . . ≥
kn .
Proof. Let f be a symmetric function with L(f ) = xk11 xk21 . . . xknn , and suppose the statement
of the lemma fails, i.e. suppose that the ki ’s are not ordered from largest to smallest. Let
σ ∈ Sn be a permutation that orders the ki ’s correctly, i.e. such that
kσ(1) ≥ kσ(2) ≥ . . . ≥ kσ(n) .
k k k
Since f is symmetric, f must contain the monomial x1σ(1) x2σ(2) . . . xnσ(n) . By the definition
k k k
of the lexicographic ordering, we have x1σ(1) x2σ(2) . . . xnσ(n) < xk11 xk21 . . . xknn . But this is a
contradiction, since we started by assuming that xk11 xk21 . . . xknn was the lowest monomial in
f. 
Now we are ready to prove the fundamental theorem of symmetric functions. The idea, as
demonstrated in Example 2.5, is to focus on the lowest monomial of our symmetric function,
and then find a monomial in the elementary symmetric functions which matches it. By suc-
cessively subtracting off appropriate multiples of monomials in the symmetric functions, we
ROOTS AND SYMMETRIC POLYNOMIALS 5

can work our way up the lexicographic ordering until there are monomials left! At that point,
we have expressed f as a polynomial in the symmetric functions.
Proof. First, we reduce to the case that f is homogenous. We claim that if we know the
fundamental theorem for homogenous symmetric functions, then we know the fundamental
theorem for all symmetric functions. To see this, let f be any symmetric function and write
f = f1 + . . . + fm , where each fi is homogenous of degree i. If f is symmetric, then each fi
must be as well (because the action of Sn preserves the degree of each monomial of f ). If we
know the fundamental theorem for homogenous symmetric functions, then we can write each
fi as a polynomial in elementary symmetric function. But then we clearly get a representation
of f as a polynomial in elementary symmetric functions as desired.
Now let f be a homogeneous symmetric function and let L(f ) = cxk1 . . . xkn . By Lemma
2.10, we know that k1 ≥ k2 ≥ . . . ≥ kn . We claim that there exists a monomial in the
symmetric functions, say csl11 sl22 . . . slnn such that
L(f ) = L(csl11 sl22 . . . slnn ).
To check this, we need to investigate the lowest monomials of the elementary symmetric
functions. By the definition of the elementary symmetric functions, one easily checks that:
L(si ) = x1 x2 . . . xi .
It follows that
l +l2 +...+ln−1
L(csl11 sl22 . . . slnn ) = cxl11 +l2 +...+ln x21 . . . xlnn .
Thus, in order to get L(f ) = L(csl11 sl22 . . . slnn ), we simply need to find non-negative integers
l1 , l2 , . . . , ln such that
l1 + . . . + ln = k1
l1 + . . . + ln−1 = k2
..
.
ln = kn .
Happily, the condition k1 ≥ k2 ≥ . . . ≥ kn guarantees that we can do this. Indeed, we
simply set ln = kn and li = ki − ki+1 for i = 1, . . . , n − 1. With this choice of li , we have
L(f ) = L(csl11 sl22 . . . slnn ) as desired.
Now we are basically done. If we let f 0 := f − csl11 sl22 . . . slnn , then f 0 is a symmetric function
with L(f 0 ) > L(f ). Thus, we can simply replace f by f 0 and repeat this procedure. As we do
this, we will subtract off multiples of monomials of the symmetric functions to get a sequence
of functions f, f 0 , f 00 , . . . with higher and higher lowest monomials. The only way this process
can terminate is to have f k = 0 for some k. At that point, we have an equation expressing f
as a sum of monomials of elementary symmetric functions, i.e. a polynomial in the elementary
symmetric functions. 

You might also like