Taha Sochi - Tensor Calculus Made Simple (2016)
Taha Sochi - Tensor Calculus Made Simple (2016)
This book is prepared from personal notes and tutorials about tensor calculus at an
introductory level. The language and method used in presenting the ideas and techniques of
tensor calculus make it very suitable for learning this subject by the beginners who have
not been exposed previously to this elegant discipline of mathematics. Yes, some general
background in arithmetic, elementary algebra, calculus and linear algebra is needed to
understand the book and follow the development of ideas and techniques of tensors.
However, we made considerable efforts to reduce this dependency on foreign literature by
summarizing the main items needed in this regard to make the book self-contained. The
book also contains a number of graphic illustrations to aid the readers and students in their
effort to visualize the ideas and understand the abstract concepts. In addition to the graphic
illustrations, we used illustrative techniques such as highlighting key terms by boldface
fonts.
The book also contains extensive sets of clearly explained exercises which cover most of
the materials presented in the book where each set is given in an orderly manner in the end
of each chapter. These exercises are designed to provide thorough revisions of the supplied
materials and hence they make an essential component of the book and its learning
objectives. Therefore, they should not be considered as a decorative accessory to the book.
We also populated the text with hyperlinks, for the ebook users, to facilitate referencing
and connecting related objects so that the reader can go forth and back with minimum effort
and time and without compromising the continuity of reading by interrupting the chain of
thoughts.
In view of all the above factors, the present text can be used as a textbook or as a reference
for an introductory course on tensor algebra and calculus or as a guide for self-studying
and learning. I tried to be as clear as possible and to highlight the key issues of the subject
at an introductory level in a concise form. I hope I have achieved some success in reaching
these objectives for the majority of my target audience.
Finally, I should make a short statement about credits in making this book following the
tradition in writing book prefaces. In fact everything in the book is made by the author
including all the graphic illustrations, front and back covers, indexing, typesetting, and
overall design. However, I should acknowledge the use of the LaTeX typesetting package
and the LaTeX based document preparation package LyX for facilitating many things in
typesetting and design which cannot be done easily or at all without their versatile and
powerful capabilities. I also used the Ipe extensible drawing editor program for making all
the graphic illustrations in the book as well as the front and back covers.
Taha Sochi
London, November 2016
Table of Contents
Preface
Nomenclature
1: Preliminaries
1.1: Historical Overview of Development & Use of Tensor Calculus
1.2: General Conventions
1.3: General Mathematical Background
1.3.1: Coordinate Systems
1.3.2: Vector Algebra and Calculus
1.3.3: Matrix Algebra
1.4: Exercises
2: Tensors
2.1: General Background about Tensors
2.2: General Terms and Concepts
2.3: General Rules
2.4: Examples of Tensors of Different Ranks
2.5: Applications of Tensors
2.6: Types of Tensor
2.6.1: Covariant and Contravariant Tensors
2.6.2: True and Pseudo Tensors
2.6.3: Absolute and Relative Tensors
2.6.4: Isotropic and Anisotropic Tensors
2.6.5: Symmetric and Anti-symmetric Tensors
2.7: Exercises
3: Tensor Operations
3.1: Addition and Subtraction
3.2: Multiplication of Tensor by Scalar
3.3: Tensor Multiplication
3.4: Contraction
3.5: Inner Product
3.6: Permutation
3.7: Tensor Test: Quotient Rule
3.8: Exercises
4: delta and epsilon Tensors
4.1: Kronecker delta
4.2: Permutation epsilon
4.3: Useful Identities Involving delta or/and epsilon
4.3.1: Identities Involving delta
4.3.2: Identities Involving epsilon
4.3.3: Identities Involving delta and epsilon
4.4: Generalized Kronecker delta
4.5: Exercises
5: Applications of Tensor Notation and Techniques
5.1: Common Definitions in Tensor Notation
5.2: Scalar Invariants of Tensors
5.3: Common Differential Operations in Tensor Notation
5.3.1: Cartesian Coordinate System
5.3.2: Cylindrical Coordinate System
5.3.3: Spherical Coordinate System
5.3.4: General Orthogonal Coordinate System
5.4: Common Identities in Vector and Tensor Notation
5.5: Integral Theorems in Tensor Notation
5.6: Examples of Using Tensor Techniques to Prove Identities
5.7: Exercises
6: Metric Tensor
6.1: Exercises
7: Covariant Differentiation
7.1: Exercises
References
Footnotes
Nomenclature
In the following table, we define some of the common symbols, notations and abbreviations
which are used in the book to avoid ambiguity and confusion.
∇ nabla differential operator
∇f gradient of scalar f
∇⋅A divergence of vector A
∇×A curl of vector A
2
∇ or Δ Laplacian operator
⊥ perpendicular to
2D, 3D, nD two-dimensional, three-dimensional, n-dimensional
det determinant of matrix
Ei ith covariant basis vector
Ei ith contravariant basis vector
Eq./Eqs. Equation/Equations
hi scale factor for ith coordinate in general orthogonal system
iff if and only if
r, θ, φ coordinates of spherical system in 3D space
tr trace of matrix
u1, u2, u3 coordinates of general orthogonal system in 3D space
x1, x2, x3 labels of coordinate axes of Cartesian system in 3D space
X1, X2, X3 same as the previous entry
x1 , x2 , x3 coordinates of general curvilinear system in 3D space
x, y, z coordinates of points in Cartesian system in 3D space
ρ, φ, z coordinates of cylindrical system in 3D space
Chapter 1
Preliminaries
In this introductory chapter, we provide the reader with a general overview about the
historical development of tensor calculus and its role in modern mathematics, science and
engineering. We also provide a general set of notes about the notations and conventions
which are generally followed in the writing of this book. A general mathematical
background about coordinate systems, vector algebra and calculus and matrix algebra is
also presented to make the book, to some extent, self-sufficient. Although the general
mathematical background section is not comprehensive, it contains essential mathematical
terminology and concepts which are needed in the development of the ideas and methods of
tensor calculus in the subsequent chapters of the book.
Partial derivative symbol with a spatial subscript, rather than an index, is used to denote
partial differentiation with respect to that spatial variable. For instance:
(3)
is used for the partial derivative with respect to the radial coordinate r in spherical
coordinate systems identified by the spatial variables (r, θ, φ) (see Footnote 4 in § 8↓).
Partial derivative symbol with repeated double index is used to denote the Laplacian
operator:
2
(4) ∂ii = ∂i∂i = ∇ = Δ
The notation is not affected by using repeated double index other than i (e.g. ∂jj or ∂kk).
The following notations: ∂2ii, ∂2 and ∂i∂i are also used in the literature of tensor calculus
to symbolize the Laplacian operator. However, these notations will not be used in the
present book.
We follow the common convention of using a subscript semicolon preceding a subscript
index (e.g. Akl;i) to symbolize the operation of covariant differentiation with respect to
the ith coordinate (see § 7↓). The semicolon notation may also be attached to the normal
differential operators for the same purpose, e.g. ∇;i or ∂;i to indicate covariant
differentiation with respect to the variable indexed by i.
Finally, all transformation equations in the present book are assumed to be continuous and
real, and all derivatives are continuous in their domain of variables. Based on the
continuity condition of the differentiable quantities, the individual differential operators in
the second (and higher) order partial derivatives with respect to different indices are
commutative, that is:
(5) ∂i∂j = ∂j∂i
In generic terms, a coordinate system is a mathematical device used to identify the location
of points in a given space. In tensor calculus, a coordinate system is needed to define non-
scalar tensors in a specific form and identify their components in reference to the basis set
of the system. Hence, non-scalar tensors require a predefined coordinate system to be fully
identified (see Footnote 5 in § 8↓).
There are many types of coordinate system, the most common ones are: the orthonormal
Cartesian, the cylindrical and the spherical. A 2D version of the cylindrical system is the
plane polar system. The most general type of coordinate system is the general curvilinear.
A subset of the latter is the orthogonal curvilinear. These types of coordinate system are
briefly investigated in the following subsections.
A. Orthonormal Cartesian Coordinate System
This is the simplest and the most commonly used coordinate system. It consists, in its
simplest form, of three mutually orthogonal straight axes that meet at a common point
called the origin of coordinates O. The three axes, assuming a 3D space, are scaled
uniformly and hence they all have the same unit length. Each axis has a unit vector oriented
along the positive direction of that axis (see Footnote 6 in § 8↓). These three unit vectors
are called the basis vectors or the bases of the system. These basis vectors are constant in
magnitude and direction throughout the system (see Footnote 7 in § 8↓). This system with
its basis vectors (e 1, e 2 and e 3) is depicted in Figure 1↓.
The three axes, as well as the basis vectors, are usually labeled according to the right
hand rule, that is if the index finger of the right hand is pointing in the positive direction of
the first axis and its middle finger is pointing in the positive direction of the second axis
then the thumb will be pointing in the positive direction of the third axis.
Figure 1 Orthonormal right-handed Cartesian coordinate system and its basis vectors
e 1, e 2 and e 3 in a 3D space (left frame) with the components of a vector v in this system
(right frame).
The transformation from the Cartesian coordinates (x, y, z) of a particular point in the
space to the cylindrical coordinates (ρ, φ, z) of that point, where the two systems are in a
standard position, is performed through the following equations (see Footnote 8 in § 8↓):
(6)
ρ = √(x2 + y2)
φ = arctan(y ⁄ x)
z=z
while the opposite transformation from the cylindrical to the Cartesian coordinates is
performed by the following equations:
(7)
x = ρ cosφ
y = ρ sinφ
z=z
The transformation from the Cartesian coordinates (x, y, z) of a particular point in the
space to the spherical coordinates (r, θ, φ) of that point, where the two systems are in a
standard position, is performed by the following equations (see Footnote 9 in § 8↓):
(10)
r = √(x2 + y2 + z2)
θ = arccos(z ⁄ √(x2 + y2 + z2))
φ = arctan(y ⁄ x)
while the opposite transformation from the spherical to the Cartesian coordinates is
performed by the following equations:
(11)
x = r sinθ cosφ
y = r sinθ sinφ
z = r cosθ
Figure 6 Demonstration of the geometric interpretation of the dot product of two vectors
a and b (left frame) as the projection of a onto b times the length of b (middle frame) or as
the projection of b onto a times the length of a (right frame).
Algebraically, the dot product is the sum of the products of the corresponding components
of the two vectors, that is:
(13) a⋅b = a1b1 + a2b2 + a3b3
where ai and bj (i, j = 1, 2, 3) are the components of a and b respectively. Here, we are
assuming an orthonormal Cartesian system in a 3D space; the formula can be easily
extended to an nD space, that is:
(14) a⋅b = Σni = 1 aibi
From Eq. 12↑, it is obvious that the dot product is positive when 0 ≤ θ < π ⁄ 2, zero when
θ = π ⁄ 2 (i.e. the two vectors are orthogonal), and negative when π ⁄ 2 < θ ≤ π. The
magnitude of the dot product is equal to the product of the lengths of the two vectors when
they have the same orientation (i.e. parallel or anti-parallel). Based on the above given
facts, the dot product is commutative, that is:
(15) a⋅b = b⋅a
Figure 7 Graphical demonstration of the cross product of two vectors a and b (left frame)
with the right hand rule (right frame).
Algebraically, the cross product of two vectors a and b is expressed by the following
determinant (see Determinant of Matrix in § 1.3.3↓) where the determinant is expanded
along its first row, that is:
(17)
Now, since |b × c|( = |b| |c| sinφ) is equal to the area of the parallelogram whose two main
sides are b and c, while |a|cosθ represents the projection of a onto the orientation of b × c
and hence it is equal to the height of the parallelepiped (refer to Figure 8↑), the magnitude
of the scalar triple product is equal to the volume of the parallelepiped whose three main
sides are a, b and c while its sign is positive or negative depending, respectively, on
whether the vectors a, b and c form a right-handed or left-handed system.
The scalar triple product is invariant to a cyclic permutation of the symbols of the three
vectors involved, that is:
(21) a⋅(b × c) = c⋅(a × b) = b⋅(c × a)
It is also invariant to an exchange of the dot and cross product symbols, that is:
(22) a⋅(b × c) = (a × b)⋅c
Hence, from the three possibilities of the first invariance with the two possibilities of the
second invariance, we have six equal expressions for the scalar triple product of three
vectors (see Footnote 10 in § 8↓). The other six possibilities of the scalar triple product
of three vectors, which are obtained from the first six possibilities with the opposite cyclic
permutations, are also equal to each other for the same reason. However, they are equal in
magnitude to the first six possibilities but are different in sign (see Footnote 11 in § 8↓).
From the above interpretation of the scalar triple product as the signed volume of the
parallelepiped formed by the three vectors, it is obvious that this product is zero when the
three vectors are coplanar. This, of course, includes the possibility of being collinear.
The scalar triple product of three vectors is also defined algebraically as the determinant
(refer to Determinant of Matrix in § 1.3.3↓) of the matrix formed by the components of the
three vectors as its rows or columns in the given order, that is:
(23)
where i, j and k are the unit vectors in the x, y and z directions respectively.
The gradient of a scalar field f(x, y, z) is a vector defined, in Cartesian coordinate
systems, by:
(27)
Geometrically, the gradient of a scalar field f(x, y, z), at any point in the space where the
field is defined, is a vector normal to the surface f(x, y, z) = constant (refer to Figure 9↓)
pointing in the direction of the fastest increase in the field at that point.
Figure 9 The gradient of a scalar field f(x, y, z) as a vector normal to the surface
f(x, y, z) = constant pointing in the direction of the fastest increase in the field at that point.
The gradient operation is distributive but not commutative or associative, that is:
(28) ∇(f + h) = ∇f + ∇h
(29) ∇f ≠ f∇
(30) (∇f)h ≠ ∇(fh)
where f and h are differentiable scalar functions of position.
The divergence of a vector field v(x, y, z) is a scalar quantity defined as the dot product
of the nabla operator with the vector. Hence, in Cartesian coordinate systems it is given by:
(31)
where vx, vy and vz are the components of v in the x, y and z directions respectively. In
broad terms, the physical significance of the divergence of a vector field is that it is a
measure of how much the field diverges or converges at a particular point in the space
where the field is defined. When the divergence of a vector field is identically zero, the
field is called solenoidal.
The divergence operation is distributive but not commutative or associative, that is:
(32) ∇⋅(A + B) = ∇⋅A + ∇⋅B
(33) ∇⋅A ≠ A⋅∇
(34) ∇⋅(fA) ≠ ∇f⋅A
where A and B are differentiable vector functions of position.
The curl of a vector field v(x, y, z) is a vector defined as the cross product of the nabla
operator with the vector. Hence, in Cartesian coordinate systems it is given by (refer to
Cross Product of Vectors in § 1.3.2↑):
(35)
Broadly speaking, the curl of a vector field is a quantitative measure of the circulation or
rotation of the field at a given point in the space where the field is defined. When the curl
of a vector field vanishes identically, the field is called irrotational.
The curl operation is distributive but not commutative or associative, that is:
(36) ∇ × (A + B) = ∇ × A + ∇ × B
(37) ∇ × A ≠ A × ∇
(38) ∇ × (A × B) ≠ (∇ × A) × B
where A and B are differentiable vector functions of position.
2
The Laplacian (see Footnote 15 in § 8↓) scalar operator ∇ is defined as the divergence
of the gradient operator and hence it is given, in Cartesian coordinates, by:
(39)
The Laplacian can act on scalar, vector and tensor fields of higher rank. When the
Laplacian operates on a tensor (in its general sense which includes scalar and vector) it
produces a tensor of the same rank; hence the Laplacian of a scalar is a scalar, the
Laplacian of a vector is a vector, the Laplacian of a rank-2 tensor is a rank-2 tensor, and so
on.
F. Divergence Theorem
The divergence theorem, which is also known as Gauss theorem, is a mathematical
statement of the intuitive idea that the integral of the divergence of a vector field over a
given volume is equal to the total flux of the vector field out of the surface enclosing the
volume. Symbolically, the divergence theorem states that:
(40) ∭V∇⋅A dτ = ∬SA⋅n dσ
where A is a differentiable vector field, V is a bounded volume in an nD space enclosed
by a surface S, dτ and dσ are volume and surface elements respectively, and n is a variable
unit vector normal to the surface.
The divergence theorem is useful for converting volume integrals into surface integrals
and vice versa. In many cases, this can result in a considerable simplification of the
required mathematical work when one of these integrals is easier to manipulate and
evaluate than the other, or even overcoming a mathematical hurdle when one of the
integrals cannot be evaluated analytically. Moreover, the divergence theorem plays a
crucial role in many mathematical proofs and theoretical arguments in mathematical and
physical theories.
G. Stokes Theorem
Stokes theorem is a mathematical statement that the integral of the curl of a vector field
over an open surface is equal to the line integral of the field around the perimeter
surrounding the surface, that is:
(41) ∬S(∇ × A)⋅n dσ = ∮CA⋅dr
where A is a differentiable vector field, C symbolizes the perimeter of the surface S, dr is
a vector element tangent to the perimeter, and the other symbols are as defined in the
divergence theorem. The perimeter should be traversed in a sense related to the direction
of the normal vector n by the right hand twist rule, that is when the fingers of the right
hand twist in the sense of traversing the perimeter the thumb will point approximately in the
direction of n, as seen in Figure 10↓.
Figure 10 Illustration of Stokes integral theorem (left frame) with the right hand twist rule
(right frame).
Similar to the divergence theorem, Stokes theorem is useful for converting surface
integrals into line integrals and vice versa, which is useful in many cases for reducing the
amount of mathematical work or overcoming technical and mathematical difficulties.
Stokes theorem is also crucial in the development of many proofs and theoretical arguments
in mathematics and science.
There is a close relation between rank-2 tensors and square matrices where the latter
usually represent the former. Hence, there are many ideas, techniques and notations which
are common or similar between the two subjects. We therefore provide in this subsection a
set of short introductory notes about matrix algebra to supply the reader with the essential
terminology and methods of matrix algebra which are needed in the development of the
forthcoming chapters about tensor calculus.
A. Definition of Matrix
A matrix is a rectangular array of mathematical objects (mainly numbers or functions)
which is subject to certain rules in its manipulation and over which certain mathematical
operations are defined. Hence, two indices are needed to define a matrix unambiguously
where the first index labels the rows while the second index labels the columns. A matrix
which consists of m rows and n columns is said to be an m × n matrix.
The elements or entries of a matrix A are usually labeled with light-face symbols similar to
the symbol used to label the matrix where each element is suffixed with two indices: the
first refers to the row number of the entry and the second refers to its column number.
Hence, for a matrix A the entry in its second row and fifth column is labeled A25.
The two indices of a matrix are not required to have the same range since a matrix can have
different number of rows and columns. When the two indices have the same range, the
matrix is described as square matrix. Examples of matrices are:
(42)
When the range of the row/column index has only one value (i.e. 1) while the range of the
other index has multiple values the matrix is described as row/column matrix. Vectors may
be represented by row or column matrices (see Footnote 16 in § 8↓). Scalars may be
regarded as a trivial case of matrices.
For a square matrix, the entries with equal values of row and column indices are called the
main diagonal of the matrix. For example, the entries of the main diagonal of the third
matrix in Eq. 42↑ are C11 and C22. The elements of the other diagonal running from the top
right corner to the bottom left corner form the trailing or anti-diagonal, i.e. C12 and C21 in
the previous example.
B. Special Matrices
The zero matrix is a matrix whose all entries are 0. The identity or unit or unity matrix
is a square matrix whose all entries are 0 except those on its main diagonal which are 1. A
matrix is described as singular iff its determinant is zero (see Determinant of Matrix in §
1.3.3↓). A singular matrix has no inverse (see Inverse of Matrix in § 1.3.3↓). A square
matrix is called diagonal if all of its elements which are not on the main diagonal are zero.
The transpose of a matrix is a matrix obtained by exchanging the rows and columns of the
original matrix. For example, if A is a 3 × 3 matrix and AT is its transpose then (see
Footnote 17 in § 8↓):
(43)
The transposition operation is defined even for non-square matrices. For square matrices,
transposition represents a reflection of the matrix elements in the main diagonal of the
matrix.
C. Matrix Multiplication
The multiplication of two matrices, A of m × k dimensions and B of k × n dimensions, is
defined as an operation that produces a matrix C of m × n dimensions whose Cij entry is
the dot product of the ith row of the first matrix A and the jth column of the second matrix
B. Hence, if A is a 3 × 2 matrix and B is a 2 × 2 matrix, then their product AB is a 3 × 2
matrix which is given by:
(44)
From the above, it can be seen that matrix multiplication is defined only when the number
of columns of the first matrix is equal to the number of rows of the second matrix. Matrix
multiplication is associative and distributive over a sum of compatible matrices, but it is
not commutative in general even if both forms of the product are defined, that is:
(45) (AB)C = A(BC)
(46) A(B + C) = AB + AC
(47) AB ≠ BA
As seen above, no symbol is used to indicate the operation of matrix multiplication
according to the notation of matrix algebra, i.e. the two matrices are put side by side with
no symbol in between. However, in tensor symbolic notation such an operation is usually
represented by a dot between the symbols of the two matrices, as will be discussed later in
the book (see Footnote 18 in § 8↓).
D. Trace of Matrix
The trace of a matrix is the sum of its diagonal elements, therefore if a matrix A is given
by:
(48)
then its trace is given by:
(49) tr(A) = A11 + A22 + A33
From its definition, it is obvious that the trace of a matrix is a scalar and it is defined only
for square matrices.
E. Determinant of Matrix
The determinant is a scalar quantity associated with a square matrix. There are several
definitions for the determinant of a matrix; the most direct one is that the determinant of a 2
× 2 matrix is the product of the elements of its main diagonal minus the product of the
elements of its trailing diagonal, that is:
(50)
The determinant of an n × n (n > 2) matrix is then defined, recursively, as the sum of the
products of each entry of any one of its rows or columns times the cofactor of that entry
where the cofactor of an entry is defined as the determinant obtained from eliminating the
row and column of that entry from the parent matrix with a sign given by ( − 1)i + j with i
and j being the indices of the row and column of that entry (see Footnote 19 in § 8↓). For
example:
(51)
where the determinant is evaluated along the first row. It should be remarked that the
determinant of a matrix and the determinant of its transpose are equal, that is:
(52) det(A) = det(AT)
where A is a square matrix and T stands for the transposition operation. Another remark is
that the determinant of a diagonal matrix is the product of its main diagonal elements.
F. Inverse of Matrix
The inverse of a square matrix A is a square matrix A − 1 where:
(53) AA − 1 = A − 1 A = I
with I being the identity matrix (see Special Matrices in § 1.3.3↑) of the same dimensions
as A. The inverse of a square matrix is formed by transposing the matrix of cofactors of the
original matrix with dividing each element of the transposed matrix of cofactors by the
determinant of the original matrix (see Footnote 20 in § 8↓). From this definition, it is
obvious that a matrix possesses an inverse only if its determinant is not zero, i.e. it must be
non-singular.
It should be remarked that this definition includes the 2 × 2 matrices where the cofactor of
an entry is a single entry with the designated sign, that is:
(54)
Another remark is that the inverse of an invertible diagonal matrix is a diagonal matrix
obtained by taking the reciprocal of the corresponding diagonal elements of the original
matrix.
1.4 Exercises
Exercise 1. Name three mathematicians accredited for the development of tensor calculus.
For each one of these mathematicians, give a mathematical technical term that bears his
name.
Exercise 2. What are the main scientific disciplines that employ the language and
techniques of tensor calculus?
Exercise 3. Mention one cause for the widespread use of tensor calculus in science.
Exercise 4. Describe some of the distinctive features of tensor calculus which contributed
to its success and extensive use in mathematics, science and engineering.
Exercise 5. Give preliminary definitions of the following terms: scalar, vector, tensor, rank
of tensor, and dyad.
Exercise 6. What is the meaning of the following mathematical symbols?
∂i
∂ii
∇
A, i
Δ
Ai;k
Exercise 7. Is the following equality correct? If so, is there any condition for this to hold?
∂k ∂l = ∂l ∂k
Exercise 8. Describe, briefly, the following six coordinate systems outlining their main
features: orthonormal Cartesian, cylindrical, plane polar, spherical, general curvilinear,
and general orthogonal.
Exercise 9. Which of the six coordinate systems in the previous exercise are orthogonal?
Exercise 10. What “basis vectors” of a coordinate system means and what purpose they
serve?
Exercise 11. Which of the six coordinate systems mentioned in the previous exercises have
constant basis vectors (i.e. some or all of their basis vectors are constant both in magnitude
and in direction)?
Exercise 12. Which of the above six coordinate systems have unit basis vectors by
definition or convention?
Exercise 13. Explain the meaning of the coordinates in the cylindrical and spherical
systems (i.e. ρ, φ and z for the cylindrical, and r, θ and φ for the spherical).
Exercise 14. What is the relation between the cylindrical and plane polar coordinate
systems?
Exercise 15. Is there any common coordinates between the above six coordinate systems?
If so, what? Investigate this thoroughly by comparing each pair of these systems.
Exercise 16. Write the transformation equations between the following coordinate systems
in both directions: Cartesian and cylindrical, and Cartesian and spherical.
Exercise 17. Make a sketch representing a spherical coordinate system, with its basis
vectors, superimposed on a rectangular Cartesian system in a standard position.
Exercise 18. What are the geometric and algebraic definitions of the dot product of two
vectors? What is the interpretation of the geometric definition?
Exercise 19. What are the geometric and algebraic definitions of the cross product of two
vectors? What is the interpretation of the geometric definition?
Exercise 20. What is the dot product of the vectors A and B if A = (1.9, − 6.3, 0) and B =
( − 4, − 0.34, 11.9)?
Exercise 21. What is the cross product of the vectors in the previous exercise? Write this
cross product in its determinantal form and expand it.
Exercise 22. Define the scalar triple product operation of three vectors geometrically and
algebraically.
Exercise 23. What is the geometric interpretation of the scalar triple product? What is the
condition for this product to be zero?
Exercise 24. Is it necessary to use parentheses in the writing of scalar triple products and
why? Is it possible to interchange the dot and cross symbols in the product?
Exercise 25. Calculate the following scalar triple products:
a⋅(b × c)
a⋅(d × c)
d⋅(c × b)
a⋅(c × b)
(a × b)⋅c
where a = (7, − 0.4, 9.5), b = ( − 12.9, − 11.7, 3.1), c = (2.4, 22.7, − 6.9) and d = ( −
56.4, 29.5, 33.8). Note that some of these products may be found directly from other
products with no need for detailed calculations.
Exercise 26. Write the twelve possibilities of the scalar triple product a⋅(b × c) and
divide them into two sets where the entries in each set are equal. What is the relation
between the two sets?
Exercise 27. What is the vector triple product of three vectors in mathematical terms? Is it
scalar or vector? Is it associative?
Exercise 28. Give the mathematical expression for the nabla ∇ differential operator in
Cartesian systems.
Exercise 29. State the mathematical definition of the gradient of a scalar field f in
Cartesian coordinates. Is it scalar or vector?
Exercise 30. Is the gradient operation commutative, associative or distributive? Express
these properties mathematically.
Exercise 31. What is the relation between the gradient of a scalar field f and the surfaces
of constant f? Make a simple sketch to illustrate this relation.
Exercise 32. Define, mathematically, the divergence of a vector field V in Cartesian
coordinates. Is it scalar or vector?
Exercise 33. Define “solenoidal” vector field descriptively and mathematically.
Exercise 34. What is the physical significance of the divergence of a vector field?
Exercise 35. Is the divergence operation commutative, associative or distributive? Give
your answer in words and in mathematical forms.
Exercise 36. Define the curl of a vector field V in Cartesian coordinates using the
determinantal and the expanded forms with full explanation of all the symbols involved. Is
the curl scalar or vector?
Exercise 37. What is the physical significance of the curl of a vector field?
Exercise 38. What is the technical term used to describe a vector field whose curl
vanishes identically?
Exercise 39. Is the curl operation commutative, associative or distributive? Express these
properties symbolically.
2
Exercise 40. Describe, in words, the Laplacian operator ∇ and how it is obtained. What
are the other symbols used to denote it?
Exercise 41. Give the mathematical expression of the Laplacian operator in Cartesian
systems. Using this mathematical expression, explain why the Laplacian is a scalar rather
than a vector operator?
Exercise 42. Can the Laplacian operator act on rank-0, rank-1 and rank-n (n > 1) tensor
fields? If so, what is the rank of the resulting field in each case?
Exercise 43. Write down the mathematical expression for the divergence theorem, defining
all the symbols involved, and explain the meaning of this theorem in words.
Exercise 44. What are the main uses of the divergence theorem in mathematics and
science? Explain why this theorem is very useful theoretically and practically.
Exercise 45. If a vector field is given in the Cartesian coordinates by A = ( − 0.5, 9.3,
6.5), verify the divergence theorem for a cube defined by the plane surfaces x1 = − 1, x2 =
1, y1 = − 1, y2 = 1, z1 = − 1, and z2 = 1.
Exercise 46. Write down the mathematical expression for Stokes theorem with the
definition of all the symbols involved and explain its meaning in words. What is this
theorem useful for? Why it is very useful?
Exercise 47. If a vector field is given in the Cartesian coordinates by A = (2y, − 3x, 1.5z),
verify Stokes theorem for a hemispherical surface x2 + y2 + z2 = 9 for z ≥ 0.
Exercise 48. Make a simple sketch to demonstrate Stokes theorem with sufficient
explanations and definitions of the symbols involved.
Exercise 49. Give concise definitions for the following terms related to matrices: matrix,
square matrix, main diagonal, trailing diagonal, transpose, identity matrix, unit matrix,
singular, trace, determinant, cofactor, and inverse.
Exercise 50. Explain the way by which matrices are indexed.
Exercise 51. How many indices are needed in indexing a 2 × 3 matrix, an n × n matrix, and
an m × k matrix? Explain, in each case, why.
Exercise 52. Does the order of the matrix indices matter? If so, what is the meaning of
changing this order?
Exercise 53. Is it possible to write a vector as a matrix? If so, what is the condition that
should be imposed on the indices and how many forms a vector can have when it is written
as a matrix?
Exercise 54. Write down the following matrices in a standard rectangular array form
(similar to the examples in Eq. 42↑) using conventional symbols for their entries with a
proper indexing: 3 × 4 matrix A, 1 × 5 matrix B, 2 × 2 matrix C, and 3 × 1 matrix D.
Exercise 55. Give detailed mathematical definitions of the determinant, trace and inverse
of matrix, explaining any symbol or technical term involved in these definitions.
Exercise 56. Find the following matrix multiplications: AB, BC, and CB where:
Exercise 57. Referring to the matrices A, B and C in the previous exercise, find all the
permutations (repetitive and non-repetitive) involving two of these three matrices, and
classify them into two groups: those which do represent possible matrix multiplication and
those which do not.
Exercise 58. Is matrix multiplication associative? commutative? distributive over matrix
addition?
Exercise 59. Calculate the trace, the determinant, and the inverse (if the inverse exists) of
the following matrices:
Exercise 60. Which, if any, of the matrices D and E in the previous exercise is singular?
Chapter 2
Tensors
In this chapter, we present the essential terms and definitions related to tensors, the
conventions and notations which are used in their representation, the general rules that
govern their manipulation, and their main types and classifications. We also provide some
illuminating examples of tensors of various complexity as well as an overview of their use
in mathematics, science and engineering.
Transformations can be active, when they change the state of the observed object (e.g.
translating the object in space), or passive when they are based on keeping the state of the
object and changing the state of the coordinate system from which the object is observed.
Such distinction is based on an implicit assumption of a more general frame of reference in
the background.
A permutation of a set of objects, which are normally numbers like (1, 2, …, n) or
symbols like (i, j, k), is a particular ordering or arrangement of these objects. An even
permutation is a permutation resulting from an even number of single-step exchanges (also
known as transpositions) of neighboring objects starting from a presumed original
permutation of these objects. Similarly, an odd permutation is a permutation resulting from
an odd number of such exchanges. It has been shown that when a transformation from one
permutation to another can be done in different ways, possibly with different numbers of
exchanges, the parity of all these possible transformations is the same, i.e. all are even or
all are odd, and hence there is no ambiguity in characterizing the transformation from one
permutation to another by the parity alone.
These are the main types of tensor with regard to the rules of their transformation between
different coordinate systems. Covariant tensors are notated with subscript indices (e.g. Ai)
while contravariant tensors are notated with superscript indices (e.g. Aij). A covariant
tensor is transformed according to the following rule:
(65)
where the barred and unbarred symbols represent the same mathematical object (tensor or
coordinate) in the transformed and original coordinate systems respectively.
An example of covariant tensors is the gradient of a scalar field while an example of
contravariant tensors is the displacement vector. Some tensors of rank > 1 have mixed
variance type, i.e. they are covariant in some indices and contravariant in others. In this
case the covariant variables are indexed with subscripts while the contravariant variables
are indexed with superscripts, e.g. Ai j which is covariant in i and contravariant in j. A
mixed type tensor transforms covariantly in its covariant indices and contravariantly in its
contravariant indices, e.g.
(67)
We assume that the barred tensor and its coordinates are indexed with ijkl and the unbarred
are indexed with npqr, so we add these indices in their presumed order and position
(lower or upper) paying particular attention to the order in the mixed type:
(69)
Since the barred and unbarred tensors are of the same type, as they represent the same
tensor in two coordinate systems (see Footnote 30 in § 8↓), the indices on the two sides of
the equalities should match in their position and order. We then insert a number of partial
differential operators on the right hand side of the equations equal to the rank of these
tensors, which is 4 in our example. These operators represent the transformation rules for
each pair of corresponding coordinates, one from the barred and one from the unbarred:
(70)
Now we insert the coordinates of the barred system into the partial differential operators
noting that (i) the positions of any index on the two sides should match, i.e. both upper or
both lower, since they are free indices in different terms of tensor equalities, (ii) a
superscript index in the denominator of a partial derivative is in lieu of a covariant index
in the numerator (see Footnote 31 in § 8↓), and (iii) the order of the coordinates should
match the order of the indices in the tensor, that is:
(71)
For consistency, these coordinates should be barred as they belong to the barred tensor;
hence we add bars:
(72)
Finally, we insert the coordinates of the unbarred system into the partial differential
operators noting that (i) the positions of the repeated indices on the same side should be
opposite, i.e. one upper and one lower, since they are dummy indices and hence the
position of the index of the unbarred coordinate should be opposite to its position in the
unbarred tensor, (ii) an upper index in the denominator is in lieu of a lower index in the
numerator, and (iii) the order of the coordinates should match the order of the indices in
the tensor:
(73)
We also replaced the “≗” sign in the final set of equations with the strict equality sign “=”
as the equations now are complete.
The covariant and contravariant types of a tensor are linked through the metric tensor, as
will be detailed later in the book (refer to § 6↓). As indicated before, for orthonormal
Cartesian systems there is no difference between covariant and contravariant tensors, and
hence the indices can be upper or lower although it is common to use lower indices in this
case.
A tensor of m contravariant indices and n covariant indices may be called type (m, n)
tensor. When one or both variance types are absent, zero is used to refer to the absent
variance type in this notation. Accordingly, Aij k is a type (1, 2) tensor, Bik is a type (2, 0)
tensor, Cm is a type (0, 1) tensor, and Dpqr st is a type (2, 3) tensor.
The vectors providing the basis set for a coordinate system are of covariant type when
they are tangent to the coordinate axes, and they are of contravariant type when they are
perpendicular to the local surfaces of constant coordinates. These two sets, like the
tensors themselves, are identical for orthonormal Cartesian systems.
Formally, the covariant and contravariant basis vectors are given respectively by:
(74)
where r = xie i is the position vector in Cartesian coordinates and xi is a general curvilinear
coordinate. As before, a superscript in the denominator of partial derivatives is
equivalent to a subscript in the numerator. It should be remarked that in general the basis
vectors (whether covariant or contravariant ) are not necessarily of unit length and/or
mutually orthogonal although they may be so (see Footnote 32 in § 8↓).
The two sets of covariant and contravariant basis vectors are reciprocal systems and hence
they satisfy the following reciprocity relation:
(75) Ei⋅Ej = δi j
where δi j is the Kronecker delta (refer to § 4.1↓) which can be represented by the unity
matrix (see Special Matrices in § 1.3.3↑). The reciprocity of these two sets of basis
vectors is illustrated schematically in Figure 14↓ for the case of a 2D space.
Figure 14 The reciprocity relation between the covariant and contravariant basis vectors
in a 2D space where E1⊥ E2, E1⊥ E2, and |E1|| E1|cosφ = |E2|| E2|cosφ = 1.
These are also called polar and axial tensors respectively although it is more common to
use these terms for vectors. Pseudo tensors may also be called tensor densities (see
Footnote 33 in § 8↓). True tensors are proper or ordinary tensors and hence they are
invariant under coordinate transformations, while pseudo tensors are not proper tensors
since they do not transform invariantly as they acquire a minus sign under improper
orthogonal transformations which involve inversion of coordinate axes through the origin
of coordinates with a change of system handedness.
Figure 16↓ demonstrates the behavior of a true vector v and a pseudo vector p where the
former keeps its direction following a reflection of the coordinate system through the
origin of coordinates while the latter reverses its direction following this operation.
Figure 16 The behavior of a true vector (v and V) and a pseudo vector (p and P) on
reflecting the coordinate system in the origin of coordinates. The lower case symbols stand
for the objects in the original system while the upper case symbols stand for the same
objects in the reflected system.
Because true and pseudo tensors have different mathematical properties and represent
different types of physical entities, the terms of consistent tensor expressions and
equations should be uniform in their true and pseudo type, i.e. all terms are true or all are
pseudo.
The direct product (refer to § 3.3↓) of true tensors is a true tensor. The direct product of
even number of pseudo tensors is a true tensor, while the direct product of odd number of
pseudo tensors is a pseudo tensor. The direct product of a mix of true and pseudo tensors is
a true or pseudo tensor depending on the number of pseudo tensors involved in the product
as being even or odd respectively.
Similar rules to those of the direct product apply to the cross product, including the curl
operation, involving tensors (which are usually of rank-1) with the addition of a pseudo
factor for each cross product operation. This factor is contributed by the permutation
tensor ε which is implicit in the definition of the cross product (see Eqs. 173↓ and 192↓).
As we will see in § 4.2↓, the permutation tensor is a pseudo tensor.
In summary, what determines the tensor type (true or pseudo) of the tensor terms involving
direct (see Footnote 34 in § 8↓) and cross products is the parity of the multiplicative
factors of pseudo type plus the number of cross product operations involved since each
cross product operation contributes an ε factor.
Examples of true scalars are temperature, mass and the dot product of two polar or two
axial vectors, while examples of pseudo scalars are the dot product of an axial vector and
a polar vector and the scalar triple product of polar vectors. Examples of polar vectors
are displacement and acceleration, while examples of axial vectors are angular velocity
and cross product of polar vectors in general, including the curl operation on polar vectors,
due to the involvement of the permutation symbol ε which is a pseudo tensor as stated
already. As indicated before, the essence of the distinction between true (i.e. polar) and
pseudo (i.e. axial) vectors is that the direction of a pseudo vector depends on the observer
choice of the handedness of the coordinate system whereas the direction of a true vector is
independent of such a choice.
Examples of true tensors of rank-2 are stress and rate of strain tensors, while examples
of pseudo tensors of rank-2 are direct products of two vectors: one polar and one axial.
Examples of true tensors of higher ranks are piezoelectric moduli tensor (rank-3) and
elasticity tensor (rank-4), while examples of pseudo tensors of higher ranks are the
permutation tensor of these ranks.
where |∂x ⁄ ∂x̃| is the Jacobian of the transformation between the two systems (see
Footnote 35 in § 8↓). When w = 0 the tensor is described as an absolute or true tensor, and
when w ≠ 0 the tensor is described as a relative tensor. When w = − 1 the tensor may be
described as a pseudo tensor, while when w = 1 the tensor may be described as a tensor
density (see Footnote 36 in § 8↓). As indicated earlier, a tensor of m contravariant indices
and n covariant indices may be described as a tensor of type (m, n). This may be extended
to include the weight w as a third entry and hence the type of the tensor is identified by
(m, n, w).
Relative tensors can be added and subtracted (see § 3.1↓) if they are of the same variance
type and have the same weight (see Footnote 37 in § 8↓); the result is a tensor of the same
type and weight. Also, relative tensors can be equated if they are of the same type and
weight. Multiplication of relative tensors produces a relative tensor whose weight is the
sum of the weights of the original tensors. Hence, if the weights are added up to a non-zero
value the result is a relative tensor of that weight; otherwise it is an absolute tensor.
These types of tensor apply to high ranks only (rank ≥ 2) (see Footnote 39 in § 8↓).
Moreover, these types are not exhaustive, even for tensors of rank ≥ 2, as there are high-
rank tensors which are neither symmetric nor anti-symmetric. A rank-2 tensor Aij is
symmetric iff for all i and j the following condition is satisfied:
(79) Aji = Aij
and anti-symmetric or skew-symmetric iff for all i and j the following condition is
satisfied:
(80) Aji = − Aij
Similar conditions apply to contravariant type tensors (refer also to the following).
A rank-n tensor Ai1…in is symmetric in its two indices ij and il iff the following condition
applies identically:
(81) Ai1…il…ij…in = Ai1…ij…il…in
and anti-symmetric in its two indices ij and il iff the following condition applies
identically:
(82) Ai1…il…ij…in = − Ai1…ij…il…in
Any rank-2 tensor Aij can be synthesized from (or decomposed into) a symmetric part
A(ij), which is marked with round brackets enclosing the indices, and an anti-symmetric
part A[ij], which is marked with square brackets, where the following relations apply:
(83) Aij = A(ij) + A[ij]
(84) A(ij) = (Aij + Aji) ⁄ 2
(85) A[ij] = (Aij − Aji) ⁄ 2
Similarly, a rank-3 tensor Aijk can be symmetrized by the following relation:
(86) A(ijk) = (Aijk + Akij + Ajki + Aikj + Ajik + Akji) ⁄ 3!
and anti-symmetrized by the following relation:
(87) A[ijk] = (Aijk + Akij + Ajki − Aikj − Ajik − Akji) ⁄ 3!
More generally, a rank-n tensor Ai1…in can be symmetrized by:
(88) A(i1…in) = (sum of all even &odd permutations of indices i's) ⁄ n!
and anti-symmetrized by:
(89) A[i1…in] = (sum of all even permutations minus sum of all odd permutations) ⁄ n!
A tensor of high rank ( > 2) may be symmetrized or anti-symmetrized with respect to only
some of its indices instead of all of its indices. For example, in the following the tensor A
is symmetrized and anti-symmetrized only with respect to its first two indices:
(90) A(ij)k = (Aijk + Ajik) ⁄ 2
(91) A[ij]k = (Aijk − Ajik) ⁄ 2
A tensor is described as totally symmetric iff it is symmetric with respect to all of its
indices, that is:
(92) Ai1…in = A(i1…in)
and totally anti-symmetric iff it is anti-symmetric in all of its indices, that is:
(93) Ai1…in = A[i1…in]
For a totally anti-symmetric tensor, non-zero entries can occur only when all the indices
are different.
It should be remarked that the indices whose exchange defines the symmetry and anti-
symmetry relations should be of the same variance type, i.e. both upper or both lower.
Another important remark is that the symmetry and anti-symmetry characteristic of a tensor
is invariant under coordinate transformations. Hence, a symmetric/anti-symmetric tensor in
one coordinate system is symmetric/anti-symmetric in all other coordinate systems.
Similarly, a tensor which is neither symmetric nor anti-symmetric in one coordinate system
remains so in all other coordinate systems (see Footnote 40 in § 8↓).
Finally, for a symmetric tensor Aij and an anti-symmetric tensor Bij (or the other way
around) we have the following useful and widely used identity:
(94) AijBij = 0
This is because an exchange of indices will change the sign of one tensor only and this will
change the sign of the term in the summation resulting in having a sum of terms which is
identically zero due to the fact that each term in the sum has its own negation.
2.7 Exercises
Exercise 1. Make a sketch of a rank-2 tensor Aij in a 4D space similar to Figure 11↑. What
this tensor looks like?
Exercise 2. What are the two main types of notation used for labeling tensors? State two
names for each.
Exercise 3. Make a detailed comparison between the two types of notation in the previous
question stating any advantages or disadvantages in using one of these notations or the
other. In which context each one of these notations is more appropriate to use than the
other?
Exercise 4. What is the principle of invariance of tensors and why it is one of the main
reasons for the use of tensors in science?
Exercise 5. What are the two different meanings of the term “covariant” in tensor calculus?
Exercise 6. State the type of each one of the following tensors considering the number and
position of indices (i.e. covariant, contravariant, rank, scalar, vector, etc.):
ai
Bi jk
f
bk
Cji
Exercise 7. Define the following technical terms which are related to tensors: term,
expression, equality, order, rank, zero tensor, unit tensor, free index, dummy index,
covariant, contravariant, and mixed.
Exercise 8. Which of the following is a scalar, vector or rank-2 tensor: temperature, stress,
cross product of two vectors, dot product of two vectors, and rate of strain?
Exercise 9. What is the number of entries of a rank-0 tensor in a 2D space and in a 5D
space? What is the number of entries of a rank-1 tensor in these spaces?
Exercise 10. What is the difference between the order and rank of a tensor considering the
different conventions in this regard?
Exercise 11. What is the number of entries of a rank-3 tensor in a 4D space? What is the
number of entries of a rank-4 tensor in a 3D space?
Exercise 12. Describe direct and inverse coordinate transformations between spaces and
write the generic equations for these transformations.
Exercise 13. What are proper and improper transformations? Draw a simple sketch to
demonstrate them.
Exercise 14. Define the following terms related to permutation of indices: permutation,
even, odd, parity, and transposition.
Exercise 15. Find all the permutations of the following four letters assuming no repetition:
(i, j, k, l).
Exercise 16. Give three even permutations and three odd permutations of the symbols (α,
β, γ, δ) in the stated order.
Exercise 17. Discuss all the similarities and differences between free and dummy indices.
Exercise 18. What is the maximum number of repetitive indices that can occur in each term
of a legitimate tensor expression?
Exercise 19. How many components are represented by each one of the following
assuming a 4D space?
Ai jk
f+g
Cmn − Dnm
5Dk + 4Ak = Bk
Exercise 20. What is the “summation convention”? To what type of indices this convention
applies?
Exercise 21. Is it always the case that the summation convention applies when an index is
repeated? If not, what precaution should be taken to avoid ambiguity and confusion?
Exercise 22. In which cases a pair of dummy indices should be of different variance type
(i.e. one upper and one lower)? In what type of coordinate systems these repeated indices
can be of the same variance type and why?
Exercise 23. What are the rules that the free indices should obey when they occur in the
terms of tensor expressions and equalities?
Exercise 24. What is illegitimate about the following tensor expressions and equalities
considering in your answer all the possible violations?
Aij + Bij k
Cn − Dn = Bm
Aij = Aji
Aj = f
Exercise 25. Which of the following tensor expressions and equalities is legitimate and
which is illegitimate?
Bi + Cijj
Ai − Bki
Cm + Dm = Bmmm
Bik = Aik
State in each illegitimate case all the reasons for illegitimacy.
Exercise 26. Which is right and which is wrong of the following tensor equalities?
∂nAn = An∂n
[B]k + [D]k = [B + D]k
ab = ba
AijMkl = MklAji
Explain in each case why the equality is right or wrong.
Exercise 27. Give at least two examples of tensors used in mathematics, science and
engineering for each one of the following ranks: 0, 1, 2 and 3.
Exercise 28. State the special names given to the rank-0 and rank-1 tensors.
Exercise 29. What is the difference, if any, between rank-2 tensors and matrices?
Exercise 30. Is the following statement correct? If not, re-write it correctly: “all rank-0
tensors are vectors and vice versa, and all rank-1 tensors are scalars and vice versa”.
Exercise 31. Give clear and detailed definitions of scalars and vectors and compare them.
What is common and what is different between the two?
Exercise 32. Make a simple sketch of the nine unit dyads associated with the double
directions of rank-2 tensors in a 3D space.
Exercise 33. Name three of the scientific disciplines that heavily rely on tensor calculus
notation and techniques.
Exercise 34. What are the main features of tensor calculus that make it very useful and
successful in mathematical, scientific and engineering applications.
Exercise 35. Why tensor calculus is used in the formulation and presentation of the laws of
physics?
Exercise 36. Give concise definitions for the covariant and contravariant types of tensor.
Exercise 37. Describe how the covariant and contravariant types are notated and how they
differ in their transformation between coordinate systems.
Exercise 38. Give examples of tensors used in mathematics and science which are
covariant and other examples which are contravariant.
Exercise 39. Write the mathematical transformation rules of the following tensors: Aijk to
Ãrst and Bmn to B̃ pq.
Exercise 40. Explain how mixed type tensors are defined and notated in tensor calculus.
Exercise 41. Write the mathematical rule for transforming the mixed type tensor Dij klm to
D̃ pq rst.
Exercise 42. Express the following tensors in indicial notation: a rank-3 covariant tensor
A, a rank-4 contravariant tensor B, a rank-5 mixed type tensor C which is covariant in ij
indices and contravariant in kmn indices where the indices are ordered as ikmnj.
Exercise 43. Write step-by-step, similar to the detailed example given in § 2.6.1↑, the
mathematical transformations of the following tensors: Aij to Ãrs, Blmn to B̃ pqr, Cij mn to
C̃ pq rs and Dm kl to D̃ r st.
Exercise 44. What is the relation between the rank and the (m, n) type of a tensor?
Exercise 45. Write, in indicial notation, the following tensors: A of type (0, 4), B of type
(3, 1), C of type (0, 0), D of type (3, 4), E of type (2, 0) and F of type (1, 1).
Exercise 46. What is the rank of each one of the tensors in the previous question? Are
there tensors among them which may not have been notated properly?
Exercise 47. Which tensor provides the link between the covariant and contravariant types
of a given tensor D?
Exercise 48. What coordinate system(s) in which the covariant and contravariant types of
a tensor do not differ? What is the usual tensor notation used in this case?
Exercise 49. Define in detail, qualitatively and mathematically, the covariant and
contravariant types of the basis vectors of a general coordinate system explaining all the
symbols used in your definition.
Exercise 50. Is it necessary that the basis vectors of the previous exercise are mutually
orthogonal and/or of unit length?
Exercise 51. Is the following statement correct? “A superscript in the denominator of
partial derivatives is equivalent to a superscript in the numerator”. Explain why.
Exercise 52. What is the reciprocity relation that links the covariant and contravariant
basis vectors? Express this relation mathematically.
Exercise 53. What is the interpretation of the reciprocity relation (refer to Figure 14↑ in
your explanation)?
Exercise 54. Are the covariant and contravariant forms of a specific tensor A represent the
same mathematical object? If so, in what sense they are equal from the perspective of
different coordinate systems?
Exercise 55. Correct, if necessary, the following statement: “A tensor of any rank ( ≥ 1)
can be represented covariantly using contravariant basis tensors of that rank, or
contravariantly using contravariant basis tensors, or in a mixed form using a mixed basis of
the same type”.
Exercise 56. Make corrections, if needed, to the following equations assuming a general
curvilinear coordinate system where, in each case, all the possible ways of correction
should be considered:
B = BiEi
M = MijEi
D = DiEiEj
C = CiEj
F = FnEn
T = TrsEsEr
Exercise 57. What is the technical term used to label the following objects: EiEj, EiEj,
EiEj and EiEj? What they mean?
Exercise 58. What sort of tensor components that the objects in the previous question
should be associated with?
Exercise 59. What is the difference between true and pseudo vectors? Which of these is
called axial and which is called polar?
Exercise 60. Make a sketch demonstrating the behavior of true and pseudo vectors.
Exercise 61. Is the following statement correct? “The terms of tensor expressions and
equations should be uniform in their true and pseudo type”. Explain why.
Exercise 62. There are four possibilities for the direct product of two tensors of true and
pseudo types. Discuss all these possibilities with respect to the type of the tensor produced
by this operation and if it is true or pseudo. Also discuss in detail the cross product and
curl operations from this perspective.
Exercise 63. Give examples for the true and pseudo types of scalars, vectors and rank-2
tensors.
Exercise 64. Explain, in words and equations, the meaning of absolute and relative
tensors. Do these intersect in some cases with true and pseudo tensors (at least according
to some conventions)?
Exercise 65. What “Jacobian” and “weight” mean in the context of absolute and relative
tensors?
Exercise 66. Someone stated: “A is a tensor of type (2, 4, − 1)”. What these three numbers
refer to?
Exercise 67. What is the type of the tensor in the previous exercise from the perspectives
of lower and upper indices and absolute and relative tensors? What is the rank of this
tensor?
Exercise 68. What is the weight of a tensor A produced from multiplying a tensor of
weight − 1 by a tensor of weight 2? Is A relative or absolute? Is it true or not?
Exercise 69. Define isotropic and anisotropic tensors and give examples for each using
tensors of different ranks.
Exercise 70. What is the state of the inner and outer products of two isotropic tensors?
Exercise 71. Why if a tensor equation is valid in a particular coordinate system it should
also be valid in all other coordinate systems under admissible coordinate transformations?
Use the isotropy of the zero tensor in your explanation.
Exercise 72. Define “symmetric” and “anti-symmetric” tensors and write the mathematical
condition that applies to each assuming a rank-2 tensor.
Exercise 73. Do we have symmetric/anti-symmetric scalars or vectors? If not, why?
Exercise 74. Is it the case that any tensor of rank > 1 should be either symmetric or anti-
symmetric?
Exercise 75. Give an example, writing all the components in numbers or symbols, of a
symmetric tensor of rank-2 in a 3D space. Do the same for an anti-symmetric tensor of the
same rank.
Exercise 76. Give, if possible, an example of a rank-2 tensor which is neither symmetric
nor anti-symmetric assuming a 4D space.
Exercise 77. Is it true that any rank-2 tensor can be decomposed into a symmetric part and
an anti-symmetric part? If so, write down the mathematical expressions representing these
parts in terms of the original tensor. Is this also true for a general rank-n tensor?
Exercise 78. What is the meaning of the round and square brackets which are used to
contain indices in the indexed symbol of a tensor (e.g. A(ij) and B[km]n)?
Exercise 79. Can the indices of symmetry/anti-symmetry be of different variance type?
Exercise 80. Is it possible that a rank-n (n > 2) tensor is symmetric/anti-symmetric with
respect to some, but not all, of its indices? If so, give an example of a rank-3 tensor which
is symmetric or anti-symmetric with respect to only two of its indices.
Exercise 81. For a rank-3 covariant tensor Aijk, how many possibilities of symmetry and
anti-symmetry do we have? Consider in your answer total, as well as partial, symmetry and
anti-symmetry. Is there another possibility (i.e. the tensor in neither symmetric nor anti-
symmetric with respect to any pair of its indices)?
Exercise 82. Can a tensor be symmetric with respect to some combinations of its indices
and anti-symmetric with respect to other combinations? If so, can you give a simple
example of such a tensor?
Exercise 83. Repeat the previous exercise considering the additional possibility that the
tensor is neither symmetric nor anti-symmetric with respect to another set of indices, i.e. it
is symmetric, anti-symmetric and neither with respect to different sets of indices (see
Footnote 41 in § 8↓).
Exercise 84. A is a rank-3 totally symmetric tensor and B is a rank-3 totally anti-
symmetric tensor. Write all the mathematical conditions that these tensors satisfy.
Exercise 85. Justify the following statement: “For a totally anti-symmetric tensor, non-zero
entries can occur only when all the indices are different”. Use mathematical, as well as
descriptive, language in your answer.
Exercise 86. For a totally anti-symmetric tensor Bijk in a 3D space, write all the elements
of this tensor which are identically zero. Consider the possibility that it may be easier to
find first the elements which are not identically zero, then exclude the rest (see Footnote
42 in § 8↓).
Chapter 3
Tensor Operations
There are various operations that can be performed on tensors to produce other tensors in
general. Examples of these operations are addition/subtraction, multiplication by a scalar
(rank-0 tensor), multiplication of tensors (each of rank > 0), contraction and permutation.
Some of these operations, such as addition and multiplication, involve more than one
tensor while others, such as contraction and permutation, are performed on a single
tensor. In this chapter we provide a glimpse on the main elementary tensor operations of
algebraic nature that permeate tensor algebra and calculus.
First, we should remark that the last section of this chapter, which is about the quotient rule
for tensor test, is added to this chapter because it is the most appropriate place for it in the
present book considering the dependency of the definition of this rule on other tensor
operations; otherwise the section is not about a tensor operation in the same sense as the
operations presented in the other sections of this chapter. Another remark is that in tensor
algebra division is allowed only for scalars, hence if the components of an indexed tensor
should appear in a denominator, the tensor should be redefined to avoid this, e.g. Bi =
1 ⁄ Ai.
However, in the present book no symbol is being used for the operation of direct
multiplication and hence the operation is symbolized by putting the symbols of the tensors
side by side, e.g. AB where A and B are non-scalar tensors. In this regard, the reader
should be vigilant to avoid confusion with the operation of matrix multiplication which,
according to the notation of matrix algebra, is also symbolized as AB where A and B are
matrices of compatible dimensions, since matrix multiplication is an inner product, rather
than an outer product, operation.
The direct multiplication of tensors is not commutative in general as indicated above;
however it is distributive with respect to the algebraic sum of tensors, that is (see
Footnote 46 in § 8↓):
(102) AB ≠ BA
(103) A(B±C) = AB±AC and (B±C)A = BA±CA
As indicated before, the rank-2 tensor constructed by the direct multiplication of two
vectors is commonly called dyad. Tensors may be expressed as an outer product of vectors
where the rank of the resultant product is equal to the number of the vectors involved, e.g. 2
for dyads and 3 for triads. However, not every tensor can be synthesized as a product of
lower rank tensors. Multiplication of a tensor by a scalar (refer to § 3.2↑) may be regarded
as a special case of direct multiplication.
3.4 Contraction
The contraction operation of a tensor of rank > 1 is to make two free indices identical, by
unifying their symbols, and perform summation over these repeated indices, e.g.
(104) Aji (contraction) Aii
(105) Ajk il (contraction on jl) Amk im
Contraction results in a reduction of the rank by 2 since it implies the annihilation of two
free indices. Therefore, the contraction of a rank-2 tensor is a scalar, the contraction of a
rank-3 tensor is a vector, the contraction of a rank-4 tensor is a rank-2 tensor, and so on.
For general non-Cartesian coordinate systems, the pair of contracted indices should be
different in their variance type, i.e. one upper and one lower. Hence, contraction of a
mixed tensor of type (m, n) will, in general, produce a tensor of type (m − 1, n − 1). A
tensor of type (p, q) can, therefore, have p × q possible contractions, i.e. one contraction
for each combination of lower and upper indices.
A common example of contraction is the dot product operation on vectors (see Dot
Product of Vectors in § 1.3.2↑) which can be regarded as a direct multiplication (refer to
§ 3.3↑) of the two vectors, which results in a rank-2 tensor, followed by a contraction.
Also, in matrix algebra, taking the trace of a square matrix, by summing its diagonal
elements, can be considered as a contraction operation on the rank-2 tensor represented by
the matrix, and hence it yields the trace which is a scalar.
Conducting a contraction operation on a tensor results into a tensor. Similarly, the
application of a contraction operation on a relative tensor (see § 2.6.3↑) produces a
relative tensor of the same weight as the original tensor.
3.6 Permutation
A tensor may be obtained by exchanging the indices of another tensor. For example, Ai kj
is a permutation of the tensor Ai jk. A common example of the permutation operation of
tensors is the transposition of a matrix (refer to Special Matrices in § 1.3.3↑) representing
a rank-2 tensor since the first and second indices, which represent the rows and columns of
the matrix, are exchanged in this operation.
It is obvious that tensor permutation applies only to tensors of rank > 1 since no exchange
of indices can occur on a scalar with no index or on a vector with a single index. The
collection of tensors obtained by permuting the indices of a reference tensor may be called
isomers.
3.8 Exercises
Exercise 1. Give preliminary definitions of the following tensor operations: addition,
multiplication by a scalar, tensor multiplication, contraction, inner product and
permutation. Which of these operations involve a single tensor?
Exercise 2. Give typical examples of addition/subtraction for rank-n (0 ≤ n ≤ 3) tensors.
Exercise 3. Is it possible to add two tensors of different ranks or different variance types?
Is addition of tensors associative or commutative?
Exercise 4. Discuss, in detail, the operation of multiplication of a tensor by a scalar and
compare it to the operation of tensor multiplication. Can we regard multiplying two scalars
as an example of multiplying a tensor by a scalar?
Exercise 5. What is the meaning of the term “outer product” and what are the other terms
used to label this operation?
Exercise 6. C is a tensor of rank-3 and D is a tensor of rank-2, what is the rank of their
outer product CD? What is the rank of CD if it is subjected subsequently to a double
contraction operation?
Exercise 7. A is a tensor of type (m, n) and B is a tensor of type (s, t), what is the type of
their direct product AB?
Exercise 8. Discuss the operations of dot and cross product of two vectors (see Dot
Product of Vectors in § 1.3.2↑ and Cross Product of Vectors in 1.3.2↑) from the
perspective of the outer product operation of tensors.
Exercise 9. Are the following two statements correct (make corrections if necessary)?
“The outer multiplication of tensors is commutative but not distributive over sum of
tensors” and “The outer multiplication of two tensors may produce a scalar”.
Exercise 10. What is contraction of tensor? How many free indices are consumed in a
single contraction operation?
Exercise 11. Is it possible that the contracted indices are of the same variance type? If so,
what is the condition that should be satisfied for this to happen?
Exercise 12. A is a tensor of type (m, n) where m, n > 1, what is its type after two
contraction operations assuming a general coordinate system?
Exercise 13. Does the contraction operation change the weight of a relative tensor?
Exercise 14. Explain how the operation of multiplication of two matrices, as defined in
linear algebra, involves a contraction operation. What is the rank of each matrix and what
is the rank of the product? Is this consistent with the rule of reduction of rank by
contraction?
Exercise 15. Explain, in detail, the operation of inner product of two tensors and how it is
related to the operations of contraction and outer product of tensors.
Exercise 16. What is the rank and type of a tensor resulting from an inner product
operation of a tensor of type (m, n) with a tensor of type (s, t)? How many possibilities do
we have for this inner product considering the different possibilities of the embedded
contraction operation?
Exercise 17. Give an example of a commutative inner product of two tensors and another
example of a non-commutative inner product.
Exercise 18. Is the inner product operation distributive over algebraic addition of tensors?
Exercise 19. Give an example from matrix algebra of inner product of tensors explaining
in detail how the two are related.
Exercise 20. Discuss specialized types of inner product operations that involve more than
one contraction operation focusing in particular on the operations A:B and A⋅⋅B where A
and B are two tensors of rank > 1.
Exercise 21. A double inner product operation is conducted on a tensor of type (1, 1) with
a tensor of type (1, 2). How many possibilities do we have for this operation? What is the
rank and type of the resulting tensor? Is it covariant, contravariant or mixed?
Exercise 22. Assess the following statement considering the two meanings of the word
“tensor” related to the rank: “Inner product operation of two tensors does not necessarily
produce a tensor”. Can this statement be correct in a sense and wrong in another?
Exercise 23. What is the operation of tensor permutation and how it is related to the
operation of transposition of matrices?
Exercise 24. Is it possible to permute scalars or vectors and why?
Exercise 25. What is the meaning of the term “isomers”?
Exercise 26. Describe in detail the quotient rule and how it is used as a test for tensors.
Exercise 27. Why the quotient rule is used instead of the standard transformation equations
of tensors?
Chapter 4
where n is the space dimension, and hence it can be considered as the identity matrix. For
example, in a 3D space the Kronecker δ tensor is given by:
(116)
The components of the covariant, contravariant and mixed types of this tensor are the same,
that is:
(117) δij = δij = δi j = δi j
The Kronecker δ tensor is symmetric, that is:
(118) δij = δji
(119) δij = δji
where i, j = 1, 2, …, n. Moreover, it is conserved (see Footnote 55 in § 8↓) under all
proper and improper coordinate transformations. Since it is conserved under proper
transformations, it is an isotropic tensor (see Footnote 56 in § 8↓).
Figure 17↓ is a graphical illustration of the rank-3 permutation tensor εijk while Figure 18↓,
which may be used as a mnemonic device, demonstrates the cyclic nature of the three even
permutations of the indices of the rank-3 permutation tensor and the three odd permutations
of these indices assuming no repetition in indices. The three permutations in each case are
obtained by starting from a given number in the cycle and rotating in the given direction to
obtain the other two numbers in the permutation.
Figure 17 Graphical illustration of the rank-3 permutation tensor εijk where circular
nodes represent 0, square nodes represent 1 and triangular nodes represent − 1.
Figure 18 Graphical demonstration of the cyclic nature of the even and odd permutations
of the indices of the rank-3 permutation tensor assuming no repetition in indices.
The definition of the rank-n permutation tensor (i.e. εi1i2…in) is similar to the definition of
the rank-3 permutation tensor with regard to the repetition in its indices (i1, i2, ⋯, in) and
being even or odd permutations in their correspondence to (1, 2, ⋯, n), that is:
(122)
As well as the inductive definition of the permutation tensor (as given by Eqs. 120↑, 121↑
and 122↑), the permutation tensor of any rank can also be defined analytically where the
entries of the tensor are calculated from closed form formulae. The entries of the rank-2
permutation tensor can be calculated from the following closed form equation:
(123) εij = (j − i)
Similarly, for the rank-3 permutation tensor we have:
(124) εijk = (1 ⁄ 2)(j − i)(k − i)(k − j)
while for the rank-4 permutation tensor we have:
(125) εijkl = (1 ⁄ 12)(j − i)(k − i)(l − i)(k − j)(l − j)(l − k)
More generally, the entries of the rank-n permutation tensor can be obtained from the
following identity:
(126)
where S(n − 1) is the super factorial function of the argument (n − 1) which is defined by:
(127)
A simpler formula for calculating the entries of the rank-n permutation tensor can be
obtained from the previous one by dropping the magnitude of the multiplication factors and
taking their signs only, that is:
(128)
where sgn(k) is the sign function of the argument k which is defined by:
(129)
The permutation tensor is totally anti-symmetric (see § 2.6.5↑) in each pair of its indices,
i.e. it changes sign on swapping any two of its indices, that is:
(130) εi1…ik …il…in = − εi1…il…ik …in
The reason is that any exchange of two indices requires an even/odd number of single-step
shifts to the right of the first index plus an odd/even number of single-step shifts to the left
of the second index, so the total number of shifts is odd and hence it is an odd permutation
of the original arrangement.
The permutation tensor is a pseudo tensor since it acquires a minus sign under an
improper orthogonal transformation of coordinates, i.e. inversion of axes with possible
superposition of rotation (see § 2.2↑). However, it is an isotropic tensor since it is
conserved under proper coordinate transformations.
The permutation tensor may be considered as a contravariant relative tensor of weight +
1 or a covariant relative tensor of weight − 1. Hence, in 2D, 3D and nD spaces we have
the following identities for the components of the permutation tensor (see Footnote 57 in §
8↓):
(131) εij = εij
(132) εijk = εijk
(133) εi1i2…in = εi1i2…in
Hence, in an nD space we obtain the following identity from the last two identities:
(138) ∂ixi = δii = n
Based on the above identities and facts, the following identity can be shown to apply in
orthonormal Cartesian coordinate systems:
(139)
This identity is based on the two facts that the coordinates are independent, and the
covariant and contravariant types are the same in orthonormal Cartesian coordinate
systems.
Similarly, for a coordinate system with a set of orthonormal (see Footnote 59 in § 8↓)
basis vectors, such as the orthonormal Cartesian system, the following identity can be
easily proved:
(140) e i⋅e j = δij
where the indexed e are the basis vectors. This identity is no more than a mathematical
statement of the fact that the basis vectors in orthonormal systems are mutually orthogonal
and of unit length.
Finally, the double inner product of two dyads (see § 3.5↑) formed by an orthonormal set of
basis vectors of a given coordinate system satisfies the following identity:
(141) e ie j : e ke l = δik δjl
which is a combination of Eq. 110↑ and Eq. 140↑.
From the definition of the rank-3 permutation tensor, we have the following identity which
demonstrates the sense of cyclic order of the non-repetitive permutations of this tensor:
(142) εijk = εkij = εjki = − εikj = − εjik = − εkji
This identity is also a demonstration of the fact that the rank-3 permutation tensor is totally
anti-symmetric in all of its indices since a shift of any two indices reverses its sign (see
Footnote 60 in § 8↓). Moreover, it reflects the fact that this tensor has only one
independent non-zero component since any one of the non-zero entries, all of which are
given by Eq. 142↑, can be obtained from any other one of these entries.
We also have the following identity for the rank-n permutation tensor (see Footnote 61 in §
8↓):
(143) εi1i2⋯in εi1i2⋯in = n!
This identity is based on the fact that the left hand side is actually the sum of the squares
of εi1i2⋯in over all the n! non-repetitive permutations of n different indices where the value
of ε of each one of these permutations is either + 1 or − 1 and hence in both cases their
square is 1.
The double inner product of the rank-3 permutation tensor and a symmetric tensor Ajk is
given by the following identity:
(144) εijk Ajk = 0
This is because an exchange of the two indices of Ajk does not affect its value due to the
symmetry of Ajk whereas a similar exchange in these indices in εijk results in a sign change;
hence each term in the sum has its own negative and therefore the total sum is identically
zero.
Another identity with a trivial outcome that involves the rank-3 permutation tensor and a
vector A is the following:
(145) εijk Ai Aj = εijk Ai Ak = εijk Aj Ak = 0
This can be explained by the fact that, due to the commutativity of ordinary multiplication,
an exchange of the indices in A’s will not affect the value but a similar exchange in the
corresponding indices of εijk will cause a change in sign; hence each term in the sum has its
own negative and therefore the total sum will be zero.
Finally, for a set of orthonormal basis vectors in a 3D space with a right-handed
coordinate system, the following identities are satisfied:
(146) e i × e j = εijk e k
(147) e i⋅(e j × e k) = εijk
These identities are based, respectively, on the forthcoming definitions of the cross product
(see Eq. 173↓) and the scalar triple product (see Eq. 174↓) in tensor notation plus the fact
that these vectors are unit vectors.
For the rank-2 permutation tensor, we have the following identity which involves the
Kronecker delta in 2D:
(148)
This identity can simply be proved inductively by building a table for the values on the left
and right hand sides as the indices are varied. The pattern of the indices in the determinant
of this identity is simple, that is the indices of the first ε provide the indices for the rows
while the indices of the second ε provide the indices for the columns (see Footnote 62 in §
8↓).
Another useful identity involving the rank-2 permutation tensor with the Kronecker delta in
2D is the following:
(149) εil εkl = δik
This can be obtained from the previous identity by replacing j with l followed by a
minimal algebraic manipulation using tensor calculus rules (see Footnote 63 in § 8↓).
Similarly, we have the following identity which correlates the rank-3 permutation tensor to
the Kronecker delta in 3D:
(150)
Again, the indices in the determinant of this identity follow the same pattern as that of Eq.
148↑.
Another useful identity in this category is the following:
(151)
This identity can be obtained from the identity of Eq. 150↑ by replacing n with k (see
Footnote 64 in § 8↓). The pattern of the indices in this identity is as before if we exclude
the repetitive indices.
More generally, the determinantal form of Eqs. 148↑ and 150↑, which link the rank-2 and
rank-3 permutation tensors to the Kronecker tensors in 2D and 3D spaces, can be extended
to link the rank-n permutation tensor to the Kronecker tensor in an nD space, that is:
(152)
Again, the pattern of the indices in the determinant of this identity in their relation to the
indices of the two epsilons follow the same rules as those of Eqs. 148↑ and 150↑.
The identity of Eq. 151↑, which may be called the epsilon-delta identity, the contracted
epsilon identity or the Levi-Civita identity, is very useful in manipulating and simplifying
tensor expressions and proving vector and tensor identities; examples of which will be
seen in § 5.6↓. The sequence of indices of the δ’s in the expanded form on the right hand
side of this identity can be easily memorized using the following mnemonic expression
(see Footnote 65 in § 8↓):
(153) (FF × SS) − (FS × SF)
where the first and second F stand respectively for the first index in the first and second ε
while the first and second S stand respectively for the second index in the first and second
ε, as illustrated graphically in Figure 19↓. The mnemonic device of Eq. 153↑ can also be
used to memorize the sequence of indices in Eq. 148↑.
Figure 19 Graphical illustration of the mnemonic device of Eq. 153↑ which is used to
remember the sequence of indices in the epsilon-delta identity of Eq. 151↑.
where the δij entries in the determinant are the ordinary Kronecker deltas as defined
previously. In this equation, the pattern of the indices in the generalized Kronecker delta
symbol in connection to the indices in the determinant is similar to the previous patterns,
that is the upper indices in the symbol provide the upper indices in the ordinary deltas by
indexing the rows of the determinant, while the lower indices in the symbol provide the
lower indices in the ordinary deltas by indexing the columns of the determinant.
From the above given identities, it can be shown that:
(160)
Now, on comparing the last equation with the definition of the generalized Kronecker delta,
i.e. Eq. 159↑, we conclude that:
(161)
As an instance of Eq. 161↑, the relation between the rank-n permutation tensor in its
covariant and contravariant forms and the generalized Kronecker delta in an nD space is
given by:
(162)
where the first of these equations can be obtained from Eq. 161↑ by substituting (1…n) for
(i1…in) in the two sides with relabeling j with i and noting that ε1 … n = 1, while the
second equation can be obtained from Eq. 161↑ by substituting (1…n) for (j1…jn) and
noting that ε1 … n = 1.
Hence, the permutation tensor ε can be considered as an instance of the generalized
Kronecker delta. Consequently, the rank-n permutation tensor can be written as an n × n
determinant consisting of the ordinary Kronecker deltas. Moreover, Eq. 162↑ can provide
another definition for the permutation tensor in its covariant and contravariant forms, in
addition to the previous inductive and analytic definitions of this tensor as given by Eqs.
122↑ and 126↑.
Returning to the widely used epsilon-delta identity of Eq. 151↑, if we define (see
Footnote 67 in § 8↓):
(163)
and consider the above identities which correlate the permutation tensor, the generalized
Kronecker tensor and the ordinary Kronecker tensor, then an identity equivalent to Eq.
151↑ that involves only the generalized and ordinary Kronecker deltas can be obtained,
that is:
(164)
The mnemonic device of Eq. 153↑ can also be used with this form of the identity with
minimal adjustments to the meaning of the symbols involved.
Other identities involving the permutation tensor and the ordinary Kronecker delta can also
be formulated in terms of the generalized Kronecker delta.
4.5 Exercises
Exercise 1. What “numerical tensor” means in connection with the Kronecker δ and the
permutation ε tensors?
Exercise 2. State all the names used to label the Kronecker and permutation tensors.
Exercise 3. What is the meaning of “conserved under coordinate transformations” in
relation to the Kronecker and permutation tensors?
Exercise 4. State the mathematical definition of the Kronecker δ tensor.
Exercise 5. What is the rank of the Kronecker δ tensor in an nD space?
Exercise 6. Write down the matrix representing the Kronecker δ tensor in a 3D space.
Exercise 7. Is there any difference between the components of the covariant, contravariant
and mixed types of the Kronecker δ tensor?
Exercise 8. Explain how the Kronecker δ acts as an index replacement operator giving an
example in a mathematical form.
Exercise 9. How many mathematical definitions of the rank-n permutation tensor we have?
State one of these definitions explaining all the symbols involved.
Exercise 10. What is the rank of the permutation tensor in an nD space?
Exercise 11. Make a graphical illustration of the array representing the rank-2 and rank-3
permutation tensors.
Exercise 12. Is there any difference between the components of the covariant and
contravariant types of the permutation tensor?
Exercise 13. How the covariant and contravariant types of the permutation tensor are
related to the concept of relative tensor?
Exercise 14. State the distinctive properties of the permutation tensor.
Exercise 15. How many entries the rank-3 permutation tensor has? How many non-zero
entries it has? How many independent entries it has?
Exercise 16. Is the permutation tensor true or pseudo and why?
Exercise 17. State, in words, the cyclic property of the even and odd non-repetitive
permutations of the rank-3 permutation tensor with a simple sketch to illustrate this
property.
Exercise 18. Correct the following equations:
δij Aj = Aj
δij δjk = δjk
δij δjk δki = n!
xi, j = δii
Exercise 19. In what type of coordinate system the following equation applies?
∂i xj = ∂j xi
Exercise 20. Complete the following equation assuming a 4D space:
∂i xi = ?
Exercise 21. Complete the following equations where the indexed e are orthonormal basis
vectors of a particular coordinate system:
e i⋅e j = ?
e ie j : e k e l = ?
Exercise 22. Write down the equations representing the cyclic order of the rank-3
permutation tensor. What is the conclusion from these equations with regard to the
symmetry or anti-symmetry of this tensor and the number of its independent non-zero
components?
Exercise 23. Write the analytical expressions of the rank-3 and rank-4 permutation tensors.
Exercise 24. Correct, if necessary, the following equations:
εi1⋯in εi1⋯in = n
εijk Cj Ck = 0
εijk Djk = 0
e i × e j = εijk e j
(e i × e j)⋅e k = εijk
where C is a vector, D is a symmetric rank-2 tensor, and the indexed e are orthonormal
basis vectors in a 3D space with a right-handed coordinate system.
Exercise 25. What is wrong with the following equations?
εijk δ1i δ2j δ3k = − 1
εij εkl = δik δjl + δil δjk
εil εkl = δil
εij εij = 3!
Exercise 26. Write the following in their determinantal form describing the general pattern
of the relation between the indices of ε and δ and the indices of the rows and columns of
the determinant:
εijk εlmk
εijk εlmn
εi1⋯in εj1⋯jn
Exercise 27. Give two mnemonic devices used to memorize the widely used epsilon-delta
identity and make a simple graphic illustration for one of these.
Exercise 28. Correct, if necessary, the following equations:
εrst εrst = 3!
εpst εqst = 2 δpq
εrst δrt = εrst δst
Exercise 29. State the mathematical definition of the generalized Kronecker delta.
Exercise 30. Write each one of εi1…in and εi1…in in terms of the generalized Kronecker δ.
Exercise 31. Write the mathematical relation that links the covariant permutation tensor,
the contravariant permutation tensor, and the generalized Kronecker delta.
Exercise 32. State the widely used epsilon-delta identity in terms of the generalized and
ordinary Kronecker deltas.
Chapter 5
where the last two equalities represent the expansion of the determinant by row and by
column. Alternatively, the determinant of a 3 × 3 matrix can be given by:
(167) det(A) = (1 ⁄ 3!) εijk εlmn Ail Ajm Akn
More generally, for an n × n matrix representing a rank-2 tensor in an nD space, the
determinant is given by:
(168) det(A) = εi1⋯in A1i1…Anin = εi1⋯in Ai11…Ainn = (1 ⁄ n!)εi1⋯in εj1⋯jn Ai1j1…Ainjn
The inverse of a matrix A representing a rank-2 tensor is given by:
(169)
while the vector triple product of three vectors in a 3D space is given by:
(175) [A × (B × C)]i = εijk εklm Aj Bl Cm
The expression of the other principal form of the vector triple product [i.e. (A × B) × C]
can be obtained from the above form by changing the order of the factors in the external
cross product and reversing the sign; other operations, like relabeling the indices and
exchanging some of the indices of the epsilons with a shift in sign, can then follow to obtain
a more organized form. The expressions of the subsidiary forms of the vector triple product
[e.g. B × (A × C) or (A × C) × B] can be obtained from the above with relabeling the
vectors in the indicial form according to their order in the symbolic form.
Similarly, the gradient of a differentiable vector function of position A is the outer product
(refer to § 3.3↑) between the ∇ operator and the vector and hence it is a rank-2 tensor
given by:
(187) [∇A]ij = ∂iAj
The divergence of a differentiable vector A is the dot product of the nabla operator and
the vector A and hence it is a scalar given by:
(188)
The divergence operation can also be viewed as taking the gradient of the vector followed
by a contraction. Hence, the divergence of a vector is invariant because it is the trace of a
rank-2 tensor (see § 5.2↑; also see Footnote 71 in § 8↓).
Similarly, the divergence of a differentiable rank-2 tensor A is a vector defined in one of
its forms by:
(189) [∇⋅A]i = ∂jAji
and in another form by:
(190) [∇⋅A]j = ∂iAji
These two different forms can be given, respectively, in symbolic notation by:
(191) ∇⋅A and ∇⋅AT
where AT is the transpose of A.
More generally, the divergence of a tensor of rank n ≥ 2, which is a tensor of rank-(n − 1),
can be defined in several forms, which are different in general, depending on the
combination of the contracted indices.
The curl of a differentiable vector A is the cross product of the nabla operator and the
vector A and hence it is a vector defined by:
(192)
The curl operation may be generalized to tensors of rank > 1, and hence the curl of a
differentiable rank-2 tensor A can be defined as a rank-2 tensor given by:
(193) [∇ × A]ij = εimn ∂mAnj
The Laplacian scalar operator acting on a differentiable scalar f is given by:
(194)
The Laplacian operator acting on a differentiable vector A is defined for each component
of the vector in a similar manner to the definition of the Laplacian acting on a scalar, that
is:
2 2
(195) [∇ A]i = ∇ [ A]i = ∂jjAi
The following scalar differential operator is commonly used in science (e.g. in fluid
dynamics):
(196)
where A is a vector. As indicated earlier, the order of Ai and ∂i should be respected. The
following vector differential operator also has common applications in science:
(197) [A × ∇]i = εijk Aj ∂k
where, again, the order should be respected.
It should be remarked that the differentiation of a tensor increases its rank by one, by
introducing an extra covariant index, unless it implies a contraction in which case it
reduces the rank by one. Therefore the gradient of a scalar is a vector and the gradient of a
vector is a rank-2 tensor (∂iAj), while the divergence of a vector is a scalar and the
divergence of a rank-2 tensor is a vector (∂jAji or ∂iAji). This may be justified by the fact
that nabla ∇ is a vector operator. On the other hand the Laplacian operator does not
change the rank since it is a scalar operator; hence the Laplacian of a scalar is a scalar
and the Laplacian of a vector is a vector.
For the cylindrical system identified by the coordinates (ρ, φ, z) with a set of orthonormal
basis vectors e ρ, e φ and e z we have the following definitions for the nabla based
operators and operations (see Footnote 72 in § 8↓).
The nabla operator ∇ is given by:
(198)
For the spherical system identified by the coordinates (r, θ, φ) with a set of orthonormal
basis vectors e r, e θ and e φ we have the following definitions for the nabla based
operators and operations (see Footnote 73 in § 8↓).
The nabla operator ∇ is given by:
(203)
For general orthogonal systems in a 3D space identified by the coordinates (u1, u2, u3)
with a set of unit basis vectors u1, u2 and u3 and scale factors h1, h2 and h3 where hi
= |∂r ⁄ ∂ui| and r = xie i is the position vector, we have the following definitions for the
nabla based operators and operations.
The nabla operator ∇ is given by:
(208)
(definition).
• (A × B) × (C × D) = [D⋅(A × B)]C − [C⋅(A × B)]D:
[(A × B) × (C × D)]i = εijk[A × B]j[C × D]k
(Eq. 173↑)
= εijkεjmnAmBnεkpqCpDq
(Eq. 173↑)
= εijkεkpqεjmnAmBnCpDq
(commutativity)
= εijkεpqkεjmnAmBnCpDq
(Eq. 142↑)
= (δipδjq − δiqδjp)εjmnAmBnCpDq
(Eq. 151↑)
= (δipδjqεjmn − δiqδjpεjmn)AmBnCpDq
(distributivity)
= (δipεqmn − δiqεpmn)AmBnCpDq
(Eq. 134↑)
= δipεqmnAmBnCpDq − δiqεpmnAmBnCpDq
(distributivity)
= εqmnAmBnCiDq − εpmnAmBnCpDi
(Eq. 134↑)
= εqmnDqAmBnCi − εpmnCpAmBnDi
(commutativity)
= (εqmnDqAmBn)Ci − (εpmnCpAmBn)Di
(grouping)
= [D⋅(A × B)]Ci − [C⋅(A × B)]Di
(Eq. 174↑)
= [[D⋅(A × B)]C]i − [[C⋅(A × B)]D]i
(index definition)
= [[D⋅(A × B)]C − [C⋅(A × B)]D]i
(Eq. 60↑).
Because i is a free index, the identity is proved for all components.
5.7 Exercises
Exercise 1. Write, in tensor notation, the mathematical expression for the trace,
determinant and inverse of an n × n matrix.
Exercise 2. Repeat the previous exercise for the multiplication of a matrix by a vector and
the multiplication of two n × n matrices.
Exercise 3. Define mathematically the dot and cross product operations of two vectors
using tensor notation.
Exercise 4. Repeat the previous exercise for the scalar triple product and vector triple
product operations of three vectors.
Exercise 5. Define mathematically, using tensor notation, the three scalar invariants of a
rank-2 tensor: I, II and III.
Exercise 6. Express the three scalar invariants I, II and III in terms of the other three
invariants I1, I2 and I3 and vice versa.
Exercise 7. Explain why the three invariants I1, I2 and I3 are scalars using in your
argument the fact that the three main invariants I, II and III are traces?
Exercise 8. Justify, giving a detailed explanation, the following statement: “If a rank-2
tensor is invertible in a particular coordinate system it is invertible in all other coordinate
systems, and if it is singular in a particular coordinate system it is singular in all other
coordinate systems”. Use in your explanation the fact that the determinant is invariant under
admissible coordinate transformations.
Exercise 9. What are the ten joint invariants between two rank-2 tensors?
Exercise 10. Provide a concise mathematical definition of the nabla differential operator
∇ in Cartesian coordinate systems using tensor notation.
Exercise 11. What is the rank and variance type of the gradient of a differentiable scalar
field in general curvilinear coordinate systems?
Exercise 12. State, in tensor notation, the mathematical expression for the gradient of a
differentiable scalar field in a Cartesian system.
Exercise 13. What is the gradient of the following scalar functions of position f, g and h
where x1, x2 and x3 are the Cartesian coordinates and a, b and c are constants?
f = 1.3x1 − 2.6ex2 + 19.8x3
g = ax3 + bex2
h = a(x1)3 − sinx3 + c(x3)2
Exercise 14. State, in tensor notation, the mathematical expression for the gradient of a
differentiable vector field in a Cartesian system.
Exercise 15. What is the gradient of the following vector where x1, x2 and x3 are the
Cartesian coordinates?
V = (2x1 − 1.2x2, x1 + x3, x2x3)
What is the rank of this gradient?
Exercise 16. Explain, in detail, why the divergence of a vector is invariant.
Exercise 17. What is the rank of the divergence of a rank-n (n > 0) tensor and why?
Exercise 18. State, using vector and tensor notations, the mathematical definition of the
divergence operation of a vector in a Cartesian coordinate system.
Exercise 19. Discuss in detail the following statement: “The divergence of a vector is a
gradient operation followed by a contraction”. How this is related to the trace of a rank-2
tensor?
Exercise 20. Write down the mathematical expression of the two forms of the divergence
of a rank-2 tensor.
Exercise 21. How many forms do we have for the divergence of a rank-n (n > 0) tensor
and why? Assume in your answer that the divergence operation can be conducted with
respect to any one of the tensor indices.
Exercise 22. Find the divergence of the following vectors U and V where x1, x2 and x3 are
the Cartesian coordinates:
U = (9.3x1, 6.3cosx2, 3.6x1e − 1.2x3)
V = (x2sinx1, 5(x2)3, 16.3x3)
Exercise 23. State, in tensor notation, the mathematical expression for the curl of a vector
and of a rank-2 tensor assuming a Cartesian coordinate system.
Exercise 24. Define, in tensor notation, the Laplacian operator acting on a differentiable
scalar field in a Cartesian coordinate system.
Exercise 25. Is the Laplacian a scalar or a vector operator?
Exercise 26. What is the meaning of the Laplacian operator acting on a differentiable
vector field?
Exercise 27. What is the rank of a rank-n tensor acted upon by the Laplacian operator?
Exercise 28. Define mathematically the following operators assuming a Cartesian
coordinate system:
A⋅∇
A×∇
What is the rank of each one of these operators?
Exercise 29. Make a general statement about how differentiation of tensors affects their
rank discussing in detail from this perspective the gradient and divergence operations.
Exercise 30. State the mathematical expressions for the following operators and
operations assuming a cylindrical coordinate system: nabla operator, Laplacian operator,
gradient of a scalar, divergence of a vector, and curl of a vector.
Exercise 31. Explain how the expressions for the operators and operations in the previous
exercise can be obtained for the plane polar coordinate system from the expressions of the
cylindrical system.
Exercise 32. State the mathematical expressions for the following operators and
operations assuming a spherical coordinate system: nabla operator, Laplacian operator,
gradient of a scalar, divergence of a vector, and curl of a vector.
Exercise 33. Repeat the previous exercise for the general orthogonal coordinate system.
Exercise 34. Express, in tensor notation, the mathematical condition for a vector field to
be solenoidal.
Exercise 35. Express, in tensor notation, the mathematical condition for a vector field to
be irrotational.
Exercise 36. Express, in tensor notation, the divergence theorem for a differentiable vector
field explaining all the symbols involved. Repeat the exercise for a differentiable tensor
field of an arbitrary rank ( > 0).
Exercise 37. Express, in tensor notation, Stokes theorem for a differentiable vector field
explaining all the symbols involved. Repeat the exercise for a differentiable tensor field of
an arbitrary rank ( > 0).
Exercise 38. Express the following identities in tensor notation:
∇⋅r = n
∇(a⋅r) = a
∇⋅(∇ × A) = 0
∇(fh) = f∇h + h∇f
∇ × (fA) = f∇ × A + ∇f × A
A × (B × C) = B(A⋅C) − C(A⋅B)
2
∇ × (∇ × A) = ∇(∇⋅A) − ∇ A
∇⋅(A × B) = B⋅(∇ × A) − A⋅(∇ × B)
Exercise 39. Prove the following identities using the language and techniques of tensor
calculus:
∇×r=0
2
∇⋅(∇f) = ∇ f
∇ × (∇f) = 0
∇⋅(fA) = f∇⋅A + A⋅∇f
A⋅(B × C) = C⋅(A × B) = B⋅(C × A)
A × (∇ × B) = (∇B)⋅A − A⋅∇B
∇(A⋅B) = A × (∇ × B) + B × (∇ × A) + (A⋅∇)B + (B⋅∇)A
∇ × (A × B) = (B⋅∇)A + (∇⋅B)A − (∇⋅A)B − (A⋅∇)B
(A × B) × (C × D) = [D⋅(A × B)]C − [C⋅(A × B)]D
Chapter 6
Metric Tensor
The subject of the present chapter is the metric tensor which is one of the most important
special tensors, if not the most important of all, in tensor calculus. Its versatile usage and
functionalities permeate the whole discipline of tensor calculus.
The metric tensor is a rank-2 tensor which may also be called the fundamental tensor.
The main purpose of the metric tensor is to generalize the concept of distance to general
curvilinear coordinate frames and maintain the invariance of distance in different
coordinate systems.
In orthonormal Cartesian coordinate systems the distance element squared, (ds)2, between
two infinitesimally neighboring points in space, one with coordinates xi and the other with
coordinates xi + dxi, is given by:
(261) (ds)2 = dxidxi = δijdxidxj
This definition of distance is the key to introducing a rank-2 tensor, gij, called the metric
tensor which, for a general coordinate system, is defined by:
(262) (ds)2 = gijdxidxj
The above defined metric tensor is of covariant type. The metric tensor has also a
contravariant type which is usually notated with gij.
The components of the covariant and contravariant metric tensor are given by:
(263) gij = Ei⋅Ej and gij = Ei⋅Ej
where the indexed E are the covariant and contravariant basis vectors as defined in §
2.6.1↑.
The metric tensor has also a mixed type which is given by:
(264) gi j = Ei⋅Ej = δi j and gi j = Ei⋅Ej = δi j
and hence it is the same as the unity tensor.
For a coordinate system in which the metric tensor can be cast in a diagonal form where
the diagonal elements are ±1 the metric is called flat. For Cartesian coordinate systems,
which are orthonormal flat-space systems, we have:
(265) gij = δij = gij = δij
The metric tensor is symmetric in its two indices, that is:
(266) gij = gji and gij = gji
This can be easily explained by the commutativity of the dot product of vectors in reference
to the above equations involving the dot product of the basis vectors.
The contravariant metric tensor is used for raising covariant indices of covariant and
mixed tensors, e.g.
(267) Ai = gikAk and Aij = gikAk j
Similarly, the covariant metric tensor is used for lowering contravariant indices of
contravariant and mixed tensors, e.g.
(268) Ai = gikAk and Aij = gikAk j
In these raising and lowering operations the metric tensor acts, like a Kronecker delta, as
an index replacement operator as well as shifting the position of the index.
Because it is possible to shift the index position of a tensor by using the covariant and
contravariant types of the metric tensor, a given tensor can be cast into a covariant or a
contravariant form, as well as a mixed form in the case of tensors of rank > 1. However, it
should be emphasized that the order of the indices must be respected in this process,
because two tensors with the same indicial structure but with different indicial order are
not equal in general, as stated before. For example:
(269) Ai j = gjkAik ≠ Aj i = gjkAki
Some authors insert dots (e.g. Ai⋅ j and A⋅j i) to remove any ambiguity about the order of
the indices.
The covariant and contravariant metric tensors are inverses of each other, that is:
(270) [gij] = [gij] − 1 and [gij] = [gij] − 1
Hence:
(271) gikgkj = δi j and gikgkj = δi j
It is common to reserve the term metric tensor to the covariant form and call the
contravariant form, which is its inverse, the associate or conjugate or reciprocal metric
tensor. As a tensor, the metric has a significance regardless of any coordinate system
although it requires a coordinate system to be represented in a specific form. For
orthogonal coordinate systems the metric tensor is diagonal, i.e. gij = gij = 0 for i ≠ j.
As indicated before, for orthonormal Cartesian coordinate systems in a 3D space, the
metric tensor is given in its covariant and contravariant forms by the 3 × 3 unit matrix, that
is:
(272)
For cylindrical coordinate systems with coordinates (ρ, φ, z), the metric tensor is given in
its covariant and contravariant forms by:
(273)
while for spherical coordinate systems with coordinates (r, θ, φ), the metric tensor is
given in its covariant and contravariant forms by:
(274)
As seen, all these metric tensors are diagonal since all these coordinate systems are
orthogonal. We also notice that all the corresponding diagonal elements of the covariant
and contravariant types are reciprocals of each other. This can be easily explained by the
fact that these two types are inverses of each other, plus the fact that the inverse of an
invertible diagonal matrix is a diagonal matrix obtained by taking the reciprocal of the
corresponding diagonal elements of the original matrix, as stated in Inverse of Matrix in §
1.3.3↑.
6.1 Exercises
Exercise 1. Describe in details, using mathematical tensor language when necessary, the
metric tensor discussing its rank, purpose, designations, variance types, symmetry, its role
in the definition of distance, and its relation to the covariant and contravariant basis
vectors.
Exercise 2. What is the relation between the covariant and contravariant types of the
metric tensor? Express this relation mathematically. Also define mathematically the mixed
type metric tensor.
Exercise 3. Correct, if necessary, the following equations:
gi j = δi i
gij = Ei⋅Ej
(ds) = gijdxidxj
gij = Ei⋅Ej
Ei⋅Ej = δj i
Exercise 4. What “flat metric” means? Give an example of a coordinate system with a flat
metric.
Exercise 5. Describe the index-shifting (raising/lowering) operators and their relation to
the metric tensor. How these operators facilitate the transformation between the covariant,
contravariant and mixed types of a given tensor?
Exercise 6. What is wrong with the following equations?
Ci = gijCj
Di = gijDj
Ai = δijAj
Make the necessary corrections considering all the possibilities in each case.
Exercise 7. Is it necessary to keep the order of the indices which are shifted by the index-
shifting operators and why?
Exercise 8. How and why dots may be inserted to avoid confusion about the order of the
indices following an index-shifting operation?
Exercise 9. Express, mathematically, the fact that the contravariant and covariant metric
tensors are inverses of each other.
Exercise 10. Correct, if necessary, the following statement: “The term metric tensor is
usually used to label the covariant form of the metric, while the contravariant form of the
metric is called the conjugate or associate or reciprocal metric tensor”.
Exercise 11. Write, in matrix form, the covariant and contravariant types of the metric
tensor of the Cartesian, cylindrical and spherical coordinate systems of a 3D flat space.
Exercise 12. Regarding the previous question, what do you notice about the corresponding
diagonal elements of the covariant and contravariant types of the metric tensor in these
systems? Does this relate to the fact that these types are inverses of each other?
Chapter 7
Covariant Differentiation
The focus of this chapter is the operation of covariant differentiation of tensors which, in a
sense, is a generalization of the ordinary differentiation. The ordinary derivative of a
tensor is not a tensor in general. The objective of covariant differentiation is to ensure the
invariance of derivative (i.e. being a tensor) in general coordinate systems, and this results
in applying more sophisticated rules using Christoffel symbols where different
differentiation rules for covariant and contravariant indices apply. The resulting covariant
derivative is a tensor which is one rank higher than the differentiated tensor.
The Christoffel symbol of the second kind is defined by:
(275)
where the indexed g is the metric tensor in its contravariant and covariant forms with
implied summation over l. It is noteworthy that Christoffel symbols are not tensors. The
Christoffel symbols of the second kind are symmetric in their two lower indices, that is:
(276) {kij} = {kji}
For Cartesian coordinate systems, the Christoffel symbols are zero for all values of the
indices. For cylindrical coordinate systems, marked with the coordinates (ρ, φ, z), the
Christoffel symbols are zero for all values of the indices except:
(277) {122} = − ρ
(278) {212} = {221} = 1 ⁄ ρ
where (1, 2, 3) stand for (ρ, φ, z).
For spherical coordinate systems, marked with the coordinates (r, θ, φ), the Christoffel
symbols are zero for all values of the indices except:
(279) {122} = − r
(280) {133} = − r sin2θ
(281) {212} = {221} = 1 ⁄ r
(282) {233} = − sinθ cosθ
(283) {313} = {331} = 1 ⁄ r
(284) {323} = {332} = cotθ
where (1, 2, 3) stand for (r, θ, φ).
For a differentiable scalar f, the covariant derivative is the same as the ordinary partial
derivative, that is:
(285) f;i = f, i = ∂if
This is justified by the fact that the covariant derivative is different from the ordinary
partial derivative because the basis vectors in general coordinate systems are dependent on
their spatial position, and since a scalar is independent of the basis vectors the covariant
and partial derivatives are identical.
For a differentiable vector A, the covariant derivative of the covariant and contravariant
forms of the vector is given by:
(286) Aj;i = ∂iAj − {kji}Ak (covariant)
(287) Aj ;i = ∂iAj + {jki}Ak (contravariant)
For a differentiable rank-2 tensor A, the covariant derivative of the covariant,
contravariant and mixed forms of the tensor is given by:
(288) Ajk;i = ∂iAjk − {lji}Alk − {lki}Ajl (covariant)
(289) Ajk ;i = ∂iAjk + {jli}Alk + {kli}Ajl (contravariant)
(290) Ak j;i = ∂iAkj + {kli}Alj − {lji}Akl (mixed)
More generally, for a differentiable rank-n tensor A, the covariant derivative is given by:
(291)
From the last equations, a pattern for the covariant differentiation operation emerges, that
is it starts with an ordinary partial derivative term, then for each tensor index an extra
Christoffel symbol term is added, positive for superscripts and negative for subscripts,
where the differentiation index is the second of the lower indices in the Christoffel symbol.
Since the Christoffel symbols are identically zero in Cartesian coordinate systems, the
covariant derivative is the same as the ordinary partial derivative for all tensor ranks.
Another important fact about covariant differentiation is that the covariant derivative of the
metric tensor in its covariant, contravariant and mixed forms is zero in all coordinate
systems and hence it is treated like a constant in covariant differentiation.
Several rules of ordinary differentiation similarly apply to covariant differentiation. For
example, covariant differentiation is a linear operation with respect to algebraic sums of
tensor terms, that is:
(292) ∂;i(aA±bB) = a ∂;iA±b ∂;iB
where a and b are scalar constants and A and B are differentiable tensor fields.
The product rule of ordinary differentiation also applies to covariant differentiation of
tensor multiplication, that is:
(293) ∂;i(AB) = (∂;iA)B + A∂;iB
However, as seen in this equation, the order of the tensors should be observed since tensor
multiplication, unlike ordinary algebraic multiplication, is not commutative. The product
rule is also valid for the inner product of tensors because the inner product is an outer
product operation followed by a contraction of indices, and covariant differentiation and
contraction of indices do commute.
Since the covariant derivative of the metric tensor is identically zero, as stated above, the
covariant derivative operator bypasses the index raising/lowering operator, that is:
(294) ∂;m(gijAj) = gij∂;mAj
and hence the metric tensor behaves like a constant with respect to the covariant
differential operator.
A principal difference between the partial differentiation and the covariant differentiation
is that for successive differential operations with respect to different indices the partial
derivative operators do commute with each other, assuming certain continuity conditions,
but the covariant differential operators do not commute, that is:
(295) ∂i∂j = ∂j∂i but ∂;i∂;j ≠ ∂;j∂;i
Higher order covariant derivatives are similarly defined as derivatives of derivatives;
however the order of differentiation, in the case of differentiating with respect to different
indices, should be respected, as explained above.
7.1 Exercises
Exercise 1. Is the ordinary derivative of a tensor necessarily a tensor or not? Can the
ordinary derivative of a tensor be a tensor? If so, give an example.
Exercise 2. Explain the purpose of the covariant derivative and how it is related to the
invariance property of tensors.
Exercise 3. Is the covariant derivative of a tensor necessarily a tensor? If so, what is the
rank of the covariant derivative of a rank-n tensor?
Exercise 4. How is the Christoffel symbol of the second kind symbolized? Describe the
arrangement of its indices.
Exercise 5. State the mathematical definition of the Christoffel symbol of the second kind
in terms of the metric tensor defining all the symbols involved.
Exercise 6. The Christoffel symbols of the second kind are symmetric in which of their
indices?
Exercise 7. Why the Christoffel symbols of the second kind are identically zero in the
Cartesian coordinate systems? Use in your explanation the mathematical definition of the
Christoffel symbols.
Exercise 8. Give the Christoffel symbols of the second kind for the cylindrical and
spherical coordinate systems explaining the meaning of the indices used.
Exercise 9. What is the meaning, within the context of tensor differentiation, of the comma
“,” and semicolon “;” when used as subscripts preceding a tensor index?
Exercise 10. Why the covariant derivative of a differentiable scalar is the same as the
ordinary partial derivative and how is this related to the basis vectors of coordinate
systems?
Exercise 11. Differentiate the following tensors covariantly:
As
Bt
Cji
Dpq
Emn
Alm…pij…k
Exercise 12. Explain the mathematical pattern followed in the operation of covariant
differentiation of tensors. Does this pattern also apply to rank-0 tensors?
Exercise 13. The covariant derivative in Cartesian coordinate systems is the same as the
ordinary partial derivative for all tensor ranks. Explain why.
Exercise 14. What is the covariant derivative of the covariant and contravariant forms of
the metric tensor for an arbitrary type of coordinate system? How is this related to the fact
that the covariant derivative operator bypasses the index-shifting operator?
Exercise 15. Which rules of ordinary differentiation apply equally to covariant
differentiation and which do not? Make mathematical statements about all these rules with
sufficient explanation of the symbols and operations involved.
Exercise 16. Make corrections, where necessary, in the following equations explaining in
each case why the equation should or should not be amended:
(C±D);i = ∂;iC±∂;iD
∂;i(AB) = B(∂;iA) + A∂;iB
(gijAj);m = gij(Aj);m
∂;i∂;j = ∂;j∂;i
∂i ∂j = ∂j ∂i
Exercise 17. How do you define the second and higher order covariant derivatives of
tensors? Do these derivatives follow the same rules as the ordinary partial derivatives of
the same order in the case of different differentiation indices?
References
• G.B. Arfken; H.J. Weber; F.E. Harris. Mathematical Methods for Physicists A
Comprehensive Guide. Elsevier Academic Press, seventh edition, 2013.
• R.B. Bird; R.C. Armstrong; O. Hassager. Dynamics of Polymeric Liquids, volume 1. John
Wiley & Sons, second edition, 1987.
• R.B. Bird; W.E. Stewart; E.N. Lightfoot. Transport Phenomena. John Wiley & Sons,
second edition, 2002.
• M.L. Boas. Mathematical Methods in the Physical Sciences. John Wiley & Sons Inc.,
third edition, 2006.
• C.F. Chan Man Fong; D. De Kee; P.N. Kaloni. Advanced Mathematics for Engineering
and Science. World Scientific Publishing Co. Pte. Ltd., first edition, 2003.
• T.L. Chow. Mathematical Methods for Physicists: A concise introduction. Cambridge
University Press, first edition, 2003.
• J.H. Heinbockel. Introduction to Tensor Calculus and Continuum Mechanics. 1996.
• D.C. Kay. Schaum’s Outline of Theory and Problems of Tensor Calculus. McGraw-Hill,
first edition, 1988.
• K.F. Riley; M.P. Hobson; S.J. Bence. Mathematical Methods for Physics and Engineering.
Cambridge University Press, third edition, 2006.
• D. Zwillinger, editor. CRC Standard Mathematical Tables and Formulae. CRC Press,
32nd edition, 2012.
Footnotes
Footnote 1. Oblique coordinate systems are usually characterized by similar features as
orthonormal Cartesian systems but with non-perpendicular axes.
Footnote 2. Since matrices in this book are supposed to represent rank-2 tensors, they also
follow the rules of labeling tensors symbolically by using non-indexed upper case bold
face non-italic Latin letters.
Footnote 3. This mainly applies in this book to Cartesian coordinate systems.
Footnote 4. As indicated, in notations like ∂r the subscript is used as a label rather than an
index.
Footnote 5. There are special tensors of numerical nature, such as the Kronecker and
permutation tensors, which do not require a particular coordinate system for their full and
unambiguous identification since their components are invariant under coordinate
transformations (the reader is referred to Chapter 4↑ for details). However, this issue may
be debated.
Footnote 6. As indicated before, these features are what qualify a Cartesian system to be
described as orthonormal.
Footnote 7. In fact, the basis vectors are constant only in rectangular Cartesian and oblique
Cartesian coordinate systems. As indicated before, the oblique Cartesian systems are the
same as the rectangular Cartesian but with the exception that their axes are not mutually
orthogonal. Also, labeling the oblique as Cartesian may be controversial.
Footnote 8. In the second equation, arctan(y ⁄ x) should be selected consistent with the
signs of x and y to be in the right quadrant.
Footnote 9. Again, arctan(y ⁄ x) in the third equation should be selected consistent with the
signs of x and y.
Footnote 10. This equality can be explained by the fact that the magnitude of the scalar
triple product is equal to the volume of the parallelepiped whereas the cyclic permutation
preserves the sign of these six possibilities.
Footnote 11. The stated facts about the other six possibilities can be explained by the two
invariances (as explained in the previous footnote) plus the fact that the cross product
operation is anti-commutative.
Footnote 12. Associativity here should be understood in its context as stated by the
inequality.
Footnote 13. Although we generally use (x, y, z) for the Cartesian coordinates of a
particular point in the space while we use (x1, x2, x3) to label the axes and the coordinates
of the Cartesian system in general, in this subsection we use (x, y, z), instead of
(x1, x2, x3), because it is more commonly used in vector algebra and calculus and is
notationally clearer, especially at this level and at this stage in the book. We also label the
basis vectors of the Cartesian system with (i, j, k), instead of (e 1, e 2, e 3), for similar
reasons.
Footnote 14. As we will see later in the book, similar differential operations can also be
defined and performed on higher rank tensor fields.
Footnote 15. This operator is also known as the harmonic operator.
Footnote 16. In this book, only square matrices are of primary interest as they are
qualified to represent uniformly dimensioned rank-2 tensors. Row and column matrices are
also qualified to represent employed vectors.
Footnote 17. The indexing of the entries of AT is not standard; the purpose of this is to
demonstrate the exchange of rows and columns.
Footnote 18. In brief, AB represents an inner product of A and B according to the matrix
notation, and an outer product of A and B according to the symbolic notation of tensors,
while A⋅B represents an inner product of A and B according to the symbolic notation of
tensors. Hence, AB in the matrix notation is equivalent to A⋅B in the symbolic notation of
tensors.
Footnote 19. In fact, this rule for the determinant of an n × n matrix applies even to the 2 ×
2 matrix if the cofactor of an entry in this case is taken as a single entry with the designated
sign. However, we separated the 2 × 2 matrix case in the definition to be more clear and to
avoid possible confusion.
Footnote 20. The matrix of cofactors (or cofactor matrix) is made of the cofactors of its
elements taking the same positions as the positions of these elements. The transposed
matrix of cofactors may be called the adjugate or adjoint matrix although the terminology
may differ between the authors.
Footnote 21. This assertion, in fact, applies to the common cases of tensor applications;
however, there are instances, for example in the differential geometry of curves and
surfaces, of tensors which are not uniformly dimensioned because the tensor is related to
two spaces with different dimensions.
Footnote 22. As indicated previously, the symbol “ ≡ ” may be used for identity as well as
for definition.
Footnote 23. In the literature of tensor calculus, rank and order of tensors are generally
used interchangeably; however some authors differentiate between the two as they assign
order to the total number of indices, including contracted indices, while they reserve rank
to the number of free indices. We think the latter is better and hence in the present book we
embrace this terminology.
Footnote 24. We adopt this assertion, which is common in the literature of tensor calculus,
as we think it is suitable for this level. However, there are many instances in the literature
of tensor calculus where indices are repeated more than twice in a single term. The bottom
line is that as long as the tensor expression makes sense and the intention is clear, such
repetitions should be allowed with no need in our view to take special precaution like
using parentheses. In particular, the forthcoming summation convention will not apply
automatically in such cases although summation on such indices, if needed, can be carried
out explicitly, by using the summation symbol ∑ or by a special declaration of such
intention similar to the summation convention. Anyway, in the present book we will not use
indices repeated more than twice in a single term.
Footnote 25. These precautions are obviously needed if the summation convention is
adopted in general but it does not apply in some exceptional cases where repeated indices
are needed in the notation with no intention of summation.
Footnote 26. This should not be confused with the order of tensor as defined above in the
same context as tensor rank. It should also be noticed that the order of indices in legitimate
tensor expressions and equalities is not required to be the same in all terms; the point of the
remark about the order of indices is to highlight the importance of this aspect of the indicial
structure so that it is observed and clarified so that the tensor becomes well defined and
correctly notated and used.
Footnote 27. We exaggerate the spacing here for clarity.
Footnote 28. In many places in this book (like other books) and for the convenience in
typesetting, the order of the indices is not clarified by spacing or inserting dots in the case
of mixed type tensors. This commonly occurs where the order of the indices is irrelevant in
the given context or the order is clear. Sometimes, the order of the indices may be indicated
implicitly by the alphabetical order of the selected indices.
Footnote 29. The focus of this section is on providing examples of tensors of different
ranks. As we will see later in this chapter (refer to § 2.6.2↑), there are true and pseudo
scalars, vectors and tensors, and hence some of the statements and examples given here
may qualify for certain restrictions and conditions.
Footnote 30. Similar basis vectors are assumed.
Footnote 31. The use of upper indices in the symbols of general coordinates is to indicate
the fact that the coordinates and their differentials transform contravariantly.
Footnote 32. In fact there are standard mathematical procedures to orthonormalize the
basis set if it is not and if this is needed.
Footnote 33. The terminology in this part, like many other parts, is not universal.
Footnote 34. Inner product (see § 3.5↑) is the result of a direct product (see § 3.3↑)
operation followed by a contraction (see § 3.4↑) and hence it is like a direct product in this
context.
Footnote 35. The Jacobian J is the determinant of the Jacobian matrix J of the
transformation between the unbarred and barred systems, that is:
(296)
For more details, the reader is advised to consult more advanced textbooks on this subject.
Footnote 36. Some of these labels are used differently by different authors as the
terminology of tensor calculus is not universally approved and hence the conventions of
each author should be checked. Also, there is an obvious overlap between this
classification (i.e. absolute and relative) and the previous classification (i.e. true and
pseudo) at least according to some conventions.
Footnote 37. This statement should be generalized by including w = 0 which corresponds
to absolute tensors and hence “relative” in this statement is more general than being
opposite to “absolute”. Accordingly, and from the perspective of relative tensors (i.e.
assuming that other qualifications such as matching in the indicial structure are met), two
absolute tensors can be added/subtracted but an absolute and a relative tensor (i.e. with w
≠ 0) cannot since they are “relative” tensors with different weights.
Footnote 38. For improper rotation, this is more general than being isotropic.
Footnote 39. Symmetry and anti-symmetry of tensors require in their definition two free
indices at least; hence a scalar with no index and a vector with a single index do not
qualify to be symmetric or anti-symmetric.
Footnote 40. In this context, like many other contexts in this book, there are certain
restrictions on the type and conditions of the coordinate transformations under which such
statements are valid. However, these details cannot be discussed here due to the
elementary level of this book.
Footnote 41. The best way to tackle this sort of exercises is to build a table or array of
appropriate dimensions where the indexed components in symbolic or numeric formats are
considered and a trial and error approach is used to investigate the possibility of creating
such a tensor.
Footnote 42. The concepts of repetitive and non-repetitive permutations may be useful in
tackling this question.
Footnote 43. Here, “type” refers to variance type (covariant/contravariant/mixed) and
true/pseudo type as well as other qualifications to which the tensors participating in an
addition or subtraction operation should match such as having the same weight if they are
relative tensors, as outlined previously (refer for example to § 2.6.3↑).
Footnote 44. Associativity and commutativity can include subtraction if the minus sign is
absorbed in the subtracted tensor; in which case the operation is converted to addition.
Footnote 45. As indicated before, there are cases of tensors which are not uniformly
dimensioned, and in some cases these tensors can be regarded as the result of an outer
product of lower rank tensors.
Footnote 46. Regarding the associativity of direct multiplication, there seems to be cases
in which this operation is not associative. The interested reader is advised to refer to the
research literature on this subject.
Footnote 47. The non-commutativity of the inner product in the case of only one of the
involved tensors is of rank > 1 my not be obvious; however, a simple example is
multiplying Ai and Bj kl with a contraction of j and k or j and l.
Footnote 48. In fact, this statement is rather vague and rudimentary and may not apply in
some cases. There are many details related to the issue of commutativity of inner product
of tensors which cannot be discussed here due to the level of this book. In general, several
issues should be considered in this regard such as the order of the indices in the outer
product of the two tensors involved in the inner product, the (m, n) type of the outer
product and whether the contracted indices are contributed by the same tensor or the two
tensors involved in the product assuming that the first case is conventionally an inner
product operation. Another important issue to be considered is that the contracted indices
must, in general, be of opposite variance type. Many of these details can be worked out
rather easily by the vigilant reader from first principles if they cannot be obtained from the
textbooks of tensor calculus.
Footnote 49. It should be emphasized that we are using the symbolic notation of tensor
calculus, rather than the matrix notation, in writing Ab and A⋅b to represent, respectively,
the outer and inner products. In matrix notation, Ab is used to represent the product of a
matrix by a vector which is an inner product according to the terminology of tensor
calculus.
Footnote 50. In these statements, we assume that the contracted indices can be contributed
by the same tensor in the product as well as by the two tensors (i.e. one index from each
tensor); otherwise more details are required. We are also assuming a general coordinate
system where the contracted indices should be opposite in their variance type.
Footnote 51. This is also defined differently by some authors.
Footnote 52. This should not be confused with the quotient rule of differentiation.
Footnote 53. We assume, of course, that the rules of contraction of indices, such as being
of opposite variance type in the case of non-Cartesian coordinates, are satisfied in this
operation.
Footnote 54. This name is usually used for the rank-3 tensor. Also some authors
distinguish between the permutation tensor and the Levi-Civita tensor even for rank-3.
Moreover, some of the common labels and descriptions of ε are more specific to rank-3.
Footnote 55. “Conserved” means that the tensor keeps the values of its components
following a coordinate transformation.
Footnote 56. Here, being conserved under all transformations is stronger than being
isotropic as the former applies even under improper coordinate transformations while
isotropy is restricted to proper transformations.
Footnote 57. Considering the difference in weight, the difference in the variance type on
the two sides of the equations should not be considered as a violation to the rules of
indices as stated in § 2.3↑. Similarly, considering the difference in the variance type, the
difference in weight should not be considered as a violation to the rules of equating
relative tensors as stated in § 2.6.3↑.
Footnote 58. This identity, like many other identities in this chapter and in the book in
general, is valid even for general coordinate systems although we use Cartesian notation to
avoid unnecessary distraction at this level. The alert reader should be able to notate such
identities in their general forms.
Footnote 59. As explained previously, “orthonormal” means that the vectors in the set are
mutually orthogonal and each one of them is of unit length.
Footnote 60. This also applies to the zero entries of this tensor which correspond to the
permutations with repetitive indices.
Footnote 61. This is a product of the rank-n permutation tensor by itself entry-by-entry
with the application of the summation convention and hence it can be seen as a multi-
contraction inner product of the permutation tensor by itself.
Footnote 62. The role of these indices in indexing the rows and columns can be shifted.
This can be explained by the fact that the positions of the two epsilons can be exchanged,
since ordinary multiplication is commutative, and hence the role of the epsilons in
providing the indices for the rows and columns will be shifted. This can also be done by
taking the transposition of the array of the determinant, which does not change the value of
the determinant since det(A) = det(AT), with an exchange of the indices of the Kronecker
symbols since the Kronecker symbol is symmetric in its two indices.
Footnote 63. That is:
εil εkl = δikδll − δilδlk = 2δik − δilδlk = 2δik − δik = δik
Footnote 64. That is:
εijk εlmk = δilδjmδkk + δimδjkδkl + δikδjlδkm − δilδjkδkm − δimδjlδkk − δikδjmδkl
= 3δilδjm + δimδjl + δimδjl − δilδjm − 3δimδjl − δilδjm
= δilδjm − δimδjl
Footnote 65. In fact, the determinantal form given by Eq. 151↑ can also be considered as a
mnemonic device where the first and second indices of the first ε index the rows while the
first and second indices of the second ε index the columns, as given above. However, the
mnemonic device of Eq. 153↑ is more economic in terms of the required work and more
convenient in writing. It should also be remarked that the determinantal form in all the
above equations is in fact a mnemonic device for these equations where the expanded form,
if needed, can be easily obtained from the determinant which can be easily built following
the simple pattern of indices, as explained above.
Footnote 66. That is:
εijk εljk = δilδjj − δijδjl = 3δil − δil = 2δil
Footnote 67. In fact this can be obtained from the determinantal form of δijklmn by
relabeling n with k.
Footnote 68. Matrix multiplication in matrix algebra is equivalent to inner product in
tensor algebra.
Footnote 69. The direction is also invariant but it is not a scalar! In fact the magnitude
alone is invariant under coordinate transformations even for pseudo vectors because it is a
true scalar.
Footnote 70. There is another reason that is the components given in the cylindrical and
spherical coordinates are physical, not covariant or contravariant, and hence suffixing with
coordinates looks more appropriate. The interested reader should consult on this issue
more advanced textbooks of tensor calculus.
Footnote 71. It may also be argued more simply that the divergence of a vector is a scalar
and hence it is invariant.
Footnote 72. It should be obvious that since ρ, φ and z are labels for specific coordinates
and not variable indices, the summation convention does not apply.
Footnote 73. Again, the summation convention does not apply to r, θ and φ.