Notes2018 7 PDF
Notes2018 7 PDF
Jan W. Thomsen
Spring 2010
Contents
1 Introduction page 1
2 Maxwell’s Equations 3
2.1 General form of Maxwell’s Equations 3
2.2 Maxwell’s Equations in vacuum 8
2.2.1 Measurements of the speed of light 10
2.2.2 Solutions to the wave equation 13
2.3 Model of electron motion in materials 17
2.3.1 Rayleigh scattering of light 20
2.3.2 Energy of an EMW 22
2.3.3 Momentum of light 25
2.4 Maxwell’s Equations for an insulator 28
2.4.1 Simple model of the index of refraction 31
2.4.2 Real and complex index of refraction 35
2.4.3 Group and phase velocity 37
2.4.4 Dispersion viewed in ω, k- diagrams 42
2.5 Negative refractive index 43
2.6 Maxwell’s Equations for a conductor 46
2.6.1 Frequency dependent conductivity: Drude’s
model 51
3 Propagation of light 57
3.1 Fermat’s principle 57
3.1.1 Solutions to Euler Lagrange equations 68
3.2 Snell’s law 69
3.2.1 Applications of Snell’s law 72
3.3 Fresnel’s laws of reflection 74
3.3.1 Application of Fresnel’s equations 82
3.3.2 Intensity relations 85
iv Contents
3.3.3 Metals 89
4 Geometrical optics 92
4.1 Optical elements 92
4.1.1 Aspherical surfaces 94
4.1.2 Imaging with spherical surfaces 98
4.1.3 Common spherical lens errors 107
4.2 Ray tracing 112
4.2.1 Thick lenses 117
4.2.2 Combination of two thin lenses 120
5 Polarization of light 122
5.1 Polarization states of monochromatic waves 123
5.1.1 Jones vectors 125
5.1.2 Mathematical description of light: Stokes
vector 129
5.1.3 Optically active crystals 132
5.1.4 Maxwell’s equations for an anisotropic
medium 136
5.1.5 Group velocity vg in anisotropic materials 141
5.2 Production and manipulation of polarized light 143
5.2.1 Polarizers 143
5.2.2 Wave plates and retarders 147
5.2.3 Optical activity, magneto-optical effect
and Faraday-rotators 150
6 Interference of light 152
6.1 Coherence of light fields 153
6.1.1 Interference - general considerations 156
6.2 Young’s experiment 159
6.3 Thin films 161
6.4 Interferometers and their applications 164
7 Diffraction of light 170
7.1 Interference of N sources - the grating 171
7.2 The modified Huygens-Fresnel construction 174
7.3 Fraunhofer diffraction 179
7.4 Babinet’s principle 192
7.5 Rayleigh’s criterion for angular resolution 194
1
Introduction
Before studying any field one should begin by asking an important ques-
tion; Why? Why study optics? Three good reasons:
• It is inherently beautiful.
• It is very important for present and future high-tech industry.
• It is the basis for many other physics disciplines.
The field of optics is one of the most fascinating and amazing topics
of science. It is visible and very appealing! Most of us perceive the world
with the incredibly powerful visual sense, so it is very natural to be cu-
rious about how vision and optics in general actually work. Yet many
elements of the theory are abstract and indeed complicated to imagine.
But in optics we can illustrate complicated theories by simple experi-
ments and demonstrations that support our understanding of light and
how it interacts with atoms and matter. Many beautiful optical phenom-
ena can be observed in nature, such as rainbows, halos, sun dogs, green
flashes etc. A good understanding of these phenomena is a must for any
physics student. On the front cover of these notes you can see a “camera
obscura” – people speculate that this is the oldest optical instrument
used already by cavemen early in the history of mankind.
The science of optics is one of the most important fields of physics
as it provides a firm basis for many other disciplines of science and
technology. In industry, telecommunication and optical fiber based data
transfer is becoming increasingly important. As an example, every day
the fiber industry makes fibers corresponding to a length that can reach
seven times around the planet. Any improvements in these systems will
have tremendous impact on economy and society. Here you can make a
difference!
2 Introduction
If you wish to study in detail any field of science or physics you are
bound to come across the field of optics. That may be laser technology
in Atomic Molecular and Optical physics (AMO), near field imaging
microscopes in biophysics, optical simulation of LHC detectors, gravita-
tional wave detectors, telescopes in astronomy. Therefore, it is not our
job to turn you into an optician. However, we will provide you with a
toolbox that you can hopefully use on your way in the science landscape.
In the future, many ways and things may come and go, but Maxwell’s
equations will never be abandoned.
This set of notes was prepared in connection with an optics class
taught at the Niels Bohr Institute spring 2010. The class was divided
into a series of lectures, problem sessions and experimental classes. As a
prerequisite some formal knowledge of Maxwell’s equations in differen-
tial form was assumed. These notes are a living document and you can
contribute to improve them by pointing out things that are unclear to
you to your teachers and instructors.
The notes are intended to help you to learn the ropes in optics, but
they cover by no means the full breadth of the field. A very useful supple-
ment is the splendid (alas, expensive) book “Optics” by Eugene Hecht,
which – even if somewhat old-fashioned – provides a very good overview
and many examples accessible for undergraduates. Check out your fa-
vorite book-store, ask older students, or browse the web to get a hand
on the 4th or 5th edition. For a quick brush-up on the use of vector anal-
ysis in electromagnetism you should always have your copy of Griffiths’
“Introduction to Electrodynamics” within reach.
2
Maxwell’s Equations
In this chapter we will study Maxwell’s four equations and their solutions
in three distinct cases. These four equations are truly a milestone in the
human history. So, enjoy the adventure that you are about to begin
exploring and think about for the rest of your life! After a small general
introduction, we begin with the simplest case study namely vacuum, or
empty space if you like, a medium with no sources or currents. In the
second case we look at the insulator where the conductivity is zero, and
finally, we will study a conductor, such as a metal, which has a finite
conductivity.
∇·D = ρ (2.1)
∇·B = 0 (2.2)
∂B
∇×E = − (2.3)
∂t
∂D
∇×H = J+ (2.4)
∂t
4 Maxwell’s Equations
These laws must be treated with a certain amount of respect. They tell
how fields are produced, what causes them and how the fields are inter-
connected. The first law we call “Gauss’ law”, then follows “no mag-
netic monopoles”, on to “Faraday’s law”, and finally, “Ampère’s
law” with Maxwell’s correction. Maxwell’s equations unify electricity,
magnetism and light. Gauss’ law expresses the electric field in terms of
its sources - the charges. The second equation expresses that there are
no magnetic “charges”, magnetic sources always come with a south and
a north pole. Faraday’s law states that a time varying magnetic field
produces a circulatory electric field, - an electromotive force. Ampère’s
law states how a magnetic field is related to its sources, - moving charges
and a time varying electric field. The perhaps most important law for
our everyday life is Faraday’s law, controlling electric motors, generation
of electricity, electromagnets etc.
As you may have observed Maxwell’s equations do not contain ex-
plicitly any material specific parameters such as the permeability µ and
permittivity ε. Those are introduced through the constitutive relations
stated below and express how fields interact with matter. They also de-
fine how the auxiliary fields D and H are related to the fields E and B
appearing in the Lorentz force law. For simple homogeneous materials
in moderate fields we have the linear field relations:
2.1 General form of Maxwell’s Equations 5
D = εE (2.5)
B = µH. (2.6)
Below we list for reference the connection to free space or vacuum per-
mittivity ε0 and vacuum permeability µ0 .
ε = ε0 (1 + χ) (2.7)
µ = µ0 (1 + χm ) (2.8)
B = µ0 (H + M) = µ0 (1 + χm )H (2.11)
where the first term is the polarization of the medium in absence of external
applied electric field, often zero, the next term is linear in the field, third term
quadratic in the field etc. The non-linear terms give rise to a huge amount of new
physics. We call it non-linear optics, a very rich field of optics with phenomena
such as frequency doubling (two photons of frequency ω combine to one photon
of frequency 2ω ), down conversion (one photon of frequency ω disintegrates into
two new photons of frequency ω 0 and ω 00 ) etc.
6 Maxwell’s Equations
The units can also be expressed as: [F/m] = [As/(Vm)] and [H/m] =
[N/A2 ].
ε = ε0 + iε00 , (2.14)
ε’ and ε’’
depicted in figure 2.2. Notice that not all four frequency regimes, ionic,
rotational, atomic and electronic need to be present for a given material.
For some materials the rotational regime may dominate, others only the
ionic part. Inside metals, things are a bit different. A good conductor
is easy to polarize, but no DC electric fields can exist inside a good
conductor. So at DC we would expect ε0 to be infinite (or very large).
However at higher frequencies, this is not the case. For example a good
conductor (Ag, Au, Al, etc.) at optical frequencies ( ∼ 1014 Hz) has neg-
ative relative permittivity ε0 ∼ −5. This reflects that electrons cannot
follow infinitely fast, they have an inertial mass limiting how fast they
may be dragged around.
Let’s return to dielectrics. At zero frequency, i.e., at DC we will mea-
sure the DC dielectric constant. As the frequency increases we observe
a decreasing dielectric constant. In the frequency range 1-106 Hz we are
8 Maxwell’s Equations
Vacuum
∇·E = 0 (2.15)
∇·B = 0 (2.16)
∂B
∇×E = − (2.17)
∂t
∂E
∇×B = ε0 µ0 , (2.18)
∂t
as we have no sources. The material equations become particularly sim-
ple ε = ε0 , µ = µ0 . If we take the curl of Faraday’s law we get
∂(∇ × B)
∇ × (∇ × E) = − (2.19)
∂t
∂ 2 (E)
−∇2 E = −ε0 µ0 . (2.20)
∂t2
In the last step we used Amperes law and the vector identity ∇ × (∇ ×
A) = ∇(∇ · A) − ∇2 A together with Gauss’ law ∇ · E = 0. Finally, we
recognize the result as a wave equation for the E-field:
1 ∂2E
∇2 E = (2.21)
c2 ∂t2
1 ∂2B
∇2 B = . (2.22)
c2 ∂t2
Problem 2.3 Obtain the wave equation for the B-field using Maxwell’s
equations.
10 Maxwell’s Equations
Jupiter
3 1
Sun M
Earth orbit
4
Semi transparent
mirror plate
Observer
Rotating cogwheel
L = 4315 m
the infrared laser, and the resulting value for c, is given below:
• Plane waves
• Spherical waves
• Cylindrical waves
• Gaussian waves (approximate solution used in laser physics)
A0 ei(k·r−ωt) (2.27)
where we in the last step only looked for the magnitude. Below we will
see that vp = c. Note that if we assume traveling plane wave solutions
for, for example the E-field,
E(r, t) = E0 ei(k·r−ωt) , (2.29)
it implies that all fields are harmonic plane wave solutions of this form.
So in total we operate with a set of solutions like:
E(r, t) = E0 ei(k·r−ωt) , (2.30)
i(k·r−ωt)
D(r, t) = D0 e , (2.31)
i(k·r−ωt)
B(r, t) = B0 e , (2.32)
i(k·r−ωt)
H(r, t) = H0 e . (2.33)
Just by acting with the ∇ operator on those equations, remembering the
Leibniz rule 3 and using Maxwell’s equations we can derive a number
of important relations for the propagation of light. For example from
Gauss’ law and the law stating ”no magnetic monopoles” we obtain:
∇ · E = ik · E = 0 and ∇ · B = ik · B = 0. (2.34)
so both the E- and B-fields are perpendicular to the k vector, the direc-
tion of propagation. This shows that the electromagnetic wave is trans-
verse. Using Faraday’s law we get:
∇ × E = ik × E = −(−iωB), so k × E = ωB. (2.35)
Then we can write E × B as:
1
E×B= E × (k × E)
ω
(2.36)
Using the vector identity
A × (B × C) = (A · C)B − (A · B)C
(E · E)
E×B= k. (2.37)
ω
3 Two tricks that often come in handy:
∇ · (Af (x, y, z)) = (∇ · A)f (x, y, z) + A · (∇f (x, y, z))
∇ × (Af (x, y, z)) = (∇ × A)f (x, y, z) + ∇f (x, y, z) × (A)
where f (x, y, z) is a well behaved function of x, y, z and A is a vector. For
example ∇ · E0 ei(k·r−ωt) = ik · E0 ei(k·r−ωt) and
∇ × E0 ei(k·r−ωt) = ik × E0 ei(k·r−ωt)
2.2 Maxwell’s Equations in vacuum 15
Problem 2.5 Show that for a traveling plane wave ω = ck , and de-
duce the phase velocity vp = c. Show also that E = cB linking the
magnitude of the electric field and the magnetic field, an important re-
lation we will use many times.
E ⊥ k (2.38)
B ⊥ k (2.39)
(E · E)
E×B = k (2.40)
ω
E = cB (2.41)
x
E
z
Figure 2.5 Traveling electromagnetic wave propagation in the +z
direction. Notice how the E-field and the B-field are in phase and
orthogonal to each other. The direction of the E-field we call the
polarization of the light, here linearly polarized light, polarized along
the x-direction.
Sun light
k out k in
k in
Observer
k out polarized
sun glasses. Orienting the polarizer along the flash lamp direction will
kill all the scattered light as it is perfectly polarized perpendicular to
this direction.
where q is the charge, E the electric field, v the charge velocity and B
the magnetic field.
We model the active electrons attracted to the positive charge (nu-
cleus) by a harmonic potential with a resonance frequency ω0 . Actually,
4 Dichroism means light rays having different polarizations are absorbed by
different amounts.
18 Maxwell’s Equations
wo
Positive Positive Positive
B k
Atom Molecule
Figure 2.7 An atom or molecule interacting with plane wave EMW
of the form E = E0 e−iωt . We can ignore the spatial dependence
following the dipole approximation, see main text for details.
qE0
x(t) = e−iωt . (2.45)
m(ω02 − ω 2 )
What does this mean? As the light frequency approaches the resonance
frequency of our atoms or molecule ω0 the amplitude becomes larger and
larger, eventually it diverges. This is because damping is not included,
we keep pumping energy into the system. This is absolutely unphysical,
however, such a limited model gives the correct physics in the limit we
will be discussing except when ω0 ∼ ω.
The model presented in this chapter seems at first eye a bit simplistic.
However, it turns out to be a very powerful tool to get the physics. In-
terestingly, the simple model of driven damped harmonic oscillator fully
agrees with the result of quantum mechanics. We will use the simple
model again and again. It is a good idea to familiarize yourself with it.
P̈2 sin2 θ r
S= · . (2.46)
16π 2 ε0 c3 r2 r
Here r is the vector to the observation point and θ the angle of the
dipole axis with respect to r. For a proof we refer to other texts and
books7 . The amount of power radiated by an accelerated charge is ob-
tained by integrating the above expression for the Poynting vector over
7 Classical Electrodynamics by J.D. Jackson, 3rd ed., Classical Theory of Fields by
L.D. Landau. et al., R. Feynman et al, Lectures on Physics vol.2, D.J. Griffiths
Introduction to electrodynamics, Reitz et al, Foundations Of Electromagnetic
Theory, 3rd Edition
2.3 Model of electron motion in materials 21
the unit sphere and using that the dipole moment can be written as
P = qx(t). This gives:
q 2 2a2
Prad = , (2.47)
4πε0 3c3
where a = ẍ is the acceleration of the charge. Now we put up a simple
model of an atom or molecule, as above, to see what amount of power
that is radiated and at what frequency. The medium we have in mind is
a dilute transparent one, such as a gas where the molecules are placed
at random and uncorrelated positions8 like e.g. the atmosphere of earth.
In these systems the resonance transitions of the atoms or molecules are
in the UV or far infrared and usually not in the visible range 400 nm -
700 nm, hence there is low absorption in the visible range. Taking the
double time derivative of equation (2.44) and using the limit ω0 ω
gives the scattered power:
Rayleigh Scattering
ω4
Prad ∝ . (2.48)
ω04
Rayleigh scattering applies when particles, atoms and molecules are sig-
nificantly smaller than the light wavelength λ. Generally, scattering is
explained by the Mie theory for an arbitrary size. For small scatter sizes
the Mie theory reduces to the Rayleigh approximation.
and only red is left. The thing is the sky becomes ”more” intense blue
and sunset more intense red the more we pollute, sadly speaking. Then
we may ask why the sunset is not purple then? We need to include two
things, the detector response of our eyes and the wavelength distribution
of the sunlight. The eye is most sensitive to light at 550 nm and the sun
emission spectrum peaks at 500 nm. Taking that into account settles the
most intense observed light to the blue color.
As we discussed before, you can also observe the effect using a flash
lamp and a glass of water very similar to example 2.6. Shine the flash
lamp though the water and observe the scattered light at 90 degrees as
you slowly add a bit of fat milk. Milk contains tiny fat particles of sub-
micron size. As you add milk slowly you will see blue light at 90 degrees
and the flash lamp will appear red when you look through the milky
water. In a more advanced version add 20 grams of sodium thiosulfate
to 2 l of water stored in a small fish tank. Then add 20 ml sulfur acid and
steer. This will allow small sulfur particles to be generated that scatter
of light. After about 5 minutes you will see nice blue color at 90 degrees
and a nice sunset for light traversing fish tank.
Problem 2.9 The clouds are white. Is this a result of Raleigh scatter-
ing? What could be the reason for the white color?
A
k
U = ε0 E 2 = ε0 EBc. (2.51)
Poynting’s vector
E×B
S= =E×H (2.53)
µ0
representing the energy flux per unit area. Note that the rightmost ex-
pression can be used also inside all materials.
1 t+T
Z
hf (t)iT = f (t0 )dt0 , (2.56)
T t
where T is chosen sufficiently large10 . For a monochromatic field T =
2π/ω is sufficient. The time averaged Poynting vector becomes:
E 0 B0 1
hS(t)iT ≡ S = = ε0 cE02 , (2.58)
2µ0 2
since the time average of a cosine or sine squared over an optical period
or longer is equal to 1/2. Remark: detectors are sensitive to intensity
and not electric field, this includes the eye. It is important to note that
the energy flux per unit area scales as the square of the E- or B-field.
10 For given electromagnetic field, not necessarily monochromatic, we need to
consider the limit
Z t+T
1
lim hf (t)iT = f (t0 )dt0 , (2.57)
T →∞ T t
2.3 Model of electron motion in materials 25
If you calculate the energy flux into your eye (pupil radius = 2 mm)
you obtain about 5 mW. This is also the danger level for working with
lasers and for red light an upper limit of 1 mW is set (max exposure
time one second). This means light levels below 1 mW (exposure times
at one second) will not seriously harm the functioning of your eye. Most
laser pointers, however, you can buy deliver about 5-20 mW total opti-
cal power or more. Most green laser pointers emit also infrared radiation
with considerable power. The output beams can all be very very danger-
ous, even after being reflected off common surfaces, like a glass window
or just a finger nail. Never look into lasers or laser pointers! Never
look directly at the sun!
Photon momentum
h
p= (2.60)
λ
life time t
v e
The cooling effect happens as the laser light is tuned below resonance
of the atomic transition. Since the atom moves, by Doppler effect the
light is shifted into resonance νlaser < ν0 , and the absorption process
happens frequently. The emission, however, takes place at ν0 with very
high probability. So on average we remove the energy ∆E = νlaser − ν0
for each absorption-emission cycle. Remember we have about 108 of
these cycles per second. This energy is removed from the center of mass
motion of the atom and it slows down. Physicists were awarded the 1998
Nobel prize in physics for this spectacular achievement.
∇·E = 0 (2.65)
∇·B = 0 (2.66)
∂B
∇×E = − (2.67)
∂t
∂E
∇ × B = εµ , (2.68)
∂t
as we have no sources and the material equations are controlled by ε, µ.
This gives a wave equation similar to before:
∂2E
∇2 E = εµ ,
∂t2
with a propagation speed
1
v=√ .
εµ
With this relation in mind we define the index of refraction n as
the ratio:
r
c ε µ
n= = .
v ε0 µ0
r
ε
n(ω) = .
ε0
For harmonic waves, the wave equation (2.69) imposes the dispersion
relation:
c
ω = · k. (2.70)
n(ω)
c
vp = (2.71)
n(ω)
Problem 2.15 Show equation (2.70) using the expression for a har-
monic wave E = E0 cos(ωt − k · r) and equation (2.69).
Analogous to the case of EMW in vacuum we deduce:
30 Maxwell’s Equations
Insulator
E ⊥ k (2.72)
B ⊥ k (2.73)
1
E×B = (E · E)k (2.74)
ω
c
E = B (2.75)
n
S = E×H (2.76)
1 n 2
hSi = E (2.77)
2µ0 c 0
The last relation follows from the energy density which can be written
as:
1 n 2
U = εE02 = E .
µ0 c 0
Isotropic insulator
nω
Ŝ = k. (2.78)
c
This relation holds in isotropic materials and shows that the flow of
energy is directed along the k-vector, which seems very natural indeed.
Later we will see that this is not always the case, in particular for non-
isotropic materials, such as uniaxial and bi-axial crystals.
2.4 Maxwell’s Equations for an insulator 31
D = ε0 E + P (2.79)
= εE , with ε = ε0 (1 + χ)
N q2
n(ω)2 = 1 + · 2 . (2.80)
V mε0 (ω0 − ω 2 − iγω)
32 Maxwell’s Equations
q2 N X fj
n2 = 1 + 2 . (2.84)
mε0 V j (ω0j − ω 2 − iγj ω)
n n
ω0 ω ω0 ω
Figure 2.11 Plot of the real n0 and imaginary index n00 as a function of
ω using equation (2.84). The real part shows strong dispersion around
the resonance frequency while the imaginary part is a Lorentz curve
showing maximal absorption at ω = ω0 .
1.5
1.3
l
300 nm 1000 nm
equation (2.84), or (2.80) if you prefer. The reason is that our assump-
tions regarding the local electric field are not completely true. In dense
materials atoms will feel neighboring atoms located in the vicinity which
modifies the local electric field. This field must be added to the external
field from the EMW. In many books on EM you may find the expression
for the total field on the atom:
P
Etot = E + .
3ε0
Using this expression the index of refraction becomes:
n2 − 1 q2 NX fj
2
= · 2 . (2.85)
n +2 3mε0 V j (ω0j − ω 2 − iγj ω)
n = n0 + in00 .
The findings in equations (2.72-2.76) are still correct. From the wave
equation we obtain again the dispersion relation
n(ω)
k=ω ,
c
as before. Mathematically, we have the following two possibilities: (a)
either the wave vector is complex and ω is real or (b) ω is complex and
the wave vector is real. Obviously, we prefer to have ω real and k-vector
complex k = k0 +ik00 , because we want to consider stationary situations.
36 Maxwell’s Equations
Figure 2.13 Measured index of refraction for BK7 and fused silica
glass. BK7 is the most common glass type you will find use in camera
lenses etc. The BK7 glass is based on borosilicate glass compound.
00
= (E0 e−k ·r
) exp(i(k0 · r − ωt)), (2.87)
(n0 + in00 )
k 0 + ik 00 = ω ,
c
and
11 This is usually the case when the source of light is residing inside the absorbing
medium. For light entering from the outside with an incidence angle away from
the surface normal, things become slightly more intricate. We will discuss that
when we derive and study the Fresnel equations.
2.4 Maxwell’s Equations for an insulator 37
ωn00 n0
= E0 e − c q̂·r
· e−iω(t− c q̂·r)
(2.89)
ω
vp = . (2.90)
k
c
ω= · k, (2.91)
n(ω)
so phase velocity is not the same for all frequency components of the
wave. Some components advance faster than others and consequently
change phase with respect to one another. Consider a weighted sum of
monochromatic waves with angular frequencies ω(k) and g(k) is some
peaked weight function. The resulting wave is given by the integral:
Z +∞
1
Φ(x, t) = √ dk g(k)ei(kx−ω(k)t) .
2π −∞
√
The pre-factor 1/ 2π is put for convenience so that the integral can be
interpreted as the (unitary) Fourier transform. It will make no difference
in our final conclusion, and could as well be omitted. With the Fourier
transform in mind the weight function g(k) can be obtained from, see
figure 2.14:
Z +∞
1
g(k) = √ dx Φ(x, 0)e−ikx .
2π −∞
As the k-space and the x-space are linked by the Fourier transform, the
initial spatial pulse spread ∆x and the spread in the g(k)-function ∆k
obey an ”uncertainty” relation ∆x∆k ∼ 1.
Assume the case of a peaked g(k) function, as depicted in figure 2.14.
Since n(ω) does not change much over a narrow range of ω we can Taylor
2.4 Maxwell’s Equations for an insulator 39
dω
ω(k) ' ω(k0 ) + (k − k0 ) + . . . (2.92)
dk
= ω0 + vg (k − k0 ), (2.93)
Z +∞
1
Φ(x, t) = √ dk g(k)ei(kx−(ω0 +vg (k−k0 ))t) (2.94)
2π −∞
+∞
ei(k0 x−ω0 t)
Z
= √ dk g(k)ei(k−k0 )·(x−vg t) , (2.95)
2π −∞
| {z }
unimportant phase factor
which shows that the pulse form is unchanged and moving with speed
vg . More generally in three dimensions it is defined as:
Group velocity
∂ω
vg = (2.96)
∂k k0
Phase velocity
ω0 k
vp = (2.97)
k0 |k|
Figure 2.14 The group velocity vg is the speed at which the pulse
(black envelope) travels along. Phase velocity vp is the speed of con-
stant phase. Here the overall phase speed is vp = ω0 /k0 . The figure
was constructed
√ using g(k) = exp((k − 1)2 /∆k2 ), ∆k = 0.1 and
2
ω(k) = k + 1. Time was put to t = 1.
∂ω
vg = (2.98)
∂k k0
vg · vp = c2
In the presence of dispersion both group and phase velocity may take
different values. A question often asked: is it possible for these velocities
to be larger than the speed of light in vacuum? The answer is yes for
both of them. It is a widely spread misconception that the phase velocity
may exceed c while the group velocity must be below the speed of light
since it is connected to the energy transport velocity. In fact, only when
the light frequency is far off strong absorptions lines and for systems
with no gain, it is true that the group velocity is the energy transport
velocity. The group velocity may be shown to be given by:
c
vg = dn
, (2.99)
n(ω) + ω dω
dn
see the problem above. The feature to notice is the factor n(ω) + ω dω .
dn
In a spectral region where we have normal dispersion dω > 0, the
group velocity is less than c, and may even be significantly less than c,
as show in for example Lene Hau’s experiment on slow light, which we
will encounter in the exercise class. In a spectral region of anomalous
dn
dispersion where dω < 0, it is possible for the group velocity to be
larger than c, even infinite or negative. For a long time this was believed
to be in conflict with Einstein’s special theory of relativity, but Som-
merfeld and Brillouin proved it not to be13 . The important thing here,
is to prove that no signal velocity, enabling transport of information,
may exceed c. In no experiment so far have we observed transport of
information faster than the speed of light!
You may wonder what is really a signal then? A pure sine wave is not
a signal since it has no beginning and no end, it extends from minus
infinity to plus infinity. We have no way of saying what the onset of
a pure sine wave is. If we want to say something about propagation
of a signal, we have to define a start, a beginning, an onset showing
an element of ”surprise” which we could not have predicted from the
eternal periodic wave motion itself.
A good example of a signal, which may be used to transfer information,
13 See for example L Brillouin ”Wave Propagation and Group Velocity” (New York:
Academic) 1960 or P.W. Milonni ”Fast Light, Slow Light and Left-Handed
Light” (IOP) Series in Optics and Optoelectronics (2004)
42 Maxwell’s Equations
w
vg > vp vg = vp
vg = dw/dk
w
vg < vp
vp
k k
Figure 2.15 Dispersion relations plotted in an ω, k- diagram. The
straight line corresponds to no dispersion ω = vk. The two other
cases show dispersion. The local slope of the curves yield the group
velocity vg which is compared to the ratio vp = ω/k, by looking at
the subtended angle.
n = n0 + in00 , (2.101)
n
Ŝ = ω · k, (2.102)
c
where n is the refractive index and c the speed of light in vacuum.
Notice when both ε < 0 and µ < 0 the value n2 is naturally positive,
but effectively we lose the sign, due to the square. So what to do then?
Let us look at Maxwell’s equations, we obtain the relations (Faraday’s
law, Ampere’s law):
k × E = ωµH (2.103)
k × H = −ωεE. (2.104)
When ε > 0 and µ > 0, as for regular dielectric media, the vectors E,
H and k forms a right-handed triad as does the vectors E, B and k .
However, when ε < 0 and µ < 0 the vectors E, H and k forms a left-
handed triad! For this reason such materials are often named left-handed
materials. To recognize the left-handed triad we may recast equations
2.5 Negative refractive index 45
Split-ring resonantors Wires
Conductor
∇·D = ρ (2.107)
∇·B = 0 (2.108)
∂B
∇×E = − (2.109)
∂t
∂D
∇×H = J+ (2.110)
∂t
J=σ·E
to the electric field and the conductivity σ of the metal ([σ]=(Ω m)−1 ).
Generally, the conductivity is a function of the frequency ω. Again, we
seek to derive the wave equation for the electric and magnetic field com-
ponents.
Firstly, we now approximate ∇ · E = 0. This follows from:
∂∇ · D
∇ · (∇ × H) = 0 = σ∇ · E + .
∂t
Using Gauss’ law and Ohm’s law we can write the continuity equation
in terms of charge density as
dρ σ
+ ρ = 0,
dt ε
48 Maxwell’s Equations
with τ = ε/σ. Since σ is typically few 106 (Ω m)−1 and ε ∼ 10−12 F/m,
the time constant τ becomes 10−18 s, significantly faster than 1/ω for
visible light, which is 10−15 s. What does this mean? Generally, in a
conductor if a charge density is built up, by some process, like in our
case where the electric field of the EMW pushes the free electrons, the
charge density will rapidly decay. This is due to the repulsive nature of
the electrons and the fact they are nearly free to move around. It cor-
responds to the same case you studied in electrostatics where charges
in a conductor reside entirely on its surface. We can therefore safely
put ∇ · E = 0 for a conductor interacting with visible light or or radi-
ation of lower frequencies19 . For x-rays or even higher frequencies this
approximation is not valid.
The wave equation is obtained, as before:
∂∇ × B ∂2E ∂E
∇ × (∇ × E) = −∇2 E = − = −µε 2 − σµ
∂t ∂t ∂t
We finally have:
∂2E ∂E
−∇2 E + µε + σµ =0
∂t2 ∂t
The first two terms we recognize as just the “normal” wave equation
ingredients for dielectric materials. The third term, on the other hand,
we did not have before. This term is an attenuation or damping term.
The higher σ is, the better a conductor our metal is and the larger the
damping term becomes. The damping term forces the amplitude of the
EMW in the metal to decay really quickly or, in other words, over very
short distances.
For monochromatic waves of the form E(r, t) = E(r)e−iωt , we have:
∇2 E + (εµω 2 + iωσµ)E = 0,
19 You might have spotted a flaw in our argument. We already encountered a
frequency dependent permittivity and we will see that the conductivity is also
frequency dependent. So how should one interpret this time constant? Let us just
state that the approximation chosen works.
2.6 Maxwell’s Equations for a conductor 49
ω2 σ
∇2 E + (εr + i )E = 0.
c2 ωε0
(∇2 + k 2 )E = 0
with k complex:
ω2 σ
k2 = 2
(εr + i )
c ωε0
r h
0 1 p i
n = a + a2 + b2 , (2.113)
2
r h
1 p i
n00 = −a + a2 + b2 . (2.114)
2
dv −e v
= E− .
dt m τc
Here, τc is the timescale for collisions that scramble the electron velocity
leading to damping of the drift speed. The differential equation looks
very similar to the equation we studied previously in connection with
Rayleigh scattering and the model of the index of refraction:
qE0 −iωt
ẍ + γ ẋ = e .
m
But now we have ω0 → 0 as the conduction electrons are free to move
within the metal without any local restoring force. Notice γ = 1/τc . The
stationary solution to this differential equation is:
eE0
x(t) = e−iωt
m(ω 2 + iγω)
J = σ · E = −ene v,
where ne is the electron density, e the electron charge and v the electron
speed imposed by the electric field E. The electron velocity is deduced
20 The drift speed is typically some mm/s as opposed to the mean velocity of the
electron in the atom which is of the order αc ∼ 106 m/s. For conduction
electrons in a metal the relevant velocity is the Fermi velocity, which has similar
magnitude.
21 Ultimately, these collisions are the cause for the conversion of electromagnetic
energy from the driving field into heat.
52 Maxwell’s Equations
from equation (2.115) and inserted into equation (2.115) to yield for σ:
e2 ne i
σ= · (2.115)
m ω + iγ
e2 ne τc ε0
= ·
mε0 1 − iωτc
This leads to a dynamic conductivity of:
σ0
σ(ω) =
1 − iωτc
σ0
σ(ω) =
1 − iωτc
with
e2 ne τc
σ0 =
m
Let us analyze equation 2.116. For very low frequencies the real part of
the conductivity dominates. At d.c., i.e., for ω → 0 the conductivity σ0
becomes purely real. We conclude that current and electric field are in
phase for low EM frequencies. As the frequency increases, the imaginary
part becomes more pronounced and a significant phase shift between
electric field and current develop. Finally, the conductivity vanishes,
σ → 0, with increasing ω as we expected, - the electrons cannot follow
any more, due to their finite inertial mass.
We introduce a characteristic frequency ωp , the electron plasma
frequency which plays an important role in plasma and solid state
physics:
s
e2 ne
ωp =
ε0 m
where, as before, ne is the electron density and m the electron mass.
We can relate the electron plasma frequency to the DC-conductivity as
σ0 = ωp2 ε0 τc . For Cu it is 1.6 · 1016 Hz. As you can see, plasma frequency
is different for each species. Physically, the electron plasma frequency
2.6 Maxwell’s Equations for a conductor 53
d2 ∆x
m = eEx = −m ωp2 ∆x,
dt2
Today we know that this is not exactly true. In fact the principle must
be modified to
By stationary time we mean here, that the travel time along the actual
path taken can be either minimal, maximal or a point of inflection (a
saddle point) when compared to an infinitesimally changed path. Proof
of the principle of stationary time can be carried out rigorously using
3.1 Fermat’s principle 59
(a) (b)
Q
R R
P M
N
A B A B
distance from A to B is the same, no matter where you reflect off the
mirror! This is a unique property of the ellipse and cannot be attributed
to other curves. So this is a perfect example of a Saddle point. Next
we consider the two cases shown in figure 3.2(b). Here we display two
curves (M, N) in addition to the ellipse. At point R the red curve M has
a curvature which is less than that of the ellipse, while the blue curve
N has a greater curvature compared to the ellipse. For the curve M you
may easily see that the distance ARB is the shortest possible, while all
other routes have longer path lengths. We may conclude this is a case of
a minimum. In case of the blue curve N it is reversed. Here the distance
ARB is the longest and all other routes are shorter. Hence this is a case
of a maximum.
where n is the real part of the index of refraction. We used that v = ds/dt
and v = c/n. The differential of optical path length (OPL) is defined
as the product of the index of refraction n and the physical path length
differential ds, so
Z B
OPL = n(r)ds (3.4)
A
Z x2 p
OPL = n(x, y) (y 0 )2 + 1dx
x1
dx
A ds dy
B ds
x1 x2 x
Z x2
I= f (x, y, y 0 ) dx
x1
stationary if
d ∂f ∂f
− =0
dx ∂y 0 ∂y
The program is clear. Specify the index of refraction n(x, y) and solve
the differential equation (3.5), then you know which path light takes.
So our optical version of the Euler-Lagrange differential equation be-
comes:
!
n(x, y) · y 0
d ∂n(x, y) p
p − · (y 0 )2 + 1 = 0
dx (y 0 )2 + 1 ∂y
In three dimensions:
Z z2 p
I= n(x, y, z) (x0 )2 + (y 0 )2 + 1 dz
z1
and
!
n(x, y, z) · y 0
d ∂n(x, y, z) p
p − · (x0 )2 + (y 0 )2 + 1 = 0.
dz (x0 )2 + (y 0 )2 + 1 ∂y
p
Using the ds/dz = (x0 )2 + (y 0 )2 + 1 we can rewrite the 3-D condi-
tions as:
d dx ∂n
n = (3.6)
ds ds ∂x
d dy ∂n
n= (3.7)
ds ds ∂y
d dz ∂n
n = . (3.8)
ds ds ∂z
d dr
n(r) = ∇(n(r))
ds ds
Get hands on a fish tank with a volume of around 20-30 liters and
fill 1.5 cm of regular sugar in the bottom. Fill the tank with hot water
by placing cup and a saucer on the sugar and pour slowly into the cup.
After adding water remove gently the cup and saucer. Let the tank rest
2 For further reading consult: Andrew T. Young Journal of the Optical Society of
America A, vol. 17, pp. 2129-2139 (2000)
64 Propagation of light
Figure 3.4 Bending of laser light in the atmosphere. The density gra-
dient results in a gradient in the index of refraction increasing towards
the Earth’s surface. This makes light bend more and more.
for 2-3 days. This will result in an efficient sugar gradient in the water
where the sugar concentration increases towards the bottom of the tank.
Consequently, we will have an index gradient, increasing towards the
bottom of the fish tank, just as the atmospheres and light will bend
when entering the solution. This is shown in figure 3.5.
With the refraction that takes place in the atmosphere and given
that the index of refraction changes with height, one may ask if it is
possible to bend light around in a circle with radius of curvature equal
to that of the Earth? Obviously, shooting a laser parallel to surface of
the Earth we know that it will soon be reduced in intensity and finally
become invincible, - at least for moderate intensities. Anyway, we are
going to investigate the problem using Fermat’s principle. Firstly, how
does the index of refraction vary with height? The index of refraction
at the Earth’s surface (300 K and 1 atmosphere) is about n =1.00029.
We assume the index n − 1 to be proportional to the density such that
3.1 Fermat’s principle 65
Figure 3.5 Bending of laser light in a sugar solution. The fish tank
contains water with a strong gradient of sugar concentration increas-
ing towards the bottom which makes the light beam bend more and
more toward the bottom. Notice how light is reflected from the glass
bottom of the fish tank.
(isothermal atmosphere):
so
2 !1/2
dr
ds = + r2 dθ. (3.12)
dθ
and finally we find ρ/ρ0 = 4.86. Thus if the air density was a factor
about five higher at sea level, light could bend around in a circle and we
could watch the sunset for ever! We would feel some pressure, though...
√
point y is given by v = 2gy since we have conservation of mechanical
energy. From the geometry, see figure 3.6, we find:
1 1
sin(θ) = cos(φ) = p 2
=p .
1 + (tan (φ)) 1 + (y 0 )2
If the mass is going from A to B in the least amount of time we can use
Snell’s law, which by Fermat’s principle ensures the travel time to be a
minimum. From equation (3.10) and equation (3.17) we have:
p
y( 1 + (y 0 )2 ) = q
where q is a constant.
The solutions to this differential equation is the cycloid curve, here
parameterized by α:
∂ 2 f d2 y
d ∂f
= · =0
dx ∂y 0 ∂y 02 dx2
∂2f
if 6= 0 then
∂y 02
d2 y
= 0 and y = ax + b.
dx2
This part corresponds to straight lines, exactly the case when n(x, y) is
a constant.
d ∂f
=0
dx ∂y 0
∂f
and = constant
∂y 0
∂f
y 0 − f = constant
∂y 0
4 See Ashby, N., Brittin, W. E., Love, W. F., and Wyss, W. “Brachistochrone with
Coulomb Friction.” Amer. J. Phys. 43, 902-905, 1975.
3.2 Snell’s law 69
θ1 = θ3
n1 sin(θ1 ) = n2 sin(θ2 )
70 Propagation of light
There are several ways to prove Snell’s law. We will not go through all
of them here, but refer to other optics books. However, let us look at two
prominent proofs both illustrating the kinematic nature of the law. The
first is perhaps the most general. Since there is translation symmetry
along the y -direction, momentum will be conserved in this direction.
For a photon entering along k1 and exiting along k2 the y-component
of its momentum its conserved:
h sin(θ1 ) h sin(θ2 )
= (3.22)
λ1 λ2
n1 sin(θ1 ) n2 sin(θ2 )
= (3.23)
λ0 λ0
where we used the expression for the photon momentum p = h/λ in the
first line, h is Planck’s constant, and λ0 is the vacuum wavelength. The
reflection law θ1 = θ3 can be shown in a similar manner. From Fermat’s
principle we can also show Snell’s law. Looking at figure 3.9 we write up
3.2 Snell’s law 71
Figure 3.9 Snell’s law shown by the use of Fermat’s principle (Prin-
ciple of least time).
p p
OPL(x) = n1 x2 + a2 + n2 (s − x)2 + b2 (3.25)
so
x −(s − x)
OPL0 (x) = 0 = n1 √ + n2 p (3.26)
x2 + a2 (s − x)2 + b2
n1
sin(θ2 ) = (3.28)
n2
n2
sin(θc ) = .
n1
with
s 2
n1 n1
= k2 y sin(θ1 ) − k2 z 1− sin2 (θ1 ) (3.34)
n2 n2
If we now consider n1 > n2 then for θ1 > θc the square root is purely
imaginary. We can thus write the transmitted field as:
n1
α = k2 sin(θ1 ), (3.36)
n2
s
2
n1
β = k2 sin2 (θ1 ) − 1. (3.37)
n2
scale of the wavelength as you can see. This, for example, is how finger
scanners work. The finger is brought close to the surface, actually within
a wavelength or so, and modifies the field which can then be detected
on the reflected beam. What about the Poynting vector? As we saw in
chapter 2 the Poynting vector is directed along the k-vector, in this case
k2 . So the wave will travel along the +y-direction, but have a complex
component describing the decay in medium 2.
In the optics industry optical wave guides play a huge role today and
evanescent waves allows for transferring of power between waveguides
which are brought together within a wavelength of each other. We are
talking 0.1 µm size objects on small chips where light may be manip-
ulated around to target high speed optical communication and super
computers of tomorrow. Also in biophysics there are quite a number of
applications too. One of the perhaps most prominent ones are the inter-
nal reflection fluorescence microscope (TIRF). In (TIRF) microscopy,
incident laser light is totally reflected within the glass of a coverslip,
producing an evanescent wave outside the glass that illuminates objects
with a short range of about a 100 nm beyond the coverslip. The portion
of the specimen within the evanescent field can therefore be excited to
emit fluorescence selectively.
(1) (2)
Etan = Etan
ρs
ε(1) (1) (2) (2)
r En − εr En =
ε0
Bn(1) = Bn(2)
(1) (2)
Btan Btan
(1)
= (2)
µr µr
Remember ki , kr and kt are all belonging to the same plane, the plane
of incidence, in fact the three k-vectors span the plane of incidence. In
the first case we assume the E-field to be perpendicular to the plane of
incidence, i.e., we only look at the component Ei⊥ . The other two
6 Note that in fig 3.11 the plane of incidence is the paper plane.
3.3 Fresnel’s laws of reflection 77
cases you should do yourself. Now we are hunting for two equations, in
the E-field, which enable us to give the ratio of transmitted and reflected
field to the incoming field. The situation is shown in figure 3.13. Since
the field Ei⊥ is tangential we use first law of the boundary conditions
(3.38):
For the B-field, which we know is orthogonal to the E-field, we also look
for the tangential component, i.e. Bik,tan . Using figure 3.13 we find;
nj j
Bj = E , j ∈ {i, r, t} (3.42)
c
i r t
E⊥ + E⊥ = E⊥ (3.43)
and
i r t
n1 (E⊥ − E⊥ ) cos(θ1 ) = n2 E⊥ cos(θ2 ) (3.44)
r
E⊥ n1 cos(θ1 ) − n2 cos(θ2 )
r⊥ = i
= (3.45)
E⊥ n1 cos(θ1 ) + n2 cos(θ2 )
and
t
E⊥ 2n1 cos(θ1 )
t⊥ = i
= (3.46)
E⊥ n1 cos(θ1 ) + n2 cos(θ2 )
n1 cos(θ1 ) − n2 cos(θ2 )
r⊥ =
n1 cos(θ1 ) + n2 cos(θ2 )
2n1 cos(θ1 )
t⊥ =
n1 cos(θ1 ) + n2 cos(θ2 )
n2 cos(θ1 ) − n1 cos(θ2 )
rk =
n2 cos(θ1 ) + n1 cos(θ2 )
2n1 cos(θ1 )
tk =
n2 cos(θ1 ) + n1 cos(θ2 )
Note that there are several ways to write the Fresnel equations. You
can see that we use both refractive indices, the incidence angle and
the refraction angle on the right hand side in 3.47. But, those are not
at all independent variables – they are constrained via Snell’s law! For
practical optics design applications it is often useful to eliminate the
refraction angle, since it is you who decides on which materials to use
and which incidence angles are interesting. This form is also very useful
for plotting reflectance and phase shift, so we do that at the end of
this section. Alas, the expressions become a little less easy to remember.
7 In most of the optics literature the ⊥-polarization is called s-polarization (from
the German word senkrecht), while the k polarization is denoted as
p-polarization.
80 Propagation of light
We can also write the equations just in terms of incidence angle and
refraction angle, which looks so neat that we will do it later, when we
discuss Brewster’s angle. But now we want to look at a simple special
case, to see what the equations actually tell us and where we have to be
careful in interpreting the equations.
Example 3.9 Fresnel’s equations at normal incidence.
At normal incidence θ1 = θ2 = 0. Looking at r⊥ (rk ) and t⊥ (tk ) we
find:
n1 − n2 2n1
r⊥ = and t⊥ = (3.48)
n1 + n2 n1 + n2
2
r⊥ + t2⊥ 6= 1 (3.50)
2 n2
r⊥ + t2⊥ =1 (3.51)
n1
Problem 3.10 Using the boundary conditions (3.38) show the two
other Fresnel equations.
Using Snell’s law we can further reduce Fresnel’s equations to:
82 Propagation of light
sin(θ2 − θ1 )
r⊥ =
sin(θ1 + θ2 )
2 cos(θ1 ) sin(θ2 )
t⊥ =
sin(θ1 + θ2 )
tan(θ1 − θ2 )
rk =
tan(θ1 + θ2 )
2 cos(θ1 ) sin(θ2 )
tk =
sin(θ1 + θ2 ) cos(θ1 − θ2 )
π
n1 sin(θ1 ) = n2 sin(θ2 ) = n2 sin( − θ1 ) (3.53)
2
so
n2
tan(θ1 ) = (3.54)
n1
n2
tan(θB ) = .
n1
For an air-glass interface we find θB ' 56.3 degrees and for glass-air θB '
33.7 degrees. Brewster’s angle has a beautiful physical interpretation.
When θ1 + θ2 = π/2 the transmitted ray and the reflected ray are at
90 degrees with respect to each other, see figure 3.15. But since the
Ek electric field is perpendicular to the transmitted ray kt , then it is
84 Propagation of light
Figure 3.15 When θ1 +θ2 = π/2 the transmitted ray and the reflected
ray are at 90 degrees with respect to each other. The dipole induced
in the material generates the reflected ray. But here the reflected ray
is along the dipole axis and no light emerges.
Figure 3.16 TOP: Schematic drawing of a HeNe laser. The gas mix-
ture is kept in a glass cell with mounted Brewster windows to mini-
mize optical losses. The black arrow at the output indicates the po-
larization state of the laser light. BOTTOM: Picture of a Brewster
mounted window in a real HeNe laser, here marked with the white
circle.
The output polarization is consequently fixed and very clean for this
type of laser. In figure 3.16 the polarization state of the laser is indicated
with a black arrow. The perpendicular component suffers too large a loss
in the cavity to make it above the laser threshold and the laser will not
emit light with this polarization state.
means areas are in play, see figure 3.17. Consider first the reflected beam
(reflected power/ incident power):
r 2
I r Ar n1 d1 E
= |r|2
Rk,⊥ = = (3.56)
I i Ai n1 d1 Ei
In the last step we used the expression for intensity from equation (2.76)
I = 1/2ε0 cn|E|2 . So for both R-coefficients it is just the square modulus
of r as we argued before10 . For the transmittance we have (transmitted
power/ incident power):
t 2
I t Ar n2 d2 E
= t2 · n2 · cos(θ2 )
Tk,⊥ = = (3.57)
I i Ai n1 d1 Ei n1 cos(θ1 )
You may wonder why is it not the square of d1 cos(θ1 ) that must be used
as we are talking about areas? But notice it is only one dimension of the
beam cross section that changes! The circle dimension “out of the paper”
10 Note, we left out the subscripts k, ⊥ on the left hand side. Orthogonal
polarizations do not interfere, so you can calculate reflectance and transmittance
separately for s- and p-polarization.
3.3 Fresnel’s laws of reflection 87
at angles below the critical angle θc . However, for angles greater that
the critical angle, we have total internal reflection, things become a bit
more complicated. Let us calculate the phase shift for θ ≥ θc . Recall the
Fresnel equations for glass-air (eliminated θ2 using Snell’s law)
r 2
n2 cos(θ1 ) − n1 1 − nn12 sin2 (θ1 )
rk = r (3.58)
2
n1 2
n2 cos(θ1 ) + n1 1− n2 sin (θ1 )
q
cos(θ1 ) − n 1 − n2 sin2 (θ1 )
= q (3.59)
cos(θ1 ) + n 1 − n2 sin2 (θ1 )
r 2
n1 cos(θ1 ) − n2 1 − nn12 sin2 (θ1 )
r⊥ = r (3.60)
2
n1 2
n1 cos(θ1 ) + n2 1− n2 sin (θ1 )
q
n cos(θ1 ) − 1 − n2 sin2 (θ1 )
= q (3.61)
n cos(θ1 ) + 1 − n2 sin2 (θ1 )
q
cos(θ1 ) − i · n · n2 sin2 (θ1 ) − 1
rk = q , θ1 ≥ θc (3.62)
cos(θ1 ) + i · n · n2 sin2 (θ1 ) − 1
q
n cos(θ1 ) − i · n2 sin2 (θ1 ) − 1
r⊥ = q , θ1 ≥ θc (3.63)
n cos(θ1 ) + i · n2 sin2 (θ1 ) − 1
x − iy
, (3.64)
x + iy
3.3 Fresnel’s laws of reflection 89
x − iy e−iα
|r|eiδ = = |r| iα = |r|e−i2α , (3.65)
x + iy e
where δ is the phase we are looking for and α just a convenient param-
eter. Since δ = −2α = −2 arctan( xy ) we have:
q
δk
n n2 sin2 (θ1 ) − 1
tan =− , θ1 ≥ θc (3.66)
2 cos(θ1 )
q
δ⊥
n2 sin2 (θ1 ) − 1
tan =− , θ1 ≥ θc . (3.67)
2 n cos(θ1 )
These phase shifts are plotted on figure 3.18. This is not just a long
calculation - the result can be used to build useful optical components.
As you can see in the figure the difference in phase shift for the two po-
larization components can reach π/4 for standard glass. So, two bounces
without losses make a phase shift large enough to convert linearly polar-
ized light into circularly polarized light over a wide range of wavelength
limited only by material dispersion. This is used in Fresnel rhombs,
named after - guess who! We will discuss more methods to manipulate
the polarization state of light in a later chapter.
3.3.3 Metals
For metals we can also consider Fresnel’s equations. The procedure is
identical to that shown above except now we need to take into account
the fact that metals have complex index of refraction, often with the
imaginary part dominating. We will not consider this in detail here, just
show the reflectance for the two cases of a nickel and a silver surface,
see figure (3.19).
When metal surfaces are involved, say an air metal interface, we will
have a case of n1 = 1 and θ1 real while n2 = nr + ini and θ2 are complex
numbers. The complex index we defined in the previous chapter. Let us
90 Propagation of light
n1 − n2 2
R =
n1 + n2
n1 − n2 n1 − n2
= ·
n1 + n2 n1 + n2
q 2
1
cos(θ1 ) − 1 −
n22
sin(θ1 )2
R⊥ = q
1
cos(θ1 ) + 1 − sin(θ1 )2
n22
q 2
1
n2 cos(θ1 ) − 1 −
n22
sin(θ1 )2
Rk = q
n2 cos(θ1 ) + 1 − 1 2
sin(θ1 )
n22
(3.69)
where n1 = 1 is assumed equal and n2 = nr + ini .
Characteristic for metals is the absence of a genuine Brewster’s angle
where the reflectance goes to zero. Metals will typically have a minimum
in the Rk component and thus also be able to marginally polarize light
upon reflection. If the imaginary part ni nr then R → 1 and the
difference between Rk and R⊥ will be negligible. This happens, when
the conductivity is good. This explains the difference in the reflectance
of nickel and silver, as silver has a far better conductivity than nickel.
1 We will see later in chapter 5 on polarization, that not all lossless optical
elements are reciprocal.
2 I.e. the paper plane in the drawings. Note that we implicitly assume either
rotational symmetry around the x-axis or translational symmetry along the
z-axis. We also do not consider skew rays – rays in 3-D which are not parallel to
the x-axis, but never intersect it.
94 Geometrical optics
OPL = n0 So + n1 Si = n0 do + n1 di = constant.
Accordingly, the travel time from 0 to the plane Q must be the same
Plane Q
Figure 4.2 Geometry for a curved surface collimating light rays from
a point source.
for all rays. Introducing the cartesian coordinates (x, y) for points on
the surface, we can rewrite the equation as:
p p
n0 So + n1 Si = n0 x2 + y 2 + n1 ((So + Si ) − x)2 .
p
When we transform to polar coordinates, r = x2 + y 2 and x = r cos(φ),
we find accordingly:
(n0 − n1 )S0
r(φ) =
n0 − n1 cos(φ)
4.1 Optical elements 95
for the shape of the surface achieving our task. This curve will be a
hyperbola, when n1 > n0 and an ellipse, when n0 > n1 .
Conic sections
If you remember the more advanced bits of your education in analytical
geometry, you will recognize the last equation as the parametric repre-
sentation of a conic section. A conic section is the curve formed by the
set of points lying on the intersection of the surface of a (double) cone
with a plane as illustrated in Fig.4.3.
In polar coordinates, (r, φ), the conic section has the form;
·d
r= ,
1 − · cos(φ)
where is the eccentricity, a measure of how strongly curved a conic
section is. When < 1, the curve is an ellipse. When = 1, it is a
parabola. For > 1, it is a hyperbola3
Problem 4.1 Show that the optimal shape r(φ) is indeed a hyperbola
for n1 > n0 and an ellipse for n0 > n1 . Establish = n1 /n0 for the
hyperbola and = n0 /n1 for the ellipse. What happens when n0 = n1 ?
Plot the curves for n0 = 1 and n1 = 1.5 and vice versa.
(n0 − n1 )S0
r(φ) =
n0 − n1 cos(φ)
a · (2 − 1)
r= for > 1
1 − · cos(φ)
where a is a distance that characterizes the conic section.
96 Geometrical optics
differential equation is
C1
r(φ) = (4.5)
(n0 − n1 cos(φ))
(n0 − n1 )So
r(φ) = (4.6)
(n0 − n1 cos(φ))
OPL = n0 Lo + n1 Li (4.7)
2 2 1/2
= n0 ((So + R) + R − 2R(So + R) cos(φ))
+
n1 ((Si − R)2 + R2 + 2R(Si − R) cos(φ))1/2 ,
(4.8)
Figure 4.5 Single spherical surface with radius R used as a lens. The
point C denotes the center of the circle.
n0 n1 n1 − n0
+ =
So Si R
The object distance So in this special case we call the (object side) focal
100 Geometrical optics
x-axis
(2) normal
x-axis
(3)
c x-axis
the image. The image distance Si becomes negative, in the sense that
the (virtual) image point sits on the object side of the surface and to
the right of the object at So . This you may also see from equation
(4.9). On the right hand side, we have a positive number, n1 − n0
is positive, and R is positive. But when So becomes small, at some
stage we have
n1 n1 − n0 n0
= − , (4.10)
Si R So
so Si must become negative. We interpret this, as written above, as
both object and (virtual) image being located on the object side.
3. In the last case we look at a spherical surface with the center of
curvature flipped to the opposite side, as shown in panel 3 of the
figure. For this case, we assume an incoming parallel beam So → ∞.
Applying the same extrapolation as before we can see the image must
be to the left of the lens, i.e., again on the same side as the object.
102 Geometrical optics
LEFT RIGHT
x-axis
ventional lens from a slab of glass by polishing the two end faces into
spherical shape. Let us apply what we have just learned and deduce the
lens makers formula for a thin lens of thickness d → 0, as sketched in
fig.4.7. We have to consider the effect of the two surfaces sequentially and
be mindful about our sign conventions, when we stitch the two results
together. For the effect of the first surface we obtain:
n0 n1 n1 − n0
+ = . (4.11)
S1o S1i R1
To continue we need to look at the case, where the beam starts inside
the lens and meets the second spherical surface with radius of curvature
R2 . Here, we obtain:
n1 n0 n0 − n1
+ = . (4.12)
S2o S2i R2
These two formulas are considered exact within the paraxial approxima-
tion. So far, we have not yet used our assumption about the vanishing
thickness of the lens (d). From the geometry, we observe that d + S2o =
−S1i . With our thin lens assumption d → 0 we have S2o = −S1i . Ac-
cordingly, we can combine the two equations (4.11) and (4.12) to arrive
4.1 Optical elements 103
at:
n0 n0 1 1
+ = (n1 − n0 ) − . (4.13)
S1o S2i R1 R2
Finally, for the compound system we identify the object distance as S1o
and the image distance as S2i to find:
1 (n1 − n0 ) 1 1
= − . (4.14)
f n0 R1 R2
and
1 1 1
= + . (4.15)
f So Si
These two equations are so important, that they deserve a box and a
personal name.
Having a thin5 lens with two spherical surfaces of radii R1 and R2 one
may attribute an unique focal length and it is given by:
1 n1 − n0 1 1
= −
f n0 R1 R2
1 1 1
= +
f So Si
The equations (4.16) and (4.17) are the two central equations for geo-
metrical optics! We will derive these equations in a more elegant way
using ray tracing matrices presented in the next section. For now, we
will just illustrate the use of lenses and of the lens makers formula.
Example 4.4 Symmetric positive and negative lenses
We start with some gymnastics with signs. Consider a glass lens in air
(n0 < n1 ) with radii of curvature R1 = −R2 = R. This lens type is
called by its shape a symmetric biconvex lens6 . The lens transforms
5 A thin lens has no thickness, according to the definition. That means the focal
length is measured relative to its center.
6 It looks like a lentil seed – and this is how the optical lens got its name, as is still
quite evident in Danish or German language.
104 Geometrical optics
parallel rays from the object side to converging rays on the image side,
accordingly it is called a converging or positive lens. The opposite part-
ner with R1 = −R2 = −R is called a symmetric biconcave lens. This
lens transforms parallel rays from the object side instead to diverging
rays on the image side and is hence referred to as a diverging or nega-
tive lens. Can you figure out what happens, when n0 > n1 as for an air
bubble in water or a concave air gap in glass? Looking at the signs, we
conclude that the role of the shapes will simply switch.
In the following examples we will image object points which are not
on the optical axis. In order to find graphically the location of the image
it is very convenient to put an arrow vertically up from the optical axis
with the tip on the object point. Now, we can easily draw three rays
from the object point source to locate the image point. The first ray
starting parallel to the optical axis has to cross by definition the optical
axis at the image side focal point fi . The second ray we draw, intersects
the optical axis at the center of the lens7 . This ray will not be deflected
by the lens. The third ray to draw from the object point crosses the
optical axis in the object side focal point fo . This third ray will be
transformed by the lens into a ray parallel to the optical axis, according
to the definition of the object side focal length. If we have set up the
drawing correctly, the three8 transformed rays will intersect in one point
located at the image of the object.
7 This is convenient for thin lenses. The proper prescription for thick lenses will be
given later.
8 Of course, we can do with just two of those rays, but it is nice to use also the
third one as a sanity check. Do that with pencil and ruler in the next figures!
4.1 Optical elements 105
Figure 4.8 Imaging with a thin positive lens. A thin lens has no
extension and may be considered as a line. This is consistent with
the ray yo Qyi through the center not deviating when traversing the
lens.
−Si
M=
So
xo · xi = f 2
Figure 4.10 Sign convention for curved surfaces and lenses. When the
center of curvature C is to the right of of the reference vertex point
V, we associate a positive radius of curvature R > 0.
Spherical aberration occurs as the spherical shape is not ideal for the
imaging. We abandoned the aspheric surfaces as they were generally too
expensive to produce. A convex spherical lens bends rays which enter
far from its center more strongly compared to the ideal hyperbolic lens.
This means that parallel incident light rays far away from the optical
axis will come to a focus closer to the lens than rays entering close to
the center, leading again to a blurred focus. This is the prize to be paid
for the paraxial approximation. To minimize spherical aberration, we
must restrict us to use the lens area, which is close to the center and
compromise between image sharpness and image brightness. One has
to invest in an aspherical or more exotic lens9 , if applications are very
critical.
9 Modern production techniques allow to produce also lenses with a gradient of the
refractive index both in the radial and axial directions, which gives more design
freedom to optimize the performance of a lens.
4.1 Optical elements 109
located on the object side of the lens; the distance from this virtual point
to the lens is known as the focal length. Similar to the Plano-Concave
lenses, the Bi-concave lenses also have negative focal lengths, thereby,
they affect collimated incident light to diverge. Bi-Concave lenses also
have equal radius of curvatures on the both sides of the lens. They are
generally used to expand the light or increase the focal length in existing
systems, such as the beam expanders and projection systems.
Positive Meniscus lenses are designed to minimize spherical aber-
ration and are generally used in small f/number applications (f/number
less than 2.5). The Positive Meniscus lenses have a larger radius of curva-
ture at the convex side, and a smaller radius of curvature at the concave
side. They are thicker at the center compared to the edges. Positive
meniscus can maintain the same angular resolution of the optical sys-
tem while decreasing the focal length of the other lens, resulting a tighter
focal spot size. A positive meniscus lens can be used to shorten the focal
length and increase the numerical aperture of an optical system when it
is paired with another lens. For the best performance, the curved surface
should face the largest object distance or the infinite conjugate in order
to reduce the spherical aberrations.
Negative Meniscus lenses are designed to be an alternative option
to other negative lenses. Without causing additional spherical aberra-
tion, negative meniscus can increase the divergence of the beam, making
it a good choice for beam expanding application. The Negative Menis-
cus lenses can be used to increase the focal length of another lens while
maintaining the same angular resolution of the optical system. The Neg-
ative Meniscus lenses have a small radius of curvature on the convex side
and a larger radius of curvature on the concave side. They are thinner
at the center compared to the edges.
Achromats (Achromatic Doublets) consist of a positive low-index
crown glass lens (low dispersion, high Abbe number) element cemented
to a negative high- index flint glass lens (high dispersion, low Abbe num-
ber) element. The elements are chosen to cancel chromatic aberrations
at two well separated wavelengths; usually in the blue and the red region
of the spectrum. Achromats are used to bring two wavelengths into fo-
cus in the same image plane, thus, shifts of the focal length are virtually
eliminated across a considerable range of visible wavelengths. Usually,
these lenses are computer designed from the manufactory to effectively
minimize spherical aberration and coma when operating at an infinite
conjugate ratio. Unlike the singlet lenses, this results in a constant focal
length independent of aperture and the far better off-axis performance.
112 Geometrical optics
Freedom from the spherical aberration and the coma means that the
achromats are superior to the singlet lenses for monochromatic applica-
tions at any visible wavelength.
y1 1 L y0
= .
α1 0 1 α0
12 There is no law that says, we have to represent a ray by a column vector (y, α).
Different authors use different representations. For instance, in Hecht’s “Optics”
column vectors (α, y) are used. Be careful not to mix element matrices based on
different conventions!
13 Keep in mind that you have to measure angles in radians for the approximation
to make sense!
4.2 Ray tracing 113
1 L2 1 L1
Mtot = (4.22)
0 1 0 1
1 L1 + L2
=
0 1
Example 4.14 Propagation through (n0 , n1 ) interface
Looking at the figure 4.14, we are looking for the matrix M for the
interface n0 , n1 . Notice that, here: y0 = y1 , since we are right at the
interface. For the angles we apply Snell’s law:
n0 sin(α0 ) = n1 sin(α1 ) or n0 α0 = n1 α1 ,
again applying the paraxial approximation. We finally obtain:
1 0
M= .
0 nn01
114 Geometrical optics
1 0 1 d 1 0
M= n1 n0 (4.23)
0 n0 0 1 0 n1
d nn01
1
= .
0 1
Notice the order, beginning from the first element M1 we meet (right-
to-left), which should be the last matrix, so M = M3 · M2 · M1 .
In conclusion, we have:
Mtot = MN · . . . M2 · M1 .
4.2 Ray tracing 115
1 0
M= n0 −n1 n0
n1 R n1
Refraction at n0 , n1 interface
1 0
M= n0
0 n1
with
1 n1 − n0 1 1
= − .
f n0 R1 R2
Problem 4.16 Given a transfer matrix for an optical element (lens):
A B
M= ,
C D
shows that C = −1/f , where f is the focal length. Inject a ray with a
given y0 and α0 = 0, i.e., a beam parallel to the optical axis.
Example 4.17 The refraction matrix for a spherical surface
Y
q1 O a2
y
q2
R
a1
n1 n2
1 0 1 d 1 0
M= n1 −n0 n1 n0 −n1 n0 (4.31)
n0 R 2 n0 0 1 n1 R 1 n1
with
n0 − n1 d
A=1− (4.32)
n1 R1
n0
B= d (4.33)
n
1
n1 − n0 n0 − n1 d n0 − n1
C= 1+ + (4.34)
n0 R2 n1 R1 n0 R1
n0 − n1 d
D =1− (4.35)
n1 R2
1 (n1 − n0 ) 1 1 (n1 − n0 ) d
= − +
f n0 R1 R2 n1 R1 R2
A R1 R2 B R1 R2
V1 V2
H1 H2
O ff1 ff2
f f
PP1 PP2
C R1 R2
N1
q
N2
q
NP1 NP2
Figure 4.17 Principal plane and nodal planes of a thick lens. Focal
distance is referred to the principal planes. The nodal planes are
coinciding with the principal planes when the surrounding medium
of the left and right side of the lens have identical indices of refraction.
f (n1 − 1)d
|V 2H2| = − . (4.38)
R1 n1
So, the focal lengths of a thick lens are referenced to the two
principal planes. This also makes sure the focal length of a lens is
unique, when immersed in a single substance.
From the thin lens we are used to another special ray, the central ray,
which travels from an off-axis object point without angular deflection
and without displacement through the center of the lens. To find the
analogous ray for the case of a thick lens, we look for a ray which travels
from the object point without angular deflection through the lens, but
we have to allow for a parallel displacement on the image side. Extending
the object and image side rays as virtual rays inside the lens, we find
the intersection points of the virtual rays with the optical axis. Those
120 Geometrical optics
points are the nodal points N 1 and N 2, with corresponding nodal planes.
For the simple case of a thick lens surrounded by the same medium on
the object side and the image side, the principal points and the nodal
points are at the same location. In more general settings they need not
coincide. Knowing the location of principal and nodal points for a lens
can simplify the task of ray tracing considerably, which is why they are
typically determined early when designing a lens.
1 1 1 L
= + − ,
f f1 f2 f1 · f2
which is an important result. Double lens systems can cancel the first
order chromatic effects. Consider the combination of two lenses f1 and
f2 described above. We can write their lens equations as:
4.2 Ray tracing 121
1 1 1
= (n − 1) − = K1 (n − 1) (4.40)
f1 R11 R12
1 1 1
= (n − 1) − = K2 (n − 1) (4.41)
f2 R21 R22
1
L= (f1 + f2 )
2
Would this only apply to thin lenses? No – you should check that we
can use thick lenses, as long as we have R1 = ∞ or R2 = ∞ i.e. one of
the lens surfaces is flat (plano).
5
Polarization of light
This chapter deals with the polarization state of light. We discuss the
different polarization states of light: linear polarized, circular polarized,
and elliptical polarized light. We give two mathematical toolboxes to
describe the state of polarization – one for strictly monochromatic light
and a second more general one applicable also to incoherent mixtures of
light waves. Of course, we also want to know how we can change and
control the polarization state of light in practice. This will lead us from a
discussion of birefringent materials to optical devices, that can perform
almost any desired preparation and transformation of light polarization.
We know that electromagnetic waves are transverse, i.e. k · E = 0 at
least in free space. From this we can already conclude that there are just
two independent (orthogonal) polarizations. This makes the polarization
degree of freedom resemble a bit the two possible spin projections of an
electron. In fact, you will find in your quantum mechanics courses a few
mathematical tools that are just “borrowed”1 from optics. The analogy
is, however, not complete. In the quantum mechanical description of
light, the photon as a quantum of light is a spin one (S = 1) particle
(boson) with zero rest mass2 , which has two possible projections of its
spin onto the propagation direction. You certainly have heard about
research aimed at developing quantum computers based on photonic
qbits – one possible encoding of qbits relies on the two polarization
states of single photons.
Figure 5.1 Linear (A) and elliptical (B) polarized states of light. For
the special case E1 = E2 and δ = π/2 we talk about circularly
polarized light.
are associated with the two possible ±h̄ angular momentum projections
onto the z-axis.
Remark By construction monochromatic light is always polarized. This
happens as the amplitude, frequency, and phase δ are constants. If, say
the phase fluctuated in time, then at one instant the E-vector is pointing
in some particular direction the next moment in another direction. If the
phase fluctuation behaves as random noise and all phases in the interval
0-2π are equally probable, we cannot talk about polarized light. In fact,
this case corresponds to randomly polarized light, sometimes also
called unpolarized light. On the other hand, if a light beam is polarized
we cannot conclude that it is monochromatic. Take the following E-field
as a counter example:
so
Ey Ex
− cos(δ) = − sin(kz − ωt) sin(δ).
E2 E1
But we can rewrite the last sine term in terms of the cosine term for Ex .
Then
2 2 !
Ey Ex Ex
− cos(δ) = 1 − sin2 (δ).
E2 E1 E1
which is the equation of a tilted ellipse with the major axis oriented at
an angle α to the x-axis:
2E1 E2 cos(δ)
tan(2α) = .
E12 − E22
5.1 Polarization states of monochromatic waves 125
Figure 5.2 Tilted ellipse with the major axis oriented at an angle α
to the x-axis.
2 2
Ey Ex Ey Ex
+ −2 cos(δ) = sin2 (δ),
E2 E1 E2 E1
with
2E1 E2 cos(δ)
tan(2α) = , 0 ≤ α ≤ π/2
E12 − E22
Problem 5.1 Show that equation (5.1) defines a rotated ellipse in the
Ex , Ey frame, making an angle α with the Ex -axis.
E1 eiϕx
Ex
E= =
Ey E2 eiϕy
A field polarized along the x-direction (horizontal), we could call E0
becomes:
126 Polarization of light
1
E0 =
0
similarly a field polarized along the y-direction, we could call E90 :
0
E90 =
1
For a field linearly polarized along the + 45 degree direction we have:
1 1
E45 = √ .
2 1
A circularly polarized field, say with photons of angular momentum
projection +h̄
E1 eiϕx
iϕx 1 1 1
E+1 = = E1 e →√ .
E1 ei(ϕx −π/2) −i 2 −i
The angular momentum of light is defined with respect to the k vector
(right hand rule). Our wave is propagating in the + z direction and the
y-component lags behind the x-component by a phase of π/2, so we must
have angular momentum +h̄ in this case.
Below, we give a summary of the polarization states in the Jones
notation.
1 0
E0 = , E90 =
0 1
1 1 1 1
E45 =√ , E135 =√
2 1 2 −1
1 1 1 1
E+1 = √ , E−1 =√
2 −i 2 i
0 0
M90 =
0 1
It will produce zero output for E0 , but 1 for E90 as expected. For E45
the output is
1 0 0 1 1 0
Eout = √ · =√ .
2 0 1 1 2 1
Problem 5.3 For a E+1 state how much power is transmitted though a
M90 analyzer?
accept that wave-plates have a different optical path length for two orthogonal
linear polarization components.
128 Polarization of light
1 0
M0 =
0 0
0 0
M90 =
0 1
1 1 1
M45 =
2 1 1
1 1 −1
M135 =
2 −1 1
+iπ/4 1 0
M+λ/4 = e
0 i
−iπ/4 1 0
M−λ/4 = e
0 −i
Problem 5.5 Show that the matrix of a linear polarizer oriented with
transmission axis θ with respect to x-direction is given by:
cos2 (θ)
cos(θ) sin(θ)
Mlin (θ) =
cos(θ) sin(θ) sin2 (θ)
Problem 5.6 Show that a λ/2 plate with fast axis oriented at an angle
θ with respect to x-direction is described by the matrix:
cos(2θ) sin(2θ)
Mλ/2 (θ) =
sin(2θ) − cos(2θ)
Generally, for a series of elements the light encounters on its path, we
perform the operations element by element, leaving us with the rule:
Eout = MN · . . . · M1 · Ein
!
Z T
2 1 2
hE(t) iT = lim E(t) dt
T →∞ T 0
4 This representation is completely equivalent to the density matrix, that you will
encounter in quantum mechanics.
5 We can choose any direction in the plane perpendicular to the k-vector.
130 Polarization of light
S0 = I(0) + I(90)
S1 = I(0) − I(90)
S2 = I(45) − I(135)
S3 = I(σ+ ) − I(σ− )
Problem 5.7 Show that for the general field in equation (5.1) the Stokes
vector becomes
S0 = E12 + E22
S1 = E12 − E22
S2 = 2E1 E2 cos(δ)
S3 = 2E1 E2 sin(δ),
where we have omitted 1/2ε0 c. You can use the Jones formalism or
project the E-field on the x- and y- direction. For S3 remember to add
the π/2 phase shift in one of the directions. Based on the measured
numbers S1 , S2 , S3 how do we extract E1 , E2 and δ? Finally Show S02 =
S12 + S22 + S32 holds for polarized light.
Problem 5.8 Let the phase δ be white noise (white random process) in
[0, 2π]. Show that S1 = S2 = S3 = 0. What is S0 ?
We summarize important properties of the Stokes vector in the fol-
lowing box.
5.1 Polarization states of monochromatic waves 131
S3
S2
A D
S1
Figure 5.3 Poincaré sphere for polarization states. Points on the unit
sphere surface represent fully polarized light. Partially polarized light
states are plotted inside the ball. Completely random polarized light
lies at the center (0, 0, 0). North and south poles correspond to right
and left circularly polarized light, while points on the equator describe
linearly polarized light. Note that walking along the equator by 90
degrees changes the linear polarization direction by only 45 degrees.
c c
vk = =
nk ne
c c
v⊥ = =
n⊥ no
In figure 5.6 we imagine a point source embedded in a negative uniaxial
crystal. Wavefronts belonging to Ek , i.e., E-fields parallel to the optic
axis will, move faster compared to wavefronts with E-fields perpendicular
to the optic axis (dots on the circle). This happens as ne < no . At the
semi-minor axis both polarizations are perpendicular to the optic axis
and they move with the same speed v⊥ .
Example 5.10 Propagation of light through a uniaxial crystal.
Let the optic axis be oriented at 45 degrees with respect to incoming
wave vector, but in the plane of the paper. Consider first case (A) in
figure 5.7. Using Huygens principle we can construct secondary wave-
fronts to establish the next wavefront. As we observed above the speed
5.1 Polarization states of monochromatic waves 135
of propagation is the same in all directions in the paper plane, when the
electric field is perpendicular to the optic axis. Our secondary wavelets
are consequently circles. The wavefront will propagate straight through
as we have seen it with isotropic media. This case is named ordinary for
that reason.
Case (B) is different. Now the electric field has a component parallel to
the optic axis and a component perpendicular to it. Secondary wavelets
will be ellipses shaped and the beam according to Huygens principle will
get a kink upwards – the ray bends! The angle between the k vector and
wave propagation direction is about 6 degrees for calcite, so the effect
is actually quite small. Figure 5.8 provides a ”zoom in” on the physics
of case (B) summarizing the different fields involved. Notice that the
E-field is no longer oriented perpendicular to the wave vector, but still
perpendicular to the Poynting vector S. However, the D vector is still
136 Polarization of light
∇·D = 0 (5.9)
∇·B = 0 (5.10)
∂B
∇×E = − (5.11)
∂t
∂D
∇×H = , (5.12)
∂t
5.1 Polarization states of monochromatic waves 137
Di = εi Ei
ε1 = ε2 = ε⊥
ε3 = εk
going to evaluate the indices, but first we want to work out what we can
say about the angles between the various field and wave vectors.
For the present analysis we assume a monochromatic plane wave as
usual. Let E = E0 ei(k·r−ωt) , all other fields will be of the same form.
From Gauss’ law we have
∇ · D = ik · D = 0
and
∇ · B = ik · B = 0
∇ × H = ik × H = −iωD
∂B
∇×E=− , (5.14)
∂t
ik × E = iωB, (5.15)
so
1 1
H= B= k × E. (5.16)
µ µω
1 1
E·H= E · (k × E) = − k · (E × E) = 0. (5.17)
µω µω
A · (B × C) = −B · (A × C). (5.18)
So we can conclude:
5.1 Polarization states of monochromatic waves 139
k·D=0
k·H=0
1
D=− k×H
ω
∂(∇ × B)
∇ × (∇ × E) = − .
∂t
Performing the derivatives gives
∂2E
k × (k × E) = µε = −µεω 2 E,
∂t2
where we used B = µH. Notice here ε is the tensor from equation (5.13).
Using c2 = 1/ε0 µ0 we find
140 Polarization of light
µε ω 2
k × (k × E) = − E
ε0 µ0 c2
and finally
ω2
k × (k × E) = εt E
c2
with
n2o
εt = n2o .
n2e
ω2
(k · E)k − k 2 E + εt E = 0.
c2
! !
kx2 ky2 k2 ω2 kx2 ky2 k2 ω2
2 + 2 + z2 − 2 · 2
+ 2
+ z2 − 2 = 0.
n0 n0 n0 c ne ne n0 c
Setting the first or the second factor in parentheses to zero we find two
independent classes of solutions for the k-vector. Fixing the frequency,
we see that for the first term k solutions lie on a sphere. This is the or-
dinary wave solution, in good agreement with the picture from Huygens
principle. For the second term k solutions form the surface of an ellipsoid
as we change direction. This represents the extraordinary solution.
5.1 Polarization states of monochromatic waves 141
Problem 5.11 Let the angle between the k vector and the optical axis
be 45 degrees. Show that the angle α between k and S, shown in figure
5.8, is given by:
1 n2 + n2o
cos(α) = √ · p e
2 n4e + n4o
and that α = 6.2 degrees for calcite. Is this the maximal α?
ωD = −k × H (5.22)
ωB = +k × E, (5.23)
so
ω(D × B) = D × (k × E) (5.27)
= (D · E)k (5.29)
D×B
k= ω. (5.30)
D·E
For the final steps we look for an expression of the form ω = A · k where
A is some vector. This will allow us to read of directly the group velocity
as vg = ∂ω/∂k = A. We can write
D×B E×H
· = (5.31)
D·E D·E
(D · E)(B · H) − (B · E)(D · H)
=
(D · E)(D · E)
= 1,
E×H
ω= · k. (5.32)
D·E
5.2 Production and manipulation of polarized light 143
∂ω E×H S
vg = = = . (5.33)
∂k D·E D·E
This result tells us that a light beam, modeled as a wave packet of plane
waves with different directions, moves along the direction of the Poynting
vector. This is, of course, also true for light beams in isotropic materials.
5.2.1 Polarizers
The job of a polarizer is to separate two orthogonal polarization compo-
nents. This can be achieved by either splitting the path, e.g. as shown in
fig.5.7, or by selectively absorbing one of the polarization components,
as in the polaroid sheet polarizers you use in the lab experiments. The
quality of a linear polarizer is characterized by it’s extinction ratio
Rex , the ratio of the intensity transmitted when two ideal polarizers are
crossed at θ = π/2 to the intensity transmitted if the two polarizers are
parallel θ = 0. Typically the extinction ratio of good polarizers is about
10−5 (using quartz prisms) and values of 10−7 or better are possible
(using calcite prisms), however, going significantly beyond 10−7 is very
difficult 8 . Polaroid sheet polarizers have Rex ≈ 5 · 10−3 . How do we
8 To measure an extinction ratio of high quality polarizer (Rex < 10−5 ) is a
challenge. This means detecting very low power levels typically at the nano-watt
range and controlling input polarization to the required level. Eventual
absorption loss must be measured and taken into account as well. In some cases
144 Polarization of light
Figure 5.10 Polarizer based on a Nicol calcite prism. The blue layer
is the glue cementing the prisms together. Here ne < nglue < n0 . The
optic axis is marked with ”OA”.
interpret the extinction ratio physically? Well, for a polarizer with ex-
tinction ratio of say 10−3 the direction of the polarization vector has an
angular uncertainty ∆θ, which can be estimated according to Malus law.
The law describes the transmitted intensity through an ideal polarizer
as a function of the angle θ between the incoming polarization and the
transmission axis of the polarizer: I = I0 cos2 (θ).
I π
Rex = = 10−3 = cos2 ( − ∆θ) = sin2 (θ) ≈ ∆θ2 (5.34)
I0 2
so
√
∆θ = 10−3 rad ' 1.8 degrees. (5.35)
light around 250 nm which limits range of wavelengths where the Nicol
prism may be used. Also, glue reduces the amount of power that can be
used. An air spaced version of the Nicol prism, rather that using glue,
the so-called Foucault prism can be used at considerably higher power
levels10 .
A major drawback of the Nicol and Foucault prisms is the small but
notable parallel displacement of the output beam. Many applications
rely on the ability to rotate the polarizer without displacing the output
beam. The most common polarizing prisms used in modern optics belong
to the Glan prism “family”: Glan-Thompson, Glan-Taylor and Glan-
Foucault prism. These prisms are shown in figure 5.11. They all have the
advantage of a non displaced transmitted beam. The air gap versions
have furthermore the advantage of allowing high powers, but at the
expense of being more sensitive to alignment. Normally, the extinction
ratio for these prisms is in the range of 10−5 or better. It is always the
extraordinary ray which has the clean polarization and is used as the
output. Frequently the ordinary ray is sent into an absorbing layer or
caught in a beam dump.
There are, of course, many applications, where it is desirable to keep
both of the output beams. In figure 5.12 we show common prisms for
separating or combining beams. The most common type used is the
polarization beam splitter cube PBS. It relies on thin dielectric coat-
ing layers11 at the interface between the prisms, designed such as to
show near perfect transmission (reflection) for incoming p-polarized (s-
polarized) light12 . Typically, extinction ratios range from about 10−3 to
10−4 .
10 A Foucault prism has a length to width ratio of 1:1 and a reduced angular
acceptance of ± 4 degrees.
11 We will return to thin film coatings, when we discuss interference.
12 Note that this is opposite to most beam splitters based on birefringent materials.
146 Polarization of light
θ3 = θ2 − α (5.38)
φ3 = φ2 − α. (5.39)
Now we may compute the exit angle of the prism backside to be (assum-
5.2 Production and manipulation of polarized light 147
Figure 5.13 Wollaston prim. The light emerges the prism configura-
tion as two orthogonally polarized beams separated by an angle β.
The prism wedge angle is α. Refraction angles belonging to the polar-
ization state pointing out of the paper are denoted by θ the orthogo-
nal ones as φ. At the prism interface incident angles are θ1 = φ1 = α,
while the refracted angles are θ2 , φ2 and so forth.
Figure 5.14 The optical bench of the read head in an early CD ROM
drive.
In many cases one seeks to convert linearly polarized light into circu-
larly polarized light or reversed, converting circularly polarized light
into linearly polarized light. A retarder aimed at rotating the direction
of linearly polarized light is called a lambda half plate λ/2. In figure 5.15
we show how to generate such a plate using a thin slab of birefringent
material, say uniaxial material, with the optic axis orthogonal to the
propagation direction of light through the plate.
When light, in this case linearly polarized light, hits the plate we can
decompose the polarization into the components along the optic axis x̂
of the crystal and perpendicular to it ŷ.
The component polarized along the x̂-direction (optic axis) advances
faster compared to the ŷ component, since no > ne for calcite. Thus one
component will be phase shifted with respect to the other component.
The input electric field can be written as:
Figure 5.15 Retardation plates with optic axis along x and incident
polarization at 45 degrees to the optic axis. Left: λ/2-plate, the red
wave is shifted half a wavelength with respect to the blue wave, thus
flipping the E-vector at the output. Right:λ/4-plate, the red wave is
shifted by a quarter wavelength with respect to the blue wave, thus
producing circularly polarized light at the output.
equatorial plane is given by twice the angle of the optic axis with the
horizontal.
Problem 5.14 Show graphically using the Poincare sphere, that circu-
larly polarized light entering a quarter-wave plate is transformed into
linearly polarized light and determine the polarization direction with
respect to the optic axis.
13 We have to invert the direction of all currents for complete time reversal, also the
currents in the source of the applied magnetic field.
14 Incidentally, much of the research done in the Quantop group at NBI uses the
Faraday effect in atomic vapors.
6
Interference of light
E0 for − τ /2 ≤ t ≤ +τ /2
E(t) = (6.1)
0 else
The Fourier transform of this field is the sinc function, so the intensity
2 The Fourier transform of a function f is given by:
Z +∞
fˆ(ω) = F (f (t)) = f (t)e−iωt dt
−∞
1
Z +∞
f (t) = fˆ(ω)e+iωt dω
2π −∞
154 Interference of light
I(ω) becomes:
ωτ
I(ω) = E02 τ 2 sinc2 ( ). (6.2)
2
τc ∆ν ∼ 1
Coherence length:
lc = cτc
Problem 6.2 In the above estimate for the coherence time of light from
a sodium lamp we have considered only collisions as the mechanism for
random phase changes. We have neglected the finite natural lifetime of
the atomic excited state. This lifetime is about 16 ns for the relevant
energy level in sodium. Is it reasonable to neglect this contribution?
Experiments to demonstrate interference must take source bandwidth
and coherence length into account. Typical optical path length differ-
ences in the experiment must be kept below the coherence length of
156 Interference of light
light from the source. At longer path length difference interference ef-
fects will wash out and may not be observed.
Besides the longitudinal coherence length, the transverse coherence
length needs to be considered as well in interference experiments. All
real light sources have a finite size of the emitting surface and typically
the light emitted from different surface patches of the source will be
statistically independent. This means, if we try to superpose the waves
from different surface patches to see interference, we will most likely fail.
In the next chapter, we will see how diffraction limits our ability to know
the exact location of a source – at first sight paradoxically, this limitation
can be used to restore interference with light from real sources and even
put to good use in astronomy to measure the angular distance between
close stars with a Michelson stellar interferometer. We will come back to
this in the chapter on diffraction, but for now we want to look at general
properties of interference.
The intensity 3
becomes (modulo 1/2ε0 c):
δ = (k1 − k2 ) · r + ϕ1 − ϕ2 (6.7)
and
3 The time average is defined as: hf (t0 )iT = T1 tt+T f (t0 )dt0 . In principle for non
R
monochromatic fields we may have to let T → ∞.
158 Interference of light
Here, we can draw two important conclusions: First, if the two fields
oscillate at the same frequency the interference phase and hence the
interference pattern is stable in time. For fields with different frequencies
a beat note at the frequency difference is observed. Whether or not this
beat note is actually registered with the photodetector depends on how
fast the photodetector reacts to changing intensity. Secondly, fields that
are in orthogonal polarization states will not produce interference by
virtue of the dot product structure of the interference term.
Problem 6.3 Let the sources S1 and S2 in figure 6.2 be randomly po-
larized. Would you be able to see interference?
Assume the two fields to have the same frequency and to be in the
same polarization state. Then we can write the intensity as:
p
I = I1 + I2 + 2 I1 I2 cos(δ). (6.9)
p
Imax = I1 + I2 + 2 I1 I2 , δ = m · 2π , m = 0, ±1, ±2, . . . (6.10)
p
Imin = I1 + I2 − 2 I1 I2 , δ = (2m + 1) · π , m = 0, ±1, ±2, . . .(6.11)
δ
I = I0 + I0 + 2I0 cos(δ) = 4I0 cos2 ( ). (6.12)
2
6.2 Young’s experiment 159
δ = m · 2π , m = 0, ±1, ±2, . . .
Destructive interference:
Figure 6.3 In 1801 Thomas Young carried out an double slit exper-
iment to decide whether light was waves or particles. A small hole
pre-selects a spatially coherent part of the light. The spherical waves
hit two holes. The interference pattern is observed on a screen placed
a distance L d away. The intensity on the screen is I = 4I0 cos2 ( 2δ ).
2π
δ = (k1 − k2 ) · r + ϕ1 − ϕ2 = d sin(θ). (6.14)
λ
Since we assume the waves from the source to be in phase at the openings
in screen 2, we have ϕ1 = ϕ2 . Increasing now the observation angle θ,
every time source 1 is ahead of source 2 by an additional full wavelength
λ, the phase will increase by 2π. This means, constructive interference
will take place at observation angles:
λ
sin(θ) = m · , m = 0, ±1, ±2, . . . (6.15)
d
δ
I = 4I0 cos2 ( ). (6.16)
2
Remark A remark to the above equation. As the waves from the two
openings are assumed spherical, the intensity for both source 1 and 2
scales as 1/R2 where R is the distance. That means the interference
cannot be complete, except at the center of the observation plane, where
waves from both source have traveled the same distance. Everywhere else
in the observation plane the distances to the two sources are different.
In figure 6.4 we show the intensity distribution as a function of sin(θ),
as it could be observed in the laboratory with monochromatic light.
Figure 6.5 Thin film of thickness d. The small electric field arrows
indicate when the field changes phase by π. This happens only at the
first interface as n1 < n2 .
4πd 4πdn2
δ= +π = +π
λoil λ0
λ0 δ cos2 (δ/2)
400 nm 3π 0
Table 6.1 Power reflected from a thin film with thickness d = 133 nm
at various wavelengths.
since we travel twice the distance d and have one phase shift of π. Ob-
viously, we cannot have complete destructive interference in this case,
as 3.7 % cannot match the 4 % reflected from the top surface, but the
interference contrast is still very high.
Problem 6.6 Above we only took one reflection on the oil air interface
into account which is a very good approximation, why? For incident
plane waves there will naturally be an infinite number of reflections
occurring. To get a more correct expression for the amount of reflected
power, sum up the amplitudes for all beam pathes and give a general
formula for the reflected power, assuming n1 < n2 .
Example 6.8 Assume a thin film of thickness 133 nm. We shine white
light on the film. What colour will the film appear to have?
As we can see from table 6.1 the intensity reflected at 650 nm is signif-
icant. So the film appears red! At viewing angles different from normal
incidence the film will take other colours since d effectively changes.
One important application of thin films is anti-reflection coatings min-
imizing the amount of reflected light from a refracting surface. Those
reflections create unwanted, shifted and blurry images. In addition, you
lose power in an optical system. In optics experiments reflections pro-
duce ghosting (multiple interference patterns) which ultimately sets a
limit on the performance of your set-up. Here is one way to avoid that.
Consider a film with refractive index n2 deposited on a glass surface, as
shown in figure 6.6.
Let the index of the glass substrate be n3 > n2 > n1 . In that case the
164 Interference of light
the reflection phases from surface 1 and surface 2 are the same and we
find
4πtn2 4πt
δ= = .
λ0 λf ilm
If we choose the film thickness to be t = λf ilm /4, we have δ = π and the
field components reflected from surface 1 and 2 cancel to first order.
Problem 6.10 Show that for given n3 > n2 > n1 the best choice of
√
coating n3 = n1 n2 results in a zero reflection to first order.
• Air → glass π
• Air → mirror π
• Glass → air 0.
If we now assume the reflecting surface of the beam splitter to face
L1, the interference phase in output A becomes:
δA = 2k(L1 − L2 ) + 2π − π = 2k(L1 − L2 ) + π,
as we neglect the thickness of the beam splitter. The 2π phase shift in the
middle expression is associated to the two air-glass interface reflections
for path L1.
The Michelson interferometer is typically used in connection with
In the case of perfect alignment, what happens to the light when there
is destructive interference and no output in the detector arm A? Well, the
beams traveling in the arms are sent back by the beam splitter towards
the source. This can be quite a nuisance, because not all light sources
like it to be confronted with their past output, but fortunately there are
ways to avoid it. Let us make a sanity check of the results by looking
at energy conservation. For the interference phase of light reflected back
into port B we have:
δB = 2k(L1 − L2 ) + 3π − π = 2k(L1 − L2 ) + 2π = δA + π,
Now, energy conservation demands that the total output power equals
the input power:
I0 I0
Itot = IA + IB = 4 cos2 (δA /2) + 4 cos2 (δB /2) = I0 ,
4 4
which works out nicely as expected. Notice, we are only using Maxwell’s
equations and conservation of energy comes out naturally.
The Mach-Zehnder interferometer, shown in Fig.6.8 is similar to the
6.4 Interferometers and their applications 167
Michelson interferometer, but does not share the nasty property of re-
flecting one output beam back into the source. This allows to monitor
both output ports. By taking sum and difference of the output signals in
ports A and B, we can double the signal and reject the influence of input
intensity fluctuations. For these reasons the Mach-Zehnder configuration
is more popular than the Michelson interferometer in many applications.
Note, that in a Mach-Zehnder interferometer one can also use an unbal-
anced input beam splitter, R1 /T1 together with a 50/50 beam splitter
at the output, if one has a delicate sample in arm L1 which limits the
amount of power one can send through this path. This is often used in
spectroscopy experiments, where the signal-to-noise ratio needs to be
optimized by using enough power at the detectors, while at the same
time keeping a delicate sample “alive”.
The Sagnac interferometer differs essentially from the two previous
configurations. Light is injected into the interferometer and split 50/50
to go clockwise and anti-clockwise around the interferometer. Since the
optical path length is, at least at first sight, exactly the same for both
directions by construction, it appears pointless to use it as a measure-
ment instrument. A path length difference can only occur due to effects
which differentiate between the two path orientations. One such effect
is a global rotation of the whole setup. If the whole interferometer is
rotating with angular frequency Ω there will be a phase difference be-
tween the to directions when overlapped on detector D. The phase can
be calculated by considering the round trip times. We assume a circular
path for simplicity, around a circle with circumference 2πR and with
area πR2 . We take into account that the beam splitter moves while light
168 Interference of light
is traveling around the loop. For the clockwise round trip time we have
2πR
t2 =
c − ΩR
and for the anti-clockwise sense we get
2πR
t1 = .
c + ΩR
Thus there will be an effective length difference traveled between the
two directions:
4AΩ
∆L = c(t2 − t1 ) '
c
and a corresponding phase change:
8πAΩ
∆φ = .
cλ
By measuring this phase change you can determine the rotation speed
of your frame7 ! Michelson and Gale where the first to measure the ab-
solute rotation speed of the Earth in this way in 1925, performed on the
prairies west of Chicago. They used an rectangular optical loop 2/5 mile
7 The derivation above for the phase shift leads to the correct result, but is a little
questionable in the light of special and general relativity. If you want to learn
more, take a look at the articles published on the course homepage.
6.4 Interferometers and their applications 169
long and 1/5 mile wide evacuated to reduce absorption and index vari-
ations from the air. Theory predicted a shift 236/1000 of a fringe while
they measured 230/1000 of a fringe. Today with stable laser systems we
can measure below 10−8 rad/s and readily observe changes in the earth
rotation rate of 1-2 ms per 24 h.
As you can see the phase difference is proportional to the area of the
interferometer. With a laser and up to 5000 m of optical fibre curled up
in a ring we have a powerful tool that can be used as sensor of rotation.
Today this type of Sagnac interferometer is commonly used in all high
performance satellites. The considerably smaller and self calibrating ring
laser gyroscopes also use the Sagnac effect to measure rate of rotation
and are frequently used in aviation.
7
Diffraction of light
Unlike particles waves have the ability to bend around corners. This
phenomenon we call diffraction. So far we have entirely disregarded
this effect. We have assumed the limit of λ → 0 in this respect. In optics,
diffraction effects become pronounced when the wavelength of light is of
similar size as the characteristic length of an object illuminated by the
light. The sharp edges of any object in the way of a wave always give
rise to diffraction.
As waves bend around and redirect, interference will take place. So
the distinction of when light is self-interfering or diffracting is rather
diffuse. Later in this chapter you will see that Young’s experiment, we
studied last chapter can be handled equally good with the theory of
diffraction. We will start this chapter with studying the interference
from multiple sources. This has not only important applications, but it
will lead naturally to the more general problem of diffraction.
The historical Huygens-Fresnel principle states that the propagation
of a wavefront in space can be seen as the result of interference of light
emitted from closely spaced virtual point sources sitting on the wavefront
in all places where the wavefront is not obstructed by opaque obstacles.
In fact, this principle has been put later by Kirchhoff and Sommerfeld
onto a more solid theoretical foundation with their scalar diffraction the-
ory1 . We will not study this theory in any detail, but rather focus on
two limiting cases, Fresnel and Fraunhofer diffraction, where the latter
has important implications for the resolving power of imaging instru-
ments. It should be mentioned, that with the computing power we have
at our hands today it is possible to solve electromagnetic wave propa-
1 If you want to learn more, chapters 10-12 and Appendix 2 in Hecht “Optics”
contain an accessible introduction into scalar diffraction theory and many
application examples.
7.1 Interference of N sources - the grating 171
illuminating, and hence exciting the array of dipoles with a plane wave
from the left. The point P of observation2 is assumed to be far away
compared to d, the distance between the sources. From our discussion of
Young’s experiment in the last chapter we know that the phase difference
of two waves arriving at the observation point from two adjacent sources
is given by the expression:
2π
δ= d sin(θ).
λ
With this in mind we can write up the total electric field at point of
observation P:
eiδN − 1
Ep = E0 (r)e−iωt eikr1 (7.1)
eiδ − 1
−iωt ikr1 i(N − 1)δ sin(N δ/2)
= E0 (r)e e exp .
2 sin(δ/2)
The observed intensity distribution becomes:
!2
sin( N2δ )
I(δ) = I0
sin( 2δ )
with
2π
δ= d sin(θ)
λ
I(δ = 0) = I0 N 2 .
We see that for large values of N the intensity into directions for which
δ = 0 becomes enormously higher, because there we have constructive
interference of all N sources. This happens in all directions satisfying
d
m= sin(θm ),
λ
where m is a whole number specifying the diffraction order of the grat-
ing. Inspecting this expression, we observe that for a spacing d > λ we
get solutions other than the trivial one at θ0 = 0 and that the corre-
sponding angles depend on the wavelength. This suggests immediately
an application – if we drive the emitters in phase and measure those
principal diffraction angles we can determine the wavelength of light, if
we know the spacing between the scatterers. Looking at the expression
in equation (7.2) we can also see that in the vicinity of the diffrac-
tion orders, we can approximate the sine function in the denominator
by its argument (Taylor expansion). This means around the diffraction
orders the intensity distribution follows an oscillating sinc2 (N δ/2) func-
tion – the first side maximum is less than 5% of the principal maximum
in height, so the diffraction orders really stick out. As we increase the
number of emitters the width of the main peak decreases inversely pro-
portional to N . Summarizing, we have:
I(0) = I0 N 2
λ
Peak FWHM ∆θ '
Nd
2π
m2π = sin(θ)d
λ
interpret λ as a function of θ and take the derivative
m∆λ = d · cos(θ)∆θ.
∆λ cos(θ)
= .
λ m·N
This states the higher the diffraction order and the higher number of
sources the better is the resolution3 .
was the case, then the secondary waves from all points on the sphere
would interfere constructively in the backward direction towards the
source point. But this is not what happens in practice, so we need to
modify the angular dependence for the secondary sources such that the
backward propagating wave is avoided. This is achieved by multiplying
spherical waves with an inclination factor K(θn ) = 1/2(1 + cos(θn )),
which describes the emission pattern of the virtual sources. We can now
write the resulting field dEP at the observation point as:
K(θn )E0
dEP = exp(ik(ρ + r) − iπ/2)dS.
ρrλ
Here, E0 is the strength of the primary source and we have omitted
the time dependence, since we deal with a monochromatic source. The
peculiar extra phase shift of π/2 has to be added to keep the end result
after integrating over the spherical surface the same as the expression
one gets from just writing the primary spherical wave amplitude at the
observation point. We now want to interpret the expression. We see,
that as we scan the sphere starting from the north pole (ϕ = 0), the
distance r to the observation point grows and the inclination angle θ
slowly increases. Since r enters the optical path length in the complex
exponential function, the phase angle varies as we move away from the
north pole. It makes sense to divide the surface of the sphere into zones
of similar distance to the observation point within half a wavelength. All
surface elements within a zone will contribute with the same sign to the
resultant field at P , while contributions of adjacent zones tend to cancel
each other. We call the zones Fresnel zones. A more in depth analysis
shows4 , that when we integrate over the whole sphere, the resulting field
is to a very good approximation half as big as the contribution from the
first polar zone.
So far, this has just been a complicated way to describe the prop-
agation of a spherical wave between to points, but can we make any
practical use of the zone construction? A first surprising result is that
we can quadruple the intensity at the observation point by blocking all
the light apart from the first polar Fresnel zone with an opaque screen.
But we can achieve even more – if we number the zones starting from the
polar region and block all even numbered zones, we observe the on-axis
intensity grows quadratically with the number of zones. Optical elements
based on this idea are called zone plates and are used as focusing ele-
ments in wavelength regions where transparent materials are not readily
4 See, e.g. chapter 10.3 in Hecht.
7.2 The modified Huygens-Fresnel construction 177
available. Instead of blocking the light from every other zone, we can
also phase delay the light from every other zone by half a wavelength
by using transparent layers of suitable thickness. This is the basic idea
behind Fresnel lenses, which have been used for centuries in lighthouses
and nowadays can be found in more mundane applications as magni-
fiers on mobile phone display screens. Diffractive optical elements can
save a lot of space and material and play an important role in modern
integrated optics systems.
In general, we can now judge at least qualitatively what happens with
the intensity at the observation point, if we place a screen with a centered
hole of variable diameter between the primary source and the observa-
tion point. As we gradually decrease the diameter of the opening or
178 Diffraction of light
equivalently move the observation point further away from the screen,
the intensity at the observation point will oscillate. The intensity varies
between almost zero, when an even number of Fresnel zones is transmit-
ted, and some maximal values, whenever an odd number of Fresnel zones
covers the opening. When the opening becomes so small that only the
first Fresnel zone is transmitted, the intensity will vary smoothly from
four times the unobstructed intensity to zero. This behavior is shown in
Fig.7.3.
To set up a proper definition for the Fresnel number and to distin-
guish the different diffraction regimes for the case of a plane screen, we
consider an obstacle or small hole of diameter 2a, as shown in Fig.7.4,
between source and observation point. We ask to what extent the spher-
ical wavefronts from a given source or to a given observation point can
be approximated as planar wavefronts. When wavefronts may be con-
sidered as plane we speak of Fraunhofer diffraction and if they are
curved we speak of Fresnel diffraction.
Let us estimate the distances when this happens. There is a distance
R to the opening, so ∆, the deviation of the plane wavefront from the
7.3 Fraunhofer diffraction 179
a2
R
λ
Fresnel regime:
a2
R∼
λ
Fresnel number:
a2
F =
Rλ
The Fresnel number essentially counts how many Fresnel zones are
“visible” in the opening. We see that the Fraunhofer regime is charac-
terized by a Fresnel number much smaller than one, corresponding to
the case where a fraction of the first Fresnel zone already completely
covers the hole. This is the most common diffraction regime and we will
focus on the Fraunhofer regime in the following.
emerging from the aperture5 . Now it is our job to sum those spherical
wavelets up at some point of observation. The phase and field inten-
sity are constant across the aperture. The field contribution from the
differential surface element dS becomes:
EA i(kr−ωt)
dE = e dS,
r
where EA is a source strength with units electric field per unit length. Co-
ordinates (y,z) denoted with small letters belong to the aperture screen
while capital coordinates (X, Y, Z) are observation coordinates. Let us
5 Note that we neglect the inclination factor, since all angles in the problem can be
considered small.
7.3 Fraunhofer diffraction 181
1/2
r = X 2 + (Y − y)2 + (Z − z)2
(7.6)
1/2
(X 2 + Y 2 + Z 2 ) (y 2 + z 2 )
(yY + zZ)
=R + −2
R2 R2 R2
(yY + zZ)
'R−
R
Notice the phase is linear in the screen coordinates (y, z) in this approx-
imation. This is a characteristic of Fraunhofer diffraction. We can now
sum up our field contributions over S:
EA ei(kR−ωt)
Z
E= e−ik(yY +zZ)/R dS
R
EA ei(kR−ωt)
Z
E= e−ik(yY +zZ)/R dS (7.8)
R
EA ei(kR−ωt)
Z
= A(y, z)e−ik(yY +zZ)/R dS,
R
where
1 for (y, z) ∈ S
A(y, z) = (7.9)
0 else
is the aperture function describing the transmissivity of the aperture
screen. We see, that we can also handle a semi-transparent aperture,
182 Diffraction of light
Z a/2
E=ψ eikzZ/R dz (7.11)
−a/2
sin( kZa
2R )
=ψ·a· kZa
2R
where we named
EL ei(kR−ωt)
ψ= .
R
with
β = ka sin(θ)/2
In figure 7.8 we plot the sinc expression as a function of sin(θ) for the
single slit.
Z −d+a Z d+a
−ikzZ/R
E = ψ( e dz + e−ikzZ/R dz) (7.13)
−d−a d−a
2 h −iξ(a−d) i
= e − eiξ(a+d) + e−iξ(a+d) − eiξ(a−d) (7.14)
2iξ
−2
= [sin(ξ(a − d)) + sin(ξ(a + d))] (7.15)
ξ
sin(ξa)
= −ψ · 4a · cos(ξd) (7.16)
ξa
A+B A−B
sin(A) + sin(B) = 2 sin( ) cos( )
2 2
with
α = ka sin(θ)
and
β = kd sin(θ)
function describing diffraction from a single slit. So, the ”real” dou-
ble slit interference pattern is the product of the single slit
diffraction pattern with the double point source interference
term (Young’s pattern).
In figure 7.9 we plot the intensity distribution as a function of sin(θ)
for the single slit.
Example 7.4 Delta function slit
Let us analyze what happens if you diffract light from a slit where the
diameter a → 0. Our aperture function is thus a delta function9 δ(z).
Our diffraction integral
Z +∞
E=ψ δ(z)eikzZ/R dz = constant (7.19)
−∞
ment with the single slit expression in the limit a → 0. See figure 7.8,
when a → 0 the zero point moves towards infinity and we get a constant
diffraction pattern. Obviously, the light intensity will be distributed over
the entire z-axis so you may argue that the intensity also go to zero.
Z +∞
E=ψ (δ(y + y0 ) + δ(y − y0 ))eikyY /R dy (7.21)
−∞
e+iξy0 + e−iξy0
= 2ψ = 2ψ cos(ξy0 ) (7.22)
2
with our definition of d = 2y0 the intensity becomes:
kd sin(θ) δ
I = 4I0 cos2 ( ) = 4I0 cos2 ( )
2 2
This is Young’s result! In chapter 6 we did assume our source to be
pointlike. Here we can learn: Interference is naturally taken into
account in our diffraction formalism.
Problem 7.6 Imagine a Young’s type of experiment using, not two, but
four point sources, two placed at the y-axis and two placed at the z-axis.
The aperture function is given by:
Z +b/2 Z +a/2
E = ψ( eiξy y dy eiξz z dz) (7.23)
−b/2 −a/2
sin(α) sin(β)
=a·b·ψ·
α β
with
kaZ
α=
2R
and
kbY
β=
2R
y sin(ϕ)
=ρ
z cos(ϕ)
and
Y sin(Φ)
=q
Z cos(Φ)
Z a Z 2π
E = ψ( eiξρ cos(ϕ−Φ) ρdρdϕ) (7.25)
0 0
Z a
= 2πψ · J0 (ξρ)ρdρ
0
J1 (ξa)
= 2πa2 ψ · .
ξa
Here, Jn is an n’th order Bessel function of the first kind. Using the
usual expression sin(θ) = q/R we find the intensity
with
α = ka sin(θ)
Important to notice that the zero’s are controlled by the Bessel func-
tion. This means the first zero, when plotted against sin(θ) is located at
1.22λ/D next at 2.23λ/D and so forth, see figure 7.13.
192 Diffraction of light
1 for z ∈ S
A(z) =
0 else
EA ei(−ωt+kR)
Z Z
E= A(y, z)eik(yY +zZ)/R dS (7.27)
R
Z
= Ψ A(z)eiαz dz (7.28)
= ΨF(A) (7.29)
where α = kZ/R. Here the initial integral runs over all two dimensional
space R × R, however, we reduced A(y, z) to A(z) since we are only
considering a slit in one dimension, i.e., the final integral only runs over
R meaning from −∞ to +∞. With F we designate the Fourier trans-
form and Ψ is just a complex constant scaling the diffraction pattern
intensity, but not influencing the shape of the diffraction pattern. Now
the aperture function of an obstacle rather than a slit, corresponding to
the ”negative” of case (A), is given by:
0 for z ∈ S
Ã(z) =
1 else
7.4 Babinet’s principle 193
where δ(z) = F(1) is the Dirac delta function. For values z 6= 0 we have
a connection between the two diffraction patters:
and thus we conclude the shape of the two diffraction patterns are the
same! The intensity at various points may be different as the constants Ψ
and Ψ0 are different, but the shape is the same. Any physical conclusion
one draws from the shape, such as location of maxima and minima for
example, will be the same for case (A) and (B). This is a really powerful
and beautiful principle indeed.
Screen
(A) (B)
1 1
0 0