100% found this document useful (1 vote)

958 views

Script Swiss Physics Olympiad 4 Edition

This document is a script for the Physics Olympiad that provides strategies for solving physics problems and reviews key concepts in physics. It was created by volunteers and covers topics like vector algebra, calculus, mechanics, and gravity. The script is intended to help students prepare for the Physics Olympiad competition.

Uploaded by

sajjad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

958 views

Script Swiss Physics Olympiad 4 Edition

Uploaded by

sajjad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 398

Physics Olympiad

Script

Fourth Edition

Autumn 2020
PHYSICS.
OLYMPIAD.CH
PHYSIK-OLYMPIADE
OLYMPIADES DE PHYSIQUE
OLIMPIADI DELLA FISICA
Physics Olympiad
8000 Zürich
physics.olympiad.ch
[email protected]

This script only exists because of the many volunteers which helped writing, typesetting and correcting.
Namely:
Editor: Rafael Winkler
Chapters from: Cyrill, Levy, David, Quentin, Lionel, Sven, Sebastian and Rafael
Proofreading: Markus, Claudio, Stephen, Sebastian, Viviane
Graphics: Sebastian, Oscar, Rafael

ii
Contents

1 Solving Strategies 1
1.1 Get an Overview and Elaborate a Strategy . . . . . . . . . . . . . 2
1.2 Get the Key Aspects of the Problem . . . . . . . . . . . . . . . . . 2
1.3 Write Clearly and Keep the Overview . . . . . . . . . . . . . . . . 3
1.4 Drawings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.5 Symmetries and Order of Magnitude . . . . . . . . . . . . . . . . . 6
1.6 Introduce new variables . . . . . . . . . . . . . . . . . . . . . . . . 7
1.7 Check Your Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.8 Calculator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.9 Other Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.10 Calculated example . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2 Mathematics 19
2.1 Vector algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.1.1 Scalar product . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.1.2 Vector product . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.2 Differential calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.2.1 Derivative of a function . . . . . . . . . . . . . . . . . . . . 26
2.2.2 Differentiation rules . . . . . . . . . . . . . . . . . . . . . . 29
2.2.3 More derivatives . . . . . . . . . . . . . . . . . . . . . . . . 34
2.2.4 Overview about derivatives . . . . . . . . . . . . . . . . . . 39
2.2.5 Higher derivatives . . . . . . . . . . . . . . . . . . . . . . . 40
2.2.6 Taylor approximation . . . . . . . . . . . . . . . . . . . . . . 40
2.2.7 The idea of differential equations . . . . . . . . . . . . . . 44
2.3 Integral calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.3.1 Antiderivatives . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.3.2 Integral as an area . . . . . . . . . . . . . . . . . . . . . . . 47
2.3.3 Fundamental theorem of calculus . . . . . . . . . . . . . . 49

iii
2.3.4 More integration rules . . . . . . . . . . . . . . . . . . . . . 54
2.3.5 The idea of multidimensional integrals . . . . . . . . . . . 62
2.4 Complex Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
2.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
2.4.2 Representation of a complex number, Euler formula . . 64
2.4.3 A first simple application . . . . . . . . . . . . . . . . . . . 66
2.4.4 Physical examples . . . . . . . . . . . . . . . . . . . . . . . . 67

3 Mechanics 1 69
3.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.2 Kinematics of Point-like Particles . . . . . . . . . . . . . . . . . . . 71
3.2.1 General Description . . . . . . . . . . . . . . . . . . . . . . 71
3.2.2 Linear Uniform Acceleration . . . . . . . . . . . . . . . . . 73
3.2.3 Circular Motion . . . . . . . . . . . . . . . . . . . . . . . . . 76
3.2.4 General 2 Dimensional Motion . . . . . . . . . . . . . . . . 78
3.3 Dynamics of Point-like Particles . . . . . . . . . . . . . . . . . . . . 79
3.3.1 Force . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
3.3.2 Choice of Frame of Reference . . . . . . . . . . . . . . . . 80
3.3.3 Newton’s Laws . . . . . . . . . . . . . . . . . . . . . . . . . . 81
3.3.4 Center of Mass . . . . . . . . . . . . . . . . . . . . . . . . . . 83
3.3.5 Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
3.3.6 Gravitational Force . . . . . . . . . . . . . . . . . . . . . . . 86
3.3.7 Independence of Motion . . . . . . . . . . . . . . . . . . . . 87
3.3.8 Volume and Surface Forces . . . . . . . . . . . . . . . . . . 89
3.3.9 Friction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
3.4 Momentum, Work, Energy and Power . . . . . . . . . . . . . . . . 91
3.4.1 Momentum . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
3.4.2 Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
3.4.3 Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
3.4.4 Potential Energy . . . . . . . . . . . . . . . . . . . . . . . . . 93
3.4.5 Kinetic Energy . . . . . . . . . . . . . . . . . . . . . . . . . . 95
3.4.6 Power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
3.4.7 Rotation Energy . . . . . . . . . . . . . . . . . . . . . . . . . 97
3.4.8 Angular Momentum . . . . . . . . . . . . . . . . . . . . . . . 98

iv
4 Mechanics 2 101
4.1 Rotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
4.1.1 Angle and Angular Velocity as Vectors . . . . . . . . . . . 102
4.1.2 Angular Acceleration and General Motion . . . . . . . . . 105
4.1.3 Accelerated Frames and Fictitious Forces . . . . . . . . . 105
4.1.4 Centrifugal force . . . . . . . . . . . . . . . . . . . . . . . . 106
4.2 Rigid Bodies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
4.2.1 Definition and Basic Properties . . . . . . . . . . . . . . . . 111
4.2.2 Center of Mass . . . . . . . . . . . . . . . . . . . . . . . . . . 112
4.2.3 Momentum of Inertia . . . . . . . . . . . . . . . . . . . . . . 115
4.2.4 Parallel Axis Theorem (Steiner’s Theorem) . . . . . . . . . 118
4.3 Dynamics of Rotation . . . . . . . . . . . . . . . . . . . . . . . . . . 120
4.3.1 Torque . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
4.3.2 Angular Momentum . . . . . . . . . . . . . . . . . . . . . . . 122
4.3.3 Rotational Energy . . . . . . . . . . . . . . . . . . . . . . . . 124
4.3.4 General Motion of a Rigid Body . . . . . . . . . . . . . . . . 126
4.3.5 Analogy Translation and Rotation . . . . . . . . . . . . . . 127
4.4 Gravity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
4.4.1 NEWTON’s Law of Gravity . . . . . . . . . . . . . . . . . . . 128
4.4.2 Gravitational Fields . . . . . . . . . . . . . . . . . . . . . . . 129
4.4.3 Energy and Angular Momentum in Gravitational Fields . 133
4.4.4 Two Objects Subject to Mutual Attraction . . . . . . . . . 134
4.4.5 KEPLER’s Laws of Planetary Motion* . . . . . . . . . . . . . 136

5 Thermodynamics 141
5.1 Important definitions . . . . . . . . . . . . . . . . . . . . . . . . . . 142
5.2 The temperature scale . . . . . . . . . . . . . . . . . . . . . . . . . 143
5.3 Zeroth law of thermodynamics . . . . . . . . . . . . . . . . . . . . 145
5.4 Thermal energy and heat capacity . . . . . . . . . . . . . . . . . . 145
5.4.1 Molar heat capacities of ideal gases . . . . . . . . . . . . . 146
5.5 Ideal gas law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
5.6 First law of thermodynamics . . . . . . . . . . . . . . . . . . . . . 147
5.7 Thermodynamic systems . . . . . . . . . . . . . . . . . . . . . . . . 147
5.8 Equipartition theorem . . . . . . . . . . . . . . . . . . . . . . . . . 148
5.9 Thermodynamic processes . . . . . . . . . . . . . . . . . . . . . . . 149
5.10 Second law of thermodynamics . . . . . . . . . . . . . . . . . . . . 152
5.11 Heat engines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

v
5.12 Kinetic gas theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
5.13 Phase transitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
5.14 Real gases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
5.15 Stefan-Boltzmann law . . . . . . . . . . . . . . . . . . . . . . . . . . 158

6 Oscillations 161
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
6.2 Harmonic Oscillations . . . . . . . . . . . . . . . . . . . . . . . . . . 163
6.2.1 Harmonic Oscillations, Spring/Mass Systems and Differ-
ential Equations . . . . . . . . . . . . . . . . . . . . . . . . . 164
6.2.2 Further Examples . . . . . . . . . . . . . . . . . . . . . . . . 167
6.2.3 Importance of Harmonic Oscillations . . . . . . . . . . . . 169
6.3 Beyond Harmonic Oscillations . . . . . . . . . . . . . . . . . . . . . 170
6.3.1 Example of Forced Oscillating Systems . . . . . . . . . . . 174

7 Waves 175
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
7.2 Harmonic Waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
7.3 Waves in 3D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
7.3.1 Waves as Functions of 3D Spatial Coordinates . . . . . . . 179
7.3.2 Waves as 3D Functions, Transversal and Longitudinal
Waves, Polarization . . . . . . . . . . . . . . . . . . . . . . . 179
7.4 Waves Propagation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
7.4.1 Sound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
7.4.2 Light . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
7.4.3 Seismic Waves . . . . . . . . . . . . . . . . . . . . . . . . . . 182
7.4.4 Transport . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
7.4.5 Doppler Effect . . . . . . . . . . . . . . . . . . . . . . . . . . 183
7.5 Waves Propagation at Interfaces . . . . . . . . . . . . . . . . . . . 184
7.5.1 Reflection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
7.5.2 Refraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
7.5.3 Diffraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
7.6 Multi-Waves Phenomena . . . . . . . . . . . . . . . . . . . . . . . . 188
7.6.1 Same amplitude, frequency and wavenumber . . . . . . 189
7.6.2 Same amplitude and frequency, opposite wavenumber . 190
7.6.3 Slightly different frequencies . . . . . . . . . . . . . . . . 191
7.6.4 Fourier Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 192

vi
8 Fluid Dynamics 195
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
8.1.1 What is fluid dynamics about? . . . . . . . . . . . . . . . . 196
8.1.2 How can we model such a fluid? . . . . . . . . . . . . . . . 196
8.2 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
8.3 Pressure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
8.3.1 Compressible and incompressible fluids . . . . . . . . . . 198
8.3.2 Hydrostatic pressure . . . . . . . . . . . . . . . . . . . . . . 198
8.3.3 Buoyancy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
8.4 Continuity equation . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
8.5 Bernoulli’s equation . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
8.5.1 Derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
8.6 Surface tension, energy and capillary pressure . . . . . . . . . . 201
8.7 Friction in fluids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203

9 Electro- and Magnetostatics 205

9.1 Electrostatics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
9.1.1 Coulomb Force . . . . . . . . . . . . . . . . . . . . . . . . . 206
9.1.2 Electrostatic field . . . . . . . . . . . . . . . . . . . . . . . . 207
9.1.3 Superposition . . . . . . . . . . . . . . . . . . . . . . . . . . 208
9.1.4 Continuous charge distributions . . . . . . . . . . . . . . . 209
9.1.5 Gauss’ law . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
9.1.6 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
9.2 Potential and Voltage . . . . . . . . . . . . . . . . . . . . . . . . . . 215
9.2.1 Electric potential . . . . . . . . . . . . . . . . . . . . . . . . 215
9.2.2 Electric potential of a point charge . . . . . . . . . . . . . 216
9.2.3 Potential of multiple charges . . . . . . . . . . . . . . . . . 218
9.2.4 Voltage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
9.2.5 Potential and conducting material . . . . . . . . . . . . . . 219
9.2.6 Capacity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
9.2.7 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
9.3 Current and magnetic field . . . . . . . . . . . . . . . . . . . . . . 224
9.3.1 Current and conservation of charge . . . . . . . . . . . . . 224
9.3.2 Magnets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
9.3.3 Magnet and electric current . . . . . . . . . . . . . . . . . 227
9.3.4 Lorentz force . . . . . . . . . . . . . . . . . . . . . . . . . . 228

vii
10 Direct current circuits 231
10.1 Ohm’s law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
10.2 Equivalent circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
10.2.1 Wire . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
10.2.2 Series circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
10.2.3 Parallel circuit . . . . . . . . . . . . . . . . . . . . . . . . . . 234
10.2.4 Voltage source . . . . . . . . . . . . . . . . . . . . . . . . . . 234
10.3 Electric power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
10.4 Electric components . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
10.5 Kirchhoff’s circuit law . . . . . . . . . . . . . . . . . . . . . . . . . . 236
10.5.1 Current law . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
10.5.2 Voltage law . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
10.5.3 Applying Kirchhoff’s law . . . . . . . . . . . . . . . . . . . . 236
10.6 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
10.6.1 Maximal power from a real power source . . . . . . . . . 239
10.6.2 Charging a capacitor . . . . . . . . . . . . . . . . . . . . . . 240

11 Electrodynamics 243
11.1 Magnetism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
11.1.1 Magnetic Field and Flux . . . . . . . . . . . . . . . . . . . . 244
11.1.2 Ampere’s Law . . . . . . . . . . . . . . . . . . . . . . . . . . 245
11.1.3 Magnetic Field of a Moving Point Charge . . . . . . . . . . 246
11.1.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
11.2 Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
11.2.1 Approach and Definition . . . . . . . . . . . . . . . . . . . . 252
11.2.2 Self induction . . . . . . . . . . . . . . . . . . . . . . . . . . 254
11.2.3 Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
11.2.4 Transformer . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
11.3 Displacement current . . . . . . . . . . . . . . . . . . . . . . . . . . 259
11.4 Maxwell’s equations and their conclusions . . . . . . . . . . . . . 261
11.4.1 Maxwell’s equations . . . . . . . . . . . . . . . . . . . . . . 261
11.4.2 Electromagnetic wave . . . . . . . . . . . . . . . . . . . . . 262
11.5 Electro-magnetic field in Materials . . . . . . . . . . . . . . . . . . 265
11.5.1 Polarizability and dielectric constant . . . . . . . . . . . . 265
11.5.2 Electric displacement and Polarisation . . . . . . . . . . . 268
11.5.3 Continuity equations at interfaces . . . . . . . . . . . . . 270
11.5.4 Magnetic field . . . . . . . . . . . . . . . . . . . . . . . . . . 271

viii
11.5.5 Maxwell’s equations in Materials . . . . . . . . . . . . . . . 273
11.5.6 Electromagnetic waves . . . . . . . . . . . . . . . . . . . . 274
11.6 Energy of the electromagnetic field . . . . . . . . . . . . . . . . . 276
11.6.1 Energy density of the electric field . . . . . . . . . . . . . 276
11.6.2 Energy density of the magnetic field . . . . . . . . . . . . 277
11.6.3 Poynting vector . . . . . . . . . . . . . . . . . . . . . . . . . 277

12 Alternating current (AC) 281

12.1 Describing alternating voltage and current . . . . . . . . . . . . . 282
12.1.1 Fourier series . . . . . . . . . . . . . . . . . . . . . . . . . . 282
12.1.2 Usual real notation . . . . . . . . . . . . . . . . . . . . . . . 283
12.1.3 Phasor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283
12.1.4 Complex notation . . . . . . . . . . . . . . . . . . . . . . . . 283
12.2 Impedance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
12.2.1 Ohmic resistor . . . . . . . . . . . . . . . . . . . . . . . . . . 286
12.2.2 Capacitor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
12.2.3 Inductor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
12.3 Combinations of R,C and L . . . . . . . . . . . . . . . . . . . . . . . 290
12.3.1 Kirchhoff’s laws . . . . . . . . . . . . . . . . . . . . . . . . . 290
12.3.2 Serial and parallel circuit . . . . . . . . . . . . . . . . . . . 291
12.3.3 High pass filter . . . . . . . . . . . . . . . . . . . . . . . . . 292
12.3.4 Resonant circuit . . . . . . . . . . . . . . . . . . . . . . . . . 295
12.4 Power consideration and effective values . . . . . . . . . . . . . . 300
12.4.1 Power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300
12.4.2 Effective values . . . . . . . . . . . . . . . . . . . . . . . . . 302
12.4.3 Active, reactive and apparent power . . . . . . . . . . . . 302
12.5 Three-phase electric power . . . . . . . . . . . . . . . . . . . . . . 303
12.5.1 Definition and production . . . . . . . . . . . . . . . . . . . 303
12.5.2 Star and Delta circuit . . . . . . . . . . . . . . . . . . . . . 304
12.5.3 Advantage of a three-phase system . . . . . . . . . . . . . 307

ix
13 Special Relativity 309
13.1 Historical Milestones . . . . . . . . . . . . . . . . . . . . . . . . . . 310
13.1.1 Aether and electromagnetic waves . . . . . . . . . . . . . 310
13.1.2 Flying electron . . . . . . . . . . . . . . . . . . . . . . . . . 311
13.2 Galileo transformations . . . . . . . . . . . . . . . . . . . . . . . . . 313
13.2.1 Reference System . . . . . . . . . . . . . . . . . . . . . . . . 313
13.2.2 Inertial frame of reference . . . . . . . . . . . . . . . . . . 314
13.2.3 Galileo Transformation . . . . . . . . . . . . . . . . . . . . . 315
13.3 Lorentz transformation . . . . . . . . . . . . . . . . . . . . . . . . . 317
13.3.1 Einstein’s Postulates . . . . . . . . . . . . . . . . . . . . . . 317
13.3.2 Synchronisation of clocks . . . . . . . . . . . . . . . . . . . 318
13.3.3 Time dilation . . . . . . . . . . . . . . . . . . . . . . . . . . . 318
13.3.4 Lorentz contraction . . . . . . . . . . . . . . . . . . . . . . 320
13.3.5 Symmetry of time dilatation and Lorentz contraction . . 322
13.3.6 Lorentz transformation . . . . . . . . . . . . . . . . . . . . 323
13.4 Minkowski metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
13.4.1 Definition of Minkowski metric . . . . . . . . . . . . . . . . 326
13.4.2 Properties of Minkowski metric . . . . . . . . . . . . . . . . 326
13.4.3 Four vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
13.4.4 Note on rigorous derivation . . . . . . . . . . . . . . . . . . 328
13.5 Velocities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
13.5.1 Addition of parallel velocities . . . . . . . . . . . . . . . . 329
13.5.2 Addition of perpendicular velocities . . . . . . . . . . . . 330
13.5.3 velocity four vector . . . . . . . . . . . . . . . . . . . . . . 330
13.6 Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
13.6.1 Momentum . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
13.6.2 momentum four vector . . . . . . . . . . . . . . . . . . . . 333
13.6.3 Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334
13.6.4 Acceleration and forces . . . . . . . . . . . . . . . . . . . . 335
13.7 Paradoxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336
13.7.1 Ladder and barn . . . . . . . . . . . . . . . . . . . . . . . . . 336
13.7.2 Twin paradox . . . . . . . . . . . . . . . . . . . . . . . . . . . 338
13.7.3 Solution to the flying electron problem . . . . . . . . . . 338

x
14 Quantum Mechanics 341
14.1 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
14.1.1 Black body radiation . . . . . . . . . . . . . . . . . . . . . . 342
14.1.2 Photoelectric effect . . . . . . . . . . . . . . . . . . . . . . 345
14.1.3 Double slit experiment . . . . . . . . . . . . . . . . . . . . . 348
14.2 Laws of Quantum Mechanics . . . . . . . . . . . . . . . . . . . . . . 351
14.2.1 Wavefunction and probability . . . . . . . . . . . . . . . . 351
14.2.2 Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . 351
14.2.3 De Broglie hypothesis . . . . . . . . . . . . . . . . . . . . . 352
14.2.4 Uncertainty Principle . . . . . . . . . . . . . . . . . . . . . . 353
14.2.5 Schrödinger Equation . . . . . . . . . . . . . . . . . . . . . . 354
14.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
14.3.1 Bohr model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
14.3.2 Rigorous example . . . . . . . . . . . . . . . . . . . . . . . . 357

15 Introduction to Statistics 363

15.1 Location and Spread of a single Set of Data . . . . . . . . . . . . 364
15.1.1 Bivariate Analysis . . . . . . . . . . . . . . . . . . . . . . . . 365
15.2 Uncertainty Propagation . . . . . . . . . . . . . . . . . . . . . . . . 366
15.2.1 Quantification of Uncertainty . . . . . . . . . . . . . . . . 366
15.2.2 Propagation of Uncertainty . . . . . . . . . . . . . . . . . . 366
15.3 Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368
15.3.1 The International System of Units (SI) . . . . . . . . . . . 368
15.3.2 Prefixes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368
15.3.3 Dimensional Analysis . . . . . . . . . . . . . . . . . . . . . . 369
15.4 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369
15.4.1 Elements of good graphs . . . . . . . . . . . . . . . . . . . . 369
15.4.2 Logarithmic Plots . . . . . . . . . . . . . . . . . . . . . . . . 370

xi
Appendices 370

A Further derivations 373

A.1 Derivations of Statistics . . . . . . . . . . . . . . . . . . . . . . . . 374
A.1.1 Alternative formulations for Variance and Covariance . . 374
A.1.2 Derivation of the Least Squares Coefficients . . . . . . . 374

B Tables 377
B.1 List of physical constants (in SI units) . . . . . . . . . . . . . . . . 378
B.2 List of named, SI derived units . . . . . . . . . . . . . . . . . . . . 379
B.3 List of material constants . . . . . . . . . . . . . . . . . . . . . . . 379

xii
Chapter 1

SOLVING STRATEGIES
The problem is not the problem; the
problem is your attitude about the
problem.
Captain Jack Sparrow.

1.1 Get an Overview and Elaborate a Strategy . . . . . . . . . . . . . 2

1.2 Get the Key Aspects of the Problem . . . . . . . . . . . . . . . . . 2
1.3 Write Clearly and Keep the Overview . . . . . . . . . . . . . . . . 3
1.4 Drawings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.5 Symmetries and Order of Magnitude . . . . . . . . . . . . . . . . . 6
1.6 Introduce new variables . . . . . . . . . . . . . . . . . . . . . . . . 7
1.7 Check Your Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.8 Calculator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.9 Other Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.10 Calculated example . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

1
1 Solving Strategies
The tests of the Physics Olympiad are rather tricky and sometimes beyond the scope of
physics at school. Often it is important to have a good idea or to approach the problem
from a good starting point. The hints and tricks presented in this chapter should help
you to approach a problem and to solve it efficiently.

1.1 Get an Overview and Elaborate a Strategy

The time and the problems are chosen such that it is almost impossible to solve all the
problems correctly within the given time. This might sound cruel but you have to see it
from a positive side. Take the different topics as choice and focus on those you can solve
best, nobody expects you to solve everything correctly (the goal is not to get all points
but get more than the others). Hence a good time management is crucial, do not become
desperate if you get stuck in a problem, maybe also most of the other competitors will
get stuck and there might be another problem that you can solve. Summarizing the most
important points we have:

• Read through the whole exam and try to solve the easy tasks first (they are some-
times hidden after a more difficult task, so read the whole problem).

• Focus on problems and topics you like more or you think you have a better chance
to solve them correctly. Use your time wisely.

• Fight for every small point: Also an unsuccessful trial might give some points,
writing nothing certainly gives no points.

• Do not become desperate, the test is hard for everyone, focus on fighting for
points.

1.2 Get the Key Aspects of the Problem

Before you really can start, you have to understand what is searched and what quantities
are given. Sometimes the problem looks hard at first sight, but having a second look it
gets much easier. But also the opposite is possible... Some tricks to improve the overview
are:

• Read the problem twice, maybe you forgot an important detail when reading it first
and you get it the second time.

• Read also the subsequent tasks, maybe they give you a clue what to do.

2
1.3. WRITE CLEARLY AND KEEP THE OVERVIEW

• Make a drawing of the situation (see also next section).

• If there are a lot of variables involved and you lost the overview, make a list with
these variables (and label them in the drawing). Also write down (obvious) relations
between the variables. This might already give some points.

• For each task the number of points you can gain for that task is indicated. This is
a hint how difficult or how easy a task is.

1.3 Write Clearly and Keep the Overview

Not only those who correct will be pleased about a nice writing and a good overview.
It is also very useful for you to keep the overview and thus minimize calculation errors.
Some hints (see also the examples in the next chapters):

• Leave enough space in all directions and in particular between different problems
/ parts / questions. This will improve the overview and in case you have to add
something, there is still space left.

• Explain what you are doing (also in words) and make nice drawings (see also next
section).

• If you can name the variables by yourself, give them intuitive names. For example
Greek letters for angles, the variables you are used to for energy, momentum and
so on.

• Clearly write down your assumptions and simplifications.

• If you do a longer computation do not make a long row of equal signs. Instead
write on the right hand side of the equation below each other. For example:

b
ax2 + bx + c = a(x2 + x) + c
a
b b2 b2
= a(x2 + x + 2 ) + c −
a 4a 4a
b 2 b2
= a(x + ) + c − 2 .
2a 4a

3
1 Solving Strategies

1.4 Drawings

Sometimes you are asked to do a drawing or you want to do a drawing to get a better
overview. Independent of the purpose of the drawing there are some points that should
be considered:

• Make a BIG drawing. And by big we mean: really use the space on the page
(drawings smaller than a third of the page are in most cases too small).

• If possible, use ruler and compass for the drawing and try do draw it as precisely
as possible (without wasting time on the drawing). Sometimes a problem can be
solved geometrically, that is why a precise drawing might be useful.

• Use different colors to distinguish different properties.

• Label the quantities in the drawing that are given or asked or that might be helpful
(see next section).

Second round 2020, part of the first question:

Consider the following schematic of a mirror telescope setup, using an ocular lens.
The sketch is not to scale.
lR

dR
D

The telescope has a total length of L = 2 m, and the diameter of the opening
and the parabolic mirror is D = 50 cm. In the interior of the telescope, there
is a planar secondary mirror with diameter dR = 9 cm. The secondary mirror is
placed lR = 1.8 m from the parabolic mirror, and tilted by 45°. The ocular lens has
diameter dO = 1 cm and is placed at an adjustable distance lO outside the housing,
in such a way that a sharp image forms. The focal length of the parabolic mirror is
fP = 2.1 m, and the focal length of the ocular lens is fO = 3 cm.

4
1.4. DRAWINGS

Question 1: Sketch the trajectory of light from a very distant star through the tele-
scope.
Solving Strategy: We make a big drawing and try to draw the situation to scale 1 : 10
(at least as much as possible, see also Note below). Since the star is very far away
we can assume the rays are parallel when entering the telescope. When the rays hit
the mirror, they get focused towards the focal point and would intersect there. The
small planar mirror deflects the rays. Therefore the rays intersect at the mirrored
focal point, so we ”reflect” the focal point. If we want to have a sharp image, the
rays should be parallel after the ocular. To have them parallel, all of them must pass
through the focal point of the ocular (note, the diameter and the distance from the
ocular are not drawn to scale)
Solution 1:

Figure 1.1
Note: We did a big and nice drawing1 which will be useful in the next two tasks!
Furthermore: Is important to read the task exactly and maybe also mention things
in the drawing that might be obvious but might get forgotten (there were students
who mistook the mirror telescope with a lens and drew the rays through the mirror).

Question 2: At which distance lO does the ocular lens need to be placed so that the
rays from a very distant star are parallel after the ocular lens?
Solving Strategy: To have a parallel ray after the lens, the rays must pass through
the focal point. If we put the lens such that its focal point is at the position where
wall rays meet (=mirrored focal point of mirror), we fulfil this condition. From the
drawing we see that the focal point of the lens and of the mirror must coincide.

5
1 Solving Strategies

Solution 2: fP +fO = lR + 12 D+lO which leads to lO = fP +fO −lR − 12 D = 8 cm.

Note: Since we drew the whole situation very carefully, it is easier to see this solution
and if we draw it to scale, we even can check the numeric solution (not possible here
as it was not possible to draw the ocular at the right position).

Question 3: Determine the magnification of the telescope.

Solving Strategy: From a geometrical point of view we can use the intercept theo-
rem: the ratio between the distance parabolic mirror - focal point and the distance
ocular - focal point is equal to the ratio of the corresponding diameter of the rays.
Solution 3: M = ffO P
= 70 using interception theorem.
Note: Again, a good drawing helps a lot.

The first problem continues and the drawing itself is not that useful anymore. Nev-
ertheless with this solutions we got 4.25 points out of 162 .

1.5 Symmetries and Order of Magnitude

The physics taught you in this script is formulated quite generally and the formulas might
look more complicated than they are. In a specific problem, there might be symmetries
involved that massively simplify the problem (in particular in electrodynamics). Further-
more the problem might contain different quantities that have a completely different or-
ders of magnitude. This might allow you to simplify the problem (e.g. in most problems
the earth can be considered being flat).The most important hints in this topic are:

• What is the proper dimensionality of the problem? Often it is not necessary to treat
the problem in 3 dimensions. For example the motion of a planet or the ballistic
treatment of a football can be done in two dimensions as they move in a plane.

• Use rotational or mirror symmetries.

• If in the problem two quantities are explicitly related via a or a , you should
always simplify your computations. The difficulty is then to figure out how much
1
The page layout does not allow to plot the drawing in its right scale. The squares originally have a size
of 5 mm.
2
The average of the points at this whole problem 1 (including the tasks that are not listed here) at the
second round 2020 was ≈ 1.8.

6
1.6. INTRODUCE NEW VARIABLES

the problem can be simplified without neglecting the effect under investigation, i.e.
whether it is allowed to set one value completely to zero or a more sophisticated
simplification is needed such as a Taylor expansion. For example considering a
pendulum, the displacement s shall be much smaller than the length of the pendu-
lum l (s l). But setting s = 0 would not allow us to investigate the oscillation,
hence we need the first order approximation s/l = sin(φ) ≈ φ where φ is the
displacement angle.

1.6 Introduce new variables

In some cases it is very useful to introduce a new variable (or multiple) that is not given
by the problem. It might be easier to split the problem in smaller steps and find relations
between given quantities and introduced ones instead of only using the given variables.
In most cases, these newly introduced variable is just helping you to find useful equations
and to simplify the physics. In the end it is often possible to get rid of the introduced
variable by having enough equations to eliminate it (see also example below). Some tricks
are:

• Say clearly how you define your new variable. If the definition is not clear you
might mix up different quantities.

• Introduce only variables you know how to deal with. Think a bit if the new variable
really describes something useful and meaningful.

• Do not introduce variables that are very obviously related to other variables (e.g.
radius and diameter of a circle). Otherwise you risk to loose the overview over all
the variables.

• Keep track of your variables and name them in an intuitive way, i.e. Greek letters
for angles (see also section 1.3)

• If you have a drawing, also label this new variable.

7
1 Solving Strategies

Example:
Problem: A coin with diameter d = 2 cm lies on the floor of a white tea cup which
has an inner diameter b = 6 cm and height l = 8 cm. The coin lies in the middle of
the cup. From which angle is it possible to see the whole upper surface of the coin
if the cup is filled with water up to a height h? Write down a relation between the
given quantities (no need to solve it).

Solving strategy: First we think about the symmetry of the problem. Assuming to
have a round cup, it does not mutter from which side we look at it. We therefore have
a rotation symmetric problem and we simply chose one vertical plain going though
the middle of the cup. We therefore reduced the 3 dimensional problem to 2, which
will simplify our calculations and drawings.We draw the situation (see figure 1.2) and
try to understand what limits the visibility of the coin. Looking at the coin from the
top we obviously can see the coin. When turning to the side there is an angle where
the light of the edge of the coin just touches the rim of the cup (corresponding angle
drawn as β). We therefore have to find this β. The difficulty is that the light gets
refracted at the water-air surface and it is not clear where on the surface this happens.
So we introduce the distance x between that point and the side of the cup. We now
try to find enough equations to eliminate x. In the drawing, we have four given
variables (b, d, h, l) and three unknown variables (α, β and x). We have to find three
(independent) equations relating all these variables.

α
l
h
x
d

Figure 1.2: Drawing of the situation.

8
1.7. CHECK YOUR RESULT

Solution: We can apply Snell’s law:

sin(α)nw = sin(β) (1.1)

where nw is the refractive index from water (known).

Second we compute the horizontal path length from the edge of the coin to the rim
of the cup

d
b− = tan(α)h + x. (1.2)
2
Third we have
x
tan(β) = . (1.3)
l−h
Having these three equations we can eliminate all unknown variables: Solve equation
1.3 for x and insert it in 1.2. Furthermore solve equation 1.1 for α and also insert it
into equation 1.2. We then get

d sin(β)
b − = tan arcsin( ) h + (l − h) tan(β) (1.4)
2 nw

This equation defines β. As we are not asked to solve it, we are done.

Note: Of course we could have solved the problem without introducing x but some-
times it is very helpful to introduce a variable to split the problem into smaller steps
which can be solved easier.

1.7 Check Your Result

If you got a result there are some sanity checks you should do to avoid avoidable mistakes
such as forgetting a square or using the wrong units:

• Analysis of dimension: Does the result have the correct dimension? Sometimes it
is even possible to guess a formula just by looking at the dimension of the given
quantities.

• Is the argument of certain functions without dimension? This point concerns in

particular the trigonometric functions and the exponential and logarithm. For ex-

9
1 Solving Strategies

ample in sin(x), x must be without dimension. So x = ωt where t is time and ω a

(angular) frequency is fine, but x = t makes no sense (what is the sine of a time?!).

• If specific numerical values are given, the result should be meaningful. For example
the speed of a car is in the order of 10m·s−1 and certainly not 1000m·s−1 or
0.01m·s−1 .

• Back of the envelope calculation when doing calculations with the calculator

1.8 Calculator
At our exams, you are only allowed to use a simple calculator without any algebraic solving
system or graphic display. With some tricks such a calculator is enough to solve the
problems1 .

• If numerical values are given, insert them at the very end of your calculation. It is
easier to calculate with variables than with numbers and you will make less mistakes.

• If you split your computations in different steps, save your steps and continue with
the saved values. This way you avoid to make rounding errors.

• When inserting the numbers in your formula, it is easier to write them in scientific
notation, i.e. c = 3 · 108 instead of c = 300000000. Doing this, you can even
go one step further and simplify all the 10x exponentials in your formula before
typing it into the calculator.

• Get used to your calculator: You should know how your calculator works and
where the different operations are located. Some calculators (which are allowed at
our exams) have even some more functions such as computing the mean or the
standard deviation. Knowing these functions might be very helpful.

1.9 Other Hints

There are some other hints that might help:

• Even when specific values are given (e.g. velocity v = 3m·s−1 ), only plug them in
at the end, as you normally will get points for the final formula
1
When studying Math or Physics, in most exams you are not allowed to use any calculator or only a
simple one. So it is a very good exercise to get used to solve mathematical or physical problems without
calculator or with a simple one.

10
1.10. CALCULATED EXAMPLE

• If you have no clue, look at the dimensions of the given variables. Maybe you can
guess the answer. This is also very useful when you are not sure at which power
you have to take the quantities or if you forgot the formula

• Usually the tasks are such that the solution only needs given quantities (maybe not
given as numerical values but as variables) or natural constants. Nevertheless there
are sometimes questions where you have to make an assumption. Examples are
the diameter of an eye (in optics) or the mass of an object.

• Take your time to read the problem and get some understanding of the systems
and definitions. Start solving the first exercise once you have done this.

1.10 Calculated example

In this section we have a look at an entire former second round problem (second round
2020, problem 3).
Intro:
Let’s consider a helical ramp. The helix’s axis is vertical, its radius R (the horizontal
distance from each point of the ramp to the axis) is constant. The ramp’s slope is also
constant and such that the vertical distance between two coils (distance which is called
the helix’s ”pitch”) is s.

~g s

We study the motion of a marble of mass m that rolls on the ramp. The marble’s position
l(t) on the helix is described by the distance it travelled along the ramp from its initial
position.

11
1 Solving Strategies

Part A: A point object on a line

First, let’s consider that the ramp is analog to a line along which the marble moves with-
out friction and without leaving the ramp.

Task A i) (1P): What is the length L of one helix’s turn, that is, the distance the marble
travelled when it crosses the vertical of its initial position for the first time after being let
go along the ramp?
Solving Strategy: Before hurrying with the answer, we first make some general thoughts
about the problem:
• Read the whole problem or at least part A. There might be some hints hidden in
the following tasks.

• Think about the dimensionality of the problem and how to describe the motion
efficiently (reading the whole part A first might help answer this question. The
motion itself is confined on the helix. The helix itself is a line, so one dimension is
sufficient to describe the motion (in Task A iv) - vi) this will be relevant). Never-
theless the helix itself is embedded in three dimensions, so when talking about the
helix itself, we maybe have to do the computation in 3D.
As this question deals with the helix, we have to consider more than one dimension. But
we can simplify the computation if we unroll the helix2 . We then see that one winding
corresponds to a triangle, the hypotenuse of a right triangle. One cathetus of this triangle
is the circumference 2πR the other is the pitch s.
Solution: Unrolling the helix and applying Pythagoras leads to
p
L = (2πR)2 + s2 .

Note: No hurry at the beginning, we will see that the general considerations will not be
done in vain.

Task A ii) (1P): What is the angle α between the ramp and the horizontal plane?
Solving Strategy: We are dealing with the helix again and we can reuse the picture gained
in the first task.
Solution:
s
α = arctan .
2πR
2
To unroll the helix, imagine a piece of paper rolled into a cylinder. Draw the helix onto the paper and
unroll it. The helix becomes a set of parallel, slanted lines.

12
1.10. CALCULATED EXAMPLE

Note: Understanding the first task well and making a drawing (at least in the head by
unrolling the helix) helped solving this one.

Task A iii) (1P): Draw the applied forces acting on the marble in the referential of your
choice.
Solving Strategy: As we need three dimensions to draw the helix, we need also all three
dimensions to draw the forces. Since a three dimensional drawing is difficult, we draw
the situation once from the side and once from the front view (also a good 3D drawing
is possible).
Solution: There is the gravitational force FG , the centripetal force FZ and the normal
force FN . The vertical line in the drawing corresponds to the axis around which the helix
winds.

side view front view

FN
FN −FG

FZ
FG
FG R

Figure 1.3

Note: In this task different drawings are possible but it must clear from which perspec-
tive they are drawn.

Task A iv) (1.5P): Compute the marble’s acceleration a(t) tangent to the ramp as a
function of time.
Solving Strategy: We use Newton’s second law: The sum of all forces must be equal to
the acceleration times the mass. As we only have to consider the tangent acceleration, we
only have to consider the forces drawn in the side view picture. The normal force and the
gravitational force compensate each other in the direction perpendicular to the direction
of motion. The resulting force is therefore the force pointing parallel to the ramp.

13
1 Solving Strategies

Solution: The tangent force is constant and since there is no friction we get

gs
a = g sin(α) = .
L

Note: Understanding the geometry of the problem is again very useful.

Task A v) (0.5P): We let the marble roll along the ramp with an initial velocity v0 (tangent
to the ramp). Compute the marble’s position l(t) as a function of time.
Solving Strategy: As the acceleration is constant, we deal with the usual formula for
constant acceleration.
Solution:

1
l(t) = v0 t + at2
2

Note: This question might look difficult at first sight, but looking at the number of points
we realize there must be a simple solution.

Task A vi) (2P): If the initial velocity’s direction is upward the ramp, after what time
τ will the marble cross its initial position again? Find a numerical value for τ using
R = s = 20 cm and v0 = 1 m·s−1 .
Solving Strategy: Being at the same position again means l(τ ) = 0. Inserting in the
equation above and solve it for t leads to the correct result.
Solution: We want l(τ ) = 0. There are two solutions, at t = 0 the marble starts rolling
up, hence we want the non-zero solution:

1
0 = v0 + aτ
2
2v0
τ=
a q
2v0
= (2πR)2 + s2
gs
s
R 2

2v0
= 2π + 1.
g s

Therefore τ ≈ 1.3 s.
Note: The numerical values are inserted at the very end.

14
1.10. CALCULATED EXAMPLE

Part B: Like a slide

We consider now that the ramp’s cross-section is a half-circle of radius r, the two rims
being at same height. The marble is still considered a point object with frictionless motion.
The position of the marble inside the ramp is determined by the angle φ(t) (taken in the
vertical plane containing the helix’s axis) and the distance l(t) from its initial position,
measured along the bottom of the half-pipe (where φ = 0).

φ r

Task B i) (5P): Find an equation that links the variables φ(t), R, r, s, l(t) and L if the
marble’s initial conditions are v0 = 0 and φ0 = 0. No other variable than these six ones
should appear in this equation. You are not asked to solve this equation.
Solving Strategy: Ok, this is a hard task, also as it gives 5 points! But there is no reason
to be in despair, let’s solve it step by step.

1. Understand the geometry of the problem: In part A, the helix consisted of an

infinite thin line, that’s why we could treat the movement of the marble in one
dimension. In this part B, the line is replaced by a half pipe. This half pipe is then
wound up as helix. The difficulty is, that the motion of the marble is not confined
to one dimension anymore. Instead the marble can also move on the semi-circle
given by the half pipe and which is parametrized by ϕ.

2. After understanding the geometry, we start to think about the motion in an intuitive
way. Along the helix, the marble will always get faster (as it goes down). But the
faster motion asks for a bigger centripetal force, hence the marble will be pushed
out, meaning ϕ gets bigger (intuitively (but wrongly) spoken: the centrifugal force
gets bigger and pushes the marble out).

3. Think about how to simplify the computation3 The speed along the helix will al-
ways be much bigger than the motion in the semi-circle. Therefore we neglect the
velocity along the semi-circle.
3
If you did not consider the motion along the semi-sphere, don’t worry, you have a good physical intuition
about relevant aspects.

15
1 Solving Strategies

4. Now we start collecting equations: First we apply Newton’s laws in vertical and
horizontal direction:

N cos(φ) − mg⊥ = 0
v2
N sin(φ) = mac = m .
R + r sin(φ)
The first line basically tells you that the component of the gravitational force g⊥
is compensated by the normal force N . The second line includes the centripetal
force ac : The horizontal component of the normal force acts as centripetal force
forcing the marble on the curved path of the helix. Since the result should not
include g⊥ and the velocity v, we have to find more equations equating them with
other given quantities.
As the velocity along the circle is negligible, there must be no force pointing tangent
to the circle. This means, the normal force points towards the center of the circle,
see the following figure

φ ~
N

~ is mg⊥ , we get another equation

As the vertical component of N

v2
= g⊥ tan(φ).
(R + r sin(φ))
In addition we somehow have to connect the velocity with the position along the
path. This is easiest done with energy conservation equating the potential and the
kinetic energy:

1 2 l
mv = mg⊥ s − r (1 − cos(φ)) .
2 L
With these equations it should be possible to eliminate all introduced variables and
to solve the problem.

16
1.10. CALCULATED EXAMPLE

Solution: The forces are

N cos(φ) − mg⊥ = 0
v2
N sin(φ) = mac = m .
R + r sin(φ)

Furthermore the force must act perpendicular to to the circle:

v2
= g⊥ tan(φ) ⇒ mv 2 = mg⊥ tan(φ) (R + r sin(φ))
(R + r sin(φ))

And according to energy conservation:

1 2 l 2 l
mv = mg⊥ s − r (1 − cos(φ)) ⇒ mv = 2mg⊥ s − r (1 − cos(φ))
2 L L

Merging the last two equations, we get

l
tan(φ) (R + r sin(φ)) = 2 s − r (1 − cos(φ)) .
L

Note: With the last equation, we are done as it relates the given quantities. This was a
really though task, nobody will be able to solve this within some minutes. So do not des-
perate if you don’t see the solution immediately but try to do the steps explained above.
Furthermore we did not use the first two equations, but they are good to understand the
problem and where the forces come from.

Task B ii) (1P): Will the marble jump off the ramp?
Solving Strategy / Solution: As vertical component of the normal force is always point-
ing in the same direction as the gravitational force, the normal force has always a non-zero
vertical component. Hence the normal force will never be perfectly horizontal and there-
fore the angle always smaller than 90◦ . Therefore the ball will never jump off the ramp4 .
Note: Once again it helped understanding the previous problem and the unused equa-
tions about the forces become useful. In addition this task was solvable even without
completely solving the previous one. So always look at all tasks and do not stop solving
if you cannot solve one.

4
Here the assumption enters, that there is no motion in the semi circle and therefore no oscillation or
other wired motion which (hypothetically) could cause the ball to jump off the ramp.

17
1 Solving Strategies

Task B iii) (2P): How is the equation simplified if we assume R r?

Solving Strategy: We divide the solution of Task B i) by R and use 1 r/R meaning
we discard all terms where r/R is added to a constant.
Solution:
r 2sl r
tan(φ) 1 + sin(φ) = − 2 (1 − cos(φ))
R LR R
2sl
tan(φ) ≈
LR
using sin(φ) and cos(φ) are finite.
Note: A wrong result in Task B i) might also lead to a wrong result in this task. So if you
get some very unrealistic result here, you might already have done a mistake in B i). But
you might still get partial points for this task so certainly write something down.

Task B iv) (1P): Provide the numerical value of φ(t) when the marble completed 5 turns
(with R = 10 m, r = 2 cm and s = 2 m).
Solving Strategy: The marble performing 5 turns means l/L = 5. Inserting all these
values, we get the numerical result. Furthermore R r, therefore we can use the
simplified formula.
Solution: We use the simplified formula, then φ ≈ 1.1 rad (= 63◦ )
Note: Do not insert numerical values earlier, only at the very end.

18
Chapter 2

MATHEMATICS
2×3=4
Pippi Långstrump

2.1 Vector algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.2 Differential calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.3 Integral calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.4 Complex Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

19
2 Mathematics

This part of the script presents the most important mathematical tools. Some things will
be known from school, other things maybe will be new. The goal is not to present the
entire high school mathematics, but to give an overview over the concepts, which are
needed in the physics part. The proofs are not really important for the physics, so one
can omit them if not interested.

2.1 Vector algebra

In physics, vectors are of crucial importance, since many physical quantities are repre-
sented by vectors. The foundation of vector algebra should be known from school.
Therefore we will only repeat the two most important concepts for physics. The main
reference for this chapter is [1].

2.1.1 Scalar product

There is an intuitive way to add two vectors and to multiply a vector by a number. There
are essentially two ways how two vectors can be multiplied. One of them is the scalar
product (also known as dot product).

Definition: Let ~a and ~b be vectors. Then the scalar product of ~a and ~b is

   
ax bx
~a · ~b = ay  · by  = ax bx + ay by + az bz .
az bz

There is a nice connection between the scalar product and the angle ϕ between the vec-
tors ~a and ~b (see figure 2.1). We have the following alternative expression for the scalar
product:

~a · ~b = ax bx + ay by + az bz = |~a| · |~b| · cos(ϕ).

20
2.1. VECTOR ALGEBRA

Figure 2.1: ~c = ~a − ~b

Proof: The equality ax bx + ay by + az bz = |~a| · |~b| · cos(ϕ) follows from the cosine formula1 .
We look at figure 2.1 and calculate the length of ~c = ~a − ~b in two ways. On one hand, we have:

|~c|2 = |~a − ~b|2 = (ax − bx )2 + (ay − by )2 + (az − bz )2

= a2x + a2y + a2z + b2x + b2y + b2z − 2ax bx − 2ay by − 2az bz
= |~a|2 + |~b|2 − 2 (ax bx + ay by + az bz ) .

On the other hand, it follows by the cosine formula, that:

|~c|2 = |~a|2 + |~b|2 − 2 · |~a| · |~b| · cos(ϕ).

Comparing the two expressions for |~c|2 , we find

ax bx + ay by + az bz = |~a| · |~b| · cos(ϕ).

1
The cosine formula tells us, that in a triangle with sidelengths a, b and c we have

c2 = a2 + b2 − 2ab cos(ϕ),

where ϕ is the angle between a and b.

21
2 Mathematics

Remarks

• The scalar product is often used to calculate the angle ϕ between two vectors ~a
and ~b. Using the two different expressions for the scalar product, we get

~a · ~b ax bx + ay by + az bz
cos(ϕ) = = .
~
|~a| · |b| |~a| · |~b|

• The scalar product has an intuitive interpretation: One first projects the vector ~a
on the vector ~b to get a vector a~0 of length

|a~0 | = |~a| · cos(ϕ)

(see fig. 2.2). Then the scalar product of ~a and ~b is the product

|~a| · cos(ϕ) · |~b| = |a~0 | · |~b|

of the length of a~0 with the length of ~b. This means that only the part of ~a which is
parallel to ~b contributes to the scalar product. The same holds, if one interchanges
~a and ~b.

Properties: We summarize the most important rules for calculations involving the
scalar product. Let ~a, ~b, ~c be three vectors and s a real number. Then we have:

~a · ~a = |~a|2 = a2x + a2y + a2z

~a · ~b = ~b · ~a

(s~a) · ~b = s · ~a · ~b

~a · b + ~c = ~a · ~b + ~a · ~c.
~

Furthermore: If ~a, ~b 6= ~0, then ~a · ~b = 0 if and only if ~a and ~b are orthogonal.

Exercise: Does (~a · ~b) · ~c = ~a · (~b · ~c) hold?

22
2.1. VECTOR ALGEBRA

Figure 2.2: Projection of ~a on ~b

2.1.2 Vector product

The scalar product of two vectors is a number (scalar). The vector product (also known
as cross product) of two vectors is a vector, concretely:

Definition: Let ~a and ~b be vectors. The vector product of ~a and ~b is defined

as      
ax bx ay bz − az by
~a × ~b = ay  × by  = az bx − ax bz 
az bz ax by − ay bx

Properties
1. The vector product ~a × ~b is orthogonal to both ~a and ~b, given that ~a and
~b aren’t parallel (see. fig. 2.3). If ~a and ~b are parallel then ~a × ~b = ~0.
2. The vectors ~a, ~b and ~a × ~b follow the right hand rule (see. fig. 2.1).

23
2 Mathematics

Figure 2.3: Vector product

3. If ϕ is the angle between ~a and ~b, we have |~a × ~b| = |~a| · |~b| · sin(ϕ).
Therefore the absolute value of the vector product is equal to the area of
the parallelogram with sides ~a and ~b (see fig. 2.3). This means that only
the part of ~a, which is orthogonal to ~b contributes to ~a × ~b.

Proof:

~a × ~b

Figure 2.4: Right hand rule.

24
2.1. VECTOR ALGEBRA

1. One calculates ~a · ~a × ~b = 0 and ~b · ~a × ~b = 0. So it follows, that ~a and ~b are
orthogonal to ~a × ~b.

2. This is not really an obvious fact. The idea is to first show the fact for ~a and ~b in the
xy-plane and then one reduces the general case to this.

3. One has sin2 (ϕ) = 1 − cos2 (ϕ). Therefore:

2
|~a| · |~b| · sin(ϕ) = |~a|2 · |~b|2 · 1 − cos2 (ϕ)

2
= |~a|2 · |~b|2 − |~a| · |~b| · cos(ϕ)
2
= |~a|2 · |~b|2 − ~a · ~b .

2
On the other hand, one can calculate explicitly, that |~a|2 · |~b|2 − ~a · ~b = |~a × ~b|2 .
Therefore one gets |~a × ~b| = |~a| · |~b| · sin(ϕ).

More properties: We summarize the most important rules for calculations involving
the vector product. Let ~a, ~b, ~c be three vectors and s a real number. Then we have:

~a × ~a = 0

~a × ~b = − ~b × ~a

(s~a) × ~b = ~a × s~b = s · ~a × ~b

~a × ~b + ~c = ~a × ~b + ~a × ~c

~a + ~b × ~c = ~a × ~c + ~b × ~c

Exercise: Does (~a × ~b) × ~c = ~a × (~b × ~c) hold?

25
2 Mathematics

2.2 Differential calculus

Physics without calculus is impossible, since most physical laws in their general formula-
tion use derivatives or integrals. In this chapter, we look at differential calculus. The next
chapter will treat integral calculus. The main reference for this chapter is [2].

2.2.1 Derivative of a function

Given a real function f , which maps a real number to another real number, we are often
interested in the slope of the graph of f at some point x0 . With the slope of f , we mean
the slope of the tangent at the graph of f through f (x0 ) (see fig. 2.5). How can we
calculate this slope? For this we first think about how to calculate the slope of the secant
through f (x0 ) and f (x0 + h) for some h > 0 (see. fig. 2.5): This slope is simply the
difference in height divided by the difference in lenght, hence
f (x0 + h) − f (x0 ) f (x0 + h) − f (x0 )
= .
(x0 + h) − x0 h

If we make h smaller, the secant approaches the tangent and the slope of the secant
approaches the slope of the tangent. Therefore the slope of the tangent is the limit of the
slope of the secant, when h goes to 0, written as limh→0 . We call the slope of the tangent
of f at the point x0 the derivative f 0 (x0 ) of f at the point x0 .

Definition: Let f be a real function and x0 a real number. Then we define

the derivative of f at the point x0 as
f (x0 + h) − f (x0 )
f 0 (x0 ) = lim . (2.1)
h→0 h

Remarks
• One often also writes df 0
dx (x0 ) for f (x0 ). Then df and dx stand for very small
(”infinitesimal”) differences f (x0 + h) − f (x0 ) and (x0 + h) − x0 .
• In physics f is often a function of the time t. Then, we often write
df
f˙(t0 ) = (t0 )
dt
for the derivative of f at the point t0 .

26
2.2. DIFFERENTIAL CALCULUS

Figure 2.5: Tangent and secant

• The limit in the definition of the derivative doesn’t exist for every function every-
where. However, we will only work with functions which are ”nice enough” and
we will always assume the limit exists everywhere.

• One often calls the derivative f 0 (x0 ) the ”instantaneous rate of change” of f at
the point x0 , because it tells how fast f is changing at x0 . With the help of the
derivative, we can approximate f near x0 : For ∆x small, we have

f (x0 + ∆x) ≈ f (x0 ) + ∆x · f 0 (x0 ), (2.2)

since f (x0 ) + ∆x · f 0 (x0 ) is the value of the tangent line at the point x0 + h (see
fig. 2.6).

27
2 Mathematics

Figure 2.6: Derivative as approximation

Example:
We calculate the derivative of the function f (x) = x2 at the point x0 = 1. So we
calculate
f (x0 + h) − f (x0 ) (1 + h)2 − 12
f 0 (1) = f 0 (x0 ) = lim = lim
h→0 h h→0 h
2
1 + 2h + h − 1 2h + h2
= lim = lim
h→0 h h→0 h
= lim (2 + h) = 2.
h→0

Mostly, we do not only want to know the derivative of f at some point x0 (as in the
example), but generally at an arbitrary point. We define the derivative function of f
(mostly, we only say derivative of f ) as the function f 0 = df
dx , which maps every real
number x to the derivative f 0 (x) of f at the point x. Therefore the derivative f 0 of
a function f is again a function. We also say ”differentiate” instead of ”calculating the
derivative”.

28
2.2. DIFFERENTIAL CALCULUS

Example:
We calculate the derivative function of f (x) = x2 . We proceed just as before, but
we write x instead of x0 = 1:

f (x + h) − f (x) (x + h)2 − x2
f 0 (x) = lim = lim
h→0 h h→0 h
x2 + 2hx + h2 − x2 2hx + h2
= lim = lim
h→0 h h→0 h
= lim (2x + h) = 2x.
h→0

So the derivative of f (x) = x2 is f 0 (x) = 2x.

One can show analogously, that for a positive integer n, the derivative of f (x) = xn is
the function f 0 (x) = n · xn−1 . For example for f (x) = x25 , we have f 0 (x) = 25 · x24 ,
or for g(x) = x we have g 0 (x) = 1 (this can be verified immediately, since the slope of
g(x) = x is equal to 1 everywhere).

2.2.2 Differentiation rules

Often it is not necessary to calculate the derivative explicitly using the limit (2.1), since
often the function is a combination of simpler functions, whose derivatives are already
known. In this chapter we look at the corresponding differentiation rules.

Factor rule: Let s be a real number and g a function. If f (x) = s · g(x),

then the derivative f 0 (x) = s · g 0 (x).

Sum rule: Let g and k be functions. If f (x) = g(x) + k(x), then f 0 (x) =
g 0 (x) + k 0 (x).

29
2 Mathematics

Proof:
• Let f (x) = s · g(x). Then we have
f (x + h) − f (x) s · g(x + h) − s · g(x)
f 0 (x) = lim = lim
h→0 h h→0 h

g(x + h) − g(x) g(x + h) − g(x)
= lim s · = s · lim
h→0 h h→0 h
= s · g 0 (x).

• Let f (x) = g(x) + k(x). Then we have

f (x + h) − f (x)
f 0 (x) = lim
h→0 h
g(x + h) + k(x + h) − (g(x) + k(x))
= lim
h→0 h
g(x + h) − g(x) + k(x + h) − k(x)
= lim
h→0 h

g(x + h) − g(x) k(x + h) − k(x)
= lim +
h→0 h h
g(x + h) − g(x) k(x + h) − k(x)
= lim + lim = g 0 (x) + k 0 (x).
h→0 h h→0 h

Example:
We calculate the derivative of f (x) = 3x4 − 2x2 . We set g(x) = 3x4 und k(x) =
−2x2 , so we have f (x) = g(x) + k(x). Using the factor rule (and the rule for
derivatives of powers in the last section), we get g 0 (x) = 3 · 4 · x3 = 12x3 and
k 0 (x) = (−2) · 2 · x = −4x. Using the sum rule, we get f 0 (x) = g 0 (x) + k 0 (x) =
12x3 − 4x.

We are now able to differentiate sums (and also differences using the factor rule). We
also want to differentiate products and quotients. The first guess (g(x) · k(x))0 = g 0 (x)·
k 0 (x) can be easily proved to be wrong, for example by looking at g(x) = k(x) = x.
The correct rules are:

30
2.2. DIFFERENTIAL CALCULUS

Product rule: Let g and k be functions. If f (x) = g(x) · k(x), then we

have
f 0 (x) = g 0 (x) · k(x) + g(x) · k 0 (x).

g(x)
Quotient rule: Let g and k be functions. If f (x) = k(x) , then we have

g 0 (x) · k(x) − g(x) · k 0 (x)

f 0 (x) = .
(k(x))2

Proof:
• Let f (x) = g(x) · k(x). Then we have
f (x + h) − f (x)
f 0 (x) = lim
h→0 h
g(x + h) · k(x + h) − g(x) · k(x)
= lim
h→0 h
g(x + h) · k(x + h) − g(x + h) · k(x) + g(x + h) · k(x) − g(x) · k(x)
= lim
h→0 h
k(x) · (g(x + h) − g(x)) + g(x + h) · (k(x + h) − k(x))
= lim
h→0 h
k(x) · (g(x + h) − g(x)) g(x + h) · (k(x + h) − k(x))
= lim + lim
h→0 h h→0 h
g(x + h) − g(x) k(x + h) − k(x)
= k(x) · lim + lim g(x + h) · lim
h→0 h h→0 h→0 h
0 0
= g (x) · k(x) + g(x) · k (x)

g(x)
• The quotient rule follows from the product rule: Let f (x) = k(x) . Then we have g(x) =
f (x) · k(x). Therefore by the product rule g (x) = f (x) · k(x) + f (x) · k 0 (x). If we
0 0
g(x)
solve for f 0 (x) and plug in f (x) = k(x) , we find

g 0 (x) − f (x) · k 0 (x)

f 0 (x) =
k(x)
g(x)
g 0 (x) − k(x) · k 0 (x)
=
k(x)
g 0 (x) · k(x) − g(x) · k 0 (x)
= 2 .
(k(x))

31
2 Mathematics

Example:
x−1
1. We calculate the derivative of f (x) = x+1 : We set g(x) = x − 1 and k(x) =
g(x)
x + 1. Then we have f (x) = k(x) . Furthermore g 0 (x) = 1 and k 0 (x) = 1.
Therefore by the quotient rule:

g 0 (x) · k(x) − g(x) · k 0 (x) 1 · (x + 1) − (x − 1) · 1 2

f 0 (x) = 2 = 2
=
(k(x)) (x + 1) (x + 1)2

2. For a positive integer n, let be f (x) = 1

xn = x−n . Then we have by the
quotient rule

(1)0 · xn − 1 · (xn )0 0 · xn − 1 · n · xn−1

f 0 (x) = n 2
=
(x ) x2n
1
= −n · xn−1−2n = −n · x−n−1 = −n · n+1
x
Especially, we have found that the rule (xn )0 = n·xn−1 also holds for negative
n.

We are now able to differentiate all polynomial functions and all fractions of polynomial
functions. However, functions often appear as compositions of two functions. For ex-
ample the function f (x) = (x2 − 3x + 13)17 , can be written as f (x) = u(v(x)), where
u(y) = y 17 and v(x) = x2 − 3x + 13. Theoretically, we could calculate f 0 by expanding
(x3 − 3x + 13)17 and applying the above rules, but that would not be a very satisfying
solution. We now formulate the chain rule, which deals with such compositions.

Chain rule: Let u and v be functions and f (x) = u(v(x)). Then

f 0 (x) = u0 (v(x)) · v 0 (x).

df
It is easy to remember the chain rule, using the notation f 0 = dx : We formally ”expand

32
2.2. DIFFERENTIAL CALCULUS

the fraction” and get

df du(v) du(v) dv
f0 = = = · = u0 (v) · v 0
dx dx dv dx

Proof: Let f (x) = u(v(x)). Then we have

f (x + h) − f (x)
f 0 (x) = lim
h→0 h
u(v(x + h)) − u(v(x))
= lim
h→0 h

u(v(x + h)) − u(v(x)) v(x + h) − v(x)
= lim ·
h→0 v(x + h) − v(x) h
u(v(x + h)) − u(v(x)) v(x + h) − v(x)
= lim · lim
h→0 v(x + h) − v(x) h→0 h
u(v(x + h)) − u(v(x)) 0
= lim · v (x)
h→0 v(x + h) − v(x)

We now define ĥ = v(x + h) − v(x). For h → 0, we also have ĥ → 0 (because we assumed v

to be ”nice enough”). Therefore

u(v(x + h)) − u(v(x)) u(v(x) + (v(x + h) − v(x))) − u(v(x))

lim = lim
h→0 v(x + h) − v(x) h→0 v(x + h) − v(x)
u(v(x) + ĥ) − u(v(x))
= lim = u0 (v(x)).
ĥ→0 ĥ

We get f 0 (x) = u0 (v(x)) · v 0 (x).

Example:
We calculate the derivative of f (x) = (x2 − 3x + 13)17 . We have f (x) = u(v(x)),
where u(y) = y 17 and v(x) = x2 − 3x + 13. We differentiate u and v and get
u0 (y) = 17y 16 and v 0 (x) = 2x − 3. Therefore f 0 (x) = u0 (v(x)) · v 0 (x) =
17 · (x2 − 3x + 13)16 · (2x − 3).

33
2 Mathematics

2.2.3 More derivatives

We want to differentiate more functions. We start with the trigonometric functions sine,
cosine and tangent. We will always measure angles in radians and not in degrees.2 The
graphs of sine and cosine are presented in fig. 2.7.

Figure 2.7: The graphs of sine and cosine

We now have a closer look at the graph of the sine function f (x) = sin(x) (see. fig. 2.8).
The slope at the point 0 seems to be approximately f 0 (0) ≈ 1. At the point π2 , the sine
function obtains its maximum, therefore the derivative is f 0 π2 = 0. At the pointπ, the
derivative is again approximately f 0 (π) ≈ −1, at the point 3π 2 it is again f
0 3π = 0,
2
etc. We can draw the derivative function (see fig. 2.8). If we compare the figures 2.7 und
2.8, we conjecture that the derivative of the sine function is f 0 (x) = cos(x).

One can indeed prove this, but we don’t do this here, since the proof is quite technical.
Analogously, one can find the derivative of g(x) = cos(x): It is g 0 (x) = − sin(x).
For the tangent, we use
sin(x)
tan(x) =
cos(x)
2
Repetition: The magnitude of an angle in radians is the length of the corresponding arc of the unit circle.
For example 360◦ correspond to 2π in radians, 180◦ correspond to π. In general: y ◦ corresponds to

y◦ y◦
· 2π = ·π
360◦ 180◦
in radians.

34
2.2. DIFFERENTIAL CALCULUS

Figure 2.8: Derivative of the sine function

and apply the quotient rule. Then we have (where we use that sin2 (x) + cos2 (x) = 1):
sin0 (x) · cos(x) − sin(x) · cos0 (x)
tan0 (x) =
(cos(x))2
cos(x) · cos(x) − sin(x) · (− sin(x))
=
cos2 (x)
cos2 (x) + sin2 (x) 1
= = .
cos2 (x) cos2 (x)
Alternatively we can also write this differently and get:
cos2 (x) + sin2 (x) sin2 (x)
tan0 (x) = = 1 + = 1 + tan2 (x).
cos2 (x) cos2 (x)
Finally, we have
1
tan0 (x) = = 1 + tan2 (x).
cos2 (x)
To summarize:

Derivatives of trigonometric functions:

sin0 (x) = cos(x)

cos0 (x) = − sin(x)
1
tan0 (x) = = 1 + tan2 (x)
cos2 (x)

35
2 Mathematics

Example:
We calculate the derivative of f (x) = sin(x2 ) + (sin(x))2 For g(x) = sin(x2 ) we
get using the chain rule:

g 0 (x) = sin0 (x2 ) · 2x = 2x · cos(x2 ).

for k(x) = (sin(x))2 we get also using the chain rule

k 0 (x) = 2 · sin(x) · sin0 (x) = 2 sin(x) · cos(x).

Using the sum rule, we finally get

f 0 (x) = g 0 (x) + k 0 (x) = 2x · cos(x2 ) + 2 sin(x) · cos(x).

Exponential function: We now want to find the derivative of the exponential func-
tion
f (x) = ax ,

where we should have a > 0. We start with the limit (2.1). Using the power rules, we
get:

f (x + h) − f (x) ax+h − ax
f 0 (x) = lim = lim
h→0 h h→0 h
ax · ah − ax
h
a −1 x
= lim = lim ·a
h→0 h h→0 h
ah − 1

= lim · ax (2.3)
h→0 h

We see, that the limit

ah − 1
lim (2.4)
h→0 h
doesn’t depend on x anymore, so it is simply a number.
Exercise: What is the geometric meaning of the limit (2.4)?
We want to understand this limit better and calculate it approximately for a = 2 and 3,

36
2.2. DIFFERENTIAL CALCULUS

by plugging in small numbers for h (see table 2.1). For a = 2, the limit (2.4) is smaller

h = 0.1 h = 0.01 h = 0.001

2h −1
h 0.717... 0.695... 0.693...
3h −1
h 1.161... 1.104... 1.099...

Table 2.1: Calculation of the limit (2.4) for a = 2, 3

than 1 and for a = 3 it is bigger 1. Therefore there exists a number e, with 2 < e < 3,
such that
eh − 1
lim =1
h→0 h
We therefore find with equation (2.3), that

d x
e = ex .
dx

The derivative of ex is again ex . One can calculate that

e = 2.7182...

The number e is called Euler’s number and plays an important role in mathematics.
h
More about e:We defined e as the number, that satisfies limh→0 e h−1 = 1. We want to find a
better representation for e. By the definition of the limit, we get for very small h

eh − 1
≈ 1.
h
Therefore eh ≈ 1 + h, respectively
1
e ≈ (1 + h) h .
1
If we take the limit for h → 0, the ≈ gets again a =. If we set h = n, we find
n
1
e = lim 1+ .
n→∞ n

This is a widely used representation for e and is often also seen as the definition of e.

37
2 Mathematics

We come back to the question about the derivative of ax . For this, we define the natural
logarithm
ln(x) = loge (x).

This means that the number u = ln(x) satisfies the equation

eu = eln(x) = x.

We can use this as follows: Let a > 0 and f (x) = ax . With the power rules, we find
x
f (x) = ax = eln(a) = eln(a)·x .

With the chain rule:

x
f 0 (x) = eln(a)·x · ln(a) = eln(a) · ln(a) = ax · ln(a).

The derivative of ax therefore is ln(a) · ax .

An interesting result is the derivative of ln(x). It holds

d 1
ln(x) = .
dx x

Proof: We use the identity x = eln(x) . If we differentiate this on both sides, we get using the chain
rule
d d ln(x) d d
1= x= e = eln(x) · ln(x) = x · ln(x).
dx dx dx dx
Thus
d 1
ln(x) = .
dx x

38
2.2. DIFFERENTIAL CALCULUS

Example:

1. We calculate the derivative of f (x) = 23x . Using the chain rule, we get
f 0 (x) = 3 · ln(2) · 23x .

2. We calculate the derivative of f (x) = xr for any real number r. We notice

that r
f (x) = xr = eln(x) = eln(x)·r .

We set u(y) = ey and v(x) = ln(x) · r. Therefore we have f (x) = u(v(x)).

By the chain rule, we get

1 xr
f 0 (x) = u0 (v(x)) · v 0 (x) = eln(x)·r · r · =r· = r · xr−1 .
x x

2.2.4 Overview about derivatives

We summarize in table 2.2 the most important functions and their derivatives. They
appear very often and one should know them by heart.

f (x) f 0 (x)
xr r · xr−1
ex ex
ax ln(a) · ax
1
ln(x) x
sin(x) cos(x)
cos(x) − sin(x)
1
tan(x) cos2 (x)
= 1 + tan2 (x)

Table 2.2: Derivatives of the most important functions

39
2 Mathematics

2.2.5 Higher derivatives

We define the second derivative f 00 (x) of a function f (x) as the derivative of f 0 (x).
Analogously, we define the third derivative f 000 (x) as the derivative of the second deriva-
tive f 00 (x), etc. In general, we define the n-th derivative f (n) (x) recursively as the deriva-
tive of the (n − 1)-th derivative f (n−1) (x).

Example:

1. Let f (x) = ex . Then

d 0 d x
f 00 (x) = (f (x)) = (e ) = ex .
dx dx

2. Let g(x) = sin(x). Then

d 0 d
g 00 (x) = (g (x)) = (cos(x)) = − sin(x).
dx dx

3. Let k(x) = 12 x2 . Then

d 0 d
k 00 (x) = (k (x)) = (x) = 1.
dx dx

2.2.6 Taylor approximation

We have seen at the very beginning (see equation (2.2)) that the derivative of a function
f at a point x0 can be used to approximate the function f at x around x0 : One just
calculates the value of the linear function that is tangent to f at x0 , i.e. at x around x0
one approximates

f (x) ≈ f (x0 ) + (x − x0 ) · f 0 (x0 ) =: T1,f,x0 (x).

This is the best linear approximation of the function f near x0 .

This is already a special case (and by far the most important!) of a concept called Tay-
lor approximation. The idea is the following: Why restrict to linear approximations?

40
2.2. DIFFERENTIAL CALCULUS

How about the best quadratic approximation? For this we look for a quadratic func-
tion T2,f,x0 (x) such that at x0 it goes through f (x0 ) and has the same first and second
derivative as f at x0 . It is not hard to show that T2,f,x0 (x) is given by3
1
T2,f,x0 (x) = f (x0 ) + f 0 (x0 )(x − x0 ) + f 00 (x0 )(x − x0 )2 .
2

Figure 2.9: Graphs of f , T1,f,x0 and T2,f,x0

Similarly, we can find the polynomial Tn,f,x0 (x) of degree n, which approximates f best
around x0 , i.e. the polynomial of degree n such that for all k ≤ n
(k)
Tn,f,x0 (x0 ) = f (k) (x0 ).

Definition: This polynomial Tn,f,x0 (x) is called Taylor polynomial of degree

n of f at x0 and given by
n
X 1 (k)
Tn,f,x0 (x) = f (x0 ) + f (x0 )(x − x0 )k
k!
k=1
1 1
= f (x0 ) + f 0 (x0 )(x − x0 ) + f 00 (x0 )(x − x0 )2 + . . . + f (n) (x0 )(x − x0 )n
2 n!
3
This is essentially because a quadratic function has three ”degrees of freedom”, i.e. three parameters
can be chosen freely.

41
2 Mathematics

where n! = n · (n − 1) · . . . · 2 · 1.(We define 0! = 1.)

Figure 2.10 shows the Taylor polynomials up to degree 5 for the same function f as
above.

Figure 2.10: Taylor polynomials of f (solid line) up to degree 5.

Example:

1. Look at f (x) = ex and x0 = 0. Since dx

d x
e = ex , we have f (n) (x) = ex for
(n) 0
all n. Thus f (x0 ) = e = 1 for all n. Thus
n
X 1 k
Tn,f,0 (x) = 1 + x
k!
k=1

is the best polynomial approximation for ex of degree n around 0. Most im-

portantly, we have
ex ≈ 1 + x
for x around 0.

42
2.2. DIFFERENTIAL CALCULUS

2. Look at f (x) = sin(x) and x0 = 0. Then f 0 (x) = cos(x), so f 0 (0) = 1 and

f 00 (x) = − sin(x), so f 00 (0) = 0. Thus
1
T2,f,0 (x) = f (0) + f 0 (0)x + f 00 (0)x2 = x,
2
in particular for x around 0,

sin(x) ≈ x.

3. Look at f (x) = cos(x) and x0 = 0. We have f (0) = cos(0) = 1. Further-

more f 0 (x) = − sin(x), so f 0 (0) = − sin(0) = 0 and f 00 (x) = − cos(x), so
f 00 (0) = − cos(0) = −1. Thus

1 x2
T2,f,0 (x) = f (0) + f 0 (0)x + f 00 (0)x2 = 1 − ,
2 2
which means that for x around 0,

x2
cos(x) ≈ 1 − .
2

One can actually state more precisely how accurate the function f is approximated by
Tn,f,x0 : It holds that (if f satisfies certain conditions4 )

f (x) = Tn,f,x0 (x) + O(|x − x0 |n+1 ).

O(|x − x0 |n+1 ) means that the approximation error is of order |x − x0 |n+1 , i.e. there
is a constant C > 0 such that when |x − x0 | is small enough,

|f (x) − Tn,f,x0 (x)| ≤ C · |x − x0 |n+1 .

4
that we always assume to hold...

43
2 Mathematics

2.2.7 The idea of differential equations

Differential equations are (functional) equations that also contain derivatives of the func-
tions. The solutions of a differential equation are functions. Differential equations are
crucial for physics since many physical laws can be formulated using differential equa-
tions.
Let us start with an example:
f 0 (x) = f (x).
This equation means that we are looking for a function f such that its derivative f 0 is
equal to f . We know already that f (x) = ex is a solution to this differential equation.
But actually, for every c ∈ R, also f (x) = cex is a solution. So the solution to the above
differential equation is not unique. To ensure uniqueness of the solution, we also need
to fix the value of f at a point e.g. x = 0. If we then present the differential equation as
f 0 (x) = f (x), f (0) = 3,
(a so called initial value problem) the solution is given by f (x) = 3ex . One can show
that this solution is actually the unique solution.
We do not develop the theory of differential equations in this script (There are books on
this topic...). We just mention the general result that, if the differential equation does not
behave too bad, then for any given initial value there always exists a unique solution.

Example:

1. We slightly generalize our first example. Let c ∈ R and f0 > 0, and let

f 0 (x) = cf (x), f (0) = f0 .

Then the solution is given by

f (x) = f0 ecx .

2. Look at the initial value problem

f 0 (x) = 1 + f (x)2 , f (0) = 0.

We already know that

f (x) = tan(x)
is a solution.

44
2.3. INTEGRAL CALCULUS

3. We now look at a second order differential equation (i.e. also second deriva-
tives appear). To make the solution unique, we need two initial conditions, for
example on f (0) and f 0 (0). Let α > 0 and f0 ∈ R, and let

f 00 (x) = −α2 f (x), f (0) = f0 , f 0 (0) = 0.

This is the differential equation corresponding to a harmonic oscillator. The

solution to this differential equation is given by

f (x) = f0 cos(αx).

4. We generalize this equation a bit: Let α > 0 again, and let

f 00 + 2αf 0 + α2 f = 0.

Then the solution is given by

f (x) = (b1 + b2 t)e−αt ,

where b1 and b2 are chosen according to the initial conditions. This corre-
sponds to the case of critical damping of a harmonic oscillator.

2.3 Integral calculus

Roughly speaking, integral calculus deals with the area enclosed by the graphs of func-
tions. We will see that differential and integral calculus are closely related. The main
reference for this chapter is [2].

2.3.1 Antiderivatives

Definition: An antiderivative of a function f is a function F , which satisfies

dF
f (x) = F 0 (x) = (x).
dx

Calculating an antiderivative is therefore the reverse of calculating the derivative. For

example an antiderivative of f (x) = x2 is the function F (x) = 13 x3 , since F 0 (x) =

45
2 Mathematics

1
3 · 3 · x2 = x2 = f (x). However, the antiderivative F isn’t unique. For example the
function G(x) = 13 x3 + 42 also satisfies G0 (x) = f (x).
In general: If f is a function and F an antiderivative of f , then for any real number c,
also F (x) + c is an antiderivative of f . This follows because of
d d
(F (x) + c) = F (x) + 0 = F 0 (x) = f (x).
dx dx
It even holds that for two arbitrary antiderivatives F and G of f there exists a constant
c such that
G(x) = F (x) + c.
(If F and G are antiderivatives of f , then the derivative of the difference F (x) − G(x)
is
d d d
(F (x) − G(x)) = F (x) − G(x) = f (x) − f (x) = 0.
dx dx dx
Therefore F (x) − G(x) is constant.)

Antiderivatives of many functions can be easily calculated. We can essentially interchange

the two columns of table 2.2 and get table 2.3. In the table for each function there is only
one antiderivative listed. One obtains all the other antiderivatives by adding the corre-
sponding constant c. We have a factor rule and a sum rule for antiderivatives. They

f (x) F (x)
1
xr for r 6= −1 r+1 · xr+1
1
x ln |x|
ex ex
ax
ax ln(a)
sin(x) − cos(x)
cos(x) sin(x)

Table 2.3: Antiderivatives of the most important functions

correspond to the factor and sum rules for derivatives.

Factor rule: Let f be a function with antiderivative F and let s be a real

number. Then s · F is an antiderivative of s · f .

46
2.3. INTEGRAL CALCULUS

Sum rule: Let f and g be functions with antiderivatives F and G. Then

F + G is an antiderivative of f + g.

2.3.2 Integral as an area

We define the integral of a function f as the ”signed area” between the graph of f and
the x-axis.

Definition: Let f be a function and let a and b be real numbers with a ≤ b.

Then the integral of f from a to b
ˆ b
f (x)dx
a

is the area between the graph of f and the x-axis in the region between a and
b, where the area under the x-axis counts negative (see fig. 2.11).

So the integral is a real number.

´b
Figure 2.11: The integral a
f (x)dx

In the example in figure 2.11, the integral is

ˆ b
f (x)dx = A − B + C.
a

47
2 Mathematics

Example:
We calculate ˆ 2
f (x)dx
1
for the function f (x) = x. We look at figure 2.12 and see, that the area under the
graph between 1 and 2 is a trapezoid with height (2 − 1) and side lengths 1 and 2.

´2
Figure 2.12: The integral 1
x dx
Therefore the area of this trapezoid is
ˆ 2
1+2 3
x dx = · (2 − 1) = .
1 2 2

For a > b, we define

ˆ b ˆ a
f (x)dx = − f (x)dx.
a b

48
2.3. INTEGRAL CALCULUS

In this case, the area above the x-axis counts negative and the area under the x-axis
positive.
For arbitrary a, b and c, we then have
ˆ c ˆ b ˆ c
f (x)dx = f (x)dx + f (x)dx. (2.5)
a a b

This follows because one can simply add the corresponding areas. This property is called
”interval additivity”.

Approximation of integrals using Riemann sums: Let us mention a useful way

to approximate integrals, which can be generalized later. One can calculate the integral
´b
a f (x)dx of a function f by approximating it using so-called Riemann sums: One di-
vides the interval [a, b] into a sequence of points a = x0 < x1 < . . . < xn−1 < xn = b
and calculates the area of the rectangles with width ∆xk := xk − xk−1 and height f (xk )
for all k = 1, . . . , n (see figure 2.13). The sum of the areas of the rectangles is
n
X n
X
f (xk ) · (xk − xk−1 ) = f (xk ) · ∆xk .
k=1 k=1

If one makes the partition a = x0 < x1 < . . . < xn−1 < xn = b finer (i.e. increases
n) then the sum of the areas of the rectangles will converge to the integral
ˆ b
f (x)dx.
a
P
The notation´ of the integral is motivated by this approximation: In the limit, the
becomes an and the ∆xk becomes a dx.

2.3.3 Fundamental theorem of calculus

In this section, we are going to connect antiderivatives and integrals. The idea is to vary
the upper limit of the integral. In this way, one gets a function of the upper limit of the
integral. Let f be a function and a an arbitrary number. Then we define
ˆ x
If,a (x) = f (t)dt.
a

49
2 Mathematics

Figure 2.13: Approximation of integral with sums

Since x already appears in the limits of the integrals, we have to take another integration
variable t. The expression If,a (x) is independent of t, which is just a ”dummy” variable.
Of course, we have for any real number b that
ˆ b ˆ b
If,a (b) = f (t)dt = f (x)dx.
a a

Example:
Let f (x) = x. We calculate If,1 (x). We can proceed analogously as in the example
above, we only have to replace 2 by x. Then we have
ˆ x ˆ x
1+x x2 − 1
If,1 (x) = f (t)dt = t dt = · (x − 1) = .
1 1 2 2
x2 −1
If we differentiate If,1 (x) = 2 with respect to x, we get

x2 − 1

0 d
If,1 (x) = = x = f (x).
dx 2

50
2.3. INTEGRAL CALCULUS

This is not a coincidence, but it’s exactly the statement of the fundamental theorem of
calculus:

Fundamental theorem of calculus: Let f be a function and a a real

number. Then If,a is an antiderivative of f . Concretely, this means:
ˆ x
0 d d
If,a (x) = (If,a (x)) = f (t)dt = f (x).
dx dx a

Proof: We want to calculate

0 If,a (x + h) − If,a (x)

If,a (x) = lim .
h→0 h
For this, we look at If,a (x + h) − If,a (x) more closely. Using the interval additivity (2.5), we
have ˆ ˆ ˆ
x+h x x+h
If,a (x + h) − If,a (x) = f (t)dt − f (t)dt = f (t)dt. (2.6)
a a x
We look at Figure 2.14. Let fh,min be the minimal value of f between x and x + h, and let
fh,max be the maximal value of f between x and x + h. Then we surely have
ˆ x+h
h · fh,min ≤ f (t)dt ≤ h · fh,max .
x

If we divide by h, we get
ˆ x+h
1
fh,min ≤ · f (t)dt ≤ fh,max .
h x

Using equation (2.6), we have

If,a (x + h) − If,a (x)
fh,min ≤ ≤ fh,max .
h
If we let go h → 0, then fh,min → f (x) and fh,max → f (x), hence
If,a (x + h) − If,a (x)
f (x) ≤ lim ≤ f (x).
h→0 h
Finally, we get
0 If,a (x + h) − If,a (x)
If,a (x) = lim = f (x)
h→0 h
and therefore If,a is an antiderivative of f .

51
2 Mathematics

Figure 2.14: Proof of the fundamental theorem of calculus

The fundamental theorem of calculus gives us a useful tool to calculate integrals. We

want to know the value of the integral
ˆ b
f (x)dx = If,a (b).
a

Let F be an antiderivative of f . From the fundamental theorem of calculus, we know

that If,a (x) also is an antiderivative of f (x). Therefore there exists a real number c such
that
If,a (x) = F (x) + c.
But then we have

F (b) − F (a) = If,a (b) + c − (If,a (a) + c) = If,a (b) − If,a (a) = If,a (b),

since ˆ a
If,a (a) = f (x)dx = 0.
a
Hence we get

ˆ b
f (x)dx = If,a (b) = F (b) − F (a). (2.7)
a

52
2.3. INTEGRAL CALCULUS

We emphasize again, that the formula (2.7) does not depend on the choice of the an-
tiderivative.
For F (b) − F (a), we also write

[F (x)]ba = F (b) − F (a)

Example:

1. We calculate ˆ 2
x dx.
1

An antiderivative of f (x) = x is given by F (x) = 12 x2 . So using equation

(2.7),
ˆ 2
1 2 2 1 2 1 2

3
x dx = x = ·2 − ·1 = .
1 2 1 2 2 2
We indeed get the same result as above.

2. We calculate ˆ 1
x2 dx.
0

An antiderivative of f (x) = x2 is given by F (x) = 13 x3 . So we get

ˆ 1 1
2 1 3 1 3 1 3 1
x dx = x = ·1 − ·0 = .
0 3 0 3 3 3

3. We calculate ˆ π
sin(x) dx.
0
An antiderivative of sin(x) is given by − cos(x). Therefore,
ˆ π
sin(x) dx = [− cos(x)]π0 = − cos(π) − (− cos(0)) = −(−1) + 1 = 2.
0

53
2 Mathematics

The fundamental theorem of calculus motivates the following notation: Let f be a func-
tion. Then we also write ˆ
f (x)dx

for an arbitrary´ antiderivative of f . Note that this notation is not really mathematically
correct, since f (x)dx is not unique, but it should always be clear from the context,
what is meant.

2.3.4 More integration rules

Let f be a function. If we are given an antiderivative of f , it is in general easy to check
that F is really an antiderivative of f , since we just have to calculate the derivative F 0 .
But it can be very difficult to find an antiderivative.5 In this section we will present more
methods to calculate antiderivatives in certain situations. The first one is the method of
integration by parts. Integration by parts essentially is the reversed product rule. The
second one is integration by substitution, which is the reversed chain rule.

Integration by parts: We start with the product rule. Let u and v be two functions
and f (x) = u(x) · v(x). Then the product rule says that
d
f 0 (x) = (u(x) · v(x)) = u0 (x) · v(x) + u(x) · v 0 (x).
dx
If we calculate antiderivatives on both sides, we get
ˆ
u0 (x) · v(x) + u(x) · v 0 (x) dx

u(x) · v(x) =
ˆ ˆ
= u0 (x) · v(x)dx + u(x) · v 0 (x)dx.

This implies

ˆ ˆ
0
u (x) · v(x)dx = u(x) · v(x) − u(x) · v 0 (x)dx. (2.8)

This equation may look very abstract at first glance and will now be clarified by some
examples.
2
There even exist functions, where it is analytically impossible. E.g.the function f (x) = e−x does not
5

have an antiderivative that can be expressed by other functions.

54
2.3. INTEGRAL CALCULUS

Example:

1. We calculate an antiderivative of ex · x. Set u(x) = ex and v(x) = x. Then

ex · x = u0 (x) · v(x).

Using equation (2.8) an antiderivative of ex · x is given by

ˆ ˆ
ex · x dx = u0 (x) · v(x)dx
ˆ
= u(x) · v(x) − u(x) · v 0 (x)dx
ˆ
= ex · x − ex · 1 dx

= ex · x − ex .

If we differentiate ex · x − ex , we see that it is indeed an antiderivative of ex · x.

2. We calculate an antiderivative of sin2 (x). We set u(x) = − cos(x) and

v(x) = sin(x). Then u0 (x) = sin(x) and therefore

sin2 (x) = u0 (x) · v(x).

Using equation (2.8), we calculate

ˆ ˆ
2
sin (x) dx = u0 (x) · v(x)dx
ˆ
= u(x) · v(x) − u(x) · v 0 (x)dx
ˆ
= − cos(x) · sin(x) − (− cos(x)) · cos(x)dx
ˆ
= − cos(x) · sin(x) + cos2 (x)dx.

We now use
cos2 (x) = 1 − sin2 (x)

55
2 Mathematics

and get
ˆ ˆ
2
sin (x) dx = − cos(x) · sin(x) + cos2 (x)dx
ˆ
1 − sin2 (x) dx

= − cos(x) · sin(x) +
ˆ ˆ
= − cos(x) · sin(x) + 1 dx − sin2 (x)dx
ˆ
= − cos(x) · sin(x) + x − sin2 (x)dx.
´
Now we solve for sin2 (x)dx and get
ˆ
1
sin2 (x) dx = · (− cos(x) · sin(x) + x)
2
x cos(x) · sin(x)
= − .
2 2
Again, we can check by differentiating that this is indeed an antiderivative of
sin2 (x).
3. We calculate an antiderivative of ln(x). This does not look like a case for
integration by parts, but we can write
ln(x) = 1 · ln(x).
Then we define u(x) = x and v(x) = ln(x). Now we have
ln(x) = u0 (x) · v(x).
Using equation (2.8) and v 0 (x) = x1 we get
ˆ ˆ ˆ
ln(x)dx = 1 · ln(x)dx = u0 (x) · v(x)dx
ˆ
= u(x) · v(x) − u(x) · v 0 (x)dx
ˆ
1
= x · ln(x) − x · dx
x
ˆ
= x · ln(x) − 1 dx

= x · ln(x) − x.

56
2.3. INTEGRAL CALCULUS

Integration by substitution: The second method we look at is integration by sub-

stitution. This is essentially the converse of the chain rule. We start with an example: We
want to calculate an antiderivative of
2
f (x) = 2x · ex .

We can write f as
f (x) = k 0 (x) · g 0 (k(x)),
where k(x) = x2 and g(y) = ey . Applying the chain rule, we get
d
f (x) = k 0 (x) · g 0 (k(x)) = g(k(x)).
dx
Therefore by definition of an antiderivative,
2
F (x) = g(k(x)) = ex

is an antiderivative of f . This is actually the whole idea of integration by substitution.

Integration by substitution (version 1): Let u and v be functions and

let U be an antiderivative of u. Then U (v(x)) is an antiderivative of u(v(x)) ·
v 0 (x). Hence ˆ
u(v(x)) · v 0 (x)dx = U (v(x)).

One can check that U (v(x)) is really an antiderivative of u(v(x))·v 0 (x) by differentiating
using the chain rule. The simplest application of integration by substitution is the case
where v is of the form
v(x) = ax + b.
If u is a function and U is an antiderivative of u, then an antiderivative of u(ax + b) is
given by

ˆ ˆ
1
u(ax + b) dx = u(ax + b) · a dx
a
1
= U (ax + b).
a

57
2 Mathematics

Remark: Integration by substitution can be remembered and applied using a simple

dv
trick: We simply pretend that we can handle dv, dx and dx just as normal variables. Then
6 0
we can do the following formal calculation. Write v (x) = dx dv
. Then ”v 0 (x)dx = dv”,
so ˆ ˆ
0
u(v(x)) · v (x)dx = u(v)dv = U (v(x)).

Example:

1. We calculate an antiderivative of e3x−2 :

ˆ
1
e3x−2 dx = · e3x−2 .
3

2. We calculate an antiderivative of tan(x). For this we write

sin(x) − sin(x)
tan(x) = =− .
cos(x) cos(x)
dv
Set v(x) = cos(x). Then dx = − sin(x), so we write

− sin(x)dx = dv.

ˆ ˆ
1
tan(x) dx = − · (− sin(x)) dx
v(x)
ˆ
1
=− dv = − ln(|v|) = − ln(| cos(x)|).
v

Definite integrals: If one calculates definite integrals using the substitution rule, one
gets the following formula:

Integration by substitution (version 2): Let u and v be functions and

let a and b be real numbers. Then
ˆ b ˆ v(b)
u(v(x)) · v 0 (x)dx = u(v)dv.
a v(a)
6
Note that here ”formal” means that we write things that are not properly defined.

58
2.3. INTEGRAL CALCULUS

Example:
We calculate ˆ π
2
sin(x) · cos(x)dx.
0
Set
v(x) = sin(x).
Thus dv = cos(x)dx, so
ˆ π ˆ π
2 2
sin(x) · cos(x)dx = v(x) · cos(x)dx
0 0
ˆ v( π2 )
= vdv
v(0)
ˆ sin( π )
2
= v dv
sin(0)
2 1
v
=
2 0
1
= .
2

Two more elaborate examples:

At a first glance, functions for which one can apply integration by substitution seem
to have a very special form. But actually, the manipulations seen above can be
applied to a wide range of functions, which we want to illustrate on two examples.

1. We want to calculate an antiderivative of

1
.
ex + 1
We set v(x) = ex . Thus
dv = ex dx.

59
2 Mathematics

Using integration by substitution, we have

ˆ ˆ
1 1
x
dx = · ex dx
e +1 (e ) + ex
x 2
ˆ
1
= · ex dx
v(x)2 + v(x)
ˆ
1
= 2
dv
v +v

Now we write

1 1 (v + 1) − v
= =
v2 +v v(v + 1) v(v + 1)
v+1 v
= −
v(v + 1) v(v + 1)
1 1
= − .
v v+1
We can simply integrate this and get that
ˆ
1
dv = ln(|v|) − ln(|v + 1|).
v2 + v

Therefore
ˆ ˆ
1 1
x
dx = 2
dv
e +1 v +v
= ln(|ex |) − ln(|ex + 1|)
= ln(ex ) − ln(ex + 1)
= x − ln(ex + 1).

If one differentiates x − ln(ex + 1) with respect to x, one can check that this
is indeed an antiderivative of ex1+1 .

2. We want to find an antiderivative of

p
1 − x2 .

60
2.3. INTEGRAL CALCULUS

This is a case where one can apply integration by substituion ”backwards”, i.e.
one replaces x by some function x(u) depending on some other variable u.
With a bit of experience1 , one sees that
x(u) = cos(u)
could be a good choice for a substitution. Then we have
p p
1 − x2 = 1 − cos2 (u) = sin(u).
Note that x = cos(u), so
dx = − sin(u)du.
Using integration by substitution, we get
ˆ p ˆ p
2
1 − x dx = 1 − cos2 (u) · (− sin(u))du
ˆ
= sin(u) · (− sin(u))du
ˆ
= − sin2 (u)du

u cos(u) · sin(u)
=− −
2 2
where we used the antiderivative of sin2 (u) which we calculated in the section
about integration by parts. Now, we have to write this term as a function of x
again. Note that
p p
sin(u) = 1 − cos2 (u) = 1 − x2 .
Furthermore, we write
u = arccos(cos(u)) = arccos(x).
Therefore
ˆ p
2
u cos(u) · sin(u)
1 − x dx = − −
2 2
√ !
arccos(x) x · 1 − x2
=− − .
2 2
√
We can differentiate this to see that it is indeed an antiderivative of 1 − x2 .2

61
2 Mathematics

2.3.5 The idea of multidimensional integrals

The idea of integrals is not restricted to functions of one dimension. It can be generalized
to higher dimensions. Let’s illustrate this with an example in three dimensions. Assume
we have some object C, at which we look as a subset of the three dimensional space R3 .
We would like to know the mass of C, but we only know the density ρ and the volume
V of C. This is easy, since we can just calculate the mass m of C from this information:
m = ρ · V.
But what if the density ρ is not constant? Let us first assume that it is at least piecewise
constant, i.e. we can divide C into disjoint pieces C1 , ..., Cn such that C1 ∪C2 ∪...∪Cn =
C and, on each of C1 , ..., Cn , the density is constant equal to ρ1 , ..., ρn . Furthermore,
let V1 , ..., Vn denote the corresponding volumes. Then the total mass of C is just the
sum of all the masses of C1 , ..., Cn , i.e.:
n
X
m = ρ1 · V1 + ... + ρn · Vn = ρ k · Vk .
k=1

Now let us assume that the density ρ is a function of the location ~x in C, i.e. ρ = ρ(~x),
where ~x ∈ C ⊂ R3 . Now we can still approximate the mass of C by dividing C into a
large number of small disjoint subsets ∆C1 , ..., ∆Cn , such that ∆C1 ∪ ... ∪ ∆Cn = C.
Then for each ∆Ck , we choose a location ~xk ∈ ∆Ck . For k = 1, ..., n let ∆Vk denote
the volume of ∆Ck . Then we approximate the mass m of C by
n
X
m≈ ρ(~xk ) · ∆Vk
k=1

where the approximation gets better making the partition finer. This value then converges
to the actual mass m of C. In analogy to the one dimensional case, we write
˚
m= ρ(~x)dV (~x).
C
1
One has to guess/see which function is suitable. There are even tables where one can look up the most
common ”standard substitutions”.
2
We use that
d 1
arccos(x) = − √ .
dx 1 − x2
To see this, one differentiates y = cos(arccos(y)) using the chain rule and
cos2 (x) + sin2 (x) = 1.

62
2.3. INTEGRAL CALCULUS

P ˝
As in the one dimensional case, the intuition is that in the limit the gets a and the
∆ gets a d. If we replace the density ρ = ρ(~x) by an arbitrary function f = f (~x), we

y
dV (~x)

Figure 2.15: An object with a small piece of volume. Dividing it in small pieces of volume
dV allows to assume a constant density (or in general a constant function) on that small
piece. Summing/integrating all these piece times their density ρdV allows to compute the
total mass.

have now already defined ˚

f (~x)dV (~x)
C

for arbitrary7 functions f from R3 to R and subsets C ⊂ R3 .

At the moment, we omit the discussion of how one actually calculates those integrals.
It is more important to understand the idea. This idea is not restricted to the three di-
mensional case. If we replace the volume V by the area A, we can similarly define the
integral ¨
f (~x)dA(~x)
C

for a subset C ⊂ R2 and a function f = f (~x) from R2 to R. This integral corresponds

to the (signed) Volume between C and the two dimensional graph of f .
7
We implicitly assume that f is ”nice enough”.

63
2 Mathematics

We can generalize this idea even more. For example we can integrate functions on curved
two dimensional surfaces in the tree dimensional space. In the same spirit, we can also
integrate functions along one dimensional curves in three dimensional space. The idea
stays always the same: One cuts the set on which one wants to integrate into small parts
and then assumes the function f to be constant on those small parts. Then one just sums
over all the small parts to approximate the integral.

2.4 Complex Numbers

This Chapter is (except for very small modifications) equal to Chapter 5 in [3], which was
written by Lionel Philippoz.

2.4.1 Introduction
Should you encounter the following equation in a textbook

x2 = 1 (2.9)

and be asked to find its solutions in R, it would not be that difficult to conclude that there
are two of them, namely 1 and −1. But what happens if one slightly modifies Eq. (2.9)
by changing a sign?
x2 = −1 (2.10)
2
√ solution? Actually not, since for any real number x, x ≥ 0. In
Can you find a real
the real numbers, −1 is not defined. This equation thus possesses no solutions in R.
However, one can expand the set of real numbers to the so-called set of complex numbers
C in which Eq. (2.10) actually has two (complex) solutions.

2.4.2 Representation of a complex number, Euler formula

Complex numbers are not so different from real numbers, and all you actually need to
know is that we define a new number i ∈ C such that i2 = −1. And that’s it! You can
now solve Eq. (2.10) in C and find its two solutions: i and −i.
Any complex number z ∈ C can be written as the sum of two numbers:

z = x + iy (2.11)

64
2.4. COMPLEX NUMBERS

where x, y ∈ R and i as previously defined. x is also called the real part of z (sometimes
written as <(z) or Re(z)), whereas y is called the imaginary part of z (written as =(z) or
Im(z)).
If y = 0, then z is simply a real number, and when x = 0, we say that z is an imaginary
number. As you can see, any complex number can now be defined using two “coordi-
nates” x and y, which means it can actually be represented in a plane, the so-called…com-
plex plane!

3i
z = 3 + 2i
2i
|z|
i
θ
-1 0 1 2 3 4 R
−i

Figure 2.16: z can be seen as a point in the complex plane, with cartesian coordinates
(x, y) or polar ones (|z| , θ).

Another possibility to describe the position of a point in a plane consists in using polar
coordinates, where one needs to give the distance from the origin as well as the angle
between the x-axis (here the real axis) and the line connecting the origin and the point8 .
If you know trigonometry well, you can then easily relate both coordinate systems by
writing:

x = |z| cos(θ)
y = |z| sin(θ)

and z can thus be written as

8
If you want to consider z as a vector, then x and y are the components of that vector, |z| the norm of
z and θ the angle between the vector and the x-axis.

65
2 Mathematics

z = x + iy
= |z| cos(θ) + i |z| sin(θ)
= |z| (cos(θ) + i sin(θ))
= |z| eiθ (2.12)

The last step is actually performed using the Euler formula:

eiθ = cos(θ) + i sin(θ) (2.13)

which we will take as a definition of a complex exponential function in this script. This is
a very important relation which allows one to switch between a trigonometric represen-
tation and an exponential function, which is much more easy to handle with. You can
actually write both trigonometric functions sin(θ) and cos(θ) as a function of complex
exponentials! This is done as follows:

eiθ = cos(θ) + i sin(θ) (2.14)

−iθ
e = cos(−θ) + i sin(−θ)
= cos(θ) − i sin(θ) (2.15)

If you now add (resp. substract) Eq. (2.14) and Eq. (2.15), you can eliminate the sin part
(resp. cos) and solve for cos (resp. sin), which leads to

eiθ + e−iθ
cos(θ) = (2.16)
2
e − e−iθ
iθ
sin(θ) = (2.17)
2i

2.4.3 A first simple application

One useful application of the Euler formula lies in the fact that dealing with trigonometric
function is not always that easy. For instance, do you know by heart how to write sin(2θ)
or cos(2θ) with only functions of the angle θ, not 2θ? If you forgot about those formulas,
just switch to the complex world!

66
2.4. COMPLEX NUMBERS

ei(2θ) = cos(2θ) +i · sin(2θ)

| {z } | {z }
<(e2iθ ) =(e2iθ )
2
ei(2θ) = eiθ
= (cos(θ) + i sin(θ))2
= cos2 (θ) + 2i sin(θ) cos(θ) + i2 sin2 (θ)
= cos2 (θ) − sin2 (θ) +i · 2 sin(θ) cos(θ)
| {z } | {z }
<(e2iθ ) =(e2iθ )

where we used the fact that i2 = −1, as you should know by now. It is now easy to
conclude by equating the two expressions of the real part (and identically for the imaginary
part):

sin(2θ) = 2 sin(θ) cos(θ)

cos(2θ) = cos2 (θ) − sin2 (θ)

2.4.4 Physical examples

One possible application of complex numbers to physics resides in the representation of
any periodic system which would require trigonometric functions to be described, such
as waves for instance. If we consider a wave function ξ(x, t) 9 , it will usually be written
as
ξ(x, t) = ξ0 cos(kx − ωt)
or, using the Euler formula

ξ(x, t) = ξ0 < ei(kx−ωt)

However, you will never encounter this notation with the real part (or imaginary part if
you considered a sin-function) in any physics book, where the wave is simply denoted as

ξ(x, t) = ξ0 ei(kx−ωt)

Physicists simply assume that the real part is taken at the end of their calculations (and
that ways, it is shorter to write and much easier to read, don’t you agree?).
9
ξ(x, t) is the displacement of the wave, which depends on the position x and the time t and ξ0 as the
amplitude.

67
2 Mathematics

2
What is now the velocity ξ(x, ¨ t) ≡ d ξ(x,t)
˙ t) ≡ dξ(x,t) or the acceleration ξ(x, ? Well,
dt dt2
you just need to differentiate the wave function ξ(x, t), and it is much easier to do it
when considering an exponential function!

68
Chapter 3

MECHANICS 1
I like to move it, move it.
King Julien

3.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.2 Kinematics of Point-like Particles . . . . . . . . . . . . . . . . . . . 71
3.3 Dynamics of Point-like Particles . . . . . . . . . . . . . . . . . . . . 79
3.4 Momentum, Work, Energy and Power . . . . . . . . . . . . . . . . 91

69
3 Mechanics 1

Mechanics is one of the most important topics in physics. It describes how a body moves
if the interaction of that body with other bodies is known. For example in Newtonian
mechanics1 the ”how bodies move” is described by the acceleration, and the interaction
by the force. Newtons second law brings these two things together such that in principle
we can calculate the motion. In mechanics we do not ask where the force come from, this
is subject of other topics for example electromagnetism. We only describe its influence
on motion.
All the following topics in physics will more or less base on mechanics, for example ther-
modynamics and fluid dynamics will describe the collective behavior of many particles.
Oscillations and waves describe the motion of a body or many bodies where a linear
force is acting on. And finally electrodynamics describes the interaction of electric and
magnetic field with a body.
Since mechanics contains a lot of subjects we divide it into two parts. In this chapter
we will focus on point-like particles and introduce the main concepts to describe their
motion. In the next chapter we will then look at rigid bodies, which consist of many
atoms, and generalize circular motion including the motion of planets around the sun.

3.1 Model

Any body is built-on a lot of atoms. To precisely describe the motion of that body one
would need to describe the interaction of each atom with each other and also with other
bodies and then compute the motion of each single atom. Obviously this calculation
exceeds any computing power for systems with more than some dozens of atoms. In
order to describe the motion of a body in good approximation we have to figure out a
model.

• A first good approximation is to consider the body as point-like and address col-
lective properties like (total) mass and (total) charge to it. This approximation can
be justified in case the internal structure of the body is small compared to its be-
haviour as entire body. For example if you kick a football its travelling distance is
much larger than its diameter, same holds for planets. Therefore one can in good
approximation describe a football and planets as point-like particles2 .
1
Other common descriptions are Lagrangian and Hamilton mechanics. They describe interactions by
the potential energy and it is possible to generalize the formalism at the prize it gets more complicated. In
the end the equations one has to solve to get the motion are the same for all description.
2
Nevertheless there are effects that appear from their non-zero size. One then has to model it in a more
complicate way.

70
3.2. KINEMATICS OF POINT-LIKE PARTICLES

• If one considers the distance between the atoms to be constant, one describes that
body as rigid body. This is treated in 4.

• One can also look at the deformation of a body. The description as rigid body is
then not valid any more and one has to go one step further.

3.2 Kinematics of Point-like Particles

As already mentioned above in this whole chapter 3 we will only consider point-like par-
ticles. In this section we will look how one can describe the motion of such a particle.
We are not interested why the particle performs this motion. This we will discuss in the
next section about Newton’s laws.
In this section, first a general description is given which is basically a little repetition of
math. Afterwards we will look at the most important cases.

3.2.1 General Description

Our universe is three dimensional and the whole evolution can be parametrized by the
time. Or in simpler words, the position of any body can be described by a three-dimensional
vector which only depends on time and its starting point. Let’s look at this description
step by step:
First of all we need to define a reference point, often denoted as O. This point is the zero
point in the coordinate system and the coordinates of all bodies refer to this point, see
also figure 3.1. For our coordinate system we need furthermore three coordinate axis. In
principle one is free to choose any point in space as zero point and also the direction of
the axis. Nevertheless it is very often possible to reduce the difficulty of the problem by
a good choice of reference point and direction of axis (see sections 3.2.2 and 3.2.3). In
particular it is useful to take perpendicular axis since then the scalar product of two axis is
always zero. Additionally one should choose the axis such that they obey the right hand
rule (see also 2.1.2). This means if the x-axis points in the direction of the thumb and the
y-axis in the direction of the forefinger then the z-axis should point in the direction of the
middle finger. This right hand rule will get important when we look at vector products
of vectors such as angular momentum or torque. The position of a point-like particle at
time t is given by the position vector ~r(t), namely

 
x(t)
~r(t) =  y(t) 
z(t)

71
3 Mechanics 1

where x(t) is the coordinate along the x-axis at the time point t. This means the distance
between the reference point O and the particle is in the direction of the x-axis is x(t).
Note: x(t) is not simply a number, it also contains an unit such as meter or kilometer.
Depending on the unit, the value of x(t) can be different but the distance is always the
same. The same counts for y and z.

~v (t)
~r(t)

y(t)
0 x

z(t)
x(t)

z
Figure 3.1: General setup at a time point t: A particle is moving on a (curved) trajectory
through space (from left to the right). The position of the particle is measured with respect
to a reference point O. The velocity of the particle is the tangent at the curve at the point.

The change of the position vector ~r(t) divided by the time ∆t needed for this change is
the velocity. Assume the particle travels between the time points t1 and t2 from ~r(t1 ) to
~r(t2 ). The average velocity is then

 ∆x 
∆~r(t) ~r(t2 ) − ~r(t1 )  ∆t
∆y
~v (t) = = = ∆t
.
∆t t2 − t1 ∆z
∆t

To get the instantaneous velocity one looks at the limit of very small ∆t which is the
derivative. The instantaneous velocity vector ~v (t) is therefore

72
3.2. KINEMATICS OF POINT-LIKE PARTICLES

 dx(t)
  
dt vx (t)
d~r(t)  dy(t)
~v (t) = = = vy (t)  .
 
dt dt
dz(t) vz (t)
dt

q at the curve ~r(t) which points in

Geometrically the instantaneous velocity is the tangent
the direction of motion and has length v = |~v (t)| = vx2 + vy2 + vz2 , see also figure 3.1.
We can proceed and look at the change of the velocity per time. This means we take the
derivative of the velocity. This quantity is the acceleration, denoted by ~a. Formally we
get

 dvx (t)
  
ax (t)
d~v (t) d2~r(t) dt
dvy (t)
~a(t) = = 2  = ay (t)  .
  
dt dt = dt
dvz (t) az (t)
dt

Of course one can also look at higher derivatives but they usually have no physical im-
portance.

3.2.2 Linear Uniform Acceleration

One of the most important non-trivial cases3 is the linear uniform acceleration. This
means ~a is constant. We assume that the acceleration points in the direction of the x-axis
and that the y and z coordinate do not change4 .
Obviously we don’t need to describe this motion in three dimensions and we can do the
whole calculation scalar only considering the x-axis. For this assume5 that at time t0 , the
particle is at position x0 and moves with velocity v0 . We search for the velocity at a (later)
time point t. Since the acceleration is the derivative of the velocity, we obtain the velocity
3
Trivial cases would be no acceleration and therefore constant velocity, because then we could make a
change of reference system and our object would be at rest. This is related to the definition of inertial frames,
see section 3.3.2.
4
Otherwise we could chose our coordinate system such that the x-axis points in the direction of the
acceleration. Furthermore we could choose the reference point moving with the same velocity in y and z
direction as the particle. Then the y and z coordinate with respect to this moving reference point would
not change. The reason why we can choose a reference point that moves with constant velocity is given in
section 3.3.2.
5
We could also choose a reference frame where at the t0 the particle is at rest and at the origin. But to
show how the calculations has to be done, let us consider the more general case.

73
3 Mechanics 1

by integrating the acceleration and choosing the integration constant such that at t = t0
the velocity is v0 . Since the integral from t0 to t = t0 is zero, the integration constant6 is
exactly v0 .

ˆt
v(t) = v0 + a(t0 ) dt0
t0
ˆt
= v0 + a dt0 = v0 + a(t − t0 ).
t0

Note that the prime of t0 has nothing to do with derivatives. We simply need a variable
to parametrise the integral. We could also replace the t0 by any other sign (e.g. x, α or ℵ)
as it does not appear outside the integral. The position of the particle is obtained by one
more integration. With the same argumentation the integration constant is x0 .

ˆt
x(t) = x0 + v(t0 ) dt0
t0
ˆt
v0 + a(t0 − t0 ) dt0

= x0 +
t0
1
= x0 + v0 (t − t0 ) + a(t − t0 )2 .
2
The calculation is graphically drawn in figure 3.2.
It is important to remember that the displacement (with a constant) is proportional to the
square of time (for t0 = 0). One intuitive explanation for this square dependence is that
the mean velocity is proportional to the time t and that the time of flight is proportional
to t. Therefore the distance is proportional to t2 , see again figure 3.2.

If the acceleration is linear (the vector ~a(t) points always in the same direction) but not
uniform (~a(t) not constant), we can treat the problem also in one dimension but we have
to calculate the integral for the function ~a(t).
6
It’s not always that simple to find the integration constant. For example if the velocity v0 is not given
at t0 but at t0 − 1s.

74
3.2. KINEMATICS OF POINT-LIKE PARTICLES

a v x

a=const

v0
v x
x0
time time time
t0 t t0 t t0 t

Figure 3.2: Connection between the acceleration a, the velocity v and the distance x (here
fore the linear uniform acceleration). Left: graph of the acceleration, which is constant
(since uniform acceleration). The area under the graph (shaded) from t0 to t corresponds
to the velocity v − v0 at time t which grows linearly. Middle: Graph of the velocity.
The area corresponds to the distance x − x0 . Right: Graph of the distance which grows
quadratically.

Example:
x

x0 = 4
3
2
1

Figure 3.3: Schematic drawing of the example.

Let’s calculate a concrete case namely the motion of a ball when dropping it from
the second floor of a house. Since in the gravitational field on earth, all bodies are
accelerated with the same7 vertical acceleration8 g = −9.81m·s−2 we encounter the
case of linear uniform acceleration. Take the coordinate axis to point upwards9 and

75
3 Mechanics 1

the zero-point on the ground, see also picture 3.3. Assume the window in the second
floor of the house has a height of 4m and we let the ball drop at t = 0. Then its
velocity until it hits the ground is v = gt = −9.81t. The velocity is negative as
expected as the x coordinate of the ball gets smaller and smaller (from initially 4m
towards 0). The position is then

ˆt
x(t) = x0 + v(t0 ) dt0
t0
ˆt
v0 + at0 dt0 ,

= x0 + v0 = 0
t0
1
= 4m − 9.81m·s−2 t2 .
2
To get the time the ball needs to reach the ground we have to set x(t) = 0 and solve
for t leading to

r
2 · 4m
t= ≈ 0.28s
9.81m·s−2

3.2.3 Circular Motion

After we discussed the kinematic problems which can be solved in one dimension lets now
look at the simplest problem which needs two dimensions. This is the circular motion.
In this case the particle flies on a circle in a plane, see 3.4. Since the particle always moves
on the circle we can parametrize it by an angle ϕ(t).

x = R cos(ϕ)
y = R sin(ϕ).

If the angle changes with a constant rate we call this motion uniform circular motion.
The rate of change is described by the angular velocity ω = dϕ dt (being constant). This
leads then to an angle ϕ(t) = ωt + ϕ0 . The velocity is given by

76
3.2. KINEMATICS OF POINT-LIKE PARTICLES

v(t)

~r(t)

ωt R sin(ϕ)
ϕ(t)
O x
R cos(ϕ)
ϕ0

Figure 3.4: Situation for the circular motion: A particle moves on a circle with radius R.
Its position is described by the angle ϕ.

dx(t) dR cos(ωt + ϕ0 )
vx = = = −Rω sin(ωt + ϕ0 )
dt dt
dy(t) dR sin(ωt + ϕ0 )
vy = = = Rω cos(ωt + ϕ0 ).
dt dt

The velocity is obviously perpendicular to the position vector ~r. This is related to the
constriction of the motion on the circle where |~r| is constant. To see this relation, assume
the velocity would not be perpendicular to the position. Then there is a component
of the velocity pointing in the direction of the position which means that the position
vector would get longer or smaller. The position vector would then leave the circle which
contradicts to |~r| being constant. The general motion in two dimensions will be discussed
in the next section 3.2.4. Let’s come back to the circular motion and calculate the absolute
value of ~v :

77
3 Mechanics 1

q p
|~v | = vx2 + vy2 = Rω cos(ωt + ϕ0 )2 + sin(ωt + ϕ0 )2 = Rω.

Similar to the position vector the velocity vector only changes its direction and not its
absolute value. We therefore can repeat all the steps above to calculate the acceleration
which is then given by

dvx (t) −dR sin(ωt + ϕ)

ax = = = −Rω 2 cos(ωt + ϕ)
dt dt
dvy (t) dR cos(ωt + ϕ)
ay = = = −Rω 2 sin(ωt + ϕ).
dt dt
From the discussion above the acceleration was expected to be perpendicular to the ve-
locity and since we are in two dimensions it must be parallel to the position. But position
and acceleration point in the opposite direction or in mathematical language ~r · ~a < 0.
This means ~a points towards the origin O of the circular motion.

3.2.4 General 2 Dimensional Motion

We now look at the motion in two dimension, as this occurs in many situations. Examples
for motion in two dimensions are ball rolling on plane, a car driving on a street or the
trajectory of a flying object with gravitational force.
Consider the following situation: A point-like particle moves on a plane. At the time t
the particle shall be at the position10 ~r(t) and moving with a velocity ~v (t) and accelerated
with an acceleration ~a(t). The question is, what the relation between these quantities is.
Formally the relation is the one described in section 3.2.1 but there is a more intuitive
approach. For this let’s first look at the relation between acceleration ~a(t) and velocity
~v (t). The acceleration describes the change of the velocity. We split the acceleration
into the part parallel to the velocity, denoted by ~ak (t), and the part perpendicular to the
velocity, denoted by ~a⊥ (t), see also figure 3.5. From the discussion in the section about
circular motion (see 3.2.3), it is obvious that ~a⊥ (t) only changes the direction of ~v (t)
but not its length. On the other hand ~ak (t) only changes the length of ~v (t) but not its
direction. Similarly we can argue concerning the relation between the position ~r(t) and
the velocity ~v (t).
It might seem that this intuitive approach does not lead to a lot of insight. But for example
considering the motion of planets around the sun and in particular their non circular but
elliptic motion can intuitively be explained with this argument.
10
All vectors in this section shall be two dimensional vectors on the plane.

78
3.3. DYNAMICS OF POINT-LIKE PARTICLES

~a(t)

~a⊥ (t)
~ak (t)

~v (t)

Figure 3.5: The acceleration pointing in an arbitrary direction is split into its components
parallel and perpendicular to the velocity.

3.3 Dynamics of Point-like Particles

In the last chapter we discussed how one can describe the motion of a particle. But we did
not ask what makes the particle move on that trajectory. To do this we now proceed to
the description of the interaction between bodies and how the interaction is connected
to the motion, i.e. what forces the particle to move the way it does. We will describe
the interaction by forces11 . The relation to the motion is then described by Newton’s
laws. After this rather general part we look at the most important forces and how one
can categorize forces.

3.3.1 Force
Although the concept of a force is very important in this chapter and one might have
an intuition from everyday live it’s not that easy to precisely define what is meant by a
force in physics. It is rather the relation to other (maybe more intuitive) quantities that
defines a force. For the moment let us define the force as (mechanical) resistance a body
opposes. For example if someone wants to deform a body one needs to apply a force.
Or if one wants to change the velocity one needs to apply a force which leads then to the
11
One could also describe the interaction by the interaction energy which is done in Lagrangian and
Hamiltonian mechanics.

79
3 Mechanics 1

acceleration (see Newtons second law). On the other hand these two examples allow us
to measure a force: For example if a body gets deformed, we know that a force is acting
on it. Or if an object accelerates we know a force is causing this acceleration12 . Since
the resistance as described above might be different in different directions of space it is
obvious that the force in general is a vector.

3.3.2 Choice of Frame of Reference

Until now we simply took a frame of reference (usually a suitable) without thinking about
its consequences to the laws of physics. To be more precise until now (meaning the
kinematic part) we actually did not do physics because we only defined different quantities.
This definitions are in principle math and physics comes in when we connect different
definition such as force and acceleration13 . Depending on the choice of the frame of
reference the laws of physics might look different.

An example is driving a curve with a car: In the frame of the driver the car and he/she does
not move,therefore his/her velocity is always zero and hence also his/her acceleration.
Nevertheless he/she feels a force in a curve without acceleration. For a person standing
on the road this force is needed in order to accelerate the car such that it can take the
curve (details see 4.1.3). So in both frames we have a force but only in one there is an
acceleration.

There is a very special class of reference frames in which Physics takes the simplest form.
These reference frames are called inertial frames of reference. They are characterized by
Newton’s first law. Most of the laws of physics are only valid in inertial frames. We
will have a look at non-inertial frames in section 4.1.3. Let’s now proceed and look at
Newton’s laws:

12
This statement is not as easy as it sounds because it depends on the choice of reference frame (see 3.3.2).
It might lead to fictitious forces if one does not choose an inertial frame.
13
The relation between position and velocity and acceleration is also math because we gave the different
derivatives of the position new names. The force is a completely independent concept and needs to be
related to the kinematic properties.

80
3.3. DYNAMICS OF POINT-LIKE PARTICLES

3.3.3 Newton’s Laws

The tree laws that connect the description of motion (position, velocity, acceleration) and
the one of interaction (force, interaction energy) are Newton’s laws (see also picture 3.6)

1. A body where no force acts on, remains in constant velocity14 .

2. The total force acting on a body causes an acceleration proportional to its mass m:
~
F
~a = m .

3. If body A acts with a force F~ on body B, body B acts with a force −F~ on A.

Here we introduced the first time the mass m. This is a property of the body which tells
you how inert it is or using the terminology of the definition of the force: The mass tells
you haw big the resistance of a body is, if a force acts on it15 .

~v = const
F~ F~AB F~BA
~v
~a
F~ = 0 ⇒ ~a = 0 F~AB = −F~BA

Figure 3.6: Newton’s laws as pictures: Left: Newton’s first law: If no force acts, a particle
moves with constant velocity. Middle: Newton’s second law: Force is equal to mass times
acceleration. Right: Newton’s third law: The force acting from a particle to the other is
equal to minus the force acting from the other to the one.

The laws stated above contain a lot of information that needs to be discussed:
As already mentioned in section 3.3.2, the first law gives a definition for an inertial frame.
To be more precise: A reference frame is an inertial frame if and only if any body on
which no (total) force acts, does not move or moves with constant velocity. Returning
to the example of the car driving a curve we observe that the car is not an inertial frame:
For this we only look at the motion on the plane where the car is driving and do not take
14
Including the case where it remains in rest.
15
Be aware that defining the mass this way, the gravitational property of a mass is not included. In fact the
two properties ”resistance ” and ”two masses attract each other” are à priori two independent properties. A
body could have these two properties independently so a ”resistive mass” and an ”attractive mass”. These
two properties get unified in terms of general relativity.

81
3 Mechanics 1

into account the vertical gravitational force. A body that is not attached at the car will
then be acceleration in the frame of the car although there is no force acting on it16 . In
the frame of the road, that body will simply move straight forward as we would expect it
from a body on which no force acts.
The second law is the most important one because it answers the question how a body
behaves if a force is acting on it. One might think that the first law is contained in the
second one. This is only partially true, because the second law is only valid in inertial
frames. Therefore if we want to link the motion and the forces (as given in the second
law) we first have to ensure we have an inertial frame and for this we need the first
law. If multiple forces act on a body one has to sum them up (as vector) to get the
total force. According the the second law, the total force is then equal to F~tot = m~a.
Or to be more clear: If multiple forces act on a body such that they cancel each other,
that body will not be accelerated. In principle you can solve all the problems simply by
summing up all forces, calculate the acceleration and then the path. But in general the
force depends on the position or the velocity of the body and you end up in complicated
differential equations. One can often avoid this difficult way and get the result easier
using conservation laws and some tricks.
The third law is in one to one correspondence to the conservation of momentum. We
will look at this closer in section 3.4.1. The third law might sound contradictory to the
second law. Because if a body A acts with a force F~ on B and B with a force −F~ on
A, the total force is zero. And indeed, the total system composed of A and B will not
accelerate (meaning the center of mass will not accelerate, see also 3.3.4). But as long as
there is nothing that inhibits the two bodies to accelerate (for example a rod between the
two bodies) the total force on each body is non zero and both will accelerate17 . If there is
for example a rod separating the two bodies, one has also to take into account the forces
of the rod acting on each particle. In this case the total force on each body is in fact zero.

The second law of Newton can be formulated in a more general way18 as

dm~v
= F~ (3.1)
dt
16
This might sound contradictory to the statement in section 3.3.2 were we stated that in both frames a
force is acting. But there we looked at the driver and the driver is attached to the car and therefore a force
is needed in order to make the curve (otherwise the driver would leave the car).
17
They accelerate such that the total momentum is conserved. Or equivalently the center of mass is not
accelerating. This is ensured by the third law.
18
Newton already stated it in that way.

82
3.3. DYNAMICS OF POINT-LIKE PARTICLES

where the first term m~v is the momentum (see section 3.4.1). If the mass m is constant,
we can take it out of the derivative and we get the formula above. In most cases one can
use the simpler version of Newton’s second law.

3.3.4 Center of Mass

Consider a body consisting of N point-like particles, see figure 3.7. Assume these particles
interact with each other such that there is a force between each pair of particle (in case two
particle do not interact, their force is zero). In addition assume there is an additional force
acting on all particles. Let us enumerate all the particles and focus on one particular with
the number i. We can split the total force acting on this particle into the contributions of
all other particles and the external force F~i,ext

1 F~21 F~12 2
F~1,ext F~2,ext
F~32
F~31

F~13 F~23

3 F~3,ext
Figure 3.7: Body consisting of three particles. The forces acting on each particle are
drawn.

Formally we get

X
F~i = F~ji + F~i,ext
j6=i

where F~ji is the force19 acting form a particle j 6= i on i. The total force acting on the
entire body is given by the sum of all forces acting on all particles
 
X X X
F~tot = F~i =  F~ji + F~i,ext  .
i i j6=i

19 ~ji is the force from i on j

In some literature the indices i and j are swapped. So F

83
3 Mechanics 1

The first term (the sum over all forces F~ji ) is called internal force and the second therm
(sum over all external forces) is called external force. According to Newton’s third law,
the force acting from particle i on particle j is opposite the one from j on i. As a
consequence the sum over all F~ji vanishes and the total force acting on the body is only

X
F~tot = F~i,ext .
i

According to Newtons second law, the total force acting on a particle i is equal to the
2
acceleration ddtr~2i time its mass mi which leads to

d2 r~i d2
P
i mi r
~i
X X
F~tot = F~i,ext = mi 2 = M 2 (3.2)
dt dt M
i i

P
where M = i mi is the total mass. This is a very useful statement because it means
that the motion of the entire body can be described as if it were a point-like particle at
the position

P
~C = i mi r
~i
R .
M

~ C is called center of mass. Equation (3.2) can then be written as

The position R

~C
d2 R
F~tot = M
dt2

~ C . This is
which is exactly the equation of motion of a point-like particle at position R
especially remarcable since we do not have to know anything about the internal forces,
we can completely neglect them. Be aware that the center of mass R ~ C only describes
the motion of the entire body and not internal motion as rotations or oscillations of its
constituents. In case of a rigid body have also a look at 4.2

84
3.3. DYNAMICS OF POINT-LIKE PARTICLES

3.3.5 Equilibrium

In different topics of physics we will encounter equilibriums. In general an equilibrium is

a state which does not change with time. Usually they are easier to calculate than a general
state where an explicit time dependence needs to be calculated.

A body is in a mechanical equilibrium if it remains in rest or moves with constant speed.

We focus on resting bodies. If a body remains in rest, its velocity and acceleration is zero.
Due to Newton, the total force acting on a body must therefore also be zero. There are
three possibilities how this can happen, see also figure 3.8.

Figure 3.8: Examples of stable, unstable and indifferent equilibriums. On top, the case
for a ball rolling on a surface is drawn. On the bottom, a rod is suspended at the white
point.

Stable

The body is in a (local) minimum of energy. If it is displaced slightly, a force pushes it back
towards the minimum, it returns to its position of equilibrium. See also the two examples
in figure 3.8. For small displacements, the restoring force is in most cases proportional
to the displacement. If slightly displaced, the Body then performs a harmonic oscillation,
see also 6.2.

85
3 Mechanics 1

Unstable
In this case the body is energetically in a (local) maximum. A small displacement causes
the body to be pushed away from the initial position. Examples are shown in the middle
of figure 3.8. If the body is placed exactly at the maximum of the energy then the total
force vanishes also.

Indifferent
A third interesting case is the indifferent equilibrium. If the body is slightly displaces,
there is no force acting at all. See also the right part of figure 3.8.

3.3.6 Gravitational Force

One of the most important forces is gravity. According to Newton’s laws, a mass has the
property that it opposes the change of velocity. Or in other words, a mass is inert. But
masses have also an other property, they attract each other.
Consider two masses m1 and m2 separated by a distance r. Then the attractive force20
between the two masses is given by

Gm1 m2
F =
r2

where G = 6.67 · 10−11 m3 ·kg−1 ·s−2 is the gravitational constant. Obviously the force
is proportional to each of the two masses. This has a very important implication: If the
only force acting on a body is gravity, then its motion is independent of its mass. To see
this assume we double the mass of a body. Then its force is also doubled. To get the
acceleration of that body we have to divide the force by its mass. And there the factor
two cancels out21 .
Like in electromagnetism, we can define a field strength of a gravitational field. This field
strength is called gravitational acceleration22 . The field strength is the quotient of the
force divided by the mass of one body. For example if we look at the earth with mass
M , the gravitational acceleration at the surface of the earth is given by
20
Since the form of this formula is highly related to the electric force of two point charges, further prop-
erties in case of the electric field can be found in chapter 9.1
21
Although in electromagnetism, the force for point-like charges looks basically the same. But since the
e
force is proportional to the charge q, only bodies with same quotient m have equal motion.
22
This terminology is a bit unfortunate. Because one should think of a field strength and not an accelera-
tion. The gravitational acceleration corresponds to the electric field (strength) in electromagnetism.

86
3.3. DYNAMICS OF POINT-LIKE PARTICLES

GM
g= ≈ 9.81m·s−2
r2
where for the numerical value we took the radius of the earth23 . This means that a body
with mass m is attracted by the mass M with a force F = mg.
Looking how gravity looks vectorial, we have to take into account that two masses attract
each other. Denoting ~r as the vector pointing from mass M to mass m (see figure 3.9),
the force acting on mass m is given by

M m
F~
Figure 3.9: Two masses attract each other. The bit arrow denotes the position ~r of m
with respect to M and the small arrow denotes the gravitational force acting on m.

GM m GM m
F~grav = − 2
~e~r = − ~r
|~r| |~r|3

where ~e~r = |~~rr| is the unit vector along ~r. The minus sign is important as it represents
the fact that two masses attract each other.

3.3.7 Independence of Motion

In many cases it is useful to split the three dimensional movement in its x, y and z
coordinate and calculate the motion along each coordinate separately.
Example Let’s look at the case of a projectile motion where after we shoot a body (with
mass m), the only force acting is the gravitational force. Shooting it at time t = 0 with an
initial velocity v at an angle of α (see figure 3.10), the horizontal ~vx and vertical velocity
~vy are

~vx = cos(α)v~ex
~vy = sin(α)v~ey
23
The earth is obviously not a point-like particle and to use the formula for point-like particles need to be
justified. The proof that this is valid is the same as in case of the static electric field, we refer the reader to
9.1

87
3 Mechanics 1

Putting the origin to the place, where we shoot the body, leads to ~r(t = 0) = 0. We now
can calculate the motion along the different axis independently (for t > 0).

vx = v0 cos(α)

v~0
vy = v0 sin(α)

α
x
Figure 3.10: Projectile motion.

• As there is no initial velocity in the z direction and no force acts in the z direction,
the z component will not change. So we have z(t) = 0.

• Along the x axis, there is no force acting, so no acceleration. But we have an initial
velocity so we get a linearly growing x component. x(t) = vx t where vx = |~vx |.

• In y direction we have the gravitational force acting as well as an initial velocity.

Therefore we can use the linear uniform acceleration from section 3.2.2 with t0 =
0, x0 = 0 , v0 = vy = |~vy | and a = −g < 0 where the g > 0 is due to the
gravity pointing in opposite direction than the y axis. We then have the motion
y(t) = vy t + g2 t2

Due to this splitting we were able to simplify this 3 dimensional motion into the known
motion in one dimension. We can even go a step further and calculate the y component
as a function of the x component meaning that for each point along the x axis we know
the height of the body. For this we solve t for x and get t = vxh and substitute it into the
function for the y component:

2
g x g x vy g 2
y(x) = vy t(x) − t(x)2 = vy − = x x
2 vx 2 vx vx 2vx2

88
3.3. DYNAMICS OF POINT-LIKE PARTICLES

3.3.8 Volume and Surface Forces

There are two conceptually different ways a force can act on a body. On one hand a force
can act on the entire body or only on its surface. Correspondingly one calls these forces
volume and surface force. Obviously these concepts do not hold for point-like particles
since they have no volume and no surface24
In case of the volume force, the force acts on the entire body. It is not necessary that it
acts on the entire body the same. Examples of volume forces are gravity, magnetic field
or electric field in case of insulators25 .
This is different to surface forces where the force only acts on the surface of the body.
Examples of these are pressure and friction. One can even split up the surface forces
in normal and tangent forces. A normal force acts perpendicular on the surface. For
example if a stone is placed on a horizontal table, the table opposes the gravitational force
of the stone by pushing it up. This up-pushing force only acts on the surface touching
the stone and nowhere else. Since gravity acts vertically, the force between the table and
the stone is also vertical and therefore perpendicular to the surface. This is different in
case of a tangent force which is acting parallel to the surface. For example if the stone is
pushed over the table, there is friction which acts parallel to the surface. For more details,
see also the next section.

3.3.9 Friction
A exact description of friction (including fluid resistance) is pretty difficult. Nevertheless
there are different models where we will lock at the simplest one.
This model states that the friction of a (moving) body is proportional to its normal force,
see also figure 3.11. The proportionality constant is called coefficient of friction µ. One
can distinguish three different types of friction, static friction, dynamic friction and rolling
resistance. Each of these three types has its own coefficient.
Static friction happens if two bodies are placed one onto the other but do not move. For
example a stone on a table. As long as a horizontal force is smaller than the static friction,
the stone does not move. This maximal static friction is given by Fs = µs FN where µs is
the static friction coefficient and FN is the normal force of the stone, e.g. its gravitational
force FN = mg.
24
In fact these concepts make only sense in terms of bodies which consist of a lot of particles and we can
describe them as continuum.
25
In case of conductors in an electric field, the charge is distributed on the surface of the body. See also
9.2.5.

89
3 Mechanics 1

F Ff

Figure 3.11: Body on a plane surface (e.g. table) is pulled with a force F . The friction Ff
is proportional to the force that pushes the body on the surface FN .

Be aware that this is only the maximal horizontal force one can apply before the stone
starts sliding. If one applies a force F < Fs , the friction is equal to F and not Fs 26 .
If the force is bigger than the static friction or if the body is already sliding, dynamic
friction happens. This means the friction is Fd = µd FN where µd ≤ µs is the dynamic
friction coefficient. In our example, the stone slides with a constant speed if the applied
force is equal to the dynamical friction F = Fd . If the applied force is bigger, the stone
1
accelerates with an acceleration a = m (F − Fd ).
Instead of sliding, a body could also roll. In this case one calls the ”friction” rolling
resistance. One can also assign a coefficient µr to this resistance. For well-formed bodies
(as cylinders or spheres), the rolling resistance is much smaller than dynamic friction. Be
aware that rolling resistance is highly related to static friction. Because in case of rolling
resistance, at each instance one point of the body is not moving (the one touching the
table). This point is the subject to static friction. For better understanding imagine driving
a bike. If you brake it is not the rolling resistance that allows you to brake: As long as the
wheel is still turning, your braking force is limited by the static force between the ground
and the wheel. If you brake harder, you overcome the static force and you start sliding.
This is worse because the dynamic friction is smaller than the static one and you brake
slower (smaller acceleration). That is the reason most cars have ABS: At an emergency
braking, the board computer regulates the braking such that it never blocks the wheels,
they always brake with static friction and not dynamic friction.

26
Otherwise the total horizontal force would not vanish and the stone should accelerate horizontally.

90
3.4. MOMENTUM, WORK, ENERGY AND POWER

3.4 Momentum, Work, Energy and Power

In principle one can always proceed as described in the previous sections. These equa-
tions of motion are very often too difficult to calculate. This does not mean we cannot
say anything about the behaviour of complicated system. We can will use conserved
quantities, as Momentum and Energy, which we treat in this chapter.

3.4.1 Momentum
The momentum already appeared in the most general formulation of Newton’s second
law (see equation (3.1)). It is defined as p~ = m~v . As long as no force is acting, the time
derivative of the momentum is zero, meaning momentum is conserved/constant. Or in
other words, momentum is only changing if a force acts. The change of momentum ∆~ p
is then given by

ˆ t2
p = p~(t2 ) − p~(t1 ) =
∆~ F (t)dt.
t1

3.4.2 Work
The concept of work is pretty important in physics as it connects energy and force.
Roughly speaking, work is one possibility to convert one form of energy into another.
For example if someone drops something, e.g. a stone, it gets accelerated and the poten-
tial energy converts into kinetic energy (see also sections below). This transformation of
energy happens due to the gravitational force acting along the path of the stone.
For a constant force F always pointing in the direction of movement, the work W is
defined as

∆W = F ∆s

where s is the length of the considered path. The unit of work is newton meter (N·m) or
Joule (J) or watt seconds (W·s). For example dropping a stone of 1kg from a height of
2m., the work done by gravity is W = F s = 1kg · 9.81m·s−2 · 2m = 19.62J.
In case the force points perpendicular to the direction of motion, no work is done. Intu-
itively this can be seen in case of circular motion, where no energy is applied but a force
acts radially and therefore perpendicular to the movement. Or in case of gravity: Moving
our stone horizontally does not change its potential energy, so there is no work done by

91
3 Mechanics 1

gravity27 . In general we can split up the force in the component parallel and perpendicu-
lar to the motion, see also figure 3.12. Then only the component parallel does contribute
to the work. This is achieved by taking the scalar product of force and path

∆W = F~ · ∆~s

F~ F~k
∆~s

F~ s⊥

Figure 3.12: A body is elevated along ∆~s (only vertical movement) by a force F~ . Only
the component F~ k along ~s contributes to the work ∆W = F~k ∆~s = F~ · ∆~s.

In general the force might not be constant along the path and the path not a straight
line. Then the calculation can get pretty tedious28 . In this general case we split up the
path in small pieces ∆~s where we can assume the force F~ to be constant and the piece a
straight line. Then we calculate the work done on this small piece and sum all this work
up. The limit for the splitting into pieces of very small pieces (length going toward zero)
is mathematically an integral. In this integral we integrate (=sum) the small work pieces
dW = F~ · d~s which leads then to the formula

~sˆ
+∆~s

∆W = F~ · d~s.
~s
27
Although one might need to apply work in order to move the stone for example in case pushing it over
a table where friction occurs. Then the work is only the friction times the path length.
28
In most cases the general calculation is not needed but it is important to understand what this general
formula (intuitively) means.

92
3.4. MOMENTUM, WORK, ENERGY AND POWER

3.4.3 Energy
The definition of energy is the ability to perform work. This means if we want to perform
work, we must apply some kind of energy which is then transferred (by the work) to
another kind of energy. Energy is also a conserved quantity. Nevertheless there exist
a lot of different forms of energy. The conservation of energy is stated as first law of
thermodynamics29 . In order to apply the conservation one needs to consider all kinds of
energy which makes it in some cases less applicable than momentum.

3.4.4 Potential Energy

If a force acts between two bodies, in some cases it is possible to attribute a potential
energy. Since this is in general a bit tricky, we will first have a look at an easy case and
then generalize.
The easiest case is the one of a homogeneous field, e.g. the gravitational field on earth30 .
We choose the coordinate system such that the z-axis points vertically up (see also figure
3.13). The force on a body with mass m is then F~grav = −gm~ez . If we lift a body
vertically we have to apply a force F~e = −F~grav . Lifting it up by vertical distance s, we
have to apply the work W = |F~e |s = F~e · s~ez = −F~grav · s~ez > 0. If we take the
xy plane as reference, we can denote for each height z an energy we have to apply. This
energy is called potential energy Epot and is given by

Epot = −F~grav · ~r = |F~grav |z

where ~r is the vector pointing at a certain position (see also figure 3.13). We therefore
managed to attribute a gravitational energy for every point in space. Knowing the poten-
tial energy, we can also go back to the force by taking the derivative

dEpot
F~grav = − ~ez .
dz
Here we used, that we already know the direction of the force. A general expression is
given in the next paragraph, where we look at general potentials.

29
In Newtonian mechanics this cannot be proven. But in Lagrangian mechanics this is associated to
the assumption that physics is independent of time, meaning the laws of physics are true yesterday, today,
tomorrow and at any other time.
30
Here we only look near the surface of the earth such that earth looks like a plane.

93
3 Mechanics 1

For the general case we consider two bodies denoted by their masses M and m which
interact which each other. This interaction leads to a force F~~r between them where
~r denotes the distance from M to m, see also picture 3.14. In some cases31 , we can
attribute a potential energy to this configuration. If this is possible, we can proceed as
follows. We chose a reference point ~r0 where we attribute the energy Epot = 0. The
choice of this reference point is arbitrary, due to equation (3.5). Then we calculate the
work W that needs to be done to move32 m from ~r0 to ~r. The potential energy is then

ˆ~r
Epot (~r) = W (~r0 → ~r) = − F~ · d~s (3.3)
~
r0

where the minus sign takes into account that we have to apply the external force F~e = −F
in order to move the body m.
This definition of the potential energy is only meaning full if it does not depend on the
path from ~r0 to ~r. In particular this means we can go from point ~r0 to ~r and back (by a
different path) and gain no energy. This leads to the constraint the interaction between
the two bodies have to fulfil in order to describe it by a potential energy: The work we
have to apply for a closed path must be zero or as formula

˛
W = F~e · d~s = 0 (3.4)

where the circle in the integral symbolizes that we take a closed path. Such an interaction is
associated to a (so called) conservative field, further information see in the section about
electromagnetism, chapter 9.2. Not all interactions fulfil this constraint, for example
friction or many time-varying fields as for example a time dependent magnetic field33 .
If we can describe an interaction with a potential energy, many calculations simplify. For
example we can easily calculate the energy between two points34 ~r1 and ~r2 we can simply
take the potential energy at ~r2 and subtract the one from ~r1
31
The condition to succeed is given in equation (3.4).
32
We could also move M which yields the same result. But since ~r points to m it is more intuitive to
move m.
33
This is used in a transformer, where the electrons ”flying” around a varying magnetic field gain energy,
meaning a voltage builds up.
34
The difference to the calculation before is that no one of these points must be the reference point ~r0 .

94
3.4. MOMENTUM, WORK, ENERGY AND POWER

 
ˆ~r2 ˆ~r2 ˆ~r1
∆Epot = − F~ · d~s = − F~ · d~s − − F~ · d~s = Epot (~r2 ) − Epot (~r1 ).
 

~
r1 ~
r0 ~
r0
(3.5)

On the other hand if the potential energy of an interaction is given, we can also go back
to the force35 . Similar to the example above with the homogeneous gravitational field,
we can apply some derivative on the potential energy. This is also clear from an analytic
point of view: The potential energy is the integral from the force, and the ”inverse” of the
integral is the derivative. So applying an appropriate derivative on the potential energy
should give back the force. Indeed we can get the force by

 ∂ 
∂x
F~ (~r) = −  ∂
∂y
 Epot (~r)
∂
∂z

where these curly derivative signs ∂ are the usual derivatives but indicate that the potential
energy not only depends on one parameter but on x, y, and z. The minus sign is there
because of the same reason as in the definition of the potential energy, see equation (3.3).

3.4.5 Kinetic Energy

Taking a bouncy ball and let it fall, it bounces up again. When flying up, it gains po-
tential energy and in particular work is done in order to bring it up again. Hence there
must be another kind of energy which allows the ball to get potential energy again. This
other energy is the kinetic energy (neglecting elastic energy when it bounces). In order to
calculate the kinetic energy, we make use of the conservation of energy. This means the
potential energy lost must be converted into kinetic energy. When the ball falls for a time
t (starting from rest), it passes a distance h = 12 gt2 and loses the potential energy mgh
The same energy must then be gained as kinetic energy. Therefore the kinetic energy is

35
Since this includes more advanced math, you do not need to know this for the exams.

95
3 Mechanics 1

Ekin = mgh
1
= mg 2 t2 , gt = v
2
1 1
= mv 2 = pv
2 2

where v is the final velocity of the ball.

A 2 times bigger velocity leads to four times more kinetic energy This is clear because
to double the velocity one need double as many time and therefore a four time longer
distance. As a consequence one converts four time more potential energy into kinetic
energy.

3.4.6 Power

An other interesting quantity is the amount of work per time, which is called power P .
If during a period of time T the work is constant W , the power is simply P = ∆W ∆T .
Therefore if one needs longer for the same work his power is smaller and vice versa.
Similar to the discussion about mean velocity and instantaneous velocity in section 3.2.1,
we can look at mean and instantaneous power. The instantaneous power is defined as

dW
P (t) = .
dt

This instantaneous power describes how much work per second is done at a certain time.
To get the mean power we choose a slightly different approach, which is more common
in terms of power and work. Lets calculate the mean power between t1 and t2 . For this
we average the instantaneous power. Intuitively speaking we split the time T = t2 − t1
into N small pieces and consider the power being constant during each piece. Then we
sum all these pieces up and divide by the total number of pieces. Letting N going towards
infinity we end up by an integral

96
3.4. MOMENTUM, WORK, ENERGY AND POWER

N
1 X Tj
P ≈ P (t1 + )
N N
j=1
N Tj
X P (t1 + N ) 1
= N
T
T
j=1
ˆt2
1
→ P (t) dt = P
T
t1

T
where we used N → dt for N going towards infinity. The unit of Power is Watt which
is one joule per second. Using this, the unit of energy is sometimes written as kW·h,
1kW·h = 3.6MJ which is the work done by the power of one Watt during one hour.
If you’re not used thinking in terms of work and power, you might mix them (as many
politicians and journalists do). For example a light bulb has a power of 40W, this means
each second it converts 40J electric energy in light and heat. If the bulb shines one hour
it has ”worked” (not mechanical work) 40W · 3600s = 144000J which is of course an
energy.
Other example: A house need (let’s say) 7500kW·h per Year. This is a power, because
you have energy per year which is equivalent to work per time.

3.4.7 Rotation Energy

One other form of mechanical energy is the rotation energy. Consider a rotating body
consisting of (multiple) mass point(s) mi . Then the rotation energy is given by the kinetic
energy of all the mass points

X X1 X1 1
Erot = Ekin,i = vi2 mi = ω 2 ri2 mi = ω 2 I
2 2 2
i i i

36
P 2 ω is the angular frequency and we used that the velocity vi = ωri and I =
where
i ri mi is the moment of inertia. We will have a closer look to rotating bodies in the
next chapter 4.
36
In scalar form. For the vector form we would need the vector product, see also 2.1.2.

97
3 Mechanics 1

Example:
Someone is driving with a bike with total mass M = 80kg with a velocity v =
10m·s−1 then the kinetic energy is Ekin = 12 M v 2 = 4000J. Additionally the wheels
are rotating, meaning they have additionally rotation energy. Assuming (two) cylin-
drical wheels with mass m = 0.6kg and radius R = 0.5m, the momentum of inertia
is I = mR2 = 0.15kg·m2 . Using ω = Rv = 20s−1 we get a rotational energy
Erot = 2 · 12 ω 2 I = 60J. The total energy is therefore Etot = 4060J.

3.4.8 Angular Momentum

In terms of conservation laws one should clearly mention the angular momentum which
we will also investigate more precisely in the next chapter. Nevertheless the most impor-
tant formulas shall be given already now.
For a point like particle at position ~r and mass m the angular momentum L is defined
as37

L = rmv⊥

where v⊥ is the velocity component perpendicular to ~r and r = |~r| is the absolute value
of the position. For symmetric bodies consisting of many particles, we can use the mo-
mentum of inertia introduced in section 3.4.7 and we get

L = Iω.

37
In fact the angular momentum is a vector quantity. For simplicity we only give the scalar form.

98
3.4. MOMENTUM, WORK, ENERGY AND POWER

F~e

F~grav

~r
F~e s

xy plane

F~grav

Figure 3.13: A body in the gravitational field is lifted from ground to a coordinate z. The
work is W (z) = |F~grav |z. This means the work done is stored as potential energy. With
the reference point Epot (z = 0) we get Epot (z) = W (z) = |F~grav |z

99
3 Mechanics 1

Epot (~r0 ) = 0
M
~r0

~r Epot (~r)

Figure 3.14: Two bodies with mass M and m interact with each other. The potential
energy at ~r is equal to the work that needs to be applied to m when moving it from the
reference point~r0 to ~r.

100
Chapter 4

MECHANICS 2
I was the first camera team to visit the
moons of Jupiter.
Galileo

4.1 Rotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

4.2 Rigid Bodies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
4.3 Dynamics of Rotation . . . . . . . . . . . . . . . . . . . . . . . . . . 120
4.4 Gravity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

101
4 Mechanics 2

After we got familiar with the basic concepts of mechanics (see chapter 3) we now want to
apply these concepts for more advanced topics and examples. Namely we will introduce
a more general notation and description for rotations and discuss related topics such as
fictitious forces and rigid bodies. Finally we will make a little excursion to the motion of
planets and in particular to Kepler’s laws.

4.1 Rotations

Since most of this chapter is about rotational motion it is important to have a reliable
mathematical description. This will be important to precisely describe and calculate ef-
fects appearing from rotations1 . In particular we will introduce the angular-velocity and
redefine all the rotational quantities using this vectorial notation. In the end we will have
a look at rotating frames and the appearance of fictitious forces.

4.1.1 Angle and Angular Velocity as Vectors

When we started with kinematics, we started with the position of a point-like particle and
continued with the velocity and acceleration. The corresponding quantities for rotations
are the angle φ, the angular velocity ω and the angular acceleration α. Since the angle
is only defined up to a 360◦ (or equivalently 2π in radians), it is less important than the
angular velocity. Hence it is more intuitive to start with the angular velocity and then
define the angle.
Assume a body (for example a point-like particle) is performing a circular motion around
a given axis, see also figure 4.1. In addition we want to assume that the time T for one
round trip is constant, meaning we have a constant angular velocity ω = 2π/T . The
motion of the body hast three important properties

• The body moves in a plane perpendicular to the axis. Therefore the velocity (as
vector) lies in this plane.

• As we assumed the frequency to be constant, its speed is proportional to the dis-

tance of the body to the axis r⊥ .

• The speed is proportional to the frequency ω.

1
Rotational motion can be quite unintuitive that’s why we need a reliable mathematical formalism

102
4.1. ROTATIONS

ω
~

r⊥

perpendicular plane

Figure 4.1: A body (black dot) is rotating around an arbitrary axis ω. Its motion happens
in a plane perpendicular to that axis.

These tree properties can be merged in one formula for which we need to define the
angular velocity vector ω~ : It points in the direction of the axis and its length is equal to
the absolute value of ω = |~ ω |. The direction of ω ~ is such that if we take the right hand
and if the thumb points in the direction of ω ~ , the velocity ~v points in the direction of the
other finger. Using this definition and assuming the center of the coordinate system is
on the rotation axis, the velocity ~v is related to the angular velocity by

~ × ~r
~v = ω

where ~r is the position vector of the body pointing from the center of the rotation to the
body itself. The vector product ensures all the three properties mentioned above. It is a
good exercise to prove the three properties from these formula.
Since the angular velocity ω~ represents the change rate of an angle ϕ~ , we can define an
angle by integrating ω~.
ˆ
ϕ~= ω ~ dt + C

103
4 Mechanics 2

where C is just a constant angle (integration constant). The properties of this angle vector
ϕ
~ are inherited from those of ω ~ . The direction of ϕ~ is along the rotation axis and the
length corresponds to the rotated angle (usually in radiants). The length of the path l of
an object rotated by ϕ
~ is

l = |~
ω × ~r|.

Before closing this section, we investigate the vectorial property of ω

~ more precisely. For
this consider a body which performs simultaneously a rotation along the x, y, and z axis,
see figure 4.2.

ωy

ωx
ωz

Figure 4.2: A body (gray shaded) is simultaneously rotating around all three axis. This
rotation is equivalent to a rotation around the angular velocity vector given as vectorial
sum of the rotation around each axis.

This rotation corresponds to a single rotation given by

 
ωx
~ =  ωy  .
ω
ωz

104
4.1. ROTATIONS

4.1.2 Angular Acceleration and General Motion

For a linear motion we found that the change rate of the velocity is the acceleration
v
~a = d~
dt . Similarly we can define an angular acceleration

ω
d~ d2 ϕ
~
α
~= = 2.
dt dt
The acceleration of a body is then given by

~ × ~r
~a = α

where ~r points from the rotation axis to the body.

If the angular acceleration is constant α
~ = const, we can derive the angle and the angular
velocity with the same argument as for the linear motion and we get

ω
~ (t) = αt + ω
~0
1 2
ϕ
~ (t) = α ~t + ω~ 0t + ϕ
~0
2
with ω
~ 0 and ϕ
~ 0 the corresponding values at t = 0.

4.1.3 Accelerated Frames and Fictitious Forces

Imagine, for example, to be sitting on a bus at night, as the cutest kitten suddenly appears
on the street a couple of meters ahead of the bus. The driver, almost immediately pushes
down the brake pedal. You feel how your body is somehow dragged forward and you
almost fall off your seat. It feels as if some magic force acting on your body had appeared
at the instant of braking. In this chapter we want to investigate this magic force and
we will see that it is an effect of being in an accelerated frame. Because the mother of
the kitten might have seen it from the roadside. For her, as much as she might have
been scared, the whole situation was nothing but a manifestation of Newton’s first law of
motion. The bus would decelerate, and you would just keep moving at your initial speed
along the direction of travel, no magic force involved. Still, from your point of view, the
magic force must have existed, as otherwise Newton’s first law would have been violated
in your reference frame.
As we already discussed in chapter 3.3.2 the simplest frames of reference are the inertial
ones where Newton’s laws are valid. If we have an inertial frame than all frames that
move with constant speed with respect to the inertial one are also inertial systems (details
see section 13.2.2). In this section we assume that we are in an inertial frame and look

105
4 Mechanics 2

at an accelerated system and try to describe the dynamics in that accelerated one. For
this we denote all quantities we measure in our system without prime (~x, ~v , ~a, F~ ) and
those measured in the accelerated system with prime (~x0 , ~v 0 , ~a0 , F~ 0 ). As the general case
is rather difficult, we will first have a look at special cases and in the end derive the general
result and interpret its constituents using the special cases.

4.1.4 Centrifugal force

The most common fictitious force is the centrifugal force, which is related to the cen-
tripetal force. Assume the accelerated system moves on a circle with constant angular
velocity ω and it contains a mass m which is at rest in this (accelerated) system. Then
from our inertial frame it is obvious that a force needs to be applied to the mass: Ac-
cording to Newton’s second law, without force the mass would perform a uniform and
linear motion (~v = const.). We have to force to perform the circular motion and the
corresponding force points towards the center of the motion and is given by

F~ = m~a = −mω 2~r

where ~a is the centripetal acceleration as computed in section 3.2.3. The minus sign comes
from the fact that the acceleration and hence also the force points toward the center of
the circle whereas ~r points from the center to the mass m. From this perspective (the
inertial frame) everything is clear. But in the system attached to the mass, this is different.
In that (accelerated) system, the mass does not move. Nevertheless a force acts on that
mass which could also be measured2 . Therefore we have a force without acceleration
which obviously contradicts Newton’s second law. This seems only paradox since the
frame is accelerated, and in such a frame Newton’s laws are not valid anymore. In that
accelerated frame the mass pushes towards outside with the force

F~ 0 = mω~r

which is the same as the centrifugal force up to the minus sign. The mass opposes the
changing of the direction and that’s the force we see in the frame of the mass.

2
For example attaching the mass to a spring, the spring would change its length in both frames and hence
indicating a force

106
4.1. ROTATIONS

Coriolis Force

y0 ω x0
y0 x0 ω
ω

y0 s(t)
m s(t)
x0 m
~v m ~v ~v

t=0 t = ∆t t = 2∆t
Figure 4.3: A body (black dot) moves with constant velocity. From the rotating frame its
trajectory is not a straight line but curved.

You might have experienced this force when walking around on a carousel. To approach
this force we consider a body and attached to it a coordinate system that rotates with a
constant angular velocity ω, see figure 4.3. In our inertial frame a body is moving with
constant velocity. According to Newton’s second low, no force is acting on that body.
From the view of the rotating frame still no force is acting on this body. But the body
does not move on a straight line, it seem accelerated. Again this contradiction is due to
the acceleration of the frame. To get a formal expression assume the body to be at the
center of the rotation frame and moving with speed v. After a time ∆t it passed a distance
∆x = v∆t. In the rotating frame the motion consists of two components. A radial one
vr0 pointing radially away from the center and a tangential one vt0 pointing in the direction
of the rotation. The first one stays constant as it nothing but the velocity we see from
outside vr0 = v. The tangential velocity changes with the distance r(t) of the body to the
center vt0 = r(t)ω and the apparent tangential distance is s(t) = ϕr(t) = ωtvt. Hence
it depends quadratically on time t and as a consequence it corresponds to a constant
acceleration. For a constant acceleration we had the formula x(t) = 1/2at2 . Equating
x(t) and s(t) we can solve for a and get the Coriolis acceleration

a0cor = 2ωv.

This is therefore the acceleration seen from the rotating frame. Doing the computation

107
4 Mechanics 2

more rigorously (see below) we find the vectorial dependence

~a = −2~
ω × ~v

Similarly we can have a look at a body that moves along a straight line in the rotating
frame. In our inertial frame it performs a curved trajectory hence there is a force acting
which is given as F~cor = m~acor . Note that this force is not the Coriolis force itself because
the Coriolis force is a fictitious force and this force is not. The relation between this force
and the Coriolis force is the same as the relation between centripetal and centrifugal force
discussed in section 4.1.4.

Linear acceleration

If the accelerated system does not move on a circle but along a line and accelerates with
an acceleration a then also all objects that move with constant velocity in our frame well
be seen accelerated in the accelerated frame. From the view of the accelerated frame this
is again the case where no force acts but an acceleration is measured. On the other had we
can look at the case where a body does not move with respect to the accelerated frame.
From the inertial frame this body also accelerates with a and there must act a force F on
that object to enforce this acceleration. Once again the problem arises in the accelerated
frame because there the body does not accelerate (as it does not move) but a force acts
on that body which is opposite the one viewed from outside F 0 = −F . This is because
the body opposes its acceleration. This opposition is seen as force F 0 in the accelerated
frame. Nevertheless this is not a real force as it does not lead to an acceleration. It only
appears as a force since the frame is accelerated.

Rigorous Computation
To derive a formally complete and correct formula of the fictitious forces, we describe
how a position vector of a point mass m from an inertial frame Σ is transformed into
an accelerated frame Σ0 . Each system consist of a choice of basis vectors e~x , e~y and
e~z in Σ and correspondingly with prime in Σ0 . The accelerated system moves with
respect to the inertial one. Its position with respect to the inertial one is given by a
~
time dependent vector R(t), see figure 4.4.

108
4.1. ROTATIONS

m y0 x0

~r ~r0
y Σ0

~
R(t)
Σ z0
x

z
Figure 4.4: The inertial frame Σ and the accelerated Σ0 are separated by a
~
vector R(t)

The relation between the position vector in each frame is given by

~ + r~0 (t).
~r(t) = R(t)

As the mass might move with time, ~r(t) and r~0 (t) are assumed to be time dependent
3 . The velocity of the mass m in each system fulfils the relation

d~r(t)
~v (t) = = ~r˙ (t)
dt
~˙ ˙ ~ (t) + r~˙0 (t)
= R(t) + r~0 (t) = V

where V~ (t) is the velocity of the accelerated frame with respect to the inertial one.
For the second term where we take the derivative of r~0 (t) we have to be more careful.
Because not only the coordinates change but also the basis due to a rotation ω ~ of
the system Σ0 . Using the product rule for derivatives we get (dropping the time
dependences)

˙0 = ṙ0 e~0 + ṙ0 e~0 + ṙ0 ~e0 + r0 e~˙0 + r0 e~˙0 + r e~˙0

~r(t) x x y y z z x x y y z z
~0
= v (t) + ~u(t)

109
4 Mechanics 2

where v~0 is the velocity of the mass m in the accelerated frame and ~u(t) = ω × r~0 is
the velocity of the frame itself. This velocity origins from the rotation of the frame
around its axis. To get the acceleration, we have to take one more time derivative
and get

~˙ +~v ˙
~a = ~v˙ = V ~ × r~0
˙ 0+ω

where we assumed that the angular velocity ω is constant. With the same argument
as above we find that ~v˙0 = a~0 +~
ω × v~0 . In addition we use ~v˙0 = ω
~ × v~0 +~ ω × r~0 )
ω ×(~
and we end up with

~a = a~0 + A ω × v~0 + ω
~ + 2~ ω × r~0 ).
~ × (~

In the inertial frame Σ, Newton’s law holds such that m~a = F~ . This is not true
in the accelerated frame Σ0 , where ma~0 = F~ + F~f with F~f being the sum of all
fictitious forces. Solving for a~0 we get

a~0 = ~a − A ω × v~0 − ω
~ − 2~ ω × r~0 ).
~ × (~

With this result the calculation is done and we can interpret the result.

• ~a is the acceleration in the inertial frame and is related to the total force via
F~ = m~a.

• a~0 is the acceleration measured in the accelerated frame. It differs from ~a by

the different fictitious accelerations:
~ is the linear acceleration of the frame itself.
• −A

ω × v~0 is the Coriolis acceleration.

• −2~

• −~ ω × r~0 ) is the centripetal acceleration.

ω × (~

With this we have derived all the fictitious forces in vector form.

110
4.2. RIGID BODIES

4.2 Rigid Bodies

A very important concept in the context of circular motion are rigid bodies. Although
they consist of many point-like particles, their fixed shape and fixed mass distribution
simplifies the computations a lot. After defining rigid bodies we will define and compute
useful quantities such as the center of mass or the momentum of inertia. In the end we
will explain how to describe different axes of rotation..

4.2.1 Definition and Basic Properties

As the name already says, a rigid body is a body whose form does not change. More
precisely:

A rigid body consists of many point-like particles whose distance is constant.

The distance is even constant if external forces act on the body. It therefore
keeps its shape and mass distribution.

Once again this is only a model, in nature no body exists that does not slightly deform
when applying a force. This model is therefore only valid if the deformation is negligible
compared to other effects. For example the trajectory of a football can be well described
by football being a rigid body. Nevertheless when the football gets kicked or hits the
ground, it is deformed and then this model is not valid anymore.
The advantage of dealing with rigid bodies lies in their rigidity meaning it reduces the
amount of variables that need to be considered when describing its motion. The position
of three point-like particles (being part of that body) suffices to describe the position
of all other constituents4 . Instead of describing the rigid body by the position of three
constituents, it is usually easier to describe its position and its orientation. So at which
position it is and in what direction it is rotated.
In most cases there is another very useful approximation done. Considering the atoms
as point-like constituents of a rigid body, their number is usually very big (in the order
of 102 3) and their inter atomic distance much smaller than the size of the body. Instead
of treating the atoms as point-like particles with mass m we consider a them to be con-
tinuously distributed in the body using their density ρ = mn where n is the number of
atoms per volume in the body. This allows to rewrite cumbersome sums by computable
integrals.
4
Knowing the position of 2 particles does not suffice as the third one could still rotate around the axis
given trough the other 2 particles.

111
4 Mechanics 2

4.2.2 Center of Mass

As already mentioned in the last section, the kinematics of a rigid body is completely
described by its position and its orientation. There is one particularly interesting point of
the body, the center of mass. l, the so-called center of mass.
We already encountered the center of mass in section 3.3.4. Before we define it for the
it to the rigid body, we want to view it from a different perspective. Assume we have a
system with N point-like particles where no external force is acting5 . Therefore the total
momentum
N
X N
X
P~ = p~i = mi~vi
i=1 i=1

is conserved, where mi is the constant mass of the ith particle and ~vi its velocity. Since
the velocity is the temporal derivative of the position and the mass of each particle is
assumed to be constant, we can take the time derivative in front of the sum
N N
X d~ri d X d ~
P~ = mi = mi~ri = R CM (4.1)
dt dt dt
i=1 i=1

where TM is the total mass. The last sum is nothing but the center of mass of the system
times the total mass(compare also with 3.3.4). Since P~ is constant, equation 4.1 tells us,
that the center of mass remains constant or moves with a constant speed (otherwise the
derivative could not be zero). This shows again the equivalence between Newton’s first
law, the conservation of momentum and the constant speed of the center of mass. All
these laws can be derived from each other and hence have the same physical meaning.
We now face the rigid body as many point-like particle system. The formulas that follow
might look difficult or ugly but there is no need for you to be able to perform abstract
computations with this formulas. You should rather get the concept and be able to apply
it to simple cases such as the ones in the exercises. We start with the formula used
above and enumerate the particles (e.g. atoms) of the rigid body again by the index i and
compute the center of mass
P
~ mi r~i
RC = Pi .
i mi

This is in general very complicated to compute and we pass on to the continues descrip-
tion of a rigid body. We split the whole body into small pieces. The piece lying at the
5
Maybe the particles interact with internal forces but they anyway cancel each other, see section 3.3.4

112
4.2. RIGID BODIES

position ~r shall have the volume dV (~r). The mass dm(~r) of such a piece is then given
as

dm(~r) = ρ(~r)dV (~r)

where ρ(~r) is the density (mass per volume) at that position ~r. Replacing the sum by an
integral we arrive at
´
~rdm(~r)
RC = ´V
~
dm(~r)
´V
~rρ(~r)dV (~r)
= V´
V ρdV (~r)

where the subscript V denotes the summation/integration over the whole volume of the
body.
For symmetric, homogeneous bodies, the center of mass lies in the symmetry point of
that body. Examples are the homogeneous sphere, where the center of mass indeed is the
center of the sphere or similarly a cube or a cylinder. For bodies with rotational symmetry,
there is a useful trick to compute the center of mass by reducing the three dimensional
integral to a one dimensional. We split the body in small discs along the axis. Each disk
at the position h has a thickness dh. The radius r(h) of such a disc is a function of
its position h. Assuming a constant density ρ of our body, the mass of such a disc is
dm = πρr(h)2 dh. Since the body is rotation symmetric, the center of mass certainly lies
on the symmetry axis. We obtain the distance hC of the center of mass from the origin
by summing/integrating the contribution of all these discs and arrive at

ˆh2 ˆh2
1 1
hC = hdm = πρ r(h)2 hdh (4.2)
M M
h1 h1

where h1 and h2 denote the start and endpoint of the body along the symmetry axis.
Let’s have a look at an example: We compute the center of mass of a cone, see figure 4.5.
We set the origin on the symmetry axis at the top of the cone. The radius of the cone at
height h is

R
r(h) = h.
H
where H is the total height of the cone. Inserting this in equation 4.2 we get the distance

113
4 Mechanics 2

Figure 4.5: A cone with height H and maximal radius R.

of the center of mass from the top by

ˆh2
1
hC = πρ r(h)2 hdh
M
h1
ˆH 2
1 R
= πρ h hdh
M H
0
ˆH
1 R2
= πρ 2 h3 dh
M H
0
1 R2
1 4
= πρ H
M H2 4
1 3 3
= MH = H
M4 4

where we used the total mass being M = ρ 13 πR2 H. For example for H = 12 cm, the
distance from the top to the center of mass is 9 cm!

114
4.2. RIGID BODIES

4.2.3 Momentum of Inertia

For translations, the mass is the ratio between the acceleration and the force m = F /a.
Similarly we can define a quantity for rotations describing the ratio between the torque
and the angular acceleration. This ratio is called momentum of inertia and we have already
encountered it quickly in chapter 3.4.7. After having already warmed up by calculating of
the center of mass, we continue with the definition and computation of the momentum
of inertia and focus on the physical background later (i.e. the torque in chapter 4.8 or
rotation energy in chapter 4.3.3). To give a little bit of context, remember that the velocity
of a small piece of the rigid body is v = rω where r is the distance from the rotation
axis. To compute the kinetic energy, we have to square this velocity, leading to a factor of
r2 . As the angular velocity is constant for the entire rigid body, the whole computation
difficulty remains in the r2 factor.
The momentum of inertia I is defined as the integral of the square of the distance or
formally:
ˆ
2
I=ρ r⊥ dV
V

where we inserted the subscript in r⊥ to remember that only the distance from the ro-
tation axis counts and not the distance from the origin. Once again, this formula looks
complicated but there is no need to being able to compute it for a body with a wired shape.
Comparing with the center of mass, we had to integrate the distance ~r from a small vol-
ume element dV over the whole volume. Now we integrate the distance squared r⊥ 2.

Note one important difference to the center of mass: We always can choose a coordinate
frame such that the center of mass is at the origin, meaning the center of mass can be
zero. Since there is a square in the momentum of inertia, it will always be larger than zero
and zero only iff r⊥ = 0 hence if the body is not extended perpendicular to the rotation
axis, i.e. an infinitely thin rod rotating around its axis. In addition it is important that the
momentum of inertia depends on the rotation axis because r⊥ depends on the choice of
the rotation axis6 .
The easiest non-trivial body to compute the momentum of inertia is a thin ring. Consider
a ring with radius R and a negligible thickness and rotation axis through the rotation
symmetry axis. As the thickness is negligible, each piece of mass has the same distance
r⊥ = R to the rotation axis and we can take it out of the integral and using dm = ρdV ,
6
For a very general description, one would need to describe the momentum of inertia by a tensor, i.e.
a 3x3 matrix. Nevertheless looking at a particular rotation axis and considering a symmetric body rotating
parallel to one of its symmetry axis, it is sufficient to compute the momentum of inertia as number.

115
4 Mechanics 2

we obtain
ˆ ˆ
2 2
I=ρ r⊥ dV =R dm = R2 M (4.3)
V V

where M is the total mass of the ring.

To compute the momentum of inertia for rotation symmetric bodies, we will again divide
them in small disks. So we first need to know the momentum of inertia of a disc (same
as cylinder): Consider a disc with height H and radius R with rotation axis through the
rotation center of the cylinder, see figure 4.6 left. First we note that the momentum of
inertia is proportional to the height. Doubling the length of the disc corresponds to the
case where we stick two discs together along the symmetry axis. This leads to a doubling
of the momentum of inertia. Hence, we can seperate the integral into the integral along
the height h and the area perpendicular to it dV = dh·dA. The area itself is split into thin
rings with radius dr, see figure 4.6 right. As we know how to compute the momentum

Figure 4.6: Left: Overview over the cylinder. Right: Top view, one little ring.

of inertia of a ring, we do a similar trick as before with the discs: Each ring with radius r
has a volume and a corresponding momentum of inertia according to 4.3

dV = H2πrdr
dI = r2 ρdV.

The total momentum of inertia is therefore

ˆ ˆ ˆR
I= dI = ρ r⊥ dV = ρH2π r2 · rdr
V
0
1 1
= 2πρH R4 = R2 M
4 2

116
4.2. RIGID BODIES

where we used again M = ρHπR2 and that we defined r perpendicular to the rotation
axis, so r⊥ = r. This result is worth keeping in mind as we will be often used.

Example:
Using the momentum of inertia for a disc, we can now compute it for a sphere.
Consider a sphere with radius R and assume the rotation angle going through the
center and the origin lying in the center. We slice the sphere into thin discs along the
rotation axis. Let h denote the distance
√ of such a disc from the origin. The radius
of this disc is given by r(h) = R − h2 and hence its momentum of inertia is
2

dI = 1/2πρr(h)4 dh. Summing up the momentum of inertia of all these discs, we

arrive at the total momentum of inertia
ˆ ˆR
1
I= dI = πρr(h)4 dh
2
−R
ˆR
π
= ρ (R2 − h2 )2 dh
2
−R
ˆR
π
= ρ R4 − 2R2 h2 + h4 dh
2
−R

π 5 4 5 2 5
= ρ 2R − R + R
2 3 5

π 16 5 2 2 4 3 2
= ρ R = R π ρR = R2 M
2 15 5 3 5

where M is again the total mass.

Similarly the momentum of inertia of another rotation symmetric body can be com-
puted.

117
4 Mechanics 2

4.2.4 Parallel Axis Theorem (Steiner’s Theorem)

We have seen how we can calculate the center of mass and the momentum of inertia
and have done this also for different bodies. The question arises, how these quantities
transform when changing the rotation axis. In particular we are interested in the case
where we shift the rotation axis., i.e. the axis before and after the shift are parallel to each
other.
For the center of mass, the answer is pretty simple. Assume we move the coordinate
system by a vector T~ . Then the new center of mass is
´
+ T~ )dm(~r)
V (~
r
~0
R C = ´
V dm(~ r)
´ ´
~rdm(~r) (~r + T~ )dm(~r)
= ´ V
+ V ´
dm(~r) dm(~r)
´V ´ V
~rdm(~r) ~ V dm(~r)
= ´V +T´
V dm(~ r) V dm(~ r)
=R~ C + T~

where R~ C is the center of mass before the shift. We find that the center of mass is simply
shifted by T~ .

Shifting the axis is more complicated in case of the momentum of inertia because the
radius is squared in the integral. To efficiently proceed and find a useful formula we have
to assume that the rotation axis before the shift goes through the center of mass. For this
case, we denote the momentum of inertia by I. The new axis is shifted by a distance d,
see figure 4.7, and its momentum of inertia is denoted by I 0 . Inserting the definition we
get
ˆ
0
I = ρ (r⊥ + d)2 dV
ˆV
2
= ρ (r⊥ + 2r⊥ d + d2 )dV
V ˆ
2
=I +d ρ dV
V
= I + d2 M

where the term containing 2r⊥ d drops because the axis passes through the center of

118
4.2. RIGID BODIES

Figure 4.7: Two axis, one going through the center of mass, the other being parallel to
the first one.

mass7 . The derived formula is called the parallel axis theorem and it is very useful (see
example below).
There is one important thing to mention which gets often forgotten: We can only apply
this formula if one of the axis passes through the center of mass. If this is not the case,
we have to apply this formula to compute the momentum of inertia for the axis passing
through the center of mass and from this we can continue the computation.

Example:
In the last section, we derived the momentum of inertia of a sphere. In case the
sphere rolls on a plane, the rotation axis is the contact point of the sphere on the
plane and not the middle point (note, Note that this axis is not constant in time, as
the contact point moves when the sphere is rolling.). Therefore the momentum of
inertia of the sphere rolling on the plane is
2 7
I = R2 M + R2 M = R2 M.
5 5
Thanks to the Parallel Axis Theorem, we could just do this simple addition and avoid
any integration.

7
Since r⊥ is not squared anymore, we have to include ”on which side” r⊥ lies, i.e. this term is nothing
but the computation of the distance of the center of mass from the axis. Since the axis passes through the
center of mass, this distance is zero.

119
4 Mechanics 2

4.3 Dynamics of Rotation

In a system consisting of N point-like particles, each particle has three degrees of freedom.
The dynamics of each point-like particle can be described by Newton’s laws, knowing the
dynamics of each particle determines also the dynamics of the whole system. The problem
is again the impossibility of solving such a problem as the number of equations is 3N .
As described above, the description simplifies in case of a rigid body. A rigid body has 6
degrees of freedom, the translations in all three spatial directions and rotations around all
three axis. As discussed in the chapter about the center of mass (see chapter 4.2.2), the
motion of the center of mass can be described by Newton’s laws. Therefore we know
how to describe the translational motion. In this section we have a look at the rotational
motion. For this we introduce the torque, the angular momentum and the rotational
energy. After introducing these quantities, we describe the general motion of a rigid
body which is a superposition of translation and rotation. Using these expressions and
the momentum of inertia, we find a very similar set of equations as for the translational
motion which we will summarize in the end of this chapter.

4.3.1 Torque
To change the velocity of any mass, we need to apply a force. The same concept also
exists for rotational motion and it is called torque. Namely, to change the angular velocity
of a rigid body, we need to apply a torque. To motivate the formula, consider two discs
with different diameters. Around each disc, a string is wound and at the end of each string
a constant force is applied, see figure 4.8. The work needed per rotation of each disk is

r2
r1

F F

Figure 4.8: Two discs with different diameters r1 and r2 . Around each disc a string with
a mass at one end is wound.

given by the force times the path length W = F 2πr. Although the force is constant, the

120
4.3. DYNAMICS OF ROTATION

work is different because of the different path length8 . Therefore the work scales with
the radius of the disc. Hence it will be convenient to introduce a new quantity related to
the work which takes into account the radius.
This quantity is the torque. It is basically defined as the force times the distance between
the rotation axis and the position that force acts. There is one thing we have to take into
account: only the force acting perpendicular to that distance will influence the rotation.
A force pointing towards the rotation axis will only push against that axis but not cause
any rotation. Once again the vector product is very convenient as it allows to respect all
the properties we discussed. The torque is defined as

M~ = ~r × F~
~ | = rF⊥
|M

where ~r is the vector pointing from the rotation axis to the point where the force F~ acts,
r is its absolute value and F⊥ the component of the force acting perpendicular to ~r. In
case you don’t remember whether the torque is ~r × F~ or F~ × ~r, there is a trick to find
the proper version. Take your right hand and bend your fingers slightly (as if you would
hold a cup of tea) and hold the thumb up. When the torque points in the direction of the
thumb, the force has to act in the direction of all the other fingers.
Note that the torque is not only defined for a rigid body. For example you could consider
the torque of a system with many point-like particles which do not form a rigid body.
Then ~ri being the position of the ith particle and F~i the force acting on it, the torque
~ i = ~ri × F~i . Nevertheless at rigid bodies, there are some very
acting on that particle is M
useful relations to other quantities we encountered so far. The vectorial definition of
the torque is compatible with the vectorial definition of the angular velocity and angular
acceleration. In case of a rigid body, the applied torque is proportional to the angular
acceleration α~ and the proportionality constant is the momentum of inertia I

~ = Iα
M ~.

Proof: To see this, consider a rigid body with momentum of inertia I and assume, there is one force
F~ acting at one point ~r. As it is a rigid body, there are internal forces between the different parts
of the body causing the entire body to change its angular velocity. For each little piece dm(~r 0 )
located at ~r 0 , we can apply Newton’s law dF~ (~r 0 ) = dm(~r 0 )~a = dm(~r 0 )~r 0 × α
~ . Hence α~ is
constant for all these pieces whereas ~a is not. The equations still hold, when applying the vector
8
The path length per rotation the circumference, hence: 2πr1 6= 2πr2 .

121
4 Mechanics 2

product with the position ~r 0 :

dM (~r 0 ) = ~r 0 × dF~ (~r 0 ) = dm(~r 0 )~r 0 × (~r 0 × α

~)
= dm(~r 0 )~r 0 × (~r 0⊥ × α
~)
2
= dm(~r 0 )~r 0⊥ α
~

using in the second line that the vector product does not change when only considering ~r 0⊥ the
perpendicular component with respect to α ~ and in the end ~a × (~b × ~c) = (~a · ~c)~b − (~a · ~b)~c and
0 0 0 2
~r ⊥ · ~r = ~r ⊥ . The sum of all these angular momenta acting at the different piece must be equal
to the applied momentum
ˆ
M~ = ~r × F~ = dM (~r 0 ) dm
ˆ
2
= ~r 0⊥ α ~ dm(~r 0 )
ˆ
2
=α~ ~r 0⊥ dm(~r 0 )

= Iα
~.

When considering the stable equilibrium of a rigid body, we also have to consider the
torque

A rigid body is in a stable equilibrium only if the total force and the total torque
are zero. If the torque is not zero, the body starts rotating around its center of
mass

4.3.2 Angular Momentum

For translations we found that the momentum is conserved if no force acts. Similarly
we can find a conserved quantity for rotations as long as no torque is acting. This quan-
tity is the angular momentum and we can derive its conservation in a similar way as for
momentum.

The angular momentum of a point-like particle is defined as

~ = ~r × p~
L

122
4.3. DYNAMICS OF ROTATION

where ~r is the position and p~ the momentum of that particle.

Taking the time derivative on both sides, and using ~v k~
p and the product rule
for derivatives adapted to the vector product, we arrive at

~
dL d~r p
d~
= × p~ + ~r × = ~r × F~ = M
~.
dt dt dt
This means the angular momentum is changed when a torque is applied.

For a system of N particles, this relation is valid when considering the total angular mo-
mentum and the total force. This follows directly from the linearity of the derivative9 :
N
d
P ~i
L
~ tot
dL i=1
=
dt dt
N ~i
X dL
=
dt
i=1
N
X d~ri pi
d~
= × p~i + ~ri ×
dt dt
i=1
XN
= ~i = M
M ~ tot .
i=1

The computation of the angular momentum simplifies again for a rigid body. The deriva-
tion is similar as in case of the torque (see 4.8)

~ of a rigid body is
The angular momentum L
~ = I~
L ω

where I is the momentum of inertia and ω

~ is the angular velocity. An applied
torque changes the angular momentum

~
dL ω
d~
=I = Iα ~
~ =M
dt dt

9
Meaning the sum and factor rule.

123
4 Mechanics 2

4.3.3 Rotational Energy

Opposite to the previously introduced concepts like torque or angular momentum, the
rotational energy is not a new concept, see section 3.4.7. It is convenient to introduce the
rotational energy for a rigid body as the distinction of the translation of the the center of
mass and the rotation around the center of mass is very intuitive. In case the rotation is
not performed around the center of mass, the partition into (translational) kinetic energy
and rotational energy is sometimes a bit arbitrary, see also the example at the end of this
section.
Consider a rigid body rotating round an axis without translation. We again split the body
into small pieces dm(~r) located at the position ~r. The rotation energy dErot of such a
piece is
1
dErot = dmv 2
2
1 2 2
= dmr⊥ ω
2
where r⊥ is the distance of d(~r) from the axis. Integrating over all these small contribu-
tions to the rotational energy, and using ω being constant, we arrive at
ˆ
Erot = dErot
ˆV
1 2 2
= r⊥ ω dm
V 2ˆ
1 1
= ω2 2
r⊥ dm = Iω 2 .
2 V 2
In close analogy to the translational kinetic energy we found

The rotational energy of a rigid body with momentum of inertia I and angular
velocity ω is
1
Erot = Iω 2
2
It is minimal when the rotation axis passes through the center of mass as I is
minimal (see section 4.2.4).

As discussed in the section about the momentum of inertia (see section 4.2.3), the mo-
mentum of inertia depends on the rotational axis. In general it is possible to change the

124
4.3. DYNAMICS OF ROTATION

rotation axis by introducing an additional translational motion. We illustrate this in the

following example

Example:
Consider a cylinder with mass m rolling on a horizontal plane, see figure 4.9. There
are two different rotation axis that can be considered. Each case leads to a different
rotation energy, of course the total energy remains the same.

Figure 4.9: Left: The rotation axis is the center of the cylinder. The center
of mass moves with a velocity v = rω. Right: The rotation axis is the point
where the cylinder touches the plane. Each point on the cylinder has its own
axis, the axis changes permanently. In this case, the body does not perform
any translation.
The first case is more intuitive. The cylinder performs its rotation around its center
and performs a translation with velocity v = rω, where r is the radius of the cylinder.
We can compute the kinetic, the rotation and the total Energy as
1 1
Ekin = mv 2 = mr2 ω 2
2 2
1 2 11 2 2
Erot = I0 ω = mr ω
2 22
3 3
Etot = Ekin + Erot = mr2 ω 2 = mv 2 .
4 4
where we denoted the momentum of inertia for the axis passing through the center
of mass by I0 . Note that in this choice of the axis, the rotation energy is minimal.
Now we consider a different rotation axis, the point where the cylinder touches the
plane: In this case the cylinder only performs a rotation around this axis but no
translation. Note in this considerations the rotation axis is different for each point
on the cylinder. To compute the momentum of inertia for this axis, we have to apply
the parallel axis theorem
1 3
I = I0 + mr2 = mr2 + mr2 = mr2 .
2 4

125
4 Mechanics 2

Computing the different energy contributions, we arrive at

Ekin = 0
1 13 2 2
Erot = Iω 2 = mr ω
2 22
3 3
Etot = Ekin + Erot = mr2 ω 2 = mv 2 .
4 4
As mentioned above, the total energy is the same, the partition between kinetic and
rotational energy is not well unique. We can interpret the resulting total energy as
a redefinition (renormalization) of the mass: The rotation energy contributes to the
total energy and the total energy of the rolling cylinder is the same as that of a point-
like particle of mass m̃ = 32 m and velocity v = rω. This is useful if the cylinder does
not move on a plane but in a more complicated landscape. As long as the cylinder
rolls and does not glide, its motion of the center of mass is the same as the one of a
point-like mass with m̃ moving along the same trajectory as the center of mass.

4.3.4 General Motion of a Rigid Body

For a rigid body that is free to movable and rotate, we can in general divide the motion
into the motion of the center of mass and a rotation around the center of mass. The
motion of the center of mass is determined by the total force acting on the body (see also
section 4.2.2). The center of mass moves as if it would be a point-like particle with mass
m (the mass of the rigid body) according to Newton’s laws

d2~rC
m = m~aC = F~tot .
dt2

The rotation of the body around its center of mass is determined by the total torque and
we find

d2 ϕ
~ ~ tot
I 2
= Iα
~ =M
dt

where I is the momentum of inertia with respect to axis through the center of mass and
the direction determined by the direction of ϕ. In general this motion is very complicated
and we will only treat easy exceptions.

126
4.3. DYNAMICS OF ROTATION

4.3.5 Analogy Translation and Rotation

Formally, there is a one by one correspondence between the expressions used to describe
the translation and the rotation. We summarize this in the following table.

Translations Rotations
Quantity Scalar Vectorial Quantity Scalar Vectorial
Distance r ~r Angle φ ϕ
~
Velocity v ~v Ang. vel. ω ω
~
Acceleration a ~a Ang. acc. α α
~
Mass m Mom. of in. I
Momentum p = mv p~ = m~v Ang. mom. L = Iω ~ =ω
L ~ × F~
Force F = ma F~ = m~a Torque M = Iα ~
M = Iα ~
´ ´
Work W = F ∆r W = F~ d~r Work W = M ∆ϕ W = M ~ d~ϕ
´ ´
Power P = Fv P = F~ d~v Power P = Mω W = M d~ ~ ω
Energy Ekin = 12 mv 2 Ekin = 12 m~v 2 Energy Erot = 12 Iω 2 Erot = 12 I~
ω2

127
4 Mechanics 2

4.4 Gravity

4.4.1 NEWTON’s Law of Gravity

I guess there’s no need to explain you what gravitational attraction is, you’ve surely heard
about it in school. Let’s get straight to the point by introducing a notation for gravitational
forces that takes their vectorial nature into account:

Newton’s Law of Gravity

A point-like mass m1 located at ~r1 exerts the gravitational force
m1 m2
F~1→2 = G (~r1 − ~r2 ) (4.4)
|r~1 − r~2 |3

on another point-like mass m2 located at ~r2 , where

G = 6.67 · 10−11 m3 s−2 kg−1 (4.5)

is the universal gravitational constant.

In school you might have encountered something more like

m1 m2
F =G (4.6)
r2

for the magnitude F of gravitational attraction between the point-like masses m1 and
m2 at distance r apart. This formula looks indeed simpler than (4.4), but it contains no
information about the direction of the force. From (4.4), you see that the force acting from
m1 onto m2 is an attractive force, as ~r1 − ~r2 shows from m2 towards m1 . We leave it to
you to show that formula (4.6) is easily derived from (4.4).
Look at how the third law of motion is contained in (4.4): by swapping the subscripts 1
and 2, you see that
F~2→1 = −F~1→2 ,

that is, the two bodies 1 and 2 act on each other by means of opposite gravitational forces
of equal magnitude.

128
4.4. GRAVITY

4.4.2 Gravitational Fields

Consider a large mass distribution, like a planet or a star, exerting gravitational forces on
other smaller objects in its surroundings, like meteoroids, comets, satellites, or human
beings. These objects are assumed to be so small that they do not affect motion of the
larger mass distribution.
To calculate the gravitational force by which an extended mass distribution acts on a small
mass m, we can think of the large mass distribution as being composed by a large amount
of point masses mi , every of which will exert its ‘own’ gravitational attraction F~i on m
according to
mi m
F~i = G (~ri − ~r) ,
|~ri − ~r|3
where ~ri denotes the position of the i-th point mass mi and ~r the position of m. To
calculate the total gravitational attraction on m, we make use of the extremely important

Principle of Superposition
Consider three point masses mA , mB and m. If mA and m were alone, let F~A be the
gravitational force mA would exert on m. Analogously, let F~B be the gravitational force
by which mB would act on m if mB and m were alone. The gravitational force acting
on m if both mA and mB are around is then given by

F~A + F~B .

In other words: The gravitational force by which a system of point masses acts on another
point mass is given as the vector sum of all gravitational forces by which each of the point
masses of the system alone would act on this other point mass.

Applied to our situation, this simply means that we can calculate the total gravitational
attraction acting on m as
X X mi m
F~i = G (~ri − ~r)
all masses mi i
|~ri − ~r|3
!
X mi
=m· G (~ri − ~r) .
i
|~ri − ~r|3
The very last term in brackets is a quantity describing how the distribution of the masses
mi in space exerts gravitational forces on a mass located at ~r. We will call this therm the
gravitational field ~g (~r) produced by this distribution of masses at ~r, such that we can write

129
4 Mechanics 2

F~ = m ~g (~r) . (4.7)

with

X mi
~g (~r) = G (~ri − ~r) . (4.8)
i
|~ri − ~r|3

So far for systems consisting of point masses. Calculation of gravitational fields produced
by extended mass distributions turns out to be somewhat more difficult. Exactly as we
did there, we can treat continuous mass distributions as continuous by introducing their
mass density % (~r) and turning the sum into an integral:

The gravitational field produced in point ~r by a continuous mass distribution % confined

to the portion of space Ω is given by
ˆ
% (~r0 ) 0
dV 0 .

~g (~r) = G 3 ~
r − ~
r (4.9)
0
Ω |~
r − ~r|

Again, your math courses will show you such integrals can be computed in various situ-
ations. You might realize that dealing with the integrand of (4.9) can be a rather tedious
affair. Fortunately, a very elegant law exists that makes calculation of gravitational fields
in particularly symmetric situations quite simple:

Gauss’ Law for Gravitational Fields

Let Ω be a volume in space and ∂Ω its outer surface. For gravitational field ~g , the fol-
lowing identity is valid: ˛
~ = −4πGmΩ ,
~g · dA (4.10)
∂Ω
where mΩ denotes the total mass located inside Ω.

You probably already know a very similar law by Gauss for electric fields and charge
distributions. Indeed, both the gravitational and the electrostatic versions of Gauss’ law

130
4.4. GRAVITY

are nothing but a direct consequence of the inverse-square-law nature of Newton’s law
of gravity and Coulomb’s law of electrostatic interaction, respectively. Again, your math
courses will tell you what the integral in the left-hand side of (4.10) means and how to
calculate it.
We won’t discuss Gauss’ law in great detail as you can learn more about it in your elec-
trodynamics courses. However, we will show you one very important application.
Many Olympiad problems involving gravity deal with planets or stars, which are in most
cases treated as spheres of homogeneous mass density. We will use Gauss’ Law to cal-
culate the field induced by such a homogeneous sphere of total mass M and radius R in any
point of space. This situation shows perfect spherical symmetry in space, that is, one could
rotate everything about the center of the sphere by any angle and you wouldn’t be able
to tell the difference. Therefore, magnitude of the gravitational field must be the same at
any point located at the same distance r from the center of the planet. Furthermore, still
due to spherical symmetry, the vector ~g must point in radial direction anywhere in space:
spherical symmetry would be broken if ~g had at some point in space a non-vanishing
component not pointing in radial direction. Choose a coordinate system with origin at
the center of the sphere. The gravitational field at any point in space ~r can be described
as

~g (~r) = g(r) r̂

with the projection g (~r) of the gravitational field on r̂. First, we consider a point located
inside the sphere, that is, r < R. Now apply (4.10) on a sphere V (r) of radius r concen-
tric to the planet. The surface integral on the left-hand side is obtained by subdivision of
the surface ∂V (r) of the sphere into many infinitesimal area elements dA with normal
vectors n̂ and taking the sum over the scalar products ~g · dA n̂ at any such area element,
that is, something like

ˆ
dA ~g (at this area element) · n̂ (at this area element)
∂V ˆ
= dA (g(r)r̂) · r̂
∂V ˆ
= g(r) dA
∂V
= g(r) · surface area of ∂V (r)
= g(r) · 4πr2 ,

131
4 Mechanics 2

where we used n̂ = r̂ for all surface elements on the sphere. Our result,
˛
~ = 4πr2 · g(r),
~g · dA
∂V (r)

is, according to (4.10), equal to −4πG times the total mass mV (r) = M · (r/R)3 con-
tained within V (r), that is,

M 3
4πr2 · g(r) = −4πG · r
R3
GM
⇔ g(r) = − r.
R3

Inside the sphere, the magnitude of gravitational field increases linearly with the distance
from its center.
Now apply the same strategy for any point outside the sphere, that is, r > R. The mass
mV (r) contained inside a spherical surface of radius r around the center is now just the
total mass M of our spherical body. Gauss’ law now yields

4πr2 g(r) = −4πGM

GM
⇔ g(r) = − 2 ,
r

which is the same as the field produced by a point mass M located at the origin. This is
a very important result:

The gravitational field inside a homogeneous sphere of mass M and radius R increases
linearly with the distance from the center of the sphere:

GM 3
~g (~r) = − r r̂, 0 < r < R (4.11)
R3
Outside the sphere, the gravitational field behaves as if the sphere was a point mass M
located at its own center:
GM
~g (~r) = − 2 r̂, r > R. (4.12)
r

132
4.4. GRAVITY

4.4.3 Energy and Angular Momentum in Gravitational Fields

In school, you might already have encountered the expression

Mm
Epot = −G (4.13)
r

for potential energy of a small mass m subject to gravitational field of the larger mass M
at distance r from it. This expression arises from the definition of potential energy of the
particle as the amount of work an external force would have to perform to move the mass
inside the field from infinity to a point at distance r from the center of the field.
Consider a point ~r at distance r from the center of the field and another arbitrary point
~r0 at distance r0 from the center. The amount of work done by an external force when
displacing the mass from ~r0 to ~r along a path γ connecting ~r0 to ~r is the line integral
ˆ
Wγ = − m~g · d~s
γ

Decompose the path γ into very small segments d~s in such a way that their direction
is either radial or perpendicular to the radial direction. The force vector m~g is always
directed radially, therefore only terms with radial d~s will contribute to the integral, since
all other terms vanish with ~g · d~s = 0. Therefore, the value of Wγ is independent of the
exact shape of γ as long as it connects ~r0 to ~r. Furthermore, the exact position of both
~r0 and ~r is not relevant for the value of Wγ , they could be replaced by any points located
at distance r0 and r, respectively, from the center of M . For calculation of Wγ , we can
therefore replace ~r0 and ~r by the two points (r0 , 0, 0) and (r, 0, 0), respectively:
ˆ r
GM m
Wγ = + x̂ · (dx x̂)
x=r0 x2
ˆ r
dx
= GM m 2
0 x
x=r
1 1
= GM m 0 − ,
r r

that is,

133
4 Mechanics 2

Suppose a large mass M generates a gravitational field at the origin of a coordinate system.
The work an external force has to do to bring a small mass m from a point at distance r0
from the origin to another point distant r from the origin is equal to

1 1
GM m 0 − . (4.14)
r r

Now back to the concept of potential energy. In a homogeneous gravitational field,

one can define potential energy of an object as the amount of work an external force
has to perform to bring the object from a determined reference height to the height of
the object’s current position. For central gravitational fields, we can apply an analogous
definition. It seems reasonable to choose infinity as a distance of reference: if we let
r0 → ∞, the corresponding term of (4.14) vanishes. That is, we are left with equation
(4.13).
Problems involving gravity can often be solved by considering conservation of mechanical
energy of an object moving in a gravitational field:

Total mechanical energy

1 mM
E = mv 2 − G (4.15)
2 r
of an object of mass m moving in a gravitational field produced by a large mass M at the
origin is constant over time.

Gravitational fields as from point masses or homogeneous spheres are radial. In particular,
angular momentum of particles is preserved as they move in such fields.

4.4.4 Two Objects Subject to Mutual Attraction

So far, we have only considered small masses subject to an external gravitational field
produced by a mass distribution so large not to be affected by the small mass. We turn
now to the discussion of systems consisting of two masses m1 , m2 of similar orders of
magnitudes, such that motion of each is influenced by gravitational attraction from the
respective other mass. In such a system, if not subject to any other external forces, the
following conservation laws hold:

134
4.4. GRAVITY

• Position of the center of mass (m1~r1 + m2~r2 ) / (m1 + m2 ) is constant over time.

• Total energy

E = Ekin, 1 + Ekin, 2 + Epot

1 1 m1 m2
= m1 v12 + m2 v22 − G
2 2 r12
is a conserved quantity, where r12 denotes the distance between the two particles.

• Total angular momentum

~ = m1~r1 × ~v1 + m2~r2 × ~v2
L

is a conserved quantity

Note how the expression for total energy consists of three parts: one term for kinetic
energy of each particle, Ekin, 1 and Ekin, 2 , and one for potential energy

m1 m2
Epot = −G . (4.16)
r12

This expression for potential energy of a system of two point masses looks very similar
to the expression derived in the previous section for a single particle moving in a central
gravitational field. The two situations are slightly different though. In the last section,
we would deal with a gravitational field and potential energy of the particle was defined
as the amount of work done an external force when dragging the particle from infinity
to its current position in the field. Here we are dealing with two point masses and using
the concept of field as in the last section doesn’t make too much sense as either mass
influences motion of the other one with its gravitational attraction. We can, however,
define the potential energy of the system as the amount of work one would have to
perform to assemble it in its current spatial configuration. This assembling might take
place in two steps as follows: take m1 from infinity and put it to ~r1 first, then take m2
from infinity and place it into ~r2 . The first step does not require any work, as m1 is all
alone in its way to ~r1 and no force acts on it. For the second step, we need to move

135
4 Mechanics 2

m2 against gravitational attraction exerted by m1 , and the work required is exactly the
same as if we thought of m2 being moved in a gravitational field produced by m1 and
the amount of work required is therefore just the term on the right-hand side of (4.16).
Note that while m2 is put into ~r2 , some force must prevent m1 from slipping off its
position at ~r1 due to attraction by the approaching m2 . This force does not contribute
any mechanical work, since m1 is not displaced under its influence.

4.4.5 KEPLER’s Laws of Planetary Motion*

Here as well, you most probably already know these laws from your high school physics
courses. This section will give you short clarifications about the concepts involved, as they
might provide a deeper understanding of the principles governing gravity and Newton’s
laws of motion.

Kepler’s Laws of Planetary Motion

1. The orbit of a planet around the sun is a plane ellipse, the sun lies at one of its two
foci.

2. Consider the radius vector pointing from the sun to the planet as it moves along
its orbit. During equal amounts of time, the radius vector sweeps out segments of
the ellipse of equal area.

3. The third power of the semi-major axis of an elliptical planet orbit is proportional
to the square of its orbital period around the sun.

The First Law

To understand the first law, we need to clarify some geometry. An ellipse can be defined
in different but equivalent ways:
A: As the figure you get by stretching a circle by any factor with respect to a straight
line.

B: As the locus of all points in the plane such that the sum of their distance to two
given, fixed points, called the foci of the ellipse, is a constant.

C: As the locus of all points in the plane whose distance from a given point, called
a focus of the ellipse, is proportional to their distance from a given straight line,

136
4.4. GRAVITY

called the directrix, with a constant of proportionality < 1.

D: As the figure you get by intersecting a cone with a plane under such an angle that
the figure is closed.
You might find it an interesting exercise in geometry to prove the equivalence of all these
definitions. Anyway, for Olympiad problems, the following two parametrizations of an
ellipse in the xy-plane are useful:

x 2 y 2
+ =1 (4.17)
a b

in Cartesian coordinates and

p
r(ϑ) = (4.18)
1 + ε cos ϑ

in polar coordinates. You might verify that both equations describe the same curve in the
plane under certain conditions for a, b, p, ε.
Equation (4.17) is probably the most useful to understand the concept of major and minor
axis. These are defined as the largest and smallest diameter, respectively, of the ellipse.
It is easy to show that their values are given by 2a and 2b, respectively, if the ellipse is
described as in (4.17). Furthermore, they coincide with the x- and y-axis. Note how an
ellipse is also supposed to have two foci, such that the sum of the distances of any point
on the ellipse to the two foci is a constant. You might verify that the√foci of the ellipse
as introduced in definition B are situated
√ at (−s, 0), (s, 0) with s = a2 − b2 if a > b
or at (0, −s) and (0, s) with s = b − a2 if b > a. In particular, this means the foci
2

are always situated on the major axis. Conversely, by starting with definition B with two
foci lying on either the x- or y-axis, you might retrieve equation (4.17) for description of
the ellipse in the xy-plane.
In (4.18), the quantities p and ε are called semi-latus rectum and eccentricity of the ellipse, re-
spectively. You can verify that the curve described by this equation satisfies the condition
of definition B with one focus at the origin and the other located at −2εp/ 1 − ε2 , 0

on the x-axis. Definition C is satisfied for the focus at the origin with a directrix described
by the equation x = p/ε, the constant of proportionality between the distance from
the points to the origin (which corresponds to r) and the distance from the points to

137
4 Mechanics 2

the directrix (which corresponds to the points’ x-coordinates) is equal to ε. The ellipse
intersects the y-axis at two points distant p from the origin.
I bet you can’t wait to see how an equation like (4.17) or (4.18) can arise from the equa-
tions of motion of a body subject to universal gravitation. The calculations involved
are somewhat elaborate and may not be of great help for solving physics competition
problems.
At this point, it is just important for you to understand what an ellipse is, what its foci are,
and to keep in mind that closed planetary orbits are ellipses with the sun at one focus.
In addition, it sure wouldn’t hurt to feel comfortable around the different geometrical
definitions of an ellipse, so feel free to practice your skills in analytic geometry and play
around.

The Second Law

Kepler’s second law is a direct consequence of conservation of angular momentum. Put
the whole orbit into a polar coordinate system with the sun at the origin. The position of
the particle can be written as
~r = (r cos ϕ) x̂ + (r sin ϕ) ŷ,
at any time. The velocity vector is obtained by taking the time derivative
~v = ϕ̇ ((ṙ cos ϕ − r sin ϕ) x̂ + (ṙ sin ϕ + r cos ϕ) ŷ) .
The angular momentum vector is then found to be
~ = m~r × ~v = r2 ϕ̇ ẑ
L
if m is the mass of the planet.
During a small amount of time t, let the position vector of the planet sweep out a small
area dA of the ellipse, thereby covering an angle dϕ = ω dt with ω = ϕ̇. The small area
can be approximated as a triangle, such that it amounts to 12 · r · rdϕ = ωr2 dt/2. Due to
conservation of angular momentum, we know that ωr2 is a conserved quantity. This just
means that the area dA swept out during dt is a constant and that it does not depend on
the position of the planet along its orbit. We see that this is also true for any finite time
interval of a given length ∆t between any times t1 and t2 = t1 + ∆t, as the area swept
out by the position vector is just calculated as
ˆ ˆ t2 ˆ t2
1 2 L L
dA = ωr (t) dt = dt = · ∆t,
time interval [t1 , t2 ] t1 2 2m t1 2m
which does not depend on the exact position of the time interval as long as its length is
∆t – just the statement of the second law.

138
4.4. GRAVITY

The Third Law

It is quite easy to verify the third law for a circular orbit. Suppose the planet of mass
m moves on a circular orbit of radius R around the sun of mass M with a time period
T . Then gravitational attraction of the sun on the planet must account for centripetal
acceleration, and, by the second law of motion:
2
2π mM
m R=G ,
T R2

which is equivalent to
R3 GM
2
=
T 4π 2 ,
in other words, circular orbits satisfy Kepler’s third law. Calculations for general elliptic
orbits are somewhat more complex.

139
4 Mechanics 2

140
Chapter 5

THERMODYNAMICS
Alice: “Brrrr, its cold in here, only
17 °C.” Bob: “It’s 4 °C outside, just
open the window and let those
remaining degrees in.”
Note that Maxwell’s demon may help Bob.

5.1 Important definitions . . . . . . . . . . . . . . . . . . . . . . . . . . 142

5.2 The temperature scale . . . . . . . . . . . . . . . . . . . . . . . . . 143
5.3 Zeroth law of thermodynamics . . . . . . . . . . . . . . . . . . . . 145
5.4 Thermal energy and heat capacity . . . . . . . . . . . . . . . . . . 145
5.5 Ideal gas law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
5.6 First law of thermodynamics . . . . . . . . . . . . . . . . . . . . . 147
5.7 Thermodynamic systems . . . . . . . . . . . . . . . . . . . . . . . . 147
5.8 Equipartition theorem . . . . . . . . . . . . . . . . . . . . . . . . . 148
5.9 Thermodynamic processes . . . . . . . . . . . . . . . . . . . . . . . 149
5.10 Second law of thermodynamics . . . . . . . . . . . . . . . . . . . . 152
5.11 Heat engines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
5.12 Kinetic gas theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
5.13 Phase transitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
5.14 Real gases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
5.15 Stefan-Boltzmann law . . . . . . . . . . . . . . . . . . . . . . . . . . 158

141
5 Thermodynamics

Thermodynamics is a phenomenological treatment of classical macroscopic systems and

their properties. The systems looked at in thermodynamics involve many degrees of free-
dom (∼ 1023 ). Therefore it is impossible to solve the equations of motion exactly.

In this chapter, we will first look at some important definitions, before looking at ideal
gases and heat engines. In the end, we will also have a quick look at real gases and the
Stefan-Boltzmann law.

5.1 Important definitions

Mole: base unit for the amount of substance. One mole of a substance contains
exactly NA particles. (See figure 5.1.)

Avogadro constant NA :
the number of molecules in one mole. NA ≈ 6.022 × 1023 mol−1 .

Boltzmann constant kb :
an important proportionality constant in statistical physics to relate energy
and temperature. kb ≈ 1.380 × 10−23 JK−1

Gas constant R:
proportionality constant to relate energy to a mole of particles at a stated
temperature. Molar version of kb : R = kb · NA

Volume V : space in which the system is confined in.

Temperature T :
measured quantity of a thermometer.

Pressure p: force applied per unit area.

Number of molecules N :
number of molecules confined in the considered volume.

Amount of substance n:
molar version of N . (See figure 5.1.)

Work W : energy transferred from or to the system by expansion/compression.

Heat Q: transferred energy between systems due to temperature differences.

142
5.2. THE TEMPERATURE SCALE

Internal energy U :
also sometimes called just energy, all the energy inside the system.

Ideal gas: a collection of neutral atoms or molecules where the interaction between
individual particles can be neglected.

mass number of particles

mass of one particle

Avogadro constant
density molar mass

molar volume amount of substance

volume

Figure 5.1: Connection between mass, volume, number of particles and amount of sub-
stance. The value for the molar volume is for one mole of an ideal gas at 0 °C and
101.325 kPa. Adapted from [9].

5.2 The temperature scale

Since there is no easy way to define temperature, we just state that the temperature is a
physical quantity that tells us in which direction the heat flows. Heat always flows in the
direction of the body with the lower temperature. A way to measure this temperature is
by using a thermometer, which uses thermal properties of materials to measure the tem-
perature. Examples are the thermal expansion of mercury, an electrical resistance that
changes with temperature or the volume of a gas at constant pressure.

143
5 Thermodynamics

The temperature is a scalar (a number) measure which is constant in an isolated system

in thermodynamic equilibrium. There are 3 main scales for temperature (Kelvin, Celsius
and Fahrenheit), but in physics, only Kelvin and, in rare cases, Celsius are used. In figure
5.2, a comparison between the scales can be seen.

The Kelvin scale starts at absolute zero, which is equal to −273.15 °C. At 0 K all thermal
motion freezes. There is no temperature that is < 0 K. The Kelvin scale also uses the
same temperature step as the Celsius scale and therefore we can easily switch between the
two:

T [°C] + 273.15 = T [K] (5.1)

Note that most equations in physics only work properly if you calculate everything in the
Kelvin temperature scale and with no other.

Figure 5.2: Comparison of different temperature scales related to the energy scale via the
Boltzmann constant kb (e.g. for T =300 K ⇒ kb T = 4.1 × 10−21 J ) with the corre-
sponding energies in Zeptojoule (1 zJ = 1 × 10−21 J).[10]

144
5.3. ZEROTH LAW OF THERMODYNAMICS

5.3 Zeroth law of thermodynamics

The zeroth law of thermodynamics states that if we bring two bodies (A & B) with differ-
ent temperatures (TA & TB ) into contact, the two bodies will reach a thermal equilibrium
(Teq ) after a certain time. Since the temperature is a measure of thermal movement of
atoms/molecules (see chapter 5.12), the equilibrium temperature Teq is between TA and
TB :

TA < Teq < TB or TA > Teq > TB , (5.2)

depending on which body had the higher temperature in the beginning.

5.4 Thermal energy and heat capacity

When two bodies with different temperatures touch each other, the heat of one body
flows to the other until they have the same temperature. This thermal energy Q is often
(especially in chemistry) measured in calories (cal):

1 cal = 4.1868 J, (5.3)

where one calorie is the heating energy one needs to heat up 1 g water at normal pressure
(p = 1 bar) from 287.65 K to 288.65 K. Today the SI-unit Joule should be used, but in
many textbooks one can still find calories.

Different materials also take different amounts of energy ∆Q to increase the temperature
by a certain extent ∆T . Therefore we define the heat capacity C which is a measure of
how much thermal energy the body needs to heat up.

∆Q
C= ⇔ ∆Q = C · ∆T (5.4)
∆T
Usually the heat capacity is normalized to the mass of the body (specific heat capacity
Cm ) or the amount of substance (molar heat capacity CM ), i.e.

1 ∆Q
Cm = (5.5)
m ∆T
and
1 ∆Q
CM = . (5.6)
n ∆T

145
5 Thermodynamics

5.4.1 Molar heat capacities of ideal gases

In general there are two different types of heat capacities (for gases). The heat capacity of
a substance depend on how one measures it: If you measure it while leaving the pressure
constant (Cp ) the specific heat is higher compared to when you leave the volume constant
(CV ). This means that one needs different amounts of energy to heat the same amount
of particles, depending on how one heats them up. This is because additional work is
done when expanding the system while keeping the pressure constant. This can be seen
from figures 5.3 and 5.4. The difference between the two molar heat capacities is exactly
the gas constant

R = Cp − CV . (5.7)

For a derivation look at equation (5.15) and divide both sides by n∆T .

V ∆Q = nCV ∆T V
p p + ∆p
T T + ∆T

Figure 5.3: Heat capacity with constant volume.

V ∆Q = nCp ∆T V + ∆V
p p
T T + ∆T

Figure 5.4: Heat capacity with constant pressure.

146
5.5. IDEAL GAS LAW

5.5 Ideal gas law

The most important equation in classical thermodynamics is the ideal gas law

pV = nRT, (5.8)

which can also be written as

pV = N kb T. (5.9)

It is a combination of the empirical Boyle’s law (p ∝ V1 ), Charle’s law (V ∝ T ) and

Avogadro’s law (V ∝ n) and is only valid for ideal gases.[11] Such an ideal gas can not
be liquefied like a real gas. For real gases there is among others the Van-der-Waals gas
equation to describe not only the behaviour in the gaseous phase, but also the liquid phase
as well as the phase transition between them.

5.6 First law of thermodynamics

Temperature is connected to the kinetic energy of the particles in the gas (see chapter
5.12). This internal energy U (p, V, T ) is a function of the parameters of the system:
pressure p, volume V and temperature T . These are state variables like described in 5.7.
A change in internal energy only depends on the states at the beginning and the end of
the process ∆U = UE − UA . This also means that it doesn’t matter how one got from
one state to the other.

The first law of thermodynamics states that the energy of a system does change if thermal
energy Q is added or when mechanical work W is performed on the system:

dU = δQ. + δW . (5.10)

This means that for an isolated systems, in which we do not add thermal energy or do
work on, the energy is constant. Note that here we introduced the convention that work
done onto the system, or thermal energy added to the system is positive and the work the
system does on it’s surroundings is negative.

5.7 Thermodynamic systems

A thermodynamic system is a collection of particles in thermal equilibrium and is charac-
terized by so called state variables like volume V , temperature T , pressure p, entropy S,
number of particles N , amount of substance n and many more. In contrast heat Q and

147
5 Thermodynamics

work W are not not state variables, but process functions, since they do not describe an
(equilibrium) state.

The different types of thermodynamic systems are characterized as follows.

Isolated systems
An isolated system has no exchange of matter or energy with its surroundings. It is
completely isolated and will stay in the thermodynamic equilibrium.

Closed systems
Closed systems have, like the isolated system, no exchange of matter with the surround-
ings. But they can exchange energy, for example with a thermal contact or by performing
work on each other.

Open systems
Open systems may exchange energy as well as matter with their surroundings.

5.8 Equipartition theorem

The equipartition theorem states that the mean energy of each molecule with f degrees
of freedom (number of independent motions that are allowed, e.g. moving in x,y and z
equals to 3 degrees of freedom) is given by

f
mean energy per molecule = kb T. (5.11)
2

Thus the energy of an ideal gas with n moles of molecules is given by

f f
U = nNA kb T = n RT. (5.12)
2 2

Using ∆Q = nCV ∆T , we also find that CV = f2 R and therefore Cp = f +2 2 R. Due to

quantum mechanical effects, degrees of freedom can be ”frozen” out at low temperatures.
This means they cannot be excited and therefore don’t contribute to the inner energy.
This can be seen in figure 5.5.

148
5.9. THERMODYNAMIC PROCESSES

Figure 5.5: Specific heat of H2 (schematically). The y-axis corresponds to f2 = CRV . The
vibration gives two degrees of freedon, since one can store energy in potential or kinetic
energy.[10]

5.9 Thermodynamic processes

In this chapter we look at different processes that change the system by doing/extracting
work onto it or by heating/cooling it. These processes are mostly characterized by the
variables they leave constant. The different processes in a p-V diagram can be seen in
5.6.

Isobaric processes
In an isobaric process the pressure is held constant. The work done on an expanding gas
from the outside is given by
ˆ Vb ˆ Vb
W. = δW . = − pdV, (5.13)
Va Va

where Va is the volume at the beginning and Vb at the end of the process. By using that
the pressure is constant over the whole expansion, we can take the pressure out of the
integral and get the total work
ˆ Vb
W . = −p dV = −p(Vb − Va ) = −nR(Tb − Ta ). (5.14)
Va

By using the equipartition theorem, we can also find the change in internal energy ∆U =
nCV ∆T and therefore, according to the first law of thermodynamics and the ideal gas
law, the heat exchanged is
Q. = ∆U − W . = nCV ∆T + nR∆T = nCp ∆T. (5.15)

149
5 Thermodynamics

Figure 5.6: Different processes in a p-V diagram. [12]

Isothermal processes
For isothermal processes the temperature is held constant (T = const.) and therefore,
according to the equipartition theorem, the internal energy is constant (U = U (T ) =
const). Using the ideal gas law we also find that the product pV = const as well. Since
we know that the internal energy remains constant, the first law of thermodynamics tells
us that the incoming heat is entirely converted to work:
ˆ ˆ
Q. = δQ. = −δW . = −W . = W % (5.16)

Using the ideal gas law we can then calculate the work done by the system as
ˆ Vb ˆ Vb
dV Vb
W% = pdV = nRT = nRT ln (5.17)
Va Va V V a

Isochoric processes
In isochoric processes the volume doesn’t change (V = const.). As a consequence no
work is being done and therefore the first law of thermodynamics tells us

dU = δQ. . (5.18)

150
5.9. THERMODYNAMIC PROCESSES

Therefore the change in energy is simply given by

ˆ Tb
.
∆Q = nCV dT = nCV ∆T. (5.19)
Ta

Adiabatic processes

If during a process the system does not exchange heat with the surroundings, one speaks
of adiabatic processes. This is for example the case when the system is thermally isolated
or when the process is so fast that the heat exchange with the surroundings is negligible.

We can write down the change in work and energy for the process

nRT
δW = −pdV = − dV (5.20)
V
dU = nCV dT. (5.21)

Since we know that there is no heat exchanged ∆Q = 0, we find that for any adiabatic
process nCV dT + nRTV dV = 0. By dividing through nT and integrating we get

ˆ ˆ
CV R
dT = − dV (5.22)
T V
CV ln(T ) = −R ln(V ) + const. (5.23)
− CR
⇒T =V V + const., (5.24)

which is exactly

T V κ−1 = const. (5.25)

or equivalently

pV κ = const. (5.26)
κ 1−κ 0
T p = const . (5.27)

Cp R
Where κ is the ratio κ = CV which implies CV = κ − 1 (using equation (5.7)).

151
5 Thermodynamics

5.10 Second law of thermodynamics

The first law of thermodynamics was about the exchange of energy with the surroundings.
The second law of thermodynamics is about the distribution of the molecules within the
volume of the system. The entropy of a system is a measure for the disorder in the system.
There are many formulations, but the most known one is

S = kb ln W, (5.28)

where S is the entropy and W the probability for the system to be in a given state. (To
be more precise it is the probability to be in a given macrostate defined by state variables.
For a better explanation you can have a look at [13].)

The second law of thermodynamics states that the entropy of a system can only increase

δQ.
dS ≥ . (5.29)
T
There are also other formulations of the second law, like the one from Rudolf Clausius:

”Heat can never pass from a colder to a warmer body without some other change,
connected therewith, occurring at the same time.” [14]

or Lord Kelvin:

”It is impossible, by means of inanimate material agency, to derive mechanical

effect from any portion of matter by cooling it below the temperature of the coldest
of the surrounding objects.” [14]

The second law also splits processes up into those which are reversible and conserve the
entropy, and those which are irreversible and do not conserve entropy. For reversible
processes, the system can be returned to its initial state. For reversible processes we
therefore have

δQ.
dS = . (5.30)
T

There is also the third law of thermodynamics which fixes the entropy at absolute zero
S(0 K) = 0. For more information have a look at [13].

152
5.11. HEAT ENGINES

5.11 Heat engines

Heat engines do work by transferring heat between two reservoirs at different tempera-
tures. One can imagine it a bit like a water mill, where the water from the higher level
produces mechanical work by running to the lower level.

Heat engines can be characterized by the associated cycle, which is the closed curve on
a p-V diagram. There are many different heat machines, like conventional car engines,
but we will have a look at the theoretically most efficient process, the Carnot cycle. The
Carnot process, which is depicted in figure 5.7, features two isothermal and two adiabatic
processes.

Figure 5.7: The Carnot cycle with its two isotherms and two adiabats. The surface that
is enclosed by the Carnot process equals the total work done. [13]

To see how efficient the Carnot cycle is we have to write down the equation for every
step, shown in table 5.1.

153
5 Thermodynamics

Table 5.1: The Carnot process.

step process T W Q

1: a→b adiabatic T2 → T1 δW1. = nCV (T1 − T2 ) 0

compression
2: b→c isothermal T1 δW2% = nRT1 ln VVcb δQ. Vc
2 = nRT1 ln Vb
expansion
3: c→d adiabatic ex- T1 → T2 δW3% = nCV (T1 − T2 ) 0
pansion
4: d→a isothermal T2 δW4. = nRT2 ln VVad δQ% Vd
4 = nRT2 ln Va
compression

Furthermore we know that T V κ−1 = const. during adiabatic processes

T2 Vaκ−1 = T1 Vbκ−1 (5.31)

T2 Vdκ−1 = T1 Vcκ−1 (5.32)

and if we divide the two equations, we get the condition

Va · Vc = Vb · Vd . (5.33)

With all of that we can define the Carnot efficiency

work done −δW1. + δW2% + δW3% − δW4. T1 − T2

ηC = = .
= . (5.34)
supplied heat δQ2 T1

The Carnot cycle defines a thermodynamic cyclic process to produce work with the high-
est achievable efficiency. All reversible heat engines between two heat reservoirs are
equally efficient as a Carnot engine operating between the same reservoirs.[14] Therefore
all real processes are less efficient
work done Thot − Tcold
= ηreal ≤ ηC = . (5.35)
supplied heat Thot
It is also nice to see that the entropy for a fully reversible cycle does not increase:
˛ ˛
δQ
∆S = dS = =0 (5.36)
T
This is known as the Clausius equality for reversible processes.[15]

154
5.12. KINETIC GAS THEORY

5.12 Kinetic gas theory

In this chapter we look at how microscopic motion explains the macroscopic properties
of a system. Solving the problem exactly is impossible due to the huge number of parti-
cles within a macroscopic volume, but with some approximations we can model the real
system very accurately. The most important approximations are:

• The gas consists of very many, small particles and we approximate them as points,
so we can neglect the volume they take.

• All particles are identical.

• The particles move with a constant, random velocity.

A full list of the approximations can be found on Wikipedia[16].

Figure 5.8: Molecules move around in a volume with random direction and speed, indi-
cated by arrows.[17]

We look at a cube of side length L in which the particles are confined, like the one in
figure 5.8. When a particle hits the wall, the change in momentum in the direction (e.g.
x) is given by

∆P = Pbefore,x − Pafter,x = Pbefore,x − (−Pbefore,x ) = 2Pbefore,x = 2mvx , (5.37)

155
5 Thermodynamics

where P is the momentum. Further the particle hits any given wall periodically in time
with
2L
∆t = (5.38)
vx
between collisions, which gives an average force onto the wall of one particle of
∆P mvx2
Fparticle = = . (5.39)
∆t L
This results in a total force onto the wall of
mv¯x2
F = N · F̄particle = N , (5.40)
L
where the bar over the force and the velocity indicates mean values over all particles.
Since we do not have a bias in any direction, the average squared speed in any direction
should be the same
v¯2 = v¯x2 + v¯y2 + v¯z2 = 3v¯x2 . (5.41)
This leads to a pressure of
F N mv¯2
p= = , (5.42)
L2 3V
with a volume V = L3 .

Comparing with the ideal gas law

N mv¯2
pV = = N kb T, (5.43)
3
and knowing that kb , m and N are constant, we see that the velocity squared is directly
connected to the temperature
mv¯2
kb T = . (5.44)
3
By using this we can write down the average kinetic energy per particle
1 ¯2 3
mv = kb T, (5.45)
2 2
which we recognize as the equipartition theorem with particles having 3 degrees of free-
dom (they are able to move in x,y and z).

156
5.13. PHASE TRANSITIONS

5.13 Phase transitions

Ideal gases do not show any phase transitions (e.g. condensation), but real gases like de-
scribed in chapter 5.14 do. For every point in the p-T diagram, there is exactly one phase
that minimizes the energy. The system always tries to minimize this energy and there-
fore phase transitions happen when the system goes from one region to another. During
phase transitions the two phases coexist.

When a phase transition happens usually depends on the pressure as well as on the tem-
perature, as can be seen in figure 5.9. In the figure one can see the phase diagram of water,
with the freezing point at 0 °C and the boiling point at 100 °C for normal atmospheric
pressure. But with changing pressure, the necessary temperature to boil/freeze water gets
shifted and one can even get water that is under 0 °C cold.

Figure 5.9: Phase diagram of H2 O. On the very left is the solid phase, in the middle the
liquid phase and on the right, extending all the way to the left at low pressure, is the the
gaseous phase. [18]

When a system transitions from one phase to the other, heat is either released to the

157
5 Thermodynamics

surrounding (e.g. condensation) or taken from the surroundings (e.g. evaporation). This
heat is called the latent heat L and is usually found as specific latent heat Lm which is
normalized by the mass. Therefore one can calculate the heat Q needed to evaporate a
material only by knowing its mass and the specific latent heat

Q = mLm . (5.46)

This is of course only valid if the system is already at the right temperature.

5.14 Real gases

In this script we only looked at ideal gases, but just as an outlook we shall briefly discuss
real gases.

As a first approximation we can introduce two empirical parameters to the ideal gas law,
bringing us to the van der Waals gas:

an2

p + 2 (V − nb) = nRT. (5.47)
V
The two terms are:

1) There is not only the outer pressure, but the interaction of the molecules is also
taken into account by the factor a.

2) The volume that the gas can move in is reduced by the volume the molecules
occupy. This is taken care of by the factor b.

The probably biggest advantage of using the van der Waals equation is that with this
model one can also take phase transitions into account, which is not included in the ideal
gas law.

5.15 Stefan-Boltzmann law

Every body radiates power in the form of electro-magnetic waves because of its temper-
ature. For example molten metals send out light in the visible range. The power radiated
per area by a black body (a body that absorbs all incoming light) is given by the Stefan-
Boltzmann law

P = σT 4 . (5.48)

158
5.15. STEFAN-BOLTZMANN LAW

σ is the Stefan-Boltzmann constant and is given by

2π 5 kb4
σ= ≈ 5.670 × 10−8 Wm−2 K−4 , (5.49)
15c2 h3
where c is the speed of light and h is the Planck constant.

159
5 Thermodynamics

160
Chapter 6

OSCILLATIONS
A physicist asks a mathematician to
help him check the turn signals of his
car. The physicists enters the car and
activates the turn signal. He asks the
mathematician whether the signal
works. The reply: “Works, doesn’t
work, works, doesn’t work, works,
doesn’t work, ...”
Mathematicians normally argue that the roles
are reversed.

6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162

6.2 Harmonic Oscillations . . . . . . . . . . . . . . . . . . . . . . . . . . 163
6.3 Beyond Harmonic Oscillations . . . . . . . . . . . . . . . . . . . . . 170

161
6 Oscillations

6.1 Introduction
Important sources for this chapter include [19] and [20].
The term oscillation generally refers to the (periodic) variation of some physical quantity
with time around a point of equilibrium.
There is no strong consensus among physicists for a proper definition of oscillations.
Some prefer using broader definitions, to the point of including phenomena that do not
”oscillate” in the common sense, while some others prefer a narrower definition, with
the risk of excluding movements showing the characteristic ”back-and-forth” variation.
Damping is the common example of a phenomenon of the ”grey zone”, more on this
later.
A way of physically defining oscillatory systems (oscillators) is to require that the varying
quantity has an equilibrium value, and that the system naturally tends to go back to that
value if taken out of equilibrium by an external perturbation.
From that definition, we already see that oscillators are very common in physics, virtu-
ally in any domains; oscillations, in particular, happen each time the considered quantity
(which can be almost any mathematical quantity: a position, the strength of a field, the
voltage of an alternating current circuit, etc.) is bound to a potential with a local minimum.
In the following, we will take a more precise look at simple oscillations, notably harmonic
oscillations and their extensions - damping and resonance.
As a preamble, let us first define periodicity. This term refers to the idea of having some
time-dependent quantity that takes the same values over and over, in a regular pattern.
For example, considering the time-dependent variable x(t), we call it periodic if we can
find some T such that

x(t) = x(t + T )

for all times t.

As can be seen, if we find such a T , then 2T also satisfies the criterion, as well as any
multiple of T . Thus, we define the period of a periodic system as the (positive) smallest
such T .
It is clear from this definition that the period T has the units of an interval of time, thus
is given in s (seconds) in the SI, and measures the time it takes an oscillator to accomplish
an oscillation cycle.
We also define the frequency f (also commonly refered to as ν): it is the inverse of the
period, i.e. 1/T , and it measures how many oscillation cycles an oscillator accomplishes
in a time unit. Its SI unit is s−1 , also called Hz (Hertz).

162
6.2. HARMONIC OSCILLATIONS

6.2 Harmonic Oscillations

Harmonic oscillations are considered the most simple type of oscillation. Despite this (or
more probably, because of this), it also provides a convenient model for most types of
(more complicated) physical oscillations. As is often the case, mathematical and physical
simplicity come together.
Mathematically, a quantity x(t) following a harmonic oscillation has the following form:

x(t) = A sin(ωt + ϕ) , (6.1)

that is, it simply follows a sine function, up to some scaling and shifting factors:

• A is called amplitude, and has the same units as x. It corresponds to the maximal
value of x.

• ω is called angular frequency (sometimes also pulsation) and corresponds to 2πf . Its
units are that of an angle over a time, i.e. radians per second in the SI. Mathemat-
ically, it is a scaling factor that allows transforming the parameter t into an object
of the right dimension for the sine function to absorb, i.e. ωt should have units of
an angle.

• ϕ is called phase and has units of an angle (for the same reason as just explained for
ωt), usually given in radians.

Note:

• some textbooks refer to 2A as the amplitude. Be careful!

• another definition can be created using a cosine instead of a sine. Those definitions
are equivalent and only differ by a shift of π/2 in the phase.

163
6 Oscillations

It is easy to check that harmonic oscillations are periodic, it simply follows from the
mathematical periodicity of the sine function:

x(t + T ) = A sin (ω(t + T ) + ϕ)

= A sin (ωt + ωT + ϕ)
= A sin (ωt + 2πf T + ϕ)
= A sin (ωt + 2πT /T + ϕ)
= A sin (ωt + ϕ + 2π)
= A sin (ωt + ϕ)
= x(t) .

6.2.1 Harmonic Oscillations, Spring/Mass Systems and Differential

Equations
At this point, we want to look at a first concrete example of harmonic oscillator and try
to relate the physical equations of such a system with the mathematical form of harmonic
oscillations described above.
The simplest example of harmonic oscillator is perhaps the spring/mass system, where an
extremity of the (massless) spring is fixed and the other is attached to a free mass m. We
will only consider here the 1D case, where the mass can only move along one dimension.
We do not take any gravitational force into account.
We will later see that by accounting for further interactions (friction, driving by an external
movement, etc.) we will be able to generalize this simple example and therefore study
various interesting oscillatory cases.

Newton’s equation for the mass is as follows:

X
Fi = ma .
i

The only force acting on m is the restoring force of the spring (Hooke’s law):

F = −kx ,

164
6.2. HARMONIC OSCILLATIONS

k
m

Figure 6.1: Spring-mass system. Adapted from [21].

where k is the spring constant (defining its stiffness) in N·m−1 (equivalent to kg·s−2 ).
The origin of x is taken at the point of equilibrium of the spring.
Thus we have, by reordering and using that the acceleration is the second time derivative
of the position (a = ẍ):

k
ẍ + x=0. (6.2)
m
Such an equation is called a homogeneous second-order linear constant coefficient ordinary differential
equation:

• differential, because it contains both a function (x(t)) and some of its derivatives
(ẍ(t)), and the goal is to determine x(t);

• ordinary, because the function x depends on only one variable t;

• constant coefficient, because the function and its derivatives only show up with con-
stant coefficients in the equation;

• linear, because the function and its derivatives only show up with degree 1 in the
equation (no square or such);

• second-order, because the highest derivative of x present is the second one;

165
6 Oscillations

• homogeneous, because there is no ”free” constant or function of t in the equation (all

terms contain an x or one of its derivatives at least).

Due to Newton’s law, most of mechanics revolve around finding solutions to second-
order differential equations.
Our hypothesis above was that spring/mass systems are harmonic oscillators, i.e. we
believe that such systems perform harmonic oscillations (with the variable quantity being
the position of the mass).
To verify this, we have to check whether 6.1 and 6.2 are compatible, i.e. whether 6.1 is a
solution of 6.2.
Let’s insert 6.1 into 6.2:

x(t) = A sin (ωt + ϕ)

ẋ(t) = ωA cos (ωt + ϕ)
ẍ(t) = −ω 2 A sin (ωt + ϕ)

k k
ẍ + x = −ω 2 A sin (ωt + ϕ) + A sin (ωt + ϕ)
m m
k
= −ω 2 + A sin (ωt + ϕ)
m
!
=0.
For the last equality to hold for all t, we need to have

k
−ω 2 + =0,
m
i.e.

r
k
ω= .
m
Thus, a spring/mass system with the above
p defined characteristics does behave as a har-
monic oscillator with angular frequency k/m.
We can also conclude that any system with a single force −Kx for some constant K
acting on it is harmonic. Such a system has a potential V (x) = 12 Kx2 , which also
suffices to characterize it.

166
6.2. HARMONIC OSCILLATIONS

6.2.2 Further Examples

Beyond the spring/mass, other simple systems represent harmonic oscillators:
• Torsion Pendulum

Figure 6.2: Torsion pendulum. Adapted from [22].

Hooke’s law also holds for the angular position ϕ of an object of moment of inertia
J attached to a torsion spring of angular spring constant κ (with the torque M ):

M = −κϕ ,

thus with Newton’s law for rotations:

X
Mi = Jα
i

κ
ϕ̈ + ϕ=0.
J
p
And we again find a solution of the form ϕ(t) = Φ sin κ/Jt + θ .

167
6 Oscillations

• Simple Pendulum

l
g

φ m

Figure 6.3: Simple pendulum.

A simple pendulum is composed of a (massless) rod of length l fixed at one of its

extremities to a pivot, and attached at the other end to a moving mass.
In the common case, one restricts oneself to the 2D. Using the angle ϕ of the rod
with the vertical, one can then parametrize the system with one single variable -
thus the system has one single degree of freedom.
Two forces act on the mass, the gravity F~P and the tension F~T .
Seen from the pivot point, the tension produces no torque (because of its collinear-
ity with the rod, MT = 0), while the weight of the mass creates

MP = −mgl sin(ϕ)

(the minus sign indicating that the torque opposes itself to the deviation of the rod
from the vertical).
Thus we get, using Newton’s law in rotationary form (we assume the mass to be
point-like, so J = ml2 by Steiner’s theorem):

168
6.2. HARMONIC OSCILLATIONS

g
ϕ̈ + sin(ϕ) = 0 . (6.3)
l
As can be seen, we again find a homogeneous second-order constant coefficient
ordinary differential equation, but this time it is not linear, due to the sine!
Therefore the simple pendulum is not harmonic; as can be shown, 6.1 is not a
solution of 6.3.
However, in the case of small oscillations, the angle ϕ remains itself small and we
can linearize the equation. This is because of the Taylor expansion of sine:

x3 x5
sin(x) = x − + + O(x7 ) ,
6 120

which allows to approximate sin(x) ≈ x for small x.

Thus, for ϕ 1 (with ϕ in radians), we have

g
ϕ̈ + ϕ ≈ 0
l
and we p can say that the pendulum is approximatively harmonic with angular fre-
quency g/l. Note that the condition ϕ 1 is necessary for the approximation
to hold!

6.2.3 Importance of Harmonic Oscillations

The fact that non-harmonic oscillations can be approximated in the vicinity of equilibrium
points by harmonic oscillations is general, and is the reason why harmonic oscillations are
so common and important.
More concretely, if we have a system with potential V (x) we can Taylor-expand (say
around 0):

V 00 (0) 2
V (x) = V (0) + V 0 (0)x + x + O(x3 ) .
2
If the potential has a local minimum at 0, we have V 0 (0) = 0 and V 00 (0) > 0, thus for
small x we can neglect the higher orders and get

169
6 Oscillations

V 00 (0) 2
V (x) ≈ V (0) + x ,
2
which is the potential of a harmonic oscillator (V (0) is arbitrary and we can redefine V
in order to get rid of it). In terms of force we get

∂V
F =− = −V 00 (0)x
∂x
which is obvious linear in x (V 00 (0) is the value of the second derivative at 0 and therefore
simply a number).

6.3 Beyond Harmonic Oscillations

Here we will consider our mass-spring system, but accounting for further phenomena.
In fact, harmonic oscillations represent an idealized case. In reality, we often have two
more ingredients in oscillatory systems:

• Damping
Damping occurs everytime we have friction in the system, opposing the motion.
• Forcing
Forcing (also called driving force) appears when an external element interacts with the
oscillating system. Depending on this interaction, this can typically greatly increase
or reduce the oscillation amplitude.

In this script we will consider the following simple situation only: linear damping (with a
force proportional to the velocity, −bv) and sinusoidal forcing (with an external excitation
B sin(Ωt)).
Newton’s law gives for such a system:

−kx − bv + B sin(Ωt) = ma ,

which leads to the (in general inhomogeneous due to the forcing) differential equation:

B
ẍ + 2ζω ẋ + ω 2 x = sin(Ωt) ,
m

170
6.3. BEYOND HARMONIC OSCILLATIONS

where ζ = 2√bmk is called the damping ratio.

The full mathematical treatement of this differential equation is beyond the scope of this
script (it is partly covered in the analogous case of oscillatory circuits in the AC part), so we
only provide the solutions without resolution steps below. We distinguish the following
cases:

• B=0
This is the free case, i.e. there is no forcing. The behavior of the system depends
on the damping ratio:

– ζ=0
There is no damping here, thus this is the common harmonic oscillator, as
discussed above.
As soon as ζ gets bigger than 0, the oscillator looses energy in the damp-
ing (friction) and is not periodic anymore, but rather tends to go back to its
equilibrium point and stop there.
– 0≤ζ<1
This case is called underdamped and corresponds to solutions of the form

x(t) = Ce−ζωt sin(ω̃t + ϕ̃) ,

with C and ϕ̃ arbitrary constants depending on the initial conditions. As

can be seen, the movement represents a sinusoidal modulated (squashed) by
an exponential decay. It is not periodic, but one can define ω̃ as a so-called
pseudoperiod for the oscillation. We have

p
ω̃ = ω 1 − ζ2 .

– ζ=1
This one is the critically damped case, with solutions

x(t) = (C1 t + C2 )e−ζωt ,

where C1 and C2 are again arbitrary constants depending on the initial con-
ditions.

171
6 Oscillations

The system here approaches its equilibrium point in the fastest possible way,
which makes it technically useful in real-world applications, such as springs
for automatically closing doors (that should close fast but without oscillating)
or shock absorbers on vehicles.
– ζ>1
Here we have the overdamped case, with the system going back to its point of
equilibrium without oscillating, but slower that in the critical case described
above.
The solutions have the form

p p
− ζ− ζ 2 −1 ωt − ζ+ ζ 2 −1 ωt
x(t) = C1 e + C2 e ,

with C1 and C2 defined in an equivalent way as above.

Figure 6.4: Linear damping with various ζ values. Adapted from [26].

• B 6= 0
In the presence of forcing, we will have to separate to cases:

– We will first have a transitory solution depending on initial conditions.

– Then, we will have a steady-state solution depending only on the forcing.

172
6.3. BEYOND HARMONIC OSCILLATIONS

The general solution is a sum of both transitory and steady-state solutions; here we
will consider the steady-state solution only. It has the form

B
x(t) = p sin(Ωt + Φ)
m 4ω 2 Ω2 ζ 2 + (ω 2 − Ω2 )2

with some phase Φ pthat we won’t discuss in this script. The interesting part is the
square root Z = 4ω 2 Ω2 ζ 2 + (ω 2 − Ω2 )2 .
p
For ζ < √12 , we will have a minimum of Z for a forcing frequency Ωr = ω 1 − 2ζ 2 ,
thus the amplitude of the oscillation for that case of forcing will be strongly in-
creased. In the undamped case (ζ = 0), the amplitude in that case will even diverge
to infinity, meaning that for that forcing frequency Ωr (called resonance frequency) the
forcing will continuously add energy to the system.

x(Ω)
x(0)

Ω/ω

Figure 6.5: Amplitude resonance with sinusoidal forcing. The maxima line corresponds
to Ωr . Adapted from [25].

173
6 Oscillations

6.3.1 Example of Forced Oscillating Systems

One of the most common examples of forcing is the swing: one has to furnish energy
at the right time (i.e. with the right frequency) for the swing to increase its oscillation
amplitude.
Conversely, any sufficently little damped mechanical system has a resonance frequency
that can be used to bring it to strongly oscillating, even with a relatively small amplitude
driving interaction.
During the engineering of any mechanical product, resonance has to be taken into account
in order to prevent unwanted oscillations that could even make it break. Among others,
vehicles parts should not be brought to resonance by the vibrations induced by the engine.
Buildings are another type of mechanical systems where resonance has to be well con-
trolled. Modern bridges and skyscrapers often contain mechanical parts designed to ab-
sorb oscillations.
One of the most well-known examples of so-called resonance disaster is the destruction of
the Tacoma Narrows Bridge in 1940 due to wind passing through its structure, bringing it to
twisting and eventually leading to its collapse.

Figure 6.6: The Tacoma Narrows Bridge collapsing [23]. See also [24].

174
Chapter 7

WAVES
~~~~~~~~~~~~
~~~~~~~~~~~~
~~~\______/~~~
~~~~~~~~~~~~
Shake the script to see the waves move.

7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176

7.2 Harmonic Waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
7.3 Waves in 3D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
7.4 Waves Propagation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
7.5 Waves Propagation at Interfaces . . . . . . . . . . . . . . . . . . . 184
7.6 Multi-Waves Phenomena . . . . . . . . . . . . . . . . . . . . . . . . 188

175
7 Waves

7.1 Introduction

In the previous chapter, we spoke about oscillations. Now we can ask ourselves the
question ”what happens if we couple many oscillators together into a network, so that
oscillations on one of them also influence its neighbors?”
What happens is that the perturbation will little by little, neighbor after neighbor, ”travel”
across the oscillators network. Such a propagating perturbation is called progressive wave.
Most waves are progressive waves, but there are some counterexamples such as standing
waves.
An exact definition of waves is difficult to construct, but one can anyways say that waves
are perturbations (variations of some quantity, typically in an oscillating way) in space and
time. This is the main difference with oscillations, which occur in more or less abstract
quantities with time: waves add the idea of space.
Mathematically, waves are often represented as functions of time and space u(~r, t) (u(x, t)
in the 1D case). Sometimes the oscillations composing the wave also have a determined
direction in space, thus the wave in this case is represented by a vector function ~u(~r, t).
We will not go into much details in this script, but the temporal evolution of those func-
tions is fixed by the wave equation, which in 1D has the form

∂ 2 u(t, x) 2
2 ∂ u(t, x)
= v . (7.1)
∂t2 ∂x2

This is a partial differential equation (as opposed to an ordinary DE), meaning that it con-
tains partial derivatives (u is function of multiple variables and gets differentiated - such
as in ∂u
∂t - with respect to one single of those variables at a time, by letting other variables
constant). This is a direct consequence of our letting space play a role in the description of
waves, in opposition to the ordinary differential equations we saw for oscillations, where
only time plays a role.
The v in the wave equation can already let us think about the velocity of a wave, which
is in general an important property. Sometimes c is used instead of v.
Waves can be seen as generalizations of oscillations, so in the following we will begin by
taking a look at general properties of waves - which may recall us about oscillations. Then
we will talk about their propagation and the media associated. Finally the interference
phenomenon (two or more waves interacting with each other) will be discussed.

176
7.2. HARMONIC WAVES

7.2 Harmonic Waves

We will provide a more in-depth description of waves using the example of harmonic
waves. Those are waves that can be described using a sinusoidal function:

u(x, t) = A sin(ωt − kx + ϕ) . (7.2)

As can be seen, this mathematical form is quite similar to the one we used for harmonic
oscillations, with the difference of the argument of the sine gaining a spatial term. Some-
times, the reversed convention is used, u(x, t) = A sin(kx−ωt+ϕ0 ). Both conventions
are equivalent, up to a change in the phase from ϕ to ϕ0 . Here we use a positive time and
a negative space, as it more directly relates to the case of oscillations.

Space

u(x, 0) u
λ
A
x

−A
u(x, ∆t)

Time

u(0, t) u
T
A
t

−A
u(∆x, t)

Figure 7.1: A plane wave u(x, t) (with ϕ = 0) represented along each axis, with the other
set first to 0, then to some small value (dashed plot). Both plots are very similar, up to the
choice of the origin.

177
7 Waves

As was the case before, u as a function of t (provided that x is held constant) has a
sinusoidal form. This means that, at any given point, the ”quantity the wave moves in”
oscillates with time. But it is also to be noted that the spatial coordinate x has an equivalent
place in the formula as the temporal coordinate t. Thus the other way round also holds: at
any given time, the ”quantity the wave moves in” oscillates with space. It is therefore very
important, when looking at a plot describing the amplitude of a harmonic wave, to check
whether it drawn against time or space. In both cases the figure will appear sinusoidal.
When talking about oscillations, we defined ω as the angular frequency, a measure of the
variation ”rapidity” of the quantity with time. In the formula above, x has a similar role
as t and it also has some factor k associated, called wavenumber. And again similarly to
the time, where we defined the period T = 2π/ω as the time interval between (say) two
peaks of the oscillation (or wave, at a given point), we can define the wavelength λ = 2π/k,
which is nothing but the space interval (i.e. the distance) between (say) two peaks of the
wave (at a given, fixed time).
From this, we can derive a very important equation by inserting 7.2 into 7.1:

∂u
= ωA cos(ωt − kx + ϕ)
∂t
∂2u
= −ω 2 A sin(ωt − kx + ϕ)
∂t2
∂u
= −kA cos(ωt − kx + ϕ)
∂x
∂2u
= −k 2 A sin(ωt − kx + ϕ)
∂x2

∂2u
−ω 2 A sin(ωt − kx + ϕ) =
∂t2
! ∂2u
= v2 2
∂x
= v 2 (−k 2 )A sin(ωt − kx + ϕ)

⇒ ω2 = v2 k2
⇒ v = ω/k = λ/T = λf ,

where we assume v, ω and k to be positive (which is mostly a matter of convention).

178
7.3. WAVES IN 3D

This relates the temporal and spatial characteristics of waves through a parameter, the
velocity of the wave (which is not very surprising, considering that velocities express in
fact the division of a space interval by a time interval).

7.3 Waves in 3D

7.3.1 Waves as Functions of 3D Spatial Coordinates

When going from waves in one dimension to three, the first element to look at is the fact
that now the function u(x, t) is modified into u(~r, t).
In order to adapt our formula for harmonic waves, we define ~k, the wavevector and write:

u(~r, t) = A sin(ωt − ~k · ~r + ϕ) . (7.3)

While ~k still indicates the ”rapidity” of variation of the wave with space (through its norm
k = |~k|), it now also has a direction, which actually is the direction of propagation of the
wave itself.
The formula 7.3 creates so-called planar waves: at a given time, all points in planes perpen-
dicular to the wavevector share the same oscillation state.

7.3.2 Waves as 3D Functions, Transversal and Longitudinal Waves,

Polarization
Depending on the ”quantity the wave moves in”, not only the oscillations defining the
wave can depend on all the three spatial coordinates, but the oscillations themselves can
have a well-defined direction in space, which is mathematically indicated by a vectorial func-
tion ~u(~r, t). In the harmonic case, in general we simply modify 7.3 by using a vectorial
amplitude A ~ to take this oscillation direction into account.
As we saw before, waves in 3D have a particular direction determined by ~k, the propa-
gation direction. For waves whose oscillations themselves have a well-defined direction,
we have two possible cases:

• The oscillation direction can be perpendicular to the propagation direction, in

which case the wave is named transversal.

• Or it can be parallel, and we have a longitudinal wave.

179
7 Waves

Figure 7.2: Wave propagation in 3D. Top: transversal, vertically polarized wave, oscilla-
tions happen in the vertical direction, thus perpendicularly to the direction of the wave
(given by ~k). Bottom: longitudinal wave, oscillations are parallel to ~k.

In the first case, even for a fixed propagation direction the oscillation direction is not
unique, but can be freely chosen in the plane perpendicular ~k. We thus define the po-
larization, which is the oscillation direction. Two transversal waves going in the same
direction can nevertheless have different polarization states.

7.4 Waves Propagation

In the introduction to the present chapter, we said that waves are perturbations travelling
through a ”network” of coupled oscillators. Later on, we used the expression ”quantity
the wave moves in” in several sentences. It is now time to try answering the question
”what does a wave move in?”, i.e. find what is the propagation medium of waves.

180
7.4. WAVES PROPAGATION

This in turn depends on the type of wave, and in particular on the nature of the oscillators.
In the following, we will introduce a couple of examples of wave forms as concrete cases
for the different properties that were discussed in the previous sections. This will allow
us to also get an insight into the different propagation media.

7.4.1 Sound
Sound is made of so-called mechanical waves, i.e. waves whose oscillators are positions
of matter particles - be it in a solid, liquid or gas. In the case of sound, those waves
are longitudinal, so matter particles get pushed away in the direction of propagation by
previous ones, where they hit next ones, thus propagating the wave further and letting
them bounce back (in average) to their original position.
Like any mechanical waves, sound cannot propagate in the absence of physical medium
(like in vacuum). Even if an enormous asteroid closely passes by your spacecraft in outer
space, you have no chance of hearing anything! (Unless the ship actually gets hit by some
part of the asteroid...)
The frequency of sound is directly related with the perceived height - a higher frequency
leading to a higher pitch. It also leads to a shorter wavelength, as the sound travels at
approximately the same velocity in a given material (round 343 m·s−1 in air) under given
conditions, independently of its frequency.

7.4.2 Light
While light also composes itself of waves - and while we also can sense it (in some cases),
one could not imagine anything more different from sound than light!
The first fundamental difference between them is that light is not a material wave, i.e. it is
not made of matter particles brought to oscillating. Moreover, it does not need any matter
to propagate (the question on the nature of a hypothetic physical medium - ”luminiferous
aether” - for light propagation actually participated in the development of relativity).
While light does not need any material to propagate, the presence of a material can have
an impact on its propagation, such as changing its velocity, direction, etc. We will talk
about these general effects later on.
We said that waves travel on ”networks of oscillators” so, even for light there has to be
something supporting it. This ”something” is the electromagnetic field, and thus light is ac-
tually an electromagnetic wave. Concretely, this means that light beams are perturbations
across space that bring electric and magnetic fields to locally oscillate.
In most cases (and most importantly in vacuum), light is a transversal wave. Thus, in any
point on the light path, both E~ and B~ are perpendicular to ~k (E~ and B ~ are actually also

181
7 Waves

perpendicular to each other).

As a transversal wave, light is subject to polarization, given by the direction of the electric
field. The existence of polarization filters, which can block light depending of its po-
larization orientation, is crucial for many technologies in both research and engineering;
but it is also at the origin of the most common 3D cinema system, where two images
are projected on the screen simultaneously, with perpendicular polarizations. Eyeglasses
similarly have two lenses with corresponding polarization filters, so that each lens exactly
lets one of the images pass through while blocking the other one.
Wavelength and frequency of electromagnetic waves cover a huge range of magnitude
orders, which also leads to very different types of interaction with matter.

Penetrates Earth's
Atmosphere?

Radiation Type Radio Microwave Infrared Visible Ultraviolet X-ray Gamma ray
Wavelength (m) 103 10−2 10−5 0.5×10 −6 10−8 10−10 10−12

Approximate Scale
of Wavelength

Buildings Humans Butterflies Needle Point Protozoans Molecules Atoms Atomic Nuclei

Frequency (Hz)

10 4 10 8 1012 1015 1016 1018 1020

Temperature of
objects at which
this radiation is the
most intense
wavelength emitted 1K 100 K 10,000 K 10,000,000 K
−272 °C −173 °C 9,727 °C ~10,000,000 °C

Figure 7.3: Electromagnetic spectrum. [28]

7.4.3 Seismic Waves

Like sound, seismic waves - waves created by geological events in the inner of the Earth,
such as earthquakes - are material waves. Unlike them, though, they are not necessarily
longitudinal. In fact, single events often create a bunch of seismic waves of different
types - both longitudinal and transversal, but also waves travelling through the Earth or
across its surface, etc. All those different types come in general together with different
wavelengths, frequencies and velocities, which also depend on the propagation material.
In turn, seismology can take advantage of all these properties to better understand the
Earth’s inner structure as well as to study geological risks.

182
7.4. WAVES PROPAGATION

7.4.4 Transport
While they may transport energy, it is important to note that waves do not transport
matter (not taking into account the small local displacements in material waves).
This may seem counterintuitive, for example in waves on a water surface: the waves peaks
move across the surface, but the water particles themselves only move up and down. Thus
an object floating on the water - such as a duck - is itself brought to (almost) only oscillate
vertically.

7.4.5 Doppler Effect

Doppler Effect is a phenomenon occurring when the emitter (an object creating a wave)
and/or the receiver (an hypothetic object observing wave oscillations) of a wave move
with respect to each other and, in the case of material wave, with respect to the propaga-
tion medium.
In this script we will restrict ourselves to the 1D case for sound waves.
Let’s consider an emitter travelling with velocity vem and creating a sound wave of fre-
quency fem and velocity v. Let’s also assume that a receiver travelling with velocity vre is
at a distance d from the emitter at a given time (which we will set to zero together with
the emitter’s initial position for the sake of simplicity).
Note that we do not make any assumption on the signs of vem and vre but, as we are
considering the 1D case, each object is moving either directly towards or away from the
other one. We will simply define vem and vre as positive when they are in the same
direction as the vector going from the emitter to the receiver, as negative otherwise.
Moreover, both vem and vre are relative to the medium.

Let t1 be the time taken by the sound emitted at the initial instant to arrive to the receiver.
We have:

vt1 = d + vre t1 ,

thus t1 = d/(v − vre ). Similarly, let t2 be the time taken by the sound emitted after a
period of the sound (1/fem ) to arrive to the receiver, so:

1 1
vt2 = d − vem + vre + vre t2 ,
fem fem

so we have t2 = d + (vre − vem ) f1em /(v − vre ).

183
7 Waves

emitter receiver

vem v vre

d
Figure 7.4: Emitter and receiver. Due to the Doppler effect, the frequency of the sound
will be different for the emitter, the receiver and a neutral observer at rest with respect to
the medium.

From t1 and t2 we can calculate the frequency of the sound fre as observed by the receiver,
as a function of fem (t2 − t1 representing the difference in periods between the sound
received and emitted):

1 1 v − vre
fre = 1 = vre −vem 1 v−vre 1 = fem .
t2 − t1 + fem v−vre fem + v−vre fem
v − vem

Note: the signs in the fractions depend on the way we defined the velocities. They vary
among textbooks and one should be careful when applying the formula.
The Doppler effect typically leads to a higher pitch for approaching source and lower
when it recedes from the observer. An everyday example is the siren of emergency vehi-
cles: one can well hear the frequency drop when the vehicle passes by.
It can also be used in technical applications, by sending a sound signal and analysing the
frequency of the signal reflected back: radars and blood flow imaging in cardiology are
typical uses.

7.5 Waves Propagation at Interfaces

Under this title we understand phenomena where, in contrast to what we saw before
with (implicitely assumed) homogeneous, isotropic and infinite propagation spaces, there
is some element in the path, preventing the wave from freely propagating further.
An important principle for understanding those phenomena is Fermat’s principle: it states
that light (and it can be generalized on most waves) always ”chooses” the locally fastest

184
7.5. WAVES PROPAGATION AT INTERFACES

way between two points, i.e. the way that it takes the shortest time for the wave to travel.
We will see later on why this ”locally” is important, when studying reflections.

7.5.1 Reflection
Reflection happens when a wave bounces back at an interface between its propagation
medium - be it material or not - and some material element.
Supposing that the element’s interface with the medium is sufficiently smooth, we can
define the incidence angle ϕi and reflection angle ϕr between the surface normal and the re-
spective path elements.
Using Fermat’s principle, it can be shown that this is exactly the case when both angles
are equal, ϕi = ϕr . We also easily see the local aspect of the principle: of course this
path is not the globally shortest one (a straight line between the extremities of the path
would be shorter), but of the continuum of paths going from one point to the other while
hitting the interface, it is the shortest one.
A straightforward way to see how Fermat’s principle implies equal angles is to imagine that
the mirror does not exist, but that instead one of the extremities of the path is mirrored,
i.e. on the other side of the mirror’s plane. Then it is clear that to minimize the distance,
one has to consider the straight line from that mirrored extremity to the other, unmodified
one. And it is also directly visible that the equality of angles should be respected in that
case.

7.5.2 Refraction
Refraction occurs when a wave arrives at an interface between to propagation media, which
typically have different properties, such that the wave has a different propagation velocity
in each of them.
In order to take these velocities into account, one defines - in case of light - the refrac-
tive index n of a medium as the fraction of the vacuum velocity c by the velocity in the
medium v: n = c/v. The refractive index is slightly dependent on the light wavelength,
which is one of the main reasons for chromatic aberration in refractive optical parts. We will
nevertheless forget this detail in this script.
As can be shown using Fermat’s principle, the angles of incidence ϕ1 and of refraction
ϕ2 at the interface between media of refractive indices n1 and n2 are related by Snell-
Descartes law:

n1 sin(ϕ1 ) = n2 sin(ϕ2 ) .

185
7 Waves

ϕi ϕr

ϕ0i

Ã

Figure 7.5: Reflection on a plane surface of a wave travelling from A to B. By constructing

the image Ã of A by symmetry w.r.t. the plane, one sees that ϕi = ϕ0i as well as ϕr = ϕ0i
(Fermat’s principle requiring ÃB be a straight line). Hence ϕi = ϕr .

186
7.5. WAVES PROPAGATION AT INTERFACES

ϕ1
n1
n2
ϕ2

C
A0

ϕ01
n1
n2
ϕ02
A00
ϕ001
n1
n2

Figure 7.6: Top: Refraction at a medium interface of a wave travelling from A (in medium
with refraction index n1 ) to C (index n2 ). Here, n2 < n1 is implied. Middle: limit case
with ϕ02 = π2 . ϕ01 is thus called the critical angle. Bottom: For ϕ001 > ϕ01 , no refraction
is possible anymore, only reflection. This phenomenon is therefore called total internal
reflection.

187
7 Waves

7.5.3 Diffraction
Diffraction is what happens when part of the wavefront (the most advanced part of the
perturbation, with all points in the same oscillation state) hits an object: secondary waves
are formed at the hitting points, which make the wave ”turn around” the object, following
its curvature a bit. For the diffraction to have an important effect, the wavelength of the
wave has to be of roughly the same scale as the object in the way. This is typically the
reason why we can hear a sound without actually seeing its source, because the objects in
the way are at our scale - which is roughly also the sound’s scale - but are much too big
for visible light to get diffracted enough to come to us.

Figure 7.7: Diffraction of a plane wave passing through a slit. Notice how the diffracted
wave is approximately circular and thus also reaches zones (even if attenuated) that are
”hidden” sideways of the slit, along the barrier. [27]

7.6 Multi-Waves Phenomena

Until now, we only considered single waves going through space. However, as waves
are simply perturbations, nothing forbids two waves from being at the same place at the
same time. Thus we have to consider the so-called wave superposition, or wave interference.
The main idea here is the superposition principle, which states that, given two waves u(~r, t)
and v(~r, t) of same type (i.e. sharing the same oscillators), the perturbation resulting from

188
7.6. MULTI-WAVES PHENOMENA

their cumulated effect is simply their algebraic sum u(~r, t) + v(~r, t). The same goes with
vectorial waves as well.
The resulting perturbation can have different forms depending on what the original waves
look like, but the most important property is that the result is again a wave. In the fol-
lowing, we will study the most common and interesting interference phenomena for 1D
harmonic waves.
In general, we will therefore look at

u(x, t) + v(x, t) = A1 sin(ω1 t − k1 x + ϕ1 ) + A2 sin(ω2 t − k2 x + ϕ2 )

and try to understand what is going on depending of the relation between A1 and A2 , ω1
and ω2 , k1 and k2 , and ϕ1 and ϕ2 .

7.6.1 Same amplitude, frequency and wavenumber

The simplest case is when A1 = A2 = A, ω1 = ω2 = ω and k1 = k2 = k, i.e. the
waves are very similar but can simply be shifted by some phase. For the sake of simplicity,
we will set ϕ1 = 0 and ϕ2 = ϕ.
We can use the sine addition formula to rewrite the interference:

u(x, t) + v(x, t) = A sin(ωt − kx) + A sin(ωt − kx + ϕ)

ϕ ϕ
= 2A sin ωt − kx + cos .
2 2

As can be seen, the resulting wave is again harmonic, with the same frequency and
wavenumber. Depending on ϕ, its amplitude can vary:

• For ϕ = 0, ±2π, ±4π, etc., the amplitude is maximal and equal to 2A. This is the
perfect addition of two identical waves.

• For ϕ = ±π, ±3π, etc., the amplitude is null, i.e. the waves are exactly out of
phase and thus completely annihiliate.

189
7 Waves

A x
ϕ = 0:
−A

A x
ϕ = π4 :
−A

A x
3π
ϕ= 4 :
−A

A x
ϕ = π:
−A

Figure 7.8: Superposition (thick plot) of two waves u(x, t) (dashed) and v(x, t) (dotted)
with same amplitude, frequency and wavenumber and a phase difference ϕ for v with
respect to u. Due to the coefficient cos( ϕ2 ) in the amplitude, the amplitude of the resulting
superposition wave can be bigger than those of the original waves (positive interference)
or smaller or even null (negative interference).

7.6.2 Same amplitude and frequency, opposite wavenumber

This situation corresponds to two identical waves propagating in opposite directions, so
A1 = A2 = A, ω1 = ω2 = ω, k1 = −k2 = k. We can set ϕ1 = ϕ2 = 0 without loss
of generality.
Again using the addition of sines:

u(x, t) + v(x, t) = A sin(ωt − kx) + A sin(ωt + kx)

= 2A sin(ωt) cos(kx) .

Very interestingly, by summing the waves we get a separation of ωt and kx, which are
therefore uncoupled. This means that the oscillations of the resulting wave are static
- some places will constantly have zero amplitude due to the cos(kx) (”nodes”) while

190
7.6. MULTI-WAVES PHENOMENA

others will have maximal amplitude (”anti-nodes”). Similarly, at some times the whole
wave will have zero amplitude everywhere due to the sin(ωt), etc.
Such a wave, that does not seem to travel along its medium, is called a stationary wave or
standing wave. It is quite common, as it forms everytime some wave gets reflected back
and interferes with itself.
Note that the distance from node to node and from antinode to antinode is λ2 (for space)
and T2 (for time).

Space
w
w(x, 0) 2A
w(x, ∆t) x
w(x, 2∆t) −2λ −λ λ 2λ
−2A
Time
w(0, t) w
2A
w(∆x, t)
t
w(2∆x, t)
−2T −T T 2T
−2A

Figure 7.9: The resulting stationary wave w(x, t) = u(x, t) + v(x, t) represented in
both space and time. In both plots, nodes and antinodes remain at the same x, resp. t
coordinate, hence ”stationary”. For our choice of wave equation and ϕ’s, spatial antinodes
are at nλ nT
2 , while temporal nodes are at 2 , n ∈ Z. Note that w(x, 0) = 0 for all x’s.

7.6.3 Slightly different frequencies

Another interesting case is A1 = A2 = A, k1 = k2 = k but ω1 6= ω2 . We can set both
phases to zero without loss of generality.
Using once more the addition of sines:

u(x, t) + v(x, t) = A sin(ω1 t − kx) + A sin(ω2 t − kx)

ω1 + ω2 ω1 − ω2
= 2A sin t − kx cos t .
2 2

191
7 Waves

What we have here is a conventional harmonic wave of angular frequency ω1 +ω 2

2
- i.e.
oscillating more rapidly than the original waves - modulated by an oscillation (not a wave
- note the absence of x) of angular frequency ω1 −ω2 . In the case where ω1 ≈ ω2 , this
2

modulation will be slow with respect to the oscillation of the wave itself.
The result is the so-called beat phenomenon, with an almost harmonic wave periodically
increasing and decreasing in intensity. ω1 ≈ ω2 is needed for the phenomenon to be
visible/audible.

t
π π
− ω1 −ω 4π
ω1 −ω2
2 ω1 +ω2
−1

w
2A

t
π π
− ω1 −ω2 ω1 −ω2

−2A

Figure 7.10: The modulated beat wave is the result of the multiplication of the dotted
wave by a temporal oscillation (dashed) and an amplitude factor, 2A. Note that both the
beat and the dotted lines represent waves, and as such also move in space. In contrast,
the dashed line represent a pure temporal oscillation that only ”modulates” (constrains
the amplitude of) the beat wave. Note also that the ”enveloppe” of the beat wave is a
periodic, non-sinusoidal oscillation with a period half of the modulating cosine.

7.6.4 Fourier Analysis

We just saw that the sum of two harmonic waves is again a periodic wave, and in some
cases is even harmonic itself. We can generalize this idea and prove that any superposition
of harmonic waves is periodic.
Conversely, an important result says that any periodic oscillation can be decomposed in
a sum of (potentially infinitely many) harmonic oscillations. Even more interestingly, it

192
7.6. MULTI-WAVES PHENOMENA

exists a well-defined mathematical operation, called Fourier-transform, that does exactly

that. More precisely, given a periodical signal, it allows to calculate its spectrum, i.e. the
distribution of amplitudes for each frequency of fundamental harmonic oscillation.
Fourier Analysis can be thought of as the counterpart of Taylor expansion in terms of
periodical signals instead of polynoms, and is a crucial instrument in the modern digital
world, among other uses. For example, analogic audio signals can be converted (digi-
talized) into a sequence of numbers representing a finite approximation of the signal’s
spectrum, thus allowing an efficient digital treatement and storage.

193
7 Waves

194
Chapter 8

FLUID DYNAMICS
How do you sink a submarine manned
by mathematicians? Just knock,
someone will surely open the hatch.
Toilets on submarines are similarly
dangerous. Search for U-1206.

8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196

8.2 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
8.3 Pressure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
8.4 Continuity equation . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
8.5 Bernoulli’s equation . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
8.6 Surface tension, energy and capillary pressure . . . . . . . . . . 201
8.7 Friction in fluids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203

195
8 Fluid Dynamics

In this chapter we discuss the behaviour of fluids macroscopically. Macroscopically

means that we do not look at the individual particles forming that fluid but the fluid as
continuous system of particles. This will then lead to important concepts such as hydro-
static pressure or buoyancy. Furthermore we will encounter very fundamental equations
such as the continuity equation or Bernoulli’s equation.

8.1 Introduction

We will first explain the basic assumptions here and then present some important results
in the next sections.

8.1.1 What is fluid dynamics about?

Fluid dynamics describes the dynamic properties of fluids. Dynamic properties (or sys-
tems) are those who can change in time, as opposed to static ones1 . The other part of the
name is fluid. A fluid is generally something which can flow2 (i.e. it is a gas or a liquid).
However the distinction between a fluid and a solid is not that easy in general, as some-
thing may look solid if we only look at it for a short time but it flows on larger timescales
(e.g. a glacier or the pitch drop experiment - google it).

8.1.2 How can we model such a fluid?

The first formal concept used is that of a trajectory. Intuitively this is the path of some
small particle put into the fluid, e.g. a leaf in a stream. For each point in space we can find
exactly one trajectory passing through this point (at any given point in time). The second
- more important - concept is that of a velocity field, which is just a function giving us
the velocity of a fluid at each point in space and time (see figure 8.1). If we have such a
velocity field we can define flow lines (analogously to field lines in electrodynamics). This
lines are such that they are tangent to the vectors of the vector fields at each point. In
the cases we look at here, they are identical to the trajectories3 .

1
Normally static systems are seen as an (mostly) easier special case of the corresponding dynamic systems.
I.e. we will also look at static properties in this chapter.
2
Technically speaking, we could say that in contrary to a solid, in a fluid two initially neighbouring particles
can move arbitrarily far away from each other.
3
In general they are not, namely if the velocity field depends explicitly on time.

196
8.2. NOTATION

Figure 8.1: A velocity field and a flowline (dashed).

8.2 Notation

Before we start a short summary of notations used throughout this chapter:

Velocity ~v : the velocity of the fluid at some point (in space and time).

Speed v = |~v |: the absolute value of the velocity.

Pressure p: force applied per unit area.

Density ρ: the density of a fluid (in mass per volume).

197
8 Fluid Dynamics

8.3 Pressure
Pressure in fluid dynamics is the same concept as in thermodynamics. It is a force per
area which acts on any surface a fluid touches. The force is always perpendicular to the
surface on which it acts.

Note: In contrast to thermodynamics, the pressure here can also depend on the position
in the fluid (see below).

8.3.1 Compressible and incompressible fluids

One big difference between water and air is the compressibility. We say air is compress-
ible, that means we can change the density of it e.g. by applying pressure. We cannot
do this with water4 . To be able to describe this we would need some equation relating
pressure and density. One example would be the ideal gas law (excercise: why?). But we
will not calculate compressible flows.

8.3.2 Hydrostatic pressure

Suppose we have a long vertical pipe with radius r, closed at the bottom with a lid (see
fig. 8.2). If we now fill it with water up to some height h, the total volume of water
above the lid is V = πr2 h. The force of all the water pushing on the lid is thus given
by F = V ρwater g, where g = 9.81 m·s−2 is the gravitational acceleration. The pressure
on the lid is now given by the force divided by the area, which leads to p = ρgh. The
astonishing thing is that the area of the lid cancels. This means that the pressure at some
depth d (measured from the water surface) is independent of the form of the tube. It also
means that if we have connected tubes the pressure at any height h should be the same
in every tube5 .

8.3.3 Buoyancy
Buoyancy is a phenomenon which is due to the depth-dependence of the hydrostatic
pressure. This also means that we need a force like gravity acting on our fluid to have
buoyancy. To find a formula for this effect, consider a small cube of side length a fully
submersed in water (fig. 8.3). Assume the top and bottom of the cube are perpendicular
to the gravitational field. Now we look at the forces due to the pressure. By symmetry,
the two forces for the sides (front,back) and (left,right) cancel. Assume the top is at a
4
Of course we can, but the effects are much smaller than in air, and we neglect them here.
5
Taking tubes with open tops, we can deduce that they have the same water level.

198
8.3. PRESSURE

Figure 8.2: A pipe filled with water.

depth d then the force on the top is Ftop = pa2 = ρgda2 . The bottom is at depth d + a
and the force is Fbottom = ρg(d + a)a2 . So the total (upward) buoyancy force on the
cube is given by Fup = Ftop − Fbottom = ρgaa2 = ρgVd . Where Vd is the Volume of the
water displaced by the cube6 . If the cube is only partially submersed, we just set Ftop = 0
and arrive at the same equation. The same formula holds also for other bodies (i.e. boats),
intuitively just think of them as being built from small cubes, then also the same formula
holds7 . A body can now float in water if Fup > Fg = mg. We can translate this into an
equation of of the densities, namely a homogeneous body floats if ρbody < ρwater .

6
As long as the cube is fully submersed, this is the volume of the water.
7
You could also do surface integrals over the whole body, which is much more tedious to do and leads
to the same results.

199
8 Fluid Dynamics

~Ftop
d

~Fleft ~Fright

d+a

~Fbottom

Figure 8.3: A cube submersed in water.

8.4 Continuity equation

The continuity equation is a consequence from the conservation of mass. In a first step
assume we have water flowing through a tube. We take the tube to have different cross-
section areas, say A at point a and B at point b. The total mass passing point a in a
time interval ∆t is given by8 va Aρ∆t. And analogously at b. Conservation of mass
means now, that those two quantities need to be equal, i.e. va Aρ∆t = vb Bρ∆t. Time
cancels on both sides and we are left with9 : va Aρa = vb Bρb . Now we look at a different
system. Lets say we have some sort of bottle (of constant volume) and fill it with air.
We can again look at the total mass flowing into the bottle in some ∆t, which is given by
∆M = vAρincoming ∆t. As the air cannot escape, the mass inside the bottle increases and
so does the density: ∆M = ∆ρV . We set this equal to the incoming mass (conservation
of mass) and get ∆ρV = vAρincoming ∆t. We can simplify this by introducing the time
derivative of ρ: dρ A
dt = v V ρincoming . We need to be a bit careful here, as the density of

8
We assume v to be constant over the whole cross-section.
9
ρ would also cancel here, but as we also want to look at gases, in general ρ is not the same at a and b.

200
8.5. BERNOULLI’S EQUATION

the incoming fluid ρincoming is independent of the change of density inside the bottle dρ
dt .
Combining the two parts, i.e. looking at a tube with changing density between two points
a and b we find:
dρ
va Aρa − vb Bρb = Vab , (8.1)
dt
where Vab is the volume between the cross-sections A and B.

8.5 Bernoulli’s equation

Let us just start with the full equation and then explain what it says:

v2
ρ + ρgh + p = const. (8.2)
2
This equation relates the velocity, the gravitational potential and the pressure. Note how-
ever that it is only valid for incompressible fluids and that const. needs to be taken along
one flow line (but in most problems here this constant will be the same for all flow lines).

8.5.1 Derivation
Assume we have a tube with a piston at each end, filled with water (see figure 8.4). Denote
the two ends and all quantities there with a and b respectively. Suppose piston i has an
area of Ai and is at height hi above ground. Let the water have pressure pi at the piston.
If we now displace piston a by a small distance da , piston b will move db = A V
Ab da = Ab .
a

The work needed to displace piston a is Wa = da Aa pa = V pa and for piston b: Wb =

−db Ab pb = −V pb , where the minus sign comes from the fact that the force coming
from pressure now points into the other direction (compared to piston a). So the total
work we put into the system is given by Wtot = Wa + Wb = V (da − db ). If we look at
the energy of the fluid, we find two effects: We displace a volume V of water from ha to
hb , this gives a change in potential energy of ∆Epot = ρV g(hb − ha ). The second effect
is the increase in kinetic energy ∆Ekin = ρV 2 2
2 (vb − va ). If we now combine everything
Wtot = ∆Epot + ∆Ekin divide by V and rearrange the terms we get Bernoulli’s equation.

8.6 Surface tension, energy and capillary pressure

Surface tension comes from the fact that molecules at the surface of a liquid have no
molecules ’above’ them and are attracted by those below. So there is a force acting on
them which needs to be compensated (e.g. by pressure) to have a static surface. Assume
that we increase the surface of a liquid by a small area ∆A (e.g. by bulging it out just a

201
8 Fluid Dynamics

pb
da
Ab

pa
Aa
hb

Figure 8.4: Derivation of Bernoulli’s equation [45].

little bit). In general we need to do some work ∆E to achieve this. The surface tension
is now given by σ = ∆E ∆A . Capillary pressure is the pressure forcing water up a thin tube
with radius a. This is also a surface effect, as the water molecules need less energy when
they are at the surface to the wall compared to somewhere in the liquid (i.e. they stick to
the wall). This pressure is given by:

2γ cos(Θ)
pc = (8.3)
a
where γ is the surface tension relative to the wall and Θ is the contact angle. To get the
height the liquid rises to, equate this pressure with the hydrostatic pressure and solve for
h.

202
8.7. FRICTION IN FLUIDS

8.7 Friction in fluids

As for now we silently assumed that our flows do not have friction. As friction is in gen-
eral very complicated, we will only state two approximate results which hold for objects
moving through a fluid (e.g. a submarine, a ball, a car). There are two types of flows,
laminar and turbulent, which have different formulas for friction. In the laminar case we
get a force proportional to the velocity. An exact formula (only valid for a sphere with
radius R)10 is:
Fr = 6πηRv (8.4)
where v is the velocity and η is the dynamic viscosity. The turbulent friction is given by:

v2
Fr = cW Aρ (8.5)
2
Where cW is the drag coefficient (a constant depending on the material and form of the
object) and A is the area of the object perpendicular to the velocity (i.e. the area you see
if you look in the direction of flow of the fluid).

10
A farmer wants to improve the milk production of his cows. He asks a biologist, an economist and a
physicist to help him. All three of them come to his farm and observe everything. After a day the economist
gives the advice to fire all cows and outsource the farm to China to improve the production by 2%. The
farmer doesn’t like this suggestion and waits for the other two. After a week the biologist presents the idea
of replacing the cows by gene manipulated algae to produce 10% more milk. The farmer decides to wait for
the physicist. After several weeks, the physicist appears with tons of paperwork and claims to have found an
idea to improve the production by over 60%. The farmer is really interested and asks the physicist to explain
his idea in more detail. The latter starts: ”Assume cows to be spherical and in a vacuum (see fig. 8.5).”

Figure 8.5: A cow in a vacuum.

203
8 Fluid Dynamics

204
Chapter 9

ELECTRO- AND MAGNETOSTATICS

If it weren’t for electricity, we’d all be
watching television by candlelight.
George Gobel

9.1 Electrostatics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206

9.2 Potential and Voltage . . . . . . . . . . . . . . . . . . . . . . . . . . 215
9.3 Current and magnetic field . . . . . . . . . . . . . . . . . . . . . . 224

205
9 Electro- and Magnetostatics

Electric attraction and repulsion is a fundamental property of charged particles. It is

described by the electric field. When charged particles move they generate an additional
field, the magnetic field1 . The theory about electrodynamics describes these fields and
how they interact between each other or with charges and also how they evolve in time.
Since the general case of time dependent systems is pretty complicated, we will focus on
static setups. This means that the charges, wires or whatever we are looking at do not
move and have always the same position. The main sources for this chapter are [29] and
[30].

9.1 Electrostatics
This chapter examines the interaction between point charges and how this interaction can
be described. Knowing the physics of point charges one can easily derive the physics for
charge distributions.

9.1.1 Coulomb Force

The basic experimental observation of electrodynamics is that there exist ”things” that
can attract or repel each other in a way that is not due to gravity. These ”things” we
call charges and the attracting or repelling force electric force or Coulomb force. One
observes that there exist two types of charges, we call them positive and negative charge.
A charge which has a very small spreading compared to the distance to other charges we
call a point charge. An ideal point charge is just a point in space with a charge.
If we take two point charges q1 and q2 which are separated by a constant distance r we
measure a force F~ acting on q2 . This force is related to the charges according to

q1 q2
F~12 = k 2 e~r (9.1)
r

where e~r = |~~rr| is the vector that points from q1 to q2 and has length 1. Such a vector
with length 1 we call unit vector. k is a constant which depends how we define the basic
unit of the electrodynamic theory. There exist different systems of units and as a con-
sequence k is different in each of those unit systems. We use the SI system, where the
current and the time are defined and therefore also the charge (see 9.3.1). The unit of
charge is Coulomb C (see exact definition in section 11.1.4). One electron has a charge
1
In relativity, the electric and magnetic field are not two independent fields, they have a strong relation
to each other. That is why it is often called the electromagnetic field. The relativistic treatement is out of
scope for this course.

206
9.1. ELECTROSTATICS

of 1.602 · 10−19 C. We then get k = 4π 1

0
where 0 = 8.85 · 10−12 Fm−1 . The reason
why k contains a factor 4π is explained in chapter 9.1.5.

The important properties we learn from this formula are:

• The coulomb force always points in the direction of the connecting line of the two
charges. The force acting on q1 has the same amount but opposite direction to the
force acting on q2 which agrees to Newtons actio equal reactio.

• If both charges are positive or negative the two charges repel each other, if the
charges are different they attract each other.

• The force has a 1/r2 dependence as we know it from gravity (see also 9.1.5).

• Opposite the gravitation force, there is also repulsion possible. As a consequence

it is possible to shield a charge from the influence of other charges.

9.1.2 Electrostatic field

In the 18th and 19th century, when the theory about electro-magnetism was developed,
there was a big discussion how the force described by equation (9.1) can act over distances.
One point of view was that the force acts directly and that the two charges interact im-
mediately. This view contradicts with some statements from relativity which state that
information can maximal propagate with speed of light. Farady successfully described the
Coulomb force by introducing the electric field: The space has an additional property, the
electric field, which is influenced by the presence of charges and the charges interact with
the field. If we apply this field theory to the example above with the two point charges
we get that a charge, for example q1 influences the field in such a way that the interaction
between q2 and the field at the position of q2 results in the Coulomb force. The simplest
~ as quotient of Force F~ and
way to describe this interaction is defining the electric field E
charge q2

F~12 q1
E~q1 = = e~r (9.2)
q2 4π0 r2

207
9 Electro- and Magnetostatics

~ r) at the position ~r of a given setup, one calculates the force

To get the electric field E(~
F~ that would act on a very small charge q0 at the position ~r and divides the force through
the charge

~
~ r) = F (~r)
E(~
q0
The reason why q0 has to be very small is because otherwise it might influence the other
charges and therefore the electric field. This gets important if one looks at time depen-
dent fields which we do not treat here.

Fields are often drawn by field lines, see figure 9.1. Putting a very small charge in a field
line picture the force on the small charge points always in the direction of the field line.
Therefore in a field line picture the field is always tangential to the field lines and the
strength of the field is proportional to the density of the field lines. Electric field lines
have the following qualitative properties: They start and end at the sources, in the electric
case they start at the positive charge or in infinity and end at the negative charges or at
infinity. Additionally field lines want to be as short as possible but they repel each other.
This attraction in length and repulsion of each other defines a stable state for the field
lines which they will take. Of course this is more a qualitative reason for how the field
looks like but often one gets a first intuition for a problem.
The advantage of describing the interaction of charges by a field is that it is possible to
describe the changing of the field with a finite speed. Therefore two charges interact only
with the speed of light and not immediately.

9.1.3 Superposition
The Coulomb force allows us also to examine the field of multiple charges because every
pair of charges interacts according to equation (9.1). Therefore the total force on a charge
q is the sum of all forces between q and the other charges q1 ...qN . The definition of the
electric field is still the same as defined in equation (9.2) and we get

PN qj q
j=1 4π0 rj2 e~rj
~ N
~ =F =
X qj
E = e~r (9.3)
q q
j=1
4π0 rj2 j

where e~rj is the unit vector pointing from qj to q.

208
9.1. ELECTROSTATICS

Figure 9.1: Field lines of two charges, at the left both positive, at the right one negative
and one positive. At the left picture some vectors of the electric field are drawn which
are tangent to the fieldlines. At the right picture the electric field in the oval with the
continuous line is much stronger than in the one with the dashed line because in the oval
with the continuous line the field lines are much denser to each other. [31].

9.1.4 Continuous charge distributions

When there are a lot of point charges involved (1C = 6.2 · 1018 electrons) it is useful
to use the charge density ρ = Vq instead of describing the electric field of every point
charge. The charge density contains the information how much charge q a volume V
contains. Therefore the charge in the volume is given by q = ρV if ρ is constant all over
V . It can also happen, that the density depends on the position ~y , we then write ρ(~y ).
To compute the electric field at the position ~x we treat the charge that is in a very small
volume dV (~y ) as point charge located at the position ~y . We then integrate the charge
over all these dV (~y ) which are located inside the total volume V (so ~y is inside V ). This
leads to the formula

˚
~ x) = ρ(~y ) ~x − ~y
E(~ 2
dV (~y ) (9.4)
V 4π0 |~x − ~y | |~x − ~y |
where the ρ(~y ) symbolises that one has to take the charge density at the location where
the dV (~y ) is. It is not important to be able to calculate the electric field for a difficult
charge distribution. It is more important to understand the concept (see application 9.1.6).
Often the problems have some nice symmetries which makes it easier to solve.
There might also be two dimensional surface charge distributions or one dimensional
charge distributions which can be treated analogously to the three dimensional case dis-
cussed above (see 9.1.6).

209
9 Electro- and Magnetostatics

9.1.5 Gauss’ law

The 1/r2 dependence in the Coulomb law (9.1) has an important consequence. When
we look at a point charge Q around which we place an imaginary sphere, we observe that
the number of field lines that pass through the sphere does not depend on the radius of
the sphere because field lines start and end at charges or in infinity. Let’s formulate this
more mathematically. The electric flux Ψ for a homogeneous electric field E ~ is defined
~ ~ ~
as Ψ = E · A where A is the surface vector for a plane surface. The surface vector points
perpendicular to the surface. The length of the vector is equal to the area of the surface.
If the sign of Ψ is positive, the electric field goes through the surface in the same sense as
the surface vector is pointing, if it’s negative in the opposite sense. Visually spoken the
flux indicates how many field lines pass through the surface. If we now have a curved
surface, as it is the case at a sphere, we have to split the surface A of the sphere into small
pieces dS ~ and we get a small amount of the flux by dΨ = E ~ · dS
~ (see figure 9.2).

~ and the electric field E

Figure 9.2: Curfed surface with a small surface vector dS ~ going
through the surface [32].

Since the surface A of the sphere is always perpendicular to the radius vector, E ~ and dS ~
are parallel and the flux is simply dΨ = ±EdS, where E and dS are the absolute values
~ and dS
of E ~ (the ± indicates that one has to take into account weather the field lines go
into the sphere which is the case if Q < 0 or if the field lines go out of the sphere for
Q > 0). As we stated the whole flux through the sphere should be independent of the
radius and indeed, the calculation also states this:

210
9.1. ELECTROSTATICS

‹
Ψ= dΨ
‹A
= ~ S
Ed ~
‹A
= EdS
‹
A
Q Q
=E dS = 4πr2 = (9.5)
A 4π0 r2 0

The two integral symbols indicate that the integral is over a two dimensional surface and
the circle in the integral symbolizes that the area is closed (it is the surface of a volume).
The step from the second last to the last line is right since the electric field is everywhere
on the surface parallel to dS ~ and has the same strength E = Q 2 (because on the
4π0 r
sphere the distance to the charge is everywhere the same), therefore it does not depend
on dS and we can take it out of the integral. The last integral is simply the integral of the
small surface areas dS all over A and therefore simply A = 4πr2 itself.
Following the idea of field lines it is also clear that in any volume V that has no charge in
it the number of field lines that enter the volume is the same as the number of field lines
that leave the volume. In mathematical notation this means that the total flux is zero:

‹
Ψ= ~ S
Ed ~=0 (9.6)
∂V

where ∂V is the surface of the volume. If we now combine it with the result from
equation (9.5) we get that independent of the volume the flux only depends on the amount
of charge that is inside the volume:

‹ ˚
Ψ= ~ S
Ed ~ = Qin = 1 ρ(~y )dV (~y ) (9.7)
∂V 0 0 V

This formula is called Gauss law and its statement is that the flux through a closed surface
only depends on the electric charge inside that volume and not on the form of the surface.
One can conclude that the field lines are produced by the charges. It is pretty useful for
example to calculate the electric field in symmetric problems (see 9.1.6) or to find out
how charge is distributed.

211
9 Electro- and Magnetostatics

9.1.6 Examples
We want now use the laws above to calculate the electric field of some configurations.

infinite plain
Let’s assume there is a thin (height = 0) plate with a surface charge density σ (charge
per area) at the x-y plane (see figure 9.3). We again use Gauss law 9.1.5 by using a small
cylinder which has above and under the x-y plane a surface parallel to the x-y plane with
area S each.

Figure 9.3: Infinite plate whtih the cylinder [34].

Since the plane is infinite and the charge density is uniform, the electric field has no
component that points parallel to the x-y plane so the field only points in z-direction.
Therefore the whole flux through the surface of the volume goes through the surface
above and under the plane and using Gauss law one gets

¨
Qin ~ S~
= Ed (9.8)
0 2S
σA
= E · 2A (9.9)
0
~ = ± σ e~z . The ± depends weather one calculates the
Therefore the electric field is E 20
electric field above or under the plate.

212
9.1. ELECTROSTATICS

infinite long wire

Let’s assume there is an infinite long, straight wire on the x-axis which has a constant
charge density (per length) λ. The goal is to calculate the electric field at every point.
We’ll use two different approaches:

At the first approach we use Gauss law 9.1.5 and some symmetries to calculate the elec-
tric field. To use Gauss law we need some volume and since the problem is rotations
symmetric around the z-axis we choose a cylinder with length l and radius r around the
wire (see figure 9.4). Since the wire is infinitely long there is no midpoint of the wire and
as a consequence the electric field must point radial. As a consequence the flux through
the left and right circle S (see figure 9.4) is zero. Furthermore the electric field has the
same strength everywhere on the side of the cylinder and is pointing outside, parallel to
the surface vector. Therefore the scalar product is the same as the multiplication of the
absolute values of the surface vector dS and the electric field E. Now using Gauss law
one gets

‹
λl Qin ~ S~
= = Ed
0 0 surface of cylinder
¨
= EdS
¨
A

=E dS
A
= E2πrl

where A is the area of the side of the cylinder with A = 2πrl.

The electric field is therefore

 
x
~ = λ e~r =
E
λ  y 
0 2πr 0 2π(x2 + y 2 )
0

The second approach uses the superposition principle for continuous distribution (see
9.1.4) and is more complicated. Since the whole problem is rotation symmetric around
the x-axis and also translation invariant along the z-axis the electric field only depends on

213
9 Electro- and Magnetostatics

Figure 9.4: Cylinder around the wire. The wire is along the z-axis.

the distance r from the wire and to simplify the calculation we look at the electric field at
the point x = z = 0 and y = r. The density in this problem is per unit length therefore
the dV is replaced by a dx in formula (9.4). So the integral looks like

ˆ
 
x
~ = λ  y  ds
E p 3
z−axis 4π0 x2 + y 2 + z 2 z
ˆ∞
 
0
λ
= √  r  dz
4π r 2 + z23
−∞ 0 z

So we can look at the z and y component of the E~ field separately. For the z Component
z
we get 0 because p 2 2 3 is an odd function (it is point symmetric to the origin) and the
z +y
integral from an odd function over a symmetric interval is always zero. Physically this can
be interpreted the following way: when we compute the contribution to the electric field
from a point on the wire at z0 we find a point −z0 which contributes the same amount to
the z-direction of the electric field but in opposite direction. So the contributions from
z0 and −z0 to the z-component cancel out. Now let’s calculate the y-component of the
electric field which we have to do by solving the integral. Since λ is constant over the
whole wire we can take it out of the integral

214
9.2. POTENTIAL AND VOLTAGE

ˆ ∞
λ r
Ey = √ 3 dz
4π0 −∞ r2 + z 2

We now apply the substitution z = r sinh(u) and dz = r cosh(u)du and we get

ˆ ∞
λ r2 cosh u
Ey = du
4π0 −∞ (r2 (sinh(u)2 + 1))3/2
ˆ ∞
λ 1
= du
4π0 −∞ r cosh(u)2
∞
λ 1
= tanh(u)
4π0 r −∞
λ λ
= 2= .
4π0 r 2π0 r
This is the same result as we got at the first approach.

9.2 Potential and Voltage

An electric field produces a force on a charged particle. If one displaces the particle work
has to be done. This chapter looks at this work and the energetic properties of the electric
field.

9.2.1 Electric potential

The force F~ on a charged particle q in an electric field E
~ is given by F~ = q E.
~ If one wants
to move the particle from one position P1 to an other position P2 one has to overcome
the force F~ therefore one has to apply the force F~ext = −F~ = −q E. ~ Moving q along a
path S one has to effort the work W

ˆP2 ˆP2 ˆP2

W = F~ext · d~s = − F~ · d~s = −q ~ · d~s
E
P1 P1 P1

The − sign causes the work to be positive if one drags a positive charge q > 0 against
the electric field, therefore one has to apply work (actively). If one pulls a charge q > 0

215
9 Electro- and Magnetostatics

in direction of the electric field one gets energy. Therefore if the work from P1 to P2
is positive the charge has at P2 a higher potential energy than at the position P1 . If the
energy at the reference point P1 is chosen to be E0 then the energy at the point P2 the
energy is exactly Ep = W + E0 . Since only the energy difference between the two points
can be used, E0 can be set to zero without changing the behaviour of the system. As a
consequence the energy at P2 is proportional to the charge and we define a new property,
the electrostatic potential ϕ(P2 ) (often only called electric potential) which is defined as

ˆP2
Ep ~ · d~s
ϕ(P2 ) = =− E (9.10)
q
P1

This formula is only true in the electrostatic case, for the dynamic case one can also
define an electric potential but one has to pay attention to more things. The electrostatic
potential describes the influence of the electric field to the energy of a charged particle q.

9.2.2 Electric potential of a point charge

If we understand the electric potential of a point charge we can easily generalize it to

multiple point charges or even to a general charge distribution.
Let Q be a point charge at the origin of the 3-dimensional coordinate system and q an
other point charge which we move. We want to examine the potential energy of q de-
pending on its position. Lets first move q from a point P1 to a point P2 where both
points have the same distance r from the origin. As path S we chose a path where we
have always the same distance r from Q (which is at the origin). Therefore we never
move radial and therefore always perpendicular to the electric field since the electric field
points radial (this means in the same direction as the connecting line of Q and q, see
chapter 9.1.1). To calculate the potential at P2 we integrate according to equation (9.10)
and since the electric field and the path are always perpendicular to each other the scalar
product is zero and as a consequence the whole integral. Therefore the energy of q only
depends of the distance to Q, which is reasonable since the whole problem is spheric
symmetric.

216
9.2. POTENTIAL AND VOLTAGE

Let’s now examine the dependence of r on ϕ. The point P1 shall have the distance r1
and P2 the distance r2 from the origin. Then the potential difference is

ˆr2 ˆr2
~ · d~s = − Q
ϕ2 − ϕ1 = − E dr (9.11)
4π0 r2
r1 r1
r2
Q 1 Q 1 1
=− − = − (9.12)
4π0 r r1 4π0 r2 r1

Since the reference energy can be chosen freely, it is also possible to choose the reference
point r1 freely and the simplest way is to choose r1 = ∞ and we get.

ˆR
Q Q 1
ϕ(R) = − dr = (9.13)
4π0 r2 4π0 R
∞

If we want to calculate the potential difference ∆ϕ between two distances r1 and r2 we

can use equation (9.13)

ˆr2
Q
∆ϕ = − dr
4π0 r2
r1
ˆ∞ ˆr2
Q Q
=− dr − dr
4π0 r2 4π0 r2
r1 ∞
= ϕ(r2 ) − ϕ(r1 )

which makes it easy to calculate potential differences of point charges.

217
9 Electro- and Magnetostatics

9.2.3 Potential of multiple charges

We want to examine the potential energy Ep of a point charge q at position ~r in a system
of n charges q1 ...qn at the positions r~1 ...r~n , where infinity has potential energy zero. Since
the force at any point is the sum of the forces between q and one of the other charges
the total potential energy is also the sum of the energy between q and each other charge:

ˆ~r X
n
qqj
Ep (~r) = − (~r − r~j )d~s
4π0 |~r − r~j |3
∞ j=1
n ˆ~
r
X qqj
= (~r − r~j )d~s
4π0 |~r − r~j |3
j=1 ∞
rj0
n ˆ n
X qqj X qqj
= dr =
4π0 r 2 4π0 rj0
j=1 ∞ j=1

where rj0 = |~r − r~j | is the distance between q and qj . The step from the second last to
the last line is possible because inside the sum we treat the interactions separately and we
can therefore apply equation (9.13). Therefore the potential of the n charges is given by

n
X qj
ϕ(~r) =
4π0 rj0
j=1

We also find the potential of a continuous charge distribution ρ in a volume V by applying

the same argument as when we looked at the electric field in chapter 9.1.4 and we find
the potential at the position ~r by

ˆ
ρ(~y )
ϕ(~r) = dV (~y )
4π0 |~r − ~y |
V

where ~y is the position of the infinite small charge dq = ρ(~y )dV (~y ) inside the volume.

218
9.2. POTENTIAL AND VOLTAGE

9.2.4 Voltage
A potential difference is called voltage. The voltage between two points indicates how
much energy per charge a charge gains if it moves from one point to the other. The unit
of the voltage is volt V.
If one coulomb is moved between two points with one volt then this coulomb charge
gains one joule energy. Therefore one volt is defined as 1V = 1J/1C. Once the volt is
defined one can also define a new energy unit: The energy one electron gains passing one
volt is called one electronvolt eV.
The quantity volt plays an important role in electric circuits (see chapter 10).

The fact that the potential is only a function of the place allows us to make an important
statement about the electric field. If we calculate the energy W we have to apply to
move a charge q from a starting point r~0 over a closed path γ with end point equal to
the starting point r~0 we see that the applied energy is zero W = 0 since the potential
difference between r~0 and r~0 is zero. Since W is proportional to the charge q the integral
of the electric field along the path must be zero:

˛
~ · d~s = 0
E (9.14)
γ

This property makes the electric field an irrotational field (opposite to the magnetic field,
see chapter 11.1.2). The statement that the electric field is an irrotational field is only
true for the static electric field. Therefore the electrostatic field is an irrotational source
field which means that equation (9.14) holds and that the electric field begins and ends at
sources which we called charge.

9.2.5 Potential and conducting material

If a body made of conducting material is placed in an external electric field the charges
in that material will move according to the electric field. At the end there will be a stable
state with some interesting properties:
1. The total electric field (sum of external field and field of moved charges in the con-
ducting material) points perpendicular to the surface at every point on the surface.
Because if there would also be a parallel component the charge would accelerate
in this direction and it would not be a stable state.
2. Inside the body there is no electric field. Otherwise again charge would be accel-
erated what contradicts to the assumption of the stable state.

219
9 Electro- and Magnetostatics

3. Since the electric field inside the body is zero the potential all over the body is
the same. Because if we look at two points of the body and want to calculate the
potential difference we have to apply equation (9.10) and since E ~ = 0 we get that
the potential difference is zero

4. Inside the body there is no net charge (this means positive and negative charge
have the same density). Otherwise it is not possible to have everywhere in and on
the body the same potential. Therefore all charge is on the surface of the body.

9.2.6 Capacity

Let’s take two bodies made of conducting material which have no net charge on each of
them. We call them electric neutral (so the negative and positive charge have the same
amount). If we take some charge from the first body and put it on the second body there
will be a potential difference ∆U between this bodies. Taking a very small charge q (so
small that it does not influence the potential or the electric field) and moving it from
the one to the other body we recognize that no mater which way we take, the absolute
value of the energy is always |∆U q|. Therefore the charge on each body is distributed
in a very particular way such that the voltage between two points on the two bodies is
always ∆U . If we put more charge from the first to the second body the voltage will be
bigger but the energy gain displacing q from one body to the other is still independent on
the path. This means that the way the charge is distributed is the same but with a bigger
amount of charge. Therefore the direction of the electric field does not depend on how
much charge was put from the first on the second body, only the strength of the electric
field depends on this amount and the sense (if it is pointing in one or the other direction,
depends on the sign of the charge we displaced). Since the electric field is proportional
to the charge ∆Q displaced from the first body to the second, we have a proportional
dependence between the charge on the bodies and the voltage between them:

C · ∆U = ∆Q

where C is the proportional constant which is called capacity. The capacity only depends
on the geometry of the two bodies.

220
9.2. POTENTIAL AND VOLTAGE

9.2.7 Example

Let’s look at some examples to get used to the theory.

Plate capacitor

Putting two metallic plates with area A each and both parallel to the y-z plane in a distance
d (d A) we get a plate capacitor (see figure 9.5). Let us take the charge Q from the
left plate and put it on the right plate. Since d A we model the electric field of this
setup as one of infinite spread plates. This means the electric field between the plates has
everywhere the same strength and the same direction2 . Such a field is called homogeneous
field. Additionally we define the charge density σ by σ = Q A.

Figure 9.5: Plate capacitor [35].

We know from chapter 9.1.6 the electric field is perpendicular to the plates and has ev-
erywhere the same strength. Therefore we get the voltage U between the two plates by

2
This is because the plates are infinite spread out. Therefore the electric field must everywhere look the
same (translation and rotation symmetry).

221
9 Electro- and Magnetostatics

ˆ plate
right

U= E~tot d~s
left plate

ˆ plate
right ˆ plate
right

= ~ left plate d~s +

E ~ right plate d~s
E
left plate left plate

= Eleft plate d − Eright plate d

Q −Q
A A Qd
= d− d=
20 20 A0
Therefore the capacity C is given by

Q A0
C= = .
U d
This problem was easy to solve because of the homogeneous electric field. Because in
case of an homogeneous electric field the voltage between two points ~r1 and ~r2 is simply
given by U = E ~ · (~r2 − ~r1 ). If the connection line of the two points lies parallel to the
electric field this simplifies even more to U = |E|d ~ where d is the distance between the
two points.

Additionally we can calculate the energy Epot stored in the capacitor. In principle the
energy is given as Epot = QU . But we have to pay attention because if we load the
capacitor, the voltage and the charge changes and we have to add the potential energy of
the different stages of the charging capacitor by taking the integral

ˆQ
Epot = U (q) dq
0
ˆQ 2 Q
q q Q2
= dq = =
C 2C 0 2C
0

where we used the relation between the voltage and the charge given by C = Q U . Instead
of integrating with respect to the charge we can also do it with respect to the voltage and
we get

222
9.2. POTENTIAL AND VOLTAGE

ˆU
Epot = Q(u) du
0
ˆU
C 2 Q2
= Cu du = U = .
2 2C
0

There is another approach to calculate the energy stored in the capacitor. Assume we
have two charged plates with charge Q on each. Assume they are separated by a small
distance δ. The force from one plate of the other is F = EQ = 2Q0 A Q where A is the
area of the plates. We now can compute the energy to pull one plate to the distance d to
the other plate. This energy is given as

ˆd
Epot = F ds
δ
ˆd
Q2 Q2
= ds = (d − δ).
20 A 20 A
δ

A0
If we now set δ = 0 and use C = d we recover the stored energy from above:
Q2
Epot = 2C .

Potential of an infinitely long wire

As we have seen in chapter 9.1.6 the electric field of an infinite long wire with charge
~ = λ e~r with e~r the unit vector pointing radial away from the wire. To
density λ is E 2π0 r
calculate the voltage between two distances R1 and R2 we integrate radially from R1 to
R2 (as a consequence E ~ and d~r are pointing parallel).

223
9 Electro- and Magnetostatics

ˆR2
ϕ(r) = ~ r
Ed~
R1
ˆR2
= Edr
R1
ˆR2
λ
= dr
2π0 r
R1
λ λ
= [ln(r)]R 2
R1 = (ln(R2 ) − ln(R1 ))
2π0 2π0

9.3 Current and magnetic field

Current is basically moving charge. A current produces an additional field, called mag-
netic field which can be measured. The existence of the magnetic field can be predicted
using relativity but the calculation is far beyond the stuff for this chapter so we make a
phenomenological approach.

9.3.1 Current and conservation of charge

If charge is moving one speaks of a current. The unit of the current I is Ampere A. The
precise definition of the current is I = dQ
dt Where Q is the charge that passes at a certain
point.

An important concept of electrodynamics is the conservation of charge. This means that

the total charge of a closed system can not change. It is for example possible to have a
charge neutral atom and take away an electron. But then the atom is positively charged
and the total charge is still zero, i.e. the sum of both charges, the electron and atom.
Therefore the charge at a point can only change if a current flows to that point. On the
other hand a current starts and ends at points where the charge changes or the current is
a closed circuit.

224
9.3. CURRENT AND MAGNETIC FIELD

9.3.2 Magnets
Everybody has already seen a magnet, which looks like small pieces of metal. This mag-
nets are called permanent magnets since they are always magnetic. There exist also mag-
nets that work with current and which are called electro magnets (see 9.3.3). A magnet
produces a magnetic field which is somehow similar to the electric field but has some very
important differences. A magnet has always two poles, this are the parts of the magnet
where the magnetic field leaves or enters the magnet. The north pole is the part where
the magnetic field leaves the magnet and the south pole is where the magnetic field enters
the magnet (see figure 9.6). Inside the magnet the field lines go from the south to the
north pole, they build therefore closed filed lines (see chapter 11.1.4 and equation (9.15)).
The names of the poles come from the fact that the earth has also a magnetic field and
the north pole of a magnet is attracted by the geographic north pole and the south pole
of the magnet is attracted by the geographic south pole.

Figure 9.6: Magnet with the magnetic field which leaves the magnet on the right, side
where the north pole is, and enters the magnet on the left side, where the south pole is
[37].

As we know from electrostatics field lines want to be as short as possible and they repel
each other. If we now put two magnets together (see figure 9.7), we see from the total
magnetic field that the same poles repel each other and two different poles attract each
other. Because in the first case the field lines repel each other and as a consequence also
repel the two magnets. In the second case the field lines can build nice closed loops which
want to get shorter and therefore attract the magnets.

225
9 Electro- and Magnetostatics

Figure 9.7: Two magnets with the field lines of the total magnetic field. [38].

We want to formalize the magnetic field a bit. What we called magnetic field is for-
mally the magnetic flux density B ~ with unit Tesla T. The connection to other SI-units is
1T = 1kg·A ·s = 1W·s·A ·m−2 . One Tesla is a pretty strong field, for example
−1 −2 −1

the magnetic field of the earth is about 5 · 10−5 T and a usual magnet produces a field in
the order of 0.1T.

One important difference to the electric charge and field is that there exist no magnetic
monopoles. This means that it is not possible to separate the north from the south pole,
they build always an inseparable pair (see chapter 11.1.4). But this also means that mag-
netic field lines have no start and no end, therefore the magnetic field is a source free
rotational field. If we adapt to Gauss law in electrostatic (see chapter 9.1.5) we find the
law

‹
~ S
Bd ~=0 (9.15)
∂V

226
9.3. CURRENT AND MAGNETIC FIELD

9.3.3 Magnet and electric current

If one puts a permanent magnet which is freely moveable near a wire where current flows
one can observe that the permanent magnet orient itself in a particular way which is shown
in figure 9.8.

Figure 9.8: Permanent magnets oriented along a wire where a current flows. [39].

If we now think the magnetic field being such that the magnets are tangent to the field
we get that the magnetic field looks like in figure 9.9. The rule is the following: If one
takes the right hand and places the thumb in the direction of the current then the other
four fingers show the direction of the field.

Figure 9.9: Magnetfield around a wire. [40].

227
9 Electro- and Magnetostatics

9.3.4 Lorentz force

Putting a wire in a homogeneous magnetic field and letting flow current trough the wire,
the homogeneous magnetic field and the magnetic field of the wire superpose to a total
magnetic field which is shown in figure 9.10.

Figure 9.10: The homogenous field and the field from the wire are indicated by the dashed
line. The total field is indicated by the continuous lines. The cross on the wire shows that
to current flows into the page. [41].

From the picture it is obvious that on the left side of the wire the magnetic field is pushed
more together than on the right side. Since magnetic field lines repel each other there is
a force pushing on the left side of the wire. On the other hand the field lines on the right
side are not straight lines but a bit curved. Since field lines want to be as short as possible
they attract the wire. This force on the wire in the right direction is called Lorentz force
F~ . Formally it is described by the formula

F~ = I~l × B
~ (9.16)

where ~l is the direction of the current flowing in the wire and B

~ is the magnetic field. If
we want to calculate the force on a single charge q we use the mathematically not really
precise but intuitively correct equation

dq dq~l d~l
I~l = ~l = = q = q~v
dt dt dt

228
9.3. CURRENT AND MAGNETIC FIELD

where ~v is the velocity of the charge. Therefore we get that the force on a moving charge
is

F~ = q~v × B
~ (9.17)

It is important that a charge which is not moving has no Lorentz force which is obvious
since only a moving charge has a magnetic field and a magnetic field can only interact with
an other magnetic field and not with an electric field. If one looks at electric and magnetic
fields which change in time then a changing electric field produces a magnetic field and
a changing magnetic field produces an electric field. This is described by the Maxwell
equations which are too complicated to be treated here. Therefore an electric field can
only interact with a magnetic field if the electric field changes with time since then the
electric field produces a magnetic field which can interact with an other magnetic field,
or the magnetic field changes and produces an electric field which can interact which the
electric field. Since we (nearly) always look at static systems (which do not change with
time) we will not need the Maxwell equations.

229
9 Electro- and Magnetostatics

230
Chapter 10

DIRECT CURRENT CIRCUITS

Resistance is futile
The Borg Collective

10.1 Ohm’s law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232

10.2 Equivalent circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
10.3 Electric power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
10.4 Electric components . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
10.5 Kirchhoff’s circuit law . . . . . . . . . . . . . . . . . . . . . . . . . . 236
10.6 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239

231
10 Direct current circuits

Besides the theoretical treatment of electric and magnetic fields there is also a very practi-
cal use. All the electronic devices base on the laws of electrodynamics. Since the calcula-
tion with the basic equations of electrodynamics is often very complicated one simplifies
the calculation of electric circuits by introducing electrical components. The laws of elec-
trodynamics then define the behaviour of the components (overview see chapter 10.4).
We will now look at electrical circuits which have a constant voltage (and most also a con-
stant current). This sort of circuits are called direct circuits. There exist also alternating
current where the voltage and the current change with time but this is not treated here.

10.1 Ohm’s law

Applying a variable voltage U to a body the current I through the body might depend
on many influences as temperature or humidity. The quotient UI is called resistance of a
body. The simplest (not trivial) dependence is the proportional dependence: The current
is proportional to the voltage with proportionality constant R with U = RI. R is called
ohmic resistance. The linearity is just a model which is valid for many materials and
bodies. Mathematically it is not completely wrong to describe the dependence between
U and I by a linear function because any (nice) function can be approximated by a linear
function. But there exist also components which have a non linear dependence as the
light bulb or the diode. The light bulb is a typical example of the dependence of the
temperature on the resistance of a material. For metallic materials the resistance is higher
if the metal is warmer and since the wire of a light bulb gets hotter if a higher voltage is
applied one can recognise that the resistance is higher at the higher voltage.

10.2 Equivalent circuit

Sometimes it is possible and useful to describe a collection of components by a single

component. Since there are many possible combinations we want to look at the most
important.

10.2.1 Wire
A real wire usually has a little resistance. Since this resistance is spread all over the wire
it is cumbersome to describe the wire as a chain of little ohmic resistors. Instead one
adds to an ideal wire (with no resistance) a single ohmic resistor which describes the total
resistance of the wire. Since two wires also have a capacity one could additionally add

232
10.2. EQUIVALENT CIRCUIT

a capacitor to a closed circuit. As one can imagine it is nearly impossible to describe all
effects and it is often not necessary to taking all into account but to consider the relevant
ones.

10.2.2 Series circuit

If two ohmic resistors are connected one behind the other one talks about a serial circuit
(see figure 10.1). Since the current I trough R1 and R2 is the same, the voltage drop
over both resistors is U = R1 I + R2 I = I(R1 + R2 ). Therefore the resistance of both
resistors is R = R1 + R2 .
By the same argument one can calculate the total resistance Rtot of an arbitrary number
of resistors which are all connected in series. For n resistors R1 , R2 , ...Rn the total
resistance is given by Rtot = nj=1 Rj .
P

Figure 10.1: Left: serial circuit of two resistors. Right: parallel circuit of two resistors.

233
10 Direct current circuits

10.2.3 Parallel circuit

If we put two resistors parallel to each other we get a parallel circuit (see figure 10.1). The
voltage U over both resistors is the same and the total current that flows through the two
resistors is Itot = RU3 + RU4 . Therefore the total resistance is

U 1
R= = 1 1
Itot R3 + R4

Similarly the resistance of n parallel connected resistors is given by

 −1
n
X 1 
Rtot =
Rj
j=1

10.2.4 Voltage source

An ideal voltage source is a device where independent of the current the voltage is always
the same. Since already the wires leaving the ideal voltage source have a resistance one
has to add an additional resistance RS in series to the voltage source to describe a real
voltage source. RS is usually very small one can often neglect it. Taking RS into account
is only important if one wants take out a lot of energy from the voltage source or if the
rest of the electrical circuit has a very low resistance in the order of RS .

10.3 Electric power

The voltage describes how much energy one Coulomb gets if it passes the voltage. A
current describes how much charge passes per unit of time. Therefore the product of a
voltage and a current describes how much charge gets an energy by the voltage per unit
time, therefore the product tells us the power that the current performs.

P = UI

where P is the power that the current I emits over the voltage drop U .

234
10.4. ELECTRIC COMPONENTS

10.4 Electric components

Table 10.1 gives an overview over the most common symbols.

ideal wire (with no resistance)

switch to open and close the electric circuit

ohmic resistor

variable ohmic resistor

(ideal) voltage source

ground (reference voltage in infinity), connected to the earth

capacitor (to store charge)

inductor (inductive resistor, often a coil)

light bulb

LED (light emitting diode)

diode (lets current only flow in direction of the arrow)

horn

Table 10.1: Overview over different symbols [44]

235
10 Direct current circuits

10.5 Kirchhoff’s circuit law

There are two very important rules which can be used to determine the current and voltage
through any circuit.

10.5.1 Current law

Because there is conservation of charge we have a restriction on the currents: At any
point where no charge is stored the sum of all currents must be zero. It is important to
chose the currents that flow to that point with one sign (so positive or negative) and the
currents that flow away with the other sign (negative or positive) (see figure 10.2).

Figure 10.2: Knot where many currents flow together. The sum I1 − I2 + I3 − I4 − I5
must be zero. The currents that flow to the knot have positive sign, the currents that flow
away negative sign.

10.5.2 Voltage law

As we have seen in chapter 9.2.4 the sum of all voltages in a closed electric circle is zero.
Therefore we define a direction of summation (in clock wise or anti clock wise) and take
the sum over all electric components of the circuit (see figure 10.3)

10.5.3 Applying Kirchhoff’s law

A knot is a point in the electrical circuit where more than two currents flow. For every
knot we apply the current law. This gives us for k knots k independent equations. For
every closed circuit one applies the voltage law which gives for every independent closed
circuit one more equation. One has to pay attention on the independence of the circuits
(see figure 10.4)

236
10.5. KIRCHHOFF’S CIRCUIT LAW

Figure 10.3: Here the sence of summation is clock wise. For all the components (which
does not necessarily have to be ohmic resistors) we take the potential difference between
the potential at the end of the arrow minus the potential on the start of the arrow. This
gives us the voltage in direction of the summation. The sum of all these voltages must be
zero U1 + U2 + U3 + U4 + U5 = 0 [42].

If one has now n equations (from the current and voltage law) one has to express the
voltages by the currents ore vice versa. Then one should have n equations with n variables
which should be solvable.
This is a very useful method for complicated circuits. For easier circuit one can apply
other methods, for example simplify complicated systems of resistors by a single resistors
(see 10.2) and then apply the Kirchhoff’s voltage law by saying that the voltage over the
voltage source must be equal to the voltage over the single resistor.

237
10 Direct current circuits

Figure 10.4: There are three closed circuits: circuit 1: ACDB, circuit 2: AEFB and circuit
3: CEFD. But the three circuits are not independent because circuit 3 is basically circuit
2 minus circuit 1. Therefore the voltage law must only be applied on two of the three
circuits (it does not depend on which two). To be sure not doing anything wrong: only
take the closed circuits which are the smallest possible circuits. For example circuit 3 can
be contracted to circiut 2 by making a shortcut from B to A.[43].

238
10.6. EXAMPLES

10.6 Examples

10.6.1 Maximal power from a real power source

Let’s look at a real voltage source with voltage U0 and connect it to an ohmic resistor (see
figure 10.5). Since the voltage source has an ohmic resistance itself it is not possible to
get infinite power out of the source.

Figure 10.5: A real voltage source connected to an ohmic resistor. The real voltage source
is the gray box.

We want to calculate the maximal power that one can use at the resistor. The power P
U0
on the resistor R is P = U I = I 2 R. The current is given byI = R+R S
. Therefore the
2 R
power is U0 (R+RS )2 To get the maximal power we consider the power as function of R
and set the derivative zero:

dP
0=
dR
(R + RS )2 − 2(R + RS )R
= U0
(R + RS )4
= R + RS − 2R
R = RS

Therefore one can take the maximal power out of a real voltage source if the resistor is
equal to the interior resistor of the voltage source. Of course the same amount of power
as one can use at the resistor R is heating up the voltage source because of RS .

239
10 Direct current circuits

10.6.2 Charging a capacitor

A capacitor with capacity C is connected in series with a resistor with resistance R to a
voltage source with voltage U0 . At the time t = 0 the switch closes the circuit and the
capacity starts to charge (see figure 10.6).

Figure 10.6: Circuit to charge a capacity

We want to calculate the voltage over the capacity VC as a function of time. We apply
Kirchhoff’s law: U0 = UR + UC with UR = RI the voltage over the resistor and
UC = QC with Q the charge in the capacitor. Taking the time derivative we get

dI I
0=R +
dt C
dI
= RC +I
dt
1 dI
− dt =
RC I
ˆ t ˆ
I(t)
1 0 dI
− dt =
RC I
0 I0
1 I(t)
− t = ln( )
RC I0
−1
I(t) = I0 e RC t

Where I0 is the current when the switch is closed and it is given by I0 = UR0 because at
the first moment no voltage drops over the capacitor (since it is empty) and therefore all
the voltage drops over the resistor. Therefore we get for the voltage over the capacitor

240
10.6. EXAMPLES

UC (t) = U0 − RI(t)
−1
= U0 (1 − e RC t )

This contains some expected properties: At the beginning when there is no charge in the
capacitor there is no voltage drop over it. For t very big nearly all voltage drops over
the capacitor which is also clear since the capacitor is an interruption of the circuit for a
constant voltage.

241
10 Direct current circuits

242
Chapter 11

ELECTRODYNAMICS
Like Electrostatics, but now with extra
maths.
Don’t worry, it might look ugly in the
beginning, but it’s worth the struggle.

11.1 Magnetism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244

11.2 Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
11.3 Displacement current . . . . . . . . . . . . . . . . . . . . . . . . . . 259
11.4 Maxwell’s equations and their conclusions . . . . . . . . . . . . . 261
11.5 Electro-magnetic field in Materials . . . . . . . . . . . . . . . . . . 265
11.6 Energy of the electromagnetic field . . . . . . . . . . . . . . . . . 276

243
11 Electrodynamics

In this chapter we continue electrodynamics. First we look at magnetism from a more

formal point of view, which leads to Ampère’s Law and Biot Savart’s Law. Then we
look at time dependent fields and phenomena related to them, for example induction and
displacement current. Putting all these equations together leads to the famous Maxwell’s
equations. In a second part we will investigate the behavior of electromagnetism in the
presence of matter. At the very end, energy considerations of the electromagnetic field
are discussed.

11.1 Magnetism

In electrodynamics 1 we looked at magnetism more from a phenomenological point of

view without formal relations. We will catch up on this now. First we will have a closer
look at the magnetic field B itself and introduce the magnetic flux. Then we will inves-
tigate an important property of the magnetic field: Remember that a current (flowing
through a wire) creates a magnetic field around the wire (see 9.3.3). It is a circular field
which needs some new formalism to describe it. And finally we look at the magnetic field
of a point charge and some more examples.

11.1.1 Magnetic Field and Flux

In section 9.3.2 we gave the magnetic field B ~ already a unit, namely the Tesla. To be more
precise B~ is the magnetic field density . In a field line picture, the field density indicates
1

how many field lines pass through a certain area. For a given area A, we can therefore
define a quantity which corresponds to the total number of field lines through that area.
This quantity is called flux Φ and it is defined as

¨
Φ= ~ · dA
B ~
A

where the differential dA~ is a small area element pointing perpendicular to the surface, see
figure 9.2. The scalar product B ~ · dA ~ indicates the flux through dA which, summed/in-
tegrated up for all small elements, gives the total flux.

1 ~ This plays a less

There is also a magnetic field strength, analogous to the electric field strength E.
important role in physics, see also 11.5.4

244
11.1. MAGNETISM

If the area is not curved and the magnetic field is homogeneous through the whole surface
the formula simplifies to

~ ·A
Φ=B ~

where A~ is the surface vector pointing perpendicular to the surface and its absolute value
is equal to the area of the surface.

11.1.2 Ampere’s Law

Similar to Gauss’s law (see chapter 9.1.5) where the flux at the surface of a volume only
depends on the charge inside, we can formulate a law with currents. But there is a very im-
portant difference between the (static) electric and the (static) magnetic field: The (static)
electric field starts and ends on charges. Since there are no ”magnetic charges”, the mag-
netic field has no starting and no ending point. Therefore the magnetic field lines are
always closed and we can quantify them using this property. Instead of the flux though
a closed surface we look at the magnetic field along a closed path (enclosing a surface
S). From an intuitive point of view we have to relate the current through a wire with the
magnetic field around the wire2 . If we have a wire where a current I flows through and
a surface S (that is not closed) we have that

˛ ¨
~ ~l = µ0 Iin =
Bd ~
µ0~jdS (11.1)
∂S S

where ∂S is the boundary line of the surface S and d~l is a infinitesimal short tangent
vector on that boundary line (see figure 11.1). Iin is the current that flows through the
surface S and ~j is the current density (current per area) which points in the direction of
the current flow. The direction of d~l is given by the right hand rule: if the thumb of
the right hand shows in the direction of the current then the other four fingers show the
direction of d~l and dS~ shows the direction of the current (see also figure 9.9).
It is again not important being able to calculate the integrals above for arbitrary cases but
it is very useful to understand the formula.

2
This in analogy to Gauss’law where we relate the charge in a volume to the flux through the surface of
the volume. Here we relate the charge flowing through a surface with the magnetic field around the surface.

245
11 Electrodynamics

~dl

~
B

Figure 11.1: The black point with the circle around it denotes the current which flows
perpendicular out of the sheet. The dashed line symbolizes the magnetic field. The shaded
area is S where the current flows through (note it does not matter where it flows through).
A small piece of the integral along the boundary of S is denoted by d~l.

The important thing about equation (11.1) is that there exists a quantity connected to
the magnetic field (namely the integral on the left hand side) which only depends on the
current. In symmetric cases this is very useful, see example 11.1.4.

11.1.3 Magnetic Field of a Moving Point Charge

An other way to calculate the magnetic field of a current is to look at the explicit depen-
dence of the magnetic field on a moving charge. Assume a point charge with charge q
at the position r~0 is moving with velocity ~v . We want to understand the formula of the
magnetic field at a point ~r of this point charge. The formula is

~0
~ r) = k q ~v × ~r − r
B(~ (11.2)
r2 |~r − r~0 |

where ~r − r~0 is the vector pointing from the point charge to the point ~r and k is a
µ0
constant. In SI units k = 4π when µ0 = 4π · 10−7 V·s·A−1 ·m−1 . This formula seems
extremely complicated at first but looking a bit more precisely and comparing it with the

246
11.1. MAGNETISM

~ field has a q
electric field of a point charge, it gets a lot simpler. The B dependence
r−r~0 |2
|~
as the E ~ fiend in the coulomb law. This is pretty reasonable since as mentioned above
the electric field and the magnetic field have a strong connection3 . This means that both
should have the same dependence on q and on the distance |~r − r~0 |. The factor4π is
part of the constant and is separated with a similar reason as in the electric case, where
4πr2 is the area of the sphere with radius |~r| around the charge. Additionally some other
laws take a nice form, see equation (11.1). The big difference between the formula of
the electric and the magnetic field is the vector product. But the vector product fulfils
exactly the properties of the magnetic field we stated in chapter 9.3.3: The vector product
ensures that the magnetic field always points tangent to a circle around the direction of
the current (which here is ~v ) and since the magnetic field should be zero for ~r parallel
to ~v the angle between ~v and ~r − r~0 plays also a role and is respected in the vector product.

The formula above is not correct with respect to relativity because in the formula it is
assumed that the moving particle has an immediate influence at the position ~r which is
not possible due to relativity. But if we consider |~v | c, when c is the speed of light,
the mistake is negligibly small.

To calculate the magnetic field of a current flowing through a wire we use equation (11.2)
and redefine some quantities. The charge dq in a short part of the wire with length dl is
dq = ρdl where ρ is the charge density per unit length. Assume that charge is moving
with a (mean) speed v. To pass the length dl the time dt is needed. We therefore have a
ρdl
current I = dqdt = dt = ρv. Hence we have dqv = ρdlv = Idl. Since the charge dq is
assumed to be small, and located at a small spot, we can use equation (11.2) to calculate
the magnetic field dB~ caused by the current at r~0 through the small piece of wire d~l(r~0 ).
We turned the path element dl into a vector in order to calculate the vector product. The
vector has to point in the direction of the current. The formula is then given as

~ r) = µ0 d~l(r~0 ) × (~r − r~0 )

dB(~ I .
4π |~r − r~0 |3

This formula is called Biot-Savart’s law. The total magnetic field at the point ~r is then the
~ of all the wire, namely
integral of all the dB

3
This gets obvious in relativity.

247
11 Electrodynamics

ˆ ˆ
~ = ~ = µ0 d~l(r~0 ) × (~r − r~0 )
B dB I .
wire wire 4π |~r − r~0 |3

This integral does basically nothing else than sum up all the contributions of the different
parts of the wire. If the current through the wire is constant, the equation above is also
relativistically correct, because the magnetic field is constant in time.

11.1.4 Example
Magnetic Field of an Infinitely Straight Long Wire
Consider a wire along the x-axis with a current I flowing in the positive x-direction, see
also figure 11.2. Since the wire is infinitely long there is no component of the magnetic
field pointing in the x-direction. Additionally the problem is rotational symmetric around
the x-axis. Therefore the strength of the magnetic field at a point only depends on the
distance to the wire. We imagine a circle around the x-axis with radius R. As the magnetic
field also makes circles around the wire (see figure 9.9) the line vector of the boundary
line of the circle and the B~ field point in the same direction and therefore Bd~
~ s = Bds
where B and s are the absolute values of the respective fields. Therefore equation (11.1)
leads to

˛
µ0 I = ~ s
Bd~
˛∂S
= Bds
˛
∂S

=B ds
∂S
= B2πR
µ0 I
B= .
2πR

B can be taken out of the integral since B is constant at a constant distance r from the
wire.

248
11.1. MAGNETISM

~
B ~
B

R
x

Figure 11.2: Infinitely long wire along the x axis. The magnetic field (at radius R) is
indicated by the dashed line, the path of integration is indicated by the solid line.

Force on Two Parallel Infinitely Long Wires

Let us take a wire along the x-axis and one parallel to the x-axis through the point y =
r > 0. Assume the wires only lie in the xy plane. Assume that currents I1 and I2 flow
through the first and second wire, respectively. We take them to be positive if they flow
in +x-direction. According to the example above the first wire produces a magnetic field
at the position of the second wire



0
~ = µ0 I1  0  .
B
2πr
1

Therefore the force on a length l on the second wire is due to the Lorentz force

 
l
F~ = I2  0  × B ~
0
     
l 0 0
µ0 I1 I2     µ0 I1 I2 l 
= 0 × 0 = −1  .
2πr 2πr
0 1 0

As a consequence the two wires attract each other if both currents flow in the same
direction and repel each other if the two currents flow in opposite direction.

249
11 Electrodynamics

The formula above is also used to define the SI-unit for electromagnetism which is the
current:
”The ampere is that constant current which, if maintained in two straight parallel conductors of infinite
length, of negligible circular cross-section, and placed 1 metre apart in vacuum, would produce between
these conductors a force equal to 2 · 10−7 newton per metre of length.” [36]
This definition is also the reason that µ0 = 4π · 10−8 V·s·A−1 ·m−1 is a precise constant.

Magnetic Field of a Coil

If we wind a wire to circles and place them close together we get a coil. As we have seen
above, the magnetic field of a current flowing through a straight wire has a very small
impact4 . This is different in a coil, which can be viewed from different aspects: As the
windings are close to each other, the total magnetic field at a point is the superposition of
the magnetic field produced by each winding. Having many windings, the magnetic field
becomes much larger. Equivalently one could say that through a current flows each wind-
ing and therefore the total current corresponds to the current through the wire multiplied
by the number of windings. Therefore coils play an important role when considering
magnetism5 .
We now look at a coil whose diameter is much smaller than its length. To calculate the
magnetic field of a coil we consider a rectangle with width l and length L which we place
as seen in figure 11.3.

Let n be the number of windings per unit length. By Amperes law (see equation (11.1))
we get

˛
µ0 Itot = ~ · d~s
B
∂S
µ0 Inl = lB
B = µnI

when Itot = nlI is the total current flowing through the rectangle and I is the current
trough the wire. The step from the first to the second line is because if we look a the
sides of the rectangle the magnetic field is almost perpendicular to the sides of the
4
With 10−7 N one obviously can not make an electromagnet.
5
In fact one usually neglects the effect of the magnetic field of all devices except coils or coil-like devices.

250
11.1. MAGNETISM

~
B

Figure 11.3: Coil with a gray rectangle. We choose the surface vector of the rectangle
pointing into the plane because the current through the rectangle points into the pane.
In other words, integrating in the same direction of the magnetic field corresponds to a
surface vector pointing into the plane (right hand rule).

rectangle (using l length of coil) and as a consequence B ~ · d~s = 0 for the sides.
Additionally we make L very large so that at the top of the rectangle the magnetic field
is very weak and therefore the contribution can be neglected.

The magnetic field of a coil is similar to the one of a permanent magnet. Therefore one
could explain the magnetic field of a permanent magnet by assuming to have little circular
currents in the magnet. This explanation is not really true since in quantum mechanics
there exist also magnetic fields which do not origin from currents. But one can imagine
how the magnetic field looks inside a permanent magnet where the field lines go from the
south to the north pole. The field lines inside and outside the permanent magnet build
closed lines as it is stated by equation (9.15).

251
11 Electrodynamics

11.2 Induction
Induction is an important phenomenon in electrodynamics because it gives a (first) con-
nection between the electric and magnetic field. It basically tells us that a magnetic field
that changes with time produces an electric field6 . After introducing induction, we will
look at two examples, namely the generator and the transformer.

11.2.1 Approach and Definition

To approach induction we consider a metallic pipe that is falling down. In that pipe a
magnet is kept fixed (see figure 11.4). In the metal there are charges (positive nucleus
and negative electron gas). On these moving charges the Lorentz force is acting7 and
accelerates the electrons. They start circling in the pipe such that the magnetic field they
produce opposes the change of the external magnetic field8 . (see also figure 11.5).

Figure 11.4: Left: frame of the magnet, the pipe is falling and the magnet is kept fixed.
The white circles with the cross or the point indicate the direction of the current due to
Lorentz force. The circle with point indicates that the current flows out of the sheet and
the circle with the cross that it flows into the sheet. Right: frame of the pipe: The pipe
stands still and the magnet is moving. The white circles again correspond to the current
in the pipe.
6
Also the opposite is true, see 11.3.
7
Since the nuclei are much heavier than the electrons the influence of the magnetic field is much smaller.
8
External means here outside the metallic pipe, i.e. the magnetic filed of the magnet.

252
11.2. INDUCTION

In the rest frame of the pipe the pipe itself is not moving. As a consequence there is
no moving charge and therefore no Lorentz force acting. But the current needs to be
independent of the frame. The explanation of this is induction: Since the magnet is
moving (in the frame of the pipe), the magnetic field at a certain position of the pipe is
changing with time. This change causes an electric field, which causes the movement of
the charges.

Figure 11.5: Top of the pipe. The pipe is the gray ring, the current (and the electric field)
is indicated as a circle by arrows.

If we integrate the electric field along a closed path (for example the one indicated in
figure 11.5) we get a voltage uind . This voltage is connected to the magnetic field by

˛
~ s = uind = − dΦ
Ed~ (11.3)
dt
where Φ is the flux that flows through the closed path. The obtained equation (11.3) has
some very important properties:

• The negative sign corresponds to the fact that the current is such that it opposes a
changing of the magnetic field. It therefore represents energy conservation. Would
there be no negative sign, the current would amplify the magnetic field which then
also would lead to a bigger current. This self-amplification would lead to an infinite
current (in absence of resistance) which is of course not physical and in particular
would violate energy conservation9 .
9
In concrete cases one has to chose some conventions as the positive direction of current. Due to this
conventions it might happen that the induction law does not contain a negative sign (see section 12.2.3). Be
aware about this and check at the end if the result is meaningful or leads to non-physical behavior.

253
11 Electrodynamics

• Opposite to the electrostatic case, the electric field we obtained above has no start-
ing and no ending point. The work required to afford to change the position of a
charge depends on the path, therefore one cannot find a potential which is equal
to the potential energy of a particle.

11.2.2 Self induction

In most of the electrical components and in most cases there is a weak interaction between
the component and the magnetic field. Nevertheless there is one component that couples
very strong to the magnetic field. This component is the coil. Since the different windings
of a coil are close to each other and the same current flows through each winding, there
is a lot of charge moving on a small spot. This current produces a strong magnetic field
(see section 11.1.4). Since the magnetic field is proportional10 to the current I and since
the cross section of a coil does not change with time, the total flux Φ through the coil is
proportional to the current Φ = LI. The proportionality constant L is called inductance
of the coil.
If we apply an alternating current to a coil, the magnetic field through the coil (caused by
the current) is also alternating. Therefore induction happens. The induced voltage in the
coil is

dΦ di
uind = − = −L .
dt dt

The coil therefore opposes a change of the current by inducing a voltage. For more
details also see the AC impedance of an inductor (see section 12.2.3). This phenomenon
is called self induction.

In order to get a feeling for the inductance, let’s compute the inductance of a very long
coil (see also 11.1.4). The magnetic field of such a coil is B = µnI where n is the number
of windings per length l and I the current through the coil. Assume that we have a coil
with cross section area A. Then the magnetic flux is Φ = BA = µnIA. Applying
an AC voltage with angular frequency ω, the amplitude of the induced voltage in each
winding is Uind = ωΦ = ωµnIA. Therefore the total voltage for all N = ln windings
is U = ωµnIAN = ωµN 2 Al I = LωI, where we got the inductance

10
This is not always the case, for example if the magnetic field goes through a ferromagnetic material.
There, so called saturation can occur.

254
11.2. INDUCTION

µN 2 A N2
L= =
l Rm

where Rm is the magnetic resistance of the material around the coil11 . It obviously scales
with N 2 . The first factor N comes from the fact that the total current density scales with
N , meaning that if we double N , the amount of moving charge is also doubled (not the
current through the wire itself, but the same current passes twice as many times). The
second factor comes from summing up the voltage at each winding. This means the ratio
between the (amplitude of the) voltage and the current scales as N 2 , but the magnetic
field scales as N .

11.2.3 Generator
The biggest part of electricity is produced with generators12 . The principle is in most
cases the same and uses induction: Some external energy (as water or hot steam) drives
a rotation (e.g. a turbine). This rotation causes a magnet to turn and leads to a changing
magnetic field. This changing field induces an electric field in a coil or equivalently a
voltage.
To have a more formal look, assume we have a fixed coil and a magnet turning near
the coil, see also figure 11.6. For simplicity we assume that the magnet and the coil are
very close to each other such that the magnetic field is homogeneous and constant over
the area of the coil. As the magnet is rotating, the magnetic field at the coil changes
periodically. The magnetic field is

~ =B sin(ωt)
B
cos(ωt)

where B is the amplitude of the field (at the coil) and ω is the angular frequency of the
rotation.

11
One can think similarly to electric circuits about magnetic circuits. If the magnetic field has two possible
”paths” to ”flow”, the total resistance corresponds to a parallel circuit. Similarly if the magnetic field is forced
to take a longer path, one has to add up the resistances of the paths.
12
Only photovoltaic produces it differently.

255
11 Electrodynamics

N S

uind

Figure 11.6: Magnet rotating near a coil where a voltage Uint is induced.

Therefore the flux through the coil is Φ = Bx A where A is the cross-section area of the
coil and Bx is the magnetic field component pointing towards the coil (here along the
x-axis). In each winding, the voltage

dφ
uind = − = −BAω cos(ωt)
dt
is induced and as a consequence the total voltage13 is Utot = N uind .
There is another very common setting where the magnet is fixed and the coil is rotating.
The disadvantage of this configuration is that one has to make a connection between the
rotating coil and the static consumer. This is usually done by brushes. The advantage is
that it is also possible to generate (pulsed) DC voltages.

11.2.4 Transformer
Another very important application is the transformer. This device allows to change the
voltage of an alternating current (AC) circuit. It consists of a loop of iron and two coils
wound on this loop, see also figure 11.7.

One of the coils is connected to an AC source and is called primary coil. The other is
connected to a consumer, this coil is called secondary coil. As we have seen above (see
13
One could also include the factor N in another way. Namely by saying that the area where the magnetic
field goes through is N times larger than the cross-section area of the coil.

256
11.2. INDUCTION

Figure 11.7: The piece of iron is gray. The two coils are placed at the top and bottom
part of the iron. Np is the number of windings of the primary coil and Ns the number of
windings of the secondary coil.

example of section 11.2.2), the magnetic field scales with the number of windings. So the
magnetic field in the iron is proportional to Np , the number of windings of the primary
coil. In addition, the induced voltage in the secondary coil is proportional to Ns , its
number of windings. Therefore we have that

Us Ns
=C
Up Np
where C is a proportionality constant we do not know yet from the above considerations.
And here the role of the iron comes in: If there would be no iron, the magnetic field of
the primary coil would not necessarily pass through the secondary coil. As iron extremely
amplifies the magnetic field (factor of about 5000), most of the magnetic field produced
by the primary coil ”flows” inside the iron and therefore passes through the secondary
coil. As a consequence all the magnetic properties in both coils are the same and therefore
C = 1. To derive this assume we apply an alternating voltage with angular frequency ω
and amplitude Up at the primary coil. The voltage then is of the form up (t) = Up sin(ωt),
see also chapter 12. The magnetic field in the iron is related to the primary voltage as

dB(t)A
up (t) = Np ,
ˆdt
1 U
B(t) = u(t) dt = − Up cos(ωt)
Np A Np Aω

257
11 Electrodynamics

where A is the cross-section area of the iron. Note that in the above equation there is no
minus sign. This is discussed in more detail in section 12.2.3 and is not important in this
example as we are only interested in the amplitudes and not in the phase relation. You
might be puzzled why we use the induction law but there is no external magnetic field
inducing a voltage. The point is that the self-induced voltage of the primary coil must be
equal to the applied voltage (assuming there is no resistance). If this is not the case, there
would be a voltage difference over the primary coil causing a larger current. This current
would lead to a larger magnetic field until there is no voltage difference left.
On the other hand, the induced voltage in the secondary coil is given by (once again
neglecting the minus sign in the induction law)

dBA Ns A
us (t) = Ns = up (t)
dt Np A

where we used that the magnetic field though the secondary coil is the same as through
the primary coil.
We therefore get the famous equation for the ideal transformer

Us Ns Ip
= =
Up Np Is

where Ip and Is are the currents in the two coils. Their relation can easily be found by
power conservation: Pp = Ps . Note: This equation is only true if the magnetic field of
the two coils is strongly connected, i.e the magnetic field though both coils is the same.
For example if the consumer on the secondary side takes out a lot of current, this current
produces a magnetic field that opposes the one from the primary coil. As a consequence
the primary coil needs more current to sustain the magnetic field14 . On the other hand,
the magnetic field ”looks for” an alternative way to avoid the secondary coil. Therefore
the two coils do not have anymore the same magnetic field and the above equation is
not valid. This breaking down is related to the construction of the transformer and in
particular how the coils are placed. For example if the coils are on top of each other, their
magnetic field is stronger connected than if they are aside of each other as in the picture
11.7 above.
14
Remember: the magnetic field and the voltage of the primary coil are connected to each other by the
induction law. So for a give voltage, there must pass a certain magnetic field through the coil.

258
11.3. DISPLACEMENT CURRENT

11.3 Displacement current

In the previous section we discussed how a time dependent magnetic field causes an
electric field. In this section we will look at the opposite, namely a time dependent electric
field causing a magnetic field.
There is a very common way to introduce this topic. Consider a (plate) capacitor that gets
loaded with a current I (see figure 11.8). The current causes a magnetic field B around
the wire. If we only consider the current, the magnetic field between the plates would be
smaller than outside the capacitor, because the magnetic field is stronger near the wire.
As there is no wire between the plates, the contribution from the current itself is smaller.
But if we would measure the magnetic field around the plates, it would be the same as
around the wires15 . The current that loads the capacitor changes the electric field E. This
changing electric field also causes a magnetic field. Outside of the capacitor, one cannot
tell from the magnetic field whether there is a current or a changing electric field creating
that magnetic field.

Figure 11.8: A plate capacitor (grey planes) gets charged by a current I. The current causes
a magnetic field B. Between the plates there is no current, but the changing electric field
E causes also a magnetic field.

To get a formal description consider a plate capacitor with capacity C = Ad where each
plate has cross section area A and the plates are separated by d. From the basic equation
for capacitors we can relate the current and the changing electric field by
15
At least outside of the plates and only with respect to the same distance from the wire.

259
11 Electrodynamics

Q = CU,
dQ dU
I= =C
dt dt
A dE
= d
d dt
dE
= A
dt

where we used that the electric field E and the voltage U (for a homogeneous electric
field) depend on each other as U = Ed. This ”current” I between the plates is called
displacement current and it has the same effect on the magnetic field as a usual current.
As a consequence we have to take the displacement current also in account in Ampère’s
law (see section 11.1.2). This then leads to

˛
~ ~ dE
Bdl = µ0 Itot = µ0 I + A
∂S dt
¨ ~
!
dE ~
= µ0~j + µ0 dS.
S dt

The last line is the most general description where we assume an arbitrary area S with
boundary ∂S and a current density ~j. If the current density ~j is zero, the formula above
is very similar to the induction formula: The change of the electric flux through an area
S is proportional to the integral of the magnetic field around a closed circle (see also
section 12.2.3).

This displacement current might look a bit irrelevant and not very useful for applications.
But it is of great theoretical importance as it predicts/ ensures conservation of charge.

260
11.4. MAXWELL’S EQUATIONS AND THEIR CONCLUSIONS

11.4 Maxwell’s equations and their conclusions

11.4.1 Maxwell’s equations

In this section we summarize the basic equations that we found in electrodynamics.
They will form a set of integral equations16 that describe the behavior of the electric field
E~ and the magnetic field B.~ This set is called Maxwell’s equations. In principle these
equations combined with the Lorentz force describe everything about electromagnetism.
Nevertheless it is far too complicated to solve these equations in many configurations,
in particular when many charges are involved like in materials. So one simplifies/adapts
these Maxwell equations for an electromagnetic field in materials which we will do in the
next section 11.5.

We use the following notation: With V we denote a volume and ∂V is the surface17 that
confines the volume V . A small element of the volume is dV . To integrate over the
surface of V we need to split ∂V into small pieces denoted by dS ~ where the area of the
~ ~ points perpendicular
small piece is equal to the absolute value |dS| and the direction of dS
to the surface outside the volume.
With A we denote an area and the border of the area18 is denoted by ∂A which is a closed
~ We also divide the closed line into
line/loop. The area is again split into small pieces dA.
small pieces denoted d~l. The length of these pieces is equal to the absolute value |d~l| and
~ and the small curve
it points tangent to the line. The direction of the small area pieces dA
~ ~
pieces dl need to fulfill the right hand rule: If dA points in the direction of the thumb of
the right hand, the other fingers of that hand indicate the direction of d~l.
With this notation Maxwell’s equations are given as

‹ ˚
~ dS
~= 1
E ρ dV Gauss’s law,
0
‹∂V V

~ dS
B ~=0 no magnetic monopoles exist,
˛∂V ¨
~ d~l = − d
E B~ dS~ Faraday’s law (Induction),
∂A dt A
˛ ¨ ¨
~ ~ ~
~j dA + 0 d ~ ~
B d l = µ0 E dA Ampère’s law,
∂A A dt A
16
One can also write down equivalent differential equations but they are more complicated.
17
Take surface ∂V as notation for the surface and not as partial derivative or small piece of V .
18
This area has not to be closed (as surface of a volume), this is why we use another variable than above.

261
11 Electrodynamics

where ρ is the charge density and ~j the current density. The circles at the integrals on the
left side denote that the surface or the path is closed (therefore the surface of a volume
or the boundary of an area). The number of integral signs denotes the dimensionality of
the integral, meaning the dimension of the space the integral has to be taken over.
Additionally to Maxwell’s equations one has to mention the Lorentz force in order to
describe the interaction between charges and fields. The Lorentz force is given as

F~ = q(E
~ + ~v × B)
~

where q is the charge of a particle and ~v its velocity.

With these five equations it is basically possible to describe any problem involving charges
and the electromagnetic field.

11.4.2 Electromagnetic wave

Maxwell introduced his equations in 1865. These equations predict the existence of elec-
tromagnetic waves which then were experimentally measured by Heinrich Hertz in 1886.
It is impressive how Maxwell managed to predict this waves only due to theoretical con-
siderations.
We now want to deduce the electromagnetic waves and in particular some important
properties. As the derivation is pretty tedious, you will not need to know it, but the
results are pretty important. So we state them first and give the proof(s) afterwards.
We are going to prove that a time dependent electric and magnetic field lead to a wave
satisfying the wave equations.

∂ 2 Ey ∂ 2 Ey
= 0 µ 0
∂x2 ∂t2
2
∂ Bz ∂ 2 Bz
= 0 µ 0
∂x2 ∂t2

where we choose the coordinate system such that the electric field points in the direction
of the y axis and the magnetic field in the direction of the z axis. Do not get confused
~ and B
by the ∂ sign, these are simple derivatives and the ∂ only indicates that E ~ depend
on multiple variables (x, y, z, t).

262
11.4. MAXWELL’S EQUATIONS AND THEIR CONCLUSIONS

From this wave equations we can read off the speed of light c which is given by

1
c= √ .
0 µ 0

As the spatial derivative is along the x axis, we note that the wave propagates in the x
direction meaning the E ~ and B ~ field are perpendicular to the direction of propagation.

Proof:
To apply Maxwell’s laws we have to use the third and fourth law, meaning we have to consider
different areas where we perform the integrations. For this we consider a small cuboid with small
side length dx, dy and dz, see also figure 11.9.

∂Bz
∂x dx

dy
y
∂Bz
Bz Bz + ∂x dx

Figure 11.9: Small cuboid

We start by considering a time and space dependent magnetic field B ~ pointing in the y direction
of the coordinate system. Since the sides are small we can assume B~ being constant along the y
axis of the cuboid but we have to take into account the change in the x direction. We consider
the square with the four arrows as our area where we have to integrate around to get the left side
of Ampère’s law and we have to calculate the electric flux through the square for the right side.
The left side of Ampère’s law then reads

263
11 Electrodynamics

˛
~ d~l = ∂Bz ∂Bz
B Bz + dx dz − Bz dz = dxdz.
∂A ∂x ∂x

To calculate the electric flux, we assume the electric field is constant across the area. Since the
area is parallel to the xz plane, only the y component of the electric field contributes to the flux
so the right side of Ampère’s law is

¨
µ 0 0
d ~ = −µ0 0 ∂Ey dxdz
~ dA
E
dt A ∂t

where the minus sign enters because due to the right hand rule, the surface vector of this area
points in the −y direction. Equating these two sides yields

∂Bz ∂Ey
= −µ0 0 (11.4)
∂x ∂t

Next we have to use the induction law which is in absence of charges or currents structurally
analogous to Ampère’s law. With the same argumentation applied to the area dxdy we get

∂Ey ∂Bz
=− .
∂x ∂t

Taking the derivative with respect to x on both sides and inserting the first equation (11.4), we
find

∂ 2 Ey ∂ ∂Bz
=−
∂x2 ∂x ∂t
∂ ∂Bz
=−
∂t ∂x
∂ 2 Ey
= 0 µ 0 .
∂t2

This is nothing but the wave equation we looked for. Taking the derivative with respect to t
instead of x and eliminate Ey would yield the equation for Bz .

264
11.5. ELECTRO-MAGNETIC FIELD IN MATERIALS

11.5 Electro-magnetic field in Materials

All materials are built of protons, neutrons and electrons.These particles are charged and
therefore interact with the electromagnetic field. As a consequence the presence of a
material influences the electromagnetic field. We will discuss this influence and its de-
scription in this section. Since the influence on the electric field is more intuitive, we will
start with the electric field and treat it more precisely and then claim a similar behavior
for the magnetic field.

11.5.1 Polarizability and dielectric constant

The case of a conductor in an electric field was already discussed in section 9.2.519 . We
now want to look at insulators. In an insulator, charge cannot move freely, nevertheless
the electric field influences the charge distribution in two different ways.

Molecular polarization
Being an insulator does not mean that the electrons cannot move at all. It only means
that the binding of an electrons to its atom is strong enough that it cannot hop from one
atom to the next. But it still can move slightly such that there is more negatively charge
on one side of the atom than on the other, see also figure 11.10. As a consequence the
negative electron cloud moves on one side and builds a dipole with the positive molecule.

~
E

Figure 11.10: Polarization of a molecule: The external electric field shifts the electron
(little black dot) cloud towards the positive charge (causing the external electric field).
Therefore the molecule gets a dipole moment.

19
The electrons get redistributed such that there is no electric field in the conductor.

265
11 Electrodynamics

Oriental polarization

Some molecules are already polarized. For example water molecules, where the electrons
are stronger attracted by the oxygen than the hydrogen and therefore the region around
the oxygen is negative ”charged” with respect the the region near the hydrogen atoms.
When an electric field is aligned, the molecules get rotated such that the positive part of
the molecule points in the direction of the electric field.

Independent of how the electric field influences the insulator the effect is always the same:
~ p that opposes the external field
The polarization of the insulator leads to an electric fiels E
~ 0 . This situation gets very obvious when we look at the situation drawn in figure 11.11.
E

- +- + - +-
+- + - +- + - +
+
-
- +
- +- + - +-

+- + - +- + - +
+
-
- +
- +- + - +-
+- + - +- + - +
+
-
- +
Figure 11.11: An electric field origins from two charged plates (left and right). This
electric field points from the left to the right and is called external field as it acts from
outside on the insulator. A polarizable insulator is placed between the two plates. The
positive part of each insulator molecule moves towards the negative charged plate and vice
versa, see left picture. Inside the insulator the positive and negative moments compensate
each other but at the edge of the insulator the positive moment is not compensated on
the left side and the negative not on the right side. This leads to an electric field in the
insulator pointing opposite the external one, see right picture.

266
11.5. ELECTRO-MAGNETIC FIELD IN MATERIALS

~ =E
This means the effective electric field between the plates E ~0 + E~ p is smaller than
the one applied to the plates. For not too large electric fields, this polarization can be
assumed to be proportional to the external field20 , we call the proportionality constant
electric permittivity r .

~ 0 = r E,
E ~ (11.5)
~p = E
E ~ − E0 = (1 − r )E
~ = χE
~ (11.6)

where χ is called the electric susceptibility. For vacuum (and approximately air) we get
r = 1 and therefore χ = 0. Many materials have a permittivity between = 1- and
= 10 but there are some in the region of hounded or even thousand.
In our consideration so far we always kept the external field E ~ 0 . If we think of the two
charged plates as a plate capacitor, this is equivalent to a constant charge on the plate
capacitor. In most cases the situation is slightly different, i.e. the voltage applied to a
capacitor is given and not the charge. As we have discussed in chapter 9.2.7, applying a
voltage U to a plate capacitor leads to an electric field E = Ud where d is the distance
between the plates. If we now insert an insulator between the plates. It opposes the
electric field. As the voltage and as a consequence also the electric field is fixed, more
charge needs to flow on the plates to create a stronger electric field. Assume the insulator
has permittivity r and it fills the whole space between the plates (else see 11.5.3). Then
the field produced by the charge on the plates must be

r U
E0 = r E = .
d
The charge (surface) density on the plates must therefore be

U r 0
σ = 0 E 0 =
d
and the capacity is therefore given by C = Ad0 r where A is the area of the plates.
Obviously the capacity of a capacitor can be increased by inserting an insulator with high
permittivity.
20
An attentive reader might argue that when all molecules are rotated, the polarization saturates. But this
is only the case with very strong fields, otherwise thermal fluctuations and other effects oppose this aliment.

267
11 Electrodynamics

11.5.2 Electric displacement and Polarisation

When we introduced the electric field its properties were characterized by Gauss’ law, see
9.1.5. Taking into account the polarisation effect described in the previous chapter, we
need a new version of Gauss’ law. This gets for example obvious when we look at the
plate capacitor described above when the insulator does not fill the whole space between
the plates. Then the electric field between the plates inside and outside of the insulator is
not the same but should be according to Gauss (see also the next chapter 11.5.3). This is
because the insulator creates an electric field without being charged.
To get this problem fixed, we introduce a new field D ~ called displacement field. To
~
understand the conceptional difference between the E and D ~ field, let’s go one step back
and have a closer look how we introduced the E ~ field. We started with the Coulomb
~
force and deduced that a force F acts on a charged particle q in an electric field due to
F~ = q E.
~ Hence the electric field E ~ is primarily related to the force and only related to
the charge via the dielectric constant 0 . And this 0 ”causes” trouble in case of polarized
insulators meaning we have to introduce a new constant r renormalizing 0 . For the
new field D,~ we want to approach from the opposite side, meaning we want to relate
it primarily to the charge and then somehow to the force. As the two fields should be
connected, we expect them not to differ too much. In fact if a point like charge q is
placed in empty space the D ~ field is given by

~ = q
D ~er
4πr2
which only differs from the E ~ field by the missing factor 0 . With the same argumentation
as for the electric field we can deduce Gauss’ law. For the D ~ field it looks like

‹ ˚
~ dS
D ~= ρfree dV.
∂V V

In this law we find another small difference between the E ~ and D ~ field. In case of
~
the D field we only want to consider a free (movable) charge ρfree and not small charge
separations due to polarisation21 . For this also have a look at figure 11.12.

21
The origin of the problem with Gauss’ law and the E ~ field lies in this polarisation: The polarisation
produces an electric field without really separating charges but only shift charges slightly.

268
11.5. ELECTRO-MAGNETIC FIELD IN MATERIALS

- +
+
-
- + - +
+
-
- +
- +
+
-
- + - +
+
-
- +
- +
+
-
- + - +
+
-
- +
Figure 11.12: Gauss’ law is applied to the same situation but with two different volumes,
indicated by the dashed line. On the left side, the volume only contains the right plate. We
can apply Gauss’ law as usual. For both, the E ~ and D ~ field the result is correct. Looking
at the right side, the volume also contains a part of the insulator. In case of the electric
field, we know that Gauss’ law for the E~ field does not work. This is because the insulator
produces an additional field but the charge is the same. For the D ~ field, we only take into
account the charge on the plate. As the insulator has no free charge, it does not at all
contribute to the D ~ field, therefore Gauss’ law is valid, see also the next chapter.

In most materials the two fields are connected to each other by the permittivity

~ = 0 r E.
D ~

Using this equation and the linear relation between the applied and effective field (see
equation 11.5) we can define a new quantity called polarization P~ . It is defined as

P~ = D
~ − 0 E
~ = χ0 E.
~

269
11 Electrodynamics

11.5.3 Continuity equations at interfaces

We now want to have a closer look at what happens at the interface of two insulators (or
vacuum). There are two cases we have to examine: The case where the field is perpen-
dicular to the surface and where it is parallel. For all other cases we can split the field in
these two components and then use the corresponding relations.
In our example of the plate capacitor, we always had perpendicular fields so let’s first
have a look at them. Consider the situation shown in figure 11.12. We assume the usual
plate capacitor22 such that we assume the field to be perpendicular to the plates. Applying
Gauss’ law to the situation on the right side of figure 11.12, we get D · 2A = Qfree where
A is the surface area of the plate23 . As a consequence, the electric displacement caused by
one plate is D = Q2A free
. The other plate contributes the same amount such that the total
Qfree
D field is D = A . Considering the left side of figure 11.12, we also get D · 2A = Qfree
for one plate. This is because the insulator has no free charge and the considered surface
is the same. Therefore we get the same electric displacement D = QAfree . We see that the
electric displacement D is continuous at the surface of the insulator. This is obviously
different to the electric field in case of an insulator with r 6= 0, because in the insulator,
the electric field is Ein = 0Dr whereas outside the insulator it is Eout = D0 6= 0Dr .
To get the behaviour of the D ~ and E ~ field parallel to the surface we have to go back
to Maxwell’s induction law24 . Assuming to have static fields, the time derivative of the
magnetic field is zero, hence also the inducted voltage. Consider the situation shown in
figure 11.13 where an interface of an insulator and air (or vacuum) is drawn. To apply
the induction law we have to consider a surface S where we would have to calculate the
magnetic flux through. As the magnetic field is constant, its time derivative is zero any-
where, so we do not need to calculate it. The other side of the equation is the integration
of the electric field along the path confining the area. As we only consider the electric
field parallel to the interface, the scalar product of horizontal boundaries of the surface S
is zero and we only have to look at the vertical ones. Assume that the area is enough small
so that the electric field is constant along the vertical sides which are E ~ r in the insulator
and E ~ a in the air.

22
Distance between the plates much smaller than the length and width of the plates.
23
Remember the factor 2 enters because the field passes through the left and right side of the surface of
the volume.
24
In fact we should first prove that Maxwell’s induction law is still valid before we can use it here. We will
have a look at it in section 11.5.5.

270
11.5. ELECTRO-MAGNETIC FIELD IN MATERIALS

We then have

~ r · ~ez l − E
uind = E ~ a · ~ez l = 0
~r = E
E ~a

where ~ez l is the vector of the left boundary. For the second line we used that the elec-
tric field and the path are parallel and that the path on the right boundary points in the
opposite direction of the electric field, leading to the minus sign. We see that the parallel
electric field is continuous across the interface.

~r
E ~a
E
S
l

r air

Figure 11.13: Interface of an insulator with dielectric permittivity r (gray) and air. To
apply the induction law we consider the area S and integrate along its boundary (rectangle
with arrows).

11.5.4 Magnetic field

Very similar considerations can be done in case of the magnetic field which we will not
repeat here. Nevertheless the most important facts and relations are summarized.

~ and B
Distinction H ~ field

As in the case of the electric field, we have to introduce a new field, called magnetic field
~ But the correspondence is not as expected. The discussed B
H. ~ field corresponds to the
~ ~ ~
D field and the new H field corresponds to the E field. To be precise, the B ~ field is the
25
magnetic flux density .
25
As long as only the B ~ field is involved, it is simply called magnetic field. In this section we will explicitly
call it flux density to distinguish it from the magnetic field H. ~

271
11 Electrodynamics

~ and its relation to H

Magnetisation M ~ and B
~ field

Similarly to the polarisation P~ in case of the electric field, we can introduce a magnetisa-
~ , also called magnetic polarisation. This magnetisation accounts how much little
tion M
magnetic blocks in materials are aligned or anti aligned (pointing in opposite direction) to
the magnetic field. We then find the relations

~ = µ 0 µr H
B ~ = µ0 H
~ +M
~,
~ = χm H
M ~

where similarly to the electric case we introduced a magnetic permeability µr and a mag-
netic susceptibility χm .

Dia,- Para- and Ferromagnetism

In case of the electric field, the charge gets always redistributed such that r ≥ 1. This
corresponds to para- and ferromagnetism, where the small magnetic pieces in a body align
along the magnetic field such that they amplify the magnetic field outside the magnet. In
case of the Paramagnetism, this effect is small, usually 1 < µr . 1.1 and it immediately
vanishes if the external magnetic field is turned off. This is different in ferromagnetism,
where µr ≈ 100 − 10000 is possible. The values depend strongly on the tempera-
ture, the external magnetic field and other effects. This effect is so strong that after
switching of the external magnetic field, a magnetization remains. The only everyday-
ferromagnetic materials are iron, nickel and cobalt. Since magnetism is a more complex
phenomenon which involves many different aspects including quantum mechanics it is
possible to have µr < 1. This is called diamagnetism which weakens the magnetic field
outside the material. The most important diamagnetic material is water, but also some
metals are diamagnetic.
As we see, except the case of ferromagnetism, the influence of a material on the magnetic
field is very small, usually 0.99 ≤ µr ≤ 1.01 such that the effect can be neglected and
one does not need to consider the H ~ field.

272
11.5. ELECTRO-MAGNETIC FIELD IN MATERIALS

Continuity equations at interfaces

~ perpendicular to the surface is continuous,

Across the surface of a body, the flux density B
~
whereas the magnetic field H is continuous parallel to the surface. This corresponds to
the expected analogy with the electric field.

11.5.5 Maxwell’s equations in Materials

As motivated above, the interaction of a material and the electromagnetic field requires to
~ and H
introduce the D ~ fields. As a consequence we have to adapt Maxwell’s equations.
Since the whole influence of the material can be absorbed in the constants = 0 r and
µ = µ0 µr , we want to reformulate Maxwell’s equations such that they do not appear any
more. This leads to the equations

‹ ˚
~ dS
D ~= ρfree dV,
‹∂V V

~ dS
B ~ = 0,
˛∂V ¨
~ ~ d ~ dS,
~
E dl = − B
∂A dt A
˛ ¨ ¨
~ ~ ~ ~ d ~ ~
H dl = j dA + D dA
∂A A dt A

together with the connection of the fields

~ = 0 r E,
D ~
~ = µ0 µr H.
B ~

273
11 Electrodynamics

11.5.6 Electromagnetic waves

In section 11.4.2 we have derived the existence of electromagnetic waves and some of
~ and
their properties. The derivation in the most general case also involving the fields D
~
H is very similar and we will not repeat it. Nevertheless in this proper derivation one
would use the H ~ field instead of the B~ field. Because when also taking into account the
propagation through a surface separating two materials, one needs the continuity equation
~ field is the
for the field parallel to the surface when using the path integrals. And the H
continuous one, as discussed in 11.5.4. In addition one has to include r and µr . So one
gets the two wave equations

∂ 2 Ey ∂ 2 Ey
= µ
0 r 0 r µ ,
∂x2 ∂t2
∂ 2 Hz ∂ 2 Hz
= 0 r µ 0 µ r .
∂x2 ∂t2

From these equations we find some interesting facts:

Speed of light

The speed of light is modified by r and µr . Therefore it reads as

1 c0
c= √ = ,
0 r µ 0 µ r n
1
n= √
r µ r

where c0 is the speed of light in vacuum and we introduced the refractive index n, which
we know from optics.

274
11.5. ELECTRO-MAGNETIC FIELD IN MATERIALS

Impedance

The amplitude of the two waves is not arbitrary, they are related to each other. To see
~ and H
this, consider the following form of the E ~ field:

x
Ey = E0 sin ω(t − ) ,
c
x
Hz = H0 sin ω(t − ) ,
c

where we assumed an angular frequency ω and the amplitudes E0 and H0 .

In the derivation of the wave equation we encountered the equations (adapted to our
situation)

∂Hz ∂Ey
= −0 r
∂x ∂t

which applied to our ansatz leads to

ω x x
H0 cos ω(t − ) = 0 r ω cos ω(t − )
c c c
H0 = c0 r E0 .

~ and
Using the relation we found for the speed of light, we find the ratio between the H
~
the E field which is also called impedance Z.

E0 ~
|E|
r
µ0 µr
Z= = = .
H0 ~
|H| 0 r

This impedance is only valid for electromagnetic waves, in case of static charges it is
obviously wrong.

275
11 Electrodynamics

11.6 Energy of the electromagnetic field

Now that we have introduced the H ~ and D~ field, we can derive the energy density of the
electric field in its full generality. In addition we can introduce a quantity which quantifies
the energetic flux of the electromagnetic field.

11.6.1 Energy density of the electric field

As discussed in section 9.2.7, the stored energy of a capacitor is W = 12 QU where Q
is the (free) charge and U is the applied voltage. We choose this formula because there
no capacity C appears, i.e. it is does not contain the dielectric constant26 which might be
affected by polarization. As we derived in section 11.5.3, the charge and the D field are
connected by

Q = DA
where A is the area of the plates. In addition we can relate the electric field and the voltage
by the usual formula27

U = Ed
where d is the distance of the plates. Inserting this in the equation above, we obtain

1
W = AdED.
2
As Ad is the volume of the capacitor, the energy density u of the electric field is

1~ ~
u= E ·D
2

where we made the step to the most general formula using the scalar product of E ~ and
~ In most cases, E
D. ~ and D~ are parallel and therefore the scalar product is not necessary.
26
Remember the capacity of a plate capacitor is given by C = 0dA , where A is the area of a capacitor
plate and d the distance between them.
27
This formula is still valid since the voltage is a measure for the energy per charge, and the energy is
something like force times displacement. Since the electric field is defined via the force, its connection to
the voltage is still valid.

276
11.6. ENERGY OF THE ELECTROMAGNETIC FIELD

11.6.2 Energy density of the magnetic field

With similar arguments we could derive the energy density of the magnetic field. Never-
theless we will not do this, but use the strong similarity of the electric and magnetic field
formalism. Using the most general form of the electric field density, we simply claim that
the magnetic field density is

1~ ~
u= H · B.
2

11.6.3 Poynting vector

The energy density of the electric field is ue = 12 DE = 02r E 2 and the one of the
magnetic field is um = 12 HB = µ02µr H 2 . Considering an electromagnetic wave, we can
q
use the impedance E = ZH, where Z = µ00 µrr , to find ue = um . For the total energy
density we then find in scalar notation

EH
u = ue + um = 0 r E 2 = µ0 µr H 2 = .
c
The energy density is related to the intensity I as

I = uc

where c is the speed of light. To derive this equation, consider an electromagnetic wave
transporting the energy density u moving with the speed of light, also see figure 11.14.
The energy E passing through a surface of area A in a time ∆t is the density times the
volume that passes the surface

E = uAc∆t.

The intensity is the energy per time and per surface, so it is simply I = uc.

277
11 Electrodynamics

u A

c∆t
Figure 11.14: In the time ∆t, the volume c∆A passes through the surface A

Inserting the energy density we found for the electromagnetic wave we get an intensity
of

I = EH.

We can even go one step further and give the intensity a direction, namely the direction
of propagation of the electromagnetic wave. The resulting vector is called Poynting vec-
~ As we know from the derivation of the electromagnetic wave, the direction of
tor S.
~ and H
propagation, the E ~ field are mutually orthogonal which leads to the formula

~=E
S ~ × H.
~

The funny thing is that the Poynting vector is not restricted to electromagnetic waves
but can also be applied to any configuration. It basically tells you in which direction how
much electromagnetic power flows per area. Take for example a simple DC circuit with
a voltage source and a resistor. In figure 11.15, the electric and magnetic field as well as
the Poynting vector is schematically drawn. Obviously at the source, the Poynting vector
points away as the electric energy is inserted in the circuit, so given away from the source.
The opposite happens at the resistor, where electric energy flows to.

278
11.6. ENERGY OF THE ELECTROMAGNETIC FIELD

~
H ~
E ~
H ~
H

+ +

- - ~
S
~
S

Figure 11.15: Usual DC circuit with a voltage source (left) and a resistor (right). The ver-
tical arrows indicate the electric field, the circles the magnetic field (caused by the current,
use right hand rule) and the horizontal arrows indicate the Poynting vector.

279
11 Electrodynamics

280
Chapter 12

ALTERNATING CURRENT (AC)

Professor to student: ”Does a tram
actually run on direct or alternating
current? Student: ”With alternating
current!” Professor: ”But wouldn’t it
have to go back and forth all the
time?” Student: ” But that’s what it
does!”

12.1 Describing alternating voltage and current . . . . . . . . . . . . . 282

12.2 Impedance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
12.3 Combinations of R,C and L . . . . . . . . . . . . . . . . . . . . . . . 290
12.4 Power consideration and effective values . . . . . . . . . . . . . . 300
12.5 Three-phase electric power . . . . . . . . . . . . . . . . . . . . . . 303

281
12 Alternating current (AC)

When the electric voltage periodically oscillates there are a couple of phenomena which
can be observed but do not appear when a constant voltage is applied. In the following
chapter we will describe sinusoidal voltages and the behaviour of electric devices when
they are connected to it. The main difference to constant voltages is that there exist
devices whose resistances depend on the frequency of the voltage. In particular we will
have a look at ohmic resistors, capacitors and inductors and combinations of them.

12.1 Describing alternating voltage and current

Since the voltage (and therefore also the current) oscillates with time the voltage is not
simply a number but a function of time. There are a couple of possibilities to describe
this and we will have a look at the most important ones in the next sections.

12.1.1 Fourier series

There is an important theorem in mathematics which states that any periodical function
f (t) can be decomposed in a sum fn (t) of harmonic oscillations1 . This theorem is called
Fourier’s theorem and the decomposition is called the Fourier series.
Let’s formulate this a bit more mathematically: Let f (t) be a periodic function with period
T . This means that for any t, f (T +t) = f (t). Then the theorem states that this function
f (t) can be written as

∞
X
f (t) = An sin(ω0 nt) + Bn cos(ω0 nt)
n=0

where ω0 = 2π T is the angular frequency and An and Bn are constants which depend on
the periodic function f (t). The whole theory about Fourier series is to complicated to
be treated here. The important point is that if we know the behaviour of a device for a
sinusoidal voltage we can construct its behaviour on any other periodic voltage. So from
now on we always consider the voltages to be sinusoidal unless explicitly mentioned.

1
Harmonic oscillations are nothing else than sinusoidal oscillations.

282
12.1. DESCRIBING ALTERNATING VOLTAGE AND CURRENT

12.1.2 Usual real notation

A sinusoidal voltage takes the general form

u(t) = U0 cos(ωt + ϕ)
where U0 is the amplitude of the oscillating voltage, ω = 2π T is the angular frequency
and ϕ is the phase. Of course we could also chose the sine instead of the cosine with
a different phase but this description is more consistent with another description (see
section 12.1.4). It is convenient in AC notation to write time dependent quantities with
lower-case letters (u(t)) and time independent quantities with capital letters (U0 , Uef f ).
We assume that the voltage oscillates already a long time. Therefore the whole physics
does not change if we shift time. A shift in time corresponds to an additional phase.
It is often easier for calculation to shift time such that the phase for the voltage or the
current is zero. If we take a particular choice of the phase, it is not guaranteed that other
quantities have the same phase.

12.1.3 Phasor
If the voltage oscillates harmonically, one can see the voltage u(t) at a certain time t as
cathetus of a rectangular triangle (see picture 12.1) with hypotenuse length U0 . Let us
now rotate the hypotenuse of the triangle with angular frequency ω in the mathematical
positive orentation, which is anti-clockwise. Then the cathetus of the triangle behaves
exactly like the alternating voltage. This means the whole information of the alternating
voltage is encoded in the vector in picture 12.1: The amplitude, the angular frequency
and also the phase are given. Therefore we can imagine alternating current as a vector
which rotates with constant frequency, and by looking at its projection on the x-axis we
get the familiar real notation.

12.1.4 Complex notation

A phasor corresponds to a two dimensional vector which has its starting point at zero
and its endpoint is an x and y coordinate. We can now associate this two dimensional
vector to a complex number z = x + jy ∈ C where j is the imaginary unit2 and
x, y ∈
√R are real.
p Therefore a phasor can be associated to a complex number with radius
r = zz = x2 + y 2 with a turning phase. This can easily be written as (see section
2.4.2)
2
Usually i is the imaginary unit. Since i = i(t) is already the time dependent current, j is used.

283
12 Alternating current (AC)

Figure 12.1: Phasor u(t) with hypotenuse length U0 . The phasor rotates around the
origin with the angular frequency ω. The projection on the real axis corresponds to the
measured quantity (for example the voltage).

z = rej(ωt+ϕ) = r cos(ωt + ϕ) + jr sin(ωt + ϕ).

In the case of z representing a voltage, r is the amplitude. Therefore r = U0 . The voltage
is the projection of the phasor on the x-axis.
Therefore it is the real part of z:

u(t) = Re(z) = U0 cos(ωt + ϕ).

The complex notation might be a bit strange at the beginning because we use a non-real
quantity (a complex number) to describe something real (for example a voltage). This is

284
12.2. IMPEDANCE

not really a problem because the complex number is just a notation. Nevertheless there
are cases where one can really think about ”turning voltages” and then the phasor and
also the complex notation as a description of the phasor get some real properties (see
12.5.1).
In this script we use the complex notation, therefore we write a voltage as u(t) =
U0 ej(ωt+ϕ) and keep in mind that the physical property is only the real part. The complex
notation allows us also to add or subtract complex numbers because the real part of the
sum of two complex numbers is equal to the sum of the real parts of the two numbers.
Thus for z1 = a1 + jb1 and z2 = a2 + jb2 where a1 , a2 , b1 , b2 ∈ R it follows

<(z1 + z2 ) = <(a1 + a2 + j(b1 + b2 ))

= a1 + a2 = <(a1 + jb1 ) + <(a2 + jb2 ) = <(z1 ) + <(z2 ).

This property is important when we formulate Kirchhoff’s laws, see section 12.3.1. It
allows us to perform all the (linear) calculations in the complex notation as long as we do
not multiply voltage and current. This is the case when we look at the power3 .
The complex notation is often used and it simplifies the equations a lot. Nevertheless it is
also possible to do the whole alternating current theory without complex numbers. One
then has always to think about the amplitude and the phase and examine the different
phenomena according to both of these quantities. With the complex notation everything
can be done with one number since the complex number contains the amplitude and the
phase.

12.2 Impedance
In this section we examine the relation between the applied alternating voltage and
current for different electrical components. The components we examine are the
resistor, the capacitor and the inductor. Most other electrical devices behave like a
combination of these three basic components, we discuss this in section 12.3.

The impedance is a generalisation of the resistance of an electrical device. The resistance

R in a direct current circuit is the ratio of the voltage and the current: R = UI . Since
we now examine alternating currents there are more parameters of freedom than only
the ”amount” of voltage or current which corresponds to the amplitude. We additionally
have a phase shift between the applied voltage and the current. Therefore the impedance
3
In this case one has to first take the real part and then multiply the two quantities.

285
12 Alternating current (AC)

not only takes into account the ratio of the amplitudes of the voltage and current but also
their phase shift. Since the voltage as well as the current are represented by a complex
function, the impedance Z is usually a complex number4 . The absolute value of Z
corresponds to the ratio of the amplitude of voltage and current. The angle between the
real axis and Z corresponds to the phase shift.
Assume that we have an electrical device and we measure a voltage u(t) = U0 ejωt and a
current i(t) = I0 ej(ωt+ϕ) . Then the impedance is given as

u(t)
Z=
i(t)
U0 −jϕ
= e .
I0

From this impedance we conclude that the ratio of the amplitudes is given by |Z| = UI00
and that the cosinus curve of the current follows the cosinus curve of the voltage (see
picture 12.2).

12.2.1 Ohmic resistor

An ohmic resistor is a device where the current is always proportional to the voltage.
Therefore there is no phase shift between the voltage and the current and the impedance
of an ohmic resistor therefore is a real number R. Of course R is exactly the ohmic
resistance known from the direct current. This means

u(t) U0
R= = .
i(t) I0

12.2.2 Capacitor
A capacitor is a device where charge can be stored (see electrodynamics 1 section 9.2.6).
The basic equation of a capacitor with capacity C is

Q
C= (12.1)
U
4
If one describes AC without complex notation one has to consider the ratio of the amplitudes and the
phase shift separately as two real numbers.

286
12.2. IMPEDANCE

Figure 12.2: Sinusoidal voltage and current. The amplitude of the voltage is U0 = 9
and the one of the current is I0 = 4.5. The absolute value of the impedance is therefore
ϕ
|Z| = 2 and the phase shift is ϕ = 1rad. The angular frequency is ω = 10 , the period
2π
therefore is T = ω = 20.

where Q is the stored charge and U is the applied voltage. As we have seen, the capacity
does not depend on the applied voltage. Therefore it is constant for an alternating voltage
as well. If we now multiply equation (12.1) with the voltage u(t) = U0 ej(ωt+ϕ) and take
the derivative with respect to time of the whole equation we get

u(t)C = q(t)
du(t) dq(t)
C = = i(t)
dt dt
dU0 ej(ωt+ϕ)
C = CU0 jωej(ωt+ϕ) = jCωu(t) = i(t).
dt

As a consequence we get the impedance ZC of a capacitor by

287
12 Alternating current (AC)

u(t)
ZC =
i(t)
u(t) 1 j
= = =− .
Cjωu(t) jCω Cω

In figure 12.3 the phasor and the time evolution of the voltage and the current are shown.
The −j in the impedance means that the voltage is rotated 90◦ clockwise5 . This is intu-
itively clear because the capacitor has to be charged in order to have a voltage and it is the
current that charges the capacitor. Therefore the current is first and after that, when the
capacitor is already charged a bit, one can measure a voltage. The angular frequency ω in
the denominator is intuitively also clear because if ω is big, the capacitor gets charged and
discharged fast As a consequence the current is big and therefore the impedance small.
With the capacity C in the denominator it is nearly the same, because a big capacity can
store more charge, therefore the current is big.

Figure 12.3: Phasor at time t = 0 and time diagram for a capacity. Be aware that the
voltage or current is the projection on the x-axis.

5
Multiplying a complex number with j rotates that number by an angle of 90◦ in the positive orientation.
Therefore the multiplication with −j rotates it 90◦ in the negative orientation, therefore clockwise.

288
12.2. IMPEDANCE

12.2.3 Inductor
An inductor is an electrical device with a high inductance and an ideal inductor has no
resistance and no capacity. The inductance L = ΦI of a device is the ratio between
the magnetic flux Φ and the current I through a device (causing the magnetic field), see
also section 11.2.2. In AC there is an additional phenomenon, called self inductance.
An alternating current causes an alternating magnetic field which induces again a voltage
in that device. The induced voltage is such that it opposes an additional growth of the
current.
From electro-magnetism we get the following equation:

di(t)
u(t) = −uind (t) = L
dt
where L is the inductance of the inductor. The switch of the sign between u(t) and
uind (t) comes from energy consideration: If we look at an ohmic resistor, the resistance
is bigger than zero. As a consequence u(t) and i(t) are in phase. If we look at a voltage
source then the voltage and the current have a phase shift of 180◦ . This is basically
a convention but it ensures that Kirchhoff’s laws hold6 . The induced voltage is like a
source voltage, it therefore has opposite sign to the applied voltage u(t). Assume that
the current is i(t) = I0 ej(ωt+ϕ) . Then the voltage is

di(t) dI0 ej(ωt+ϕ)

u(t) = L =L = LI0 jωej(ωt+ϕ) = jLωi(t).
dt dt
The impedance of the inductor therefore is

u(t) jLωi(t)
Z= = = jLω.
i(t) i(t)

The phasor and the time dependence are shown in figure 12.4. The j causes the voltage
to be 90◦ earlier than the current. This is clear: If one applies a voltage at an induc-
tor, the growing current causes a growing magnetic field which opposes the current to
grow. The current therefore grows slowly whereas the voltage over the inductor drops
6
If there is no external alternating magnetic field through the circuit, the sum of all voltages over the
devices of a closed path is zero. If we go in the direction of the current then it is obvious that the source
and the consuming devices have different sign. See also 12.3.1.

289
12 Alternating current (AC)

immediately7 . The dependence on the angular frequency and the inductance is intuitively
clear because a large inductance causes a stronger magnetic field and therefore the in-
duced voltage, which opposes the current to grow, is also larger and as a consequence
the current smaller. For high frequencies, the magnetic field changes fast. Since the in-
duced voltage is proportional to the change of the magnetic field, the current is small and
therefore the impedance is high.

Figure 12.4: Phasor at time t = 0 and time diagram for an inductor. Be aware that the
voltage or current is the projection on the x-axis.

12.3 Combinations of R,C and L

In this chapter we have a look at the most important combinations of basic electrical
elements. For this the usefulness of the complex notation gets obvious. Nevertheless we
calculate one example with real impedances as well in order to show how this works.

12.3.1 Kirchhoff’s laws

We can nearly adapt Kirchhoff’s laws with some small modifications from direct current
(DC). The first modification is that we use complex currents, voltages and impedances.
Since charge cannot be created or destroyed, we get the continuity equation: For any
7
This is different to the capacitor where the capacitor has to be charged in order to have a voltage drop

290
12.3. COMBINATIONS OF R,C AND L

point in an electrical circuit and for all time the sum of all currents ik (t) flowing to a
particular point has to be equal to the change of charge q(t) at that point:

X dq(t)
ik (t) = .
dt
Assuming no charge is stored, as it is the case for (nearly) all electrical elements except
the
P capacitor, we get that the sum of all currents flowing to a point has to be zero,
ik (t) = 0. This is one of Kirchhoff’s laws.

From electrodynamics we know that for any time, the sum of all voltages uk (t) around a
closed path has to be equal to the change of the external magnetic flux Φext :

X dΦext
uk (t) = uind = − .
dt
Assuming there is
Pno time dependent magnetic field, the equation simplifies to the known
Kirchhoff’s law uk (t) = 0. Be aware that the voltages are complex quantities, which
means that the vectorial sum of all the voltage phasors has to be zero.

12.3.2 Serial and parallel circuit

With the same argument as in the case of DC one gets formulas for the total impedance
of electric components connected in series or parallel.
If two or more components are connected in series, the current i(t) through the com-
ponents is the same everywhere. Multiplying each component by its impedance, one gets
that the voltage drop uk (t) = i(t)Zk over each element.
P The sum of all these voltage
drops must be equal to the applied voltage u(t) = uk (t)8 . The total impedance is
then given by

P
u(t) i(t) Zk X
Z= = = Zk
i(t) i(t)

which is exactly what we expect from DC with the difference, that the impedances Zk
are complex.
8
Here the voltage over the source is measured in the opposite direction, which causes a change of the
sign. It is then compatible with the notation in the Kirchhoff’s laws.

291
12 Alternating current (AC)

With the analogous argumentation for components which are connected parallel to the
voltage source, one gets that the total impedance is given as

1 −1
X
Z=
Zk

which is also exactly what we expect from DC.

12.3.3 High pass filter

One important AC circuit is the high pass filter. A high pass filter has a high impedance
for low frequencies and a low impedance for high frequencies. The high pass filter is
often used as anaudio filter to keep low frequencies away from the tweeters. There are
different realisations of a high pass filter with different characteristics. One of the easiest
is the RC circuit as shown in figure 12.5.

Figure 12.5: High pass circuit. An alternating voltage uin (t) is applied to a capacitor and
a resistor and the voltage uout (t) over the resistor is measured.

292
12.3. COMBINATIONS OF R,C AND L

The total impedance is given as

−j
Z = ZR + ZC = R + ,
ωC
1 1 ωC ωC(RωC + j)
= j
= = ,
Z R − ωC RωC − j (RωC)2 + 1
s
1 2

|Z| = R +2
ωC

where R is the resistance of the resistor, C the capacity of the capacitor and ω the angular
frequency of the voltage applied. If a voltage u(t) = U0 ejωt is applied to the high pass
filter, the voltage drop over the resistor is given by

u(t) ω 2 C 2 R + jωC
uR (t) = Ri(t) = R = Ru(t) .
Z (RωC)2 + 1
This leads to an amplitude UR0 and phase ϕ of

√ p
RU0 ω 4 C 4 R2 + ω 2 C 2 RU0 ωC (RωC)2 + 1) RU0 ωC
UR0 = = =p ,
(RωC)2 + 1 (RωC)2 + 1 (RωC)2 + 1
ωC 1
tan(ϕ) = = .
Rω 2 C 2 RωC
This means that one measures a voltage which has an amplitude UR0 with an angular
frequency ω and which leads the voltage of the AC source by the angle ϕ. The voltage
over the resistor is leading the voltage of the source: Because the voltage drop over the
resistor is proportional to the current; and at the capacitor, the current leads the voltage.
In the limit ω → ∞ there is no phase shift and all the voltage drops over the resistor. This
is also clear, because at very high frequencies, the capacitor has a very low impedance and
therefore almost no influence on the circuit. In figure 12.6 the amplitude and the phase
is plotted as function of ω.

Now let us once do the calculation in real notation in order to see how elegant the complex
notation is. For the real notation there are different approaches, the easiest is to do the
calculation with phasors by geometrically add impedances. But this is basically the same
as we did above in the complex notation with vectors instead of complex numbers.

293
12 Alternating current (AC)

Figure 12.6: Amplitude UR0 and phase ϕ of the high pass filter. As expected, the output
voltage is higher for high frequencies. For the plot the following values were assumed:
U0 = 1V, R = 1000Ω and C = 10−6 F. Be aware that the x-axis is a frequency f = 2π ω
.

For the calculation with real quantities let the current be i(t) = I0 cos(ωt) 9 . This does
not mean that the voltage at the voltage source is of the form u(t) = U0 cos(ωt), because
there will be a phase shift between current and voltage. We now compute this phase shift
as well as the relation between U0 and I0 .
From Ohms law we know that the voltage over a resistor is always proportional to the
current, therefore uR (t) = Ri(t) = RI0 cos(ωt). Additionally we know that the voltage
I0
over a capacitor is always lagging 90◦ and therefore uC (t) = ωC sin(ωt). The voltage
over the voltage source is equal to the voltage over the RC combination and this is the
sum of the two voltages

1
uin (t) = UR (t) + uC (t) = I0 R cos(ωt) + sin(ωt) = U0 cos(ωt − ϕ).
Cω

Using some trigonometric theorems to get

9
We could also assume a different phase because only the phase shift between current and voltage matters.
So we choose the phase of the current to be zero because the current through both components is the same,
so this simplifies the calculation.

294
12.3. COMBINATIONS OF R,C AND L

s 2
1
U0 = I0 R2 + ,
ωC

1
ϕ = arctan .
ωCR
q
U0 1 2

The absolute value of the total impedance therefore is = R2 + I0 ωC , the same
as above in the complex notation. The phase shift is the same as well.

12.3.4 Resonant circuit

There are two different types of resonant circuits, the parallel circuit and the serial circuit.
They behave differently but the equations to solve the problem are almost the same. So
we only look at the serial circuit.
A serial resonant circuit is a circuit where a capacitor, a resistor and an inductor are con-
nected in series. If there is no resistor (or its resistance is zero) the circuit is called an
ideal serial circuit or an LC circuit. Since this is a special case we consider the general case
where R 6= 0. Figure 12.7 shows the serial resonant circuit. We connect the resonant
circuit with a voltage source with an angular frequency Ω.

Figure 12.7: Serial resonant circuit. An alternating voltage with angular frequency Ω is
applied to an inductor, a resistor and a capacitor.

295
12 Alternating current (AC)

Minimal resistance
The total impedance of the circuit is given as

1
Z = ZR + ZC + ZL = R + j LΩ − .
ΩC

Obviously the absolute value of the impedance is minimal if the imaginary part vanishes.
In this case the angular frequency is often called resonance frequency10 Ω = ω0 . It then
holds

1 1
Lω0 = ⇒ ω0 = √ .
ω0 C LC
In fact this is only the resonance frequency for the ideal LC circuit, because in the case
of a not ideal circuit it is a bit smaller (see below). Since the impedance of the circuit is
minimal, the current from the source through the circuit will be maximal at this frequency.

Natural frequency
Since the Resonant circuit is a very beautiful example of a harmonic oscillator we shortly
repeat its properties. As it is an oscillator it can oscillate itself. Consider the following
case: We open the circuit, charge the capacitor with a charge q0 and close the circuit
again. As soon as the circuit is closed, the overall voltage must be zero, and the sum of
all voltages is given as

di(t) 1
0 = uL (t) + uR (t) + uC (t) = L + Ri(t) + q(t)
dt C
d2 q dq 1
= L 2 + R + q. (12.2)
dt dt C
We therefore search a function q = q(t) which solves the equation above. We try the
ansatz q(t) = q0 eλt . Inserting the ansatz in equation 12.2 we get a quadratic equation for
L
λ. Since we search for oscillating solutions, the discriminant is constraint by R2 −4 C <0
and therefore the square root is complex.
10
Resonance is the phenomenon where an oscillating system gets maximally excited by an external exci-
tation. Since maximal excitation is not the same as minimal resistance, the frequency we discussed is often
missleadingly called resonance frequency although the resonance frequency is something different.

296
12.3. COMBINATIONS OF R,C AND L

The general solution is

q q
−δt j ω02 −δ 2 t −j ω02 −δ 2 t
q(t) = e Ae + Be ,
R 1
δ= , ω0 = √ .
2L LC
The constants A and B depend on the conditions at t = 0. In our case we choose them
such that the capacitor is maximal charged at t = 0 which leads to

q
−δt 2 2
q(t) = q0 e cos ω0 − δ t .
p
This means that the circuit oscillates with a frequency ω = ω02 − δ 2 < ω0 which is
called natural frequency. The oscillation is damped which is not surprising because energy
gets dissipated at the resistor (see also section 12.4.1.

Resonance (maximal current)

If we apply a harmonic oscillating voltage to the circuit we can observe resonance. As-
sume we apply the voltage u(t) = U0 ejΩt . According to Kirchhoff’s law the applied
voltage is equal to the total voltage over the consuming devices

1 −jLi(t) 2
U0 ejΩt = jLΩi(t) + Ri(t) − j ω0 − Ω2 + 2jδΩ ,

i(t) =
ΩC Ω
R 1
δ= , ω0 = √ .
2L LC
jΩ 1
i(t) = U0 2 2
ejΩt ,
L ω0 − Ω + 2jδΩ
1 jΩ 1
= 2 2
,
Z L ω0 − Ω + 2jδΩ
1 Ω 1
= p 2
|Z| L (ω0 − Ω2 )2 + (2δΩ)2
1
If we search for the frequency with maximal current, we can take the derivative of |Z|
with respect to Ω and set it equal to zero11 .
11
Since the first derivative vanishes at the extrema.

297
12 Alternating current (AC)

1 1
2(ω02 − Ω2 )(−2Ω) + 4δ 2 2Ω

d |Z| 1 D − Ω 2D
0= = ,
dΩ L D2
p 1
where D = (ω02 − Ω2 )2 + (2δΩ)2 is the nasty square root in the denominator of |Z| .
Multiplying this equation with LD3 one gets

0 = ((ω02 − Ω2 )2 + 4δ 2 Ω2 ) + 2Ω2 (ω02 − Ω2 ) − 4δ 2 Ω2 = ω04 − Ω4 ,

Ω = ±ω0 .

This result was expected because if the impedance is minimal, the current is maximal.

Resonance (maximal voltage)

Now let us have a look at the voltage drop over the capacitor:

1 1 1
uC = −j i(t) = U0 2 2
ejΩt ,
ΩC LC (ω0 − Ω ) − 2jδΩ
U0 1
UC0 = p
2
,
LC (ω0 − Ω )2 + (2δΩ)2
2

ω02 1
= U0 p = U0 q
(ω02 − Ω2 )2 + (2δΩ)2 (1 − Ω2 2
)2 + 4 ωδ 2 Ω
2
ω02 0 ω2
0

where UC0 is the amplitude of uC (t). Once again let us have a look at the maximal
voltage

dUC0 1 2(ω02 − Ω2 )(−2Ω) + 4δ 2 Ω

0= = −U0 ,
dt LC 2D3
⇒ Ω2 = ω02 − 2δ 2 . (12.3)

This means
p that one can measure the highest voltage at the capacitor at the frequency
Ω = ω02 − 2δ 2 . This frequency is usually called resonance frequency. Picture 12.8
shows the voltage over the capacitor for different frequencies depending on the dumping
δ. This frequency dependence is called resonance curve. Concerning the asymptotic
behaviour of the resonance curve we get the expected result: For very small frequencies,

298
12.3. COMBINATIONS OF R,C AND L

the voltage drop over the capacitor converges toward the applied voltage and is exactly the
applied voltage U0 in case the of DC. This is because when applying a constant voltage,
the capacity corresponds to an interruption in the circuit and therefore the whole voltage
drops over it. For very high frequencies, the impedance of the capacitor goes towards
zero and therefore the voltage drop also goes towards zero.

Figure 12.8: Voltage drop UC0 over the capacitor as a function of the applied frequency
for different dumpings δ. The y-axis is normed to the excitation and RLC circuit. The
x-axis and the dumping δ are normed with respect to ω0 . The dotted line shows the
position of the maxima depending on the resonance frequency (for a given Ω the δ was
calculated according to equation 12.3).

299
12 Alternating current (AC)

Let us make a summery of the different frequencies:

Name Formula
p Property
natural frequency ω02 − δ 2 Frequency at which the circuit oscillates without exter-
nal excitation.
minimal impedance ω0 The two cases are the same since for minimal
or maximal current p impedance the current gets maximal at a given voltage
resonance fre- ω02 − 2δ 2 Maximal voltage drop over the capacitor
quency

Table 12.1: Overview of different frequencies where ω0 = √1 and δ = R

LC 2L .

12.4 Power consideration and effective values

Until now we always considered equations which were linear in the voltage or in the
current. Therefore there was no problem when adding complex voltages or currents and
then taking the real part to get the property one measures at the circuit. Now we look at
the power and therefore multiply current and voltage. So we have to take the real value
first and then multiply the two quantities.

12.4.1 Power

The power consumption at time t of an electrical device is p(t) = u(t)i(t) which is

time dependent. In many cases the instantaneous power is not so important and one is
more interested in the mean power consumption12 . Since we look at periodic oscillations
one only has to consider one period T to calculate the mean value. Let’s apply a voltage
u(t) = U0 cos(ωt) at a device which causes a current i(t) = I0 cos(ωt + ϕ) with phase
shift ϕ flowing through that device. The mean power then is

12
For example consider a resistor: There the mean heat dispersion is more relevant than its instantaneous
power consumption because the resistor also has some heat capacity

300
12.4. POWER CONSIDERATION AND EFFECTIVE VALUES

ˆT
1
P = u(t)i(t)dt (12.4)
T
0
ˆT
U0 I0
= cos(ωt) cos(ωt + ϕ)dt
T
0
ˆT ˆT
 
U0 I0 
= cos2 (ωt) cos(ϕ)dt − cos(ωt) sin(ωt) sin(ϕ)dt
T
0 0
U0 I0
= cos(ϕ) (12.5)
2

where we use cos(ωt + ϕ) = cos(ωt) cos(ϕ) − sin(ωt) sin(ϕ) and sin(ωt) cos(ωt) =
1 13 T
2 sin(2ωt). The first integral gives 2 and the second one gives zero. This means that
for constant current I0 and voltage amplitude U0 the mean power depends on their phase
shift. To be more precise, the power we calculated is the one that flows into that device
and gets converted into an other energy, for example heat at the resistor or rotation in
case of an electric motor. The power is maximal for ϕ = 0, so for example for a resistor.
On the other hand the power is minimal for ϕ ± 90◦ which corresponds to a capacitor
or an inductor. In case of the capacitor, the phase shift is ϕ = 90◦ (see figure 12.3).
After the capacitor is completely charged (which is the case for voltage at the capacitor
being maximal, i.e. u(t) = U0 ) the capacitor gets discharged. This means energy flows
back from the capacitor to the source until there is no charge left in the capacitor. This is
the case when the voltage is zero, and then it gets charged again. While charging, energy
flows from the source to the capacitor. In the time diagram, the discharging periods
are those where the voltage and the current have opposite sign and as a consequence
the instantaneous power is negative. Obviously the charging and discharging energy is
the same which means that the capacitor has no mean power consumption. A similar
argument can be made with a coil where the energy is stored in the magnetic field.

13
One can calculate this using partial integration or using that the cosine and the sine are the same function
´T ´T
up to a phase shift and therefore 0 cos2 (ωt)dt = 0 sin2 (ωt)dt = A for a certain number A. Then
´T ´ T ´ T
A = 0 cos2 (ωt)dt = 0 1 − sin2 (ωt)dt = 0 1dt − A = T − A which leads to A = T2

301
12 Alternating current (AC)

12.4.2 Effective values

In direct current, the power is simply P = U I whereas in AC there are the factors14 2
and cos(ϕ). One defines

1
Ueff = √ U0 ,
2
1
Ieff = √ I0 ,
2
as the effective voltage and effective current. Sometimes these quantities are called RMS
(root mean square) of the voltage or current. Using Ueff and Ieff one gets the mean power
of a device as

P = Ueff Ieff cos(ϕ).

In everyday life most of the voltages and currents are indicated with their effective value
instead of the amplitude. This has the advantage that one can simply multiply the voltage
and the current in order to get the power consumption15 . As an example the voltage at
√ sockets has Ueff = 230V which means that the amplitude of the voltage is
the power
U0 = 2 · 230V ≈ 325V.

12.4.3 Active, reactive and apparent power

A closer look to the mean power in equation (12.5) allows an interesting view on the
phase shift. If we apply a voltage u = U0 cos(ωt) to a device, we can rewrite the current
as i(t) = I0 cos(ωt + ϕ) = I0 (cos(ωt) cos(ϕ) − sin(ωt) sin(ϕ)) which is basically a
superposition of a sine and a cosine oscillation with amplitude −I0 sin(ϕ) and I0 cos(ϕ)
respectively. The cosine oscillation is in phase with the voltage and therefore it corre-
sponds to the part of the current which dissipates energy at the device. One calls this the
active current, which means that is the part of the current that does (useful) work. The
sine oscillation does not perform any work, it only causes energy to be transferred from
the source to the device and back. This current is called reactive current.
Analogously one can define active, reactive and apparent power. What we have defined
as mean power is also called active power P = Ueff Ieff cos(ϕ) since this is the electric
14
The factor 2 is only valid for sinusoidal signals, for other signal forms one has to calculate it from
equation (12.4).
15
Assuming to have no phase shift ϕ = 0. This is the case in most of the everyday life devices. Never-
theless at big electric motors cos(ϕ) is indicated.

302
12.5. THREE-PHASE ELECTRIC POWER

power that is used by the electric device. The apparent power S = Ueff Ieff is the total
power that is transferred between the source and the device. One part of this total power
is the past used as active power and the other part is transferred back to the source. The
second part is the energy oscillation between the source and the device. It is called reactive
power and defined as Q = Ueff Ieff sin(ϕ). Using sin2 (ϕ) + cos2 (ϕ) = 1 one gets the
following relation between the three different powers:

S 2 = P 2 + Q2 .

Generally one wants to have small reactive currents because a reactive current does not
transfer energy to the device but it causes a larger current than necessary which leads to
more losses in the cables. There are different possibilities to avoid reactive current. The
simplest is to connect a capacitor or an inductor parallel to the device such that the total
impedance suffices ϕ = 0. Then the reactive current only oscillates between the device
and the capacitor or inductor and not through the cables from the power station to the
device.

12.5 Three-phase electric power

The word wide power net, which also provides electricity at home16 , is a bit more sophis-
ticated version of AC than we have discussed until now. This version is called three-phase
electric power and it has some nice applications we will look at.

12.5.1 Definition and production

Usually, a three-phase power supply consists of three wires and to each wire an AC voltage
is applied. Additionally there is one wire which is called neutral wire, which we consider
later. The three wires with the voltage are called phases. The amplitude of the voltage
in all of the three phases is the same but shifted by 120◦ = 2π3 which is shown in figure
12.9.

There are different ways to create three-phase power, the easiest is the three-phase gener-
ator. It works as an usual generator but the three coils placed around the rotating magnet
16
The power is brought to you home with a three-phase system and then the three phases are split up.
Therefore at an usual power socket there is only one phase together with the ground wire and the neutral
wire. Only very few power sockets are three-phase sockets. They are usually only needed for machines that
consume a lot of energy. These sockets (usually) have five poles.

303
12 Alternating current (AC)

Figure 12.9: The three voltages of a three-phase system. The wires are labelled by L1 , L2
and L3 .

are displaced by 120◦ each, see figure 12.10. The magnet in the center rotates and there-
fore the magnetic field at the coils is changing. This induces a voltage in the coils. Since
the coils are displaced by 120◦ , the induced voltage is also shifted by 120◦ . Be aware that
the induced voltage at a coil reaches its maximum when the magnet changes its polarity
because the induced voltage is proportional the variation of the magnetic field in time.
This variation is maximal when the magnet changes its polarity.
The setup described above allows a visualisation of the phaser: Treat the phasor as a
vector perpendicular to the North-South direction of the magnet such that the maximal
voltage is induced in a coil when the phasor is pointing to that coil. Then the induced
voltage in each coil is the projection of the phaser on the axis of the coil.

12.5.2 Star and Delta circuit

A big advantage of the three-phase power circuit is that there are two possibilities of
connecting a device. A three-phase device is (usually) connected to all three phases and
it (usually) consists of three independent loads which are drawn as resistors in figure
12.11. However, there are two possibilities how this can be done, see figure 12.11. Let
U0 be the amplitude of all the three phases, the first phase therefore has the voltage
u1 = U0 cos(ωt), the second phase hasu2 = U0 cos(ωt + 120◦ ) and the third one has
u3 = U0 cos(ωt + 240◦ ).

304
12.5. THREE-PHASE ELECTRIC POWER

Figure 12.10: A three-phase generator. The tree coils are rotated by 120◦ . The magnet
in the center is rotating with angular frequency ω and induces an AC voltage in the coil.
One pole of each coil is connected to the neutral wire N , the other poles are the phases
L1 , L2 and L3 .

Star circuit
The intuitive easier one is the star circuit where each load is connected the same way
as the coils in the three-phase generator. At the first load (connected to phase L1 ) the
voltage u1 drops, as expected. Analogously the other loads. If all loads have the same
impedance then there is no current flowing from the three loads to the generator through
the neutral wire. This is because the sum of all three currents is zero17 . Therefore it is not
necessary to connect the common point of the three loads with the neutral wire, the loads
need only to be connected at one point with each other. It is often useful to connect the
common point with the neutral wire to stabilize the circuit, see also chapter 12.5.3.

17
This can easily be seen if one adds the currents as phasors. Since we assume that all the three impedances
are the same, the three currents are also the same. The three phasors form an equilateral triangle, therefore
the starting point is equal to the ending point and as a consequence the sum of the three currents is zero.

305
12 Alternating current (AC)

Figure 12.11: Left side: The star circuit where each load is connected to a phase and the
neutral wire. Right side: Each load is connected to two phases, the neutral wire is not
used at all.

Delta circuit
The other possibility is the Delta circuit where each load is connected to two phases. The
advantage of this circuit is that the voltage drop over each load is higher. To understand
this let’s have a look at the voltage u(t) over one load, for example connected to L1 and
L2 :

u(t) = u1 − u2 = U0 (cos(ωt) − cos(ωt + 120◦ ))

= U0 (cos((ωt + 60◦ ) − 60◦ ) − cos((ωt + 60◦ ) + 60◦ )
= U0 (cos(ωt + 60◦ ) cos(60◦ ) + sin(ωt + 60◦ ) sin(60◦ )
− cos(ωt + 60◦ ) cos(60◦ ) + sin(ωt + 60◦ ) sin(60◦ ))
√
= 3U0 sin(ωt + 60◦ )).

√
This means that the voltage drop over each load has an amplitude of 3U0 .

To start a big electric motor one can use both, the properties of the star and Delta circuit.
When starting the motor it needs a lot of current. If one applies a smaller voltage, the
current is also smaller18 . So one starts the motor in the star circuit and as soon as it
rotates with constant speed, one connects it as Delta circuit and the motor has more
18
This might sound trivial but it is important for the fuse because it should not melt.

306
12.5. THREE-PHASE ELECTRIC POWER

power because the load is connected to a higher voltage. In the past this was often used
to start large motors and there were special switches which allowed to change easily from
the star to the Delta circuit. Of course it is a bad idea to connect a motor which is not
build for a Delta circuit to a Delta circuit as it might get damaged.

12.5.3 Advantage of a three-phase system

There are different reasons why our power supply at home is a three-phase system. We
now investigate some of them.

Role of neutral wire

As we have seen in the section about the star circuit (see section 12.5.2), there is no
current flowing through the neutral wire as long as the impedances at the three phases
are the same. But if the impedances of the three consumers are not the same, a current
flows through the neutral wire. If the common point in the star circuit is not connected
to the neutral wire, this common point has a non-zero voltage with respect to the neutral
wire. This is because at the common point the total current has to be zero19 . This is only
possible if the three currents are the same20 . Since not all impedances are the same, the
voltage drop over the loads will not be the same and as a consequence the sum of the
voltage will not be zero. This sum of the voltage drops is the same as the voltage between
the common point and the neutral wire. To guarantee that the voltage drop over all the
consumers is the same, one connects the common point with the neutral wire and then
the neutral wire is like a fixing point where current can flow away if there are different
impedances.
This is exactly the situation at home: The houses are connected to the three-phase net.
In the house the different sockets are connected to different phases and to the neutral
wire. If all the phases are loaded the same (connected with the same impedance), then
there is no current flowing out of the house through the neutral wire. But if for example
only one power socket is used then the current flows through the neutral wire out of the
house.

Energy transfer
Concerning the energy transfer the three-phase system offers a very efficient way to trans-
port electrical energy over a (long) distance. If the impedance at the three phases is the
19
There is no capacitor and Kirchhoff’s law holds.
20
To distinguish: The three currents are the same if the common point is not connected to the neutral
wire. They are not the same if the common point is connected to the neutral wire.

307
12 Alternating current (AC)

same then (nearly) no current flows through the neutral wire. Therefore we only need
the three phases to transport the electrical energy whereas in a simple AC circuit we need
two wires for a closed circuit which corresponds to one single phase. This means in the
three-phase system the three wires transport as many energy as three simple AC circuit
consisting of six wires.
This gets very important if one wants to transport a lot of energy over a long distance
because a lot of energy means thick cables and long distances mean long cables. Hence, a
lot of wire. The power net consists of three thick cables where the phases are connected
and one thin cable with the neutral wire (to stabilize the net). The different phases are
connected then cleverly to the different houses (or even different villages) such that the
impedance on all three phases is approximately the same. In terms of whole villages
it is not relevant if someone at home uses a lot of power from one single phase21 . In
conclusion the impedances applied to the three phases can be considered the same.

Three-phase motor
The three-phase motor is in principle the same as a generator (see figure 12.10) but instead
of turning the magnet and inducing a voltage in the coils, a voltage is applied to the
coils which create a magnetic field which causes the magnet to turn around. Since the
magnetic field of a coil is proportional to the current flowing through it and the current is
proportional to the applied voltage22 and the voltage correspond to a rotating phasor, the
magnetic field also rotates. But the direction of the rotation (whether it rotates clockwise
or anti-clockwise) depends on how the three coils are connected to the three phases.
Actually, it only depends on how the three phases are connected to the coils: in clockwise
or anti-clockwise direction. This means if the three coils coil 1, coil 2 and coil 3 are
connected to L1 , L2 and L3 respectively or L1 , L3 and L2 . By swapping the phases of
two coils the motor changes its direction of rotation.
This property seems pretty unspectacular but it allows to run a motor in a certain direction
which is not that simple to achieve in a simple AC circuit: In a simple AC circuit there is
no rotating magnetic field (because there is only one phase). There is only a magnetic field
that always points in the same direction and changes its strength. As a consequence the
motor does not know in which direction it should turn so it starts turning in an arbitrary
direction. Of course there are some tricks how one can force the motor of a simple AC
circuit to turn in a specific direction but it is more complicated than in case of a three-
phase system.
21
Additionally each phase at home is connected to a fuse, the power consumption is therefore limited. In
terms of the whole village the consumption of one single house is small/negligible
22
Maybe with phase shift but this phase shift is the same for the three coils in a three-phase motor.

308
Chapter 13

SPECIAL RELATIVITY
Everything is relative.
Certainly not Albert Einsteina
a
As explained in this chapter, not every-
thing is relative.

13.1 Historical Milestones . . . . . . . . . . . . . . . . . . . . . . . . . . 310

13.2 Galileo transformations . . . . . . . . . . . . . . . . . . . . . . . . . 313
13.3 Lorentz transformation . . . . . . . . . . . . . . . . . . . . . . . . . 317
13.4 Minkowski metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
13.5 Velocities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
13.6 Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
13.7 Paradoxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336

309
13 Special Relativity

Roughly 100 years ago there were some phenomena in different fields of physics that
could not be explained by the theories of that time. Einstein recognised that (at least
some of) these phenomena can be described using a modification of classical mechanics
and taking into account how one observes an other reference frame. The complete theory
of relativity is very complex and needs a lot of maths. Nevertheless there is a small part of
it which gives already a lot of interesting insights and which gets along with high school
maths. This easier part is called special relativity and the more general theory is called
general relativity. As long as nothing else is mentioned in this chapter, we will always do
special relativit and sometimes simply call it relativity. The main source of this chapter
are [63] and [64].
In this chapter we will first have a look at classical mechanics. After we understand the
most important properties of classical mechanics we will look first at the assumptions of
special relativity and then at its conclusions and interpretations. In the end we will resolve
some of the most common paradoxes. Before we start doing physics lets have a look at
some historical milestones that lead to relativity.

13.1 Historical Milestones

Before Einstein developed relativity there were different experiments and parts of the
theory which were incompatible. In electrodynamics this inconsistency was especially
big. Let’s have a look at the big questions physicists had roughly 150 years ago.

13.1.1 Aether and electromagnetic waves

In 1864, Maxwell elaborated the famous Maxwell’s equations which describe (nearly) ev-
erything in classical electrodynamics. These equations also predict electromagnetic waves
which were then experimentally proven by Heinrich Hertz in 1887. If one thinks about
waves in every day life one always imagines a medium where this waves propagate. For
example sound waves consist of air (or an other medium) which gets compressed and
stretched. But the important thing is that there is a medium in which the wave propagates,
eg. if there is no medium, nothing can get compressed and stretched so no sound can
propagate. Only in the reference frame where the medium is resting, the wave equations
take the simple form, for all other reference systems one has to add additional terms1 .
Until the beginning of the 20th century physicists thought that light would also propagate
in a medium called aether. But Michelson and Morley showed in their famous experiment
that this can not be true. For this they measured the speed of light in the direction of the
rotation of the earth and perpendicular to it. They thought that the aether should rotate a
1
To show this mathematically one needs a bit more math, therefore we skip this here

310
13.1. HISTORICAL MILESTONES

bit slower than the earth because it must somehow glue to the rest of the universe which
is not rotating. Therefore the speed parallel and perpendicular to the direction of rotation
should differ, which was not the case. Since light propagates as a wave it fulfils the wave
equations. In the frame of the medium they take the simple form given in equation 7.1. If
one calculates the transition from one to another (moving) reference frame using classical
mechanics, the simple wave equation change. Since light does not travel in a medium, the
simple wave equations must hold in all reference frames, which contradicts the calculation
of classical mechanics. This was a clear hint that something was wrong. At the beginning
of the 20th century, famous physicists as Lorentz and Poincaré figured out how one has to
calculate the transition from one to the other reference frame. This lead to the Lorentz
transformation, which is nowadays the basis of relativity. So the found the equations
some years before Einstein and Einstein surely knew the equations already. But the big
success of Einstein was to give the equations a physical meaning. Because Lorentz and
Poincaré (and many others) didn’t realise that these equations contain a deep property of
nature, they simply used these equations to resolve the contradiction in the theory.

13.1.2 Flying electron

We now want to look at an example where classical mechanics leads to a contradiction.
Consider the case where a current flows through a wire and an electron is flying with a
constant velocity parallel to the wire, see figure 13.1. The current in the wire creates a
magnetic field around the wire. The Lorentz force acting on the moving electron pulls it
toward the wire (depending of the direction of the current and movement of the electron).
This is different if we change the reference system. Let’s chose the system moving with
the electron. In this system the electron does not move. As a consequence there is no
Lorentz force and therefore it will not be attracted by the wire.
This seems to be a paradox but considering special relativity, one can resolve it. We will
have a look at this at the end of this chapter when we understand relativity better.

311
13 Special Relativity

Figure 13.1: In the first reference frame (left) the electron (black dot) is moving parallel to
the wire through which a current flows. The current creates a magnetic field which causes
a force acting on the electron (Lorentz force). The system left moves with the electron.
Since the electron is not moving, no Lorentz force is acting.

312
13.2. GALILEO TRANSFORMATIONS

13.2 Galileo transformations

In order to understand the basic concepts of relativity such as transformation from one
reference system to another, lets first have a look at these concepts in classical mechanics.
There the transformation from one reference system to another is called Galileo trans-
formation. But before we start with the Galileo transformation we have to define what a
reference system, also called frame of reference, is and which kind of frames we will look
at2 .

13.2.1 Reference System

To describe physics it is often very useful to describe it in a reference system. A reference
system consists in a choice of an origin and three axis. Usually one takes the axis pairwise
orthogonal which is then a Cartesian coordinate system3 . In this system each object has
at every time t a certain position ~x(t). The origin is usually fixed at a certain object
such as the sun if one wants to describe the motion of the planets or one edge of a desk
when describing the motion of a ball on the desk. As a consequence the object which is
connected to the origin does not move in that frame of reference because it has for all
time the position ~x(t) = ~0. If we fix the origin at a certain object X (X is here the name
of the object) we say that we describe the problem in the reference frame of X or in the
rest frame of X. For example in chapter 13.1.2 we described the problem in the reference
system of the wire (left) and in the system of the electron (right)4 .
Sometimes we will give our reference frames names, for example we will often denote
two frames as Σ and Σ0 . One then says that a property, for example time or position, are
measured in some (specific) frame. This sounds often pretty abstract and it is sometimes
useful to visualize it. One convenient visualization is to think for example Σ as the rest
frame of the earth. So measuring a quantity in Σ means that someone (for example you)
is standing on the earth and measures this quantity. If there are two frames involved it
is often easy to think about a space ship as the second frame Σ0 . For this assume that a
colleague of you flies in a space ship and performs his own measurement.
There are different names for the most important and most common reference frames.

2
The description of the transformation between arbitrary reference frames is described by general rela-
tivity.
3
One could also take polar or spherical coordinates. The choice depends on the symmetry of the problem
but in many cases Cartesian coordinates is a good choice (at least to start with).
4
In that example it was not necessary to chose concrete axes so we left it out.

313
13 Special Relativity

Rest frame
As already defined above this is the system which is connected to a certain object. This
object will then rest in that system, therefore it’s the rest frame of that object.

Laboratory frame of reference

This is usually the system where we start describing a problem. If one performs a mea-
surement this is the rest frame of the experiment.

Centre-of-momentum frame
This is the frame where the total momentum is zero. If the momentum is conserved5 ,
the total momentum will stay zero which often simplifies the calculation.

Inertial frame of reference

Since this is a very important type of reference frame lets have a closer look at it in the
next chapter:

13.2.2 Inertial frame of reference

Of particular importance is the inertial frame of reference. It is a frame where Newtons
laws are valid6 . In particular this means that if no force is acting on an object, this object
will not move or move with a constant velocity.
For example the earth is approximately an inertial frame7 because any force acting on an
object (for example gravity) is proportional and parallel to the acceleration. It is not really
a inertial frame because the earth is moving around the sun and rotating around its axis,
therefore it’s accelerated and there are fictitious forces such as the centrifugal force or
Coriolis force.
If we look at a space station, for example the ISS, orbiting around the earth we might
think this is an inertial system. But in fact it is not because all the masses in the space
station seem to be weightless (for example a pen floating), although there is a gravitational
force.
If we once found an inertial frame of reference then any system moving with a constant
velocity to that system is also an inertial system. In Newtonian mechanics this is obvious.
Lets assume the frame Σ is an inertial system and that the system Σ̃ is moving with velocity
5
This means no external forces act on our system.
6 ~ = m~a.
This means a force acting on an object is proportional to its acceleration: F
7
Neglecting relativistic effects due to gravity.

314
13.2. GALILEO TRANSFORMATIONS

v(t) measured in Σ. If a force F (measured in Σ) is acting on an object the object gets

F
accelerated with a = m , since it is an inertial frame of reference. If in Σ̃ the same force
F̃ and the same acceleration ã is measured, then Σ̃ is also an inertial system. This gives
us restrictions on the velocity v(t) because since

dv
a = ã + = ã
dt

where the first step is simply the transformation of the acceleration between two reference
frames (in Newtonian mechanics) and the second step is the assumption that the two
accelerations are equal. The consequence is that dv dt = 0 and therefore the velocity is
constant (including v = 0). This means that Σ̃ is an inertial frame if and only if it is
moving with constant velocity or it is displaced by a constant vector or rotated by a fix
angle with respect to Σ (assuming Σ is an inertial system).

13.2.3 Galileo Transformation

Now we have enough definitions to look at the Galileo Transformation. This trans-
formation describes how time and space transform between different inertial frames in
classical mechanics. Let Σ and Σ0 be two inertial frames of reference8 . In classical me-
chanics there is no reason why the time in one frame elapses slower or faster than in the
other9 . This means the time in both frames is the same or mathematically t = t0 . This
is the transformation of time between the two systems. To examine the transformation
of space, assume that the axis in the two systems are parallel to each other and that Σ0 is
moving with a constant speed along the x axis of Σ (see figure 13.2). In Σ the origin of
Σ0 has therefore the coordinates

   
vt x0
OΣ0 = 0 + 0 
0 0

where x0 is the position of the origin of Σ0 at t = 0 measured in Σ. This means that a

point r~0 measured in Σ0 has the coordinates ~r in Σ
8
We denote all quantities measured in Σ0 with a prime 0 and if measured in Σ without.
9
They might have a different zero point of time and therefore the time differs by a constant. But this
constant has no influence on the the physics so we can without loss of generality set it to zero.

315
13 Special Relativity

     0   
x vt x vt
~r =  y  =  0  +  y 0  =  0  + r~0 .
z 0 z0 0

This means that the y and z coordinates of the two systems are always the same, eg.
y = y 0 and z = z 0 . The x coordinate transforms as x = vt + x0 or x0 = x − vt.

Figure 13.2: The two frames represented by their axis. One is moving with respect to the
other along the x axis with the speed v.

If we put all the transformations together we get

t = t0
x = vt + x0
y = y0
z = z0

where obviously time and space only mix at the x component. Sloppily said: time influ-
ences measurements of space but not the opposite. This will be different in relativity.

316
13.3. LORENTZ TRANSFORMATION

13.3 Lorentz transformation

In this chapter we will will describe the basic transformation in relativity. Further phe-
nomena will be discussed in the next chapter.

13.3.1 Einstein’s Postulates

All the special relativity can be derived by two assumptions about nature. These assump-
tions are called Einstein’s postulates which are:

1. The laws of physics are the same in all inertial frames of reference.

2. The speed of light, denoted by c, has a finite value10 .

Sometimes the postulates are formulated in a slightly different way but the conclusions
remain the same. In the regime of classical mechanics and considering the definition of
inertial frames it is clear that the physics in all inertial frames is the same11 . In relativity
we have to state this explicitly because we cannot assume any more that the acceleration
in two frames has the same value. The second postulate needs some more explanation.
What is called speed of light is in fact a much more important speed, namely the speed
of information. A precise discussion about this would go beyond this scope and is not
important to understand relativity. So we only illustrate this at a small example. Consider
two objects that interact with each other, for example by electromagnetic or gravitational
forces. Then this interaction only happens with the speed of light, which means that one
object ”sees” or ”feels” the other object where it was some time before. This is because
the information where the objects are spreads out with the speed of light, so the objects
have already moved a bit before one sees their position.

Without calculation we can already now state some interesting things.

• Since the physics is in all inertial frames the same and the speed of light is a physical
property, the speed of light has the same value in all inertial frames, independent
of the movement of the light source of the observer.

• Since the speed of light is the maximal possible speed12 , the addition of velocities
will not behave the same as in classical mechanics.
10
The value of c is c = 2990 7920 545.8 m·s−1
11
At least if the speed of light is considered infinite.
12
Because if something is faster than light it could deliver informations faster than light which is in con-
tradiction to the fact that informations cannot be delivered faster than light

317
13 Special Relativity

In the following we will always look at inertial frames of reference because for all other
frames one needs one more postulate which leads to general relativity.

13.3.2 Synchronisation of clocks

Since time will not be an absolute quantity13 in relativity we have to find a method how
we can synchronise two clocks. Two clocks are synchronised if they ”show” the same
time, where ”show” means that if we look at two distant clocks we might see two different
times because the light from the clock further away needs longer to reach our eyes than
from the nearer one. If the distance from the observer to each clock is the same then
the observers sees the same time on the two synchronized clocks. In fact the ”show” is
defined more precisely by the process of synchronisation. For the synchronisation we of
course neglect all effects that might happen in a real-life experiment such as imprecise
clocks or delayed detectors and so on.
To synchronize two clocks we send out one light pulse at the time t0 from one clock, we
call it clock 1, to the other, clock 2. There it is (immediately) reflected and it returns to
clock 1. The whole time from clock 1 to clock 2 and back to clock 1 we denote as ∆t.
Clock 2 is now set such that at the time the light pulse arrived it would have shown14
the time t0 + ∆t 2 . We therefore have constructed the synchronisation such that two
light pulses which are send out at the two clocks at the same time meet exactly in the
middle of the two clocks. As we will see later, simultaneousness is not in all frames the
same, therefore it does not make sense to synchronize two clocks in different frames of
reference.

13.3.3 Time dilation

Let’s start examine how the different quantities transform from one inertial frame to
another. We start with a time interval. Already now it shall be pointed out that the
transformation of time itself behaves differently than the transformation of time intervals,
which can be seen in section 13.3.6. To investigate the transformation of time intervals
we need a clock or even more fundamental something that creates a periodic signal. The
simplest clock is the so called light clock, which is shown in figure 13.3. A light clock
consists of two mirrors between which a short light pulse is trapped. Assume we observe
a light clock moving parallel to the mirrors with a constant velocity v. The frame we
13
Absolute time means time does not depend on the position and the reference frame.
14
This looks like we have to set the time of clock 2 before we actually know which time we have to set.
But this is not necessary because clock 2 can remember the time the light pulse arrived and then comparing
with the time t0 + ∆t at clock 1 one can calculate by which amount one has to change the time at clock 2.

318
13.3. LORENTZ TRANSFORMATION

observe the moving clock we call Σ and the rest frame of the clock Σ0 . One period ∆t0
in Σ0 is then the time the light needs to make one complete cycle, which is

2L0
∆t0 =
c
where L0 is the distance between the mirrors and c is the speed of light, which is the same
in all frames (an therefore there is no ’).

Figure 13.3: Left: the light clock in the rest frame of itself. The light pulse is moving
up and down between the mirrors. Right: The same light clock observed from a frame
which is moving with respect to the clock. The light is not moving straight up and down
but makes a zigzag.

In our system we see the same distance between the mirrors15 , so L = L0 . In Σ, however,
the light does not move perpendicularly to the mirrors, as the whole clock is moving. The
Period is defined as the time the light needs to go from one mirror to the other and back.
In Sigma this is

L
∆t = 2
v⊥

15
This would follow if one derives relativity more rigorously but this would need more math and would
not give more insight.

319
13 Special Relativity

where v⊥ is the velocity perpendicular to the mirrors. Using Pythagoras one can calculate
v⊥ and one gets for ∆t:

1
∆t = 2L ,
v⊥
1
= 2L √ ,
c2
− v2
2L 1
= √ = γ∆t0 ,
c 1−β

where β and γ are often used short cuts defined as

v
β= ,
c
1
γ=p .
1 − β2

This means that if on a clock in Σ0 one time interval passes, for example one second, in
our frame Σ more time passes, for example one and a half seconds. Since both systems
are inertial systems one cannot tell whether one is moving or the other. Therefore it does
not depend whether we look at a light clock or a (Swiss) watch, all will show the same time
and therefore show the phenomenon described above. This means we see the clocks in
a moving frame going slower than ours. This phenomenon is called time dilatation.

13.3.4 Lorentz contraction

Similar to time we can have a look at distances. Since we don’t know yet how distances
transform, we cannot simply compare an unknown distance with a meter stick in in two
different systems16 . Since the speed of light is constant in all systems, we can reduce this
problem to measuring the time the light needs to pass a certain distance. First we notice
that distances perpendicular to the direction of motion do not change17 . This is because
if the light moves perpendicular to the direction of motion there is no difference in time
whether they move in one or the other direction (to understand this, keep on reading).
A first (wrong) approach to get the transformation for distances parallel to the direction
of monition might be ∆L = c∆T which then would lead to ∆L = γ∆L0 which means
16
We don’t know how long a meter seems in a moving system.
17
We have already implicitly used this at the derivation of the time dilatation.

320
13.3. LORENTZ TRANSFORMATION

that a moving object seems longer than in its rest frame (which is wrong). The mistake in
our derivation happened because we did not take into account that in a moving system,
the time the light needs for a certain distance depends weather the light propagates in the
direction of flight or opposite (see figure 13.4). In fact, moving objects seem shorter than
in the in their rest frame.

Figure 13.4: 1): Sending out a light pulse. 2) Arrival of the light pulse at the mirror after
the time ∆t1 . 3) Reflection of the light at the mirror (this happens immediately). 4) Arrival
of the light pulse at the detector, the time the light needs from the mirror to the detector
is ∆t2 .

To get the right result we have to do the following: We send out a light pulse along the
direction we would like to measure. At the other end of this distance we place a mirror
which reflects the light pulse. The time the light needs to pass the distance source-mirror
we denote by ∆t1 . The time from the mirror to the detector by ∆t2 . Measuring the time
for twice the distance (∆t1 + ∆t2 ) and dividing it by 2c we get the right length. In the
rest frame this is easy because the time to the mirror and back are the same. We denote

321
13 Special Relativity

this distance ∆L0 (we call this the system Σ0 again since we want to measure it from our
system Σ. Therefore Σ0 is the moving system with respect to our system Σ). The time
0
for the distance in Σ0 is therefore ∆t0 = ∆Lc . In Σ we get for ∆t1 and ∆t2

∆L
∆t1 = ,
c−v
∆L
∆t2 =
c+v
where the denominators are given this way because the light pulse moves in the same or
opposite the direction of flight. The fact that we simply add or subtract velocities is no
contradiction to the statement above because there is no particle moving with c + v. The
term c + v appears because the light pulse and Σ0 are moving. From time dilatation we
know that ∆t1 + ∆t2 = γ(∆t01 + ∆t02 ) = 2γ∆t0 and using some math we get

1
∆t0 = (∆t1 + ∆t2 )
2γ
∆L0

1 ∆L ∆L
= +
c 2γ c − v c + v
q
v2
2c∆L 1 − c2
=
2 c2 − v 2
∆L 1
= q
c 1− v
2
c2
0
∆L = γ∆L.
Since γ ≥ 1, a moving object seems shorter than it is in its rest frame.

13.3.5 Symmetry of time dilatation and Lorentz contraction

According to the first postulate, the equations (Time dilatation and Lorentz contraction)
look the same in all inertial frames. Assume you are observing a space ship. According
to time dilatation you see the clocks in the space ship going faster than yours with ∆t =
γ∆t0 . A colleague in the space ship would observe the same with your clocks, namely
that your clocks move slower than his. According to the college, the equation ∆t0 = γ∆t
holds.
One might now argument that from this follow ∆t = γ∆t0 = γ 2 ∆t and therefore
γ 2 = 1 which questions of course all we did (since γ > 1 if v > 0). This argumentation

322
13.3. LORENTZ TRANSFORMATION

is of course wrong because we have to compare same things with same things, what we
did not. Because observing a moving clock (which goes slower than yours) one has to
measure the time at two different places (where as the moving clock in its rest frame stays
at the same position). This leads to an asymmetry in measuring the time one a clock.

13.3.6 Lorentz transformation

Similarly to the Galileo Transformations we can now calculate the transformation of space
and time between moving reference frames. To do this we first have to be aware what
we mean by transforming time and space. From classical mechanics we can keep the
definition of a (spatial) reference frame as we have defined it in section 13.2.1 but since
space and time will mix, we have to develop a new intuition for time18 . For this we
imagine to have at every point in space an (imaginary) clock and each inertial frame of
reference has its own clocks. All the clocks belonging to one system show the same time
in their rest frame (see also chapter 13.3.2 about synchronisation of clocks). We can then
compare the time shown on the clocks of different frames at the same position. This
is necessary because if we compare the time shown on clocks at different positions we
would have to take into account the time the light needs from the clock to our eyes. In
relativity, this union of space and time is called spacetime19 .
Let Σ and Σ0 be two inertial frames of reference with their clocks showing the time t and
t0 . In the two frames, a certain point is given as ~r = (x, y, z) in Σ and r~0 = (x0 , y 0 , z 0 )
in Σ0 . For convenience we chose the two systems such that the corresponding axis are
parallel to each other and that Σ0 is moving along the x-axis with the speed v. Furthermore
we assume that the two origins at t = t0 = 0 coincide. Then the transformation is then
given as

v
t0 = γ t − 2 x
c
x0 = γ (x − vt)
y0 = y
z0 = z

where γ = q 1 is the factor introduced above. Before we look at a (rough) proof,

2
1− v2
c

18
In classical mechanics time is absolute, which means that it is a given quantity for all frames (up to a
constant shift in time).
19
Mathematically it is a 4 dimensional (vector) space but with an important difference to the usual 4
dimension (spatial) space: There is a different measure of distances, see chapter 13.4.

323
13 Special Relativity

we can rewrite the transformation above in order make the two equations symmetric. If
we associate time t by the distance ct and rewrite the equations in terms of ct, we get

v
x0 = γ x − ct (13.1)
c
0
v
ct = γ ct − x . (13.2)
c
To prove this transformations we basically have to repeat the steps we did deducing time
dilatation and Lorentz contraction. Hence there will be rather a sketch of proof instead
of a rigorous proof.

• The transformation of space is in principle the same as in classical mechanics (see

Galileo transformation in chapter 13.2.3). The only difference is that we have to
take into account Lorentz contraction which causes the γ factor in equation (13.1).

• For the transformation of time, we first look at the spatial dependence. This trans-
formation depends only on the coordinate that is in the direction of movement of
the two frames. When we look at the synchronisation of clocks in a moving frame
(as described in chapter 13.3.2), we observe that the time the light pulse needs in
one direction is longer than in the other. We have already observed this in the
section about Lorentz contraction (section 13.3.6). As a consequence we observe
for a moving frame that the two clocks (in the moving frame at different positions)
are not synchronous which literally means they do not show the same time20 . The
time in the moving frame therefore depends on the position.

• When we look at the time dependence in the time transformation we would expect
t0 = γt according to time dilatation (see section 13.3.3). But there we looked at a
slightly different case than here: In section 13.3.3 we observed a time interval in
the moving frame from our (rest) frame. Here we look at the corresponding clock
that passes a certain point ~r at a certain time t. Therefore we have to calculate
which clock is at the time t at the position ~r. It is the clock that was at time t = 0
at the position ~r0 = ~r − ~v t for which we can perform a clock synchronisation at
t = 0 (denoted by t00 and then calculate how much time in Σ0 is passed until this
x0
clock reached the point ~r. The synchronisation at t = 0 leads to t00 = c0 = γ xc0
and the time passed is then t0 = γt . The time dependence of ~r0 and the time itself
lead then to the given dependence in equation 13.2.
20
To avoid any misunderstanding: They show the same time in their rest frame but in our system this
frame is moving and then they do not show the same time any more.

324
13.4. MINKOWSKI METRIC

The transformation from Σ0 to Σ is the same21 with the only difference that in Σ0 the
velocity changes sign and we get

v
x = γ x0 + ct0 (13.3)
c
v
ct = γ ct + x0 .
0
(13.4)
c

Before we finish this chapter about the Lorentz transformation let’s deduce the time
dilatation from the Lorentz transformation. For this consider two points in time t1 and
t2 in Σ and t01 and t02 in Σ0 . For the time dilatation we have to observe one clock in Σ0 ,
which shall be located at x1 at time t1 and x2 = x1 + v(t2 − t1 ) at t2 , whereas x01 = x02
22 . We then get

v v
∆t0 = t02 − t01 = γ t2 − 2 x2 − γ t1 − 2 x1
c c
v v
= γ t2 − 2 (x1 + v(t2 − t1 )) − γ t1 − 2 x1
c c
v2
= γ (t2 − t1 ) − γ 2 (t2 − t1 )
c
v2
1− 2 1
= q c (t2 − t1 ) = ∆t
1− v
2 γ
c2

as we had it in section 13.3.3.

13.4 Minkowski metric

In classical mechanics a distance in the 3 dimensional space is conserved by the Galileo

transformations. This means a distance between two points has the same length in all
inertial frames.
Similarly one can define something like a ”distance” in the 4-dimensional time space
which will be invariant under Lorentz transformations. This section will examine this
”distance” and define so called four vectors.
21
This is because the two systems are equivalent according to the first postulate of Einstein and in Σ0 the
velocity of Σ is −v instead of v
22
Since the y and z-coordinate have no influence on the transformation, we omit them.

325
13 Special Relativity

13.4.1 Definition of Minkowski metric

Let Σ and Σ0 be two frames of reference. In Σ we have two points in space and time
t1 , r~1 = (x1 , y1 , z1 ) and t2 , r~2 = (x2 , y2 , z2 ). These points are often called events
because an event happens at a certain time at a certain place. In Σ0 we look at the same
points which are given according to the Lorentz transformation (see section 13.3) and
denoted by t01 , ~r0 = (x01 , y10 , z10 ) and t02 , r~2 0 = (x02 , y20 , z20 ). Then the ”distance” in the 4
dimensional time space given by

∆s2 = c2 ∆t2 − ∆~r 2 (13.5)

= c2 (t2 − t1 )2 − (x2 − x1 )2 − (y2 − y1 )2 − (z2 − z1 )2
= c2 (t02 − t01 )2 − (x02 − x01 )2 − (y20 − y10 )2 − (z20 − z10 )2
= c2 ∆t02 − ∆r~0 2 = ∆s02

is in both frames the same (∆s02 = ∆s2 ). This ”distance” is called Minkowski metric23
and it has some properties that are similar to the distances we know from 3 dimensions.
To prove that the Minkowski metric is invariant under Lorentz transformations we as-
sume for simplicity that Σ0 is moving along the x-axis of Σ with velocity v and that the
corresponding axis of the two systems are parallel to each other. Therefore we can al-
ready state that (y2 − y1 )2 + (z2 − z1 )2 = (y20 − y10 )2 + (z20 − z10 )2 . So we only have
to look at time and the x-coordinate. We get

c2 ∆t02 − ∆x02 = c2 (t02 − t01 )2 − (x02 − x01 )2

= γ 2 (ct2 − βx2 − ct1 + βx1 )2 − (x2 − βct2 − x1 + βct1 )2
= γ 2 c2 ∆t2 + β 2 ∆x2 − 2cβ∆t∆x − ∆x2 + c2 β 2 ∆t2 + 2cβ∆t∆x

1 1
c2 ∆t2 1 − β 2 + ∆x2 1 − β 2 = c2 ∆t2 − ∆x2

= 2 2
1−β 1−β

13.4.2 Properties of Minkowski metric

The reason why the Minkowski metric is always denoted as ”distance” with quotation
marks is that there is one important difference to distances we know from 3 dimensions.
23
In some books, the Minkowski metric is defined as ∆s2 = ∆~r 2 − c2 ∆t2 . Obviously this definition
does not change the fact that ∆s2 is invariant under Lorentz transformations and the properties are also the
same (up to a sign)

326
13.4. MINKOWSKI METRIC

This difference is not the dimensionality. The reason is the minus sign between the time
and space part, if there would be a plus sign one would also call this a distance. The
reason is exactly that minus sign. A distance is something that has always a positive value
(or zero). However using the Minkowski metric, it is also possible to get a negative value
if c2 ∆t2 < ∆~r 2 . This change of sign allows us to classify the ”distance” of two events:

light cone
If two events have ∆s2 = 0 we say that they lie on the light cone. This means if at the
time t1 at r~1 a light pulse would be send towards r~2 it would reach r~2 exactly at t2 . This
property is obvious because ∆s2 = 0 leads to

c∆t = |∆~r|
|∆~r|
c=
∆t

where the right side is the mean velocity of an object moving from r~1 to r~2 in the time
∆t. This velocity is the speed of light c.

timelike
In case ∆s2 > 0 the ”distance” is called timelike. This means the 3-dimensional distance
of the two events is smaller than the time a light pulse needs to travel from the first to the
second event. Therefore there exists a frame of reference where the two events happen
at the same position. This frame is travelling with the speed v = |∆~ r|
∆t < c. Furthermore
2
the condition ∆s > 0 means that the first event might have caused the second event.
This is because if at t1 a light pulse would be send out at r~1 , it would reach r~2 before t2
and might therefore causes an action at t2 .

spacelike
The opposite of timelike is spacelike, where ∆s2 < 0. This means light would have
longer between the spatial positions of the two events than ∆t. Since the speed of light is
the highest speed for interaction there is no possibility that the first event could influence
the second one. But there is a frame of reference where the two events happen at the
same time. Since in this frame ∆t = 0 and ∆s2 independent of the frame, ∆~r 2 must
be minimal.

327
13 Special Relativity

13.4.3 Four vectors

A compact notation for an event is the so called four vector. A four vector is a four
dimensional vector which has as first component time multiplied by the speed of light c
and the other four components are the space components. The four vector to the event
at t1 and position r~1 is therefore

 
ct1
 x1 
r1 = 
 y1 


where we use the convention that a four vector has no arrow above and a spatial (position)
vector has an arrow. Usually it is clear from context if a symbol represents a four vector
or a scalar. In this script we will not really use four vectors so you do not have to bother
about the distinction of four vectors and scalars.
The special feature about four vectors is that one can define a ”scalar product” for them.
We have already seen this ”scalar product” and it is nothing else than the Minkowski
metric. So the (relativistic) ”scalar product” of two four vectors is defined as

   
ct1 ct2
 x1   x2 
 y1  ·  y2
    = ct1 ct2 − (x1 x2 + y1 y2 + z1 z2 )

z1 z2

13.4.4 Note on rigorous derivation

To derive relativity more rigorous one would start from the Minkowski metric. Since this
derivation would need more math but would not give more intuitive insight into relativity
we omit it. Nevertheless a very rough sketch of the derivation shall be given in order to
show the importance of the Minkowsky metric and the use of four vectors24 .
From Newtons laws we know that in a inertial frame of reference a free particle25 moves
with a constant velocity. This must be true in all inertial frames because otherwise Ein-
stein’s postulate number one would be violated. From this one can conclude that the
transformations (with will lead to the Lorentz transformation) between the two frames
24
If you cannot follow the sketch do not mind, you will see it in full detail if you study physics.
25
A free particle is a particle on which no force is acting.

328
13.5. VELOCITIES

must be a linear map. Additionally a light pulse should travel with the speed of light in
all inertial frames, therefore we search for a transformation that leaves the Minkowski
metric invariant26 . The task is therefore to find the transformations for a 4 dimensional
(vector) space that leaves the Minkowski metric invariant. One then can directly deduce
the Lorentz transformation as transformation of four vectors.

13.5 Velocities
In this chapter we will take a closer look at the physical quantity velocity. We will examine
how velocities ”add” in relativity which will prepare us for the dynamical part of relativity.
Additionally we will have a look at the four vector of the velocity.
Before we start let’s repeat an important property of the velocity between to frames of
reference. Assume you are in an inertial frame and you observe a space ship passing with
a constant velocity v. From the frame of the space ship you are also moving with the
speed −v. This means the relative velocity between two frames is something constant
and not relative.

13.5.1 Addition of parallel velocities

In Σ you are observing an other frame Σ0 which is moving with a constant speed v.
Assume that with respect to Σ0 an object is moving in the same direction as Σ0 with
a speed u0 (measured in Σ0 )27 . The question is, what is the speed of that object in Σ?
As already mentioned, it cannot be u = v + u0 because this might be greater than the
speed of light c. To calculate the right velocity we have to go back to the definition of
the velocity and apply the Lorentz transformation28 . We denote by x the position of the
object, the velocity is then given as

dx 0
dx γ (dx0 + vdt0 ) dt0 + v u0 + v
u= = = 0 = 0
γ dt0 + cv2 dx0

dt 1 + cv2 dx
dt0 1 + vu
c2

dx0
where we reduced the fraction be dt0 and used the definition of u0 = dt0 . The obtained
result has some interesting features:
26
Because the Minkowski metric fulfils exactly the condition with the light pulse.
27
This might be written a bit theoretically. So assume the following example: You observe a space ship
(Σ0 ) which moves with v. In that space ship some one throws a ball with speed u0 in the same direction as
the space ship is moving.
28
Actually one need multidimensional analysis but since the Lorentz transformation is a linear map, the
derivative ”behaves” like the Lorentz transformation itself.

329
13 Special Relativity

• For small speeds (v c and u0 c) the denominator is nearly 1 and as a conse-

quence we get the classical result.
• If v = c or u0 = c we also get u = c.

The obtained result might lead to the conclusion that the relativistic addition of velocities
is symmetric in v and u0 . But this is not the case, and would be obvious if one would
look at velocities which are not parallel to each other. The symmetry happens because
the γ in the numerator and denominator drops.

13.5.2 Addition of perpendicular velocities

Assume nearly the same setup as above (section 13.5.1) with the difference that the object
in Σ0 moves perpendicular to the relative motion between Σ and Σ0 . To make a distinc-
tion to the calculation above we denote the velocities that are measured perpendicular to
v with a subscript ⊥, therefore u⊥ and u0⊥ . Before we calculate the relation between u⊥
and u0⊥ lets state an important fact: Although the object is moving perpendicular to v
in the frame Σ0 it is not moving perpendicular in Σ because there it also moves with the
speed v in the direction of the relative motion between Σ and Σ0 . Of course it also has a
perpendicular component which is exactly the one denoted qby u⊥ . This means u⊥ is not
the speed of the object in Σ, the speed would be given as v 2 + u2⊥ . Now let’s calculate
u⊥ by following the same steps as above but using that there is no Lorentz contraction
perpendicular to the direction of motion, so x⊥ = x0⊥ :

dx⊥ dx0⊥ u0 u0
u⊥ = = = ⊥ = ⊥
dt γ dt0 + cv2 dx0k γ 1 + cv2 u0k γ

where we used that the parallel component of the velocity if the object in Σ0 is zero29 . In
case of u⊥ it is obvious that v and u0⊥ do not enter symmetrically the formula (v enters
the formula through γ).

13.5.3 velocity four vector

Similarly to the four vector defined in section 13.4.3 one can define four vectors for
velocities. Let r = (ct, x, y, z)be the four vector of an object moving with a constant
speed. This means at the time t that object is at the position ~r = (x, y, z). A first (wrong)
approach to a four vector for the velocity u might be
29
If the object would move in an arbitrary direction (not perpendicular) we would keep u0k .

330
13.5. VELOCITIES

   
ct c
d  x   vx 
u=  = 
dt  y   vy 
z vz
where vx is the x component of the velocity and similarly for y and z. But this approach
contradicts the idea of the absolute value of the four vector being invariant under Lorentz
transformations. This is because the scalar product of u with itself is given as
u · u = u2 = c2 − vx2 − vy2 − vz2 .
Depending on the frame the velocities are different so |u|2 is not invariant.
To get the right result by an intuitive approach we have to go back at the definition of the
derivative. A derivative is basically the fraction of two quantities ∆x
∆t with the limes ∆t
going towards zero (and ∆x too). If the numerator is a four vector (as we assume here)
and the denominator is a quantity that is constant in all frames, then the whole fraction
is again a four vector that has an absolute value independent of the frame. There is one
special time, namely the proper time τ which is independent of all frames. This is the
time in the rest frame of the moving object. We obtain ∆τ by the time dilatation, which
leads to ∆τ = γ1 ∆t. Taking the limit for the derivative we have do replace the ∆ by the
d (roughly speaking) and we get
   
ct c
d  x   = γ  vx  .
 
u= 1 
γ dt
 y   vy 
z vz
If our construction of the four vector was successful, u · u should be invariant under
Lorentz transformations. Lets check this by considering u · u in the rest frame of the
object and in an arbitrary frame. In the rest frame the object does not move, therefore
vx = vy = vz = 0 and therefore u · u = c2 . In an arbitrary frame it is moving and we
get
u · u = u2 = γ 2 (c2 − vx2 − vy2 − vz2 )
1 2 2 2
= 2 c (1 − v ) = c
1 − vc2

where we used vx2 + vy2 + vz2 = v 2 . Obviously this is invariant under Lorentz transfor-
mations and we succeeded in constructing a four vector for the velocity.

331
13 Special Relativity

13.6 Dynamics
Until now we only did relativistic kinematics, this means we developed a formalism do
describe the motion of an object and how this description changes by a Lorentz trans-
formation. Now we will look at dynamics. A proper description of dynamics in special
relativity as we know in classical mechanics (Newton’s laws) is more difficult. Never-
theless we can define quantities as momentum or energy which will be conserved and
which allow to calculate a lot of examples (using these conservation laws). To derive this
we start with the 3-dimensional momentum and then search for the four vector of the
momentum. From the four vector of momentum one can derive energy considerations
including Einstein’s famous formula E = mc2 . At the end of this section we will briefly
look at acceleration and forces.

13.6.1 Momentum
Looking at the following example one can see that the momentum cannot simply be
p~ = m~v where m is the (rest) mass of the object: We observe a space ship that moves
with velocity v along the x-axis. In the space ship two balls with speed v are moving per-
pendicular to the x-axis towards each other (see figure 13.5). When they meet the bounce
away such that they move in the x direction. From energy and momentum conservation
we can deduce that they move also with velocity v along the x-axis in the frame of the
space ship. From our system (where the space ship is moving), the total momentum of
the two balls is 2p where p is the momentum in the x-direction of each ball (since the
balls are moving towards each other, the total momentum perpendicular to the x-axis is
zero). After the collision the total momentum must be conserved. But the ball moving to
the left does not move in our frame because in the space ship it moves with −v and the
space ship itself moves with v, so the overall movement of this ball is zero. The second
ball moving to the right has according to the relativistic addition of velocities a speed of

v+v 2v
v2 = vv = 2
1 + c2 1 + vc2

If the momentum would be p = mv this leads to a contradiction since

2v
pbefore = 2mv 6= v2
m = pafter .
1+ c2

332
13.6. DYNAMICS

Figure 13.5: Collision and scattering of two balls in the moving space ship. Initially, the
two balls move with the velocity v perpendicular to the direction of flight of the space
ship. After the scattering they move in the same direction as the space ship.

Consequently the momentum cannot depend linear from the velocity. The right depen-
dence is given as

1
p = γmv = q mv.
v2
1− c2

By a longer calculation one can show that the momentum defined this way fulfils the
conservation of momentum. Since there is an other method showing that this is a good
candidate for the momentum using four vectors we will look at four vectors.
One important note about the mass m: Sometimes a relativistic mass mrel = γm is
defined. This is a unsuitable definition (see section 13.6.4), when we talk about masses
we always mean the (rest) mass, so the mass measured in the rest frame of the mass.

13.6.2 momentum four vector

Since p~ = m~v is wrong, the next approach might be taking the last three components of
the four vector of velocity v and multiply them with the mass. And indeed this leads to the
right result. Encouraged from this success we define the four vector for the momentum
as

333
13 Special Relativity


c
 vx 
p = mv = γm 
 vy  .


vz
As the absolute value of the velocity four vector is invariant under Lorentz transforma-
tions and the mass is a constant, the absolute value of the momentum four vector is also
constant with p2 = m2 c2 . What we don’t understand yet is the first component of the
momentum four vector, namely γmc. This will clarify in the next section:

13.6.3 Energy
The (total) energy of an object can be calculated by
ˆ ˆ
dp
E= F ds = ds mγc2 (13.6)
dt
where many steps were left out including some longer calculation. Looking at the mo-
mentum four vector, we see that the first component is exactly the energy divided by c.
This means we can rewrite the four vector as

  E 
c c
 vx   px 
p = mv = γm 
  =γ
 .
vy  py 
vz pz
This discovery leads to some interesting conclusions:

Rest Energy
In the rest frame of an object one has p~ = 0 and γ = 1. Since p2 = m2 c2 we get for the
rest frame

E2
= m2 c2
c2
E = mc2
which is exactly Einstein’s famous mass-energy equivalence. This means the mass itself
is energy and the minimal possible energy of an object is E = mc2 the rest mass (times
c2 ) of that object.

334
13.6. DYNAMICS

Energy-momentum relation
For an arbitrary system the absolute value square of p multiplied with c2 leads to

p2 c2 = m2 c4 = E 2 − p~ 2 c2
E 2 = p~ 2 c2 + m2 c4 . (13.7)
This equation relates the momentum of an object with its energy. Of course this energy is
the same as the one obtained above in equation (13.6) as the following calculation shows:

s s
p p β2 β2 + 1 − β2
E= p~ 2 c2 + m2 c4 = mc γ 2 β 2 + 1 = mc2
2
+ 1 = mc2 .
1 − β2 1 − β2

small speed limit

If we look at small velocities the Energy as written in equation (13.6) can be approximated
by a Taylor expansion around v = 0 an one obtains

2 1 dγ
E = γmc ≈ 1 + v mc2
2
2 dv 2 v2 =0
1 v2

1
= 1+ 2
mc2 = mc2 + mv 2 .
2c 2
The first term is the rest energy, which is always there but the second therm is interest-
ing, since this is exactly the kinetic energy we know from classical mechanics. So in the
limit for small velocities, the kinetic energy of an object is exactly the one from classical
mechanics.

13.6.4 Acceleration and forces

According to Newton’s law of motion the force acting on a body is equal to the rate of
p
change of the momentum F~ = d~ dt . Since p
~ = mγ~v and m constant, the derivative is
only acting on γ~v . There are two simple cases one has to consider:

Radial acceleration
If the Force acts perpendicular to the direction of motion, as it is the case at circular
motion, the absolute value of ~v does not change. Therefore γ does not change and we
get

335
13 Special Relativity

d~v
F~ = γm
dt

Linear acceleration
If we accelerate in the direction of motion we also have to take the derivative of γ which
is dγ 3 ~v d~v
dt = γ c2 dt . This leads then to

2 v

~ 3 v d~ d~v d~v
F =m γ 2 +γ = mγ 3
c dt dt dt

Since these two cases which scale different with γ an arbitrary force is not parallel to
the acceleration any more! That’s why it makes no sense to talk about a relativistic mass
because a relativistic mass should be defined as proportionality constant between F~ and
~a. But since these two quantities are not proportional any more (except in the two cases
above), one cannot talk about a relativistic mass. So the only meaningful mass is the rest
mass m, as we have defined above (see section 13.6.1).

13.7 Paradoxes
In this section we will look at some paradox which might appear when mixing classical
mechanics and relativity. Some can be resolved using special relativity, some need general
relativity and we won’t be able to totally resolve them.

13.7.1 Ladder and barn

A farmer has a ladder and a barn where the ladder is slightly longer than the barn. The
farmer wants to put the ladder in the barn but since the barn is shorter this does not work.
His idea is to put the ladder on his high speed tractor and them moves with ≈ 0.25c in
the barn. Due to length contraction the ladder should be smaller and therefore fit into
the barn (see figure 13.6. The son of the farmer, who read this script and therefore
understood relativity very well, redoes the calculation and points out that in the frame of
the moving ladder, the barn seems shorter and therefore there is absolutely no way how
the ladder would fit into the barn. Father and son decide to risk the experiment, who is
right?

336
13.7. PARADOXES

Figure 13.6: Ladder and barn. There is a door on each side of the barn such that the
farmer can drive through the barn. Left: The moving ladder and the barn in the frame of
the barn. Right: The ladder and the moving barn in the frame of the ladder.

The question ”fit into the barn or not” is equivalent to the question weather the head
of the ladder passes the second door before the back of the ladder passes the first door
or not. This is therefore a question about simultaneousness, which is of course relative.
Or to be more precise: the frame of the tractor and the frame of the barn use different
clocks and the question when which part of the ladder passes which door depends on
the time indicated on the clocks. So the father on the tractor effectively measures 30 that
the ladder does not fit into the barn whereas the son standing outside shortly sees the
ladder disappear in the barn (assuming the father drives with constant speed through the
barn).

Since the son was a curious guy he suggests the following modification of the experiment:
He installs a photoelectric sensor at each door of the barn and puts a lamp in the middle
of the barn. If the front of the ladder passes the photoelectric sensor at the second door
before the end of the ladder passes the photoelectric sensor at the first door, the lamp
gets on. This experiment should allow to take a decision whether the ladder fits into the
barn or not. What will they observe?
They indeed observe that the light goes on. In the frame of the son (standing next to the
barn) this is obvious since the ladder gets smaller as it moves. In the frame of the father
30
Measure instead of see because to see something, the light needs some time to pass the corresponding
distance

337
13 Special Relativity

on the tractor this is not that obvious and we have to consider how the signal from the
photoelectric sensor gets to the lamp. Assume there is a cable between the photoelectric
sensor and the lamp where the signal moves with the speed of light c and that there are
no effects that delay the signal. Since the speed of light is a natural constant, both the
father and the son see it move with c. But in the frame of the father the lamp is moving
with velocity v. Therefore the overall31 effect of the second door moving with velocity
v and the signal moving with c is the same as c − v. This is opposite at the first door,
where the farmer sees an overall velocity of c + v. Although the distance between the
lamp and each door is the same, the signal from the second door reaches the lamp first
and as a consequence it shines.
In this paradox we see very nicely that there are examples in relativity which seem con-
tradictory but at a closer look there is no mistake in the theory.

13.7.2 Twin paradox

The probably most famous paradox is the twin paradox: One of the two twins makes a
space trip in a very fast space ship whereas the other twin stays at home. After several
years the first twin returns and meets the other twin. The one who made the trip is much
younger than the one stayed at home.
Not knowing relativity this is astonishing because time is usually considered as absolute
which should not be influenced by movements. Knowing relativity this might be aston-
ishing because each of the twins see the other twin move and therefore the other twin
should stay younger. The point in this paradox is that one of the twins has to accelerate in
order to return to the other twin. This acceleration breaks the symmetry of the problem
and leads to the fact that one twin stays younger than the other.
The simplest way to calculate the paradox is to assume that the twin which makes his trip
moves with constant velocity v until he changes to another space ship which moves with
−v back to earth. The clocks of the two space ships are set such that they have the same
time at the moment they meet and at the position they meet.

13.7.3 Solution to the flying electron problem

We now want to go back to the problem from the beginning, namely the flying electron
(see section 13.1.2). For simplicity we assume that the electrons in the wire move with

31
This is the addition of two relative (and parallel) velocities measured in the same frame and not the
relativistic velocity addition discussed in section 13.5.1.

338
13.7. PARADOXES

the same speed as the electron above the wire32 . Furthermore we assume that in the lab
frame (where the electron is moving) the wire is neutral, this means the distance between
two neighbour positive atoms and the distance between two neighbour negative electrons
is the same. Let’s denote this distance with L.
First of all we have to state that from the point of the lab we see the distance of two
neighbour moving electrons in the wire with a Lorentz contraction. Therefore their dis-
tance (in the direction of flight) in their own frame is L0e = Lγ where the γ = p 1 2
1−β
and β = vc the velocity in terms of c. If we now look at the frame of the moving elec-
tron, the electrons in the wire are not moving any more (by our assumption above), their
distance is therefore L0 . Additionally the positive atoms are moving which leads accord-
ing to the Lorentz contraction to a distance between neighbour atoms of L0p = γ1 L. As
a consequence the density of the positive atoms is bigger than the one of the negative
electrons, the wire is therefore electrically positive charged. This positive charge leads to
an attraction of the single electron, similarly we observe in the lab frame.
This example show how important relativity is in electrodynamics. Without relativity
electrodynamics would not be a consistent theory, meaning that different observers would
observe different result of one experiment. Nevertheless one needs a bit more math to
precisely formulate electrodynamics with relativity.

32
One can do the whole calculation without this assumption which leads to the same result but needs a
longer calculation respectively is less intuitive.

339
13 Special Relativity

340
Chapter 14

QUANTUM MECHANICS
Werner Heisenberg was driving in his
car, thinking about the problem of
time. Suddenly a police car appeared
and stopped him. The policewoman
asked: “Do you know how fast you
were driving?”. Upon which
Heisenberg answered: “Do you know
where I have been?”.
Exercise: why was he fined nontheless?

14.1 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342

14.2 Laws of Quantum Mechanics . . . . . . . . . . . . . . . . . . . . . . 351
14.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355

341
14 Quantum Mechanics

Classical mechanics describes how particles or bodies behave in time. So it characterises in

which way a particle moves. In Newtonian mechanics the main formula that characterises
the way particles move is F~ = m~a. One then assumes that the particle moves according
the the formula above on a well defined path. At the end of the 19th century some
phenomena did not fit to the predictions of classical mechanics. A new description was
developed which is called quantum mechanics.
Quantum mechanics describes also how a particle behaves in time but the laws are very
different of those of classical mechanics and intuitively not really understandable. The
main idea is that a particle does not move on a well defined path but the probability to find
it at a certain location is connected to wave properties. This chapter shows the phenom-
ena which did not fit to classical mechanics and what changed in the description switching
to quantum mechanics. Then some concepts of quantum mechanics are discussed. Since
the general description of quantum mechanics is pretty complicated the concepts are dis-
cussed in a more intuitive way. At the end we will look how a rigorous calculation in
quantum mechanics looks like. This rigorous calculation is surly not necessarily to know
for the Olympiads (not even for the IPhO) but it should show how quantum mechanics
really looks like. The main sources for this chapter are [46], [47] and [48].

14.1 Experiments
We want to look at some experimental setups and discuss how quantum mechanics
changed the understanding of physics and what the main statements are.

14.1.1 Black body radiation

A body with a certain temperature T emits electromagnetic waves, called radiation. De-
pending on the surface, the body can emit electromagnetic waves for certain frequencies
better than for others. If there are two bodies and one body is hotter than the other,
energy can only go from the hotter to the colder body. Therefore for any frequency the
capability of a body to absorb an electromagnetic wave is the same as the capability to
emit a wave.
A black body is a body which absorbs all electromagnetic waves. As a consequence it
also emits waves very well. If a body absorbs an electromagnetic wave it gets heated up
by the energy of the wave. The emission of an electromagnetic wave is also due to the
thermal energy of a body.
A surface which has the property of a black body can by achieved if one takes a closed
cavity with a little hole (see figure 14.1). The area where the hole is behaves like a black
body because at the spot where the hole is all the incident waves will go into the cavity

342
14.1. EXPERIMENTS

and are therefore absorbed by the body.

Figure 14.1: A cavity with a little hole [49].

Inside the body there are some other interesting properties: Since all the walls of the
cavity have the same temperature there is no net flow of energy from one wall to the
other. Therefore the electromagnetic waves inside the cavity have a intensity such that
the emission and absorption at the walls is in an equilibrium. We assume the hole to be
very small, therefore at the hole the radiation is the same as inside the cavity. Since the
hole behaves like a black body the electromagnetic waves inside the cavity look like they
were emitted by a black body, although the walls of the cavity need not to be black. This
is because the electromagnetic field and the walls are in an equilibrium.

At the end of the 19th and beginning of the 20th century there was a big discussion about
the question how the power that a black body radiates is distributed as a function of the
frequency f . This means how much power is radiated between a small interval f and
f + df .
The classical approach is to model the electric field inside the cavity as standing waves
between opposite walls. According to thermodynamics any possible standing wave has
the same amount of energy (equipartition theorem). But for higher frequencies also diag-
onal standing waves can happen therefore the number of standing waves per frequency
is not fix. Since usually the dimension of such a cavity is in the order centimetre and the
wavelength of light is in the order of 500 nanometres the discrete difference between
two successive frequencies is very small and we treat the spectrum continuous. A rig-
orous calculation then gives that the number of standing waves between f and f + df
is

343
14 Quantum Mechanics

4πf 2 V
g(f )df = df
c3
where V is the volume of the cavity. We see that the number of standing waves of a
certain frequency is proportional to the square of the frequency (for high frequencies).
The equipartition theorem from thermodynamics states that each standing wave has the
mean energy kB T where kB is the Boltzmann constant and T the temperature. Therefore
we have an energy density (Energy per volume) inside the cavity of

8πf 2
ρ(f )df = kB T df
c3
This law is called Rayleigh-Jeans law (see picture 14.2). The additional factor 2 appears
because there are two possibilities of polarisation. The big problem is that the energy
density for high frequencies goes to infinity. Therefore in the cavity should be infinite
energy which is of course not possible.

In 1900, Max Planck introduces an other idea: He said that the energy of light (of a
certain frequency) does not have an arbitrary amount. Instead he claimed that light at a
certain frequency has a smallest amount of energy which is given by E0 = hf where h is
the Planck constant with h = 6.626 · 10−34 J·s. He did not believe that his assumption
describes a real property of nature, he just tried to do the calculation a bit different. The
calculation lead to the formula

8πhf 3 1
ρ(f )df =
c3 hf
e kB T
−1
The additional exponential in the denominator causes the curve to get small for high fre-
quencies. The curve is drawn for different temperatures in figure 14.2. The measurements
confirm Plancks calculation.

hf
Since for small frequencies: e kB T − 1 ≈ 1 + khf
BT
−1 = hf
kB T the Rayleigh-Jeans law is
therefore an approximation for small frequencies.

This was then the beginning of a new theory, the quantum theory. The introduction
of the smallest energy E0 quantises the energy of an electromagnetic field. Therefore

344
14.1. EXPERIMENTS

Figure 14.2: Radiated intensity for different temperatures. The x-axis is the wavelength
and it is obvious that for high frequencies (small wavelength) the intensity according to
Plancks calculation gets small. For the callsical approach the intensity goes towards infinity
[50].

electromagnetic waves behave like waves but have also quantised property which suits
more to the model of particles than waves. This particle property gets clearer in the next
chapter.

14.1.2 Photoelectric effect

If one shines light on a charged metallic plate one can observe that the plate loses charge
with time. To examine this effect let’s look at the experimental setup shown in figure
14.3. We shine light with a frequency f on the left metallic plate. We will see that this
plate loses electrons therefore we call it cathode. Parallel to this plate we have an other
plate where no light shines on, we call it anode. Between these two plates we connect
a voltage source with variable voltage U . We define U > 0 if the cathode is connected
to the plus pole of the voltage source (so electrons flow from the cathode to the voltage
source) and U < 0 if the cathode is connected to the minus pole. Additionally we put an
ammeter on the electric circuit in order to measure the current I that flows.

345
14 Quantum Mechanics

Figure 14.3: Light shines on an metal plate (Cathode) which emits electrons. Depending
on the voltage U the electrons reach an other metal plate (Anode). If they reach the anode
the electric circuit is closed and a current flows which is measured at the ammeter (A) [51].

Dependence of voltage U
We see that for all voltages that are smaller than a certain voltage Ut (also called threshold
voltage) there is a current flowing from the anode to the cathode. Since metallic atoms
can not really leave the metallic plate we conclude that the electrons move. Since the
electrons are negative charged and the current flows from the anode to the cathode, the
electrons move from the cathode to the anode. The light seems to hit out electrons at
the cathode and they travel to the anode. The existence of Ut (for a certain intensity) is
reasonable because for higher voltages the electrons need more energy to fly against the
electric field between the anode and cathode.

Dependence of intensity
If the voltage is smaller than the threshold voltage U < Ut we recognise that the current
is proportional to the intensity. This seems also reasonable because if we shine with more
light on the cathode there will be more electrons hitting out and reaching the anode (until
saturation effects occurs).
For a certain intensity the existence of a maximal voltage Ut is reasonable because if
we raise the voltage the electrons need more energy to overcome the electric field. The
remarkable thing that the experiment shows is, that Ut is independent of the intensity.
This means that an electron does not collect energy from the light until it can leave the
cathode and reach the anode. Instead it gets once a certain amount of energy and if the
energy is big enough to leave the cathode and reach the anode the electron will do so.

346
14.1. EXPERIMENTS

Therefore the energy from the light is concentrated in small packets. We call this energy
packets photons.

Dependence of frequency
If we change the frequency of the light we see that the threshold voltage also changes. If
we rise the frequency the threshold voltage also rises. The dependence seems to be linear
function which is drawn in figure 14.4. Therefore the energy of a photon is proportional
to the frequency. The y-intercept can be interpreted as the work that has to be applied to
the electron in order to extract it from the metallic plate. The work to leave the plate is
therefore V0 e where e is the charge of an electron.

Figure 14.4: Dependence of the cut-off potential Ut on the frequency f [52].

Interpretation
As we already stated, the energy of the light does not flow continuously but in small
packets. We can read out the amount of Energy per frequency of figure 14.4 by taking
the slope of the linear function. If we do this, we get that the slope is h, the Planck’s
constant. Therefore the energy of such an energy packet is given by E = hf as we
already assumed in chapter 14.1.1.
The intensity of the light is only a measure of how many Photons arrive at the cathode
per second.
Light is therefore on one hand a wave because we have wave phenomena like refraction
and diffraction (see chapter 14.1.3). On the other hand it has particle-like properties
because its energy (and also its momentum) travels in small packets. This shows the very
unintuitive nature of quantum mechanics.

347
14 Quantum Mechanics

14.1.3 Double slit experiment

The double slit experiment is an experiment which points out the wave property of quan-
tum objects. This experiment (see figure 14.5) consists of a light source which emits light
with a wave length λ and a wall which absorbs light except at two small slits (small com-
pared to the wavelength) which are separated by a distance similar to λ. The light can go
through these to slits and is measured on a screen.

Figure 14.5: Double slit experiment: The source shines on the wall which gives a pattern.
There are two different pattern shown. In the left pattern the intensity I1 from the first
slit (upper one) is drawn if we block the second slit and also the intensity I2 from the
second slit if we block the first one. Additionally the sum I = I1 + I2 is drawn. The
right pattern is the wave pattern which one expects when a wave hits both slits and gets
diffracted [53].

Let’s now discuss what we measure on the screen and what happens if we change the
setup slightly:

Normal setup, no change

The light behaves as a wave and gets diffracted at the two slits. This means that we have
to treat each slit as a source of new spherical waves A1 and A2 and then for each point on
the screen add the amplitudes from each slit A = A1 +A2 . The measured pattern is then
proportional to A2 because we measure the intensity. Therefore we get an interference
pattern (the right pattern in figure 14.5).

348
14.1. EXPERIMENTS

Closing one slit

If we close one slit the light goes only through the other slit. There a spherical wave is
caused which hits the screen making a pattern like I1 (or I2 if the first slit is closed).

Measuring the path

If we have both slits open we get a diffraction pattern as described above. Thinking the
light to be a flow of photons which (should) behave like particles we might be interested in
measuring through which slit the individual photons pass. Therefore we install a detector
which measures where the particles come from. As soon as the detector is installed and
measures, the light behaves differently: The diffraction pattern disappears and we get a
pattern as if the particle would fly through one or the other slit, therefore we get the left
pattern in figure 14.5. This (process) is called collapse of the wave (function) because the
wave property is disappeared. We state that the measurement influences the outcome of
the experiment, which is typically for quantum mechanics.

Very low intensity

If we reduce the intensity until only one photon is emitted at a time we will also get
the interference pattern if we wait long enough (see figure 14.6). Therefore each single
photon interferes with itself.

Figure 14.6: The light shines with very low intensity such that single phtons reach the
screen. On the left side the beginning of the experiment is shown where not many phtons
hit the screen. On the right side many phtons hit the screen and an interference pattern
is visible. [54].

349
14 Quantum Mechanics

The pattern is then no intensity distribution any more because single photons produce
only small dots. The light dots show that light hits the screen as a small packet (in fact it
is not possible to see a photon hitting an usual screen. But if one replaces the screen by
a very sensitive detector it is possible to measure single photons). Therefore the pattern
is the sum of all these photons and it is proportional to the probability that a photon
interferes to that position.
We can also switch on the detector to measure through which slit a photon passes. If we
do so we measure again a probability distribution like the left pattern in figure 14.5.

Interpretation
Since the behaviour of the experiment does not change if we perform it with single pho-
tons we have to describe the observations by behaviour of single photons (and not by
the interaction of many photons). We relate the fact that a photon is a light packet to a
particle property. Therefore a photon is a particle which travels and if it hits the screen
we see a dot like we would expect if a ball hits a wall. The interference pattern however
is clearly caused by a wave property.
If we combine these two properties we get that light is a flow of particles and the proba-
bility to find a particle is connected to a wave like probability distribution. If we describe
the light this way we can also explain intuitively the collapse of the wave when we perform
the measurement of the path. Because in that moment we see the photon going through
one slit the probability to find that photon in the other slit is zero. Therefore the photon
starts its wave like behaviour from the slit we observe it and therefore it gets an other
probability distribution.

350
14.2. LAWS OF QUANTUM MECHANICS

14.2 Laws of Quantum Mechanics

As we have seen in the experiments above quantum systems behave pretty different to
what we know from classical mechanics. We now want to examine some basic properties
of quantum mechanics. Since the general description is too complicated for this level we
will not be able to derive all laws and understand all connections.

14.2.1 Wavefunction and probability

As we have seen in the double slits experiment the probability P is proportional to the
square of the superposed light waves P ∝ (E1 + E2 )2 where E1 is the electric field
coming from the first slit and E2 from the second slit. Therefore the interference is
due to the fact that one takes the sum and then squares and not the other way round
(E1 + E2 )2 6= E12 + E22 .
If we describe quantum systems generally we have a function Ψ(x) which is called wave-
function and which assigns to each place x a value Ψ(x). The probability P to find a
particle at a certain place x is then given by P = |Ψ(x)|2 . The absolute value is necessary
if the function Ψ is a complex number.
The probability P is a probability distribution. This means that the probability to find
the particle at exact the position x0 is zero vor every x0 . It’s like shooting a football on a
target: It is impossible to hit the target exactly in the middle and if it seems one has hit it in
the middle you have to look more precise (perform a more exact measurement) and you
will find out that it was not the middle. Therefore the probability has to be understood a
bit different: The probability P12 to find a particle between x1 and x2 is

ˆx2 ˆx2
P12 = P (x)dx = |Ψ(x)|2 dx
x1 x1

Therefore one can understand a probability distribution (of the place) as P dx is the prob-
ability to find a particle between x and x + dx.

14.2.2 Measurement
If we perform a measurement the wave behaviour (of the measured quantity) of a quan-
tum mechanic particle disappears and we get a concrete value (with a certain uncertainty,
see chapter 14.2.4). If our wavefunction Ψ(x) is related to the probability to find the
particle at a certain position x (to be more precise to find it between x and x + dx) and

351
14 Quantum Mechanics

we measure the position we get a value of the position of the particle according to the
probability distribution.
But if Ψ(x) is the wavefunction to find the particle at a certain position and we mea-
sure the momentum of the particle we will not get a momentum distribution according
to |Ψ(x)|2 . To get the probability distribution |Φ(p)|2 for the momentum p with wave-
function Φ(P ) we have to make some more calculations which we do not treat here. But
we recognise that performing a measurement of the momentum knowing the position
distribution is similar to transforming the position distribution to the momentum distri-
bution. Therefore measurements are associated to operators: The measurement of the
momentum is like an operator p̂ which acts as p̂(Ψ(x)) = Φ(p)Ψp (x) where Ψp (x) is
the wavefunction with |Ψp (x)|2 describes the probability distribution to find the particle
with momentum p at the position x (for any momentum p we get a certain distribution
Ψp (x)). If we measure the momentum we measure a concrete value p0 with wavefunc-
tion Ψp0 (x). If Ψp0 (x) 6= Ψ(x) we get an new wavefunction. This means that the
measurement might changes the behaviour of the quantum system.

14.2.3 De Broglie hypothesis

Until now we only looked at the photon as quantum object. In 1924 De Broglie stated
that any object is a quantum object which behaves according to quantum laws. Since
quantum physics works with waves we have to contribute a wavelength to each object.
De Broglie stated that the wavelength λ of any object is given by

h
λ= (14.1)
p

where p is the momentum of the object and h = 6.626 · 10−34 J·s the Planck constant.
Therefore big objects with a high mass have a very small wavelength. This is also the
reason why we do not experience quantum effects in every day life. Because to get the
typical properties of a wave the wave has to hit structures which have a similar size as
the wavelength (for example diffraction is nearly not observable if slits are very far away
from each other). Since the wavelength of objects around us (like a football) is very small
(≈ 10−34 m) there exist no slits or something similar with such a small distance (Atoms
have a diameter of ≈ 10−10 m). Therefore no interference phenomena occur with every
day objects.
But one can take for example electrons or atoms and perform similar experiments as
the double slits experiment (see chapter 14.1.3) and one get also the wave behaviour of

352
14.2. LAWS OF QUANTUM MECHANICS

those particles.

A comment to the photon. From relativity there is the equation

E 2 = c2 p2 + c4 m2
with E the energy, p the momentum, c the speed of light and m the mass (to be more
precisely the rest mass). Since the photon has no mass, one gets that E = pc. If we put
this in equation (14.1) we get λ = hc
E and using λf = c we have E = hf as we stated at
the black body radiation and the photoelectric effect (chapter 14.1.1 and 14.1.2).

14.2.4 Uncertainty Principle

If we want to measure the position of a quantum object we might use light. A limitation
of the measuring precision is the angular resolution: To distinguish two objects which
are separated by a distance l we need light with a wavelength λ < l. Otherwise the
interference patterns of the two objects in our eye (or on the film in a camera) are too
close to each other to distinguish them. If we want to measure the position of a quantum
object very precisely we need light with very low wavelength. But light with very low
wavelength has a very high momentum and if the light of our measurement interacts with
the quantum object, the quantum object might gain a lot of momentum. Therefore we
know its position very well but we know nothing about its momentum.
Heisenberg stated that the uncertainty of the position σx and the uncertainty of the mo-
mentum σp are not independent. There is a natural bound which avoids that we can
measure the position and the momentum very precisely. The Uncertainty Principle is

h
σx · σp ≥
4π
where h is again the Planck constant.
This inequality does not only hold if we measure with light, it is a fundamental uncertainty
and independent of the measuring method.
There is also an uncertainty of the time and energy which is given by

h
σE · σt ≥
4π
where σE is the uncertainty about the energy and σt about time.

353
14 Quantum Mechanics

The uncertainty principle is an interpretation of the diffraction of a quantum object at

a very small slit: The smaller the slit is the more precise we know where the particle is
when it crosses the slit. But a small slit produces a wide diffraction pattern, therefore the
smaller the slit is the less we know about the momentum perpendicular to the slit.

14.2.5 Schrödinger Equation

Until now we never discussed how a quantum system changes with time. This has its
reason because the time evolution is described by the Schrödinger equation which is a
pretty complicated differential equation. Therefore this chapter is more to give a complete
overview over quantum mechanics and it is absolutely not relevant to know or understand
the equations.
Let Ψ(x, t) be the wavefunction of a quantum particle along the x-axis which also de-
pends on the time t. The (time dependent) Schrödinger equation is given by

h ∂Ψ h2 ∂ 2 Ψ
i =− 2 + V (x)Ψ (14.2)
2π ∂t 8π m ∂x2
where i is the imaginary unit, m the mass of the particle and V (x) the potential energy.
The ∂Ψ∂t is the derivative with respect to the time. The reason why we used ∂ instead of
d is because Ψ depends on the time t and the position x and by using ∂ we indicate that
2
we only differentiate according to time. ∂∂xΨ2 is the second derivative with respect to x
and it is related to the kinetic energy of the particle.
h ∂
The differential operator i 2π ∂t is the operator which is related to the energy distribution.
If we have a constant energy E the energy operator gives us just the energy, therefore
h ∂Ψ
i 2π ∂t = EΨ. This leads to the time independent Schrödinger equation

h2 ∂ 2 Ψ
EΨ = − + V (x)Ψ (14.3)
8π 2 m ∂x2
In a problem one often searches a function Ψ which solves the equation. As you can
imagine this is pretty complicated.

354
14.3. EXAMPLES

14.3 Examples
We now want to look at some examples and apply the principles of quantum physics.

14.3.1 Bohr model

One big application of quantum physics is the description of atoms. The simplest atom
is the hydrogen atom with one proton as nucleus in the middle and one electron ”flying”
around the proton. Since the precise calculation of the hydrogen atom is pretty laborious
and needs much math we will derive the Bohr model. The Bohr model is a semi clas-
sical description of the hydrogen atom which is not really correct but gives some nice
predictions.
The Bohr model assumes the following properties:

• The electron orbits around the nucleus. Since an accelerated charge radiates energy,
a classical orbit, as the motion of planets around the sun, is not possible. Instead
we assume that the electron behaves like a wave with frequency f . This wave goes
around the nucleus and has to be a standing wave (this means that the ”start” and
”end” point of the wave must meet each other). These standing waves are drawn
in figure 14.7.

• If an electron changes the orbit with energy difference ∆E it emits a photon with
frequency ν according to ∆E = hν.

Figure 14.7: The nucleus in the middle and two possible electron orbits. The orbits are
like closed standing waves [55].

355
14 Quantum Mechanics

The first assumption leads to the restriction, that circumference of the orbit must be a
multiple of the wavelength of the electron wave. Therefore

nh
2πr = nλ =
mv
where r is the radius of the orbit and n is an integer. We also used the de Broglie
wavelength λ = hp where the momentum is classically given by p = mv where m is the
mass of the electron and v its velocity.

An other equation is given by the classical orbit equation where the centripetal force is
due to the electric attraction. Therefore we get

v2 Ze2
m = (14.4)
r 4π0 r2
where Z is the number of protons in the nucleus (which is 1 for hydrogen, but we want
to calculate it more generally).

If we combine these two equations we get

0 h 2
rn = n2
πmZe2
where rn labels the different orbits according to n. For the smallest radius of the
hydrogen atom we get r1 = 5.3 · 10−11 m

If we now calculate the energy of the electron going from an orbit with radius ra to
one with radius rb we have to take into account two terms: The kinetic energy and the
potential energy. The potential energy is given by

Ze2

1 1
∆Epot = −
4π0 ra rb
The kinetic energy of the electron on an orbit rn is given by

1 1 Ze2
Ekin rn = mvn2 =
2 2 4π0 rn

356
14.3. EXAMPLES

where we used equation (14.4) to express the velocity by the radius.

The total energy difference between ra and rb is therefore

∆Etot = ∆Epot + Ekin rb − Ekin ra

1 Ze2

1 1
= −
2 4π0 ra rb
1 Ze2

1 1
= −
2 4π0 r1 a2 b2

where we used that ra = a2 r1 and rb = b2 r1 with a and b integers.

If ra > rb the electron loses energy which causes a photon leaving the atom with the
frequency ν = Eh respectively with the wavelength λ = ch E . For transition from the
second a = 2 to the first b = 1 orbit we get a wave length of 121nm which is clearly in
the ultra violet. Since any other transition to the first orbit has more energy, all transitions
to the first orbit are not visible. This is different for transitions to the second orbit b = 2
where we might get visible light. This is shown in figure 14.8. An important property
of the calculated energies is that they have discrete values. Therefore a hydrogen atom
has some certain frequencies which it can emit light. This frequencies correspond to the
spectral lines of hydrogen.
If we calculate how much energy is needed to take the electron from the first orbit to
infinity we get an energy of 13.6eV which agrees to the measurement.

The Bohr model is a first approach to describe the behaviour of atoms with only one
electron. The problem with the model is, that it still does not really avoid the problem
that accelerated charge radiates electromagnetic waves and therefore looses energy. Ad-
ditionally it violates the Heisenberg’s uncertainty principle. To get the precise description
of an atom one has to solve Schrödinger’s equation (see chapter 14.2.5).

14.3.2 Rigorous example

In this chapter we want to look how the calculation with the wave function looks like.
This is not relevant for any selection round, and not even for the IPhO. Since the math
is sometimes beyond this level, not anything can be derived.

The simplest case for a quantum system is when we have a particle in an interval [0, a]
where the potential is zero and infinite anywhere else (see figure 14.9).

357
14 Quantum Mechanics

We now search for static wave functions which means that they do not change with time.
This simplifies the calculation a lot because we can use the time independent equation
(14.3) instead of the time dependent (14.2). For x < 0 or x > a we get therefore

h2 ∂ 2 Ψ
− = (E − V )Ψ
8π 2 m ∂x2
Since E − V is something like minus infinity we see that the particle would need infinite
energy to be in this region which is impossible. Therefore the probability to find the
particle there is zero and as a consequence the wave function too.
More interesting is the region 0 < x < a because there we have to solve the equation

h2 ∂ 2 Ψ
− = EΨ (14.5)
8π 2 m ∂x2
If the second derivative appears it is always a good idea to think of sin(x) and cos(x).
We try to solve the equation (14.5) by attempting Ψ(x) = A sin(kx) + B cos(kx) where
A and k are constants which have to be determined. If we formulate equation (14.5) a
bit different and use the attempt we get

8π 2 mE ∂2Ψ
− Ψ =
h2 ∂x2
8π 2 mE
− (A sin(kx) + B cos(kx)) = −k 2 (A sin(kx) + B cos(kx))
h2
√ 2π
k = 2mE
h
To get a restriction on A and B we have to consider the following: at the position x = 0
the wave must have amplitude zero because there the infinite potential begins. Therefore
B = 0. Additionally we have also the restriction Ψ(a) = 0 which leads to the fact that
the argument in the sin must be a multiple of π. This means ka = nπ where n is an
integer. But this can only be the case if the Energy E has a certain value, namely

n2 π 2 h2 n 2 h2
En = =
2ma2 4π 2 8ma2
This means that only discrete energy levels are allowed in order to get time independent
solutions. The subscript n at En indicates which energy we look at.

358
14.3. EXAMPLES

To get the A we use the condition that the total probability to find the particle between
0 and a must be 1

ˆa
1= |Ψ(x)|2 dx
0
ˆa
= A2 sin(kx)2 dx
0
a
2 1 1
=A (− sin(kx) cos(kx) + x)
2 k 0
2a
=A
r2
a
A=
2

Therefore the possible wavefunctions are superpositions of the Ψn (x) which are

r
2 nπ
Ψn (x) = sin( x)
a a
where the Ψn (x) has the energy

n 2 π 2 h2 n2 h2
En = =
2ma2 4π 2 8ma2

359
14 Quantum Mechanics

Figure 14.8: Transistions to the second orbit [56].

360
14.3. EXAMPLES

Figure 14.9: Infinite potential (V ) for x < 0 and x > a. Additionally the first three
standing waves are shown.

361
14 Quantum Mechanics

362
Chapter 15

INTRODUCTION TO STATISTICS
What is the average air speed velocity
of an unladen swallow?
What do you mean, an African or European
swallow?

15.1 Location and Spread of a single Set of Data . . . . . . . . . . . . 364

15.2 Uncertainty Propagation . . . . . . . . . . . . . . . . . . . . . . . . 366
15.3 Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368
15.4 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369

363
15 Introduction to Statistics

The main goal of statistical methods is to make inferences based on data. There are three
important steps in this process: Collecting the data, describing the data and analyzing the
data. While we will focus on descriptive statistics in this introduction, it is important to
mention that every step heavily relies on the previous step. If your experimental setup
does not provide good data, you will not be able to draw any meaningful conclusions
(Garbage in - Garbage out).
The main sources are [57], [58], [59], [60] and [61].

15.1 Location and Spread of a single Set of Data

Let X = {x1 , x2 , . . . , xn } denote a set of n data points. The mean x̄ and the variance
σ 2 of X are defined as
n
1X
x̄ = xi (15.1)
n
i=1
n
1 X
2
σ = (xi − x̄)2 (15.2)
n−1
i=1

By taking the square root of σ 2 , we find the standard deviation, a common measure for
how spread the data is, which has the nice property of having the same units as the original
data. By dividing the standard deviation by the mean, we get the coefficient of variation,
an indicator for the relative spread of the data.
√
σ= σ2 (15.3)
σ
CV = (15.4)
x̄
Another way of better describing how data is located are quantiles. The idea is to divide
the data into equally large groups and indicate where these cuts are. As an example, when
calculating the lower and upper quartile, one quarter of the data points are smaller than the
lower quartile and one quarter are larger than the upper quartile, with the remaining half
being located between these two values. It is important to note, that for this calculation,
the data must first be sorted smallest to largest (i.e. xi−1 ≤ xi ≤ xi+1 ). The general
formula for calculating quantiles is then given by (15.5) for a cut at the fraction p (i.e. p
of the data points below the cut and (1 − p) above the cut). As an example, for the lower
and upper quartiles one would calculate Q(0.25) and Q(0.75).

Q(p) = (1 − g) · xj + g · xj+1 (15.5)

364
15.1. LOCATION AND SPREAD OF A SINGLE SET OF DATA

1 1
j = bpn + c g = pn + −j (15.6)
2 2
(Note that bzc designates the floor of z, i.e. the next lower integer)
Common quantiles used are quartiles (p = k4 ), deciles (p = 10 k k
), percentiles(p = 100 )
1
and of course the median (p = 2 ), for which we can simplify the calculation:
(
x[ n+1 ] n odd
x̃ = 1
2 (15.7)
2 · (x[ 2 ] + x[ 2 +1] ) n even
n n

15.1.1 Bivariate Analysis

If we have two sets of data X and Y collected in parallel so that each xi is associated to its
corresponding yi , it might be interesting to measure how the data in these sets might be
connected. This can be quantified by calculating the covariance and correlation of these
sets:
n
1 X
Cov(X, Y ) = (xi − x̄) (yi − ȳ) (15.8)
n−1
i=1

Cov(X, Y )
Corr(X, Y ) = (15.9)
σx · σy
Finally, if we suspect that there is a linear relationship between X and Y , we can try to
calculate a line y = a + bx that fits our data as much as possible. This is called linear
regression and is often done using the least squares method. This method minimizes
the square of the errors between the calculated line and the observed data points. The
coefficients can be calculated as shown in (15.10) and (15.11). The derivation can be
found in the appendix.
Cov(X, Y )
b= (15.10)
σx2

a = ȳ − bx̄ (15.11)

365
15 Introduction to Statistics

15.2 Uncertainty Propagation

When performing experiments one should always be aware of uncertainties in measure-
ments. There are many sources of uncertainties: Noise, minimum resolution, misalign-
ment, calibration, etc. To asses the impact of uncertain measurements on results of com-
plex calculations, the propagation of uncertainty has to be analyzed.

15.2.1 Quantification of Uncertainty

There are two different representations of uncertainty for a measurement. Additive un-
certainty is expressed as an absolute range by which the real value might be different from
the measured value (e.g. ±0.001m), while relative uncertainty is expressed as a fraction
of the measurement (e.g. ±0.1%). They are easily convertible into one another:
εadd
εrel = (15.12)
|X|

Since it is often impossible to find definitive upper and lower bounds for the error of a
measurement, we usually express uncertainty in terms of the expected standard deviation
of the measurement, xreal = xm ± σxm , where σxm can be found either by performing
a measurement multiple times or by carefully assessing the different possible sources of
uncertainties (e.g. for a ruler with mm markings, it is reasonable to assume an uncertainty
of ±0.5mm). For this reason, we will express uncertainties as additive uncertainties δX
in this document.

15.2.2 Propagation of Uncertainty

In this section, we will consider different types of operations R being applied to uncertain
variables X ± δX, Y ± δY and Z ± δZ, and will quantify the uncertainty δR of the
result.

• Addition of a constant
The addition of a constant will not affect additive uncertainty.

R(X) = X + c (15.13)
δR = δX (15.14)

• Addition of uncertain variables

By adding multiple uncertain variables, the variance of the result is the combined

366
15.2. UNCERTAINTY PROPAGATION

variance of the individual elements. However, as we express uncertainties with

standard deviation, we need to take the square root of the added variances.

R(X, Y, Z) = X + Y − Z (15.15)
p
δR = (δX)2 + (δY )2 + (δZ)2 (15.16)

• Multiplication with a constant

By multiplying an uncertain quantity with a constant, the uncertainty is simply mul-
tiplied with the absolute value of the constant.

R(X) = a · X (15.17)
δR = |a| · δX (15.18)

• Multiplication of uncertain variables When multiplying uncertain variables, this

corresponds to adding the relative variances to find the relative variance of the
result.

X ·Y
R(X, Y, Z) = (15.19)
Z s
δX 2 δY 2
2
δZ
δR = |R| · + + (15.20)
X Y Z

• General Operations
For general operations, the formula can be derived by considering the variation of
the result due to every variable and again adding them just like standard deviations.
In fact, all previous rules are just special cases of this rule and can be derived easily.

R(X, Y, . . . ) (15.21)
s 2 2
∂R ∂R
δR = δX + δY + ... (15.22)
∂X ∂Y

367
15 Introduction to Statistics

15.3 Units

A good understanding of the different units and their respective dimensions can be very
helpful to avoid careless mistakes.

15.3.1 The International System of Units (SI)

The International System of Units (SI, système international (d’unités)), is the most widely
used system of measurement due to its simplicity regarding unit conversions. The system
comprises seven base units from which many other units can be derived (e.g. 1N =
1 kg·m
s2
).

Dimension Unit Abbreviation

Electric Current Ampere A
Temperature Kelvin K
Time Second s
Length Meter m
Mass Kilogram kg
Luminous Intensity Candela cd
Amount of Substance Mole mol

Table 15.1: The seven SI base units

15.3.2 Prefixes
Prefixes can be used to change the order of magnitude of a unit.

Prefix Symbol Factor Prefix Symbol Factor

femto f 10−15 peta P 1015
pico p 10−12 tera T 1012
nano n 10−9 giga G 109
micro µ 10−6 mega M 106
milli m 10−3 kilo k 103
centi c 10−2 hecto h 102
dezi d 10−1 deca da 101

Table 15.2: Metric Prefixes

368
15.4. GRAPHS

15.3.3 Dimensional Analysis

We can use the fact that dimensions corresponding to the seven base units cannot be
created from any other base dimensions to quickly check equations. If an equality does
not have the same dimension on each side, then surely it cannot be true (however, the
opposite cannot be said, even if the dimensions agree there could be a mistake in form of
a dimensionless factor). To perform dimensional analysis on an equation, we replace all
involved variables with their respective dimension. We then simplify both sides to see if
the dimensions cancel each other. As an example, this technique is applied to Newton’s
second Law:

F = am
[M ][L] [L]
2
= · [M ]
[T ] [T ]2

15.4 Graphs

15.4.1 Elements of good graphs

When presenting your results in form of graphs, there are some guidelines that you should
respect to make your graph clear and understandable:

1. Complexity: A graph should not be more complex than the data it represents.
Avoid irrelevant decoration, 3D effects and distortion.

2. Scaling:

(a) The data should not be clumped in one section of the graph
(b) The scale should not change along one axis
(c) Your axes should include 0 and have no jumps
(d) Use simple steps, one square or tick mark could represent 1, 2, 5, 10, etc..

3. Title: Your graphs should have a descriptive title that contains information about
the origin of the data.

4. Multiple Data Sets: If your plotting multiple sets of data on the same graph, make
sure they’re easily distinguishable and include a key/legend (Should not obstruct
Data)

369
15 Introduction to Statistics

5. Labeled Axes: Label your axes with the name of the variable, its unit and the scale
(Ticks and Numbers). There are multiple ways of including the units, however we
suggest you use the ISO standard variable_name /unit.

6. Readability: When drawing graphs by hand, use a Ruler.

15.4.2 Logarithmic Plots

Logarithmic and Semi-Logarithmic Plots are useful tool to identify special types of re-
lationship between variables. They are characterized by one or both axis being scaled
logarithmically instead of linearly. For a logarithmic plot, this means that monomials of
the form y = axk appear as straight lines with slope k. This can be seen by applying a
log function to both sides of the equation:

log (y) = log (axk ) = log (a) + k log (x) (15.23)

In a similar fashion, we can see that relations of the form y = λaγx appear as a line with
slope γ on a semi-log plot:

loga (y) = loga (λaγx ) = γx + loga (λ) (15.24)

Or using a base 10 log:

log y = log (λaγx ) = (γ log (a))x + log (λ) (15.25)

370
15.4. GRAPHS

Figure 15.1: Different Monomials on a Logarithmic Plot.

Figure 15.2: Different Exponentials on a Semi-Logarithmic Plot.

371
15 Introduction to Statistics

372
Appendix A

FURTHER DERIVATIONS
A.1 Derivations of Statistics . . . . . . . . . . . . . . . . . . . . . . . . 374

373
A Further derivations

A.1 Derivations of Statistics

A.1.1 Alternative formulations for Variance and Covariance
By applying the sum to individual elements and using ni=1 xi = nx̄, alternative formu-
P
lations can be found that are sometimes more comfortable to apply
n
1 X
σ2 = (xi − x̄)2
n−1
i=1
n
1 X
x2i − 2xi x̄ + x̄2

=
n−1
i=1
n n n
!
1 X X X
= x2i − 2x̄ xi + x̄ 2
n−1
i=1 i=1 i=1
n
!
1 X
= x2i − 2nx̄2 + nx̄2
n−1
i=1
n
!
1 X
= x2i − nx̄2
n−1
i=1

n
1 X
Cov(X, Y ) = (xi − x̄) (yi − ȳ)
n−1
i=1
n
1 X
= (xi yi − xi ȳ − x̄yi + x̄ȳ)
n−1
i=1
n n n n
!
1 X X X X
= xi yi − ȳ xi − x̄ yi + x̄ȳ
n−1
i=1 i=1 i=1 i=1
n
!
1 X
= xi yi − nx̄ȳ − nȳx̄ + nx̄ȳ
n−1
i=1
n
!
1 X
= xi yi − nx̄ȳ
n−1
i=1

A.1.2 Derivation of the Least Squares Coefficients

We want to generate a linear regression to predict y for a given x. We will call ŷ = a + bx
our predictor for y. We define the sum of square errors as a function of the coefficients

374
A.1. DERIVATIONS OF STATISTICS

a and b:

n
X
S(a, b) = (yi − ŷ)2
i=1
n
X
= (yi − a − bxi )2
i=1

Since we want to find a and b that minimize this function, we will set the partial derivatives
with respect to a and b equal to zero and solve for a and b.

n
∂S X
= −2 (yi − a − bxi ) = 0 (A.1)
∂a
i=1
n
∂S X
= −2 (xi (yi − a − bxi )) = 0 (A.2)
∂b
i=1

We will start by simplifying the sum in (26):

n
X n
X n
X n
X
(yi − a − bxi ) = yi − a−b xi (A.3)
i=1 i=1 i=1 i=1
= nȳ − na − bnx̄ (A.4)

In (26):

−2(nȳ − na − bnx̄) = 0 (A.5)

ȳ − a − bx̄ = 0 (A.6)
a = ȳ − bx̄ (A.7)

375
A Further derivations

We will use this expression for a and simplify the sum in (27):
n
X n
X
(xi (yi − a − bxi )) = (xi (yi − ȳ + bx̄ − bxi )) (A.8)
i=1 i=1
Xn n
X n
X n
X
= xi yi − xi ȳ + xi bx̄ − bx2i (A.9)
i=1 i=1 i=1 i=1
n
X Xn n
X n
X
= xi yi − ȳ xi + bx̄ xi − b x2i (A.10)
i=1 i=1 i=1 i=1
n n
!
X X
= xi yi − nx̄ȳ + b nx̄2 − x2i (A.11)
i=1 i=1
= (n − 1) · Cov(X, Y ) − b(n − 1) · σx2 (A.12)

In (27):

−2 (n − 1) · Cov(X, Y ) − b(n − 1) · σx2 = 0

(A.13)
Cov(X, Y ) − =0 bσx2 (A.14)
Cov(X, Y )
b= (A.15)
σx2

376
Appendix B

TABLES
B.1 List of physical constants (in SI units) . . . . . . . . . . . . . . . . 378
B.2 List of named, SI derived units . . . . . . . . . . . . . . . . . . . . 379
B.3 List of material constants . . . . . . . . . . . . . . . . . . . . . . . 379

377
B Tables

B.1 List of physical constants (in SI units)

Name Symbol Value Unit

Atomic mass unit u = 1.660 · 10−27 kg
Atomic mass unit uc2 = 931.49 MeV
Avogadro constant NA = 6.022 · 10−23 mol−1
Bohr radius a0 = 5.2917 · 10−11 m
Boltzmann constant kB = 1.3806 · 10−23 J·K−1
Elementary charge e = 1.602 · 10−19 C
Vacuum permittivity /
0 = 8.8541 · 10−12 A·s·V−1 ·m−1
electric constant
Gravitational acceleration
g = 9.807 m·s−2
(average)
Universal Gas constant R = 8.3145 J·mol−1 ·K−1
Gravitational constant G = 6.673 · 10−11 m3 ·kg−1 ·s−2
Speed of light c = 2.9979 · 108 m·s−1
Vacuum permeability /
µ0 = 4π · 10−7 V·s·A−1 ·m−1
magnetic constant
Normal pressure p0 = 101324 Pa
Planck constant h = 6.626 · 10−34 J·s
Mass of electron me = 9.109 · 10−31 kg
Mass of neutron mn = 1.675 · 10−27 kg
Mass of proton mp = 1.673 · 10−27 kg
Rydberg constant RH = 1.097 · 107 m−1
Stefan-Boltzmann con-
σ = 5.670 · 10−8 W2 ·m−4 ·K−1
stant
Wave impedance of vac-
Z0 = 376.7 Ω
uum

Table B.1: List of Phyiscal constants [29]

378
B.2. LIST OF NAMED, SI DERIVED UNITS

B.2 List of named, SI derived units

Unit Symbol Quantity Equivalents SI Equivalent

hertz Hz frequency 1/s s-1
radian rad angle m/m 1
newton N force kg·m/s2 kg·m·s-2
pascal Pa pressure, stress N/m2 kg·m-1 ·s-2
joule J energy, work, heat N·m, W·s kg·m2 ·s-2
watt W power J/s, V·A kg·m2 ·s-3
coulomb C electric charge F·V A·s
volt V voltage W/A, J/C kg·m2 ·s-3 ·A-1
farad F capacitance C/V kg-1 ·m-2 ·s4 ·A2
ohm Ω resistance, impedance 1/S, V/A kg·m2 ·s-3 ·A-2
siemens S conductance 1/Ω, A/V kg-1 ·m-2 ·s3 ·A2
tesla T magnetic field strength V·s/m2 kg·s-2 ·A-1
henry H inductance V·s/A, Ω·s kg·m2 ·s-2 ·A-2

Table B.2: List of named, SI derived units

B.3 List of material constants

379
380

Density Speed of sound Linear expansion coefficient Specific heat capacity Melting temperature

B
Name ρ/kg·m−3 cs /m·s−1 α/K−1 C/J·kg−1 ·K−1 Tm / °C

Tables
Aluminium 2700 5240 23.8 · 10−6 896 660.1
Lead 11340 1250 31.3 · 10−6 129 327.4
Iron 7860 5170 12.0 · 10−6 450 1535
Gold 19290 3240 14.3 · 10−6 129 1063
Copper 8920 3900 16.8 · 10−6 383 1083
Brass 8470 18 · 10−6 380 905
Silver 10500 19.7 · 10−6 235 860.8

Heat conductivity Specific electric resistance (at 20°C) Magnetic permeability

Name λ/W·m−1 ·K−1 ρe /Ω·m−1 µr
Aluminium 239 2.82 · 10−8 1 + 2.1 · 10−5
Lead 34.8 2.2 · 10−7 diamagnetic
Iron 80 1· 10−7 ≈ 5800
Gold 312 2.2 · 10−8 1 − 3.4 · 10−5
Copper 390 1.7 · 10−8 1 − 6.4 · 10−6
Brass 79 7.8 · 10−8
Silver 428 1.59 · 10−8

Table B.3: Properties of different metals[62].

Density Speed of sound Volume expansion coefficient Specific heat capacity Melting temperature
Name ρ/kg·m−3 cs /m·s−1 α/K−1 C/J·kg−1 ·K−1 Tm /°C
Acetone 792 1190 1.49 · 10−3 2160 −94.86
Benzol 879 1326 .1.23 · 10−3 1725 5.53
Ethanol 789 1170 1.1 · 10−3 2430 −114.5
Oil ≈ 900
Mercury 13546 1430 1.84 · 10−4 139 −38.87
Water 998 1483 2.07 · 10−4 4182 0

Boiling temperature Enthalpy of fusion Enthalpy of vaporization

Name Tb /°C Lm /J·kg−1 Lb /J·kg−1
Acetone 56.25 9.8 · 104 5.25 · 105

B.3. LIST OF MATERIAL CONSTANTS

Benzol 80.1 1.28 · 105 3.94 · 105
Ethanol 78.33 1.08 · 105 8.4 · 105
Oil
Mercury 356.58 1.18 · 104 2.85 · 105
Water 100 3.338 · 105 2.256 · 106

Table B.4: Properties of different fluids[62].

381
382
Speed of Molare heat capacity Melting Boiling tem- Van-der-Waals Van-der-Waals

B
Density
sound (p constant) temperature perature constant a constant b
Name ρ/kg·m−3 cs /m·s−1 Cp /joule/mol/K Tm /°C Tb /°C a/N·m4 ·mol−2 b/m3 ·mol−1

Tables
Argon 1.784 - 20.9 −77.7 −33.4 0.425 3.73 · 10−5
Helium 0.1785 1005 20.9 - −268.94 0.0034 2.36 · 10−5
Carbondioxide 1.977 268 36.8 - −78.45 0.366 4.28 · 10−5
Air 1.293 344 29.1 - −191.4 0.135 3.65 · 10−5
Methan 0.717 445 35.6 - −191.4 0.229 4.28 · 10−5
Neon 0.9 - 20.8 −248.61 −245.06 0.0217 1.74 · 10−5
Oxygen 1.429 326 29.3 −218.79 −182.97 0.138 3.17 · 10−5
Nitrogen 1.25 1310 29.1 −210.0 −195.82 0.137 3.87 · 10−5
Water vapour - - 33.6 0 100 0.553 3.04 · 10−5
Hydrogen 0.0889 1310 28.9 −259.2 −252.77 0.0248 2.66 · 10−5

Table B.5: Properties of different gases[62].

BIBLIOGRAPHY
[1] Jankovics, Peter (Hrsg.): Lambacher Schweizer 9/10: Grundlagen der Mathematik
für Schweizer Maturitätsschulen. 1. Auflage. Klett und Balmer Verlag, Zug, 2011. S.
226-232.

[2] Jankovics, Peter (Hrsg.): Lambacher Schweizer 11/12: Grundlagen der Mathematik
für Schweizer Maturitätsschulen. 1. Auflage. Klett und Balmer Verlag, Zug, 2013. S.
42-60, 106-113, 119-127, 142-163, 178-179.

[3] Philippoz, Lionel: EPFL Trainingscamp: Mathematics. Unpublished, 2016.

[4] Tipler, P. A. and Mosca, G. 2007. Physics for Scientists and Engineers – Vol. 1. 6th ed.
London: Macmillan Education

[5] Kittel, C., Knight, W. and Ruderman, M. A. 1963. Berkeley Physics Course – Vol. 1:
Mechanics. New York City: McGraw-Hill

[6] Geckeler C. and Lind, G. 2002. Physik zum Nachdenken. Köln: Aulis Verlag Deubner

[7] Gnädig, P., Honyek, G. and Riley, K. 2001. 200 Puzzling Physics Problems. Cambridge:
Cambridge University Press

[8] Feynman, R. P. 2005. The Feynman Lectures on Physics, The Definitive Edition – Vol. 1.
2nd ed. Boston: Addison Wesley

[9] https://de.wikipedia.org/wiki/Stoffmenge

[10] Physik I, ETH Zürich 2013, Prof. Dr. Werner Wegscheider.

[11] https://en.wikipedia.org/wiki/Ideal_gas_law

[12] http://www.aplusphysics.com/courses/honors/thermo/
thermodynamics.html

383
BIBLIOGRAPHY

[13] Grundkurs Theoretische Physik 4, Springer Verlag 2012, Wolfgang Nolting.

[14] https://en.wikipedia.org/wiki/Second_law_of_thermodynamics
[15] https://en.wikipedia.org/wiki/Clausius_theorem
[16] https://en.wikipedia.org/wiki/Kinetic_theory_of_gases
[17] http://www.met.reading.ac.uk/pplato2/h-flap/phys7_5.html
[18] https://courses.lumenlearning.com/cheminter/chapter/
phase-diagram-for-water/
[19] Duc, P. F. 2012. EPFL Trainingscamp: Oscillations & Perturbations. Unpublished
[20] Stöcker, H. 2004. Taschenbuch der Physik: Formeln, Tafeln, Übersichten. 5th ed. Harri
Deutsch
[21] Wikimedia; https://commons.wikimedia.org/wiki/File:Harmonic_
oscillator.svg
[22] Wikimedia; https://commons.wikimedia.org/wiki/File:Torzni_
kyvadlo.svg
[23] Wikipedia; https://en.wikipedia.org/wiki/File:Image-Tacoma_
Narrows_Bridge1.gif
[24] Wikipedia: Tacoma Narrows Bridge (1940); https://en.wikipedia.org/wiki/
Tacoma_Narrows_Bridge_(1940)
[25] Wikimedia; https://commons.wikimedia.org/wiki/File:Mplwp_
resonance_zeta_envelope.svg
[26] Wikimedia; https://commons.wikimedia.org/wiki/File:Damping_1.
svg
[27] Wikimedia; https://commons.wikimedia.org/wiki/File:Wave_
Diffraction_4Lambda_Slit.png
[28] Wikimedia; https://commons.wikimedia.org/wiki/File:EM_Spectrum_
Properties_edit.svg
[29] Heribert Stroppe; Physik für Studierende der Natur- und Ingenierurwissenschaften
15., aktualisierte Auflage; Carl Hanser Verlag; 2012 München

384
BIBLIOGRAPHY

[30] Joachim Grehn, Joachim Krause; Metzler Physik; 4. Auflage; Bildhaus Schul-
buchverlage; 2011

[31] Wikimedia; https://upload.wikimedia.org/wikipedia/commons/8/8f/Camposcar-

gas.PNG

[32] http://hyperphysics.phy-astr.gsu.edu/hbase/electric/gaulaw.html

[33] Youtube; https://www.youtube.com/watch?v=Jx0IdlBKYNk

[34] http://slideplayer.com/slide/10856582/

[35] http://archive.cnx.org/contents/3e23d0f1-22c7-47a2-8185-
c9a9fc23a9fd@2/capacitors-and-dielectrics

[36] Bureau International des Poids et Mesures; http://www.bipm.org/en/publica-

tions/si-brochure/ampere.html

[37] Wikipedia; https://en.wikipedia.org/wiki/Magnet

[38] http://www.homofaciens.de/technics-electrical-engineering-magnetic-field-
energy_ge.htm

[39] Forphys; http://www.forphys.de/Website/student/bfeld.html

[40] http://www.fyzikalni-experimenty.cz/en/electromagnetism/oersted-experiment-
perpendicular/

[41] http://freeweb.dnet.it/motor/Kap1.htm

[42] https://de.wikipedia.org/wiki/Kirchhoffsche_Regeln

[43] https://commons.wikimedia.org/wiki/Category:Kirchhoff

[44] https://de.wikipedia.org/wiki/Liste_der_Schaltzeichen_(Elektrik/Elektronik)

[45] Roland Puntaier; http://www.texample.net/tikz/examples/bernoulli/; modified

[46] Joachim Grehn, Joachim Krause; Metzler Physik; 4. Auflage; Bildhaus Schul-
buchverlage; 2011

[47] Gerthsen, Vogel; Physik; 17. Auflage; Springer Lehrbuch; 1993

385
BIBLIOGRAPHY

[48] Warner, Cheung; A Cavendish Quantum Mechanics Primer; 2. Edition; Periphyseos

Press Cambridge; 2013

[49] http://physical-easy.blogspot.ch

[50] https://zh.wikipedia.org/wiki/File:Blackbody-lg.png

[51] http://physicsabout.com/photoelectric-effect/

[52] http://www.dronstudy.com/book/dual-nature-previous-years-questions/

[53] http://nanotechweb.org/cws/article/lab/57800

[54] http://miro.romanvlahovic.com/2016/08/29/pob/

[55] https://www.tf.uni-kiel.de/matwis/amat/mw1_ge/kap_2/backbone/r2_1_2.html

[56] http://hyperphysics.phy-astr.gsu.edu/hbase/Bohr.html

[57] J-M. Helbling, Course «Probabilités et Statistique» (EPFL, Fall 2015)

[58] Michigan State University Online Lectue «Error Propagation»

[59] http://lectureonline.cl.msu.edu/ mmp/labs/error/e2.htm

[60] Wikipedia - Log-Log Plot, https://en.wikipedia.org/wiki/Log-log_plot

[61] Wikipedia - Semi-Log Plot, https://en.wikipedia.org/wiki/Semi-log_plot

[62] Adrian Wetzel, Formelsammlung der Physik, 2.Auflage 2009

[63] Galison, Peter (2003): Einsteins Uhren und Poincarés Karten: Die Arbeit an der
Ordnung der Zeit. Frankfurt am Main: S. Fischer, S. 226-307.

[64] https://de.wikipedia.org/wiki/Geschichte_der_speziellen_Relativitätstheorie

386

Russian Physics Olympiads 2005-2017 1 3 1
80% (5)
Russian Physics Olympiads 2005-2017 1 3 1
152 pages
Savchenko. Problems in Physics-1
No ratings yet
Savchenko. Problems in Physics-1
230 pages
APhO_2024
No ratings yet
APhO_2024
40 pages
Jinhui Wang, Bernard Ricardo Widjaja - Competitive Physics-World Scientific Publishing (2019) PDF
80% (10)
Jinhui Wang, Bernard Ricardo Widjaja - Competitive Physics-World Scientific Publishing (2019) PDF
831 pages
Wang Jinhui - Competitive Physics - Thermodynamics, Electromagnetism and Relativity (2019, World Scientific Publishing Co. Pte. LTD.)
100% (2)
Wang Jinhui - Competitive Physics - Thermodynamics, Electromagnetism and Relativity (2019, World Scientific Publishing Co. Pte. LTD.)
961 pages
Kevin Zhou All Handouts
100% (1)
Kevin Zhou All Handouts
507 pages
USAPhO Problems (2007-2014)
95% (20)
USAPhO Problems (2007-2014)
311 pages
Eotvos Physics Competition 2017 2021
No ratings yet
Eotvos Physics Competition 2017 2021
8 pages
Olympiad MegaList 5
100% (1)
Olympiad MegaList 5
4 pages
Gravitation Olympiad Level
100% (1)
Gravitation Olympiad Level
22 pages
Script Swiss Physics Olympiad 2 Edition
100% (3)
Script Swiss Physics Olympiad 2 Edition
282 pages
Preparation Materials Ipho: Dinesh Kandel
100% (1)
Preparation Materials Ipho: Dinesh Kandel
32 pages
Estonian Finnish Physics Olympiad (2003-2014)
92% (12)
Estonian Finnish Physics Olympiad (2003-2014)
57 pages
Wopho Problems
100% (1)
Wopho Problems
17 pages
Vibration Formulas
No ratings yet
Vibration Formulas
9 pages
RuPhO 2019
No ratings yet
RuPhO 2019
3 pages
5 Adiabatic Invariants
100% (2)
5 Adiabatic Invariants
6 pages
Physics Olympiads Methology
100% (1)
Physics Olympiads Methology
193 pages
Methods of Problem Solving in Kinematics
100% (1)
Methods of Problem Solving in Kinematics
7 pages
Moscow Phys Olym - 1986 2007 Compressed 1 250
100% (1)
Moscow Phys Olym - 1986 2007 Compressed 1 250
250 pages
Bangladesh Physics Olympiad 2016
89% (9)
Bangladesh Physics Olympiad 2016
9 pages
All Russian Olympiad in Physics 2017-18 Grade - 11: Translated By: Vaibhav Raj, Kushal Thaman Edited By: Qilin Xue
100% (1)
All Russian Olympiad in Physics 2017-18 Grade - 11: Translated By: Vaibhav Raj, Kushal Thaman Edited By: Qilin Xue
3 pages
Olympiad Physics INPHO Solved Question Paper 2009
100% (1)
Olympiad Physics INPHO Solved Question Paper 2009
20 pages
Ohm's law in a vector form: I U R U ρL/S
100% (1)
Ohm's law in a vector form: I U R U ρL/S
7 pages
Solutions To Jaan Kalda's Problems in Mechanics: With Detailed Diagrams and Walkthroughs Edition 1.2.1
No ratings yet
Solutions To Jaan Kalda's Problems in Mechanics: With Detailed Diagrams and Walkthroughs Edition 1.2.1
80 pages
INPhO 2008 - 2020 Indian National Physics Olympiad
100% (1)
INPhO 2008 - 2020 Indian National Physics Olympiad
362 pages
Syllabus
No ratings yet
Syllabus
6 pages
Indonesian Regional Physics Olympiad 2022 1
No ratings yet
Indonesian Regional Physics Olympiad 2022 1
4 pages
Komal Physics English
75% (4)
Komal Physics English
320 pages
Physics Olympiad Books
75% (4)
Physics Olympiad Books
4 pages
APhO Theoretical Problems (2000-2013)
100% (7)
APhO Theoretical Problems (2000-2013)
107 pages
Asian Olimp (Iad PDF
50% (2)
Asian Olimp (Iad PDF
319 pages
Colombian Olympiad Physics
100% (8)
Colombian Olympiad Physics
120 pages
Taiwan Physics National Team Selection Test Round 3 2021 Part 1
No ratings yet
Taiwan Physics National Team Selection Test Round 3 2021 Part 1
6 pages
Physics Olympiad 2010 PDF
100% (3)
Physics Olympiad 2010 PDF
2 pages
Chinese Physics Olympiad 2017 Finals Theoretical Exam: Translated By: Wai Ching Choi Edited By: Kushal Thaman
100% (1)
Chinese Physics Olympiad 2017 Finals Theoretical Exam: Translated By: Wai Ching Choi Edited By: Kushal Thaman
7 pages
ITPO 2016 Problems
100% (1)
ITPO 2016 Problems
2 pages
Syllabus and Books Recommended For IPhO
100% (1)
Syllabus and Books Recommended For IPhO
12 pages
Inpho Soln 11
100% (4)
Inpho Soln 11
4 pages
Resistance of Infinite Circuits
50% (2)
Resistance of Infinite Circuits
3 pages
Unofficial Physics Challenge Problems
No ratings yet
Unofficial Physics Challenge Problems
5 pages
03 Maths Questions
100% (1)
03 Maths Questions
147 pages
IPHO Books
100% (1)
IPHO Books
3 pages
Applied Physics - I Book W
100% (1)
Applied Physics - I Book W
160 pages
Physics Olympiad Practice Test
No ratings yet
Physics Olympiad Practice Test
10 pages
Indian National Physics Olympiad Arihant Sourabh Chapter 4 Simple Harmonic Motion and Waves D C Pandey for NSEP INPhO IPO IPhO conducted by HBCSE Homi Bhabha Center for Science Education by Arihant So (z-lib.org).pdf
100% (2)
Indian National Physics Olympiad Arihant Sourabh Chapter 4 Simple Harmonic Motion and Waves D C Pandey for NSEP INPhO IPO IPhO conducted by HBCSE Homi Bhabha Center for Science Education by Arihant So (z-lib.org).pdf
321 pages
Solved Problems in Physics
100% (2)
Solved Problems in Physics
34 pages
Adiabatic Process - Physics Olympiads Guide
100% (1)
Adiabatic Process - Physics Olympiads Guide
7 pages
IOAA 2016 Problems
100% (5)
IOAA 2016 Problems
76 pages
Experimental Problems in Physics
100% (1)
Experimental Problems in Physics
377 pages
Pan Pearl Physics Olympiad 2012
100% (1)
Pan Pearl Physics Olympiad 2012
6 pages
Complete Book Mechnics PDF
100% (1)
Complete Book Mechnics PDF
111 pages
Indian National Physics Olympiad - 2008
No ratings yet
Indian National Physics Olympiad - 2008
5 pages
Statistical Mechanics
From Everand
Statistical Mechanics
Norman Davidson
No ratings yet
Selected Problems in Physics with Answers
From Everand
Selected Problems in Physics with Answers
M.P. Shaskol'skaya
3/5 (2)
Solved Problems in Classical Electromagnetism
From Everand
Solved Problems in Classical Electromagnetism
Jerrold Franklin
No ratings yet
Lectures in Classical Mechanics - Richard Fitzpatrick
100% (3)
Lectures in Classical Mechanics - Richard Fitzpatrick
310 pages
Newtonian Dynamics: Professor of Physics The University of Texas at Austin
No ratings yet
Newtonian Dynamics: Professor of Physics The University of Texas at Austin
293 pages
Analytical Classical Dynamics
100% (15)
Analytical Classical Dynamics
315 pages
Analytical Classical Dynamics An Intermediate Level Course PDF
No ratings yet
Analytical Classical Dynamics An Intermediate Level Course PDF
315 pages
Newtonian Dynamics
No ratings yet
Newtonian Dynamics
298 pages
Foundations of Tensor Analysis For Students of Physics and Engineering With An Introduction To The Theory of Relativity
No ratings yet
Foundations of Tensor Analysis For Students of Physics and Engineering With An Introduction To The Theory of Relativity
92 pages
4 Problem 2.23: Small Oscillations Full Rotation Slack of The String
No ratings yet
4 Problem 2.23: Small Oscillations Full Rotation Slack of The String
1 page
Tarea 02
No ratings yet
Tarea 02
3 pages
Every Student Has To Work On The Problems Individually
No ratings yet
Every Student Has To Work On The Problems Individually
4 pages
Set 5 QM III 98 99 1 PDF
No ratings yet
Set 5 QM III 98 99 1 PDF
2 pages
Brown RE - Strathprints - Main Rotor-Tail Rotor Interaction and Its Implications For Helicopter Directional Control Apr 08 PDF
No ratings yet
Brown RE - Strathprints - Main Rotor-Tail Rotor Interaction and Its Implications For Helicopter Directional Control Apr 08 PDF
14 pages
XI Target 2025
No ratings yet
XI Target 2025
3 pages
White Paper - Inertia Mismatch
No ratings yet
White Paper - Inertia Mismatch
3 pages
Couple and Moment Numericals
100% (1)
Couple and Moment Numericals
2 pages
Previous Asked University Questions
No ratings yet
Previous Asked University Questions
15 pages
Download Complete Vibration Control Engineering: Passive and Feedback Systems 1st Edition Ernesto Novillo PDF for All Chapters
No ratings yet
Download Complete Vibration Control Engineering: Passive and Feedback Systems 1st Edition Ernesto Novillo PDF for All Chapters
21 pages
Chapter 12 - Sound (Book Solutions)
No ratings yet
Chapter 12 - Sound (Book Solutions)
12 pages
Centripetal Force Lab: Objectives
No ratings yet
Centripetal Force Lab: Objectives
4 pages
Rbts Test-4 (26-Dec.-23) (Eh) Paper
No ratings yet
Rbts Test-4 (26-Dec.-23) (Eh) Paper
53 pages
Centrifugal Compressor 1666671884287
No ratings yet
Centrifugal Compressor 1666671884287
74 pages
Uace Complete Physics One Notess
No ratings yet
Uace Complete Physics One Notess
325 pages
Practice Questions - Sound, Class 9, Science - EduRev
No ratings yet
Practice Questions - Sound, Class 9, Science - EduRev
4 pages
Mock Exam March
No ratings yet
Mock Exam March
24 pages
Whittaker Dynamics 17
No ratings yet
Whittaker Dynamics 17
442 pages
Department of Education: Republic of The Philippines
No ratings yet
Department of Education: Republic of The Philippines
1 page
1) Wave Theory of Light - PLPN ?
No ratings yet
1) Wave Theory of Light - PLPN ?
37 pages
PHYSICAL-SCIENCE Propagation-of-Light
No ratings yet
PHYSICAL-SCIENCE Propagation-of-Light
27 pages
Special Relativity Homework
No ratings yet
Special Relativity Homework
1 page
Lecture Note 3-Planar Mechanisms - I
No ratings yet
Lecture Note 3-Planar Mechanisms - I
39 pages
Example-Transpiration Cooling PDF
No ratings yet
Example-Transpiration Cooling PDF
10 pages
Mohamed Elsakka 20200721 PHD Thesis
No ratings yet
Mohamed Elsakka 20200721 PHD Thesis
271 pages
Launch Vehicle Trajectory Design
No ratings yet
Launch Vehicle Trajectory Design
120 pages
General-Physics-1 Q1 Las Week-3
No ratings yet
General-Physics-1 Q1 Las Week-3
46 pages
Conversation of Momentum
0% (1)
Conversation of Momentum
4 pages
Mécanique - Génie Civil Par Nizkaya Tatiana
No ratings yet
Mécanique - Génie Civil Par Nizkaya Tatiana
133 pages
Rotary Actuator Guide File
No ratings yet
Rotary Actuator Guide File
20 pages
26 - Wave Optics - Keynotes
No ratings yet
26 - Wave Optics - Keynotes
12 pages
Physical Sciences Grade 11 Term 1 Week 5 - 2021
No ratings yet
Physical Sciences Grade 11 Term 1 Week 5 - 2021
13 pages
Moving Man Acc Lab
No ratings yet
Moving Man Acc Lab
4 pages

Script Swiss Physics Olympiad 4 Edition

Uploaded by

Script Swiss Physics Olympiad 4 Edition

Uploaded by

Physics Olympiad

9 Electro- and Magnetostatics 205

12 Alternating current (AC) 281

15 Introduction to Statistics 363

A Further derivations 373

1.1 Get an Overview and Elaborate a Strategy . . . . . . . . . . . . . 2

1.1 Get an Overview and Elaborate a Strategy

1.2 Get the Key Aspects of the Problem

• Make a drawing of the situation (see also next section).

1.3 Write Clearly and Keep the Overview

• Clearly write down your assumptions and simplifications.

• Use different colors to distinguish different properties.

Second round 2020, part of the first question:

Solution 2: fP +fO = lR + 12 D+lO which leads to lO = fP +fO −lR − 12 D = 8 cm.

Question 3: Determine the magnification of the telescope.

1.5 Symmetries and Order of Magnitude

• Use rotational or mirror symmetries.

1.6 Introduce new variables

• If you have a drawing, also label this new variable.

Figure 1.2: Drawing of the situation.

Solution: We can apply Snell’s law:

sin(α)nw = sin(β) (1.1)

where nw is the refractive index from water (known).

1.7 Check Your Result

• Is the argument of certain functions without dimension? This point concerns in

ample in sin(x), x must be without dimension. So x = ωt where t is time and ω a

1.9 Other Hints

1.10 Calculated example

Part A: A point object on a line

side view front view

Note: Understanding the geometry of the problem is again very useful.

Part B: Like a slide

1. Understand the geometry of the problem: In part A, the helix consisted of an

~ is mg⊥ , we get another equation

Solution: The forces are

Furthermore the force must act perpendicular to to the circle:

And according to energy conservation:

Merging the last two equations, we get

Task B iii) (2P): How is the equation simplified if we assume R  r?

2.1 Vector algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.1 Vector algebra

2.1.1 Scalar product

Definition: Let ~a and ~b be vectors. Then the scalar product of ~a and ~b is

~a · ~b = ax bx + ay by + az bz = |~a| · |~b| · cos(ϕ).

|~c|2 = |~a − ~b|2 = (ax − bx )2 + (ay − by )2 + (az − bz )2

On the other hand, it follows by the cosine formula, that:

|~c|2 = |~a|2 + |~b|2 − 2 · |~a| · |~b| · cos(ϕ).

Comparing the two expressions for |~c|2 , we find

ax bx + ay by + az bz = |~a| · |~b| · cos(ϕ).

where ϕ is the angle between a and b.

|a~0 | = |~a| · cos(ϕ)

|~a| · cos(ϕ) · |~b| = |a~0 | · |~b|

~a · ~a = |~a|2 = a2x + a2y + a2z

Furthermore: If ~a, ~b 6= ~0, then ~a · ~b = 0 if and only if ~a and ~b are orthogonal.

Exercise: Does (~a · ~b) · ~c = ~a · (~b · ~c) hold?

Figure 2.2: Projection of ~a on ~b

2.1.2 Vector product

Definition: Let ~a and ~b be vectors. The vector product of ~a and ~b is defined

Figure 2.3: Vector product

Figure 2.4: Right hand rule.

3. One has sin2 (ϕ) = 1 − cos2 (ϕ). Therefore:

Exercise: Does (~a × ~b) × ~c = ~a × (~b × ~c) hold?

2.2 Differential calculus

2.2.1 Derivative of a function

Definition: Let f be a real function and x0 a real number. Then we define

Figure 2.5: Tangent and secant

f (x0 + ∆x) ≈ f (x0 ) + ∆x · f 0 (x0 ), (2.2)

Figure 2.6: Derivative as approximation

So the derivative of f (x) = x2 is f 0 (x) = 2x.

2.2.2 Differentiation rules

Factor rule: Let s be a real number and g a function. If f (x) = s · g(x),

• Let f (x) = g(x) + k(x). Then we have

Product rule: Let g and k be functions. If f (x) = g(x) · k(x), then we

g 0 (x) · k(x) − g(x) · k 0 (x)

g 0 (x) − f (x) · k 0 (x)

g 0 (x) · k(x) − g(x) · k 0 (x) 1 · (x + 1) − (x − 1) · 1 2

Task B iii) (2P): How is the equation simplified if we assume R r?