0% found this document useful (0 votes)
602 views

Optics in Telecommunications

Uploaded by

Paul Webster
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
602 views

Optics in Telecommunications

Uploaded by

Paul Webster
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 535

Springer Series in

OPTICAL SCIENCES
Founded by H.K.V. Lotsch
Editor-in-Chief: W.T. Rhodes, Atlanta
Editorial Board: T. Asakura, Sapporo
K.-H. Brenner, Mannheim
T.W. Hansch, Garching
T. Kamiya, Tokyo
F. Krausz, Vienna and Garching
B. Monemar, Linkoping
H. Venghaus, Berlin
H. Weber, Berlin
H. Weinfurter, Munich

101

Springer Series in

optical sciences
The Springer Series in Optical Sciences, under the leadership of Editor-in-Chief William T. Rhodes, Georgia
Institute of Technology, USA, provides an expanding selection of research monographs in all major areas of
optics: lasers and quantum optics, ultrafast phenomena, optical spectroscopy techniques, optoelectronics,
quantum information, information optics, applied laser technology, industrial applications, and other
topics of contemporary interest.
With this broad coverage of topics, the series is of use to all research scientists and engineers who need
up-to-date reference books.
The editors encourage prospective authors to correspond with them in advance of submitting a manuscript. Submission of manuscripts should be made to the Editor-in-Chief or one of the Editors.

Editor-in-Chief
William T. Rhodes

Ferenc Krausz

Georgia Institute of Technology


School of Electrical and Computer Engineering
Atlanta, GA 30332-0250, USA
E-mail: [email protected]

Max-Planck-Institut fur Quantenoptik


Hans-Kopfermann-Strae 1
85748 Garching, Germany
E-mail: [email protected]
and
Institute for Photonics
Guhausstrae 27/387
1040 Wien, Austria

Editorial Board
Toshimitsu Asakura
Hokkai-Gakuen University
Faculty of Engineering
1-1, Minami-26, Nishi 11, Chuo-ku
Sapporo, Hokkaido 064-0926, Japan
E-mail: [email protected]

Karl-Heinz Brenner
Chair of Optoelectronics
University of Mannheim
Institute of Computer Engineering
B6, 26
68131 Mannheim, Germany
E-mail: [email protected]

Theodor W. Hansch
Max-Planck-Institut fur Quantenoptik
Hans-Kopfermann-Strae 1
85748 Garching, Germany
E-mail: [email protected]

Takeshi Kamiya
Ministry of Education, Culture, Sports
Science and Technology
National Institution for Academic Degrees
3-29-1 Otsuka, Bunkyo-ku
Tokyo 112-0012, Japan
E-mail: [email protected]

Bo Monemar
Department of Physics
and Measurement Technology
Materials Science Division
Linkoping University
58183 Linkoping, Sweden
E-mail: [email protected]

Herbert Venghaus
Heinrich-Hertz-Institut
fur Nachrichtentechnik Berlin GmbH
Einsteinufer 37
10587 Berlin, Germany
E-mail: [email protected]

Horst Weber
Technische Universitat Berlin
Optisches Institut
Strae des 17. Juni 135
10623 Berlin, Germany
E-mail: [email protected]

Harald Weinfurter
Ludwig-Maximilians-Universitat Munchen
Sektion Physik
Schellingstrae 4/III
80799 Munchen, Germany
E-mail: [email protected]

Jay N. Damask

Polarization Optics in
Telecommunications
With 202 Figures

Jay N. Damask
[email protected]

Library of Congress Cataloging-in-Publication Data


Damask, Jay N.
Polarization optics in telecommunications / Jay N. Damask.
p. cm (Springer series in optical sciences, ISSN 0342-4111 ; v. 101)
Includes bibliographical references and index.
ISBN 0-387-22493-9
1. Optical communication systems. 2. Fiber optics. 3. Polarization (Light) I. Title. II.
Series.
TK5103.592.F52.D36 2004
621.3827dc22
2004056603
ISBN 0-387-22493-9

ISSN 0342-4111

Printed on acid-free paper.

2005 Springer Science+Business Media, Inc.


All rights reserved. This work may not be translated or copied in whole or in part without the written
permission of the publisher (Springer Science+Business Media, Inc., 233 Spring Street, New York, NY
10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by
similar or dissimilar methodology now known or hereafter developed is forbidden.
The use in this publication of trade names, trademarks, service marks, and similar terms, even if they
are not identified as such, is not to be taken as an expression of opinion as to whether or not they are
subject to proprietary rights.
Printed in the United States of America.
9 8 7 6 5 4 3 2 1
springeronline.com

(SBA)

SPIN 10949047

To Diana Castelnuovo-Tedesco,
to my Family,
and in loving memory of A. C. Damask

Preface

I have written this book to ll a void between theory and practice, a void that
I perceived while conducting my own research and development of components
and instruments over the last ve years. In the chapters that follow I have
pulled materials from the technical and patent literature that are relevant
to the understanding and practice of polarization optics in telecommunications, material that is often known by the respective experts in industry and
academia but is rarely if ever found in one place. By bringing this material
into one monograph, and by applying a single formalism throughout, I hope to
create a base level upon which future research and development can grow.
Polarization optics in telecommunications is an ever-evolving eld. Each
year signicant advancements are made, punctuated by important discoveries.
The references upon which this book is based are only a snap-shot in time.
Areas that remain unresolved at the time of publication may very well be claried in the years to come. Moreover, the focus of the eld changes in time: for
instance, there have been few passive nonreciprocal component advancements
reported in the last few years, but PMD and PDL advancement continues
with only modest abatement.
The framework used throughout the monograph is the spin-vector calculus
of polarization. The spin-vector calculus as applied to telecommunications
optics has long been advocated by N. Frigo, N. Gisin, and J. Gordon. The
calculus has its origins in the quantum mechanical description of electron
spin and in classical dynamics of rotating bodies. While this calculus may be
unfamiliar to the reader, the advantage is its inherent geometric nature and
its compact form. Spin-vector calculus abstracts the matrix algebra generally
used to describe polarization into a purely vector form. Compound operations
are evaluated on the vector eld before being resolved onto a local coordinate
system. Without exception I have found every derivation in this book shorter,
more intuitive, and sometimes surprisingly revealing when using spin-vector
calculus. Chapter 2 is entirely dedicated to this formalism. I assure the reader
that the time invested learning this material will be rewarding.

VIII

Preface

The monograph is divided into three logical sections: theory, components,


and ber polarization. The three sections can be treated with some independence. Chapters 13 present the basic theory of Maxwells equations, polarization, and the classical interaction of light with dielectric media. Next,
Chapters 47 detail passive optical components, their design, and the building
blocks upon which they are based. Special to this section is Chapter 4, which
attempts to bridge theory and practice by tabulating known properties of the
most commonly used materials and oering practical explanation of simple
optical combinations. Lastly, Chapters 810 present aspects of polarizationmode dispersion and polarization-dependent loss.
Even though this monograph is entitled, Polarization Optics in Telecommunications, the reader should be cognizant of subjects that are missing.
Notably absent are, for example, electro-optic eects, used in polarization
controllers; liquid-crystal elements, used for switching and attenuation; and
interleaver lters, used in wavelength-division multiplexing. These omissions
are a measure of my limited experience rather than the fertility of the elds.
I have been fortunate to have a number of experts read various chapters of this book. Their help and dedication have claried a variety of points
and helped prevent mistakes. I am indebted to Dr. C. R. Doerr, Distinguished Member Technical Sta, Bell Laboratories, Lucent Technologies;
Dr. N. J. Frigo, Division Manager, AT&T Laboratories; Dr. J. P. Mattia,
Co-Founder, Big Bear Networks; Prof. T. E. Murphy, Assistant Professor,
University of Maryland, College Park; Dr. K. R. Rochford, Division Chief,
Optoelectronics, National Institute of Standards and Technology; Dr. M. Shirasaki, Co-Founder and Chief Scientist, Arasor; and Dr. P. Westbrook, Technical Manager, Photonics Device Research, OFS Labs. Complementing my
Readers, Dr. P. A. Williams of the National Institute of Standards and Technology has carefully answered my questions throughout the entire writing of
this book I am pleased to acknowledge his great support.
I have also contacted many other experts when I needed clarication on
particular topics. I would like to thank Mr. M. Alexandrovich, Prof. H. Ammari, Dr. N. Bergano, Mr. A. Boschi, Dr. S. Evangelides, Prof. A. Eyal,
Dr. V. Fratello, Prof. D. Hagen, Dr. D. Harris, Dr. G. Harvey, Prof. E. Ippen,
Dr. P. Leo, Dr. J. Livas, Dr. C. Madsen, Prof. A. Meccozi, Prof. C. Menyuk,
Mr. P. Myers, Dr. J. Nagel, Dr. K. Nordsieck, Dr. B. Nyman, Dr. C. Poole,
Dr. G. Shtengel, Mr. G. Simer, and Dr. P. Xie.
While I am indebted to these contributors, all mistakes are my responsibility alone. You can contact me at [email protected] and
I look forward to receiving your feedback.
I wish to thank The MathWorks Company, and especially C. Esposito, for
generous support through the MathWorks Authors program. Many of the
code pieces I used to generate the gures will be available courtesy of the
MathWorks at www.mathworks.com.
The people at Springer, New York, have generously given their time and
encouragement over the last eighteen months. In particular, I am indebted to

Preface

IX

my editor Dr. H. Koelsch, and to F. Ganz and M. Mitchell. Their professionalism and expertise has made this project a pleasure for me.
I wish also to thank the library services at the Massachusetts Institute of
Technology. The M.I.T. technical library is a national resource and is second
to none. The professional sta and on-line databases have helped me nd
original references of all sorts.
I wish to remember M.I.T. Institute Professor Hermann A. Haus, who,
over a decade, supported my pursuit into the beauties of optics.
Finally, I am indebted to my family, especially Mary and John, and to my
friends, who encouraged me throughout this project. Special acknowledgement
goes to my wife D. C.-T., without whose unwavering support this book would
not have been written.

New York City


July 2004

Jay N. Damask

Contents

Vectorial Propagation of Light . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


1.1 Maxwells Equations and Free-Space Solutions . . . . . . . . . . . . . .
1.2 The Vector and Scalar Potentials . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3 Time-Harmonic Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4 Classical Description of Polarization . . . . . . . . . . . . . . . . . . . . . . .
1.4.1 Stokes Vectors, Jones and Muller Matrices . . . . . . . . . . . .
1.4.2 The Poincare Sphere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5 Partial Polarization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.1 Coherently Polarized Waves . . . . . . . . . . . . . . . . . . . . . . . .
1.5.2 Incoherently Depolarized Waves . . . . . . . . . . . . . . . . . . . . .
1.5.3 Pseudo-Depolarized Waves . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.4 A Heterogeneous Ray Bundle . . . . . . . . . . . . . . . . . . . . . . .
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1
2
8
10
12
17
20
22
24
28
31
33
36

The Spin-Vector Calculus of Polarization . . . . . . . . . . . . . . . . . .


2.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2 Vectors, Length, and Direction . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.1 Bra and Ket Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.2 Length and Inner Products . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.3 Projectors and Outer Products . . . . . . . . . . . . . . . . . . . . . .
2.2.4 Orthonormal Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3 General Vector Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.1 Operator Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4 Eigenstates, Hermitian and Unitary Operators . . . . . . . . . . . . . .
2.4.1 Hermitian Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4.2 Unitary Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4.3 Connection between Hermitian and Unitary Matrices . .
2.4.4 Similarity Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4.5 Construction of General Unitary Matrix . . . . . . . . . . . . . .
2.4.6 Group Properties of SU(2) . . . . . . . . . . . . . . . . . . . . . . . . . .
2.5 Vectors Cast in Jones and Stokes Spaces . . . . . . . . . . . . . . . . . . .

37
37
39
39
41
42
43
44
44
46
47
48
49
49
50
51
52

XII

Contents

2.5.1 Complete Measurement of the Polarization Ellipse . . . . .


2.5.2 Pauli Spin Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.5.3 The Pauli Spin Vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.5.4 Spin-Vector Identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.5.5 Conservation of Length . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.5.6 Orthogonal Polarization States . . . . . . . . . . . . . . . . . . . . . .
2.5.7 Non-Orthogonal Polarization States . . . . . . . . . . . . . . . . . .
2.5.8 Pauli Spin Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.6 Equivalent Unitary Transformations . . . . . . . . . . . . . . . . . . . . . . .
2.6.1 Group Properties of SU(2) and O(3) . . . . . . . . . . . . . . . . .
2.6.2 Matrix Entries of R in a Fixed Coordinate System . . . . .
2.6.3 Vector Expression of R in a Local Coordinate System . .
2.6.4 Select Vector Identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.6.5 Euler Rotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.6.6 Some Relevant Transformation Applications . . . . . . . . . .
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3

52
54
55
56
58
59
60
61
63
65
66
67
70
71
72
78

Interaction of Light and Dielectric Media . . . . . . . . . . . . . . . . . . 79


3.1 Introduction of Media Terms into Maxwells Equations . . . . . . . 80
3.2 Constitutive Relation Tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
3.3 The kDB System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.4 The Lorentz Force . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
3.5 Isotropic Materials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
3.5.1 Permittivity of Isotropic Materials . . . . . . . . . . . . . . . . . . . 91
3.5.2 Propagation in Isotropic Materials . . . . . . . . . . . . . . . . . . . 94
3.5.3 Refraction at an Interface . . . . . . . . . . . . . . . . . . . . . . . . . . 96
3.5.4 Reection and Transmission for TE Waves . . . . . . . . . . . . 96
3.5.5 Reection and Transmission for TM Waves . . . . . . . . . . . 99
3.5.6 Total Internal Reection . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
3.6 Birefringent Materials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
3.6.1 Propagation in Uniaxial Materials . . . . . . . . . . . . . . . . . . . 106
3.6.2 Refraction at an Interface . . . . . . . . . . . . . . . . . . . . . . . . . . 112
3.6.3 Total Internal Reection . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
3.6.4 Polarization Transformation . . . . . . . . . . . . . . . . . . . . . . . . 120
3.7 Gyrotropic Materials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
3.7.1 Magnetic Material Classes . . . . . . . . . . . . . . . . . . . . . . . . . . 123
3.7.2 Permittivity of Diamagnetic Materials . . . . . . . . . . . . . . . 124
3.7.3 Propagation in Gyrotropic Materials . . . . . . . . . . . . . . . . . 126
3.7.4 Faraday Rotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
3.7.5 The Verdet Constant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
3.7.6 Faraday Rotation in Ferrous Materials . . . . . . . . . . . . . . . 133
3.8 Optically Active Materials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
3.8.1 Propagation in Bi-Isotropic Media . . . . . . . . . . . . . . . . . . . 138
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

Contents

XIII

Elements and Basic Combinations . . . . . . . . . . . . . . . . . . . . . . . . . 143


4.1 Wavelength-Division Multiplexed Frequency Grid . . . . . . . . . . . . 143
4.2 Properties of Select Materials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
4.2.1 Isotropic Glass Materials . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
4.2.2 Birefringent Crystals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
4.2.3 Iron Garnets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
4.2.4 Packaging Alloys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
4.3 Fabry-Perot and Gires-Tournois Interferometers . . . . . . . . . . . . . 154
4.3.1 Fabry-Perot Response . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
4.3.2 Gires-Tournois Response . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
4.4 Temperature Dependence of Select Birefringent Crystals . . . . . . 163
4.4.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
4.4.2 Quadratic Temperature-Dependence Model . . . . . . . . . . . 166
4.4.3 Association of Resonant Peak Shift With Temperature
Coecients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
4.4.4 Group Index and Thermal-Optic Coecients . . . . . . . . . . 168
4.4.5 Passive Temperature Compensation . . . . . . . . . . . . . . . . . 170
4.5 Compound Crystals For O-Axis Delay . . . . . . . . . . . . . . . . . . . . 173
4.6 Polarization Retarders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
4.6.1 Half-Wave and Quarter-Wave Waveplates . . . . . . . . . . . . 179
4.6.2 Birefringent Waveplate Technologies . . . . . . . . . . . . . . . . . 182
4.6.3 Waveplate Combinations . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
4.6.4 Elementary Polarization Control . . . . . . . . . . . . . . . . . . . . 191
4.6.5 TIR Polarization Retarders . . . . . . . . . . . . . . . . . . . . . . . . . 196
4.7 Single and Compound Prisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
4.7.1 Wollaston and Rochon Prisms . . . . . . . . . . . . . . . . . . . . . . 199
4.7.2 Kaifa Prism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
4.7.3 Shirasaki Prism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208

Collimator Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211


5.1 Collimator Assemblies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
5.2 Gaussian Optics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
5.2.1 q Transformation and ABCD Matrices . . . . . . . . . . . . . . . 224
5.2.2 ABCD Ray Tracing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
5.2.3 Action of a Single Lens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
5.2.4 Action of a GRIN Lens . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
5.2.5 Some Limitations of the ABCD Matrix . . . . . . . . . . . . . . . 232
5.3 Select Collimators Analyzed with the ABCD Matrix . . . . . . . . . 234
5.4 Fiber-to-Fiber Coupling by a Lens Pair . . . . . . . . . . . . . . . . . . . . 239
5.4.1 Coupling Coecients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245

XIV

Contents

Isolators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
6.1 Polarizing Isolator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
6.2 Comparison of Lens Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
6.3 Deection-Type Isolators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
6.4 Displacement-Type Isolators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
6.5 Two-Stage Isolators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
6.6 PMD-Compensated Isolators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271

Circulators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
7.1 Polarizing Circulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
7.2 Historical Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
7.3 Displacement Circulators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
7.4 Deection Circulators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
7.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295

Properties of PDL and PMD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297


8.1 Polarization-Dependent Loss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298
8.1.1 Denitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
8.1.2 Change of Polarization State . . . . . . . . . . . . . . . . . . . . . . . . 304
8.1.3 Repolarization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306
8.1.4 PDL Evolution Equations . . . . . . . . . . . . . . . . . . . . . . . . . . 308
8.2 Polarization-Mode Dispersion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312
8.2.1 A PMD Primer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
8.2.2 Fundamental Derivations . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
8.2.3 Connection Between Jones and Stokes Space . . . . . . . . . . 330
8.2.4 Concatenation Rules for PMD . . . . . . . . . . . . . . . . . . . . . . 333
8.2.5 PMD Evolution Equations . . . . . . . . . . . . . . . . . . . . . . . . . . 338
8.2.6 Time-Domain Representation . . . . . . . . . . . . . . . . . . . . . . . 342
8.2.7 Fourier Analysis of the DGD Spectrum . . . . . . . . . . . . . . . 364
8.3 Combined Eects of PMD and PDL . . . . . . . . . . . . . . . . . . . . . . . 371
8.3.1 Frequency-Dependence of the Polarization State . . . . . . . 372
8.3.2 Non-Orthogonality of PSPs . . . . . . . . . . . . . . . . . . . . . . . . 374
8.3.3 PMD and PDL Evolution Equations . . . . . . . . . . . . . . . . . 376
8.3.4 Separation of PMD and PDL . . . . . . . . . . . . . . . . . . . . . . . 378
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381

Statistical Properties of Polarization in Fiber . . . . . . . . . . . . . . 385


9.1 Polarization Evolution Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 388
9.1.1 Random Birefringent Orientation . . . . . . . . . . . . . . . . . . . . 389
9.1.2 Random Component Birefringence . . . . . . . . . . . . . . . . . . . 391
9.2 Polarization Diusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392
9.3 RMS Dierential-Group Delay Evolution . . . . . . . . . . . . . . . . . . . 397
9.4 PMD Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399

Contents

XV

9.4.1 Probability Densities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401


9.4.2 Autocorrelation Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 408
9.4.3 Mean-DGD Measurement Uncertainty . . . . . . . . . . . . . . . 414
9.4.4 Discrete Waveplate Model . . . . . . . . . . . . . . . . . . . . . . . . . . 417
9.4.5 Karhunen-Lo`eve Expansion of Brownian Motion . . . . . . . 419
9.5 PDL Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425
10 Review of Polarization Test and Measurement . . . . . . . . . . . . . 429
10.1 SOP Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 430
10.2 PDL Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 432
10.3 PMD Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 436
10.3.1 Mean DGD Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . 438
10.3.2 PMD Vector Measurement . . . . . . . . . . . . . . . . . . . . . . . . . 440
10.3.3 Polarization OTDR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450
10.4 Programmable PMD Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451
10.4.1 Sources of DGD and Depolarization . . . . . . . . . . . . . . . . . 454
10.4.2 ECHO Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463
10.5 Receiver Performance Validation . . . . . . . . . . . . . . . . . . . . . . . . . . 478
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 483
A

Addition of Multiple Coherent Waves . . . . . . . . . . . . . . . . . . . . . 491

Select Magnetic Field Proles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493


References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 496

Ecient Calculation of PMD Spectra . . . . . . . . . . . . . . . . . . . . . . 497

Multidimensional Gaussian Deviates . . . . . . . . . . . . . . . . . . . . . . . 505

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 509

1
Vectorial Propagation of Light

Maxwells equations are the basis of all optical studies. In vacuum the equations can be stripped to a pure form where the wave motion is most easily
described. Moreover, as the equations in vacuum are linear, each Fourier component of a wave can be individually studied and subsequently superimposed
to construct a composite wavefront or ray bundle. When the electromagnetic
wave propagates through media, additional terms are added to Maxwells
equations to account for the interaction. These terms come in as constitutive
laws of the media. Constitutive laws can encompass lossy, charged, dielectric, nonlinear, or relativistic media. There is almost no end to the studies on
optical interactions already undertaken over the last several hundred years.
The purpose of this chapter and that of Chapters 2 and 3 is to derive
the necessary governing equations for studies of birefringent media, birefringent components, and birefringent eects in optical ber. This chapter exclusively deals with Maxwells equations in vacuum. The classical description
of polarization motion and the degree of polarization is emphasized. Chapter 2 presents a modern description of polarization that adopts well-developed
mathematical formalisms from quantum mechanics to polarization studies.
Chapter 3 adds interaction terms to Maxwells equations to describe optical
propagation through birefringent linear dielectrics.

1 Vectorial Propagation of Light

1.1 Maxwells Equations and Free-Space Solutions


The four vectorial Maxwells equations are
E(r, t) =

Faradays law:

H(r, t) = o

Amp`
eres law:

o H(r, t) o M(r, t)
t
t

E(r, t) + P(r, t) + J(r, t)


t
t

o E(r, t) = P(r, t) + f (r, t)

Gausss electric law:

o H(r, t) = o M(r, t)

Gausss magnetic law:

where the vector quantities E, H, P, M, J, and f are real functions of time t


and the three-dimensional spatial vector r. These vector quantities are
E(r, t) :

electric eld strength

(V/m)

H(r, t) :

magnetic eld strength

(A/m)

P(r, t) :

polarization density

(C/m2 )

M(r, t) :

magnetization density

(A/m)

J(r, t) :

current density

(A/m3 )

f (r, t) :

electric charge density

(C/m3 )

where V is volts, A is amperes, and C is coulombs. The free electric charge


density f is distinguished from the bound charge density b as the bound
density is the generator of the polarization density vector P.
The physical constants o and o are the permittivity and permeability of
vacuum, respectively. The values and units are [8]
o  8.854187817 1012 (F/m)
o = 4 107 (H/m)
where F is Farads and H is Henries.
Maxwells equations completely describe the propagation and spatial extent of electromagnetic waves in free-space and in any medium. Faradays law
states that the curl of the electric eld is generated by the temporal change of
the magnetic eld and the magnetization density vector. Amp`eres law states
that the curl of the magnetic eld is generated by the temporal change of
the electric eld and the polarization density vector, as well as by currents of

1.1 Maxwells Equations and Free-Space Solutions

charged particles. Gausss two laws govern the divergence of the electric and
magnetic elds. The divergence is zero except in the presence of dipoles and
electric charges.
It is customary when considering a restricted class of problems to eliminate
various non-essential terms from the equations. As this text is predominantly
focused on passive birefringent optical components, including interaction with
xed electric and magnetic elds, the current density J(r, t), and the free
electric charge density f (r, t) are set to zero. The reduced equations are
E(r, t) = o

H(r, t) o M(r, t) ,
t
t

H(r, t) = o

E(r, t) + P(r, t) ,
t
t

(1.1.1)
(1.1.2)

o E(r, t) = P(r, t) ,

(1.1.3)

H(r, t) = M(r, t)

(1.1.4)

The eld terms E and H are the two complementary components of an


electromagnetic wave. The polarization and magnetization density vectors P
and M, respectively, are the means to describe the interaction of the electromagnetic eld with matter. The density vectors P and M are related to the
eld quantities E and H by constitutive relations. The constitutive relations
for various dielectric materials will be presented in detail in Chapter 3. This
chapter details the most simple solutions to Maxwells equations, the eld
solutions in vacuum.
In a vacuum, vectors P and M are zero. The wave equation for the electric
eld is derived by taking the curl of Faradays law, substituting in Amp`eres

since it commutes with . The


law, and reordering the temporal derivative t
wave equation for the magnetic eld is similarly found. The electric-eld wave
equation is
2
(1.1.5)
E = o o 2 E
t
Application of the vector identity = ( ) 2 ( ), and recognition
that Gauss law (1.1.3) dictates zero electric-eld divergence in the absence of
a xed charge density, (1.1.5) simplies to the Helmholtz wave equation:
2 E = o o

2
E
t2

(1.1.6)

The Helmholtz equation relates the spatial curvature of the electric eld E(r, t)
to its temporal second derivative, the factor of proportionality being o o . The
wave equation is otherwise invariant to spatial and temporal translation, spatial rotation, time reversal, and coordinate system selection. Moreover, the
wave equation is linear in that

1 Vectorial Propagation of Light

2 (E1 + E2 ) = o o

2
(E1 + E2 )
t2

(1.1.7)

The linear property of the wave equation allows arbitrarily complex eld distributions E to be constructed by Fourier synthesis or the method of superposition.
A monochromatic solution to (1.1.6) is
E(r, t) = Eo cos (t k r)

(1.1.8)

where Eo , k, and r are three-dimensional real-valued vectors and is the


radial oscillatory frequency of the wave. Eo is the eld amplitude at time and
distance zero. k is the propagation vector of the eld. The magnitude of k,
having units of inverse length, is the wavenumber k = |k|. The monochromatic
wave (1.1.8) is a travelling plane wave that propagates in the direction of k
and oscillates at frequency .
When an underlying coordinate system is chosen so that the propagation
direction of the wave k is coincident with a coordinate axis r, i.e. k  r, the
spatial argument of (1.1.8) reduces to k r = kr. The monochromatic solution
simplies to E(r, t) = Eo cos(t kr). This is the equation of a plane wave
whose phase fronts are constant in the plane perpendicular to r and whose
amplitude is likewise constant in that plane. Picking a xed phase position
along the wavefront as it propagates along r, t kr = constant, it is found
that the phase front travels at phase velocity vph = /k. The wavelength , as
dened by the length along r between two adjacent eld maxima, is = 2/k.
Substitution of the monochromatic plane wave solution (1.1.8) into the
wave equation (1.1.6) yields the dispersion relation that relates the wavenumber to the radial frequency:

(1.1.9)
k = o o
As wave equation (1.1.6) is written for vacuum, the electromagnetic wavefront
velocity is the speed of light, c. Using the dispersion relation, the speed of light
is related to the free-space permittivity and permeability as

c = 1/ o o

(1.1.10)

The speed of light in vacuum is [8]


c  299, 792, 458 (m/s)
The wavenumber is therefore related to frequency and the speed of light via
k = /c.
The monochromatic wave (1.1.8) can be resolved into cartesian coordinates
as follows. The eld amplitude vector is resolved into three scalar components
T
Eo = [Ex Ey Ez ] ; the coordinate vector r is resolved as
r=x
x + yy + zz

(1.1.11)

1.1 Maxwells Equations and Free-Space Solutions

the wave vector k is resolved as


k=x
kx + yky + zkz

(1.1.12)

A particular vector component of (1.1.8) takes associated elements from E


and k r, e.g. E(x, t) = Ex cos(t kx x). From (1.1.12), the wavenumber in
cartesian coordinates is

k = kx2 + ky2 + kz2
(1.1.13)
The monochromatic electric-eld solution (1.1.8) has a magnetic eld
counterpart. Introduction of (1.1.8) into Faradays law and solving for H by
taking the time integral yields the magnetic eld monochromatic solution

o
(1.1.14)
k Eo cos(t k r)
H(r, t) =
o
where the value of the wavenumber k has been pulled through by writing
k = k k and where k is a unit vector pointing in the direction of k. The
magnetic eld has the same spatial and temporal dependence as the associated
 electric eld. The scalar constant that relates the two eld amplitudes
is o /o . This physical constant is called the characteristic admittance of
vacuum. The characteristic impedance, the inverse of the admittance, is approximately [8]

o
 376.730313461 (ohms)
o
Substitution of the eld equations (1.1.8) and (1.1.14) into Maxwells equations (1.1.1-1.1.4) for vacuum yields
k E = o H

(1.1.15a)

k H = o E

(1.1.15b)

kE = 0

(1.1.15c)

kH = 0

(1.1.15d)

These equations show the relation of the electric and magnetic eld oscillations
with respect to one another and with respect to the propagation direction k.
The divergence equations for the electric and magnetic elds (1.1.15c,d) show
that there are no eld components in the direction of propagation. That is,
the longitudinal eld components are zero; only transverse components exist.
Both the electric and magnetic eld oscillations are therefore perpendicular
to k. Moreover, the electric and magnetic eld oscillations are mutually perpendicular. Calculation of E H via (1.1.15a,b) results in
EH=

1
2 o o

(k H) (k E)

Application of the vector relation a b c = a b c shows that E H = 0.

1 Vectorial Propagation of Light

Combination of Faradays and Amp`eres laws has led to the wave equation (1.1.6), which in turn yielded a monochromatic plane-wave solution for
both eld components (1.1.8) and (1.1.14). Substitution of these eld expressions into Maxwells equations for vacuum leads to the conclusion that the
vectors (E, H, k) are mutually perpendicular. What remains is the calculation of energy ow of the propagating electromagnetic wave.
Poyntings theorem shows explicitly that conservation of energy is an immediate result of Maxwells equations. The theorem states that the electromagnetic power ow into a volume must equal the rate of increase of stored
electric and magnetic energy plus the total power dissipated. To arrive at
the conservation equation, take the dot product of H with Faradays law
and the dot product of E with Amp`eres law, and use the vector identity
a (b c) = c a b b a c. Poyntings energy conservation equation is




1
1
o E E +
o H H +
(E H) +
t 2
t 2
E

P
o M
+H
+EJ = 0
t
t

(1.1.16)

The Poynting theorem introduces a new vector quantity: E H. This is called


the Poynting vector and represents the electromagnetic power ow density and
has units of (W/m2 ). It is customary to represent the Poynting vector by the
symbol S:
S(r, t) = E(r, t) H(r, t)
(1.1.17)
The direction of S is the direction of power ow. The power ow direction
is always orthogonal to both the E and H elds. Recalling Gauss integral
theorem,


F dV =
V

F da
S

the divergence of F enclosed by volume V equals the power ow through


surface S out of the volume. Accordingly, S represents the power ow out
of a dierential volume. This power ow is balanced by the increase of stored
electromagnetic energy W and by the power dissipated Pd . Symbolically [2],
S+

W
+ Pd = 0
t

The energy stored in the system is recoverable; the stored energy is reactive
rather than resistive. The power dissipated is non-recoverable. In terms of the
conservation equation, energy that can be grouped after the /t operator is
stored while the xed power is dissipated. As an example, consider a volume V
through which electric energy We = 1/2 o E E ows. Denote the temporal
prole as We (t) = Wo f (t) where f (t) is a positive, bounded scalar function of
time and Wo is the maximum electric energy. The prole function is zero at
t = . The time-integrated reactive power is

1.1 Maxwells Equations and Free-Space Solutions

(Wo f (t)) dt = 0
t

Integration over all time shows that no net power was left in volume V . Shown
another way [1], for any intermediate time to , the energy into the volume V
is
 to

(Wo f (t)) dt = +Wo f (to )


t

After to , the energy into the volume V is




(Wo f (t)) dt = Wo f (to )


to t
The energy that ows into V up to time to is fully recovered as t +.
However, consider the E J term. Using Ohms law relating current density
to electric eld, J = E, where is the charge density, the power dissipated
is


E 2 (t)dt = Eo2 Ip

where Ip is the integral of the square electric eld E 2 (t) over all time and Eo
is the maximum eld amplitude assuming a bounded eld-amplitude time
prole. Only if E(t) = 0 for all time for nite will the dissipated power
vanish, but this is the trivial case.
With this understanding of what constitutes stored energy and dissipated
power, the stored energy present in Poyntings theorem is identied with
W = We + Wm =

1
1
o E E + o H H
2
2

(1.1.18)

and the power dissipated is identied with


Pd = E J

(1.1.19)

This leaves the remaining terms E P/t and H o M/t open to interpretation as energy storage terms or power dissipative terms. In general these
two terms can be either; the particulars depend on the nature of the matter with which the electromagnetic eld interacts. For example, in the case
of linear dielectrics, P = o e E, the dipole density follows the electric eld
instantaneously. The change of energy of the polarization density is then


1
P
=
o e E E
E
t
t 2
where the energy stored in the polarization density is clearly reactive. If, on the
other hand, the dipole density exhibits a delayed reaction to the electric eld,
as can be the case in highly resistive media, then one could write dP/dt = aE
where a is a scaling parameter [2]. Then,

1 Vectorial Propagation of Light

P
= aE E
t

and the system is dissipative.


Earlier in this section the general plane-wave monochromatic eld solutions in vacuum were found for both the electric and magnetic elds.
The power ow density is found by S = E H. Taking the cross of (1.1.8)
and (1.1.14) yields

o 2

E cos2 (t k r)
(1.1.20)
S(r, t) = k
o o
The time average of the Poynting vector yields the average power ow of the
electromagnetic eld:

 2
1 o 2
1
S(r, t)d(t) = k
E
(1.1.21)
S(r, t) =
2 0
2 o o
The time-average power ow of the electromagnetic eld in vacuum is along
the k direction, where k is perpendicular to planes of constant phase along
the wave front. In the following chapters, dielectric anisotropy is introduced.
The anisotropy will, in general, break the apparent identity that S and k
run parallel to one another and instead induce the power ow and wave-front
propagation directions to diverge.

1.2 The Vector and Scalar Potentials


In the absence of currents, free charges, and electric and magnetic dipoles,
Maxwells equations reduce to
E = o

H
t

o H = 0
H = o

(1.2.1a)
(1.2.1b)

E
t

o E = 0

(1.2.1c)
(1.2.1d)

Under these circumstances, the magnetic and electric elds are solenoidal
(having zero divergence). It is appealing to nd the class of elds that a priori
guarantee the solenoidal nature. Note the following vectors identities:
( F) = 0

(1.2.2a)

() = 0

(1.2.2b)

that is, the divergence of an arbitrary eld curl F is solenoidal and the
curl of an arbitrary potential gradient is irrotational.

1.2 The Vector and Scalar Potentials

The solenoidal nature of o H is guaranteed by equating it with the curl


of the vector potential A:
(1.2.3)
o H = A
Substitution of (1.2.3) into (1.2.1a) yields



E+ A =0
t

(1.2.4)

Following (1.2.2b), (1.2.4) is guaranteed by dening E as


E =

A
t

(1.2.5)

where is the scalar potential. Maxwells equations (1.2.1a,b) are guaranteed


to be satised when E and H are expressed in terms of the vector potential A
and scalar potential as above. That said, A is not yet uniquely determined,
as any eld is dened by both its curl and divergence. The divergence of A
has not yet been established. Without this, a shift of the vector potential by
an arbitrary gradient, e.g. A = A + , would not change either E nor H
but would indeed change .
The divergence of A must be set with an eye toward guaranteeing the solutions to the remaining Maxwells equations (1.2.1c,d). Substitution of (1.2.3,
1.2.5) into (1.2.1c) gives



(1.2.6)
( A) = o o
A
t
t
Expanding the double-curl on the left side and rearranging terms makes


2

2
A = o o 2 A + A + o o
(1.2.7)
t
t
The selection of the vector potential divergence is arbitrary since E and H
are invariant. Therefore the most convenient choice is suitable. Accordingly, a
wave equation for the vector potential can be established given the denition
A + o o

=0
t

(1.2.8)

This choice is called the Lorentz gauge. This gauge in turn is used to generate
a wave equation for the scalar potential through substitution into (1.2.1d).
Together the wave equations are
2
A
t2
2
2 = o o 2
t

2 A = o o

(1.2.9a)
(1.2.9b)

10

1 Vectorial Propagation of Light

In summary, the vector and scalar potentials are self-consistent elds that
are constructed to satisfy all of Maxwells equations by denition. The divergence and curl of the vector potential is completely specied, through which
the link to the scalar potential is dened. The vector and scalar potentials
provide an alternative means to nd solutions to Maxwells equations. In particular, plane wave solutions exemplied by (1.1.8) are highly convenient when
the electromagnetic source is modelled innitely far away and any dielectric
or magnetic media are piece-wise uniform; Fourier techniques can be used to
assemble a ray bundle that satises some boundary condition. In contrast,
point sources generate nonuniform eld patterns that cannot be modelled by
plane waves. The vector and scalar potentials are necessary to nd the requisite eld solutions. As a particularly relevant example, Gaussian beam optics
grants the adiabatic expansion of a ray bundle as fundamental. In this paraxial limit, the eigen-waves have a spherical phase curvature that is not present
in a plane wave. In practice, which formalism is used, eld solutions or vector
potential solutions, is determined by the problem and the required degree of
accuracy.

1.3 Time-Harmonic Solutions


The above developments of Maxwells equations, monochromatic eld solutions, and Poyntings theorem were performed in vector notation with only
passing reference to an underlying coordinate system. Pure vector notation
provides the most compact form of the equations, provides for direct comparison of the vector quantities, and allows for resolution onto any convenient
coordinate system. In a analogous manner, complex exponential notation is
like vector notation because there is no a priori selection to an underlying
time reference. The use of cosine solutions in the previous section is certainly
acceptable, but choice of (sin, cos) requires selecting an underlying time reference from the beginning. To keep with real-valued functions at this point
will lead to unnecessary analytic complexity when adding phases or multiplying frequencies. The equations and solutions of the preceding section will be
recast into complex exponential notation to simplify the analytics.
One problem with complex exponential notation is that there is no customary sign of the argument. Physics texts usually use exp(it), while engineering text usually use exp(jt). Either selection is ne, as long as the
derivations, particularly those regarding polarization, are consistent. This text
chooses to use exp(jt).
The operators e and
m are used to translate between real functions
and complex exponential functions. For a complex exponential z = exp(j),
the following relations are dened:
e{z} =

z z
z + z
,
m{z} =
2
2

(1.3.1)

1.3 Time-Harmonic Solutions

11

and
z = e{z} + j
m{z}

(1.3.2)

where z is the complex conjugate of z.


The real-valued electric eld is dened using complex exponential notation
as


(1.3.3)
E(r, t) = e E ej(tkr)
where E is a complex vector. Moreover, E is written rather than Eo only
for compactness of notation, but it is recognized that E is evaluated at t = 0
and r = 0. The real part of (1.3.3) is the same as (1.1.8). The remaining
eld, dipole, and current terms in Maxwells equations undergo a similar sub
on the complex eld undergo the following
stitution. Operations and t
mapping:
jk

j
t
Substitution of (1.3.3) and like terms into Faradays law yields [7]




e j k E (o H + o M) ej(tkr) = 0
This equation must hold true for all time and position. As the real part of
the exponential term can take any value between 1 e (exp(j)) 1, the
remaining expression must equal zero. To summarize, Maxwells equations in
time-harmonic, plane-wave form are
k E = o (H + M)

(1.3.4)

k H = (o E + P)

(1.3.5)

k (o E + P) = 0

(1.3.6)

k (o H + o M) = 0

(1.3.7)

where the xed charge and current densities have been excluded. It is particularly relevant to remark that since the electric and magnetic Gaussian laws
show zero divergence, (1.3.4 and 1.3.5) describe the eld motion exclusively
in the plane perpendicular to k.
The Poynting theorem can likewise be recast into complex notation. The
theorem is
k (E H ) = o |H|2 + o |E|2 + H o M + E P

(1.3.8)

As long as there is no phase between H and o M, and similarly between E


and P , then the power ow density experiences no gain or loss. However, a
lead or lag of M to H , or P to E, introduces gain or loss into the system.
The complex Poynting vector is dened as

12

1 Vectorial Propagation of Light

S = E H

(1.3.9)

and the time-average of S is found by


S =

1
e {E H }
2

(1.3.10)

The following identities are useful for time-harmonic calculations:


1
e {a(r)b (r)}
2
1
e {a(r) b (r)}
a(r, t) b (r, t) =
2
1
a(r, t) b (r, t) =
e {a(r) b (r)}
2
a(r, t)b (r, t) =

(1.3.11a)
(1.3.11b)
(1.3.11c)

1.4 Classical Description of Polarization


Thus far the study of the vectorial nature of light has shown that a planar
electro-magnetic wave is a solution to Maxwells equations in free space, and
that the wave has a phase velocity, wavelength, and dispersion relation. Moreover, the relation between electric and magnetic elds and the power ow of
the wave have been determined. This section is addressed to the evolution
of the electric eld in the plane perpendicular to the propagation direction.
The motion of the electric eld in this plane governs the polarization of the
wave. Separate discussion of the magnetic wave motion is redundant as the
magnetic eld is immediately derived from the electric eld using Faradays
law.
Consider a time-harmonic monochromatic plane wave (1.3.3) that travels
in the z direction (k r = kz), Fig. 1.1. Since k E = 0 in vacuum, so there
is no z component to the electric eld. The most general form of the electric
eld vector is then

Ex ejx
ej(tkz)
(1.4.1)
E(z, t) =
jy
Ey e
where Ex,y are signed real numbers. The complex 2-row column vector is
called the Jones polarization vector [5].
This plane wave propagates along the z-axis with wavelength 2/k and
phase velocity c. The two eld components lie in the (x, y) plane and complete
full cycles at rate . The polarization of the wave is governed by the electriceld evolution in the xyBasis plane. For convenience of notion but without
loss of generality, kz = x . Using this reference plane and converting (1.4.1)
to its real-valued counterpart, the electric eld vector is
E(x, y, t) = x
Ex cos(t) + yEy cos(t + )

(1.4.2)

1.4 Classical Description of Polarization

13

z
Exy(t)

Fig. 1.1. In a vacuum, k E = 0, restricting the electric eld to lie in the plane
perpendicular to the propagation direction. Polarization is the motion of the electric
eld in the perpendicular plane.

where = y x . Equation (1.4.2) describes an ellipse in the plane perpendicular to z. The convention used in this text to describe the state and
handedness of the polarization ellipse is: the eld is observed as it propagates towards the observer; that is, the observer faces in the
z direction,
(see Fig. 1.1). The eld is right-hand polarized when ones right-hand thumb
points along +z and ones gures curl in the direction of electric-eld vector
motion.
The elliptical equation is derived from (1.4.2) as follows. The eld amplitudes as projected along the x
and y directions are
x = Ex cos(t)

(1.4.3a)

y = Ey cos(t + )

(1.4.3b)

Taking the square of the parametric equations, adding and absorbing terms
by identication with xy/Ex Ey yields the elliptical equation
x2
y2
2xy
+

cos = sin2
2
2
Ex
Ey
Ex Ey

(1.4.4)

There are three independent variables that govern the shape of the ellipse: Ex ,
Ey , and .
Figure 1.2 illustrates a general polarization ellipse resolved onto two coordinate systems. A general ellipse is one where there is no zero component in
the (Ex , Ey , ) triplet. In Fig. 1.2(a), Ex,y mark the projections of the ellipse
onto the (x, y) basis, and the angle is dened as tan = Ey /Ex [4]. From
the tangent relation between Ey and Ex , the Jones vector can be rewritten in
normalized form:

cos

E = Eo
(1.4.5)
sin ej

14

1 Vectorial Propagation of Light


a)

Ey

b)

v
u

b
c

Ex

c = p/6, f = p/3

Fig. 1.2. Analysis of a general polarization ellipse onto the (x, y) and (u, v) coordinate systems. a) Ex,y show maximum extent of elliptical motion on (x, y) basis.
b) Same ellipse but where (u, v) basis is aligned to the major and minor elliptical
axes. The angle between (x, y) and (u, v) is .


where Eo = Ex2 + Ey2 is the eld amplitude irrespective of coordinate system. With this normalization, the state of polarization is described uniquely
by the (, ) pair of polarimetric parameters.
Now, as any ellipse has a major and minor axis, a coordinate system can be
dened to align to these axes. Call this basis (u, v), Fig. 1.2(b). In the (u, v)
basis the elliptical equation is
u2
v2
+
=1
a2
b2

(1.4.6)

where (a, b), the major and minor axes of the ellipse, are the projections onto
the u and v axes, respectively. The parametric time-evolution equations that
result in ellipse (1.4.6) are
u = a cos t

(1.4.7a)

v = b sin t

(1.4.7b)

As is dened as the tangent angle between Ey and Ex , is likewise dened


as tan = b/a. The ellipses described by (1.4.4) and (1.4.6) are related by a
rotation in the plane through angle . That is,


x
cos sin
u

=
(1.4.8)
y
sin cos
v
Substituting the elliptical projections (1.4.3) and (1.4.7) into the above rotation, the angle of rotation is
tan 2 = tan 2 cos

(1.4.9)

To verify that the rotation was unitary, one can show that a2 + b2 = Ex2 + Ey2 .
An important conclusion is that while the (u, v) basis is the natural coordinate

1.4 Classical Description of Polarization

a)

15

b)

f = +p/2
Right-hand

f = -p/2
Left-hand

Fig. 1.3. Two states of circular polarization, counterclockwise (right-hand circular,


or R) and clockwise (left-hand circular, or L). Right- and left-hand circular states
are distinguished by the curl of ones ngers with the thumb pointing along the +
z
direction. Circular polarization exists when = 45o and = /2. a) Counterclockwise (R) corresponds to = /2. b) Clockwise (L) corresponds to = /2.

a)

b)

c)
c

c=0
f=0

c = p/3
f=0

c = p/6
f=0

Fig. 1.4. Linear states of polarization exist when = m, where m is an integer.


The orientation of the state is determined by , or alternatively by . From a) to c),
the value of increases.

a)

b)

c)

c = p/6, f = 0

c = p/6
f = p/6

c = p/6
f = p/3

c = p/6
f = p/2

Fig. 1.5. Three elliptical polarization states. All three states have same value
of . The phase dierence increases: a) = /6, b) = /3, and c) = /2.
Both and play a role in the orientation of the ellipse, as governed by
tan 2 = tan 2 cos .

16

1 Vectorial Propagation of Light

system for an ellipse having arbitrary rotation , any unit ellipse may equally
well be described on an arbitrary (x, y) basis by the (, ) pair. The coordinate
pairs (, ) and (, ) are in one-to-one correspondence.
The parametric electric eld described by (1.4.2) exhibits a handedness
that depends on the sign of . For the range < 0, the evolution of the
ellipse is in the clockwise (cw) direction and the handedness is left (L). For the
range 0 < , the evolution is in the counterclockwise (ccw) direction and
the handedness is right (R). The sense of the handedness is lost in elliptical
equation (1.4.4) since cos is an even function and sin2 is positive denite.
The same loss of handedness shows, however, that the shape of the ellipse is
independent of the rotary sense.
There are three general categories of polarization state: circular, linear,
and elliptical. Taken as a progression, circular is the most restrictive on the
possible (, ) values, linear is less restrictive, and elliptical places no restrictions on (, ). In particular, circular polarization requires = /4 and
= /2. Handedness is the only distinguishing property. When (, ) have
the same sign, the sense is R; when the signs are opposite the sense is L.
Linear polarization lets take any value and requires = m, where m in
an integer. Elliptical polarization includes circular and linear states as well as
all other possible values of (, ). Figures 1.31.5 provide examples of these
three categories.
The polarization ellipse is completely described by the (, ) pair. The
question is how to determine these polarimetric parameters uniquely for an
arbitrary state having arbitrary intensity. The following series of seven measurements will uniquely determine the state. The rst measurement is for the
overall time-averaged intensity. For a xed polarization state

Ex

(1.4.10)
E=
Ey ej
where Ex and Ey are real, the time-averaged intensity is1
1
e (E E)
2
= (Ex2 + Ey2 )/2

Io =

(1.4.11)
(1.4.12)

The remaining six measurements use a linear polarizer and, in two cases, a
quarter-wave waveplate, to make the measurements. The projection matrix is
a suitable model of a linear polarizer [10]

cos sin
cos2

(1.4.13)
P=
cos sin
sin2
1

The time-average here is only over a few optical cycles. Partial polarization takes
time-averages over longer periods.

1.4 Classical Description of Polarization

17

The origin of this matrix is derived in Chapter 2. The angle is the angle
of the polarizer to the horizontal axis. Any particular component intensity
is calculated from Ik E P()E. The rst pair of measurements orient the
polarizer in the x
direction and y direction. The component intensities are
Ix = Ex2 /2

(1.4.14a)

Ey2 /2

(1.4.14b)

Iy =

The second pair of measurements orient the polarizer in the +45o and 45o
directions. The component intensities are
I+45 = (Ex2 + Ey2 )/4 + (Ex Ey /2) cos

(1.4.15a)

(Ex Ey /2) cos

(1.4.15b)

I45 =

(Ex2

Ey2 )/4

One more measurement pair is necessary because handedness cannot be determined since cos is an even function of . To complete the measurements, the
optical beam is passed through a +45 -oriented quarter-wave waveplate and
an x
- or y-oriented polarizer so as to convert R and L hand circular polarizations to linear horizontal and vertical, respectively. The resulting intensities
are
IR = (Ex2 + Ey2 )/4 + (Ex Ey /2) sin

(1.4.16a)

(Ex Ey /2) sin

(1.4.16b)

IL =

(Ex2

Ey2 )/4

These seven measurements can be succinctly combined into four terms called
Stokes parameters, which are dened by the equations
S0 = Ix + Iy

= (Ex2 + Ey2 )/2 =

1 2
2 Eo

S1 = Ix Iy

= (Ex2 Ey2 )/2 =

1 2
2 Eo

cos 2
sin 2 cos
sin 2 sin

S2 = I+45 I45 = Ex Ey cos

1 2
2 Eo

S3 = IR IL

1 2
2 Eo

= Ex Ey sin

(1.4.17)

From these equations the polarization coordinates (, ) can be uniquely determined. Table 1.1 displays representative states in Jones and Stokes form.
1.4.1 Stokes Vectors, Jones and Muller Matrices
The Stokes vector S is dened by the projector construct (1.4.17). In general,
one can write

S0
S1

(1.4.18)
S=
S2
S3

18

1 Vectorial Propagation of Light

The Stokes vector is the analogue to the Jones vector (1.4.5) on page 13.
One must recognize that directly underlying the Jones vector are Maxwells
equations. The problem is that the Jones vector cannot be directly measured,
but the Stokes vector can. The Jones vector is reconstructed from a Stokes
vector to within a complex c constant by inverting (1.4.17):


1
(1
+
S
/S
)
1
0

2
(1.4.19)
E = c 



1
1
S3 /S2
2 (1 S1 /S0 ) exp j tan
Other than the undetermined complex constant c, there are three free variables in (1.4.19). A Jones vector, however, has four free variables: two amplitudes and two phases. The fourth free variable is the common phase of
the two polarization components; this common phase is lost in the intensity
measurements.
When light propagates through a medium, the interaction between medium
and light can impart a change in the polarization state. In Stokes space, the
change of state to S from S is determined by the Mueller matrix M. The
general transformation is

m11
S0
S1 m21


S  = m31
2
S3
m41

m12
m22
m32
m42

m13
m23
m33
m43

m14
S0
S1
m24

m34 S2
m44
S3

(1.4.20)

In matrix form one writes S = MS. The Mueller matrix is a 4 4 matrix


with real-valued entries. Polarimetric measurements nd the Mueller matrix
elements directly.
Underlying a Stokes-state transformation M is the Jones-state transformation J. As with vectors, the Jones transformation matrix comes directly
from Maxwells equations; were it not for the natural advantages of polarimetric measurements the Mueller matrix would simply be a tautology. The
Muller matrix is in any case the analogue to the Jones matrix. In Jones space,
an output vector E is related to the input vector E through
E = JE

(1.4.21)

The Jones matrix J is a 2 2 matrix with complex-valued entries. The connection between Jones and Mueller matrices is derived using Pauli matrices
(cf. 2.6.2). The Mueller matrix is derived from the Jones matrix via
Mi+1,j+1 =



1
Tr Jj J i
2

(1.4.22)

where i, j = 0, 1, 2 or 3, i is the ith Pauli matrix, and Tr is the trace operator.


The derivation of this expression is given in 2.6.2 starting on page 66.

1.4 Classical Description of Polarization

19

Equation (1.4.22) is not invertible directly. However, R. C. Jones prescribes the way to reconstruct a Jones matrix from output Stokes vectors
after three measurements [3, 6]. The three input states for the measurement
are Sa = (1, 1, 0, 0)T , Sb = (1, 1, 0, 0)T , and Sc = (1, 0, 1, 0)T . Three output
Jones vectors are constructed from the sequence:

Sa
Ea
Sa

M
to Jones

(1.4.23)
Sb Sb Eb

Sc
Sc
Ec
From these three Jones vectors four complex ratios are calculated:


k1 = Exa
/Eya
,



k2 = Exb
/Eyb



k3 = Exc
/Eyc

k4 =

k3 k2
k1 k3

(1.4.24)

To within a complex constant c, as before, the reconstructed Jones matrix is

k1 k4 k2

J = c
(1.4.25)
k4
1
Two classes of Jones matrices are particularly important for polarization
studies: the Hermitian matrix and unitary matrix. Either matrix is written in
the form

a0 + a1 a2 ja3

J=
(1.4.26)
a2 + ja3 a0 a1
A Hermitian matrix represents a measurement of the polarization state and
thus has real-valued eigenvalues. All four coecients a0,1,2,3 in (1.4.26) are
real numbers. A unitary matrix represents a coordinate transformation of the
Stokes vectors but imparts no loss or gain. Its eigenvalues are related through
the matrix exponential, cf. 2.4.3. The Mueller equivalents to these matrices
depend on the details, but the characteristic matrix forms are

JH MH

1 0

0
, JU MU =
0

(1.4.27)

While a Hermitian matrix scatters energy to all elements of the Mueller matrix a unitary matrix keeps all of the light within the three spherical Stokes
coordinates; the vector length S0 remains unchanged. This characteristic form
shows that JU imparts only a rotation.

20

1 Vectorial Propagation of Light


a)

S3

b)

S3

ccw cir (R)

q
o

90 lin
S2

45 lin
S2

j
S1

-45 lin

S
o
0 lin 1

cw cir (L)

Fig. 1.6. Spherical representation of polarization states. a) The cartesian basis


is (S1 , S2 , S3 ). The equivalent spherical basis is (r, , ). On a unit sphere, r = 1,
so (, ) coordinates uniquely determine position. b) Identication of particular
polarization states on the Poincare sphere. Along the equator lie linear states. At
the north and south poles lie ccw (R) and cw (L) circular states. All remaining
points are elliptical states. Orthogonal states are point pairs on opposite sides of the
sphere connected by a cord that runs through the origin.

1.4.2 The Poincar


e Sphere
Every possible polarization state can be represented on the surface of a unit
sphere. The unit sphere is called the Poincare sphere after H. Poincare, its
creator. A unit sphere is made by normalizing the three-directional Stokes
components S1,2,3 by the intensity component S0 . On a unit sphere, the declination and azimuth angles and describe any point on the surface. Referring
to the polar coordinates illustrated in Fig. 1.6(a), the azimuth and declination
angles are projected onto the (S1 , S2 , S3 ) basis as
S1 = sin cos
S2 = sin sin

(1.4.28)

S3 = cos
Associating spherical parameters to ellipse parameters = 2 and = 2, the
normalized Stokes components S1,2,3 of (1.4.17) are related to the spherical
coordinates as
S1 /S0 = sin 2 cos 2 = cos 2
S2 /S0 = sin 2 cos 2 = sin 2 cos
S3 /S0 = cos 2

= sin 2 sin

(1.4.29)

1.4 Classical Description of Polarization

a)

b)

S3

21

S3

S2

S2

S1

S1

Fig. 1.7. Polarization contours. a) Contour of states for xed and .


The phase slips through a full revolution. This eect can be achieved physically by transmission through a waveplate. b) Contour of states for xed and
/2 /2. The ellipse does a full rotation while maintaining its eccentricity.
This eect can be achieved physically by transmission through an optically active
waveplate.

c)

d)

S3

S3

S2

S2

S1

S1

Fig. 1.7. Polarization contours. c) Contour of states for xed and for
/2 /2. determines the tilt of the plane. Any two orthogonal states
lie on such a contour, the states being separated by 180 . d) Contour of states for
xed and . The eccentricity of the ellipse varies between linear and
circular, but the pointing direction remains either vertical or horizontal.

22

1 Vectorial Propagation of Light

Figure 1.6(b) illustrates the polarization states on the coordinate axes. Figure 1.7(ad) illustrates various contours on the Poincare sphere and their
associations with , , , and .
It is signicant that the variables , , and have a multiplier of two
in (1.4.29) while does not. Physically, any full 2 phase slip of yields
the identical polarization state; distinct optical phases within a 2 range correspond to distinct polarization states. In contrast, a change in the , ,
and parameters does not change the state. This is physically reasonable
as an ellipse is preserved under 180 rotation, and (Ex , Ey ) (Ex , Ey )
or (a, b) (a, b) inversion. Jones space includes a built in degeneracy of
elliptical parameters , , and .
The spherical representation provides a geometric interpretation of the
transformations that polarization states undergo when propagating through
birefringent media. This representation will be used extensively throughout
the text. There are, however, two drawbacks to the geometric interpretation.
First, as the Stokes parameters are determined through measurements of intensity, only the polarization phase modulo 2 can be determined. In the
study of polarization-mode dispersion, two orthogonally polarized waves can
accrue thousands of 2 phase revolutions. As delay is dened as = /,
is it essential to track the total number of phase revolutions as well as any partial slip. Polarization-mode dispersion requires a modication to the Stokes
calculus to treat the delay as well as the phase. Second, the polarization of a
state by an arbitrarily oriented polarizer is dicult to picture in Stokes space.
The projection due to the polarizer is more easily pictured in physical space.
It is good practice to intuit a polarization state seamlessly in both Stokes and
Jones space as a more robust understanding is achieved.

1.5 Partial Polarization


A wave is fully polarized when all component polarizations of a ray-bundle
oscillate coherently. Such is the case with a laser. By contrast, natural light,
such as light from the sun, is fully depolarized: the components of a ray-bundle
are completely incoherent and the instantaneous polarization over a dierential bandwidth can point in any direction on the Poincare sphere. Partially
polarized light can be naturally partially polarized in that some fraction of
the ray-bundle is polarized and the remaining part naturally depolarized,
or can be pseudo depolarized in that all components individually remain
fully polarized but the polarization of the sum is not. The instantaneous polarization of pseudo-depolarized light touches a limited loci of points on the
Poincare sphere.
There are two ways to express partial polarization: the degree of polarization (DOP, denoted D) and the Jones coherency matrix J. DOP is a scalar
value between zero and one and can be expressed in terms of Stokes or Jones
parameters. The Jones coherency matrix is derived from the dyadic form of

1.5 Partial Polarization

23

the Jones vector and is used to trace depolarization through a system in Jones
space. The coherency matrix is a necessary augmentation to Jones calculus
because the 16 free variables of the Mueller matrix are enough to include depolarization directly, while that eight free variables of the Jones matrix do
not provide enough freedom.
In terms of Stokes parameters, DOP is dened as

2
2
2
S1  + S2  + S3 
(1.5.1)
D=
S0 
where the time averages are given by
S(t) =

1
T

S(t)dt
0

The time average is taken over all time-varying quantities, i.e. t, (t), (t),
etc. D = 1 means that all waves that make up a ray bundle each have fully
determined, time-invariant polarizations. D = 0 means the polarimetric terms
of the ray bundle have vanishing time averages, but the underlying cause,
e.g. whether from incoherence or pseudo-depolarization, cannot be discerned
using D alone. An intermediate value of D means that some of the optical
power is polarized and the remaining power is not.
In terms of the coherency matrix, DOP is dened as

4 det(J)
(1.5.2)
D = 1
Tr(J)2


The coherency matrix is dened by J = EE [9], where




ex (t)
e e  e e
, and J = x x  x y 
(1.5.3)
E(t) =
ey (t)
ex ey  ey ey
and where (ex , ey ) are complex numbers. Finally, the time-averaged Stokes
parameters in terms of the coherency-matrix elements are

S0 
1
1 0 0
Jxx
S1  1 1 0 0 Jyy

(1.5.4)
S2  = 0
0 1 1 Jxy
Jyx
S3 
0
0 j j
Both D and J are inherently time-average measures. The integration period can aect the reported values. For instance, a monochromatic source that
has a coherence time of 0.1 sec certainly produces polarized waves on timescales T << 0.1 sec. However, polarization states separated by T > 0.1 sec are
uncorrelated. A D measure taken over a long time scale would produce a subunity value, while a D measure over a short time scale would produce D 1.

24

1 Vectorial Propagation of Light

Both answers are technically correct and the issue reduces to what is a relevant
time scale. That will depend on the application.
The following studies of partial polarization are grouped into ray bundles
comprised of coherent, or polarized, components; incoherent, or depolarized,
components; heterogeneous combinations of coherent and incoherent components; and pseudo-depolarized components. In all cases the ray-bundle components are collinear. In the following calculations, the electric-eld spectrum
is denoted as

pn ()
(1.5.5)
E() = Eo G()
n

where G() is the spectral prole, Eo is complex, and pn () is the nth polarization at . The time-dependent eld E(t) is the inverse Fourier transform
of E():

pn ()ejt d
(1.5.6)
Eo G()
E(t) =
n

1.5.1 Coherently Polarized Waves


The common feature of the four cases studied below is that the polarization
of each component is time-invariant and independent of frequency. The study
begins with a single monochromatic wave and generalizes to narrowband ray
bundles having either discrete or continuous spectra. The studies show that for
coherently polarized waves, only pseudo-depolarization can reduce the degree
of polarization below unity.
A Monochromatic Polarized Wave
The simplest case is a single monochromatic polarized plane wave. The eld
spectrum is
p
(1.5.7)
E() = Eo ( o )
where ( o ) is the Dirac delta function centered at o . In the time domain,
the plane wave is

E(t) = Eo ejo t

cos

sin ej

The corresponding Stokes parameters are

1
cos 2
2

S = |Eo |
sin 2 cos
sin 2 sin

(1.5.8)

As and are xed in time, substitution of (1.5.8) into (1.5.1) yields D = 1.


The coherency matrix is

1.5 Partial Polarization

J =

cos2

ej sin cos

ej sin cos

sin2

25

(1.5.9)

The polarization state of this wave is completely determined.


A Monochromatic Wave Having Multiple Polarizations
The spectrum of a ray bundle that comprises multiple monochromatic polarized waves of multiple polarization components is written as

E() =
Eon ( o )
pn
(1.5.10)
n

The time-domain eld of the ray bundle is


E(t) = ejo t
Eon
n

cos n
sin n ejn

While the polarimetric parameters of the combined wave may be complicated,


they do not vary in time. One can verify that
S1  + S2  + S3  = (ex ex + ey ey )2
2

and thus D = 1. A ray bundle that is constituted from multiple monochromatic coherent waves has a polarization state that is completely determined.
The intensity of the ray bundle is calculated from S1 , or

2
|Eon |
(1.5.11)
Icoh = S0  =
n

where Icoh denotes the intensity of the coherent waves.


Narrowband Polarized Waves with Discrete Spectrum
Consider an extension of (1.5.7) where the spectrum comprises multiple frequency components, each component itself being polarized:

E() =
Eon ( n )
pn
(1.5.12)
n

In the time domain, this discretely polychromatic wave is


cos

ejn t Eon
E(t) =
jn
sin n e
n

(1.5.13)

26

1 Vectorial Propagation of Light

The polarimetric parameters n and n for each frequency component are


xed in time (Dn = 1) and the frequencies n are distinct. After summation,
however, the composite polarimetric parameters do depend on time. In general
this leads to Dtotal < 1.
The depolarization is calculated as follows. Consider rst the S1 Stokes
parameter:
S1 = ex ex ey ey



=
ejm t Eom
cos m
ejn t Eon cos n
m

jm t

Eom

sin m ejm

ejn t Eon sin n ejn

The time averages on normalized components are







jm t
jn t
e
cos m
e
cos n =
cos2 n
m

and



m

ejm t sin m ejm


ejn t sin n ejn

sin2 n

where the time-average window is T >> [min(n m )]1 . All cross terms
are eliminated upon averaging, and the same holds for S2 and S3 . In general
the three time-averaged Stokes parameters are

Skn 
(1.5.14)
Sk  =
n

Accordingly, the degree of polarization is





2
2
2
( n S1n ) + ( n S2n ) + ( n S3n )

D=
n S0n 
2

Since Dn of each component is unity, it follows that S0n  = S1n  +S2n  +


2
S3n  . By iterating the triangle inequality
|r1 + r2 | |r1 | + |r2 |
one concludes that




2
2
2
( n S1n ) + ( n S2n ) + ( n S3n ) n S0n 
Therefore, in general, the DOP for a discretely polychromatic ray bundle is

1.5 Partial Polarization


S2
|

r5

r2
r1

+ ..
.+
r5

r3

|r
1

r4

r4
r3

+r
2

r5

S2

27

r2

r1

S1

S1

Fig. 1.8. Stokes vectors rk in a plane. On the left, individual vector components:
the vector direction is a function of frequency. On the right, the length of the vector
sum is generally less than the arithmetic sum of the vector lengths.

D=




2
2
2
( n S1n ) + ( n S2n ) + ( n S3n )
Icoh

(1.5.15)

where Icoh is given by (1.5.11). Equation (1.5.15) does provide some physical
insight even though a specic expression is lacking. As Fig. 1.8 illustrates,
when the Stokes vectors for the various frequencies are nearly aligned, then
D 1. However, when the vector components are not aligned the overall DOP
is reduced. Passage through a birefringent element can pseudo-depolarize this
ray bundle (more detail is found in 1.5.3), but otherwise the addition of
more coherent components in and of itself does not decrease the degree of
polarization of the total.
A Narrowband Polarized Wave With Continuous Spectrum
A narrowband polarized wave is one where a modulation has been imprinted
on a carrier. The broadening of the spectrum in this way does not entail a
frequency-dependent polarization rotation. Accordingly, the spectrum is written as
(1.5.16)
E() = Eo |G()| ejG () p
where G() is the modulated spectral prole. G() is continuous for broadband modulation and discrete for harmonic modulation. The polarization direction is xed along p and the prole amplitude is taken as a bound function
which goes to zero outside a bandwidth of . The time-domain electric eld
is

Eo |G()| ejG () ejt d
E(t) = p

Consider rst the ex ex product:


 
2
|Eo | |G(1 )| |G(2 )| ej(G (2 )G (1 ))
ex ex =

ej(2 1 )t cos2 d1 d2

28

1 Vectorial Propagation of Light

Simplication comes with the time-average operation, where



1 T

ex ex  =
ex ex dt
T 0
generates a Dirac delta (2 1 ) once the temporal integral is moved through
to the exp j(2 1 )t term. Therefore,
 
2
ex ex  =
|Eo | |G(1 )| |G(2 )| ej(G (2 )G (1 ))

cos (2 1 )d1 d2
2

= |Eo | IG cos2
where the integral IG is

(1.5.17)


2

|G()| d

IG =

(1.5.18)

Following the same procedure,


ey ey  = |Eo | IG sin2
2

and
ex ey  = ex ey 

= |Eo | IG sin cos ej


The time-averaged Stokes parameters are

S = |Eo | IG

1
cos 2

sin 2 cos
sin 2 sin

(1.5.19)

and thus D = 1. This derivation shows that line broadening due to modulation does not in itself alter the degree of polarization of the light. The light
can be pseudo-depolarized, however. Contrary to a discrete spectrum, for a
continuous spectrum D 0 monotonically with increasing bandwidth-delay
from the depolarizing element.
1.5.2 Incoherently Depolarized Waves
Incoherently depolarized waves are comprised of individual components having time-varying polarimetry parameters. Light from the sun or noise from
an optical amplier are examples of completely depolarized light. An exposed
air-gap polarization-dependent delay line, used to generate dierential-group
delay, can have a time-dependent retardance with a xed ellipsometric orientation. The DOP of this source depends on the orientation of the input
state.

1.5 Partial Polarization

29

A Narrowband Incoherent Wave


An narrowband incoherent wave is one where the projection angle and/or
the phase slip changes with time. The eld amplitude may also change
in time, but that impacts only the wave intensity rather than the polarization state. The time scale for (t) and (t) change is assumed to be signicantly shorter than the integration time of the D measurement. Moreover, it
is understood that (t) of a single wave is synonymous with frequency shift,
which makes the wave technically narrowband rather than monochromatic;
it is assumed that (t) changes slowly enough so that the line broadening is
inconsequential. A narrowband, incoherent-wave spectrum is written as
p(B)
E() = Eo ( o )

(1.5.20)

where B denotes a spectral bandwidth which is consistent with the integration


time. The time-domain eld is

cos
(t)

E(t) = Eo ejo t
(1.5.21)
sin (t) ej(t)
The corresponding Stokes parameters are

cos 2(t)
2

S(t) = |Eo |
sin 2(t) cos (t)
sin 2(t) sin (t)

(1.5.22)

Consider an exposed air-gap polarization-dependent delay line with a stable input polarization. The input polarization beam splitter projects the input
light onto two orthogonal axes and delays one with respect to the other. For
a stable input polarization, the projection is xed in time: (t) = o . The
exposed delay arm, however, imparts a time-varying retardance. In this case,
the time-averaged Stokes parameters are

1
2 cos 2o

S = |Eo |

0
0
The degree of polarization is therefore
D = | cos 2o |

(1.5.23)

D can attain values 0 D 1. When o = 0 all light travels in one arm or the
other. Therefore D = 1 as no relative phase shift is experienced. Alteratively,
when o = 45 , the light is equally split between the two arms and D = 0.
One should be careful about the stability of air-gap polarization controllers.
Separately, consider the more general case where both and change in
time. In this case S = [1 0 0 0]T and D = 0 over suitably long integration
periods.

30

1 Vectorial Propagation of Light

Multiple Narrowband Incoherent Waves


As an extension of (1.5.21), a ray bundle composed of multiple narrowband
incoherent waves is written as


cos

E(t) = ejt
(1.5.24)
Eon

sin
n ej n
n
where
and denote random variables and in time. The distributions
of
and are uniform for each wave in the ray bundle. The compound
polarimetric parameters depend on time, too, and the averages are found as
follows. Consider rst the S1 term, where
S1 = ex ex ey ey



=
Eom
cos
m
Eon cos
n
m

Eom
sin
m ej m

Eon sin
n ej n

(1.5.25)

Now, since
m and
n are uncorrelated, only diagonal components of the
product-of-sums are non-zero after time averaging. For any pair of indices,
 cos
m cos
n  =

1
m,n
2

where m,n is the Kronecker delta function dened by m,n = 1 if m = n and


m,n = 0 otherwise. The time averages over the sums are therefore




N
cos
m
cos
n =
2
m
n


and

sin
m e

m
j


sin
n e

n
j

N
2

where the time-average is long enough and the absence of the weighting
coecients is irrelevant in the limit. Therefore,
S1  0
Now consider


S2  = ex ey + ey ex
 






n
m
j
j
cos
m
sin
n e
sin
m e
cos
n
+
=
m

1.5 Partial Polarization

31

Unlike (1.5.25), the time averages for both on- and o-diagonal components
of S2  are zero. Consequently,
S2  0, and S3  0
The only non-vanishing Stokes parameter is S0 , the total intensity. The timeaverage intensity Iincoh for an incoherently depolarized ray bundle is

2
Iincoh = S0  =
|Eon |
(1.5.26)
n

and the degree of polarization is D = 0.


A signicant extension of the preceding derivation is that multiple incoherent waves need not be narrowband but can be discretely or continuously
polychromatic. Relocation of the exp(jt) term of (1.5.24) within the summations does not change the vanishing time-average nature of S1,2,3 . However,
polychromatic wave addition can relax the distribution property constraints
of
and to achieve D = 0.
1.5.3 Pseudo-Depolarized Waves
Pseudo-depolarized waves are waves that start fully polarized and are then
depolarized by passage through a birefringent crystal. This conguration is
called a Lyot depolarizer. The depolarizer imparts a frequency-dependent polarization on the components of the input light. Unlike natural polarization
where each light component uniformly covers the Poincare sphere, pseudodepolarized light retains a well-dened pointing direction for each polarization
component; these directions vary with frequency.
Consider a single-crystal depolarizer oriented at 45 to a horizontally polarized input state. Denote = nL/c, where n is the birefringence, L is
the length, and c is the speed of light. The output polarization state is

  

 j /2
ej /2
1
1
1
e

=
(1.5.27)
j
j /2
1
e
e
2
2
It is readily veried that S1 = 0. The non-vanishing Stokes parameters are
S2 = cos , S3 = sin
These parameters are time invariant, but the pointing direction of the Stokes
vector changes with frequency. For this example, an arc along a line of longitude on the Poincare sphere is traced, the subtended arc angle being .
More generally, consider the Jones matrix in (1.5.27) operating on a polarized narrowband wave having a continuous spectrum (1.5.16). The spectrum
has a modied polarimetric parameter due to the exp(j ) term. The timedomain eld components are

32

1 Vectorial Propagation of Light


G()ejt d


j
G()ej ejt d
ey (t) = Eo sin e

ex (t) = Eo cos

Following the time-averaging procedure of (1.5.17), the o-diagonal components of J are


ex ey  = ex ey 
2

= |Eo | sin cos ej IG ( )


where


2

|G()| ej d

IG ( ) =


2

|G()| cos( )d + j

|G()| sin( )d

(1.5.29)
The diagonal components of J are
ex ex  = |Eo | IG (0) cos2
2

ey ey  = |Eo | IG (0) sin2


2

Taking these factors into account, the Stokes parameters for a pseudodepolarized narrowband wave are

IG (0)

IG (0) cos 2
2

S = |Eo |
(1.5.30)
|IG ( )| sin 2 cos ( + IG ( ))
|IG ( )| sin 2 sin ( + IG ( ))
2

Since |G()| is always positive, the sine and cosine integrands in (1.5.29)
are the only sources able to decrease IG ( ), see Fig. 1.9. In the limit that
0, the oscillatory terms are nearly stationary and IG ( ) IG,max . Conversely, when there is enough birefringent delay such that  1 , the oscillatory terms vary rapidly, resulting in IG ( ) 0. For a continuous spectrum,
the DOP decreases monotonically with increasing delay-bandwidth product.
It is interesting to note that  1 is a necessary but not sucient
condition for a single-stage Lyot depolarizer to drive D 0. If the input
polarization is aligned to an eigenaxis of the crystal then there is no dispersion
of the polarization vector over frequency. The DOP remains unity. The DOP
is minimized when the input polarization is equally split between axes of
the crystal. For this reason, two or more stages are generally used in a Lyot
depolarizer.

1.5 Partial Polarization


a)

b)

Composite
Spectrum

Signal

Signal

Composite
Spectrum

v
Birefringence
variation

33

v
Birefringence
variation

Fig. 1.9. Single-stage Lyot depolarizer impact on a continuous narrowband spectrum. a) Delay smaller than inverse signal bandwidth yields slow birefringence
variation; the depolarizer has small eect on the integral IG ( ). b) Delay much
larger than inverse signal bandwidth; IG ( ) is signicantly smaller in this case. As
increases, D 0 monotonically.

In contrast to the continuous-spectrum case, consider a discrete spectrum


described by

G() =
gn ( n )
n

where amplitudes gn decrease away from o . In this case IG ( ) converts to



IG ( ) =
gn2 exp(jn )
(1.5.31)
n

In contrast with the continuous wave, the integral IG ( ) does not monotonically decrease. Rather, the sum oscillates with a decreasing envelope as
increases. The components of (1.5.31) are phasors (see Fig. 1.8), and the angle between adjacent phasors is determined by . As the phasors fan out for
increasing eventually all even phasors point along +1 and all odd phasors
point along 1. The sum is zero if the spectrum is symmetric. Subsequent
doubling of points all phasors along +1. Such oscillation persists until the
birefringence raps around within the linewidth of an individual spectral component.
1.5.4 A Heterogeneous Ray Bundle: Coherent and Incoherent
Waves
The preceding sections have studied the DOP for coherent and incoherent ray
bundles separately. Signals in a practical system such as a ber-optic communication link are generally comprised of both coherent and incoherent terms.
Coherent light comes from the laser source and incoherent light comes from
both the noise of optical ampliers and depolarization due to polarizationmode dispersion. The degree of polarization for such a heterogeneous mixture
is

2
2
2
S1coh + S1incoh  + S2coh + S2incoh  + S3coh + S3incoh 
D=
S0coh + S0incoh 

34

1 Vectorial Propagation of Light

Since the incoherent components have vanishing time-averaged Stokes parameters other than S0 , only the coherent terms in the numerator survive. When
there is no pseudo-depolarization in the system, the expression for the DOP
is
Icoh
(1.5.32)
D=
Icoh + Iincoh
but when the spectrum is pseudo-depolarized, cf. (1.5.15), the DOP expression
is
Icoh
D
(1.5.33)
Icoh + Iincoh
For instance, when Iincoh = 0, pseudo-depolarization can drive the DOP to
D = 0. One generally nds expression (1.5.32) in the literature, but the very
real eect of polarization mode dispersion in ber-optic systems leads to the
more general expression (1.5.33).

1.5 Partial Polarization

35

Table 1.1. Polarization States in Equivalent Representations


Polarization state

Jones vector


Linear x


Linear y

Linear at 45

Right-hand circular

Left-hand circular


Elliptical

0
1

1
0

Stokes vector

1
1

0
0

1
1

1
j

Coherency matrix

1
j

cos
sin ej

1
1

0
0

1
0

1
0

1
0

0
1

1
0

0
1

1
cos 2 cos 2

cos 2 cos 2
sin 2

1
2

1
2

1
2


Unpolarized

none

All vectors are normalized to a Jones vector of unit length.

0 0
0 1

1 1
1 1

1 j
j 1

1 j
j 1

ej sc
c2
j
s2
e sc
c = cos
s = sin

1
0

0
0

1 0
0 0

1
2

1 0
0 1

36

1 Vectorial Propagation of Light

References
1. H. A. Haus, Waves and Fields in Optoelectronics. Englewood Clis, New Jersey:
PrenticeHall, 1984.
2. H. A. Haus and J. R. Melcher, Electromagnetic Fields and Energy. Englewood
Clis, New Jersey: PrenticeHall, 1989.
3. B. L. Hener, Automated measurement of polarization mode dispersion using
Jones matrix eigenanalysis, IEEE Photonics Technology Letters, vol. 4, no. 9,
pp. 10661068, 1992.
4. S. Huard, Polarization of Light. New York: John Wiley & Sons, 1997.
5. R. Jones, A new calculus for the treatment of optical systems, Part I. description and discussion of the calculus, Journal of the Optical Society of America,
vol. 31, no. 7, pp. 488493, July 1941.
6. , A new calculus for the treatment of optical systems, Part VI. experimental determination of the matrix, Journal of the Optical Society of America,
vol. 37, pp. 110112, 1947.
7. J. A. Kong, Electromagnetic Wave Theory. New York: John Wiley & Sons,
1989.
8. P. Mohr and B. Taylor, Codata recommended values of the fundamental physical constants, Reviews of Modern Physics, vol. 72, no. 2, pp. 351495, 2000.
9. K. B. Rochford, Encyclopedia of Physical Science and Technology, 3rd ed. San
Diego: Academic Press, 2002, ch. Polarization and Polarimetry, pp. 521538.
10. G. Strang, Linear Algebra and its Applications, 3rd ed. New York: Harcourt
Brace Jovanovich College Publishers, 1988.

2
The Spin-Vector Calculus of Polarization

Spin-vector calculus is a powerful tool for representing linear, unitary transformations in Stokes space. Spin-vector calculus attains a high degree of abstraction because rules for vector operations in Stokes space are expressed in
vector form; there is no a priori reference to an underlying coordinate system.
Absence of the underlying coordinate system allows for an elegant, compact
calculus well suited for polarization studies.
Spin-vector calculus is well known in quantum mechanics, especially relating to quantized angular momentum. Aso, Frigo, Gisin, and Gordon and
Kogelnik have greatly assisted the optical engineering community by adopting
this calculus to telecommunications applications [1, 35]. The purpose of this
chapter is to bring together a complete description of the calculus as found in
a variety of disparate sources [2, 3, 58], and to tailor the presentation with
a vocabulary familiar to the electrical engineer. Tables 2.2 and 2.3 located at
the end of the chapter oer a summary of the principal relations.

2.1 Motivation
The purpose of this calculus is to build a geometric interpretation of polarization transformations. The geometric interpretation of polarization states
was already developed in 1.4. The Jones matrix, while a direct consequence
of Maxwells equations when light travels through a medium, is a complexvalued 2 2 matrix. This is hard to visualize. The Mueller matrix, however,
can be visualized as rotations and length-changes in Stokes space. The spinvector formalism makes a bilateral connection between the Jones and Mueller
matrices.
Of all the possible Jones matrices, two classes predominate in polarization
optics: the unitary matrix and the Hermitian matrix. The unitary matrix preserves lengths and imparts a rotation in Stokes space. A retardation plate is
described as a unitary matrix. The Hermitian matrix comes from a measurement, such as that of a polarization state. Since all measured values must be

38

2 The Spin-Vector Calculus of Polarization

real quantities, the eigenvalues of a Hermitian matrix are real. The projection
induced by a polarizer is described as a Hermitian matrix.
Based on the characteristic form (1.4.27) on page 19 of the Mueller matrix
for a unitary matrix, dened by U U = I, one can write

1 0 0 0
0

JU MU =
(2.1.1)
0

R
0
where R is a 3 3 rotation matrix having real-valued entries. Since the polarization transformation through multiple media is described as the product
of Jones matrices, one would expect a one-to-one correspondence between
multiple unitary matrices and multiple rotation matrices. This would lead to

1 0 0 0

(2.1.2)
JU2 U1 MU2 U1 =
0 R2 R1
0
This is indeed the case. Moreover, the Mueller matrix representing passage
of light through any number of retardation plates always keeps the form
of (2.1.1). Rotation matrix R is therefore a group closed under rotation.
Taking the abstraction one step further, any rotation has an axis of rotation and an angle through which the system rotates. Instead of describing
the rotation R as a 3 3 matrix, it is more general to describe the rotation
as a vector quantity: R = f (
r, ), where r is the rotation axis in Stokes space
and is the angle of rotation. The vector r need not be resolved onto an
orthonormal basis to give r = x
rx + y ry + z rz ; this operation may be postponed indenitely. This is in contrast to writing R as a 3 3 matrix where
the underlying orthonormal basis is explicit. Accordingly, r exists as a vector
in vector space and can undergo operations such as rotation, inner product,
and cross product with respect to other vectors.
In parallel to the unitary-matrix case, the Mueller matrix that corresponds
to a Hermitian matrix, dened by H = H , one can write

JH MH =

(2.1.3)

This indeed is a tautology. As with the unitary matrices, products of Hermitian matrices in Jones space result in products of Mueller matrices in Stokes
space. That is,
JH2 H1 MH2 H1 = MH2 MH1
(2.1.4)
All Hermitian operations are closed within the 4 4 Mueller matrix.

2.2 Vectors, Length, and Direction

39

As it appears, products of unitary-corresponding Mueller matrices change


only entries in the lower-right-hand 3 3 sub-matrix. Inclusion of even a single
Hermitian-corresponding Mueller matrix scatters those nine elements into all
sixteen matrix positions. This is a non-reversible process. There is, however, a
remarkable exception. A traceless Hermitian matrix H, dened by TrH = 0,
has a corresponding Mueller matrix of the form

1 0 0 0
0

JH MH =
(2.1.5)
0

V
0
where V is a Stokes-space vector having a length and pointing direction. (Note
that a rotation operator has unit length, two angles that determine the vector
direction, and one angle of rotation. A Stokes vector has a length and two
angles that determine the vector direction. Both rotation operator and Stokes
vector have three parameters). Arbitrary products of unitary matrix U and
traceless Hermitian matrix H form an extended closed group in which entries
change only in the lower right-hand 3 3 sub-matrix of M.
Throughout this chapter and the chapters on polarization-mode dispersion, one looks for zero trace of Hermitian matrices. If this property is established, then a calculus that includes lossless rotations of vectors can be
applied to the system. This calculus is called spin-vector calculus, and is the
topic of the present chapter.

2.2 Vectors, Length, and Direction


Physical systems can often be described by the state the system is in at a
particular time and position. The span of all possible states for a given system
is called a state space. Any particular state represents all the information that
one can know about the system at that time and position. Interaction between
a physical system and external inuences, such as transmission through media
or applied force, can change the state. So, there are two categories of study:
the description of state, and the transformation of state.
A state that describes wave motion can be represented by a vector with
complex scalar entries. The dimensionality of the state vector is determined
by the number of states that are invariant to an external inuence. That
is, the dimension of a state vector equals the number of eigenstates of the
system. For polarization, the dimensionality is two. The important properties
of a vector space are direction, length, and relative angles. These metrics will
form a common theme throughout the following development.
2.2.1 Bra and Ket Vectors
Bra and ket spaces are two equivalent vector spaces that describe the same
state space. Bra and ket spaces, or bracket space, is a formulation developed

40

2 The Spin-Vector Calculus of Polarization

by P. A. M. Dirac and used extensively in quantum mechanics. Bras and kets


are vectors with dimension equal to the state dimension. When a bra space
and ket space describe the same state vector, the bra and ket are duals of one
another. For a state vector a, the ket is written |a and the bra is written a |.
The entries in bra and ket vectors are complex scalar numbers. A ket vector
suitable for polarization studies is

ax

(2.2.1)
|a =
ay
where ax and ay are the components along an orthogonal basis. The entries
are complex and accordingly there are four independent parameters contained
in (2.2.1). Since the entries are complex, they have magnitude and phase:

|a
|ax |ejx
|
x
= ej

(2.2.2)
|a =
jy
|ay |ej
|ay |e
where is a common phase and is the phase dierence of the second row.
In the following the explicit magnitude symbols | | will be dropped and the
intent of magnitude or complex number should be clear from the context. Bra
vector a | is said to be the dual of |a because they are not equal but they
describe the same state:
dual
|a a |
The bra vector a | corresponding to |a is


a | = ax ay

(2.2.3)

for every |a. The bra vector is the adjoint (), or complex-conjugate transpose,
of the corresponding ket vector:

a | = (|a)

(2.2.4)

Bra and ket vectors obey algebraic additive properties of identity, addition,
commutation, and associativity. Identity and addition rules for kets are
identity

|a + |0 = |a

addition

|a + |b = |

where |0 is the null ket. Commutation and associativity are straightforward
to prove using the matrix representation. A bra or ket vector can also be
multiplied by a scalar quantity c:
c |a = |a c

(2.2.5)

Physically, the multiplication of a state vector by a scalar does not change the
state and therefore the two commute. Operations that have no meaning are

2.2 Vectors, Length, and Direction

41

the multiplication of multiple ket vectors or bra vectors. For example, |b |a
is meaningless.
Finally, it should be understood that state vectors a | and |a are a more
general representation than column and row vectors (2.2.1) and (2.2.3). A
state vector is a coordinate-free abstraction that has the properties of length
and direction; a row or column vector is a representation of a state vector
given a choice of an underlying coordinate system.
2.2.2 Length and Inner Products
Bra and ket vectors have properties of length, phase, and pointing direction.
The length of a real-valued vector is a scalar quantity and is determined by
the dot product: |a|2 = a a. For complex-valued bra-ket vectors, the inner
product is used to nd length of a vector and is determined by multiplying its
bra representation a | with its ket representation |a: a2 = a |a, where  
is the norm of the vector.
More generally, one wants to measure the length of one vector as projected
onto another. The inner product of two dierent vectors is the product of the
bra form of one vector and the ket for of the other: b |a. For real-valued vectors it is clear that b a = a b. However, for bra-ket vectors, having complex
entries, the order of multiplication dictates the sign of the resulting phase.
That is,
b |a = |b |a| ej
a |b = |b |a| e

(2.2.6a)
(2.2.6b)

The two inner products are related by the complex conjugate:

b |a = (a |b)

(2.2.7)

The inner product of a bra and ket is a complex-valued scalar. Based on


(2.2.7) it is clear that the inner product of a vector onto itself yields a real
number, and since the inner product is a measure of length, the real number is
positive denite: a |a = real number 0. Only the null ket has length zero.
Any nite ket has a length greater than zero. Throughout the body of the
text, polarization vectors are taken to be unit vectors unless otherwise stated.
A unit vector has a direction, phase, and unity length. Any vector can be
converted to a unit vector by division by its norm:
|
a = 

1
a |a

|a

(2.2.8)

so that

a |
a = 1
In the following the tilde over the vectors will be dropped.

(2.2.9)

42

2 The Spin-Vector Calculus of Polarization

Two vectors are dened as orthogonal to one another when the inner
product vanishes:
b |a = 0
(2.2.10)
This is an essential inner product used regularly.
When two polarization vectors are resolved onto a common coordinate
system,
(2.2.11)
b |a = bx ax + by ay
Finally, the inner product in matrix representation of a normalized vector is
the sum of the component magnitudes squared:
a |a = |ax |2 + |ay |2 = 1

(2.2.12)

2.2.3 Projectors and Outer Products


The inner product measures the length of a vector or the projection of one
vector onto another. The result is a complex scalar quantity. In contrast,
the outer product retains a vector nature while also producing length by
projection. There are two outer product types to study: the projector, having
the form |pp|; and the outer product |pq|. The form |pq| is called a dyadic
pair because the vector pair has neither a dot nor cross product between them.
In quantum mechanics the projector |pp| is called the density operator for
the state.
Consider a projector that operates on ket |a:
|pp |a = |p (p |a) = c |p

(2.2.13)

The quantity c = p |a is just a complex scalar and commutes with the ket.
Operating on |a the projector measures the length of |a on |p and produces
a new vector |p.
The eect of the projector is to point along the |p direction where the
length of |p is scaled by p |a. Projectors work equally well on bras, e.g.
a |p p | = c p |

(2.2.14)

so in fact it should be clear that

a |p p | = (|pp |a)

(2.2.15)

The adjoint operator connects the bra and ket forms.


The behavior of the outer product |pq| is similar to the projector but
for the fact that the projection vector and resultant pointing direction dier.
The resultant pointing direction depends whether the outer product operates
on a ket or a bra. Acting on a ket, the outer product yields
|pq |a = (q |a) |p

(2.2.16)

2.2 Vectors, Length, and Direction

43

whereas acting on a bra of the same vector, the outer product yields
a |p q | = (a |p) q |

(2.2.17)

The resultant pointing direction and projected length depends on whether the
outer product operates on a bra or ket vector.
In the study of polarization, the outer product is a 2 2 matrix with
complex entries:

|ba| =

bx ax bx ay
by ax by ay

(2.2.18)

The determinant is
det (|ba|) = 0

(2.2.19)

and therefore the projector is non-invertible. The determinant of an outer


product of any dimension is likewise zero. That means the action of |ba|
on a ket is irreversible, which is reasonable because the original direction of
the ket is lost. So, while all outer products are operators not all operators
are outer products. Operators that are linear combinations of projectors are
reversible under the right construction.
In summary, the outer product follows these rules:

equivalence

(|ba|) = |ab|

associative

(|ba|) | = b | (a |)


Tr (|ba|) = a |b

trace
irreversible

det (|ba|) = 0

where Tr stands for the trace operation. The trace connects the outer product
to the inner product.
2.2.4 Orthonormal Basis
An orthonormal basis is a complete set of orthogonal unit-length axes on which
any vector in the space can be resolved. Consider a vector space with N dimensions and orthogonal unit vectors (|a1  , |a2  , . . . , |aN ). The orthogonality
requires
(2.2.20)
am |an  = m,n
where m, n is the Kronecker delta function. Only a vector projected onto
itself yields a non-vanishing inner product. When the set is complete, the
outer products are closed, where closure is dened as

|an an | = I
(2.2.21)
n

44

2 The Spin-Vector Calculus of Polarization

When a basis set, or group, is closed, any operation to a member of the group
results in another member within the group. Together, (2.2.202.2.21) are the
two conditions that dene an orthonormal basis.
Given an orthonormal basis, any arbitrary vector can be resolved onto the
basis using (2.2.21). An arbitrary ket |s is resolved as




|an an | |s =
cn |an 
(2.2.22)
|s =
n

where the complex coecients are given by cn = an |s. The inner product s |s is the sum of the absolute-value squares of the coecients cn :

|ca |2
(2.2.23)
s |s =
When |s is normalized


a

a

|ca |2 = 1.

2.3 General Vector Transformations


Interaction between a physical system and external inuences can change the
state of a system. Left unperturbed, a state persists indenitely. Operators
embody the action of external inuences and are distinct from the state of
the system itself. The bra and ket vectors of the preceding section are two
equivalent spaces that describe the same state space. Operators also have two
distinct and equivalent spaces that describe the same state transformation.
While there is no special notation to represent a ket operator or a bra
operator, equivalence between operator spaces is maintained under
X |a a | X
dual

(2.3.1)

X is said to be the adjoint operator of X. Care should be taken because the


action of X |a is not the same as a | X; these two results are dierent.
2.3.1 Operator Relations
Operators always act on kets from the left and bras from the right, e.g. X |a
or a | X. The expressions |a X and Xa | are undened. An operator multiplying a ket produces a new ket, and an operator multiplying a bra produces
a new bra. In general, an operator changes the state of the system,
X |a = c |b

(2.3.2)

where c is a scaling factor induced solely by X. Operators are said to be equal


if
X |a = Y |a X = Y
(2.3.3)
Operators obey the following arithmetic properties of addition:

2.3 General Vector Transformations

commutative

45

X +Y =Y +X

associative

X + (Y + Z) = (X + Y ) + Z

distributive

X (|a + |b) = X |a + X |b

Operators in general do not commute under multiplication. That is


XY = Y X

(2.3.4)

In matrix form, only when X and Y are diagonal matrices does XY = Y X.


Other multiplicative properties are
identity

I |a = |a

associative

X(Y Z) = (XY )Z

distributive

X (Y |a) = XY |a

All of the above arithmetic properties apply equally well to bra vectors.
The eect X has on state a is measured by
expectation value of X on a =

a |X| a
a |a

(2.3.5)

In general, an inner product that encloses an operator gives a complex number:


b | (X |a) = b |X| a = complex number

(2.3.6)

Consider dual constructions, rst where X |a is left-multiplied by b |, and


second where the dual a | X is right-multiplied by |b:
   
b |X| a = a X  b
(2.3.7)
These two cases are duals of one another and are therefore complex conjugates.
In the study of polarization, operators are represented as 2 2 complexvalued Jones matrices:

a ej b ej

X=
(2.3.8)
c ej d ej
There are eight independent variables contained in the operator. If det X = 0,
then X is invertible and the action of X can be undone.
The properties of operators are summarized as follows:
dual

operator duality

X |a a | X

change of state

X |a = c |b
a | X = c b |

inner product with operator


conjugate relation
conjugate transpose

b |X| a = complex number


 
b |X| a = a X  b

(XY ) = Y X

46

2 The Spin-Vector Calculus of Polarization

Just as an arbitrary ket can be resolved onto an orthonormal basis, an


arbitrary operator X can be resolved onto a set of projection operators formed
on the orthonormal basis. Applying the closure relation (2.2.21) yields
 




|am am | X
|an an |
X =
m


n

|am  am |X| an an |

(2.3.9)

The indexing symmetry of (2.3.9) looks like a matrix with am |X| an  as
the (m, n) entry. For polarization, the matrix is 2 2 and looks like


a1 |X| a1  a1 |X| a2 

(2.3.10)
|am  am |X| an an | 
a2 |X| a1  a2 |X| a2 
n m
The resolved form of X in (2.3.10) will become particularly simple in discussion of Hermitian and unitary matrices.

2.4 Eigenstates, Hermitian and Unitary Operators


Many physical systems exhibit particular states that are not transformed by
interaction with the system. These invariant states are called eigenstates of
the system. In spin-vector calculus, operators embody the inuence of a phenomena. The eigenvectors of an operator are the eigenstates of the system.
When an operator X acts on its own eigenstate a,
X |a1  = a1 |a1 
a1 | X

a1 a1

(2.4.1a)
(2.4.1b)

the state of the system is unaltered but for a scaling factor a1 . The scale factor
is the eigenvalue of X associated with eigenstate a1 . Each eigenvector has an
associated eigenvalue, and a well-conditioned matrix has as many eigenvectors
as rows in the matrix or, equivalently, dimensions in the state space.
The eigenvectors of Hermitian and unitary operators are orthogonal when
the associated eigenvalues are distinct. The eigenvalues of a Hermitian operator are real-valued scalars, and the eigenvalues of a unitary operator are
complex exponential scalars. A Hermitian or unitary operator X having N
eigenkets (|a1  , |a2  , . . . , |aN ) and associated eigenvalues (a1 , a2 , . . . , aN ) produces the series of inner products


am X X  an  = |am |2 m,n
(2.4.2)
The operator X X scales each axis by a dierent amount, but does not rotate
nor create reection of the original basis. The eigenvalues of operator X are
related to the determinant and trace by

2.4 Eigenstates, Hermitian and Unitary Operators

det(X) = a1 a2 aN
Tr(X) = a1 + a2 + + aN

47

(2.4.3a)
(2.4.3b)

Since the eigenvalues of a Hermitian matrix are real, its determinant and trace
are real.
2.4.1 Hermitian Operators
The dening property of a Hermitian operator is
H = H

(2.4.4)

The associated Hermitian matrix in polarization studies has only four independent variables: three amplitudes and one phase. This contrasts with the
general Jones matrix (2.3.8) which has eight.
The eigenvectors of H form a complete orthonormal basis and the eigenvalues are real. That the eigenvalues are real is proved from the following
dierence:


an  H H  am  = (an am ) an |am 
= 0

(2.4.5)

Non-trivial solutions are found when neither vector is null. The eigenvectors
may be the same or dierent. Consider rst when the eigenvectors are the
same. Since an |an  = 0, (an an ) = 0 and the eigenvalue is real. Consider
when the eigenvectors are dierent. Unless am = an , in which case the eigenvectors are not linearly independent, it must be the case that an |am  = 0.
All eigenvalues are therefore real. Hermitian operators H scale its own basis
set:


(2.4.6)
am H H  an  = a2m m,n
When det(H) = 0, H is invertible and the action of H on the state of a system
is reversible.
The expansion of H onto its own basis generates a diagonal eigenvalue
matrix. Under construction (2.3.9) the expansion yields

|am  am |H| an am |
H =
n

am |am am |

(2.4.7)

where am |H| an  = an m,n . The orthonormal expansion is written in matrix


form as H = SS 1 , where S is a square matrix whose columns are the
eigenvectors of H and is a diagonal matrix whose entries are the associated
eigenvalues. Schematically,

48

2 The Spin-Vector Calculus of Polarization

|
| |

S=
v1 v2 vN

| |
|

, and =

a1

a2
..

(2.4.8)

aN

where |an  = vnT .


2.4.2 Unitary Operators
The dening property of a unitary operator is
T T = I

(2.4.9)

Acting on its orthogonal eigenvectors |an , the unitary operator preserves the
unity basis length:


(2.4.10)
am T T  an  = m,n
Taking the determinant of both sides of (2.4.9) gives det(T T ) = 1. Since the
determinant of a product is the product of the determinants and the adjoint
operator preserves the norm, the determinant of T must be
det(T ) = ej

(2.4.11)

Since the the determinant is the product of eigenvalues, the eigenvalues of T


must themselves be complex exponentials and, accounting for (2.4.10), they
must have unity magnitude. Therefore T acting on an eigenvector yields
T |an  = ejn |an 

(2.4.12)

The eigenvalues of T lie on the unit circle in the complex plane.


A special form of T exists where the determinant is unity. This special
form is denoted U and is characterized by det(U ) = +1. To transform from T
to U , the common phase factor = exp(j/N ) must be extracted from each
eigenvalue of T , where N is the dimensionality of the operator. The T and U
forms are thereby related:
(2.4.13)
T = ej U
It should be noted that when det(U ) = 1, a reection is present along an
odd number of axes in the basis set of U .
The eigenvalue equation for U is
U |an  = ejn |an 

(2.4.14)

U expands on its own basis set in the same way H expands (2.4.7):

U=
ejm |am am |
(2.4.15)
m

2.4 Eigenstates, Hermitian and Unitary Operators


Hy = H

UyU = +1

49

=m
+1

eig(H)

<e

<e

eig(U)

Fig. 2.1. Eigenvalue loci of H and U . Left: eigenvalues of H lie on the real number
line. Right: eigenvalues of U lie on the unit circle in the complex plane.

This orthonormal expansion has the matrix analogue of U


where the diagonal matrix is
j
e 1

ej2

exp (j) =
..
.

= S exp(j)S 1 ,

(2.4.16)

ejN
and S is the same form as (2.4.8).
2.4.3 Connection between Hermitian and Unitary Matrices
The connection between Hermitian and unitary operators is quite intimate.
Figure 2.1 illustrates the eigenvalue domains for H and U . The eigenvalues
of H lie on the real number line, while those of U lie on the unit circle in
the complex plane. Multiplying the eigenvalues of H by j and taking the
exponential, one can construct the eigenvalues of U . Note that the eigenvalues
of U are cyclic, so only the real number line modulo 2 is signicant.
Based on the operator expansions of the preceding sections, one has

Since in general exp(SS


connected as
U = ejH

H = SH S 1

(2.4.17a)

U = R ejU R1

(2.4.17b)

) = S exp()S
=

, the H and U operators may be

SejU S 1 = SejH S 1

(2.4.18)

For every Hermitian operator H there is an associated unitary operator U that


shares the same basis set and has eigenvalues related through the complex
exponential.
2.4.4 Similarity Transforms
Frequently one has Hermitian operator H and orthonormal basis |pn  that
are not aligned. That is, the eigenvectors |an  of H are not parallel to vectors |pn . Expansion of H into |pn  using the expansion expression (2.3.10)

50

2 The Spin-Vector Calculus of Polarization

generates a matrix that is not diagonal. However, the expansion matrix can
be diagonalized by rotating basis |pn  into |an . The unitary matrix does this
operation. Taking advantage of U U = 1, one can write


p |H| p = p U U HU U  p


= a U HU  a
= a |HT | a

(2.4.19)

Since (2.4.19) holds for any choice of initial basis |pn , the operators
HT = U HU

(2.4.20)

must be equal. Equation (2.4.20) is known as a similarity transform. Both the


determinant and trace of H are independent of basis; that is
det(HT ) = det(U ) det(H) det(U )
= det(H)
and

Tr(HT ) = Tr(U HU )

(2.4.21)

(2.4.22)

The trace is always preserved under a similarity transform.


2.4.5 Construction of General Unitary Matrix
The characteristic property T T = 1 of unitary matrices restricts the eight
independent variables generally available for a polarization operator (2.3.8).
Derivation of the restrictions generates a general form of the unitary matrix.
First consider U , where det(U ) = +1 and . Substitution of X (2.3.8) for U
in U U = 1 generates the following requirements:
|a|2 + |b|2 = 1, |a|2 = |d|2 , |b|2 = |c|2 ,
ac ej() + bd ej() = 0

(2.4.23)

The determinant requirement generates


ad ej(+) bc ej(+) = 1

(2.4.24)

The amplitude restrictions in (2.4.23) are satised by a = cos and b = sin .


There remains, however, a sign degeneracy in that c = b and d = a. This
degeneracy is insignicant in that any choice ows through the restriction
criteria and produces the same matrix form.
Combination of the last equation in (2.4.23) and (2.4.24) generates two
restrictions on the phase:

2.4 Eigenstates, Hermitian and Unitary Operators

51

ej(+) = ej(+)
ej = ej
e

= e

(2.4.25)

There are only two independent phases. Combining all of the above restrictions, the general matrix form of U is written

ej cos ej sin

(2.4.26)
U =
ej sin ej cos
There are three independent variables in U : one amplitude and two phases.
The fourth independent variable has been suppressed because of the arbitrary
selection det(U ) = +1. The unitary matrix T includes the common phase:

j
j
e
cos

e
sin

T = ej
(2.4.27)
ej sin ej cos
where there are now four independent variables: one amplitude and three
phases.
The Cayley-Klein form of U , using complex entries a and b, is

a
b

(2.4.28)
U =
b a
The inverse of the unitary matrix is U 1 (a, b) = U (a , b).
2.4.6 Group Properties of SU(2)
For polarization studies, unitary operators are 2 2 square matrices with
complex entries. The group dened by multiplication operations of 2 2 unitary matrices is called U(2), and the subgroup of unitary matrices where
det(U ) = +1 is called SU(2), S for special. The group properties for multiplication are
Identity

UI = U

Closure

U1 U2 = U3

Inverse

U 1 U = I

Associativity

(U1 U2 )U3 = U1 (U2 U3 )

where in all cases U1,2,3 SU(2). SU(2) is closed under these four operations.

52

2 The Spin-Vector Calculus of Polarization

2.5 Vectors Cast in Jones and Stokes Spaces


Thus far, state spaces and operators have been presented without restriction
on their dimensionality. The properties of these vectors and matrices have
been studied in general with passing observations about polarization-specic
points of interest. At this stage the scope of presentation will concentrate
on the study of polarization so that a formal connection between Jones and
Stokes space can be established. The tools developed in the preceding sections
are essential to make the bilateral connections that follow.
Recall from (1.4.5) on page 13 that a polarization vector is written as

cos

(2.5.1)
|s = Eo ej
sin ej
where Eo is real. There are two polar angles in (2.5.1): and . The common
phase exp(j) is lost on conversion to Stokes space.
There are seven measurements necessary to determine the polarization
ellipse uniquely. The rst measurement is for the overall intensity and the
remaining measurements project the ellipse onto six dierent reference axes.
The formal construction of a projection matrix is necessary at this point.
Consider points along two orthogonal axes and their projection onto a
line L inclined by angle that passes through the origin. As illustrated in
Fig. 2.2, the coordinate (1, 0) is projected to point a on line L. The coordinates
of a as measured along the two orthogonal axes are (cos2 , sin cos ). After
a similar analysis for the coordinate (0, 1), one can construct the projection
matrix P:

cos2
sin cos

P=
(2.5.2)
sin cos
sin2
It is clear that det(P) = 0; P is non-invertable and its action is irreversible.
There is loss of information after projection. Moreover, P 2 = P, so once
the projection is taken, subsequent projections along the same line L do not
change the result.
2.5.1 Complete Measurement of the Polarization Ellipse
There are seven measurements necessary for complete determination of the
polarization ellipse. The rst measurement is one of total intensity, the remaining six measurements are projections. The projections are dened in pairs and
the dierence values are associated with the Stokes coordinates. The result of
an intensity of the polarization ellipse is the inner product


1 0
s |s = s |
|s
(2.5.3)
0 1

2.5 Vectors Cast in Jones and Stokes Spaces


L

(0, 1)
a

53

a = cos u cos u
sin u

b = sin u cos u
sin u

(1, 0)

Fig. 2.2. Projection of unit coordinates (1, 0) and (0, 1) onto line L, which is inclined
by angle and passes through the origin. The projected coordinates are tabulated
on the right. A second projection of a and b onto L does not change the coordinates
of a and b. The projection operator is non-invertable.

Without loss of generality, s |s = 1 in the following.


The rst projection pair is = 0 and = /2. The projection measure
comes from the inner products




1 0
0 0
|s , and P/2 = s |
|s
(2.5.4)
P0 = s |
0 0
0 1
The Stokes parameter s1 is dened as
s1 = P0 P/2

(2.5.5)

Substitution of (2.5.4) into (2.5.5) makes




1
0
|s
s1 = s |
0 1

(2.5.6)

The second projection pair is = /4. These projections produce






1 1
1 1
P+/4 = 12 s |
|s , and P/4 = 12 s |
|s (2.5.7)
1 1
1
1
The Stokes parameter s2 is dened as
s2 = P+/4 P/4
which makes


s2 = s |

0 1
1 0

(2.5.8)


|s

(2.5.9)

The last projection requires the measurement of the ellipse circularity. By


convention, right-hand circular polarization rotates in the counter-clockwise
(ccw) direction when observed along the
z direction (looking into the light).
The right-hand circular polarization vector is
 
1
|s R =
(2.5.10)
j

54

2 The Spin-Vector Calculus of Polarization

The ccw vector needs mapping to the = 0 axis; a unitary transform does
the rotation. The right- and left-hand projections are calculated via
PR = s | U P0 U |s

(2.5.11a)

PL = s | U P/2 U |s
The unitary matrix
U=

1
2

1 j
j
1

(2.5.11b)


(2.5.12)

maps right-hand circular polarization to the = 0 axis:



   
1
1 j
1
1
=
j
1
j
0
2

(2.5.13)

Substituting (2.5.2) and (2.5.12) into (2.5.11) produces



PR =

1
4

s |

1 j
j
1


|s , and PL =

1
4

s |

1 j
j 1


|s

(2.5.14)

The Stokes parameter s3 is dened as


s3 = PR PL
which makes


s3 = s |

0 j
j
0

(2.5.15)


|s

(2.5.16)

From these seven measurements one can transform from a ket in Jones
space to three Stokes coordinates that lie on the unit sphere:
|s = s

(2.5.17)

where the vector s is a column vector dened by s = (s1 , s2 , s3 )T .


This completes the measurement of the polarization ellipse. From these
measurements the polarimetric angles and are uniquely determined. These
measurements combined with the denition of the Stokes parameters generate
the Pauli spin matrices. This is the topic of the next section.
2.5.2 Pauli Spin Matrices
The Pauli spin matrices connect Jones to Stokes spaces through the projection
measurements of the preceding section. The identity Pauli matrix is


1 0
0 =
(2.5.18)
0 1

2.5 Vectors Cast in Jones and Stokes Spaces

The Pauli spin matrices are1








1
0
0 1
0 j
1 =
; 2 =
; 3 =
0 1
1 0
j
0

55

(2.5.19)

The spin matrices are both Hermitian and unitary:


k = k and k k = I

(2.5.20)

The determinants of the spin matrices are 1 and the traces zero:
det(k ) = 1 and Tr(k ) = 0

(2.5.21)

A spin matrix multiplied by itself yields


k k = I

(2.5.22)

and multiplied by other matrices gives


i j = j i = jk

(2.5.23)

where the indices of the multiplication table (i, j, k) are cyclic permutations
of (1, 2, 3).
Each Stokes coordinate of a polarization state |s is calculated by inserting
the associated Pauli matrix into the inner product s | | s. The individual
Stokes coordinates are
(2.5.24)
sk = s |k | s
This is shorthand for the projection-dierence measurements of (2.5.6, 2.5.9,
2.5.16). Since the spin matrices are Hermitian, the Stokes coordinates sk are
real, signed quantities. Moreover, since det(k ) = 1 and the Jones vector |s
is assumed to be normalized, sk is bounded by 1 sk +1. The proof that
the norm of s is unity, |
s| = 1, is shown below.
2.5.3 The Pauli Spin Vector and the Bilateral Connection
Between Jones and Stokes Vectors
The Pauli spin vector condenses further the notation of (2.5.24). The spin
vector is dened as

1
 = 2
(2.5.25)
3
1

In physics texts the z direction is denoted by the 1 spin matrix while here it is
denoted by 3 . Historically, the Pauli spin matrices describe electron spin, which
is either up or down in the z direction. In polarization optics, one usually thinks
of a horizontal polarization state aligned to the x axis.

56

2 The Spin-Vector Calculus of Polarization

where  is a vector of matrices. The vector of Stokes coordinates s is derived


from the Jones vector |s using the spin vector:


s |1 | s
s1
s2 = s |2 | s
s |3 | s
s3

(2.5.26)

s = s | | s

(2.5.27)

More concisely,
This is the most compact way to map Jones vectors to Stokes vectors.
The reciprocal connection is made through an eigenvalue equation whose
parameters are the Stokes vector s and the spin vector. First, observe that the
spin vector behaves both as a 3 1 vector and as a 2 2 matrix, depending on
the context. Above shows the spin vector acting as a 31 vector. Alternatively,
the dot product of s with the spin vector yields
s  = s1 1 + s2 2 + s3 3

s1
s2 js3

=
s2 + js3
s1

(2.5.28)

s  in this case is a 2 2 Jones matrix and, since the coecients sk are real,
s  ).
s  is Hermitian: (
s  ) = (
Next, recall from 2.2.3 that the trace operation connects the projector
with its inner product: Tr(|ss|) = s |s. Since the trace of each Pauli matrix
is zero it is also true that Tr (
s  ) = 0. For a normalized state vector such
that s |s = 1, one can construct the projector for ket |s in terms of the spin
vector:
1
(2.5.29)
|ss| = (I + s  )
2
Subsequent multiplication on the right by |s generates the eigenvalue equation
s  |s = |s

(2.5.30)

This is the most compact way to map Stokes vectors to Jones vectors. The
eigenvector of s  associated with eigenvalue +1 generates the Jones vector
|s from Stokes vector s.
2.5.4 Spin-Vector Identities
Vector operations that include spin-vectors do not yield to the same intuition one is accustomed to with normal vectors. For example, while one is
quite familiar with a (b a) = 0, since a is orthogonal to b a, the spinvector analogue produces  (a  ) = 2j(a  ). The dierence comes from

2.5 Vectors Cast in Jones and Stokes Spaces

57

the cyclic multiplication table for spin-vectors (2.5.222.5.23), where the sign
of a product is determined by the order in which the spin-vectors appear.
The purpose of the following identity tabulation is to provide reductions
in the order k of ( )k . For the following identities, a and b are real-valued 31
vectors and  is the spin vector. Real vectors a and b are not interchangeable
with the spin vector  .
Identities of order ( )0 and ( ):
a a = a2

(2.5.31)

a  =  a

(2.5.32)

a(a  ) = (a  )a

(2.5.33)

  = 3I

(2.5.34)

 (a  ) = aI + ja 

(2.5.35)

(a  ) = aI ja 

(2.5.36)

(a  )(a  ) = a2 I

(2.5.37)

(a  )(b  ) = (a b)I + (ja b) 

(2.5.38)

[(a  ),  ] = 2ja 

(2.5.39)

{(a  ),  } = 2a I

(a  ), (b  ) = 2(ja b) 


(a  ), (b  ) = 2(a b) I

(2.5.40)

 (ja  ) = 2(a  )

(2.5.43)

(ja  )  = 2(a  )

(2.5.44)

(ja  )(a  ) = a  a(a  )

(2.5.45)

(a  )(ja  ) = a(a  ) a2

(2.5.46)

Identities of order ( )2 :

(2.5.41)
(2.5.42)

where [A, B] = AB BA is the commutator and {A, B} = AB + BA is the


anti-commutator.
Identities of order ( )3 :
 ((a  ) ) = ( (a  ))  = (a  )

(2.5.47)

 ( (a  )) = ((a  ) )  = 3(a  )

(2.5.48)

(a  ) (a  ) = 2a(a  ) a2

(2.5.49)

58

2 The Spin-Vector Calculus of Polarization

Identity of order ( )n :

an
n even
(a  )n =
an1 (
a  ) n odd

(2.5.50)

Finally, there are identities that relate to inner products taken with various
forms of the spin vector. These identities are as follows:
s |a  | s = a s | | s = a s

(2.5.51)

s |a  | s = a s | | s = a s

(2.5.52)

s |R | s = Rs | | s = R
s

(2.5.53)

where a and b are arbitrary vectors in Stokes space, a is the length of a,
and R is a 3x3 matrix. Identity (2.5.53) is always a source of confusion, so it
is repeated explicitly:

s |r11 1 + r12 2 + r13 3 | s


r11 r12 r13
s |1 | s

r21 r22 r23 s |2 | s = s |r21 1 + r22 2 + r23 3 | s

s |3 | s
s |r31 1 + r32 2 + r33 3 | s
r31 r32 r33
where R
s on the left and s |R | s on the right.
2.5.5 Conservation of Length
Expressions (2.5.27) and (2.5.30) complete the bilateral connection between
Jones and Stokes vectors. Length must be conserved, of course, and this is
veried now.
The Stokes-vector length is derived from the product s s = s21 + s22 + s23 .
Consider one coordinate alone,
s2k = s |k | ss |k | s

(2.5.54)

Substitution of the projector |ss|, (2.5.29), for the innermost term gives
s2k =

1
1
s |s + s |k (
s  )k | s
2
2

(2.5.55)

The sum of all three terms gives


s21 + s22 + s23 =

1
3
s |s + s | ((
s  ) )| s
2
2

(2.5.56)

The spin-vector identity (2.5.48) simplies (2.5.56):


s21 + s22 + s23 =

1
3
s |s s |
s  | s = s |s
2
2

(2.5.57)

2.5 Vectors Cast in Jones and Stokes Spaces

59

Thus,  s 2 = s |s. Length is clearly preserved in this direction. For the


reverse mapping, construction of (2.5.30) without the assumption s |s = 1
produces
s  |s = s |s |s s  |s = |s
(2.5.58)
That length is conserved over the bilateral connections is thus established.
There is, however, one piece of information that is lost in the mapping
from Jones to Stokes. Since Stokes coordinates are derived from intensity
measurements, the common phase of the Jones vector is lost. Transformation
from Stokes back to Jones does not reintroduce this phase. Any Jones vector
constructed from a Stokes vector is accurate to the true Jones vector to
within an arbitrary common phase. Physically this just means that the absolute time it took for the light to travel from its source to the observer cannot
be determined from Stokes measurements.
2.5.6 Orthogonal Polarization States
For every polarization state |s+  there is a unique polarization state |s  such
that s |s+  = 0. These states |s+  and |s  are orthogonal. Given |s+  how
does one can construct the orthogonal state |s  and its Stokes equivalent?
From (2.5.30) on page 56 one writes




s  )
(2.5.59)
s |s+  = s | (
(
s+  ) |s+  = 0
Since (
s  ) is Hermitian, (2.5.59) is rewritten using spin-vector identity (2.5.38) as
s |s+  = (
s s+ )s |s+  + js |(
s s+ )  | s+ 

(2.5.60)

As s |s+  = 0, (2.5.60) requires that (


s s+ ) = 0. There are two orientations that produce (
s s+ ) = 0: s s+ = 1. If s s+ = +1, then
s |s+  = 1, contradicting the orthogonality of the two states. Therefore it
must be the case that
(2.5.61)
s s+ = 1
The Stokes coordinates for any two orthogonal polarization states are on ops+ . Specically, a chord that conposite sides of the Poincare sphere: s =
nects any two orthogonal states crosses through the origin of the sphere. The
polarimetric parameters and are related through the Stokes vectors as

cos 2+
cos 2

(2.5.62)
sin 2 cos = sin 2+ cos +

sin 2 sin
sin 2+ sin +
A sucient requirement to satisfy (2.5.62) is

60

2 The Spin-Vector Calculus of Polarization


a)

b)

S3
^

jp- i

s+

90

S2

a
2a

S1

jp+i
^

s-

Fig. 2.3. Orthogonal polarization states in Jones and Stokes space. a) The handedness of the polarization ellipse is reversed and the major axis is rotated by /2.
b) Points on opposite sides of the Poincare sphere are orthogonal.

2 =
=

2+ +

(2.5.63a)

(2.5.63b)

Equations (2.5.61) and (2.5.63) show that orthogonal polarization states have
opposite handedness and perpendicular orientations of the respective elliptical
major axes. The Jones and Stokes representation of orthogonal polarization
pairs is illustrated in Fig. 2.3.
2.5.7 Non-Orthogonal Polarization States
The inner product magnitude between two polarization states may be calculated either in Jones or Stokes space. Consider two Jones vectors |p and |q
that are not normalized, and recall that Tr (|p q |) = p |q. In a manner similar to (2.5.29), the inner product between the two Jones vectors is written
|pp| =

1
(I + p  ) p |p
2

(2.5.64)

Multiplication on the right by |q and on the left by q |, and some rearrangement, makes
2
|p |q|
1
= (1 + p q)
(2.5.65)
p |p q |q
2
When |p and |q are normalized, the identity reduces to
2

|p |q| =

1
(1 + p q)
2

(2.5.66)

The magnitude of the inner product in Jones space is derived directly from
the Stokes vectors using this equation. What cannot be discerned, however, is

2.5 Vectors Cast in Jones and Stokes Spaces

61

the phase of the inner product. To recover the phase, p |q must be calculated
explicitly in Jones space. To construct the Jones vectors, one must either solve
the eigenvalue equation (2.5.30) or make the Jones vector (1.4.19) on page 18
for both |p and |q
2.5.8 Pauli Spin Operators
A general operator can be constructed from the identity matrix and spinvector by the form
A = a0 I + a 
= a0 I + a1 1 + a2 2 + a3 3

a0 + a1 a2 ja3

=
a2 + ja3 a0 a1

(2.5.67)

where all ak are complex numbers. This matrix has the eight requisite independent variables necessary for a general Jones matrix. The entries in A are
isolated by the trace:
a0 =

1
2

Tr(A) and a =

1
2

Tr (A )

(2.5.68)

A Hermitian operator is a special case of A:


H = a0 I + a  ,
ak real
(2.5.69)


The determinant is det(H) = a20 a21 + a22 + a23 . Moreover, when the trace
of H is zero, the Hermitian matrix equals a spin-vector form
HTr=0 = a 

(2.5.70)

Throughout much of this text, Hermitian operators with zero trace, and operators that preserve that trace, are associated with the spin-vector form.
The general operator A can be decomposed into Hermitian and skewHermitian matrices
(2.5.71)
A = Hr + jHi
where the operator K = (jHi ) is skew-Hermitian: K = K. The eigenvalues
of skew-Hermitian matrices are purely imaginary. The matrices Hr and Hi
contain the real and imaginary parts of A, respectively. The decomposition
is taken further by separating the nite-trace component from the traceless
components. Writing the complex number a0 as a0 = a0 + ja0 and identifying
each traceless Hermitian matrix with a spin-vector form, one has
A = ar  + jai  + (a0 + ja0 ) I

(2.5.72)

62

2 The Spin-Vector Calculus of Polarization

This is the most general spin-vector form of an arbitrary operator A. In practice, the decomposition matrices are derived from A as follows:
Hr =

1
2





A + A , and Hi = 2j A A

(2.5.73)

The matrices Hr and Hi are made traceless by calculating a0 = 12 Tr(H) for


the real and imaginary components. The real-valued Stokes vectors ar and i
can be read from the matrices ar,i  = Hr,i ar,i,0 I.
The Pauli spin operator S produces the most compact way to describe operators and concatenations of operators. The spin operator is a matrix exponential form of A and can describe Hermitian, unitary, and general operators.
The matrix exponential is written
S = exp (A/2)

(2.5.74)

The exponential is evaluated using its Taylor expansion. For instance,


1
1
1
exp(M ) = I + M + M 2 + M 3 + M 4 + . . .
2!
3!
4!

 

1 2
1 4
1 3
1 5
= I + M + M + ... + M + M + M + ...
2!
4!
3!
5!
(2.5.75)
For a Hermitian matrix, M = o I + (
 ) where k are real. The nth -order
n
spin-vector identity (2.5.50) gives the necessary reductions for (
 ) , which
gives
exp (
 /2) = I cosh (/2) + (
 ) sinh (/2)
(2.5.76)
where
 =
. The Pauli spin operator for a Hermitian operator is thus
SH = exp (0 /2) exp (
 /2)


 ) sinh (/2)
= e0 /2 I cosh (/2) + (

(2.5.77)

One can interpret this operator as that for a partial polarizer: the common
loss is 0 /2 (which is a negative quantity for loss), the maximum and minimum dierential losses are 1 tanh /2, and the Stokes direction of partial
polarization is .

The Pauli spin operator for a unitary matrix is constructed by recalling


the connection between Hermitian and unitary operators (2.4.18) on page 49.
  )). The
For coecients k real, the unitary form of M is M = j(o I + (
equivalent to (2.5.76) is
!
"
  /2 = I cos (/2) j(  ) sin (/2)
exp j
(2.5.78)
 = .
The Pauli spin operator for a unitary operator is thus
where

2.6 Equivalent Unitary Transformations

"
  /2
ST = exp (j 0 /2) exp j


= ej 0 /2 I cos (/2) j(  ) sin (/2)

63

(2.5.79)

One interprets this operator as a retardation plate: 0 /2 is the common


When the
phase, the full retardance is , and the axis of retardation is .
common phase is removed the unitary matrix U is recovered.
In the particular case where the axes of polarization and retardance co+
incide, the compound eect is expressed as the spin vector (j
 )  .
Following the form of (2.5.76) the Pauli spin operator expands to matrix form
as
!
"
+
 )  /2 =
exp ((j 0 + 0 )/2) exp (j


r  ) sinh ((j + )/2)
e(j0 +0 )/2 I cosh ((j + )/2) + (

(2.5.80)

where r is the axis of polarization and retardance.


Two relevant properties of matrix exponentials are
At
e = AeAt = eAt A
t
At Bt Ct
e e e = AeAt eBt eCt + eAt BeBt eCt + eAt eBt eCt C
t

(2.5.81)
(2.5.82)

while two non-intuitive results are


At
e = teAt
A
eAt eBt = e(A+B)t

(2.5.83)
(2.5.84)

unless, for the second equation, A and B commute.


When one writes a compound operation such as
  )/2) exp ((
1  )/2) exp(j(
2  )/2)
H1 U H2 = exp ((

(2.5.85)

one is only guratively expressing the series of operations on a polarization


state. The evaluation of H1 U H2 requires substitution of the matrix form for
each operator.

2.6 Equivalent Unitary Transformations


The preceding sections have established the bilateral connection between
Jones and Stokes vectors, and have constructed Hermitian, unitary, and general Jones operators that act on Jones vectors. The connection between a
Hermitian matrix and an equivalent Stokes matrix is made with the Mueller

64

2 The Spin-Vector Calculus of Polarization

matrix, (1.4.22) on page 18 and guratively (2.1.3) on page 38. Mueller matrices operate on 4 1 Stokes vectors to create new, transformed 4 1 Stokes
vectors.
The connection between a unitary matrix and an equivalent Stokes matrix
is also made with the Mueller matrix, but as indicated by (2.1.1) on page 38,
only the lower right 3 3 sub-matrix R is relevant. Sub-matrix R maps
spherical coordinates (S1 , S2 , S3 ) into new spherical coordinates (S1 , S2 , S3 )
without change of length. Therefore one expects the existence of a rotation
operator R corresponding to matrix R that performs rotations on the Poincare
sphere. The operator R does indeed exist and its derivation and properties are
so central to the description of retardance that this entire section is devoted
to its understanding.
Operators U and R are equivalent representation of the same transformation cast in two dierent vector spaces. The operators are called isomorphic
because they have similar, but not equal, eects. The isomorphism is two-toone since, as well be seen, there are two Jones operations that have the same
eect as every one Stokes operation.
Consider equivalent vectors |s and s at the input of a system and equivalent vectors |t and t at the output. In Jones space a unitary transformation T , corresponding to the underlying Maxwells equations in anisotropic
media, links the input and output. In Stokes space the rotation operator R
links the input and output. The parallel transformations are
dual
|t = T |s t = R s

(2.6.1)

Expansion of the Stokes vectors on the right side of (2.6.1) into their corresponding inner products gives the relation between R and T
t | | t = Rs | | s
Replacing |t with T |s and applying identity (2.5.53) gives


s T  T  s = s |R  | s

(2.6.2)

(2.6.3)

Since (2.6.3) holds for any |s, the embedded operators must be equal. Therefore
R  = U  U
(2.6.4)
where the common phase of T commutes with  and is eliminated. Equation (2.6.4) has an unusual form; the interpretation is

1
U 1 U

(2.6.5)
3 3 2 = U 2 U

U 3 U
3
That R is unitary is derived by multiplying (2.6.4) by its adjoint:

2.6 Equivalent Unitary Transformations

 R R  = U  U U  U

65

(2.6.6)

The product   = 3I, so only if R R = I does (2.6.6) hold. Therefore,


R R = RR = I

(2.6.7)

Since R is unitary it embodies a pure rotation; there is no scaling or translation; t is related to s by a rotation in Stokes space. The group properties
of R and their correspondence to the group properties of U are listed in the
following section.
There are two ways to derive an explicit expression for R, either through
the matrix form of U , generating a matrix form of R; or through the Pauli
spin operator form of U , generating a vector form of R. The vector form
is more lightweight and powerful than the matrix form because successive
operations in Stokes space are evaluated purely in vector form without an
a priori choice of orthonormal basis. The vector form is also a template with
which to construct any matrix R without matrix multiplication.
2.6.1 Group Properties of SU(2) and O(3)
The group dened by multiplication operators of R is called O(3), where O
stands for orthogonal and (3) for the three rotational dimensions of R. The
group properties for multiplication mirror those for SU(2), cf. 2.4.6:
Identity

UI = U

RI = R

Closure

U1 U2 = U3

R1 R2 = R3

Inverse

U 1 U = I

R1 R = I

Associativity

(U1 U2 )U3 = U1 (U2 U3 ) (R1 R2 )R3 = R1 (R2 R3 )

where in all cases U1,2,3 SU(2) and R1,2,3 O(3).


To conrm the entries in the above table, note that multiplication of successive U and R operators form a one-to-one correspondence. For example,
dual
|t = T2 T1 |s t = R2 R1 s

(2.6.8)

Operator equality is generated through the expression


R2 R1 = U1 U2  U2 U1

(2.6.9)

Making the correspondences R = R2 R1 and U  = U2 U1 , then (2.6.9) is equivalent to (2.6.4). Given that U2 U1 SU(2) and U  R , one infers closure
on O(3).
There are twice as many entries in the group SU(2) as in O(3). For every Uo , the corresponding Ro is Ro = Uo  Uo . For every Uo the corresponding operator Ro is the same: Ro = (Uo ) (Uo ). The isomorphism
between SU(2) and O(3) is two-to-one.

66

2 The Spin-Vector Calculus of Polarization

2.6.2 Matrix Entries of R in a Fixed Coordinate System


One way to generate R explicitly is through the matrix form of U . Earlier,
the generation of matrix entries for the Mueller matrix from entries in the
Jones matrix was given without derivation, (1.4.22) on page 18. That equation
created a 4 4 matrix; only for a unitary matrix is a distinct 3 3 sub-matrix
formed. The matrix entries for M are found as follows.
Start with the two relations tj = t |j | t and tj = Tr (M j ), equations
(2.5.24) on page 55 and (2.5.68) on page 61, respectively. The index j is
for j = 0, 1, 2, 3. Moreover, assume the system |t = J |s. Identication with
t |t = Tr(|tt|) gives
tj = t |j | t
= Tr (|tt| j )


= Tr J |ss| J j

(2.6.10)

Next, the outer product |ss| is replaced with something close to the spinvector form (2.5.29) on page 56. Since the vector is not necessarily normalized,
the expression |ss| = 12 s |s (I + s  ) will be used. Thus,


Tr J |ss| J j =

1
2
1
2



s |s Tr J (I + s  ) J j


s |s Tr J(
s  )J j

3

1  
Tr Jk J j sk
2

(2.6.11)

k=0

where  has been loosely indexed here to include 0 , sk = s |s sk , and the
common phase in T commutes with (
s  ) and is eliminated. The last equation
has the matrix multiplication form
tj =

3


Mj+1,k+1 sk

(2.6.12)

k=0

Identication of the matrix entries gives the nal expression


Mj+1,k+1 =



1
Tr Jk J j
2

(2.6.13)

The specialized case of J = U is of immediate interest. Substitution of U


for J in (2.6.13) shows that for k = 0, j = 0, and for j = 0, k = 0, the matrix
entries are identically zero. This is because Tr(j ) = 0. Other than the M1,1
matrix entry, which is unity, only the sub-matrix R survives the trace. Explicitly, the matrix entries for R given a matrix form of U are
Rj,k =



1
Tr U k U j
2

(2.6.14)

2.6 Equivalent Unitary Transformations

67

Table 2.1. Elementary Rotations in Jones and Stokes Space

1
j

=0
U =
R1 =

cos
2
sin
2
j

e
sin 2 cos 2

=0

cos

U =

j sin

= /2

==0

U =

j sin
cos

cos sin
sin

cos

R2 =

cos 2

sin 2
1

sin 2

cos 2

cos 2 sin 2

R3 = sin 2

cos 2
1

Using the general form of U given in (2.4.26) on page 51, R is resolved as

R=

cos 2

cos( ) sin 2

sin( ) sin 2

cos( + ) sin 2

cos 2 cos2 cos 2 sin2

sin 2 cos2 + sin 2 sin2

sin( + ) sin 2

sin 2 cos + sin 2 sin

cos 2 cos + cos 2 sin

(2.6.15)
Calculation of the determinant gives
det(R) = 1

(2.6.16)

which veries that R is invertible, unitary, and contains no reections.


Table 2.1 gives the three elementary rotations about the Stokes axes
(S1 , S2 , S3 ) and their original unitary form. Notice that all angles which appear in the unitary matrices are doubled in the corresponding Stokes matrices.
Stokes angles are twice that of physical, or laboratory angles. This is also
why the Jones to Stokes isomorphism is two-to-one: rotation of in Jones
space is invariant, and so is rotation by 2 in Stokes space.
2.6.3 Vector Expression of R in a Local Coordinate System
The vector form of R abstracts away any notion of an underlying, xed coordinate system. Rather, each operation R has its own local coordinate system
based on the eigenvectors and spin direction of R. The vectorial form of R
gives the highest level of geometric interpretation to transformation mechanics
in Stokes space.

68

2 The Spin-Vector Calculus of Polarization

The vector expression for R is derived from the vector form of U . The
operator U is resolved into its eigenvector-based projectors using (2.4.15) on
page 48; the resolution for a two-dimensional system gives
U = ej/2 |r+ r+ | + ej/2 |r r |

(2.6.17)

where the projectors are equated to the spin-vector form via


|r r | =

1
(I r  )
2

(2.6.18)

Note that r+ =
r are orthogonal Stokes coordinates. In the following, r
will denote the eigenvector with the positive subscript. Substitution of the
spin-vector form of the projector into U gives
U = I cos(/2) j(
r  ) sin(/2)

(2.6.19)

Equation (2.6.19) is now in familiar form and can be mapped to the exponential equivalent as
(2.6.20)
U = ej(/2)(r )
where (/2)(
r  ) is the Hermitian operator associated with U .
Substitution of (2.6.19) into the equivalence relation (2.6.4), and applying
the spin-vector identities (2.5.35), (2.5.36), and (2.5.49) produces
R  = cos  + (1 cos )
r(
r  ) + sin (
r  )

(2.6.21)

Since each term on the left- and right-hand sides of (2.6.21) operates on  ,
one can extract the embedded relation for R
R = cos I + (1 cos )(
rr) + sin (
r)

(2.6.22)

Grouping like terms gives


R = (
rr) + sin (
r) + cos (I (
rr))

(2.6.23)

Recalling the vector identity a a c = b(a a) c(a b), the last term on
the right-hand side is identied as
(
r)(
r) = (
rr) I

(2.6.24)

The nal vectorial form of R is then


R = (
rr) + sin (
r) cos (
r)(
r)

(2.6.25)

Equation (2.6.25) is a beautifully compact expression for the action any unitary operator has on a polarization vector. The vector r points in the direction
of the positive eigenvector of U . The vector operators {(
rr), (
r ), (
r)(
r)}
form a local orthonormal basis. The local basis requires a vector about which

2.6 Equivalent Unitary Transformations


a)

69

b)
^

^^
rr.

(rx)(rx)
^

(rx)

Fig. 2.4. Vector components of rotation operator R. a) The local orthonormal basis
{(
rr), (
r), (
r)(
r)} as resolved on s. b) Transformation to t from s via precession
about r, travelling through precession angle .

the basis can be fully resolved; for instance, operation on state s generates
the basis (
r, r s, r r s). In the absence of being fully resolved, the local
basis has immutable properties that are independent of the resolving vector.
Figure 2.4(a) illustrates the local basis resolved by s. Vector s in relation
to r denes a precession circle, the circle about which s travels. Local axis (
rr)
always points parallel to r. The local axes (
r), (
r)(
r) dene the plane
of the precession circle and are perpendicular to r. The local axis (
r) is
tangent to the precession circle and (
r)(
r) points to the origin of the
precession circle. The particular pointing directions of (
r) and (
r)(
r),
while always in the precession plane, are determined only after determination
of s. Figure 2.4(b) illustrates transformation to state t from s about r. The
precession angle is and the precession direction follows the right-hand rule.
Since the motion of precession is so central in the description of polarization transformation mechanics, Fig. 2.5 is included to describe precession
in a local coordinate system. Consider the input state s and the precession
axis r. The precession axis can be the birefringent axis of a dielectric medium
or the principal-state-of-polarization axis used to describe polarization-mode
dispersion. In any case, the angle separates the two vectors. The motion of
precession is to turn s about r in a circle while keeping the angle xed. This
is the same motion a gyroscope exhibits under gravitational inuence. The
angle subtended by projections of states s and t onto the base circle is the
precession angle. The dierential equation of motion can be deduced from R
in local-coordinate form. Consider state s that undergoes a small change in
angle . The motion is
(2.6.26)
s +
s = R s
Taking R in the form (2.6.22) and simplifying for small angles,
s +
s = I s + r s

(2.6.27)

70

2 The Spin-Vector Calculus of Polarization


^

d s = ^r x ^s
dw
ws-t

Fig. 2.5. Precessional motion of s about r, passing through state t. Angle remains
xed, while angle , as projected onto the base, is the degree of precession.

which is rewritten in dierential form as


d
s
= r s
d

(2.6.28)

The r s term dictates that s moves perpendicular to r.


Equation 2.6.22 can be used as a template to construct a matrix representation for R given any r and . The matrix representations for rr and r
are

0 r3 r2
r1 r1 r1 r2 r1 r3

r) = r3
(2.6.29)
(
rr) = r2 r1 r2 r2 r2 r3 , (
0 r1

r3 r1 r3 r2 r3 r3
r2 r1
0
The rst matrix is a projector, as veried by det(
rr) = 0. The second matrix is
derived from r = r S1 + r S2 + r S3 , where S1,2,3 are the three Stokes
axes.
2.6.4 Select Vector Identities
Figure 2.6 illustrates two identities that have a simple geometric interpretation. The identity
r (R a ) = r a
(2.6.30)
is illustrated by Fig. 2.6(a). The rotation of a generated by R about r through
angle does not change the angle between r and a. Thus the dot product
may be taken with or without the intermediate rotation. The identity
(R a ) (R b ) = a b

(2.6.31)

is illustrated by Fig. 2.6(b). The rotation due to R does not change the angle
between vectors a and b, so again the dot product may be taken with or
without the intermediate rotation.

2.6 Equivalent Unitary Transformations


a)

Ra

b)

Ra

a
^

Rb

71

a
^

w
d

Fig. 2.6. Geometric representation of two rotational identities. a) r (R a ) = r a.


b) (R a ) (R b ) = a b.

2.6.5 Euler Rotations


The Euler rotations are an alternative method to construct the operator R in
matrix form (2.6.15). While there are several ways to establish the connection,
the fact that multiplication in Stokes space corresponds to multiplication in
Jones space is simple enough to construct the operator U in the form (2.4.26),
and to map the terms from Jones back to Stokes space. One can verify that

ejv
cos sin
eju

(2.6.32)
U =
sin cos
eju
ejv
where u = ( + )/2 and v = ( )/2. Identication of the Jones operators
in (2.6.32) with the Stokes operators in Table 2.1 gives the equivalent Stokes
transformations
(2.6.33)
R = R1 ( + ) R3 (2) R1 ( )
These rotations generate the general rotation matrix in (2.6.15).
Another way to view the unitary operator is to diagonalize the matrix. The
eigenvalues of U are complex exponentials that have unity magnitude and they
are conjugates of one another: U |r  = exp(j/2) |r . The operator U can
be separated as
(2.6.34)
U = V V
where the matrix V is a 2 2 unitary matrix with the two eigenvectors |r 
of U entered as columns of V , and where the matrix is a diagonal 2 2
matrix with the eigenvalues of U on the diagonal. The corresponding Stokes
operation is
(2.6.35)
R = RE R1 ()R E
where RE is the Euler rotation associated with V . Now, several observations
can be made. If the state at the input is |r , then the action of U does not

72

2 The Spin-Vector Calculus of Polarization

change the state, only a phase is contributed. The state is invariant under U .
For every |r  there are corresponding r vectors in Stokes space. The behavior of R E is to rotate r to s1 ; R1 () then pirouettes the state about
the s1 axis, and RE returns the state back to r .
The decomposition of U as in (2.6.34) has much signicance in relation to
propagation through birefringent media. For example, the propagation constants for ordinary and extraordinary waves in a birefringent medium are
o = no /c and e = ne /c. The eigenvalues of the propagation matrix are
exp(j(e o )z/2). The polarization transformation in Stokes space is accordingly
(2.6.36)
R = RE Rx (z)R E
The inner matrix Rx creates precession about the s1 axis in Stokes space while
the Euler rotation and its adjoint transforms the eigenstates of the system
onto the s1 axis and then restores the pointing direction of the eigenstate.
The precession about s1 is transformed to precession about r.
2.6.6 Some Relevant Transformation Applications
Four examples are presented here to give some illustrative detail on how to
use the Stokes transformation operator R cast in local-coordinate form. First,
dierential precession rules for a single homogeneous birefringent section are
written. Then, the polarization evolution through a concatenation of two misaligned birefringent sections, as a function of length, is illustrated. Third, the
shortest distance between two polarization states is found. Finally, uniform
and biased polarization scattering examples are given.
For polarization studies, the axis r is the extraordinary axis of a birefringent medium. For a birefringent crystal the birefringent axis lies in the
equatorial plane in Stokes space. An input state s precesses about r as the
state propagates through the medium. The retardation of a birefringent plate
is just the angle through which the state processes: = nL/c, albeit
any two rotations of that are the same modulo 2 yield identical Stokes
transformations. The angle is called the birefringent phase.
The dierential equation of motion for a state of polarization as it evolves
through a homogeneous birefringent material is the same for dierentials in
either position or frequency. The retardation as a function of length z and
radial frequency is
nz
(2.6.37)
=
c
Moreover, the birefringent axis is parallel to r. Dierentiation of (2.6.37)
with respect to z and substitution into (2.6.28) gives
d
s
= s
dz

(2.6.38)

where = (n/c)
r. Equally possible is dierentiation with respect to ,
which gives

2.6 Equivalent Unitary Transformations


a)

b)

S3

73

S3
^

S2
a
^

r2

r1
w1

S1

S2

ws-t

S1

w2

Fig. 2.7. a) Polarization evolution through two misaligned birefringent sections.


Input state (a) precesses about r1 to state (b). That state enters the second stage,
precesses about r2 , and leaves as state (c). b) Construction of great circle through
states s and t. The normal to the circle is r.

d
s
= s
(2.6.39)
d
The dierential precession rule for a single homogeneous birefringent section
is the same whether the position or frequency changes. This simplicity is
quickly broken when two or more homogeneous sections are concatenated.
Birefringent concatenation is in the category of polarization-mode dispersion.
The polarization state evolution through two misaligned birefringent sections as a function of length can be evaluated using (2.6.25) in the following
way. Since the media are misaligned, their birefringent axes are not parallel;
that is, r1 = r2 . Figure 2.7(a) illustrates the polarization evolution through
the sections. The input state, arbitrarily selected, is located at position (a).
That state precesses about r1 through angle 1 , dictated by the length and
birefringence of the section, as well as the input frequency. The output polarization from the rst section is located at position (b). That state then enters
the second section which transforms it about r2 . The polarization state now
traces a second circle that is dierent from the rst. The output state is eventually located at position (c). The aggregate polarization transformation is
calculated by the concatenation of R2 and R1 . The compounded polarization
transformation is
!
"
R2 R1 =
(
r2 r2 ) + sin 2 (
r2 ) cos 2 (
r2 )(
r2 )
(2.6.40)
!
"
(
r1 r1 ) + sin 1 (
r1 ) cos 1 (
r1 )(
r1 )
(2.6.41)
Each transformation is denoted by a unique index and the concatenation
is written right-to-left as is usual for matrix multiplication. Sometimes the
vector products oer simplications that can reduce the complexity of the
overall motion.

74

2 The Spin-Vector Calculus of Polarization


a)

S3

b)

so

S3

S2

S2

S1

S1

Fig. 2.8. Uniform and biased scattering through operator R. a) Uniform scattering. r points in any direction with equal likelihood. is uniformly distributed.
is constructed along s3 and oriented toward so .
b) Biased scattering, a = 0.05. R

As another application, the shortest distance between points s and t on


a unit sphere is along a great circle. The axis normal to the great circle is
evidently
s t
(2.6.42)
r =
|
s t|
and the rotation angle between the two points is
cos st = s t

(2.6.43)

Figure 2.7(b) illustrates the motion. r is derived from the cross of s and t.
Angle st rotates s through to t.
Uniform and biased polarization scattering is useful in connection with
polarization-mode dispersion ber-modelling calculations. The scattering process occurs between any two adjacent birefringent sections and it intended to
model the relative alignments of the respective birefringent axes. A uniform
scattering process sends the polarization state at the output of one section, so ,
to any point on the Poincare sphere with equal probability. That state, si , is
then input to the next birefringent section. The biased scattering process
weights the scattering along a predetermined direction, often the direction
of so . For either uniform or biased scattering an operator R needs to be constructed.
There are two variables contained in R, (2.6.25): pointing direction r and
precession angle . Direction r itself has two independent variables, the polar angles of declination and azimuth. Combined, R has three independent
variables. The random process is derived using the unit deviate u
. To have r
point in any direction on the unit sphere with equal likelihood, the azimuth
angle and position along the s3 axis are both uniformly distributed. Also, the
precession angle is uniformly distributed to generate precessions with equal

2.6 Equivalent Unitary Transformations

75

likelihood. The random variable expressions are


r3 = 2
u1

= (2
u 1)

(2.6.44b)

= (2
u 1)

(2.6.44c)

(2.6.44a)

Relating r3 to the polar angle as r3 = cos , the remaining coordinates are


r1 = sin cos
r2 = sin sin

(2.6.45a)
(2.6.45b)

can now be constructed. Since uniform scattering


The random variable R
is completely symmetric on the unit sphere, the pointing direction r does
not need to be oriented toward so . Figure 2.8(a) illustrates an output state
scattered on the Poincare sphere.
Biased scattering is used to enhance the likelihood of rare events, rare
events being those where multiple birefringent axes are preferentially aligned,
misaligned, or some other construction. To preferentially align the birefringent
sections, r should be biased to point toward so . A simple way to generate the
bias is rst to bias r towards the s3 direction using the following formula:
r3 = 2
u1/a 1

(2.6.46)

For a 1 the bias is toward +s3 and for a 1 the bias is toward s3 . The
scattering operator R is now biased toward s3 and is denoted R3 . Before R3
can be applied to so the former needs to be rotated into the latter. Following
the previous example of the shortest distance between two points in Stokes
space, a deterministic operator R3so is constructed to perform the required
rotation. Operator R3so needs to be calculated only once. Figure 2.8(b) illustrates an output state scattered on the Poincare sphere and biased toward so .

76

2 The Spin-Vector Calculus of Polarization


Table 2.2. Jones and Stokes Equivalent Expressions
Jones expressions

|s =

sx ejx
sy ejy

Stokes expressions

s1

s = s2

s3

|t = T |s

t = R s

|t = T2 T1 |s

t = R2 R1 s

T T = T T = I

RR = R R = I

|s = s |s

s = s | | s

|ss| =

1
2

(I + s )

U U

s =

U1 =

U2 =

cos
j sin

U3 =

ej

ej

j sin
cos

cos

R = (
rr) + sin (
r) cos (
r)(
r)


Rj,k = 12 Tr U k U j

R1 =

cos sin
sin

Tr (|ss| )
R

U = I cos(/2) j(
r ) sin(/2)

a
b

U =
b a

1
2

sin 2

sin 2 cos 2
cos 2

R2 =

cos 2

sin 2
1

sin 2

cos 2

cos 2 sin 2

R3 = sin 2

cos 2
1

d |s
= j/2(
r ) |s
d

d
s
= r s
d

2.6 Equivalent Unitary Transformations

77

Table 2.3. Spin-Vector Expressions

0 =

1 0

; 1 =

0 1

0 1

; 2 =

0 1

; 3 =

1 0

0 j
j

= 2

a0 + a1 a2 ja3

a0 I + a =
a2 + ja3 a0 a1

rr =

R = (
rr) + sin (
r) + cos (I (
rr))

0 r3 r2
r1 r1 r1 r2 r1 r3

r2 r1 r2 r2 r2 r3 , r = r3
0 r1

r3 r1 r3 r2 r3 r3
r2 r1
0

H = exp (0 /2) exp (


/2) = e0 /2

I cosh (/2) + (
) sinh (/2)


!
"
/2 = ej 0 /2 I cos (/2) j( ) sin (/2)
T = exp (j 0 /2) exp j

78

2 The Spin-Vector Calculus of Polarization

References
1. O. Aso, I. Ohshima, and H. Ogoshi, Unitary-conserving construction of the Jones
matrix and its applications to polarization-mode dispersion analysis, Journal of
the Optical Society of America A, vol. 14, no. 8, pp. 19882005, Aug. 1997.
2. D. M. Brink and G. R. Satchler, Angular Momentum, 3rd ed. Oxford: Oxford
Science Publications, 1999.
3. N. Frigo, A generalized geometric representation of coupled mode theory, IEEE
Journal of Quantum Electronics, vol. QE-22, no. 11, pp. 21312140, 1986.
4. N. Gisin and B. Huttner, Combined eects of polarization mode dispersion and
polarization dependent losses in optical bers, Optics Communications, vol. 142,
pp. 119125, Oct. 1997.
5. J. P. Gordon and H. Kogelnik, PMD fundamentals: Polarization mode dispersion
in optical bers, Proceedings of National Academy of Sciences, vol. 97, no. 9,
pp. 45414550, Apr. 2000. [Online]. Available: http://www.pnas.org
6. M. Rose, Elementary Theory of Angular Momentum. New York: Dover Publications, 1995.
7. J. J. Sakurai, Modern Quantum Mechanics. New York: AddisonWesley, 1985.
8. G. Strang, Linear Algebra and its Applications, 3rd ed. New York: Harcourt
Brace Jovanovich College Publishers, 1988.

3
Interaction of Light and Dielectric Media

Optical components and waveguiding ber are designed to control the interaction between light and media. The regime of interest in this text is material
transparency, to rst order. Transparent glasses, crystals, and garnets are the
building blocks on which passive optical components and optical ber are
made. Moreover, the interactions addressed in this text are optically linear in
that the material response is assumed linear with eld intensity. This is not
actually the case since, for example, optical ber has a prominent Kerr eect,
but linearity will do for the studies to follow. An equally broad topic is the
interaction between light and semiconductors such as diode lasers and optical
detectors, but such interactions are not covered here.
The main purpose of this chapter is to introduce elementary classical descriptions for the constitutive relations of isotropic, anisotropic, gyrotropic,
and optically active materials, and to detail how these constitutive relations,
when included in Maxwells equations, change the wavefront, power ow, and
polarization of light. Glasses and birefringent crystals, principal examples of
isotropic and anisotropic materials, are well characterized by classical waveelectron interaction. Faraday rotation induced by diamagnetic materials, a
particular example of a gyrotropic material, yields to classical analysis, but a
quantum-mechanical model is necessary to describe rotation in ferrimagnetic
garnets. Finally, a constitutive relation of optical activity based on a classical description can be sketched to give avor, but a detailed dipole-dipole
interaction model is really necessary.
There are two levels of treatment for the interaction of light and media.
First a constitutive relation must be found that dictates how the incident eld
eects the dipole moments (and possibly free electrons) of the material, and
how these dipole moments in turn eect the eld. Second, Maxwells equations
are solved based on the inclusion of the constitutive relation. The solution of
Maxwells equations, in this context, is completely classical and the rigorous
treatment of the kDB system is provided.

80

3 Interaction of Light and Dielectric Media

3.1 Introduction of Media Terms into Maxwells


Equations
The modications to Maxwells equations to include media terms are treated
rst. Overall, the equations must show how light behaves within a bulk, homogeneous medium and at the interface into and out of the region. In vacuum
an electromagnetic wave is described by electric and magnetic elds E and H.
Within a material, however, the wave is described by the electric and magnetic
ux densities D and B. The ux densities incorporate the original eld and
the media reaction. The following sections then classify possible interaction
types and recast Maxwells equations for solution within a material.
When an electromagnetic wave propagates through a material, the electric
and magnetic elds couple to bound and free electrons. A material with free
electrons is conductive and any currents that are generated by the eld absorb
energy from that eld and impart loss. A material without free electrons is
dielectric. The travelling eld forces the bound electrons to oscillate and they,
in turn, radiate. In a pure dielectric without impurities, the induced radiation
eld is not diuse but instead co- and counter-propagates with the incident
eld, so, when combined, the total eld has a shortened wavelength and slowed
energy ow.
There are two interactions to account for when light propagates within a
dielectric medium: the radiation eld from electronic dipoles that are stimulated by the incident eld; and the radiation eld from magnetic dipoles,
or current loops, that are likewise stimulated. The electric eld can directly
couple to the electric dipoles and the magnetic eld can directly couple to the
magnetic dipoles. There can also be cross-coupling where the electric eld induces magnetic dipole radiation and the magnetic eld induces electric dipole
radiation, as is the case in a chiral medium.
First the electric-dipole radiation. Including paired charge density p and
unpaired (or free) charge density u , Gauss law for the electric eld is
o E = p + u

(3.1.1)

A dielectric, non-conductive medium has only paired charges since every electron is bound to an atom; the unpaired charge density is zero, u = 0. Without
free electrons there is no current, so J = 0 as well. For incident eld energies
below the ionization energy, the eld stimulates the electrons to oscillate. To
rst order atomic nuclei do not move, so the electrons move closer and further away from the nucleus to generate a oscillating dipole. The oscillation
frequency matches the frequency of the incident eld.
The electric dipole moment of the oscillator is p = er, where e is the
electron charge and r is the vector distance between electron and nucleus.
Vector r points in the direction of positive charge. The polarization density
vector P of the media is the product of the individual dipole moments and
the number of dipoles N per unit volume, or

3.1 Introduction of Media Terms into Maxwells Equations

P = N er

81

(3.1.2)

As the electrons oscillate, the electric charge changes position. A dierential volume dV containing a charge density N has at any time a charge of
q = e r da exterior to its volume, where da is a dierential surface-area
element. The total net charge Q remaining on the interior comes from integrating across the surface,

Q = (N er) da
S

Rewriting Q in terms of dipole density P, Gauss integral law shows that




P da =
P dV
S

The net charge within volume V is then



Q=
P dV
V

But by denition, the net charge Q within the same volume due to the paired
charge density p is

Q=

p dV
V

Comparing these two expressions for charge density, the paired charge density
is related to the divergence of the polarization density by
p = P

(3.1.3)

Substitution back into Gauss law gives


(o E + P) = 0

(3.1.4)

So now it is clear how to dene the electric-ux density D:


D = o E + P, and D = 0

(3.1.5)

where D has units of (C/m2 ) and the electric-ux density has zero divergence.
Next comes magnetic-dipole radiation. Gauss law for the magnetic eld
is
(3.1.6)
o H = 0
The divergence of the eld is zero because there is no magnetic monopole.
Magnetic elds and the currents that generate them must close on themselves.
The elementary model for a unit magnet is a current loop having current i
circulating along a perfect conductor having radius R enclosing area a, where

82

3 Interaction of Light and Dielectric Media

the direction of a is normal to the surface element. The magnetic dipole moment m can be identied as m = ia. In analogy with the electric polarization
density, the magnetization density M is dened as
M = Nm
Now, in the far-eld, the scalar potentials of an electric dipole and magnetic
dipole have the same form, provided that the magnetic dipole is identied as
m = o m [8]. So, in analogy with (3.1.3), the magnetic density is dened as
m = o M

(3.1.7)

Introducing of m into Gauss magnetic law gives


(o H + o M) = 0

(3.1.8)

In parallel with the electric-ux density D, the magnetic-ux density B is


dened by
(3.1.9)
B = o H + o M, and B = 0
where B has units of (V-s/m2 ), or the derived unit of (Tesla).
All materials at the very least exhibit the magnetic eects of nuclear and
electronic spin. In general the resultant magnetic elds close on themselves
on an atomic scale and therefore average out on a macro-scale. Moreover,
a magnetic moment has an angular moment and any reorientation of that
moment has a delay with respect to a sinusoidal driving eld. At optical
frequencies the induced magnetization, e.g. of a ferrite, is essentially zero.
Bianisotropic materials exhibit a helical molecular structure or crystalline
structure which can combine to couple the electric and magnetic elds via
dipole and current excitation. These materials cause optical activity and are
of growing importance in telecommunications since liquid crystal structures
have a helical component.
Having dened the electric and magnetic ux densities D and B, they
must be incorporated into Faradays and Amp`eres laws. Faradays law says
that the curl of E is generated by the time variation of a eld F:
E=

F
t

Taking the divergence of both sides yields


( E) =

F=0
t

Since the divergence of a solenoidal eld is zero, the divergence of F must be


constant: F = cm . Since D = 0, the ux density D is a suitable eld to
include in Faradays law. The same analysis holds for Amp`eres law. Therefore
one can restate Maxwells equations in terms of D and B:

3.1 Introduction of Media Terms into Maxwells Equations


^

a)

b)

(a)

(b)

(a)

(b)

is

83

in

Fig. 3.1. Analysis of electric and magnetic ux density continuity across a boundary.
a) Electric ux density continuity determined by a pillbox across the surface.
b) Magnetic ux density continuity determined by a loop through the surface.

E =

B
t

(3.1.10a)

D
t

H =

(3.1.10b)

The electric and magnetic ux densities are now fully incorporated into
Maxwells equations. In order to use these equations, constitutive relations
that relate the polarization P and magnetization o M to the electric and
magnetic elds are required. Constitutive relations determine these interactions.
Normal to an interface the electric and magnetic ux densities are continuous. Tangent to a surface the electric and magnetic elds are continuous.
These continuity conditions are derived as follows.
The continuity condition for uxes are derived from Gauss law. Consider a small volume V in the shape of a pillbox that intersects a smooth,
charge-free interface between two homogeneous regions denoted by (a) and (b)
(Fig. 3.1(a)). The volume has height h and area normal to the interface A.
points from region (b) to (a). Integration of D = 0
Also, a unit vector n
over the volume and taking the limit to zero height gives


D dV = lim
D da
lim
h0

h0

= lim

h0

D
n

(a)

(b)


D ds

"
A+h

= 0
where S is the surface enclosed by volume V , C is the contour around area
element A, and Stokes integral law is used to transform from volume to surface integrals. The electric ux density normal to the interface is therefore
continuous:
"
!
D(a) D(b) = 0
(3.1.11)
n
The normal component of the electric eld, however, is not continuous. Just
consider the step between vacuum and a dielectric:
!
"
o E(a) (o E(b) + P(b) ) = 0
n

84

3 Interaction of Light and Dielectric Media

Table 3.1. Inclusion of Electric and Magnetic Media Terms in Maxwells Equations
o E = P

o H = o M

P = N er

M = Nm

(o E + P) = 0

(o H + o M) = 0

H=

(o E + P)
t

E=

(o H + o M)
t

D = o E + P

B = o H + o M

D : (C/m2 )
"
!
D(b) D(a) = 0
n
!
"
E(b) E(a) = 0
n

B : (Vs/m2 )
!
"
B(b) B(a) = 0
n
!
"
H(b) H(a) = 0
n

The amplitude of E(b) must take into account the strength of P(b) for this
relation to hold.
Using Gauss law for the magnetic ux gives, via the same analysis
"
!
B(a) B(b) = 0
(3.1.12)
n
The normal component of the magnetic ux density is continuous across an
interface.
The tangential continuity conditions are derived from Faradays and
Amp`eres laws. Consider a small loop that intersects the same interface
(Fig. 3.1(b)). The area of the loop is A and its normal in is parallel to the
interface. Integration of (3.1.10a) and taking the limit of zero height yields



E da = lim
B da
lim
h0 S
h0 t
S
Separate evaluation of the integrals gives

"
!
lim
E ds = lim is E(a) E(b) L
h0

and

h0

lim
h0 t


B da = lim
S

h0


in BhL
t

and a (b c) = (a b) c, the integrals proUsing the identities is = in n


duce the continuity law for the tangential electric eld:
"
!
E(a) E(b) = 0
(3.1.13)
n

3.2 Constitutive Relation Tensors

85

The same analysis of Amp`eres law gives the continuity condition for the
tangential magnetic eld:
"
!
H(a) H(b) = 0
(3.1.14)
n
Table 3.1 summarizes the results of this section.

3.2 Constitutive Relation Tensors


The preceding section introduced polarization P and magnetization o M
terms into Maxwells equations. These terms are derived from the radiation
elds of bound electrons within a dielectric medium. The radiation elds are
themselves excited by incident electromagnetic waves; the combined incident
and radiated elds inextricably co- and counter-propagate throughout the
medium. The combined electric and magnetic elds are denoted by D and B
and are called the electric and magnetic ux densities, respectively. Maxwells
equations including the ux densities are

B(r, t)
t

H(r, t) =
D(r, t)
t
D(r, t) = 0

E(r, t) =

B(r, t) = 0

(3.2.1a)
(3.2.1b)
(3.2.1c)
(3.2.1d)

A specic medium is modelled by the relation between the eld terms E


and H and the ux density terms D and B. The most general form of these
constitutive relations is

E
P L
cD

(3.2.2)
cB
M Q
H
where c is the speed of light and P, L, M, and Q are 3 3 matrix tensors. The
form of (3.2.2) is preferred for its invariance to relativistic transformation [11].
The constitutive tensors are generally frequency dependent, and when cast in
time-harmonic form, are generally complex quantities.
Materials are classied according to the matrix entries of (3.2.2). When the
cross-coupling terms L and M are non-zero, the medium is called bianisotropic.
Optically active materials are bianisotropic. When the cross-coupling terms
are zero (L = M = 0) then the medium is anisotropic. Within anisotropic materials the electric eld excites only the electric ux, and the magnetic eld
excites only the magnetic ux. These excitations are generally not spatially
uniform and depend on the eigenvectors of P and Q. Isotropy is a special

86

3 Interaction of Light and Dielectric Media

case of anisotropy in that P and Q are diagonal tensors with all entries equal.
Physically, excitation of dipole moments is spatially uniform for isotropic materials.
The materials considered in this text are lossless to rst order. Losslessness
imposes certain symmetry conditions on the constitutive tensors. These conditions are determined by Poyntings conservation theorem. The Poyntings
theorem derived from time-harmonic versions of (3.2.1a,b) is
(E H ) = j (E D H B)
Losslessness requires that the divergence of time-averaged power ow vanishes:


1
 S = e j (E D H B) = 0
(3.2.3)
2
To evaluate the impact of this constraint on the constitutive relations, expression (3.2.2) is re-ordered to put the elds on one side and the ux densities
on the other:

E
E
D
D
= CDB
, and

= CEH

(3.2.4)
H
H
B
B
where

CEH =

, and CDB =

(3.2.5)

The entries of CEH and CDB are, generally, 3 3 tensors.


Fluxes D and B in (3.2.3) can now be expressed in terms as E and H.
With the identity E E = E E and a similar one for H and , the
lossless condition imposes the following constraints on the tensors of CEH
and CDB :
(3.2.6)
= , = , =
and
= , = , =
i

(3.2.7)

Losslessness forces the on-diagonal tensors to be Hermitian and the odiagonal tensors to posses symmetry. Accordingly, transmission through any
such media retains a real-valued dispersion relation and is therefore lossless.
Tensors and are non-zero for bianisotropic media and identically zero
for anisotropic media. Lossless isotropic materials have scalar (, ), rather
than tensor (, ) values. In anisotropic and isotropic materials, and are
readily identied as the electric permittivity and magnetic permeability tensors. Likewise, and are identied as the impermittivity and impermeability
tensors, where
(3.2.8)
= 1 , and = 1

3.3 The kDB System

87

For anisotropic and isotropic materials, the elds and ux densities are decoupled:
D = E

(3.2.9a)

B = H

(3.2.9b)

Only when and are scalars are the uxes necessarily aligned to the elds.
Otherwise, elds and uxes align along the eigenvectors of their respective
tensors.
Finally, Poyntings theorem restated to include explicit time dependence
is
W
=0
(3.2.10)
S+
t
where the total stored energy is
1
(E D + H B)
(3.2.11)
2
In the time-domain picture, losslessness requires an instantaneous response of
the polarization and magnetization densities to the applied electro-magnetic
eld; a phase lag of D to E, or B to H, generates loss in the medium.
W =

3.3 The kDB System


Section 3.1 showed that within a medium the incident elds and induced
dipole radiation propagate together as electric and magnetic ux densities.
These ux densities are the natural terms in which to describe the electromagnetic behavior. There is a second reason why these are the natural terms,
and that is because the k-vector always lies perpendicular to plane-wave solutions of D and B. One could instead consider using the elds E and H as
descriptors within a medium since the Poynting vector always lies perpendicular to the elds: S = E H. The choice of which system to use is made
by looking for mathematical simplicity, not on physical grounds. Since the kvector always lies perpendicular to D and B, one of three vectors drops out of
Maxwells equations. Thus the so-called kDB system is introduced. Once the
uxes are everywhere determined, the elds are calculated from the inverse
constitutive relations and, ultimately, the Poynting vector is determined.
The kDB system is constructed to take advantage of that natural basis
dened by the triplet (k, D, B). Plane-wave solutions separate the oscillatory
term exp{j(t k r)} from a slowly varying and (piecewise) spatially independent ux envelope. Maxwells equations (3.2.1) are thus recast as
k E = B

(3.3.1a)

k H = D

(3.3.1b)

kD = 0

(3.3.1c)

kB = 0

(3.3.1d)

88

3 Interaction of Light and Dielectric Media

The natural coordinates for these ux vectors and k-vector are (


e1 , e2 , e3 ). In
particular, e3 always points along the k-vector (k = e3 k) and the D and B
vectors line in the DB plane dened by (
e1 , e2 ) normal to e3 (Fig. 3.2). The
result is that D3 = B3 = 0 when resolved on the kDB coordinate system. This
provides the promised simplication.
Typically the constitutive tensors are written along their eigenvectors. The
permittivity tensor for a birefringent crystal, for example, is generally written
with only diagonal entries. However, the ux densities can propagate in an
arbitrary direction. To reconcile the two reference frames, the eigenvector
frame of the tensors, denoted by coordinate system (x, y, z), is rotated into
the natural frame of the uxes and k-vector. This transformation is done with
the rotation operator T , which denes a rst rotation about z and a second
rotation about e1 , the latter being the local x axis. Thus
T = Rx ()Rz ()

1
0
0
cos sin 0
= 0 cos sin sin cos 0
0 sin cos
0
0
1
For vectors A and Ak resolved in the crystal and ux coordinates, respectively,
the forward and inverse vector transformations are
Ak = T A, and A = T 1 Ak
where

cos
sin
0
T = cos sin cos cos sin
sin sin sin cos cos

(3.3.2)

and T 1 = T T . The operators T and T 1 are unitary: T T 1 = I.


When acting on matrices rather than vectors, the transformation operator T imparts a similarity transform on the coordinate system, cf. 2.4.4.
Given the kDB constitutive relation

i

D
E

(3.3.3)
i
B
H

resolved on an underlying (x, y, z) coordinate system, the constitutive relation
is rotated into the ux frame according to

i
Ek
T T 1 T T 1
Dk

(3.3.4a)
i
Hk
T T 1 T T 1
Bk

i
k k
Dk

=
(3.3.4b)
i
k k
Bk

3.3 The kDB System


z

89

k, e3

y
f

D1, e1

D2, e2

Fig. 3.2. The kDB coordinate system is written relative to a (x, y, z) coordinate
system that is typically aligned to the eigenvectors of the constitutive tensors. First,
a right-hand rotation about z by , then a right-hand rotation about the e1 axis
by . Axis e3 is aligned to the k-vector, and e1 always lies in the (x, y) plane.

Each constitutive relation tensor is transformed by the coordinate-system


change, but since T is unitary, there is no dilation or compression of the
tensors, only a pure rotation.
Now, since k = k e3 and D3 = B3 = 0, the elds E and H can be eliminated in (3.3.1a,b) to solve for D1,2 and B1,2 . The resulting coupled equations
are

i
i
11 12
D1
11
B1
12
u

(3.3.5a)
i
i
21
21 22
D2
+u
22
B2

i
i
11 12
B1
11
D1
12
+u

(3.3.5b)
i
i
21
21 22
B2
u
22
D2
where the phase velocity is dened as
u = /k

(3.3.6)

i
i
, and ij
are resolved in the
As written, the tensor components ij , ij , ij
kDB system. Depending on the complexity of the material, (3.3.5a,b) can be
solved simultaneously, or either Dk or Bk can be eliminated and the resulting
equation can be solved.
Equations (3.3.5a,b) are of the utmost importance to the description of
electromagnetic propagation within dielectric media. These equations will be
used in the following to determine the eigenmodes of propagation within a
medium, the phase and group velocities, the dispersion relations, and the direction of energy ow. The elegance of the kDB system is that all linear,
lossless materials can be described by this one system of equations, the particulars depending only on the tensor entries.

90

3 Interaction of Light and Dielectric Media

3.4 The Lorentz Force


As outlined in the introduction to the chapter, constitutive relations for
isotropic, birefringent, gyrotropic, and optically active materials are derived
from classical electronwave interaction in the following. The Lorentz force
and Newtons second law are sucient to model simple behavior of isotropic
and anisotropic media. Anisotropic media, of course, includes birefringent and
gyrotropic materials. The constitutive relation for optical activity is derived
heuristically as any robust model requires more work than space allows.
In isotropic and anisotropic materials the electric eld couples to the bound
electrons through the Lorentz force. The Lorentz force f is dened by


r
B
(3.4.1)
f = e E +
t
where (e) is the charge of the electron and r is the displacement of the
electron from its neutral position. Since the electrons in lossless material are
bound to the nucleus, the charge-attraction serves as a restoring force. For
low-intensity light the electron displacement is purely linear, so one can take
the restoring force K as constant with displacement r. Newtons second law
f = ma generates the equation of motion for a bound electron driven by the
Lorentz force:
2r
r
B
(3.4.2)
m 2 = Kr eE e
t
t
In the absence of an externally applied magnetic eld, the cross-product
force rB is vanishingly small compared to E. Indeed, recast in time-harmonic
form and substituting in Faradays law gives r (k E) |r/c| E. For
r 0.1
A and 2 200 THz, |rk| < 105 . To turn the cross-product term
on either an intense external magnetic eld must be applied or the internal magnetization must be high, as is the case for materials with magnetic
domains.

3.5 Isotropic Materials


The equation of motion for a bound electron in a linear isotropic material is
2r
r
(3.5.1)
= Kr eE m
2
t
t
where is a resonant damping coecient, and the restorative and drag forces
on the electron are centro-symmetric. The magnetic-induced force present
in (3.4.2) is neglected. When an incident eld is a plane wave at frequency ,
(3.5.1) is rewritten in time-harmonic form:


m 2 + jm + K r = eE
(3.5.2)
m

The following subsection incorporates this equation of motion into Maxwells


equations to obtain a dispersion relationship and the functional forms for the
material permittivity, refractive index, and absorption.

3.5 Isotropic Materials

91

3.5.1 Permittivity of Isotropic Materials


Equation (3.1.2) relates the material polarization to the electron displacement.
Multiplying (3.5.2) by N e, the frequency dependence of the polarization density is
N e2
E
(3.5.3)
P=
K m 2 + jm
The polarization P is a linear function of the applied electric eld E and points
in the same direction. The frequency-dependent coecient is a resonant factor
where the resonance frequency o is dened as

o = K/m
(3.5.4)
The resonance frequency is not a function of the eld energy or intensity (to
this order of approximation) and is therefore a xed quantity that depends
on the composition of the material.
The material susceptibility e is the tensor that relates E to P for linear
media. For isotropic media the tensor is a scalar e . The eld and polarization
are thus related via
(3.5.5)
P = o e ()E
Identifying (3.5.5) with (3.5.3), the complex susceptibility of a simple isotropic
material is
N e2
1
(3.5.6)
e () =
mo o2 2 + j
The susceptibility tensor or scaler is fundamental because it embodies the
homogeneous response of a material.
Susceptibility e is used to dene permittivity in the following way. The
electric-ux density D is the combination of the incident eld and its induced
dipole polarization:
(3.5.7)
D = o E + P = ()E
where (3.5.5) is used to dene the material permittivity accordingly:
() = o (1 + e ())

(3.5.8)

When the susceptibility is a tensor so is the permittivity. Often the permittivity relative to vacuum is used to compare materials. The relative permittivity r is dened as
(3.5.9)
r = /o
By analogy to permittivity, material permeability is dened as B = H.
However, for materials considered here there is little or no interaction between
the eld and the magnetic moments of the material. Therefore, is treated
as a scalar, frequency-independent quantity.
With the above denitions of the material permittivity and permeability,
the Helmholtz wave equation is (compare (1.1.6) on page 3)

92

3 Interaction of Light and Dielectric Media

2 E =

2
E
t2

(3.5.10)

Plane-wave solutions of the form

! !
""
r
E(r, t) = Eo exp j t k

when inserted into (3.5.10), generate the dispersion relation

k =
Now, the permittivity is a complex scalar, so k is complex as well. That
means absorption as well as retardation are fundamental implications of the
Lorentz equation of electron motion.
The frequency dependence of the loss and refractive index of an isotropic
material is found as follows. Expansion of the permittivity and dispersion
relation into real and imaginary parts,
=  + j

(k + j) = (n + j)
c
gives an expression for the real and imaginary parts of the relative permittivity:
(n + j)2 = r + jr ,
The real and imaginary parts of the relative permittivity are identied with
the refractive index n and extinction coecient as
r = n2 2 , and r = 2n

(3.5.11)

Finally, expanding the resonant form of the susceptibility e into its real and
imaginary parts and identifying with (3.5.11) gives the coupled equations for
the refractive index and extinction coecient as
n2 2 = 1 +
and
2n =

o2 2
N e2
2
mo (o 2 )2 + ()2

N e2

2
2
mo (o )2 + ()2

(3.5.12)

(3.5.13)

In the transparent regime, the bound-electron resonance is far away from


the frequency of the incident eld. In this case,  0 and the dispersion
relation is approximately

(3.5.14)
k= n
c
The phase velocity of the plane wave as it propagates within the medium is
vph = c/n. Inspection of (3.5.12) in light of  0 shows that the refractive

3.5 Isotropic Materials

93

index n is greater than unity. This is generally the case, although exceptions
are possible, such as in the x-ray region where multiple material resonances
are below, rather than above, the excitation frequency. For the near-infrared
transparent regime important to telecommunications, the refractive index is
greater than one. Another implication of polarization of the material is the
wavelength change within the medium. The wavelength in the material is
related to the free-space wavelength o as
= o /n

(3.5.15)

That the wavelength is reduced is not surprising since the frequency of


the light does not change whether the light is in vacuum or material, and
= 2vph always holds.
The phase velocity vph is the velocity of a monochromatic wavefront. However, the velocity of energy ow of a narrowband pulse is related to the frequency derivative of the wavenumber:
1
dk
=
vg
d

(3.5.16)

where vg is the group velocity. As has been shown, even a simple dielectric material has permittivity dispersion, so the phase and group velocities generally
dier. The group index ng , dened by vg = c/ng , is
ng = n

dn
d

(3.5.17)

Generally, material and waveguide dispersion has a negative index slope with
wavelength. So, generally, the group index lies above the refractive index.
Now, the key simplication so far has been that a single resonance exists
within the isotropic material. In general this is not the case. Multiple resonances can be related to various absorption bands of the atoms or molecules
that constitute the material. A more robust model includes multiple resonances and weighs the contributions to the susceptibility according to the
fraction of atoms that are associated with each resonance. Well into the transparent regime where the damping factor can be ignored, the refractive index
square is modelled as
n2 () = 1 +

N e2  fn
mo n o2 2

Often an additional pole at low frequency and another at high frequency are
added based on phenomenological experience. The refractive index equation
is then
a
N e2  fn
ao
n2 () = 1 + 2 +
2
2
2
o
mo n o

94

3 Interaction of Light and Dielectric Media


Table 3.2. Atomic Lines for Abbe Number Denition
Wavelength (nm)

Symbol

Spectral line

Element

656.2725

Red hydrogen line

589.2938

Yellow sodium line

Na

587.5618

Yellow helium line

He

486.1327

Blue hydrogen line

Converting frequency to wavelength and introducing material-specic


tting coecients, the Sellmeier equation for refractive index dispersion is
n2 () = A +

 Bn 2
D2
2C

n
n

(3.5.18)

There are, in fact, any number of forms of this equation. While the measured
refractive index data is absolute, the value of the coecients in (3.5.18) depends on which equation is used to model the data. As an example of another
form of the Sellmeier equation, the equation published by Schott Glass is
n2 () 1 =

B1 2
B2 2
B3 2
+
+
2 C1
2 C2
2 C3

(3.5.19)

The Abbe number is a measure of the refractive index dispersion throughout the visible region. The Abbe number generally works well for glasses rather
than semiconductor or crystals because the amorphous, homogeneous nature
of glass precludes strong resonances. The Abbe number vd is dened on the
d-line as
nd 1
vd =
(3.5.20)
nF nC
where nd , nF , and nC are the measured refractive indices on the d, F, and C
lines, respectively. The wavelengths for these lines are dened in Table 3.2.
3.5.2 Propagation in Isotropic Materials
Even though the kDB system is intended for more complex propagation calculations, propagation within an isotropic material will be cast in this formalism
for the sake of example. Within a material, one needs to nd the directions
of the k-vector and the Poynting vector, as well as the wavenumber. Also,
the refraction properties into and out of the material needs to be determined;
that is the topic of the following section.
i
i
In an isotropic material, tensors and in CDB are zero, and and
are scalars. The constitutive relations are
E = D
H = B

(3.5.21)

3.5 Isotropic Materials

95

The underlying coordinate system is arbitrary. A rotation of the coordinate


system into kDB coordinates leaves the constitutive relations unchanged since
T T 1 = . Thus, substitution of (3.5.21) into the coupled kDB equations (3.3.5) gives


 


0 u
B1
D1
=
(3.5.22a)

D2
B2
u 0



 

B1
0 u
D1

=
(3.5.22b)
B2
D2
u 0
Elimination of B generates the governing equation
 

 2
D1
0
u
=0
D2
0
u2

(3.5.23)

The electric eld may be polarized along the e1 direction, the e2 direction, or
a mixture of the two. For example, when the eld is polarized along e1 then
D1 = 0 and D2 = 0. In any case, the governing equation is satised when the
phase velocity is

1
(3.5.24)
u = =

Substituting u = /k gives the dispersion relationship

k =

(3.5.25)

from which the refractive index can be extracted:

n = r r

(3.5.26)

It is important to note that |k| = k and k = n/c. Geometrically, the k-vector


within the material can point in any direction, while the vector length is
always k. The surface dened by all possible pointing directions of the kvector is a sphere of radius n/c:

(3.5.27)
k = kx2 + ky2 + kz2
Finally, the Poynting vector is determined by the elds. Since and are
scalars, the elds and respective ux densities are aligned. Since by denition
the k-vector is aligned to e3 , the Poynting vector is
S = Ek Hk
= Dk Bk
1
e3
=

So, k  S in an isotropic material.

(3.5.28)

96

3 Interaction of Light and Dielectric Media

3.5.3 Refraction at an Interface


There are two questions to be addressed when dealing with refraction at a
smooth interface: the angle of refraction, and the transmission and reection
coecients. The refraction angle at an interface of two dissimilar isotropic
materials is determined by the relative refractive indices alone and is independent of the polarization of the incident wave. The transmission and reection
coecients, however, do depend on the incident polarization.
Phase matching of two waves, one in a rst material and the other in a second material, is the single condition that must be satised along an interface.
Refraction results when the refractive indices of the two materials dier. The
canonical example is illustrated in Fig. 3.3(a). Here the plane wave k-vector in
material 1 is inclined from the interface normal by 1 . The wavelength within
the material is 1 = o /n1 and the wave period as projected onto the interface
is 1 / sin 1 . The same analysis applies to the second material, with 2 and n2
in replacement. Phase matching at the interface requires
1
2
=
sin 1
sin 2

(3.5.29)

Snells law is a restatement of (3.5.29) using the refractive index instead:


n1 sin 1 = n2 sin 2

(3.5.30)

Since the refractive index of either (isotropic) material is independent of the


polarization of the wave, the refraction angle is independent of polarization.
A geometric construction of refraction and reection is illustrated in
Fig. 3.3(b). Recall that within an isotropic medium the k-surface is a sphere
having radius n/c. The interface between two media bisects each sphere,
dividing them into hemispheres of dierent radii. In Fig. 3.3(b) the refractive index of the rst material is less than the second. The top and bottom
half-circles represent the k-surfaces as projected onto the page, and the ki ,
kt , and kr vectors are illustrated, subscripts i, t, and r referring to incident,
transmitted, and reected waves, respectively. Phase matching at the interface requires that the projection lengths of all three k-vectors match on the
interface. In order for the tip of kt to reach the n2 /c circle, the direction of kt
must change with respect to ki . For the reected wave, however, |ki | = |kr |
and therefore the angle of the reected wave is the same, albeit mirror-imaged
about the normal to the interface. Finally, since these are isotropic materials,
the Poynting and k-vectors for all three waves are aligned and coincident.
3.5.4 Reection and Transmission for TE Waves
Next the reection and transmission coecients are determined. These coefcients are dierent for waves with transverse-electric (TE) and transversemagnetic (TM) polarizations. The TE and TM polarization states are distinguished because, in the rst case, the electric eld oscillates in the plane of

3.5 Isotropic Materials


a)

l1

kx
kz

n1

k1

b)

u1

ki

kr
u1

k1

l1/sinu1 5 l2/sinu2

kz1

kz1
x

kx

kx

l2
u2
n2

k2

97

x
kz2

u2
k2
z

kt

Fig. 3.3. Phase matching at a smooth dielectric interface. a) The free-space wavelength is reduced by the refractive index for waves within the dielectric. For n2 > n1
the wavelength in material 2 is shorter than that in material 1. Phase matching at
the interface requires that the k-vector change direction at the interface. b) A kvector diagram for refraction. The half-circle radii represent the respective refractive
indices; the contours are circular in isotropic media. Phase matching requires the kx
vectors of both waves to match.

the interface while, in the second case, the magnetic eld oscillates in that
plane plane. The tangential continuity conditions (3.1.13-3.1.14) determine
the relative eld amplitudes in the two materials.
For TE plane waves, illustrated in Fig. 3.4(a), the total electric elds in
the two materials are


(1)
Ey = yEo ejkz1 z + ejkz1 z ejkx x
(3.5.31)
(2)
Ey = yEo T ejkz2 z ejkx x
where and T are complex coecients of the reection and transmission
amplitudes, respectively. Faradays law determines the associated magnetic
elds:



H=
x
Ey z Ey
jo
z
x
where Ey = yEy . The magnetic eld components are therefore


kz1
Eo ejkz1 z ejkz1 z ejkx x
o
kz2 (2)
=
x
E
o y

H(1)
x
x =
H(2)
x
and

kx (1,2)
E
o y
Only the tangential E and H eld components are required to satisfy continuity across the interface. Application of the boundary conditions
H(1,2)
= z
z

98

3 Interaction of Light and Dielectric Media


Ey

Ey

n1

Hy

u1

n1

kz1

kz1
kx

kx

u1
kz1

kz1
kx

kz2

kx

u2

x
kz2

u2

n2
a) TE

Hy

n2
z

Ey

b) TM

Hy

Fig. 3.4. Refraction diagrams for TE and TM waves. a) TE wave: the electric eld
lies tangential to the interface. b) TM wave: the magnetic eld lies tangential to the
interface.
(1)

(2)

Ey = Ey

(1)

(2)

and Hx = Hx

on the interface where x = 0 gives


1+=T
kz1 (1 ) = kz2 T

(3.5.32a)
(3.5.32b)

Combining (3.5.32a,b) gives the TE reection coecient in terms of material


indices and ray angles:
TE =

n1 cos 1 n2 cos 2
n1 cos 1 + n2 cos 2

or, in terms of n1 and 1 alone,




1 sin2 1 (n2 /n1 ) 1 (n1 /n2 )2 sin2 1

TE = 
1 sin2 1 + (n2 /n1 ) 1 (n1 /n2 )2 sin2 1

(3.5.33)

(3.5.34)

Now, if one considers a box drawn around the point of incidence, then all
the power that ows into the box from the incident wave must ow out of
the box via the reected and transmitted waves. The time-averaged Poynting
vectors are
1
(
xkx + zkz1 ) |Eo |2
2o
1
Sr  =
(
xkx zkz1 ) ||2 |Eo |2
2o
1
St  =
(
xkx + zkz2 ) |T |2 |Eo |2
2o
Si  =

3.5 Isotropic Materials

99

Power conservation requires that


| Si  | = | Sr  | + | St  |

(3.5.35)

Substitution of the TE Poynting vectors gives


1 = ||2 +

n2
|T |2
n1

Dening R = ||2 and T = n2 /n1 |T |2 generates the power conservation equation:


R+T=1
(3.5.36)
Clearly all that is reected and transmitted must come from the incident
power.
3.5.5 Reection and Transmission for TM Waves
For TM plane waves, illustrated in Fig. 3.4(b), the magnetic eld oscillates
in the plane of the interface. In analogy to the TE wave solution, the total
magnetic elds in the two materials are


(1)
Hy = yHo ejkz1 z + ejkz1 z ejkx x
Hy = yHo T ejkz2 z ejkx x
(2)

(3.5.37)

where and T are complex coecients of the TM-wave reection and transmission amplitudes, respectively. The associated electric elds are determined
from Amp`eres law,



E=
x
Hy z Hy
j
z
x
where Hy = yHy . Solving for the elds and matching the tangential continuity
condition across the interface yields
1+=T
kz1
kz2
(1 ) =
T
1
2

(3.5.38a)
(3.5.38b)

Combining (3.5.38a,b) gives the TM reection coecient in terms of material


indices and ray angles:
TM =
or, in terms of n1 and 1 ,

n2 cos 1 n1 cos 2
n2 cos 1 + n1 cos 2

(3.5.39)

100

3 Interaction of Light and Dielectric Media

Reflection Intensity

1.0
n1 = 1.0
n2 = 1.5

0.8
0.6

TE
Brewster's Angle

0.4
0.2
0

TM
0

10

20

30

40

50

60

70

80

90

Incident Angle
Fig. 3.5. Reection intensities for TE and TM waves as a function of incident angle.


1 sin2 1 (n1 /n2 ) 1 (n1 /n2 )2 sin2 1

= 
1 sin2 1 + (n1 /n2 ) 1 (n1 /n2 )2 sin2 1


TM

(3.5.40)

Figure 3.5 compares the reection intensities || for TE and TM waves


and an air-glass boundary as a function of incident angle. The TE reection intensity increases monotonically, while the TM wave goes through a zero point
along the way. The angle at which TM reection is zero is called Brewsters
angle.
The remarkable property of TM waves is that at Brewsters angle all reection is extinguished. Brewsters angle satises the condition
1 + 2 = /2

(3.5.41)

That is, the transmitted and reected waves are perpendicular to one another
(see Fig. 3.6(a)). That there is no power in the reected wave is reasonable
based on physical considerations. The radiation pattern of an electric dipole is
null along the polar axis. Brewsters condition orients the dipole excitation and
the direction of the reected wave perpendicular to one another. Substituting
Brewsters condition (3.5.41) into (3.5.39) gives the requisite incident angle B
B = tan1 (n2 /n1 )

(3.5.42)

Brewsters condition cannot be satised by TE waves because the electric eld


of the reected and transmitted waves are always parallel.
To write the power conservation equation in the form of (3.5.36), the reection and transmission coecients are identied to be
n1
|T |2
R = ||2 , and T =
n2
As a concluding remark, in addition to incident angle and polarization
state, only the refractive-index ratio n2 /n1 determines the refraction angle,
not the absolute refractive indices.

3.5 Isotropic Materials


E
Hy

n1

E
Hy

uB

ki

101

kr
uc

n1
x

kx = k2

u2

kt

u2
n2
a) Brewster

Hy

n2
b) TIR

Fig. 3.6. a) The condition at Brewsters angle: the reected and transmitted waves
are perpendicular to one another. Brewsters condition of zero reectance applies
onto to TM waves. b) The critical angle for total internal reection, n2 < n1 .

3.5.6 Total Internal Reection


Total internal reection (TIR) occurs when the phase-matching condition at
an interface cannot be satised by plane-wave solutions. A plane-wave solution
does not exist when phase-matching requires a phase velocity that exceeds the
speed of light. In this case all the incident power is reected.
For TIR to exist, the refractive index in which the incident wave travels
must exceed the refractive index beyond the interface: n1 > n2 . This condition
is required whether the material is isotropic or anisotropic. In isotropic media
the refractive index is independent of polarization state, so TIR between two
isotropic media is polarization agnostic.
For TIR to occur at an interface where n1 > n2 , the incident angle must
be at or beyond the critical angle. The critical angle for the incident wave
is that angle which makes the transmitted wave run parallel to the interface
2 = /2 (see Fig. 3.6(b)). At and beyond the critical angle the wavefront
period as projected onto the interface from the incident side is shorter than the
wavelength of a plane wave on the transmission side. There is no orientation
of kt that achieves phase matching. Instead, kt becomes imaginary and the
transmitted eld decays exponentially into the lower-index media.
Referring to Fig. 3.3(a), the angle where 1 = 2 / sin 2 is called the critical
angle. Snells law generates the relation
c = sin1 n2 /n1

(3.5.43)

Resolving k2 into normal and transverse components,



2 ,
k2 = kx2 + kz2
at and above the critical angle kx > k2 . This condition is only satised when kz
is imaginary:

102

3 Interaction of Light and Dielectric Media


n1 = 1.5
n2 = 1.0

0.8

Critical Angle

0.6
0.4

TE

0.2
0

100
50
0

TE

-150

10 20 30 40 50 60 70

TM

-50

-100

TM
0

n1 = 1.5
n2 = 1.0

150

Reflection Phase

Reflection Intensity

1.0

80 90

10 20 30 40 50 60 70

Incident Angle

80 90

Incident Angle

Fig. 3.7. Reection intensity and phase before and after the onset of total internal
reection. a) Reection intensity for TE and TM waves. b) Reection phase shifts
for TE and TM. Notice the sign change for TM reection across the Brewsters angle
boundary.


kz2 = jz2 = j

kx2 k22

This shows the exponential decay of the eld along the axis normal to the
surface. Parallel to the surface the incident and evanescent elds are phase
matched.
In comparison to the reected intensities from air to glass n : 1.0 1.5
plotted in Fig. 3.5, the reection intensity and phase for the reverse direction
n : 1.5 1.0 is plotted in Fig. 3.7. Since both media are isotropic the critical
angle is the same for both polarizations. Below the critical angle of c  41.8
the reection coecients are below unity and there is no phase slip. At and
beyond the critical angle, however, the reection coecients are unity and a
phase develops between the incident and reected waves. The phase of the
TM wave goes through a phase shift at Brewsters angle.
When the incident angle exceeds the critical angle, the reection coecient
takes unity magnitude and develops a phase shift. The phase shift is called the
Goos-Hanchen shift. Dening the reection phase as = 2, the boundary
conditions for TE (3.5.32) and TM (3.5.38) are rewritten as
1 + e2j = T
q1 (1 e

2j

) = q2 T

(3.5.44a)
(3.5.44b)

where for TE q = kz and for TM q = kz /. The Goos-H


anchen phase shifts
are




z2
1 z2
1
1
TE = tan
and TM = tan
(3.5.45)
kz1
2 kz1
Expanding in terms of the indices and incident angle, the reection phases
are

3.5 Isotropic Materials


TE = 2 tan1

TM = 2 tan1

sin2 1 (n2 /n1 )2


cos 1
sin2 1 (n2 /n1 )2
(n2 /n1 )2 cos 1

103

(3.5.46a)

(3.5.46b)

The retardance induced by a TIR reection, = (TE TM ), is




cos 1 sin2 (n2 /n1 )2

(3.5.47)
= 2 tan1
sin2 1
Since the sign of is positive, the magnitude of phase shift imparted on the
TM wave is greater than that for the TE wave.
Total internal reection can be understood as a partial penetration of the
electro-magnetic wave into the material of lower index (cf. Fig. 3.8). The eld
component normal to the interface is a standing wave where the rst null lies
within the lower-index material. The phase shift between the interface location
and the rst eld maxima is called the Goos-Hanchen shift. This shift is
polarization dependent, as determined above, with the TM wave penetration
greater than the TE penetration for the same inclination. Since the GoosHanchen phases dier for TE and TM waves, the state of polarization of a
reected eld can be transformed with respect to the incident eld.
For TE waves, the electric eld expressions above the critical angle are
E(1)
2Eo ej cos(kz z + ) ejkx x
y =y

(3.5.48a)

Eo ejz z ejkx x
E(2)
y =y

(3.5.48b)

The phase factor in the cosine term of the standing wave indicates that
the last null of the standing wave lies below the interface and in region 2,
Fig. 3.8(a). If the lower dielectric material were removed and a perfect metal
conductor were placed at the location of the last null, the same standing wave
pattern would persist.
The existence of the Goos-Hanchen phase shift alters the reection diagrams of Fig. 3.4: the reected light ray is no longer coincident with the
incident ray at the interface. Rather, the incident ray penetrates into the
lower material and reemerges forward-shifted with respect to its point of entry (Fig. 3.8(b)). The forward shift 2xs is also rather confusingly called the
Goos-Hanchen shift.
To construct the expression for the forward shift, consider two plane waves
incident on the interface at slightly dierent angles i . The incident and
reected waves at z = 0 are
Ey = Eo ej(kx kx )x
(i)

Ey = Eo ej(kx kx )x e2j()
(r)

104

3 Interaction of Light and Dielectric Media


a)

b)

n1 > n2
n1

n1

Standing wave

u1

xs

x
n2

First null

Decaying wave

zs

n2
z

Fig. 3.8. Views of the Goos-H


anchen shift. a) A standing wave oscillates along the
normal to the interface; the rst null penetrates into the lower-index material. b) The
inclined wave penetrates into the lower-index material and reemerges forward-shifted
with respect to the incident wave. The penetration depth depends on the polarization
state.

The total incident and reected elds, summed over the two slightly dierent
incident angles, are
Ey(i) = 2Eo cos(kx x) ejkx x

(3.5.49a)

Ey(r) = 2Eo cos(kx x 2) ejkx x

(3.5.49b)

Clearly the reected eld is shifted forward with respect to the incident eld.
Expanding the Goos-H
anchen shift about kx gives
(k + kx ) = (k) +

kx
kx

Substitution into (3.5.49)(b) gives the expression for the reected wave
Ey(r) = 2Eo cos(kx (x 2xs )) ejkx x

(3.5.50)

where the lateral Goos-Hanchen shift xs is dened as


xs =

kx

(3.5.51)

The Goos-Hanchen shifts for TE and TM reections are therefore


tan 1
x(TE)
=
s
kx2 k22

(3.5.52a)

(TE)

x(TM)
=
s

kx
k2

2

xs
 2
kx
+
1
k1

The penetration depth for TE incident light is

(3.5.52b)

3.6 Birefringent Materials

zs =

1
z2

105

(3.5.53)

and the TM penetration is greater. The penetration depth of (3.5.53) makes


sense since a point and slope t to decaying eld in (3.5.48b) gives the null
1
location at z2
.

3.6 Birefringent Materials


Most glasses when unstrained are isotropic due to their amorphous nature and
lack of long-range periodicity. Crystals, however, can exhibit anisotropy due do
natural symmetries of the lattice structure. There are seven distinct crystalline
classes, one class being cubic, where the atomic lengths of the three unit
cell dimensions are equal; three classes having one unique and two identical
unit cell dimensions; and three classes having dierent unit cell dimensions
in all three directions. Cubic crystals, being centrosymmetric, are the only
isotropic crystalline class. As the polarizability of a material is intimately
related to the binding energy of electrons within the lattice, dierences in unit
cell dimensions of the remaining six classes cause dierences in the material
susceptibility. Accordingly, the dielectric constant of the material depends on
how the bound electrons oscillate.
In the transparent regime, a high binding energy corresponds to a tight
spring constant in the electron-oscillator model, which in turn corresponds to a
high resonance frequency. A low binding energy corresponds to a low resonance
frequency. At an excitation frequency below either resonance frequencies,
the electron oscillation along the axis (axes) having lower resonance frequency
will induce a higher refractive index (Fig. 3.9). A uniaxial crystal has two
dierent refractive indices and a biaxial crystal has three dierent refractive
indices. Both uniaxial and biaxial materials all called birefringent materials.
Table 3.3 summarizes the correspondence of crystal class symmetries with
susceptibilities.
When a linearly polarized light ray propagates through a crystal such that
its electric eld is aligned to a crystalline lattice axis, denoted by index i, the
equation of motion for the bound electrons is
m

ri
2 ri
= Ki ri eEi mi
t2
t

(3.6.1)

where Ki and i are the spring and damping constants for the ith direction.
In general, however, the electric elds are not aligned to a crystalline axis and
multiple vibrational modes are excited. Substitution of electric susceptibility
tensor e for the scalar e provides the mathematical framework to describe
this more complex oscillation. The tensor relation between the induced polarization density and incident electric eld is
P = o e ()E

(3.6.2)

3 Interaction of Light and Dielectric Media

n(v)

106

1.0
transparency

ve

vo

Fig. 3.9. Resonance model of birefringent vibrations and corresponding refractive


index.

(cf. (3.5.5)). In the natural coordinate system of the crystal, the tensor e is
written as

a
E
b
P = o
(3.6.3)
c
where each diagonal susceptibility may be dierent and all o-diagonal components are zero. The lack of o-diagonal components in the lattice coordinate
system indicates that under pure excitation along a crystalline axis there is
no coupling to the other axes.
That the susceptibility is a tensor recasts the dielectric constant as a
tensor: = o (1 + e ). The relation of D to E, while still linear, now dependents on the orientation of the electric eld. For example, in the kDB system,
the electric constitutive relation is
Dk = T T 1 Ek

(3.6.4)

The tensor T T 1 in general has o-diagonal components, mixing the electron oscillation modes. The following studies analyze the propagation and
polarization states of light rays within birefringent media.
3.6.1 Propagation in Uniaxial Materials
Uniaxial birefringence exists in trigonal, tetragonal, and hexagonal crystals.
The axis that has the unique lattice constant is called the extraordinary axis;
the remaining two axes are call the ordinary axes. A uniaxial crystal can
be positive uniaxial or negative uniaxial, depending of the relative refractive
indices. The convention is
positive (+) uniaxial

ne > no

negative () uniaxial

ne < no

Whether a crystal is positive or negative uniaxial depends on the particular


constituents and group symmetries.

3.6 Birefringent Materials

107

Table 3.3. Crystal Classes and Birefringence


Class

Unit cell

Susceptibility

Isotropic
Cubic

Uniaxial
Trigonal
Tetragonal
Hexagonal
Biaxial
Triclinic
Monoclinic
Orthorhombic

e =

a=c=b

e =

a = b = c

e =

a = b = c

a
a

o
e

b
c

After Fowles [5].

In an uniaxial material, tensors and in CDB are zero, is a scalar,


and is a tensor. The constitutive relations are
E=D

(3.6.5a)

H=B

(3.6.5b)

The impermittivity tensor is diagonal when D and E are aligned to its


eigenvectors:

o
(3.6.6)
=
e
where the subscripts o and e refer to ordinary and extraordinary axes, respectively. In diagonal form, i = 1/i . Moreover, the extraordinary axis is
aligned to the z axis of the lattice.
A plane wave propagating within the crystal has a k-vector direction. In
the kDB system the k-vector is always aligned to the e3 axis, while vectors D1
and D2 are aligned to e1 and e2 , respectively. The elds and uxes, after
rotation into kDB coordinates from lattice coordinates, are governed by the
coordinate-transformed constitutive relations

where k = T T 1 , or

Ek = k Dk

(3.6.7a)

Hk = Bk

(3.6.7b)

108

3 Interaction of Light and Dielectric Media

k =

o cos2 + e sin2 (o e ) sin cos


(o e ) sin cos o sin2 + e cos2

(3.6.8)

The impermittivity tensor k is independent of : a uniaxial crystal is isotropic


in the plane normal to z, so the orientation of the k-vector-projection within
that plane makes no dierence. Substitution of (3.6.7) into the coupled kDB
equations (3.3.5) gives


 


11
D1
0 u
B1
=
22
D2
B2
u 0

 


B1
0 u
D1

=
B2
D2
u 0
where
11 = o
22 = o cos2 + e sin2
Elimination of B generates the governing equation
 2
 

u 11
0
D1
=0
D2
0
u2 22

(3.6.9)

In general, there are three non-trivial solutions to (3.6.9). They are

(1) D1 = 0, D2 = 0
u = 11
(2) D1 = 0, D2 = 0

u = 22

(3) D1 = 0, D2 = 0

u = 11 and u = 22

For solutions (1) and (2) the k-vector can point in any direction. The dispersion relation for solution (1) is
k=

no
c

(3.6.10)

Physically, the D1 ux vector lies along e1 , which in turn always lies in


the (x, y) lattice plane. The D1 vector therefore never excites the extraordinary vibrational mode; only the ordinary refractive index is experienced.
One the other hand, the dispersion relation for solution (2) is
*1/2
)
cos2 sin2
k=
+
c
n2o
n2e

(3.6.11)

Physically, the D2 ux vector can point in any direction in the lattice coordinates. Whether D2 excites the extraordinary vibrational mode, the ordinary

3.6 Birefringent Materials

a)

b)

z
^

109

k, S, e3

k, e3

S
g
^

E, D1, e1

H, B2, e1

H, B1, e2

D2, e2

Fig. 3.10. Two solutions to linearly polarized propagation in a () uniaxial medium.


a) D z, therefore S k. b) D  z, therefore S  k. For a (+) uniaxial crystal, Se
lies between z and k.

vibrational mode, or a mixture, depends only on , the declination angle of


the k-vector to the z axis. The eective index ne is dened by identifying
terms in (3.6.10, 3.6.11):
ne = 

ne no

(3.6.12)

n2e cos2 + n2o sin2

When the k-vector is aligned with z, ne = no ; and when the k-vector is


perpendicular to z, ne = ne . Otherwise, a mixture of vibrational modes is
excited and an intermediate refractive index is experienced.
The characteristic direction of the Poynting vectors is another fundamental dierence between solutions (1) and (2). As illustrated in Fig 3.10, the
k- and Poynting vectors are aligned for solution (1) and are misaligned for
solution (2). For solution (1), D2 = 0 and the Poynting vector is
Dk = e1 D1 , Ek = e1 11 D1
Bk = e2 u/vD1 , Hk = e2 uD1
Sk = e3 11 uD1 D1

(3.6.13)

In this case Sk  k. Figure 3.10(a) illustrates that the D1 vector is perpendicular to z and that k- and Poynting vectors are aligned. Solution (2) on the
other hand, where D1 = 0, generates a Poynting vector according to
Dk = e2 D2 , Ek = e2 22 D2 + e3 32 D2
Bk =
e1 u/vD2 , Hk =
e1 uD2
Sk = (
e3 22 e2 32 ) uD2 D2

(3.6.14)

110

3 Interaction of Light and Dielectric Media

This is a characteristically dierent solution. The E eld has a longitudinal


component along k and the Poynting vector is no longer parallel to the kvector, Sk  k. Planes of constant phase lie perpendicular to the k-vector by
denition, but those planes are tilted with respect to the energy-ow propagation direction. The walko angle between the ordinary and extraordinary
Poynting vectors is
32
tan =
22
=

(n2e n2o ) sin cos


n2e cos2 + n2o sin2

(3.6.15)

The angle is a signed value: Se is inclined further from the z axis than k
in a positive uniaxial material; Se is inclined less than k from z in a nega plane. Similar to
z , k)
tive uniaxial material. In either case, Se lies in the (
the reection and transmission coecients at a boundary, is a function of
the ne /no ratio.
There is an important rule for behavior of uniaxial crystals. The rule is
D  z or D z = Se  ke

When the electric ux vector is either


parallel or perpendicular to the z axis
of the crystal, the extraordinary k- and
Poynting vectors coincide.

D  z or D  z = Se  ke

When the electric ux vector is aligned in


any other orientation, the extraordinary
k- and Poynting vectors are misaligned.

Many designs use the fact that the ordinary and extraordinary rays can be
separated by walko in a uniaxial crystal.
Returning to (3.6.9), solution (3) allows for the simultaneous existence
of D1 and D2 , but only when k is aligned along z. With D1 = 0 and D2 = 0,
(3.6.9) can only be satised for = 0. Physically, the k-vector points along z
and the perpendicular plane contains only the ordinary vibrational mode;
only the ordinary refractive index is experienced and the material appears
isotropic.
In general, only linearly polarized plane waves can propagate in a uniaxial
birefringent crystal. There are two solutions for each orientation of the kvector, each solution having a distinct wavenumber, eective refractive index,
and polarization. An arbitrarily polarized light ray incident on a birefringent
crystal is decomposed into two linearly polarized light rays, one associated
with each wavenumber. Refraction into and out of a birefringent crystal shows
the distinction between these two solutions.
The behavior of light in a birefringent medium, and its refraction into
and out of the medium, is described geometrically by the indicatrix. The

3.6 Birefringent Materials


z

a)

b)
^

k, e3

z
^

k, e3

y
^

111

D1, e1

e1

x
^

e2

D2, e2

Fig. 3.11. The ordinary and extraordinary indicatrices. a) The ordinary indicatrix
is isotropic to k-vector orientation. b) The extraordinary indicatrix is an ellipsoid,
with major axis along z. This is a negative uniaxial indicatrix, the z axis being
longer than the others.

indicatrix encompasses all possible eective refractive indices of the medium,


is associated with the k-vector and polarization vectors, and generates the
orientation of the Poynting vector.
For each k-vector there are two indicatrices: one for D1 = 0 and the other
for D2 = 0. In the former case, dispersion relation (3.6.10) is isotropic. Resolving the k-vector into components along the lattice coordinates, the ordinary
indicatrix is a sphere (Fig. 3.11(a)), which follows the expression
ky2
kx2
kz2
2
+
+
=
n2o
n2o
n2o
c2

(3.6.16)

For the D2 = 0 case, the following coordinate associations are made:


kz = k cos
ky = k sin sin
kx = k sin cos
Substitution into (3.6.11) generates an ellipsoidal indicatrix (Fig. 3.11(b)),
which follows the expression
ky2
kx2
kz2
2
+
+
=
n2e
n2e
n2o
c2

(3.6.17)

This is the extraordinary indicatrix. The major and minor axes of this indicatrix are aligned to the axes of the lattice, the major axis aligned to z axis,
and the axis lengths are the wavenumbers kx,y,z along the associated lattice
axes. Equivalently, the axis lengths are the refractive indices along the lattice
directions.

112

3 Interaction of Light and Dielectric Media

For either indicatrix the wavenumber is determined by the length of the


k-vector at the point of intersection between that vector with the ellipsoid
surface. The group velocity is the tangent to the surface at this intersection:
vg = k

(3.6.18)

where k = k /k. Recall that the group velocity is the result of dierent
wavelengths travelling at dierent speeds. In a birefringent medium a change
of k-vector at a xed frequency is sucient to alter the propagation speed. A
perturbation analysis of Maxwells equations (3.3.1a,b) shows the geometric
interpretation. Expanding these equations to rst order in k and making the
indicated dot products gives
H (k E + k E = B)
E (k H + k H = D )
Taking the dierence and applying the vector identity a (b c) = b (c a)
yields
2k (E H ) = (H B H B + E D E D )
Substitution of D and B with the inverse constitutive relations (3.6.5), recognizing that E E = E E (cf. 3.2), and applying lossless conditions
= and = , reduces the expansion to
2k (E H ) = ((E E) (E E) +
(H H) (H H) )
The right-hand side of (3.6.19) is an entirely imaginary quantity. As the timeaveraged Poynting vector is dened as S = e (E H ) /2, this last expression is further reduced to
k S = 0
(3.6.19)
Geometrically, S is normal to the indicatrix surface at the intersection
point k. As this coincides with the direction of the group velocity, is it clear
that the energy ow travels at the group velocity and may not coincide with
the direction of the k-vector.
3.6.2 Refraction at an Interface
Refraction and reection at an interface between isotropic and uniaxial materials are treated nearly the same way as detailed in 3.5, but with the following
modications:
1) the o-ray and e-ray are calculated separately;
2) the eective index of the e-ray determines the refraction angle and reection coecient; and

3.6 Birefringent Materials

a)

b)

113

Fig. 3.12. Two orientations of the extraordinary indicatrix at a boundary. a) The z


axis is normal to the interface, view P is isotropic about z. b) The z axis is in the
plane of the interface. Rays Q and R are orthogonal slices through the indicatrix
and have dierent refraction angles.

3) the Poynting vector of the e-ray is not generally coincident with the kvector and must be calculated separately.
Snells law remains intact for birefringent materials because it enforces phase
matching along the interface.
Work with birefringent crystals requires the distinction between the physical shape of the crystal and the internal orientation of the lattice. Crystal cuts
can be limited by the brittleness of the material, but there are nonetheless
several customary orientations that can be cut. As illustrated in Fig. 3.12,
two orientations that are common are for the z normal to or lying in the interface plane. The gures illustrate the extraordinary indicatrix of a positive
uniaxial crystal intersected by a plane, and rays P , Q, and R refracting into
the interface.
Figure 3.12(a) illustrates the z axis cut perpendicular to the interface
plane. The vertical plane dened by view P is isotropic about z; the eective
index of e-ray P depends only on its inclination. Figure 3.12(b) illustrates
the z axis cut within the interface plane. The vertical plane dened by view Q
forms an elliptical intersection with the indicatrix while the plane dened
by view R forms a circle. The eective index of an e-ray refracting into a
uniaxial material with this cut depends both on the inclination angle and
azimuth orientation about the interface normal.
Figures 3.13(af) show the refractive index manifolds for views P , Q, and R
for positive and negative uniaxial crystals. In each gure, the ordinary and
extraordinary indicatrices are shown in plan view and bisected by the interface
plane. The center of each indicatrix must lie in the plane of the interface and
be located where the respective k-vector breaches the interface. As illustrated,
the centers of the e- and o-indicatrices coincide.
The drawings illustrate the relative refraction for the e- and o-rays. The
horizontal components of the ke - and ko -vectors are equal, satisfying phase
matching along the interface. The e- and o-Poynting vectors are normal to

114

3 Interaction of Light and Dielectric Media

Positive Uniaxial

Negative Uniaxial

kx

kx

ko

ke

ke

ko

So

a)

Se

Fig. 3.13a. Refraction manifold for


orientation P in a positive uniaxial
crystal. Ordinary (dashed) and extraordinary (solid) indicatrices are bisected by the interface plane. Poynting
vectors S are normal to the respective
indicatrices. The o-ray is TE.

Se

Fig. 3.13b. Refraction manifold for


orientation P in a negative uniaxial
crystal. Ordinary (dashed) and extraordinary (solid) indicatrices are bisected by the interface plane. Poynting
vectors S are normal to the respective
indicatrices. The o-ray is TE.

kz

kz
ke

ko

ko

ke
z

Se

So

c)

Se

Fig. 3.13c. Refraction manifold for


orientation Q. The o-ray is TE.

So

d)

Fig. 3.13d. Refraction manifold for


orientation Q. The o-ray is TE.

kx

kx
ke

ko

ko

ke
z

e)

So

b)

Se

So

So

Fig. 3.13e. Refraction manifold for


orientation R. The o-ray is TM. This
is the only orientation that is isotropic
for both e-rays and o-rays.

Se

f)
Fig. 3.13f. Refraction manifold for
orientation R. The o-ray is TM. This
is the only orientation that is isotropic
for both e-rays and o-rays.

3.6 Birefringent Materials

Q
ui
z

ui

ui

ue

ue
uo
ko, So

ke
j

ui

uo

a)

115

Se

b)

ke
j

Se
ko, So

Fig. 3.14. Refraction from an isotropic material into a positive uniaxial material,
views P and Q. a) Poynting vector Se lies between So and z. The e-ray is slow.
b) Poynting vector Se lies outside of So . The e-ray is again slow. In both cases the
o-ray is TE.

the corresponding indicatrices at the point of intersection with the respective


k-vectors. The polarization orientation is also illustrated for each ray. The
ordinary ray always has its polarization state perpendicular to the extraordinary axis z. Depending on the direction of z at the interface, the ordinary ray
may be either TE or TM. Finally, the fast axis is always the axis that has
the lower refractive index; like TE and TM, the fast axis can be either the eor o-ray, depending on orientation of z to the interface.
Refraction into positive uniaxial crystals for views P and Q is shown in
detail in Figs. 3.14(a,b). For view P (Fig. 3.14(a)), the o-ray sees a refractive
index of no and is TE. Snells law determines the angle of refraction:
ni sin i = no sin o

(3.6.20)

The Poynting vector So coincides with ko . The e-ray, which is TM, sees an
eective refractive index ne given by
ne no

ne = 
n2e

cos2

(3.6.21)

e + n2o sin2 e

where e is the angle between ke and the z axis. Snells law then
ni sin i = ne sin e

(3.6.22)

However, this version of Snells law contains the angle e both in ne and the
sine term. Given ne , no , and i , (3.6.22) can be solved for e :
tan e =

ni sin i
ne

no
n2e n2i sin2 i2

(3.6.23)

116

3 Interaction of Light and Dielectric Media

a)

b)

ki

90 2 a

a
g

Se

ke, ko, So

Fig. 3.15. A walko cut: a uniaxial crystal cut where the z axis is inclined by from
the normal. a) The (positive) extraordinary indicatrix as intersected by the interface
plane. View V is indicated. b) Plan view of refraction at interface. Ordinary (dashed)
and extraordinary (solid) indicatrices are shown. Se is inclined by from the normal.

One can verify that when ne no , (3.6.23) reduces to (3.6.20). The normal
to the extraordinary indicatrix at the point of intersection with ke determines
the direction of the Se vector. The angle between the So and Se is calculated
from (3.6.15) using = e . For view P with a positive uniaxial crystal, is
negative and Se lies between So and z.
View Q is for a cut such that z lies in the interface plane; this is a waveplate
cut. When i = 0, the ordinary and extraordinary k- and Poynting vectors
coincide but the refractive index of the e-ray diers from that of the o-ray. As
the two rays transit the crystal, the e- and o-rays slip relative to one another,
which in turn transforms the polarization state from input to output.
Figure 3.15 illustrates a third important crystal cut, the walko cut. A
walko crystal is used to polarize the input light and spatially separate the
resultant ordinary and extraordinary components. The crystal is cut such
that its z-axis is inclined from the plane of the interface. Since no walko
occurs at 0 and 90 inclination, there is an intermediate inclination angle
that maximizes the angular separation of Se and So .
Figure 3.15(b) illustrates the refraction manifold for an inclined-cut positive-uniaxial crystal for view V (Fig. 3.15(a)). The z-axis is inclined by
with respect to the normal. When the incident light is normal to the surface,
the ordinary component is not refracted and continues in the same direction.
The ordinary Poynting vector runs parallel to the ko vector. The extraordinary
component, however, behaves dierently. Since the input angle (as illustrated)
is normal, the extraordinary k-vector also continues into the crystal without
refraction and runs parallel to the ko vector. However, the point of intersection
of ke with the extraordinary indicatrix leads to a deected Poynting vector.
The deection for a positive-uniaxial crystal is in the direction of the z axis.
The deection angle , also known as the walko angle, is governed
by (3.6.15) with in place of (note that ne is not replaced by ne : the

3.6 Birefringent Materials

117

deection equation is written in terms of the cardinal indices). In order to


maximize the walko angle , (3.6.15) is dierentiated with respect to and
set to zero. The cut angle
that maximizes the walko angle is
tan (
) =

ne
no

(3.6.24)

and the corresponding deection angle is


tan (max ) =

1
2

ne
no

no
ne


(3.6.25)

The governing quantity is the ratio ne /no , rather than the dierence. A
positive-uniaxial crystal having 10% birefringence, that is, ne /no = 1.1, gives
 47.7 .
a maximum walko angle max  5.45 for a cut angle of
A walko crystal makes two parallel but laterally displaced, orthogonally
polarized output rays when the input ray is perpendicular to the input face,
and the input and output faces are cut parallel to one another. The index of
the ordinary ray is no , while that of the extraordinary ray is (compare (3.6.21))
ne = 

ne no

(3.6.26)

n2e cos2 (e + ) + n2o sin2 (e + )

For a crystal cut at


the eective index is

1 2
(n + n2o )
ne =
2 e

(3.6.27)

This is, in fact, just the average of the permittivities along the cardinal directions.
The eective path length Le of the extraordinary ray is the length traversed by the ke vector times the eective index. That is, the path length
of Se does not enter into the eective path. The path-length dierence for a
walko crystal of length L is Le Lo = (ne no ) L.
3.6.3 Total Internal Reection
Total internal reection (TIR) for birefringent materials is more interesting
than for isotropic materials because its onset diers for e- and o-rays. Otherwise, the physics of TIR remains the same as detailed in 3.5.6. This section
analyzes a birefringent polarizer that uses selective TIR, and asymmetric TIR
where the input and output angles are unequal.
Figure 3.16 illustrates a polarizing birefringent wedge. The z axis of the
wedge is cut normal to the plane of the page and the output face is tilted
by angle p from the input face. The incident light is normal to the input
face. Since the polarization of the o- and e-rays is perpendicular to the z axis,
the Poynting vectors coincide with their respective k-vectors. No refraction

118

3 Interaction of Light and Dielectric Media

a)

b)

ki
oe

up
z

o, e
o

up

z
e

kx-e
uo

o
902up

kx-o

Fig. 3.16. Polarizing wedge. a) Cross-section of a polarizing birefringent wedge cut


with z pointing out of the page. Only the o-ray survives refraction. b) Plan view of
output refraction manifold.

is experienced at the input. However, the inclination of the output face cuts
o transmission for the e-ray but not the o-ray. This is because the refractive
indices of the two rays dier. The critical angles for the two polarizations are
c,e = sin1 (ni /ne )
for the e-ray and

c,o = sin1 (ni /no )

for the o-ray, where ni is the isotropic index, possibly air. As long as p falls
within c,e p c,o then only the o-ray emerges from the output face. The
e-ray experiences TIR. The angle at which the o-ray emerges with respect to
the axis of the input ray is


no
1
sin p
o = sin
ni
The o-ray is TM at the second interface and will suer reection, reducing
its transmission. The reection coecient is determined by (3.5.40). The Shirasaki prism solves this problem by setting p to Brewsters angle (see 4.7.3).
Another example of the eects of birefringent total internal reection is the
asymmetric reection created in a walko-cut crystal, Fig. (3.17). The left side
of the gure illustrates the conguration, where the z axis is cut at angle and
the incident light is normal to the input. The k-vector does not refract into the
crystal, but the extraordinary Poynting vector tilts upward by angle . The
Poynting vector propagates until it hits the roof of the crystal, whereupon it
experiences TIR as long as the outside refractive index is suitably low. After
total internal reection the k- and Poynting vectors point downward. The
k- and Poynting vectors refract at the output surface and subsequently run
parallel to one another.

3.6 Birefringent Materials

B
a
A

ke

ke

ua

Se

Se

ke

119

kx

kx

a)

ub

Se

ke

b)

ke

Fig. 3.17. TIR from roof of walko block. a) Path of e-ray k- and Poynting vectors.
Position A: normal incidence, walko of extraordinary ray. Position B: TIR from
roof, asymmetric reection and direction change of k-vector. Position C: refraction
at output. b) Detail of Position B.

Figure (3.17b) illustrates the detail of the TIR at the roof of the walko
crystal. The extraordinary-index refraction manifold is shown tilted by angle with respect to the roof. The incident vector ke is parallel to the roof.
The reected wave must maintain phase matching along the roof, and, unlike
an isotropic material, the ellipsoidal shape of the extraordinary indicatrix allows for two solutions: ke that runs parallel with the roof, and ke that tilts
downward at angle . To calculate the angle , the eective index ne associated with ke (when projected along the horizontal axis) must match the
eective index ne associated with ke . Thus,
ne cos = ne
where

ne no

ne = 
n2e

cos2 (

+ ) + n2o sin2 ( + )

Solving for tan gives


tan =

2(n2e n2o ) sin cos


n2e sin2 + n2o cos2

(3.6.28)

The vector ke is tilted downward by angle . The deviation angle  between
Poynting vector Se and ke is calculated from (3.6.15), replacing with + .
The reection angle b with respect to the roof is b =  . Generally speaking, a = b , although the values can be close.
When the tilt angle is optimized for maximum walko, tan
= ne /no .
Substituting this angle into (3.6.28) gives


ne
no
tan (
) =

(3.6.29)
no
ne

120

3 Interaction of Light and Dielectric Media

For ne /no = 1.1,  10.8 . Another relevant identity is



tan(
+ (
)) =

ne
no

3
(3.6.30)

Combined, the inclination of Se to ke is given by


) =
tan  (

1 (ne /no )2 ne
1 + (ne /no )4 no

(3.6.31)

Substitution yields   5.36 . Therefore, the reected angle is b  5.44 .


This is compared with a  5.45 . Here the reected and incident TIR angles
are nearly the same. This is because
maximizes ; at the maximum the
system is stationary. Asymmetry occurs for lower walko angles.
3.6.4 Polarization Transformation
A waveplate is a birefringent crystal cut so that its extraordinary axis lies
in the plane of the input, and the input and output crystal faces are parallel. A light ray normally incident on the input face is resolved into ordinary
and extraordinary components, according to the relative orientation of the
input polarization and the e-axis of the waveplate. The k- and Poynting vectors within the crystal propagate collinearly. The wavenumbers for the two
components are

(3.6.32)
ke = ne , ko = no
c
c
Therefore the phase velocities dier. The dierence in phase velocity results
in the fast component walking through the slow component, slipping one full
optical wave every birefringent beat length.
Strictly speaking, the ordinary and extraordinary waves within the waveplate propagate independently from one another, and it is accurate to say only
linearly polarized elds exist. Once the elds emerge from the waveplate they
continue to run collinearly but have a phase slip between them. This phase
slip makes the output polarization (generally) dierent from the input polarization. Within a waveplate, the phase slip at any position zo is represented
by a polarization rotation as if the waveplate ended at zo .
Figure 3.18 illustrates the polarization evolution along a long waveplate
in laboratory coordinates and Stokes coordinates. In the laboratory frame,
an input polarization state |s is resolved into ordinary and extraordinary
components at the input face of the waveplate. These two components have
dierent wavelengths and travel at dierent phase velocities. The phase slip
between the components is called the birefringent phase and as a function of
travel distance z is dened by
(, z) =

nz
c

(3.6.33)

3.6 Birefringent Materials

121

S3
jsi

2h

a)

r
e

a
^

S2

S1

b)

Fig. 3.18. Polarization transformation along a waveplate. a) The input SOP |s
is projected onto z, which is tilted by angle within the polarization plane. The
projection generates two collinear waves having dierent wavelengths. The phase
slip along the waveplate transforms the state of polarization from state a through e.
b) Stokes picture of polarization change. Precession of s about r.

where n is the birefringence n = ne no . The travel distance over which


the components slip by one full wave is called the birefringent beat length.
This distance is dened by
0
(3.6.34)
=
n
where o is the free-space wavelength. The positive uniaxial crystal YVO4 has
a birefringence of n  0.2; the corresponding beat length is only 7.5 m at
o  1.5 m.
The phase slip through the waveplate changes the polarization state. Figure 3.18(b) illustrates the corresponding Stokes transformation. As only the
phase changes but not the power split between e- and o-components, the
precession circle is traced. The birefringent axis of the waveplate lies on the
equator of the Poincare sphere. The polarization evolution is periodic with
travel distance.
The birefringent phase also depends on frequency. The free-spectral
range is the analogue to the birefringent beat length but for frequency change,
not length change. However, there is an important subtly: the ordinary and
extraordinary refractive indices are generally frequency dependent. To rst
order the refractive index n is replaced by the group index ng . Accounting for
the group index, the free-spectral range (FSR) of a waveplate of length L is
dened by 2 1 = 2 FSR, or
c
(3.6.35)
FSR =
ng L
The free-spectral range is given in cyclic frequency. Subsection entitled, FreeSpectral Range and Group Index starting on page 158 details the relation
between phase and group indices.

122

3 Interaction of Light and Dielectric Media

The accrual of phase slip due to frequency change is tantamount to a


relative delay between the two components. This delay is called the dierential group delay (DGD). As delay is dened as = /, the DGD for a
waveplate is
1
ng L
=
(3.6.36)
=
c
FSR
With this denition of dierential group delay, the birefringent phase as a
function of frequency is written = . An output polarization state precesses
about the birefringent axis (of a single homogeneous section) in Stokes space
just the same as if the length changed, but in the case of length change the
precession angle is = 2z/.

3.7 Gyrotropic Materials


Gyrotropic materials are those whose optical properties are eected by either
an applied magnetic eld or their own internal magnetization. In either case
the eigen-axes are elliptical or circular rather than linear. Moreover, unique
to gyrotropic materials is their nonreciprocal polarization rotation. This nonreciprocal rotation is due to a reorientation of the eigen-axes between forward
and backward transit. The term non-reciprocal is of unfortunate historical
origin because one typically thinks of a non-linear system being non-reciprocal.
Non-reciprocity of a non-linear system means one cannot reverse time and end
up with what was at the beginning. Gyrotropic processes, instead, are fully
reciprocal under time reversal. Time reversal not only ips the propagation direction but also reverses the electron spin which is the source of magnetization.
Non-reciprocity for gyrotropic media refers to polarization transformation for
forward and backward propagation.
There are a vast array of materials and conditions that exhibit gyrotropic
behavior. Gyrotropic eects are known in gases, liquids, and solids. Gyrotropic
eects are also known to exist from radio-frequency through the ultraviolet. A
full exposition of gyrotropy requires a quantum mechanical formalism, which
is not the thrust of this text. Nonetheless, there are two steps involved when
studying these materials. First, a microscopic-level analysis determines the
origin and character of the gyrotropy, culminating in a constitutive relation.
Second, given the constitutive relation, a kDB analysis predicts the interaction of light with the material. Common to most all gyrotropic materials is
the constitutive relation of the permittivity tensor:

jg

(3.7.1)
= jg
z
The diagonal entries look like a uniaxial birefringent material. The o-diagonal
entries are characteristic of gyrotropy. These entries are purely imaginary and
keep the tensor Hermitian. For low, non-optical frequencies, the permeability
tensor takes the form of (3.7.1) and is a scalar.

3.7 Gyrotropic Materials

123

3.7.1 Magnetic Material Classes


There are ve classes of magnetic materials that exhibit magneto-optical
eects: diamagnetic, paramagnetic, ferromagnetic, antiferromagnetic, and
ferrimagnetic. Diamagnetic and paramagnetic materials have no intrinsic
magnetization; these materials exhibit weak magneto-optical eects. Ferro-,
antiferro- and ferrimagnetic materials do have intrinsic magnetization and
their magneto-optical eects are dominated by their internal magnetization
rather than an external eld (unless of high coercivity). These materials can
exhibit strong magneto-optical eects.
A diamagnetic material has no magnetic dipole moments at a microscopic
level. Only through coercion from an external eld does this material class
exhibit gyrotropic behavior. The external eld splits the excited-state electron levels of the constituent atoms, thereby inducing a refractive index splitting. A paramagnetic material does possess intrinsic magnetic dipole moments
among some if not all of the constituent atoms. These magnetic moments exist
in the ground state and tend toward cancellation on a nano-scale. An externally applied magnetic eld splits the ground state, rather than the excited
state, of the atoms, inducing a refractive index splitting. A nite temperature
populates the upper level of the split ground state, again leading to general
cancellation of the magnetization. Magneto-optical eects can be stronger in
paramagnetic materials than in diamagnetic materials, but at the cost of a
fall higher temperature sensitivity.
Ferro-, antiferro-, and ferrimagnetic materials are more complex. These
materials all exhibit long-range spin ordering. A homogeneously ordered region is called a domain. Domains exist even in the absence of an applied magnetic eld and the magnetization within a domain can be orders of magnitude
stronger than for dia- or paramagnetic materials. In ferromagnetic materials,
the spins of unpaired valence electrons align, generating a large magnetization
within a domain. However, on a microscopic scale neighboring domains are
coerced to align anti-parallel and buck a large spontaneous magnetization.
An external eld, however, coerces the domains to align preferentially in the
direction of the applied eld, which in turn induces a strong magnetic reaction
to the eld. In antiferromagnetic materials the spins of adjacent unpaired valence electrons anti-align and cancel. While the process of spin-orbit coupling
places antiferromagnetic materials in the same category as ferromagnetic materials, magnetization of this type is weak. Finally, ferrimagnetic materials
exhibit a coupling of two interstitial magnetic sublattices that couple antiferromagnetically. Generally, one sublattice has a stronger magnetization than
the other, oering some degree of overall magnetization.
Ferro-, antiferro-, and ferrimagnetic materials have such strong intrinsic
magnetization that their magneto-optical properties are dominated by the
internal elds rather than applied elds (of a coercion order one would expect in a telecommunications environment). Often, such complex material
structures require measurements of their magneto-optical properties since a

124

3 Interaction of Light and Dielectric Media

rst-principles derivation is too complex. Moreover, these materials can have


signicant temperature dependence since thermal vibration disrupts the zeroKelvin alignment. In fact, at zero Kelvin, the magnetization of a ferromagnet
is largest, the magnetization decreases monotonically with increased temperature, and nally vanishes at the Curie temperature Tc .
In this text, a short derivation using classical electron motion is presented
to motivate the construction of the constitutive relation with linear entries.
Indeed, the Lorentz force is the only tool presented in this text to address
the present analysis. Only diamagnetic materials yield to the Lorentz-force
analysis. The reader is cautioned about adopting this derivation to other materials. Rather, references [3, 6] and the references found within provide a
solid foundation on which to pursue more detailed studies of the microscopic
behavior.
3.7.2 Permittivity of Diamagnetic Materials
Magnetization of diamagnetic materials is generated by an externally applied
magnetic eld. As discussed in 3.4, the magnetic eld of an electro-magnetic
wave is not strong enough to generate an appreciable force on the bound electron since |rB|  |eE|. However, an external biasing eld can easily inuence
the motion. Consider once again the bound-electron equation of motion (3.4.2)
rewritten in time-harmonic form and with an external DC magnetic eld
aligned along +
z:

 2
(3.7.2)
+ o2 r + j(e/m) (r Bz ) = (e/m)E
The spring-constant resonance is o = K/m. Before the susceptibilities are
calculated, intuition can be developed by solving (3.7.2) under simplifying
assumptions. Consider a ray propagating along the +
z direction; in this case
r z = 0. Separating (3.7.2) into its components and solving for r gives





a2
Ex
1 ja1
rx
=
ry
ja1
1
Ey
1 a21
where a1 = (e/m)Bz /(o2 2 ) and a2 = (e/m)/(o2 2 ). The eigenvalue and eigenvector solutions are
+ = 1 + a1 , = 1 a1
 


1
1
E+ =
, E =
j
j
Counter-clockwise and clockwise polarization states of E are the eigenstates
of the system. These two states in turn induce a circular motion of the bound
electrons about the Bz eld, the counter-clockwise orbit having the larger radius of the two. The circular electron trajectory is called cyclotron motion.

3.7 Gyrotropic Materials

125

As the susceptibility is intimately related to the electron motion, one should


expect that the eigen-polarizations of the system, when simplied in this manner, are circular and that the refractive indices are dierent for ccw and cw
polarizations, the larger index corresponding to the larger electron radius.
Returning to the detailed analysis of susceptibilities based on the Lorentz
force, recall that the polarization density is dened as P = N er. The equation of motion (3.7.2) is recast in terms of P and E:

 2
o 2 P + j(e/m) (P Bz ) = N e2 /mE
(3.7.3)
Expanding the curl, the cartesian components in matrix form are


 2
o 2  +jc 
0
Ex
Px
2
N
e
Ey
jc
o2 2  0  Py =
m
2
2
P
Ez
0
0
o
z
where the cyclotron resonance frequency c is dened as
c =

eBz
m

(3.7.4)

The dielectric susceptibility tensor e relates the components of the electric


eld to the polarization density: P = o e E. Inverting (3.7.4) gives the components of the electric susceptibility tensor

11 j12

e = j12 11
(3.7.5)
33
where

 2

o 2
N e2
=
mo (o2 2 )2 2 c2

(3.7.6a)

12 =

c
N e2
mo (o2 2 )2 2 c2

(3.7.6b)

33 =

1
N e2
mo (o2 2 )

(3.7.6c)

11

The susceptibility 12 is directly related to the strength and direction of the


xed magnetic eld Bz . The susceptibility 11 experiences a second-order
detuning due to the eld, but is generally comparable in magnitude to 33
when far away from absorption resonance. The permittivity tensor relates to
the susceptibility in the usual way: = o (1 + e ). The entries in correspond
to those in e by
= o (1 + 11 )

(3.7.7a)

z = o (1 + 33 )

(3.7.7b)

g = o 12

(3.7.7c)

126

3 Interaction of Light and Dielectric Media

where the subscripts z and g denote the dielectric constant along the z axis
and the gyrotropic dielectric value, respectively. The impermittivity tensor
required for the kDB calculations is
1

jg

jg
= jg

jg

(3.7.8)
z
z
where
=

g
1
, g = 2
, z =
2 2g
2g
z

(3.7.9)

Note from (3.7.8) that indeed = , so the system remains ideally lossless:
no work is done on the electrons due to the xed magnetic eld, all energy
coupled into the cyclotron motion is recovered.
3.7.3 Propagation in Gyrotropic Materials
The constitutive relations for gyrotropic materials are
E=D
H = B
where is given by (3.7.8). Rotation into the kDB frame gives
Ek = k Dk

(3.7.10a)

Hk = Bk

(3.7.10b)

where the transformed impermittivity tensor relation is

jg sin

jg cos

k = jg cos cos2 + z sin2 ( z ) sin cos


jg sin ( z ) sin cos sin2 + z cos2

(3.7.11)

There is no dependence in . The eigenvectors of (3.7.8) are circular in the


plane perpendicular to z, so is invariant to any rotation . Substitution
of (3.7.11) into the coupled kDB equations (3.3.5) gives
 
 
 


D1
0 u
B1
11 j12
=
(3.7.12a)
j12 22
D2
B2
u 0

 
 

B1
0 u
D1

=
(3.7.12b)
B2
D2
u 0
where
11 = , 12 = g cos , and 22 = cos2 + z sin2

3.7 Gyrotropic Materials

Elimination of B generates the governing equation





jg cos
D1
(u2 / )
=0
jg cos (u2 / ) + ( z ) sin2
D2

127

(3.7.13)

Notice that the sign of the o-diagonal elements changes with + : reversal
of propagation direction generates a transposition of the eigenvectors. More
on this to follow.
In order to arrive at a compact expression for the eigenvectors and eigenvalues, (3.7.13) is rst rearranged to the form



p jq
D1
=0
(3.7.14)
D2
jq p + 2
where
p=

2(u2 / )
,
( z ) sin2

q=

2g cos
( z ) sin2

The eigenvectors and corresponding eigenvalues of (3.7.14) are




q
1

|r  = 

j(1 1 + q 2 )
2(1 + q 2 1 + q 2 )
and
p + 1

1 + q2 = 0

(3.7.15)

(3.7.16)

The gyrotropic angle is dened as


tan 2 =

2g cos
( z ) sin2

(3.7.17)

Identifying the gyrotropic angle with (3.7.15), the eigenvectors are expressed
in the more revealing form:




sin
cos
, |r  =
|r+  =
j cos
j sin
In Stokes space the eigenvectors are

cos 2
cos 2

r+ =
0
0
, r =

sin 2
sin 2
The eigenvectors of (3.7.13) correspond to polarization states that can propagate through the gyrotropic medium without change. The eigenstates of polarization are elliptical and the ellipsoid axes in the laboratory frame are aligned

128

3 Interaction of Light and Dielectric Media

to the horizontal and vertical. The eccentricity depends on the gyrotropic angle , which in turn depends on the propagation direction, wavelength, and
strength of the applied magnetic eld. In contrast to birefringent materials
where the eigen-polarizations are linear, gyrotropic materials can have any
eigen-polarization along a line of longitude through S1 . An input polarization
state is resolved onto the two eigen-polarization states and in general the two
states propagate with dierent phase velocities and energy-ow directions.
The phase-velocity eigenvalues of (3.7.13) are
u2 = o

(3.7.18)

where o = 1/o and

( z ) sin
1

o
2o
2

1+

2g cos
( z ) sin2

2

(3.7.19)

The dispersion relations are therefore


k =

(3.7.20)

It follows that the eective refractive indices are n = 1/ . Heuristically,


the refractive index has the form n = no n ng , where no is the intrinsic
refractive index, n is a eld-induced diminution of no , and ng is the splitting
factor due to the cyclotron motion of the bound electrons.
Like birefringent media, the Poynting vectors in gyrotropic media do
not necessarily align with the k-vectors of the eigenstates. Inserting (3.7.11)
into (3.7.10) and remembering that D3 = 0, the elds and ux densities are
Dk = e1 D1 + e2 D2 ,
Ek = e1 (11 D1 + 12 D2 ) + e2 (21 D1 + 22 D2 ) +
e3 (31 D1 + 32 D2 ) ,
e1 D2 + e2 D1 ) ,
Bk = u/ (
e1 D2 + e2 D1 ) ,
Hk = u (
The Poynting vector is therefore
e1 E3 H2 + e2 E3 H1 + e3 (E1 H2 E2 H1 )
Sk =

(3.7.21)

The Poynting vector is generally not aligned to the k-vector. In contrast to


birefringent materials, the gyrotropic Poynting vector is tilted away from the
k-vector along both the e1 and e2 axes, and out of the plane of k and z. Only
when E3 = 0 does the Poynting vector align with the k-vector. The E3 eld
component is
E3 = jg sin D1 + ( z ) sin cos D2

3.7 Gyrotropic Materials

129

which shows that E3 = 0 only when either Bz = 0 and/or sin = 0. Since


Bz = 0 by design, the Poynting vector aligns with the k-vector only when
the propagation direction is aligned with or counter to the magnetic eld
direction.
3.7.4 Faraday Rotation
Applications of gyrotropic materials in telecommunication components are
generally limited to applications where the Poynting vector aligns with the kvector. In the presence of a magnetic eld, this is only possible when sin = 0.
The polarization-transforming eect that is induced when a ray travels along
the
z direction is called Faraday rotation. As detailed below, Faraday rotation is a nonreciprocal polarization rotation. The absence of reciprocity
enables key components such as isolators and circulators to be realized.
The following analysis details propagation in the +
z and
z directions.
Since the signs will quickly become dicult to track, the analysis is divided
into a rst section that details = 0 propagation and a second section that
details = propagation.
When = 0, the ray travels along the +
z direction. The governing equation (3.7.13) simplies to


 2
D1
u jg
=0
(3.7.22)
jg u2
D2
The gyrotropic angle , as determined by (3.7.17), is
= /4
The eigenvectors of forward propagation are therefore
 


1
1
1
1
|r+  =
, |r  =
2 j
2 j

(3.7.23)

The eigenvectors of the system are circular polarization states. The precession axis r in Stokes space for the forward propagation direction is r = +S3 .
The circular eigenstates of polarization are reminiscent of the circular electron orbital motion that was calculated at the beginning of 3.7.2. It should
be no surprise that the cyclotron electron motion produces circular eigenpolarization states.
The phase-velocity eigenvalues of the system are

(3.7.24)
u = ( g )
The wavenumbers and eective indices are found by substituting u = /k,
= 1/o , and (3.7.9) for the impermittivities. The wavenumbers are

130

3 Interaction of Light and Dielectric Media

k =
c

2r 2gr
r gr

(3.7.25)

where r = /o and gr = g /o . In terms of susceptibility, the wavenumbers


are

1 + 11 12
(3.7.26)
k =
c
The gyrotropic eect splits the intrinsic susceptibility by 12 .
Since the +
z propagation of the ray keeps the Poynting vectors aligned
with the k-vectors, and further assuming normal incidence onto the gyrotropic
medium to ensure alignment of the k-vectors, the polarization state of an
incident ray is resolved into right- and left-circular components which copropagate through the medium with dierent phase velocities. The circular
basis of the eigenmodes makes the polarization state transform along a line
of latitude on the Poincare sphere. Indeed, Table 2.1 on page 67 shows that
for an eigenvector that points to +S3 , the transformation matrix U is


cos(/2) sin(/2)
U=
(3.7.27)
sin(/2) cos(/2)
where the birefringent phase is dened as usual:
= (k+ k )z

(3.7.28)

U rotates an input state of polarization about the +S3 axis by angle and the
rotation direction is right-handed. This rotation is illustrated in Fig. 3.19(a).
The contour traced by is a line of constant latitude. Recall
from 1.4 that a family of polarization states on a line of latitude has constant
ellipticity and handedness. It is the tilt of the major axis that rotates with
longitude. In the laboratory frame, the eect of Faraday rotation is to rotate
the major axis of the input polarization ellipse by angle (/2).
Next, consider when = ; the ray travels along the
z direction while the
magnetic eld still points in the +
z direction. The governing equation (3.7.13)
simplies to


 2
D1
u jg
=0
(3.7.29)
jg u2
D2
The gyrotropic angle , as determined by (3.7.17), is
= /4
The eigenvectors of backward propagation are therefore


 
1
1
1
1
|r+  =
, |r  =
j
2 j
2
with corresponding eigenvalues

(3.7.30)

3.7 Gyrotropic Materials

a)

b)

S3

131

S3

S2

S2

S1

S1
^

r
Bz
a

Bz

FR

FR

z
a

Fig. 3.19. Nonreciprocal polarization transformation via Faraday rotation. a) Forward propagation along the direction of the xed magnetic eld. Input polarization (a) right-hand precesses about +S3 by to state (b). b) Backward propagation
against the direction of the magnetic eld. Polarization state (b) left-hand precesses
about S3 by || to state (c).


u =

( g )

(3.7.31)

In comparison with (3.7.23-3.7.24), the eigenvectors for backward propagation have ipped sign, while the eigenvalues remain unchanged. Rather than
precessing about the +S3 axis, the polarization state of a ray travelling backward precesses about the S3 axis. The transformation matrix U for backward
propagation is


cos(/2) sin(/2)
U=
(3.7.32)
sin(/2) cos(/2)
where is dened by (3.7.28).
The o-diagonal sign of U has changed in comparison with (3.7.27) because
of precession about S3 . However, the precession expression remains the same.
Since precession increases with forward travel and decreases with backward
travel, the net result of the backward polarization rotation is a left-hand
rotation about S3 by ||. Figure 3.19(b) illustrates this.
The analyses for
z propagation show that polarization states are transformed via right-hand-rule rotation about +S3 for forward and backward

132

3 Interaction of Light and Dielectric Media

propagation. Regardless of travel direction, the polarization state precesses


in the same direction in Stokes space. This is completely dierent behavior in
comparison with a optical activity (covered in the following section), where
a polarization transformation due to forward travel is undone by backward
travel.
The nonreciprocal character of Faraday rotation is not tantamount to violation of time reversibility. Time reversal would change the spin direction of
the electrons that generate the xed magnetic eld, in addition to mapping
ccw to cw circular polarizations and vica-versa. All polarization transformations induced by Faraday rotation would be undone under time reversal.
At this point it is almost trivial to point out that when the input polarization state is linear, the output state is also linear but with its major axis
rotated by the medium. This behavior is the basis for all telecommunicationsgrade Faraday rotators. As a rule, when the input polarization is a linear state,
the output state is the orthogonal linear state after = rotation. There is
some confusion in the literature that Faraday rotation magically drives any
input state to its orthogonal state. This analysis has shown the invalidity of
such a view.
For linear input polarizations, the Faraday angle F is dened as
F = /2

(3.7.33)

The rotation angle F is the physical, not Stokes, rotation of the linear state.
The physical Faraday angle F is an important quantity when considering
isolator and circulator designs because this angle determines the relative orientation between two polarizers.
3.7.5 The Verdet Constant
Considering that an external magnetic eld is not necessarily uniform throughout the diamagnetic medium, the overall retardation from one end to the other
is

L

(k+ k ) dz

=
0

In the transparency regime, the wavenumbers are related to the refractive


indices in the usual way: k = wn /c. The splitting of the refractive index
for small 12 in light of (3.7.26) is
12
n+ n =
1 + 11

(3.7.34)

The denominator is approximately the intrinsic refractive index to the same


degree
of approximation. Referring to (3.7.7a), the average refractive index is
n  1 + 11 , or
1
N e2
n2 = 1 +
mo o2 2

3.7 Gyrotropic Materials

133

A small susceptibility 12 occurs when the cyclotron resonance frequency


follows c  (o2 2 ). Inserting this condition into (3.7.7b) eliminates the
susceptibility terms from (3.7.34):
n+ n 

N e3 Bz
nm2 o (o2 2 )2

(3.7.35)

The refractive index splitting is also related to the dispersion of the intrinsic
index dn/d by
 
dn
2 e
n+ n 
Bz
2cm d
Putting all of this together, the functional form of the Faraday rotation angle
is
 L
F = V
Bz dz
(3.7.36)
0

where the Verdet constant is identied as


 
e
dn
V =
2mc d

(3.7.37)

This expression is known as the Becquerel formula of the Verdet constant [3].
The negative sign is cancelled out by dn/d for usual dispersions. The Verdet
constant measures the rotary power of a material and can be used as a point
of comparison between dierent materials. In SI units, the Verdet constant is
measured in (rad/(m T)).
Fundamentally, the Verdet constant is a function of frequency and varies
only to second order with the magnetic eld strength. Generally is it well
documented that the Verdet constant has a 2 wavelength dependence, as
indicated in (3.7.35) [9]. Akin with the study of refractive index and material
susceptibility, the Verdet constant of (3.7.37) is based here on a single oscillator model. More complicated materials may have multiple contributions to
the Verdet constant, and the Verdet constant can certainly go negative.
3.7.6 Faraday Rotation in Ferrous Materials
The preceding analysis of diamagnetic materials resulted in an equation that
linearly relates the Faraday rotation angle F to the applied magnetic eld Bz
(cf. (3.7.36)). However, the Verdet constant for diamagnetic materials is too
weak to nd useful applications in telecommunications. In order to achieve a
compact size the magnetization must be strong. Rare-earth iron garnets are
ferrimagnetic materials that have orders of magnitude stronger rotary power
than the best diamagnetic materials for the same applied magnetic eld.
As outlined in 3.7.1, ferro-, antiferro-, and ferrimagnetic materials exhibit
domain structure on a microscopic scale, where the magnetization within a
domain is uniform and the domains spontaneously orient to prevent a large

134

3 Interaction of Light and Dielectric Media


a)

Hz

Linear:

uF

uF,sat

Hz

Hysteresis:
uF
uF,sat

c
b

Hsat

c)

Latching:

Hz

-Hsat

b)

Hz

uF

Hz
+Hn Hsat

Hz
+Hn

Hsat

Fig. 3.20. Illustration of domain structure and alignment with applied external
eld. a) A demagnetized ferrimagnet: equal number of spin-up and spin-down domains. b) Partially magnetized material; some spin-down domains remain. c) Saturated material; a single spin-up domain spans the material. Below, saturation curves
of F vs. Hz : linear, hysteretic, and latching [14, 15].

constructive magnetic eld (at least at high enough temperatures or in a demagnetized sample). In the presence of an external magnetic eld the domains
reorient preferentially along the eld lines. However, unlike diamagnetic materials where the magnetization linearly tracks the applied eld strength, all
materials with domain structures saturate once a suciently strong eld is applied. Saturation occurs when the domain boundaries are pushed to the edges
of the material and only a single, unidirectional domain is left. No further magnetization occurs with increased applied eld. Moreover, all magneto-optical
eects are dominated by the internal magnetization rather than the applied
eld.
Figure 3.20 illustrates domain alignment and various magnetization curves.
Figures 3.20(ac) illustrate a demagnetized ferrimagnet having equal number
of spin-up and spin-down domains (a), a partially magnetized ferrimagnet
where spin-up domains expand at the expense of spin-down domains (b), and
a saturated ferrimagnet where a single aligned domain spans the materials.
Beyond the saturation eld Hsat there is no further magnetization of the
material.
The magnetization curves in the lower half of Fig. 3.20 illustrate three
types of material responses. A linear magnetization curve has a one-to-one
correspondence between Hz and F ; above Hsat angle F is no longer responsive and remains xed at F,sat . A hysteretic magnetization curve is one where
the magnetization, or equivalently the Faraday rotation angle F , traces one
path for increasing Hz and a separate path for decreasing Hz . In particular, once the applied eld exceeds the the saturation eld, F remains xed
at F,sat until the applied eld goes below the nucleation eld Hnuc . The nucle-

3.8 Optically Active Materials

135

ation eld is the eld strength where internal elds coerce a fracturing of the
single domain to reduce the potential energy of the system. A latching magnetization curve has particular interest for component applications because
once the saturation eld is applied, the Faraday rotation remains at F,sat
even for zero external eld.
When the Faraday rotation can saturate, the Verdet constant is no longer
a reasonable measure since, by denition, the Verdet constant is the constant
of proportionality between eld and rotation. Instead, the specic rotation F
is dened as
F,sat
(3.7.38)
F =
L
or the saturation rotation F,sat per unit length. The specic rotation has
units of (rad/m) [3].
In order for a ferrimagnet such as iron garnet to work well as a Faraday
rotator it must be saturated. A demagnetized or partially magnetized element scatters light to an unacceptable degree. The light is scattered because
the domains locally impart a polarization rotation, but the domains have no
coherence or alignment. A radiation eld with numerous k-vectors and polarization components must be constructed to match the boundary condition of
the scattered light on the output face of the element.
That the ferrimagnet must operate in saturation is certainly a benet
because sensitivity to the applied eld strength is eliminated. Without this
built-in nonlinearity, much care would have to go into designing an external
magnetic eld that is highly uniform throughout the volume. A calculation
of the magnetic eld prole generated by a toroidal magnetic is given in Appendix B. Moreover, the saturation nonlinearity aids with the stringent aging
requirements of all telecommunication components since the xed magnet will
degrade over time and still, with proper design, exceed the saturation eld. A
key design goal for an iron garnet is to have a low saturation eld so that the
requisite magnet can be small.
The remaining factors that must be accounted for when using iron garnets are the temperature sensitivity and wavelength variation of the specic
rotation F (T, ), and the wavelength dependence of the element [1315].

3.8 Optically Active Materials


Materials that are chiral exhibit optical activity. A chiral material is one
where the crystalline unit cell, or the molecular structure, diers from its
mirror image; that is, the molecules have twist. A chiral molecule and its
mirror image are called isomers. Most organic molecules are chiral, including
sugar and DNA. For example, dextrose is right-handed sugar and fructose is
left-handed sugar.
Chiral materials have a dierent radiation response to an optical eld because nearest-neighbor dipole polarizations add constructively because of the

136

3 Interaction of Light and Dielectric Media

b)

a)

Pe
k

rXM

Pm

Pe

c)
^

M
rXM

k
Pm

Fig. 3.21. Simple model of chiral molecular and induced polarization components [7]. a) Perfectly conducting wire of length  with right-handed single turn.
The applied electric eld E induces current i, which generates both an electric polarization component P and a parallel magnetization component M. b) An achiral
material. M is perpendicular to E (M H), so magnetically induced polarization
component Pm is parallel to Pe and makes an insignicant contribution. c) A chiral
material. By construction, M is parallel to E, generating magnetization contribution Pm perpendicular to Pe . This one-sided persistent bias rotates the polarization
state of the propagating wave.

molecular twist. The symmetry of isotropic and anisotropic materials cancels


all nearest-neighbor dipole contributions, which is why they were ignored in
the preceding sections. The twist of chiral materials induces both electric and
magnetic dipole responses to an external electric eld; these coupled responses
cause optical activity.
Optical activity causes the right-hand or left-hand rotation of a linear polarization state as it propagates through the medium. A material that induces
right-hand polarization rotation is called dextrorotatory and a material that
induces left-hand rotation is called levorotatory. There is no a priori relation
between the handedness of an isomer and dextro- or levo-rotatory behavior.
However, if a right-handed isomer is dextrorotary, then its left-handed counterpart will be levorotatory.
The microscopic origins of optical activity can vary widely depending on
the nature of the molecular or atomic structure. Optically active materials can
be either reciprocal or non-reciprocal, bi-isotropic or bianisotropic, lossless or
lossy. The constitutive relations dierentiate these possible conditions. The
common theme underlying optical activity is a coupling of the magnetic eld to
the electric polarizability, and the electric eld to the magnetic polarizability.
As an example, consider a perfectly conducting wire of length  having a
single spiral turn half-way along its length (Fig. 3.21(a)). When an external
electric eld generates a current that oscillates up and down along the wire,
the current through the turn generates a magnetic ux in the direction of

3.8 Optically Active Materials

137

the electric eld. The external electric eld thus elicits a collinear electric and
magnetic response.
The handedness of the spiral turn determines whether the magnetic response is parallel or antiparallel with the electric response. Moreover, ipping
the wire upside-down does not change the wires handedness; handedness is
an inherent property of the wire structure.
To pursue this example further, Hagan [7] considers Maxwells equations
in the absence of a current source:
E =
H =

(o H + o M)
t

(o E + P)
t

(3.8.1a)
(3.8.1b)

In a charge-free region, the divergence of the electric eld is zero. Rewriting


in time-harmonic form and using the vector identity = () 2
gives
(3.8.2)
2 E = 2 o Pe
where an eective polarization density Pe is identied as


j
Pe = E M

= Pe + Pm

(3.8.3a)
(3.8.3b)

The polarization of the medium has two contributors: Pe associated with the
linear dielectric and Pm associated with the curl of the induced magnetic ux.
In achiral materials E H = 0, so the curl of the magnetic ux (M) is parallel or antiparallel with the polarization Pe . Since generally |E| >> |M/c|,
the Pm contribution is negligible. However, in chiral materials, the Pm component lies perpendicular to Pe . That is, since a component of M is generated
parallel with E, M lies perpendicular. It is the Pm component that distinguishes optical activity from other processes.
In order to analyze this further, the magnetization must be related to the
electric eld. The canonical constitutive relation between M and H is
M = m H

(3.8.4)

Now, given the particular geometry of the wire with embedded loop, the magnetic ux generated by current ow through the loop is parallel to the applied
eld. Moreover, the current lags the voltage due to the loop inductance. With
these considerations, (3.8.4) can be rewritten as

M = jm |H|E

(3.8.5)

is the direction of
where |H| is the magnitude of the magnetic eld and E
the electric eld. The sign accounts for the handedness of the loop, () for

138

3 Interaction of Light and Dielectric Media

a right-hand loop and (+) for a left-hand loop. From (3.8.1b), the magnetic
eld magnitude is



j

(3.8.6)
|H| = E M
k

Since E dominates the right-hand side magnitude, the curl term is neglected.
The curl of M in (3.8.5) is then just
M  jm

E
k

(3.8.7)

The eective polarization density is therefore


Pe = E + E

(3.8.8)

where the chirality parameter is dened as = m /k. The sign of the chirality parameter designates the handedness, and the units of are length.
The constitutive relation between D and E are therefore
D = DBF (E + E)

(3.8.9)

This is the Drude-Born-Fedorov constitutive relation for reciprocal chiral material. Detailed investigation of conservation of energy [4] requires a complimentary constitutive relation for the magnetic ux vector,
B = DBF (H + H)

(3.8.10)

The notation of DBF and DBF follows from [12] and is used to distinguish
these values when compared to more general forms of the constitutive relations. Of immediate importance is the existence of terms in the constitutive relations. The part is a spatial derivative, which physically means
that neighboring elds contribute to the polarization. The cross in indicates that the neighboring eld contributions are perpendicular to the applied
eld. The presence of the terms in (3.8.9-3.8.10) generates a persistent
bias perpendicular to the propagation direction which rotates the elds in a
circular motion. Circular states of polarization are in fact the eigenstates of
optical activity.
The above derivation is heuristic but not particularly rigorous. More sophisticated calculations start with dipole-dipole interactions within chiral
molecules and proceed to generate coupled equations of bound-electron motion. From these the spatially averaged polarization and magnetization vectors
are derived [2]. While telecommunications applications rarely require more
detailed knowledge, bio-optics is replete with applications of molecular mechanics and optical activity. The interested reader is referred to [1, 2, 12].
3.8.1 Propagation in Bi-Isotropic Media
The most general constitutive relations for bi-isotropic optically active media
are [12]

3.8 Optically Active Materials

D = E + (T jP ) o o H

B = H + (T + jP ) o o E

139

(3.8.11a)
(3.8.11b)

where T is the Tellegen magnetoelectric parameter and P is the Pasteur chirality parameter. When T = 0 the medium is non-reciprocal. When P = 0
the medium is chiral. Either or both terms can be non-zero. The constitutive relations (3.8.11) follow from (3.8.9-3.8.10) for a proper association of
constants (DBF , DBF , ) (, , P ) [12] and for T = 0. That is, the phenomenological derivation of the preceding section was based on a purely reciprocal eect. A Tellegen material is well beyond the scope of this text, but
suce it to say that it is a complex material having bound-electric and magnetic permanent dipoles. A Pasteur material is an isotropic chiral material,
where the axial direction of the helices are uniformly randomly oriented. An
alignment of the helical axes makes the material anisotropic, which has the
implication that P becomes a tensor.
In the following analysis, a lossless, reciprocal, bi-isotropic medium is considered. The constitutive relations (3.8.11) form the scalar transfer relation


jP
(3.8.12)
CEH =
jP

The kDB formalism requires the inverse constitutive relations CDB = C1


EH .
The bi-isotropic kDB constitutive relations are written as
E = D + jB

(3.8.13a)

H = jD + B

(3.8.13b)

There is no transformation of the constitutive parameters into kDB, as they


are all scalars. Substitution into the coupled kDB equations (3.3.5) yields



 

D1
j u
B1

=
(3.8.14a)
D2
B2
u j



 

B1
j u
D1

=
(3.8.14b)
B2
D2
u j
Elimination of B generates the governing equation



2ju
D1
u2 + 2
=0
2ju
u2 + 2
D2
The eigenvectors of (3.8.15) are circular states:
 


1
1
|r+  =
, |r  =
j
j
with corresponding phase-velocity eigenvalues

(3.8.15)

(3.8.16)

140

3 Interaction of Light and Dielectric Media

a)

b)

S3
^

S3
^

S2

S1

OA

S1

OA

z
a

S2

z
b

Fig. 3.22. An optically active medium is reciprocal. a) Forward travel. The precession axis is +S3 ; the eigenvectors are circular polarization states. Transit through
the medium transforms an input polarization state from (a) to (b). b) Backward
travel. The precession axis remains +S3 . Transit through the medium transforms
input polarization state (b) to (a).

(3.8.17)

c 1 n/c

(3.8.18)

u =
The corresponding wavenumbers are
k =

where n is an average index. For positive values, right-hand circularly polarized light travels slower than left-hand polarization.
As the constitutive relations in the present model are scalars, the Poynting
vector coincides with the k-vector in the medium. Polarization transformation
therefore occurs through transit. In Stokes space, the precession axis r points
along +S3 . The precession angle is determined by the phase dierence (3.7.28).
The transformation matrix U is therefore


cos(/2) sin(/2)
U=
(3.8.19)
sin(/2) cos(/2)
For positive values the precession in Stokes space for +z travel follows ||.

3.8 Optically Active Materials

141

Polarization transformation due to optical activity is similar to that of


Faraday rotation: the precession axes for both mechanisms are S3 . This
is in stark contrast to birefringent transformation where the precession axis
lies in the (S1 , S2 ) plane. However, unlike Faraday rotation, when T = 0
optically active media is reciprocal (see Fig. 3.22). Consider the upper-righthand matrix component from Faraday and OA governing equations (3.7.13,
3.8.15):
jg cos

Faraday
Optical Activity

2ju

For Faraday rotation, the sign of the o-diagonal term changes when the
propagation direction is reversed. This is not the case with optical activity,
where the sign is unaected by direction. Therefore the eigenvectors for OA
do not change when the propagation direction is reversed.
Like Faraday rotation, a chiral medium will rotate a linear polarization
state from one angle to another. The rotary power of a chiral material is
this polarization rotation per unit length. From the above analysis, the rotary
power is = /2z. Biots law (circa 1812) gives a phenomenological although
rather accurate wavelength dependence of the rotary power:
=a+

b
2

(3.8.20)

Drude in the nineteenth century proposed an extension of Biots to account for


multiple material resonances, akin to Sellmeiers equations. Drudes equation
is

bi
(3.8.21)
=
2 2

o
i
These models are complex to derive. For more information on the relevant
expressions and the tools required for derivation, see [10].

142

3 Interaction of Light and Dielectric Media

References
1. H. Ammari, K. Hamdache, and J. Nedelec, Chirality in the Maxwell equations
by the dipole approximation, SIAM Journal of Applied Math., vol. 59, pp.
20452059, 1999.
2. D. J. Caldwell and H. Eyring, The Theory of Optical Activity. New York:
Wiley-Interscience, 1971.
3. M. N. Deeter, G. W. Day, and A. H. Rose, CRC Handbook of Laser Science
and Technology, Supplement 2: Optical Materials. Boca Raton, Florida: CRC
Press, 1995, ch. Magnetooptic Materials, pp. 367402.
4. F. Fedorov, On the theory of optical activity of crystals. I. Energy conservation law and optical activity tensors, optics and spectroscopy, Optics and
Spectroscopy, vol. 6, pp. 8593, 1959.
5. G. R. Fowles, Introduction to Modern Optics. New York: Dover Publications,
1989.
6. V. J. Fratello and R. Wolfe, Handbook of Thin Film Devices, Vol. 4: Magnetic
Thin Film Devices. San Diego: Academic Press, 2001, ch. Epitaxial Garnet
Films for Nonreciprocal Magneto-Optic Devices, pp. 93141.
7. D. J. Hagan, private communication, 2002, from lecture notes, School of Optics,
University of Central Florida. [Online]. Available: http://www.creol.ucf.edu/
8. H. A. Haus and J. R. Melcher, Electromagnetic Fields and Energy. Englewood
Clis, New Jersey: PrenticeHall, 1989.
9. A. Jain, J. Kumar, F. Zhou, L. Li, and S. Tripathy, A simple experiment for
determining verdet constants using alternating current magnetic elds, Am. J.
Phys., vol. 67, pp. 714717, 1999.
10. W. Kaminsky, Experimental and phenomenological aspects of circular birefringence and related properties in transparent crystals, Rep. Prog. Phys., vol. 63,
pp. 15751640, 2000.
11. J. A. Kong, Electromagnetic Wave Theory. New York: John Wiley & Sons,
1989.
12. I. Lindell, A. Sihvola, S. Tretyakov, and A. Viitanen, Electromagnetic Waves in
Chiral and Bi-Isotropic Media. Boston, Massachusetts: Actech House, 1994.
13. K. B. Rochford, A. H. Rose, and G. Day, Magneto-optic sensors based on iron
garnets, IEEE Transactions on Magnetics, vol. 32, no. 5, pp. 41134117, 1996.
14. K. Shirai, K. Ishikura, and N. Takeda, Low saturated magnetic eld bismuthsubstituted rare earth iron garnet single crystal and its use, U.S. Patent
5,512,193, Aug. 30, 1996.
15. K. Shirai and N. Takeda, Faraday rotator, U.S. Patent 5,535,046, July 9, 1996.

4
Elements and Basic Combinations

4.1 Wavelength-Division Multiplexed Frequency Grid


Specication of an internationally recognized standard for the channel frequency locations of a wavelength-division multiplexed optical communication
system is essential for component development and system interoperability.
Optical components such as signal lasers, multiplexers and demultiplexers,
wavelength add-drop lters, and interleavers require a channel location specication to design to.
The current standard for dense wavelength-division multiplexed (DWDM)
transmission is specied by the document ITU G694.1 [30]. An anchor frequency of 193.1 THz is dened o of which all other channel locations can
be derived. Channel spacings are dened for 12.5 GHz, 25 GHz, 50 GHz, and
100 GHz separations. Figure 4.1 illustrates a partial frequency grid at 100 GHz
channel separation. The channel frequency locations fn are dened as
fn = 193.1 + n C

(THz)

(4.1.1)

where n is a positive or negative integer including zero, and where C = 0.0125,


C = 0.025, C = 0.050, or C = 0.100. To convert between frequency and wavelength, the value of the speed of light found on page 4 is used by the Standard.
Other than the anchor frequency and channel separations there is no adopted
standard for how many channels can run on a DWDM system or where those
channels are located. The standard is agnostic to implementation.
Nonetheless, there is the practical issue that an optically amplied multispan link requires periodic amplication of the optical signal. An optical amplier amplies all channels at once [18]. There are, however, limitations on the
bandwidth of the gain due to the physical nature of the rare-earth ions such
as erbium that are doped into the optical ber. There is no single bandwidth
that can be attributed to an optical amplier because the useable bandwidth
is governed by both the particular architecture of the amplier as well as the
system application. What is true is that communications carriers who pur-

144

4 Elements and Basic Combinations


Center (C) Band

Guard Band

Long (L) Band

Anchor

wavelength (nm)
196.100 195.100 194.100 193.100 192.100 191.100 190.100 189.100 188.100 187.100
1528.77 1536.61 1544.53 1552.52 1560.61 1568.77 1577.03 1585.36 1593.79 1602.31

THz
nm

Fig. 4.1. Overview of ITU-T G.694.1 spectral grid for DWDM applications. The
anchor frequency is located at 193.100 THz. As illustrated, channel centers are
spaced by 100 GHz above and below the anchor frequency. While the Standard is
open to higher and lower frequencies than indicated, rough demarcation of center
(C) and long (L) bands, with possible intermediate guard band, is shown.

chase the optical transport systems want the lowest overall system cost for
the largest aggregate transmission.
There has evolved a banding of the spectrum based on the optical amplier architectures that are economically manufactured. The C-band, or center band, is the original band where an erbium-doped optical amplier provides high gain eciency. This band is often delineated by the range 192.1 to
196.1 THz. The L-band, or long band, is recently available using erbium-doped
ber. That band is often delineated by the range 186.1 to 191.1 THz. These
ranges vary from vendor to vendor. Because the C- and L-bands have dierent
gain eciencies for pumps wavelengths of 1480 or 980 nm, separate ampliers
are built for these each band. A Raman amplier can pump the entire spectrum seamlessly, however. For diode-pumped systems, a band-separation lter
splits the two bands prior to amplication and then combines them prior to
transmission. Typically there is a guard band between the C- and L-bands
to accommodate the band-separation lter. However, recently demonstrated
lter improvements can eliminate the need for a guard band.
There is no standard for the deviation of laser or lter center frequency
from the center frequency of the grid [29]. The end-of-life specication depends
on many factors, including the maximum foreseeable channel density. For the
purposes of analysis in this text, the allowable beginning-of-life frequency
deviation will be taken as f = 2.5 GHz.
There are several reasons the denition of the spectral grid is important
for component designers. These reasons include

The tolerance on the FSR of a periodic component must allow for channel
alignment across a band.
The frequency centering of a periodic component with the correct FSR
must align to the channel locations.
The bandwidth of polarization transforming elements such as waveplates
must cover a band.

4.1 Wavelength-Division Multiplexed Frequency Grid


a)

b)

196.100
1528.77

FSR

z50

frequency (THz)

195.100
1536.61

Anchor

145

Anchor

194.100
1544.53

193.100
1552.52

192.100
1560.61

THz
nm

Fig. 4.2. Two types of lter placement errors in relation to the DWDM spectral
grid. a) FSR error leads to walko between spectral grid and lter centers. While
at one frequency the grid and lter may align ( = 0) at the band edges the lter is
misaligned. b) Frequency location error , often called phase error. Even if the FSR
is within tolerance, the lter center frequencies may suer a common misalignment
to the spectral grid.

Figure 4.2 illustrates the rst two error types. In Fig. 4.2(a) the FSR of a
periodic element such as an interleaver lter is too small, leading to a walko
over the band. Denote the frequency separation between the grid and the lter
at either band end as . The tolerable FSR error of a component is then
|C FSR|
||
FSR
=
=
C
C
NC

(4.1.2)

where C is the designed channel separation and N is the number of channels between band center and band edge. For example, in a C-band designed
with 40 channels on 100 GHz centers, N = 20. With the aforementioned lter
location tolerance of fn = 2.5 GHz, the FSR tolerance of the lter is
|2.5|
FSR
=
= 0.125%
C
20 100

(4.1.3)

For a resonant element such as a Fabry-Perot, a 0.125% tolerance for a 1 mm


long cavity is about one micron. The broader the band coverage the tighter
the cavity-length tolerance.
In Fig. 4.2(b) FSR = 0, but there is a common frequency oset error
across all the channels. The frequency tolerance is generally more than the
FSR tolerance because the error does not accumulate across multiple channels.
As such, the frequency tolerance f is
|f |
f

FSR
C

(4.1.4)

As with the preceding example, the frequency tolerance is f = 2.5% FSR.

146

4 Elements and Basic Combinations

In relation to polarization elements and material dispersion, the spectral


coverage of the system should be tolerated by the component. For a C-band
that covers 192.1 to 196.1 THz, the spectral coverage SC of this band is
SC =

196.1 192.1
 2.061%
194.1

(4.1.5)

Likewise, an L-band that covers 186.1 to 191.1 THz has a spectral coverage SL
of
191.1 186.1
 2.651%
(4.1.6)
SL =
188.6
The combined spectral coverage is SC+L = 5.23%. These are broad coverages for waveplates, Faraday rotators, and anti-reection coatings to cover.
In many cases compromises or increased complexities are required to satisfy
these bandwidth demands.

4.2 Properties of Select Materials


A central theme of this text is to provide information necessary to realize
the components whose architectures are detailed in subsequent chapters. To
this end it is essential to have on hand the physical and optical properties of
the relevant materials. This section tabulates the properties of select isotropic
optical materials, select birefringent materials, and select metal and alloy materials used for packaging. Faraday rotators made from iron-garnet derivatives
are covered at the end. These are not intended to be an exhaustive tabulation
but rather a short reference for the commonly used materials.
4.2.1 Isotropic Glass Materials
Isotropic glass materials are used both as optical elements and as packaging
and assembly parts. There are a large number of well-characterized glasses
available from major suppliers [28, 38, 45]. Principal factors used to select a
low-loss glass for optical transmission use include its refractive index at the
wavelength of use; the refractive index dispersion; its thermal-optic coecient,
or change of refractive index with temperature; and its thermal-expansion coecient [44]. The refractive index and its dispersion will govern such attributes
such as the angle of a prism made from a particular glass, while the thermaloptic coecient is necessary to tolerance the component over the required
temperature range. The thermal expansion coecient is important because
a package assembled from a variety of materials must maintain its integrity
over its lifetime. If one part expands signicantly more than the others then
adhesion, for example, can be compromised.
Glass parts are sometimes used for packaging and assembly parts as well.
Two key factors when choosing such as glass are its thermal expansion coefcient and its ultraviolet (UV) transmissivity. A glass package part is often

4.2 Properties of Select Materials

147

Table 4.1. Select Isotropic Glass Materials


Property

Fused
silica(a)

Density

2.2

N-BK7(b)
2.51

g/cm3

1.114

W/(mK)

Hardness

5-6

Thermal conductivity

1.31

Thermal expansion

0.5(c)

7.1(d)

1.44424

1.50091

Refractive index n1529.6

Units

Mohs
106 /K

Thermal optic(e)
1060 nm line

2.4
9.9

3.0

Sellmeier coecients(f ) B1

e-line

0.66942

1.03961

B2

0.43458

0.23179

(a)
(b)
(c)
(f )

B3

0.87169

1.01047

C1

0.00448

0.00600

C2

0.01328

0.02002

C3

95.3414

103.561

106 /K
in m

Reported by Schott [45].


Reported by Schott [46].
25 C-100 C,

(d)

-30 C-+70 C, (e) 20 C-40 C, nrel /T .


2
B
B2 2
B3 2
1
2
n () 1 = 2
+ 2
+ 2
, [45].
C1
C2
C3

used because its thermal expansion matches with other glassy parts that are
in the transmission path. Another reason is that the transmission parts need
to be visible during assembly, for alignment and/or for UV tacking with epoxy.
Finally, glass windows are sometimes brazed into metal packages to make a
clear path for collimators while maintaining hermetic integrity.
While a complete glass catalog should be referred to in order to choose an
optical glass for a particular application, Table 4.1 provides select material
properties of two commonly used transmission and packaging glasses: fused
silica and BK7.
4.2.2 Birefringent Crystals
Birefringent crystals are the basic building blocks for the birefringent components detailed in the following chapters. The crystal materials may be divided
into two application regimes: applications requiring high birefringence and
those requiring low birefringence. Rutile and yttrium orthovanadate (YVO4 )
are examples of very high birefringent material, both having about 10% birefringence at 1.55 m. Crystalline quartz is a readily available low birefringent
material, having a birefringence of 0.0084 at 1.55 m.
Materials of intermediate birefringence are nonetheless required for practical reasons. For example, the birefringent phase of a birefringent crystal is tem-

Table 4.2. Select Birefringent Material Properties


Property

LiNbO3

YVO4

Tetragonal

Hexagonal

Hexagonal

Hexagonal

Birefringent type

+ uniaxial

uniaxial

uniaxial

uniaxial

R3c

D4h

R3c

Density

4.22(a)

4.65(b)

(a)

Hydroscopic susceptibility(a)

4.5

none

low

low

Thermal expansion

4.43(a)
1.9447

1.9787

Group birefringence(d)

(s)
(d)

2.2112

A1

(f )

12.547(s)
0.08

7.5
2.1381

1.6749

4.1(f )

2.1914

2.2643

35.0
2.1856

3.77834(a) 4.59905

17.060

0.80

5.1(s)

6.2

W/(mK)

33.3

-3.7(s)

25.1

106 /K

1.5555

-9.3(a)

1.6629

15.14

33.08

4.9048(f )

4.582

1.4885

2.1(s)

1.5295

1.6586

11.9
1.4820

1.29

-4.84

5.21

2.69705(a) 2.18438

A2 0.069736

0.1105

0.1176

0.09921

0.01921

0.04724

0.04813

0.04753

0.04448

0.0182

0.01018

A4 0.010813

0.01227

0.02715

0.02194

0.01516

0.00244

Reported by Casix [32]:

(e)

reported for 1550 nm,

Reported by Crystal Technology [14],

(f )

(g)

reported for 532 nm,

(h)

dn/dT 106 /K

-0.1766

-0.1292
1.41

(h)

-0.1744
-6.6

1.6587

Mohs

4.990(s)

-0.1194

-0.0787

4.59

(g)

g/cm3

12.736

(s)

0.5(s)

-0.0732
3.0

7.99

A3

(b)

2.1486

13.866
(b)

15(b)

11.37

+0.2127

Thermal optic(d)

(a)

4.2

5.23

8.5(c)

(d)

Sellmeier coecients

(e)

5.151(b)

+0.2039

Thermal optic
Group index

6.29

(a)

5.10

Refractive index

2.711(s)

Thermal conductivity

Birefringence

3.84(s)

none
7.12(c)

Lattice constants

4 Elements and Basic Combinations

R3

Space group

Units

Crystal type

Hardness

148

CaCO3 (Calcite)

-BBO
c

1 dng L
6
10 /K
ng L dT
in m

0.00873

reported for 1300 nm.

from Sellmeier at 1.55 m and 24.5 C.

Reported by Handbook of Optics [3].


Reported in this text by Damask, 25 C100 C, 15151575 nm.

Sellmeier expression: n () = A1 +

A2
2
A4
2 A3

Table 4.3. Select Birefringent Material Properties


Property

Crystal Quartz

Hexagonal

Birefringent type

(s)

TiO2

Crystal type

Lead Molybdate

Rutile

(s)

SiO2

PbMoO4
a

Tellurium Dioxide Magnesium Fluoride

(c)

(s)

TeO2
a

Tetragonal

Tetragonal

Tetragonal

MgF2
a

Units

(s)

Tetragonal

+ uniaxial

+ uniaxial

uniaxial

+ uniaxial

+ uniaxial

Space group

P32 21

P42 /mmm

I41 /a

P41 21 2

P42 /mmm

Density

2.648

4.25

6.95

6.019

3.171

g/cm3

Mohs

none

none

none

none

Hardness
Hydroscopic susceptibility

Thermal expansion
Thermal optic
Refractive index
Birefringence

4.9136

5.4051

4.594

2.962

7.5

12.7

8.3

11.8

6.88

6.86

8.97

12.38
6.2

(s1 )

1.5352

(s2 )

7.0

1.5440

2.432

+0.0088

5.4312

12.4

9
2.683

12.1065

26.7
190

2.260

(c1 )

2.170

4.810

15.0
9

2.18

(c)
(d)

Reported by Handbook of Optics [3]:


Reported by Isomet Corporation [31],

4.623

3.053

30

21

W/(mK)

13.6

106 /K

0.32

dn/dT 106 /K

9.4
(s4 )

0.88

2.32

1.3734

+0.14

1.3851

+0.0117

11.5(d)

Optical rotary power


(s)

4.9

(s3 )

0.090

+0.251

7.613

(s1 )
(c1 )

reported at 546 nm,

(s2 )

reported at 405 nm,

ng = 0.104, d(ng )/dT is reported.

At = 1.55m. Follows Biots law: = o /2 .

(s3 )

reported at 644 nm,

deg/mm
(s4 )

reported at 1.15 m.

4.2 Properties of Select Materials

Lattice constants
Thermal conductivity

149

150

4 Elements and Basic Combinations

perature dependent. But two crystals having complementary thermal-optic coecients can be cascaded to mitigate the temperature dependence passively.
Useful complementary crystals for YVO4 are LiNbO3 and lead molybdate,
both of which have an intermediate birefringence. Alternatively, intermediate birefringent crystals such as LiNbO3 have strong electro-optic coecients,
making them useful for switching and polarization-control applications.
Tables 4.2 and 4.3 provide a compilation of data on nine widely used birefringent materials. All of these materials are synthetic except for calcite, which
is found in nature and mined. Because the crystal quality can vary and the
material is not suciently hydrophobic, calcite is not a favored componentgrade material. In comparison with rutile, YVO4 is favored because it is more
easily grown to large boule sizes and is easier to handle at the grinding and
polishing stages. To compensate for the temperature dependence of YVO4 ,
LiNbO3 is commonly used, although lead molybdate has been recently proposed. Tellurium dioxide is a reasonably high birefringent crystal with the
added feature that it can be grown in high-purity dextrorotatory and laevorotatory chiralities, enabling the crystal to be used for optical activity. -BBO is
a negative uniaxial crystal that is not commonly used due to its intermediate
birefringence, but one interleaver design pairs its negative uniaxial property
with the positive uniaxial YVO4 to make a compound crystal that imparts
no Poynting vector walko.
Finally, crystalline quartz is extensively used for waveplates because of its
high material quality and its low birefringence. The birefringent beat length of
quartz is about 184 m at 1.55 m wavelength, so a true zero-order half-wave
waveplate at this wavelength is 92 m thick. In contrast, the birefringent beat
length of LiNbO3 is about 19 m; an equivalent half-wave waveplate is 9.5 m
thick, a nearly impossible thickness to reproducibly attain.
4.2.3 Iron Garnets
Faraday rotators (FR) provide the requisite nonreciprocal polarization rotation necessary for isolators and circulators. The optical properties of Faraday
rotation were detailed in 3.7.6. Here the material properties and design goals
of iron garnet FRs are outlined.
An iron garnet is a ferrimagnetic material that has several orders of magnitude greater rotary power than diamagnetic materials. Practical iron garnets
are most commonly grown by the liquid-phase epitaxy. A substrate such as
gadolinium gallium garnet (GGG) is used to nucleate the crystalline growth
and support the lm. After the lm is grown, usually to 250500 m thick, the
substrate is ground away and both sides of the remaining lm are polished.
Anti-reection coatings are then applied and the lm is diced into square
parts, typically 2 2 mm2 .
To use an iron garnet FR, the nished part is placed at the center of a annular permanent magnet such as samarium-cobalt (Sm-Co). The major face of
the garnet part is placed in the clear aperture of the permanent magnet and,

4.2 Properties of Select Materials

151

Table 4.4. Non-Latching Iron Garnet Design Goals


High specic Faraday rotation

F > 0.1 /m

Low thickness for F = 45

t < 500 m

Low absorption

IL < 0.1 dB

Low saturation magnetization


Good substrate lattice matching

Hsat < 400 Oe


match to GGG or like

Low pitting density

< 10/cm2

Temperature range

20 to +70 C

Low temperature coecient

d|F |/dT < 0.07 /C

Low wavelength dependence

d|F |/d < 0.08 /nm

High Curie temperature

Tc > 150 C

accordingly, the magnetization vector within the lm is normal to the lm


surface. The strength of the magnet must be sucient to saturate fully the
domain structure of the garnet over the specied temperature range. Multimagnet schemes have been proposed to enhance the magnetic eld in the
region surrounding the magnet [23, 25], although these concepts are not currently used in telecom-grade components. The direction of magnetization sets
the direction of polarization rotation, whether clockwise or counterclockwise.
Light transits the garnet part either parallel or anti-parallel to the magnetization direction.
To attain component-quality performance, the FR must have a low saturation magnetization Hsat so the permanent magnetic can be small, a low absorption, a low temperature-dependent specic rotation (dened by (3.7.38)
on page 135), and a low wavelength-dependent specic rotation. Moreover,
the lm must closely lattice match to the substrate, over the range of room
and growth temperatures, to enable crystal growth. The requisite qualities of
a telecommunications-grade iron garnet used in isolators and circulators, as
opposed to magneto-optic sensor applications, are listed in Table 4.4.
Yttrium iron garnet (YIG) is an early garnet material that was a suitable replacement for diamagnetic FRs of the time. With a chemical formula
of Y3 Fe5 O12 , the iron content is the sole contributor to the magnetization
as Y has no net magnetic moment. In relation to the component requirements,
however, a YIG lm must be 2.7 mm to achieve a rotation of F = 45 .
Also, YIG has a large saturation magnetization ( 1800 G), requiring a large
permanent magnet.
To reduce the requisite lm thickness, bismuth exchanged rare-earth iron
garnets (Bi:RIG) were introduced. With the chemical formula (BiRE)3 Fe5 O12 ,
the combined bismuth and rare-earth (RE) ions greatly enhance the specic
rotation. There are, however, tradeos due to the exchange of (BiRE) for yt-

152

4 Elements and Basic Combinations

trium. The bismuth ion increases the lattice constant of the lm; it increases
the temperature dependence of the specic rotation; it increases the thermal
expansion of the lm; and it increases the possibility of pitting in the lm.
The lattice mismatch limits bismuth incorporation to about one atom in three.
These deleterious eects may be somewhat compensated by the selection and
concentration of the rare-earths. Addition of terbium (Tb), for example, will
decrease the temperature dependence. Which rare-earth atoms are suitable
depends in large part on their absorption spectra and the operating wavelength of the garnet. At 1.55 m, Tb, gadolinium (Gd), holmium (Ho), and
europium (Eu) are all suitable to varying degrees. However, at 980 nm their
absorption is too high and other means, such as heavy bismuth loading and
a 25 m lm [21], is all that can be expected.
To reduce the saturation magnetization, gallium and aluminum can be substituted for iron as in (BiRE)3 (FeGaAl)5 O12 . Introduction of Ga and/or Al,
however, concurrently reduces the Curie temperature (the temperature at
which the magnetization is zero) and increases the temperature dependence
of the specic rotation.
Very interesting studies have been conducted to tailor the overall material properties through introduction and balancing of rare-earth ions as well
as gallium and aluminum. With the optimal balance, there is a window of
material compositions in which all of the iron garnet design goals listed in
Table 4.4 can be achieved. A single source that presents the various tradeos
is [21]. The combined patent work of [1, 27, 4751, 53] provides many practical
details about compositions and materials processing.
As a specic example, (Tb1.69 Bi1.31 )(Fe4.38 Ga0.42 Al0.20 )O12 [27] exhibits
Hsat = 340 Oe at +60 C, F = 0.099 /m and a temperature dependence
of 0.062 /C. It should be noted that the wavelength dependence of iron garnets
is generally small and does not follow Biots law.
All of the above described iron garnets follow the linear saturation curve
of Fig. 3.20. These garnets are non-latching and require the presence of a
permanent magnet to maintain alignment of the magnetic domains. Latching garnets, in contrast, require only a one-time poling by a strong magnet
and then retain their magnetization indenitely under normal conditions. A
latching garnet is perfectly hysteretic, as illustrated by the latching curve in
Fig. 3.20. As a practical matter, once the garnet is poled, proper orientation in
a component is critical: reversing the part will create high transmission rather
than high isolation. To make the direction of magnetization easy to identify,
reference [52] reports the idea of making the AR coating on the two sides
of the lm dierent in color. A light purple can be made from a three-layer
coating while a bluish purple can be made from a single-layer coating.
A linear hysteresis loop results from the lms natural tendency to fracture into domains rather than remain in a single domain. The magneto-static
energy of the lm is proportional to the square of the saturation magnetization: a high Hsat provides sucient energy for domain break up. However,
doping with Ga and/or Al decreases the saturation magnetization, increasing

4.2 Properties of Select Materials

153

Table 4.5. Select Alloy Packaging Materials


Kovar(a)

Invar-36(b)

Carbon

0.020%

0.020%

Silicon

0.20%

0.20%

Cobalt

17%

0.30%

0.35%

Property

Units

Element

Manganese
Nickel

29.0%

36.0%

Iron

balance

balance

Density

8.39

8.09

Thermal Conductivity

17.3

10.5

W/(mK)

Thermal Expansion(c)

5.85

1.30

cm/cm/K
-mm

Electrical Resistivity(d)

487

820

Melting Point

1470

1445

(a)
(b)
(c)

g/cm3

Reported by Carpenter [12].


Reported by Carpenter [11].
25 C-100 C,

(d)

70.0 F.

in turn the hysteresis of the material. By lowing Hsat to below 100 G, latching
can occur. As an example, (Bi0.75 Eu1.5 Ho0.75 )(Fe4.1 Ga0.9 )O12 has a thickness
of 86 m for 45 rotation and a saturation magnetization of 14 Oe [7]. However, the temperature dependence necessarily increases due to the high Ga
concentration. The temperature dependence of the preceding lm is reported
as 0.093 /C [21].
The latching garnet is perfectly suited for an isolator placed at the output
of a diode laser within the hermetic housing [22]. A semiconductor laser diode,
used for signal and pump lasers, requires a miniature housing where elimination of the permanent magnet is a signicant advantage. To maintain the
lasing wavelength, laser diodes sub-mounts are attached to Peltier thermalelectric coolers that maintain the temperature. The latching garnet is also
placed on the cooler. Under these conditions the latching garnet performs well.
In passive component applications, however, where the garnet must remain
stable over a wide temperature range, low temperature-dependent garnets are
a better choice.
4.2.4 Packaging Alloys
Packaging materials play an integral role in component design and assembly
because they house the optical core and must resist environmental interference. Industry standards such as [58] place stringent conditions on the heat
and humidity a component must tolerate over its 25year lifetime. Generally,
telecommunications-grade components require hermetic sealing to protect the
optical surfaces and attachment epoxies from heat and humidities. Specialty

154

4 Elements and Basic Combinations

components such as air-gap interleavers do a nal frequency tuning by adjusting an inert-gas overpressure prior to sealing the package, after which
that overpressure is expected to be maintained.
Table 4.5 provides information on two commonly used alloys, Kovar and
Invar. These alloys are suitable for their machinability and low thermal expansion coecient. Typically these alloys are plated with one or two mils
thickness nickel and gold, in that order. The nickel promotes the gold adhesion, and the gold prevents corrosion while providing a quality surface on
which to solder and weld.

4.3 Fabry-Perot and Gires-Tournois Interferometers


Two elementary components are the Fabry-Perot (FP) and Gires-Tournois
(GT) interferometers. Both interferometers are resonators that store energy
two partially reecting mirrors when on resonance between. For the FabryPerot, leakage of the stored energy through the partial reectors interferes
constructively (destructively) with the input light to alter the transmission
(reection) properties. By comparison, on anti-resonance there is no energy
stored and the transmission is minimum. The Gires-Tournois is also a resonator but with the second mirror fully reecting. The reection coecient
is unity but there is a frequency-dependent delay associated with resonance.
The response of these two resonators is typically derived using an innite series construct. Here, scattering and transmission matrices are rst derived and
then a matrix concatenation is used to reach the solution. In either formalism
the nal equations are the same, but the transformation matrix approach is
applicable to a broader range of calculations.
The rst step is to derive a scattering matrix S across an index-step boundary such as a partially reecting mirror. The matrix relates the outputs across
the boundary to the inputs (Fig. 4.3(S)):

s1 s2
a1
b1
=

(4.3.1)
b2
s3 s4
a2
Since the system is lossless, the scattering matrix must be unitary:
SS = I

(4.3.2)

The scattering matrix elements are found from (3.5.31) on page 97. A phase
reference plane is established on either side of the boundary. On the side of the
reection, the phase reference is chosen so that /|| = +1, or 2kz1 z = 2n.
On the side of transmission, the phase reference is chosen so that T /|T | = j,
or kz2 z = (2n 1/2). One can say that for a given wavelength the phase
reference plane for reection coincides with the boundary surface and the
phase reference plane for transmission is set back by a quarter-wave on the far

4.3 Fabry-Perot and Gires-Tournois Interferometers

S:

a1
b1

b2
n1

n2

a2

T:

f1
g1

155

f2
n1

n2

z1

z2

g2

Fig. 4.3. Scattering and transmission coecients across an index-step boundary. S)


Scattering coecients, inputs denoted by a, outputs denotes by b. T) Transmission
coecients, forward waves denoted by f , backward waves denoted by g.

side of the boundary. With these denitions and enforcing the unitary property
of the scattering matrix, the scattering matrix of a partially transmissive
mirror are

r jt

(4.3.3)
S=
jt r
where r and t are the reection and transmission eld amplitudes, respectively.
The reection and transmission coecient related by
r 2 + t2 = 1

(4.3.4)

ensures a lossless interface. The physical interpretation of the j multiplier to


the transmission coecient is that the passage of a wave across the boundary
is delayed by /2 with respect to a wave reected.
Next the scattering matrix is converted to a transformation matrix. The
latter matrix relates the forward and backward waves at one location z1 to the
forward and backward waves at another location z2 (Fig. 4.3(T)). A transformation matrix can be concatenated with other transmission matrices to
nd the response of a more complicated system. This is not the case for the
scattering matrix, which is the reason for the conversion. In regard to the
gure, the eld amplitudes of the scattering and transformation frameworks
are related in the following way:
f1 = a1 ,

f2 = b2 ,

g1 = b1 ,

g2 = a2 .

The eld amplitudes are related to the transformation matrix as

f1
t1 t2
f2

g1
t3 t4
g2

(4.3.5)

The transformation matrix for the partially transmissive mirror is therefore

156

4 Elements and Basic Combinations


a)

r1

1
r12

r2

b)

n1

n1

n1

t12

g2 = 0

r11

r1

-r1
L

n1

n2

t11
n1

g2 = 0

Fig. 4.4. Fabry-Perot interferometers with planar partially reective mirrors. a)


Two mirrors with reectivities r1 and r2 separated by distance L. b) Model of a solid
of index n with equal magnitude Fresnel reections on either face. One reection is
the negative of the other.

j 1 r
T (z1 , z2 ) =
t
r 1

(4.3.6)

Note that the determinant of (4.3.6) is det(T ) = jt; reection appears as


loss to the forward-going waves.
A Fabry-Perot is modelled with two partially reecting mirrors separated
by gap L. Free propagation to position z  from z, where z  > z, is simply

jkL
e
0

(4.3.7)
T (z  , z) =
0
ejkL
where the wavenumber k is given by the dispersion relationship k = nL/c.
Two Fabry-Perots are illustrated in Fig. 4.4, one with an air gap and the other
made of a solid block.
An important simplication for nding the transmission and reection
coecients of the FP is to null one input, in this case g2 = 0 as illustrated in
Fig. 4.4. The transmission and reection amplitudes t12 and r12 , respectively,
are then determined from t12 = f2 /f1 and r12 = g1 /f1 . The left- and righthand sides are related by the FP transformation matrix

1 1 r1 ejkL 0 1 r2
(4.3.8)
Tfp (z1 , z2 ) =
t1 t2
0
ejkL
r1 1
r2 1
Expansion of (4.3.8) and identication of terms leads to
t12 =

t1 t2
ejkL r1 r2 ejkL

(4.3.9a)

r12 = +

r1 ejkL r2 ejkL
ejkL r1 r2 ejkL

(4.3.9b)

The transmitted and reected powers are the squared-magnitudes of the preceding expressions.

4.3 Fabry-Perot and Gires-Tournois Interferometers

157

4.3.1 Fabry-Perot Response


The general transmission and reection coecients for a Fabry-Perot are given
by (4.3.9). Two special cases are considered here.
For the rst special case, r1 = r2 = r. The power coecients are
|t11 |

(1 R)2
1 + R2 2R cos(2kL)

(4.3.10a)

|r11 |

2R (1 cos(2kL))
1 + R2 2R cos(2kL)

(4.3.10b)

where R = r2 , T = t2 , R + T = 1, and coecient T and transformation matrix T are distinguished by context.


For the second special case, illustrated in Fig. 4.4(b), the reection coecients on the two boundaries have opposite sign: r2 = r1 . The power
transmitted is
(1 R)2
2
|t11 | =
(4.3.11)
1 + R2 + 2R cos(2kL)
where the eect of the negative second reection coecient is to shift the
transmission (and reection) spectrum by a half period as compared with
(4.3.10). An exemplar transmission spectrum is illustrated in Fig. 4.5, where
the FP interferometer is a solid block.
The transmission spectrum of a Fabry-Perot is periodic and depends on
the phase accumulation in the resonator. The transmission is related to the
resonator phase as

resonance: 2kL = 2n,


1
2
(4.3.12)
|t11 | = (1 R)2

anti-resonance: 2kL = (2n + 1)


2
(1 + R)
Of course, if there is loss in the system then the transmission maximum is
less than unity. Also, the conditions for resonance and anti-resonance for the
solid-block FP are switched from those in (4.3.12).
In general, a modulation depth MD is dened as
MD =

Imax Imin
Imax + Imin

(4.3.13)

where I is intensity. Substitution of (4.3.12) into (4.3.13) gives the modulation


depth for a Fabry-Perot interferometer:
MDFP =

2R
1 + R2

(4.3.14)

The range of modulation depth is from zero to unity and is governed solely
by the mirror reectivity R.

158

4 Elements and Basic Combinations

Tmax=1
1
R11

T11
L

FSR
dfFWHM

50%
Tmin

vn

vo

vn+1

Fig. 4.5. Solid Fabry-Perot interferometer; the second reection coecient is the
opposite sign of the rst. For a unit input there is frequency-dependent transmission and reection. Left: exemplar transmission spectrum for solid FP. A comb of
transmission peaks exists in frequency, spaced by the free-spectral range (FSR) of
the cavity. The modulation depth is governed solely by the boundary reectivity.

Free-Spectral Range and Group Index


As illustrated in Fig. 4.5 the transmission (and reection) spectrum is periodic.
The periodic structure is dictated by the resonator phase = 2kL. The freespectral range (FSR) of the interferometer is dened as the frequency shift
required to oset the spectrum by one full period. That is, the change in phase
must be = 2. Expanding the phase to show radial frequency ,
=2

n()L
c

(4.3.15)

the phase dierence is related to the frequency dierence by


2L
(2 n(2 ) 1 n(1 ))
c
2L
(n())
=
c

(4.3.16)

The dierence on the right-hand right may be written


d(n())

d


dn
= n+

(n()) =

(4.3.17)

After the more customary conversion to wavelength from frequency, the group
index is dened as
dn
(4.3.18)
ng = n
d
Substitution of (4.3.17-4.3.18) into (4.3.16) and conversion to cycle frequency
from radial frequency yields
= 2

2ng L
f
c

The free-spectral range FSR = f is dened for = 2, or

(4.3.19)

4.3 Fabry-Perot and Gires-Tournois Interferometers

FSR =

c
2ng L

159

(4.3.20)

That the FSR is dened by the group index is a statement that it is the roundtrip time of the optical energy, and not phase, that matters. The dierence
between phase and group index is critically important when building precision
instruments such as optical clocks, where the phase and group velocities in an
resonant cavity must be locked [33, 37]. Optical bers also exhibit a dierence
between phase and group indices, a dierence that varies from unspun to spun
bers, across vintages, and across ber types [36].
Resonant Bandwidth
Also illustrated in Fig. 4.5 is the full-width half-maximum fFWHM of the
transmission spectrum. For a symmetric lossless Fabry-Perot, as long as the
mirror reectivity exceeds

21
(4.3.21)
R
2+1
then the transmission will fall below 50%. Setting (4.3.10) to 50% and solving
for the resonant bandwidth fFWHM yields
fFWHM =

1R
2
FSR sin1

2 R

(4.3.22)

In the highly resonant case, where FWHM  , then


fFWHM 

1R
FSR
R

(4.3.23)

Fabry-Perot interferometers are classically used as narrowband lters because


the transmission on resonance is unity and a high mirror reectivity makes
the resonant bandwidth narrow. For practical applications, FP lters are not
good wavelength-division multiplexing lters unless they are cascaded to make
a multi-pole lter with higher roll o and atter transmission proles. The
FP resonator is sometimes used, however, as a frequency standard on which
to lock a transmission laser frequency.
Tolerancing the Resonance Frequency
The following tolerance analysis illustrates the necessary specications for a
telecommunications-grade Fabry-Perot interferometer. Consider a FP etalon
made of BK7 glass that nominally aligns its transmission spectrum to a
DWDM grid of C = 100 GHz in the C-band. The frequency-dependent argument (2kL) of the transmission expression (4.3.11) nominally aligns to the
DWDM comb of frequencies fn :

160

4 Elements and Basic Combinations

2fn

2ng L
= (2n + 1)
c

Nominally FSR = C, so


fn =

1
n+
2

(4.3.24)


C

(4.3.25)

Due to errors, however, the free-spectral range may have an error. That error
requires an oset from fn to reach the nominal frequency comb fn . That is,
2

fn
fn f
= 2
FSR FSR
C

(4.3.26)

Substitution of the nominal grid (4.3.25) into (4.3.26) gives the error proportionality
FSR
f
=
(4.3.27)
C
fn
Using the specication in 4.1 of fn = 2.5 GHz and a center frequency of
fn = 194.1 THz, the allowable error on the free-spectral range is


 FSR 


(4.3.28)
 C  13 ppm
This error tolerance is almost impossibly small.
To relax the tolerance one can separate the frequency-location error from
the FSR error. That is, rather than insisting the FSR exactly match the grid
exactly, the FSR is allowed to walko the grid over some bandwidth. This is
illustrated in Fig. 4.2(a). The calculation made in (4.1.3) gives a 1250 ppm
error tolerance to the FSR.
Separate from the FSR accuracy, the resonant peaks must still align to the
grid. At center frequency the number of modes in the cavity is n  1941. The
length of BK7 to realize the cavity, using round numbers, is L = 1.000 mm. As
there are 1941 modes in the cavity, the mode length is l = 0.515 m. A change
in cavity length by 0.515 m times an integral number adds or removes modes
to the cavity, but a non-integral length change shifts the frequency location
of the resonant peaks. Using the frequency-error tolerance fn = 2.5 GHz,
any change in cavity index-length product must be within


 (ng L) 
1.5


(4.3.29)
 n  1941 2.5%
or
|(ng l)| 0.020 m

(4.3.30)

Expansion of the index-length product to account for temperature dependence


gives
dng
dl
(ng l) = ng T
+ lT
(4.3.31)
dT
dT

4.3 Fabry-Perot and Gires-Tournois Interferometers

161

Using the thermal expansion and thermal optic coecients for BK7 found in
Table 4.1 and accounting for a 50 C temperature swing, the error budget
due to temperature change is
ng T

dng
dl
+ lT
 0.683 m
dT
dT

(4.3.32)

In comparison with (4.3.30) one can clearly see that a bulk Fabry-Perot cavity
does not meet the tolerance requirements for a telecommunications component. These cavities require active temperature control to meet the necessary
specications. In practice, the cavities are made with an air gap sealed hermetically into a package, and the construction materials are selected to minimize
temperature-dependent expansion.
4.3.2 Gires-Tournois Response
A Gires-Tournois interferometer replaces the second mirror with a perfect
reector. In this case t2 = 0 and r2 = 1. Therefore t12 = 0 and the reection
coecient (4.3.9) simplies to
rGT =

rejkL ejkL
ejkL rejkL

(4.3.33)

The magnitude is unity for all frequencies: |rGT | = 1. The GT interferometer


is an all-reection lter, but the phase is frequency dependent. Making the
denominator read-valued, the reection coecient is
 jkL
2
re
ejkL
rGT =
(4.3.34)
1 + r2 2r cos(2kL)
Denoting the phase of the reection as GT = rGT , the GT phase is


1+r
tan(kL)
(4.3.35)
GT = 2 tan1
1r
The phase reference plane from which GT is measured is located on the
interface of the leading mirror.
In the limiting cases, when r = 0 the phase propagates normally over
length 2L while when r = 1 the reection is negative one. Between these two
limits there is variation of the phase. The eect of the resonant cavity is more
clearly shown through the group delay g , where
g =

dGT
d

(4.3.36)

or, by expansion
g =

2h
2ng L
c (1 + h2 ) + (1 h2 ) cos(2kL)

(4.3.37)

162

4 Elements and Basic Combinations

1
1,tg

2tg

r
n

tg max

FSR

dfFWHM

tg max/2
L

tg min

vn-1

vn

vn+1

Fig. 4.6. Solid Gires-Tournois interferometer (GTI) with gap length L and refractive index n. The GTI has 100% reection at all frequencies, but there is frequencydependent delay of the reection. The delay spectrum is similar to the transmission
spectrum of a Fabry-Perot interferometer in that it is periodic. The FSR of the
group-delay comb is dictated by the gap length and refractive index, and the maximum delay is dictated by the FSR and the reectivity r of the partial reector.

where
h=

1+r
1r


(4.3.38)

(compare (4.3.10) on page 157). Figure 4.6 illustrates an exemplar group-delay


spectrum. On resonance, 2kL = 2n and the group delay is maximum because
the resonance wavelength is an integral multiple of the cavity length, allowing
the storage of energy. On anti-resonance the group delay is at a minimum as
there is no energy stored. Indeed the maximum and minimum group delays
are


1+r
1
2kL = 2n,
(4.3.39a)
g,max =
FSR 1 r


1r
1
g,min =
2kL = (2n + 1)
(4.3.39b)
FSR 1 + r
One can see that the group delay is related to the inverse of the free-spectral
range and the leading mirror partial reectivity. The inverse free-spectral
range is a unit of delay for the cavity. As with the Fabry-Perot interferometer,
the free-spectral range is related to cavity parameters by (4.3.20).
Conservation of Delay Bandwidth Product
The Gires-Tournois interferometer group-delay response obeys a conservation rule in that the product of peak delay and resonance bandwidth is a
constant for xed r. Following the Fabry-Perot example, the resonance bandwidth fFWHM of the group delay can be calculated at one-half the maximum
level. Setting (4.3.37) equal to g,max /2, one nds that
fFWHM =

FSR

h2 1

The peak delay/bandwidth product is therefore

(4.3.40)

4.4 Temperature Dependence of Select Birefringent Crystals

h
g,max fFWHM =
h2 1
1+r
=
2 r

163

(4.3.41)

The partial reectivity of the front mirror alone governs the peak delay/bandwidth product. The FSR of the cavity plays no role but rather is scaled out.
Higher peak delays are achieved with higher reectivities (cf. (4.3.39)), but
the bandwidth (4.3.40) narrows by a commensurate amount. The partial reectivity of the leading mirror is the only degree of freedom for this simple
GT, so independent control of bandwidth and peak delay is not possible.

4.4 Temperature Dependence of Select Birefringent


Crystals
There is little data available on the temperature and wavelength dependence
of the refractive and group indices for components-grade birefringent crystals
in the 15201610 nm wavelength region. Yet birefringent components such as
interleavers require a detailed specication of these properties. Moreover, for
components such as the o-axis compound crystals, detailed in 4.5, the design
requires separate determination of the ordinary and extraordinary indices, not
just their dierence.
Measurement of the refractive and group indices requires dierent techniques. The method presented below is suitable to ascertain the temperatureand course wavelength-dependence of the group indices along the two birefringent axes. The group index accounts for rst-order wavelength dependence
and determines the group velocity (cf. (4.3.18)). The refractive index is dened at a specic wavelength and determines the refraction properties. The
measurement technique here involves the response to a variation in the input
wavelength, so it is the group index that is measured.
4.4.1 Experimental Setup
Crystal samples of YVO4 , LiNbO3 , -BBO, and calcite were fabricated into
Fabry-Perot resonators. The samples were cut as waveplates, with the extraordinary axis lying in the polished face of the crystal, perpendicular to the light.
The sample thicknesses were nominally 1.000 mm long, and each was coated
with a single thin-lm layer on either face to bring the Fresnel reectivity to
approximately 50%. The parallelism specication called for less than 5 arcsec
of wedge.
Figure 4.7(a) illustrates the experimental setup. The Fabry-Perot etalon
response of each sample was measured with a super-luminescent diode (SLD)
having 100 nm bandwidth and less than 0.25 dB ripple, and an optical spectrum analyzer with 0.05 nm resolution. The measurement band was 1515
1575 nm for all samples. The optical spectrum analyzer (OSA) was set to

164

4 Elements and Basic Combinations

a)
SLD

OSA
Fiber

b)

Lens

l/2

Pol

Lens
Crystal
Metal
Insulator

Fiber

circulator

SLD
Fiber

Power
Meter

Lens

5-axis alignment

Fig. 4.7. Illustration of experimental setup to measure temperature dependence


of group index. a) A superluminescent diode (SLD) provides broadband, ripplefree, polarized light in the 15151575 nm band. The polarizer is aligned to the
extraordinary or ordinary axis of the sample crystal etalon, the latter being loaded
in a heating cell. The light is collected by an optical spectrum analyzer. b) To ensure
that the crystal sample is perpendicular to the light, back-reection is collected
through a circulator and the 5-axis alignment stage is adjusted to maximize this
reection.

maximum sensitivity and eight averages per scan. It is noted that the spectra
were stable and averaging created little change.
To couple light through a sample, a single-mode ber was routed from
the SLD to a collimator, the collimator expanded the beam to approximately 0.75 mm in diameter, and the collimated beam transited the sample.
The beam was refocused through a second lens to a single-mode ber which
was routed to the OSA. Since the samples are birefringent, an in-line Polarcor polarizer was placed before the sample. Rotation of the polarizer selected
either the ordinary or extraordinary axis of the sample, or a mixture of the
two. For the measurements, the polarizer was always rotated for maximum
extinction of one axis or another. No measurement was made as to the extinction ratio, but visual inspection of the spectrum on the OSA indicated
that the extinction was better than 10 dB. Since the SLD generates a linearly
polarized white light, a half-wave waveplate was located before the polarizer
and independently rotated to maximize transmission through the polarizer.
For each experiment, the sample crystal was loaded into a small brass
xture that supplied resistive heating. The aperture was rectangular and the
position of one wall of the aperture was adjustable by a screw. In order to
allow the crystal to expand physically the screw was lightly tightened. The
brass xture was insulated with delrin and teon. A closed-loop temperature
controller controlled the heating of the xture to within 0.1 C. The samples

Power (mW)

4.4 Temperature Dependence of Select Birefringent Crystals


1.5

YVO4 Fabry-Perot - Ordinary and Extraordinary


uo

fn

1.0

1.5

fn+1

fref

0.5
0.0

Power (mW)

a)

165

b)
ue

1.0
0.5
0.0
190.4

190.5

190.6

190.7

190.8

190.9

191.0

Frequency (THz)

Fig. 4.8. Measured Fabry-Perot etalon response of YVO4 crystal sample. a) Transmission response along the ordinary axis. b) Transmission response along the extraordinary axis. Note the free-spectral range along the two axes is dierent.

were taken from 25 C to 100 C in 5 C increments. Five minutes were allowed


for thermal stabilization at each temperature.
A critical attribute of this experiment is the optical path through the
sample. When the sample is canted the path length increases, but the increase is not an easily measurable quantity. To guarantee that the crystal
was positioned perpendicular to the beam, a preliminary alignment was done.
Figure 4.7(b) illustrates the setup with the waveplate or polarizer removed,
and an optical circulator inserted between the SLD and sample. The staging
that holds the sample was adjusted to maximize the back-reection. Once
maximized the sample was considered aligned. This measurement was performed before and after was temperature cycled to assess the degree of position change. It is believed that position shifts did not eect the data to within
the present level of accuracy.
Figure 4.8 shows the measured transmission response of a YVO4 etalon
along the ordinary and extraordinary axes over a narrow bandwidth. In both
spectra there is a comb of resonant peaks and the period of the peaks diers for
the two axes. As YVO4 is a positive uniaxial crystal, the free-spectral range
of the extraordinary axis is narrower than that of the ordinary axes. Each
peak, for either spectra, corresponds to a resonant mode. As the frequency
is increased, more modes are added to the cavity; this is indicated at frequencies fn and fn+1 in Fig. 4.8(a). Additionally, a reference frequency fref is
dened to measure spectral shift with temperature. The choice of fref is arbitrary but remains constant throughout. A spectral phase is dened between
resonant frequency fn and the reference frequency as

4 Elements and Basic Combinations

Power (mW)

166

1.5

YVO4 Fabry-Perot - Ordinary and Extraordinary


T3 T2 T1

1.0
0.5
0.0

Power (mW)

a)

1.5

b)

1.0

T2

T1

T3

0.5
0.0
190.4

190.5

190.6

190.7

190.8

190.9

191.0

Frequency (THz)
Fig. 4.9. Measured Fabry-Perot etalon response of YVO4 crystal sample over increasing temperature, 10 C increments. The comb of resonant frequencies shifts to
lower frequency, indicating an increase in the index-length product of the sample. a)
Ordinary axis. b) Extraordinary axis. Note the temperature-dependent shift diers
between the two axes.

n = 2

fn fref
FSR

(4.4.1)

This spectral phase will be used to extract the thermal-optic coecient in the
following.
Figure 4.9 shows the measured transmission response of the YVO4 crystal
as the temperature is increased from T1 to T3 . For both axes the comb of
resonant peaks shifts to lower frequencies. When the cavity expands or when
the refractive index increases with increased temperature, a lower frequency
is required to maintain the same mode order n that corresponds to resonant
frequency fn . The gure shows that for both the ordinary and extraordinary
axes the product of the index and length increases with temperature, detuning
the resonance comb to lower frequencies.
What is also evident in Fig. 4.9 is that the index-length product changes by
dierent degrees for the ordinary and extraordinary axes. This is typical of all
birefringent crystals. One eect of this birefringent temperature dependency
is that the birefringent phase changes with temperature. This is a distinct
disadvantage for birefringent lters that need to remain locked to the DWDM
grid.
4.4.2 Quadratic Temperature-Dependence Model
The transmission expression (4.3.11) on page 157 is a function of phase
= 2kL. Expansion of the phase with temperature to second order gives

4.4 Temperature Dependence of Select Birefringent Crystals

(T )  o +  T +

1 
(T )2
2

167

(4.4.2)

Similarly, the equivalent expression 2kL is expanded to second order as




1 d2 (ng L)
2
2
d(ng L)
2
ng L 
T +
(T
)
(4.4.3)
(ng L) +
c
c
dT
2 dT 2
In light of these expansions, the temperature-dependent phase (T ) can be
approximated to second order as


1
2f
(4.4.4)
1 + K (1) T + K (2) (T )2
(T ) 
FSR
2
where the linear and quadratic temperature coecients are dened as
K (1) =

1 d(ng L)
ng L dT

(4.4.5a)

K (2) =

1 d2 (ng L)
ng L dT 2

(4.4.5b)

Extension of the temperature dependence to second order is necessary because the rst-order term can be identically cancelled by proper selection of
complementary crystals, as detailed in 4.4.5. However, the quadratic term
cannot be so cancelled, leaving a residual error that may be signicant given
the requirements for telecommunications-grade components.
4.4.3 Association of Resonant Peak Shift With Temperature
Coecients
The frequency locations of the etalon resonances shift with temperature due to
changes in the ng L product. At the nth resonance the optical phase is = 2n.
When the temperature changes the peak frequencies shift to maintain this
resonant condition. One can write the resonance frequency on the nth mode
for two dierent temperatures as
fn (T1 ) =

nc
2ng L

(4.4.6a)

fn (T2 ) =

nc
2(ng L + (ng L))

(4.4.6b)

The change (ng L) can then be expressed as a function of resonant peak


frequencies:
fn (T1 )
(ng L)
=
1
(4.4.7)
ng L
fn (T2 )
Using the quadratic expansion (4.4.3) and temperature coecient denitions (4.4.5), the ratio of resonant frequencies is related to the temperature
dependence as

168

4 Elements and Basic Combinations

fn (T )
1
1 = K (1) T + K (2) (T )2
fn (T + T )
2

(4.4.8)

If the frequency peaks can unambiguously be determined from the data, then
estimates of K (1) and K (2) can be determined from (4.4.8).
A problem with the determination of fn (T ) is that the resonant peak
locations of the etalon do not coincide with the frequencies at which the OSA
measures the transmitted power. This can be observed by close inspection
of Fig. 4.8. The periodic nature of the spectrum, however, can be used to
advantage by applying a Fourier analysis. Using a Fourier transform to extract
the spectral phase of a mode provides a certainty enhancement of the phase
value. The phase dierence between two temperatures determines fn (T +T )
via
FSR
(n (T + T ) n (T )) + fn (T )
(4.4.9)
fn (T + T ) =
2
as long as FSR(T + T )  FSR(T ).
There is a certain tradeo for the Fourier analysis. On one hand the spectral phase accuracy is improved by taking the Fourier transform over a large
number of periods. On the other hand, association of the resultant spectral
phase to a particular mode is increasingly less certain as the transform window
includes more modes. The presence of fn (T ) on the right-hand side of (4.4.9)
requires a certainty of the mode to which the spectral phase is associated. The
wavelength range for these measurements is 15151575 nm, or about 3.9%
spectral coverage. If the Fourier transform were taken over, the entire data
set there would be a 2% error is certainty of fn .
The tradeo used here is to partition the data set into subsets having
128 points, or about 3.8 nm, each. The certainty of fn is then 0.12%. The
spectral phase of the fundamental tone for each partition was extracted from
its Fourier transform, and the sequence of phases for a temperature ramp was
inserted into (4.4.9). This was done for each data partition and the results were
compared. Overall, there was no discernible trend from partition to partition,
although small dierences were evident.
For each partition and at each temperature the free-spectral range was
estimated via
fk (T ) fj (T )
(4.4.10)
FSR(est) (T ) =
Np 1
where the approximate peak frequencies fj,k , were taken at either end of the
partition, and Np is the number of peaks in a partition. The FSR estimates
over all temperatures were compared. There was no trend of the FSR estimates, which is reasonable since only one or two modes were added over the
total temperature range, depending on the material.
4.4.4 Group Index and Thermal-Optic Coecients
Table 4.6 tabulates the results of the data analysis, and Figs. 4.10-4.11 show
the determination of (ng L) over temperature for the four crystals. The co-

4.4 Temperature Dependence of Select Birefringent Crystals

YVO4

d(ngL) (ppt)

ord

d(ngL) (ppt)

80

100

aBBO

6
4

60

ext

20

40

60

ord

80

100

80

100

Calcite

ext

0
20

ord

ext
40

ext

0
20

LiNbO3

169

ord

0
40

60

80

Temperature (oC)

100

20

40

60

Temperature (oC)

lin error (ppt)

Fig. 4.10. Measured and quadratically t temperature-dependent refractive index


change for select crystals. The ordinary and extraordinary axes are separately measured. LiNbO3 has a much larger thermal-optic eect than the other crystals. Calcite
has a negative thermal-optic eect on the ordinary axis.

0.04

0.00

0.00

-0.04

-0.04
40

60

80

100

aBBO

0.08

-0.08
20

0.04

0.00

0.00

-0.04

-0.04
40

60

40

80
o

Temperature ( C)

100

-0.08
20

60

80

100

80

100

Calcite

0.08

0.04

-0.08
20

LiNbO3

0.08

0.04

-0.08
20

lin error (ppt)

YVO4

0.08

40

60

Temperature (oC)

Fig. 4.11. Measured and estimated quadratic residual change in index, with the
linear term removed. LiNbO3 has a large quadratic shift. Calcite shows a spurious
undulation along the ordinary axis.

170

4 Elements and Basic Combinations


Table 4.6. Group Index and Thermal-Optic Coecients
Property

YVO4

LiNbO3

-BBO

Calcite

ng,e

2.1918

2.1834

1.5296

1.4832

ng,o

1.9770

2.2656

1.6593

1.6593

ng

+0.2148

0.0822

0.1297

0.1761

(1)
Ke

4.9077

32.228

5.2729

1.3008

106 /K

(1)

8.1226

15.351

1.7364

4.9372

106 /K

(1)

24.682

432.93

39.971

57.477

106 /K

(2)

0.0112

0.0707

0.0069

0.0035

106 /K

(2)

0.0149

0.0424

0.0061

0.0009

106 /K

(2)

0.0224

0.7093

0.0033

0.0380

106 /K

Ko

Kn
Ke

Ko

Kn

Units

Measurement range: 15151575 nm, 25 C100 C.


T = 0 at T = 62.5 C.

ecients were generated for T = 0 at To = 62.5 C. Figure 4.10 shows the


change of the group-indexlength product as a function of temperature, the
room-temperature group index being subtracted from the data to better show
the variation. The solid lines are the quadratic curve t. LiNbO3 is clearly
the most temperature-dependent material of the four. Calcite has a negative
thermal-optic coecient for its ordinary axis. Whether the negative coecient
comes from change in group index or length cannot be distinguished from this
experiment.
Figures 4.11 shows the same data as in Fig. 4.10 but with the best-t
linear slope removed. This highlights the deviation from linear temperature
dependence. Again, LiNbO3 has the largest quadratic eect. Calcite shows an
irregular undulation for the ordinary axis. The quadratic terms for YVO4 and
-BBO are small but may be signicant.
In some cases the thermal-optic coecient K for the birefringence needs
to be known, rather than the coecient for either index. The birefringent
coecient for linear and quadratic terms is derived from
"
1 !
(1,2)
ng,e Ke(1,2) ng,o Ko(1,2)
(4.4.11)
Kng =
ng
Passive compensation of birefringent phase requires this combined coecient.
4.4.5 Passive Temperature Compensation
In applications of birefringent lters, birefringent crystals cut as waveplates
are used to generate a periodic lter response. The periodicity is generated by

4.4 Temperature Dependence of Select Birefringent Crystals


a)

171

n(T)
ne1(T)
no1(T)

Dn1

b)

Ta

Tb

n(T)
Dn2

no2(T)
ne2(T)

n(T)

c)
Dn1
L1

Dn2
L2

n+(T)
n-(T)

Fig. 4.12. Passive temperature compensation of birefringent phase. a) A birefringent crystal cut as a waveplate, line in face indicates extraordinary axis. Temperature dependence of refractive index for ordinary and extraordinary indices diers,
as indicated on the right graph. b) Another birefringent crystal having dierent
thermal-optic coecients from a). c) Length ratio of two crystals is designed so that
temperature dependencies of n+ and n are matched. The birefringent phase of the
cascade is temperature-compensated to rst order.

the frequency-dependent polarization transformation of the waveplate rather


than by the Fabry-Perot eect discussed in the preceding sections. Like the
Fabry-Perot interferometers, the comb spectrum of a waveplate shifts in frequency due to the temperature dependence of the birefringence.
The birefringent phase is dened through the dierence between extraordinary and ordinary indices as

(ne no ) L
(4.4.12)
=
c
A change in temperature changes the phase, to rst order, as

= (ne Ke no Ko ) L(T )
c

(4.4.13)

Consider for example a YVO4 waveplate that is 10 mm long, and the temperature is changed by 75 C. Using the entries from Table 4.6 one nds the
birefringent phase changes by  2.6, or over two waves. The comb spectrum shifts by more than two beats. For a LiNbO3 waveplate of equal length
the birefringent phase shifts  17.1.
Figure 4.12(a,b) illustrates this reason for the change. The ordinary and
extraordinary indices generally have dierent thermal-optic coecients, which
in turn causes a temperature-dependent shift in birefringent phase.
To correct for rst-order temperature dependence, two complementary
crystals can be cascaded. As illustrated in Fig. 4.12(c), given the correct length

172

4 Elements and Basic Combinations

ratio the temperature dependence on one polarization axis can be adjusted to


equal the temperature dependence on the orthogonal polarization axis. The
term polarization axis is used here because the extraordinary axes of the
two crystals may be aligned or crossed, depending on the materials, so the
designation of extraordinary axes for the combination loses its meaning.
In practice, an athermal crystal pair is designed for a specic free-spectral
range. The length ratio of the pair sets the athermalization and the length
total sets the free-spectral range. The free-spectral range for a waveplate is
determined from the dierential group delay , where is dened as
=

ng L

(4.4.14)

and the waveplate free-spectral range is FSR = 1/ . Note that the FSR of
a waveplate is dierent from a Fabry-Perot (4.3.20): the waveplate FSR is
governed by its group-index dierence and crystal length (see (3.6.35) on
page 121); the Fabry-Perot FSR is governed by the group index and twice the
cavity length. The combination of two crystals gives a total dierential group
delay of
s = 1 + 2

(4.4.15a)

= (ng,1 L1 ng,2 L2 ) /c

(4.4.15b)

where the + and signs refer to parallel and perpendicular alignment, respectively, of the extraordinary axes of the two crystals.
The temperature dependence of the birefringent phase to rst order is

(1)
= ng LKng
T
c

(4.4.16)

Stripping unnecessary sub- and superscripts, the combined temperature dependence is cancelled when

(n1 L1 K1 n2 L2 K2 ) = 0
c

(4.4.17)

where the sign has the same meaning as for (4.4.15b). Combining (4.4.15b)
and (4.4.17), the length ratio is
L2 /L1 =

n1 K1
n2 K2

(4.4.18)

and the length of the rst crystal is


L1 =

cs
n1 n2 (L2 /L1 )

(4.4.19)

The one necessary variable to attain a solution for any pair of crystals is
the alignment or crossing of the extraordinary axes. Table 4.6 shows that

4.5 Compound Crystals For O-Axis Delay

173

Table 4.7. Passive Temperature Compensation Combinations


Combination

L1 (mm)

L2 (mm)

L1 + L2 (mm)

YVO4 LiNbO3

14.801

2.205

17.005

YVO4 -BBO

36.488

37.315

73.803

YVO4 Calcite

24.461

12.812

37.273

-BBO LiNbO3

25.465

3.710

29.175

Calcite LiNbO3

19.630

5.583

25.213

-BBO Calcite

75.890

38.870

114.761

L1 + L2 generates s = 10.0 ps.


and indicate aligned or crossed extraordinary axes,
respectively.

the signs for the birefringent thermal-optic coecients are the same for all
crystals. However, YVO4 is positive uniaxial and the rest are negative uniaxial.
Referring to (4.4.17), if the birefringence of a crystal pair has the same sign
the extraordinary axes must be crossed; otherwise they are aligned.
Table 4.7 tabulates the crystal lengths required for all six combinations of
crystal pairs such that the pair generates s = 10 ps. Clearly the YVO4 and
LiNbO3 combination yields the shortest total length.
The residual phase error is calculated from the quadratic deviation. Similar
to 4.4.13, the quadratic error is
"
!
(2)
(2)
(T )2
n1 L1 K1 n2 L2 K2
(4.4.20)
=
2c
For the YVO4 -LiNbO3 combination at f = 194.1 THz and T = 37.5 C,
 0.026. This is a several order-of-magnitude decrease in temperature
dependence of birefringent phase as compared with either crystal separately.

4.5 Compound Crystals For O-Axis Delay


Applications such as birefringent lters and polarization-mode dispersion
(PMD) sources require the cascade of several crystals of equal length or having
integral multiples of a unit length. The Lyot lter as adopted to interleavers
is a classic example. But birefringent crystals are expensive and if the light
beam is to pass through a cascade of equal-length crystals it would be better
to fold the beam to pass through the same crystal several times. This has
been proposed for both interleavers [13] and PMD generators [15]. For either
component, after each pass of the crystal the polarization is rotated with a
waveplate; the delay (crystal)/rotation (waveplate) sequence can be designed
to make a birefringent lter.

174

4 Elements and Basic Combinations

There remains a problem, however. The problem is the beam must enter
and exit the crystal, or temperature-compensated crystal pair, normal to the
input face or else suer double refraction. Double refraction is compounded
by every pass and results in polarization-dependent loss. This is illustrated
in Fig. 4.13(a). Here an o-normal beam is double-refracted by a rst highbirefringent crystal. The two beams inside the crystal walko from one another and emerge oset. After polarization rotation from the intermediate
waveplate, the two beams enter the second crystal and are each double refracted. Four beams emerge from the second crystal: however, the center two
will overlap if the two crystal lengths are identical. Since the displaced beams
will couple to a receiving collimator with dierent eciencies, the concatenation generates PDL.
Double refraction for o-normal incidence onto a waveplate-cut crystal must be accepted because one polarization will see an eective index;
the other, the ordinary index. This eect is treated in 3.6.2. One solution
that xes the walko problem but does not otherwise work is illustrated in
Fig. 4.13(b). Here two equal-length crystals of the same material are placed
with e-axes perpendicular to one another. The crystals are either both positive or negative uniaxial. The double refraction of the rst crystal is corrected
by the second crystal because extraordinary and ordinary rays are exchanged
with one another. However, as shown in Fig. 4.13(c), the exchange used to
cancel the net walko also exchanges the fast and slow axes; the delay imparted by the rst crystal is cancelled by an equal and opposite delay from
the second. In principle there is no net eect.
Inspection of Fig. 3.13 on page 114 shows that there is a solution for
a birefringent delay having zero net walko with o-normal incidence. Any
solution focuses on the Poynting vector directions, not the k-vector directions.
One possible conguration is shown in Fig. 4.14(a): a positive uniaxial crystal
is followed by a negative uniaxial crystal oriented such that the extraordinary
axes are perpendicular [16]. In the rst crystal the e-ray is refracted more
than the ordinary ray because the crystal is positive uniaxial, Fig. 4.14(b).
However, the e-Poynting vector is deected by the refraction less than the
ordinary Poynting vector. That is, the o-Poynting vector lies between the
extraordinary Poynting and k-vectors. The e-Poynting vector splits from its kvector because its linear polarization state is neither parallel nor perpendicular
to the e-axis of the rst crystal.
Entrance into the second crystal exchanges the designations of two rays.
Also, both polarization states are either parallel or perpendicular to the e-axis,
so there is no further splitting of the e-Poynting vector. Even though the oand e-ray designations are ipped, the slow ray remains slow and the fast
ray remains fast because the second crystal is negative uniaxial. Moreover,
examination of the Poynting vectors shows that they will converge in the
second crystal. The length ratio of the rst and second crystal can be designed
to impart a target delay and yield zero net walko.

4.5 Compound Crystals For O-Axis Delay


a)

175

b)
t50
t1
t2

t
z

+ uniax
+ uniax

c)

l/2

t1

ke

ko

t2

ko (fast)

ke (slow)
z

+uniax

+uniax

Fig. 4.13. O-axis incidence angle. a) O-axis passage through two delay crystals
with an intermediate waveplate that rotates polarization (as in a lter) creates
beam-splitting due to double refraction. The multiple output beams produce PDL.
b) A solution that has zero net walko but no accrued delay. Two like crystals
are placed perpendicular to one another. c) Ray trace of k-vectors. Fast exchanges
with slow. No net delay.

a)

b)

t1

t1 1 t2
t1
t2

Se

1 uniax
2 uniax
z

c)

t1
ray b

1uniax

t2
dc

ke
ko (slow)
z

2uniax

dc 5 db
da

db
da

L1

ko

(slow) ke

ray c
ray a

t2
So

L2

Fig. 4.14. An o-axis zero net walko crystal pair that accrues delay. a) Two
dierent crystals with e-axes perpendicular, the rst crystal is positive uniaxial and
the second negative uniaxial. b) Ray trace. The e-Poynting vector splits from its kvector in the rst crystal, undergoing less deection than the o-Poynting vector but
at slower speed. In second crystal e o but the slow path remains slow. c) Proper
length ratio L2 /L1 achieves zero net walko.

176

4 Elements and Basic Combinations

The length ratio for a two-crystal solution is calculated by equating the


cumulative displacement of the two Poynting vectors. Figure 4.14(c) illustrates
an a-ray, a b-ray, and a c-ray. The a-ray is the k-vector of the e-ray in the rst
crystal, the b-ray is the k- and Poynting vectors for the o-ray, and the c-ray
is the Poynting vector of the e-ray. The b- and c-rays are to converge.
For small inclination angles all refraction expressions are linearized to good
approximation. The displacements through the rst crystal are
(1)

db (L1 ) = b L1
!
"
dc (L1 ) = a(1) + L1
(1)

(1)

where a and b are the refraction angles and is the Poynting-vector


tilt angle within the rst crystal. The cumulative displacements through the
second crystal are
(2)

db (L1 + L2 ) = db (L1 ) + b L2
dc (L1 + L2 ) = dc (L1 ) + a(2) L2
(2)

(2)

where a and b are the refraction angles in the second crystal. Zero net
walko is achieved for db (L1 + L2 ) = dc (L1 + L2 ). This condition gives
"
!
(1)
(1)

a
b
L2
=
(4.5.1)
(2)
(2)
L1
a
b

Taking the ambient index as one, the linearized refraction angles are
(1)

(1)

a = o /ne,1 , b

(2)

(2)

= o /no,1 , a = o /ne,2 , and b

= o /no,2

The Poynting-vector tilt angle , that is, the angle at which the vector tilts
away from its corresponding k-vector, comes from linearization of (3.6.15) on
page 110 where  90 . As the negative tilt has already been accounted for,
the linearized tilt angle is
2

(ne,1 /no,1 ) 1
o
ne,1

In terms of refractive indices, the length ratio is


L2
no,2 (ne,1 /no,1 1)
=
L1
no,1 (no,2 /ne,2 1)

(4.5.2)

This ratio is positive when the rst crystal is positive uniaxial and the second
crystal is negative uniaxial. A YVO4 and -BBO crystal will satisfy this equation. One can verify that for this length ratio, the change in net displacement

4.5 Compound Crystals For O-Axis Delay


a)

b)

YVO4 (1)

aBBO (2)

t1

1 uniax

t2
t3

2 uniax

Se
So, ko

2 uniax

LiNbO3 (2)
Se

z
(slow)

177

So, ko

So, ko

Se
(fast)

0.44

z
0.43

0.13

Fig. 4.15. Three crystal solution: accrued delay with zero net walko with inclined incidence and rst-order temperature compensation. a) Specic crystal design.
b) Ray trace of Poynting vectors. The Poynting vector is split from the e-ray k-vector
in the rst and third crystal.

for a change of incident angle is zero to rst order. That is, net walko grows
as second order with change in incident angle, making the compound crystal
more tolerant to alignment.
An eective index for the crystal pair can be dened as ne = o /e .
The eective refraction angle is the ratio of total displacement to length, or
dc = e (L1 + L2 ). Combining terms gives
ne =

L2 /L1 + 1
no,2
L2 /L1 + no,2 /no,1

(4.5.3)

The remaining problem is that the two crystals necessary to make the compound crystal are not necessarily temperature compensated. The one degree
of freedom available, the length ratio, was used to enforce zero net walko.
To make the compound crystal temperature insensitive as well a third crystal
is necessary (Fig. 4.15(a)). Here YVO4 , -BBO, and LiNbO3 crystals stacked
so that the e-axis of the -BBO crystal is perpendicular to the other two axes
give a solution.
A set of three linear equations can be solved to determine the three crystal
lengths such that there is zero net walko, temperature is compensated to rst
order, and a target delay is achieved. In matrix form the equations are

2
3
L1
0
1
n1 K1 n2 K2 n3 K3 L2 = 0
(4.5.4)
n1
n2
n3
L3
c
where K is the rst-order temperature coecient dened by (4.4.5a) (but for
the refractive index, not group index), and the angle dierences are

178

4 Elements and Basic Combinations



" n
!
1
1
e,1
(1)
1 = b a(1) =

o
no,1 no,1
ne,1


1
1
(2)
(1)
2 = b b =

o
ne,2
no,2


" n
!
1
1
e,3
(3)
3 = b a(3) =

o
no,3 no,3
ne,3

(4.5.5a)
(4.5.5b)
(4.5.5c)

The negative sign in the second column of (4.5.4) accounts for the perpendicular orientation of the -BBO e-axis with respect to the other crystals.
Group index and thermal coecients for YVO4 , -BBO, and LiNbO3 are
found in Table 4.6. The refractive indices were not measured for this table,
but it was observed that the frequency dependence of the index was below
an observable level. So the refractive indices are approximated by the group
indices. Using the tabulated values, a YVO4 crystal of length Lyvo = 9.49 mm,
an -BBO length of Labbo = 9.37 mm, and a LiNbO3 length of Lln = 2.84 mm
satises (4.5.4). The ray-trace of the Poynting vectors is shown in Fig. 4.15(b).
The practical drawback of temperature-compensated birefringent crystal
sets, either on-axis or o-axis, is the widely disparate, highly anisotropic thermal expansion coecients. This is seen in Tables 4.2 and 4.3. Coupling these
materials with glasses and packaging alloys makes for a complex thermal design which stretches the ability to achieve athermal birefringent phase over
a 70 operating range or more. The on-axis athermal crystal set of YVO4
and LiNbO3 has been used quite successfully, however, in the laboratory environment.

4.6 Polarization Retarders


Polarization retarders are essential parts used to build birefringent components such as isolators, circulators, interleavers, and polarization mode dispersion generators. Broadly speaking, a retarder is any optical element that
transforms the state of polarization without loss. Birefringent crystals cut as
waveplates, electro-optic materials, optically active materials, Faraday media,
and total-internal reection rhombs can all transform polarization. The distinguishing attribute is that two polarization components slip in phase within
the medium, transforming its polarization state.
As a comment on nomenclature, retardation and birefringent phase represent the same physical eect. Both characterize a phase slip between two
co-propagating waves having dierent wavelengths. The nuance is that retardation denotes the fractional phase slip when the overall slip is zero, one, or
a few birefringent beats. As used in this text, birefringent phase is the total
phase slip, fractional and integral, between two co-propagating plane waves
of dissimilar wavelength.

4.6 Polarization Retarders

179

The preponderance of this section analyzes birefringent waveplate retarders and select combinations. Highly multi-order waveplates and the thermal stabilization of birefringent phase were covered in 4.4. Total internal
reection retarders are covered last.
4.6.1 Half-Wave and Quarter-Wave Waveplates
The calculus of polarization transformation is extensively treated in Chapter 2 and the reader is particularly directed to 2.6. These tools are used
below to write by inspection the transformation matrix operators and vector
expressions for quarter- and half-wave waveplates.
A waveplate is made of birefringent material such as solid crystals, polyimide lms, or liquid crystals, to cite a few. The waveplate technologies addressed in this section are solid crystals, for the formalism for polarization
transformation is applicable to any technology. A waveplate made from birefringent crystal, typically a uniaxial crystal, is cut so that the extraordinary
axis lies in the crystal face. Drawing from 3.6.4, the birefringent phase of a
waveplate is dened by

(4.6.1)
= (ne no ) L
c
where L is the length of the crystal. Half- and quarter-wave waveplates generate the birefringent phases
/2 = (2n + 1)


1
/4 = 2n +

(4.6.2a)
(4.6.2b)

respectively, where the order n is a positive integer including zero. The length
of an arbitrary order half-wave waveplate is


2n + 1

(4.6.3)
L/2 =
2
n
where /n is recognized as the birefringent beat length in the crystal.
Clearly, as the order decreases so does the plate length. A true zero-order
(n = 0) half-wave plate fabricated from crystalline quartz for = 1545 nm is
about 92 m thick, which requires special care to fabricate. The extra cost
associated with the true zero-order plate is why low-order waveplates are
attractive for many non-telecommunications applications. The polarization
transformation of a waveplate of any retardation is given in Jones space
by (2.6.19) on page 68 and in Stokes space by (2.6.25) on page 68, where the
birefringent axis post-x operators are resolved into matrix form via (2.6.29)
on page 70.
Consider a waveplate with extraordinary axis rotated by to the horizontal
in the plane perpendicular to propagation direction. The birefringent vector r
in Stokes space is

180

4 Elements and Basic Combinations


b)

a)

input

output

2a

l/2
e-axis

Fig. 4.16. Mirror image produced by a perfect half-wave waveplate. a) Incident


linear polarization vector is mirror imaged about the extraordinary axis, producing
a vector with the opposite tilt. b) Mirror image of a linear input state inclined to
the extraordinary axis by appears as a rotation by 2.

cos 2
r() = sin 2
0

(4.6.4)

For a half-wave waveplate, the retardation is = and the Jones matrix


operator is


cos 2 sin 2
U/2 () = j
(4.6.5)
sin 2 cos 2
The 1 coecient to the second diagonal term indicates a mirror image. The
equivalent Stokes operator in vector form is
R/2 = 2(
rr) I

(4.6.6)

When resolved onto the basis implicit in (4.6.4), the Stokes matrix operator
is

cos 4 sin 4
0
R/2 () = sin 4 cos 4 0
(4.6.7)
0
0
1
Again the mirror image is apparent. A perfect half-wave waveplate generates
the mirror image
(4.6.8)
(S1 , S2 , S3 ) (S1 , S2 , S3 )
along with rotation by 4 about S3 . The Stokes matrix operator imparts a
fourfold multiple of the physical waveplate angle in its arguments. This is
accounted for by rst considering the 2 multiple that results by going from
Jones to Stokes space, and then the 2 multiple generated by the mirrorimage of the input state about the birefringent axis. Figure 4.16 illustrates the
mirror image eect of a perfect half-wave waveplate and its apparent rotation
about the birefringent axis. Figure 4.17(a) illustrates the Stokes space view
of an n = 0 half-wave waveplate transformation. Polarization state a rotates
to state b by precession about r by = . Note that when the input state is
linear (lies on the equator) the output is its orthogonal state.

4.6 Polarization Retarders


a)

S3

b)
z

l/2

S3

l/4

S2

S2
a

2u

181

S1

S1

Fig. 4.17. Half-wave and quarter-wave waveplate transformations in Stokes spaces.


a) Stokes picture of half-wave transformation. Birefringent axis r lies on the equator
and is rotated by 2 from S1 . The birefringent axis corresponds to the extraordinary
axis of the crystal. Input state a is transformed to b by = pi about r. b) Stokes
picture of quarter-wave transformation. Input state a rotates to b through = /2.

For a quarter-wave waveplate rotated by , the retardation is = /2 and


the Jones matrix operator is


1 j cos 2 j sin 2
1
U/4 () =
(4.6.9)
2
j sin 2 1 + j cos 2
The equivalent Stokes operator in vector form is
R/4 = (
rr) + (
r)

(4.6.10)

When resolved onto the basis implicit in (4.6.4), the Stokes matrix operator
is

cos2 2
sin 2 cos 2 sin 2
(4.6.11)
R/4 () = sin 2 cos 2
sin2 2
cos 2
sin 2
cos 2
0
Since no mirror image is derived from the quarter-wave plate, the arguments
in R/4 retain their 2 multiple. The Stokes operator matrix is more complex
as a result. Figure 4.17(b) illustrates the Stokes space view of an n = 0 quarterwave waveplate transformation. Polarization state a rotates to state b via
precession about r by = /2.
Frequency Dependence
Birefringent phase (4.6.1) is directly proportional to optical frequency . The
chromatic dependence is

182

4 Elements and Basic Combinations

d
=
d

1
1 d(ng )
+
ng d


(4.6.12)

Even excluding material dispersion, the chromatic dependence depends directly on the target value of :
=

(4.6.13)

Waveplate order plays a role in the chromatic dependence. There is a


direct tradeo between the ease of fabrication versus the birefringent phase
variation. For example, the frequency-dependent birefringent phase change for
an nth -order half-wave waveplate is
/2 =

(2n + 1)

(4.6.14)

The change is a function of the retardation and the spectral bandwidth. Consider the spectral coverage for the C-band (4.1.5): SC  2%. The change of
birefringent phase across this band for three dierent waveplate orders is

n = 0,
1.8
/2 = 5.4
(4.6.15)
n = 1,

37.8 n = 10
The change increment from zero order to rst order alone imparts a threefold
increase in chromatic dependence. Conversely, the bandwidth for a xed
tolerance suers a threefold decrease.
Depending on design requirements, waveplates in a component may have
to be eliminated where possible to broaden the spectrum over which the component specications are held. For example, in a circulator the extraordinary
axes of the birefringent prisms can be cut in non-standard ways to align to the
polarization axes rather than have a waveplate make the rotation. For other
components where a waveplate is absolutely required, achromats made from
waveplate combinations can in some cases be used.
4.6.2 Birefringent Waveplate Technologies
Fabrication of a waveplate requires high accuracy between the target retardation and the actual retardation of the part. Low-order waveplates are single- or
double-side polished and the retardation is directly measured by interrupting
the process. The waveplate is compared to a reference standard at the target
wavelength and polishing continues until the waveplate is within tolerance.
So that there is minimum retardation change for each polish step, a very
low birefringent material is used, such as crystalline quartz. Quartz has a birefringence of n = 0.0084 at 1.55 m and associated birefringent beat length
of 184 m. A typical quartz waveplate is specied to within /500, which

4.6 Polarization Retarders


a)

b)

quartz
Multi-order

c)

quartz
True zero-order

quartz

d)

quartz

Compound
zero-order

BK7

183

e)

quartz

True zero-order
on host

MgF

quartz

Zero-order
achromat

Fig. 4.18. Illustrations of solid-crystal waveplate technologies. a) Multi-order waveplate, single block of, e.g., quartz. b) True zero-order waveplate. Minimum thickness
given the material birefringence to achieve required retardance. c) Compound zeroorder waveplate makes for easier fabrication but poorer extinction ratio and angular aperture. d) Optically contacted true zero-order waveplate, very useful because
quarter-wave waveplates of quartz, a low-birefringent material, are 45 m thick at
zero order. e) Zero-order achromat, such as MgF2 /crystalline quartz.

corresponds to a 0.2 m tolerance. Only with optical feedback is this tolerance possible.
Figure 4.18 illustrates ve common types of waveplates. The easiest part to
make is a multi-order waveplate (Fig. 4.18(a)). This waveplate has a thickness
of t = m/n + t, where m is the waveplate order, n is the birefringence,
and t is the minimum thickness required to achieve the target retardation.
A half-wave waveplate at fourth order made with quartz is about 0.83 mm
thick. Such a part is certainly easy to handle. However, as discussed in the
previous section, the bandwidth of such a high-order plate unsuitably low.
A true zero-order waveplate is the thinnest plate possible that meets the
target retardation (Fig. 4.18(b)). The thickness of a true zero-order half-wave
and quarter-wave waveplate made in crystalline quartz is L/2  92 m and
L/4  46 m at 1.55 m, respectively. It is hard to xture and hold such thin
plates, which until recently has made these plates expensive. However, optical
contacting of quartz blanks to an optically at fused silica block and one-side
polishing of the blank eases the xturing problem. The nished waveplate is
removed by heating the block. Such waveplates can be very high quality and
exhibit extinction ratios better than 55 dB.
A classic solution to the cost required to make a true zero-order plate is to
make a compound zero-order plate (Fig. 4.18(c)). Here two thick quartz blocks
are cemented or optically contacted with the extraordinary axes crossed. The
dierence in thickness determines the retardation. The two parts can be made
arbitrarily thick as long as the dierence is maintained. However, the compound zero-order waveplate is limited by the accuracy with which the birefringent axes can be crossed. Typical extinction ratios are no better than 27 dB.

184

4 Elements and Basic Combinations

The retardation produced by a compound zero-order plate is very sensitive


to misalignment to the beam. The plate is zero-order not just at the center
wavelength but when perpendicular to the beam. The angular aperture of the
compound plate must be considered, and is much smaller than that of a true
zero-order waveplate. Overlooking the eect of the angular aperture will lead
to erroneous experimental results. Lastly, if the waveplate is to be rotated, the
thickness of the compound waveplate will displace the beam if there is any
wobble in the rotation. Programmable PMD sources are an example where
the waveplates are mechanically rotated.
A very good option which is becoming more common today is to fabricate a true zero-order waveplate optically contacted to a permanent host such
as BK7 (Fig. 4.18(d)). Typically a larger part is made and several waveplates
with hosts are separated by dicing. Optical contacting is the process of bonding two highly polished glass surfaces together using van der Waals force and
is the preferred method, as cement bonding causes undesired reections at the
interface. To strengthen the contact, the parts are annealed to make them inseparable. The hosted true zero-order waveplate allows the part to be handled
and xed into location with ease.
Finally, a bi-crystalline waveplate achromat can be made with materials
having complementary wavelength dispersions. The MgF2 /crystalline quartz
achromat is one example (Fig. 4.18(e)). As both crystalline quartz and MgF2
are positive uniaxial crystals, the birefringent axes are necessarily crossed.
The dierence in thicknessindex product sets the net retardation [19].
4.6.3 Waveplate Combinations
Waveplate combinations can be used to create a variety of eects such as
waveplate achromats or polarization controllers. Below are three important
examples of waveplate combinations: the Evans phase shifter, the Pancharatnam achromat, and the Shirasaki achromat. Each combination uses multiple
waveplates of various retardations and orientations to achieve a particular
result. Not presented below but of some importance is the Koester stacked
half-wave achromat [34]. Dynamic polarization control in elementary form is
left to 4.6.4.
The Evans Phase Shifter
The Evans phase shifter [20] mechanically imparts a precession on an input
polarization state, about an associated principal axis, that is equivalent to
a change in frequency. Evans originally applied the phase shifter to tune a
Lyot lter designed for astronomical observations. Other demonstrations have
shown further application to astronomy [6], to interleavers [8], and to PMD
sources [17].
The phase shifter, illustrated in Fig. 4.19, is a three-element combination
of quarter-, half-, and quarter-wave waveplates. The equivalent birefringent

4.6 Polarization Retarders


a)

b)

S3
v

c)

S2

185

d)
l/4

l/4
2u

S1

w(u)

l/2

YVO4

LiNbO3

l/4

0o

l/2

l/4

45o

45o

Delay stage

Evans phase shifter

Fig. 4.19. Evans phase shifter. Quarter-, half-, quarter-wave waveplate combination, with outer plates xed and center plate rotatable, tunes the birefringent phase
of the adjacent principal waveplates. The quarter-wave plates are xed at +45
with respect to the principal waveplate. In Stokes space: a) locus of output SOP
from principal plates over frequency; open circle denotes one frequency. b) Quarterwave transformation to lower pole. c) Half-wave transformation to upper pole and
mirror image about birefringent axis. d) Quarter-wave transformation back to principal waveplate axis. The change in open-circle position is equivalent to a frequency
change. The null position of the tuning plate is at 45 with respect to the principal
axis.

axis of the combination is called the principal axis. The phase shifter can
be located adjacent to a highly multi-order delay stage (such as the YVO4 LiNbO3 stage, as illustrated) and will tune the birefringent phase of the stage
when the principal axis of the phase shifter is aligned to the birefringent axis
of the delay.
In the phase shifter, the two quarter-wave waveplates are aligned and
their axes are further rotated by 45 with respect to the principal axis. The
half-wave waveplate, called the tuning plate and located between the two
quarter-wave waveplates, controls the birefringent phase of the cascade. The
null position of the tuning plate is at 45 with respect to the principal axis.
A rotation of the tuning plate is tantamount to a shift of the birefringent
phase. Jones calculus shows that the quarter-, half-, quarter-waveplate cascade
combines as
1
2

1 j
j 1



cos 2 sin 2
sin 2 cos 2




1 j
j 1


0
ej2
(4.6.16)
=
0
ej2

186

4 Elements and Basic Combinations

where the rst and third Jones matrices describe the quarter-wave waveplates
at +45 and the second Jones matrix describes the half-wave waveplate rotated by physical angle . The resultant matrix shows a net birefringent
phase plus a mirror image taken about the principal axis. The total birefringent phase of the delay and phase shifter is
(, ) = s + (4 )

(4.6.17)

The action of the Evans phase shifter in Stokes space is illustrated in Fig. 4.19.
While the phase shifter can endlessly and continuously tune the birefringent
phase of the cascade, there is no signicant delay through the shifter. Endless
rotation creates endless frequency shift (of the periodic spectrum) but without
change in free-spectral range.
The Pancharatnam Achromat
Pancharatnam [5, 39] determined the conditions under which three waveplates
can be combined so that the equivalent birefringent axis lies in the equatorial plane and the equivalent retardation is a prescribed value. His work can
be reduced to the Evans phase shifter, which he briey described and seems
to have developed independently, but is more generally used to build achromatic retardation plates such as quarter-wave achromats. The Pancharatnam
has a direct analogue to cascaded Mach-Zehnder interferometers uses in integrated optics to form achromatic waveguide-waveguide couplers [10]. The
Pancharatnam and waveguide achromats were invented independently.
The Pancharatnam achromat is constructed with three waveplates, a rst
and last waveplate having equal retardation and extraordinary axis orientation, and an intermediate waveplate having a possibly dierent retardation
and orientation. In this case, any choice of retardation and orientation values
keeps the principal axis of the combination on the equator. There are two
steps for the achromatic calculation: a rst step derives the retardation and
principal axis of the combination, and a second calculation determines the
achromatic behavior.
As a departure from Pancharatnams derivation, spin-vector calculus in
Jones form is used here to determine the governing equations. For rst and
third waveplates having orientation r1 and retardation 1 , and a second waveplate having orientation r2 and retardation 2 , the Jones operators are
r1,2  ) sin(1,2 /2)
U1,2 = cos(1,2 /2)I j(

(4.6.18)

As an aid to resolve the following operator product, the vector r2 is projected


onto r1 and an orthogonal axis r as
r2 r1 ) + r (
r2 r )
r2 = r1 (

(4.6.19)

and, for the following, r2 r1 = cos 221 , as is customary. Using the spin-vector
identities in 2.5.4, the waveplate combination is written

4.6 Polarization Retarders


a)

b)

S3
b

rp
^

S2

r2

c)

r?

rp

r1
S1

S3

187

S1

l/2

w1

a
u1

l/2

b
u2

r1

2up 2u21

d
w1

r2

S2

c
u1

d
up

Fig. 4.20. Pancharatnam waveplate combination equivalent to single principal


waveplate. a) Polarization evolution comparison, launched along eigen-axis of principal waveplate. Through combination, polarization rst rotates along (a) about r1 ,
then along (b) about r2 , then returning along (c) about r1 . b) Launch state along
eigen-axis of r1 . Through principal waveplate polarization rotates along (d) about rp .
Through combination, polarization pirouettes about r1 , rotates along (b) about r2 ,
and rotates along (c) about r1 . c) Equivalence between combination and principal
waveplate; rst and third waveplates of combination have same retardation and
orientation.

U1 U2 U1 = cos 1 cos(2 /2) cos 221 sin 1 sin(2 /2)


r1  ) cos 221 cos 1 sin(2 /2)
j (
r1  ) sin 1 cos(2 /2) j (
r  ) sin(2 /2)
j (
r2 r ) (

(4.6.20)

Notice that there are no s3 components in (4.6.20); the principal axis of the
combination lies in the equatorial plane.
The waveplate combination can be identied with a single principal waveplate Up ,
rp  ) sin(p /2)
(4.6.21)
Up = cos(p /2)I j (
Making identication with (4.6.20), the principal retardation p is
cos(p /2) = cos 1 cos(2 /2) cos 221 sin 1 sin(2 /2)

(4.6.22)

and the principal axis p is


cot(2p ) = csc(221 ) (sin 1 cot(2 /2) + cos 221 cos 1 )

(4.6.23)

where, in the latter case, the arc cotangent between the orthogonal (
r1  )
and (
r  ) axes was taken. Equations (4.6.22-4.6.23) are the two main results
rst derived by Pancharatnam.
The equivalence between a single waveplate and a Pancharatnam combination is illustrated in Fig. 4.20. There the polarization transformation through
a principal waveplate and an equivalent combination of three plates, where the

188

4 Elements and Basic Combinations


S2

a)
^

sin

r2

S3

b)

rp

l/4

achromat

l/4

achromat

sin

r1

S1

r2

r1

S1

rp

S2

Power Transmitted

c)
10%

l/4

8%
6%

achromat

4%
2%
-24%

-15%

-9%

0%

9%

15%

24%

Frequency Detuning

Fig. 4.21. Realization of a Pancharatnam achromat: p = /2, 1 = 116.2 ,


2 = , 21 = 69.1 physical angle, p = 30.3 physical angle. a) Comparison of single /4 waveplate (principal axis dotted) and achromatic waveplate (extraordinary
axes solid) over detuning range  = 25%, projected on Stokes space in plan view.
Achromat shows tighter progression than single plate. b) Dierent view of same, input state rotated to top pole. c) Comparison of transmitted power through polarizer
oriented along s1 for single plate and achromat.

center plate is half-wave, is illustrated. Note the distinctly dierent contours


that are traced and yet the output states are the same in either case.
The Pancharatnam combination supports enough degrees of freedom to
implement a waveplate combination that is less chromatically dependent than
a single waveplate. This is called the Pancharatnam achromat. Consider a
frequency deviation such that
= (1 )

(4.6.24)

The detuning will also be written as until later expansion. The equations
that dene the solution require the same principal retardation and axis at :

4.6 Polarization Retarders

189

+
+
+
cos 2(p /2) = cos +
1 cos(2 /2) sin 1 sin(2 /2) cos 221

(4.6.25a)

cos 2(p /2) = cos


1 cos(2 /2) sin 1 sin(2 /2) cos 221

(4.6.25b)

sin +
1

cot(+
2 /2)

cos +
1

cos 221 =

cot(
2 /2) sin 1

+ cos 221 cos


1
(4.6.25c)

After some cumbersome manipulations and setting the center waveplate to


half-wave retardation 2 = , the retardation of the outer two plates satises
the equation
sin(/2)
sin 1
sin 1 =
(4.6.26)
cos(p /2)
The inclination of the center waveplate with respect to the outer plates is
cos 221 =

tan(/2)
tan(1 )

(4.6.27)

Lastly, the inclination of the principal axis is calculated from


sin 221 cot 2p = sin(1 )1 tan(/2) + cos(1 )1 cos 221

(4.6.28)

Inspection of (4.6.26) shows that the possible achromatic bandwidth depends


on the principal retardation. A half-wave retardation cannot be achromatic
while a quarter-wave retardation exhibits the broadest possible range. For this
reason the Pancharatnam achromatic is typically quarter-wave. Note, however,
that (4.6.264.6.28) are derived for 2 = ; relaxation of this parameter allows
more complexity and, in particular, a achromatic half-wave combination.
Figure 4.21(a,b) shows a Stokes-space comparison between a quarter-wave
waveplate and an achromat designed to cover a detuning of  = 25%. As illustrated, a polarization state on the equator transformed through the achromat traces a tight loop about s3 . In contrast, the same state through the
quarter-wave waveplate traces a unidirectional arc through s3 as is expected
from simple precession. Figure 4.21(c) shows the output states as transmitted
through a polarizer. Again the bandwidth of the achromat is substantially
broader than the single waveplate.
The Shirasaki Achromat
The Shirasaki achromat [55] compensates to rst order the frequency dependence of a Faraday rotator (or optically active) waveplate for a particular
input state of polarization. The input state is known in components such as
optical isolators and circulators. A Faraday rotator waveplate precesses an
input state of polarization about the
s3 axis, the sign determined by the
relation between the magnetization vector and the propagation direction. Table 4.8 on page 207 lists Jones and Stokes operators for Faraday rotation.
One realization of the achromat, illustrated in Fig. 4.22(a), is constructed
with a half-wave and then quarter-wave waveplate, followed by the Faraday

190

4 Elements and Basic Combinations


S3

a)

S3

S3

S3

b)
2u2p

S1

S2

S2

l/4

l/2

uF

b
u/2

a c
b

2u
t/4

l/2
a

S1

b
u/2

2uF
c

u2p/2

Fig. 4.22. Two realizations of the Shirasaki achromat: Half-wave waveplate followed by a quarter-wave waveplate, the combination preceding a Faraday rotator.
a) Waveplates are oriented at /2 and , where = 30 . To rst order, the Stokes
view shows frequency-dependent motion counter to that of a +
s3 -oriented Faraday
rotator. To second order the contour curvatures of the waveplates add, reducing the
bandwidth. b) Quarter-wave waveplate rotated by 90 from a). Curvatures largely
cancel and the bandwidth is increased. The transformation motion runs counter to
a), requiring a reversed orientation of the Faraday rotator.

rotator. The extraordinary axis of the half-wave plate is rotated by /2 from


the horizontal while that of the quarter-wave plate is rotated by . The input
state of polarization is expected to be linear and along the horizontal +
s1 .
All three plates change retardation to rst order with frequency; denote the
changes as F , /4 , and /2 for the Faraday, quarter-wave, and halfwave plates, respectively.
In Stokes space, the half-wave waveplate rotates the horizontal input polarization to another point on the equator, the angle of separation being 2.
Considering small changes in frequency, the locus of polarization states forms
a line perpendicular to the equator, to rst order. The quarter-wave waveplate,
whose birefringent axis is at 2, rotates the locus parallel to the equator. The
achromat generates a state on the equator at 2 that moves toward +
s1 with
increased frequency. This motion is counter to that of a Faraday rotator havs1
ing its orientation along +
s3 , which rotates the polarization state toward
with increased frequency. These two motions can cancel.
A rigorous analysis is easily done with spin-vector operators. Stokes operators are constructed for each plate and expanded to include rst-order
frequency deviation. The rst-order operators are
s3 ) + (
s3 ) + F (
s3 s3 )
RF + RF = s3 (

(4.6.29a)

r4 ) + (
r4 ) + 4 (
r4 (
r4 ) I)
R/4 + R/4 = r4 (

(4.6.29b)

r2 (
r2 ) I 2 (
r2 )
R/2 + R/2 = 2

(4.6.29c)

where the birefringent vectors r2 and r4 lie in the equatorial plane. To rst
order, a frequency change is expanded as

4.6 Polarization Retarders

191



RF R/4 R/2  (RF )R/4 R/2 + RF (R/4 )R/2 + RF R/4 (R/2 )
(4.6.30)
Interestingly, the second term on the right-hand side vanishes because r4 is
aligned to the nominal polarization state produced by R/2 . When the operator (4.6.30) is applied to state s1 the contributing dierence terms evaluate
to
s3 s3 ) (
s1 cos 2 + s2 sin 2)
(RF )R/4 R/2 = F (

(4.6.31a)

s3 (
s3 ) + (
s3 )) (
s1 sin 2 s2 cos 2)
RF R/4 (R/2 ) = /2 sin (
(4.6.31b)
By completing the vector products and setting the result to zero, a simple
relation between frequency deviations of the Faraday and half-wave plates is
found
(4.6.32)
F = /2 sin 2
This relation makes physical sense because the precession rates need to be
matched. The half-wave waveplate has a sin multiplier because the radius
of the precession circle depends on the angle between the input state and the
extraordinary axis. In light of (4.6.13), the Stokes angle 2 is dened by
sin 2 =

F
/2

(4.6.33)

In the case of an isolator, F = /2. Accordingly, the physical angles for


the waveplates are /2 = 15 and /4 = 30 . To maximize the second-order
bandwidth, the waveplates should be made true zero-order.
Another consideration is necessary when analyzing the achromat to second order. As illustrated in Fig. 4.22(a) the curvature of states can add, which
reduces the achromatic bandwidth. As an alternative, Fig. 4.22(b) illustrates
the same achromat with the quarter-wave waveplate rotated by a 90 , which
is tantamount to exchange of the ordinary and extraordinary axes. In this case
the curvature of states imparted by the half-wave and quarter-wave waveplates
subtract, providing a very broad band over which the combination is achromatic. When the quarter-wave plate is oriented like this, the state motion with
increasing frequency is opposite that of the origin conguration. The Faraday
rotator must be accordingly reversed. Whether the quarter-wave waveplate is
to be rotated an additional 90 depends on : if /4 < 45 then the additional
rotation is necessary, otherwise it is not.
4.6.4 Elementary Polarization Control
A polarization controller dynamically changes its transformation eect to either track an incoming state of polarization or alter the output state polarization in a programmed way. Although a polarization controller can be
arbitrarily complex with enough waveplates, the fundamental question is how
few waveplates are needed for certain transformations.

192

4 Elements and Basic Combinations

A particularly useful control is arbitrary-to-arbitrary, wherein an arbitrary input polarization state is mapped to an arbitrary output state. Such a
controller oers complete control. One subset is linear-to-arbitrary, which
is useful when the input comes from a laser. A strictly endless controller is
built with parts that are themselves endlessly rotatable, like a birefringentcrystal waveplate mounted on a rotary stage. Other controller types provide
an apparently endless transformation but do so either by unwinding or digitally ipping the elementary parts. As a example, liquid crystal birefringent
elements have a limited voltage range, which inhibits endless control of any
one element. The same applies for ber squeezers that induce birefringence
through stress. Theoretically, unwinding or digital ipping can create endless polarization control, but the algorithms rely on certainty of the element
retardations, which in practice is rarely the case. An extensive review of polarization control can be found in [60].
The elementary controllers considered in this section are built with a cascade of birefringent waveplates. These plates are physical pieces that have
a xed retardation and variable angle, although the birefringent axis is constrained to the equatorial plane. A common alternative in the industry is
the lithium-niobate electro-optic polarization controller [26, 59]. Both birefringent and electro-optic controllers are strictly endless, and the electro-optic
controllers can actuate in the megahertz range or higher. Also, unlike the
birefringent waveplates, the bias voltage on a electro-optic controller can be
changed as well as the orientation voltage, which changes the retardation. A
polarization controller that combines eects of waveplate rotation and retardation modulation is called a hybrid controller.
Before doing an analysis of waveplate combinations, it should be obvious
that a single xed-retardation waveplate element cannot make arbitrary transformations. The only possible transformation type through a single waveplate
is when the cross-product between the input state and desired output state
lies on the equator, so that the birefringent axis can point in that direction,
and the dot-product between the states is equal to the retardation of the plate.
Figure 4.23 illustrates transformations through single quarter- and halfwave waveplates over a full revolution of the plates for a xed input state.
Figures 4.23(a,b) illustrate the bow-tie pattern created by a quarter-wave
waveplate at a xed frequency, while Figures 4.23(c,d) show the constantlatitude pattern created by a half-wave waveplate. In particular note that a
quarter-wave waveplate can transform a circular state to a linear state and
back, while a half-wave waveplate maps linear states to linear and circular
states to circular.
/4, /4 Arbitrary-To-Arbitrary Control
Two independently adjustable quarter-wave waveplates are the minimum
requirement for arbitrary-to-arbitrary polarization transformation. To show
that two quarter-wave plates with variable extraordinary axis inclination

4.6 Polarization Retarders


a)

S3

c)

S3

sout

r(u)

S2

S2

sin

r(u)

sin

sin

sin
d)

l/4

sout

S3

sout

S3

l/2
^

sout

sout

S2
b

S2

r(u)

S1

r(u)

S1

sout
u

b)

S1

193

S1

sin

sin

Fig. 4.23. Polarization control for single quarter- and half-wave waveplates. Trajectories show output locus for a xed input over full revolution of the respective
birefringent waveplate (shown in inset). a) Bow-tie locus traced by a quarter-wave
plate for horizontal linear input polarization. b) Distorted bow-tie for elliptical input polarization. c) Line-of-latitude locus traced by a half-wave plate for horizontal
(or any) linear input polarization. d) Elevated line-of-latitude for elliptical input
polarization. Note the eccentricity does not change.

(Fig. 4.24(a)) can map any arbitrary polarization state to another arbitrary
state, it is sucient to show that orthogonal input states, such as along s1 ,
s2 , and s3 , can each be mapped anywhere in Stokes space. An arbitrary input
state can then be composed of these orthogonal states without violation of
the mapping.
Recall that the Stokes operator for a quarter-wave plate is
rr) + (
r)
R/4 = (
Two quarter waveplates in cascade generates the operator

(4.6.34)

194
a)
^

4 Elements and Basic Combinations


l/4

l/4

sany

b)
^

l/2

l/4

sany slin
u1

u2

c)
^

l/4

l/2

l/4

sany sany
u1

u2

sany
u1

u2

u3

Fig. 4.24. Illustration of three birefringent waveplate polarization controllers. Each


plate can be rotated independently. a) Quarter-, quarter-wave waveplate combination. Can map arbitrary-to-arbitrary polarization states. b) Half-, quarterwave waveplate combination. Can map linear-to-arbitrary states. c) Quarter-, half-,
quarter-wave waveplate combination. Can map arbitrary-to-arbitrary states.

R2 R1 = r2 (
r2 r1 )(
r1 ) + r2 (
r2 r1 ) + r2 r1 (
r1 ) + r2 r1
r1 ) sin 21 (
s3 )] + (
r2 r1 ) s3 sin 21 (
r1 ) (4.6.35)
= r2 [cos 21 (
Operating on three orthogonal input states, the output states are

r2 (cos 21 cos 1s ) + r (sin 1s ) s3 (sin 21 cos 1s ) s = s1


R2 R1 s = r2 (cos 21 sin 1s ) r (cos 1s ) s3 (sin 21 sin 1s ) s = s2

r2 (sin 21 ) s3 (cos 21 )
s = s3
(4.6.36)
where 1s is the angle between the associated vector and s1 , and where, as a
variation on (4.6.19),
r1 = r2 (
r1 r2 ) + r (
r1 r )

(4.6.37)

By denition, r is a perpendicular axis in the equatorial plane such that


r2 r = 0 and r2 r = s3 .
Consider the transformation of s1 . The coecients to r2 , r , and s3 can
each independently span the range [1, 1], provided a judicious choice of 21
and 1s is made. Therefore R2 R1 can map s1 to anywhere in Stokes space.
The same argument applies to transformations of s2 and s3 . In the latter case
note that the direction of r2 is arbitrary: the linear component of the output
polarization can be arbitrarily oriented.
Since an arbitrary input state is a linear combination of projections onto
orthogonal axes, the linearity of R2 R1 ensures that any such input state can
be mapped to an arbitrary output state.
/2, /4 Linear-To-Arbitrary Control
A half-wave plate followed by a quarter-wave plate (Fig. 4.24(b)), can transform any linear input state to an arbitrary output state. To show this, recall
that the Stokes operator for a half-wave plate is
rr) I
R/2 = 2 (

(4.6.38)

4.6 Polarization Retarders

195

The half-wave (R2 ), quarter-wave (R1 ) cascade generates the operator


r2 (
r2 r1 )(
r1 ) r1 (
r1 ) + 2
r2 (
r2 r1 ) r1
R2 R1 = 2
r1 ) sin 21 (
s3 )] r1 (
r1 ) (
r1 )
= 2
r2 [cos 21 (

(4.6.39)

Operating on three orthogonal input states, the output states are

r2 (cos 21 cos 1s ) r (sin 21 cos 1s ) s3 (sin 1s ) s = s1



R2 R1 s = r2 (cos 21 sin 1s ) r (sin 21 sin 1s ) s3 (cos 1s ) s = s2

3
r2 (sin 21 ) + r (cos 21 )
s = s3
(4.6.40)
As in the case of cascaded quarter-wave waveplates, input states s1 and s2 can
be mapped arbitrarily. However, not so for a launch along s3 . In that case the
half-, quarter-wave pair only maps to the equator; any remaining component
along s3 vanishes. This makes physical sense because a circular state that
transits a half-wave plate is rotated to the orthogonal circular state. The
quarter-wave plate in turn rotates the resultant circular state to the equator.
The half-wave, quarter-wave cascade transforms any linear state to an
arbitrary state. In the opposite direction, any arbitrary state can be mapped
to a linear state. This may be useful when the light is subsequently analyzed
by a polarizer.
/4, /2, /4 Arbitrary-To-Arbitrary Control
The same procedure is used to analyze the cascade of quarter-, half-, quarterwave waveplates. To aid with the reductions, r3 r3 = 0 and r3 r3 = s3
dene the vector r3 . The concatenated Stokes operator is
r3 (
r3 r2 ) (cos 21 (
r1 ) sin 21 (
s3 ))
R3 R2 R1 = 2
r3 r1 )(
r1 ) r3 (
r3 r1 )
r3 (
r1 ) sin 21 (
s3 ))
+ 2(
r3 r2 ) (cos 21 (
r1 ) (
r3 r1 )
(
r3 r1 )(

(4.6.41)

Transformation of the three orthogonal input states generates the following


expressions:
R2 R2 R1 s1 = r3 cos(32 21 ) cos 1s r3 sin 1s s3 sin(32 21 ) cos 1s
R2 R2 R1 s2 = r3 cos(32 21 ) sin 1s + r3 cos 1s s3 sin(32 21 ) sin 1s
R2 R2 R1 s3 = r3 sin(32 21 ) + s3 cos(32 21 )
As was the case with two quarter-wave waveplates, the output polarization
for each orthogonal input is mapped arbitrarily in Stokes space provided a
judicious choice of waveplate angles.

196

4 Elements and Basic Combinations

In comparison to (4.6.36), the present transformation makes the angle substitution 21  (32 21 ). The addition of the intermediate half-wave plate
provides a degree of freedom not available for the quarter-wave pair, wherein
the value of (32 21 ) can be changed without simultaneous change of r3 ,
sin 1s , or both. For this reason the quarter-, half-, quarter-wave combination
is often preferred over the lone pair.
4.6.5 TIR Polarization Retarders
A total-internal reection retarder is nearly achromatic and accordingly has
practical application in components and instruments. As analyzed in 3.5.6,
total internal reection retards light due to the dierence of evanescent eld
decay between TE and TM plane waves. While the retardation of a waveplate is directly proportional to frequency (4.6.1), the frequency dependence
of TIR retardation is governed by the material dispersion alone. Indeed, differentiation of the retardance expression (3.5.47) on page 103 with respect to
frequency yields
d
2u
1
dn()


= 2
d
u + 1 n() n2 () sin2 1
d
where
cos
u=


n2 () sin2 1
n() sin2

(4.6.42)

(4.6.43)

Selection of a low-dispersion material will produce a highly achromatic retarder.


A common TIR retarder is the Fresnel rhomb, illustrated in Fig. 4.25(a).
The Fresnel rhomb is designed to convert linearly polarized light of the proper
inclination to circular polarization after two reections while maintaining the
output co-linear with the input. The three design parameters are the retardance per TIR, the rhombohedral angle , and the material index n. Association of physical space to Stokes space is made by referencing the Stokes
axis s1 to the TE direction on the plane of incidence. Since s1 is also associated with the positive eigenvector of the Jones operator U , the retardance
expression
(4.6.44)
= 2 tan1 u
follows a right-hand precession rule about s1 . Accordingly, linearly polarized
light aligned to +
s2 is transformed to circular polarization s3 when = /2.
The practical design of a Fresnel rhomb should minimize the retardance
sensitivities to frequency and incident angle. The frequency sensitivity is given
above, and the angular sensitivity of total internal reection retardance is


2 n2 + 1 sin2
2u
d


= 2
(4.6.45)
d
u + 1 sin cos n2 sin2 1

4.6 Polarization Retarders


b)
Retardance w (deg)

a)
u
+S2
Fresnel Rhomb

60

45
40
20
uc
0

+S3

197

30

40

50

60

70

80

90

Inclination u (deg)

Fig. 4.25. A Fresnel rhomb can transform linear 45 polarization into circular
polarization over a bandwidth limited only by material dispersion. a) Illustration
of the rhomb, where two total-internal reections impart a combined quarter-wave
shift. b) Retardance of the rhomb as a function of apex angle , where n = 1.497.
Retardance is zero at the critical angle and glacing angle. The retardance is 45
at = 51.8 while the rst-order retardance sensitivity to input angle, by design,
vanishes.

The angular sensitivity vanishes for



sin o =

n2

2
+1

(4.6.46)

The angle o also yields the maximum retardance for a given index n. Substitution of (4.6.46) into (4.6.43) gives the maximum u values for a given n:
umax =

n2 1
2n

(4.6.47)

For example, a lead-doped glass having index n 1.8 generates a retardance


per TIR of 64 . As = 90 is necessary for linear to circular conversion,
the Fresnel rhomb typically uses two reections to accumulate the full /2
retardance.
To minimize the angular sensitivity while = /4, the value of u is determined from (4.6.44) and the associated index n is calculated from (4.6.47).
Figure 4.25(b) plots the retardance for a single reection as a function of
angle for this solution. The angular sensitivity vanishes just at the point of
eighth-wave shift.
Beyond the elementary analysis presented here, two complete studies of
rhomb sensitivities can be found in [4, 42]. Also, reference [9] applies TIR
prisms to isolators and circulators to greatly extend their bandwidth; however
the implementations are not particularly practical.
Separate from Fresnel rhombs, high-sensitivity magneto-optic sensors can
use turning prisms to complete an optical circuit around a conductor [40].
When the light transits one or more unsaturated iron-garnet Faraday rotator
elements located in proximity to the conductor, the Faraday rotation is proportional to the current-induced magnetic eld. For such sensors, retardance

198

4 Elements and Basic Combinations

generated by the prisms reduces the small-signal sensitivity [41]. While the
retardance per reection can be reduced by bringing the incidence angle close
to the critical angle, as can be seen in Fig. 4.25(b), the error sensitivities
become impractically large. To overcome this limitation, a specially designed
thin-lm coating can be applied to the hypotenuse of the prism to reduce the
retardance while remaining away from the critical angle. Reference [43] cites a
design where the retardance was reduced to 1 at 1.3 m and the retardance
remained within 6 for a 5 angular error.

4.7 Single and Compound Prisms


Prisms are a cornerstone of birefringent optical components. Simple isotropic
prisms are used as turning prisms to bend the light 90 or 180 within a component, or as straightening prisms to compensate for the angular divergence
between two beams emergent from a dual-ber collimator. Birefringent prism
pairs are used in isolators to create polarization diversity. Compound prisms
such as the Wollaston, Rochon, and Kaifa prisms are used in many circulator designs for polarization diversity and angular compensation for dual-ber
collimators. The prisms studied in this section are designed in the small-angle
limit, where Snells law may be linearly approximated. A broad range of prisms
is discussed in [24].
The isotropic isosceles prism illustrated in Fig. 4.26(a) has an apex angle
and refractive index n which, in general, is a function of wavelength. For a
given angle of incidence in on one face of the prism, the deection angle is



= 1 + sin1 sin n2 sin2 1 cos sin 1
(4.7.1)
The angle of minimum deviation is that 1 , or m , which minimizes . While
the expression is complicated in the general case, minimum deviation requires
symmetry between the input and output angles: 1 = 4 .
a)

b)
b

a
u1

u2
n

u3

u1

u4

b
a

Fig. 4.26. Isosceles and small-angle prisms. a) Isosceles prism with apex angle
and refractive index n. Prism deects input beam by angle . b) Small-angle prism
with near-normal incidence. For small angles = (n 1). This shape is also called
a wedge prism.

4.7 Single and Compound Prisms


a)

199

b)

o
e
e

f1

+ uniaxial

o
e

f2

+ uniaxial

Fig. 4.27. Birefringent prisms having extraordinary axis perpendicular to input


beam. The deection of the beam is polarization dependent. For a (+) uniaxial
crystal the polarization aligned to the extraordinary axis is deected more than
that aligned to the ordinary axis. The output polarization states are aligned to the
ordinary and extraordinary axes of the prism. Deection and output polarization
states are independent. a) e axis tilted at angle 1 . b) e axis tilted at angle 2 .

For prisms with small apex angles and near-normal incidence (Fig. 4.26(b)),
(4.7.1) is linearized to yield the deection of a small-angle prism,
 (n 1)

(4.7.2)

The input angle for minimum deviation in this case is m = n.


The birefringent prism is a prism made of birefringent crystalline material,
typically uniaxial material. Birefringent prisms for isolators and most circulators are small angle prisms. A birefringent prism refracts like an isotropic
prism and obeys (4.7.1), but the refractive index depends on the input polarization. In turn, the deection is polarization dependent. Figure 4.27 illustrates two birefringent prisms made of the same material and having the same
apex angle. The dierence between the two prisms is the orientation of the
extraordinary axis. For positive uniaxial material, the e-ray deects more than
the o-ray. The linear polarization states of the two output beams are aligned
to the ordinary and extraordinary axes of the prism. The e-axis orientation
determines the output polarization orientation but does not contribute to the
deection.
The following studies detail birefringent prism pairs in Wollaston, Rochon,
Kaifa, and Shirasaki congurations.
4.7.1 Wollaston and Rochon Prisms
The Wollaston and Rochon compound prisms are birefringent prism pairs that
angularly separate orthogonal linear polarization states. The Wollaston type
uses two prisms with the same apex angle and material, and the extraordinary
axes are crossed (Fig. 4.28(a)). The line of contact between the two parts is
the hypotenuse of the prisms, and the input and output faces are parallel
to one another. The prism is generally oriented perpendicular to, or with a
small tilt to, the input beam. Two beams emerge from the compound prism,

200

4 Elements and Basic Combinations


Wollaston

a)

b)

Rochon

uW

apex angle a
A

uR

u
v

apex angle a
u41a

u5

u41a

u5

u
u21a

u41a

u5

u21a

u31a
u31a

u31a

u41a

u5

Fig. 4.28. Wollaston and Rochon compound prisms separate an input beam based
on its polarization. The Wollaston prism symmetrically separates the orthogonal
states while the Rochon prism deects only the state aligned with the e-axis in
prism B. Below shows ray-trace of orthogonal polarization components.

the u-beam following the u-path and the v-beam following the v-path. The
output angles are calculated from Snells equation applied to each interface.
From left to right, the equations for the u-path are
sin 1 = ne sin 2
ne sin (2
ng sin (3

+ ) =
+ ) =

no sin 4

ng sin (3
no sin (4
sin 5

(4.7.3a)
+ )

(4.7.3b)

+ )

(4.7.3c)
(4.7.3d)

where ng is the index in the gap. For the v-path the equations are
sin 1 = no sin 2
no sin (2
ng sin (3

+ ) =
+ ) =

ne sin 4 =

ng sin (3
ne sin (4
sin 5

(4.7.4a)
+ )

(4.7.4b)

+ )

(4.7.4c)
(4.7.4d)

Taking incident angles as small but allowing the apex angle to be significant,1 the two output deections are related to the birefringence and apex
angle as
5 1  (ne no ) tan ,

5 1  (ne no ) tan

(4.7.5)

The Wollaston deection W is the full angle between the outputs, which is
1

The approximation is sin( + ) cos + sin .

4.7 Single and Compound Prisms

uW

a)

201

v
u

Modified Wollaston

fW

uR

b)

v
u

Modified Rochon

fR

Fig. 4.29. Modied Wollaston and modied Rochon prisms. a) Modied Wollaston
tilts the e-axes of the birefringent prisms while maintaining a 90 separation. The
output polarization states are aligned to the e- and o-axes of the second prism. b)
Modied Rochon prism tilts the e-axis of the second prism.

W = 2 (ne no ) tan

(4.7.6)

As an example, to compensate for a 3 full-angle divergence between beams


emergent from a dual-ber collimator, using YVO4 material which has a birefringence at 1.55 m of n = 0.2039, the required apex angle is  14.4 .
The Rochon compound prism Fig. 4.28(b) is like the Wollaston in shape
but the e-axis of prism A is oriented along the propagation axis. In this way
both u- and v-path polarizations see the ordinary index and experience the
same refraction into the gap. Moreover, in prism B the u-path also sees the
ordinary index which in turn imparts zero net deection. Only the v-path is
deected at the output. The Rochon deection R is the full angle between
the outputs, which is
(4.7.7)
R = (ne no ) tan
The Rochon has half of the deecting power of the Wollaston, but has the
advantage of keeping one beam parallel to the input.
Path balancing is another key dierence between the Wollaston and Rochon. For the Wollaston, in prism A the u-path has the extraordinary polarization while the v-path has the ordinary. In prism B these associations are
reversed. As long as the prism lengths are the same the two paths are temporally balanced and one expects no appreciable PMD. In contrast, there is
no temporal dierence imparted by prism A of the Rochon, but there is by
prism B. A single Rochon is not temporally balanced. The dierential-group
delays accumulated in the Wollaston and Rochon prisms are

202

4 Elements and Basic Combinations

(ne no )(LA LB )
c
(ne no )LB
R 
c

W 

(4.7.8a)
(4.7.8b)

where LA and LB are the lengths of the rst and second prisms in each pair,
and the small-angle limit is used. Using YVO4 with a length LB = 2 mm, the
delay from a Rochon prism is 1.3 ps. This is an appreciable imbalance in
the context of current component PMD specications.
A variation of the Wollaston and Rochon compound prisms of signicant
practical importance is illustrated in Fig. 4.29 [56, 61]. The modied Wollaston compound prism changes the cut of the e-axes of prisms A and B while
maintaining a 90 dierent between them. The modied Rochon compound
prism changes the e-axis cut in prism B. The modied Wollaston and modied Rochon prisms impart the same polarization-dependent deections as
their standard counterparts, but the linear states of polarization are rotated
to align with the ordinary and extraordinary axes in prism B. Tilting of the
extraordinary axes in this way adds a degree of freedom in the polarizationevolution schemes used in isolators and circulators.
4.7.2 Kaifa Prism
The Kaifa prism is a hybrid of the Wollaston and Rochon prisms and is
illustrated in Fig. 4.30 [2]. The Kaifa prism serves two functions at once: the
displacement of one polarization from the other, and the deection of the two
polarizations. The prism can be designed with no dierential-group delay.
The compound prism is made from two birefringent prisms. Unlike the
preceding prisms, the extraordinary axis in prism A is cut at angle BC to
the longitudinal axis to produce Poynting vector walko along the u-path.
For normal incidence, the k-vectors of the e- and o-rays remain coincident. At
the hypotenuse interface the v-path follows the same path as in the Wollaston
and Rochon prisms, while the u-path experiences a deection that is between
zero and W /2.
The Kaifa deection K is determined from ray tracing. The u-path follows
sin 1 = ne sin 2
ne sin (2
ng sin (3

+ ) =
+ ) =

no sin 4

ng sin (3
no sin (4
sin 5

(4.7.9a)
+ )

(4.7.9b)

+ )

(4.7.9c)
(4.7.9d)

where ne is determined from the birefringence and extraordinary axis angle BC (cf. (3.6.26) on page 117):
ne = 

n2e

cos2

ne no
BC + n2o cos2 BC

(4.7.10)

4.7 Single and Compound Prisms

203

Kaifa
A

g
e

B
u

aBC
L1

L2

aBC

uK

e
apex angle a
B
u5

u21a
u41a

g
Se
ke

L1

d1

u31a

d2
u5

u41a

u21a
u31a

dc

upt

L2

Fig. 4.30. The Kaifa prism is a hybrid of the Wollaston and Rochon prisms. Due to
inclination of the extraordinary axis in prism A the Poynting vector of the extraordinary ray walks away from the ordinary ray. Prism B deects both rays, but due
to the intermediate refraction angle from prism A, the angle of the u-path output
from prism B lies between that of the Wollaston and Rochon prisms.

Note that if the incident angle is not 1 = 0 then (4.7.9a) must be replaced
with (3.6.23) on page 115. For small incident angles, the deection along
the u-path is then
(4.7.11)
5 1  (ne no ) tan
The v-path follows (4.7.4) with deection (4.7.5). Accordingly, the Kaifa compound prism deection angle is
K = (ne + ne 2no ) tan

(4.7.12)

The pointing direction, the center line of the two paths, is


1
pt = (ne ne ) tan
2

(4.7.13)

When BC = 0 the Kaifa prism reverts to a Rochon prism with ne = no .


Likewise, when BC = 90 the prism reverts to a Wollaston prism with
ne = ne . In either of these two extremes there is no walko of the extraordinary path from the ordinary path.
The optical path lengths can be balanced in the Kaifa prism. The accumulated phase along the u- and v-paths is

204

4 Elements and Basic Combinations

(ne L1 + no L2 )
c

v = (no L1 + ne L2 )
c

u =

(4.7.14a)
(4.7.14b)

The requisite prism length ratio that balances the phase and thereby eliminates dierential-group delay is
L2
ne no
=
L1
ne no

(4.7.15)

The length L1 of the rst prism determines the displacement of the e-ray. The
walko angle of the Poynting vector is governed by (cf. (3.6.15) on page 110)
tan =

(n2e n2o ) sin BC cos BC


n2e cos2 BC + n2o sin2 BC

(4.7.16)

The displacement d1 at the end of prism A is


d1  L1 tan

(4.7.17)

The change in displacement after transiting prism B is


(d1 d2 )  L2 tan (4 4 )

(4.7.18)

Given the displacement at the end face of prism B and the full deection angle,
the distance to the crossing point of the u- and v-paths is approximately
dc 

d2
K

Expansion of the respective terms yields


"!
"
!

no
neff
no
tan nneff

no
ne tan
e no
L1
dc 
(ne + ne 2no ) tan

(4.7.19)

(4.7.20)

The Kaifa prism is useful is some circulator as well as interleaver applications


because of the simplicity of its displacement and deection behavior.
4.7.3 Shirasaki Prism
The Shirasaki compound prism illustrated in Fig. 4.31 is a birefringent prism
pair designed to split an input light ray into orthogonal polarization components that run parallel to one another [54, 57]. The birefringent prisms are
designed such that one polarization component undergoes total internal reection while at the same interface the remaining polarization component is
transmitted through the Brewster angle. The Shirasaki prism is a variation
on the Glan-Taylor prism but directs the outputs to run parallel.

4.7 Single and Compound Prisms


a)

b)

2
Prism 1

uB

uB
e

uB

AR

Prism 2

c)

AR

Prism 2
AR

e
uB

n2

n1

s
2uB

d)

i2
i1

AR

2uB

3uB

AR

Prism 1

uB

uB

AR

205

3uB

18022uB
air gap

4uB

s,p
18022uB

Fig. 4.31. The Shirasaki compound prism. Two birefringent prisms cut as illustrated separate a light beam into orthogonal linear components. At the air-gap interface between the two prisms one polarization is totally internally reected while
the other is transmitted through Brewsters angle. The extraordinary axis is aligned
perpendicular to the page, and the output surfaces are AR coated. a) Input through
port 1 separates polarizations onto two parallel paths, the p polarization runs along
the top path. b) Input through port 2 also separates polarization, the p polarization runs along the bottom path. c) Optical path at air-gap interface. d) Angular
orientation of four ports.

As illustrated in Fig. 4.31(a,b), there are four ports to the compound


prism. Ports 1 and 2 have their polarization components resolved by the prism
and diverted to run parallel at the output ports. Whether the p polarization
component runs along the upper or lower output path depends on the input
port, and likewise for the s polarization. The birefringent prisms have their
extraordinary axes cut perpendicular to the plane in which the light travels.
The compound prism is easily designed for positive uniaxial crystals.
The characteristic action of the compound prism lies at interface i1
(Fig. 4.31(c)). Here one polarization component experiences total internal
reection, which is determined by the condition
n2 sin 2 n1

(4.7.21)

The other polarization is to experience total transmission through the Brewster angle. Recalling that the Brewster condition requires the reected and
refracted light to lie at right angles, or 1 + 2 = /2, the reection coecient (3.5.39) on page 99 vanishes when
tan 2 = n1 /n2

(4.7.22)

206

4 Elements and Basic Combinations

With this condition satised, one writes 2 = B , where here B is the internal Brewster angle. For an isotropic material (4.7.21) and (4.7.22) cannot be
simultaneously satised. However, they can be simultaneously satised for a
uniaxial birefringent material. Consider, without loss of generality, a positive
uniaxial material such that TIR occurs for the extraordinary index. In this
case, the governing expressions are
ne sin 2 n1

(4.7.23a)

no sin 2 = n1 cos 2

(4.7.23b)

Combination of the square of these relations generates the condition that


n2e n2o n21

(4.7.24)

When the gap index is air, n1 = 1 and (4.7.24) is satised. Rutile and YVO4
satisfy this condition.
A consideration with this compound prism is the reection coecient when
the prism index changes, as with temperature. With a temperature change,
the input angle does not change but the index does. If one writes (3.5.39) as
TM = f /g, then to rst order about the Brewster angle,
dTM
df

dn2
g

(4.7.25)

Taking the dierential and then substituting in Brewsters expression (4.7.22),


reectivity change from zero is
dTM
(n2 /n1 )2 1

dn2
2n2

(4.7.26)

Consider that the prisms are made of YVO4 . At room temperature the reection of the ordinary ray is zero. For a 40 temperature change, given a
thermal-optic coecient of dno /dT 8.6 106 , the reection changes to
2
|TM | 70 dB. The Shirasaki prism was used to demonstrate a polarization-independent circulator having high extinction [35, 57].

4.7 Single and Compound Prisms

207

Table 4.8. Summary of Waveplate Vector and Matrix Operators


Waveplate

Jones expressions

Stokes expressions

1
U/4 = (I j(
r
))
2

r r) + (
r )
R/4 = (

Quarter-wave
{U, R} Operators

{U, R}/4 ()

1 j cos 2
1

2
j sin 2

1
1

2
j

{U, R}/4 (45 )

cos2 2

sin 2 cos 2

1 + j cos 2
sin 2
j sin 2

sin 2 cos 2
sin2 2
cos 2

sin 2

cos 2

Half-wave
{U, R} Operators

U/2 = j(
r
)

{U, R}/2 ()

cos 2

sin 2

sin 2

cos 2

0
j

2
1

{U, R}/2 (45 )

R/2 = 2(
r r) I

cos 4

sin 4

sin 4

cos 4
0

Faraday rotator(a)
{U, R} Operators

UF = cos F I j3 sin F

{U, R}F (F )

{U, R}F (45 )

(a)

cos F

sin F

sin F

cos F

1
1

2
1

1
1

RF = cos 2F + (1 cos 2F )(3 3 )


sin 2F (3 )

cos 2F

sin 2F

1
1

1
2
0

sin 2F
cos 2F
0
1
1
0

The (+) and () signs refer to the relation of the magnetization vector and Faraday
rotation direction of the material. Once a sign is set, it is xed for both forward and backward
propagation.

208

4 Elements and Basic Combinations

References
1. M. Arii, N. Takeda, Y. Tagami, and K. Shirai, Magneto-optic garnet, U.S.
Patent 4,932,760, June 12, 1990.
2. V. Au-Yeung, Q.-D. Gao, and X. L. Wang, Optical circulator, U.S. Patent
6,331,912, Dec. 18, 2001.
3. M. Bass, Ed., Handbook of Optics: Volume II. New York: McGraw-Hill, Inc.,
1995.
4. J. M. Bennett, A critical evaluation of rhomb-type quarterwave retarders,
Applied Optics, vol. 9, pp. 21232129, 1970.
5. B. H. Billings, Ed., Selected Papers on Polarization. Bellingham, Washington:
SPIE Optical Engineering Press, 1990, vol. MS 23, SPIE Milestone Series.
6. J. Bland-Hawthorn, W. V. Breugel, P. R. Gillingham, I. K. Baldry, and D. H.
Jones, A tunable lyot lter at prime focus: A method for tracing supercluster
scales at z 1, The Astrophysical Journal, vol. 563, pp. 611628, Dec. 2001.
7. C. D. Brandle, V. J. Fratello, and S. J. Licht, Article comprising a magnetooptic material having low magnetic moment, U.S. Patent 5,608,570, Mar. 4,
1997.
8. C. F. Buhrer, Four waveplate dual tuner for birefringent tlers and multiplexers, Applied Optics, vol. 26, no. 17, pp. 36283632, 1987.
9. , Quasi-achromatic optical isolators and circulators using prisms with total
internal fresnel reection, U.S. Patent 4,991,938, Feb. 12, 1991.
10. S. Cao, J. Chen, J. N. Damask, C. Doerr, L. Guiziou, G. Harvey, Y. Hibino,
H. Li, S. Suzuki, K.-Y. Wu, and P. Xie, Interleaver technology: Comparisons
and applications requirements, Journal of Lightwave Technology, vol. 22, no. 1,
pp. 281289, Jan. 2004.
11. Carpenter invar 36 alloy, Carpenter Technology Corporation, Wyomissing,
Pennsylvania, 1990, edition date 08/01/1990.
12. Kovar alloy, Carpenter Technology Corporation, Wyomissing, Pennsylvania,
1990, edition date 10/01/1990.
13. J.-H. Chen, K.-W. Chang, K. Tai, H.-W. Mao, and Y. Yin, Apparatus capable
of operating as interleaver/deinterleavers for lters, U.S. Patent 6,333,816, Dec.
25, 2001.
14. Lithium niobate optical crystals, Crystal Technology, Inc., Palo Alta,
CA, 1999. [Online]. Available: http://www.crystaltechnology.com/LN Optical
Crystals.pdf
15. J. N. Damask, Polarization mode dispersion generator, U.S. Patent
2002/0 012 487 A1, Jan. 31, 2002.
16. , Composite birefringent crystal and lter, U.S. Patent 6,577,445, June
10, 2003.
17. J. N. Damask, P. R. Myers, G. J. Simer, and A. Boschi, Methods to construct
programmable PMD sources, Part II: Instrument demonstrations, Journal of
Lightwave Technology, vol. 22, no. 4, pp. 10061013, Apr. 2004.
18. E. Desurvire, Erbium-Doped Fiber Ampliers, Principles and Applications.
Hoboken, New Jersey: Wiley-Interscience, 2002.
19. S. M. Etzel, A. H. Rose, and C. M. Wang, Dispersion of the temperature
dependence of the retardance in SiO2 and MgF2 , Applied Optics, vol. 39, no. 31,
pp. 57965800, Nov. 2000.
20. J. W. Evans, The birefringent lter, Journal of the Optical Society of America,
vol. 39, no. 3, pp. 229242, 1949.

References

209

21. V. J. Fratello and R. Wolfe, Handbook of Thin Film Devices, Vol. 4: Magnetic
Thin Film Devices. San Diego: Academic Press, 2001, ch. Epitaxial Garnet
Films for Nonreciprocal Magneto-Optic Devices, pp. 93141.
22. C. E. Gaebe, Optical isolator and alignment method, U.S. Patent 5,737,349,
Apr. 7, 1999.
23. D. J. Gauthier, P. Narum, and R. W. Boyd, Simple, compact, high-performance
permanent-magnet faraday isolator, Optics Letters, vol. 11, no. 10, pp. 623625,
1986.
24. E. Hecht, Optics, 2nd ed. Reading, Massachusetts: Addison-Wesley Publishing
Company, 1987.
25. A. J. Heiney and D. K. Wilson, Optical isolators employing oppositely signed
faraday rotating materials, U.S. Patent 5,087,984, Feb. 11, 1992.
26. F. Heismann, Analysis of a reset-free polarization controller for fast automatic
polarization stabilition in ber-optic transmission systems, Journal of Ligthwave Technology, vol. 12, no. 4, pp. 690699, Apr. 1994.
27. K. Hiramatsu, K. Shirai, and N. Takeda, Low magnet-saturation bismuthsubstituted rare-earth iron garnet single crystal lm, U.S. Patent 6,031,654,
Feb. 29, 2000.
28. Hoya glass catalog, Hoya, Incorporated, 2004.
29. Optical Interfaces for Multichannel Systems with Optical Ampliers, International Telecommunication Union Std. ITU-T G.692, Oct. 1998.
30. Spectral Grids for WDM Applications: DWDM Frequency Grid, International
Telecommunication Union Std. ITU-T G.694.1, June 2002.
31. PbMoO4 data sheet, Isomet Corporation, Springeld, Virginia, 2003.
[Online]. Available: http://www.isomet.com/
32. Casix product catalog 2003, JDSU, Inc., Canada, 2003. [Online]. Available:
http://www.casix.com/crystals/birefringentcrystal.htm
33. D. Jones, S. Diddams, J. Ranka, A. Stentz, R. Windeler, J. Hall, and S. Cundi,
Carrier-envelope phase control of femtosecond modelocked lasers and direct
optical frequency synthesis, Science, vol. 288, pp. 635639, 2000.
34. C. J. Koester, Achromatic combinations of half-wave plates, Journal of the
Optical Society of America, vol. 49(4), pp. 405409, Apr. 1959.
35. H. Kuwahara, Optical circulator, U.S. Patent 4,650,289, Mar. 17, 1987.
36. M. Legre, M. Wegmuller, and N. Gisin, Investigation of the ratio between phase
and group birefringence in optical single-mode bers, Journal of Lightwave
Technology, vol. 21, no. 12, pp. 33743378, Dec. 2003.
37. U. Morgner, R. Ell, G. Metzler, T. R. Schibli, F. X. Kartner, J. G. Fujimoto,
H. A. Haus, and E. P. Ippen, Nonlinear optics with phase-controlled pulses in
the sub-two-cycle regime, Physical Review Letters, vol. 86, no. 24, pp. 5462
5465, 2001.
38. Ohara glass catalog, Ohara, Incorporated, Kanagawa, Japan, 2004. [Online].
Available: http://www.oharacorp.com/swf/catalog.html
39. S. Pancharatnam, Achromatic combinations of birefringent plates, Proc. Indian Acad. Sci., vol. A41, pp. 137144, 1955.
40. K. B. Rochford, A. H. Rose, and G. Day, Magneto-optic sensors based on iron
garnets, IEEE Transactions on Magnetics, vol. 32, no. 5, pp. 41134117, 1996.
41. K. B. Rochford, A. H. Rose, M. N. Deeter, and G. W. Day, Faraday eect
current sensor with improved sensitivity-bandwidth product, Optics Letters,
vol. 19, no. 22, p. 1903, Nov. 1994.

210

4 Elements and Basic Combinations

42. K. B. Rochford, A. H. Rose, P. A. Williams, C. M. Wang, I. G. Clarke, P. D.


Hale, and G. W. Day, Design and performance of a stable linear retarder,
Applied Optics, vol. 36, no. 26, pp. 64586465, Sept. 1997.
43. K. B. Rochford, A. Rose, M. Deeter, and G. Day, Faraday eect current sensor
with improved sensitivity-bandwidth product, in Tenth Intl Conference on
Fiber Optic Sensors (SPIE 2360), B. Culshaw and J. Jones, Eds., Glasgow,
Scotland, Oct. 1994, pp. 3235.
44. Optical glass description of properties, Schott Glass, Mainz, Germany, Sept.
2000, version 1.2. [Online]. Available: http://www.us.schott.com/sgt/english/
products/catalogs.html
45. Optical glass properties, Schott Glass, Mainz, Germany, Sept. 2000, version 1.2. [Online]. Available: http://www.us.schott.com/sgt/english/products/
catalogs.html
46. Synthetic fused silica, Schott Lithotec AG, Mainz, Germany, 2001. [Online].
Available: http://www.schott.com/lithotec/english/products/fused silica.html
47. K. Shirai, K. Ishikura, and N. Takeda, Bismuth-substituted rare earth iron
garnet single crystal, U.S. Patent 5,565,131, Oct. 15, 1996.
48. , Low saturated magnetic eld bismuth-substituted rare earth iron garnet
single crystal and its use, U.S. Patent 5,512,193, Aug. 30, 1996.
49. K. Shirai, M. Sumitani, N. Takeda, and M. Arii, Optical isolator, U.S. Patent
5,278,853, Jan. 11, 1994.
50. K. Shirai, T. Takano, N. Takeda, and M. Arii, Polarization-independent optical
isolator, U.S. Patent 5,345,329, Sept. 6, 1994.
51. K. Shirai and N. Takeda, Faraday rotator, U.S. Patent 5,535,046, July 9, 1996.
52. , Bismuth-substituted rare earth iron garnet single crystal lm, U.S.
Patent 5,925,474, July 20, 1999.
53. K. Shirai, N. Takeda, and K. Hiramatsu, Faraday rotator having a rectangular
shaped hysteresis, U.S. Patent 5,898,516, Apr. 27, 1999.
54. M. Shirasaki, Prism polarizer, U.S. Patent 4,392,722, July 12, 1983.
55. , Polarization rotation compensator and optical isolator using the same,
U.S. Patent 4,712,880, Dec. 15, 1987.
56. M. Shirasaki and K. Asama, Compact optical islator for bers using birefringent wedges, Applied Optics, vol. 21, no. 23, pp. 42964299, 1982.
57. M. Shirasaki, H. Kuwahara, and T. Obokata, Compact polarization-independent optical circulator, Applied Optics, vol. 20, no. 15, pp. 26832687, Aug.
1981.
58. Generic Reliability Assurance Requirements for Passive Optical Components,
Telcordia Technologies Std. GR-1221-CORE, 1999. [Online]. Available:
http://telecom-info.telcordia.com/site-cgi/ido/index.html
59. S. Thaniyavarn, Wavelength-independent polarization converter, U.S. Patent
4,691,984, Sept. 8, 1987.
60. G. R. Walker and N. G. Walker, Polarization control for coherent communications, Journal of Ligthwave Technology, vol. 8, no. 3, pp. 438458, Mar.
1990.
61. P. Xie and Y. Huang, Compact polarization insensitive circulators with simplifed structure and low polarization mode dispersion, U.S. Patent 6,052,228,
Apr. 18, 2000.

5
Collimator Technologies

Fiber-optic collimation and focusing assemblies, together known as collimators, are used to launch a beam of light from an optical ber into free space
and then to capture that light and refocus it into the same or another ber.
Collimator technologies are necessary whenever the gap introduced by optical
component cores exceeds several hundred microns. For example, the numerical
aperture of Corning single-mode ber (smf-28) is N.A.  0.14, which corresponds to a 16 full-angle beam expansion cone. Any ber-to-ber separation
over a distance of several hundred microns introduces severe insertion loss
into the system. A collimator assembly adds a lens between the ber termination and free space region to allow the light to travel tens or hundreds of
millimeters without severe loss penalties.
Collimators are essential elements for micro-optic packaging of isolators,
circulators, interleavers, thin-lm lters, free-space dispersion compensators,
and free-space polarization-based components. The lengths of these exemplar components require end-to-end beam coupling that ranges from 5 mm
to 150 mm. A collimator assembly and lensing design can be optimized for
any particular gap, but in general the larger the gap, the greater are the
sensitivities to wavefront distortion.
There are two axes in the collimator technology selection matrix, one axis
being the choice of lens and the other axis the choice of assembly. The two predominant lenses are Graded-Index, or GRIN, lenses, and shaped lenses, often
called C-lenses (Fig. 5.1). A third type of lens that has some applications for
micro-optic components are Gradium lenses, but these lenses are not ubiquitous. The two predominant collimator assemblies are air-gap collimators and
fused collimators. Air-gap collimators have a physical air gap between the ber
termination and the lens entry surface; the adjacent surfaces are AR coated
to maximize transmission. Fused collimators fusion-splice the ber directly to
an index-matched lens. A third assembly type, used early in the development
chronology and now obsolete, is the epoxy-joint collimator where the ber and
lens were directly attached with a bead of epoxy in the light path.

212

5 Collimator Technologies
a)

b)

Fig. 5.1. Collimation of light from a ber core by a) a shaped lens, and b) a GRIN
lens. The wave-front curvature is eliminated by the curved surface of the shaped
lens and progressively by the lateral index gradient of the GRIN lens.

Across the technology selection matrix there is a common set of optical,


mechanical, and manufacturing-related design goals; these design goals are
captured in Table 5.1. Insertion loss is typically reported by manufacturers
as loss through a collimator pair. The high return loss is necessary to avoid
coherent interference in a long chain of components. Pointing accuracy identies the angular oset between the collimator housing and the axis of the
exiting beam. WDM and spike power handling refers, respectively, to continuous operation and to the collimators ability to tolerate a large surge of
optical power, such as when an amplier chain is cut or when a signal is rst
applied to the chain.
The mechanical goals include the means to attach and x the collimator assembly to a micro-optic package. Most packages must be hermitically
sealed to pass the lifetime requirements for passive components [21]. Accordingly, these packages are made of metal and all seams are soldered or welded.
The collimator must conform to these hermeticity requirements as well. A
collimator must maintain its integrity over a wide temperature range, which
often includes a shipping range and an operational range. The pull tolerance
refers to the handleability of the part and in particular to the avoidance of
separation between the ber pigtail and the lens/ferrule assembly. Lastly, as
miniaturization is a demand on all micro-optic components, the collimator
assembly must be small.
Finally, any collimator design must be manufacturable with high yield at
low cost. The design must be, in a word: simple. As of this writing, most
collimators are manually assembled. There is some push toward automation
of the assembly process, but it is not clear that the tolerances achieved from
automation are either required, superior, or cost eective.
The following sections provide theoretical and practical detail regarding
collimator optics and technologies. The purpose of the theoretical analyses
is to enable back-of-the-envelope estimates of component layout, dimensions,
and lens selection. All truly accurate designs must be computed with commercially available ray-trace and beam propagation software. These numerical
tools capture the eects of aberrations and provide the necessary level of tolerancing required for a robust design. The purpose of the technical overview is
to highlight the central issues and challenges concerning collimator assemblies.

5.1 Collimator Assemblies

213

Table 5.1. Common Collimator Design Goals


Category

Specication

Goal

Optical

Low insertion loss


High return loss

> 60 dB
1

Pointing accuracy

Mechanical

Manufacturing

< 0.25 dB(a)

WDM power handling

> 100 mW

Spike power handling

>1W

Housing xity

Solder or weld

Temperature stability

-20 to +60 C

Pull tolerance

5N

Small diameter

< 4 mm

High reliability

Pass Telcordia
qualications(b)

High yield

> 90 %

Simple process
Low cost
(a)

Measured through collimator pair.

(b)

Telcordia GR-1221-CORE [21].

5.1 Collimator Assemblies


This section details the assembly technologies of collimators. There are three
assembly types: the epoxy-joint, the air-gap, and the fused-joint collimators.
All assemblies seek to x the position between a polished end of one or more
single-mode bers and a micro-optic lens. Moreover, all assemblies must be
xable to the metal component housing.
Epoxy-Joint Collimators
The epoxy-joint assembly, illustrated in Fig. 5.2, is the earliest type of integrated package though now obsolete. The three key elements are the ber
ferrule, the lens, and the sleeve. The ferrule is a quartz cylinder specially manufactured so that a wet chemical etch opens a capillary tube lengthwise down
the center. One or more unjacketed single-mode bers are inserted through
the tube and epoxied into place with heat-curing epoxy. A typical heat-curing
epoxy is 353ND manufactured by Epoxy Technology, Inc., in Billerica, MA.
Typically a one-hour cure at 85 C completely xes the ber and ferrule together. The ber end(s) are then clipped and the end face is polished so that
the ber and ferrule terminate on the same plane.

214

5 Collimator Technologies
AR

fiber
strain relief
elastimer

lens
heat epoxy

ferrule
UV epoxy

metal sleeve

Fig. 5.2. Epoxy-joint collimator. Fiber is threaded through ferrule and xed with
heat-curing epoxy. Ferrule and ber end face then polished at an angle (68 ).
Ferrule and lens (with angle-polished facet) are aligned and set with UV-curing
epoxy. The assembly is inserted into a metal sleeve (the sleeve may or may not cover
the ferrule/lens joint) and xed with heat-curing epoxy. A strain-relief elastomer is
added around the exposed ber to increase pull tolerance. Final assembly is soldered
to micro-optic package.

AR

air gap
AR coatings

glass insulator
metal sleeve

Fig. 5.3. Air-gap collimator. Lens and ferrule are aligned within a glass insulator
sleeve. Inner facets of lens and ferrule are polished at an angle (68 ) and subsequently AR coated to limit back reection. The gap is adjusted for optimal position
and then tacked with UV epoxy around the perimeter of the assembly but not within
the gap. Assembly is further xed with heat-curing epoxy. Assembly is then loaded
into metal sleeve and xed with heat-curing epoxy. Final assembly is soldered to
micro-optic package.

AR

glass stabilizer tube


fusion splice

indexmatched lens

Fig. 5.4. Fusion-joint collimator. One or two bers are directed fused to an indexmatched lens. The ber is threaded through a glass stabilizer tube for mechanical
integrity. No angled facets nor AR coatings are required. The assembly is loaded
into a metal sleeve and xed with heat-curing epoxy. Final assembly is preferably
laser-welded to micro-optic package.

5.1 Collimator Assemblies

215

All early collimator assemblies used GRIN lenses because of their small
size and availability. For this generation and the ones to follow, the outer
face of the lens is anti-reection coated to increase transmission and reduce
back reection. The pitch of the GRIN lens (dened by (5.2.44) on page 232)
is generally selected as P = 0.23 [18, 24], where a pitch of P = 0.25 is the
theoretical choice for a collimating lens. The small reduction of the pitch
serves two purposes. First, it is a practical step necessary to allow for a small
gap between the front face of the ferrule and the back face of the lens. A
quarter-pitch lens requires the ber end face to be positioned directly on the
back face of the lens. Second, in recognition that the ber core is not a point
source, rays emitted from the edge of the core are not over collimated with
a P = 0.23 lens.
A ferrule end face that is perpendicular to the core creates high Fresnel
reection which leads to unacceptable back reection into the ber. To reduce
the back reection the ferrule is polished with a tilt angle, typically in the
range of 68 . The back face of the GRIN lens is likewise polished. The early
designs used the same angle of polish for the ferrule and lens, not accounting
for an unintended Fabry-Perot cavity nor the refraction dierence due to the
dierent indices of the ber and lens.
The epoxy-joint collimator assembly xes the ferrule to the lens with a
UV epoxy. Prior to the UV shot, the ferrule is positioned with positioning
stages to the appropriate location behind the lens. Reference [24] cites the
OG154 UV-curing epoxy from Epoxy Technology. It is interesting to record
the indices at 1.55 m for SMF-28 ber ne = 1.4682 [9], for OG154 n  1.545,
and for GRIN lens no 1.57 1.61 [15]. The UV epoxy does a better job of
index matching the ber to the lens than would air, but the residual dierence
creates excess loss. After the UV tack, a heat-curing epoxy is painted around
the joint and the assembly is heat cured. As a nal step a metal sleeve is
slipped around the assembly and heat-cured into place. Early sleeves were
stainless steel [18] with a gold plating for better soldering ability to a metal
housing.
The problems with epoxy-joint collimators are many. First, the UV epoxy
in the optical path severely degrades the reliability of the component, especially under the damp-heat tests specied by [21]. UV epoxies in general have
low humidity resistance as compared with heat-curing epoxies. In particular,
the epoxy-joint collimator is reported to tolerance the 8585 test (85% humidity at 85 C) for only 350 hours [25]. Second, the attachment of the ferrule and
lens directly to the metal sleeve does not provide enough heat resistance when
the collimator is subsequently soldered to a component housing. All early attachments were done by hand and overheating the part was simple. The epoxy
in the optical path cannot tolerate severe heating and breaks down, both darkening the lightpath and impairing its bonding strength. Third, the epoxy that
is in the optical path exhibits an expansion coecient and a temperaturedependent refractive index. An insertion loss variation of 0.12 dB is reported
over the temperature range of 080 C [25]. Finally, the power-handling ability

216

5 Collimator Technologies

of any epoxy is weak. For a mode-eld diameter of 10 m at the ber core,


the power density is 12.5 MW/m2 /mW. Even in CW operation the epoxy
is exposed to a large power density. Moreover, power transients that generate
spikes will irreversibly darken the material. In view of the design goals stated
in Table 5.1, few conditions are achieved with the epoxy-joint technology.
Air-Gap Collimators
Figure 5.3 illustrates the second generation of collimator assembly, an assembly that has few drawbacks and is ubiquitously used today. The three
achievements of the air-gap assembly are that the in-path epoxy is removed,
the index mismatch between ber and lens to air is mitigated with AR coatings, and the overheating from soldering is reduced using an intermediate
quartz sleeve. Additionally, some augmentations are made taking advantage
of better technology and understanding. In particular, the lens selection today
includes both GRIN and shaped lenses, and the polished facets of the ferrule
and lens can be tuned to account for the ber/lens index dierence.
To assemble an air-gap collimator, an angle-polished and AR-end-coated
lens is inserted into a quartz sleeve and bonded into place with a UV tack
around the perimeter. The UV exposure lasts only seconds. The likewiseprepared ferrule is inserted into the other end of the quartz sleeve and adjusted to the correct location. The correct location is determined by actively
monitoring the beam collimation or spot size while illuminating a target with
laser light that is transmitted through the ber and lens. The ferrule is tacked
into place with UV epoxy applied around the perimeter. The sub-assembly is
then coated with a heat-curing epoxy and cured. This sub-assembly is then
inserted into a gold-plated metal sleeve and xed into place with heat-curing
epoxy. To nish the component and increase pull strength on the ber end, a
bead of strain-relief elastomer is added around the exposed ber and cured in
place.
The quartz sleeve has two advantages. First, the sleeve simplies the alignment process by eliminating all degrees of freedom except the gap between
the parts and the azimuth angle. Second, the quartz adds a heat-resistant
layer between the metal housing and the ferrule/lens joint. Accordingly, the
soldering process degrades the collimator epoxies less, especially as no UV
epoxy is relied on as the primary bonding agent.
In relation to the aforementioned design goals of any collimator, the air-gap
collimator answers most demands with good performance. Table 5.2 shows a
favorable comparison of the air-gap technology to the epoxy-joint technology
(as well as the fused-joint technology, to be detailed shortly). Also, the air-gap
column favorably reads against the design goals on Table 5.1. As an example,
the temperature dependence of the insertion loss is less than 0.05 dB over
080 C and the air-gap collimators can tolerate the 8585 damp-heat stress
test for at last 2500 hours [25].

5.1 Collimator Assemblies

217

An important characteristic of epoxy-joint and air-gap collimators is that


the pointing direction of the output beam is generally not parallel to the
mechanical axis of the housing. This is because of the angle polish of the ferrule. As shown in Fig. 5.13 for a GRIN lens and Fig. 5.12 for a shaped lens,
the output beam is deected due to the oset of the input beam from the
centerline of the lens. This eect is unavoidable, and in fact exacerbated for
long-working-distance lenses that require large beam expansion before collimation. The specication of pointing accuracy addresses this eect. Solutions
have been proposed such as laterally osetting the ferrule from the centerline
of the lens [17], adding a wedge between the ferrule and lens [19], or adjusting the polish angle of the lens to account for its index so that the refracted
beam is straightened albeit oset nonetheless [5]. The last solution is dened
by (5.3.1) on page 235. The rst two solutions make manufacture more dicult either from creating an asymmetric part or adding another element. The
third solution is probably the best to mitigate yet not eliminate the problem.
However, it should be recognized that a zero pointing error enhances back
reection. Depending on the focal length and type of lens, it seems that the
optimal pointing deection is between 0.51.0 .
For reasons of manufacturability there is a natural division between use
of GRIN and shaped lenses. The GRIN lens is suitable for short working distances because the beam does not require much expansion to counter diraction. However, larger working distances requires a larger beam which in turn
requires a larger lens diameter. The manufacturing process of a GRIN lens,
typically a chemical vapor deposition process, cannot suciently control the
quadratic curvature of the index prole nor introduce too much sag [14], which
together limits the practical size of the lens. The shaped lens is in fact suitable for long working distances because the outside surface can be ground or
molded to sucient tolerances, and in fact the surface can approximate an asphere to mitigate spherical aberration. It is the short-working-distance-shaped
lens that is dicult to manufacture because of the requisite small radius of
curvature. The choice of lens therefore is roughly divided depending on application: short-reach collimators use GRIN lenses and long-reach collimators
use aspheric lenses.
Fused-Joint Collimators
Figure 5.4 illustrates the third generation of collimator assembly, the fusedjoint collimator. Unlike the preceding collimator assemblies, the ber is not
attached to a ferrule but directly fused to the lens. A glass stabilizer tube
through which the ber is threaded is used to add mechanical integrity. The
advantages of a fused joint over the air-gap with AR coatings is the formers
power handling ability and environmental stability. It is reported that the
fused-joint collimator design can handle 10 W of optical power [13], a factor
of 20 increase over the power handling ability of AR coatings. And clearly the
direct fusion eliminates issues of gap change with temperature and lifetime.

218

5 Collimator Technologies

The barriers to high performance of a fused-joint collimator are the ability to


fuse the ber to the lens and the index matching of the ber mode to the lens
material.
Regarding the fusion process there are several barriers: the melt points of
the lens and ber must be similar; the temperature coecients of the lens
and ber must be similar; and enough heat must be added to the lens, having
a large thermal mass, to cross the glass transition temperature while not
melting away the much smaller ber. The melting point of a typical GRIN
lens is 500600 C and that of quartz glass ber is 1700 C. Either the GRIN
material set must be changed or a shaped lens made from quartz must be
used to avoid deformation. Both concepts have been demonstrated [14]. As
with the melt point dierence, similar material systems must be used for ber
and lens to avoid cracking and fracture of the fused joint during cooling.
Finally, a clever way to deliver the heat without deformation of either
part is necessary for the fusion splice. One proposal is to add an intermediate coupling rod that is larger in diameter than the ber yet smaller than
the lens [22], but this adds more steps and more parts. The more elegant
demonstration is with direct ber to lens fusion using a CO2 laser [24]. In
the reported process, a ber is threaded through a narrow slot cut in a mirror
tailored for high reectivity of CO2 radiation. The mirror is tilted to 45 with
respect to the back face of the lens. The laser beam is focused to the back face
and heats only in proximity of the ber placement. After a short pre-heat step
the lens and ber are fused. The fusion happens within a seconds, so active
alignment is not required during the xity. Moreover, the high temperature
eliminates all contaminants yet the low amount of heat allows the front face
of the lens to be AR coated in advance. Optimization of the process includes
the creation of a thin (2 m) melt layer on the lens to overcome any residual
Fresnel reection. Measurement of return loss can validate the quality of the
joint.
The second barrier is the index matching of the ber to the lens. A shaped
lens can be made of fused silica without diculty. This is the basis of some
products [13]. However, a large eort has gone into the fabrication of indexmatched GRIN lenses [1] with good success. These lenses are made with a
plasma-enhanced chemical vapor deposition (PECVD) technology using a
Ge:SiO2 material system. The PECVD process controls the radial uniformity
through rotation of the preform rod and controls the longitudinal uniformity
through programmed up-and-down motion of the plasma along the rod length.
After deposition the rod is consolidated at 2000 C and collapses to a radius
of 0.31.2 mm, depending on application.
There are two more characteristics that need to be addressed for fusedjoint collimators. First, even though the ber is directly attached to the lens,
the ber must be slightly oset from center to avoid back-reection from the
front lens surface. A typical oset is 5 m. With this oset the spot of the
slightly reected beam is oset from the ber core. The resultant pointing
deection is typically 0.5 . Second, as all epoxy is removed from the ber-

5.2 Gaussian Optics

219

Table 5.2. Collimator Technology Comparison


Specication

Epoxy joint

AR-coated air gap

Fusion joint

Optical
Low insertion loss

med(a)

< 0.2 dB @ 20 mm
< 0.5 dB @ 150 mm(b,c)

High return loss

low(a)

> 60 dB(b,c)

Pointing accuracy
WDM power handling
Spike power handling

med
low
low

(a)

0.5 W

(a)

> 55 dB, > 75 dB(d,e)

(c)

0.5 deg(d)

(b,c)

10 W(d)

1.0 deg

(a)

< 0.2 dB @ 20 mm
< 1.0 dB @ 150 mm(d)

> 10 W(d)

n/a

Mechanical
Temperature stability
Pull tolerance

low(a)
med

0 to +70 C(b,c)

(a)

5N

(a)

Pass Telcordia

low

Housing xity

lens, ferrule
direct
attachment to
housing

(b,c)

high

(f )

intermediate glass tube


between lens, ferrule and
housing

-20 to +60 C(d)


high(d)
high
lens, ferrule attached
directly to housing in
anticipation of laser
welding

Manufacturing
UV cures
Heat cures
Fusion splicing
AR coatings
Active alignment
(a)
(b)
(c)
(d)
(e)
(f )

in optical path

tack

no

in housing

in housing

in housing

no

no

yes

lens output face

ferrule, two lens faces

lens output face

yes

yes

no

US 6,148,126 [24].
Koncent single-ber collimator [11].
Koncent long-working-distance collimator [11].
LightPath Generation 3 collimator [13].
LightPath [3].
Casix Reliability Report [6, 7].

to-lens attachment, the insulating glass sleeve used in the air-gap collimator
assembly may be eliminated for the fused-joint assembly. This reduces the
diameter of the collimator which increases part density.
The third column of Table 5.2 includes specications to compare with airgap and epoxy collimators. While the fused-joint collimator looks attractive
for a few of the aforementioned reasons, the reported performance of fusedjoint and air-gap collimators is about the same save the power-handling ability.

5.2 Gaussian Optics


Gaussian optics is an analytic formalism that is a reasonable alternative to
numerical ray tracing techniques when a component design is being roughed
out. Gaussian optics addresses the adiabatic expansion of an optical beam as
it propagates in isotropic media, covers the focusing properties of shaped and

220

5 Collimator Technologies

GRIN lenses, and can estimate the mode coupling from one ber to another.
However, gaussian optics does not include lens aberration theory nor the true
mode proles of ber-guided modes. In particular, the prole of a guided mode
in single-mode ber is a Bessel function, but that prole is approximated as
a gaussian beam with a certain diraction angle. Once the gaussian-mode
envelope function is derived, subsequent augmentation produces the ABCD
ray tracing matrices, which are useful for analytically tracing a paraxial ray
through a cascade of optical elements.
In Chapter 1 plane wave solutions were sought for the wave equation (1.1.6)
of the electric eld. These solutions are valid when the source of those elds
can be considered at innity. In contrast, a mode that emerges from a ber
into free space has an aperture at the ber/free-space boundary. The emerging
eld has a spherical phase front across its leading edge. As discussed in 1.2
on page 8 the vector potential is required to nd eld solutions from point
sources or, in this case, approximations of point sources. In particular, the
vector and scalar wave equations (1.2.9) govern the eld evolution.
As a starting point, consider a vector potential trial solution
(x, y, z) ejt
A(r, t) = n

(5.2.1)

is a unit vector denoting a single pointing direction and is a


where n
scalar eld absent the fast oscillation exp(jt) term. Substitution of (5.2.1)
into (1.2.9a) yields a time-harmonic scalar wave equation

 2
(5.2.2)
+ k2 = 0
where the wavenumber k is dened as usual: k 2 = 2 o o . To this point the
implications of the trial solution (5.2.1) have been exact. The next step establishes the paraxial approximation where change of the eld is predominantly
along the direction of propagation and only small changes occur in the transverse direction. The k vector is accordingly approximated as k  kz , or

k 2 (kx2 + ky2 )
kz =
 k

kx2 + ky2
2k

(5.2.3)

The above approximation is called the paraxial limit of the k-vector. An envelope function u is dened after removing the fast exp(jkz) dependence of
the eld:
(5.2.4)
= u(x, y, z) ejkz
Substitution of (5.2.4) into (5.2.2) generates the paraxial wave equation
2T u 2jk

u=0
z

(5.2.5)

where the approximation 2 u/z 2  k(u/z) eliminates terms of order


2 u/z 2 .

5.2 Gaussian Optics

221

Construction of formal solutions to (5.2.5) requires Fresnel and Fraunhofer diraction theorems. It serves the present purposes to propose a trial
solution [23] and determine a priori unknown constants through substitution
into (5.2.5). For a fundamental gaussian mode, rather than higher-order Hermite gaussian modes, the trial solution to the scalar envelope is
# 
$
(5.2.6)
u = uo exp j p(z) + k(x2 + y 2 )/2q(z)
For the interested reader, the component p(z) comes from the Fraunhofer
diraction integral and the component k(x2 +y 2 )/2q(z) comes from the Fresnel
kernal [10, 12]. Substitution of (5.2.6) into (5.2.5) generates the parametric
equation
)
*

j
k 2 (x2 + y 2 )


q
(z)

1
=0
2k p (z) +
+
q
q 2 (z)
where primes denote dierentiation with respect to z. Solutions to this parametric equation are
j
p = , and q  = 1
q
Integration of q  = 1 generates q(z) = z + ca , where ca is a constant of integration yet to be determined. With this general solution, p (z) is integrated.
The two general solutions are
p(z) = j ln (z + ca ) + cb , and q(z) = z + ca
Choosing ca as real only produces an oset of q(z). Alternatively, choosing ca
as imaginary introduces an orthogonal coordinate than can be used to scale
the eld solutions to the initial aperture. With the solution
q(z) = z + jb

(5.2.7)

the parameter p(z) is determined as


p(z) = tan

!z "
b


j ln

1+

! z "2
b

Pulling these parameters together, the unnormalized envelope solution is


u(x, y, z) = 

uo
2



exp j tan1 (z/b)

1 + (z/b)




kz(x2 + y 2 )
kb(x2 + y 2 )
exp j
exp
(5.2.8)
2 (z 2 + b2 )
2 (z 2 + b2 )
The expression for b is determined by identifying the e2 power decay of the
mode at z = 0 (which is e1 decay of the eld):

222

5 Collimator Technologies



k(x2 + y 2 )
u(x, y, 0) = uo exp
2b

(5.2.9)

where in the transverse direction


k(x2 + y 2 )
=1
2b
or

kwo2
wo2
=
(5.2.10)
2

The parameter b is called the confocal parameter or the Rayleigh length and
is related to the minimum beam radius wo according to the above equation.
The unit of b is length. In order to complete the trial envelope solution the
eld must be normalized. Normalization can be calculated at any point z since
there is no loss or gain in the system. Normalization of u(x, y, z) such that

2
|u(x, y, 0)| = 1
b=

yields uo = k/b. Pulling together all the pieces, the envelope solution to
the scalar wave equation is

 2



2
x + y2
k(x2 + y 2 )
u(x, y, z) =
exp
(j)
exp

exp
j
(5.2.11)
w2
w2
2R
where the mode parameters are dened as


z2
w2 (z) = wo2 1 + 2
b
z
1
= 2
R(z)
z + b2
z
tan =
b

(5.2.12)
(5.2.13)
(5.2.14)

Before the various denitions are discussed in detail, note that the single
parameter q(z) completely determines the behavior of the beam. This will be
important in the following sections.
Symbol w(z) is the e2 waist radius (in power) along z; R(z) is the radius of
the phase-front, or eld, curvature of the mode along z; and is the common
phase of the mode (Fig. 5.5). These parameters have two solutions in the
extreme. First, at z = 0, the waist does not get smaller than diameter 2wo .
This is in contrast to a ray-optic model, where paraxial rays cross on a focal
plane, indicating a zero beam waist. A gaussian mode always has a non-zero
minimum waist. Moreover, the radius of the phase-front R is innity at z = 0.
There is no phase curvature, the beam is a plane wave at this position, and
the mode is in focus on the plane. Second, in the far eld, the waist and phasefront radius approach asymptotic limits: they both grow linearly with z. In

5.2 Gaussian Optics


beam waist

2wo

223

ray trace

z
phase front

2b

Fig. 5.5. A gaussian mode passing through a waist minimum. The eld curvature
is governed by R(z) and the e2 beam waist is governed by w(z). In the far eld the
adiabatic expansion of the beam waist falls within the diraction angle /wo .
The smaller the minimum waist wo the larger the beam divergence.Also, the depth
of focus, 2b, is the length between points where the beam waist is 2wo .

particular, the beam waist expands in a cone whose half-angle is called the
diraction angle. The diraction angle is calculated from w(z)/z, or
tan =

wo

(5.2.15)

Clearly the diraction angle increases as the minimum beam waist, or the
aperture a collimated beam passes through, decreases. The level of approximation used herein is that the aperture be at least a wavelength large.
The numerical aperture also describes the half-angle of the cone within
which a beam of light adiabatically expands, but the diraction angle and
numerical aperture are not precisely the same. The numerical aperture is
dened as
(5.2.16)
N.A. = n sin na
where na is the angle the marginal ray in a ray-optic formalism takes when
propagating away from a point source. The diraction angle is dened for a
beam waist at e2 , where 13.5% of the optical power is accordingly excluded.
The marginal ray for a suciently large collecting lens covers all the optical power and would therefore trace a larger angle. For example, Corning
reports a numerical aperture of its SMF-28 ber of 0.14 and a mode-eld
diameter of 10.4 0.8 m [9]. These two quantities are directly measured using the procedure referenced in [8]. Asserting wo 5.2 m yields a diraction
angle of = 0.094 rad, which is less than the measured numerical aperture.
Conversely, asserting = 0.14 rad yields a beam diameter of 7.0 m. Similar results are obtained either way, but the point has been made that the
diraction angle and numerical aperture are similar but not the same.
Another parameter derived from (5.2.12) is the depth of focus. The depth
of focus
is the full length about a waist minimum where the waist crosses
through 2wo . In fact, the depth of focus is just twice the confocal parameter: 2b. Of importance is that the depth of focus grows as the square of the
minimum beam waist (cf. (5.2.10)). A doubling of the minimum waist quadruples the depth of focus, allowing the optical throw between lenses to become
large.

224

5 Collimator Technologies

To conclude this section, the functional form of the electric eld prole is
derived from the vector potential. With the vector potential dened by (5.2.1),
the scalar potential in time-harmonic form is found from (1.2.8):
=

j
A
o o

The electric eld (1.2.5) in time-harmonic form is therefore




1
A
E = j A + 2
o o

(5.2.17)

(5.2.18)

= x
Suppose for instance that n
. The electric eld (5.2.18) has a primary
vector component in the x
direction, but also a weak component in the longitudinal, or z, direction. This is in contrast to a plane wave, where the electric
eld components line only the a plane perpendicular to the direction of propagation. For gaussian beams, with a spherical phase front as illustrated in
Fig. 5.5, the electric eld at a point o of the z-axis clearly has a longitudinal
component.
5.2.1 q Transformation and ABCD Matrices
The gaussian beam analysis of the preceding section provides a functional
form to determine parameters such as beam expansion, minimum beam waist,
and eld curvature. The next step addresses the behavior of a gaussian beam
through a lens, a dielectric block, and more generally through a cascade of
on-axis optical elements. The following analysis develops what is called the
ABCD matrix formalism to calculate the behavior of a gaussian beam through
a system. The reader should note that the ABCD formalism has limitations,
including the absence of wavefront aberrations and the inability to directly
calculate passage through a wedge. Following the warning at the introduction of the preceding section, ABCD matrices, like gaussian beam analysis,
provides only an analytic framework with which to rough out a design; a ray
tracing and beam propagation software program should be used to conrm
and tolerance any manufactured component design.
Recall that the q(z) parameter completely describes a gaussian beam. Accordingly, it is sucient to track the transformation of q from one position
to another, through one interface to another. The q parameter is dened
by (5.2.7). The inverse q parameter can easily be related to the beam waist w
and phase-front radius R:
1
2
1
=
j 2
(5.2.19)
q
R
kw
First consider propagation over length d. The q parameter is transformed
to q  as
q = q + d

5.2 Gaussian Optics

a)

b)

c)
wo1
0

n1

Rlens
R1

n
z

0
d)

n2
R2

225

yb

n1

ya
z

dz

n2

Fig. 5.6. Gaussian mode transformations. a) Adiabatic expansion over length d. b)


Passage through a dielectric block of index n and length d having input and output
faces perpendicular to the beam direction. c) Passage through a spherical surface
into dielectric medium of index n2 from medium of index n1 . d) Geometry for phase
dierence calculation.

The beam semi-prole is illustrated in Fig. 5.6(a).


Next consider refraction through a dielectric block having index n and
length d (Fig. 5.6(b)). There are two at interfaces, both perpendicular to
the direction of propagation. At the rst interface the initial wavenumber k
is changed to nk. At the second interface nk changes back to k. The change
in wavenumber needs to be accounted for in the gaussian envelope equation (5.2.11). Consider the input interface. First, the confocal parameter
changes within the medium: b nb. Second, the mode waist is continuous
across the interface, so w(z ) = w(z+ ). To maintain the same mode waist
while accommodating the change of the confocal parameter, z must transform to nz. The q parameter thus transforms as
q  = nq

(5.2.20)

The resulting change of the mode envelope function within the medium produces a reduction of the eld curvature. Within the medium the expansion
of the mode is diminished by the factor n. However, expansion continues
unabated because no curved surface or transverse index gradient has been
encountered. Finally, the change at the output face is reversed from that at
the input face, leading to the q transformation:
q  = q/n

(5.2.21)

Next consider the passage through a spherical lens surface from index n1
to index n2 (Fig. 5.6(c,d)). The surface is oriented such that the z-axis crosses
its center. As detailed in 5.6(d), the phases accrued at lateral coordinates ya
and yb while traveling along z are
a = kn2 z, and b = kn1 z

226

5 Collimator Technologies

Note that the wavenumber k is written for vacuum. The surface, being spherical, is described by x2 + y 2 + z 2 = Rs2 , where Rs is the physical curvature.
The dierence in phase between an axial ray and a ray o-axis is given by the
equation
(n2 n1 )
= k(x2 + y 2 )
(5.2.22)
2Rs
The dierential phase delay of the surface imparts a curvature on the gaussian
mode, leaving a new eld curvature R . That curvature is
1
n2 n1
1

=
R
R
Rs

(5.2.23)

Notice that in the absence of a refractive index discontinuity at the surface


there is no change of eld curvature. In order to arrive at the correct q transformation, care must be taken to recognize that the mode rst travels in n1
and passes to n2 . Applying (5.2.20-5.2.21) and (5.2.23) yields


n1
1
(n2 n1 ) 1

=
q
q
Rs
n2
All of the above transformations are instances of the bilinear transformation. A bilinear transformation is a transformation in the complex plane to
coordinate q  from coordinate q via
q =

Aq + B
Cq + D

(5.2.24)

where coecients A D are real numbers. The four bilinear coecients can
be grouped in 2 2 matrix form. This matrix is called the ABCD matrix. In
the case of uninterrupted propagation, the ABCD matrix is

 

A B
1 d
=
(5.2.25)
C D
0 1
For transformations through a at interface into and out of a dielectric
medium having index n, the ABCD matrices are

 


 

A B
1 0
A B
1 0
=
, and
=
,
(5.2.26)
C D
0 1/n
C D
0 n
respectively. Finally, transformation through a spherical surface produces

1
0
A B

(5.2.27)
(n2 n1 ) n1

C D
n2 Rs
n2
The matrix operation is carried out as

5.2 Gaussian Optics

A B
C D



qa
qb


=

qa
qb

227

where subsequently q  = qa /qb .


Needless to say, armed with a matrix formalism, an arbitrary cascade
of spherical lens surfaces, planar index discontinuities, and free-space propagation can be calculated and the resultant q parameter determined. This
formalism will be used in the following to calculate collimator to collimator
designs.
5.2.2 ABCD Ray Tracing
To the current level of approximation, ray tracing is derived from the gaussian
mode in the limit of innite beam waist. In this limit, there is no eld curvature
and the waves are planar. The q parameter becomes
1
1
=
w q
L
lim

where the symbol L replaces R since the phase-front radius is innite. Rather,
L is the length of the ray between two planes along z. Note that the q parameter is now purely real. Even with the elimination of the eld curvature, the
paraxial limit remains. In particular
sin = y/L 

cos = z/L  1

tan = y/z 

y  = y/z

With these approximations, the ray length from one boundary plane to another is
y
L= 
(5.2.28)
y
The bilinear q transformation can also be rewritten for the plane wave where L
replaces q:
Aq + B
AL + B
= L =
q =
Cq + D
CL + D
or, using (5.2.28),
L =

Ay + By 
Cy + Dy 

(5.2.29)

With this transition to ray optics from gaussian optics in the paraxial limit, the
use of the ABCD matrices (5.2.255.2.27) remains valid. Figure 5.7 illustrates
a ray trace from an object to an image through a thick lens.

228

5 Collimator Technologies

(y1, y1)

L1 (y2, y2)

L2

(y3, y3)
z

f
n
l1

L3

(y4, y4)

l2

Fig. 5.7. Ray trace from object to image through a thick lens. The length of the
ray L1...3 between each boundary plane is indicated. The coordinate (y, y  ) at the
intersection of the ray with each boundary plane completely describes the trace of
the ray.

5.2.3 Action of a Single Lens


Figure 5.8 illustrates a thin symmetric-convex lens of index n with an object
positioned at A (distance l1 from the lens) and an image at C  (distance l2
from lens). The object and image heights are y1 and y2 , respectively. The ray
trace concatenation from object to image is

1
0
1
0
1
l
1
l
2
1

(n 1)
(5.2.30)
A =
(n 1) 1

n
0 1
0 1
Rlens
nRlens n
The center two matrices transform through the rst and second spherical
surfaces. Combined, they yield the focal length of the lens:

1
0
1
0
1
0

(n 1)
(n 1) 1 =

1/f 1
Rlens
nRlens n
where the inverse focal length is dened as
1
2(n 1)
=
f
Rlens

(5.2.31)

Thus the focal length is related to the lens curvature. As the lens was dened
as a symmetric-convex lens, both surfaces refract the ray. The optical power O
of a single spherical surface is dened as
O=

(n2 n1 )
Rlens

(5.2.32)

Optical power is the ability of an interface to change the eld curvature of a


gaussian beam. A at interface has an innite radius and therefore no optical
power. A curved interface with zero index discontinuity likewise has no optical

5.2 Gaussian Optics

y1

229

z
C

n
l1

y2

l2

Fig. 5.8. Ray trace from object to image through a thin lens. Lengths l1 and l2 are
related by the focal length of the lens. The focal length f is f = R/2n.

power. The lens surfaces described in this and the preceding sections do have
optical power, and in anticipation of the derivation of the GRIN lens equation
below, a at interface having an lateral index gradient also has optical power.
Returning to the concatenation (5.2.30), the product is

l
/f
l
+
l

l
l
/f
2
1
2
1
2

A =
(5.2.33)
1/f
1 l1 /f
On the image plane, all rays that emanate from the object converge to the
same point at the image. This is illustrated in Fig. 5.8, where chief ray AOC
and paraxial ray ABC both converge at point C. As this is a thin lens, the
chief ray transits through the center of the lens, where there is no curvature,
without a change of its direction. The paraxial ray travels straight to the lens
and then alters slope so as to pass through the focal point f . In either case,
the nal position is independent of the ray slope y  at the object. Therefore
B = 0. Application of this condition on (5.2.33) generates the formula for a
thin lens:
1
1
1
(5.2.34)
+ =
l1
l2
f
In terms of the q parameter, in the object plane 1/R = 0 and q = jb. The
matrix (5.2.33) yields
1
1
1
=  j 
(5.2.35)
q
R
b
where
 2
l2
f
(5.2.36)
, and b =
b
R =
l1 /l2
l1
Consider three cases: the object is placed behind, at, and in front of the front
focal plane, the front focal plane being on the side of the object. In the rst
instance, the lens focuses the object onto the image plane. The image plane
is a nite distance l2 from the lens and the image is magnied by
 
 l2 
(5.2.37)
M =  
l1

230

5 Collimator Technologies

The magnication can be greater or less than unity, depending on the relative
positions of object and lens. Likewise, the gaussian beam waist is magnied
as indicated by b in (5.2.36): M = wo /wo .
In the second instance, l1 = f and the lens collimates the light from the
object. In the paraxial limit, all rays passing through the lens subsequently run
parallel to one another. The ABCD matrix (5.2.33) for this condition yields
only one meaningful relation, which is the inclination angle of the collimated
beam as a function of oset on the back focal plane:
pt =

y
f

(5.2.38)

where, in anticipation of what is to follow, the inclination angle is denoted pt


for the pointing direction. A simple lens transforms positional oset on the focal plane to angle of the collimated beam. This transformation property is the
basis of much Fourier optic ltering work as well as some telecommunications
components [20].
The nal instance is when l1 < f . In this case the curvature of the lens is
not sucient to counter the adiabatic expansion of the mode; the mode will
continue to expand. However, a virtual image is formed on the same side of
the lens as the object, at location l2 . To an observer on the far side of the
lens, the object will appear located at the virtual image position.
5.2.4 Action of a GRIN Lens
A GRIN lens substitutes the optical power of a curved surface across an
index discontinuity with the optical power of an index gradient in the plane
perpendicular to the propagation axis. A canonical index prole is


x2 + y 2
(5.2.39)
n(x, y) = no 1
2p2
where no is the axial index of the lens and p is the curvature of the index
gradient having units of length. The phase retardation across an innitesimal
slab z, determined in the same way (5.2.22) was derived, is
= k

(x2 + y 2 )
no z
2p2

The focal length of this innitesimal slab is identied as


1
no z
=
f
p2
A single slab is described by a cascade of three ABCD matrices: a rst matrix
that propagates z/2no , a second matrix that focuses with focal length f ,
and a nal matrix that again propagates z/2no . Keeping terms to order (z)2 , the resulting concatenation is

5.2 Gaussian Optics

z 2
z
1
2

2p
no

A=
no z
z 2
2
1
p
2p2

231

(5.2.40)

The determinant of (5.2.40) is


=1+
det(A)

2p

4

Only in the limit z 0 is the concatenation matrix A unitary.


A grin lens having length L is divided into n sections of length z so that
L = nz. The limits z 0 and n are then taken. In order to carry out
these limits, matrix (5.2.40) is separated into its eigenvectors and eigenvalues
as in (2.6.34). When A is cascaded n times, the inner eigenvector matrices
telescope and leave only
An = V n V
(5.2.41)
where V is a square matrix whose columns are the eigenvectors of A and
is a diagonal matrix of the corresponding eigenvalues. The eigenvectors and
eigenvalues of A are
jp





z
z 2
V = no , and = 1

j
2p2
p
1
Notice that z enters only in the eigenvalues and not the vectors; only the
eigenvalues get integrated so that z will become length L. Retaining only
terms of order z in and recalling the limit expression for the exponential
function, the cascaded eigenvalues take the form

n
L
lim ( )n = lim 1 j
n
n
np
= exp(jg )
where the GRIN lens angle is
g =

L
p

(5.2.42)

Denoting An as AL , the ABCD matrix of a GRIN lens, derived from (5.2.41),


is

p
sin g
cos g

no
AL = no
(5.2.43)

sin g cos g
p
This is an unusual governing equation because of its periodicity. Every g = 2
yields a replication of the object in real space. Half of this length generates an

232

5 Collimator Technologies
n1

n2

F = Front focal point


F  = Rear focal point
S = Front surface vertex

I z

S  = Rear surface vertex


P = Front principal plane
P  = Rear principal plane

cos(L A)

no A

sin(L A)
n2

n1 = Object space index

n1
sin(L A)
no A

n1
cos(L A)
n2

n2 = Image space index


O = Object plane
I = Image plane
Object distance L1 = OS
Image distance L2 = SI

Fig. 5.9. GRIN lens reference planes and denitions (after [15]). Scale-factor
is the index gradient constant and has units of m1 .

object inversion. An one-quarter of this length will collimate a point source.


Along a long GRIN rod the light rays are bound and trace a periodic longitudinal pattern.

In order to maintain consistency with the industry norm, the symbol A


replaces p and is called the index gradient constant, having units of m1 .
Figure 5.9 shows the matrix form with industry notation. The pitch of a
GRIN lens P is dened as

L A
(5.2.44)
P =
2
When a GRIN rod is cut to a quarter pitch, P = 0.25, and the rod is placed
in air, the focal length of the lens is
1
f1/4

= no A

(5.2.45)

This relation makes it clear that the stronger the index gradient, the higher
the eective curvature of the lens. As a matter of common practice, a GRIN
collimator has a pitch of P = 0.23; the purpose of this reduced pitch is to pull
the front focal plane away from the physical face of the rod so that a ber
ferrule can be located in proximity to the lens without touching it. Table 5.3
provides the paraxial design equations [15] for a GRIN lens immersed in differing media. The locations of the reference planes are indicated in Fig. 5.9.
5.2.5 Some Limitations of the ABCD Matrix
Care must be taken when applying the ABCD matrix formalism to practical
systems. The formalism is derived for a gaussian mode as it travels through
a cascade of refractive indices and surfaces having optical power. Changes
in ray direction originate from a change in the modal wavenumber k and

5.2 Gaussian Optics

233

Table 5.3. GRIN Lens Paraxial Equations


Parameter

Function

Front focal length

FS =

Eective front focal length


Rear focal length
Eective read focal length
Front principal distance
Rear principal distance
Magnication
Angular magnication

n1 cos(L A)

no A sin(L A)

n1

A sin(L A)

n2 cos(L A)
S F  =

no A sin(L A)
FP =

no

n2

A sin(L A)

n1 |1 cos(L A)|
SP =

no A sin(L A)

n2 |1 cos(L A)|
P  S =

no A sin(L A)
P F  =

no

n1

n1 cos(L A) no L A sin(L A)

n1 cos(L A) no L A sin(L A)
Ma =
n2

M =

changes in the eld curvature. Absent is the ability to change direction of


the mode through a wedge or a prism. According to (5.2.29), the ABCD
matrix transforms (ya , ya ) to (yb , yb ). One is generally more accustomed to
transformations on coordinates (y, z). In the latter case rotation matrices can
be applied to rotate the coordinate system through at, inclined refractive
surfaces. But this does not apply to the ABCD formalism.
As an example, consider the propagation over length d through air and
through a medium with index n. The initial ray comes from an axially located
point source, i.e. ya = 0. The two cases are





1 d
0
dya
=
(5.2.46a)
ya
0 1
ya





1 d/n
0
dya /n
=
(5.2.46b)
ya
0 1
ya
Using Snells law, one expects the light ray to refract into the medium and
thereby change its inclination. However, (5.2.46) does not directly indicate
that the ray inclination has changed: the y  entry in both output vectors is
the same. The way refraction is handled in the ABCD formalism is to compress
the oset component to y/n from y.
This limitation poses problems for the analysis of most collimators. As
discussed in 5.1 on page 213, epoxy-joint and air-gap collimator assemblies
incline the ferrule and lens facets to minimize back reection. How is the
inclination to be handled? Ordinarily Snells law is applied at the interface

234

5 Collimator Technologies
a)

Object

ut

us
us1ut

u11ut
c)

dp

b)

Image

us

u2

Lgap
n

us

u12u2

Fig. 5.10. Image location correction for a gaussian mode through an inclined face.
a) Object point inclined by 1 + t to interface normal, the interface being tilted
by t . Refraction angle is s . b) Image angle 2 such that s is the same as in a). c)
Over xed gap Lgap image point is oset p from object point.

and a new paraxial angle y  is determined. However, that is incompatible


with the ABCD formalism. To x this problem, an image point is created
as a complement to the original object point. When a ray emanating from
the image point transits a perpendicular interface (having the same refractive
step as the inclined face), the refraction angle with respect to the horizontal
(or other xed reference) is the same as that created by a ray from the object
point which transits the inclined face. On the refracted-ray side, one cannot
distinguish whether the ray comes from the object or image point.
Figure 5.10 illustrates the required analysis. All calculations are taken in
the paraxial limit of small angles. In Figs. 5.10(a,b) the refracted angles are
(1 + t ) = n(s + t ) and 2 = ns , respectively, where t is the tilt angle of
the facet. The relation between 1 and 2 for a xed s is
2 = 1 (n 1)t

(5.2.47)

The position oset over a xed gap length Lgap is


p = Lgap (2 1 )

(5.2.48)

This oset is illustrated in Fig. 5.10(c). Armed with these position and angular
corrections, the ABCD formalism may be applied to inclined-facet collimators.

5.3 Select Collimators Analyzed with the ABCD Matrix


Four collimator examples are presented to highlight the analyses of the preceding sections. The rst example is shown in Fig. 5.12 and uses a shaped
lens. The optical data are detailed in the caption. This collimator is an airgap collimator where the ferrule and lens rod are angle polished. As proposed

5.3 Select Collimators Analyzed with the ABCD Matrix


nfiber

ngap

nlens

ua
uferrule

235

ub
ulens

utilt

Fig. 5.11. Ray trace to determine tilt angle t such that the reentrant beam runs
parallel to the mode in the ber given that the ber and lens refractive indices dier.

in [5], the angle of the lens can be tailored with respect to the ferrule angle
so that the beam that enters the lens runs parallel to the axis of the ber
even when the lens and ber refractive indices dier. This detail is important
when trying to minimize the pointing error of an air-gap collimator and can
be applied to either a GRIN or shaped lens. Figure 5.11 illustrates the relevant calculation. The known parameters are the ber and lens indices and
the ferrule facet angle. In the small angle limit, the angle of the beam as it
emerges from the ferrule, with respect to the ber axis (horizontal), is
a = (nber 1)ferrule
where the index of the air gap is taken as unity. The angle of the beam after
refraction into the lens with respect to the horizontal is
b = (a + lens )/nlens lens
The goal is to have the reentrant beam run parallel to the ber axis: b = 0.
The lens facet angle with respect to the ferrule facet angle is therefore


nber 1
(5.3.1)
lens =
ferrule
nlens 1
The tilt angle t , the angle dierence between ferrule and lens, is


nlens nber
t =
ferrule
nlens 1

(5.3.2)

For the example shown in Fig. 5.12, the ferrule and lens angles are 8 and 6.7 ,
respectively. The refraction from the object point through the wedge facet and
the refraction from the image point through the planar facet produce the same
ray trace. The image ray trace, which accounts for the faceting of the lens and
calculated via (5.2.47-5.2.48), is shown in both gures and detailed in the lower
gure. The pointing direction of the collimated beam is downward due to the
upward oset of the central ray accrued in the gap. While the collimator in the
gure shows a substantial gap between ferrule and shaped lens, it is clear that
relocation of the ferrule immediately behind the lens facet and re-optimization
of the lens focal length can minimize, albeit not eliminate, the pointing error.

236

5 Collimator Technologies

nfiber

(w)

nlens

upt
offset

ua

Ferrule

ub

Lens

ua: p/21uferrule
ub: p/22ulens

Physical
curve

Collimated
beam

Optical
power

(w)

obj2

Image
ray trace

Matched
refraction

dp
obj1
du

Original
ray trace

Wedge
facet

Planar facet

Fig. 5.12. Scale drawing of central and paraxial rays emergent from an SMF28 ber and captured by a shaped lens. The lens collimates the beam. The
surface of optical power is superimposed over the physical curvature of the
lens. The lens facet is designed to straighten the central ray after refraction. The parameters are: nber = 1.46, ferrule = 8.0 , N.A. = 0.11, nlens = 1.55,
Llens = 2.0 mm, Rlens = 1.0 mm, lens = 6.7 , Lgap = 0.53 mm, oset = 0.0 mm,
image oset = +0.033 mm, pt = 1.07 , vertical scale = 2.
gap
nfiber

p/21uferrule
Ferrule

nlens

p/22ulens
utilt

offset

upt

Lens

Fig. 5.13. Scale drawing of central and paraxial rays emergent from an
SMF-28 ber and captured by an NSG SLS2 lens [16]. The parameters
= 1.46, ferrule = 8.0 , N.A. = 0.11, nlens = 1.5503, Llens = 6.10 mm,
are: nber
A = 0.237 mm1 , EFL = 2.743 mm, lens = 6.0 , Lgap = 0.343 mm,
P = 0.23,
oset = 0 mm, image oset = +0.020 mm, pt = 0.25 , vertical scale = 2.

5.3 Select Collimators Analyzed with the ABCD Matrix


a)

237

upt-1

obj1

b)

upt-2

obj2

c)
obj3

upt-3

Fig. 5.14. Scale drawing of central and paraxial rays emergent from
an SMF-28 ber and captured by a long-reach GRIN lens [17]. Lateral oset of the ferrule can reduce the pointing error. The parameters
are: nber =1.46, ferrule = 8.0 , N.A. = 0.11, nlens = 1.5902, Llens = 2.324 mm,
A = 0.322 mm1 , EFL = 2.870 mm, lens = 6.0 , Lgap = 2.10 mm,
P = 0.119,
oset = 0, 125, 250 m, image oset = +0.130 mm, pt = 2.59, +0.10, +2.39 ,
vertical scale = 9.5.

The second example is shown in Fig. 5.13 and uses a GRIN lens. The
optical data are detailed in the caption and the design follows that of [16].
The ferrule and lens facets are angle polished to minimize back reection and
while the angles dier they do not exactly straighten the central ray. As with
the shaped lens, the image method was used to calculate the beam refraction
through the GRIN facet. The GRIN lens changes the wavefront curvature
continuously throughout the body of the rod until collimation is achieved; the
pitch of this lens is P = 0.23. Even with the small gap there is a downward
pointing direction due to the lateral oset of the beam.
The third example is shown in Fig. 5.14 and uses a long-reach GRIN lens.
The example follows that of [17] and is a study of pointing direction as a
function of lateral ferrule oset. The optical data are detailed in the caption.
A long working distance lens requires a large beam expansion to overcome
diraction. In this example the expansion occurs predominantly in the air
gap. Accordingly, the ray bundle that impinges on the back face of the lens
is substantially oset resulting in a large pointing error. Progressive oset by
125 m steps shows how the pointing direction changes, the optimal oset
being  125 m.
Finally, the forth example is that of a dual-ber collimator. Dual-ber
collimators are essential components for micro-optic devices because one lens
acts to collimate light from and focus light onto two separate bers. A dual

238

5 Collimator Technologies
a)

wedge
dz

fiber 1
fiber 2
ferrule
b)

udiv

lens
c)
ferrule
dz

fiber 1
fiber 2

udiv

planar

Fig. 5.15. Scale drawings of central and paraxial rays emergent from two SMF28 bers positioned in a dual-ber ferrule (inset) and captured by an NSG SLS2
lens [16]. a) Fiber cores not in plane, b) Fiber cores in plane. The parameters for

Llens = 6.10 mm,


a) are: nber
= 1.46, ferrule1= 8.0 , N.A. = 0.11, nlens = 1.5503,
A = 0.237 mm , EFL = 2.743 mm, lens = 6.0 , Lgap = 0.343 mm,
P = 0.23,
oset = 125 m, pt = 3.03 , +2.20 , div = 5.23 . The parameters for b) are
the same except: pt = 2.61 , div = 5.22 .

ber collimator is the most compact way to t two bers into a small package.
Two separate collimators use the spatial oset of the lenses to distinguish one
port from another. A single dual-ber collimator uses the divergence angle to
distinguish between ports. Many elegant schemes for circulator, interleaver,
and thin-lm lter architectures have been developed to take advantage of
this component.
Figure 5.15 shows a dual ber collimator using a GRIN lens. A shaped
lens could be used as well. The inset illustrates the face of the two bers
inserted into the ferrule. The bers are stripped of their jacket so that the
separation is twice that of the ber radius. For SMF-28 ber the core-to-core
separation is 250 m. For an air-gap collimator there is a choice on how to
orient the dual-ber ferrule with respect to the lens facet. In one case one ber
core extends beyond the other ber core (Fig. 5.15(a)), and in the other case
the ber cores are ush (Fig. 5.15(b)). It is reported that GRIN collimators
typically use the rst orientation and shaped-lens collimators use the second
orientation [5].
In light of the change in pointing direction with lateral ber oset, studied in Fig. 5.14, the angle between output collimated beams is calculated
using (5.2.38) on page 230. The divergence angle div is
div =

y2 y1
f

(5.3.3)

5.4 Fiber-to-Fiber Coupling by a Lens Pair

239

where the position yk is taken from the centerline of the lens. For small gap
lengths, the approximate and generally used expression for the divergence
angle is
s
(5.3.4)
div =
f
where s is the separation between ber cores. A typical divergence angle for
commercially available collimators is 3 .
While small pointing errors of a single-ber collimator are not signicant
since the collimator housing is simply tilted in compensation, the divergence
angle of a dual-ber collimator is inviolate as the two bers in the ferrule are
xed in position. For a dual-ber collimator that receives two output beams,
the component architecture must account for the requirement that the beams
must enter the collimator at the divergence angle.

5.4 Fiber-to-Fiber Coupling by a Lens Pair


The function of the collimator is to minimize the ber-to-ber insertion loss
when the beam transits a component core. The elements within the core and
the wavefront distortion imparted by the lenses both introduce loss. These
losses must be accounted for on a case-by-case basis. Errors in imaging may
also lead to losses, and these losses can be estimated by the coupling coecients derived below.
Figure 5.16 illustrates the two optimal coupling congurations for a gap of
length Lgap . The rst optimal coupling is when the lenses produce a collimated
beam. In this case the beam waist between the two lenses is w = f N.A. and
the beam suers minimum diraction. The second optimal coupling is when
the lenses produce an intermediate focus positioned between the lenses. The
waist on this focal plane is determined from the magnication of the lenses:
M=

Lgap
wb
1
=
wa
2f

(5.4.1)

where the latter equation comes from (5.2.34). While not a necessary condition, it is worth noting that the depth of focus equals the gap length when

Lgap
wb =
(5.4.2)

For example, with ideal lenses, a gap length of 100 mm, and = 1.545 m,
a beam diameter of 450 m puts the depth of focus at the gap length. In
turn this corresponds to a magnication factor of M 43.
That the aforementioned coupling conditions are optimal is shown using
the ABCD matrix formalism. The matrix concatenation for two like collimators is

240

5 Collimator Technologies

a)

F
2 f N.A.

2N.A.
lp1
b)

lg
p
2 2wb

lp1
2wb

2wa

2wa
lp2

lg

lp2

Depth of focus = 2b

M=

wb
wa

Fig. 5.16. Gaussian beam prole from one ber to another with two similar lenses.
There are two optimal coupling conditions for xed gap Lgap : collimation (a) and
focusing (b). a) To collimate the bers are located on the front focal plane and the
beam between lenses nominally does not expand. b) To focus the bers are located
behind the front focal plane and the intermediate beam achieves focus between the
lenses. The image on the back focal plane is an image of the ber face magnied
by M .

A =

1 lp
0 1

1/f 1

1 Lgap
0

1/f 1

1 lp

0 1

The resulting product is

2
L
l

2f
l

f
L
+
f
(l

f
)
(L
l

2f
l

f
L
)
1
gap
p
p
gap
p
gap
p
p
gap

A = 2
f
Lgap 2f
Lgap lp 2f lp f Lgap + f 2
To achieve optimal coupling, the B entry must be zero. It is in this circumstance that a point source is imaged to another point source with no regard
to the inclination of the rays that emanate from the original source. One nds
two possible solutions:


1
1
1
(lp f )

=0
f
Lgap /2 lp
The solutions are
Collimate:

lp = f

Focus:

1
1
1
=
+
f
Lgap /2 lp

5.4 Fiber-to-Fiber Coupling by a Lens Pair

241

To collimate, the ber facet is positioned on the front focal planes of the lens.
The ABCD matrix is

1
0

Acoll = Lgap 2f
(5.4.3)

1
2
f
Alternatively, to focus, the ber facets are pulled behind the front focal planes
of the lenses. A focus is reached midway between the lenses with the magnication factor (5.4.1). The ABCD matrix in this case is

1
0

Afocus = Lgap 2f
(5.4.4)

1
2
f
Consider rst the case of collimation in the ray-optic framework. Denote
the excess gap between the two focal lengths of the lens to lens gap d such
that d = Lgap 2f . The output point and slope is related to the inputs as

y1
yo

y1
(d/f 2 )yo yo
All points yo in the object plane are reconstructed on the image plane with
up-down inversion and unity magnication. For the point on the object plane
that is also on axis (yo = 0) the angle of an image ray is the negative of the
angle of an associated object ray: y1 = yo . For points on the object plane
o axis, the angle is adjusted to ensure that at the image plane the points
reconstruct the object with up-down inversion. In the gaussian framework,
the output q parameter is related to the input q parameter as
1
f2
1
+
=
q1
d
qo

(5.4.5)

When the object is in focus, 1/qo = j2/kwo2 . Clearly the waist of q1 is the
same as qo , representing unity magnication, and eld curvature is added
when the excess gap is non-zero.
The same observations regarding the collimating lens pair apply to the
focusing lens pair with the exception that the image is not up-down inverted
but is instead recovered upright. The notion of excess gap is not applicable
to the focusing system because the focal position is designed to lie between
the two lenses. Field curvature in imparted at the object plane via the same
mechanism that applies to the collimating lens pair. Only when the object
and image are at innity do they both have zero eld curvature.

242

5 Collimator Technologies

5.4.1 Coupling Coecients


Calculation of the coupling from ber to ber is essential to the design of
robust components. Gaussian optics provides a framework from which to calculate the coupling coecients. However, it is once again emphasized that
robust tolerancing must be done with commercially available ray-tracing software, but reasonable loss estimates can be made with gaussian optics.
One method to calculate the coupling loss is that of overlap integrals. A
rst gaussian mode prole e1 from one beam is multiplied by a second prole
of another beam e2 and integrated over the cross-section. The general mode
overlap integral is
,,
e e dxdy
1 2
(5.4.6)
I = ,,
,,

e dxdy
e
e
e
dxdy
1
2
1
2
The overlap integral I is bound by zero and unity. The overlap integrals
derived in the following account for magnication errors, oset, tilt, and defocusing. The method of overlap integrals allows the coupling coecient to
be calculated at any convenient plane along the optical path, not just, for
instance, at the output. The midpoint along a path is often a convenient location to calculate the mode overlap. Note, however, that care must be taken to
account properly for propagation direction when the overlap integral is taken
away from the source or target; a mode reverse propagated to the overlap
plane must be reversed on the plane to account for the complex conjugate
found in the integral.
Magnication and Oset Errors
Consider a rst gaussian mode with circular beam radius w1 and centered
along the axis of propagation. The normalized mode prole is
 2

1
x + y2
e1 (x, y) =  2 exp
(5.4.7)
2w12
w1
This mode might be the approximate eigenmode of a ber. Now, consider a
second gaussian mode that is imaged onto the rst mode. The second mode
has magnication error, where the beam radius w2 is not w1 , and oset error
where the oset is taken, without loss of generality, along the x axis. The
second mode prole is written as


1
(x x)2 + y 2
e2 (x, y) =  2 exp
2w22
w2
The coupling coecient of these two modes is calculated from (5.4.6):


w1 w2
x2
Ia = 2 2
exp

(5.4.8)
w1 + w22
2(w12 + w22 )

5.4 Fiber-to-Fiber Coupling by a Lens Pair

243

When both mode radii are the same, the overlap integral is unity for zero
oset and decreases to zero as the oset is increased. In the asymptotic limit
that one mode is much larger than the other, the overlap integral is dominated
by the smaller mode size.
Tilt Error
Another error is the tilt error. This can come from misalignment of the collimators or may be built into the architecture of the device, e.g. a wedge-type
polarization independent isolator. The tilt is accounted for by a phase front
rotation by the angle between the two beams. For a tilt angle of , the phase
rotation is exp(+jkx tan ), where k is the wavenumber. For normalized modes
and in the small angle limit, the phase tilt is added to (5.4.6) via

I=
e1 ejkx e2 dxdy
(5.4.9)

Two beams may be tilted and oset along dierent directions. The perpendicular and parallel cases are considered here. When the tilt and oset are
perpendicular, the resultant coupling coecient is
 2 2 2 2

w1 w2
k w1 w2 + x2
Ib = 2 2
exp
(5.4.10)
w1 + w22
2(w12 + w22 )
When the beams are aligned but for the tilt and the magnication is unity,
the tilt penalty is
 2 2 2
k w

(5.4.11)
Ib = exp
4
When the tilt and oset are parallel, the resultant coupling coecient is
 2 2 2 2

w1 w2
k w1 w2 + 2jkxw12 + x2
exp
(5.4.12)
Ib = 2 2
w1 + w22
2(w12 + w22 )
This coupling coecient is a complex number. The phase of the coecient
does not matter when one beam overlaps with another. However, accounting
for the phase is critical when two or more co-polarized beams are coupled into
the same ber or waveguide; in this case coherent interference will occur and
more or less coupling can be achieved with oset of tilt of the beam. In the
present case, only one beam couples to another, in which case the magnitude
of (5.4.12) is what matters. That magnitude is the same as written in (5.4.10),
that is
 
Ib  = Ib
(5.4.13)

244

5 Collimator Technologies

Focus Error
Focus error can be treated in the same matter as tilt error where a phase term
is added to the overlap integral. For tilt the phase front was modied by a
linear increase in phase as a function of lateral coordinate. For focus error,
within the gaussian approximation where all eld curvatures are spherical, a
quadratic phase increase (or decrease) as a function of lateral coordinate is
inserted. The overlap integral takes the form

!
2 +y 2 "
j x 2r

2
I=
e1 e
e2 dxdy
(5.4.14)

where r is the eld curvature. Eliminating tilt and oset errors but retaining
magnication error, the coupling coecient with focus error is
1
Ic =
w1 w2

1
1
j
+
+ 2
2
2
2w1
2w2
2r

1
(5.4.15)

As with the tilt error, the integral (5.4.15) is a complex number. The imaginary
part is associated with the eld curvature error. When more than one copolarized beam is coupled to the same output port, interference due to the
eld curvature results. However, in the present case no such interference is
considered and the magnitude of the coupling coecient generates the loss
penalty. The magnitude of (5.4.15) is
1
|Ic | =
w1 w2



1
1
+
2
2w1
2w22

2
+

1
2r2

2 1/2
(5.4.16)

and nally when the magnication is unity the defocus penalty reduces to
|Ic | = 

1
2

(5.4.17)

w
2r2

+1

Taylor expansion of (5.4.17) shows that the initial penalty accrues as quickly
as that for oset (5.4.8) and tilt (5.4.11) errors.

References

245

References
1. K. Asano and H. Hosoya, Collimator lens, ber collimator and optical parts,
U.S. Patent Application 2002/0 168 140 A1, Nov. 14, 2002.
2. P. Bernard, M. A. Fitch, P. Fournier, M. F. Harris, and W. P. Walters, Fabrication of collimators employing optical bers fusion-spliced to optical elements
of substantially larger cross-sectional areas, U.S. Patent 6,360,039 B1, Mar. 19,
2002, same spec as US 2002/041742 A1 and US 2002/0054735 A1.
3. , Fabrication of collimators employing optical bers fusion-spliced to optical elements of substantially larger cross-sectional areas, U.S. Patent Application 2002/0 041 742 A1, Apr. 11, 2002, same spec as US 6,360,039 B1 and US
2002/0054735 A1.
4. , Fabrication of collimators employing optical bers fusion-spliced to optical elements of substantially larger cross-sectional areas, U.S. Patent Application 2002/0 054 735 A1, May 9, 2002, same spec as US 6,360,039 B1 and US
2002/0041742 A1.
5. C. Brophy and A. K. Thompson, Dual ber collimator, U.S. Patent Application 2003/0 021 531, Jan. 30, 2003.
6. Casix Quality Assurance Department, Reliability test report on c-collimator,
Casix, Inc., Fuzhou, Fujian, P.R. China, Tech. Rep. TR1201020 Issue 01, Oct.
1999. [Online]. Available: http://www.casix.com
7. , Reliability test report on collimator, Casix, Inc., Fuzhou, Fujian,
P.R. China, Tech. Rep. TR1201001 Issue 01, Dec. 1999. [Online]. Available:
http://www.casix.com
8. Mode-eld diameter measurement method, Corning Incorporated, Corning,
NY, Aug. 2001, MM16.
9. Corning SMF-28 optical ber product information, Corning Incorporated,
Corning, NY, Aug. 2002, PI1036.
10. H. A. Haus, Waves and Fields in Optoelectronics. Englewood Clis, New Jersey:
PrenticeHall, 1984.
11. Micro optics for telecom catalog 2002, Kocent Communications, Fuzhou,
Fujian, P.R. China, 2002. [Online]. Available: http://www.koncent.com/
12. J. A. Kong, Electromagnetic Wave Theory. New York: John Wiley & Sons,
1989.
13. Lightpath technologies product catalog, LightPath Technologies, Orlando,
FL, 2003. [Online]. Available: http://www.lightpath.com/literature.html
14. Z. Liu, Optical collimator with long working distance, U.S. Patent 6,469,835
B1, Oct. 22, 2002.
15. NSG America, Inc., Somerset, NJ, object at Dispersion Equations and Paraxial
Optics Formulae. [Online]. Available: http://www.nsgamerica.com/technical.
shtml
16. Seloc microlens table, NSG America, Inc., Somerset, NJ. [Online]. Available:
http://www.nsgamerica.com/technology/microlens.cfm
17. I. Ooyama, T. Fukuzawa, and S. Kai, Optical ber collimator, U.S. Patent
Application 2002/0 094 163 A1, July 18, 2002.
18. J.-J. Pan, M. Shih, and J. Xu, Integrable beroptic coupler and resulting devices and system, U.S. Patent 5,889,904, Mar. 30, 1999.
19. C. Qian and Y. Qin, Optical ber collimator with long working distance and
low insertion loss, U.S. Patent Application 2002/0 197 020 A1, Dec. 26, 2002.

246

5 Collimator Technologies

20. M. Shirasaki, Optical apparatus which uses a virtually imaged phased array to
produce chromatic dispersion, U.S. Patent 5,930,045, July 27, 1999.
21. Generic Reliability Assurance Requirements for Passive Optical Components,
Telcordia Technologies Std. GR-1221-CORE, 1999. [Online]. Available:
http://telecom-info.telcordia.com/site-cgi/ido/index.html
22. L. Ukrainczyk, Optical ber collimators and their manufacture, U.S. Patent
Application 2003/0 026 535 A1, Feb. 6, 2003.
23. A. Yariv and P. Yeh, Optical Waves in Crystals. Hoboken, New Jersey: WileyInterscience, John Wilet & Sons, Inc., 2003.
24. Y. Zheng, Dual ber optical collimator, U.S. Patent 6,148,126, Nov. 14, 2000.
25. , Reliable low-cost dual ber optical collimator, U.S. Patent 6,246,813
B1, June 12, 2001.

6
Isolators

An isolator is a one-input, one-output component that transmits light in a forward direction and blocks light in a reverse direction. All isolators incorporate
at least one Faraday rotator (FR) as the nonreciprocal element. There are two
classes of isolators: polarizing isolators that isolate by rejecting the unwanted
polarization state, and polarization-independent isolators that isolate by spatial ltering. Polarizing isolators have a single optical path that transits a rst
polarizer, the FR, and a second polarizer. In the isolation direction the unwanted polarization is either absorbed or deected. Polarization-independent
isolators use polarization diversity to form two co-directional optical paths,
one for each polarization. Along the forward direction the two paths recombine
at the collimator while along the reverse direction the two paths fall outside
of the ber aperture.
Analysis of the polarizing isolator highlights the issues regarding transmission and isolation as a function of temperature, wavelength, and manufacturing error. The materials-related shortcomings of iron garnets are apparent
in this analysis. The polarization-diversity schemes to realize polarizationindependent isolators guide the best choice lensing system and open up the
issue of path-length balancing. A path-length imbalance imparts polarizationmode dispersion (PMD). PMD-compensated isolators are realized by select
changes in the optical elements used to build the isolator or by explicit addition of a compensating element.

6.1 Polarizing Isolator


Polarizing isolators were the rst isolator type to demonstrate the importance
of the nonreciprocal Faraday rotator. Early literature shows a Verdet-based
isolator [28] and a terbium-aluminum garnet-based isolator [16]. A garnetbased polarizing isolator is illustrated in Fig. 6.1. The polarizing axes of the
rst and second polarizers are rotated 45 with respect to one another. A
Faraday rotator with a xed magnetization direction is placed between the

248

6 Isolators
M

M
P45

P45

uF=45
a)

P0

uF=45
b)

P0

Fig. 6.1. Faraday rotation of 45 between two polarizers. Polarizer and analyzer
are rotated 45 with respect to one another to maximize transmission and isolation.
a) Forward path allows transmission. Horizontal linear polarization is rotated +45
by FR and transits analyzer P45 without loss. b) Reverse, isolation path. +45 linear
polarization is rotated +45 by FR and is extinguished by analyzer Po .

two polarizers. In practice the FR is a saturated Bi:RIG iron garnet (cf. 4.2.3)
where the magnetization is xed by a permanent magnet, such as Sm-Co, or
where the iron garnet is latching and pre-poled. Multi-magnet schemes have
been proposed to concentrate the magnetic eld around the FR [5, 6, 18],
but in practice a single magnet is used. The FR is designed to rotate a linear
polarization state by +45 (or 45 ) irrespective of transit direction.
In the forward, or transmission, direction, the lead polarizer Po polarizes
the light along the horizontal (the absolute direction being, of course, immaterial). The FR subsequently rotates the polarization by +45 , which aligns it
for complete transmission through the second polarizer P45 . In the reverse, or
isolation, direction, the lead polarizer P45 polarizes the light at +45 . The FR
rotates the polarization by +45 , at which point the polarization is clipped
by the second polarizer Po . The second polarizer absorbs the light.
Problems with this system arise from the wavelength and temperature dependence of the FR plate, manufacturing error of the plate length, residual
linear birefringence in the garnet, and multiple reections due to imperfect
antireection coatings. Residual linear birefringence is reduced by removing
the garnet from its substrate and annealing the lm, although linear birefringence remains the ultimate limiting factor. Imperfect antireection coatings
cause multiple reections inside the material; each full pass rotates the polarization by approximately 90 , so every other round-trip reection creates
a polarization component that reduces isolation.
The specic rotation of an iron garnet has temperature and wavelength
dependencies:
2
n(, T )
(6.1.1)
F (, T ) =

The Faraday rotation angle F for a plate of length L is


F = F L

(6.1.2)

At a nominal temperature, wavelength, and thickness the target rotation


is F o . The actual rotation for small deviations is F = F o + F , where
the total deviation from the target is

6.1 Polarizing Isolator

F =

dF
dF
dF
L +
T +

dL
dT
d

249

(6.1.3)

The rst term comes from manufacturing error, the second from temperature
dependence of the garnet, and the third from the total wavelength dependence.
The wavelength dependence has two components, one from the xed waveplate
thickness and the other from wavelength-dependent n of the garnet:


dF

dn
1
=
F o +
(6.1.4)
F o
d
o
n
d
The rst term is simply the frequency dependence of the waveplate and comes
from (4.6.13) on page 182. The second term highlights the material dependence on wavelength. This term is not zero and generally increases the overall
wavelength dependence of the plate.
Recall that the eigen-axis of a Faraday rotator is
s3 . The associated Jones
operator UF is
(6.1.5)
UF = cos F I j3 sin F
The operator UF must be treated carefully: due to the nonreciprocal nature
of the FR, the signs of F and 3 are invariant to transit direction. The sign
encompasses only the magnetization direction M and the particular Bi:RIG
material. Once these are xed, the sign is xed. The point of including the
sign is to represent the possibility that the FR or permanent magnet can be
ipped around. Without loss of generality, the () sign will be used in the
following.
Also, recall that a polarizer is represented by the projection matrix (2.5.2)
on page 52. The two polarizers for this nominal isolator are




1 1 1
1 0
, and P45 =
(6.1.6)
Po =
0 0
2 1 1
Equipped with these polarizer and FR matrices, the forward and reverse isolator paths can be analyzed.
In the forward direction, output state |t is generated from input state |s
via
|t = P45 UF Po |s
Recalling that P 2 = P, the output intensity is
t |t = s | Po UF P45 UF Po |s

(6.1.7)

Combining (6.1.5-6.1.6) into (6.1.7) gives the forward transmission matrix




1
1 0
+
= (1 + sin 2F )
Tiso
0 0
2
At the nominal Faraday rotation angle F o = +45 the transmission is unity.

250

6 Isolators

In the reverse direction, the output state |s  is generated from input
state |t  via
|s  = Po UF P45 |t 
The output intensity is
s |s  = t | P45 UF Po UF P45 |t 
which yields the reverse transmission matrix

Tiso

1
= (1 sin 2F )
4

1 1
1 1

Stripping away the polarization orientation, the norm of Tiso


gives the total
reverse intensity
1

|Tiso
| = (1 sin 2F )
(6.1.8)
2
At the nominal Faraday rotation angle F o = +45 the transmission is zero.
The nominal forward and reverse transmissions are unity and zero, respectively. However, rotation deviations included in (6.1.3) degrade the performance. Accounting for deviations, the forward and reverse transmissions
are
+

= cos2 F , and Tiso


= sin2 F
Tiso

Isolation is dened as
 
Iiso = 10 log10 Tiso

(dB)

(6.1.9)

The forward transmission, or insertion loss (IL), changes to second order with
change in F and the isolation changes to rst order.
Consider the frequency dependence of the rotation alone:
F =

o
F o
o

(6.1.10)

where F o coincides with frequency o . Figure 6.2(a) plots the isolation as a


function of frequency over the C-band. This idealized FR imparts +45 rotation at 194.1 THz. Recalling the spectral coverage for the C-band from (4.1.5)
on page 146, the frequency deviation from center is f /fo = 1.03%, which
translates to F  0.45 . In turn, the minimum isolation on either side of
the C-band is Iiso  42 dB. This is a high level of isolation that under ideal
conditions would be suitable for many applications.
The remaining three factors further degrade the practical performance
of an isolator. One factor is that the FR plate is generally toleranced to
F = 45 1 [17]. This translates into a frequency shift of the maximum isolation point. Figure 6.2(b) shows such an frequency shift for F = 0.25 . The
isolation for a 1 rotation error, combined with the 0.45 frequency-dependent

6.1 Polarizing Isolator

Isolation (dB)

a)

251

b)

90

2DuF error

70
50
30
10
192.1

193.1

194.1
195.1
frequency (THz)

196.1 192.1

193.1

194.1
195.1
frequency (THz)

196.1

Fig. 6.2. The frequency dependence of the Faraday rotation plate, ignoring material
wavelength dependence, results in frequency-dependent isolation. a) Isolation over
the C-band. b) Any change in the FR angle changes the frequency of peak isolation.

rotation, gives F = 1.45 , or Iiso  32 dB. This level of isolation is commonly found in component specication sheets of polarization-independent
isolators at room temperature [9].
The remaining two factors are the change in specic rotation as a function
of temperature and wavelength. Examples of practical temperature and wavelength ranges are 0 C to +70 C and the short to long wavelength sides of the
C-band. Taking room temperature as RT = 25 C, the maximum temperature
excursion is T = 45 C. The C-band is covered by = 16 nm.
The total of all four deviation factors should not cause worst-case isolation to fall below a specied value. For example, Iiso = 20 dB requires
F |max = 5.75 . Subtracting 1.0 for manufacturing tolerance, there remains 4.75 for temperature and total wavelength dependencies. Setting the
coecients equal gives
dF
dF
 0.08 /nm, and
 0.08 / C
d
dT

(6.1.11)

These coecients translate to a 1.3 allowance for total wavelength dependence and a 3.6 allowance for temperature dependence. Moreover, the
wavelength-dependent component of the waveplate is 0.45 , so the material
component must be less than 0.85 . As discussed in 4.2.3, these coecients
are challenging from a materials standpoint.
That the worst-case isolation can fall to 20 dB even with perfect polarizing elements is quite a remarkable fact. The relatively poor performance
of the FR over the full operating range necessitates two-stage isolators for
high-performance applications.
One simple improvement ubiquitous in the industry is to change the angle
between polarizers to account for manufacturing error of the FR plate. A 1
manufacturing error in rotation is 18% of the total error budget. In the isolation direction, a manufacturing error of F is made up with a one-to-one
change in the rotation of the analyzing polarizer. This, however, changes the
insertion loss of the forward direction. If the analyzer is rotated by angle 45

252

Isolation (dB)

a)

6 Isolators
b)

90
70

DuF2

2
Favg

Du

DuF1

50

DuF1 3 DuF2

30

DuF1

10
5

-2.5
0
2.5
5
Faraday Rotation Angle (deg) @ RT and vo

-5

DuF2
10

30
50
Temperature (oC)

70

Fig. 6.3. Two-stage, complementary FR conguration [19]. a) One stage has FR


biased to an angle below 45 while the other stage is biased above 45 . b) Twostage isolation across temperature. The product of positive- and negative-biased FR
angles provides better overall isolation than two stages with the same FR centered
at 25 C.

+, the insertion loss varies with analyzer error as


+
()| = cos2 2
|Tiso

(6.1.12)

Note that the IL goes as 2, but still imparts only a second-order change.
To improve the overall isolation, a two-stage isolator is necessary. Shiraishi
proposes a two-stage isolator where the two FRs are detuned in frequency
about a center frequency [19, 20]. The detuning is easily achieved by polishing the iron garnets to dierent thicknesses. For detuning rotations F 1
and F 2 , the forward and reverse transmissions in cascade are
+
= cos2 F 1 cos2 F 2
Tiso

(6.1.13a)

Tiso

(6.1.13b)

= sin F 1 sin F 2

As the peak isolation frequencies shift with wavelength and temperature,


the overall isolation remains higher than a two-stage isolator using the
same FR. Figure 6.3 illustrates the principle. Figure 6.3(a) plots the frequencydependent isolation from two individual isolators where the FRs are dierent.
The combined isolation over temperature is shown in Fig. 6.3(b). The combined isolation does not fall below 48 dB while that for two stages of a single FR type centered at 25 C is slightly worst at both temperature extremes.

6.2 Comparison of Lens Systems


The next logical step in isolator development is the polarization-independent
(PI) isolator. The are two dierent types of PI isolators: deection-type and
displacement-type. These types are detailed in the following sections. Here, a
look at the dierent optimal lensing systems serves to emphasize the contrast
between the two isolator types.

6.2 Comparison of Lens Systems

253

a)

b)

UF

L
2a

FU

Fig. 6.4. Lensing systems for deection and displacement isolators. a) Deection
isolator uses a collimating design for shortest length. A collimating lens system
transforms angle in image space to position on the focal plane. b) Displacement
isolator uses a focusing design to minimize the necessary walko, in turn minimizing
the length of the birefringent crystals.

Figure 6.4 illustrates a deection-type isolator (top) and displacement-type


isolator (bottom) and their respective collimating and focusing lens systems
(cf. 5.4). The core of the deection isolator changes the angle of the beam
as a function of propagation direction. The collimating system transforms
angle into lateral oset at the focal plane, which leads to spatial ltering in
the isolation direction. The core of the displacement isolator operates on the
principle of walko, where beams are laterally displaced but remain collinear.
The lensing system transforms oset in the image plane to oset in the object
plane, the magnication of the system being the scale factor between the two
osets. These osets also lead to spatial ltering in the isolation direction.
Figure 6.5 illustrates the dierent congurations relevant to isolator designs. Figure 6.5(a) shows an idealized alignment between the axis of a collimated beam and a lens. All rays converge at the focal plane F . Figure 6.5(b)
shows the same collimating lens but where the beam is oset from the lens
axis by yo , the angle remaining aligned to that axis. Since angle transforms to
oset for a collimating system, all rays converge at the original focal point of
Fig. 6.5(a), albeit along the inclination c . Figure 6.5(c) shows a collimated
beam deected by angle d ; the beam is focused to point yi , which is oset
from the on-axis focal point. The focusing lens shown in Fig. 6.5(d) shows a
diverging beam oset from and parallel to the lens axis. On the object plane U
the rays focus to a point oset by yi . If the magnication of the system is M ,
the spot size wo and oset yo on the object plane are scaled to wi = wo /M
and yi = yo /M on the image plane.
With these two congurations in mind, the action of the deection-type
and displacement-type isolator cores and their relationship to the lensing systems will be more apparent.

254

6 Isolators

a)

b)

c)

d)
yi

yo

uc

ud

M.yi

yi

uu

Fig. 6.5. Isolator-related beam transformations through a lens. Collimated systems


are in (ac) and a focusing system is in (d). a) Collimated beam travels through
center of lens. b) Oset collimated beam focuses on same spot as (a). c) Deected
beam translates to focal-point oset. d) Oset diverging beam is imaged to a point
oset from center.

6.3 Deection-Type Isolators


The rst polarization-independent isolators are reported in [12, 13] and [26].
These isolators used iron garnet Faraday rotators and polarization diversity
schemes. However, the number of parts and required size of the isolators, much
less the deleterious inclusion of a bandwidth-limiting half-wave waveplate in
the design by [12], makes these PI isolators impractical.
The rst practical polarization-independent isolator was invented by Shirasaki [21, 22]. The Shirasaki isolator is a deection-type isolator that uses
birefringent wedges to create polarization diversity and to polarize and analyze the light. The core of the deection isolator is illustrated in Fig. 6.6.
Two birefringent wedge prisms (cf. 4.7) are oriented as shown and a Faraday
rotator is placed in between. The wedge angle of the prism deects the incoming light, and the birefringence of the prism creates a polarization-dependent
deection. The extraordinary axis of the birefringent crystal is cut to lie in
the face of the prism; in this way there is no walko but there is polarizationdependent refraction. The only required relation between the extraordinary
axes of the two wedges is that they are 45 apart in the plane perpendicular
to propagation. As illustrated in Fig. 6.6(a), when the wedge is cut such that
the e-axis is at +22.5 from an edge of the rectangular aperture, the same
wedge can be used at both ends. The Faraday rotator is a Bi:RIG iron garnet that has its saturated magnetization maintained by a permanent magnet,
Fig. 6.6(b). Originally, a YIG garnet was used.
The forward and reverse ray-trace diagrams are shown in Fig. 6.7. (A detailed ray-trace that includes chief and marginal rays is available in [1]). In the
forward direction (Fig. 6.7(a)), the input light is refracted by the rst wedge
into u- and v-paths, the paths being orthogonally polarized. On plane (a)
the point and polarization of the u- and v-paths are indicated, where the polarization of the v-path is extraordinary in relation to the rst wedge while
the u-path is ordinary. Transit of the two beams through the FR rotates their
linear polarization states by 45 in a clockwise manner. The polarizations of
these beams are now aligned to the extraordinary axis of the second wedge.
The v-path refracts based on the extraordinary index of the second wedge

6.3 Deection-Type Isolators


a)
birefringent
wedge

b)

birefringent
wedge

Faraday
rotator

255

S
H

e
e

+22.5o

45

-22.5

b-wedge

b-wedge

Sm-Co
magnet

Fig. 6.6. Core of single-stage deection-type isolator. a) Two birefringent wedges


having the same wedge angle are congured as shown. A Faraday rotator plate is
located between the wedges and the magnetization vector points along the optic
axis. b) The magnetization of the FR is typically held in saturation by a permanent
magnet such as Sm-Co.

while the u-path refracts based on the ordinary index. Provided wedge angles and materials are the same, the second wedge deection cancels the rst.
The u- and v-paths run collinear before the second lens, the lens brings the
two beams to focus at the same point.
In the reverse direction the two wedges conspire to deect the beams out
of the aperture of the return ber. Accounting for the polarization rotation
from the Faraday rotator, the two wedges form a Wollaston prism as viewed
from the isolation direction. As shown in Fig. 6.7(b), the wedge w2 imparts
double refraction based on the polarization state of the input light (the same
as wedge w1 for the forward path). The linear polarization states of paths u
and v  are shown at plane (a ). Transit through the FR again rotates the
linear polarization states by 45 in a clockwise manner. At the leading face
of wedge w1 the polarizations on the two paths are not aligned to the wedge.
The v  -path which was refracted by the extraordinary index in wedge w2 now
refracts by the ordinary index, retaining a residual deection that is calculated
below. The opposite alignment occurs for the u -path with the eect of a
residual deection in the opposite direction. Together, the two wedges split
and deect the incoming light.
There are four calculations required for the deection-type isolator: in the
forward direction, the loss due to the beam oset before lens L2 ; in the reverse
direction, the loss due to the beam deection by the wedge pair; the angle of
the wedge; and the path-length imbalance, or PMD, in the forward direction.
In the forward direction, oset yo on the left side of lens L2 is mapped by
the lens into tilt angle c . For simplicity consider that the overall magnication
of the lens pair is unity. The beam waists wo of the ber and focused-beam
modes are then the same. Using (5.4.11) on page 243, the mode-overlap due
to tilt c is

256

6 Isolators

a) F1

L1

w1

FR

w2

L2
u

du
dv

F2

b)

lw
y

(a)
lg

(b)

(c)

u
v

e
+22.5

(c)
lw

(b)

uc

v
uw

(a)

45

(a)

(b)

-22.5o

(c)

v
v
u

(c)

e
(b)

(a)

-22.5

e
45

+22.5o

Fig. 6.7. Ray-trace diagrams for forward and isolation directions in a deectiontype isolator. a) Forward-path ray trace: beams of cross polarizations converge at the
output ber. Right: spot diagram through core. b) Isolation-path ray trace: beams
of cross polarizations are deected by the wedges, falling outside of the aperture of
the return ber. The frames around the spot diagrams are a guide for the eye only.

 2 2 2
k wo
Ib = exp c
4

(6.3.1)

where k is the wavenumber and the tilt angle c is related to oset yo and
lens focal length f via
(6.3.2)
yo = c f
The beam oset is therefore related to the lens parameters and mode overlap
as
2f 
ln Ib
(6.3.3)
yo =
kwo
The oset in turn is determined by the dierence in displacement between
the u and v paths. The displacement dierence is


2lw
+ lg
dv du = (ne no ) w
(6.3.4)
ne no
where w is the angle of the wedge, lw and lg are the wedge and gap lengths,
and ne and no are the extraordinary and ordinary refractive indices. (Note that
if the wedge prisms were exchanged such that the at facets face the lenses,
only the gap length lg would contribute to the displacement.) Provided that
the axis of lens L2 bisects the displacements du and dv , the oset is related
to the displacement dierence as yo = (dv du )/2.
To appreciate the order of magnitude for tolerable oset yo , consider
f = 1 mm, = 2 194.1 THz, and wo = 5 m. For an insertion loss of
Ib = 0.05 dB attributable to the beam displacement (and not imperfect AR
coatings or material losses), the required oset is yo  7 m.

6.3 Deection-Type Isolators

257

In the reverse direction, deection d imparted by the wedge pair is mapped


by lens L1 to an oset position yi on focal plane F1 . Displacement of the
reverse beams from lens center also creates a tilt angle when mapped through
the lens, but given the small displacement calculated for the forward direction
this angle is ignored here. Provided unity magnication, the mode-overlap
due to displacement yi between the focused beams and the ber facet, given
by (5.4.8) on page 242, is


y2
Ia = exp i 2
(6.3.5)
4wo
where the displacement yi is related to the deection angle via
yi = d f

(6.3.6)

The required deection angle for a given isolation Ia is therefore


d =

2wo 
ln Ia
f

(6.3.7)

Continuing with the same example, an isolation of Iiso = 45 dB requires a


deection angle of d  1.8 .
The deection angle d is directly related to the wedge angle w and the
birefringence of the crystal. Recall the deection angle for a small-angle
prism, given by (4.7.2) on page 199. Consider rst the forward u-path. The
deections due to wedges w1 and w2 are
u1 = (no 1)w

(6.3.8a)

u2 = (no 1)w

(6.3.8b)

Both deections are based on the ordinary refractive index seen by the u path.
One sign is the opposite of the other because the based of the wedge prisms
are inverted. The total deection is u1 + u2 = 0. This is consistent with the
previous analysis of the forward path.
In the reverse direction the wedge deections do not cancel. Following the
same analysis, the deections of the u -path are
u 2 = (no 1)w

(6.3.9a)

u 1 = (ne 1)w

(6.3.9b)

The total deection of the u -path is therefore


u = d = (ne no )w

(6.3.10)

The deection along the v  -path is the negative of (6.3.10).


The deection angle d can now be associated with the isolation requirement. Substitution of (6.3.10) into (6.3.7) gives

258

6 Isolators

w =

2wo 
ln Ia
nf

(6.3.11)

Using the exemplar values from above and using YVO4 as the wedge material,
the wedge angle to provide Iiso = 45 dB is w  9.0 . This angle is consistent
with YVO4 wedges that are used in the industry.
Together, equations (6.3.3), (6.3.4), and (6.3.11) dene the specication
of a wedge-type isolator. For these three equations there are three free variables: focal length f , wedge thickness lw , and gap length lg . The remaining
parameters are the ber, the material, and the transmission and isolation
specications.
Table 6.1 presents deection-type isolator specications for a one-stage
isolator reported by a manufacturer. The high return loss is achieved by collimator selection as well as orienting the angled facet of the wedges toward the
collimators to minimize back reection. Power handling is limited by the collimator technology (cf. 5.1), and as reported here, the collimators are likely
air-gap type.
The remaining calculation is the path imbalance, commonly referred to as
the PMD of the device. Here, the use of the term PMD is not precise, and
while the industry will not likely change its terminology based on a small discrepancy, the discerning reader should know the dierence. In the forward direction the u- and v-paths experience dierent refractive indices. Accordingly,
one path is fast while the other is slow. Since the paths are split according to polarization, one polarization state is delayed with respect to the other.
This is, precisely, dierential-group delay (DGD). Polarization-mode dispersion results from the concatenation of multiple, non-aligned DGD elements
and is characterized in most cases by a PMD vector that changes its pointing
direction with frequency. This is not the case for a single isolator. In this work,
the PMD of a component will be used when relating to industry usage; but
otherwise DGD, or , is used. For the deection-type isolator, note that in
the forward direction the u experiences the ordinary index, while the v experiences the extraordinary. This is true through both crystals. The FR does not
impart any signicant dierential delay. Since the deection angle dierences
are small for a practical isolator, the wedge length lw approximates the actual
path. The dierential-group delay between the two paths is


2(ng,e ng,o )lw


c

(6.3.12)

where ng,e and ng,o are the group indices of the e- and o-axes. A YVO4
wedge 0.5 mm long at the base with a 9 wedge has a path length of approximately 0.35 mm on the thin side. The dierential-group delay for an
isolator made from this deection-type component is  0.45 ps (cf. 4.2.2).
Substitution of LiNbO3 for YVO4 can further reduce the dierential-group
delay at the expense of an increase in gap length and wedge angle. For example, LiNbO3 wedges of the same length yields a dierential-group delay of
 0.16 ps.

6.4 Displacement-Type Isolators

259

Table 6.1. Deection-Type Isolator Technology Comparison


Specication(a)

1-Stage

2-Stage

PMDComp

Units

Isolation (c 15 nm, 23 C, all SOP)

32

60

32

dB

Isolation (c 15 nm, 0-70 C, all SOP)

22

42

22

dB

Insertion Loss (c , 23 C, all SOP)

0.2

0.4

0.3

dB

Insertion Loss (c , 0-70 C, all SOP)

0.3

0.6

0.5

dB

PMD

0.20

0.05

0.02

ps

PDL (c , 23 C)

0.05

dB

60

dB

Power handling

1,000

mW

Operating temperature

0-70

Return loss (c , 23 C)

(a)

Specication values reported by Koncent for C-band operation [9].

As a concluding remark, the deection-type isolator can be generalized to


a 2 2 port component to reduce size while increasing functionality [23].

6.4 Displacement-Type Isolators


Building on the spatial-walko method of polarization diversity rst proposed
in [12], Chang and Sorin developed a practical alternative to the Shirasaki
isolator [2]. The Chang and Sorin isolator, herein called a displacement-type
isolator, uses three birefringent walko crystals cut as parallelepiped blocks to
achieve polarization independence. One of several variations of their isolator
core is illustrated in Fig. 6.8 and its relation to the associated lensing system
is found in Fig. 6.4(b).
The core is characterized
length a and the re by two walko blocks of
maining block of length 2a placed in the sequence 2 : 1 : 1. The Faraday
rotator is located between the longer block and the rst shorter block. All
blocks are cut so as to impart beam walko between orthogonal linear polarization states (cf. 3.6.1). Accordingly, the extraordinary axis has vector
components that lie both transverse and longitudinal to the optical path. To
maximize that walko angle, the extraordinary axis is inclined into the crystal
by about 45 ; the precise optimal angle, being material dependent, is calculated from (3.6.24) on page 117. For example, the optimal angle of inclination
from the optical axis for YVO4 and rutile are both max  47.8 . The three
walko crystals are cut so that the walko directions are dierent from one
another. Following Fig. 6.8, the walko directions are +90 , 45 , and +45
from front to back, respectively.

260

6 Isolators
a
p

45o

a
e

2a

+90o

+45o

e
-45o

w.o.
block

Faraday
rotator

w.o.
block

w.o.
block

Fig. 6.8. Core of single-stage displacement-type isolator.


a) Three birefringent
walko blocks are congured as shown. The rst block is 2 times longer than
the other blocks. A Faraday rotator plate is located between rst and second blocks.
The cut of the extraordinary axes all maximize the walko and are oriented relative
to one another as 90 , 45 , and 45 . The FR magnetization is typically held in
saturation by a permanent magnet.

The principle of the displacement-type isolator diers from the deection


type. Within the displacement-type core, all beam paths run collinear between
crystals. Within the crystals, polarizations are split and laterally shifted from
one position to another. The interaction of the core with the lenses also diers
because both oset and tilt are the principal means of spatial ltering. In order
to minimize the crystal lengths, and corresponding spot-oset distances, a
focusing lens system is used. The spatial distribution of spots in the transverse
ray-trace diagram in the image plane (located within the core) is imaged onto
the object plane (at the ber face) by the receiving lens with a magnication
factor M 1 . Only spots that fall within the ber aperture are coupled, the
remaining spots and associated beam paths are lost.
The forward and reverse ray-trace and spot diagrams are illustrated in
Fig. 6.9. In the forward direction, walko block wo1 splits the two polarizations and translates the extraordinary polarized light along the v-path to the
top line. The FR rotates both u- and v-path polarization states by +45 .
Walko block wo2 translates the v-spot to the point indicated in frame (d)
of Fig. 6.9(a). Lastly, walko block wo3 translates the u-spot to overlap with
the v-spot. Running collinear and along the L2 lens axis, the combined forward
beam is coupled into the output ber.
In the reverse direction, the ray-trace through blocks wo2 and wo3 is the
same as along the forward path. After the FR, however, the beams diverge.
The Faraday rotator imparts a +45 rotation to the u - and v  -path polarization states, creating a misalignment to the extraordinary axis of block wo1 .
The misalignment causes the v  -path to trace a straight line through last block
and the u -path to walko in a downward direction. The resultant spot dia-

6.4 Displacement-Type Isolators

a)

U1 F1 L1

wo1

FR wo2

wo3

261

L 2 F2 U2

v
u

(a)
(a)

(b)

(b)

(c)

(d)

(c)

(e)

(d)

(e)

2a

45o

b)

2a

uu
(e)
(e)

a
v

(d)

(d)

(c)

(c)

(b)
(b)

(a)
(a)

v
u

45o

Fig. 6.9. Ray-trace and spot-trace diagrams for forward and isolation directions.
a) Forward path splits polarizations along paths u- and v-paths. These beams, always
collinear between blocks, converge at the output lens. b) Isolation path prevents
beam convergence at the return lens and instead displaces both beams out of the
aperture of the return ber.

gram before lens L1 is shown in frame (e ) of Fig. 6.9(b). With proper design
the u - and v  -spots fall outside of the ber face and are lost.
There are several interlocking calculations required for the displacementtype isolator. In the forward direction the ber-to-ber coupling must have
maximum transmission. While the ber-to-ber magnication is unity the
single-lens magnication sets the Rayleigh length, which in turn should be on
the same order as the lens-to-lens gap. In the reverse direction the requisite
isolation determines, in conjunction with the lens magnication, the necessary
spot oset. The spot oset in turn determines the unit crystal length a. Finally,
the path-length imbalance, or DGD, is calculated for the forward direction.

262

6 Isolators

In the forward direction, the depth-of-focus 2b should be on the same


order as the lens-to-lens gap Lgap to ensure good coupling. In view of the
focusing diagram in Fig. 5.16(b), the gaussian beam waist wb between the
lenses is related to the lens magnication and ber beam waist as wb = M wo .
Using the expression for the confocal parameter (5.2.10) on page 222, the
magnication is

2b
1
(6.4.1)
M=
wo
ko
Note that the wavenumber ko here is for free space.
In the reverse direction the requisite isolation along with the lens attributes
determine the necessary spot oset. Referring to Fig. 6.9(b), the oset in
frame (e ) maps to beam oset and tilt at the object plane U1 . The tilt angle u
is related to the oset via the focal length:
u =

yo
f

(6.4.2)

Moreover, the oset yo to the right of the lens relates to the oset yi on the U1
plane through the magnication
yo = M yi

(6.4.3)

Folding these relations into the overlap integral (5.4.10) on page 243, the oset
is related to the isolation, lens, and ber parameters via
yo = 
1+

2wo M
!
2

ko nwo M
f


"2

ln Ib

(6.4.4)

As the mode overlap occurs within the ber glass, the wavenumber ko
in (6.4.4) is scaled by the ber index n. Finally, the u - and v  -spot osets
from the axis of lens L1 are both

(6.4.5)
yo = 2a
where the walko angle is determined in general from (3.6.15) on page 110,
or from (3.6.25) on page 117 for maximum walko. For example, the maximum
walko in a YVO4 crystal is max  5.70 .
Together, equations (6.4.1), (6.4.4), and (6.4.5) determine the crystal
thickness a. For example, consider an approximate solution using YVO4
where Iiso = 45 dB, 2b = 5 mm, f = 1 mm, n = 1.44 (of the ber), and
= 2 194.1 THz. The requisite magnication is M  7.0 mm. This in turn
sets the necessary spot oset to yo  156 m. For this magnication, the minimum beam waist between the lenses is 2wb  70 m and the displacement by
the walko crystals is yo  2.2 (2wb ). Finally, the unit block length given
the above walko
angle is a  1.1 mm. As a check, the total length of the
blocks is (2 + 2)a  3.8 mm, which is a bit less than the depth-of-focus.

6.5 Two-Stage Isolators

263

The remaining calculation is the path imbalance in the forward direction.


It should be clear that the dierential-group delay through walko blocks wo2
and wo3 cancel. The imbalance is solely due to walko block wo1 . Within this
block, the u-path sees the ordinary group index ng,o , while the v-path sees the
eective group index. The eective refractive index is a mixture of the ordinary and extraordinary refractive indices, the mixture set by the inclination
angle of the extraordinary from the optic axis, see (3.6.26) on page 117.
The dierential-group delay requires the eective group index, which generally is approximately the same as the refractive index for high birefringent
crystals in the near infrared. Nonetheless, keeping track of the proper indices,
the dierential-group delay is

2a (nge,e ng,o )
(6.4.6)
=
c
Continuing with the example and using an eective refractive index, use
of YVO4 walko crystals imparts a dierential-delay of  1.1 ps. There is, in
fact, a simple alternation that eliminates for a two-stage block-type isolator.
This is detailed in 6.6.
As a concluding remark, folded architectures of the one-stage block-type
isolator have been proposed wherein a single relay lens with a mirror backing is
placed at a midpoint along the polarization evolution [4, 8, 15]. These isolators
can be very compact, but none of the reports demonstrates an epoxy-free path.

6.5 Two-Stage Isolators


To increase the isolation over a wide temperature and wavelength range, two
or more isolation stages are necessary. Two-stage isolators were already considered in 6.1 with respect to the garnet design. However, that discussion did
not include polarization diversity. Nonetheless, the lessons from that earlier
section apply here just as well.
Both the deection-type and displacement-type isolators are expandable
to two stages. Rather than blindly cascade two like stages, however, economies
can be found to improve the performance or reduce the size of a two-stage
component.
An example of a two-stage deection-type isolator is illustrated in Fig. 6.10.
Here the extraordinary axes of second wedge pair are rotated by 90 with
respect to the rst wedge pair. As indicated in the ray-trace diagram, this
rotation leads to a convergence of the u- and v-paths in the forward direction.
This is an improvement in PDL and alignment sensitivity over the single-stage
deection-type isolator. Moreover, the path imbalance along the forward path
is cancelled. That is, the expected dierential-group delay through the component is zero, although in practice dierences in where the light paths intersect
the the wedges leaves a residual imbalance. Table 6.1 shows the reported reduction in PMD value for a two-stage deection-type isolator.

264

6 Isolators

a)
u:o
v:e

u:o

u:e

v:e

v:o

u:e

u:o

v:o

v:e

u:e
v:o

b)
u:o
v:e

+22.5o

+45o

-22.5o

+67.5o

u:e
v:o

+45o

+22.5o

Fig. 6.10. A two-stage deection-type isolator, one particular realization. To recombine the forward beams and to balance the u- and v-path lengths, the extraordinary
axes of the second stage wedges are cut 90 in rotation from the rst stage. a) Forward direction ray-trace. b) Isolation direction ray-trace. The deection angles are
twice that of the single-stage counterpart.

In the reverse direction, the u - and v  -beams are deected twice. The total
deection for either beam is therefore
d = 2 (ne no ) w

(6.5.1)

Since the mode coupling goes exponentially with tilt angle, some redesign
from a one-stage deection may be possible to reduce the wedge angle.
A two-stage displacement-type isolator, in the conguration
proposed by

Chang and Sorin [3], is illustrated in Fig. 6.11. The 2a block is placed between the two a blocks instead of in front. All three extraordinary axes are
cut to maximize the walko, as before. Two FRs having the same rotary
direction are added as indicated, and the walko direction in the plane perpendicular to the optic axis changes by 45 from one block to the next. Unlike
the two-stage deection-type isolator, this isolator does not increase the spatial ltering of the principal rays. However, the isolation is increased because
light is scattered into more locations. Also, the dierential-group delay is reduced, but not eliminated. Figure 6.11(a,b) details the principal and error
paths along the forward and reverse directions, while (c) shows a detailed spotevolution diagram. The error-paths originate from incorrect Faraday rotation.
The dierential-group delay for this two-stage isolator is

( 2 1)ang
(6.5.2)
=
c

6.5 Two-Stage Isolators

wo2
wo1
FR1
FR2 woo3
(45o)
(0o)
(90 )

Side

a)

265

v
u

Top

(a)

b)

(b)

2a

(c)

(d)

(e)

(f)

Side
v
u

Top

(a)

c)

(a)

(b)

(b)

Top

2a

(c)

0o

(d)

(c)

Side

(e)

(f)

(d)

(e)

v
u

45o
o

(b)

Top

Side

(c)

0o

45 6DuF2
(d)

(e)

45o
o

45 6DuF1

90o
o

45 6DuF1
(a)

(f)

(f)

90o
o

45 6DuF2

Fig. 6.11. Ray-trace and spot-trace diagrams for forward and isolation
directions.

This two-stage displacement isolator places walko blocks in a 1 : 2 : 1 sequence,


with FRs located between blocks. Solid lines are nominal beam paths; dashed lines
are error paths that occur when the Faraday rotation is not precisely 45 . a) Side and
top views of forward paths. b) Side and top views of isolation paths. c) Spot-trace
diagrams that include residual light from imperfect Faraday rotation.

266

6 Isolators

where, as shown in the gure, the u-path somewhat cancels the dierentialgroup delay from the v-path.

6.6 PMD-Compensated Isolators


The two-stage deection-type isolator (Fig. 6.10) illustrated one way to compensate for the dierential-group delay, colloquially known as PMD, in that
conguration. There are other signication PMD-compensation techniques for
one-stage isolators, both for deection and displacement types.
For a one-stage deection-type isolator, two methods have been invented
to compensate for dierential-group delay and dierential beam displacement.
Swan proposes the addition of a rhombohedral crystal with its extraordinary
axis cut in the plane perpendicular to the optical path [24, 25] (Fig. 6.12(a)).
The rhombohedral angle is set to combine the u- and v-paths through dierential refraction. Alternatively, Xie proposes the addition of a parallelepiped
crystal with its extraordinary axis cut as a walko block [7], Fig. 6.12(b). The
parallelepiped length and extraordinary axis compensation angle are set concurrently to eliminate the DGD and dierential beam displacement. There is
some advantage of the Swan method over the Xie method because the former
scheme lets the two wedge prisms be the same part where the latter method
is best suited for two dierent prism parts.
Figure 6.12(a) shows the orientation of the three birefringent crystals in the
Swan scheme. As in 6.3 the extraordinary axis of each wedge is cut at 22.5 .
The orientation of the second wedge with respect to the rst, as shown in the
gure, automatically positions the second e-axis at 45 with respect to the
rst. In the forward path, the v-path refracts more than the u, assuming the
wedges are made of positive uniaxial crystal. Concomitant with the greater
refraction is an arrival-time delay of the v-path with respect to the u-path. In
the forward direction the FR ensures that the v-path sees the e-axis of both
wedges. Therefore, the ignoring the second-order correction for the refraction
angle, DGD is accrued over length 2lw .
Accordingly, the rhombohedral compensation crystal is cut as a waveplate
with its e-axis rotated 90 to the e-axis of the second wedge. That is, the
extraordinary path through the wedge pair becomes the ordinary path through
the compensation crystal. The waveplate cut ensures that the e-axis lies in
the plane perpendicular to the optical path. With length lc = 2lw (under that
assumption that the wedge and compensation parts are the same material)
the DGD is eliminated to rst order. Second-order corrections can be made
to account for the refracted paths in the crystals.
The dierential beam displacement dv du can also be corrected by cutting the compensation crystal to rhombohedral angle r . In general, the refraction angle into the crystal depends on the input polarization. The e- and
o-ray refraction angles in the small-angle approximation are

6.6 PMD-Compensated Isolators

a)

267

lc
ur

du
dv

v
(a)
(a)

(b)

compensation
block

(c)

(b)

(c)

(d)

(d)

u
v

e
+67.5

b)

45

lw

+22.5

lg

lw

lc
gc

u
v
(a)
(a)

(b)

(b)

+112.5o

(c)

ac
compensation
block

(c)

(d)

(d)

u
v

e
+45o

45o

0o

90o

Fig. 6.12. Forward ray-trace diagrams of Swan and Xie PMD-compensated


deection-type isolators. a) Swan method uses compensation crystal cut as a waveplate and rhombohedral angle r to compensate the DGD from the wedge pair and
to refract dierentially the u- and v-paths. b) Xie method uses compensation crystal
cut as a partial walko and partial waveplate parallelepiped to compensate the DGD
from the wedge pair and to translate the u-path onto the v-path.


e,o =


1
1 r
ne,o

(6.6.1)

The dierence in displacement through the compensation crystal is therefore




1
1
d e d o = lc r

(6.6.2)
ne
no
The dierential beam displacement due to the wedge pair, (6.3.4) on page 256,
is cancelled by the compensation crystal when the two displacements are equal.
The required rhombohedral angle is therefore

268

6 Isolators

r =

d du
!v
"
lc n1e n1o

(6.6.3)

Precisely speaking, a non-zero rhombohedral angle requires the plane in which


the extraordinary axis of the compensation crystal lies to be tilted so that the
refracted paths remain perpendicular to that plane. This prevents walko.
Figure 6.12(b) shows the orientation of the three birefringent crystals in
the Xie scheme. There are two mechanisms that simultaneously aect the
light through the compensation crystal. One is the walko experienced by one
path because the e-axis of the crystal is cut with a vector component along
the optical path. The other is dierential-group delay experienced because one
path sees the ordinary index and the other sees an eective index that depends
on the inclination angle of the e-axis. The compensation crystal length lc and
e-axis inclination angle c are tailored to compensate the DGD and dierential
beam displacement.
Two equations with two free parameters together determine the length
and e-axis angle of the compensation crystal. The DGD of the crystal is
c =

lc (ng (c ) ng,o )
c

(6.6.4)

and the dierential beam displacement is


dv du = lc tan c (c )

(6.6.5)

The eective index ne (c ) is given by (3.6.26) on page 117, or for normal


incidence,
ne no
ne = 
(6.6.6)
2
2
ne cos (c ) + n2o sin2 (c )
The walko angle, (3.6.15) on page 110, is also governed by the inclination
angle:
 2

ne n2o sin c cos c
(6.6.7)
tan c = 2
ne cos2 c + n2o sin2 c
As a technical point, the group indices are used in (6.6.6) while the refractive
indices are used in (6.6.7). The two free parameters in these two equations
are lc and c . Given the DGD from the wedge pair and the displacement
dv du , a unique solution can be found.
Note that a shortcoming of the Xie design is that the e-axis orientation
in the two wedges is dierent than before (Figure 6.12(b)). To recombine
the u- and v-paths perfectly, the linear polarization state orientation should
be parallel and perpendicular to the walko direction.
Table 6.1 shows the improvement of a single-stage deection-type PMDcompensated isolator over single-stage deection-type isolator. It is not known
which of the two compensation schemes is used for this product.

6.6 PMD-Compensated Isolators

269

Displacement-type isolators can also have PMD-compensation. The best


overall performance with the fewest parts is a two-stage conguration similar
to the original Chang and Sorin proposal [3] but with a variation proposed
by Konno [10] and further improved here. Using the spot-vector diagrams so
clearly presented in [11], four displacement-type isolators and their behavior
are illustrated in Fig. 6.13. Another variation of the spot diagram is found
in [14], and an alternative PMD-compensation scheme for displacement-type
isolators is reported in [27].
The original Chang and Sorin device [2] is shown in Fig. 6.13(a). The
associated spot-vector diagrams trace the beam locations through the device
in a plane perpendicular to the optical path. In the forward direction, both
input polarizations begin at point Po . The rst and second walko blocks
translate the v-path along vectors pv1 and pv2 , respectively. The third block
translates the u-path along vector pu to combine the two beams. In the reverse
direction, the input polarizations begin at point Po and trace the indicated
paths. Both beams are displaced out of the Po aperture, leading to isolation.
The path lengths pu and pv are analogues for the DGD accrued through the
isolator core. The optical phase of either path through a crystal is = nL/c.
The phase dierence between a walko path and the associated straightthrough path is = (ne no )L/c, or, accounting for the walko angle ,
=

(ne no )pu,v cot


c

(6.6.8)

Provided that the e-axis inclination angle is the same for each crystal, ensuring
that ne and are constant, the path-length dierence pv pu is proportional
to the
DGD. The rst Chang and Sorin isolator has a DGD that goes as
2a, where a is the unit crystal length.
Clearly, then, the two-stage Chang
and Sorin component [3] (Fig. 6.13(b))
has a DGD that scales as (2 2)a. Recognizing the importance of path
balancing, Kuzuta proposes the modied
two-stage block isolator shown in

Fig. 6.13(c). Addition of the second 2a-length block translates both u- and vpaths equally. As shown in the diagram, pu1 + pu2 = pv1 + pv2 .
A more economic and elegant approach is shown in Fig. 6.13(d), which
is a variation of the Konno proposal [10]. The path lengths are balanced by
increasing the eective index of the pu in relation to the v-paths. Using (6.6.8)
and = d/d, the path lengths are balanced when

(nge (c ) ng,o ) cot c = 2 (nge ng,o ) cot


(6.6.9)
The center crystal has to be lengthened accordingly. This scheme is a small
change to the original Chang and Sorin two-stage proposal but with the signicant improvement in DGD mitigation.

270

6 Isolators
p
2a
a)

forward

Po

Po

pu

Po

pu2

2a

pu1
pu2

pv1

pu1

pu

pv2

pv

pv1

c)

2a

reverse

pv2

Po

b)

pu3

d)
2a

b
ac

pu2

pv2
pv1
Po

neff pv2
pu2

pu1

pu1
pv1

Po

neff pv1

neff-a pu

Po

pv2

Fig. 6.13. Displacement isolators, non-PMD-compensated (a,b) and PMD-compensated (c,d), with corresponding spot vector diagrams. a) Single-stage displacement isolator as before. Forward path lengths are such that pv1 + pv2 = pu , generating PMD. b) Dual-stage displacement isolator as before. c) PMD-compensated
dual-stage displacement isolator. Forward path lengths are equalized between the
two polarizations. d) Elegant PMD-compensated dual-stage displacement isolator.
Orientation of extraordinary axis in second block increases the u-path delay to
equalize to the v-path.

References

271

References
1. D. W. Anthon and D. L. Sipes, Multi-function optical isolator, U.S. Patent
6,088,153, July 11, 2000.
2. K. W. Chang and W. V. Sorin, Polarization independent isolator using spatial
walko polarizers, IEEE Photonics Technology Letters, vol. 1, no. 3, pp. 6880,
1989.
3. , High-performance single-mode ber polarization-independent isolators,
Optics Letters, vol. 15, no. 8, pp. 449451, 1990.
4. Y. Cheng and G. S. Duck, Multi-stage optical isolator, U.S. Patent 5,768,005,
June 16, 1998.
5. D. J. Gauthier, P. Narum, and R. W. Boyd, Simple, compact, high-performance
permanent-magnet faraday isolator, Optics Letters, vol. 11, no. 10, pp. 623625,
1986.
6. A. J. Heiney and D. K. Wilson, Optical isolators employing oppositely signed
faraday rotating materials, U.S. Patent 5,087,984, Feb. 11, 1992.
7. Y. Huang, P. Xie, X. Luo, and L. Du, Optical isolator with reduced insertion
loss and minimized polarization mode dispersion, U.S. Patent 2002/0 060 843,
May 23, 2002.
8. R. S. Jameson, Polarization independent optical isolator, U.S. Patent
5,033,830, July 23, 1991.
9. Micro optics for telecom catalog 2002, Kocent Communications, Fuzhou,
Fujian, P.R. China, 2002. [Online]. Available: http://www.koncent.com/
10. Y. Konno, S. Aoki, and K. Ikegai, Polarization independent optical isolator,
U.S. Patent 5,774,264, June 30, 1998.
11. N. Kuzuta, Optical isolator, U.S. Patent 5,237,445, Aug. 17, 1993.
12. T. Matsumoto, Polarization-inpdependent isolators for ber optics, Electronics and Communicatinos in Japan, vol. 62-C, no. 7, pp. 113119, 1979.
13. , Optical nonreciprocal device, U.S. Patent 4,239,329, Dec. 16, 1980.
14. H. Ohta and N. Nakamura, Optical isolator, U.S. Patent 5,151,955, Sept. 29,
1992.
15. J.-J. Pan, Highly miniatured, folded reection optical isolator, U.S. Patent
6,212,305, Apr. 3, 2001.
16. F. J. Sansalone, Compact optical isolator, Applied Optics, vol. 10, no. 10, pp.
23292331, 1971.
17. K. Shirai, M. Sumitani, N. Takeda, and M. Arii, Optical isolator, U.S. Patent
5,278,853, Jan. 11, 1994.
18. K. Shiraishi, F. Tajima, and S. Kawakami, Compact faraday rotator for an
optical isolator using magnets arranged with alternating polarities, Optics Letters, vol. 11, no. 2, pp. 8284, 1986.
19. K. Shiraishi and S. Kawakami, Cascaded optical isolater conguration having
high-isolation characteristics over a wide temperature and wavelength range,
Optics Letters, vol. 12, no. 7, pp. 462464, 1987.
20. K. Shiraishi, S. Sugaya, and S. Kawakami, Fiber faraday rotator, Applied
Optics, vol. 23, no. 7, pp. 11031105, 1984.
21. M. Shirasaki, Optical device, U.S. Patent 4,548,478, Oct. 22, 1985.
22. M. Shirasaki and K. Asama, Compact optical islator for bers using birefringent wedges, Applied Optics, vol. 21, no. 23, pp. 42964299, 1982.
23. J. Y. Song, Optical isolator, U.S. Patent 6,061,167, May 9, 2000.

272

6 Isolators

24. C. B. Swan, Optical isolator with polarization dispersion and dierential transverse deection correction, U.S. Patent 5,631,771, May 20, 1997.
25. , Optical isolator with polarization dispersion and dierential transverse
deection correction, U.S. Patent 5,930,038, July 27, 1999.
26. T. Uchida and A. Ueki, Optical isolator, U.S. Patent 4,178,073, Dec. 11, 1979.
27. T. Watanabe, S. Sugiuama, and T. Ryuo, Multiple-stage optical isolator, U.S.
Patent 6,049,425, Apr. 11, 2000.
28. C. G. Young, Multiple wavelength optical isolator, U.S. Patent 3,602,575,
Aug. 31, 1971.

7
Circulators

An optical circulator is a generalized isolator having three or more ports.


While an isolator causes loss in the isolation direction, a circulator collects the
light and directs it to a nonreciprocal output port. Figure 7.1 illustrates several
possible circulator congurations. Figure 7.1(a) illustrates the port mapping
for a four-port circulator. The ports cyclically map 1 2 3 4 1. This
is called a strict-sense circulator because every input port has a specic nonreciprocal output port. Construction of a strict-sense circulator with more ports
becomes inelegant but ones with three ports can be simple [22]. Figure 7.1(b)
illustrates a non-strict-sense circulator having any number of ports greater
than two. In this case each input port has a specic nonreciprocal output port
except for the last port; the light input to the last port is lost. The ladder
diagram reects the optical path within the component and indicates the disconnect between the rst and last ports. Figure 7.1(c) illustrates a three-port
non-strict-sense circulator. This circulator has signicance in telecommunications applications because return of light from port 3 to port 1 is often not
necessary. For instance, the reected light from a ber Bragg grating need
only be separated from the input light without loss, but as optical links are
not typically operated in reverse there is no need for strict-sense behavior.
Other than architecture, most of the considerations for the optical circulator have already been addressed in preceding chapters. As a nonreciprocal
device, a circulator has at least one Faraday rotator (FR) in the optical path.
The wavelength and temperature performance of FRs was treated in 6.1. In
that same section, the transmission and isolation as a function of polarizer
alignment was treated. The modern deection-type circulators use birefringent
prism combinations such as the Wollaston and Rochon prisms, detailed in 4.7,
along with dual-ber collimators, detailed in 5.3. Likewise, the dierentialgroup delay due to path-length imbalance was treated in 6.6. Accordingly,
all the tools necessary to appreciate and design optical circulators have been
earlier developed, allowing the current chapter to focus exclusively on architectures and performance.

274

7 Circulators
a)

b)

c)

5
2

Fig. 7.1. Three types of circulator port connections. a) Strict-sense circulator with
four ports. Each input port has a specic nonreciprocal output port. b) Non-strictsense circulator in ladder topology. Any number of ports greater than two is possible;
however, light input to the last port is lost. c) Non-strict-sense three-port circulator.
This topology has signicant applications in unidirectional telecommunications links
and has good economies compared with (a) or (b).

As with isolators, circulators can be polarization dependent or polarization


independent. The polarization-dependent circulator is an important starting
point because the minimum requirements for circulatory behavior are clear.
Polarization-independent circulators are further categorized as displacementtype and deection-type. Except for the earliest designs, displacement-type
circulators use birefringent walko crystals to achieve polarization diversity.
Deection-type circulators use birefringent prisms and dual-ber collimators
for the same purpose. The most compact and least expensive circulators use
deection-type designs.

7.1 Polarizing Circulator


Figure 7.2 illustrates a YIG-based polarizing circulator rst demonstrated by
Shibukawa [32, 33]. This conguration is the minimum necessary to achieve
circulatory behavior and should be compared to the polarizing isolator in
Fig. 6.1 on page 248. Circulatory behavior exists only for prescribed linear
input polarization states. The main issue addressed by Shibukawa was the
addition of input and output ports on either side of the FR, as compared to a
polarizing isolator, to realize a strict-sense circulator. To create two ports on
either side of the FR, Glan-Taylor prisms were used (Fig. 7.2(c)). The GlanTaylor prisms were used for two reasons. The inventors realized the importance
of high polarization contrast yet thin-lm polarization beam-splitting cubes
were not readily available nor did they exhibit good performance. Moreover, as
compared with the Glan-Thompson, Wollaston, and Rochon prisms, the GlanTaylor prism has a wide deection angle, e.g. 110 for calcite. This enabled
them to build a relatively compact components.
A Glan-Taylor prism is a birefringent prism pair cut so that along the
hypotenuse one polarization state, e.g. the extraordinary ray for a positive
uniaxial crystal, experiences total-internal reection at the crystal/air bound-

7.1 Polarizing Circulator


a)

M
P45

uF = 45

1
P0

c)

275

b)

+ uniaxial
e
o

uF = 45

e o

Fig. 7.2. A polarizing circulator in its simplest embodiment. Two polarizationdividing prisms and a Faraday rotator are used. The prisms are rotated by 45 with
respect to one another along the longitudinal axis. a) Transmission of 1 2 for
linear vertical polarization input. Linear horizontal input is deected by prism Po .
Light after the FR not aligned to prism P45 is output on port 4, reducing path
isolation. b) Transmission of 2 3 for linear polarization input. Light after the FR
not aligned to prism Po is output on port 1, reducing path isolation. c) A GlanTaylor birefringent prism, where e-rays are totally internally reected and o-rays
are partially reected at the crystal / air interface along the hypotenuse.

ary while the orthogonally polarized ray exits interface. The birefringent prism
can be cut at Brewsters angle to maximize the transmission, but such was
not the case in the work of Shibukawa. Accordingly, reected ordinary light
co-propagates with the TIR extraordinary light but is refracted at a dierent
angle upon exiting the crystal. It should be noted that in most early demonstrations none of the optical interfaces were anti-reection coated.
Referring to Fig. 7.2, all that is required for minimal circulatory action
is a Faraday rotator with F = 45 , an input polarization splitter, and an
output polarization splitter rotated by 45 . To pass from port 1 to port 2, a
vertically aligned linear polarization state is input. This state transits the rst
polarization splitter, is rotated by the FR, and transits the second polarization splitter. In the reverse direction, the same linear polarization state now
input to port 2 is again rotated by the FR and is diverted by the rst polarization splitter to port 3. Further path tracing shows that this is a strict-sense
polarizing circulator.
With high-quality thin-lm polarization beam-splitting cubes and highperformance iron garnet materials now available, the polarizing circulator
Fig. 7.2 can exhibit good performance other than PDL. However, at the time,
losses were incurred through lack of AR coatings, poor extinction ratio of the

276

7 Circulators
a)

M
P45

uF = 45

1
P0

b)

uwp = 22.5

M
P90
4

uF = 45

1
P0

c)

M
3

P90
uF = 45
uF = 45

Rochon prism

P0

Fig. 7.3. Conceptual development of polarizing circulators. a) Simple polarizing


circulator like that of Shibukawa; PBS cubes illustratively replace the Glan-Taylor
prisms. The four ports do not lie in a plane. b) Addition of reciprocal element, here
a half-wave waveplate, allows all four ports to lie in a plane. The polarization plane
can be rotated by 90 . c) A two-stage polarizing circulator. The reciprocal element
of (b) is replaced with a second FR and a polarizing beam splitter, such as a Richon
prism, located between FR plates. The polarization plane is rotated 90 along the
forward path. The polarizing-cube, FR, prism, FR, polarizing-cube sequence forms
two-stages of isolation.

prisms, losses in the YIG garnet, and poor ber coupling. The performance
reported at the time is an insertion loss of 2 dB and an isolation variation
from 13 dB to 28 dB, depending on port combination. The authors were
cognizant of wavelength dependence but made no reports on temperature dependence.
The polarizing circulator leads to a conceptual framework that encompasses essential aspects of any circulator design. The circulator of Shibukawa,
or one similar as in Fig. 7.3(a), where the Glan-Taylor prisms are illustratively
replaced with polarization beam splitting (PBS) cubes, has four ports that do
not lie in a plane. This makes the component form-factor less convenient. To
rectify this shortcoming, the FR can be preceded or followed by a reciprocal
element such as a half-wave waveplate, having its birefringent axis at 22.5
with respect to the cube axis, or an optically active crystal that rotates the
linear polarization state by 45 (Fig. 7.3(b)). In either case, from the FR

7.2 Historical Development

277

looking through the reciprocal plate, the PBS cube that is in view appears
rotated by 45 . Optically the cube is rotated but physically it is not, allowing
all ports to lie in the same plane.
The reciprocal and nonreciprocal rotators together can rotate the plane
of linear polarization by 90 . The 90 rotation is characteristic of most circulators. However, as a general rule a waveplate reduces the bandwidth of a
component. Whether such limitation is tolerable or not depends on many factors. Yet a better method is to use two Faraday rotators with an intermediate
polarizing beam splitter (Fig. 7.3(c)). Note that other than material dispersion, the second FR has the same wavelength dependence as an equivalent
reciprocal rotator. However, the dual FR design accomplishes two goals: the
plane of polarization is rotated by 90 in the forward direction and 0 in reverse, and the isolation of the circulator is squared because this is a two-stage
circulator. A two-stage circulator is a natural consequence of placing all ports
on the same plane.

7.2 Historical Development


Circulator architectures are characterized by rapid development toward an
optimal design given available materials, sub-assemblies, and recognized application-specic requirements and plateaus during which little
changed. Very early work in optical circulators circa 1960s derived motivation from radio-frequency circulators which, at the time, required only
single-polarization performance [28, 31]. The late 1970s are characterized
by the application-specic realization that single-mode optical ber does
not preserve polarization and therefore circulators have to be polarizationindependent [17, 29, 36]. Indeed early descriptions claimed a 3 dB loss for a
component with innite polarization-dependent loss (PDL) [17]. Optical circulators thus transformed from polarization-dependent (PD) to polarizationindependent (PI) via polarization-diversity schemes. The goal at the time
was to achieve high isolation and low loss, which in turn required the development of high-contrast polarization splitters. Specications of wavelengthdependence and PDL were also recognized at the time, but performance was
often poor by todays standards. Temperature dependence was almost never
referred to (the exception being [36]) as well as dierential-group delay. Lastly,
the early PI circulators were strict-sense four-port designs.
Figure 7.4 illustrates the architectural development between 1978 and 1981
from PD circulators to PI circulators. Indeed the submission dates of the
articles are so close it is dicult to reconstruct precisely who invented what
rst. Figure 7.4(a) illustrates the Shibukawa PD circulator detailed in the
preceding section. Figures 7.4(bd) are all PI circulators with an additional
important distinction. In all three topologies the four ports lie in the same
plane. To do this a reciprocal polarization rotator was added: a half-wave
waveplate was used by Matsumoto and Shirasaki, while an optically active

278

7 Circulators

a)

b)
3

uF = 45

P0

c)

P45

l/2

M-GT

Shibukawa PD circulator
FR OA

FR

M-GT

Matsumoto PI circulator

d)

4
PBS

1
2

2
3

1
PBS

FR l/2
3

Iwamura PI circulator

Shirasaki PI circulator

Fig. 7.4. Evolution of early four-port strict-sense circulators. Goals were to provide
polarization-independent circulatory behavior with high isolation. a) The polarizing
circulator circa 1978. b) First polarization-independent (PI) proposal circa 1979.
Modied Glan-Thompson (M-GT) prisms and dual half-wave waveplate plus FR
pairs were used. c) Alternative PI proposal circa 1979. Thin-lm polarization beam
splitters were used, along with optically active quartz for the reciprocal 45 rotation.
d) Shirasaki circulator circa 1980 using high-extinction-ratio Shirasaki polarization
splitters, an FR and a half-wave waveplate.

(OA) rotator was used by Iwamura. At the time an OA rotator was considered
by some as advantageous because of easy alignment [17], although the required
crystal length for quartz is 15.8 mm at 1.55 m [20]. In either case, the addition
of the reciprocal rotator reduces the bandwidth of the component.
The Matsumoto PI circulator [28, 29] in Fig. 7.4(b) uses modied GlanThompson (M-GT) birefringent prisms to separate the polarization. Similar
to the Glan-Taylor prism, the Glan-Thompson prism extracts one polarization component through TIR. The latter prism has the gap between the two
prism sections lled with a bonding agent such as epoxy to reduce the angular deection of the TIR light. Using calcite, Matsumoto reported a 36.4
full-angle deection. Unlike the Shibukawa PD circulator, the present circulator captures both transmitted and deected light and directs the two paths
through separate half-wave and FR pairs. The waveplate was a true zero-order
half-wave waveplate and the FR was YIG with an Sm-Co permanent magnet
for saturation. The polarizations output from the rotators are combined by a
second Glan-Thompson prism. The reported insertion loss was 3.7 dB.
One principle drawback of the Matsumoto scheme is that the modied
Glan-Thompson prisms reected 12 dB of the non-TIR light in the direction
of the TIR light. This made for a very low isolation oor. The other drawback
is the duplication of the reciprocal and nonreciprocal rotators.

7.3 Displacement Circulators

279

The Iwamura PI circulator [17] in Fig. 7.4(c) is a far more suitable architecture, but the inventors were limited by the low-quality thin-lm polarization
beam splitters. The signicant improvement is a single reciprocal/nonreciprocal rotator pair through which both optical paths transit. While none of the
optical surfaces were anti-reection coated, the inventors reported an insertion
loss of 1.2 1.6 dB and an isolation of 16 19 dB.
The Shirasaki circulator [22, 36] in Fig. 7.4(d) employs the Shirasaki birefringent prism (cf. 4.7) to act as a high-extinction ratio polarization splitter.
Taken as a pair, the rutile prisms exhibited over 40 dB contrast with a loss less
than 0.5 dB. Like the Iwamura isolator, the two optical paths run parallel and
transit a single rotator pair. The FR was YIG and the half-wave waveplate
was a true zero-order quartz plate. It should be noted that the birefringent
axis of the waveplate was inclined by 22.5 in the plane perpendicular to
the optical path. The resultant circulator had 0.4 0.6 dB insertion loss
and 25 32 dB isolation. Moreover, the inventors characterized their circulator over a 5 45 temperature range and demonstrated a 0.3 dB insertion
loss shift and 1 dB isolation variation.
Emkey [7, 9, 10] plays a pivotal role in the development of circulators
for two reasons. First, he recognized that birefringent walko crystals have a
higher extinction ratio than did the polarization-splitting prisms that preceded
him. Second, he recognized that for optical communication links a circulator
need not be a strict-sense four-port component, but rather a non-strict-sense
three-port circulator was satisfactory and certainly more economical. While
his component design is awkward and not repeated here, Emkey set the stage
for Koga, Fujii, Xie, and others to develop displacement-type circulators in
the early 1990s. Displacement-type circulators also incorporated superior iron
garnet materials, specically, the Bi:RIG garnets being developed at the time,
and superior single-ber collimators. A large advance in performance was thus
recorded.
The nal substantial improvement come about in the late 1990s with the
development of the dual-ber collimator. The dual-ber collimator simplied
and miniaturized the housing size of the lenses. Equally importantly, due
to its convenient interaction with Wollaston, Rochon, and Kaifa compound
prisms, the use of dual-ber collimators ushered in a family of deectiontype schemes that substantially reduced the necessary volume of birefringent
material, which in turn further reduced the size and cost.

7.3 Displacement Circulators


A displacement-type circulator is one where the polarization components
are spatially separated and combined by birefringent crystals cut as walko
blocks. Birefringent walko can yield an extinction ratio between the two polarization components in excess of 50 dB. Such an extinction ratio is better
than most commercially available thin-lm polarization beam splitting cubes.

280

7 Circulators

Accordingly, the displacement-type circulators developed in the early 1990s


exhibited superior performance in comparison with the earlier developed Matsumoto, Iwamura, and Shirasaki circulators.
The series of displacement-type circulators developed by Koga and Matsumoto highlight conceptual developments that have bearing on deectiontype circulators. Their rst-reported design was a strict-sense four-port singlestage circulator. The complexity and limited performance of this design was
overcome by their second design, which was a non-strict-sense ladder-type
two-stage circulator. Even this design, however, was limited in part by the incorporation of reciprocal polarization rotators. These rotators add to the part
count and reduce the isolation bandwidth. Their last-reported design was a
non-strict sense ladder-type two-stage circulator using only nonreciprocal polarization rotation. This is the most compact and best performing of their
devices.
Other inventors have proposed displacement-type circulators as well. In
particular, Fujii proposed several designs [1113] using both walko crystals
and PBS cubes. Separately, Cheng developed a variety of reection-based
displacement circulators than were more compact that previous architectures [2, 4, 5]. He also addressed issues of low PMD and alignment improvements for manufacturing [3, 6]. More recently, Xie and Huang proposed a
compact two-stage design that incorporates thermally expanded-core (TEC)
bers [42]. Use of TEC bers reduces the necessary beam displacement and in
turn the size of the component. TEC bers will be seen again in the deection
circulators to follow. Finally, Liu et al. have proposed a strict-sense four-port
two-stage circulator that includes a polarization beam-splitter cube to route
the last port back to the rst [27].
Figure 7.5 illustrates the rst Koga and Matsumoto PI circulator [18, 21].
The core of their circulator is a variation of the Shibukawa PD circulator
where the Glan-Taylor prisms are replaced with walko blocks (Fig. 7.5(a)).
The center reciprocal rotator allows all four ports to lie in the same plane.
This PD circulator was embedded between two polarization-conditioning sections that, all together, form a strict-sense circulator (Fig. 7.5(b)). The spottrace diagrams at the bottom of the gure detail the connection between all
port pairs.
As a critique, one can say that the polarization beam splitters to either
side of the center FR each require four reciprocal parts. The extinction ratio
of the splitters and the insertion loss of the device critically depend on the
alignment of these elements. As the circulator is single-stage and the reciprocal
parts will be misaligned in a real product, one cannot expect too much from
this architecture.
A better design is illustrated in Fig. 7.6. This is a two-stage circulator
having fewer components that the rst design [18, 20, 21]. The simpler architecture was achieved by switching to a ladder-type device from a strict-sense
device. Reciprocal rotators are still included, but the two-stage design helps
with the isolation. As with the rst device, all paths into and out of the cir-

7.3 Displacement Circulators


a)

1
3

b)

4
2

uwp = 22.5o

wo1

281

uF = 45o

wo2

wo1

uwp = 45o

wo2

uwp = 22.5o uF = 45o

wo3

uwp = 45o

wo4

2
(a)

(b)

(c)

Pol Conditioner

(d)

(e)

(f)

(g)

PD Circulator

(h)

(i)

Pol Conditioner

1!2
2
1
2!3
3

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

3!4
3
4
4!1
1

Fig. 7.5. Koga and Matsumoto strict-sense single-stage displacement-type circulator [18, 21], altered by the author to include half-wave waveplates rather than
optically active plates and to simplify the quarter-size elements. a) A variation of
the Shibukawa PD circulator, where walko blocks substitute for Glan-Taylor prisms
and a half-wave waveplate allows all four ports to lie in the same plane. b) The PI
circulator with embedded PD circulator. The polarization-conditioning stages generate the strict-sense PI circulatory behavior. All paths into and out of the circulator
run parallel, and four separate collimators couple the light to and from ber. At the
bottom, spot-trace diagrams detail the connection between port pairs.

282

7 Circulators
wo1

uF = 45

wp

wo2

uF = 45

wp

wo3

4
22.5o
(a)

67.5o

67.5o

(b)

(c)

(d)

22.5o

(e)

(f)

Stage 1

(g)

(h)

Stage 2

1!2
1
2!3

2
(b)

(c)

(d)

(e)

(f)

(g)

(h)

3
2
3!4
3

Fig. 7.6. Koga and Matsumoto ladder-type two-stage displacement circulator [18,
20, 21], altered by the author to include half-wave waveplates rather than optically
active plates. The center walko block is the polarizing element between the rst
and second Faraday rotators, resulting in a two-stage circulator. All paths into and
out of the circulator run parallel, and four separate collimators couple light to and
from ber. At the bottom, spot-trace diagrams detail the connection between port
pairs. The connection 4 1 is not available.

culator are parallel, and separate collimating lenses were used to couple to
and from the ber. Given a minimum spacing of adjacent collimators based
on the form factor, the displacement crystals must be long enough to couple
to either lens. Koga and Matsumoto somewhat overcame this limitation by
using turning prisms to deect the light from a small core to a more widely
spaced lens pair. However, use of turning prisms in production is not often
attractive. Nonetheless, the inventors reported an insertion loss of < 1.5 dB,
a PDL of 0.25 dB, and isolation at room temperature and center wavelength
of over 67 dB. They reported a 70 nm wavelength range centered at 1550 nm
where the isolation was at least 60 dB.
The nal architecture in this series eliminates the reciprocal rotators all
together [19], developed by Koga. The stated purposed by the inventor was
to reduce the component length by removing the optically active rotators.

7.3 Displacement Circulators


uF = +/- 45o
2
1

wo1

uF = +/- 45o
2
1

wo2

283

wo3

2
3
2
(a)

(b)

(c)

4
1

(d)

(e)

(f)

1!2
1

(a)

(b)

2!3
3

(c)

(d)

(e)

(f)

Fig. 7.7. A ladder-type two-stage displacement circulator with no reciprocal rotators, proposed by Koga [19]. Tiled FR elements are located between walko blocks.
The center walko block walks the extraordinary rays along a 45 line in the plane
perpendicular to the long component axis. All paths into and out of the circulator
run parallel, and four separate collimators couple light to and from ber. At the
bottom, spot-trace diagrams detail the connection between port pairs as well as the
error paths.

However, substantial length reduction could well have been achieved through
substitution of half-wave waveplates. Nonetheless, the bandwidth of the resultant component is increased by the removal of the reciprocal rotators.
The quartz-free two-stage ladder-type circulator demonstrated by Koga is
illustrated in Fig. 7.7. Here a checkerboard of FR elements is used to intersect
the internal optical paths in the appropriate way to create a PI circulator. The
center walko block is also changed to impart walko along a 45 direction in
the plane perpendicular to the light path. Since the spacing in the FR checkerboard was small, only one external magnet could be used. To achieve both
clockwise and counterclockwise polarization rotation the inventor used two
dierent Faraday materials, YIG and a Bi:RIG derivative. The drawback was
that the YIG garnet was 2.1 mm thick and the Bi:RIG garnet was 0.48 mm
thick. This leads to a relatively high PDL (1.1 dB) and a dierential group
delay of 21 ps (although no more than about 8 ps can be accounted for via
index and path-length dierence alone). Also, while not reported, the temperature coecients of these two materials dier, reducing the isolation over
a normal operating range. Nonetheless, it was reported that the insertion loss
was below 1.75 dB and the isolation better than 65 dB.

284

7 Circulators

It should be noted that the YIG and Bi:RIG garnets can be replaced today
with matched latching garnets. The 45 rotations are realized by reversing
the orientation of one part with respect to the other part. Also, the permanent
magnet is removed. Using latching garnets, one expects the PDL, temperature
dependence, and dierential-group delay to improve substantially.
As a nal note, like the earlier displacement circulators, the input and
output ports are parallel to one another and individual collimators couple the
light to ber. The component size cannot be reduced beyond the displacement
necessary for light to couple to either of two collimators on the same side of the
component. As a consequence, there is a minimum volume of required crystal
material which sets a oor on the price. These limitations are overcome by
deection circulators.

7.4 Deection Circulators


The size of a circulator can be reduced by using a dual-ber collimator. A
dual-ber collimator uses a single lens to collimate or focus the light from two
closely-spaced bers threaded through the same ferrule. As derived in 5.3,
the consequence of using a single lens for two bers is an angular divergence
between collimated light paths. A typical full-angle divergence is 3 , although
this varies by product.
Combination of dual-ber collimators (DFC) with deection prisms is natural. The simplest example of coupling a single-ber collimator to a dualber collimator is the polarization-beam splitter. Figure 7.8 illustrates four
polarization-beam splitters, all of which use a birefringent prism for deection.
Figure 7.8(a) uses a Wollaston prism to convert parallel paths to paths that
match the angular aperture of the DFC, and uses a walko crystal to laterally
translate the beams into position [14]. Figure 7.8(b) combines the rst two
crystals into one, and when combined with a second prism, as shown, the combination is called the Kaifa prism (cf. 4.7.2). As with the rst example, one
beam is displaced before the two beams are deected into the angular aperture of the DFC. Figure 7.8(c) achieves displacement by concatenating two
Wollaston prisms with a complete gap in between [16]. The second Wollaston prism imparts stronger deection that the rst and orients the light into
the angular aperture of the DFC. These three polarization-beam splitters all
require translation of one or both beams because there is a presumption that
large optical elements will be inserted between the collimators. However, very
small-size designs may place the eective deection plane of a prism right
at the crossing point of the DFC [15, 25]. Specially designed DFCs can have
crossing distances of 2.4 mm. Figure 7.8(d) illustrates a Wollaston prism
located as described to form a compact polarization-beam splitter.
As a critique of the rst three designs, the third polarization-beam splitter
uses less crystal volume than the rst two. The deection from the rst prism
makes the two beams diverge, creating displacement without the need for

7.4 Deection Circulators

a)

L1

L2

285

L3

aBC
single-fiber collimator

b)

c)

d)

Wollaston
prism (aW)

walkoff crystal
L1

dual-fiber collimator

L2

Kaifa prism (aK, aBC)


a1

a2

complete gap
a

crossing distance

Fig. 7.8. Four polarization-beam splitters using deection from compound birefringent prisms to match the angular aperture of an output dual-ber collimator. a)
Combined walko crystal and Wollaston prism. The prism imparts deection into
the angular aperture of the DFC, while the walko crystal provides the necessary
translation. b) The Kaifa compound prism combines walko and deection in one
compound prism. c) A pair of Wollaston prisms with an intermediate complete gap.
The second prism imparts more deection than the rst. The complete gap is adjusted to ne-tune the spatial translation for optimal coupling. d) Single Wollaston
prism placed at the crossing point of a DFC. Such a system typically has a small
gap between lenses.

crystal. Also, the complete gap is a convenient alignment degree-of-freedom


unavailable in the rst two designs. The forth splitter uses the least crystal
volume of all but requires that all necessary parts t between the ferrule and
the crossing point.
Based on this primer, ve signicant circulator architectures are detailed in
the following. These circulators are all deection-type, non-strict-sense three
or four port components that were independently developed between 1997
and 1999. The rst two circulators are the Kaifa designs, invented by Li, AuYeung, Guo, and Wang. The third and last circulators are New Focus designs
invented by Xie and Huang. The fourth circulator is a Fujitsu-Avanex design
invented by Shirasaki and Cao. These designs highlight the aforementioned
techniques to integrate dual-ber collimators into two-stage circulators.

286

7 Circulators

Kaifa Circulators
Figure 7.9 illustrates a Kaifa non-strict-sense two-stage three-port circulator
that uses a deection scheme [23, 24]. The chain of optical elements is shown
is Fig. 7.9(a). Polarization diversity is carried out in the vertical plane via
walko crystals wo1 and wo3 , while deection happens in the horizontal plane
via the Wollaston prism and walko crystal wo2 . The separation of polarization diversity and deection into orthogonal planes is important because
ne placement adjustments can be made independently of one another. The
circulator is two-stage because each FR is located between a walko-crystal
polarizer or compound prism polarizer. The nonreciprocal FR plates are each
preceded by a reciprocal half-wave waveplate pair. The pair is split along the
horizontal and oriented so that, at the center wavelength of the waveplates,
one plate rotates the polarization state by +45 while the other imparts a
rotation of 45 . Figures 7.9(b,c) show the spot-trace diagrams, and the topand side-view of the component for connections 1 2 and 2 3.
As a critique, the reciprocal elements add to the part count, may be difcult to precisely align, and reduce the isolation bandwidth. Moreover, the
second walko-block is an unnecessary addition to the part count.
These two problems are remedied in the second Kaifa design, shown in
Fig. 7.10 [1]. Like the rst, this circulator is a non-strict-sense two-stage threeport circulator, but here the Kaifa compound prism is incorporated to impart
displacement and deection. Moreover, the reciprocal half-wave waveplates
are eliminated. To accommodate the elimination, the Faraday rotator plates
are each split in two, the upper having a rotary direction clockwise and the
lower having an opposite rotary direction; and the rst and third walko
crystals are re-cut to displace light along a 45 angle (rather than vertical)
in the plane perpendicular to the longitudinal axis of the component. This
change in crystal cut rotates by 45 the output polarizations of the displaced
and straight-through beams. The polarization states impingent on the Kaifa
prism in either direction remain that same as those in Fig. 7.9.
The result of the re-cut of the rst and third walko blocks is a coupling of the horizontal and vertical axes, which makes alignment more dicult.
However, the component has three fewer parts and no reciprocal rotators.
A First Xie-Huang Circulator
A rst Xie-Huang deection-type circulator is illustrated in Fig. 7.11 [40, 41,
43]. This circulator is in the same spirit as the Kaifa circulators, and is a
non-strict-sense two-stage three-port design. While the rst Kaifa circulator
uses waveplates and the second circulator eliminates them but couples the
horizontal and vertical axes, the present Xie-Huang circulator combines the
best features of the two. At the core of the Xie-Huang circulator is a pair of
Wollaston prisms separated by a complete gap. As with the aforementioned
polarization-beam splitters, one prism deects more strongly than the other.

a)

lens

uwp = -/+ 22.5

uF = 45o

wo2

Wollaston
prism

uwp = +/- 22.5o

b)

(f)

(d)

lens
(a)

(d)

(e)

(g)

(f)

1
v
1

2!3
3

2!3
3

Side
2

To p
2

v
u
(b)

(c)

(d)

(e)

(f)

2
(g)

Side

v
u

287

Fig. 7.9. Kaifa non-strict-sense two-stage three-port circulator. Deection via Wollaston prism and walko crystal.

7.4 Deection Circulators

2
(c)

To p
2

Side

c)

(a)

1!2

To p

(b)

Kaifa Circulator US 5,930,039

(b)

u
(a)

(g)

(c)

1!2
1

wo3

(e)

wo1

ferrule

uF = 45o

ferrule

a)
uF = -/+45o

lens

wo3

Kaifa prism

uF = +/-45

(e)
(d)

wo1

(c)

lens

(b)

Kaifa Circulator US 6,331,912

(a)

b)
1!2
v
2

u
(a)

(b)

(c)

(d)

(f)

(e)

7 Circulators
288

2
v

2!3

2!3

To p

c)

2
Side
To p

(b)

(c)

(d)

(e)

u
v

v
2

u
(a)

1!2

To p
Side

(f)

ferrule

ferrule

(f)
Side

Fig. 7.10. Kaifa non-strict-sense two-stage three-port circulator. Deection via Kaifa prism.

2
2

a)

wo1

FR1,2

FR3,4

Wollaston
prism 2

Wollaston
prism 1

lens
(a)

(e)

1!2
1!2

To p

(c)

(d)

(e)

(g)

(f)

Side
1

c)

2!3

v2
u1

complete gap

To p

(b)

(c)

(d)

(e)

(f)

2
Side

2
(g)

2!3

(a)

2 v

1 u

289

Fig. 7.11. Xie-Huang non-strict-sense two-stage three-port circulator. Deection via Wollaston prism pair and complete gap.

7.4 Deection Circulators

(b)

To p

1
Side

v
u

(a)

(f)

Xie-Huang Circulator US 6,049,426

b)
1

(c)

(b)

ferrule

(g)
(d)

ferrule

lens

wo2

290

7 Circulators

The combination of the weaker prism and the complete gap together generate the displacement embodied by the Kaifa designs. However, since the
orientation of the birefringent axes in the Wollaston prisms is arbitrary, the
Xie-Huang design uses a modied Wollaston prism pair with birefringent axis
orientations of 45 . These axes are directly aligned to the polarization states
resolved by the walko crystals and the Faraday rotator pairs. Accordingly, the
polarization diversity generated by walko crystals wo1 and wo2 may be in the
plane perpendicular to the deection of the Wollaston prism pair. Moreover,
the Wollaston prism pair allows for simultaneous elimination of dierentialgroup delay and balancing of diraction along the two paths. These properties
lead to low PMD and low PDL, respectively.
The complete gap is an ingenious feature that allows ne-tuned alignment
between collimators. The convergence angle may be just a few degrees. For
instance, a 3 convergence angle gives a 20 : 1 ratio between longitudinal adjustment and lateral displacement.
The inventors use latching iron garnet Faraday rotators to eliminate the
permanent magnets and to allow the same material to be used for both rotary
directions in the pair. This is an important innovation that reduces size and
balances paths, and is possible because the two-stage isolation accommodates
the increased temperature sensitivity of the latching garnet.
Shirasaki-Cao Circulator
To make a circulator smaller yet, the displacement for the polarization diversity must be reduced. The displacement, of course, must be sucient to separate light into two distinct beams. Shirasaki and Cao impart displacement at
the ferrule and before the lens, where the beam waist is on the order of 10 m
(separately noted in [26]). The displacement crystal need only be 100 m
thick, a factor of 25 reduction from a typical crystal length where the walko
follows the lens. With the removal of walko crystals from the optical path
between the lens pair, there is room to place the deection prism at the crossing point of the dual-ber collimator. Architectures of this type are the most
compact of all circulator.
Figure 7.12 illustrates the Shirasaki-Cao non-strict-sense two-stage fourport circulator [35], an improvement on an earlier design by Shirasaki [34].
As a four-port scheme, dual-ber collimators are located on either end of the
component. While any deection prism may be used, the illustration shows a
Rochon prism. As shown in Fig. 7.12(a), the walko and deection directions
are orthogonal, which decouples the adjustment of polarization diversity from
isolation. Also note that in contrast to the preceding deection-type circulators the present design uses a cross-over design in the plane of polarizationdiversity. The lens axis is located between the axes of the walko light and
straight-through light so as to deect both paths equally.
In this architecture, a walko crystal and half-wave waveplate are located
between the ferrule and lens (Fig. 7.12(a)). Accordingly, the polarizations on

uwp = +45o

a)
uF = -45o

Rochon
prism

wo2

uF = -45o

(d)

ferrule
(b)

1 3

4 2
(f)

lens

wo1

ferrule

lens

(e)

(c)

(a)

Shirasaki-Cao Circulator US 6,226,115

uwp = +45o

b)

1!2
1!2
v

v
u

u
(b)

(c)

(d)

u
v
(e)

(f)

4
2

3
P
v

Side
2

c)

2!3
2!3
3

v
u

u
(a)

(b)

(c)

u
v
(d)

2
P

v
(e)

To p

(f)

v
u

Side

u
v

291

Fig. 7.12. Shirasaki-Cao non-strict-sense two-stage four-port circulator. Deection via single Wollaston prism.

7.4 Deection Circulators

(a)

u
v

Side

To p

To p

292

7 Circulators

the two paths at the lens are nominally the same. After the lens, the polarization state is rotated by 90 by transit of the rst and second Faraday
rotators. The deecting prism located between the two FRs is cut so that the
birefringent axes are at 45 . Using a Rochon prism, one polarization state
(i.e. +45 ) is transmitted straight through while the orthogonal state is deected. As shown by the ray-trace diagrams in Fig. 7.12(b,c), the component
can be designed so that no deection makes the connections 1 2 and 3 4,
while deection makes the connection 2 3. Light input to port 4 is deected
and lost toward point P. As discussed by the inventors, the diraction of the upath is less than the v-path because the former path transits the two waveplates. Moreover, the dierential diraction occurs in the high N.A. region
between ferrule and lens. In principle this leads to either higher insertion loss
or higher PDL. If a particular design cannot be made satisfactorily, a glass
plate can be inserted beside the waveplate to balance the diraction.
As a critique, the presence of the waveplates limits the bandwidth of the
device. Moreover, the Rochon prism should be as thin as possible to minimize
the imparted dierential-group delay. Otherwise, the Shirasaki-Cao design is
very compact and overall rather simple to make.
A Second Xie-Huang Circulator
Many variations are possible with the Shirasaki-Cao architecture, some of
which depend on the particular technologies that are employed. An independently invented Xie-Huang circulator that falls into this class is illustrated in
Fig. 7.13 [44]. Here the Faraday rotator plates are brought between the ferrule
and lens. The plates are split in two along a horizontal line and the plates impart opposite rotations. This allows the removal of the half-wave waveplates
present in the former scheme. This step is technically dicult for the following reasons. The FRs must be latching garnets to enable opposite rotation
with the same material. A latching garnet thickness for 1.55 m applications
is 500 m thick instead of 350 m thick for non-latching Bi:RIG garnets.
The latching garnet thickness is to be compared to the half-wave waveplate
thickness of 92 m.
The ve-fold thickness increase, especially between ferrule and lens, requires more displacement and a larger lens. Alternatively, the inventors
use TEC ber to reduce the numerical aperture of the light emergent from the
ferrule. It is in this way the split-FR highly compact design can be realized.
Figure 7.13(a) illustrates their non-strict-sense two-stage three-port circulator. Since only three ports are used (and accordingly only one DFC), the
deecting prism is naturally a Wollaston prism. When viewed from port 2,
one polarization is deected toward port 1 and the other towards port 3. The
ray-trace diagrams in Fig. 7.13(b,c) show the 1 2 and 2 3 connections.
Light input to port 3 is deected to the region above port 2 and is lost. Since
this circulator is diraction, path-length, and temperature balanced between

a)

FR3,4
Wollaston
prism

FR1,2

wo2

ferrule

lens

TEC fiber

lens

wo1

(e)

ferrule

(f)

(g)

(d)

TEC fiber

1 3

(b)

(a)

(c)

Xie-Huang Circulator US 6,175,448

b)

1!2
1!2

(b)

v
(c)

(d)

(e)

(f)

(g)

3
2
1
1

(a)

1
To p

2!3
3

2
2

2!3

c)

Side

(b)

(c)

(d)

(e)

(f)

3
2

(g)

1
1

v
u

Side

2
2

7.4 Deection Circulators

(a)

Side

To p

To p

293

Fig. 7.13. Xie-Huang non-strict-sense two-stage three-port circulator. Deection via single Wollaston prism.

294

7 Circulators
Table 7.1. Two-Stage Deection-Type Circulator Performance
Specication(a)

Nominal

Units

1550

nm

50

nm

50

dB

40

dB

0.5

dB

Insertion Loss (c 25 nm, 0-70 C, all SOP)

0.7

dB

PMD

0.05

ps

PDL

< 0.15

dB

50

dB

Maximum crosstalk

50

dB

Power handling

500

mW

Operating temperature

065

Center wavelength
Bandwidth

Isolation (c , 23 C, all SOP)

Isolation (c 25 nm, 0-65 C, all SOP)

Insertion Loss (c , 23 C, all SOP)

Return loss

(a)

Specication values reported by New Focus [30].

the two paths, one can expect very good performance. Table 7.1 lists the
specications for a high-performance compact circulator such as this one.

7.5 Summary
Circulator architectures have experienced periods of rapid and competitive
development followed by relatively stable design concepts. The two driving factors behind periods of development are the advent of new materials and sub-assemblies, such as iron garnets and dual-ber collimators, as
well as application-specic demands, such as compact form factor and low
polarization-dependent loss. This chapter has selected from the broader patent
and technical literature those circulators that highlight the conceptual evolution. Within the framework developed here other circulators may be categorized and then one can make a prediction on its relative performance.
At the time of this writing initiatives to create new functionality are being
developed. The bi-isolator [39] and the bi-circulator [8, 37, 38] are two leading examples. Both components are designed to work on a uniformly spaced
wavelength-division multiplexed grid in order to facilitate bi-directional communication. Consider the separation of DWDM wavelengths into even and
odd channels, and run the even channels east and the odd channels west.
On even channels, the bi-isolator prevents light from travelling west; on odd

References

295

channels the bi-isolator prevents light from travelling east. The bi-circulator
operates as a similar principle in that the component circulates clockwise
for even channels and counterclockwise for odd channels. The bi-circulator
can act as a gateway between unidirectional and bidirectional communication
systems. The element common to both the bi-isolator and bi-circulator is an
interleaving lter. The lter designs are discussed in the references.

References
1. V. Au-Yeung, Q.-D. Gao, and X. L. Wang, Optical circulator, U.S. Patent
6,331,912, Dec. 18, 2001.
2. Y. Cheng, Reective optical non-reciprocal devices, U.S. Patent 5,471,340,
Nov. 28, 1995.
3. , Optical circulator, U.S. Patent 5,574,596, Nov. 12, 1996.
4. , Optical circulator, U.S. Patent 5,878,176, Mar. 2, 1999.
5. , Optical circulator, U.S. Patent 5,930,422, June 27, 1999.
6. , Optical circulator, U.S. Patent 5,991,076, Nov. 23, 1999.
7. J. S. V. Delden, Optical circulator having a simplied construction, U.S.
Patent 5,212,586, May 18, 1993.
8. T. Ducellier, K. Tai, K.-W. Chang, J. Chen, and Y. Cheng, Bi-directional
circulator, U.S. Patent 2002/00 224 730, Feb. 28, 2002.
9. W. L. Emkey, A polarization-independent optical circulator for 1.3 microns,
Journal of Lightwave Technology, vol. LT-1, no. 3, pp. 466469, Sept. 1983.
10. , Optical circulator, U.S. Patent 4,464,022, Aug. 7, 1984.
11. Y. Fujii, High-isolation polarization-insensitive optical circulator, Journal of
Lightwave Technology, vol. 9, no. 10, pp. 12381243, Oct. 1991.
12. , High-isolation polarization-insensitive optical circulator coupled with
single-mode ber, Journal of Lightwave Technology, vol. 9, no. 4, pp. 456460,
Apr. 1991.
13. , High-isolation polarization-insensitive quasi-optical circulator, Journal
of Lightwave Technology, vol. 10, no. 9, pp. 12261229, Sept. 1992.
14. Y. Huang and P. Xie, Optical polarization beam combiner/splitter, U.S.
Patent 6,331,913, Dec. 18, 2001.
15. , Optical polarization beam combiner/splitter, U.S. Patent 6,282,025,
Aug. 28, 2001.
16. , Optical polarization beam combiner/splitter, U.S. Patent 6,373,631,
Apr. 16, 2002.
17. H. Iwamura, H. Iwasaki, K. Kubodera, Y. Torii, and J. Noda, Compact optical
circulator for near-infrared region, Electronics Letters, vol. 15, no. 25, pp. 830
831, Dec. 1979.
18. M. Koga, Optical circulator, U.S. Patent 5,204,771, Apr. 20, 1993.
19. , Compact quatzless optical quasi-circulator, Electronics Letters, vol. 30,
no. 17, pp. 14381440, Aug. 1994.
20. M. Koga and T. Matsumoto, Polarisation-insensitive high-isolation nonreciprocal device for optical circulator application, Electronics Letters, vol. 27, no. 11,
pp. 903905, May 1991.

296

7 Circulators

21. , High-isolation polarization-insensitive optical circulator for advanced optical communication systems, Journal of Lightwave Technology, vol. 10, no. 9,
pp. 12101217, Sept. 1992.
22. H. Kuwahara, Optical circulator, U.S. Patent 4,650,289, Mar. 17, 1987.
23. W.-Z. Li, V. Au-Yeung, and Q.-D. Gao, Optical circulator, U.S. Patent
5,909,310, June 1, 1999.
24. , Optical circulator, U.S. Patent 5,930,039, July 27, 1999.
25. W. Z. Li and Y. Yang, Method and system for splitting or combining optical
signal, U.S. Patent 6,353,691, Mar. 5, 2002.
26. W. Z. Li, Y. Yang, F. Liu, and W. Luo, Polarization splitter and combiner and
optical devices using the same, U.S. Patent 6,493,140, Dec. 10, 2002.
27. Z. Liu, M. S. Wang, and J. Xu, Loop optical circulator, U.S. Patent
2003/0 007 244 A1, Jan. 9, 2003.
28. T. Matsumoto, Optical circulator, U.S. Patent 4,272,159, June 9, 1981.
29. T. Matsumoto and K. Sato, Polarization-independent optical circulator: An
experiment, Applied Optics, vol. 19, no. 1, pp. 108112, Jan. 1980.
30. New focus optical circulators (c-band), New Focus Corporation, San
Jose, California, 2003. [Online]. Available: http://www.newfocus.com/Online
Catalog/literature/Cband.pdf
31. W. B. Ribbens, An optical circulator, Applied Optics, vol. 4, pp. 10371038,
1965.
32. A. Shibukawa and M. Kobayashi, Compact optical circulator for near-infrared
region, Electronics Letters, vol. 14, no. 25, pp. 816817, Dec. 1978.
33. , Compact optical circulator for optical ber transmission, Applied Optics, vol. 18, no. 21, pp. 37003703, Nov. 1979.
34. M. Shirasaki, Optical device, U.S. Patent 5,982,539, Nov. 9, 1999.
35. M. Shirasaki and S. Cao, Optical circulator or switch having a birefringent
wedge positioned between faraday rotators, U.S. Patent 6,226,115, May 1, 2001.
36. M. Shirasaki, H. Kuwahara, and T. Obokata, Compact polarization-independent optical circulator, Applied Optics, vol. 20, no. 15, pp. 26832687, Aug.
1981.
37. K. Tai, Q. Guo, K. W. Chang, J. Chen, and M. Xu, 4-port interleavers and
fully circulating bi-directional circulators, in Tech. Dig., Optical Fiber Communications Conference (OFC01), Anaheim, CA, Mar. 2001, paper MK5.
38. K. Tai, K.-W. Chang, J. Chen, T. Ducellier, and Y. Cheng, Wavelengthinterleaving bidirectional circulators, IEEE Photonics Technology Letters,
vol. 13, no. 4, pp. 320322, 2001.
39. , Bi-directional isolator, U.S. Patent 6,587,266, July 1, 2003.
40. P. Xie and Y. Huang, Compact polarization insensitive circulators with simplifed structure and low polarization mode dispersion, U.S. Patent 6,049,426,
Apr. 11, 2000.
41. , Compact polarization insensitive circulators with simplifed structure and
low polarization mode dispersion, U.S. Patent 6,052,228, Apr. 18, 2000.
42. , Compact polarization insensitive circulators with simplifed structure and
low polarization mode dispersion, U.S. Patent 6,212,008, Apr. 3, 2001.
43. , Compact polarization insensitive circulators with simplifed structure and
low polarization mode dispersion, U.S. Patent 6,285,499, Sept. 4, 2001.
44. , Optical circulators using beam angle tuners, U.S. Patent 6,175,448, Jan.
16, 2001.

8
Properties of Polarization-Dependent Loss and
Polarization-Mode Dispersion

Polarization-dependent loss (PDL) and polarization-mode dispersion (PMD)


are two linear properties that are encountered when dealing with long spans
of single-mode ber. PDL and PMD are not conned to ber, however, and
may be present in components, simple optical elements such as birefringent
crystals and polarizers, and instruments that generate these eects. Yet the
study of PDL and PMD and their concatenation properties was motivated by
research conducted over the last 30 years on ber-optic communication links.
Indeed, polarization-dependent eects in ber optics have been studied since
the 1970s [37, 55].
Polarization-dependent loss refers to energy loss that is preferential to one
polarization state. In the Jones picture, one axis suers more loss than the
other. This dierential loss changes the output polarization state and imparts
a common loss to an unpolarized light beam. The complement of polarizationdependent loss is polarization-dependent gain (PDG). PDG is an eect that
is related to preferential gain between signal and noise in optical ampliers,
which, if left unaddressed, can cause impairments of its own sort. The specics
of PDL detailed in this work are equally applicable to PDG, but impairment
analysis of the latter must include the combined eects of noise and signal.
This is beyond the scope of the present text and the Reader is referred to
Desurvire [7].
Polarization-mode dispersion refers to the polarization eects of concatenated lossless birefringent segments. Each homogeneous segment produces
dierential-group delay. PMD is generated when two or more dierentialgroup delay segments are placed in cascade. While PMD can have the properties of a single homogeneous birefringent element (when the birefringent axes
are aligned along a multi-segment cascade), a single homogeneous birefringent
element does not possess all of the properties of PMD. PMD is not dened
according to an eigenvector analysis of the transformation matrix as one is
accustomed to for a single birefringent element.
The combination of PDL and PMD creates eects which are quite complicated and which can impair a communication system more than either eect

298

8 Properties of PDL and PMD

alone. For example, PDL is generally wavelength independent, while PMD is


generally wavelength dependent. Addition of some PMD to PDL will generally
result in wavelength-dependent PDL. The addition of some PDL to PMD can
in some cases result in dierential-group delay that exceeds what one would
expect from the PMD alone.
In terms of partial polarization, PDL and PMD have opposite eects. PDL
tends to polarize a partially polarized signal since it is a partial polarizer;
PMD, however, tends to depolarize a signal since PMD creates polarizationstate dispersion in frequency. Lyot depolarizers exploit polarization-state dispersion to (pseudo-)depolarize a transmission signal (cf. 1.5.3), but since
this is tantamount to adding PMD to the system such one- and two-stage
depolarizers, studied extensively in the late 1980s, have fallen out of favor.
The Mueller matrix for PDL and PMD highlights some of the properties
one can expect from these two eects. PDL, being the eect of a partial
polarizer, is represented by a Jones matrix that is Hermitian. When converted
to a Mueller matrix MPDL in Stokes space, all 16 entries may be occupied:

H  MPDL

1
0

, and H  MPMD =
0

Given this form, PDL is not reversible unless gain is added. PMD, being a
physical quantity that is measurable, is also represented by a Hermitian Jones
matrix. Unlike PDL, however, the PMD matrix is identically traceless. One
can argue this simply based on the lossless cascade of retarders that generates
PMD. Accordingly, its Mueller matrix MPMD only contains entries in the
lower sub-matrix. The eect of PMD, therefore, can in principle be reversed.
Finally, the addition of PDL anywhere along a birefringent cascade scatters
the PMD Mueller-matrix entries into all 16 sites. The combination cannot
therefore be strictly inverted.
The following sections of this chapter dene the PDL and PMD vectors,
show their eects on input states of polarization, derive equations of motion
for concatenated vectors, and illustrate how these eects impact a waveform.
The following chapter details the statistical properties of these eects.

8.1 Polarization-Dependent Loss


There are many mechanisms that generate polarization-dependent loss along
a ber-optic link. Micro-optic components such as those studied in the preceding chapters generate PDL when there is preferential coupling through a
focusing lens of one polarization-diversity path over another. Integrated optic
lters generate PDL because the passband frequency locations are polarization
dependent. Fused ber couplers generate PDL due to polarization-dependent

8.1 Polarization-Dependent Loss


a)

299

b)
a!1

SOPin

SOPout

a(z)
DOP = 0

DOP = 1
SOP

Fig. 8.1. Eects of PDL. a) A completely depolarized signal is completely polarized after transmission through a perfect polarizer. b) A partial PDL element
continuously changes the polarization state of input light and diminishes the overall
intensity.

coupling ratios. Optical ampliers generate PDG due to preferential gain of


the signal polarization over the orthogonal noise polarization [59]. Finally,
micro-bends in ber will generate PDL. Materials themselves can generate
PDL because of dichroism of the molecules, such as in polymer waveguides.
Regardless of the origin, however, a single formalism describes the eects.
8.1.1 Denitions
Table 8.1 gives a symbol list for the terms used in the following analysis. There
is some inconsistency of notation in the literature, but the notation adopted
here is intended to include the broadest overlap.
Figure 8.1 illustrates two eects of PDL. When PDL is complete, the
element acts as a perfect polarizer (Fig. 8.1(a)). A perfect polarizer transmits only states that have a nite projection along the polarizer axis. After
polarization, a completely depolarized signal becomes completely polarized.
The PDL studied below is for partial polarization, not complete. Moreover,
polarization occurs continuously through a dierentially lossy medium. Figure 8.1(b) illustrates the evolution of light that is initially circularly polarized
along a homogeneous PDL element. As the light travels the intensity along the
loss axis diminishes while that along the neutral axis is unaltered. One can see
how PDL transforms the polarization state along the element. Light launched
along the PDL axis is unaltered and undiminished, and light launched along
the orthogonal axis is unaltered in state by suers loss.
Polarization-dependent loss dB is dened by international standards bodies such as the TIA and IEC as [34, 35, 60]


Tmax
(8.1.1)
dB 10 log10
Tmin
where Tmax and Tmin are the maximum and minimum transmission intensities through the system. PDL is dened in decibels and is positive. Maximum

300

8 Properties of PDL and PMD

transmission intensity occurs when the polarization state of a completely polarized probe beam is aligned to the maximum transmission axis of the PDL
element. That axis may be anywhere on the Poincare sphere. Minimum transmission occurs for the orthogonal polarization (even in the presence of PMD).
Consider an intuitive example. The Jones matrix of a PDL element aligned
to the horizontal (S1 in Stokes space) may be written as


v1
v2


=



1
e

u1
u2


(8.1.2)

where is the loss coecient. When the input polarization is (1, 0)T the
output intensity is |v1 |2 = 1. Similarly, for (0, 1)T the output intensity is
|v2 |2 = exp(2). Therefore the PDL is
dB = (20 log10 e)

(8.1.3)

This relation holds true for any orientation of the PDL vector. Note that
20 log10 e  8.686. Moreover, for dB = 3 dB,  0.345.
In a ber-optic link a PDL element is generally located somewhere between the terminations. As single-mode ber does not preserve polarization
in a practical link, the apparent orientation of the PDL vector at the ber
termination generally will not give a purely diagonal Jones matrix. Instead,
the PDL vector can point in any direction. In particular the output matrix is
P  = U P V , where U, V are unitary operators.
The generalization of the PDL operator P , whose simple case is that
in (8.1.2), comes in the matrix exponential form:



 
/2
exp
P =e
(8.1.4)
2
where local PDL vector
 =
and
is a unit vector in Stokes space that
points in the direction of maximum transmission. This matrix exponential
operator is expanded using (2.5.77) on page 62, yielding
 ) sinh(/2))
P = e/2 (I cosh(/2) + (

(8.1.5)

Indeed (8.1.2) is recovered with


= s1 .
An input state of polarization is altered by the PDL element according to
|t = P |s
Assuming that s |s = 1, the transmission is
t |t = s | P P |s
As the exponent of P is real, one has P P = P 2 . Therefore,

8.1 Polarization-Dependent Loss

301

Table 8.1. Symbol Denitions for PDL


Symbol
dB :

Range

Denition

0 dB <

,
:

Polarization-dependent loss in decibels (dB) as dened


by the TIA and IEC
PDL vector and unit vector in Stokes space.
points
in the direction of maximum transmission

0<

Loss coecient |
|,
dB = (20 log10 e)

01

Unit loss; normalization of via: = tanh

related

to

dB

via:

Vector of cumulative PDL over a concatenation.



points in the direction of maximum transmission

(z) :

PDL vector per unit length


(z) :

Cumulative PDL vector per unit length

t |t = e s | (I cosh() + (
 ) sinh()) |s
s) sinh )
= e (cosh + (
Denoting Tp = t |t, the transmission intensity is
Tp =

1
(1 + tanh (
s))
1 + tanh

(8.1.6)

The transmission depends not only on the loss coecient but the relative
orientation of the PDL
to the incoming state of polarization s. Note that s
is the state incident on the PDL element, not necessarily the state launched
into a ber far away from the element. The extrema of transmission are
1

s = 1
Tp =
2
e

s = 1
Calculation of the PDL in dB recovers the denition (8.1.3). The relation
between loss coecient PDL in decibels clearly holds in general.
One point to note is that transmission coecients Tp do not multiply.
That is, Tp21 = Tp2 Tp1 . This is of course the result of change in the output polarization state from each PDL element, and a cascade exhibits multiple polarization alternations along its length. The correct calculation is
Tp12 = s | P1 P2 P2 P1 |s.
The scalar transmission Tp can be mapped in Stokes space to show all possible polarization inputs. Figure 8.2 illustrates four such examples. The surfaces are the product of the transmission coecient with input states that lie
on the unit sphere. Figures 8.2(a,b) illustrate single PDL elements with 3 dB
and 30 dB PDL, respectively. Note that the latter surface goes concave. The

302

8 Properties of PDL and PMD

transmission contour along the equator is projected to the bottom surface,


where the orthogonal directions of no-loss and high-loss are apparent. Figures 8.2(c,d) illustrate the transmission through two PDL elements in cascade,
both having 3 dB PDL. In the rst case the PDL elements are aligned, resulting in a 6 dB overall PDL. However, in the second case the PDL elements
are crossed. This results in a sphere with radius 0.5. There is no PDL at the
output, but there is a uniform insertion loss of 3 dB. This shows a simple example of the more general fact that PDL can increase or decrease depending
on the relative orientation of multiple elements, but the insertion loss always
accumulates.
Thus far the input has been considered completely polarized. When the
input is completely depolarized, the transmission is averaged over all polarization states. Since it is clear that 
s = 0, the average transmission for
unpolarized light is
Tp  = Tdepol
where Tdepol is the transmission coecient for depolarized light. To connect
with the literature, (8.1.6) is recast into
s))
Tp = Tdepol (1 + (

(8.1.7)

where the loss coecient is functionally normalized as


tanh

(8.1.8)

and the transmission for depolarized light is


Tdepol =

1
1+

(8.1.9)

The minimum and maximum transmissions are therefore


Tmax = Tdepol (1 + )
Tmin = Tdepol (1 )
The relationship between the normalized loss coecient and the transmission
extrema is
Tmax
1+
Tmax Tmin
(8.1.11)
, and
=
=
Tmax + Tmin
Tmin
1
The decibel expression for PDL can be written in these terms:


1+
dB = 10 log10
(8.1.12)
1
Moreover, these denitions are used to write the Jones and Mueller matrices
for a PDL element oriented along the horizontal:


1+
1/2
Js1 = Tdepol
(8.1.13)
1

8.1 Polarization-Dependent Loss


S3

a)

T(sin) surface

S3

b)

T(sin) surface

S2

S2

S1

Equatorial loss contour


45

S1

Equatorial loss contour

45o

pdl = 3 dB

S3

c)

pdl = 30 dB

T(sin) surface

a1

a2

45

pdl = 3 dB

45

S3

d)

T(sin) surface

S2

S2

S1

S1
a1
a2
Equatorial loss contour

Equatorial loss contour


o

303

pdl = 3 dB

45o
pdl = 3 dB

-45o
pdl = 3 dB

Fig. 8.2. These surfaces plot the transmission Tp as a function of input polarization state sin . The contours at the bottom are projections of Tp along the equator.
a) Transmission surface for single 3 dB PDL vector at +45 . b) Transmission surface
for same PDL element but with 30 dB PDL. Note this surface is concave. c) Transmission surface after two aligned PDL elements, 3 dB PDL each. d) Transmission
surface after two orthogonally aligned PDL elements, 3 dB PDL each. Note surface
is spherical (no PDL) but has a smaller radius reecting the insertion loss.

304

8 Properties of PDL and PMD

and

Ms1

1
1

= Tdepol

1 2

1 2

(8.1.14)

where the Mueller matrix is related to the Jones matrix through the spinvector expression (1.4.22)
onpage 18. Note that the Jones matrix can also be

written as J = diag( Tmax , Tmin ).


8.1.2 Change of Polarization State
The state of polarization at the output from a PDL element generally diers
from the input state. This is easily imaged: a right-hand circular state that is
transmitted through a PDL element has one axis shortened. Accordingly, the
state is altered from circular to elliptical.
The output polarization state is determined from the PDL operator P in
the following way. The output unit Stokes vector is
t | | t
t =
t |t

(8.1.15)

Substitution of the P expansion (8.1.5) into the numerator yields


! 
 
"
t | | t = e s |
Ic/2 + s/2 (
 )  Ic/2 + s/2 (
 ) |s
where s/2 = sinh(/2) and c/2 = cosh(/2). Application of the identities
s | | s = s
s | (
 ) + (
 ) | s = 2

s |(
 ) (
 )| s = 2s |(

) | s s | | s
= 2
(
s) s
produces



s)

t | | t = e s + s 1 + t/2 (

Using the previously determined expression for t |t, the unit Stokes vector t is governed by the relation [17]




2
1 + 2 1 1 2 (
s)
1

s +

(8.1.16)
t =
1 + (
s)
1 + (
s)
where the following identication is made

tanh(/2)
1 1 2
=
tanh
2

8.1 Polarization-Dependent Loss


a)

S3

b)

305

S3
^

tout 2 sin

tout

a1

a1

S2

S2

S1

c)

S1

d)

S3

S3
^

tout 2 sin

tout

a1

a1

S2

S2

S1

S1
a2

a2

Fig. 8.3. Two examples of tout . Left gures show mesh of normalized tout for one and
two PDL elements. Right gures show vector dierence plot to indicate the change
tout sin . a) and b) Single 3 dB PDL element aligned along S2 . All states but S2
are pulled toward S2 . cd) Two 3 dB PDL elements cascaded, second element having
elliptical PDL vector (or intermediate unitary rotation and linear PDL vector). Note
2 . It is
in d) that cumulative PDL vector
points in a direction between
1 and
along
that the output states are pulled.

The transformation expression (8.1.16) has a characteristically dierent form


than encountered previously in this text (Fig. 8.3). One is accustomed to
t = R
s, where R is a rotation operator in Stokes space. Instead, polarizationtransformation through a PDL element has the form t = u
s + v
, where u, v
are positive coecients that are themselves a function of the relative orientation of s and .
Physically, the input polarization state is pulled toward
the PDL vector direction, where the pull strength is dictated by and the
relative orientation. This pulling eect is shown in Fig. 8.3(b,d).

306

8 Properties of PDL and PMD


^

DOP(sin)
input

S3

DOP(sin)
output

a)

DOP(sin)
input

S2

S1

DOP(sin)
output

S1

b)

Fig. 8.4. Degree-of-polarization surfaces, before and after single PDL element
with 3 dB loss and aligned along S1 , illustrate repolarization. a) View of the (S1 , S3 )
plane. b) View of the (S1 , S2 ) plane. In both cases, spherical surface has Din = 0.5.
sin ), or the DOP as a function of input poThe Dout surface is a map of Dout (
larization. The right hemisphere of both plots shows repolarization of the signal
(Dout Din ). The left hemisphere shows increased depolarization.

8.1.3 Repolarization
Intuitively one expects that depolarized light which passes through an ideal
polarizer attains a unity degree of polarization. Repolarization [45] is the eect
of generating partially polarized light from depolarized light by transiting
through one or more PDL elements. However, transit of a PDL element with
partially polarized light (cf. 1.5) can either increase or decrease the degree
of polarization, depending on orientation [17].
The Mueller matrix for a single PDL element (8.1.14) governs re- and depolarization. Without loss of generality the matrix elements will remain xed
in the analysis below while the input polarization state varies. To summarize
the ndings:
Din = 1

Dout = 1

Din = 0

Dout =

Din = d

Dout = f (d, ,
s)

where f is the function (8.1.18). These cases are derived as follows. The output
Stokes vector for an arbitrary input having Din = d is

1
1
1
d cos sin

Sout = Tdepol
(8.1.17)
2

d sin sin
1
d cos
1 2

8.1 Polarization-Dependent Loss


a)

S3

DOP(sin)
output
a

DOP(sin)
input

S3

b)
^

DOP(sin)
output

a2
a1

S2
S1

307

S1

DOP(sin)
input

S2

a3

Fig. 8.5. Repolarization surfaces for Din = 0.25 after one (a) and three (b) PDL
elements. All elements have 3 dB PDL. Note that for these large PDL values the
repolarization is quickly established.

For d = 1 the output partial polarization in all cases is Dout = 1. Conversely,


for d = 0, only the S0 and S1 terms survive so Dout = . For an arbitrary d
such that 0 d 1, the output DOP is

(1 + dc s )2 (1 d2 )(1 2 )
(8.1.18)
Dout =
1 + dc s
Two degree-of-polarization surfaces as plotted in Fig. 8.4. The partial polarization at the output increases when the polarized portion of the input light is
aligned to the direction of maximum transmission and decreases when aligned
to the direction of maximum extinction. This behavior follows what one would
expect.
While the above examples used a single PDL vector aligned along S1 , a
Mueller matrix can be calculated for any PDL operator P or concatenation
of operators. Such a generalized Mueller matrix is used to calculate the repolarization surfaces illustrated in Fig. 8.5. In both cases Din = 0.25 and PDL
for each segment is 3 dB. While the single-segment case exhibits both re- and
de-polarization, the three-segment case exhibits repolarization for all input
states.
The eect of repolarization is statistical when many PDL elements are distributed along a birefringent cascade. This is the model of a transmission link.
Repolarization can be an impairment for long-haul transmission systems that
launch polarized light to mitigate polarization-dependent gain eects from the
erbium ampliers. As the light becomes repolarized, PDG impairments begin
to accumulate. The statistics of repolarization are derived in [45].

308

8 Properties of PDL and PMD

8.1.4 PDL Evolution Equations


Polarization-dependent loss in a communications link comes from optical components or imperfections along the optical ber. In either case a local PDL
vector
 each source. Polarization-dependent loss through a chain of PDL
element accumulates and is denoted by . The cumulative PDL vector will
always try to track to the local PDL element, but if the elements have low
PDL and are randomly distributed, the cumulative vector eventually decorrelates. In this section the evolution of  is derived; its driving term is the local
PDL
.
Two slightly dierent derivations for the evolution of  through a chain
of arbitrarily oriented PDL elements are given by Gisin and Huttner, and
Mecozzi and Shtaif [44]. The Gisin and Huttner derivation [18, 31] tracks the
evolution of the PDL vector as dened at the output. Mecozzi and Shtaif do
the opposite, tracking the PDL vector as dened at the input. Consider the
output polarization produced by a chain of PDL elements acting on an input
state: |t = AT |s where, as in (8.1.4), A is the common loss and T is an
Hermitian operator. The output intensity is then


(8.1.19)
t |t = A2 s T T  s
Since T is Hermitian, the operator T T may be written in spin-vector form:
T T = po I + p  . Taking s |s = 1, one nds
t |t = A2 (po + p s)

(8.1.20)

The maximum and minimum output intensities are for alignment and antialignment of the input state s with p. Therefore p represents the cumulative
PDL vector referenced to the input.
Reference to the output state comes from the complement of (8.1.19):
! 
1 "
s |s = A2 t | T T
|t
Denoting QQ as the inverse of T T , the spin-vector form is QQ = qo I + q  .
Still assuming the input intensity is normalized, the output intensity can be
expressed as a function of the output polarization state:
t |t =

A2
qo + q t

Now the minimum and maximum output intensities are for alignment and
anti-alignment of the output state t with q.
The derivation below follows the Mecozzi and Shtaif choice of reference
frame: equations of motion are referred to the input.
Referring back to (8.1.20), the extrema in transmission are clearly
Tmax = A2N (po + p) , and Tmin = A2N (po p)

8.1 Polarization-Dependent Loss

309

where p = |
p|. The PDL is therefore

dB = 10 log10

1 + p/po
1 p/po


(8.1.21)

In light of the analogous expression (8.1.12), one can expect that the cumulative PDL magnitude is related to the spin-vector as = p/po . The vector
form follows this relation and will be used in a moment.
The evolution of po and p is determined by the evolution of T T . In the
continuum limit, the cumulative loss A and PDL operator T are

  z



1
1 z
(z)dz , and T = exp

 (z)  dz
(8.1.22)
A = exp
2 0
2 0
where
 (z) represents the derivative in z of the local PDL vector. (Gisin and
Huttner use a discretized version to emphasize the form of the derivatives to
follow.) The output intensity is


(8.1.23)
t |t = A2 s T T  s = A2 s |po I + p  | s
Now, the evolution is determined by the dierential change of T T . Taking
the derivative along z (from input to output) of both sides of (8.1.23) gives
dT
d
dT
T + T
=
(po I + p  )
dz
dz
dz
with the initial conditions of po = 1 and p = 0 at z = 0. The spatial evolution
of T and its adjoint comes from (8.1.22) and (2.5.82) on page 63

dT
dT
1 
T + T
=
(
(z)  ) T T + T T (
(z)  )
dz
dz
2
Plugging in the spin-vector form of T T and using the anti-commutator relation {(a  ), (b  )} = 2(a b) gives
d
(po I + p  ) = po (
(z)  ) + (
(z) p)
dz
Finally, separation of spin-vector from scalar terms gives the coupled evolution
equations
d
p
dpo
(8.1.24)
=
 (z) p, and
= po
 (z)
dz
dz
These equations are converted to equations for the cumulative PDL vector
and depolarization transmission. Dene the cumulative PDL vector as
(z) p (z)
po (z)
and the depolarization transmission Tdepol as

(8.1.25)

310

8 Properties of PDL and PMD

Tdepol = A2N po

(8.1.26)

Dierentiation with respect to length and substitution of (8.1.24) yields the


equations of motion for  and Tdepol :
!
"
d 
=

  
(8.1.27a)
dz
!
"
d Tdepol
=
  Tdepol
(8.1.27b)
dz
These elegant equations predict the direction and length of the PDL vector
is Stokes space as well as the depolarized transmission coecient. Note that
0 || 1, so the PDL vector is bound by the unit sphere.
As for discussion, note how the local PDL element
 pulls the cumulative
PDL vector. If the cumulative and local vectors are initially orthogonal at the
  = 0. The dierential equation (8.1.27a)
start of the nth segment, then
then pulls  strongly in the direction of
 . Moreover, since always, 
asymptotically approaches
 , but never aligns perfectly unless they are aligned
initially. This said, the magnitude of  may increase or decrease depending
on the cascade.
In contrast, Tdepol monotonically decreases in all cases. At best Tdepol is
stationary. Lost power is never recovered when propagating through PDL
media. Since 0 1,
  always.
Figure 8.6 illustrates four examples of cumulative PDL evolution. The details are given in the caption. One point to note, however, is the behavior of the
insertion loss in the case of many small-valued PDL elements. Figures 8.6(c,d)
show a trend of linear decrease (on a log scale) of Tdepol . The origin of this
behavior is as follows. The general solution to (8.1.27b) is
  z


(
(z)
 (z) (z)) dz
Tdepol = exp
0

In a long chain of PDL elements, the cumulative vector eventually losses track
of the local PDL element vector. That is,
 and  decorrelate. Once the two
vectors decorrelate,. one can
/ write that the statistical average of the inner
product vanishes:
 j j = 0. Beyond this point, the mean value of the
insertion loss goes as
  z

Tdepol  exp

 (z)dz
0

The insertion loss plots in Figures 8.6(c,d) illustrate


! this
" trend. Moreover, the

variance of Tdepol is related to the variance var
 . The variance of the
cosine function uniformly distributed in angle is 1/2. When all PDL elements
have the same value, the variance of Tdepol is approximately
var (Tdepol )  exp (
(z)  z/2)
where  is the mean over the link.

a)

S2
aa

Ga,b

S1

ab

Tdepol (dB) PDL (dB)

8.1 Polarization-Dependent Loss


9

311

6
3

0
0
-3
-6
-9

12

16

20

aa

ab
ac

c)

Gb

S1

Gc

S2
G0
S1
G100

d)

S3
G100
S2
S1

Tdepol (dB) PDL (dB)

S2
Ga

Tdepol (dB) PDL (dB)

b)

Tdepol (dB) PDL (dB)

Length (a.u.)
15

10

5
0
0
-4
-8
-12

20

20

10

15

20

Length (a.u.)

25

30

8
6
4
2
0
0
-25
-50

40

60

80

100

40

60

80

100

Length (a.u.)

20
15
10
5
0
0
-25
-50

Length (a.u.)

Fig. 8.6. PDL-vector evolution examples. a) Two orthogonal PDL segments


1,2 ,
9 dB each. The PDL accumulates to a maximum of 9 dB through the rst element
and diminishes to zero at the termination of the second element. The insertion
1,2,3 , 9 dB
loss Tdepol monotonically decreases to 9 dB. b) Three PDL segments
each, oriented at right angles. The cumulative PDL vector
tries to track the
element vectors with increasing disparity. c) One hundred 1 dB randomly oriented
PDL segments conned to the equator. The cumulative vector
does a random walk.
The insertion loss decreases almost linearly (on log scale). d) One hundred 1 dB PDL
segments with random orientation in all directions.

312

8 Properties of PDL and PMD

8.2 Polarization-Mode Dispersion


C. D. Poole and R. E. Wagners seminal paper, entitled, Phenomenological
Approach to Polarisation Dispersion in Long Single-Mode Fibres [52] marks
the historical dividing line between the classical treatment of birefringence
and the modern development of polarization-mode dispersion (PMD). In 1986,
Drs. Poole and Wagner, while working at Bell Laboratories on ber communications, created a characteristic description of the frequency-dependence of
concatenations of birefringent elements, since recognized as the point of discovery of PMD. Few discoveries, however, are made in a vacuum indeed a
scan of the technical literature shows titles including the term PolarizationMode Dispersion dating back at least to 1978 [2, 56] and one is inclined to
ask about the context which led to the breakthrough.
The year 1986 was before the advent of the ber-based erbium-doped
optical amplier; the pinnacle of long-distance communications rested at the
time on coherent communications. Coherent communications mixes a local
oscillator (LO) with the received signal to pull the signal out of the noise.
As early as the 1970s, birefringence of single-mode ber was recognized and
that the output polarization state would drift with temperature or abruptly
shift when hit was well understood. Since the LO needs to be aligned to the
signal polarization, the matter of polarization tracking and control was under
intense investigation. Many papers can be found on these subjects throughout
the 1980s. In this sense, polarization eects were treated in the time domain,
for the speed and control of polarization reside there.
As head of a research department at Crawford Hill Research, Wagner
tasked Poole to investigate the overall link properties and characteristics that
needed to be understood in order to make coherent communications practicable. A Bell Labs colleague of Dr. Pooles at the time, N. S. Bergano, believed
that the job would be completed somewhat quickly. Regarding polarization
Dr. Bergano was to say, its x, y, and a phase how hard can it be?.
Part of Dr. Pooles work included building a relatively fast polarimeter
to measure the speed of polarization change. When a ber was terminated
by the polarimeter and a probe diode laser was turned on, Poole noticed
that the output polarization went through a transient but would settle out
in a few minutes. Intrigued by this eect, Poole determined it was the diode
laser ramping up in temperature which in turn produced a frequency sweep
that was responsible for the changing output polarization. The unavoidable
conclusion was that the output polarization state was frequency dependent.
Determined to describe the eect as an eigenvalue problem, Poole found that
the eigenvectors of a Jones transformation matrix always change to rst-order
in frequency. Nothing distinctive could be derived from that behavior. Poole
and Wagner then asked if there exists an output state that is stationary to
rst-order in frequency consequently the two Researchers discovered the
principal characteristic of PMD.

8.2 Polarization-Mode Dispersion

313

The principal distinction between PMD research before and after Poole
and Wagners paper is that PMD enables a global description of the birefringence. Only local descriptions were available before 1986. Indeed, polarization
optics had been studied for centuries, but always in the context of local birefringent behavior. Perhaps the closest earlier researchers got to the questions
that encompass PMD are the several inventions of birefringent lters, including Lyot, Solc, Jones, Pancharatnam, and Harris. Even though, no one had
put a global description together before and it is not too surprising that communications researchers were the rst ones to make this observation.
With the advent of the optical amplier, work on coherent communications
went into decline. PMD, by contrast, has remained at the forefront of research
ever since. Particularly important work includes the statistical treatment of
PMD; the use of the statistics to develop link budgets for installed system
designs; the measurement and mitigation of PMD in single-mode ber; the
measurement of PMD in components, installed ber, operating links, and all
manner of congurations; programmable PMD generation; and interaction of
impairments such as PMD with PDL and chromatic dispersion, and PMD
with nonlinear eects.
Moreover, the wealth of research developed in polarization tracking for coherent communications was redirected to active PMD compensation. The father of the optical PMD compensator is Fred Heismann, who through his work
on lithium-niobate electro-optic polarization controllers [27, 28] demonstrated
the rst closed-loop PMDC [29]. Yet at the time of writing, the economics behind optical PMDCs appear unfavorable, even in light of proven high-quality,
live-trac-certied products [50]; and, simultaneously, lower-cost chip sets are
being developed to make corrections electronically.
The following sections of this chapter cover the major highlights of the
time, frequency, Fourier, and Stokes properties of PMD. These sections will
be successful if they arm the reader with the tools necessary to read the
literature and patents critically and informatively.
8.2.1 A PMD Primer
Polarization-mode dispersion is an optical eect present in concatenations of
lossless birefringent elements. The following observation was rst made by
Poole [52]:
Observation: There always exists an orthogonal pair of polarization states output from a birefringent concatenation which are
stationary to rst order in frequency. These two states are called
the Principal States of Polarization (PSP). A dierential delay exists between signals launched along one PSP and its orthogonal
complement.

314

8 Properties of PDL and PMD

Based on this observation, the PMD vector, which has a length and a pointing
direction in Stokes space, is dened as follows [20]1 :
Definition Part 1: The pointing direction of the PMD vector
is aligned to the slow PSP, the PSP that imparts more delay than
the other.
Definition Part 2: The length of the PMD vector is the
dierential-group delay between slow and fast PSPs.
The denition of the PMD vector follows immediately from the Poole observation, and that observation is clearly dierent from the usual input-to-output
eigenstate denition for a birefringent concatenation.
The usual eigenstate of any concatenation is that polarization state
which is the same at the output as at the input. (In contrast, the polarization state associated with the input and output PSPs are generally not the
same.) The failure of the eigenstate description happens because the eigenstate almost always changes to rst order with change in frequency. Since
group delay requires a frequency derivative, the derivative of an eigenstate
and its phase results in two terms: the derivative of the phase, which is delay, and the derivative of the eigenstate, which rotates the polarization. The
lack of a stationary eigen-polarization prevents the separation of group delay
from its eigenstate. In the context of eigenstates, dierential-group delay is
not uniquely determined.
Another way to express the need for a denition of the PMD vector is to
say that a parallelism should exist for innitesimal rotations of polarization
state in length and in frequency. As seen many times earlier in this text, the
dierential change of output polarization state due to dierential change in
length of the last birefringent element in a concatenation is
d
s
= n s
dz
where n is the birefringent vector of the last element. This is the customary
precession rule. What vector, then, does the output polarization precess about
for dierential change in frequency? That is, what is the vector V such that
d
s
= V s ?
d
Equivalent to a change in frequency would be a if all elements in the concatenation changed their length, or refractive index, or both, for the same
frequency. That is, the optical phase for any section is
1

Note: This denition of PMD follows that of Gordon and not that adopted by
Poole. The Gordon denition aligns the PMD vector along the slow PSP and uses
a right-hand coordinate system in Stokes space. A detailed comparison between
the two systems is available [21].

8.2 Polarization-Mode Dispersion

315

nL
c

Proportional change of any parameter of the optical phase along an entire


cascade leads to eigenstate change for rst-order change in that parameter.
For example, a change in temperature will change the refractive index, and
eigenstates of the cascade with change to rst order with it. This generalization of PMD was recognized by Shieh and Kogelnik [58]. It is customary,
nonetheless, to use frequency as the underlying parameter.
It is important to observe that so far no specic context for PMD has been
given other than lossless birefringent concatenations. PMD exists in optical
ber as well as birefringent crystals; it exists for any number of homogeneous
sections, whether one or thousands; it exists for any type of statistical distribution, whether Maxwellian or not, and any delay within or accumulated
across multiple sections. PMD also exists in the presence of PDL, although
its description is more complicated. Whether PMD is a signicant eect for
an optical communications link or a single component depends on these particulars, but the particulars do not alter its denition.

It is not at all obvious that for any arbitrary lossless birefringent cascade
of any length and composition a pair of principal polarization states exists. A
good heuristic is to construct and to compare the dierential-rotation rules
for length and frequency, and to draw an analogy between the principal-state
system and the eigenstate system.
To begin, an eigenstate ties the output to the input: the same polarization
state must exist at both ends (at any frequency). Likewise, a principal state
ties the output to the input but in a dierent way: the input polarization
state must be oriented such that the output polarization state does not change
to rst order in frequency. Generally, the eigenstate and principal state are
not the same. Figure 8.7 illustrates the eigenstates for a single homogeneous
birefringent element. When an input polarization is oriented to the fast
eigen-axis of the element (Fig. 8.7(a)), the refractive index seen by the light
is the smaller of the two. The output polarization is the same as the input
polarization, and a pulse of light exits at a certain time. Next, when the input
polarization is oriented to the slow eigen-axis of the element (Fig. 8.7(b)),
the refractive index seen by the light is the larger of the two. The output
polarization is also the same as the input polarization, and an output pulse is
delayed with respect to the fast-axis output pulse. The relative delay time is ,
and is called the dierential-group delay (DGD). Note that when the input
polarization is oriented either to the fast or slow axis, only one polarization
state and one pulse is present at the output; this denes the eigenstate of the
system.
In general an input polarization is not aligned to the birefringent axis of
the element (Fig. 8.7(c)). In this case, the input state is projected onto the
two orthogonal birefringent axes of the element and these projects propagate

316

8 Properties of PDL and PMD


a)
DnL
Fast axis

time

b)
DnL
Slow axis

time

time

c)
DnL
Mixed

Fig. 8.7. Eigenstates and eect of single homogeneous birefringent element.


a) Launch along fast axis gives an output polarization state that is the same as
the input state. b) Launch along slow axis also leaves the input state unaltered.
There is, however, a delay with respect to the fast axis. c) Arbitrary polarization
state at the input is projected onto the two eigen-axes, and these two projects propagate independently.

independently to the output. Accordingly, an input pulse is split into two


pulses, one delayed by with respect to the other. The relative intensity of
the two pulses is dictated by the angle between the input polarization state
and the birefringent axis of the element.
Next, consider two birefringent elements in cascade. The polarization evolution is best described in Stokes space. As in Fig. 8.8(a), consider one stage
again. The input polarization sin precesses about birefringent axis r1 to output state s1 (z1 ), where the propagation axis is denoted as z. When a second
stage is added (Fig. 8.8(b)), polarization state s1 (z1 ) precesses about birefringent axis r2 to new output polarization s2 (z2 ). Consider now a small increase
in length z of the second stage. The output polarization comes from a continuation of the precession about r2 up to z2 + z (Fig. 8.8(c)). Thus the
dierential precession rule for change of length is d
s/dz = r2 s.
Now, instead of an increment in length, consider an decrement in frequency.2 At a rst frequency , the output polarization from two stages
is s2 () (Fig. 8.9(a)). When the frequency is decremented (Fig. 8.9(b)), the
precession about r1 decreases, terminating the precession about r1 early. Accordingly, the radius of the precession circle centered on r2 is increased, and
the precession angle about r2 is decreased due to the lower frequency. The output polarization is s2 ( ). For this choice of input polarization state sin ,
the output polarization changes to rst-order with frequency, Fig. 8.9(c). This
2

In the following the frequency is decremented. The choice of increase or decrease


in frequency is of no consequence; the decrease in frequency was selected only to
make the illustrations clearer.

8.2 Polarization-Mode Dispersion


a)

b)

S3
^

sin

c)

S3
sin

r1

s1(z1)

t1

jsin i

S1

js1 i

S3
^

sin

s2(z2)

S2
^

s2(z21D)

S2
^

r1

r2

jsin i

t1

317

t2

S1

js2 i

S2
^

r1

S1

r2

t1

jsin i

t2

Dz js i
2

Fig. 8.8. Polarization evolution through one and two birefringent elements. a) Input
state sin precesses about birefringent axis r1 as the light travels along length z.
b) The state output from the rst stage is input to the second stage and precesses
about r2 . c) A small increase in length of the second stage continues the polarization
precession to state s2 (z2 + z) from s2 (z2 ) about r2 .
a)

b)

S3
^

sin

sin

s2(v)

c)

S3
^

s2(v2D)

S2
^

r1

r2

S1

sin

S2
^

r1

S3

r2

S1

r1

r2

jsin i

t1

t2

js2(v)i

jsin i

t1

t2

js2(v2Dv)i

S2
S1

s2(v)2s2(v2Dv)
_______________
5
6 0
Dv

Fig. 8.9. Polarization evolution through two birefringent elements at two dierent
frequencies. a) Evolution through two stages as in Fig. 8.8(b). b) A decrease in
frequency reduces the precession through stages one and two. Reduced precession
about r1 increases the radius of the precession circle centered on r2 . c) For the
same sin , the output polarization changes to rst-order with frequency.

input state does not yield a principal state of polarization for the system at
frequency .
A principal state of polarization for this system can be nonetheless found
for a dierent input state.
Figure 8.10 shows the construction necessary for two equal-length stages.
At frequency , input state pin precesses about r1 until it reaches the equator.
The polarization state then precesses about r2 to output state pout . Now, a
decrement in frequency moves the intermediate polarization state s1 away

318

8 Properties of PDL and PMD


S3
^

pin

S2
^

r1

r2

pout(v)2pout(v2Dv)
__________________
!0
Dv

pout

S1
^

s1(v)

s1(v2Dv)
Dv

Fig. 8.10. Principal input and output states at for two equal-length birefringent
stages. The insets show that a decrement in frequency does not change the radius of the precession circle about r2 to rst order; and that decrease in precession
about r1 is compensated by the requisite decrease in precession about r2 necessary
to reach pout to rst order.

from the equator. However, as shown in the inset, the left-right motion of the
polarization state as it precesses about r1 changes only to second order; the
same holds for precession about r2 . This is because the two precession circles
share the same tangent line at their intersection. (That the two circles are
tangent is the key point, the point of tangency happens to be at the equator
in this example.) Accordingly, the precession circle centered on r2 does not
change radius to rst order. Second, the decrease in precession about r1 is
compensated by a decrease in precession about r2 necessary to reach pout at
the output, to rst order. Neither precession radii nor arc lengths change to
rst order with change in frequency. Therefore pout is a principal state of the
system. The stationary property is expressed as
lim

pout ( ) pout ()
=0

(8.2.1)

The output polarization state pout is an output PSP. Its orthogonal state is
also a PSP. The output principle states have corresponding input principal
states, one of which is pin . The input and output principal states are related
by the transformation matrix of the system:
|pout  = T |pin 
Unlike an eigenstate of T , the output PSP is generally not the same as the
input PSP. However, the two output PSPs are orthogonal to one another, as
are the two input PSPs.
Now, since the output PSP is stationary with frequency to rst order, a
group delay can be dened and is separable form the polarization state on
which it travels. The existence of this PSP-dependent group delay was rst

8.2 Polarization-Mode Dispersion


a)

S3

319

b)

sin

t
S2

sout

S2

r1

r2

S1

t3s

sout

r2

Fig. 8.11. The PMD vector denes the precession rule in frequency for output
polarizations. a) Evolution of polarization state through two stages from xed sin for
range of frequencies. Precession circle sout () is shown for comparison. b) Magnied view of output polarization state as function of frequency with precession circle.
The two deviate only to second order.

experimentally demonstrated in the time-domain by Poole and Giles [51]. The


above construction has not determined which of the two output PSPs is the
slow axis and which is the fast axis, but nevertheless the frequency derivative
of the phase of pout imparted by T yields the dierential-group delay denoted
by . The DGD is a length and therefore always a positive quantity. By
construction, the PMD vector points along the direction of the slow PSP and
has length ; that is,
 p
(8.2.2)
where p is the slow output PSP. The PMD vector denes the innitesimal
precession rule about which any output polarization state travels for small
changes in frequency. That is,
d
s
=  s
d

(8.2.3)

where s is any polarization state at the output. This frequency-dependent


precession rule was used by Curti et al. [1] as early as 1987 to measure the
dierential-group delay of a single-mode ber.
Figure 8.11 illustrates a calculation of the output polarization states for
nine dierent frequencies for an arbitrary but xed input polarization state sin .
Overlayed on the gure is the precession circle  s, where the output PSP
is dened by pout in Fig. 8.10. It is clear that the output states circle about 
and deviate from the precession circle only to second-order.
That there is deviation at all between the output polarization states and
the precession circle underscores a fundamental aspect of PMD. The PMD

320

8 Properties of PDL and PMD

a)

b)

S3
^

pin(v1)

pout(v1)

c)

S3
pin(v2)

S3

pout(v2)

S2

S2

S1

S1

S2
^

pout(v)

S1

Fig. 8.12. The principal state of polarization changes with frequency. a) At 1


input and output PSPs are shown for a two-stage concatenation. b) At dierent
frequency 2 new input and output PSPs are required to achieve a stationary
output. c) The output PSP spectrum for all frequency. For two stages the spectrum
is a circle. This spectrum is periodic.

vector is dened as a vector that is stationary to rst order. However, the


PMD vector is itself frequency dependent. The vector  in the innitesimal
precession rule (8.2.3) must be updated for each new frequency. One generally
denotes the frequency dependence of the PMD vector as  ().
Figure 8.12 illustrates the fundamental nature of PMD vector change. The
PMD vector construction requires an output polarization that is stationary
to rst order with frequency for every frequency. Figure 8.12(a) shows the
input and output PSPs at 1 , the same frequency as in Fig. 8.10. However,
at a higher frequency 2 , more precession occurs about the birefringent axes.
A new stationary state must account for this increase; this is shown in Figure 8.12(b). A locus of output PSPs is determined by a full frequency sweep.
The locus is called the principal-state spectrum. The output PSP spectrum
for two equal-length stages is shown in Figure 8.12(c). For two stages the PSP
spectrum is always a circle in Stokes space. For more stages the spectrum
shows more complicated motion.
That the PMD vector changes pointing direction with frequency is of great
practical importance in optical communication systems. Generally there is also
an associated change in the DGD with frequency. One should recognize that
a principal state of polarization is as delicate and ephemeral as any state of
polarization, and will change with any perturbation on the system.
The preceding work is now generalized to show the meaning and behavior of a PMD spectrum associated with any lossless system. Figure 8.13(a)
illustrates the PMD vector in Stokes space for a particular frequency,  ().
The vector has a length () and a pointing direction p(). The spectrum of
this vector is composed of two components: the PSP spectrum (Fig. 8.13(b)),
and the DGD spectrum (Fig. 8.13(c)). The PSP spectrum is a vector spectrum which indicates the principal state as a function of frequency. The DGD
spectrum is a positive scalar spectrum which indicates the dierential-group
delay as a function of frequency. The PMD of any lossless system is entirely
determined by these two spectra.

8.2 Polarization-Mode Dispersion


b)
a)

S3

t(v)

S3

PSP(v2)

S2

PSP(v1)

321

S1

S2
PSP Spectrum

PMD Stokes Vector

c)
DGD

S1

DGD Spectrum
v1

Frequency

v2

Fig. 8.13. The two components of a PMD spectrum. a) The PMD vector illustrated
in Stokes space. The vector has a length and a pointing direction p. These are
functions of frequency. b) A PSP spectrum: p(). The pointing direction changes
with frequency on the unit sphere. c) A DGD spectrum: (). The vector length,
which is a positive scalar value, changes with frequency.

The connection between the two dening PMD spectra and the time domain is simple for narrow-band signals. Four narrow-band signals are illustrated in Fig. 8.14(a), having frequency centers at 1 , . . . , 4 . The overlayed
DGD spectrum determines the delay between the two orthogonal polarization
components of each signal. For instance, at 1 , the DGD spectrum has a high
value which in turn imparts a large delay between the two components of the
associated signal. At 4 , however, the DGD spectrum has a small value, so
the temporal delay for orthogonal components of the associated narrow-band
signal is small. Note, however, that the DGD spectrum provides no information as to the relative weight between split components of the signals. That
is left to the PSP spectrum.
The eect of the PSP spectrum is illustrated in Fig. 8.14(b). At a particular
frequency, the relative weight between orthogonal polarization components
is determined by the projection of the input state onto the PSP for that
frequency. While the illustration is not rigorous, the projection for the signal
centered at 1 , which has a large dierential delay, is about equal. In contrast,
the projection for the signal centered at 4 , which has a small dierential delay,
is lopsided. The change of relative weights between these two signals is due to
the change of the PSP pointing direction with frequency. Note, however, that
the PSP spectrum provides not information as to the dierential delay between

322
a)

8 Properties of PDL and PMD


DGD

v1

v2

v3

t(v1)
b)

frequency

v4

time

t(v2)

t(v3)

t(v4)

t(v2)

t(v3)

t(v4)

S3
PSP(v)
S2
SOPin
S1

t(v1)

time

Fig. 8.14. Relation between frequency and time domains for various narrow-band
signals aected by PMD in terms of the scalar and vector PMD spectra. a) The
DGD spectrum determines the time delay between orthogonal polarization components of a narrow-band signal. For signal centered at 1 , the time delay is (1 );
the gure illustrates a large dierential delay. For signal centered at 4 , the time
delay is (4 ); the gure illustrates a small dierential delay. b) The PSP spectrum
determines the relative weight between orthogonal signal components for each frequency. The change in projection between the xed input polarization and the PSP
vector for dierent frequencies changes the relative weight between orthogonal signal
components on each narrow-band signal.

split components of the signals. The DGD and PSP spectra are complementary
and the two must be considered together.
More complicated pulse deformations occur when the PMD spectrum
varies over the bandwidth of the signal. For the same PMD, the higher the
data rate the broader the signal spectrum. In this case each spectral component can be analyzed as in Fig. (8.14), but then the interference between

8.2 Polarization-Mode Dispersion


a)

t(v1)

S3

b)
!

tv||

t(v2)

p1
^

p2

t(v1)

S2

Second-Order PMD

tv ?

tv

t(v2)

S1

323

tv :

t v? :
!

tv ||:

second-order PMD
depolarization
polarization-dependent chromatic dispersion

Fig. 8.15. Denition of the second-order PMD (SOPMD) vector . a) PMD vectors (1 ) and (2 ). The vectors have dierential lengths and pointing directions. b) Second-order PMD is the vector dierence = ( (2 ) (1 )) / as
2 1 0. The SOPMD vector can be resolved on the (1 ) axis into perpendicular and parallel components. The perpendicular component is called depolarization
and the parallel component is called polarization-dependent chromatic dispersion
(PDCD).

these components must be accounted for. A measure of pulse-deformation


complexity is found by the degree of PMD change over a signal bandwidth. If
the PMD changes little, then only rst-order PMD aects the pulse. If the
PMD changes a lot, then higher-order PMD aects become pronounced.
The higher-order PMD eects generate complicated pulse deformations.
One higher-order eect that draws much attention is second-order PMD.
Second-order PMD (SOPMD) is a vector generated by a rst-order change of
the PMD vector with frequency. The SOPMD vector is illustrated in Fig. 8.15.
In general, the PMD vector  is dierent for dierent frequencies. Consider
two vectors in particular:  (1 ) and  (2 ), where 2 1 is a small change.
The vector dierence is the SOPMD vector, denoted by w . Like any vector,
w has a length and a pointing direction. The pointing direction is generally
neither parallel nor perpendicular to either rst-order PMD vector, and the
length is zero only when  pirouettes about itself or when there is only one
birefringent element. The length of the SOPMD vector in relation to the DGD
determines the signicance of the second-order vector.
The SOPMD vector is resolved onto two components called the depolarization component and the polarization-dependent chromatic dispersion (PDCD)
component (Fig. 8.15(b)). The depolarization component  runs perpendicular to  (1 ) and indicates the rate which the pointing direction of the PMD
vector changes. The PDCD component  runs parallel to  (1 ) and indicates the change of DGD with frequency. The length of the PDCD component
is in fact the frequency derivative of the DGD spectrum at 1 .

324

8 Properties of PDL and PMD


a)

b)
DGD (ps)

S3
PSP(v)

40
20

SOPMD (ps2)

p 5 t6 j t j

jtj

60

S2
S1

80

1542

1544
!

2000

j tv j

1500

1546

1548

1546

1548

Wavelength (nm)

1000
500
0

1542

1544

Wavelength (nm)

Fig. 8.16. Measurement of PSP, DGD, and magnitude-SOPMD spectra from a line
of single-mode ber. Courtesy D. Peterson, MCI [50].

The term depolarization has a connotation in the time domain, so why


does SOPMD, best described in the frequency domain, have a depolarization component? The change in pointing direction of the PMD vector with
frequency disperses a xed input polarization into many states at the output. For each frequency there is one output state. When the signal is inverse
Fourier transformed into the time domain, the dispersed output polarizations
are folded into the time-domain signal so that each time interval contains
many polarizations. A time average over these states reduces the degree-ofpolarization as compared to the input; that is, the input state is depolarized
by the PMD.
Second-order PMD is the frequency derivative of the PMD vector (8.2.2):
 = p
0123
pdcd

p
0123

(8.2.4)

depolarization

where the two orthogonal vector components are identied in the expression.
Statistically the depolarization component dominates the SOPMD vector.
Like rst-order PMD, second-order PMD is itself dependent on frequency:
 =  (). SOPMD therefore has a scalar and vector spectrum. Often the
magnitude of rst- and second-order PMD is plotted when comparing the two
orders. Figure 8.16 shows measurements of the PSP, DGD, and magnitude
SOPMD | | as a function of frequency made on a line of single-mode ber.

An alternative but equivalent view of PMD is to look at its response directly in the time domain. To do so, one looks at the impulse response of a
cascade of birefringent sections. An impulse in time is a delta function whose

8.2 Polarization-Mode Dispersion

325

a)
t1

time

t1

b)
t1

t2

t2

t2

time

c)
t1

t2

t3

t4

t5

Stk

time

Fig. 8.17. Impulse response from one or more birefringent elements. The polarization of the impulses has been abstracted. a) Impulse response from one stage alone.
An input impulse is split along the two birefringent axes and one pulse is delayed
with respect to the other by 1 . b) Impulse response from two stages. The two impulses from the rst stage are each divided into two parts, the slow components are
then delayed by 2 . c) Impulse response from ve stages generates 25 or 32 impulses.
There is a rst and last pulse, and the time response is FIR.

spectrum is uniform over all frequency. Since no two impulses can overlap unless they are precisely at the same time instant, no accounting for interference
is required to determine the output response.
Figure 8.17 illustrates the impulse response from one, two, and ve
dierent-length birefringent elements. For one stage, Fig. 8.17(a), an impulse is
split along the two birefringent axes and one impulse is delayed with response
to the other by 1 , the DGD of the element. When a second stage is added,
Fig. 8.17(b), each impulse from the rst stage is projected onto the birefringent axes of the second. The projection on the section stage is independent
from the projection onto the rst stage, so the relative impulse heights dier.
The two components aligned to the slow axis of the second stage are delayed
by 2 , yielding in general four impulses. Note that in the gure the state of
polarization for each impulse is not indicated but only its relative weight and
time position are shown. Through a cascade of ve stages, Fig. 8.17(c), there
are ve projections and ve delays, resulting in 25 or 32 distinct impulses.
The impulse response can be complicated, but a characteristic of the response
is that it is nite in duration. In general, PMD is a linear eect that acts
as a nite-impulse
response (FIR) lter. The temporal extent of the impulse

response is
k .
The temporal extent can be compared with a pulse duration of a signal to
determine the impairment of PMD. When a signal pulse, such as a non-returnto-zero ONE, is much longer in time than the FIR response of the birefringent
cascade, there is little eect of PMD on the pulse. However, when the time
extent of the signal pulse and the birefringent cascade are comparable, the signal pulse can be signicantly distorted. Strictly, the polarization-dependent

326

8 Properties of PDL and PMD

Eigen-System

Principal-State System

t
Fast axis

(v)

Fast PSP(v)

t
Slow axis

(v)

Slow PSP(v)

time

time

t(v)

Fig. 8.18. Comparison between eigenstate and principal-state systems. Left: Eigensystem for a single birefringent stage. Input polarizations aligned to the two birefringent axes are output with no change in polarization. Temporally, one axis has
a lower group index than the other, so the two eigenstates have a delay between
them. Right: Principal-state system for an arbitrary birefringent cascade. For any
frequency, there exists an input polarization such that the output polarization does
not change to rst-order in frequency. Along the pair of principal axes, there is a
dierential-group delay () between signals launched on orthogonal principal axes.
Generally, the PSP and DGD change with frequency.
a) Increment in length

S3
Dz
b
s

rb1

rb2

rb3

S2

rbn

dsb b b
5 rn 3 s
dz

br
b
s

b) Increment in frequency

S1

S3
b
s

Dv

b
s
rb1

rb2

rb3

dsb
5 tb 3 bs
dv

rbn

tb
S2
S1

Fig. 8.19. Comparison of innitesimal rotation for changes in length and frequency.
a) Increment in length of the last element. The output polarization precesses about
the birefringent axis of the last element. b) Increment in frequency. The output
polarization precesses about the principal-state axis of the system.

8.2 Polarization-Mode Dispersion

327

convolution of the signal pulse with the FIR response of the cascade determines the shape of the output signal. Convolution in time is multiplication in
frequency, and the frequency response of PMD has already been heuristically
constructed.

To conclude this primer, Fig. 8.18 shows the analogy between the eigenstate system for a single birefringent element and the principal-state system
for a birefringent cascade. Figure 8.19 illustrates the two innitesimal rotation
laws that govern PMD, one for change in length and the other for change in
frequency. These rotation laws are derived and applied in the next sections in
a more rigorous manner.
8.2.2 Fundamental Derivations
The output principal state vector is that vector which is stationary to rst
order in frequency. The principal state vector is well dened for a lossless birefringent concatenation of any length and composition. The issues at hand are
to derive an equation whose eigenvectors are the principal states of the system
and whose eigenvalues are the dierential group delays along the PSP axes.
Together the eigenvectors and values are used to dene the PMD vector. This
and the following section follows the spin-vector-based derivations set forth
by Frigo [16], Gisin [19], and so well elucidated by Gordon and Kogelnik [20].
First, a remark about the analytic tools available for these PMD studies.
The introduction of this chapter showed that while a Hermitian matrix lls
all 16 entries of the corresponding Mueller matrix, a traceless Hermitian matrix lls only the lower right-hand 3 3 sub-matrix of the Mueller matrix. It
was also shown in 2.1 that a unitary matrix also lls only the lower righthand 3 3 sub-matrix of the corresponding Mueller matrix. Comparison of
these matrices shows

1 0 0 0
1 0 0 0
0
0

H  MH =
0 , and U  MU = 0 (8.2.5)
0
0
It will be shown that the PMD operator is an identically traceless Hermitian operator. In Stokes space, PMD represents a vector having a length and
pointing direction. Unitary matrices act on PMD matrices without scattering
Mueller entries outside of the lower 3 3 sub-matrix. The action of a unitary matrix is to rotate the PMD matrix in Stokes space through similarity
transforms: H T = U HU . A complete representation of vectors and rotational
transformations can therefore be described solely within the O(3) group.
The Hermitian PMD operator and its traceless property is now derived.
Table 8.2 lists the symbols used to describe PMD. An output polarization

328

8 Properties of PDL and PMD

state is related to the input polarization state via the systems transformation
matrix T . Recall that for a lossless system, T = exp(jo )U , where U is a
unitary matrix. An arbitrary input is transformed at the output as
|t = ejo U |s

(8.2.6)

The frequency derivative is


|t = (jo U + U ) ejo |s
Substitution of (8.2.6) into the frequency derivative yields a expression that
relates the output polarization change in frequency to the output polarization
itself:


(8.2.7)
|t = j o + jU U |t
The derivative of the common phase is a delay, that is, o = o . Physically
this is the common delay experienced by both polarization states. By analogy
one expects jU U to have units of delay as well.
To proceed further, the properties of jU U must be determined. Recall
that a unitary matrix is dened as U U = U U = I. Taking the frequency
derivative and multiplying both sides by j yields
jU U = jU U
Notice that

jU U

= jU U

Therefore jU U is Hermitian: its eigenvalues are real. The question is


whether this Hermitian operator has trace or not. The answer is that jU U
is identically traceless. One way to show this is by Taylor expansion. Given
that the determinant of a unitary operator is +1 for all frequencies, one has
det (U ()) = det (U ( + )) = +1
The determinant of the Taylor expansion of U ( + ) must be unity to rst
order in frequency. For U ( + )  U () + U ,


det (U ( + )) = det I + U U det (U )
In general, the determinant of a matrix A plus the identity matrix yields
det (I + A) = 1 + Tr(A) + det(A)
For the present case,


det I + U U = 1 + Tr(U U ) + det(U U )()2
The coecient of the determinant is second order in frequency while that of
the trace is rst-order in frequency. In order for the determinant of U ( + )
to be unity to rst-order, the trace must vanish. Therefore

8.2 Polarization-Mode Dispersion

329

Table 8.2. Symbol Denitions for PMD


Symbol

Expression

= p

= /

p, :

= nL/c

Dierential-group delay (DGD), 0

Birefringent phase

As function of DGD

= L

As function of local birefringence


Local PMD vector of nth element
Cumulative PMD vector through nth element

(n) :
  :

2 :

PMD vector in Stokes space. Length , pointing


direction p

Pointing direction of PMD vector, called principal


state of polarization (PSP). PSP p is dened at
output and aligned to slow PSP axis

n :

Denition

| ()|
 () ()

Mean PMD. Frequency average of DGD


Mean-square PMD. Frequency average of DGD2



Tr jU U = 0

(8.2.8)

A Hermitian matrix with zero trace has important properties. First, its
eigenvalues are equal in magnitude and opposite in sign. Second, the SU(2)
matrix is equivalent to a vector in O(3). That vector is determined by the
eigenvalue equation for jU U :
jU U |p  = |p 

(8.2.9)

where |p  are the eigenvectors of jU U . The eigenvalues are dened as


/2, and thus
(8.2.10)
jU U |p  = /2 |p 



Since the determinant is the product of the eigenvalues, det jU U = 2 /4.


Moreover, the determinant of a product of matrices is the product of the
determinants, so the eigenvalues are

(8.2.11)
= 2 det U
Combining the eigenvalue equation (8.2.9) with (8.2.7), and choosing an
output polarization along an eigenvector of jU U , one nds


|p  = j o + jU U |p 
= j (o /2) |p 
The total group delay of the signals along the principal state axes is

330

8 Properties of PDL and PMD

g = o /2

(8.2.12)

where o is the common delay and /2 is the dierential delay. The slow
principal state is |p+ , with corresponding delay o + /2, while the fast principal state is |p , with corresponding delay o /2. In the following, |p+ 
is denoted simply by |p.
To verify that the output polarization |p is stationary to rst order for
either principal state, the Jones vector is converted to a Stokes vector as
p = p | | p 
= jg p | | p jg p | | p
which is evidently zero.
In summary, the eigenvectors of jU U are the principal states of polarization of the system, and the eigenvalues are the dierential group delays
along the two axes. The PMD vector is dened to point along the slow output
principal state and have a length of the total dierential-group delay:
 p,

p is slow output PSP

(8.2.13)

The Stokes vector interpretation of (8.2.13) is illustrated in Fig. 8.13(a).


8.2.3 Connection Between Jones and Stokes Space
That jU U is traceless Hermitian and has zero trace implies a connection
between the SU(2) Jones space and the O(3) Stokes space (cf. 2.6). In particular, observe that for a Stokes vector  , the following two eigenvalue equations
are equal to within a factor of two:
jU U |p = ( /2) |p
(  ) |p = |p
The spin-vector operator and jU U are clearly related:
jU U =

1
(  )
2

(8.2.14)

The spin-vector can be used to determine how an arbitrary output polarization state changes with frequency. In Stokes space, the frequency change
is
(8.2.15)
t = (t | )| |t + t | (|t )
Employing the spin-vector form (8.2.14) and the eigenvalue equation (8.2.7),
(8.2.15) generates

8.2 Polarization-Mode Dispersion

331

 

 





1
1
t = jt  o + (  )   t jt  o + (  )  t
2
2
j
= t |(  )   (  )| t
2
= t |  | t
and thus

dt
=  t
(8.2.16)
d
This is the innitesimal frequency-change rule for an arbitrary output polarization state. The output state precesses about the PMD vector  . Only if
the output state is aligned or anti-aligned along  will its state not change
with frequency. Illustrations of the precession rule in frequency are given by
Figs. 8.11 and 8.19(b).
The PMD vector  is dened at the output of the system. There is a
corresponding PMD vector at the input of the system. Denote t and s as
the output and input PMD vectors, respectively. Since in general the output density matrix Dt of a unitary transformation is related to the input
density Ds via Dt = U Ds U , and the density operator is related to the spinvector through (2.5.29) on page 56, the relation between input and output
spin vectors is
(t  ) = U (s  ) U
Isolating (s  ) and substitution of (8.2.14) yields
1
(s  ) = jU U
2

(8.2.17)

It makes sense that the input PMD vector is determined by writing the unitary
matrices U and U in reverse order to that for the output PMD vector.
Moreover, as the operators are unitary, the DGD for both the input and
output PMD vectors is identical.
For the unitary transformation in Jones space |t = T |s, the equivalent
transformation in Stokes space is
t = R
s

(8.2.18)

where a vector form of the unitary operator R is given in (2.6.25) on page 68,
repeated here due to its signicance:
R = (
rr) + sin (
r) cos (
r)(
r)

(8.2.19)

The precession vector r of operator R points along an eigenvector of U and


the precession angle = is the angle of the complex eigenvalue of U .
A expression analogous to jU U exists in Stokes space. Consider a frequency change for (8.2.18):
t = R s

332

8 Properties of PDL and PMD

where the input polarization state is xed in frequency. Substitution of s = R t


gives
t = R R t
By comparison with the previously derived innitesimal rotation rule (8.2.16)
one is led to the identication that
 = R R

(8.2.20)

The value of R R will be clear when deriving the PMD concatenation rules.
Yet even at this point its use is signicant. Consider again the input and
output PMD vectors s and t . It was already determined by (8.2.17) that
the lengths of these two vectors are identical. Since in general t = R
s, one can
choose the input and output polarizations parallel to the PSPs of jU U .
Accordingly,
(8.2.21)
t = R s
That is, the input and output PMD vectors are related by R. How is the
second-order PMD component, t , related to s ? Taking the frequency
derivative of (8.2.21),
t = R s + R s
along with the substitution of (8.2.21) and (8.2.20) yields
t =   + R s
= R s

(8.2.22)

Both the rst- and second-order PMD vectors transform from input to output
through R. Higher-order frequency derivatives quickly become more complex.
The spin-vector form (  ) also assists in the evaluation of the three Stokes
components of the vector  . In particular, (2.5.29) on page 56 gives


2 j3
1
(8.2.23)
  =
2 + j3
1
Likewise, (8.2.20) is identied with (2.6.29) on page 70 to gives

0 3 2
0 1
 = 3
2 1
0

(8.2.24)

When the unitary matrix is written in Cayley-Klein form, given by (2.4.28)


on page 51, the matrix entries of jU U are identied with (8.2.23) as
1 = 2j (a a + b b )

(8.2.25a)

2 = 2
m (a b b a)

(8.2.25b)

3 = 2 e (a b b a)

(8.2.25c)

8.2 Polarization-Mode Dispersion

333

The DGD can be determined either from 2 = 12 + 22 + 32 or from (8.2.11).


In either case the resulting expression is

= 2 a a + b b
(8.2.26)

In comparison with (8.2.26) it is interesting to note that aa + bb = 1.


The frequency derivative of the Jones matrix elements brings down groupdelay coecients which are responsible for the dierential-group delay of the
system.
Finally, the spin-vector form is used to relate the eigenvector of U to the
output PMD vector. The exponential form of the unitary operator
U = exp [j (
r  ) /2]

(8.2.27)

has a frequency derivative of (cf. (2.6.19) on page 68)


r  ) U j (
r  ) sin (/2)
U = j ( /2) (
The product jU U is
jU U = /2 (
r  ) + (
r  ) sin (/2) [I cos (/2) + j (
r  ) sin (/2)]
Identication with (8.2.14) and use of spin-vector product identity (2.5.38) on
page 57 generates the relation between the eigenvector r of U and the PMD
vector  of jU U :
 = r + sin r (1 cos ) r r

(8.2.28)

As discussed in 8.2.1, the eigenvector of U generally changes to rst order


with frequency. That rst-order eect is accounted for by r in (8.2.28). An
important special case is when the PMD vector refers to a single homogeneous
birefringent section. In this case r = 0 and the PMD vector of the element
is aligned to the birefringent axis:  = r. The frequency derivative of the
phase is the group delay of the element, or = .
8.2.4 Concatenation Rules for PMD
The PMD concatenation rules are very helpful to gain physical insight and intuition on how PMD behaves. The concatenation rules track the PMD vectors
themselves as more and more sections are added to a system. Early work on
PMD statistics clearly deals with the concatenation of many (indeed innitely
many) birefringent sections, but the discussion of PMD statistics is left for
the next chapter. The four works from the technical literature that present
concatenation rules are from Curti et al. [4], who presented a two-stage concatenation and numerically extended the work to large numbers; Gisin [19],
who derived the recurrence relation for an arbitrary number of sections; Karlsson [38], who recast Gisins recurrence relation in spin-vector form; and and
Gordon and Kogelnik [20], who extended the prior work.
The concatenation rules are rules for adding two or more output PMD
vectors. The rules could be written for the input PMD vectors just as well,
but they are not for a mixture of input and output vectors.

334

8 Properties of PDL and PMD


a)

b)
R1

b
s

bt

b
s

t~ 1

R1

R2

bt

t~ 1

t~ 2

c)
b
s

R1

RN
t~ 1

bt

t~ N

Fig. 8.20. Block diagrams of one, two, and N birefringent blocks in concatenation. It
is assumed that the blocks are lossless. A birefringent block may have heterogeneous
or homogeneous birefringence.

One Birefringent Section


Without even one section of birefringence there is no PMD at all. The PMD
vector rst appears after a single homogeneous birefringent section. The relation between a single birefringent element and the PMD vector is given
in (8.2.28), where r = 0. Thus,
 = r

(8.2.29)

where r points in the direction of the slow birefringent axis of the element.
The DGD is given by = .  is the rst-order PMD vector for the section.
The second order vector is
(8.2.30)
 = 0
There is no second-order frequency dependence of the PMD vector. This is a
unique case for PMD, as concatenations with two or more stages will always
have nite  except, possibly, at particular frequencies when the second-order
vector momentarily vanishes.
One Birefringent Block
A birefringent block is distinguished from a birefringent section in that the
latter is homogeneous birefringence, i.e. with no intermediate polarization
mode coupling, while the former can be birefringence of any composition and
inhomogeneity. Any output polarization is related to the input polarization
by t = R1 s, while the rst- and second-order PMD of the block is denoted
simply by 1 and 1 (Fig. 8.20(a)). The PMD vector can be determined from
1 = R1 R1 .
Two Birefringent Blocks
Denote the PMD vector generated by a rst birefringent block as 1 and by
a second birefringent block as 2 (Fig. 8.20(b)). When these two blocks are

8.2 Polarization-Mode Dispersion

335

concatenated the overall input-to-output transformation is R = R2 R1 . The


cumulative PMD vector  is related to the block transfer operators as
 = (R2 R1 ) (R2 R1 )
= R2 R1 R1 R2 + R2 R1 R1 R2
= 2 +R2 (1 ) R2
= 2 + (R21 )
The last line is derived from the preceding one as RvR is a unitary transformation on v. The embedded expression for rst-order PMD is
 = 2 + R21

(8.2.31)

The second-order expression comes from the frequency derivative


 = 2 + R21 + R2 1
Using (8.2.31) to write 1 = R2 ( 2 ), the second-order expression reduces
to
(8.2.32)
 = 2 + R21 + 2 
The second-order expression for  is almost like that for  except for the additional 2  on the right-hand side. The additional vector generates a pulling
of the second-order cumulative vector  in a direction dened by 2  ,
which is orthogonal to both the local PMD component 2 and the cumulative
rst-order vector  .
Since these derivations have applied to heterogeneous birefringent blocks,
the block PMD vectors and the transformation operators R are generally a
function of frequency. Accordingly, the rst-order cumulative vector is more
accurately written as
 () = 2 () + R2 ()1 ()

(8.2.33)

For small frequency changes, (8.2.33) establishes a precession rule. The PMD
vector 1 precesses about the axis r2 of the second PMD block through angle 2 . This is the eect of R2 on 1 . The rotated vector is then added to 2 .
As Gordon and Kogelnik point out, The rule is very similar to that for
impedances of a transmission line: to get the PMD vector of an assembly,
transform the PMD vectors of each individual section to a common reference
cross section and take the sum of all the vectors [20].
A signicant special case of (8.2.33) is when R2 and R1 refer to homogeneous birefringent sections. In this case there is no frequency dependence
of r1,2 or 1,2 , and the precession behavior is more clear since the birefringent
axes are xed. The two-section precession rule (8.2.31) is expanded to


 = 2 + (
r2 r2 ) + sin 2 (
r2 ) cos 2 (
r2 ) (
r2 ) 1

336

8 Properties of PDL and PMD


a)

b)

u21
t~
t~ 1

c)

t~ 2

t
t2

d)

t(v1)

t1

e)

br 2

br 2

br 2

2u21

vt2

t(v3)
br 2

t(v2)

Fig. 8.21. PMD concatenation rule for two homogeneous birefringent sections.
a) Concatenation of two birefringent sections having section DGDs of 1 and 2 , and
relative angle between birefringent axes 21 . b) PMD vector addition. 1 precesses
about 2 with retardance 2 . The cumulative PMD vector is the vector sum
of the components. The pointing direction of is the output PSP of the cascade.
ce) Component PMD vectors at three dierent frequencies. The PSP spectrum is
periodic and the DGD spectrum is constant.

where r2 points in the direction of 2 and 2 is the birefringent phase of the


second section. This motion and the associated physical construct is illustrated
in Fig. 8.21(a,b).
Two homogeneous birefringent sections with mode-mixing at a well-dened
junction are illustrated in Fig. 8.21(a). The mode-mixing angle is the dierence
between the angles of the birefringent axes of the two sections. The two-section
concatenation rule is illustrated in Fig. 8.21(b), which is drawn in Stokes space.
The base of 1 is jointed to the tip of 2 . The angle between the two vectors
is 221 , twice the physical angle at the mode-mixing junction; this angle is
xed in frequency. When frequency is changed, the PMD vector of the rst
section 1 precesses about the axis of the second PMD vector, r2 . The angle
of precession is 2 , which is solely dictated by the length of the second PMD
vector. The precession rate is 2 . Over all frequencies the tip of 1 traces a
circle; the circular motion is periodic with a free-spectral range of FSR = 1/2 .
Figures 8.21(ce) illustrate this motion. At a rst frequency the cumulative
PMD vector  points upward; at a second frequency it points downward; and
at a third frequency it points up again. Since  points in the direction of the
slow output PSP, the circle traced by 1 in frequency is the PSP spectrum of
the two-section concatenation. The length of the cumulative vector  is the
vector sum of the components. In this case there are only two xed-length
components, so  completes the triangle rule and remains constant in length
over frequency.

8.2 Polarization-Mode Dispersion

337

N Birefringent Blocks
The concatenation rule for N birefringent blocks is derived from repeated application of (8.2.31) and (8.2.32). Denoting k the PMD vector of the k th block
and  (k) the cumulative PMD vector through the k th block, the cumulative
rst- and second-order PMD vectors are boot-strapped from the origin
 (1) = 1

 (1) = 0

 (2) = 2 + R2 (1)

 (2) = 2  (2)

 (3) = 3 + R3 (2)

 (3) = 3  (3) + R3 (2)

..
.

(8.2.34)

..
.

 (n) = n + R3 (n 1)

 (n) = n  (n) + Rn (n 1)

Another form for  (n) is to write recursively


 (n) = n + Rn (n1 + Rn1 (n2 R3 (2 + R21 )) )

(8.2.35)

This form gives some physical insight. Starting from the beginning of the
cascade, component vector 1 is placed on the tip of 2 and precesses about
its axis at rate 2 . This is the action of R2 . Together these two components
are placed on the tip of 3 and they precess about the r3 axis at rate 3 .
This is the action of R3 . Keep in mind that while 2 and 1 precess about 3 ,
1 continues to precess about 2 . The procedure is repeated through the nth
section.
Figure 8.22 illustrates the precession for three homogeneous birefringent
sections in cascade. Unlike the two-section case, the motion of the tip of 1 is
more complicated, tracing a folded-eight curve in Stokes space and having
a frequency-dependent vector sum. The vector sum  is the cumulative PMD
vector of the cascade.
Finally, a compact form of the cumulative rst- and second-order PMD
vectors is written as
n

R(n, k + 1) n
(8.2.36)
 =
k=1

and
 =

n


R(n, k + 1) (n + n  (n))

(8.2.37)

k=1

where
R(n, k) = Rn Rn1 Rk

(8.2.38)

Note that the evaluation of  requires the concurrent evaluation of  . Expressions (8.2.36-8.2.37) oer a fast way to evaluate numerically the rst- and
second-order PMD vectors for an arbitrary cascade. The evaluation occurs
directly in Stokes space and  is determined without a numerical derivative.

338

8 Properties of PDL and PMD

a)

b)
br
2

u21

u32
t~

t~ 1

t~ 2

t~ 3

t3

t2
br
3

2u32
c)

br
2

t(v1)
br

d)

vt2

t1 2u21

vt3
e)

br 2

br 3

t(v2)

t(v3)

br

br
2

Fig. 8.22. PMD concatenation rule for three homogeneous birefringent sections.
a) Concatenation of three birefringent sections having section DGDs of 1,2,3 and relative angle between birefringent axes 21 and 32 . The drawing is for 1 = 3 = 22 .
b) PMD vector addition. 1 precesses about 2 with retardance 2 ; combined, the
vectors precess about 3 with retardance 3 . The motion at the tip of 1 is more
complicated than the two-section case, and the length of the cumulative PMD vector
now changes with frequency. ce) Component PMD vectors at three dierent frequencies. The PSP spectrum is periodic but more complicated than for two sections.
The DGD spectrum varies with frequency and is also periodic.

An alternative method, the multiplication of Jones matrices where each matrix


represents a homogeneous element and conversion to PMD vectors via (8.2.258.2.26), requires substantially more computations and requires evaluation at
two frequencies in order to estimate  .
8.2.5 PMD Evolution Equations
The PMD concatenation equations developed in the preceding abstracted the
underlying birefringence and focused solely on how PMD vectors combine
in length and precess in frequency. Studies of optical pulse propagation in
ber and statistical properties of PMD require an explicit connection between
the cumulative PMD vector and the underlying local birefringence. There
are at least two ways to derive the PMD evolution equation, one due to
Poole et al. [53], and another due to Gisin et al. [19] and Gordon et al. [20]. The
latter derivation type makes the connection between the PMD concatenation
rules and the underlying birefringence. The Poole derivation is detailed below.

8.2 Polarization-Mode Dispersion

339

A note on notation. The PDL evolution equations denote


 (z) as the local
PDL vector
 per unit length (cf. 8.1.4). For PMD, the local birefringence
per unit length is customarily denoted (as in exp(jz) where = n/c).
 L where L is the length
The connection to the local PMD vector  is  =
of the segment.
The Poole derivation starts with the established precession rules in length
and frequency for an arbitrary polarization state:
s

s
 s, and
=
=  s
z

 is the local birefringence vector and  is the cumulative PMD vector


where
up to location z. Taking the frequency derivative of the rst equation and the
length derivative of the second gives
2 s
 s +
 s
=
z
2 s
= z s +  sz
z
Under the assumption of continuity of the function s(z, ), the left-hand sides
  ) s =
 ( s)  (
 s)
are equal. Using the vector identity (
results in the PMD evolution equation

 +
 
=
z

(8.2.39)

The local birefringence changes the cumulative PMD vector in both an


 and multiplicative
additive and multiplicative sense: additive through
 
 drives the average direction of z while
through | |. Moreover,
drives z in a perpendicular direction.
Figure 8.23 illustrates the motion of  through two birefringent sections,
where (z = 0) = 0. Through the rst section the cross-product vanishes, so 
directly tracks 1 . However, once the light enters the second section, the local
birefringence and cumulative PMD vector are no longer parallel. The motion
2 term pulls the
2 . The
of the PMD vector is helical about a center axis
  term drives the helical motion.
2 while the
average direction of  along
The physical interpretation of the motion of  in section 2 is as follows. Recall that the length of the PMD vector is the dierential-group delay.
The DGD is, roughly, a measure of the number of full-wave slips between
orthogonal polarizations that has occurred. For each accumulated full-wave
slip along 2 there is an increment of the DGD by the associated delay. For
instance, at 1.55 m a one-wave slip is a delay of about 5 fs. The projection
2 axis is approximately the number of full-wave phase slips
of  onto the
that has occurred in the section. Clearly the longer the section the longer the
projected vector.

340

8 Properties of PDL and PMD


S3

~ 3t
~
b
2

b1
b2

S2

~
b
1

~
b
2

t~
S1

~ L)
(t~ 5 b

Fig. 8.23. Cumulative PMD vector evolution through birefringent sections 1


and 2 . The cumulative vector follows 1 directly. Once the light enters 2 the crossproduct between local and cumulative vectors is non-zero and precession commences.
traces a helical path in the direction of 2 . The birefringence 2 is the driving term
of the equation while 2 forces the cumulative vector in a direction perpendicular
to 2 .

Since  is a Stokes vector as well, it must track the principal state of


polarization as the light propagates. As the phase slips through one full wave
2 . A
the principal state (as well as the polarization state) precesses about
full revolution is made for length z such that
2 z = 2
If the helical motion had zero pitch then the motion of  about 2 would be
a pure precession. But given that the length of the PMD vector tracks the
number of orthogonal-wave slips the helix pitch is greater than zero.
2 . This correFigure 8.23 shows less than two full revolutions of  about
sponds to less than two full-wave phase slips, or less than 10 fs. A birefringent
crystal or ber segment that has a substantial DGD, say 1 ps, has about 200
revolutions of  in Stokes space at 1.55 m. A 100 ps delay has accordingly
some 20,000 revolutions. This is the origin of a order-of-magnitude distinction that is common when dealing with PMD. The DGD would have to be
written to an accuracy better than one part in 20,000 in order to capture the
fraction of a phase slip a long birefringent concatenation imparts. Yet only
two or three signicant gures are generally reported. Three signicant gures leaves unresolved hundreds of revolutions for a 100 ps delay. It also leaves
unresolved the fraction of a phase slip that occurs for even a 1 ps delay. But
the fraction of a phase slip is essential information to properly track the PMD
vector evolution. The criticality of the fractional phase slip is illustrated by
the following example.

8.2 Polarization-Mode Dispersion


a)

S3

b)

t
b3
b

S3

b3

b2

S2

b1

341

S1

S2

b1

b2

S1

a b

b2

b1

t
b3

b
b2

b1
t~

5bs

1.5bs

4bs

c
b3
t~

5bs

2.0bs

4bs

Fig. 8.24. The importance of residual birefringent phase on the evolution of the
cumulative PMD vector. The three-segment cascades are identical but for the second segment: in (a) the segment imparts 1.5 revolutions while in (b) the segment
imparts 2.0 revolutions. a) Trajectory of three-segment cascade where center segment has 1.5 birefringent phase slips. The cumulative DGD increases monotonically.
a) Trajectory of three-segment cascade where center segment has 2.0 birefringent
phase slips. The cumulative DGD decreases when it enters the third segment. The
output PSPs of the two cascades are nearly in opposite directions.

Figure 8.24 shows the importance of the fractional phase slip, also known as
the residual birefringent phase, in the evolution of the PMD vector. The gure
illustrates the evolution through two concatenations of three segments each.
The concatenations are almost the same; the only dierence is that the second
segment in Fig. 8.24(a) has 1.5 revolutions while the one in Fig. 8.24(b) has 2.0
2 points in
revolutions. In the rst sequence the PMD vector output from
3 ; the cumulative PMD vector continues to increase in length
the direction of
through the third segment. In the second sequence, the extra half-wave rotation of  through the second segment orients the resultant PMD vector in
3 . Propagation through the third segment still
the opposite direction from
3 but does so rst by decreasing the PMD vector length. The
pulls  toward
resultant cumulative vector length and pointing direction are very dierent

342

8 Properties of PDL and PMD

for the two cases; yet the only underlying dierence is a half-wave shift along
the second birefringent segment. This discussion highlights the importance of
the residual birefringent phase; this phase will present itself time and again
in the context of the Fourier spectrum of the DGD and programmable PMD
generation.
As a point of comparison to PDL evolution, the cumulative PMD vector
can point in any direction in Stokes space even if the underlying birefringent
vectors of the segments all lie in the same plane. For PDL, however, if the
underlying PDL vectors all lie in the same plane, the cumulative PDL vector
cannot leave that plane. This is illustrated in Fig. 8.6(c) on page 311. The
dierence stems from the (
) term in the PDL evolution equation which
 
pulls the cumulative PDL vector toward the local PDL vector, while the
term in the PMD evolution equation drives the cumulative PMD vector in a
helical motion about the local birefringence. The helical motion is in a plane
nearly orthogonal to the local birefringent vector (it has a small longitudinal
component along the local vector). Hence the cumulative PMD vector will
likely lie o of a plane of the local birefringent vectors.
8.2.6 Time-Domain Representation
The predominant representation of PMD thus far has been in the frequency
domain. This is a natural consequence of the frequency-centric denition of
the PMD vector. However, the parallel representation is the PMD impulse
response in the time domain. The impulse response is generally a more dicult
characteristic with which to make computations, but a richness of intuition is
gained by understanding the parallels between the two domains.
Between the extremes of sine-wave response and impulse response lies the
signal response of a communications channel, particularly the distortion imparted on a signal due to PMD. The signal response is fundamentally the
convolution of the input waveform with the impulse response. What makes
the calculation tricky is that co-polarized signal-image components that result from the convolution interfere coherently; the temporal location of the
impulses matter to within a fraction of a wave. When in one case two copolarized signal images are in phase and add constructively, a dephasing by
leads to destructive interference between the signal images. While the impulse
weights change with changes in mode mixing, the temporal locations of the
impulse response change only when the composition of the PMD concatenation changes. This implies that for a specic concatenation, the impulse
response may extend well into the duration of a signal pulse; how the signal
is distorted depends on which co-polarized signal images make constructive
interference and which make destructive interference. In some cases the signal
will look undistorted while in others it will look quite distorted. How the PMD
impacts the signal depends on the expression of this coherent interference.

8.2 Polarization-Mode Dispersion

343

The following three subsections present three views of the time-domain


response of a signal to PMD: simple distortions, signal-moments analysis, and
impulse-moments analysis.
Signal Distortion
The impact of PMD on a signal is calculated by multiplying the waveform
spectrum with the PMD spectrum or convolving the waveform with the PMD
impulse response. Either way a reconciliation must be made as its natural to
describe the signal waveform in the time domain and PMD in the frequency
domain.
In general, the time- and frequency-domain representations of the input
signal are
(8.2.40)
es (t) = f (t) |s , and Es () = f () |s
where f (t) is the waveform envelope of the electric eld es (t) and |s is its
input polarization state; and where Es () and f () are their Fourier transform
equivalents3 .
The output electric eld is related to the input via the PMD transformation matrix T ():
Et () = T ()Es ()
= f ()T () |s

(8.2.41)

where f () is a complex-valued function of frequency and T () is an operator. In the time domain, the output waveform envelope is governed by the
polarization transfer function

f (t) t11 (t) f (t) t12 (t)


|s
(8.2.42)
et (t) =
f (t) t21 (t) f (t) t22 (t)
where tij (t) is the inverse Fourier transform of the respective matrix element in T (). It is important to recognize that the waveform envelope and
3

The Fourier transform pair used herein follows Haus [22]:




1
E()ejt d,
E() =
e(t)ejt dt
e(t) =
2 R
R


e (t)e(t)dt = 2 E ()E()d
W =
R

where the last equation is Parsevals theorem. Dirac delta functions are dened
by the following integrals:




ej(tt ) d, and 2 (  ) =
ej( )t dt
2 (t t ) =
R

344

8 Properties of PDL and PMD


Tslot - TPMD
N

Tslot

time

5
TPMD

Tslot + TPMD

time

Fig. 8.25. Polarization-independent convolution of waveform envelope with PMD


impulse response. The convolution broadens the output signal by TPMD and reduces
the duration of the undistorted portion of the signal to Tslot TPMD . The details
of the distortion in the transition regions depend on the interference of co-polarized
signal images. These details can be calculated only when the precise nature of the
PMD and signal characteristic are known.

polarization state of the output eld are not necessarily separable; that is,
et (t) = ft (t) |t is not necessarily true. Rather, the waveform and polarization
state are entangled. Such is the eect of depolarization: while each spectral
component has a distinct polarization state, the inverse Fourier transform
into the time domain folds the polarization states together so that on every
time interval multiple polarization states exist; the degree of polarization is
accordingly reduced (cf. 1.5.3).
The elements that aect the output waveform are present in (8.2.42): the
input state |s, the impulse response of the PMD tij (t), and its convolution
with the waveform. It is hard to make generalizations about a system than can
be arbitrarily complex. Instead, the following presents a few case studies to
show dierent aspects of general PMD, rst-order PMD, second-order PMD,
and higher-order PMD.
For an arbitrary birefringent concatenation with low overall PMD, signal
distortion starts at the edges. Figure 8.25 illustrates schematically the eect
of a signal one convolved with a simple PMD impulse response; the illustration is polarization independent but two of the four impulses are orthogonal
to the others. Pulse images in the same polarization state coherently combine
either constructively or destructively depending on the relative phase of the
corresponding impulses. The regions of these combinations are at either transition edge of the signal. The temporal extent of the transition regions equals
the width of the PMD impulse response TPMD . Due to the convolution, the
overall signal duration is increased by TPMD and the duration of the center
part of the signal is Tslot TPMD .
When there is a large number of impulses, and there are 2N impulses
for N birefringent segments, the gaussian distribution of impulses broadens
the transition edges by an amount related to the standard deviation of the
distribution. The calculation by Gisin and Pellaux [19], detailed in the third
subsection, shows that the standard deviation of the impulse response equals
the mean DGD.
First-order PMD is distinct from all other forms because the output waveform is the sum of two identically shaped, orthogonally polarized, time-shifted

8.2 Polarization-Mode Dispersion

a)

345

Intensity

1.0
0.5
0

b)

time

1.0
0.5
0

1000

time (ps)

a)

3000

b)

c)

2000

t/2

2t/2

Intensity

1.0
0.5
0

d)

0.5

e)

0.5

time

1000

time (ps)

2000

3000

d)

c)

e)
analyzer

Fig. 8.26. Time response to rst-order PMD. ab) Output signal waveforms delayed
and advanced by /2; the corresponding launch conditions are (a) and (b). The
output signals are undistorted. c) Distorted output waveform due to launch at 45
with respect to either birefringent axis. When analyzed by output polarizer aligned
to either slow or fast axis, the original waveform is recovered, (d) and (e).

copies of the input. Pure rst-order PMD comes only from a single birefringent section. Figure 8.26 illustrates the extrema conditions. When the input polarization state is aligned to either the slow or fast birefringent axis,

346

8 Properties of PDL and PMD

Figs. 8.26(a) and (b) respectively, the output signal is either delayed or advanced with respect to the average delay, (a) and (b). All the light remains in
a single polarization state.
Alternatively, when the input polarization state is equally divided between
slow and fast axes (Fig. 8.26(c)), the signal is equally divided between the
two axes and time-shifted relative to one another. The square-law detected
output is distorted as shown in Fig. 8.26(c). Now, when an analyzer is placed
at the output and aligned to the fast axis (d), all the light from the slow
axis is clipped; only the signal along the fast axis emerges. The shape of the
analyzed output waveform is identical to the input waveform but with 3 dB
less intensity (d). Similarly, when an analyzer is aligned to the slow axis (e),
the analyzed output waveform is also identical to the input but time-delayed
by (e).
Finally, the polarization dependence of the distortion is evident from (a
c): launch along either birefringent axis leaves the signal undistorted, while
the equally-mixed state launch maximally distorts the signal. The distortion
is rst-order launch-state dependent.
With this one example complete, the reader is warned that DGD is not
PMD; DGD is one aspect of PMD. It is all too often forgotten or ignored that
second-order and higher-order PMD have signicant and characteristically
dierent eects on a signal. Examples of signicant activities that have been
conducted under the PMD is DGD misconception are optical and electronic
PMD compensator development; PMD emulator construction; PMD measurement; and PMD outage probability calculations. At least second-order PMD
must be considered, and in fact the mean DGD of a link must also be folded
into the analysis. Anything short of this richer set of considerations will likely
render the calculation or product ineective for industry applications.
The rst venture into second-order PMD (SOPMD) comes from two birefringent sections alone. These two birefringent sections impart depolarization as well as DGD onto the signal; the second component of SOPMD, the
polarization-dependent chromatic dispersion (PDCD), is identically zero for
two sections. The distortion of a signal due to second-order PMD is typied
by overshoot and false oors. Moreover, the output waveform can never be
resolved into two identical, time-shifted copies of the input, as was the case
for rst-order PMD alone.
Figure 8.26 showed that launch of a signal along a birefringent axis (or
equivalently, an input PSP) into a one-stage system left the output signal
undistorted. This is not true when SOPMD is present. Figure 8.27(a) shows
the output distortion when the signal is launched along an input PSP. The
output exhibits overshoot and false oors. When an analyzer aligned to either
output PSP is placed after the two birefringent sections one observes light
along both polarizations. Fig. 8.27(b) shows the analyzed components of the
output signal, and (c) shows an excerpt. The evident eect is that the light

8.2 Polarization-Mode Dispersion


a)

347

Intensity

1.5
1.0
0.5
0

b)

time

Intensity

1.5
1.0
0.5
0

1000

excerpt

2000

time (ps)

c)

3000

amplitude

t
PSPin

u50

PSPout

t
o

u 5 45

S3

d)

?PSPout

excerpt

output signal
polarization
PSPout(vo)

analyzer
axis

b
sin

S2
S1
signal spectral
density
PSPin(v)

Fig. 8.27. Time-response to second-order PMD with only depolarization. a) Distorted output signal when launched state is aligned to the input PSP at a center frequency. b) Output signal analyzed by a polarizer aligned along the center-frequency
output PSP and its orthogonal state. Light comes through the polarizer in both
states; the output cannot be constructed from two time-shifted identical copies of
the input. c) Blowup of excerpted signal, plotted in amplitude rather than intensity
to highlight the relation of the waveforms. The transition edges of the input signal
have in part been rotated down to the orthogonal polarization axis. d) Input launch
state sin , input PSP spectrum, and output signal polarization spectrum.

348

8 Properties of PDL and PMD

on the orthogonal PSP axis coincides with the transition edges of the input
waveform.
Along the transition edges the high-frequency components of the signal
spectrum have their phases aligned; one can say the high-frequency Fourier
components of the signal dominate at the edges, while the low-frequency components dominate at the centers of the ones and zeros. With this in mind,
the polarization spectra of the input PSP and output signal polarization are
shown in Fig. 8.27(d). The launch polarization state is xed in frequency,
but the input PSP is not. At a center frequency the input PSP coincides
with the launch state (here, not in general), but the PSP vector traces a circle in frequency. Accordingly, frequency components of the input signal are
mapped to onto a locus of polarization states at the output; this is called
polarization-state dispersion. The output signal spectrum is drawn in the gure. Overlaid with the output signal spectrum are small circles that indicate
the amplitude of the signal spectrum. The signal spectrum is densely packed
about PSPout (o ), but at higher and lower relative frequencies the output
polarization makes large excursions. Thus the high frequency components of
the signal are misaligned to the output analyzer (when aligned to the output
PSP) and come through. Those high-frequency components appear on the
transition edges of the signal.
The impulse response of two birefringent sections is comprised of two impulses aligned along one axis and two impulses aligned along an orthogonal
axis. This is shown in Fig. 8.28(a) for two equal delays. The output signal
is the convolution of the input signal with this impulse response. The eect
of interference, resulting in coherent addition or cancellation, of co-polarized
signal images can now be well illustrated. The three columns in the center of
Fig. 8.28 show how co-polarized signal images add for three dierent residual
birefringent phases in the rst delay section. When = 0 , the impulses
along the u-polarization have zero phase dierence. Therefore when the convolved signal images overlap the underlying carriers are in phase and add
(Fig. 8.28(cd)). The phase relation is indicated by + signs. However, along
the v-polarization the impulses are out of phase by 180 ; when the signal images overlap the elds subtract, indicated by the sign, but when they do
not overlap the partial signal images come through (Fig. 8.28(d)). A squarelaw detector adds the orthogonal components in quadrature. The waveform
in Fig. 8.28(e) for = 0 shows the result.
In the case when = 90 , the phase dierence between both pairs of copolarized impulses is zero. In this case the signal images along the u-axis add
as do the signal images along the v-axis. The resultant waveform is shown
in Fig. 8.28(e). Finally, when = 180 the u-axis impulses are out of phase
while the pair on the v-axis are in phase. The result is the mirror image of
= 180 about o .
Interference of co-polarized signal images plays a central role in how the
PMD manifests itself on an input signal. A slight phase change clearly can
change the enter shape of the signal. When the width of the PMD impulse

8.2 Polarization-Mode Dispersion


a)

t2f
2 45

c)

u50

b)

t
u 5 45

to2 t
o

6 45

f 5 0o

to

1
2

1
1

1
2

time

to1 t

f 5 90o

1
1
v

349

time

f 5 180o

1
1

1
2

1
1

2
1

1
1

1
1

1
1

2
1

d)

e)

time

amplitude

pulse center

time

f)

S3

transition

time

random
input states

g) eye-diagram, f 5 0

time
o

1
S2
S1

time
eye closure

eye closure

Fig. 8.28. Interference of an impulse response from two stages. a) Two stages
generate an impulse with two co-polarized impulses along a u-axis and two more
co-polarized impulses along a v-axis. ce) Four signal images as convolved onto the
impulse response. When two co-polarized impulses are aligned in phase the signal
images add (denoted by +); when the same impulses are out of phase the signal
images subtract (denoted by ). The complete output signal as measured by a
square-law detector adds the coherently-added components in quadrature. f) Randomly generated launch states, and g) eye diagram of cumulative distortion.

response is on the order of the signal duration, the expression of coherent


addition deep within a signal one or zero depends on this phase; when
the phases cancel, the distortion will take place closer to the transitions, but
when the phases add the distortion will be toward the center. In either case
the temporal impulse locations do not change more than a fraction of a wave.
One aspect that is similar to rst-order PMD is that signal distortion is
critically dependent on launch state. Every launch state produces a dierent

350

8 Properties of PDL and PMD

a)
Intensity

1.5
1.0
0.5
0

PDCD (ps2)

SOPMD (ps2)

DGD (ps)

b)

1000

time (ps)

2000

c)

signal spectrum

3000
S3

80
60
40
20
0
2000

S2
S1
PSPin

d)

1000
0
1000

PSPin(vo)

0
-1000
-40

-20

20

40

50

Relative Frequency (GHz)

e)

t
u 5 0o

100

time (ps)

u 5 27o

u 5 128o

u 5 154o

Fig. 8.29. Signal distortion for DGD with low average SOPMD. a) Comparison of
distorted signal to input signal. Launch state is aligned to the input PSP at band
center, indicated in (c). Four 25 ps delay sections are concatenated with mode-mixing
angles shown in (e). Resultant scalar PMD spectra are shown in (b). For comparison,
the signal spectrum is overlaid with the DGD spectrum. The eye diagram (d) is
calculated from uniformly distributed random launch states.

distortion eect, although degeneracies may exist. Figure 8.28(g) shows the
calculation of an eye diagram for 64 randomly and uniformly distributed input
states. Since each delay section is 25 ps long, the outer 25 ps of the pulse are
completely blurred. The overshoot is also evident. This gives a avor why any
measurement of bit-error rate or eye-margin penalty should be made while
scrambling the input polarization and the measurement must last until the
Poincare sphere is reasonably covered by the input state.
Extension beyond two birefringent stages leads toward anecdotal examples
or a statistical treatment of PMD eects. A couple of important examples still
remain to be shown, even though they are two of a subset in a larger class.

8.2 Polarization-Mode Dispersion


a)

351

Intensity

1.5
1.0
0.5
0

PDCD (ps2)

SOPMD (ps2)

DGD (ps)

b)

1000

2000

time (ps)

S3

c)

80
60
40
20
0
2000

3000

PSPin
S2
PSPin(vo)
S1

d)

1000
0
1000

0
-1000
-40

-20

20

40

50

Relative Frequency (GHz)

e)

t
o

u50

100

time (ps)

t
o

u 5 26

t
o

u 5 89

u 5 115

Fig. 8.30. Signal distortion for low average DGD with nite SOPMD. a) Comparison of distorted signal to input signal. Launch state is perpendicular to the input
PSP at band center, the latter indicated in (c). Four 25 ps delay sections are concatenated with mode-mixing angles shown in (e). Resultant scalar PMD spectra are
shown in (b). For comparison, the signal spectrum is overlaid with the DGD spectrum. The eye diagram (d) is calculated from uniformly distributed random launch
states. The SOPMD creates large waveform distortions in the center region of the
eye.

The examples are signal distortion for DGD with low average SOPMD, and
for low average DGD with nite SOPMD. Figures 8.29 and 8.30 illustrate the
two cases. Both calculations show that even when one PMD component is
diminished with respect to the other, complicated distortions still occur.
The rst example is DGD with low average SOPMD: (Fig. 8.29. The four
birefringent sections, each 25 ps in this example, and their relative alignment
is shown in (e). This example is special because the SOPMD vanishes at band
center (b). The SOPMD can vanish when the output PSP pirouettes about a
stationary point as the frequency changes. The PSP vector eventually stops

352

8 Properties of PDL and PMD

its pirouette and continues on a course along the Poincare sphere; the input
PSP spectrum is shown in (c). The launch state chosen for this example is the
input PSP at the pirouette position. The scalar spectra of the PMD condition
is shown in (b)4 ; the signal spectrum is overlayed with the DGD spectrum for
comparison. The output waveform is shown in (a); observe the large overshoots
and false oors even though the SOPMD is signicant only at higher waveform
frequencies. An eye diagram for uniformly distributed random input launch
states is shown in (d). Since the four section delay totals 100 ps, the width of
the impulse response is 100 ps, 50 ps of which extends into the interior of the
signal waveform. This is why the center region of the waveform is distorted.
However, the particular location of the signal spectrum with respect to the
PMD spectrum prevents the marginal impulses at either end of the PMD
impulse response to have strong weight. Thus the eye center is not closed.
The second example is low average DGD with nite SOPMD: Fig. 8.30.
Here, the DGD vanishes at band center, while the SOPMD remains signicant.
The DGD can vanish at one frequency when the component PMD vectors form
a closed loop in Stokes space. Since in this case there are four component
vectors, the closed loop is a square or rhombus. The depolarization must be
nite in this case, or else the closed loop of PMD component vectors would
not open back up (or close on itself in the rst place). Nonetheless, even with
the signal spectrum aligned to the vanishing point of the DGD, the signal
distortion is signicant. The launch state associated with the distortion in (a)
is perpendicular to the input PSP at band center. The input PSP spectrum
is shown in (c). Of particular interest is the eye diagram generated for a large
number of randomly selected launch states. Even when the average DGD is
low, signicant distortion appears all across the pulse. This is caused by the
marginal impulses in the PMD impulse response having signicant weight.
The eye center is still not closed, though, because the average DGD is low.
Signal Moments and Distortion
One can make some general statements about the eects of PMD by looking
at the moments of the signal waveform in the time domain.
Karlsson calculates the rst and second moments of a signal waveform
when eected by PMD [38]. Gordon oers details of Karlssons presentation [20]. The main results are that: 1) the dierential-group delay is the
maximum possible delay of a narrowband signal launched into the fast and
slow input PSPs, all other launch conditions yield a rst-moment less than
the DGD; 2) there is a minimum, non-zero pulse broadening in the presence of
second-order PMD principally due to the rotation of the PSPs in frequency.
The rst- and second-moments of the waveform are calculated by
4

Appendix C shows how to eciently calculate the vector and scalar spectra for
an arbitrary birefringent concatenation.

8.2 Polarization-Mode Dispersion

353

1
2j
te (t)e(t)dt =
E ()E ()d
W R
W R


 2
1
2
t =
t2 e (t)e(t)dt =
E ()E ()d
W R
W R
t =

(8.2.43a)
(8.2.43b)

The rst moment at the input is simply



2j
ts  =
Es Es d
W R

2j
=
f ()f () d
W R
If the waveform spectrum is real and symmetric then ts  = 0. Since the
output-eld spectrum is Et = exp (jo ) U Es , where o is the common phase
through the system and U is the unitary operator for the cascade, the integrand to tt  at the output is


Et Et = jEs o + 12 (s  ) Es + Es Es
where s is the input principal state (recall 2jU U = s  ) and o is the
common delay through the system. Note that o and  are both functions of
frequency. The rst moment of the output waveform is therefore



2
tt  = ts  +
Es o + 12 (s  ) Es d
W R


2
2
= ts  +
|f ()| o + 12 (s s) d
(8.2.44)
W R
The mean signal delay g through the medium is simply the dierence of rst
moments:
(8.2.45)
g = tt  ts 
This can be expressed in spectrally averaged form as
/
/
.
1 .
2
2
s() s () |f ()|
+
g = o () |f ()|
2

(8.2.46)

where s() allows for the possibility of frequency dependence of the input
state. Physically, the mean signal delay is the normalized spectral average
of the common delay weighted by the waveform envelope, plus the spectral
average of the input launch polarization as projected onto the input PSP,
again weighted by the waveform envelope.
An important observation is that the phase of the waveform envelope f (t)
is eliminated in (8.2.46); initial chirp or chromatic dispersion of the pulse does
not aect its average position at the output.
Consider two extreme cases for (8.2.46): monochromatic input, and any
input to a single birefringence segment. For monochromatic input, the waveform input is a sine wave, so in frequency f () = ( o ). The mean signal
delay for any concatenation is

354

8 Properties of PDL and PMD

g = o +

1
2

(
s rs )

(8.2.47)

where each term is evaluated at o . This expression is the main result which
connects the mean signal delay to the spectral description of PMD. The mean
delay at o is g = o /2 when the launch state is parallel or perpendicular
to the input PSP. Any intermediate launch condition produces a rst-moment
that is between these extrema. Accordingly, one can say that the DGD is the
maximum delay at a particular frequency between the fast and slow axes of
a cascade. This interpretation was used in the time-frequency correspondence
gure (Fig. 8.18 on page 326).
When there is only one homogeneous birefringent element then o and s
are stationary with frequency; the mean signal delay has the same form
as (8.2.47). This is an important connection to the impulse response of a
cascade which is considered in the next section.
The pulse spreading between the output and input can be measured by
the second moments of the signal. The pulse spread is dened as

2
(8.2.48)
= (t2t  t2s ) (tt  ts )
The second moment of the input waveform is


 2  2
2
2

ts =
Es Es d =
|f ()| d
W R
W R

(8.2.49)

At the output, Et = T Es + T Es . The dierence between output and input


second moments is
Et Et Es Es = Es T T Es + Es T T Es + Es T T Es
The frequency derivative of the transformation matrix is


T = jejo U o + 12 s 
where the substitution jU U = 12 s  was made. Since jU U is Hermitian,

2
the matrix product is T T = o + 12 s  , or
T T = o2 + 14 2 + o (s  )
where the identity (s  ) (s  ) = 2 was used. Therefore

2
Es T T Es = |f | o2 + 14 2 + o (s s)


For the remaining two terms, one has T T = j o + 12 s  . The sum is


Es T T Es + Es T T Es = j (f f f f ) o + 12 s s
Expansion of the complex waveform envelope into amplitude and phase,
f () = a() exp(j()), one nds that

8.2 Polarization-Mode Dispersion

355

j (f f f f ) = 2 |f |
2

where is the frequency derivative of the waveform phase. Now that each
integrand has been reduced, the complete second moment of the pulse spread
is

 2   2  2

2
tt t s =
|f ()| o2 + 14 2 + o (s s) d
W R



4
2
+
|f ()| () o + 12 s s d (8.2.50)
W R
The rst term on the right-hand side is similar in form to (8.2.46), where the
waveform envelope intensity is the weighting factor to the spectral average of
the delay components. Like the rst moment, this term of the second moment
does not depend on the phase of the waveform. The second term, however,
includes the derivative of the waveform phase. Therefore, the pulse spread is
related not only to the PMD but to the phase across the waveform as well.
For example, for a linear delay across the waveform, w () = , and the
resultant multiplier in the integrand of the second term is equivalent to a
time-derivative of the waveform intensity. The pulse spread then depends on
the temporal details of the signal shape as well as the intensity spectrum.
Consider an example of one section of birefringence, the PMD is only
rst order, and the signal is real (w () = 0). The dierence between output
and input rst and second moments is then
 2  2
tt ts = o2 + 14 2 + o (s s)
tt  ts  = o +

1
2

(s s)

where s is the input launch state. Expansion of the frequency-independent dot


product between launch state and input PMD vector gives (s s) = cos ,
where is the Stokes angles between vectors. The pulse spread is therefore
=

1
2

sin

(8.2.51)

For comparison, the signal delay is


g = o +

1
2

cos

Figure 8.31 illustrates the example. The maximum arrival-time deviation


from o is when the signal is launched along the fast or slow axis of the
birefringent element (Fig. 8.31(a)). In this case cos = 1. There is no pulse
spreading with this condition. This is reasonable since all the energy is inserted into one eigenstate; no dierential delay is experienced. However, if
the launch is at 45 with respect to the input, then g = o and the pulse
spreading achieves its maximum at = /2 (Fig. 8.31(b)). Clearly when the
DGD of the system approaches the time-slot duration of the signal then all
communication can be lost.

356

8 Properties of PDL and PMD

a)

tbs?sb 5 11
2t/2

T(v)
h ts i

h tt i

tbs?sb 5 21

time

h tt i
2t/2

b)
tbs?sb 5 11

2t/2 to 1t/2

tbs?sb 5 21

time

to

1t/2

to

1t/2

tbs?sb 5 0

2t/2 to 1t/2

2t/2 to 1t/2

Fig. 8.31. Illustration of signal envelope through a very simple high-birefringence


(rst-order PMD only) system. a) When the signal is launched along a birefringent
axis it is either advanced or delayed with respect to the isotropic travel time o .
b) Three launch conditions: aligned to slow axis, aligned to fast axis, and equally
mixed between slow and fast. The waveform shapes represent amplitude and not
polarization. The mixed launch imparts pulse spreading.

Consider an example of two sections of birefringence (cf. Fig. 8.21). This


is the simplest form of second-order PMD: only depolarization exists. The
output PMD vector is calculated from the concatenation rule: t = 2 + R21 .
The moment calculations use the input PSP so that the launch state s can
be projected. The corresponding input PMD vector is s = 1 + R1 2 . The
motion of the input PSP is illustrated in Fig. 8.21, but with the terms 1 and 2
interchanged. The input PSP traces a circle in Stokes space over frequency
while the cumulative DGD remains xed; the period of the circle is =
2/1 . The mean delay of the output waveform is

2
2
g = o +
|f | 21 (s () s) d
W R
Dening the integral
2
Is =
W


2

|f | (
rs () s) d

where rs is the pointing direction of the input PSP, itself a function of frequency, the mean delay is then recast as
g = o +

1
2

Is

The cumulative PMD was extracted from the integral Is because its magnitude is xed in frequency, as is the case for two birefringent stages. In a
similar manner, the square of the pulse spreading at the output is
1 2
2
4 + o I s g


2 1 Is2

2 = o2 +
=

1
4

8.2 Polarization-Mode Dispersion


a)

b)

S3

c)

S3

br s(vo)

br s(vo)

S1

b
s1

PSPin(v)

b
s2

S3
br s(vo)

S2

357

b
s3

S2

S2

S1

S1

Fig. 8.32. Three launch conditions into a depolarizing system. Two birefringent sections generate precession of the input PMD vector about 1 . a) Launch state for maximum pulse spread, s1 = 1 2 . b) Intermediate launch state where s2 rs is constant in frequency. c) Launch state for largest minimum pulse spread, s3 = rs (o ).

Since Is enters the equation as a squared quantity, the pulse spread can only
decrease with Is . The value of Is depends on the launch state at the input
and the degree of rotation of rs . Figure 8.32 illustrates three launch states, the
rst and last states being the extrema. The maximum pulse spreading occurs
when Is = 0. State s can always be selected to drive Is to zero. For a symmetric spectrum centered at o and launch state s1 = 1 2 (Fig. 8.32(a)),
where 1 and 2 are evaluated at o , the product rs s1 is antisymmetric.
Integral Is therefore vanishes and the pulse spread is 2 = 2 /4.
An interesting albeit non-extrema case is where the inner product is xed
in frequency. This occurs when s is aligned with 1 (Fig. 8.32(b)). In this case
Is = cos . Only when the two birefringent axes are aligned does the pulse
spread reach zero. But this is simply the case of a single birefringent segment
made up of two parts. The general case is when there is mode mixing between
sections. Thus in the general s2 is not a state that produces minimum pulse
spreading.
The launch state for minimum pulse spreading is illustrated in Fig. 8.32(c).
In general, Is will be less than unity, so the pulse always experiences non-zero
minimum spread. This contrasts with the rst-order PMD situation where
launch along a PSP ensures zero spreading. The problem here is that the
PSPs move with frequency, so there is no single PSP to launch into.
One can calculate the largest minimum pulse spread. This case coincides
with maximum depolarization, which is when 1 = 2 and rs s1 = 0. The
PMD vector at the input is
s /1 = r1 sin 1 r1 r2 cos 1 r1 r1 r2

and the
launch state is s3 = (
r1 + r2 ) / 2. Since the length of the PMD vector
is s = 21 , the frequency-dependence of the inner product is
s3 rs =

1
2

(1 + cos 1 )

In the regime where the bandwidth of the waveform is greater than 1/1 , then
the frequency average of the inner product is driven to one-half. In this case

358

8 Properties of PDL and PMD

the minimum pulse spread is approximately (ignoring details of the waveform


spectrum) 2  (3/16) 2 . The comparison between the two extrema is
maximum pulse spread: s = 1 2 , = /2

minimum pulse spread: s = r(o ),


3 /4

(8.2.52)

Nearly the same pulse spreading as found for the best and worst launch
conditions. Clearly depolarization can have a strong eect on the waveform.
PMD Impulse Response
Gisin and Pellaux deduced the formal connection between the PMD impulse
response and the DGD spectrum of a lossless birefringent cascade [19]. Their
result shows that the root-mean-square (rms) DGD is equal to the standard
deviation of the impulse response width. The derivation is elegant and yet
result seems to be under represented in the subsequent literature.
The derivation constructs a recurrence
 relation for the frequency average of

the mean-squared DGD, denoted 2 () , and a separate recurrence
 2  relation
for the second-moment of the impulse response, denoted h (t) , from the
same concatenation. The recurrence relations are then shown to be equivalent.
Consider a concatenation of N homogeneous birefringent sections, each
section having DGD of n and birefringent-vector orientation rn . In the frequency domain, the two-block PMD concatenation rule (8.2.33) on page 335 is
written to relate the last element, element N , to the preceding N 1 elements
as
(8.2.53)
 (N ) = N + RN  (N 1)
where  (N ) denotes the cumulative PMD through section N and N denotes
the N th local birefringent element. The cumulative vector  (N ) is clearly a
function of frequency while the local elements, such as N , are not. The dot
product of  with itself provides the DGD squared, in general 2 =   . The
DGD squared for  (N ) from recurrence relation (8.2.53) is
2
2 (N ) = N
+ 2 (N 1) + 2N  (N 1)

The DGD squared spectrum, 2 (N ; ), is then averaged over all frequency to


nd its mean-square. The average is written
 2



2
(N ) = N
+ 2 (N 1) + 2 N  (N 1)
2
The value N
comes through the average as its just a number. The last term on
the right-hand side may be further reduced by employing (8.2.53) to the N 1
term:

N  (N 1) = N (N 1 + RN 1  (N 2))


= N N 1 + N RN 1  (N 2)

(8.2.54)

8.2 Polarization-Mode Dispersion

359

The evaluation of the last term on the right-hand side is at the heart of the
derivation. Expansion of the last term to include the rotation operator RN 1
gives
RN 1  (N 2) = 
rN 1 rN 1  (N 2)
+ sin (N 1 ) rN 1  (N 2)
cos (N 1 ) rN 1 rN 1  (N 2)

(8.2.55)

The last two frequency-average terms include sine and cosine terms that average to zero. Even though  (N 2) itself generally varies with frequency, for
a concatenation with enough segments the frequency average will eventually
drive these terms to zero. In contrast, the disposition of the rst term on the
right-hand is not so clear, so it remains. Insertion of (8.2.55) into (8.2.54) and
some manipulation produces
N  (N 1) = N (
rN rN 1 ) (N 1 + 
rN 1  (N 2) )
The complete recurrence relation in the frequency domain is therefore





2
2 (N ) = N
+ 2 (N 1)
+ 2 N (
rN rN 1 ) (N 1 + 
rN 1  (N 2) )

(8.2.56)

The form of this expression has fully identied the eect of N on  (N 1).
It is this recurrence relation that will be compared to the impulse-response
relation.
Now for the impulse response. Each impulse has a position in time and a
weight. As there is no loss, the sum of all the weights remains xed regardless
of the number of sections in the concatenation. The time-position and weight
of each impulse depends on the path the light takes. If there are n sections,
there are 2n paths and 2n impulses at the output. Each one has to be enumerated to construct the impulse response. The time position of the k th pulse
with respect to the common delay through the concatenation is
tk = 1 (k)1 + 2 (k)2 + . . . + N (k)N

(8.2.57)

where i = 1 depending whether the impulse travels along the slow or fast
axis of the ith segment5 To enumerate each path only once the binary equivalent of the decimal index is useful. If for each index integer k [0, N 1]
the integer is converted to its binary form k  b0 b1 b2 bN , then the path
selector  is dened by i = 1 2bi . The path selector is further indexed by k
so that i (k) is generated by the ith binary digit of the binary representation
of the index k.
5

The factor of one-half is dropped to conform with the PMD vector denition
= p rather than = /2
p.

360

8 Properties of PDL and PMD

The weight of the k th impulse is determined by the projection of adjacent


birefringent axes and whether the impulse is going between fast axis in one
segment to the fast axis in the next, or between fast and slow, slow and fast,
or slow and slow. The weight of the impulse is
w(k) =

N
"
1 4 1 !
1 + n1 (k) n (k) rn1 rn
2 n=2 2

(8.2.58)

where the rst one-half factor comes from a 45 launch into the rst element,
that is sin r1 = 0. The weights are normalized such that
N

2


w(k) = 1

k=1

The impulse response of the birefringent concatenation is


N

h(t, N ) =

2


w(k) (t tk )

k=1

where (t to ) is the dirac delta function that has zero value everywhere but
for to . In the following the explicit time dependence of h will be dropped. The
average position of the impulse response relative to the common delay is zero:

h(N )t =

th(t, N )dt =
R

2


w(k) tk = 0

(8.2.59)

k=1

The pulse train is symmetric about t = 0. The second moment of h(t, N )


requires more work. The second moment is

 2
h (N ) t =

t h(t, N )dt =
R

2


w(k) t2k

(8.2.60)

k=1

The square of the k th time position is expanded with (8.2.57) to give


t2k =

N
N 


i (k) j (k) i j

i=1 j=1

N

i=1

i2 + 2

N 
N


i (k) j (k) i j

i=1 j=i+1

where the rst term on the right-hand side of the second line is the sum along
the diagonal of the N N matrix while the second term is the sum over the
upper triangle of the matrix. Substitution back into (8.2.60) gives

8.2 Polarization-Mode Dispersion

361

N
N
N 
2




h (N ) t =
i2 + 2
i j
i (k) j (k) w(k)
2

i=1

i=1 j=i+1

k=1

where the normalization of w(k) was applied to the rst right-hand-side term.
The sum over k in the second term has an signicant simplication. The
binary-weight product i (k)j (k) changes sign when counting over k with a
frequency that depends on j. For instance with N = 4, i (k)j (k) changes
sign for every increment of k when j = 4, but changes sign for every two
increments when j = 3. Concurrently, n1 (k)n (k) changes sign with a rate
when counting over k related to n. The combined result is that all terms of
n1 (k)n (k) that change sign at a counting-rate faster than i (k)j (k) vanish
from the sum over k while the remaining terms add to a unity coecient.
That is, all n > j terms vanish.
The second-moment of h(N ) simplies to
N
N 
N



 2
i2 + 2
i j (
ri ri+1 ) (
rj1 rj )
h (N ) t =
i=1

i=1 j=i+1



To construct a recurrence relation, the sum h2 (N ) t must be related to the
 2

 2

partial sum h (N 1) t . Recall that h (N ) t can be viewed as the element
sum over a symmetric N N matrix, the rst right-hand-side term being
the sum along the diagonal and the
 term being thesum of the upper
 second
triangle. The dierence between h2 (N ) t and h2 (N 1) t is therefore the
element sum of the N th column. Accordingly, the recurrence relation is


N
1




2
h2 (N ) t = N
+ h2 (N 1) t + 2
i N (
r1 r2 ) (
rN 1 rN )
i=1

or more succinctly,
 2



2
h (N ) t = N
+ h2 (N 1) t + 2N SN

(8.2.61)

where
SN = (
rN rN 1 ) (N 1 + (
rN 1 rN 2 ) (N 2 + ))

(8.2.62)

Comparison of the frequency-average recurrence relation


(8.2.56)


 and the

time-average recurrence relation (8.2.61) shows that 2 (N ) = h2 (N ) t .
The second-moment of the impulse response equals mean-square of the DGD
spectrum. This result of Gisin is the formal tie between the impulse and
frequency response of birefringent concatenation.
Figure 8.33 shows three concatenation calculations in both the frequency
and time domains. The number of segments is four, six, and eight. The lefthand gures plot () as a function of frequency; the right-hand gures
plot h(t), or the impulse response of the cascade. The impulse response is

8 Properties of PDL and PMD


12

Nsegments = 4

t(v)

h h2(t) it

ps

h t2(v) iv

time

362

4
0
12

h(t)

Nsegments = 6

ps

8
4
0
12

Nsegments = 8

ps

8
4
0

-6

-3

relative frequency (THz)

impulse ampl.

Fig. 8.33. Root-mean-square of DGD spectrum equals the standard deviation of


its impulse response for lossless birefringent cascades of various lengths (after [19]).
Calculated DGD spectra (left) and impulse responses for t 0 (right) are shown for
Nseg = 4, 6, 8. The dotted line shows the calculated rms DGD (left) and standard
deviation (right) of the respective representations. The amplitude of the impulses
as illustrated is the square-root of w(k) to indicate all impulses more clearly. The
birefringent vector orientations are randomly selected over a uniform distribution
on the Poincare sphere; the DGD values are randomly selected on a Maxwellian distribution where the underlying iid Gaussian distributions have a standard deviation
of unity.

symmetric about t = 0 because sin r1 was set to zero the plots shows only
the positive side of the response. The calculated rms DGD in frequency and
the standard deviation of the impulse spread in time are indicated by the
dotted lines. It is remarkable how quickly the two averages converge as the
number of segments increases.
Armed with the Gisin and Pellaux result a full circle can be closed on the
triad of time-domain analyses of the preceding. The Signal Distortion section
that started on page 343 covered the temporal extent of the PMD impulse
response and the interference of co-polarized signal images that result from the

8.2 Polarization-Mode Dispersion


a)

b)
PMD impulse
envelope
2s
2s 5 2htirms

c)

363

Pulse edge
transition
time

time
2htirms

PMD impulse
envelope

Signal pulse

3htirms

23htirms

time

Tpulse

Fig. 8.34. Pulse-width broadening due to PMD. a) For a long, suciently random link, the PMD impulse response converges in distribution to a gaussian.Gisin
and Pellaux show that the standard deviation is the rms DGD of the link  2 .
b) Leading transition edge is broadened by the impulse response. c) Three standard
deviations covers 99.9% of the gaussian. The center of a pulse can be distorted when
three standard deviations of the impulse response equal half the pulse width.

convolution. In that section only simple impulse responses were considered.


Now, the eects on a signal by the most general of impulse responses can be
intuited.
For a long birefringent concatenation suciently random, the impulse response envelope converges in distribution to a gaussian shape; this is illustrated in Fig. 8.34(a). As a measure, the width of the gaussian is twice its
standard deviation; but it is now known that the standard deviation is the
rms DGD of the link. When a transition edge is convolved by the gaussian
impulse response, the edge is characteristically broadened by the gaussian
width which is twice the rms DGD (Fig. 8.34(b)). Moreover, one standard
deviation covers only 84% of the envelope; two and three standard deviations
include 97.7% and 99.9% of the gaussian envelope the transition edges are
likely to be broadened further than one standard deviation.
Since the impulse response is the inverse Fourier transform of the entire
PMD spectrum, the impulse response does not depend on the particular DGD,
SOPMD, and higher-order PMD experienced by the signal spectrum. The
signal carrier may be frequency shifted to observe dierent aspects of the PMD
spectrum, but the only eect on the convolution is how the signal images
interfere. Therefore, the overlap between a signal one and the impulseresponse gaussian envelope at the transition edges (Fig. 8.34(c)), remains
xed for any carrier frequency. The tails of the gaussians have reasonable
probability three to four standard deviations into the signal; therefore some
conditions of interference can excite distortion this far into the signal pulse.
One can say roughly that when Tpulse /2  3  rms , the center of the pulse
can be eected by PMD. In the next chapter it is shown that for long bers
   1.085  rms , that is, the mean and rms with within 10% of one another.
As a measure of complete channel loss due to distortion, the metric of

364

8 Properties of PDL and PMD

Tpulse  3  

Metric of complete distortion

is commonly used in the industry. For instance, a 10 Gb/s NRZ communication link has 100 ps time slots. The signal can be completely distorted when
running on a ber with accrued mean PMD of    33 ps.
Concluding Remarks on Time-Domain Studies
The preceding has intended to familiarize the reader with the cornerstone features of the time-domain representation. Many researchers have gone further
than this text in the study of particular aspects. Hener oers several papers
on the time/frequency equivalence of PMD and particularly the impact of input signal chirp and coherence on the output distortion [2325]. The moments
analysis of Karlsson is extended by Shieh [57]. The correspondence between
time and frequency domains in the study of second-order PMD is principally
given by Gisin et al. [3], and separately by Penninckx et al. [13, 14]. The impact of higher orders of PMD has been studied by Gisin [30] and separately
by Kogelnik et al. [40]. The degree of polarization as a function of PMD and
optical data spectrum is analyzed by Nezam et al. [46].
Many researchers have studied the Jones-matrix representation of SOPMD
and PMD in general. Signicant works come from Eyal [9], Kogelnik [41], and
Vincetti [47] for SOPMD; and Heismann [26], Penninckx [49], and Vincetti [12,
48] for general PMD.
8.2.7 Fourier Analysis of the DGD Spectrum
The Fourier analysis of the DGD spectrum oers yet another view into the
nature of PMD. This Fourier analysis is not the PMD impulse response nor its
Fourier transform into the associated vector and scalar spectra, but strictly
a Fourier analysis of one scalar spectrum. The analysis can be extended to
include the magnitude-SOPMD scalar spectrum. The purpose of this analysis
is twofold. First, to go beyond second-order PMD, one can take higher and
higher derivatives of the PMD vector, as reported by Kogelnik [40], or one
can look at the degree of variation over a frequency window, as rst analyzed
by Poole and Favin [54]. In the former case, increased precision is found but
over an innitesimally narrow bandwidth; in the latter case, approximation is
made but how the PMD spectrum overlays with the nite-width spectrum of
a signal can be better understood.
The Poole and Favin analysis showed that in the long-length regime the
expected number of level crossings Nm  of the DGD spectrum through a
given level over an interval is
Nm  =

 
4

The number of level crossings increase linearly with the mean DGD of the
link. Therefore the variation of the DGD spectrum must increase linearly

8.2 Polarization-Mode Dispersion

365

with mean DGD as well. For instance, if one conservatively wants to place a
channel between two adjacent level crossing, one would take Nm  = 1 and
estimate the maximum   as  max  4/. For a channel bandwidth of
f = 20 GHz, the maximum   is  max  33 ps.
The following analysis is presented for the short-length regime [5, 6, 64].
Its extension to the long-length regime may be possible but to date this has
not been done. The purpose of presenting this analysis is to emphasize the
origin of oscillatory variation in the DGD spectrum and to exhibit the spinvector formalism used to arrive at the conclusions. In light of the Gisin and
Pellaux time-response derivation based on recurrence relations, it is believed
a similar approach can be used to extend the Fourier analysis into the longlength regime.
The problem at hand is similar to that of angular momentum. Analyses
of angular momentum relate to coupled spinning objects or particles and the
total overall momentum. Often one looks for the probability density of the
overall angular momentum given all possible orientations of the component
spins. Alternatively, the extrema can be determined. The quantized angular
momentum analysis of coupled subatomic spins, such as electron and nuclear
spins, determines the quantized levels of the total angular momentum and the
state densities.
PMD concatenations are similar because component PMD vectors precess, or spin, about the axes of adjacent component vectors as frequency is
swept. While the probability density of the overall PMD pointing direction
or PMD vector length can be evaluated, the focus of the present analysis is
to determine the Fourier components embedded in the variation of the PMD
vector length when frequency is swept. The embedded Fourier components
depend only on the delays of the component PMD vectors and not on their
relative orientation or the frequency. The amplitude and phase of the Fourier
components, however, do determine on these details.
A note on nomenclature. The Fourier content determined in the following
refers to the oscillatory rate of a DGD spectrum, that is, the frequency of
variation. The use of frequency as related to Fourier content diers from
the use of frequency as related to the carrier frequency of a probe signal
that measures the PMD. The optical carrier frequency is completely dierent
from the frequencies of the Fourier content of the DGD spectrum.
The following analysis builds DGD spectra and the respective Fourier components from two, three, and four birefringent stages. The concatenation rules
for each are illustrated in Fig. 8.35 and the corresponding DGD spectra and
Fourier analysis are illustrated in Fig. 8.36.
The simplest concatenation is that of two vectors 1 and 2 (Fig. 8.35(a)).
The angle between the two vectors in the diagram is determined by the mode
mixing angle between the stages and is not a function of frequency. The output
vector is
(8.2.63)
 = 2 + R21

366

8 Properties of PDL and PMD


a)

b)
ta
t2

t1

tb

2u21

t1

vt2

t2

vt2

t3

vt3

vt3

c)

t1
tc
t3

t2

vt2

t4
vt4

Fig. 8.35. Component PMD vector concatenations for two, three, and four stages.
a) Two stages. Angle 21 is determined by the mode mixer and is frequency independent. Vector 1 precesses about 2 at rate 2 . The PMD vector is the sum
of its component vectors; the length a is the DGD. b) Three stages, two dierent precessions: 2 and 3 . c) Four stages, three dierent precessions: 2 , 3 ,
and 4 .

where k = k rk , k = 1, 2. Component 1 precesses about 2 with birefringent


phase = 2 ; the free-spectral range is FSR = 1/2 . A note on the order of
precession. The gure shows 1 precessing about 2 , although in the concatenation 1 comes rst. Physically, the cumulative PMD vector  is dened at
the output. Looking from the output toward the input, one sees 2 immediately and 1 through the aperture of the second element that generates 2 .
When the frequency is changed, the appearance of Stokes orientation 1 is rotated due to the birefringence of 2 , precisely in the same way a polarization
state is altered due to the birefringence of 2 . This gives the precession of 1
about 2 .
The magnitude-squared of the DGD spectrum is
  = 22 + 12 + 22 1 r2 r1

(8.2.64)

The dot product in the last term,


r2 r1 = cos 21 ,

(8.2.65)

is frequency independent and 21 is a Stokes angle. Since there is no frequency


dependence in the DGD spectrum, the spectrum can be characterized as
  = a0

(8.2.66)

8.2 Polarization-Mode Dispersion


DGD2 Spectrum

Fourier Spectrum
Amplitude

t.t

a)
ta? ta

2-stage
b)

FSR

367

frequency

Fourier frequency

t2

tb? tb
3-stage
c)
tc? tc
4-stage
t32t2

t2 t3

t31t2

Fig. 8.36. Magnitude-square DGD spectra and associated Fourier analysis corresponding to precession diagrams in Fig. 8.35. The magnitude-square DGD spectra
plot as a function of carrier frequency. The Fourier analyses show the Fourier
frequencies that are present in the respective spectra. Only Fourier amplitudes
are shown, although the Fourier phases are not necessarily zero. a) Two-stage spectrum has no oscillatory Fourier components and is governed by (8.2.66). Only a DC
Fourier component is present. b) Three-stage spectrum has single oscillatory component with well-dened FSR and is governed by (8.2.70). The Fourier spectrum
has a DC plus a single oscillatory component at 2 ; only the center stage dictates
the frequency of oscillation. c) Four-stage spectrum has four oscillatory components,
governed by (8.2.75). The Fourier spectrum has one constant plus four oscillatory
components, including sum and dierent terms. Only the center two stage delays
contribute to the oscillation.

where a0 is a real number and the Fourier content is only DC, see Fig. 8.36(a).
The next case is the concatenation of three component vectors 1 , 2 ,
and 3 (Fig. 8.35(b)). As illustrated, the birefringent axes between the stages
are not aligned, which results in mode mixing. The resultant PMD vector is
 = 3 + R32 + R3 R21

(8.2.67)

The gure shows the motion of the three vector components. Vector 1 precesses about the 2 axis with birefringent phase 2 = 2 . Vectors 1 and 2
combined precess about the 3 axis with birefringent phase 3 = 3 . The
length and pointing direction of  exhibit a more complicated motion than
that of the two-stage example and are in general frequency dependent. The
magnitude-squared of the DGD spectrum is

368

8 Properties of PDL and PMD

  = 32 + 22 + 12 +
23 2 r3 r2 + 22 1 r2 r1 +

(8.2.68)

23 1 r3 (R2 r1 )
In light of (8.2.65), the rst ve terms on the right-hand side are frequency independent. The last term, however, generates one non-DC Fourier component.
The last term expands to
r3 (R2 r1 ) = cos 32 cos 21 +
r2 r1 ) cos 2 r3 (
r2 r2 r1 )
sin 2 r3 (

(8.2.69)

The last two terms on the right-hand side add to yield a single oscillatory
term governed by 2 (see Appendix A). Combining Eqs. (8.2.65), (8.2.69),
and using the identity



A cos B sin = A2 + B 2 cos tan1 B/A
yields the Fourier form of the magnitude-squared DGD spectrum:
  = a0 + a1 cos (2 )

(8.2.70)

where, as before, a0 and a1 are real numbers that are independent of frequency.
The spectrum is periodic where the periodicity is determined solely by the
center section 2 , see Figs. 8.36(b). The Fourier-component phase shift is
determined from (8.2.69):
)
*
r3 (
r2 r1 )
= tan1
(8.2.71)
r3 (
r2 r2 r1 )
where in general the phase changes as the relative angles between PMD components change. Only when all three birefringent axes lie in the same plane,
r2 r1 ) = 0, does phase shift vanish.
which leads to r3 (
The coupling of Fourier-component phase shift to the mode mixing angles r2 r1 and r3 r2 is an interesting eect that is illustrated in Fig. 8.37. Both
three-stage examples in the gure show a range of DGD spectral shapes when
the center section is rotated as indicated. When the birefringent axes r1,2,3
of all three sections lie in the same plane, such as the equator, then Fouriercomponent phase shift is identically zero. In this case the frequency location
of the maximum DGD value does not change even though the shape of the
spectrum changes (Fig. 8.37(a)). However, when any one of the birefringent
axes lies out of the plane of the other two then coupling between phase shift
and mode mixing occurs (Fig. 8.37(b)). This eect is observable, for instance,
when a zero-order quarter-wave waveplate is inserted to either side of the center element. In a communication link, the bulk components such as isolators
can make the apparent birefringent axes fall outside of a common plane.

8.2 Polarization-Mode Dispersion


a)

br ?(rb 3 br )
3 2
1

b)

50

j50

br

b
b 6
3?(r23 r1) 5

j
DGD

FSR

DGD

369

frequency

frequency
2t

2t

2t

br
1

br
2

br
3

br
1

br
2

2t
br

Fig. 8.37. Fourier-component phase shift as decoupled (a) and coupled (b) to
mode mixing. a) Locus of DGD spectra for a three-stage system as the center section
is rotated. All three birefringent axes lie in the same plane in Stokes space. The
frequency location of the maximum DGD is xed for each spectrum. b) Locus of
DGD spectra when one birefringent axes lies out of the plane dened by the other
two. Fourier-component phase shift is coupled to the mode mixing.

The following analysis is simplied by assuming all birefringent axes lie


in the same plane. This is called the co-planar assumption. Violation of this
assumption will not change the location or number of Fourier frequencies, only
their phase.
The concatenation of four sections 1 , 2 , 3 , and 4 is illustrated in
Fig. 8.35(c). The resultant PMD vector is
 = 4 + R43 + R4 R32 + R4 R3 R21

(8.2.72)

The motion of the vector sum is more complicated yet. Vector 1 precesses
about the 2 axis with phase 2 = 2 ; vectors 1 and 2 combined precess
about 3 with phase 3 = 3 ; and vectors 1,2,3 combined precess about 4
with phase 4 = 4 . The magnitude-squared DGD spectrum takes the form
  =

4


k2 + 2

k=1

2


3


k+1 k rk+1 rk +

k=1

(8.2.73)

k+2 k rk+2 (Rk+1 rk ) + 24 1 r4 (R3 R2 r1 )

k=1

The rst term on the right-hand side of (8.2.73) is scalar; the second term,
identied with (8.2.65), generates no frequency-dependent terms; and the
third term, identied with (8.2.69), generates the frequency-dependent terms
cos 2 and cos 3 (assuming coplanar mode-mixing vectors). The last term
generates additional frequency-dependent components. That term expands to

370

8 Properties of PDL and PMD

r4 (R3 R2 r1 ) = cos 43 cos 32 cos 21


r3 (
r2 r2 r1 ) cos 43 cos 2
r4 (
r3 r3 r2 ) cos 21 cos 3

(8.2.74)

+ r4 (
r3 r2 r1 ) sin 3 sin 2
+ r4 (
r3 r3 r2 r2 r1 ) cos 3 cos 2
The mixing products sin 3 sin 2 and their cosine complements resolve themselves into sum and dierence terms, e.g.
2 sin 3 sin 2 = cos(3 2 ) cos(3 + 2 )
The Fourier content of a four-stage concatenation therefore takes the form
  = a0 + a1 cos 2 + a2 cos 3 +
a3 cos (3 2 ) + a4 cos (3 + 2 )

(8.2.75)

where, as before, coecients ak are real, frequency-independent numbers that


are determined solely by the mode mixing between stages. An exemplar spectra and its Fourier spectrum are illustrated in Figs. 8.36(c).
Several observations about (8.2.75) are made. First, the Fourier content of
the magnitude-squared DGD spectrum is determined only by the center two
delay stages. In general the rst and last section for any number of stages
do not contribute to the Fourier content; this observation is proven below.
Second, sum and dierence Fourier terms are present in addition to the Fourier
components associated with the center two stages. Here there are four Fourier
components in total. Third, there is a Fourier-component generator function
that generates these components. The generator function G(N ) is
G(N ) = rN R(N, 2) r1

(8.2.76)

where the cumulative operator R(N, 2) is dened by (8.2.38) on page 337.


The rst four generator functions are
G(1) = 1
G(2) = g0
G(3) = g0 + g1 cos 2

(8.2.77)

G(4) = g0 + g1 cos 2 + g2 cos 3 +


g3 cos(3 2 ) + g4 cos(3 + 2 )
where the value gk for one generator function has no relation to the value gk
of another generator function.
As a last part to this section, the absence of Fourier components generated
from the rst and last stages is shown [61]. The independence of the rst stage
has already been demonstrated: it is clear from any of the vector diagrams

8.3 Combined Eects of PMD and PDL

371

in Fig. 8.35 that no vector precesses about 1 , so there is no sensitivity of 


to 1 . The insensitivity to N is proven by separating the last stage from the
preceding N 1 stages:
 


  = N + RN (N 1) N + RN (N 1)
(8.2.78)
2
2
+ (N
N r(N 1)
= N
1) + 2N (N 1) r
Since the last term on the right-hand side has the form G(2), there is in
fact no N Fourier component generated by the last stage. Geometrically
this makes sense because rotation about N pirouettes the remaining vector
structure, changing its pointing direction but not its length.

8.3 Combined Eects of PMD and PDL


When birefringent elements and partial polarizers are interspersed along a
concatenation, the resulting polarization eects are more complicated than
PDL or PMD alone. The resulting eects form a superset of PDL and PMD:
communication impairments due to combined PDL and PMD can be more
severe than either isolated eect. Since there is no separate name for the
combined phenomena, PMD and PDL here denotes the aggregate eect.
The principal results from combined PMD and PDL are
1. The principal-states of polarization are not orthogonal to one another.
2. The output polarization state does not follow simple precessional motion
as a function of frequency.
3. The polarization-dependent loss is frequency dependent. A new term,
dierential-attenuation slope (DAS) is introduced to characterize this effect.
These results conspire to create anomalous distortions. For instance, the dierential-group delay of a PMD and PDL combination can be greater than the
sum of the individual delays [32]. Or, a pulse can suer spreading even with
zero net DGD [30]. Moreover, neither an optical PDL nor PMD compensator
can be perfectly realized.
The original work on PMD and PDL is by Frigo [16], who derived the
equation of motion of the output polarization vector as function of frequency,
albeit in the context of coupled-mode equations. Eyal [10] later recast the
Frigo equation into unit-vector form to derive the complex motion of the
output polarization state with frequency. The authors Gisin, Huttner, and
Geiser [18, 31] discovered many anomalous eects due to the interaction of
PMD and PDL. R. C. Jones also contributed on this subject: his 1941 paper
proves a theorem, theorem 3, that any number of PMD and PDL elements
(although he called them retarders and partial polarizers) can be replaced
with just four elements two PMD elements, one PDL element, and one

372

8 Properties of PDL and PMD

optically active polarization rotator [36]. This theorem has practical use when
separating PMD and PDL eects.
Further research on the interaction of PMD and PDL in an optical communications link has been reported by Mollenauer [62, 63], Feced [11], and
Eyal [8]. A very interesting measurement method to test for maximum eyeopening excursion has been invented by Kuperman et al. [42].
Three principal equations are derived in the following: the change of output
polarization state with frequency; the change of output polarization state
through propagation; and the cumulative PDL vector equation of motion.
Table 8.3 compares the expressions for pure PMD and those including PDL.
8.3.1 Frequency-Dependence of the Polarization State
In general, the transformation matrix T between input and output polarization states is neither unitary nor Hermitian in the presence of combined PMD
and PDL. Due to the birefringence in the medium, T is frequency dependent:
|t = T () |s. As before, for |s xed in frequency, the frequency-change of
the output state is |t = T |s. As long as a perfect polarizer is not placed
between input and output, T 1 exists and the change of output state is expressed as
(8.3.1)
|t = T T 1 |t
Even without any particular reference to PMD and PDL parameters, the matrix T T 1 can always be decomposed into a Hermitian, skew-Hermitian, and
trace components, as was shown by (2.5.72) on page 61. The decomposition
of T T 1 takes the form
"
1 !
 i  + a0 I
jT T 1 =
r  + j
(8.3.2)
2
The factor of j in front of T T 1 is added to keep the notation parallel to the
Hermitian operator jU U dened for the description of PMD. The scalar a0
is in general complex; the real component is the common phase through the
i
 r and
system and the imaginary part is the common loss. The vectors
are real-values Stokes vectors. These vectors relate to the system birefringence
and dierential attenuation, but not is a straight-forward way. The decomposition parameters are related to jT T 1 via (2.5.68) on page 61 for the trace
and (2.5.73) for the real and imaginary parts.
Since T is not necessarily unitary, the length of the output Stokes vector t is
generally not the same as the input vector s. One must be careful to distinguish
the output vector length t and the unit vector t. With this remark in mind,
the Stokes vector t is calculated in the same way as (8.2.15) on page 330.
This makes




t = t  T T 1  +  T T 1  t
(8.3.3)
Substitution of T T 1 (8.3.2) into (8.3.3) generates the Frigo equation of
motion:

8.3 Combined Eects of PMD and PDL

373

dt
 r t + ai,0 t + t
i
=
(8.3.4)
d
where ai,0 is the imaginary part of the trace of T T 1 . This equation describes
a complex behavior of the output Stokes vector t. The rst term on the right r as a function
hand side generates a precessional motion: t precesses about

of frequency. In the absence of PDL, r =  , the PMD vector. In the presence
 r includes PDL as well as PMD terms. The second term on the rightof PDL,
hand side describes the growth or decay of t along its own axis. The imaginary
part of the trace of T T 1 governs this behavior. Also, these rst two terms
run perpendicular to one another. Finally, the third term on the right-hand
 i . The pulling behavior has been seen before in 8.1.2 in
side pulls t toward
regard to PDL. In sum, there are three distinct axes along which the output
state is changed.
 i . If the former is the
 r and
There is a competition setup between
dominant term, then it acts to retard the growth or decay of t by generating
a motion perpendicular to t. If the latter term dominates, then t grows or
decays without bound.
Further insight is found by decomposing (8.3.4) into coupled unit-vector
and vector-length equations of motion [15]. The decomposition requires two
identications. First, by denition t = tt, so the frequency derivative is
dt dt
dtt
=t
+t
d
d
d
Note that the rst term is perpendicular to t while the second term is parallel (that the rst term is perpendicular is a consequence of t being a unit
 i is decomposed into components parallel and
vector). Second, the vector
perpendicular to t:
i =
 i, +
 i,

!
"
"
!
 
 i t t
 i t t +
 i t t
=
"
!
"
!
 i t t + t
 i t
=
Substitution into the equation of motion makes
t

"
!
"
!
dt dt
 r t + (ai,0 t)t + t
 i t t + t t
 i t
+t
=t
d
d

The parallel and perpendicular components must separately satisfy this equation. Therefore the decomposition produces the two coupled equations
!
"
dt
 i t t
= ai,0 t +
(8.3.5a)
d
!
"
dt 
 i t t
= r t
(8.3.5b)
d

374

8 Properties of PDL and PMD

These equations exhibit interesting properties. First, unit-vector equation is


not coupled to the vector-length equation. Therefore these equations can be
solved in sequence: rst solve for t, then for t. Second, the unit-vector equation
of motion has both double- and triple-vector products. This result was rst
published by Eyal [10]. Third, the change of length equation is driven explicitly
 i , both directly related to PDL. If the PDL were zero, these terms
by ai,0 and
would be absent and the vector length would be invariant. With PDL present,
t changes with frequency this is the origin of the dierential-attenuation
slope, or DAS. Even when the common loss generated by ai,0 is transformed
 i induces DAS.
out, the dierential attenuation embedded in
8.3.2 Non-Orthogonality of PSPs
The principal states of polarization for PMD and PDL are found in the same
way as the PSPs are for pure PMD. Recall from (8.2.9) on page 329 that the
PSPs are dened by the eigenvalue equation of the operator jU U . When
this equation is satised, the output polarization state is stationary to rstorder in frequency.
In an entirely analogous way, the eigenvalue equation for jT T 1 is dened. Substitution of the spin-vector form of jT T 1 into (8.3.1) yields
"
j !
 i  |t j(a0 /2) |t
r  + j
(8.3.6)
|t =
2
To make the output state stationary, the spin-vector operator must collapse
to a complex scalar value:
  |

p  = |
p 

(8.3.7)

 i . That the eigenvalues are equal and opposite is a direct


 =
 r + j
where
 Moreover, since the operator
  is nonconsequence of the zero trace of .
Hermitian, the eigenvalues are in general complex. Gisin et al. identify the
real and imaginary parts of the eigenvalues as [31]
= + j

(8.3.8)

The real part is the familiar dierential-group delay magnitude; the imaginary part is the dierential-attenuation slope (DAS), which is the frequency
derivative of the dierential attenuation along the two eigenvectors.
  are related by
The eigenvalue and the operator


2 =

(8.3.9)

  I) = 0 and is analThis expression may be conrmed by solving det(


ogous to the case for pure PMD where   = 2 .
In the presence of PDL, the PSPs are not orthogonal (except, possibly,
in transient or pathological cases). This is due to the complex value of the

8.3 Combined Eects of PMD and PDL

375

 The overlap 2 of the two eigenvectors can be computed in Stokes


operator .
, see (2.5.65) on page 60. The calculation
space from the dot-product p
+ p
is simplied by rearranging (8.3.7) so that the operator has unit length and
the eigenvalues are real:
 |

p  = |
p 
= /,

=w

= 1. From this eigenvalue equation
where

 r + jw
 i , and
two auxiliary equations are computed, the rst by multiplying the equation
p | :
by 
p | and the second by 


p | | p  = 
p |
p 
 )| p  = 
p | | p 

p | (
Conversion to Stokes space gives
p = 1, and
+ j
p = p

(8.3.10)


= 1, the imagwhere the tilde has been removed for brevity. Now, since
 i = 0. Thereinary part of the dot product must vanish: this requires w
r w
fore an orthogonal group of three (unnormalized) axes may be constructed,
 i, w
r w
 i ), and p can be projected onto this basis:
(w
r, w
 r + cr w
 i + c (w
r w
 i)
p = cr w

(8.3.11)

The real-valued coecients are isolated through the dot products


r w
r = w
 r p
cr w
ci w
i w
i = w
 i p
c (w
r w
 i ) (w
r w
 i ) = (w
r w
 i ) p
p = 1, the rst two coecients are cr = 1/ (w
Given that
r w
 r ) and
ci = 0. The third coecient is evaluated from the dot-product and the second
auxiliary equation (8.3.11)
c =

 i ) p
(w
r w
1
c = 2
wr2 wi2
wr

The normalized eigenvectors in Stokes space are then [31]


p =

r w
i
w
r + w
w
r w
r

(8.3.12)


 = ||2 , the overlap of the

= wr2 + w2 and
Using the fact that
i
eigenvectors is computed as
2 =

w2 + wi2 1
1 + p+ p
= r2
2
wr + wi2 + 1
2

 ||
||
 2 + ||2
||

(8.3.13)

376

8 Properties of PDL and PMD

 is real, the case for pure PMD, the overlap integral vanishes.
Clearly when
However, addition of any PDL at all pulls the two PSPs away from an orthogonal orientation.
8.3.3 PMD and PDL Evolution Equations
There are two evolution equations to derive, both being extensions of the

pure PMD and pure PDL case. First, the evolution of the complex operator
as a function of length is derived; the analogue to this equation is (8.2.39)
on page 339, although here a dierent derivation is employed. Second, the
evolution of the cumulative PDL  is derived; the analogue is (8.1.27) on
page 310. In both cases, the combined eects of birefringence and PDL are
accounted for.
Earlier, the evolution equation for  was derived by combining the partial
derivatives of the output state t with respect to both length and frequency.
This is the Poole method. The present situation is more dicult because
both the vector direction and length vary with length and frequency. Instead,
the method of Gisin et al. is used [18]. Their method is similar to that used
in 8.1.4 except that sections are taken as discrete rather than in the continuum limit.
Given a transformation T such that |t = T |s, the transformation is partitioned into N homogeneous birefringent and lossy sections: T = AN TN ,
where AN is the common loss and TN represents the product of transformation matrices through N sections, TN = Tn Tn1 . . . T1 . Capital subscripts
denote section products and lower-case subscripts denote particular sections.
The terms AN and TN are the discrete analogue to the continuous expression (8.1.22) on page 309.
To account for birefringence and loss, the spin-vector operator for each
section is written as


(jwn +
 n ) 
Tn = exp
(8.3.14)
2
This denition of Tn is not totally general because the vector direction of
the loss and birefringence are aligned; this is a reasonable model because
the origin of dierential loss and birefringence (in the perturbation regime)
is likely due to the same disturbance. A shorthand variable g is dened as
 n , and g = g
g.
gn = jn +
The last element of the concatenation is separated from the remaining
by writing TN = Tn TN 1 . Given T,N = T,n TN 1 + Tn T,N 1 , the operator T,N TN1 may be written in incremental form as




gn 
gn 
j (n  )
T,N TN1 =
+ exp
T,N 1 TN11 exp
2
2
2
 k  ), the operator is rewritten as
With the identication T,k Tk1 = j/2 (

8.3 Combined Eects of PMD and PDL

377

Table 8.3. Comparison of Pure PMD and Entangled PMD + PDL


PMD

PMD + PDL

jU U |p = /2 |p

jT T 1 |
p = /2 |
p + a0 /2 |
p

jU U = 12 ( )

jU U = jU U

) + a0 /2
jT T 1 = 12 (


jT T 1 = jT T 1

, real

complex
,

DGD

= + j DGD + DAS



2 =

p = p | | p PSP

p = p | | p PSP

(1 + p+ p )/2 = 0

(1 + p+ p )/2 0


+
/z =

+ (
+ j

/z
=
)





"
gn  ! 
gn 

N  = n  + exp
N 1  exp
2
2
Use of the complex spin-vector operator expansion (2.5.8) on page 63, the
relevant spin-vector identities, and identication of the embedded equation
gives
"
!
 N = n +
 N 1 gn gn +

!
" "
!
"
!
 N 1
 N 1 gn
 N 1 gn gn j sinh gn
cosh gn
 N , a small increment of length
To make the dierential operator z
is characterized by a small magnitude gn , and the birefringence and PDL
are written in per-length form as  (z) and
 (z). Moreover, recognize that
 is
(z) = n/c = . The resultant equation of motion for
!
"


 +
 + j

=
(z)
z

(8.3.15)

In comparison with the pure PMD evolution equation (8.2.39), PDL adds to
 and drives the vector to a complex quantity. Li and Yariv have
the curl of
worked out the analytic solutions (8.3.15) in [43].
Regarding the cumulative PDL vector , the equation of motion is derived
in the same way as that shown in 8.1.4 but with the transformation operator (8.3.14) substituted for that in (8.1.22). The equations of motion for 
and the transmission of depolarized light are

378

8 Properties of PDL and PMD

!
"
d 
  +
=

  
dz
!
"
d Tdepol
=
  Tdepol
dz

(8.3.16a)
(8.3.16b)

While the depolarized transmission equation is the same once PMD is in  generates
cluded, the cumulative PDL equation has a new term: the


a rotation of about the local birefringence vector . This rotation is to be
expected since linear birefringence always generates precessional motion in
Stokes space.
8.3.4 Separation of PMD and PDL
In 1941 R.C. Jones showed that most any Jones matrix generated by any
number of retarders and partial polarizers can always be reconstructed with
two retarders and one partial polarizer such that
J() = U ()P ()V ()

(8.3.17)

where P represents a partial polarizer and U and V are unitary matrices [36].
In general each matrix is a function of optical frequency. The partial polarizer
is a Hermitian matrix; accordingly it has real eigenvalues and perpendicular
eigenvectors. Such a matrix can be decomposed in H = SS , where S is a
matrix whose columns are the eigenvectors of H and is a diagonal matrix
whose entries are the corresponding eigenvalues.
The unitary operator to the left or right of P can be absorbed in the
following way: decompose P and absorb one of its neighbors into a unitary
matrix:
J() = U SS V
= U V (S V ) (S V )
()P ()
=U

(8.3.18)

= U V and P = (S V ) (S V ). Likewise, J = P  V  .
where U
Consider a link composed of PMD and PDL. The transformation matrix can be written T () = P ()U (). At any particular frequency all of the
dierential attenuation is concentrated in matrix P : the eigenvectors of P determine the Stokes direction of  and its eigenvalues are the maximum and
minimum transmission. The PDL component can be isolated from T by taking
advantage of the Hermitian properties of P :
T T = P 2 = S2 S

(8.3.19)

The eigenvectors of T T point in the direction of the cumulative PDL. Also,


since the eigenvalues of P are real, the entries in 2 must be positive. Note

8.3 Combined Eects of PMD and PDL

379

that the magnitude and direction of the PDL vector is in general frequency
dependent; the dependence is governed by the birefringence of the link which
is concentrated in U . This shows that even with the decomposition P U , the
PDL and PMD remain entangled.
The unitary matrix U is found once the PDL matrix P is calculated
from T T :
(8.3.20)
U () = P 1 ()T ()
Given U it is tempting to calculate the PMD properties from jU U . For small
PDL this form of U provides a correction to T for an the investigator who
wants to isolate the PMD eects. Both Shtengel and Karlsson have reported
using this correction [33, 39]. Although suitable to remove perturbations, one
should keep in mind that generated from jU U is not the same as generated from T T 1 . Huttner et al. dene an eective PMD e for jU U to
highlight the fact that and e are two dierent quantities [31].

380

8 Properties of PDL and PMD


Table 8.4. Table of Important SOP, PDL, and PMD Relations

Evolution Equations

PDL:

d
s
s
=
dz
!
"
d

dz
!
"
d Tdepol
=

Tdepol
dz

PMD:

d

+
=
dz

SOP:

d
+

+
=
dz
d
s
= s
d
!
"
d


=
+


dz
!
"
d Tdepol
=

Tdepol
dz
!
"
d
s
i s s
r s
=
d

PMD+PDL:

Dening Expressions
SOP:

|t = U |s

PDL:

Tp = t |t = s | P P |s


Tmax
dB 10 log10
Tmin

PMD:

jU U |p  = /2 |p 
jU U =

1
( )
2

= R R
|

p  = ( + j) |
p 

PMD+PDL:
PMD Concatenation

=
=

n


n


R(n, k + 1) n

k=1

R(n, k + 1) ( n + n (n))

k=1

R(n, k) = Rn Rn1 Rk

References

381

References
1. D. Andresciani, F. Curti, F. Matera, and B. Daino, Measurement of the groupdelay dierence between the principal states of polarization on a low-birefringent
terrestrial ber cable, Optics Letters, vol. 12, no. 10, pp. 844846, 1987.
2. A. J. Barlow, Birefringentce and polarization mode dispersion in spun single
mode bers, Applied Optics, vol. 20, no. 17, p. 2962, 1981.
3. P. Ciprut, B. Gisin, N. Gisin, R. Passy, J. Weid, F. Prieto, and C. W. Zimmer, Second-order polarization mode dispersion: Impact on analog and digital
transmissions, Journal of Lightwave Technology, vol. 16, no. 5, pp. 757771,
May 1998.
4. F. Curti, B. Daino, Q. Mao, F. Matera, and C. G. Someda, Concatenation of
polarization dispersion in single-mode bres, Electronics Letters, vol. 14, no. 4,
pp. 290291, 1989.
5. J. N. Damask, Methods to construct programmable PMD sources, Part I:
Technology and theory, Journal of Lightwave Technology, vol. 22, no. 4, pp.
9971005, Apr. 2004.
6. J. N. Damask, P. R. Myers, A. Boschi, and G. J. Simer, Demonstration of a
coherent PMD source, IEEE Photonics Technology Letters, vol. 15, no. 11, pp.
16121614, Nov. 2003.
7. E. Desurvire, Erbium-Doped Fiber Ampliers, Principles and Applications.
Hoboken, New Jersey: Wiley-Interscience, 2002.
8. A. Eyal, D. Kuperman, O. Dimenstein, and M. Tur, Polarization dependence
of the intensity modulation transfer function of an optical system with PMD
and PDL, IEEE Photonics Technology Letters, vol. 14, no. 11, pp. 15151517,
Nov. 2002.
9. A. Eyal, W. K. Marshall, M. Tur, and A. Yariv, Representation of secondorder polarization mode dispersion, Electronics Letters, vol. 35, no. 19, pp.
16581659, 1999.
10. A. Eyal and M. Tur, A modied poincare sphere technique for the determination of polarization-mode dispersion in the presence of dierential gain/loss,
in Tech. Dig., Optical Fiber Communications Conference (OFC98), San Jose,
CA, Feb. 1998, paper ThR1, p. 340.
11. R. Feced, S. J. Savory, and A. Hadjifotiou, Interaction between polarization
mode dispersion and polarization-dependent losses in optical communication
links, Journal of the Optical Society of America B, vol. 20, no. 3, pp. 424433,
Mar. 2003.
12. E. Forestieri and L. Vincetti, Exact evaluation of the Jones matrix of a ber in
the presence of polarization mode dispersion of any order, Journal of Lightwave
Technology, vol. 19, no. 12, pp. 18981909, 2001.
13. C. Francia, F. Bruyere, D. Penninckx, and M. Chbat, PMD second-order eects
on pulse propagation in single-mode optical bers, IEEE Photonics Technology
Letters, vol. 10, no. 12, pp. 17391741, Dec. 1998.
14. C. Francia and D. Penninckx, Polarization mode dispersion in single-mode
optical bers: Time impulse response, IEEE Internation Conference on Communications, vol. 3, no. 6-10, pp. 17311735, June 1999.
15. N. Frigo, private communication, 2003.
16. , A generalized geometric representation of coupled mode theory, IEEE
Journal of Quantum Electronics, vol. QE-22, no. 11, pp. 21312140, 1986.

382

8 Properties of PDL and PMD

17. N. Gisin, Statistics of polarization dependent loss, Optics Communications,


vol. 114, pp. 399405, Feb. 1995.
18. N. Gisin and B. Huttner, Combined eects of polarization mode dispersion
and polarization dependent losses in optical bers, Optics Communications,
vol. 142, pp. 119125, Oct. 1997.
19. N. Gisin and J. P. Pellaux, Polarization mode dispersion: Time versus frequency
domains, Optics Communications, vol. 89, pp. 316323, May 1992.
20. J. P. Gordon and H. Kogelnik, PMD fundamentals: Polarization mode
dispersion in optical bers, Proceedings of National Academy of Sciences,
vol. 97, no. 9, pp. 45414550, Apr. 2000. [Online]. Available: http:
//www.pnas.org
21. , PMD fundamentals: Polarization mode dispersion in optical bers;
Appendix B: Relation between PMD vectors t and w, Proceedings of National
Academy of Sciences, vol. 97, no. 9, pp. 45414550, Apr. 2000, supplemental
Appendix. [Online]. Available: http://www.pnas.org
22. H. A. Haus, Waves and Fields in Optoelectronics. Englewood Clis, New Jersey:
PrenticeHall, 1984.
23. B. L. Hener, Single-mode propagation of mutual temporal coherence: Equivalence of time and frequency measurements of polarization-mode dispersion,
Optics Letters, vol. 19, no. 15, pp. 11041106, Aug. 1994.
24. , Optical pulse distortion measurement limitations in linear time invariant
systems, and applications to polarization mode dispersion, Optics Communications, vol. 115, pp. 4551, Mar. 1995.
25. , Inuence of optical source characteristics on the measurement of
polarization-mode dispersion of highly mode-coupled bers, Optics Letters,
vol. 21, no. 2, pp. 113115, Jan. 1996.
26. F. Heismann, Accurate Jones matrix expansion for all orders of polarization
mode dispersion, Optics Letters, vol. 28, no. 11, p. 20132015, Nov. 2003.
27. F. Heismann and M. S. Whalen, Fast automatic polarization control system,
IEEE Photonics Technology Letters, vol. 4, no. 5, pp. 503505, May 1992.
28. F. Heismann, Analysis of a reset-free polarization controller for fast automatic
polarization stabilition in ber-optic transmission systems, Journal of Ligthwave Technology, vol. 12, no. 4, pp. 690699, Apr. 1994.
29. F. Heismann, D. A. Fishman, and D. L. Wilson, Automatic compensation of
rst order polarization mode dispersion in a 10 gb/s transmission system, in
European Conference on Optical Communication (ECOC98), vol. 1, Sept. 1998,
pp. 529530.
30. B. Huttner, C. D. Barros, B. Gisin, and N. Gisin, Polarization-induced pulse
spreading in birefringent optical bers with zero dierential group delay, Optics
Letters, vol. 24, no. 6, pp. 370372, Mar. 1999.
31. B. Huttner, C. Geiser, and N. Gisin, Polarization-induced distortions in optical
ber networks with polarization-mode dispersion and polarization-dependent
losses, IEEE Journal of Selected Topics in Quantum Electronics, vol. 6, no. 2,
pp. 317329, Mar. 2000.
32. B. Huttner and N. Gisin, Anomalous pulse spreading in birefringent optical
bers with polarization-dependent loss, Optics Letters, vol. 22, pp. 504507,
Apr. 1997.
33. E. Ibragimov, G. Shtengel, and S. Suh, Statistical correlation between rst
and second order PMD, Journal of Lightwave Technology, vol. 20, no. 4, pp.
586590, 2002.

References

383

34. Fibre optic interconnecting devices and passive components - Basic test
and measurement procedures - Part 3-12: Examinations and measurements Polarization dependence of attenuation of a single-mode bre optic component:
Matrix calculation method, International Electrotechnical Commission Std. IEC
61 300-3-12, 1997. [Online]. Available: https://www.iec.ch/
35. Fibre optic interconnecting devices and passive components - Basic test
and measurement procedures - Part 3-2: Examinations and measurements Polarization dependence of attenuation in a single-mode bre optic device,
International Electrotechnical Commission Std. IEC 61 300-3-2, 1999. [Online].
Available: https://www.iec.ch/
36. R. Jones, A new calculus for the treatment of optical systems, Part II. proof of
three general equivalence theorems, Journal of the Optical Society of America,
vol. 31, no. 7, pp. 493499, July 1941.
37. I. P. Kaminow, Polarization in optical bers, IEEE Journal of Quantum Electronics, vol. QE-17, no. 1, pp. 1522, 1981.
38. M. Karlsson, Polarization mode dispersion-induced pulse broadening in optical
bers, Optics Letters, vol. 23, no. 9, pp. 688690, May 1998.
39. M. Karlsson, J. Brentel, and P. A. Andrekson, Long-term measurement of
PMD and polarization drift in installed bers, Journal of Lightwave Technology, vol. 18, no. 7, pp. 941951, 2000.
40. H. Kogelnik, L. E. Nelson, and J. P. Gordon, Emulation and inversion of
polarization-mode dispersion, Journal of Lightwave Technology, vol. 21, no. 2,
pp. 482495, 2003.
41. H. Kogelnik, L. Nelson, J. P. Gordon, and R. Jopson, Jones matrix for secondorder polarization mode dispersion, Optics Letters, vol. 25, no. 1, pp. 1921,
2000.
42. D. Kuperman, A. Eyal, O. Mor, S. Traister, and M. Tur, Measurement of the
input states of polarization that maximize and minimize the eye opening in the
presence of PMD and PDL, IEEE Photonics Technology Letters, vol. 15, no. 10,
pp. 14251427, Oct. 2003.
43. Y. Li and A. Yariv, Solutions to the dynamical equation of polarization-mode
dispersion and polarization-dependent losses, Journal of the Optical Society of
America B, vol. 17, no. 11, pp. 18211827, Nov. 2000.
44. A. Mecozzi and M. Shtaif, Signal to noise ratio degradation caused by polarization dependent loss and the eect of dynamic gain equalization, Journal of
Lightwave Technology, 2004, accepted for publication.
45. C. Menyuk, D. Wang, and A. Pilipetskii, Repolarization of polarizationscrambled optical signals due to polarization dependent loss, IEEE Photonics
Technology Letters, vol. 9, no. 9, pp. 12471249, Sept. 1997.
46. S. M. R. M. Nezam, J. E. McGeehan, and A. E. Willner, Theoretical and
experimental analysis of the dependence of a signals degree of polarization on
the optical data spectrum, Journal of Lightwave Technology, vol. 22, no. 3, pp.
763772, Mar. 2004.
47. A. Orlandini and L. Vincetti, A simple and useful model for Jones matrix
to evaluate higher order polarization-mode dispersion eects, IEEE Photonics
Technology Letters, vol. 13, no. 11, pp. 11761178, 2001.
48. , Comparison of the Jones matrix analytical models applied to optical
system aected by high-order PMD, Journal of Lightwave Technology, vol. 21,
no. 6, pp. 14561464, 2003.

384

8 Properties of PDL and PMD

49. D. Penninckx and V. Morenas, Jones matrix of polarization mode dispersion,


Optics Letters, vol. 24, no. 13, pp. 875877, July 1999.
50. D. L. Peterson, B. C. Ward, K. B. Rochford, P. J. Leo, and G. Simer,
Polarization mode dispersion compensator eld trial and eld ber
characterization, Optics Express, vol. 10, no. 14, pp. 614621, July 2002.
[Online]. Available: http://www.opticsexpress.org/
51. C. D. Poole and C. R. Giles, Polarization-dependent pulse compression and
broadening due to polarization dispersion in dispersion-shifted ber, Optics
Letters, vol. 13, no. 2, pp. 155157, 1988.
52. C. D. Poole and R. E. Wagner, Phenomenological approach to polarization
mode dispersion in long single-mode bers, Electronics Letters, vol. 22, no. 19,
pp. 10291030, 1986.
53. C. D. Poole, J. H. Winters, and J. A. Nagel, Dynamical equation for polarization dispersion, Optics Letters, vol. 16, no. 6, pp. 372374, 1991.
54. C. D. Poole and D. L. Favin, Polarization-mode dispersion measurements based
on transmission spectra through a polarizer, Journal of Lightwave Technology,
vol. 12, no. 6, pp. 917929, 1994.
55. S. C. Rashleigh, Origins and control and polarization eects in single-mode
ber, Journal of Lightwave Technology, vol. LT-1, no. 2, pp. 312331, 1983.
56. S. C. Rashleigh and R. Ulrich, Polarization mode dispersion in single mode
bers, Optics Letters, vol. 3, no. 2, pp. 6062, 1978.
57. W. Shieh, Principal states of polarization for an optical pulse, IEEE Photonics
Technology Letters, vol. 11, no. 6, pp. 677679, June 1999.
58. W. Shieh and H. Kogelnik, Dynamic eigenstates of polarization, IEEE Photonics Technology Letters, vol. 13, pp. 4042, 2001.
59. M. Shtaif and A. Mecozzi, Polarization-dependent loss and its eect on the
signal-to-noise ratio in ber-optic systems, IEEE Photonics Technology Letters,
vol. 16, no. 2, pp. 671673, Feb. 2004.
60. Measurement of Polarization Depedent Loss (PDL) of Single-Mode Fiber Optic
Components, Telecommunications Industry Association Std. TIA/EIA-455-157,
2000. [Online]. Available: http://www.tiaonline.org/standards/
61. S.-C. Wang, private communication, 2002, Insensitivity of Fourier phase to rst
and last section birefringent phase was rst identied by Dr. Wang.
62. C. Xie and L. F. Mollenauer, Performance degradation induced by polarizationdependent loss in optical ber transmission systems with and without
polarization-mode dispersion, Journal of Lightwave Technology, vol. 21, no. 9,
pp. 19531957, Sept. 2003.
63. C. Xie, L. F. Mollenauer, and L. Moller, Pulse distortion induced by polarization-mode dispersion and polarization-dependent loss in lightwave transmission systems, IEEE Photonics Technology Letters, vol. 15, no. 8, pp. 10731075,
Aug. 2003.
64. M. Yoshida-Dierolf and V. Dierolf, Analytical form of frequency dependence of
DGD in concatenated single-mode ber systems, Journal of Lightwave Technology, vol. 21, no. 10, pp. 22172223, Oct. 2003.

9
Statistical Properties of Polarization in Fiber

The topic of this chapter is the statistics of polarization, polarization-mode


dispersion, and polarization-dependent loss, all in relation to behavior in
single-mode bers. The origin of these statistical properties is the birefringence within the mode-eld diameter of the ber. If a perfectly isotropic ber
existed, the polarization state at the output would match that at the input
for all time, frequency, temperature, and length. This, however, is not the
case. As illustrated in Fig. 9.1, there are several causes of ber birefringence,
both intrinsic and extrinsic. While an isotropic ber is stress-free and has a
perfectly concentric core (a), perturbations such as core ovality (b), stressinduced index gradient (c), and micro-bubbles (d) introduce birefringence at
any cross-section of the ber. The existence and study of ber birefringence
was well-known by the 1970s [29, 53, 54]. The length-scale of the birefringence
is called the birefringent-beat length: LB = 0 /n, where o is the free-space
wavelength and n is the birefringence. Typical birefringence values of early
1990s ber are n/n 107 , which corresponds to LB  20 m at 1550 nm.
Birefringence, however, is only one contributing factor to the statistical
behavior. A second factor is the ber autocorrelation length LC , which is
the characteristic length over which the birefringent axis of the ber changes.
While the models used in the following are cast in the continuous limit, at the
discrete level one can think of a ber of length L segmented into N parts LC
long, where N = L/LC . Within a segment the birefringence and its orientation
is xed, and at the junctions between adjacent segments the birefringent axis
changes abruptly. In the continuous limit, LC represents the characteristic
length over which the randomly evolving birefringence loses memory of its
preceding orientations. A typical value of LC is 100 m.
The relationship between the ber autocorrelation length LC and the birefringent beat length LB , and between LC and the ber length L, determines
the regime in which the polarization ensembles behave. The principal regimes
are illustrated in Fig. 9.2. In the rst limit LC  LB , Fig. 9.2(a), the slow
evolution of the birefringent axis along the ber allows the optical eld of the
signal to follow. In the second limit LC  LB , Fig. 9.2(b), the birefringent

386

9 Statistical Properties of Polarization in Fiber

a)

b)

Concentric
(isotropic)

c)

Oval Core

d)

Strain field

Micro-bubbles

Fig. 9.1. Sources of birefringence within a cross-section of single-mode ber. a) Perfect, isotropic ber. b) Ovality of the core. c) Stress-induced index gradient. d) Physical defects like micro-bubbles or impurity concentrations.

axis changes too quickly for the optical eld to follow, the result is a long-range
average over the range of orientations. This eect is exploited to manufacture
ultra-low PMD ber: the ber preform is spun during the drawing process at
a rate designed to ensure LC  LB [43]. For instance, Chen et al. [6] report a
spin period of 1 m and a beat length of 10 m in their ber. Higher resolution
measurements are reported by Pietralunga et al. [46] and Galtarossa et al. [20].
To be sure, this is not a perfect cure as one must still include some lengthscale for variation of the spin prole this additional factor is illustrated in
Fig. 9.2(c) but the eective birefringence of the ber is reduced by an order
of magnitude.
The relationship between the autocorrelation length LC and the total ber
length determines how the polarization-mode dispersion behaves. There are
two extrema regimes, that of low (or weak) mode coupling and that of
high (or strong) mode coupling (Fig. 9.2(d)). In the low mode-coupling
regime, variation of birefringence orientation is low so the mean PMD increases
linearly with length. In the high mode-coupling regime, the birefringence orientation is random beyond a correlation length so the mean PMD increases as
the square-root of the length. As a practical matter, since square-root growth
is slower than linear, one would like to reach this regime as quickly as possible. As shown in the following, the ratio L/LC determines the regime; a
low autocorrelation length LC pushes a ber toward high mode-coupling and
root-length growth of the mean PMD.
An historic anecdote conveys the importance of the ber autocorrelation
length. Early ber-transmission and characterization studies were done in the
research lab where ber is held on spools. When C. D. Poole went to measure
for the rst time a ber spooled and then unspooled, he found that the PMD
increased ve-fold. This is an instance where the correlation length LC is small
on the spool, due to inhomogeneities of bending strain, and large unspooled.
Indeed, de Lignie et al. report spooled and cabled measurements circa 1994
where they showed LC 5 m on the spool and LC 500 m cabled [9]. Their
measurements from ber to ber show a wide range of values.

9 Statistical Properties of Polarization in Fiber

387

a)
LB

LC

Field follows birefringence

Field averages over


variation of birefringence

Model for spun fibers

b)
LC

LB

LS

LB

c)
LC

d)
LB

LC

Lfiber
low mode coupling
hti a z

high mode coupling


hti a z1/2

Fig. 9.2. Relationship between length scales within a single-mode ber. a) Adiabatic regime LC  LB : the eld follows the changing birefringence. b) Field-average
regime LC  LB : the eld cannot follow the changing birefringence vector and instead averages over the variation. c) Model for spun bers where a third length
scale LS , the range over which the spin prole changes, is added. d) Low- and
high-mode coupled PMD regimes: LC in comparison with the total ber length L.

For practical systems and design, there are three dimensions along which
one would like to derive polarization-related statistics: propagation length,
optical frequency, and time. In each case a statistical process must be dened to characterize the evolution on a microscopic level. The PMD evolution
equation over length is well dened and the probability density converges in
the limit of large ensembles. The ergodic nature of PMD lets length be
replaced by optical frequency in the density functions and long length is
replaced by wide bandwidth. The PMD autocorrelation function connects
the length and frequency regimes. There is, however, no denite process for
the time evolution. Submarine cable changes at a slow rate while aerial ber
changes in the millisecond range. Moreover, there is likely no spatial homogeneity to the temporal changes for instance, a train may cross a cable at a
particular location so one cannot expect a neat answer. When the temporal
changes are spatial homogeneous, P. J. Leo et al. have developed a Rayleighdistribution model of SOP change that is useful to dene what speed of
change means [32].

388

9 Statistical Properties of Polarization in Fiber

Statistics for polarization, PMD, and PDL are derived in the following
using diusion processes. The physics of a diusion process is rst captured
by a stochastic dierential equation (SDE) and then translated to its partialdierential equation (PDE) analogue. Exposition of these mathematical tools
is beyond the scope of this text and the reader is referred to Arnold and
Oksendal for SDEs [1, 44], and Risken for PDEs [55]. Finally, Davenport is
an invaluable reference on applied probability is [8].

9.1 Polarization Evolution Model


An optical mode conned within a ber propagates only along one dimension;
denote this direction z. The longitudinal
electric eld will propagate accord

ing to the time-harmonic factor exp jzk0 r , where k0 is the free-space


wavenumber and r is the relative permittivity of the ber. The Helmholtz
equation for the evolution of the electric eld E in the plane perpendicular
to z is therefore

 2
d
2
+ k0 r E = 0
dz 2
where r is written in tensor form in anticipation of the following. The common
permittivity r may be separated from the dierential part such that
r = r + 12 r 

(9.1.1)

When the propagation is lossless, the common permittivity is the square of


the refractive index: r = n2 . The common phase is removed from E to isolate
the polarization state evolution using the factorization

  z
k0 n(z  ) dz  |s
E = exp j
0

where the integral simply accounts for the cumulative change of common
index over the path; if the common index is xed, the integral reduces to the
more customary exponential phase factor. Substitution of this factorization
and (9.1.1) into the wave equation, and dropping terms that are second-order
in |s, makes


r 
d
2 
+ jk0
|s = 0
dz
4n
In the absence of polarization-dependent loss, r is real and its magnitude
to rst-order in n is
r = n2+ n2 = (n + n/2)2 (n n/2)2  2nn
 is dened
As a matter of notation, the magnitude of the birefringent vector
as

9.1 Polarization Evolution Model

389

 = k0 n = n
(9.1.2)
= ||
c
where the on has been dropped for convenience. Moreover, the birefringent beat length LB = o /n is related to the birefringence as
LB = 2/

(9.1.3)

With these denitions in hand, the polarization state evolves in the ber
according to [25]


d
j 
+  |s = 0
(9.1.4)
dz
2
 is the local birefringence at any position along the
The birefringence vector
ber. Equation (9.1.4) describes the response of the polarization state due to
the local birefringence. Converting to Stokes space, the dierential equation
of motion is
d
s
 s
=
(9.1.5)
dz
As expected, the polarization state precesses about the local birefringent axis
at a rate governed by the strength of the birefringence.
Wai and Menyuk propose two models of how the local birefringence varies
along an unspun ber [40, 60, 62]. In both models the ber exhibits no chirality:
No circular birefringence: 3 = 0
This assertion has been experimentally veried for such bers [21], while spun
bers show evidence of residual chirality [27]. The calculations that follow use
the no-chirality assumption, while models for spun bers can be found in [47].
Without a chiral factor, the birefringent matrix is



1
2
  =
(9.1.6)

2 1
In their rst model, the birefringence magnitude is xed and the angle on the
Poincare equator randomly varies. In their second model, the cartesian birefringent components (1 , 2 ) are independent random variables. Both models
give the correct evolution of the mean-square DGD, but the latter model,
while a bit more involved, generates aperiodic PMD spectra.
9.1.1 Random Birefringent Orientation
For this rst model, the birefringent matrix is


cos

sin

  =

sin cos

(9.1.7)

390

9 Statistical Properties of Polarization in Fiber

where is in Stokes space. The angle is modelled as a Brownian motion on R1


according to
d
= g (z)
dz

g (z) = 0,

g (z)g (z  ) = 2 (z z  )

(9.1.8)

where g is a white-noise stochastic process. Physically, the angle is subject


to impulsive kicks as z increases, which in turn drives a random walk in
angle. The impulsive nature of the kicks means that there is no memory or
correlation from one kick to the next; each is random in its own right.
The Brownian probability density of subject to this motion is


2
1
exp 2
(, z) = 
2 z
22 z
where, as characteristic with this process, the variance increases linearly with
length: var() = 2 z. The strength of the white-noise g , 2 , is now apparent: the stronger the noise the shorter the ber length is necessary to reach a
nearly uniform angular distribution between [/2, /2].
As with any random walk, eventually there is complete loss of correlation
between some earlier position and the present. In this case, the ber autocorrelation length LC is dened as the length over which the angle losses
correlation. The autocorrelation of is calculated by the expectation value
of cos (z):



2 z
cos ()d = exp
E [cos (z)] =
(9.1.9)
2
R
The autocorrelation length LC is the length at which the autocorrelation falls
to e1 ; therefore,
2
2 =
(9.1.10)
LC
With this identication, the evolution of the probability density can be written
in terms of LC :


1
2
exp
(9.1.11)
(, z) = 
4z/LC
4z/LC
This distribution represents a diusion of the angle with ber length
(Fig. 9.3). At an initial position the angle is known with certainty; propagating away from this point increases the uncertainty of the angle.
As the ber length increases, the diusion of approaches a uniform distribution. In fact, only the angle modulo matters. Consider a starting angle
of = 0; at some distance z/LC the density at = /2 will be within 1
of uniform. Accounting for the wrap-around of the distribution modulo ,
the length needed to fall within this error is z/LC = 2 /(8). For instance, it
takes about 3LC to realize a 5% deviation from a uniform distribution. Once
the distribution is uniform there is no memory of the initial state.

9.1 Polarization Evolution Model

391

ru(0, z)

ru(u, z)

ru(u, zo)
z

Fig. 9.3. Spatial evolution of the birefringent-angle probability density. At any


position z0 the density is a gaussian. On a length scale z > LC the density converges
to a uniform distribution on [/2, /2].

9.1.2 Random Component Birefringence


For the second model, the entries of the birefringence matrix (9.1.6) are treated
as independent Langevin processes:
d1
= L1
C 1 + g1 (z)
dz
d2
= L1
C 2 + g2 (z)
dz

(9.1.12a)
(9.1.12b)

The characteristics of the noise sources are


g1 (z) = g2 (z) = 0

g1 (z)g2 (z) = 0


(9.1.13a)
 2

g1 (z)g1 (z ) = g2 (z)g2 (z ) = /LC (z z )
(9.1.13b)
 
where 2 is the rms value of the birefringence.
A Langevin equation describes a mean-reverting process driven by noise.
Consider (9.1.12a) in the absence of g1 (z): any initial condition decays exponentially over characteristic length LC . Turning the noise source on disrupts this deterministic motion. The minus sign of the deterministic coecient L1
C acts as a linear spring constant, pulling the solution back toward
the mean with a strength proportional to the deviation. In the steady state,
this spring force balances the noise source, resulting in a stationary gaussian
distribution.
The solution to (9.1.12) is
 z

z/LC
+
e(zz )/LC gi (z  )dz 
i (z) = i (0) e


The initial condition i (0) decays exponentially on a scale given by the ber
autocorrelation length LC . In the regime z  LC there is no memory of
the initial state and a stationary distribution is reached. A two-dimensional
sample-path of the birefringence is illustrated in Fig. 9.4. The steady-state
density of i is readily determined by solution of the associated Fokker-Planck
equation, and is

392

9 Statistical Properties of Polarization in Fiber


Realization of a birefringence-vector path
birefringence vector

Fig. 9.4. Sample path of the birefringence vector in the steady-state. This path was
calculated using a Karhunen-Loeve expansion of a Wiener process and a numerical
integration of the Langevin equation (9.1.12). In the steady-state 1,2 converge in
distribution to i.i.d. stationary gaussian processes.


v2
exp 2
i (v) = 
(9.1.14)
 
 2 
 
The variance of each component is var (i ) = 2 /2, which is independent
of z. Moreover, as detailed in Appendix D, the radial and angular distribu=x
tions of the local birefringence vector
1 + y2 are Rayleigh and uniform
distributions,
respectively.
Finally,
the
second
moment of the Rayleigh distri 
bution is 2 , which is what is expected on physical grounds. Therefore one
writes
! "
1

(9.1.15)
var (i ) = var ||
2
This and the preceding section have detailed physically reasonable forms
 that drives the evolution of the poof the local ber birefringence vector
larization state and, consequently, the PMD evolution. A signicant further
study by Marcuse et al. details how these models are used to analyze pulse
propagation, and particularly non-linear propagation, in bers [36].
1

9.2 Polarization Diusion in Single-Mode Fiber


The optical eld evolves along a single-mode ber subject to local birefringence perturbations. The characteristic length LE is the length-scale over
which the electric eld losses memory and is called the polarization decorrelation length [60]. This is an additional length scale to those characteristic of the
ber, namely, the birefringent beat length LB and the ber autocorrelation
length LC .
One naturally expects the polarization decorrelation length to be related to
the birefringent properties of the ber. At one extreme, where LC  LB , the
optical eld averages over the rapidly varying birefringence, thereby changing
slowly with respect to its initial state. At the other extreme where LB  LC ,
the eld tends to follow the local birefringence and will, accordingly, change
slowly with respect to the local state. Based on this physical picture, there are
two choices for the polarization decorrelation length: a xed denition LE,xed

9.2 Polarization Diusion

393

that measures the local eld with respect to the initial birefringence, and a
local denition LE,local that measures the local birefringence.
To compute the polarization decorrelation length, the Stokes picture of
polarization diusion (9.1.5) is used. Recalling that the ber model assumes
no chirality, the component form of the precession equation reads

S
2
S1
2 S3
d 1

S2
1 S2 =
1 S3
=
dz
S3
2 1
S3
2 S1 + 1 S2
where capital Sk denotes a random variable. In order to simplify this expression prior to writing the diusion generator, Wai and Menyuk introduce a
rotation operator R(z) to rotate the local birefringence to point along s1 [62].
This operator is simple because 3 = 0:

cos (z) sin (z) 0


R(z) = sin (z) cos (z) 0
(9.2.1)
0
0
1
By dening the local Stokes coordinates such that s = R(z)
s, the precession
equation (9.1.5) is transformed to
"
!
d
 R1 s RR1 s
s = R
z
dz

(9.2.2)

where Rz is the derivative of R(z) with respect to z. This precession equation


is called the local evolution equation, to distinguish it from (9.1.5) which
describes xed-reference evolution.
There are two derivations that can follow from (9.2.2) the rst using the
xed-birefringence ber model, and the second using Langevin ber model.
The results of the two calculations are not qualitatively dierent; so the rst,
and simpler, model is detailed below.
For the rst ber model, the birefringent variation (9.1.7) and its noise
source (9.1.8) is substituted into the local evolution equation. This gives



0
S
S2
d 1
S3 + S1 g
=
(9.2.3)
S2
dz

S2
0
S3
This is a stochastic dierential equation (SDE) in the Ito form. While it is
quite beyond the scope of this text, the discerning reader should be aware that
this equation is not quite correct. As Foschini shows [17], the Ito interpretation leads to an immediate departure of the polarization state from the unit
sphere once (9.2.3) is integrated, while the associated Stratonovich interpretation does not. One has two choices on how to treat the analytics. Either the
innitesimal probability generator G accounts for the Stratonovich interpretation by adding a correction term or the SDE is translated from Stratonovich

394

9 Statistical Properties of Polarization in Fiber

to It
o form and the It
o generator is used. Examples of both treatments are
given below.
The innitesimal probability generator governs the diusion of the probability density associated with the stochastic dierential equation
dXi,z = b(z, Xi,z )dz + (z, Xi,z )dBi,z

(9.2.4)

where b is the column-vector coecient of the drift and is the column vector
coecient of the Brownian motion either are functions of length and the random variable. Brownian motion is related to white noise by dBz = gz dz. The
expectation of a suciently smooth functional on coordinates Xi evolves
according to
d 
= G
(9.2.5)
dz
This is Kolmogorovs backward equation (KBE). The generator G is the
probability generator of the It
o diusion (see Foschini [17], Menyuk [62], Oksendal [44] (esp. Theorem 7.3.3, and (6.1.3)), and Risken [55] for more details).
The importance of the probability generator is that it transforms a stochastic
dierential equation into a partial dierential equation (PDE). Many powerful analytic and numeric tools are available to solve PDEs, making problems
cast in this form more tractable. The polarization and PMD diusions that
follow are prime examples of physical processes developed rst in SDE form
to capture the dierential behavior of the process and then solved in PDE
form to determine the global behavior subject to the boundary conditions.
The Ito-sense diusion generator has two components: GI = Gd + Gs .
These components account for, respectively, the deterministic drift of the system and the stochastic uctuation. The It
o generator for (9.2.4) is
GI =


i

bi (xi )

1  T
2
i,j (x)
+
xi
2 i j
xi xj

(9.2.6)

The Stratonovich-sense GS generator makes a correction for the drift and,


while more complicated to express in general, for the present case it may be
written as


1
xi
(9.2.7)
GS = GI 2
2
xi
i
The calculations in this section use the Stratonovich-sense generator.
While the derivations of the above equations are advanced, its application
is straightforward. The generator for (9.2.3) is

+ S2
S2
S3


2
2

2 2 2
2

S2 2 + S1 2 2S1 S2
+
S1
S2
2
S1
S2
S1 S2
S1
S2

G = S3

(9.2.8)

9.2 Polarization Diusion

395

where the noise strength 2 is related to the ber autocorrelation length


via (9.1.10).
The evolution of the moments of S are calculated using this generator
= S and (S)
= S2 give the
and the KBE. For instance, the functionals (S)
These results are used to
evolution of the rst- and second-moments of S.
associate the polarization decorrelation length LE with the ber parameters.
For the evolution of the mean values, the functional is (Si ) = Si . Calculation of G generates the following system of equations:
. /
1 . /
S1
(9.2.9a)
S1 =
LC
. /
2 . /
1 . /
S3
S2
(9.2.9b)
S2 =
LB
LC
. /
2 . /
S2
(9.2.9c)
S3 =
LB
.
/
For a non-zero initial condition, S1 (z) monotonically decays to zero and the
remaining mean values undergo a damped oscillation to zero. The long-range
values of the polarimetric means are all zero; the polarization state with respect to the local birefringence ultimately becomes completely uncorrelated.
In the particular case when the initial state of the system is S1 = 1, that is, the
launch polarization is aligned to the local birefringent
/
. axis,/the .mean polarimetric values of the remaining two coordinates are S2 (z) = S3 (z) = 0,
and the mean along the initial axis decays as
.
/
S1 (z) = exp (z/LC )
(9.2.10)
d
dz
d
dz
d
dz

This simple case is all that is necessary to associate the polarization decorrelation length LE,local with the ber autocorrelation length. Since the characteristic length over which the mean polarization is preserved is by denition the
polarization decorrelation length, in light of (9.2.10) one makes the association
LE,local = LC

(9.2.11)

In the local reference frame the two characteristic lengths are equal.
Wai and Menyuk detail the transformation to the xed reference frame
from the local frame [62]. While their work may be consulted for the details,
the results are

"
!
S1 (0) exp z/L
LC  LB
E,xed
"
!
(9.2.12)
S1 (z) =

S1 (0) exp z/L
LC  LB
E,xed
where

396

9 Statistical Properties of Polarization in Fiber

LE,xed =

LC
2

2 2 (LC /LB )

LE,xed = LC / 2

(9.2.13a)
(9.2.13b)

These equations reenforce the physical understanding developed in the introduction of this chapter. With respect to the launched polarization state, when
LC  LB the polarization decorrelation length (9.2.13a) is much longer than
the ber autocorrelation length. This is because the birefringence changes
too rapidly for the eld to follow, which in turn makes the propagated eld
correlate with the launched eld over a longer distance. Conversely, when
LC  LB , the eld follows the birefringence more faithfully, so the polarization state diuses on a length scale more closely linked to the ber autocorrelation length. In particular, notice that LE,xed = LE,local / 2, which makes
sense because the eld follows the local birefringence, so the local-frame characteristic length should indeed be longer than the xed reference frame.
These associations between the ber autocorrelation and polarization
decorrelation lengths are useful in a practical sense. While LC is central to the
statistical description of the optical eld, the polarization decorrelation length
is the measurable quantity. Equation (9.2.12) provides a means in which to
determine LC through the measurement of LE,xed .
For the evolution of the polarimetric second-momemts, the functional
is set to (Si ) = Si2 and the generator (9.2.8) remains the same. Calculation
of G generates the following system of equations:
2 !. 2 / . 2 /"
d . 2 /
S1 S2
S1 =
dz
LC
d . 2 /
2 !. 2 / . 2 /" 4 . /
S1 S2

S2 S3
S2 =
dz
LC
LB
d . 2 /
4 . /
S2 S3
S3 =
dz
LB
d . /
2 !. 2 / . 2 /"
2 . /
S2 S3

S2 S3
(9.2.14)
S2 S3 =
dz
LB
LC
As before there are local- and xed-reference frame solutions, and the solutions dier for the two limits of LC /LB . The general form for the xed-frame
solution is


 2  1
1
3
S1,2 
(9.2.15a)
1 exp (z/LE,1 ) + exp (z/LE,2 )
3
2
2
"
 2 1 !
S3 
1 exp (z/LE,3 )
(9.2.15b)
3
where in the rst equation the + and signs refer to S1 and S2 , respectively. In the LC  LB regime, the length scales are
LE,1 = 2LE,xed , and LE,(2,3) = 6LE,xed

(9.2.16)

9.3 RMS Dierential-Group Delay Evolution

397

Conversely, in the LC  LB regime, the length scales are


LE,1 = 27 LE,xed , and LE,(2,3) = 23 LE,xed

(9.2.17)

In .
either
/ length regime, the stationary variances of the diusions converge
to Sk2 = 1/3. The convergence rate for all three variances is roughly the
same, but the small dierence was studied by Wai and Menyuk in [61], where
they showed that the absence of chirality in the ber imparts a short-range
anisotropy to the diusions.
To summarize, polarization decorrelation happens with its own characteristic length-scale LE in a single-mode ber. That length scale is related to
the ber birefringence parameters in either a local or xed frame of reference.
In the local frame, the polarization decorrelation and ber autocorrelation
lengths are equal. In the xed frame, the relationship depends on the regime
in which the ber is characterized: for LC  LB the diusion occurs at a
rate related to LB ; for LC  LB the diusion occurs at a rate related to LC .
The three polarimetric values all reach a mean of zero and a variance of 1/3
beyond the diusion limit. The diusions detailed in this section are for the
xed-birefringence ber model, but the results do not qualitatively change for
the Rayleigh-distributed birefringence model.

9.3 RMS Dierential-Group Delay Evolution


The equation of motion for polarization evolution (9.1.5) was recast in the preceding section into a stochastic dierential equation whose solutions showed
the behavior of the statistical moments of the polarization state. In parallel
with this procedure, the equation of motion for polarization-mode dispersion
evolution is studied. Recall from (8.2.39) on page 339 that the dierential
equation of motion for the PMD vector  is

 +
 
=
z

(9.3.1)

 is the local birefringence vector and


 is its frequency derivative.
where
The solution to (9.3.1) for the mean-square magnitude of  in the diusion
limit is
"
 !
 2 
(9.3.2)
(z) = 2 c2 ez/LC + z/LC 1
 2
where c is the mean-square DGD for a segment LC long. The mean-square
solution is written at this point in the discussion because it is apparently independent of any reasonable derivation. Poole [48, 49] rst derived this equation,
shortly followed by Curti [7], Foschini [17], and Gisin [23, 24], and later by
Wai and Menyuk [60]. Gisin [24] showed that (9.3.2) is the mean-square deviation of the probability density that solves the Telegraphers equation (a

htd2(z/Lc) / tc2i1/2

398

9 Statistical Properties of Polarization in Fiber


4

10

mode coupling

10

weak

3
p_______
2z / Lc

10

trms(z)

-2

-2

10

10

z / Lc

trms(z)

z / Lc

10

hti a z1/2

hti a z

strong

10

10

PMD Statistics

0
0

12

z / Lc

16

20

Fig. 9.5. The rms growth of   with length and its asymptotic limits. a) Log-log
scale shows long-range behavior. For z  LC the rms growth is linear with length,
while for z  LC the rms growth goes as root-length. The crossover is in the range
z LC . b) Linear scale of the same. PMD ber statistics are derived in the strong
mode-coupling regime.

second-order parabolic PDE). While the details depend on which report is


consulted, all of these researchers apply a gaussian-correlated noise to the
ber birefringence.

The root-mean-square growth of the DGD, dened by  rms =  2 (z),
shows two asymptotic limits:

(z/LC ) c 
weak coupling: z  LC
rms
(9.3.3)
 rms (z) = 
2z/LC c 
strong coupling: z  LC
rms

where c rms = c2 . The cross-over between these two limits is z LC .
The two limits of the rms evolution function are shown in Fig. 9.5.
At one limit, the so-called weak mode-coupling limit,  rms grows linearly with length z. In this regime the local birefringence is nearly aligned
along the ber, making the cumulative birefringence additive. At the other
limit, the strong mode-coupling limit,  rms grows as root-length: z 1/2 . In
this regime the ber length is well beyond the ber autocorrelation length, so
the cumulative birefringence grows statistically. Given that the birefringence
of ber segments LC long are uncorrelated, the birefringence variances across
segments add, resulting in a standard deviation that grows as root-length.
This behavior was already seen in the probability density (9.1.11) for the
angular distribution of birefringence.
The analytic calculation of the PMD probability densities are made in the
strong-coupling regime. The weak-coupling regime poses few novel problems
because the statistics will closely match that of the local birefringence. The
intermediate region that connects the weak- and strong-coupling regimes can
be treated one of two ways. For the case of ber the PMD statistics evolve
through the intermediate region toward their stationary end-points. Tan et al.
have studied these transient statics and show that the Maxwellian distribution
is achieved by approximately z 30LC , but that the distribution tails take
longer to mature [58, 63]. Another case is for PMD emulators, where a xed

9.4 PMD Statistics

399

number (320) of sections is used. These statistics have also been investigated
and shown to exhibit deviation from the stationary forms [30, 34].
Equation (9.3.2) is derived here following the diusion formalism. Use of
the xed-birefringence model of 9.1.1 makes for a simpler calculation; the
result is easily extended to Rayleigh-distributed birefringence. As with the
polarization calculations, the PMD diusion equation is simpler when converted to a local reference frame. Dene = R(z) , where R(z) is as in (9.2.1).
The resulting stochastic dierential equation for (9.3.1) in component form is
(cf. (9.2.3))

2
d
2 = 3 +
1 g
(9.3.4)
dz
3
2
0
The innitesimal probability generator is
G =

3
+ 2
1
2
3


2
2
2

2
2
2
+
1 2
1
2
2 2 + 1 2 2
2
1
2
1 2
1
2

(9.3.5)

Finally, the functional of interest is = 2 = 12 + 22 + 32 . A requisite auxiliary functional  = 1 is also needed. Calculating G and G  , and combining
the results, yields the equation of motion
 2

d
1 d  2
= 22
+
(9.3.6)
dz 2
LC dz
The solution to this equation is (9.3.1). A few details need clarication. The
length of the PMD vector is invariant under rotation, so |
| = |R(z) |. The
product LC is the characteristic DGD per segment LC long. To see this,
simply expand the terms: c = L
 ng LC /c. Finally, the replacement
C =
of c2 in the solution of (9.3.6) with c2 as reported in (9.3.2) comes with the
Rayleigh-birefringence derivation detailed by Wai and Menyuk [62].

9.4 PMD Statistics


The PMD statistics that yield to analytic study are those related to the rstand second-order PMD vectors ( ,  ) and the autocorrelation of the PMD
vector  with frequency. The statistics that have been analytically solved are
for  and its components i ;  and its components ,i ; and the perpendicular
and parallel components of the second-order vector: depolarization |, | and
polarization-dependent chromatic dispersion | | , respectively. Additionally,
the relation between depolarization and PDCD conditional on the DGD has
been determined. The joint density of the magnitudes (, ), however, is not

400

9 Statistical Properties of Polarization in Fiber

analytic but has been solved using importance sampling (IS) and, separately,
special numerical techniques.
A remarkable property of PMD statistics is that they scale with a single
scaling factor: the mean ber DGD . The mean ber DGD is itself related to
the ber length, ber autocorrelation length, and birefringence variance. That
association is claried below, but once made the mean ber DGD becomes
its own unit. The mean ber DGD is so ubiquitous in the statistics and
measurement of PMD that a corruption of terms has entered the literature
where PMD is dened as the mean ber DGD. There should be no confusion,
however, between PMD as a vector and mean DGD as a statistical unit.
The expression for the rms DGD evolution in the strong coupling limit
connects the microscopic scale of the birefringence variation with the macroscopic properties of PMD statistics. This expression is therefore the gateway
between micro- and macroscopic views of the same process. While this expression was derived above, the derivation of the PMD probability densities
requires analytic tools that are well beyond the scope of this text. Instead,
the results, principally of Foschini, will be quoted and the ambitious reader is
referred to the cited papers.
The mean ber DGD is connected to the stochastic model of PMD evolution in the following way. In the z  LC limit, the mean-square DGD grows
as
 
 2 
(9.4.1)
(z) = 2 c2 z/LC
As discussed in the following, the cartesian components of the PMD vector
are i.i.d. gaussian random variables. The probability density of the length of
the PMD vector is therefore Maxwellian. The rst and second moments of this
density are related (see Table D.1 on page 507), so (9.4.1) may be rewritten
for the mean DGD:

 
8 2 c2 z
(9.4.2)
 (z) =
3 LC
 
There is some variation in the literature on the interpretation of 2 c2 /LC .

Curti, for instance, writes = 8z/LC d [7]. Since (9.3.2) was derived
using a Rayleigh statistic for the birefringence,
the mean segment DGD is
 
2
related to its second moment by c2 = 4 c  /. Association with Curti

, which is an 8.5% dierence. Separately, Poole and
gives d = 8/3 c
Favin [52] write = 8N/3p , where p is the xed birefringence of a
retardation plate and N is the number of plates. The connection to (9.4.2)
requires N = 2z/LC . This interpretation is used below to develop a discrete waveplate model. Other researchers reproduce (9.4.2) in the continuous
limit [17, 23, 51, 62].
Armed with the denition of the mean ber DGD , the statistics of PMD
are presented below. The principal contributors to this eld are Curti [7],
who rst derived the Maxwellian DGD distribution; Foschini [1417], who
derived the remaining PMD distributions; Karlsson [31], and Shtaif and

9.4 PMD Statistics

401

Mecozzi [56, 57], who derived the PMD autocorrelation functions; Ibragimov
and Shtengel [28], who derived a conditional expression; and Fogal, Biondini,
and Kath, who developed the IS methods [2, 11, 12].
9.4.1 Probability Densities
The expressions for the probability densities of the rst- and second-order
PMD vector are tabulated in Table 9.1. The vectors  and  are statistically
dependent on one another. Plots of these densities are shown in Fig. 9.6 on
linear scale, to emphasize the distribution about the mean, and semi-log scale,
to emphasize the fall-o of the distribution tails.
The probability density for the DGD , which is the magnitude of the
PMD vector = | |, is Maxwellian. The origin of the Maxwellian distribution
comes from the distribution of the radius of a sphere, where the cartesian coordinates of the sphere are i.i.d. gaussian random variables (see Appendix D).
An important relation between the mean and mean-square for a Maxwellian
distribution is
 
 2
8
2
(9.4.3)


=
3
The 8/3 factor appears frequently in the discussion of PMD statistics. The
Maxwellian and gaussian distributions scale linearly in . That is, the measured DGD values from any ber can be normalized by the mean ber DGD
( /
) to produce a unit-scaled statistic.
The probability density for the magnitude SOPMD = | | is sech-tanh
in form. The origin of the sech-tanh distribution comes from the distribution of
the radius of a sphere, where the cartesian coordinates of the sphere are i.i.d.
hyperbolic secant (sech) random variables. In contrast to the Maxwellian and
gaussian distributions, the sech-tanh and sech distributions scale quadratically
2.
with 2 : measurements can be normalized to a unit statistic by /
The second moments of the and magnitudes are related by


 1  2 2
2 =

(9.4.4)

Moreover, a comparison of the tails of the Maxwellian and sech-tanh distributions (Fig. 9.6), shows that eventually the Maxwellian falls o faster. This is
because the underlying gaussian distribution falls o quadratically (on a log
scale) while that of the sech fall o linearly.
The cartesian components of the SOPMD vector are i.i.d. random variables, so their variance is one-third that of the vector length. Yet, the cartesian
components are not directly or easily related to the distortion PMD imparts
on a signal. However, the projections of the SOPMD vector onto the rst-order
vector are directly related, so one asks about the conditional dependence of
these projections.

9 Statistical Properties of Polarization in Fiber


402

Table 9.1. Statistical Relations of PMD: = p , = p + p


 2
Domain
Density
Symbol
Statistic
 



2
2
 2  () 32
1 8
| |
exp

[0, )
DGD(ad)
2
2 3
2

 

 
2G
4
4
1  2 2
8 4
2
(d)

|
|
tanh

[0, )
sech

SOPMD
2 2

2
2


2
1  2
1 8 2
(ac)

i
0
exp
(, )
component

2
3

2
 
4
4
1  2 2

,i
0
sech
(, )
component(d)
2
9

2
 


4
2
1  2 2
2


(e,f )
, = | |

0
sech
(, )
PDCD
27
2
2

8  2 2
8
| , |
u2
J0 (u ) sech ( tanh )1/2 d, u =
[0, )

Depolarization(g)
2
27

0



5
sinh3/2
4  2
1
Depolarization
2
3u
p |
|
[0, )
, 1; u tanh d

1 F1
9
2
2
vector(g)
cosh5/2
0
 


!v"
! v ""
3
5
ev/2 !
()  2 
(3 + 2v(3 + v)) I0
. G = 0.915965 . . ., Catalans constant.
+ 2v(2 + v)I1
=
2 ; 1 F1
, 1; v =
8
2
3
2
2
(a)

Curti et al. [7],(b) Poole et al. [51],(c) Foschini [17],(d) Gisin [23],(e) Foschini [15],(f ) Foschini [14],(g) Foschini [16].

9.4 PMD Statistics

Linear

403

Semi-Log

a) DGD and SOPMD


2

|t*v|
|t*|

|t*|: t / hti, |t*v|: t / hti2

0
-2
-4
-6
-8
-10

|t*v|
|t*|
0

|t*|: t / hti, |t*v|: t / hti2

b) DGD and SOPMD Components, PDCD


|t*|v

tv,k
1

tk
-4

-2

tk: t / hti, tv,k , |t*|v: t / hti2

0
-2
-4
-6
-8
-10

|t*|v
tv,k
tk
-4

-2

tk: t / hti, tv,k , |t*|v: t / hti2

c) Depolarization magnitude and vector


|t*v,?|

|pbv|

|p
bv|: t / hti, |t*v,?|: t / hti2

0
-2
-4
-6
-8
-10

|pbv|
|t*v,?|
0

|p
bv|: t / hti, |t*v,?|: t / hti2

Fig. 9.6. Probability densities of rst- and second-order PMD statistics, linear and
semi-log scales. The log scale is in log10 .

The second-order PMD vector is projected onto the direction of the rstorder vector ( p) to produce parallel and perpendicular components. This
action conditions the SOPMD vector to p. Expanding  from  = p makes
 = p + p
= , + ,

(9.4.5a)
(9.4.5b)

The parallel component is the polarization-dependent chromatic dispersion.


This component changes the chromatic dispersion of the ber from D to
De = D , and, accordingly, induces pulse compression or expansion [15,
50]. Nelson gives the wavelength dependence of this component [42]. The
PDCD magnitude has two synonymous notations: | | = , . The perpen-

9 Statistical Properties of Polarization in Fiber


100

4000

50

3000

2000
1000

1540

1541

1542

1543

1544

Depol |t*v,?| (ps2)

DGD (ps)

404

0
1545

Wavelength (nm)

Fig. 9.7. Measurements of DGD and depolarization component | , | of SOPMD


from high-PMD ber spool. Courtesy G. Shtengel [28].

dicular component is the depolarization, which is the tendency for the PMD
vector to change direction. The eects of this component were treated in 8.2.
There is a strong tendency for  to point away from  . The mean-square
value of the PDCD and depolarization components in relation to that of the
second-order vector are
.
/ 1 
/ 8 
.
2
2
(9.4.6)
,
=
=
2 , and ,
2
9
9
(Note that the rms of both components scale as 2 , as does the full SOPMD
vector.) Even in comparison to a cartesian component, the PDCD component
is diminished:
/ 1
.

2
=
(9.4.7)
2
,
3 ,i
Depolarization is clearly the dominant component of SOPMD and is therefore
the dominant impairment on an optical signal.
One may ask how do the SOPMD-projected components vary with a particular sample value of DGD for a xed mean DGD. This question comes
about when testing PMD compensators: when the DGD value is high, what
form of SOPMD in a ber can one expect? The answer is that the higher the
DGD, the higher the expected depolarization. Ibragimov and Shtengel [28]
show that the conditional expectations scale as
.
/ 1 
2
,
2
(9.4.8a)
| =
9
!
"
 2
 2 2
 
, | =

(9.4.8b)
3 2  + 2
9
When the sample value 2 equals its mean-square value (9.4.4), the conditional
expression (9.4.8b) reduces to (9.4.6). These expressions show that the rms
PDCD magnitude is determined solely by the mean ber DGD, while the rms
depolarization magnitude scales with sample DGD. These relations are borne
out in experiment. Figure 9.7 illustrates DGD and magnitude-depolarization

9.4 PMD Statistics

405

RMS SOPMD (ps2)

2500
Experiment
Theory

2000
1500

2
ht*v,?
|ti

1000

2
ht*v,k
|ti

500
0

20

40

60

80

100

DGD (ps)

Fig. 9.8. Measurements of mean-square PDCD and depolarization as a function of


DGD. Data taken on a single ber with xed mean DGD. Courtesy G. Shtengel [28].

measurements taken on a high-PMD ber; the correlation is evident. Figure 9.8 shows an analysis of the data superimposed with the conditional distributions.
Finally, the joint probability distribution function (JPDF) of (, ) is
a distribution central to the characterization and validation of receiver performance against PMD. The JPDF is not analytic, but can be calculated
via brute force, special numerical techniques, or using importance sampling
methods. A brute-force example of the JPDF is shown in Fig. 9.9. The lines
indicate contours of constant probability. This calculation by P. J. Leo took a
week running on a distributed computer network; the code formalism used the
Stokes-based PMD concatenation rules for  and  , thereby saving an orderof-magnitude in time compared to the equivalent Jones-matrix calculation.
For comparison, a JPDF from measured eld data is shown in Fig. 9.10.
The importance-sampling methods developed by Fogal, Biondini, and
Kath [2, 3, 11, 12] also calculate the PMD vector in Stokes space, but with the
innovation that a bias is added to the scattering of the polarization state from
section to section. The bias can be made to accumulate preferentially large
DGD values, large SOPMD values, or both. Importance sampling provides the
formalism to rescale the resulting statistics by the probability of the underlying bias. Contours that extend to 1020 have been demonstrated. Moreover,
IS methods have been adopted to the expedient estimation of system outage
probabilities due to PMD [33].
Finally, Forestieri has developed special numerical techniques to evaluate
the PDCD component density and, subsequently, the joint rst- and secondorder density [13]. The author reports an ecient algorithm for fast evaluation
of the JPDF, and shows that in comparison to the IS method the tails of the
distribution fall more slowly.
The JPDF of (, ) is universal because of the scaling rules for DGD
and SOPMD. In particular, the DGD and SOPMD axes scale as and 2 ,
respectively. Therefore, once the JPDF is calculated for some , it can be

406

9 Statistical Properties of Polarization in Fiber


3.5
1E-4

3.0

2E-4
5E-4

2.5

SOPMD: |t*v| / hti

1E-3
2E-3
5E-3

2.0

1E-2
2E-2

1.5

5E-2
1E-1
2E-1

1.0
5E-1

0.5
0.0

0.0

0.5

1.0

1.5

2.0

2.5

3.0

DGD: |t | / hti

Fig. 9.9. Joint rst- and second-order PMD density function. This is a universal
distribution, scaled on the abscissa by   and on the ordinate by  2 . Calculated
from 109 realizations of a 2000-section ber, using Stokes-based concatenation rules.
Courtesy P. J. Leo.

Fig. 9.10. Measured joint rst- and second-order PMD magnitudes from ber described in 9.11.

9.4 PMD Statistics

407

Fig. 9.11. Field measurements of dierential-group delay and magnitude secondorder PMD over 10 nm and 160 hrs. Notice the general adiabatic evolution is disrupted at about 145 hr. Courtesy D. Peterson, MCI [45].

408

9 Statistical Properties of Polarization in Fiber

arbitrarily rescaled to associate with any ber. Such scaling, especially with
the range available using IS, can be exploited to make solid predictions of the
outage probability due to PMD in lightwave systems.
The nature of the JPDF also reveals the fallacy of the PMD is DGD concept in testing receivers and PMD compensators, whether optical or electrical.
At almost any level of DGD there is nearly zero probability that SOPMD is
zero, or even small. Yet most receiver testing at the time of this writing is
done by introducing only DGD. Such results can signicantly underestimate
the receiver operation in real-world environments (but they glorify product
performance to the parties that fund the work).
An example of rather adiabatic temporal behavior of rst- and secondorder PMD is shown in Fig. 9.11. The data was taken over 160 hrs in half-hour
increments on installed ber from a particularly old link [45]. At any time the
wavelength variation is high, but the temporal evolution for this buried cable
is slow. An exception exists at about 145 hrs where the ber was disturbed
(unintentionally). The disturbance must have been small because the spectra
before and after the event are well correlated. A large disturbance will erase
all memory of the past and set the ber in a new state. This data also shows
that for channels that fall on high rst- and/or second-order states, the high
PMD levels may persist for a long time before there is a change. A detailed
study of the temperature dependence of PMD in installed cables is reported
by Brodsky et al. [5].
9.4.2 Autocorrelation Functions
The autocorrelation function of the PMD vector is an essential descriptor
for the characterization of PMD behavior. There are four properties that the
autocorrelation function reveal:

PMD is a wide-sense stationary process in frequency. The autocorrelation


function is the only demonstrated method to connect ensemble averages
with frequency averages.
The PMD-vector correlation bandwidth depends only on the mean DGD .
The variance of an estimated value of derived from measurement depends
inversely on the total measurement bandwidth.
All moments of the PMD vector depend only on and the moment order.
The autocorrelation function tells how large a frequency separation is necessary for two PMD vectors  () and  (  ) to become statistically independent. Physically, PMD is generated by the ber birefringence (z, ). The
longitudinal evolution of the birefringence (z) is modelled as white noise or
a concatenation of waveplates, while the frequency dependence () is linear:
() = wn/c. Recall from the PMD concatenation rules 8.2.4 that component PMD vectors precess about one another with frequency. For small
frequency variation the output PMD vector retains its length and pointing

9.4 PMD Statistics

409

Table 9.2. Autocorrelation Functions for PMD Vector and DGD Squared 2
Parameter
PMD vector(a,b)
DGD squared

Form

Expression

  

R (W) = sinhc (W/6) exp (W/6)

(c)

2 2

2 (1 R (W))
3
W/6


0.9
est (Bf ) 1 
Bf

8
8
1
(Bf ) =

2
3
9 2 2Bf

R 2 (W) =

Estimator uncertainty(a,b,d,e)
Bias error of statistic
moments(f )

3
5

1+


 
3 3
2 2 =
(f )2 , R = R 2 , R = R /R
2

2
PMDACF 
2  4.4, fPMDACF

 
W = 2 2 ,

Note: Terms such as denote = (), not a frequency derivative.


(a,c)

Shtaif and Mecozzi [39, 56, 57],

(d)

(e)

Gisin et al. [22],

(f )

Boroditsky et al. [4].

(b)

Karlsson and Brentel [31],

The Poole and Favin coecient was 0.66 rather than 0.9 [52],

direction. As the frequency is further detuned, the PMD vector takes on a


dierent length and pointing direction. The PMD autocorrelation function
tells how large, on average, a frequency separation is necessary for two such
PMD vectors to become statistically uncorrelated.
The normalized autocorrelation functions for the PMD vector R (W) and
the DGD squared R 2 (W) are given in Table 9.21 . Both functions are even
functions of the argument, continuous, dependent only on frequency dierence and not absolute frequency, and have a maximum at = 0. These
properties are characteristic of a wide-sense stationary process in frequency.
The PMD vector autocorrelation function originates with the dot-product
of the PMD vector at two dierent frequencies, averaged over frequency:
 
(9.4.9)
 (  )  () = 2 R (W)
Figure 9.12(a)
  plots R (W) as a function of cyclic frequency f . The product 2 2 (or f ) alone determines the form of the ACF. In an extensive
study Lin and Agrawal have connected this and higher-order ACFs to pulse
broadening and distortion [35].
1

The function sinhc is a sinh function divided by its argument. In general a trigonometric function followed by the letter c is to be divided by its argument. At
the origin, sinhc(0) = 1.

410

9 Statistical Properties of Polarization in Fiber

a)

b)

Rt

Rt

0.5

var(t2(Bf)) / ht2i

1.0

Rtb
0
0
1/p
2/9

0.5

1.0

1.5

2.0

100

10

0.8 / Bf h ti

-1

var
10-2
0

10

Df hti

20

30

40

50

Bf h t i

Fig. 9.12. Autocorrelation function and log variance. a) Autocorrelation function R of the PMD vector as function of f . FWHM is f 2/. The PMDvector ACF is dominated by depolarization, R (dashed curve). The root-meansquareDGD ACF
has a secondary eect. b) Normalized log10 variance of estimated

value 2 (Bf ) as function of measurement bandwidth (in Hertz) Bf . Variance is

well approximated by 16 2/ (9Bf ) for Bf > 5.

Especially important is the autocorrelation function bandwidth, dened


as the full-width half-maximum (FWHM) of the function. In cyclic frequency
the FWHM for the PMD vector is
fPMDACF 

(9.4.10)

This is a useful relation to know. For instance, with = 30 ps, the correlation
bandwidth is f  21 GHz, or  0.16 nm. Measurements of high-PMD
ber should at least have a 20 GHz resolution to capture the variation, but
any higher resolution creates correlated measurements that oversample the
PMD. Statistical estimates should be made only after the sample points of a
high-resolution measurement are decimated (or are treated with other signalprocessing methods) down to the autocorrelation bandwidth.
Another example is the failure of estimating mean DGD on installed ber
when the measurements are taken through a narrowband optical lter such
as one port of a multiplexer. A typical channel bandwidth for such a lter
is 40 GHz. The mean ber DGD must be 32 ps or greater to have more than
one statistically independent measurement within the lter passband.
As R is the autocorrelation of the PMD vector, the question is which
component of that vector, the DGD or the pointing direction, dominates the
decorrelation. In the preceding section it was determined that depolarization
dominates PDCD, which is representative of the strong tendency for the PMD
vector to change direction. The autocorrelation function also reects this behavior.
The autocorrelation function of the mean-square DGD was derived by
Shtaif and Mecozzi [57] to answer just this question. As the DGD-squared
is the dot-product of the PMD vector with itself 2 () =  ()  (), the
correlation between 2 (  ) and 2 () is

9.4 PMD Statistics

 5  2 2
2 (  ) 2 () =

R 2 (W)
3

411

(9.4.11)

where R 2 is listed in Table 9.2. The autocorrelation functions R for the DGD
and R for the pointing-direction of the PMD are then extracted as shown in
the table. These two functions are also plotted in Fig. 9.12(a). The correlation
bandwidths are all about the same (with
 fACFDGD  4/9) but the DGD
ACF falls from R (0) = 1 to R () = 3/5, or about 22%. The PMD-vector
ACF is dominated by the unit-vector ACF R , consistent with the tendency
of the PMD vector to change direction, or depolarize. The autocorrelation
function for the PSP has been recently reported by Bao et al. [26].
The PMD-vector autocorrelation function can be modied to determine
the weights of all moments of the PMD vector relative to the mean-square
DGD [56]. Unlike the preceding ACFs, the moment relations apply to a single
frequency. The two moment relations are
/
.
(9.4.12a)
 (nk)  (n+k+1) = 0
.
/
 (nk)  (n+k) = (1)k

 2 n+1
(2n)!

+ 1)!

3n (n

(9.4.12b)

where  (n) refers to the (n+1) order of  . While even/odd moments vanish, like
moments (k = 0) grow quickly with n. This is another reection of the disorder
and complicated structure of the DGD spectrum. Recall from the Fourier
analysis of the DGD spectrum that an increase in the number of elementary
PMD segments in a concatenation
increases the number of Fourier components
 
in the spectrum. Since 2 z, higher moments grow increasingly quickly as
the ber length increases, consistent with
 Fourier picture.
 the
The factorial-function coecient to 2 in (9.4.12b) grows very quickly
with n. The origin of this coecient is the white noise that underlies the model.
The moments of a Brownian motion are Hermite polynomials evaluated at
the origin. The resulting Hermite coecients grow as (2n)!/n!; this growth
is reected in the moments of the PMD vector since it is a derived process
from Brownian motion of the birefringent vector. The relative growth of the
coecient for successive moments is
 (n) (n) 


4
2n(2n 1)

=
 n
3(n + 1)
3
 (n1)  (n1)
For large n the growth is linear in n, again consistent with Hermite polynomial
behavior.
Autocorrelation Function Derivations
The PMD-related ACFs can be derived exclusively using a stochastic-calculus
treatment of the birefringence and PMD vector [37]. The PMD evolution
equation (9.3.1) on page 397 may be rewritten in SDE form as

412

9 Statistical Properties of Polarization in Fiber

 z + dB
 z 
d = dB

(9.4.13)

where
 dz, dB
z =
 z dB
 z = 2 dz, and dBz,j dBz,k  = 0
dB

j = k

(9.4.14)

 z is a three-entry column vector of i.i.d. Brownian motions


The dierential dB
that are the driving force in the evolution of d . The Brownian motion represents the local birefringence vector. The subscript z denotes that this motion
is longitudinal in z and denes
the domain over which averages will be taken.
Brownian motion grows as z, so according to the rules of stochastic cal z dB
 z = 2 dz, where 2 is
culus terms up to order z are included, thus dB
the strength of the motion. (White noise is the formal derivative of Brownian
motion: dBz = gz dz.)
The SDE (9.4.13) is in the Stratonovich form and must be translated
into the It
o form. That translation generates an additional term, which when
included makes
2 2
 z + dB
 z   dz
(9.4.15)
d = dB
3
This correction term shifts the drift of  but not the random behavior.
The rst calculation is the mean-square of the DGD, or 2 . Treating  as
os chain rule is
a stochastic variable, the dierential of 2 using It
d( 2 ) = 2 d + d d

(9.4.16)

The clarify the following calculations, the Stratonovich form of d (9.4.13) is


rst substituted into (9.4.16). Keeping only terms that survive an average,
this partial solution gives
 z + 2 dz + 2 (dB
 z  )2
d( 2 ) = 2 dB
!
"
 z + 2 dz + 2 2 2 dz (dB
 z  )2
= 2 dB
 z + 2 dz +
= 2 dB
where
 z  )2 =
(dB

3


2 2 2 2
dz
3

dBz,k dBz,k k2 + cross-terms

k=1

and dBz,k dBz,k = dz/3. Now, adding the It


o drift correction from (9.4.16)
back into d and keeping terms only up to order dz gives
2

 z + 2 dz
d( 2 ) = 2 dB

(9.4.17)

 z . Subsequent
Averaging this expression over z eliminates terms of order dB
integration over length results in the mean-square evolution of the DGD

9.4 PMD Statistics

 2 
(z) = 2 z

413

(9.4.18)

 
Identication with (9.3.2) on page 397 gives 2 = 2 c2 /LC .
The ACF for the PMD vector can now be calculated. Denoting  =  ()
(rather than the frequency derivative of  ), the It
o dierential of the dot
product   is
d (  ) = (d )  +  (d ) + d d
Substitution of (9.4.15) and keeping terms only up to order dz gives
!
"
 z ( +  ) +  dB
 z  + 2 dz
d (  ) = dB
!
"!
""
!
1
 z 
 z  dB
2 ( 2 + 2 )(  )dz +  2 (  )dz dB
3
Subsequent averaging over z eliminates many terms. The average over the
product of inner products in particular gives
.!
"!
"/ 1
 z  dB
 z 
dB
= 2 (  ) dz
(9.4.19)
3
Completing the average over all terms gives the dierential form of the autocorrelation


1
d    = 1 2    2 dz
3
Integration gives



3
2 2 z
   =
1 exp
(9.4.20)
2
3
 
 
Replacement of 2 = 2 z and normalization by 2 results in the normalized autocorrelation function listed in Table 9.2:


 
 
 2  2 
2 2
2 2
R
= sinhc
exp
(9.4.21)
6
6
The limits of the ACF are R (0) = 1 and R () = 0. It is remarkable that the
only terms that
the ACF are the frequency dierence and the mean
 enter
square DGD 2 . The mean-square DGD in turn is directly proportional to
the square of the mean ber DGD. Once again the mean ber DGD is the
unit by which a PMD-related statistical quantity is governed.
Lastly, the autocorrelation of the DGD squared is calculated. The kernel
of the calculation is 2 2 where 2 =   . Treating 2 as a stochastic
variable, the dierential is
 


 



(9.4.22)
d 2 2 = d2 2 + 2 d2 + d2 d2

414

9 Statistical Properties of Polarization in Fiber

Substitution of (9.4.17) into (9.4.22) and keeping terms only up to order dz


gives
!
"
!
"


 z  + 22 dB
 z 
d 2 2 = 22 dB

"!
"
!


 z 
 z  dB
+ 2 + 2 2 dz + 4 dB

Averaging over z and using (9.4.19) leaves




4 2
d 2 2 = 2 4 zdz +
   dz
3

(9.4.23)

 
where an Ito isometry removes the random component in 2 :
5
6 5
6
 2
 z + 2 dz
d(2 ) =
=
2 dB

= 2 dz
This average is no longer a function of . Substitution of the PMD-vector
ACF into (9.4.23) makes for a straightforward integration, resulting in


  
 
 2 2   2 2 4 2
2 2
12
 =
(9.4.24)
1 exp
+

2
4
3
Subsequent normalization as shown in (9.4.11) and rearrangement of terms
produces the normalized ACF for the DGD squared:



  
 2  2  3
2 1 R 2 2
R 2
1+
(9.4.25)
=
5
3
2 
2  /6
As with the PMD-vector ACF, the DGD-squared ACF depends only on the
frequency dierence and mean ber DGD. The limits of this autocorrelation
are R 2 (0) = 1 and R 2 () = 3/5.
9.4.3 Mean-DGD Measurement Uncertainty
The PMD ACF gives the minimum bandwidth over which two neighboring
PMD vectors are statistically independent. The PMD ACF can also be used
to determine the uncertainty of an estimator of the mean DGD of a ber.
This important application has been studied by Gisin et al. [22], Karlsson
and Brentel [31], Shtaif and Mecozzi [57], and Boroditsky et al. [4].
There is a dierence in framework between the rst three reports and the
most recent. In particular, the relation between the mean-square DGD and
average DGD is 2 = 8/3 2 is considered exact in the former reports while

9.4 PMD Statistics

415

Boroditsky et al. explain that equality holds only over innite bandwidth (or
ensemble averages). In fact, in the limit of zero bandwidth there is an 8%
systematic error
 between mean-square and average DGD. In the broadband
regime B 2 > 30 (discussed below), the error between the DGD moments
is

8
8 1
=

2
(9.4.26)
3
9 2B
where B is the full measurement bandwidth in radians. Measurements of lowPMD bers are susceptible to this error.
Putting aside this systematic error for the moment, there are two ways to
estimate the mean DGD from a measurement: average the DGD values across
frequency, or average the DGD-squared values across frequency and take the
square-root. The former is a straight average, while the latter is gives the rms
value. The studies show that the rms average gives a slightly better estimate.
The variance of the rms estimate is detailed here, and Shtaif and Mecozzi give
a brief comparison.
Consider the estimate of the mean-square DGD over a radian bandwidth
B = 2 1 :

1
2
est
(B) =
2 ()d
(9.4.27)
B B
2
(B) is, by denition,
The variance of est
# 2

# 2
$
$
 2
2
(B)
est
(B) E 2 est
(B)
var est (B) = E est
)
*

 2
1
 2
2

= 2E
d
d () ( ) 2
B
B
B


  2
1
=
d  2 () 2 (  ) 2
B B

where the double integral reduces to a single integral since the integrand
depends only on the frequency dierence and not absolute value. The last
integrand has already been calculated, see (9.4.24). Additionally, it is more
relevant to look at the normalized variance so comparisons can be made. Thus,
 2
normalizing the variance by 2 and computing the integral gives
 2

var est
(B)
2

2


= 16

 
4 B 2 2
2
B 4 
2

32
+
3


 
B 2 2 6
2
B 4 
2

2
2
eB  /12


 

B 
2
1
16 3


erf
+
9 B 
2 3
2

 
This function is plotted in Fig. 9.12(b). The asymptotic limit for B 2 > 30
is
 2


var est
(B)
1
16 3

(9.4.28)

2
2
9 B 
2



416

9 Statistical Properties of Polarization in Fiber

Translation to bandwidth in Hertz and average DGD gives




3 3
2
(Bf )
B 
=
2
where B = 2Bf . Thus,

 2

(Bf )
var est
16 2 1
, Bf > 5

2
9 Bf

2

(9.4.29)

(9.4.30)

Comparison of this approximate variance formula is given in Fig. 9.12(b).


The expected error is estimated by removing the normalization in expression (9.4.28). Since the estimated quantity is the mean-square of the
DGD, plus
 minus one standard deviation from the true value is
 and
2
2 ). Substitution of the variance by (9.4.28), translation
 2 var (
est
est
to cyclic frequency (9.4.29), and converting both sides to mean DGD gives
the expression for the estimator uncertainty:


0.9
(9.4.31)
est (Bf )  1 
Bf

where the coecient in the numerator comes from 16 2/(9). This coecient agrees with Gisin [22]. Moreover, the expression shows that reduction
 of
the standard deviation of the estimated value of is a slow function: 1/ Bf .
For example, consider an uncertainty of 10%: Bf  110. For a mean
DGD of 10 ps, the required measurement bandwidth is 11, 000 GHz,
or 90 nm. To halve the uncertainty the bandwidth must be quadrupled.
It is an open question whether an estimator with a faster convergence can
be found. The square-root form for the mean-square estimator suggests an
estimator based on the fourth-power of the DGD spectrum. This requires a
higher-order autocorrelation function. Another way to increase the certainty
of the mean DGD is to takemultiple uncorrelated measurements over time.
That uncertainty goes as 1/ N with N measurements; again a slow function
but useful nonetheless.
Returning to Boroditsky et al., the authors show that average DGD estimated from the magnitude SOPMD spectrum gives both an unbiased estimator and reduces the measurement uncertainty by 30%. The reduction in measurement uncertainty is equivalent to eectively doubling the measurement
bandwidth. They further show that average DGD estimated from the PDCD
spectrum along yields a better estimate of average DGD compared to direct
mean-square DGD spectrum analysis. However, the magnitude SOPMD spectrum uctuates roughly twice as fast as the corresponding DGD spectrum,
which in turn requires greater care in measurement. The vector MPS technique should produce suciently accurate measurements. Moreover, the width
of the PDCD density is only 1/9 that of the magnitude SOPMD spectrum,
so again, care must be used in obtaining a suciently accurate measurement
to eectively employ these techniques.

9.4 PMD Statistics

417

9.4.4 Discrete Waveplate Model


The analytic developments of this chapter are derived from a Brownian motion model of the local birefringence vector. The powerful tools of stochastic
calculus and partial-dierential equations are then employed to derive statistical properties of polarization and PMD. However, the cascaded waveplate
model is very often used instead. The waveplate model concentrates dierential delay into homogeneous segments and then abruptly mode-mixes between
adjacent segments. The waveplate model is suitable as a good approximation
in certain regimes as long as it is correctly constructed. While there are several
variations, the model below converges to the correct statistics.
In the regime L  LC  LB , where L is the ber length, the waveplate model illustrated in Fig. 9.13(a) gives a reasonable approximation for
the PMD. In particular, the rms DGD statistics follow (9.4.1). The model
uses N equal-length waveplates where each plate is LC /2 long and there are
Nc = 2L/LC waveplates in total. The statistics track for Nc  30. The physical waveplate orientation is uniformly distributed on [/2, /2] and zero
chirality is asserted. The birefringence (magnitude) of each plate is a random
variable selected from a Rayleigh distribution.
There are two aspects to be worked out. One relates to the frequency bandwidth and step size and the other to the gaussian distributions of the cartesian
components of the birefringence. First the frequency grid. To derive a good
statistic, uncorrelated DGD values over a suciently wide bandwidth must
be calculated. At the discrete level, the total bandwidth Bf comes from Nf
points of step size f : Bf = Nf f . The minimum uncorrelated bandwidth
for an average DGD is f = 2/, so the bandwidth-mean-DGD product
is Bf = 2Nf /. Substitution into the mean-DGD estimate (9.4.31) gives


1.13
(9.4.32)
est (Nf )  1 
Nf
This is the basis on which Nf is set. For instance, Nf = 500 gives a standard
deviation of 5%.
Next the birefringent distribution is determined. The mean-square DGD
as a function of length (9.4.1) is rewritten as
 2
  
(Nc ) = 2 L2C Nc
(9.4.33)
where Nc = 2z/LC and c = LC . The PMD distribution is Maxwellian, so
a 3/8 scale factor relates its rst and second moments.
The birefringent dis 
tribution is Rayleigh, so the second-moment 2 is a factor of two larger
than the variance of the underlying i.i.d. gaussian cartesian-component distributions. Putting these together gives the variance of the component gaussian:
2
,k
=

3
2
2
16LC Nc

(9.4.34)

418

9 Statistical Properties of Polarization in Fiber

a)

t1

t2

t3

t4

t N21

tN

4
Lc / 2

N21

t (v)

N 5 2L / Lc

b)

DGD (ps)

30

Nsegments = 512
h ti 5 10ps

20
10
0

-6

-4

-2

Relative Freq (THz)

Fig. 9.13. Waveplate model of a ber, good for the L  LC  LB regime. a) Waveplates have uniformly distributed e-axis orientations and are each LC /2 long. There
are Nc = 2L/LC segments in total. The DGD per segment is determined from
a Rayleigh distribution. b) Realization of a DGD spectrum for Nc = 512 and
Nf = 600, given   = 10 ps. The DGD distribution is shown to the right. The
calculation uses large-enough frequency steps so that DGD values are statistically
uncorrelated.

Each cartesian component of the birefringence


is"randomly picked from a nor!
2
. The segment birefringence
mal distribution with density ,k = N 0, ,k

2 + 2 , with a corresponding segment DGD of
magnitude is then = ,1
,2
c = LC .
Finally, the average Stokes-vector rotation per frequency step
c is estimated by combining (9.4.33) recast in terms of and c , and the uncorrelated
frequency step f . The result is

3 2
c =
(9.4.35)
2Nc
For instance, with Nc = 512, the average Stokes rotation per frequency step is
c  9.7 . This is a good check because a single step in excess of c >
creates an ambiguity as to whether the PMD completed more than a halfrevolution in one direction or less than half in the other direction.
A cropped spectral window of a DGD spectrum constructed in the manner
outlined is illustrated in Fig. 9.13(b). The waveplate cascade was made with
Nc = 512 waveplates calculated at Nf = 600 uncorrelated frequency points.
The resulting distribution and its Maxwellian t are plotted on the right. One
instance of the concatenation using this number of waveplates and frequency

9.4 PMD Statistics

419

points is not sucient to derive high-quality statistics, especially on the tail of


the Maxwellian. But multiple runs of this cascade using dierent realizations

of the birefringence vector will build up the statistics as a rate of N . Note


that varying the waveplate length as well as the aforementioned factors does
not improve the statistics and works only to lower the convergence rate.
9.4.5 Karhunen-Lo`
eve Expansion of Brownian Motion
Other than the waveplate model above, the derivations in this chapter have
relied on Brownian motion as the driving term for the evolution of various
parameters. Brownian motion is often modelled on a microscopic, step-by-step
level where the displacement for each step comes from choosing a random value
from a gaussian density. This approach works but has no analytic expression.
A useful alternative is the Karhunen-Loeve (KL) expansion of Brownian
motion [59]. The KL expansion gives a macroscopic view of the motion on an
interval and guarantees the proper covariance. For Brownian motion the KL
expansion on [0, 1] is


 
 
2
1


(9.4.36)
Bz =
1 k sin k + 2 z
k+ 2
k=1
where k are random variables with density N (0, 1). Each term in the summation spans the entire interval. Higher values of k produce higher oscillations
but with lower amplitudes. In practice the sum is taken large enough to ll
in the necessary spatial resolution and is thereafter truncated. Figure 9.14(a)
shows four sample paths generated by (9.4.36).
The KL expansion is the function-space analogue of a Markov process
at the discrete level. On this level a Markov process is determined purely
by its covariance matrix A. The eigenvectors and values are found from the
equation Ax = x. The spectral 
theorem gives the entries in A in terms of
n
its eigenvectors and values: A = k=1 k vk vkT . On the continuous level, the
eigenvalue equation is
 1
K(z, y)(y)dy = (z)
(9.4.37)
0

where the covariance of the process is K(z, y) = z y. The symbol is means


the minimum of the two quantities. The spectral theorem says that the entries
of the covariance function, entries which are now functions not vectors, are
K(z, y) =

k k (y)k (z)

(9.4.38)

k=1

where k are the eigenvalues of (9.4.37) and k (z) are its eigenvectors.
The eigenvalue equation is solved by substituting in the covariance of
Brownian motion. This gives

420

9 Statistical Properties of Polarization in Fiber

(z y) (y)dy = (z)
0

The integral is separated into two pieces to make an integral equation:


 z
 1
y(y)dy +
z(y)dy = (z)
(9.4.39)
0

This is an integral equation which can be solved by taking successive derivatives. Looking ahead a couple of steps, the dierential equation produced by
the above integral equation is second order, and therefore requires two boundary conditions to be solved uniquely. One boundary condition is found directly
from this integral equation: when z = 0 then (0) = 0.
Dierentiating both sides of (9.4.39) with respect to z makes


z(z) +

(y)dy z(z) =  (z)

where  (z) denotes the rst derivative with respect to z. After cancelling the
two terms in the left, a second boundary condition is determined: for z = 1
 (1) = 0. Dierentiating again gives
(z) =  (z)

(9.4.40)

This ODE is solved subject to the boundary conditions (0) =  (1) = 0. The
general solution is
(z) = A sin(az) + B cos(bz)
where the derivatives yield
 (z) = aA cos(az) bB sin(bz)
 (z) = a2 A sin(az) b2 B cos(bz)
The boundary conditions restrict the four unknown coecients in the following way:
(0) = 0

B=0

 (1) = 0 aA cos(a) = 0
Substitution
of (z) into the dierential equation (9.4.40) gives the denition
for a: a = 1/ . Summarizing these restrictions, the solution thus far is
"
!
(z) = A1/2 sin z1/2
(9.4.41)
subject to the condition that
"
!
A1/2 cos 1/2 = 0

9.4 PMD Statistics


a)

Sample Paths of Brownian Motion Bz

b) Sample Path Density, 210 instances


3

Length (a.u.)

421

2
1

0
-1
-2

-1
0

0.25

0.50

0.75

1.00

-3

0.25

Position

0.50

0.75

1.00

Position

Fig. 9.14. Sample paths of Brownian motion created by the Karhunen-Loeve expansion. a) Four sample paths on the interval [0, 1]. b) Density of 210 sample paths
on the interval. As expected, the density width increases as square-root of the length.

This boundary condition is satised when




1
= k + 12 ,
k

kZ

Rearranging, the eigenvalue is


2
 
k = k + 12

(9.4.42)

Finally, the coecient A is determined from the orthogonality condition


of (z):
 1
2k (y)dy = 1
0

This is satised when A = 2. The eigenvector function is therefore

 
 
z [0, 1], k Z
(9.4.43)
k (z) = 2 sin k + 12 z ,
The KL expansion for a general gaussian process Gz is
Gz =




k k k (z)

(9.4.44)

k=1

# $
where the random variable is characterized by E [k ] = 0 and E k2 = 1.
One can show that E [Gy Gz ] reproduces the correct covariance (9.4.38). Substitution of (9.4.43) into (9.4.44) makes the expression (9.4.36) stated at the
beginning.
Figure 9.14(b) shows the density of Bz on the interval [0, 1] for 210 sample
paths. As expected the density grows as the square-root of the length.

422

9 Statistical Properties of Polarization in Fiber

9.5 PDL Statistics


The statistics of polarization-dependent loss are derived in an analogous manner as those for polarization and PMD. The local dierential loss is modelled
as a white-noise process and the cumulative PDL is determined by the diusion of the PDL vector  along the ber. Concomitant with the assumption
of no chiral birefringence, circular dichroism is excluded from the PDL model.
When immersed in a random birefringence medium, the axes of minimum
and maximum transmission of a local dierential loss element
 are scrambled
from one point to another. Recall the evolution of the cumulative PDL vector
in the presence of birefringence from (8.3.16) on page 377:
!
"
d 
  +
=

  
dz

(9.5.1)

The cross-product term spins the cumulative PDL vector about the local bire scrambling its orientation. Propagation through multiple ranfringent axis ,
domly oriented birefringent elements drives the PDL vector toward isotropic
coverage of the Poincare sphere. The second term pulls the cumulative PDL
vector toward the local element, while the last term governs the growth and
decay of .
The statistics for PDL reported in the literature are based on PDL immersed in random birefringence [10, 19, 38, 64]. PMD statistics, by contrast,
were derived in the absence of PDL. The reason PMD is included in PDL
statistics is because a long concatenation of pure PDL is not likely in a
telecommunications link. The consequence of PMD inclusion is that the local dierential loss is treated as three-dimensional i.i.d. white noise in Stokes
space. The correlations of the white-noise vector
 are
j (z) = 0,

j (z)k (z  ) =

2
j,k (z z  )
3

(9.5.2)

where 2 is the strength of the disturbance. A dierential Brownian vector is


z =
 z dB
 z = 2 dz.
dened as dB
 dz such that dB

The evolution equation (9.5.1) with the cross-produce removed (as its eect
averages to zero in the isotropic PDL model) is rewritten in SDE form as
!
"
z
d  = I  dB
This diusion equation is interpreted in the Stratonovich sense and must be
translated to It
o form in order to use Ito calculus. The translation makes [38]
d  =

!
"

2 
z
2 2  dz + I  dB
3

The diusion generator for this equation is

9.5 PDL Statistics


a)

423

b)
0.04

10-1
Precise

0.03

10-3

0.02

Maxwellian

0.01
0

Maxwellian

10-2
10-4

10

20

30

rdB

Precise

10-5

h rdB i 5 25dB
40

50

60

70

10-6

hrdB i 5 25dB
0

10

20

30

rdB

40

50

60

70

Fig. 9.15. PDL probability density and Maxwellian approximation verses decibel
value, linear and semi-log scales. The log scale is in log10 .

3
3
3
3

2 2  2
2 (2 2 ) 
1 
+
G=
i
+
i j
3
i
2 i=1 j=1
i j
6 i=1 2i
i=1
(9.5.3)
"!
"T !
"

 !
I T
where T = I T
= I (2 2 )T . One can now
calculate expectations of the diusion using Kolmogorovs backward equation (9.2.5) on page 394.
It is an oddity of PDL that diusions of k , n , n and the like are dicult
to solve while those of the logarithm of Tmax /Tmin make closed solutions. It
would appear that the (2 2 ) coecient is only cleanly removed when an
logarithmic function is used. Fukada does, however, succeed in expressing the
probability densities in linear terms [18]. For the present, the moments of the
PDL magnitude expressed in decibels are used. Recall the denition:


1+
dB = 10 log10
(9.5.4)
1
It is particular to PDL that even though the cartesian components of the
local PDL vectors are modelled as isotropic i.i.d. Gaussian random variables
the cumulative distribution is not strictly Maxwellian. Shtaif and Mecozzi
show that for low cumulative PDL (25 dB or less, although indeed this
is extremely high for a lightwave system) the distribution is approximately
Maxwellian [38]. Galtarossa and Palmieri use the diusion generator to calculate the PDL distribution exactly and validate the Shtaif and Mecozzi approximation [19].
The details of the Galtarossa and Palmieri calculation are laborious but
indeed elegant. Central to their calculations are the functionals = 2k and
 = 2k+1 /. From this they construct the characteristic function (CF) of
the 2 density and, after proof of convergence, inverse-Fourier transform the
CF to the density proper. Subsequent conversion to the density of (= dB )
makes

424

9 Statistical Properties of Polarization in Fiber





p2
sinh (p/)
2p2
z
exp 2
(p, z) =
exp
(p/)
2 z
2
3 2
z3

(9.5.5)

for p 0, where z = z2 /3 and = 20 log 10e  8.868. This density is plotted


on linear and semi-log scale in Fig. 9.15. The rst and second moments are

 
z
2
z z/2
e
+ (1 + z) erf
(
z ) =
(9.5.6a)

2
 2 
z ) = 2 (
z + 3) z
(9.5.6b)
(
In the limit of large z the cumulative PDL grows linearly with z. The Shtaif
and Mecozzi second moment is, by comparison,


"
 9 2 ! 2z/3
e
2 (
z) =
1
2

(9.5.7)

Both expressions are equal to second order in z.


The Maxwellian approximation to the PDL density function (9.5.5) written
in terms of the second moment is


2p2
p2
exp
(p, z)  
, p0
(9.5.8)
2 (2 (
z )/3)
3
2 (2 (
z )/3)
Figure 9.15 shows a comparison between the Maxwellian approximation and
the precise PDL density. The cumulative mean PDL is  = 25 dB; for lower
mean PDLs the approximation improves. However, the extreme case plotted
here indicates the divergence of the true distribution from the Maxwellian: the
tails of the precise distribution fall faster for high PDL values. This means
that the cartesian-component distributions of the logarithm PDL vector dB
z.
have slightly shorter tails that the Gaussian distributions of B
Finally, PDL and PMD are complementary regarding partial polarization.
PMD tends to depolarize a perfectly polarized input, while PDL tends to
repolarize a perfectly unpolarized input. While not presented here, the statistics of PDL-induced repolarization are derived by Menyuk et al. and have an
approximate Maxwellian form as well [41].

References

425

References
1. L. Arnold, Stochastic Dierential Equations: Theory and Applications. Malabar, Florida: Krieger Publishing Company, 1992, reprinted from original 1974
edition.
2. G. Biondini, W. L. Kath, and C. R. Menyuk, Importance sampling for
polarization-mode dispersion, IEEE Photonics Technology Letters, vol. 14,
no. 2, pp. 310312, Feb. 2002.
3. G. Biondini, W. Kath, and C. Menyuk, Importance sampling for polarizationmode dispersion: techniques and applications, Journal of Lightwave Technology, vol. 22, no. 4, pp. 12011215, Apr. 2004.
4. M. Boroditsky, M. Brodsky, N. J. Frigo, P. Magill, and M. Shtaif, Improving the
accuracy of mean DGD estimates by analysis of second-order PMD statistics,
IEEE Photonics Technology Letters, vol. 16, no. 3, pp. 792794, Mar. 2004.
5. M. Brodsky, P. Magill, and N. J. Frigo, Polarization-mode dispersion of installed recent vintage ber as a parametric function of temperature, IEEE
Photonics Technology Letters, vol. 16, no. 1, pp. 209211, Jan. 2004.
6. X. Chen, M. Li, and D. A. Nolan, Polarization mode dispersion of spun bers:
An analytical solution, Optics Letters, vol. 27, no. 5, pp. 294296, Mar. 2002.
7. F. Curti, B. Daino, G. de Marchis, and F. Matera, Statistical treatment of the
evolution of the principal states of polarization in single-mode bers, Journal
of Lightwave Technology, vol. 8, no. 8, pp. 11621166, Aug. 1990.
8. W. B. Davenport, Probability and Random Processes. New York: McGraw-Hill,
Inc., 1970.
9. M. C. de Lignie, H. Nagel, and M. van Deventer, Large polarization mode
dispersion in ber optic cables, Journal of Lightwave Technology, vol. 12, no. 8,
pp. 13251329, Aug. 1994.
10. A. El Amari, N. Gisin, B. Perny, H. Zbinden, and C. W. Zimmer, Statisitcal
prediction and experimental verication of concatenations of ber optic components with polarization dependent loss, Journal of Lightwave Technology,
vol. 16, no. 3, pp. 332339, Mar. 1998.
11. S. L. Fogal, G. Biondini, and W. L. Kath, Correction to: Multiple importance
sampling for rst- and second-order polarization-mode dispersion, IEEE Photonics Technology Letters, vol. 14, pp. 14871489, 2002.
12. , Multiple importance sampling for rst- and second-order polarizationmode dispersion, IEEE Photonics Technology Letters, vol. 14, no. 9, pp. 1273
1275, Sept. 2002.
13. E. Forestieri, A fast and accurate method for evaluating joint second-order
PMD statistics, Journal of Lightwave Technology, vol. 21, no. 11, pp. 2942
2952, Nov. 2003.
14. G. J. Foschini, L. Nelson, R. Jopson, and H. Kogelnik, Probability densities of
the second order polarization mode dispersion including polarization dependent
chromatic dispersion, IEEE Photonics Technology Letters, vol. 12, no. 3, pp.
293295, Mar. 2000.
15. G. J. Foschini, R. M. Jopson, L. E. Nelson, and H. Kogelnik, The statistics
of PMD-induced chromatic ber dispersion, Journal of Lightwave Technology,
vol. 17, no. 9, pp. 15601565, Sept. 1999.
16. G. J. Foschini, L. E. Nelson, R. M. Jopson, and H. Kogelnik, Statistics of
second-order PMD depolarization, Journal of Lightwave Technology, vol. 19,
no. 12, pp. 18821886, Dec. 1991.

426

9 Statistical Properties of Polarization in Fiber

17. G. J. Foschini and C. D. Poole, Statistical theory of polarization dispersion in


single mode bers, Journal of Lightwave Technology, vol. 9, no. 11, p. 1439,
Nov. 1991.
18. Y. Fukada, Probability density function of polarization dependent loss (pdl) in
optical transmission system composed of passive devices and connecting bers,
Journal of Lightwave Technology, vol. 20, no. 6, pp. 953964, June 2002.
19. A. Galtarossa and L. Palmieri, The exact statistics of polarization-dependent
loss in ber-optic links, IEEE Photonics Technology Letters, vol. 15, no. 1, pp.
5759, Jan. 2003.
20. A. Galtarossa, L. Palmieri, and D. Sarchi, Measure of spin period in randomly
birefringent low-PMD bers, IEEE Photonics Technology Letters, vol. 16, no. 4,
pp. 11311133, Apr. 2004.
21. A. Galtarossa, L. Palmieri, M. Schiano, and T. Tambosso, Statistical characterization of ber random birefringence, Optics Letters, vol. 25, no. 18, pp.
13221324, Sept. 2000.
22. N. Gisin, B. Gisin, J. Von der Weid, and R. Passy, How accurately can one measure a statistical quantity like polarization-mode dispersion? IEEE Photonics
Technology Letters, vol. 8, no. 12, pp. 16711673, 1996.
23. N. Gisin, Solutions of the dynamical equation for polarization dispersion,
Optics Communications, vol. 86, pp. 371373, 1991.
24. N. Gisin and J. P. Pellaux, Polarization mode dispersion: Time versus frequency
domains, Optics Communications, vol. 89, pp. 316323, May 1992.
25. J. P. Gordon and H. Kogelnik, PMD fundamentals: Polarization mode
dispersion in optical bers, Proceedings of National Academy of Sciences,
vol. 97, no. 9, pp. 45414550, Apr. 2000. [Online]. Available: http:
//www.pnas.org
26. S. Hadjifaradji, L. Chen, D. S. Waddy, and X. Bao, Autocorrelation function
of the principal state of polarization vector for systems having PMD, IEEE
Photonics Technology Letters, vol. 16, no. 6, pp. 14891491, June 2004.
27. B. Huttner, J. Reecht, N. Gisin, R. Passy, and J. Weid, Distributed beatlength
measurement in single-mode bers with optical frequency-domain reectometry, Journal of Lightwave Technology, vol. 20, no. 5, pp. 828835, May 2002.
28. E. Ibragimov, G. Shtengel, and S. Suh, Statistical correlation between rst
and second order PMD, Journal of Lightwave Technology, vol. 20, no. 4, pp.
586590, 2002.
29. I. P. Kaminow, Polarization in optical bers, IEEE Journal of Quantum Electronics, vol. QE-17, no. 1, pp. 1522, 1981.
30. M. Karlsson, Probability density functions of the dierential group delay in
optical ber communication systems, Journal of Lightwave Technology, vol. 19,
no. 3, pp. 324331, Mar. 2000.
31. M. Karlsson and J. Brentel, Autocorrelation function of the polarization-mode
dispersion vector, Optics Letters, vol. 24, no. 14, pp. 939941, July 1999.
32. P. J. Leo, G. R. Gray, G. J. Simer, and K. B. Rochford, State of polarization
changes: Classication and measurement, Journal of Lightwave Technology,
vol. 21, no. 10, pp. 21892193, Oct. 2003.
33. I. T. Lima, A. O. Lima, J. Zweek, and C. R. Menyuk, Ecient computation
of outage probabilities due to polarization eects in a WDM system using a
reduced stokes model and importance sampling, IEEE Photonics Technology
Letters, vol. 15, no. 1, pp. 4547, Jan. 2003.

References

427

34. I. T. Lima, R. Khosravani, P. Ebrahimi, E. Ibragimov, C. R. Menyuk, and A. E.


Willner, Comparison of polarization mode dispersion emulators, Journal of
Lightwave Technology, vol. 19, no. 12, pp. 18721881, Dec. 2001.
35. Q. Lin and G. P. Agrawal, Correlation theory of polarization mode dispersion
in optical bers, Journal of the Optical Society of America B, vol. 20, no. 2,
pp. 292301, Feb. 2003.
36. D. Marcuse, C. R. Menyuk, and P. Wai, Application of the Manakov-PMD
equation to studies of signal propagation in optical bers with randomly varying
birefringence, Journal of Lightwave Technology, vol. 15, no. 9, pp. 17351746,
Sept. 1997.
37. A. Mecozzi, private communication, 2004.
38. A. Mecozzi and M. Shtaif, The statistics of polarization-dependent loss in optical communication systems, IEEE Photonics Technology Letters, vol. 14, no. 3,
pp. 313315, Mar. 2002.
39. , Study of the two-frequency moment generating function of the PMD
vector, IEEE Photonics Technology Letters, vol. 15, no. 12, pp. 17131715,
Dec. 2003.
40. C. R. Menyuk and P. Wai, Polarization evolution and dispersion in bers with
spatially varying birefringence, Journal of the Optical Society of America B,
vol. 11, no. 7, pp. 12881296, July 1994.
41. C. Menyuk, D. Wang, and A. Pilipetskii, Repolarization of polarizationscrambled optical signals due to polarization dependent loss, IEEE Photonics
Technology Letters, vol. 9, no. 9, pp. 12471249, Sept. 1997.
42. L. Nelson, R. Jopson, H. Kogelnik, and G. J. Foschini, Measurement of depolarization and scaling associated with second-order polarization mode dispersion
in optical bers, IEEE Photonics Technology Letters, vol. 11, no. 12, pp. 1614
1617, Dec. 1999.
43. D. Nolan, X. Chen, and M.-J. Li, Fibers with low polarization-mode dispersion, Journal of Lightwave Technology, vol. 22, no. 4, pp. 10661077, Apr. 2004.
44. B. Oksendal, Stochastic Dierential Equations: An Introduction with Applications, 5th ed. New York: Springer, 1998.
45. D. L. Peterson, B. C. Ward, K. B. Rochford, P. J. Leo, and G. Simer,
Polarization mode dispersion compensator eld trial and eld ber
characterization, Optics Express, vol. 10, no. 14, pp. 614621, July 2002.
[Online]. Available: http://www.opticsexpress.org/
46. S. M. Pietralunga, M. Ferrario, P. Martelli, and M. Martinelli, Direct observation of local birefringence and axis rotation in spun ber with centimetric
resolution, IEEE Photonics Technology Letters, vol. 16, no. 1, pp. 212214,
Jan. 2004.
47. A. Pizzinat, L. Palmieri, B. S. Marks, C. R. Menyuk, and A. Galtarossa, Analytical treatment of randomly birefringent periodically spun bers, Journal of
Lightwave Technology, vol. 21, no. 12, pp. 33553363, Dec. 2003.
48. C. D. Poole, Statistical treatment of polarization dispersion in single-mode
ber, Optics Letters, vol. 13, no. 8, pp. 687689, Aug. 1988.
49. , Measurement of polarization-mode dispersion in single-mode bers with
random mode coupling, Optics Letters, vol. 14, no. 10, pp. 523525, 1989.
50. C. D. Poole and C. R. Giles, Polarization-dependent pulse compression and
broadening due to polarization dispersion in dispersion-shifted ber, Optics
Letters, vol. 13, no. 2, pp. 155157, 1988.

428

9 Statistical Properties of Polarization in Fiber

51. C. D. Poole, J. H. Winters, and J. A. Nagel, Dynamical equation for polarization dispersion, Optics Letters, vol. 16, no. 6, pp. 372374, 1991.
52. C. D. Poole and D. L. Favin, Polarization-mode dispersion measurements based
on transmission spectra through a polarizer, Journal of Lightwave Technology,
vol. 12, no. 6, pp. 917929, 1994.
53. S. C. Rashleigh, Origins and control and polarization eects in single-mode
ber, Journal of Lightwave Technology, vol. LT-1, no. 2, pp. 312331, 1983.
54. S. C. Rashleigh and R. Ulrich, Polarization mode dispersion in single mode
bers, Optics Letters, vol. 3, no. 2, pp. 6062, 1978.
55. H. Risken, The Fokker-Planck Equation: Methods of Solution and Applications,
2nd ed. New York: Springer, 1989.
56. M. Shtaif and A. Mecozzi, Mean-square magnitude of all orders of polarization
mode dispersion and the relation with the bandwidth of the principal states,
IEEE Photonics Technology Letters, vol. 12, no. 1, pp. 5355, Jan. 2000.
57. , Study of the frequency autocorrelation of the dierential group delay in
bers with polarization mode dispersion, Optics Letters, vol. 25, no. 10, pp.
707709, May 2000.
58. Y. Tan, J. Yang, W. L. Kath, and C. R. Menyuk, Transient evolultion of the
polarization-dispersion vectors probability distribution, Journal of the Optical
Society of America B, vol. 19, no. 5, pp. 9921000, May 2002.
59. E. Vanden-Eijnden, private communication, Courant Institute of Mathematical
Sciences, New York University, N.Y., 2003.
60. P. Wai and C. R. Menyuk, Polarization decorrelation in optical bers with
randomly varying birefringence, Optics Letters, vol. 19, no. 19, pp. 15171519,
Oct. 1994.
61. , Anisotropic diusion of the state of polarization in optical bers with
randomly varying birefringence, Optics Letters, vol. 20, no. 24, pp. 24932495,
Dec. 1995.
62. , Polarization mode dispersion, decorrelation, and diusion in optical bers
with randomly varying birefringence, Journal of Lightwave Technology, vol. 14,
no. 2, pp. 148157, Feb. 1995.
63. J. Yang, W. L. Kath, and C. R. Menyuk, Polarization mode dispersion probability distribution for arbitrary distances, Optics Letters, vol. 26, no. 19, pp.
14721474, Oct. 2001.
64. M. Yu, C. Kan, M. Lewis, and A. Sizmann, Statistics of polarization-dependent
loss, insertion loss, and signal power in optical communication systems, IEEE
Photonics Technology Letters, vol. 14, no. 12, pp. 16951697, Dec. 2002.

10
Review of Polarization Test and Measurement

There are two aspects to test and measurement that are addressed in industry: the measurement and quantication of polarization eects such as SOP,
PMD, and PDL; and the calibrated generation of these eects. Most measurement techniques use a polarimeter to measure the Stokes parameters of
the light directly. Using predetermined and calibrated launch states of polarization at the input, the resulting Stokes parameters may be analyzed to
determine SOP, PMD, and PDL. In order for such equipment to adhere to
traceable standards, test artifacts for these eects have to be available. The
National Institute for Standards and Technology in the United States fullls this role, and the Telecommunication Industry Association (TIA), the
International Telecommunications Union (ITU), and the International Electrotechnical Commission (IEC) develop standard test methodologies.
Polarization-mode dispersion, PDL, and sometimes SOP uctuation generally cause impairments in an optical communications link. To quantify the
impairment, it is necessary to have test instrumentation that programmatically generates these eects. To date there is no standard way to generate SOP
uctuation calibrated to natural speeds, such as the SOP change in aerial ber
or, on the other extreme, under-sea ber. P. Leo et al. oer one proposal [72].
Artifacts for PMD and PDL are available as are instruments the make PMD
and PDL in a calibrated manner. Since PMD and PDL can interact to make
impairments worse than either eect alone, instruments that make PDL interspersed with PMD are necessary; some initial demonstrations have been
reported [98].
This chapter gives an overview of the current state-of-the-art in polarization test and measurement. The latter half of the chapter is dedicated to
programmable PMD generation, a topic that has not been covered as a whole
before.

430

10 Review of Polarization Test and Measurement

10.1 SOP Measurement


The starting point for any measurement of polarization state and its uctuation, polarization-dependent loss and gain, and polarization-mode dispersion
is the direct measurement of the Stokes parameters of the light.
A trade-o exists between ease of assembly and calibration verses speed
of a polarimeter. The rotating waveplate polarimeter, recently analyzed by
Williams for error sources [106], requires only a quarter-wave waveplate, a
linear polarizer, and a detector. Such a simple construction leads to a precision polarimeter, yet the read-out speed is limited by the waveplate rotation
rate. In contrast, the staring polarimeter literally stares at the incoming light
through a sequence of waveplates and polarizers, and detects the conditioned
light on a segmented detector. The read-out rate is limited by the detector
speed. A staring polarimeter requires several waveplates, a polarizer, and at
least four detectors. Balancing the detector responsivity and calibrating for
waveplate misalignment is more arduous than calibrating the rotating waveplate type, which generally leads to a lower-precision polarimeter. Its solidstate construction and fast read-out, however, oer advantages in live-trac
applications and for eld-portable instruments. Hague provides a review of
various early polarimeters [43].
Collett may have been the rst to build a staring polarimeter, which he
used to measure the polarization of nanosecond optical pulses [9]. The conversion of a Jones vector to a Stokes vector was detailed in 1.4 and 2.5.1.
Collett directly implements this scheme by making six simultaneous polarimetric measurements and inferring the seventh. His polarimeter is called a
dierential polarimeter because the intensity of orthogonal states pairs, e.g.
(Ix , Iy ), is measured directly, as opposed to inferred. Dierential polarimetry
is sometimes used in astrophysics and biomedical applications because of its
high sensitivity.
Siddiqui and Hener independently improved on the Collett design by
reducing from ve to three the number of distinct polarizers and waveplates [56, 91]. The logical diagram of their designs is illustrated in Fig. 10.1.
There are four intensities that are measured by a quad detector: I0,1,2,3 . I0
is the direct beam intensity. I1 and I2 are measured through linear polarizers
oriented at 0 and 45 , respectively. I3 is measured after the light transits a
quarter-wave plate and polarizer. The orientation of the waveplate and polarizer, as illustrated, is such that the waveplate transforms right-hand circular
polarization into the low-loss aperture of the polarizer. The input beam originates from a ber and expands to one or more millimeters in diameter before
collimation. Siddiqui uses a single lens, while Hener uses a segmented concave
mirror to separate and individually focus the four beams on their respective
paths. Through separate adjustment of the mirror segments the intensity of
each path can be balanced.
The four Stokes parameters are derived from the measured intensities according to

10.1 SOP Measurement


45o
polarizers

0o

I2
I3

90o

431

I1
I0

glass or gap
lo/4
lens array

Fig. 10.1. Illustrative staring polarimeter for high-speed measurement of Stokes


parameters. A four-quadrant detector measures the light transmitted along four
parallel paths. The rst path is all-pass while the next three condition the light
by polarizing it along S1 , S2 , and S3 . The bandwidth of the staring polarimeter
depends only on the sensitivity of the photodetectors and back-end electronics. A
very high-end speed could be in the GHz range, but most operate at MHz rates.

S0 = I0
Sk = 2Ik I0 ,

(10.1.1a)
k = 1, 2, 3

The Mueller matrix for the path between the lens and detectors is

1 0 0 0
I0
S0
S1 1 2 0 0 I1

S2 = 1 0 2 0 I2
1 0 0 2
S3
I3

(10.1.1b)

(10.1.2)

The formal equivalence of these relations to the rigorous transformation from


Jones to Stokes is veried using (1.4.141.4.17) on page 17, where I0 = Ix + Iy .
Moreover, to within a complex constant the associated Jones vector can be
reconstructed as detailed in 1.4.1.
In practice, the Mueller matrix in (10.1.2) is only an idealization. While the
matrix can always be constructed as shown up to the rst two rows, the realistic form of the third row depends on the relative alignment of the 45 polarizer
with respect to the one at 0 . Misalignment mixes in part of I1 . The same
holds true with the fourth row, but in addition the quarter-wave waveplate
has a wavelength dependence. Away from center frequency the waveplate over
or under rotates the polarization state, allowing a mixing with I2 . The wavelength dependence requires a low- or zero-order true wave waveplate and a
calibration table over wavelength. Also, the adiabatic expansion from ber to
collimator is sometimes replaced by a cascade of polarization beam splitters.
The polarization-dependent loss of the PBSs imparts deleterious polarization
dependence to the optical path which, ideally, can be calibrated out, but at
higher cost and lower accuracy. Finally, the separate articulation of the segmented concave mirror in the Hener design, or the four lenses as illustrated,
provides for power balancing among the four paths during construction.

432

10 Review of Polarization Test and Measurement

The staring polarimeter as illustrated is a bulk-optic component requiring


several parts. An alternative in-ber polarimeter is demonstrated by Westbrook et al. [30, 36, 102, 103]. His group imprints multiple blazed gratings
into the ber core to deect light through the cladding into a detector array.
While the printing cost may be higher than the cost of any particular part in
the bulk-optic analogue, mass production may be inexpensive and the small
form-factor and in-line nature have natural applications both in telecommunications as well as ber sensors.

10.2 PDL Measurement


To assess the polarization-dependent loss of a component or transmission link,
one must determine the minimum and maximum transmission through the device under test (DUT). A key diculty is that the orientation of the PDL axis
is unknown. A brute force approach is the so-called all-states method, which
scans the input polarization over all states and looks for transmission extrema
at the output. This method is obsolete because of the time required and the
possible error between getting close to the extrema but not actually nding it precisely. Furthermore, the all-states method is prone to repeatability
problems, which leads to each measurement being somewhat random.
Favin and Nyman pioneered the four-states method that determines the
PDL precisely after measuring the transmission for only four input polarization states [35, 81, 82]. This method has been adopted through international
standards bodies [65, 66, 94] and has been incorporated into most commercially available products. Moreover, Craig of the National Institute of Standards and Technology (NIST) has reported several details on the uncertainties
present in the four-states method [1214].
The four-states method has two parts. First there is a functional analysis of the transmission using the Mueller matrix. The result of this analysis
yields expressions for the minimum and maximum transmission in terms of
the Mueller-matrix entries alone. Second, estimates for the relevant matrix
entries are determined by measurement of the DUT using only four input
polarization states. This method works equally well for one wavelength as for
a range of wavelengths. In the latter case, errors due to waveplate retardance
change must be accounted for; such details can be found in the literature.
The functional analysis starts with the mapping of an arbitrary input
Stokes vector to the output Stokes vector:

m11 m12 m13 m14


S0
T0

T1


S1

(10.2.1)
T2 =

S2
T3

S3

Since the transmission intensity T0 is the only quantity of interest, the polarimetric values of T1 , T2 , T3 are ignored, as are the respective entries in the

10.2 PDL Measurement

Mueller matrix. The ratio of transmitted to incident intensity is


 
 
 
S1
S2
S3
+ m13
+ m14
T0 /S0 = m11 + m12
S0
S0
S0

433

(10.2.2)

 generate the minimum and maximum values


The question is what values of S
of T0 for xed albeit unknown values of m1k . The analytic problem can be
solved using the Lagrange multiplier method [14]. The linear function f is to
be maximized under the constraint g,
f = m11 + m12 s1 + m13 s2 + m14 s3
g = s21 + s22 + s23 1 = 0
where sk = Sk /S0 . Under this particular constraint, the function f surely has
extrema points. At any extremum, df = 0. Accordingly,
df = 0 = m12

f
f
f
ds1 + m13
ds2 + m14
ds3
s1
s2
s3

The dierential of g can also be taken, and the two expressions are added
in linear superposition with the Lagrangian scale factor . Separation of like
terms from the expression dg + dg = 0 yields the set of equations
f
g
+
= m12 + 2s1 = 0
s1
s1
f
g
+
= m13 + 2s2 = 0
s2
s2
f
g
+
= m14 + 2s3 = 0
s3
s3

(10.2.3)

Extraction of sk from each expression and substitution into g determines the


value of the Lagrangian multiplier :

(10.2.4)
2 = m212 + m213 + m214
Substitution of (10.2.4) into (10.2.3) determines the extrema values for sk in
terms of the Mueller matrix entries. Substitution of the resultant sk values
into (10.2.2) yields the extreme values in transmission:

Tmax = m11 + m212 + m213 + m214
(10.2.5a)

Tmin = m11 m212 + m213 + m214
(10.2.5b)
Therefore the entries in the rst row of the Mueller matrix completely determines the minimum and maximum transmission ratios. Indeed the PDL is
immediately given by [35]

434

10 Review of Polarization Test and Measurement


dB = 10 log10

m11 +
m11




m212 + m213 + m214


m212 + m213 + m214


(10.2.6)

Measurement is needed to formulate estimators for the values m1k . The


typical implementation of the four-states method is to probe the DUT with
states {S1 , S1 , S2 , S3 }, Fig. 10.2(a). There is nothing particular about this
choice of states other than the obvious three orthogonal coordinates in Stokes
space. However, it is clear that the best choice of states should provide the
best estimators for the Mueller entries. The aforementioned states are generated via a polarizer and a removable or rotatable half-wave and quarter-wave
waveplate. A rst baseline experiment is performed with the DUT bypassed.
Denote the transmitted intensities of the four experiments Ia , Ib , Ic , Id . A second experiment is then performed with the DUT inline. The Mueller matrix
of the device eects the input polarization states; the transmitted intensities,
from (10.2.1), are
I1 = (m11 + m12 )Ia , I2 = (m11 m12 )Ib
I3 = (m11 + m13 )Ic , I4 = (m11 + m14 )Id
These equations are rewritten in matrix form to relate the Mueller entries to
the DUT output intensities:

Ia Ia
I1
m11
I2 Ib Ib
m12

I3 = Ic
m13
Ic
I4
Id
Id
m14
Inversion of the 4 4 matrix (which is not a Mueller matrix) yields expressions
for m1k

m11
(Iw + Ix ) / 2
m12 (Iw Ix ) / 2

(10.2.7)
m13 = Iy (Iw + Ix ) / 2
m14

Iz (Iw + Ix ) / 2

where Iw,x,y,z = I1,2,3,4 /Ia,b,c,d . Substitution of these estimates for the Mueller
entries into (10.2.6) generates an estimate for the PDL of the DUT.
A critical aspect of the measurement is that the detector which detects all I
has minimal PDL. Earlier detectors in fact exhibited PDL larger that 0.01 dB,
where the requisite measurement accuracy was 0.001 dB. Nyman et al. were
the rst to add a depolarizer before the detector to eliminate the PDL from
the test set [82]. In their work they added a 23 m length of unpumped erbiumdoped ber (EDF). The input light to the EDF is absorbed by the erbium
ions and emitted at a longer wavelength. Repeated absorption and emission
can completely depolarize the light. Although the conversion eciency from
polarized to depolarized light was 0.032%, use of this nonlinear diuser was

10.2 PDL Measurement


a)

b)

S3
Ic
Id

435

S3
Ic

I3

I3
I2

Ib

S2

Ia

S1

I2

Ib

S2

I4
I1

I1

Ia

S1

I4
T(sin) surface

Id

T(sin) surface

Fig. 10.2. Four-states measurement method for PDL. Calibration measurements are
made without the DUT inline. Those measurement intensities are Ia,b,c,d . The DUT
is subsequently spliced in and intensities I1,2,3,4 are measured. a) Standard fourstates {S1 , S1 , S2 , S3 } and Tp surface. b) Tetrahedral four-states with same Tp
surface. The tetrahedral group has a 120 separate between all states, or maximum
discrepancy.

the rst to allow high-precision measurements. In measurement systems commercially available at the time of this writing, improved detectors or detectors
preceded by a wedge depolarizer are used.
An alternative arrangement to the standard four-state method is proposed
here. To achieve the best estimators for the Mueller entries, the probe polarization states should exhibit maximum discrepancy. That means that they
should all be as far apart from one another as possible. For four points in
a three-dimensional space this is a tetrahedral orientation, where each state
is 120 away from the others (Fig. 10.2(b)). A tetrahedrally orientated set of
four polarizations can be achieved with a polarizer and at most three waveplates. Repeating the preceding analysis for four such states, where two of the
states lie on the equator, the Mueller entries are

Ix + Iy + Iz
m11
m12
Iw + Ix Iy Iz

(10.2.8)
m13 = (Iw + 5Ix 3Iy 3Iz ) / 3

(Iy Iz ) / 3
m14
That the estimators for the Mueller entries all rely on more measurement
information than the standard four-state method will reduce the overall error.
A means currently embraced by industry to improve the accuracy is to
extend the four-state method to six states [12], where the probe states are
{S1 , S2 , S3 }. While application of the preceding analysis determines how
the Mueller entries relate to the measured intensity ratios, note that the six-

436

10 Review of Polarization Test and Measurement

state method, in addition to augmenting the measurements, uses maximum


discrepancy between probe states. Six states of maximum discrepancy argue
for increased sensitivity.

10.3 PMD Measurement


There are three principal PMD features that are of interest to measure, depending on application. One feature is the mean DGD   of a ber or link.
The preceding chapter detailed how   is the sole scaling parameter necessary to specify the statistics of all orders of PMD as well as its autocorrelation
function. Another feature is the PMD vector as a function of frequency  ().
This vector information is necessary to characterize rst- and higher-order
PMD of a component or ber directly, and is necessary to correlate receiver
performance in the presence of PMD. The third feature is the direct measurement of ber birefringence as a function of position. As birefringence is
the origin of PMD, its characterization has led to important experimental information. For instance, measurement of ber birefringence has validated the
zero-chirality model of the birefringence for unspun bers.
Table 10.1 classies the demonstrated PMD measurement methods according to the principal parameter(s) they report. The wavelength scanning
(WS) and interferometric (INT) methods are suitable to ascertain quickly the
mean DGD of a ber. These two methods are related via Fourier transform.
The PMD vector as a function of wavelength can be measured using four related techniques, Jones Matrix Eigenanalysis (JME), Mueller Matrix Method
(MMM), the Poincare Sphere Analysis (PSA), and the Attractor-Precessor
Method (APM); or a dierent technique here called the Vector Modulation
Phase-Shift (V-MPS) method. Two basic dierences between the rst four vector methods and the latter are in the rst instance a CW tunable laser and
polarimeter is used while in the second instance an RF-modulated tunable
laser and network analyzer are used instead. Finally, the local birefringence
can be measured using polarization-dependent optical time-domain reectometry (P-OTDR).
An alternative classication is adopted by Williams where measurement
techniques are grouped according to the coherence time of the probe source in
relation to the mean DGD of the device under test (DUT) [107]. Frequencydomain classication is for source coherence times c much longer than
mean DGD: c   . Time-domain classication is the opposite case, where
c   . Frequency-domain techniques are the WS method, JME, MMM,
PSA, and APM methods. Time-domain techniques are the INT and P-OTDR
methods. The scalar and vector MPS methods are hybrids of the two, where
the phase-shift measures the time-of-ight while the modulation is imparted
on a high-coherence carrier that scans wavelength.
Tied in with most practical PMD measurements is the presence of PDL.
As detailed in the preceding section, PDL can be measured, identied, and

10.3 PMD Measurement

437

Table 10.1. Classication of PMD Measurement Techniques


Abbev.

Method

Infer

Measurement

Num.
Di.

PDL
Tol.

Mean DGD:
WS(a)

Wavelength Scanning

Tp ()

no

yes

INT(b)

Interferometric

R(t t0 )

no

yes

Vector PMD:
JME(c)

Jones Matrix Eigenanalysis


(),

Sout (, Sin )

yes

yes

MMM(d)

Mueller Matrix Method


(),

Sout (, Sin )

yes

no

PSA(e)

Poincar
e Sphere Analysis


(),

Sout (, Sin )

yes

no

APM(f )

Attractor-Precessor Method


(),
,

Sout (, Sin )

yes

yes

S-MPS(g)

Scalar modulation phase-shift

(),

RF (, Sin )

no

yes

V-MPS(h)

Vector modulation phase-shift


(),

RF (, Sin )

no

yes

Local birefringence:
P-OTDR(i) Polarization
Optical
Domain Reectometry

Time-

(z)

no

n/a

P-OFDR(j ) Polarization Optical FrequencyDomain Reectometry

(z)

no

n/a

(a)
(d)
(h)

Poole, Favin [86],

(b)

Gisin, Hener [40, 52],

Jopson et al. [68],(e) Cyr [15],


Nelson et al. [80],

(i)

(f )

(c)

Hener [47],

Eyal et al. [34],

Galtarossa et al. [10].

(j )

(g)

Williams [105],

Huttner et al. [61].

extracted in the presence of PMD. The PDL information can be used to


extract the pure PMD eects (cf. 8.3.4). The three measurement methods
adapted to this procedure are the JME, APM, and MPS methods. The JME
method converts Stokes measurements into Jones transfer matrices, which
are subsequently resolved into Hermitian and unity components. The unitary
matrices are then analyzed for PMD and the Hermitian matrices for PDL.
The MPS methods, both scalar and vector, use a four-states measurement at
the input. The four-states method can combine PDL and PMD measurement
into one overall system characterization. The PSA attempts to eliminate PDL
eects before data analysis by driving the measured data into three orthogonal
coordinates. The MMM method is the least equipped to handle PDL as it is
recommended to measure only two polarization states (which can be xed) and
relies on an equation that is only approximate in the presence of PDL (which
is xed by the APM method). Finally, the WS method is reportedly tolerant
to some degree of PDL; even though the measured data changes substantially
with the addition of PDL, the number of mean crossings or extrema remains
unchanged.

438

10 Review of Polarization Test and Measurement

10.3.1 Mean DGD Measurement


Wavelength Scanning Method
Perhaps the simplest technique for mean-DGD measurement is the wavelengthscanning (WS) method developed by Poole and Favin [86]. The setup is illustrated in Fig. 10.3. The core of the measurement is the intensity response of
a DUT, typically ber, placed between crossed polarizers. The response can
be measured either with a tunable laser swept through wavelength and detector, or with a broadband source resolved with an optical spectrum analyzer
(OSA). In either case the transmission intensity, on a linear scale, is measured
over frequency. The intensity is related to the analyzer polarization p and the
SOP output from the ber s() by
Tp () =

1
(1 + s() p)
2

Since s() is uniformly distributed on the Poincare sphere in the long-length


regime one expects Tp () = 1/2.
Once Tp () is measured it can be analyzed either with a mean-level crossing analysis or a extrema-counting analysis. As indicated in the gure, meanlevel crossing is the number of times the intensity crosses the mean-level of
the spectrum. Extrema counting is also related to the mean DGD, but can be
problematic in the presence of noise. Williams provides an decision algorithm
to distinguish between noise and signal extrema [112]. The TIA standard for
the WS method is available in [95].
Poole and Favin show that, in the long-length regime,   is related to Nm
and Ne , the count of mean-level crossings and extrema, respectively, by
 m = 4

Nm
Ne
, and  e  0.805
B
B

(10.3.1)

where B is the bandwidth in radial frequency: B = 2 1 . The 0.805 coefcient is reported by Williams [112] and is a correction to the original Poole
and Favin factor of 0.824. There are two more details to be addressed: the full
measurement bandwidth and the measurement resolution. The uncertainty of
mean DGD is inversely related to the measurement bandwidth and is governed according to the PMD autocorrelation function 9.4.3. Williams shows
that the measurement resolution must meet or exceed   /12 in
order to achieve the asymptotic crossing and extrema density necessary for a
reliable measurement.
An interesting attribute of the WS method is that a measure of long or
short regime is possible [86]. In the long-length regime the ratio of crossing
to extrema is Nm /Ne 1.58, while in the short-length regime the ratio is
unity: Nm /Ne = 1. With a good measurement of the crossing and extrema
count, the count ratio is a measure of the regime in which the DUT resides.

10.3 PMD Measurement


polarizer

Source
0

analyzer
fiber

Detector
o

90
mean crossing

Transmission

439

extrema

1.0
0.5
0.0
193.0

193.5

hti 5 10ps

194.0

194.5

195.0

mean
level

Frequency (THz)

Fig. 10.3. Wavelength-scanning method to characterize   [86]. Either a broadband


source and optical spectrum analyzer detector, or a tunable laser and broadband
detector, are used to view the frequency-dependent intensity variation of the DUT
between two crossed polarizers. The WS method associates the number of mean
level crossing or intensity extrema to the mean DGD. The mean-DGD uncertainty
is related to the measurement bandwidth.

Interferometric Method
The interferometric measurement method is closely related to the WS technique. The TIA standard for this method is available in [92]. An exemplar
setup is illustrated in Fig. 10.4. As with the WS method, the DUT is placed
between two crossed polarizers. The source is broadband and an interferometer is added in the optical path; as illustrated it is located at the output.
The interferometer is polarization insensitive, the 50/50 beam splitter (BS)
is a power splitter. Translation of one arm of the Michelson delays one path
from the other to generate an interferogram at the detector, illustrated in the
gure. The interferogram is recorded by the photocurrent I as a function of
relative delay in the two arms. The variance of the interferogram I is
, 2
,
2
I( )d
I( )d
2
,
,
I =
(10.3.2)

I( )d
I( )d
Hener relates the interferogram standard deviation to the mean DGD  
according to the relation [52]

2
I  0.789 I
(10.3.3)
  =

Moreover, in the same paper he details the role of the source bandwidth.
The 0.789 coecient is the asymptotic limit for very wide source bandwidth.
Lowering the source bandwidth rst increases the coecient value and then
decreases it. Unlike the WS method, the result from an interferometric measurement depends centrally on the source characteristics. Finally, Hener

440

10 Review of Polarization Test and Measurement

points out an error in the Gisin paper [40] in that the interferometer above is
an electric-eld interferometer, not one that measures intensity. An intensity
interferogram is studied in subsection PMD Impulse Response starting on
page 358, where the second-moment of an intensity interferogram is equal to
the RMS DGD value of the DUT (see Fig. 8.33).
Gisin and Hener both show that the eld interferogram and intensity
spectrum generated by the WS method are Fourier transform pairs [40, 5153].
Thus the analysis for the wavelength scanning method can be done by a
moments calculation in the Fourier domain.
The central problem associated with the interferometric method is the separation or elimination of the source signature from the PMD-induced interferogram. Figure 10.5 illustrates the two demonstrated methods to make this
separation. In the rst method, independently proposed by Barlow, Gisin, and
Cyr [1, 16, 17, 41], a known, xed DGD element is concatenated with the DUT
(Fig. 10.5(a)). The xed DGD element, which can be a piece of polarizationmaintaining ber, serves to bias the DUT interferogram away from the zerodelay origin and the source-induced signal. In the second method, rst proposed by Hener [50] and later improved upon by Martin [78], looks to cancel
the source signal within the interferometer directly. To do so, Martin adds
a quarter-wave waveplate to one arm of the Michelson. Double-pass of the
waveplate imparts a phase shift in one arm with respect to the other. For
every position of the translating arm of the interferometer the common delay o is nominally cancelled, whereas the dierential delays /2 due to PMD
remain intact. The bandwidth of the quarter-wave plate plays a vital role in
the source-cancellation method and must be considered.
10.3.2 PMD Vector Measurement
Direct measurement of the PMD vector gives high-resolution information on
the state, frequency, and time evolution of PMD in a DUT. The average of
the vector length, or DGD, over frequency can reproduce the mean DGD as
measured by the WS or INT methods; but this is not the central purpose of
the vector methods. Measurement of the PMD vector requires polarimetric
stability of the DUT at least over the time it takes to launch the plurality
of polarization states. Installed ber plants often uctuate quickly, making it
dicult to use vector measurements.
Two sub-categories of measurement techniques are those that measure the
output Stokes vectors as a function of frequency and input polarization state,
and those that use a lock-in amplier to measure output power as a function
of frequency and input polarization state. In the rst case analysis depends
on the Stokes-vector change across frequency steps, while in the second case
the vector is directly measured at each frequency because a narrowband modulation is imprinted on the probe signal.
Comparison across these varied techniques shows that the JME and APM
methods carry the most advantages for the Stokes-based methods as eects

10.3 PMD Measurement


polarizer

441

analyzer
BS

Broadband
Source
0

fiber

90

Detector Current

Detector
1.0

2se
interferogram

0.5

Gaussian
hti 5 0.789 se

0.0
-40

-30

-20

-10

10

20

30

40

Time (ps)

hti 5 10ps

Fig. 10.4. Interferometric method to characterize  . A broadband source is used


to probe a DUT located between crossed polarizers, and the output is analyzed to
produce an interferogram. The second moment of the interferogram is associated
with the mean DGD of the DUT. The source bandwidth is directly entwined with
the interferogram variance and must be accounted for.
a)

polarizer

analyzer
BS

BB Src
fiber

bias
90

0
source

DUT
delay

bias

b)
polarizer

l/4

analyzer

BS

BB Src
fiber

0
null
source
spike

90
DUT

delay

Fig. 10.5. The interferogram in the preceding gure is idealized: the source-induced
peak at the origin was numerically removed. There are two ways to separate the
source peak from the PMD-induced interferogram. a) A bias from a xed, known
DGD element is concatenated with the DUT [1, 41]. b) The source peak is cancelled
by adding a quarter-wave waveplate to one arm of the interferometer [78]. Doublepass of the waveplate imparts a phase shift in that arm for every position of the
other arm.

442

10 Review of Polarization Test and Measurement

of PDL can be stripped, and that the vector-MPS method is the most advantageous of all because it does not require frequency dierencing and is largely
immune to PDL.
Jones Matrix Eigenanalysis
Jones matrix eigenanalysis was developed in the early 1990s by Hener [47,
55, 57]. Hener discretized Poole and Wagners PMD eigenvalue equation [85]
to arrive at a solution using measurements of the Jones matrix. The measurement setup is illustrated in Fig. 10.6. A narrow-line tunable laser is the probe
source. Wavelength accuracy is essential, so either the laser must have a builtin wavemeter or an external one must be added. At each frequency three
polarizations are launched in sequence: Pa , Pb , and Pc . The light is transmitted through the DUT and is resolved by a polarimeter. The Stokes parameters
Sa (), Sb (), Sc () are in this way measured over frequency. Figure 10.6 illustrates the motion of Sa (), Sb (), Sc () in Stokes space for three frequencies
over a narrow band through an arbitrary DUT. Arcs are traced in frequency,
a dierent arc for each input state. Over a wide frequency band an arc can
have a complicated shape.
Once the Stokes vectors are measured the data is analyzed to determine
the PMD vector. The Stokes vectors at each frequency are rst converted to a
Jones matrix at that frequency: Sa,b,c () J(). The conversion is detailed
in 1.4.1 on page 17. Heners prescription at this point is to solve the PMD
eigenvalue equation, but Karlsson and Shtengel introduce an intermediate
step [64, 70]. In order to remove the eects of PDL on the data set, the
Jones matrix is resolved into Hermitian and unitary components. The unitary
component contains the PMD information and is fed into the remainder of the
Hener method. The details of this matrix decomposition are given in 8.3.4 on
page 378. The decomposition converts the Jones matrices to unitary matrices:
J() U ().
Recall from (8.2.10) on page 329 that the eigenvalue equation for PMD is
jU U |p  = /2 |p 
The forward-dierence equation analogue is
#
$
U ( + )U () (1 jg ) |p = 0

(10.3.4)

Since jU U is traceless the eigenvalues of this equation are = 1 j /2.


Therefore the DGD at is
() = j

+ () ()

(10.3.5)

The PSP vectors are the eigenvectors |p  of (10.3.4). At the time of this
writing, Shtengel has posted a LabView library to drive an Agilent 8509 instrument to measure the full PMD vector as a function of frequency [89].

10.3 PMD Measurement


o

60

120

443

fiber

Tunable Laser
Source

Polarimeter

S3

Polarization
control

Sa,2

Sa,3
Sb,1

Sa,1

Sb,2
Sb,3

Dv

S1

S2

Sc,3

output Stokes
evolution at
frequencies v1, 2, 3

Sc,1
Sc,2

Fig. 10.6. Jones matrix eigenanalysis method to characterize () [47]. Light from
tunable laser with built-in wavemeter (or an external wavemeter) is serially polarized
into three dierent states. The polarized light transits the DUT and is resolved by a
polarimeter. At each frequency the Stokes vectors are measured for the three launch
states; the Stokes vectors are then converted to a Jones matrix. Below shows a
measurement fragment in Stokes space. Arcs are traced out on the sphere, one arc
for each launch.

tk (ps)

10
0

t1

S3

t2

p
b(v)

t3

-10

DGD (ps)

12

Dvt 5 0.16 p

S2
S1

6
0
-250

Dvt 5 1.6p
-125

125

250

Relative Frequency (GHz)

Fig. 10.7. Exemplar results of JME applied to a modelled ber. The PMD vector is
resolved into its cartesian components, plotted as PSPs in Stokes space, and plotted
as DGD as a function of frequency. Data folding occurs if the frequency step size,
local DGD product exceeds 180 .

444

10 Review of Polarization Test and Measurement

Separately, Hener reports validation this method in [48, 49, 54]. Williams
gives a comparison between the WS, INT, and JME methods in [108].
Figure 10.7 illustrates a calculation of the JME measurement. Solution of
the eigenvalues and vectors allows one to plot the three Stokes components
of  () separately. The unit-vector () maps the PSP spectrum of the DUT
while the length () gives the DGD spectrum. Since full vector information
of the PMD is available, second- and higher-order PMD can be estimated,
although higher-order dierences are required.
There are two practical issues regarding the JME method. First is that
dierences of eigenvalues and frequencies are used to calculate (10.3.5).
Noisy data leads to noisy eigenvalues, which will upset the calculated values.
Also, uncertainty of the true frequency dierence will likewise lead to
errors. Karlsson uses a multi-point estimator for the rst derivative of the
eigenvalues [70].
Second is the relation between the frequency step and the local DGD
value. To rst order, a frequency change generates precession of the output
polarization about the PSP. Assuming a stationary PSP for the moment, the
larger the frequency step the larger the precession angle. However, a step so
large that > 2 is ambiguous. Moreover, a step such that > is also
ambiguous because the direction of the PMD vector cannot be determined
uniquely (plus or minus). The step size is restricted to to avoid
ambiguity. For a ber DUT, the step size is related to the mean ber DGD
via

(10.3.6)
 
4
to ensure almost no local DGD value is so great as to lead to a rotation greater
than . The eects of increasing step size on the data are illustrated in
the DGD plot in Fig. 10.7. For = 0.16 the calculated DGD values are
close to the actual values. As the step size increases the values fall. At the
location of the lower arrow, indicating = 1.6, the curvature of the DGD
spectrum actually inverts. This is called data folding [68] and leads to errant
measurements.
Mueller Matrix Method
The Mueller Matrix Method was developed in that late 1990s by Jopson
et al. [68, 69]. Contrary to the JME method, the measured Stokes vectors are
analyzed directly rather than converted to equivalent Jones matrices. That
the Stokes vectors are not converted to a Jones transfer matrix keeps the
formalism concise but prevents the decomposition of the measured data into
PMD and PDL components.
Consider a DUT with frequency-dependent transfer matrix T (). The output polarization state, as a Jones vector, is related to the input state as
|t = T () |s. Under the assumption of zero PDL, T () = U (); the PMD
vector is identied through jU U = (  )/2. The Stokes-space analogue for

10.3 PMD Measurement

445

this state transformation is t = R()


s, where R() is the rotation operator (2.6.22) on page 68:
R = cos I + (1 cos )(
rr) + sin (
r)
where  = r and is a function of frequency. To identify the PMD vector in
Stokes space, the derivative of the transformation is taken, t = R s, and the
expression is rearranged in terms of the output polarization only: t = R R t.
With no PDL, the output state precesses about  according to t =  t. The
PMD vector is therefore  = R R . The nite-dierence equivalent to R R
is itself a rotation:
R (; ) = R( + )R ()
= cos I + (1 cos )(
rr) + sin (
r)
where both and r are evaluated at and it is assumed that is suciently small so that r = r+ to rst order. Combining the information
that Tr(
rr) = 1, that r is resolved according to (2.6.29) on page 70, and
denoting = , the PMD vector can be reconstructed from the matrix
elements Rij of R :
(10.3.7)
cos = 12 (Tr(R ) 1)
and
r1 sin =
r2 sin =
r3 sin =

1
2
1
2
1
2

(R23 R32 )
(R31 R13 )

(10.3.8)

(R12 R21 )

The DGD is calculated by


() =

cos1 ((Tr(R ) 1) /2)

(10.3.9)

and the PMD vector as  () = ()


r().
To implement the MMM method the rotation matrix R() must be constructed. Consider the three launch states sa = (1, 0, 0)T , sb = (0, 1, 0)T , and
sc = (0, 0, 1)T . Transmission of state sa through R makes ta = Rsa ; the vector ta is the rst column of matrix R. The second and third columns of R are
similarly determined. Thus a basic prescription for the measurement of R is
given: measure the DUT with input launches as 0 , 45 , and 90 in physical
angles. The MMM method can use the same experimental setup as the JME
(Fig. 10.6) with dierent polarizers.
Jopson shows that only two independent polarization launches are necessary for the MMM method. The rotation matrix R has only three independent
parameters: a precession angle, and azimuth and declination angle of the vector. A rst launch alone determines two of the R parameters. A second launch

446

10 Review of Polarization Test and Measurement

is the minimum necessary to determine the third parameter. Denote the measured result of two launches as t1 and t2 , the constructed columns of R are
t1 t2
 , t2 = t3 t1 , and t1 = t1
t3 = 
t1 t2 

(10.3.10)

These constructed vectors form an orthonormal basis.


In comparison to the JME method, MMM oers only a simplied calculation. Errors from frequency dierencing are of the same order (compare (10.3.5) and (10.3.9)) and the step size for MMM carries the same
restriction as JME (10.3.6), see [39]. In the presence of PDL, JME and MMM
dier signicantly. JME can directly decompose the data into PMD and PDL
components. MMM cannot and relies on an inexact equation of motion. As
derived in 8.3.1, the equation of motion for the output polarization state in
the presence of PMD and PDL is the non-rigid precession
dt 
 i t t
= r t
d

(10.3.11)

 i are the real and imaginary components of jT T . There


 r and
where
 r,i to PMD and PDL, other than the limiting
is no simple identication of


condition that i 0 and r  for zero PDL.
The cross-products of (10.3.10) attempt to straighten out the data, but
measurements of only two states provides no ability to cross-check. Shtengel
has shown that in the presence of PDL the two-state MMM gives spurious
results as compared with JME [90]. Surely there is a boundary below which
PDL does not aect the MMM measurements considerably; this boundary as
a function of   has yet to be explored.
The Poincare sphere analysis (PSA) method is equivalent to the MMM
method and also relies on two or three launch states. The reader is referred
to the work of Cyr of Exfo, Corp., for more information [15, 96].
The Attractor-Precessor Method
The Attractor-Precessor Method (APM) relies on the output-state equation
of motion (10.3.11). This equation is called attractor-precessor because local
PDL pulls the polarization state toward it and the local birefringence induces
precession. APM, which lies between the JME and MMM methods, has been
demonstrated by Eyal and Tur [33, 34] and is simplied here using spin-vector
formalism. In [34], the authors write the Frigo equation of motion, (8.3.4) on
 r,i
page 373, for the output unit vector; the result is (10.3.11). The vectors
have a total of six unknowns: length, and azimuth and declination angle for
 r,i using (at
each vector. Eyal and Tur demonstrated the direct estimation of
least) three input states across two closely spaced frequencies. As each input
state has two known quantities, three inputs are the minimum required.

10.3 PMD Measurement

447

 r,i , Eyal and Tur use a stereographic mapping to


Given estimates of
Jones space from Stokes space to determine the PSPs, DGD, and DAS. As an
 r,i is used as follows. Dene the complex
alternative, the spin-vector nature of

  are the


vector = r + j i . The eigenvectors of the traceless operator
PSPs of the system, according to (8.3.7) on page 374, and the corresponding
eigenvalues are related to the DGD and DAS according to (8.3.8) on page 374.
Such a procedure is detailed by Bao et al. [7].
The APM method has characteristics of the JME method because an eigenvalue equation in Jones space is solved, and has characteristics of the MMM
method because the quantities rst derived satisfy a Stokes-based equation of
motion.
Modulation Phase-Shift Method
A supremely elegant PMD measurement method that dovetails directly with
the four-states PDL measurement method is the modulation phase-shift technique (MPS). As with PDL, the rst MPS method was all-states, where the
input polarization was scanned in attempt to align with the PSPs at each
frequency. Independently, Williams [105, 109] and Nelson et al. [42, 80] developed the four-states method similar to that for PDL. That is, by measuring
four known launch states, the orientation of the PSPs can be directly calculated. The only dierence between the methods of Williams and Nelson et al.
is that the latter team reconstructs the full vector  (), while Williams calculates (). For this reason the methods are herein categorized as scalar MPS
and vector MPS. The TIA standard for the S-MPS method is found in [93].
Williams and Koer have extended the four-states method to six states, similar to Craigs six-state PDL measurement technique (cf. 10.2), to improve
the accuracy to within a 40 fs single-measurement uncertainty [110].
A suitable measurement setup is illustrated in Fig. 10.8. The output from
a narrow-line tunable laser is modulated sinusoidally at 12 GHz. The signal
is then conditioned to lie along one of four polarization states. The modulated,
polarized signal is transmitted through the DUT and detected by a network
analyzer and polarization insensitive detector. The network analyzer determines the phase dierence between the local oscillator, given by the modulator
source, and the received optical eld. The phase delay equals the product
of modulation frequency m and delay through the DUT : = m .
Recall from the time-domain analysis of PMD in 8.2.6 that for a narrowband signal the output group delay is due to the common and dierential
delays in the line ((8.2.47) on page 354):
g = o + ( /2) p s
where o is due to the average group index, and s and p are the launch
state and input PSPs, respectively. The measured group delay g can lie
anywhere between or at the extrema: g = o /2. The principal aspect of

448

10 Review of Polarization Test and Measurement

the MPS method is to equate the measured delay with the narrow-band
group delay g that comes from a moments analysis: = g .
The optical signal launch into the DUT is split by the birefringence along
the two input PSPs. The projected intensities are I = Io (1 p s). A phasor analysis of the received eld, in the absence of appreciable dierentialattenuation slope (DAS), sets the relationship between the principal variables:
! "
m
(10.3.12)
tan m (g o ) = p s tan
2
The calibrated quantities are s and m , the measured quantities are g for
each s, and the unknown values are p, o , and . Under the constraint that
p21 + p22 + p23 = 1, there are a total of four unknowns. At least four measurements are necessary to solve (10.3.12).
Equation (10.3.12) can be solved in the following way. Since p is a threeentry column vector, the four input states are separated into a rst launch
and a group of three launches. Dene a coordinate system (
r1 , r2 , r3 ) such
that the rst launch state S0 = r1 and the remaining three launch states
are Si = si,1 r1 + si,2 r2 + si,3 r3 , i = 1, 2, 3. For each launch state there is a
measured group delay g,i , i = 0, 1, 2, 3. The rst launch condition and group
of subsequent launches is written as
! "
m
(10.3.13a)
tan m (g,0 o ) = S0T p tan
2
! "
m
tan m ( g o ) = S p tan
(10.3.13b)
2
where

s11 s12 s13


1 1 1
S = s21 s22 s23 and S1 = 2 2 2
s31 s32 s33
3 3 3

where S1 is in anticipation of the following. In order for S1 to exist, all three


launch states cannot lie on the same plane. If two of the states are linearly
polarized, the third must have a circular component.
Solving for p in (10.3.13b) and substitution into (10.3.13a) gives an equation which can be solved for o :
tan m (g,0 o ) = 1 tan m (g,1 o ) +
1 tan m (g,2 o ) + 1 tan m (g,3 o )
Linearization gives an initial solution:
o =

1 g,1 + 1 g,2 + 1 g,3 g,0


1 + 1 + 1 1

(10.3.14)

Once g,0 is determined, (10.3.13b) can be solved for p and under the constraint that pT p = 1. Nelson et al. report that linearization of transcendental
equations (10.3.13) is valid to within 6% as long as m /4.

10.3 PMD Measurement


0

TLS

60 120 l/4 0

MOD

So

Sa

Sb

Sc

Polarization control
fiber

vm
Computer

449

Network
Analyzer

Fig. 10.8. Modulation phase-shift method to characterize () [80, 105]. The line
from a tunable laser source is modulated at m 1 2 GHz. The eld is then
conditioned by one of four launch-state polarizers. At most three of the polarization
states can lie in the same plane, at least one state must lie o the plane. For instance:
So = S1 , Sa = S1 , Sb = S2 , Sc = S3 . The signal is transmitted through the DUT
and received by a network analyzer and polarization-independent detector. The
analyzer measures the phase delay of the DUT path with respect to the modulated
signal. Addition of a bypass around the DUT and direct intensity measurements
augments the setup for PDL measurement.

Eyal et al. have combined PMD and PDL measurement into a single fourstates MPS method [32]. Their prescription starts with the Mueller representation of the time-domain polarization transfer function (8.2.42) on page 343.
Denoting the transfer function as H(t),
 the Mueller matrix is constructed
through M(t) = 12 Tr H(t)k H (t)i for i, k = 0, 1, 2, 3. For an RF input
frequency m , the time-averaged Stokes-based transfer function is then


Sout = ejm t M(t) Sin
(10.3.15)
By observing the amplitude and phase of the response, both PDL and PMD
information can be extracted from the measurement. The advantage of this
analysis is the implicit inclusion of the dierential-attenuation slope (DAS).
Expression (10.3.12) assumes a linear transfer function between input and
output modulation amplitude. DAS, however, dilates or compresses the output modulation amplitude, distorting the transfer function. The DAS-induced
amplitude change is accounted for in the Eyal analysis.
Finally, the beauty of the MPS technique is that PMD vector information
can be extracted at each frequency. The frequency dierencing necessary in the
JME-type methods is replaced with narrow-band sinusoidal modulation. The
modulation bandwidth is narrower than the step size one could achieve with
JME and the phase detection of the modulated source gives a highly accurate
measurement. The MPS technique is well suited for lter component testing
in particular where transmission windows are substantially less that 100 GHz.

450

10 Review of Polarization Test and Measurement

10.3.3 Polarization OTDR


Polarization-adapted optical time-domain reectometry (P-OTDR) was pioneered in 1981 by Rogers [88] to investigate the local birefringence of communication bers. That work was dedicated to the weak-coupling regime where
L  LC . In the mid-1990s Corsi, Galtarossa, and Palmieri extended the theory of P-OTDR into the strong-coupling regime [10, 11, 37]. Their experimental results of factory and installed ber show that step-index, dispersionshifted, and non-dispersion-shifted ber all exhibit immeasurably low circular
birefringence [39], validating the Wai and Menyuk stochastic model of ber
birefringence [99] for unspun bers. They have also measured the PMD change
of installed ber over a period of several years [38].
One principal result of the Corsi et al. analysis is that the longitudinal
evolution of the backward-travelling polarization state sB (z) at location z
obeys a precession rule about the round-trip birefringence vector (z) such
that
d
sB
= (z) sB (z)
(10.3.16)
dz
The practical signicance of this precession rule is that it is formally equivalent to the PMD precession expression with z and  . Therefore any
of the PMD-vector measurement techniques reviewed above can be adapted
to measure (z). Figure 10.9 illustrates the experimental setup as adapted using a commercial OTDR. Recently, Gisin et al. have reported high-resolution
measurements using a photon-counting technique adapted to P-OTDR [101].
A method complementary to P-OTDR is polarization-dependent optical
frequency-domain reectometry (P-OTFR). P-OTFR features a higher resolution that P-OTDR, but works over shorter distances. Huttner et al. has
developed P-OTFR for birefringence measurements of ber; their work is reported in [6062]. The Huttner group has investigated single-mode ber and,
in particular, spun ber having very low PMD coecients. They have found
that spun ber exhibits a degree of circular birefringence [62].
An important study recently reported by Gisin et al. [71] uses the POFDR method to study the dierence between phase and group index in
various single-mode bers. Recall that the birefringent beat length is dened
as = o /n, where n is the refractive-index dierence between the two
eigenstates, while the DGD is dened as = ng L/c, where ng is the groupindex dierence. Dening the group beat length as = / (c /L), the
ratio of beat length to group beat length / is a measure of the ratio between refractive and group birefringences. The authors report measurements
at 1550 nm that show a ratio between 1.1 and 2.6, depending on ber type. In
particular, erbium-doped ber exhibits a large ratio. So indeed any assumption that phase and group indices in optical ber are the same is suspect.

10.4 Programmable PMD Sources


ECL

trigger

Polarization Control

451

EDFA

electrical

AOM

photodiode
pol l/4

OTDR
polarization
analyzer

optical

fiber

Fig. 10.9. Measurement setup for P-OTDR [39]. A commercial OTDR is


adapted for polarization measurements. In particular, the instrument used by Galtarossa et al. did not have a pulse width as low as 5 ns. The OTDR output is
detected and triggers an external-cavity short-pulse laser. The laser signal is polarized, amplied, and launched to the DUT. An acousto-optic modulator (AOM)
transmits the pulse and block the ASE noise outside of the pulse time slot. The
eld is ultimately analyzed by a quarter-waveplate polarizer pair and returned to
the OTDR for analysis.

10.4 Programmable PMD Sources


The polarization-mode dispersion present in a communications link must be
accounted for when working out the link budget for a system. To satisfy the
link budget at low cost the amount of PMD present needs to be low and the
active components, especially transmitter/receiver (Tx/Rx) pairs, need to be
tolerant to PMD. In order to validate system or Tx/Rx performance before
deployment, PMD has to be generated and the system tested to demonstrate
operation.
Figure 10.10 shows a testing hierarchy that optimizes the product-development cycle. In the development and validation phase of single-channel
Tx/Rx pairs, a programmable PMD source is used to repeatably address
PMD states that cause trouble. A programmable source is also used to compare dierent products on an even basis. In the validation and deployment
phase of loaded wavelength-division multiplexed systems, a PMD emulator is
used to increase the condence that the system will work over its lifetime.
The PMD emulator (PMDE) and PMD source (PMDS) complement one
another. Hauer et al. report that a good PMD emulator should have three key
properties: 1) the DGD should be Maxwellian-distributed over an ensemble
of ber realizations at any xed optical frequency; 2) the emulator should
produce accurate higher-order PMD statistics; and 3) when averaged over an
ensemble of ber realizations, the frequency autocorrelation function of the
PMD emulator should tend toward zero outside a limited frequency range, in
order to provide accurate PMD emulation for WDM channels [46].

452

10 Review of Polarization Test and Measurement


Programmable
PMD Source

PMD Emulator

Field Service

For development cycle


and performance
validation

Confidence
builder for WDM
field deployment

In-service operation

Fig. 10.10. Test hierarchy that optimizes the develop cycle for Tx/Rx pairs and for
WDM system testing. A programmable PMD source targets dicult PMD states,
allowing the developer to focus on the engineering issues. Product validation is also
performed with the source. A PMD emulator is used to build condence that a
WDM system will work in specic environments and over lifetime. In-service ber
carries live trac and demands PMD tolerance of the system.

A programmable PMD source, on the other hand, produces PMD in a predictable manner and typically spans a subset of all possible PMD space for
a given mean DGD [28]. Additional attributes are the sources long-term stability, its repeatability, and its one-time calibration. When one uses a PMD
source one accepts its limitation of PMD coordinates in exchange for predictability, repeatability, and stability. Moreover, the attribute of repeatability enables one to compare the performance of two or more dierent systems
to the same PMD stress.
In principle a programmable PMD emulator can be built that meets the
required traits for both categories above. Such an instrument, however, would
be more expensive than two separate instruments. In the future, one might
look for low-cost ways to combine features to create one super instrument
without compromising performance.
A simple PMD emulator is a long spool of high-PMD ber. Temperature
cycling of the spool exercises a range of PMD states. One would, of course, like
to have better control of the states and mean DGD. An improved PMDE couples many sections of polarization-maintaining (PM) ber together. As few as
three sections have been reported, but more typically 12, 15, or more sections
are used. Mechanical rotators [63], thermo-optic heaters [45], and ber squeezers [97] have been demonstrated. Another type is built using an integratedoptic platform and micro-ring resonators [75, 76]. Here light is divided with a
polarization-beam splitter and each light component passes through a series
of evanescently coupled ring resonators. The ring resonators impart DGD and
co-directional couplers control the mode mixing. Still another type uses birefringent crystal and waveplates that rotate [8, 18]. Rotation of the waveplates
changes the accumulated PMD through the crystals.
Since an emulator comprising 1230 sections has far fewer correlation
lengths than a typical transmission ber, care must be used in determining
the generated statistics. In the last chapter it was shown that the onset of the
strong mode-coupling regime is for a length greater than 30 ber-correlation
lengths. Lima et al. and Biondini et al. have studied the dierence between

10.4 Programmable PMD Sources

453

emulator and ber statistics and report that the tails of the emulator DGD
distributions fall more quickly than ber distributions [2, 74, 77], which leads
to an under representation of high PMD states. To overcome these limitations,
Yan et al. as well as Biondini and Kath have included importance sampling
techniques to push the tails outward for correction [3, 113].
A new class of emulator is recently reported, the combined PMD and PDL
emulator. Such an instrument is important to account for combined eects,
especially as signal impairments can be worse than either eect in isolation.
Waddy et al. and separately Bessa dos Santos et al. oer the rst reports on
such instruments [29, 98].
In the absence of PDL, there are three core problems when using a PMDE
to develop and validate Tx/Rx-pair performance: in reference to the JPDF in
Fig. 9.9 on page 406, the high-PMD states have low probability of occurrence
states as far out as 3   and 3   occur less than 0.01% of the time; emulators
are not calibrated, so unless the PMD state is measured as it evolves there
is no record of the states is went through; and emulators cannot reproduce
the same test twice except in the statistical sense. For early development and
validation applications, a programmable PMD source is necessary.
A programmable PMD source overcomes these PMDE limitations but at
the expense of restriction to one- or few-channel use, and of restriction in
addressable PMD space. The most basic of sources is the calibration artifact.
P. Williams at the National Institutes of Standard and Technology (NIST)
has developed a PMD standard for the strong mode-coupling regime [104].
The artifact is made from a stack of 35 thick quartz plates ber pigtailed
on either end. PMD measurement instrumentation can be calibrated to the
artifact, setting a traceable standard. In fact, standards for PMD measurement methodologies are plentiful, but other than the Williams artifact no
standards exist for PMD sources. This has impeded the industry regarding
the development and commercialization of PMD compensators, both optical
or electronic.
The programmable PMD source extends the stable, predictable, and repeatable nature of the artifact to a dynamic instrument. The earliest available programmable source is the JDS Uniphase PMD emulator [67]. This
instrument, which generates only DGD, splits input light with a polarizationbeam splitter and physically delays one path to the other through a MachZehnder-like conguration. This instrument has been a successful product,
but does not generate PMD in a meaningful way because second-order PMD
is nonexistent. Gisin proposes a x to this by looping back the light after
one mode-mixing point [100]. Such an instrument generates DGD and the depolarization-component of SOPMD these two components are the minimum
necessary for product development. A drawback with both congurations is
that the state-of-polarization is not stable due to the open environment of the
delay line. In loop-back mode the instability will rotate the input PSP with
time, which in turn changes the coupling of the signal to the generated PMD.

454

10 Review of Polarization Test and Measurement

In order to build a stable PMD source, four physical attributes must be


stabilized and controlled: the dierential group delay for each stage, the birefringent phase per stage, the birefringent axes within a stage, and the polarization mode mixing between stages. In addition, to generate a clean spectrum,
backreection between components within the instrument and between the
ber-to-ber collimation pair must be minimized [111]. Meeting all of these
conditions at once has several practical ramications that are detailed shortly.
A programmable PMD source is most useful when PMD states are specied as inputs to the instrument; given the input states the software calculates
the required internal settings, e.g. mode-mixing angles, to produce the requested PMD. The alternative is to input physical parameters such as the
waveplate angles and calculate the output PMD states. In fact there is a
mapping between PMD states and physical states, the forward mapping
from physical to PMD being straightforward to calculate and the reverse
mapping from PMD to physical being, generally, multi-valued and dicult.
The JDS Uniphase PMD emulator has a simple mapping between DGD and
the length of the delay arm. The multistage PMD source demonstrated by
Damask [28], when controlled in wavelength-at mode, maps DGD and
magnitude-SOPMD to physical rotation angles. This mapping is also simple.
The ECHO source, a four-stage source also demonstrated by Damask [26], independently controls rst- and second-order PMD and can select the balance
between depolarization and PDCD components of SOPMD. As more stages
are added beyond the four in ECHO, it is increasingly dicult to specify the
PMD state with enough meaningful terms uniquely to reverse-map to physical
parameters.
The following sections detail two successful instruments. The rst instrument, simply called PMDS, does not control the birefringent phase within
any section. The result is a severe limitation in the types of states that can be
predictably addressed and an added complexity of the instrument. The second
instrument, called ECHO for enhanced coherent higher-order [PMD source],
explicitly controls the birefringent phase. This instrument is far simpler than
the PMDS and generates a far broader range of predictable states.1
10.4.1 Sources of DGD and Depolarization
The optical head of a twelve-stage programmable PMD source (PMDS) is
illustrated in Fig. 10.11. The optical head is the heart of the instrument,
while motion control boards, power supplies, and a chassis make it complete.
This PMDS type has been built for both 10 Gb/s and 40 Gb/s applications,
and was rst built at Bell Laboratories, Lucent Technologies [18, 20], and
subsequently by Chipman et al. [8]. The following sections detail how to build
and operate the instrument.
1

The author would like to redouble his acknowledgement of P. Myers, A. Boschi,


R. Shelley, G. Simer, K. Rochford, and P. Marchese, without whose dedication
the PMDS and ECHO sources would never have been realized.

10.4 Programmable PMD Sources


lens

APC fiber

10

11

455

12

l/2

Motors
YVO+LN

Fig. 10.11. Illustration of optical head of a twelve-stage programmable PMD


source. Such a source does not have birefringent-phase control. The motorized rotary stages house xed temperature-compensated high-birefringent crystals and rotatable true zero-order half-wave waveplates. Light is coupled in and out of the
instrument via APC bers, which are collimated with aspheric lenses having focal length f = 5 mm and beam diameter of 1.0 mm. Insertion loss and PDL are
typically 1.8 dB and 0.1 dB, respectively.

Build and Calibration


The optical head is built with twelve independent rotary stages that house and
hold temperature-compensated birefringent crystals for DGD generation and
a true zero-order half-wave waveplate for mode mixing. The delay crystals are
loaded into the rotary housing to minimize the optical path. All crystals and
waveplates are anti-reection (AR) coated to R < 0.25% at 1545 30 nm. To
reduce backreection from the bers and collimators, angle-polished (APC)
ber terminations and AR-coated lenses are used. The free-space optical path
between collimators is 30 mm and has a loss of 2 dB using asphere lenses
that expands the beam to 1.0 mm diameter. Once all the stages are added the
insertion loss, PDL, and rotation-dependent loss (RDL) are typically 1.8 dB,
0.1 dB, and 0.2 dB, respectively.
Figure 10.12(a) illustrates the construction of each stage. Miniature, highprecision rotary stages, such as those from National Aperture [79], are used to
house and hold the optics. These stages have a clear aperture 6 mm round and
a top-plate that rotates. The stages are endlessly rotatable, have a repeatable
resolution of 0.02 , a maximum spin rate of 4 revolutions per minute, and
are driven by a miniature servo-motor. Onto each top plate, which is a separate ring that attaches to the rotary, a true zero-order half-wave waveplate is
mounted. These waveplates are the polarization mode mixers. The waveplates
are made from crystalline quartz with a thickness of 92 m. True zero-order
waveplates, as opposed to compound zero-order plates, are used to minimize
beam walk during rotation, called RDL. The waveplates are 8 mm rounds
with a polished at at the bottom aligned to the extraordinary axis of the
crystal. The clear aperture of the top-plate rings is 3 mm, so there is 5 mm
overlap between the waveplate and ring. This increases the resilience to mechanical shock. The waveplates are attached using a compliant UV epoxy that
has minimal outgassing.

456

10 Review of Polarization Test and Measurement


l/2

flange

motor zero

YVO4 LN

pin
closure

table
a)

rotary

motor

b)

crystal zero

Fig. 10.12. Illustration of optics attached to the rotary stage and the absolute
angular reference. a) A half-wave waveplate is mounted to the moving part of the
rotary and the YVO4 and LiNbO3 crystals are xed to a ange which is loaded
into the body of the rotary. b) A pin and closure scheme is used to give an absolute
angular reference. The calibration point of the stage is the angle between motor zero,
where the pin closes the contact, and crystal zero, the orientation of the waveplate
to maximize extinction on a calibration setup.

High-birefringent crystals that produce the DGD are inserted and xed
into the center bore of the rotary stage. Section 4.4 details the temperature
dependence of the group index of several birefringent crystals. In particular,
the combination of YVO4 and LiNbO3 gives a high group delay per unit length
and low thermal dependence. Applicable crystal lengths are 14.801 mm of
YVO4 and 2.205 mm of LiNbO3 per 10.0 ps of DGD (cf. Table 4.6). However,
the variation of temperature coecients from batch to batch likely exceeds
the precision suggested here. For the 10 Gb/s instrument, 10.0 ps of delay is
placed into each stage. The extraordinary axes of the YVO4 and LiNbO3 need
to be aligned to compensate for temperature. The crystals are typically cut
with a slightly rectangular cross-section, and the e-axis is aligned to one side.
The crystal pair is held by a custom ange that is cylindrical on the outside
and rectangular on the inside. After UV epoxy is applied to the non-optical
faces of the crystals, they are inserted into the ange and xed by UV cure.
To ensure that the crystals are ush, a fringe pattern at the interface between
the crystals (part way into the ange) was checked. The crystals are specied
to have a 0.5 alignment of the crystalline e-axis to the physical aperture.
Each crystal pair is accordingly aligned to within 1.0 . Typically, better
alignment was observed.
Attachment of the crystal-loaded ange and waveplate to the rotary stage
is the key part of the calibration process. The goal is to align the delay crystals across all twelve stages and to align the waveplate to each delay-crystal
pair. Alignment for each stage is done one-by-one on a calibration standard
setup [20]. The calibration standard has input and output bers that are coupled by collimators. Two Polarcor polarizers (from Corning) are placed in the
optical path in rotary stages and crossed. Using a power meter the polarizers
are crossed so that the extinction ratio exceeds 60 dB. The polarizers are then
permanently fastened into place.

10.4 Programmable PMD Sources


a)

b)

S3

457

S3
2

S2

S2

S1

S1

Fig. 10.13. Measured output of a PMD source over two 6 hr periods demonstrates
temperature stability. a) Day time with laboratory trac. b) Overnight.

To have a repeatable calibration point, an absolute angular reference on the


rotary stage is required. For the rotary stages used here, the absolute angular
reference is a mechanical closure, xed to the housing, that is actuated by a
pin, xed to the rotary wheel, when the pin physically brushes the closure;
see Fig. 10.12(b). The pin brushes the closure for about 2 of travel but rst
closes it with an angular precision of 0.05 . This rst closure point is called
motor zero, and is always detected by slow rotation in the same direction.
Once a rotary stage is set to motor zero, the waveplate ring is attached.
The relative orientation of the waveplate e-axis to motor zero is unknown,
although the polished at gives some indication. The stage is then placed on
the calibration standard and the waveplate is rotated to maximize optical
extinction. Typical quartz waveplates achieve better than 50 dB extinction.
The angular orientation for maximum extinction is called crystal zero. The
dierence in angle between motor and crystal zero is the calibration point for
the stage. This calibration point is recorded in the instrument software. Crystal zero is found for any unknown orientation of the rotary by rst rotating
to motor zero, then rotating by the calibration angle to crystal zero.
Once crystal zero is found, the ange is loaded into the body of the rotary.
The ange is manually rotated until maximum extinction is found and is then
xed in place. Typical extinction ratios at this point are 42 45 dB, but as
low as 34 dB was found on occasion. Once the ange is xed in place the
rotary assembly is complete.
The optical head of the instrument is assembled using twelve rotaries all
set to crystal zero. One-by-one each rotary is set in position against set pins
and screwed into place. Minor adjustments are made to minimize the insertion
loss, which is monitored throughout, since misalignment of the crystals will
walk the beam away from the input aperture of the second ber. Once all
motors are in place, nal adjustment is made to the collimators to minimize
the loss and these are then locked into place. A dust cover protects the optics.
A nal couple of points. There is a factor of four between the physical
angle of a half-wave waveplate and its Stokes angle. One factor of two comes
from the mirror-image about the e-axis of the waveplate, giving an apparent
rotation of 2, and the other factor comes from conversion to Stokes space
from physical space. Separately, the polarization stability of this instrument

458

10 Review of Polarization Test and Measurement

is shown in Fig. 10.13. (A good reference for the polarimetric stability of other
sources is given in [114]). The temperature dependence of YVO4 or LiNbO3
alone is large, but the crystals as a pair greatly stabilize the birefringent phase.
Operation
Because the birefringent phase of each stage is not known and is not controlled, the class of sources called PMDS cannot predictably generate PMD
that has more than one Fourier component. That is, only wavelength-at
states are predictably generated. A predictable, frequency-dependent DGD
spectra requires phase control of the Fourier components, but this phase control is absent in the PMDS. Even though non-wavelength-at states are not
fully predictable, they can be repeated due to the instruments stability.
Wavelength-at states produce DGD and pure depolarization; no PDCD
or higher-order PMD is generated. For basic tests this actually has several
advantages. The rst is that no frequency alignment is necessary between the
PMD generated by the instrument and the laser line of the transmission the
DGD and magnitude-SOPMD are constant in frequency. Second, depolarization statistically dominates PDCD so it is the more common component of
SOPMD. Experimental evidence shows that depolarization also dominates the
impairment of a signal in many instances. Third, the generated PMD is engineering pessimistic in that it is unlikely a ber will exhibit high DGD and
magnitude-SOPMD over the entire bandwidth of the signal. When a Tx/Rx
pair can tolerate the PMD generated by the PMDS it will generally have an
easier time of it on a live line.
Figure 10.14 shows the properties of wavelength-at states. These states
are generated by two PMD vectors 1,2 . The rst vector precesses about the
tip of the second vector as a function of frequency (Fig. 10.14(1)). The Stokes
angle 421 between the vectors is four times the physical angle 21 of an
intermediate half-wave waveplate. This angle is xed in frequency. The output
PMD vector is the vector sum of the components. The length is constant in
frequency while the pointing direction traces a circle in Stokes space. The
DGD and SOPMD can easily be determined geometrically: the DGD is the
vector length following the triangle rule, and the magnitude-SOPMD is the
tangential rate at the tip of 1 with frequency. The tangential rate is clearly
s = r, where r = 1 sin 421 and = 2 . Putting this together, the
DGD and magnitude-SOPMD are
2 = 22 + 12 + 21 2 cos (421 )

(10.4.1a)

= 2 1 sin (421 )

(10.4.1b)

For xed 1,2 the DGD and magnitude-SOPMD are parametric in 21 .


Patscher and Eckhardt investigated a two-stage optical compensator and
demonstrated similar results [83].

10.4 Programmable PMD Sources


1)

v
*

t1

t2

tv
r

Du

4u21

3)

2)
20
SOPMD tv / ts2

459

4:4
10
c

u#

b
c
0

2:4
4
6
DGD t / ts

a
a
8

Fig. 10.14. Representations of two-section PMD states. 1) Two concatenated PMD


vectors; the rst vector precesses about the axis of the second with frequency. The
angle between vectors is four-times the physical angle of the intermediate waveplate,
and is xed with frequency. The vector length is the output PMD vector; its length
is constant in frequency and its pointing direction traces a circle in Stokes space.
2) State-space in rst- and second-order PMD for 4 : 4 and 2 : 4 groupings. 3) Vector
representations of each corresponding state.

For each angle 21 a state (, ) is produced. The locus of states for all
angles traces a trajectory in rst- and second-order-PMD state space. For
example, when the component vectors are both 4 in length, the trajectory
labelled 4 : 4 is traced (Fig. 10.14(2)). PMD states along a trajectory are
continuous, and the maximum and minimum DGD are 8 and 0, respectively,
and the maximum SOPMD is 16. Alternatively, when the vector lengths are 4
and 2, the 2 : 4 trajectory is traced. In this case the minimum DGD is not
zero but 4 2 = 2. The vector diagrams for various states are illustrated in
Fig. 10.14(3). Finally, the state-space scales by s , the delay per stage. For
the PMDS described above, s = 10.0 ps.
The PMDS instrument makes wavelength-at states by aligning the stages
into two groups (Fig. 10.15). In this gure only eight stages are illustrated,
so there are only ten unique trajectories. An important aspect is that pairs
of stages can be cancelled optically by rotating the intermediate waveplate
by 45 . This ips the fast and slow axes from one stage to the next. As illustrated in Fig. 10.15(b), a 4 : 2 trajectory (the same as 2 : 4) is made by allowing
DGD to accrue through four consecutive stages and then mode-mixing at the
junction to the fth stage. The waveplate labelled 5 is not rotated so that
DGD accrues between stages ve and six. Finally, the waveplate labelled 6 is
rotated by an equal and opposite amount as waveplate 4 to restore the polari-

460

10 Review of Polarization Test and Measurement


20
SOPMD tv / ts2

a)

16
10

2:4

1:1

1:7

1:5

1:3

4
DGD t / ts

l/2
1

2:6

3:3
2:2

0
b) 4:2

3:5

4:4

1u
2

Group 1
c) 5:3

45o

2u

Group 2

Cancelled

Group 1

Group 2

Fig. 10.15. Correspondence between PMD state-space and physical realization for
an 8-stage cascade. Groups are formed by setting the intermediate waveplates to
zero angle. Mode mixing happens whenever a waveplate has a non-zero angle. Pairs
of stages can be optically cancelled by setting the intermediate waveplate to 45 .

metric axis. In a similar manner, a 5 : 3 group is shown. Another important


feature of the PMDS is that is has a true zero PMD state. Without it, the
instrument would have to be bypassed during system setup.
Experimental validation of a 10 Gb/s source is shown in Fig. 10.16(a,b) [28].
The gures show six measured spectra of two 6-stage groups. The six states of
the PMDS were measured using Heners JME method and the Stokes data
was used to calculate DGD and PSP values. The substantially wavelengthat DGD spectrum labelled A corresponds to no mode mixing and maximum
DGD. Accordingly, output PSP spectrum A points essentially in one direction. The DGD spectra B, C, D, E, and F have corresponding PSP spectra
which are circles of ever increasing radius (PSP spectrum E removed for clarity). When the DGD value is zero, the corresponding PSP spectrum will trace
a great circle through the S3 poles.
Figure 10.17 shows an overlay of 47 wavelength-at states plotted in (, )
space. The dashed lines are theoretical trajectories derived from (10.4.1). The
points are the measured rst- and second-order PMD values averaged over a
free-spectral range. The points in fact display the results from ve repeated
tests performed overnight; the tight grouping illustrates the stability of the
instrument.

10.4 Programmable PMD Sources


b)

140
120

100

80

60

40

20

DGD (ps)

a)

1549.1

1549.3

1549.5

S3

461

D
C
B

S2
A

S1

1549.7

Wavelength (nm)

Fig. 10.16. Measured DGD and PSP spectrum from two-group operation of
a 10 Gb/s PMDS. a) Seven measured DGD spectra over a free-spectral range. The
spectra are generated with two 60 ps groups, where the intermediate waveplate controls the mode mixing. These spectra are wavelength-at, indicating only DGD
and depolarization are present. b) Six measured output PSP spectra over a freespectra range. Letters A, B, C, D, and F correspond to respective DGD spectra.
That wavelength-at states generate pure depolarization is evidenced by the circular
PSP spectra.
4000

SOPMD (ps2)

measurement
3000

theory
2000

1000

20

40

60

80

DGD (ps)

100

120

Fig. 10.17. Comparison between experiment and theory for 47 wavelength-at


PMD states. Dashed lines are theory; boxes, experiment. The experiment was repeated ve times in succession overnight, so the overlap of boxes on the same state
indicates the stability of the instrument.

Taken together, Figs. 10.16 and 10.17 demonstrate a central aspect of


the two-group PMDS operation: the resultant PMD spectra are pure, with
negligible wavelength dependence and pure depolarization with no PDCD.
Moreover, the accuracy, stability, and repeatability evident in Fig. 10.17 allows
for comparison of one system to another.

462

10 Review of Polarization Test and Measurement

Total Delay of 1.2T


The total delay built into a PMDS instrument depends on the application.
As a validation tool for Tx/Rx performance, the bit-error rate (BER) should
be mapped over an entire bit time T, where T = 100 ps at 10 Gb/s and 25 ps
at 40 Gb/s. This mapping should accurately represent both rst- and secondorder PMD states based on the JPDF for ber. An increase from 1.0T to 1.2T
increases the maximum SOPMD by 40%, which gives improved coverage for
a JPDF scaled to a ber with mean-PMD of 30 ps at 10 Gb/s and 7.5 ps
at 40 Gb/s.
Variations
One variation is to operate the PMDS in PMD emulation mode. In this mode
any and all stages are engaged so that a large amount of mode mixing in introduced. Since the rotary stages are dynamic and endlessly rotatable, rotation
speeds that correspond to prime-number multiples of a unit speed drive the
instrument through a virtually endless number of states. Moreover, since the
instrument is calibrated, a specic path in time can be reproduced. Calculation shows that the average DGD for the 10 Gb/s instrument over all states
is    31.5 ps, although the distribution tails fall faster than Maxwellian.
Another variation uses binary-weighted delay stages similar to that demonstrated by Yan et al. [114]. Such an instrument lls the rst- and second-order
PMD state space with more trajectories, giving it better coverage. One realization is an instrument with fteen stages, the rst eleven stages are as before,
the next two are loaded with two s /2-length crystals, and the last two loaded
with two s /4-length crystals. In this case, over 120 distinct trajectories are
available and cover the state space well. The problems are the size and cost of
the instrument, its fragility due to the short-length stages, and its diculty to
program. The two pair of binary-weighted stages divide all possible trajectories into four categories depending on their alignment or cancellation, making
the instrument cumbersome to calibrate and operate.
Problems with the PMDS
The PMDS instrument was the rst to demonstrate stability, predictability,
and repeatability. However, problems remain, problems that ultimately call
for the ECHO instrument.
Optically, the state-space coverage is poor. The 21 trajectories oer continuous PMD tuning along them, but jumping from one trajectory to another
requires several motors to move at once, unless the instrument is rst reset to
zero, which is time consuming. It would seem unlikely to happen, but it has
occurred that the instrument passes through high PMD states going between
two low states, which in turn can disrupt an experiment. Even beyond this
annoyance, many regions are simply not accessible, and the state density falls

10.4 Programmable PMD Sources

463

o for higher PMD values. But high PMD values are precisely where the state
density should be highest. Moreover, the wedge delineated by zero DGD, nite
SOPMD on one side and the 6 : 6 trajectory before its peak on the other side
is an entire range of relevant PMD states that are inaccessible by the instrument. These states represent high SOPMD for low DGD, which has signicant
probability of occurrence, as indicated on the JPDF in Fig. 9.9. Finally, the
birefringent phase is not controlled at each stage, limiting the predictability
of the instrument to wavelength-at states.
Mechanically, the optical head is fragile. The crystals in the motor housings
are not resilient to excessive mechanical stock or temperature variation. The
rotary stages are very high quality, but motor burnout or motor-zero problems do occur. The more rotaries within any one instrument, the higher the
likelihood of an instrument failure. Finally, use of twelve motors is expensive
and makes for a long build and calibration time.
10.4.2 ECHO Sources
The Enhanced Coherent Higher-Order (ECHO) PMD source was developed
in response to the shortcomings of the twelve-stage sources. In addition to
the stabilization and control of the dierential-group delay, birefringent axes
within a stage, and mode-mixing between stages, ECHO calibrates and controls the birefringent phase of each stage [24]. This has several optical ramications: higher-order PMD spectra are predictably generated, the spectra
is continuously tunable in frequency without changes in shape, the rst- and
second-order state-space is continuous, and the state-space faithfully covers
the JPDF; and several mechanical ramications: only ve rotaries are needed
and all the delay crystals are mechanically aligned to a single reference.
ECHO looks very much like a birefringent lter, which has been studied
by Lyot, Solc, Evans [31], and Harris [44]. But the ECHO birefringent lter imposes structure on the PMD spectra, not the intensity spectra. The
birefringent lter was extended by Buhrer [4] to have continuous frequency
tuning of the intensity spectrum by adding Evans phase shifters (cf. 4.6.3),
and likewise, ECHO adopts the phase shifter to continuously tune the PMD
spectrum. Tuning of the PMD spectrum at the source means that the transmission laser can be xed in frequency, which is the preferable way to setup
an experiment.
Finally, ECHO highlights the fact that the shape of PMD spectra is not
determined by mode mixing alone but also by the birefringent phase. This
is a key point. Structured PMD spectra generated by an unstable source,
such as PM ber, cannot be predicted even if the mode mixing is completely
controlled. The absence of birefringent-phase stability causes the spectrum to
change shape anyway. ECHO demonstrates this eect.

464

10 Review of Polarization Test and Measurement

Opto-Mechanical Layout
The opto-mechanical layout of the ECHO optical head is fundamentally dierent than for the PMDS (Fig. 10.18). Regarding the optics, there are only four
delay stages, rather than twelve, and three intermediate mode mixers. Like the
PMDS, the delays are made from temperature-compensated YVO4 -LiNbO3
crystal sets. The mode mixers are true zero-order half-wave waveplates. Added
to the second and third stage are Evans phase shifters. Each phase shifter has
a pair of xed quarter-wave waveplates and a rotatable half-wave waveplate
mounded on a rotary stage. For stages two and three, the total birefringent
phase is that from the delay crystals plus the phase imparted by the phase
shifter. As shown in 8.2.7, the birefringent phases of the rst and last stage
do not aect the PMD spectrum, so phase shifters are not used in the two
outer stages in ECHO. There is, however, a polarimetric dierence whether
or not end phase shifters are included, but this is immaterial.
The mechanical layout of the optical head puts all delay crystals and
quarter-wave waveplates onto one solid platform (Fig. 10.18(b)). The platform
has sections removed to make room for the rotary stages. Onto the platform
a crystal guide is attached. The crystal guide gives an edge along which
all crystals and waveplates are abutted. In this way the xed optics have a
single bottom and side mechanical reference. This mechanical structure makes
placement of the xed optics easy, and optical properties such as extinction
ratio and phase are repeatable. To either end of the platform the collimator
assemblies are attached and aligned. The rotaries are placed in the platform
gaps and xed from the bottom. The only optics attached to the rotaries are
half-wave waveplates.
Like the PMDS, the delay crystals are cut with a rectangular aperture,
with the e-axis aligned to one side. When aligned to the crystal guide these eaxes are horizontal. The aperture of the quarter-wave waveplates is the same
and the e-axis is inclined by the requisite 45 . A true zero-order quarter-wave
waveplate made from crystalline quartz is 46 m thick at 1.55 m. In order
to handle the part and x it to the stage, the waveplate is best mounted to
a host, such as BK7. To minimize internal reection at the waveplate/host
interface, the glasses should be optically contacted and anneal-bounded.
The extraordinary axes of all waveplates in the ECHO instrument have to
be aligned and not crossed, an unnecessary requirement for the PMDS. Since
the e-axes of the quarter-wave waveplates are at 45 , placement of the part
onto the platform backwards ips the relative orientation of the plate. This,
in turn, causes unwanted mirror images in the control of the phase shifter and
mode mixers, as is obvious after study of Fig. 4.19 on page 185. There are
at least two ways to ensure proper orientation: visual inspection of the plates
through crossed polarizers on a light table; or by applying to either side two
optically equivalent AR coatings having distinct colors, as proposed by Shirai
for iron garnets (see page 152).

10.4 Programmable PMD Sources


a)

l/4: 45o

l/2
t
0o

l/2

l/2

l/2

l/2

u1

w2

Stage 1 Mode
mixer

u2

Stage 2

w3

Mode
mixer

465

u1
Stage 3

Mode Stage 4
mixer

Evans Phase Shifters

b)

lens

YVO LN
crystal guide

l/4
Motor 1

l/2
2

Motor 3

APC fiber

platform
4

Motor 5

Fig. 10.18. Illustration of opto-mechanical layout of ECHO source. Four equallength crystal delay stages are mode-mixed with three true zero-order half-wave
waveplates. Two Evans phase shifters are added to the center stages to control
the birefringent phase. The ve rotatable waveplates fall into two groups: three
waveplates for mode-mixing, two for phase control.

The ECHO instrument is calibrated as it is built [22]. The rst step is


to zero-out the residual birefringent phase of the center two delay sections.
This is done at a calibration frequency. At a xed frequency, a delay section
has an integral number of birefringent beats and a fraction of a beat. This
fractional part is the residual birefringent phase. For instance, a 10.0 ps delay
at 194.1 THz has 1941 birefringent beats. But since the 3 m manufacturing
tolerance of the crystal length is almost the same of the 7 m birefringent-beat
length in YVO4 , the residual birefringent phase is random. At the calibration
frequency the Evans phase shifters are adjusted to drive the residual phase to
zero. Once this is done, the center mode mixer is added and aligned to the
optical axis of the delay, and then the outer mixers are added one by one and
aligned. Once all of the calibration points are determined and all rotaries are
set to crystal zero, the rotary counters are reset all subsequent rotations
refer to this zero-angle position.
Coherent PMD
In order to generate large changes in DGD and SOPMD in a small frequency
band, thereby producing strong higher-order PMD states, the component
PMD vectors must add constructively. As with any other optical eect, con-

466

10 Review of Polarization Test and Measurement


DGD2 Response

Impulse Response
t?t

a)
t32t2

c)

t2 t3

t31t2
th ? th

ts2z/v

ts

2t s

tcoh ? tcoh

b)

2vts

vts2z
FSR
vts

2vts

2 t s time

frequency

Fig. 10.19. Progression toward a coherent PMD spectrum for four delay stages.
a) Four-stage incoherent spectrum. Each stage delay is dierent, making ve Fourier
components. b) Four-stage harmonic spectrum. All stage delays are the same, but
the residual birefringent phase is arbitrary. c) Four-stage coherent spectrum: the fundamental and second-harmonic phases are aligned. Maximum contrast is achieved.
Its evident that birefringent phase plays a key role in the shape of the spectrum.

structive interference occurs when optical phases align. An excellent example


is the birefringent lter and its prerequisite coherence [44]. In terms of PMD,
constructive interference happens when the phases of the Fourier components
are aligned. This is called coherent polarization mode dispersion [23, 26].
There are four stages in the ECHO instrument. Recall from (8.2.75) on
page 370 that the general DGD-squared spectrum for four stages has ve
Fourier components: a DC component, components that correspond to the
delays of the center two stages, and the sum and dierence of these delays. This
general case is shown in Fig. 8.36(c) on page 367 and redrawn in Fig. 10.19(a).
This DGD spectrum is complicated, has a lower contrast ratio, and a long
free-spectral range.
The rst step toward coherency is to have all Fourier components be a
multiple of a unit component. This is called harmonic PMD and occurs when
the stage delays k are multiples of a unit stage delay s : k = ns where n is an
integer. When n = 1 for each stage but the residual phases remain arbitrary,
the general expression (8.2.75) reduces to
h h = b0 + b1 cos(s ) + b2 cos 2s
where h denotes harmonic and is the phase oset measured in relation
to 2s . This oset is non-zero when the residual phases of the center two sections dier. Its eect is shown in Fig. 10.19(b): there are three non-degenerate
Fourier components rather than ve, but since the residual phases do not

2
1

DGD (ps)

Log10 Amplitude

10.4 Programmable PMD Sources

467

7.5 ps (133 GHz)


15.0 ps (66.5 GHz)
Optical Frequency

0
-1
-2
-180

-120

-60

60

120

180

Fourier Components (ps)

Fig. 10.20. Fourier transform of magnitude-squared DGD spectrum (inset) measured from a four-stage coherent PMD source having stage delay s = 7.5 ps. The
principal Fourier components are DC, s , and 2s . Vertical axis is on a logarithmic
scale.

match the fundamental and second-harmonic Fourier components are not


phase aligned. This spectrum is harmonic but not coherent.
To go from a harmonic to coherent spectrum the residual birefringent
phase of the center sections must be controlled. The normalized phase for
stage k is dened
k = nk
Coherency requires j = k for all stages j and k save for the rst and last
stage. When n = 1 for all stages, the birefringent phase of each contributing
stage must be the same. The Evans phase shifter makes this situation possible.
The coherent four-stage magnitude-squared DGD spectrum is then
coh coh = c0 + c1 cos s + c2 cos 2s

(10.4.2)

where the subscript coh denotes a coherent spectrum. One possible spectrum
is illustrated in Fig. 10.19(c): the two non-DC Fourier components have the
same phase, so the components of the DGD-squared spectrum align. In this
case maximum constructive interference is possible and the PMD excursions
will exhibit their highest contrast over the shortest FSR.
A harmonic PMD spectrum is demonstrated in Fig. 10.20. An ECHO
instrument was set to maximum mode mixing and its DGD spectrum was
measured. The spectrum was numerically squared and its Fourier transform
taken. The amplitude of that spectrum is shown in the gure. There are strong
tones at DC, s , and 2s . This spectrum is in fact coherent as well as harmonic;
the phases, while not plotted, were equal to within measurement limit.
In general, for a coherent N stage concatenation the magnitude-squared
DGD spectrum has the form
  =

N


cn cos ns

(10.4.3)

n=0

One can see from Fig. 10.19 the fundamental importance birefringent phase
has on the shape of the PMD spectra. Even for the same mode mixing, change

468

10 Review of Polarization Test and Measurement

of the phase relationship shifts the position of the component tones, which in
turn changes the spectral shape.
Theory of Operation
The following theory of operation imposes some symmetries on the control of
the instrument [27]. Referring to Fig. 10.18, there are three mode mixers and
two phase shifters. Once the instrument is built and calibrated, the outer two
mode mixers, motors 1 and 5, are tied together so that they always register
the same angle. Also, the two phase shifters are operated either in common
mode or dierential mode. Common mode means the phase shifters change
phase by the same amount, which is tantamount to frequency tuning the
spectrum. Dierential mode means that the shifters change phase by equal
and opposite amounts, which changes the shape of the spectrum. Control of
ECHO principally deals with common-mode control.
Section 8.2.4 derived the PMD concatenation rules for a cascade. ECHO
uses half-wave waveplates as mode mixers, so there is a necessary modication
to the equations. Given that all delay stages (being equal) are represented
by s and the waveplates by Qk , the cumulative PMD vector  is
!
!
""
(10.4.4)
 = s + Rs(4) Q3 s + Rs(3) Q2 s + Rs(2) Q1s
where the vectors and operators are dened as
s = s rs

(10.4.5)

qn qn ) 1
Qn = 2(

(10.4.6)

Rs(n) = (
rs rs ) + sin n (
rs ) cos n (
rs rs )

(10.4.7)

and where s is the stage delay, n is the birefringent phase of the nth segment,
and qn is the direction in Stokes space to which the nth half-wave waveplate
is oriented. In particular, a physical rotation of a half-wave waveplate by
angle /2 corresponds to a rotation in Stokes space by 2. Also, Eq. (10.4.4)
explicitly separates the rst and third mode mixers.
The magnitude-squared DGD spectrum is
 
= 4 + 2
rs Q3 rs + 2
rs Q2 rs + 2
rs Q1 rs + 2
rs Q3 Rs(3) Q2 rs
s2
+ 2
rs Q2 Rs(2) Q1 rs + 2
rs Q3 Rs(3) Q2 Rs(2) Q1 rs

(10.4.8)

Under the coplanar assumption, where all birefringent axes lie on the equatorial plane (cf. 8.2.7), the vector products are expanded as
rs Qj rs = cos 2j
rs Qk Rs Qj rs = cos 2k cos 2j + sin 2k sin 2j cos j

(10.4.9)
(10.4.10)

10.4 Programmable PMD Sources

469

and
rs Ql Rs(l) Qk Rs(k) Qj rs = cos 2l cos 2k cos 2j
+ sin 2l sin 2k cos 2j cos l
+ cos 2l sin 2k sin 2j cos k
sin 2l cos 2k sin 2j cos l cos k
+ sin 2l sin 2j sin l sin k

(10.4.11)

Two simplications are now used to reduce (10.4.8) to a tractable expression. The birefringent phases of the second and third stages are split into
common and dierential parts, with the following denition:
2 = s + , and 3 = s

(10.4.12)

where s = s . Also, the rst and third mode mixers are tied such that
3 = 1 . With these conditions, the magnitude-squared DGD spectrum takes
the form
!
 . = 16s2 cos2 1 cos2 (2 1 )
2 sin 1 cos 1 sin 2 cos 2 (1 cos s cos )
+

1
sin2 1 cos2 2 (1 cos 2s )
2
"
1
sin2 1 sin2 2 (1 cos )
2

(10.4.13)

Several observations are made about (10.4.13). First, there are only three
Fourier components: DC, cos s , and cos 2s . This spectrum is harmonic.
Second, the oscillatory components appear in the expression only when mode
mixing angle 1 is not zero. When 1 = 0 the system reduces to a two-stage
concatenation, which is wavelength at. Third, as a side-eect of tying the rst
and third stages together, birefringent phase error does not dierentially
phase-shift the fundamental and second-order Fourier components but instead
diminishes the amplitude of the constant and fundamental Fourier component
amplitudes.
Explicit calculation of the   spectrum is dicult because of so many
higher-order harmonics. However, the vector expression for  can be written
and is readily evaluated at s = 0. The recursive  sequence out to four
stages is
 (1) = s
 (2) = s +

 (1) = 0
Rs(2) Q1 (1)

(10.4.14)

 (2) = s  (2)

 (3) = s + Rs(3) Q2 (2)

 (3) = s  (3) + Rs(3) Q2 (2)

 (4) = s + Rs(4) Q1 (3)

 (4) = s  (4) + Rs(4) Q1 (3)

10 Review of Polarization Test and Measurement


DGD Contours

2.5

45

p
0.5
45

3.0

3.5

22

22

4.0
0

20

1.5

u2
2

u2

p
67

u1
5

1.5
2.0

u2
2

u1
5

1.0

u1
5

u1 (deg)

67

SOPMD Contours

90
0.5

u2

90

u1
5

470

3
40

60

0
80 100 120 140 160 180 0

u2 (deg)

a)

b)

20

40

60

80 100 120 140 160 180

u2 (deg)

Fig. 10.21. At s = 0, contours of constant and scaled to s = 1. a) Constant contours, (10.4.16). Unshaded region is single-valued. b) Constant contours, (10.4.17). Region bound by bold line is single-valued. Note the existence of
contour = 0.

At band-center, s = 0, so Rs = I. The components of  (4) at this frequency


are
1 = 0
2 = 0
3 =

2s2

(10.4.15)
(sin 2(2 1 )(1 + cos 21 ) sin 21 )

The PMD coordinate (, ) for ECHO is dened at the calibration frequency. Since the instrument is calibrated to zero residual birefringent phase
at this frequency, the precession angle is s = 0. Taking the magnitude of
respective  and  vectors, governed by (10.4.13) and (10.4.15), makes
| | = 4s |cos 1 cos(2 1 )|
| | = 2s2 |sin 2(2 1 )(1 + cos 21 ) sin 21 |

(10.4.16)
(10.4.17)

These two equations map the independent variables to the dependent variables: (1 , 2 ) (, ). At the calibration frequency the rst- and secondorder PMD magnitudes are independent.
Figure 10.21 shows contours of constant and . In Fig. 10.21(a) contours
of constant are plotted as a function of (1 , 2 ), where the plot is scaled
to s = 1. The magnitude is bound between 0 4. The unshaded area
designates a region of monotonic, single-valued mapping of (1 , 2 ) . In
Fig. 10.21(b) contours of constant are plotted as a function of (1 , 2 ),
similar to Fig. 10.21(a). The magnitude is bound between 0 4. The
special contour = 0 exists in the parametric space, and was independently
discovered and reported by [84]. The contour delineated by the dark solid line
designates a region of monotonic, single-valued mapping of (1 , 2 ) .

10.4 Programmable PMD Sources

471

67

u1 (deg)

DGD
45

22

20

tv 5 1

(a)

SOPMD

t54

t52

40

60

80

100

tv 5 4

120

u2 (deg)
(a)

t50

u1

u2

u1

Fig. 10.22. At s = 0, overlay of constant and contours within single-valued


region. There are two degrees of freedom, 1 and 2 , and two dependent variables,
and . At s = 0 rst- and second-order PMD are independent quantities. Several
vector diagrams indicate interesting coordinates.

Figure 10.22 combines the (, ) contours in an area in which both coordinates are single-valued. Within this area, the mapping (, ) (1 , 2 )
is unique. Numerical inversion of (10.4.16-10.4.17) gives (1 , 2 ) for a specied (, ).
There are interesting special cases on the contour map of Fig. 10.22; these
are treated with the assistance of the vector diagrams of Fig. 10.23. Figure 10.23(a) shows the general case of four equal-length component PMD
vectors where the mode mixing between the outer two-stage pairs is equal,
i.e. 1 = 3 . When 1 = 2 = 0, the vectors are aligned and create the maximum DGD of = 4s with concurrent = 0 (Fig. 10.23(b)). When 1 = 0
then the four-stage reduces to a symmetric two-stage. The two-stage max
imum SOPMD is when 2 = : = (2s )2 with a concurrent = 4s / 2
(Fig. 10.23(c)). The abscissa on Fig. 10.22 shows the locus of possible (, )
coordinates for the two-stage case. = 0 is only possible at = 0 and
= 4s . The inclusion of 1 as a free variable adds a necessary degree of
freedom to trace the = 0 contour over the entire range 0 4s . Outside of the indicated monotonic region lies the point of maximum PDCD; such
a point is illustrated in Fig. 10.23(d). When the four vectors form a square in
Stokes space, the DGD is zero and the depolarization is also zero. The PDCD,
however, is generated by the combined dierential motions of 4 precession
about 3 and these two vectors precession about 2 .
Four equations summarize the parameters of an ECHO source. These parameters include the extrema points described above as well as a measure of
the source bandwidth:

472

10 Review of Polarization Test and Measurement


a)

u2
ts

u1

ts

ts

ts

u1

c)

tv

b)
ts

ts

ts

ts

d)

ts
ts

ts

ts

ts

ts
ts
tv

ts

Fig. 10.23. Four-component vector diagrams. a) Arbitrary conguration but with


rst and third mode mixers tied. b) Maximum DGD of 4s ; all vectors align. c) Maximum two-stage SOPMD of (2s )2 ; right angle between rst and last pair of stages.
d) Maximum PDCD of 2s2 ; all vectors are right angles.

max = 4s
w max = (2s )
| | max =

(10.4.18)
2

2s2

FSR = 1/s

(10.4.19)
(10.4.20)
(10.4.21)

where is an enhancement factor due to the combined SOPMD eects of


depolarization and PDCD. For a two-stage source = 1, but the numeric
calculation of the four-stage source shows 1.09. It is not known if the
enhancement factor can be derived analytically.
Equations (10.4.1810.4.21) show the inherent tradeo for a four-stage
source, where maximum PMD values tradeo against the free-spectral range.
The bandwidth of the PMD spectrum should be larger than the data channel
bandwidth, which sets a maximum on s . At the same time the maximum
DGD and SOPMD should be representative of what a data channel would
likely experience when run on a ber with a mean PMD of  . As with the
PMDS-type sources, a maximum delay of | |max = 1.2T is a reasonable tradeo for NRZ transmission formats. However, this bandwidth is not suitable for
RZ formats this is discussed below.
Finally, by setting 1 = 0, ECHO reverts to a symmetric two-stage source:
| | = 4s |cos 2 |
| | = 4s2 |sin 22 |

(10.4.22)

Performance results of various ECHO implementations are available in the


technical literature [24, 26, 27].

40

(a)

20

(b)

channel BW

constant 35ps
0
-100

-50

SOPMD (ps2)

DGD (ps)

10.4 Programmable PMD Sources

50

100

enhanced

500
(a)

250

(b)

reduced
0
-100

relative frequency (GHz)

473

-50

50

100

relative frequency (GHz)

PSP spectra:
Two-stage (a)

channel BW

S3

Four-stage (b)

S2
S1

channel BW

S3

S2
S1

Fig. 10.24. Three calculated spectra pairs, (f fo ) and (f fo ), from ECHO


addresses (35, 338), (35, 248), and (35, 111). The calculation uses s = 10 ps/stg.
Observations: 1) is constant at 35 ps for each spectrum; 2) is depressed at
s = 0 for latter two spectra; and 3) is enhanced at s = for latter two spectra.

Independent Control of 
And 

The governing equations (10.4.16-10.4.17) show that at the calibration frequency (s = 0), or any integral multiple of the FSR, and are independent. Figure 10.24 illustrates calculated and spectra for three dierent
states: (35, 338), (35, 248), and (35, 111). Each coordinate pair corresponds
to (, ) (ps, ps2 ). Three observations are apparent. First, at s = 0 the
DGD is constant at = 35 ps, as predicted by the state address. Second, at
the same relative frequency, the values are progressively depressed from
the two-stage case, that case corresponding to (35, 338). Third, the value
at FSR/2 from center is enhanced with respect to the two-stage case, the
value being the combined result of depolarization and non-zero PDCD.
The reason for the diminution of the SOPMD magnitude at center frequency is shown in the lower Poincare plots. PMD produced by two stages
has a PSP spectrum that traces circles. The angular rate of change with frequency is constant across the FSR, so the magnitude-SOPMD is constant. For
the four-stage case, the PSP spectrum slows along the small arc and speeds
along the large arc. On the small arc the pointing direction changes slowly,
even for the same DGD. In the limit of zero SOPMD, the PSP pirouettes
about a single point and the radius of the small arc is zero.
Figure 10.25 illustrates exemplar DGD, SOPMD, and PDCD spectra calculated for a 10 Gb/s instrument at two states: (35, 0) and (85, 1400). In
Fig. 10.25(a), the (30, 0) state is shown because of the interesting property

474

10 Review of Polarization Test and Measurement

PDCD (ps2)

SOPMD (ps2)

DGD (ps)

120

30/0

80
40
0
1500
1000
500
0
1000
500
0
-500
-1000

a)

194.88

194.90

194.92

194.96

194.98

194.96

194.98

85/1400

80
40

SOPMD (ps2)

0
3000

PDCD (ps2)

DGD (ps)

120

b)

194.94

Frequency (THz)

2000

2000
1000
0
1000
0
-1000
-2000

194.88

194.90

194.92

194.94

Frequency (THz)

Fig. 10.25. Calculated scalar PMD spectra, s = 10 ps. a) State (30, 0). At approximately 194.925 THz one observes = 30 ps and = 0 ps2 , the state setting. b)
State (85, 1400). In contrast to (a), the DGD value touches zero with large simultaneous SOPMD.

that at s = 0 the is zero while is nite. In a small frequency band


about s = 0  pirouettes about a stationary position in Stokes space. Outside
of this band  conducts its depolarizing motion. In Fig. 10.25(b), the (85, 1400)
state is shown because is zero at s = 0 while is nite. This is an important state where dominates. In fact, it is the PDCD that dominates the
SOPMD as the state is virtually devoid of depolarization; only the length of
the DGD vector changes in a small frequency band about zero phase. The component PMD vector orientation is similar to that illustrated in Fig. 10.23(d).
The Role of Birefringent Phase
Birefringent phase plays a central role in determining the shape of non-at
PMD spectra and is central to construction of programmable PMD sources.

10.4 Programmable PMD Sources

475

DGD (ps)

30
20
10
0
-100

-50

50

100

Relative Frequency (GHz)

DGD (ps)

Fig. 10.26. Continuous frequency shifting. Six frequency-shifted spectra for


1 = 2 = /2, s = 10 ps, and common-mode phase control. The spectra shape remains intact.
30
20

dw 5 0o

10
0

dw 5 11.25o
dw 5 22.5o
dw 5 33.75o
dw 5 45o
-100

-50

50

100

Relative Frequency (GHz)

Fig. 10.27. Birefringent phase changes the DGD shape. Five spectra for
1 = 2 = /2, s = 10 ps, and dierential-mode phase control. The birefringent
phase plays an central role in the spectral shape, which is predicted by (10.4.13).

This section demonstrates the criticality of birefringent phase using two examples: common and dierential control of the birefringent phase. The Evans
phase shifters in the second and third stages are used primarily to drive the
concatenation into coherence. Once achieved, the phase controllers can be rotated simultaneously by the same angle. The result from this common-mode
rotation is a frequency shift of the PMD spectrum [21]. Alternatively, the
phase controllers can be rotated by equal and opposite amounts. The result
from this dierential-mode rotation is, for the highly symmetric ECHO, a
change in the shape of the spectra but with zero movement of the Fourier
phase of the constituent components.

476

10 Review of Polarization Test and Measurement

Figure 10.26 shows the frequency shift of a DGD spectrum, calculated


using common-mode phase control and with various phase increments. The
Stokes angles of the mode mixers were all set to 90 . A detailed analysis of
the Fourier components is available in [27].
In comparison to common-mode phase shift, the profound DGD shape
change due to dierential-mode phase control is shown in Fig. 10.27. Here the
dierential phase is varied between 0 and 180 in Stokes space; the Stokes
angles of the mode mixers were all set to 90 . Unlike common-mode control,
there is no frequency shift of the spectrum. Rather, the shape changes in place.
The shape change is predicted by (10.4.13). In this equation the amplitude of
the fundamental component vanishes when = 90 . Likewise, the amplitude
of the DC component varies. In a ber, dierential phase change will change
both the shape of the spectrum and its frequency centering.
Addressable PMD Space
ECHO instruments can continuously span rst- and second-order PMD space
within the envelope dictated by (10.4.22) and delineated by the outer-most
contour in Fig. 10.17. However, this statement applies only to frequency
s = 0. If one includes all frequencies across all the possible spectra then
a much wider addressable space is available. Figure 10.28 shows how to construct an envelope of the total addressable space. The gure is drawn with
respect to a 10 Gb/s instrument but can be scaled to any other data rate. The
dotted line shows the single contour for a symmetric two-stage source, derived
from (10.4.22). Referring to the scalar spectra for the (30, 0) state in Fig. 10.25,
all values for state (, ) are plotted parametrically on Fig. 10.28 along contour (a). Likewise, all values for state (85, 1400) are plotted along contour (b).
As another example, the wavelength-at state (100, 3316) is shown as just one
point since there is no frequency dependence of that spectrum. The mapped

SOPMD (ps2)

4000

100/3316

3000
2000

(b)

f
f

1000
0

30/0
0

(c)

20

40

85/1400

(a)

60

80

100

120

DGD (ps)

Fig. 10.28. State space for rst- and second-order PMD magnitudes, scaled for
s = 30.0 ps. Dashed line delineates two-stage contour. Contour (a) is parametric
plot of scalar spectrum at address (30, 0). Likewise, contours (b) and (c) are parametric plots at addresses (85, 1400) and (100, 3316), respectively.

10.4 Programmable PMD Sources


a)

477

b)

4500

Continuous
States

ECHO Boundary

PMDS Contours

SOPMD (ps2)

3600
2700
1800

JPDF

900
0

30

60

90

120

30

DGD (ps)

60

90

120

DGD (ps)

Fig. 10.29. Comparison of ECHO and PMDS addressable space in relation to ber
JPDF for   = 33 ps. a) Addressable region for a 10 Gb/s ECHO lies below the
boundary line and is continuous on the plane. b) Addressable region for a 10 Gb/s
PMDS. The addressable states lie along lines and do not cover the entire space.
Also, the low-DGD high-SOPMD wedge is not covered at all.

function is written
(1 , 2 , ) (, )

(10.4.23)

where, with being the angle of the Evans phase shifters, the left-hand side
is a coordinate of physical parameters and the right-hand side is a coordinate
of PMD parameters.
Following this approach for all combinations angles (1 , 2 , ), where the
four-stage concatenation remains coherent (3 = 2 ) and where the rst and
third mode mixers are tied, all possible PMD addresses can be calculated.
Figure 10.29(a) shows the results of this calculation. The region below the
boundary shows the the addressable space of ECHO. The states are continuous; there are no holes in this two-dimensional surface. As a point of comparison, the addressable space for a 10 Gb/s PMDS is shown in Figure 10.29(b).
A richer mapping of physical to PMD-specic coordinates is
(1 , 2 , ) (, , | | )

(10.4.24)

where | | is the PDCD. Indeed there are three independent input variables,
so one should expect three dependent variables. However, the inverse mapping
of (10.4.24) is not one-to-one. One important inverse map is
(| | ; , ) (1 , 2 , )

(10.4.25)

where and remain xed. This inverse map explores the balance between
PDCD and depolarization at a xed PMD coordinate (, ). It would be
very interesting to nd how receiver sensitivity changes across the balance of
second-order components.

478

10 Review of Polarization Test and Measurement

Instrument Bandwidth
While it is signicant that the ECHO instrument can smoothly cover a wide
region of rst- and second-order PMD space, this property alone is only a
partial description and can be misleading. What is missing is a statement of
the instruments free-spectral range and its relation to the channel bandwidth.
Figure 10.30(a) shows a spectral overlay of a 10 Gb/s ECHO DGD spectrum
with a 10 Gb/s non-return to zero (NRZ) data channel bandwidth. The FSR
of the source is 33.33 GHz, while the rst channel null is at 10 GHz. By design
the FSR is larger than the channel bandwidth.
Figure 10.30(b) shows a similar overlay with the same instrument but with
a 12.7 Gb/s 33% duty-cycle return-to-zero (RZ) pulse bandwidth. The RZ
channel bandwidth exceeds the FSR of the instrument. The built-in periodicity of the instrument imparts an articial aliasing that would likely not exist
in a real transmission system. Use of a 10 Gb/s ECHO source to test a 40 Gb/s
data link is pointless because the channel bandwidth is many times the FSR
of the instrument, even in spite of the fact that the 10 Gb/s instrument can
reach suitably low rst- and second-order PMD values.
While it is uneconomical to build one instrument to test both 10 Gb/s
and 40 Gb/s data rates, a single source can be designed to accommodate NRZ
and RZ transmission formats. Figure 10.30(c) illustrates one possibility. The
center two vectors (all normalized to length 4) are split in a 3 : 1 ratio and the
mode mixers between these stages are either aligned or crossed. When aligned,
the four equal-length vector concatenation is recovered. When crossed, a
4 : 2 : 2 : 4 vector grouping appears. In this case the FSR is doubled. The
FSR of the modied instrument can in this way breathe between a tight
FSR and high PMD region and a looser FSR and a lower PMD region.

10.5 Receiver Performance Validation


There are two categories of information an operator of an optical communications link would like to have regarding PMD-induced impairments: what
is the total outage probability (TOP) of the system, and what is the mean
outage rate (Rout ) as well as the mean outage duration (Tout ). Total outage
probability is a static estimate of the total number of severely-errored seconds
(SES) a system will suer over a period of time. Mean outage rate and duration are estimates of the dynamic behavior of the system under impairment.
Since most links are operated in protected congurations, a few, or even one,
severely-errored seconds may be enough to switch trac onto the backup line.
For the same number of SES a year, the question is whether the protection
switch is thrown once and the full outage seconds occur, or whether the protection switch is thrown frequently for short time intervals. Frequent switching
is deleterious to smooth system operation.

10.5 Receiver Performance Validation


a)

4:4:4:4 DGD

NRZ

Vector Diagrams

&
b)

4:4:4:4 DGD

479

RZ

%
c)

4:2:2:4 DGD

RZ

frequency

Fig. 10.30. Relation between instrument spectral periodicity and channel bandwidth. a-b) Overlay of a 10 Gb/s DGD spectrum with a 10 Gb/s NRZ and 12.7 Gb/s
RZ bandwidths. In both cases the four component vector lengths are the same.
c) Wide-FSR DGD spectrum and RZ bandwidth. Both with RZ bandwidth. Here
the middle two stages are split in a 3 : 1 ratio. When vector of length 1 is folded
back onto the vector of length 3, the net middle vector length is 2.

Static and dynamic estimates of PMD-induced system impairments can be


estimated using a receiver map of a Tx/Rx pair and knowledge of the PMD
statistics. Early estimates were derived by considering DGD alone, although
there was cognizance of impairments due to SOPMD [58, 59]. Bulow subsequently showed the importance of including second-order PMD eects [5].
This philosophy is consistent with the view-point of this text: at least rstand second-order PMD must be considered to derive a meaningful outage
estimate.
A Tx/Rx pair has a certain PMD tolerance that is independent of the
ber-optic line on which it operates. The receiver map isolates and quanties this tolerance. A simple test setup to generate a receiver map is illustrated in Fig. 10.31. A single channel is driven by a bit-error-rate test setup
(BERTS) and transmitted through a polarization scrambler, a programmable
PMD source, and a ber spool to introduce chromatic dispersion (CD). The
channel is then noise-loaded prior to detection. In this all-states method,
the bit-error rate (BER) must be averaged over a time interval such that the
scrambler covers the entire Poincare sphere (typically 5 mins using an Agilent 11896A polarization controller). A BER is recorded for each coordinate
in PMD space and a contour plot of BER versus PMD is generated. Two such
contour plots are illustrated in Fig. 10.32 [19, 25]. The contour plot is called
the receiver map.
The receiver map provides qualitative and quantitative information about
the Tx/Rx pair and its expected performance. Comparison of Figs. 10.32(a)
and (b) shows that the latter receiver is more tolerant of PMD than the

480

10 Review of Polarization Test and Measurement


Tx

BERTS

PMDS
Polarization
Scrambling
TOF

CD fiber
spool
VOA EDFA

VOA

Rx
Noise loading

Fig. 10.31. Simple test conguration to generate a receiver map of the Tx/Rx
pair. The bit-error rate is measured across a large number of PMD coordinates
addressed by the PMDS. The channel is noise-loaded and chromatic dispersion can
be added. For each state of the programmable PMD source, the bit-error rate (BER)
is measured as an average over a uniform distribution of input polarization states.
This is the so-called all-states method. The channel is noise-loaded using the two
variable optical attenuators (VOA), an erbium amplier (EDFA), and a tunable
optical lter (TOF). Chromatic dispersion (CD) can be added parametrically to the
receiver map.

former. Such behavior is usually found when a PMD compensator is added


before the receiver. Other receiver comparisons have shown that some receivers
are more tolerant of SOPMD than others. Finally, the receiver maps should
be generated parametrically over the range of expected chromatic dispersion.
Total outage probability of a Tx/Rx pair has to include information of the
mean PMD   on the ber-optic link. The probability density for rst- and
second-order PMD is determined by the JPDF, which scales as  . Therefore, the receiver map is compared to the JPDF as   is varied across the
expected range in the physical plant. Two estimates can be generated: the expected error (rate E[BER]) and the total outage probability. The respective
expressions are:

BER(, ) P (, ;  )
(10.5.1)
E[BER]( ) =
,

TOP( ) =

I(BER(, ) > TOL) P (, ;  )

(10.5.2)

The expected error rate is simply the weighted average of the receiver map
with the JPDF (P (, ;  )) scaled to a particular mean PMD. TOP is
estimated using the JPDF and the indicator function, where I = 0 when the
BER is below threshold TOL and I = 1 above the threshold.
For a particular receiver map, E[BER] and TOP can be estimated over a
range of  . This is illustrated in Fig. 10.33. Since the JPDF is parametric
in  , TOP can be calculated parametrically. Important considerations are

10.5 Receiver Performance Validation


a)

b)
3600
BER = -6

2000
1000
0

3600

-3

3000

-9
-12
0 10 20 30 40 50 60 70 80 90

DGD (ps)

SOPMD (ps2)

SOPMD (ps2)

481

BER = -9

3000
2000

-6

-12

1000
0

0 10 20 30 40 50 60 70 80 90

DGD (ps)

Fig. 10.32. Illustration of two receiver maps generated by a PMDS. Bands of


constant bit-error rate across rst- and second-order PMD space indicate the Tx/Rx
tolerance to PMD. Combined with the JPDF of PMD, estimates of the total outage
probability are made. a) Poor tolerance to PMD: there is a quick roll-o in BER
with both rst- and second-order. b) Improved tolerance.

the extent of the JPDF into low-probability regions and the estimation accuracy. The JPDF calculated by brute-force (see page 406) extends to 104 ,
which is not low enough to generate accurate estimates for   < 15 ps. The
importance-sampling or direction-integration approaches resolve this problem
(see page 405). The estimation accuracy depends on the density of the receiver map and coverage of the 2D PMD space. The receiver maps illustrated
in Fig. 10.32 could be extended to low DGD, high SOPMD regions using an
ECHO source.
Dynamic outage estimates such as Rout and Tout require a dynamic model
of the PMD evolution and, most critically, a time constant with which the
evolution takes place. That undersea ber changes at a much slower rate
than aerial ber is clear. Caponi et al. made the rst estimates based on
measurements of installed terrestrial ber [6]. Their technique uses the DGD
evolution alone and the classic level-crossing-rate expression for Brownian
motion. That expression requires densities for both the DGD and its temporal
derivative. Caponi et al. use measured data to estimate the rate of change, and
conjecture, after data analysis, that a particular DGD value and its temporal
derivative are statistically independent.
Leo et al. extends the Caponi method with the conjecture that the jointdensity of rst- and second-order PMD and its joint temporal derivative are
also independent [73, 87]. Their analysis of rst- and second-order data supports the Caponi conjecture regarding DGD alone. Rather than using the
one-dimensional level-crossing expression, Leo et al. use a receiver map and a
two-dimensional indicator function. Therefore, based on measured ber uctuations and measured Tx/Rx performance, a simulation of ber evolution is
made in two PMD dimensions to estimate Rout .

Probability (log10)

a)

10 Review of Polarization Test and Measurement


b)

0
-2
-4

TOP

-6

E[BER]

-8

Error Floor

-12
0

10

15

3.5d

50m

50m

30s
0.3s

-10

-14

3.5d

20

25

30

Mean Fiber PMD hti (ps)

Outage

482

Uncompensated

30s
0.3s

3.0ms

3.0ms

30ms

30ms

0.3ms
35

0.3ms

Compensated

10

15

20

25

30

35

Mean Fiber PMD hti (ps)

Fig. 10.33. Illustrative estimates of E[BER] and TOP over a range of  . The
error oor relates to the minimum measured BER, and can be reduced with longer
averaging times. Outage and probability are related on the abscissa. a) Comparison
of TOP and E[BER]. b) Exemplar (un)compensated Tx/Rx TOP estimates.

The dynamic model of PMD evolution proposed by Leo lies solely on the
JPDF. An alternative is to use a waveplate model of the ber and rotate the
sections in a random way. The drawback of such an evolving waveplate model
is that most of the PMD states will be about the mean. Importance-sampling
(IS) methods can be used for this problem as well. Earlier, IS was used to
generate the JPDF for rst- and second-order PMD. This is a density; what
is needed is a process. Augmentation of the IS method to mimic the temporal
evolution of ber in a biased manner would be a powerful tool for robust
estimates of the dynamic outage parameters.

References

483

References
1. A. J. Barlow, T. G. Arnold, T. L. Voots, and P. J. Clark, Method and apparatus for high resolution measurement of very low levels of polarization mode
dispersion (PMD) in single mode optical bers and for calibration of PMD
measuring instruments, U.S. Patent 5,654,793, Aug. 5, 1997.
2. G. Biondini, W. L. Kath, and C. R. Menyuk, Non-maxwellian DGD distributions of PMD emulators, in Tech. Dig., Optical Fiber Communications
Conference (OFC01), Anaheim, CA, Mar. 2001, paper ThA5.
3. G. Biondini and W. L. Kath, Polarization-mode dispersion emulation with
maxwellian lengths and importance sampling, IEEE Photonics Technology
Letters, vol. 16, no. 3, pp. 789791, Mar. 2004.
4. C. F. Buhrer, Higher-order achromatic quarterwave combination plates and
tuners, Applied Optics, vol. 27, no. 15, pp. 31663169, 1988.
5. H. Bulow, System outage probability due to rst- and second-order PMD,
IEEE Photonics Technology Letters, vol. 10, no. 5, pp. 696698, 1998.
6. R. Caponi, B. Riposati, A. Rossaro, and M. Schiano, WDM design issues with
highly correlated PMD spectra of buried optical cables, in Tech. Dig., Optical
Fiber Communications Conference (OFC02), Anaheim, CA, Mar. 2002, paper
ThI5, pp. 453454.
7. L. Chen, O. Chen, S. Hadjifaradji, and X. Bao, Polarization-mode dispersion
measurement in a system with polarization-dependent loss or gain, IEEE
Photonics Technology Letters, vol. 16, no. 1, pp. 206208, Jan. 2004.
8. R. Chipman and R. Kinnera, High-order polarization mode dispersion emulator, Optical Engineering, vol. 41, no. 5, pp. 932937, May 2002.
9. E. Collett, Automatic determination of the polarization state of nanosecond
laser pulses, U.S. Patent 4,158,506, June 19, 1979.
10. F. Corsi, A. Galtarossa, and L. Palmieri, Polarization mode dispersion characterization of single-mode optical ber using backscattering technique, Journal
of Lightwave Technology, vol. 16, no. 10, pp. 18321843, Oct. 1998.
11. , Beat length characterization based on backscattering analysis in randomly perturbed single-mode bers, Journal of Lightwave Technology, vol. 17,
no. 7, pp. 11721178, July 1999.
12. R. M. Craig, Visualizing the limitations of four-state measurement of PDL and
results of a six-state alternative, in Symposium on Optical Fiber Measurements
(SOFM 2002), Boulder, Colorado, Sept. 2002, pp. 121124.
13. , Accurate spectral characterization of polarization-dependent loss,
Journal of Lightwave Technology, vol. 21, no. 2, pp. 432437, Feb. 2003.
14. R. M. Craig, S. L. Gilbert, and P. D. Hale, High-resolution, nonmechanical
approach to polarization-dependent transmission measurements, Journal of
Lightwave Technology, vol. 16, no. 7, pp. 12851294, July 1998.
15. N. Cyr, Stokes parameter analysis method, the consolidated test method for
PMD measurements, in Proceedings of the National Fiber Optical Engineering
Conference, Chicago, IL, 1999.
16. , Method and apparatus for measuring polarization mode dispersion of
optical devices, U.S. Patent 6,204,924, Mar. 20, 2001.
17. , Polarization-mode dispersion measurement: Generalization of the interferometric method to any coupling regime, Journal of Lightwave Technology,
vol. 22, no. 3, pp. 794805, Mar. 2004.

484

10 Review of Polarization Test and Measurement

18. J. N. Damask, A programmable polarization-mode dispersion emulator for


systematic testing of 10 Gb/s PMD compensators, in Tech. Dig., Optical
Fiber Communications Conference (OFC00), Baltimore, MD, Mar. 2000, paper ThB3, pp. 2830.
19. J. N. Damask, P. R. Myers, and T. R. Boschi, Programmable polarizationmode-dispersion generation, in Tech. Digest, Symposium on Optical Fiber
Measurements (SOFM 2002), Boulder, CO, Sept. 2002.
20. J. N. Damask, Apparatus and method for controlled generation of polarization
mode dispersion, U.S. Patent 6,377,719, Apr. 23, 2002.
21. , Methods and apparatus for frequency shifting polarization mode dispersion spectra, U.S. Patent 2002/0 080 467 A1, June 27, 2002.
22. , Methods and apparatus for generating polarization mode dispersion,
U.S. Patent 2002/0 191 285 A1, Dec. 19, 2002.
23. , Methods and apparatus for generation and control of coherent polarization mode dispersion, U.S. Patent 2002/0 118 455 A1, Aug. 29, 2002.
24. , Methods to construct programmable PMD sources, Part I: Technology
and theory, Journal of Lightwave Technology, vol. 22, no. 4, pp. 9971005,
Apr. 2004.
25. J. N. Damask, G. Gray, P. Leo, G. J. Simer, K. B. Rochford, and D. Veasey,
Method to measure and estimate total outage probability for PMD-impaired
systems, IEEE Photonics Technology Letters, vol. 15, no. 1, pp. 4850, Jan.
2003.
26. J. N. Damask, P. R. Myers, A. Boschi, and G. J. Simer, Demonstration of
a coherent PMD source, IEEE Photonics Technology Letters, vol. 15, no. 11,
pp. 16121614, Nov. 2003.
27. J. N. Damask, P. R. Myers, G. J. Simer, and A. Boschi, Methods to construct
programmable PMD sources, Part II: Instrument demonstrations, Journal of
Lightwave Technology, vol. 22, no. 4, pp. 10061013, Apr. 2004.
28. J. N. Damask, G. J. Simer, K. B. Rochford, and P. R. Myers, Demonstration
of a programmable PMD source, IEEE Photonics Technology Letters, vol. 15,
no. 2, pp. 296298, Feb. 2003.
29. A. B. dos Santos and J. P. von der Weid, PDL eects in PMD emulators
made out with HiBi bers: Building PMD/PDL emulators, IEEE Photonics
Technology Letters, vol. 16, no. 2, pp. 452454, Feb. 2004.
30. T. Erdogan, T. A. Strasser, and P. S. Westbrook, In-line all-ber polarimeter,
U.S. Patent 6,211,957, Apr. 3, 2001.
31. J. W. Evans, The birefringent lter, Journal of the Optical Society of America, vol. 39, no. 3, pp. 229242, 1949.
32. A. Eyal, D. Kuperman, O. Dimenstein, and M. Tur, Polarization dependence
of the intensity modulation transfer function of an optical system with PMD
and PDL, IEEE Photonics Technology Letters, vol. 14, no. 11, pp. 15151517,
Nov. 2002.
33. A. Eyal and M. Tur, Measurement of polarization mode dispersion in systems having polarization dependent loss or gain, IEEE Photonics Technology
Letters, vol. 9, no. 9, pp. 12561258, Sept. 1997.
34. , A modied poincare sphere technique for the determination of polarization-mode dispersion in the presence of dierential gain/loss, in Tech. Dig.,
Optical Fiber Communications Conference (OFC98), San Jose, CA, Feb. 1998,
paper ThR1, p. 340.

References

485

35. D. L. Favin, B. M. Nyman, and G. M. Wolter, System and method for measuring polarization dependent loss, U.S. Patent 5,371,597, Dec. 6, 1994.
36. K. S. Feder, P. S. Westbrook, J. Ging, P. I. Reyes, and G. E. Carver, In-ber
spectrometer using tilted ber gratings, IEEE Photonics Technology Letters,
vol. 15, no. 7, pp. 933935, July 2003.
37. A. Galtarossa and L. Palmieri, Spatially resolved PMD measurements, Journal of Lightwave Technology, vol. 22, no. 4, pp. 11031115, Apr. 2004.
38. A. Galtarossa, L. Palmieri, A. Pizzinat, M. Schiano, and T. Tambosso, Measurement of local beat length and dierential group delay in installed singlemode bers, Journal of Lightwave Technology, vol. 18, no. 10, pp. 13891394,
Oct. 2000.
39. A. Galtarossa, L. Palmieri, M. Schiano, and T. Tambosso, Statistical characterization of ber random birefringence, Optics Letters, vol. 25, no. 18, pp.
13221324, Sept. 2000.
40. N. Gisin, J. Von der Weid, and R. Passy, Denitions and measurements of
polarization mode dispersion: Interferometric versus xed analyzer methods,
IEEE Photonics Technology Letters, vol. 6, no. 6, pp. 730732, 1994.
41. N. Gisin and K. Julliard, Method and device for measuring polarization mode
dispersion of an optical ber, U.S. Patent 5,852,496, Dec. 22, 1998.
42. J. P. Gordon, R. M. Jopson, H. W. Kogelnik, and L. E. Nelson, Polarization
mode dispersion measurement, U.S. Patent 6,519,027, Feb. 11, 2003.
43. P. S. Hague, Polarized Light: Instruments, Devices, Applications. Bellingham,
Washington: SPIE Optical Engineering Press, Jan. 1976, vol. 88, ch. Survey of
Methods for the Complete Determination of a State of Polarisation, pp. 310.
44. S. E. Harris, E. O. Ammann, and I. C. Chang, Optical network synthesis using
birefringent crystals. i. synthesis of lossless networks of equal-length crystals,
Journal of the Optical Society of America, vol. 54, no. 10, pp. 12671279, 1964.
45. M. C. Hauer, Q. Yu, and A. Willner, Compact, all-ber PMD emulator using an integrated series of thin-lm micro-heaters, in Tech. Dig., Optical
Fiber Communications Conference (OFC02), Anaheim, CA, Mar. 2002, paper ThA3.
46. M. Hauer, Q. Yu, E. Lyons, C. Lin, A. Au, H. Lee, and A. Willner, Electrically controllable all-ber PMD emulator using a compact array of thin-lm
microheaters, Journal of Lightwave Technology, vol. 22, no. 4, pp. 10591065,
Apr. 2004.
47. B. L. Hener, Automated measurement of polarization mode dispersion using
Jones matrix eigenanalysis, IEEE Photonics Technology Letters, vol. 4, no. 9,
pp. 10661068, 1992.
48. , Deterministic, analytically complete measurement of polarization-dependent transmission through optical devices, IEEE Photonics Technology
Letters, vol. 4, no. 5, pp. 451453, 1992.
49. , Accurate, automated measurement of dierential group delay dispersion
and principal state variation using Jones matrix eigenanalysis, IEEE Photonics Technology Letters, vol. 5, no. 7, pp. 814816, 1993.
50. , Single-mode propagation of mutual temporal coherence: Equivalence
of time and frequency measurements of polarization-mode dispersion, Optics
Letters, vol. 19, no. 15, pp. 11041106, Aug. 1994.
51. , Optical pulse distortion measurement limitations in linear time invariant
systems, and applications to polarization mode dispersion, Optics Communications, vol. 115, pp. 4551, Mar. 1995.

486

10 Review of Polarization Test and Measurement

52. , Inuence of optical source characteristics on the measurement of


polarization-mode dispersion of highly mode-coupled bers, Optics Letters,
vol. 21, no. 2, pp. 113115, Jan. 1996.
53. , PMD measurement techniques - a consistent comparison, in Tech. Dig.,
Optical Fiber Communications Conference (OFC96), San Jose, CA, Feb. 1996,
paper FA1, p. 292.
54. B. L. Hener and P. R. Hernday, Measurement of polarization-mode dispersion, Hewlett-Packard Journal, pp. 2733, Feb. 1995.
55. B. L. Hener, Method and apparatus for measuring polarization mode dispersion in optical devices, U.S. Patent 5,227,623, July 13, 1993.
56. , Method and apparatus for measuring polarization sensitivity of optical
devices, U.S. Patent 5,298,972, Mar. 24, 1994.
57. , Polarimeter re-calibration method and apparatus, U.S. Patent
5,296,913, Mar. 22, 1994.
58. F. Heismann, Tutorial: Polarization mode dispersion: Fundamentals and impact on optical communication systems, in European Conference on Optical
Communication (ECOC98), vol. 2, Sept. 1998, pp. 5179.
59. F. Heismann, D. A. Fishman, and D. L. Wilson, Automatic compensation
of rst order polarization mode dispersion in a 10 gb/s transmission system,
in European Conference on Optical Communication (ECOC98), vol. 1, Sept.
1998, pp. 529530.
60. B. Huttner, B. Gisin, and N. Gisin, Distributed PMD measurement with a
Polarization-OTDR in optical bers, Journal of Lightwave Technology, vol. 17,
no. 10, pp. 18431848, Oct. 1999.
61. B. Huttner, J. Reecht, N. Gisin, R. Passy, and J. Weid, Local birefringence
measurements in single-mode bers with coherent optical frequency-domain
reectometry, IEEE Photonics Technology Letters, vol. 10, no. 10, pp. 1458
1460, Oct. 1998.
62. , Distributed beatlength measurement in single-mode bers with optical
frequency-domain reectometry, Journal of Lightwave Technology, vol. 20,
no. 5, pp. 828835, May 2002.
63. I. T. Lima, Jr., R. Khosravani, P. Ebrahimi, E. Ibragimov, A. E. Willner,
and C. R. Menyuk, Polarization mode dispersion emulator, in Tech. Dig.,
Optical Fiber Communications Conference (OFC 2000), Baltimore, MD, Mar.
2000, paper ThB4.
64. E. Ibragimov, G. Shtengel, and S. Suh, Statistical correlation between rst
and second order PMD, Journal of Lightwave Technology, vol. 20, no. 4, pp.
586590, 2002.
65. Fibre optic interconnecting devices and passive components - Basic test and
measurement procedures - Part 3-12: Examinations and measurements Polarization dependence of attenuation of a single-mode bre optic component:
Matrix calculation method, International Electrotechnical Commission Std.
IEC 61 300-3-12, 1997. [Online]. Available: https://www.iec.ch/
66. Fibre optic interconnecting devices and passive components - Basic test
and measurement procedures - Part 3-2: Examinations and measurements Polarization dependence of attenuation in a single-mode bre optic device,
International Electrotechnical Commission Std. IEC 61 300-3-2, 1999. [Online].
Available: https://www.iec.ch/
67. JDS Uniphase instrumentation catalog 2003, JDS Uniphase, Inc., Canada,
2003, PMD Emulator: PE3 or PE4. [Online]. Available: http://www.jdsu.com/

References

487

68. R. Jopson, L. Nelson, and H. Kogelnik, Measurement of second-order polarization dispersion vectors in optical bers, IEEE Photonics Technology Letters,
vol. 12, no. 3, pp. 293295, 2000.
69. R. M. Jopson, H. W. Kogelnik, and L. E. Nelson, Method for measurement of
rst- and second-order polarization mode dispersion vectors in optical bers,
U.S. Patent 6,380,533, Apr. 30, 2002.
70. M. Karlsson, J. Brentel, and P. A. Andrekson, Long-term measurement of
PMD and polarization drift in installed bers, Journal of Lightwave Technology, vol. 18, no. 7, pp. 941951, 2000.
71. M. Legre, M. Wegmuller, and N. Gisin, Investigation of the ratio between
phase and group birefringence in optical single-mode bers, Journal of Lightwave Technology, vol. 21, no. 12, pp. 33743378, Dec. 2003.
72. P. J. Leo, G. R. Gray, G. J. Simer, and K. B. Rochford, State of polarization
changes: Classication and measurement, Journal of Lightwave Technology,
vol. 21, no. 10, pp. 21892193, Oct. 2003.
73. P. J. Leo, D. L. Peterson, and K. B. Rochford, Estimation of system outage
statistics due to polarization mode dispersion, in Symposium on Optical Fiber
Measurements (SOFM 2002), Boulder, CO, Sept. 2002.
74. I. T. Lima, R. Khosravani, P. Ebrahimi, E. Ibragimov, C. R. Menyuk, and A. E.
Willner, Comparison of polarization mode dispersion emulators, Journal of
Lightwave Technology, vol. 19, no. 12, pp. 18721881, Dec. 2001.
75. C. Madsen, M. Cappuzzo, E. Laskowski, E. Chen, L. Gomez, A. Grin,
A. Wong-Foy, S. Chandrasekhar, L. Stulz, and L. Buhl, Versatile integrated
PMD emulation and compensation elements, Journal of Lightwave Technology, vol. 22, no. 4, pp. 10411050, Apr. 2004.
76. C. Madsen, E. Laskowski, M. Cappuzzo, L. Buhl, S. Chandrasekhar, E. Chen,
L. Gomez, A. Grin, L. Stulz, and A. Wong-Foy, A versatile, integrated emulator for rst- and higher-order PMD, in Tech. Dig., European Conference on
Optical Communications (ECOC03), Rimini, Italy, Sept. 2003, paper Th2.2.5.
77. B. S. Marks, I. T. Lima, and C. R. Menyuk, Autocorrelation function for
polarization mode dispersion emulators with rotators, Optics Letters, vol. 27,
no. 13, pp. 11501152, July 2002.
78. P. Martin, G. Le Boudec, E. Tauieb, and H. Lefevre, Appliance for measuring polarization mode dispersion and corresponding measuring process, U.S.
Patent 5,712,704, Jan. 27, 1998.
79. National Aperture Catalogue, National Aperture, Inc., Salem, NH, 2004,
MM-3M-R Rotary Stage. [Online]. Available: http://www.naimotion.com/
80. L. E. Nelson, R. M. Jopson, H. Kogelnik, and J. P. Gordon, Measurement of
polarization-mode dispersion vectors using the polarization-dependent signal
delay method, Optics Express, vol. 6, no. 8, pp. 158167, Apr. 2000. [Online].
Available: http://www.opticsexpress.org/
81. B. M. Nyman, D. L. Favin, and G. Wolter, Automated system for measuring
polarization dependent loss, in Tech. Dig., Optical Fiber Communications
Conference (OFC94), San Jose, CA, Mar. 1994, pp. 230231.
82. B. M. Nyman and G. Wolter, High-resolution measurement of polarization
dependent loss, IEEE Photonics Technology Letters, vol. 5, no. 7, pp. 817
818, July 1993.
83. J. Patscher and R. Eckhardt, Component for second-order compensation of
polarization-mode dispersion, Electronics Letters, vol. 33, no. 13, p. 1157,
1997.

488

10 Review of Polarization Test and Measurement

84. P. B. Phua and H. A. Haus, Variable dierential-group-delay module without


second-order PMD, Journal of Lightwave Technology, vol. 20, no. 9, pp. 1788
1794, 2002.
85. C. D. Poole and R. E. Wagner, Phenomenological approach to polarization
mode dispersion in long single-mode bers, Electronics Letters, vol. 22, no. 19,
pp. 10291030, 1986.
86. C. D. Poole and D. L. Favin, Polarization-mode dispersion measurements
based on transmission spectra through a polarizer, Journal of Lightwave Technology, vol. 12, no. 6, pp. 917929, 1994.
87. K. B. Rochford, P. J. Leo, D. L. Peterson, and P. Williams, Recent progress
in polarization mode dispersion measurement, in Proc. 16th Intl. Conf. on
Optical Fiber Sensors, Nara, Japan, Oct. 2003.
88. A. J. Rogers, Polarization-optical time domain reectometry: A technique for
the measurement of eld distributions, Applied Optics, vol. 20, pp. 10601074,
1981.
89. G. Shtengel, 2003, Labview library for Agilent 8509 Optical Polarization
Analyzer and Agilent 8614 Tunable Laser Source. [Online]. Available:
http://www.shtengel.com/gleb/Labview.htm
90. , private communication, 2003.
91. A. S. Siddiqui, Optical polarimeter having four channels, U.S. Patent
5,227,623, Jan. 14, 1992.
92. Polarization-Mode Dispersion Measurement for Single-Mode Optical Fibers by Interferometry Method, Telecommunications Industry Association Std. TIA/EIA-455-124, 1999. [Online]. Available:
http://www.tiaonline.org/standards/
93. Dierential Group Delay Measurement of Single-Mode Components and
Devices by the Dierential Phase Shift Method, Telecommunications
Industry Association Std. TIA/EIA-455-197, 2000. [Online]. Available:
http://www.tiaonline.org/standards/
94. Measurement of Polarization Depedent Loss (PDL) of Single-Mode Fiber Optic
Components, Telecommunications Industry Association Std. TIA/EIA-455157, 2000. [Online]. Available: http://www.tiaonline.org/standards/
95. Polarization-Mode Dispersion Measurement for Single-Mode Optical Fibers
by the Fixed Analyzer Method, Telecommunications Industry Association
Std. TIA/EIA-455-113, 2001. [Online]. Available: http://www.tiaonline.org/
standards/
96. Polarization Mode Dispersion Measurement for Single-Mode Optical Fibers
by Stokes Parameter Evaluation, Telecommunications Industry Association
Std. TIA/EIA-455-122, 2002. [Online]. Available: http://www.tiaonline.org/
standards/
97. D. S. Waddy, L. Chen, and X. Bao, A dynamical polarization mode dispersion
emulator, IEEE Photonics Technology Letters, vol. 15, no. 4, pp. 534536, Apr.
2003.
98. D. S. Waddy, L. Chen, S. Hadjifaradji, X. Bao, R. B. Walker, and S. J. Mihailov, High-order PMD and PDL emulator, in Tech. Dig., Optical Fiber
Communications Conference (OFC04), Los Angeles, CA, Feb. 2004, paper
ThF6.
99. P. Wai and C. R. Menyuk, Polarization mode dispersion, decorrelation, and
diusion in optical bers with randomly varying birefringence, Journal of
Lightwave Technology, vol. 14, no. 2, pp. 148157, Feb. 1995.

References

489

100. M. Wegmuller, S. Demma, C. Vinegoni, and N. Gisin, Emulator of rst- and


second-order polarization-mode dispersion, IEEE Photonics Technology Letters, vol. 14, no. 5, pp. 630632, May 2002.
101. M. Wegmuller, F. Scholder, and N. Gisin, Photon-counting OTDR for local
birefringence and fault analysis in the metro environment, Journal of Lightwave Technology, vol. 22, no. 2, pp. 390400, Feb. 2004.
102. P. S. Westbrook, T. A. Strasser, and T. Erdogan, In-line polarimeter using
blazed ber gratings, IEEE Photonics Technology Letters, vol. 12, no. 10, pp.
13521354, Oct. 2000.
103. P. S. Westbrook, System comprising in-line wavelength sensitive polarimeter,
U.S. Patent 6,591,024, July 8, 2003.
104. P. A. Williams, Mode-coupled artifact standard for polarization-mode dispersion: Design, assembly, and implementation, Applied Optics, vol. 38, no. 31,
pp. 64986507, 1999.
105. , Modulation phase-shift measurement of PMD using only four launched
polarisation states: A new algorithm, Electronics Letters, vol. 35, no. 18, pp.
15781579, 1999.
106. , Rotating-wave-plate stokes polarimeter for dierential group delay measurements of polarization-mode dispersion, Applied Optics, vol. 38, no. 31, pp.
65086515, 1999.
107. , PMD measurement techniques avoiding measurement pitfalls, in
Venice Summer School on Polarization Mode Dispersion, Venice Italy, June
2002, pp. 2436.
108. P. A. Williams and A. J. Barlow, Summary of current agreement among
PMD measurement techniques, in Presentation to Internation Electrotechnical
Commission (IEC), Edinburgh, Scottland, Sept. 1997, paper SC86, SG1.
109. P. A. Williams, A. J. Barlow, C. Mackechnie, and J. B. Schlager, Narrowband
measurements of polarization-mode dispersion using the modulation phase
shift technique, in Tech. Digest, Symposium on Optical Fiber Measurements
(SOFM 1998), Boulder, CO, Sept. 1998, pp. 2326, NIST Special Publication
930.
110. P. A. Williams and J. D. Koer, Narrow-band measurement of dierential
group delay by a six-state RF phase-shift technique: 40 fs single-measurement
uncertainty, Journal of Lightwave Technology, vol. 22, no. 2, pp. 448456, Feb.
2004.
111. P. A. Williams and J. Koer, Measurement and mitigation of multiplereection eects on the dierential group delay spectrum of optical components, in Tech. Digest, Symposium on Optical Fiber Measurements (SOFM
2002), Boulder, CO, Sept. 2002, pp. 173176.
112. P. A. Williams and C. M. Wang, Corrections to xed analyzer measurements
of polarization mode dispersion, Journal of Lightwave Technology, vol. 16,
no. 4, pp. 534554, 1998.
113. L. Yan, M. Hauer, Y. Shi, X. Yao, P. Ebrahimi, Y. Wang, A. Willner, and
W. Kath, Polarization-mode-dispersion emulator using variable dierentialgroup-delay (DGD) elements and its use for experimental importance sampling, Journal of Lightwave Technology, vol. 22, no. 4, pp. 10511058, Apr.
2004.
114. L.-S. Yan, M. Hauer, C. Yeh, G. Yang, L. Lin, Z. Chen, Y. Q. Shi, X. S.
Yao, A. E. Willner, and W. L. Kath, High-speed, stable and repeatable PMD

490

10 Review of Polarization Test and Measurement


emulator with tunable statistics, in Tech. Dig., Optical Fiber Communications
Conference (OFC03), Atlanta, GA, Mar. 2003, paper MF6.

A
Addition of Multiple Coherent Waves

There are many instances throughout the text where the simplication of a
sum of coherent sine and cosine terms is necessary. Sine and cosine terms that
are coherent have the same oscillatory frequency t but may have dierent
amplitudes and phases. These waves can be combined into a single sine, cosine,
or complex exponential expression. This appendix shows how to make the
reductions.
A sum of N coherent exponentials is
S=

N


an ej(tn )

(A.1)

n=1

Expanding the sum makes





an ejn =
an cos n j
an sin n
= A jB

(A.2)

where,
A=

N


an cos n ,

n=1

B=

N


an sin n

n=1

Converting to polar form, the sum S simplies to


N


an ej(tn ) =



A2 + B 2 exp j(t tan1 (B/A))

(A.3)

n=1

The simplications for sine and cosine sums requires an additional step of
exponential expansion. Thus, with
S =

N

n=1

an sin(t n ) ,

(A.4)

492

A Addition of Multiple Coherent Waves


Table A.1. Identities for Coherent Wave Addition
N


an ej(tn ) =



A2 + B 2 exp j(t tan1 B/A)

n=1
N


an sin(t n ) =



A2 + B 2 sin t tan1 (B/A)



A2 + B 2 cos t tan1 (B/A)

n=1
N


an cos(t n ) =

n=1

where A =

N


an cos n , B =

n=1

N


an sin n

n=1

exponential expansion of the sine terms yields


S =



1  jt 
an ejn ejt
an ejn
e
2j

(A.5)

Substitution of (A.2) into (A.5) yields





1   jt
A e ejt jB ejt + ejt
2j
= A sin t B cos t

S =

(A.6)

Recognizing that (A.6) is similar to the equation for an ellipse, the nal simplication produces
N


an sin(t n ) =



A2 + B 2 sin t tan1 B/A

(A.7)

n=1

In an analogous way, a sum of coherent cosine terms simplies to


N


an cos(t n ) =



A2 + B 2 cos t tan1 B/A

n=1

Table (A.1) summarizes the results.

(A.8)

B
Select Magnetic Field Proles

Non-latching iron garnet Faraday rotation elements require the presence of


an external magnetic eld to saturate the magnetic domains. To generate
the Faraday eect, the eld lines are aligned predominantly in the direction
of optical propagation. A cylindrical magnet with the center bore gives the
required eld prole and is also simple to analyze. This appendix gives analytic
and semi-analytic expressions for the eld along the centerline of the bore and
in the plane perpendicular to the propagation direction, where the plane is
located half-way along the bore.
A permanent magnet is described by a magnetic dipole distribution within
the material and a magneto-quasi-static magnetic eld. The relevant form of
Maxwells equations are then
H = 0

(B.1a)

o H = o M

(B.1b)

Since the magnetic eld is irrotational, the eld can be dened as the gradient
of a scalar potential (cf. 1.2.2b):
H =

(B.2)

where is the scalar magnetic potential. Denition of the magnetic charge


density m as
(B.3)
m = o M
and substitution of (B.2) into (B.1b) yields Poissons equation
2 =

m
o

A solution to Poissons equation is the superposition integral



m (r )
=
dV 
|
4
|r

r

o
V

(B.4)

(B.5)

494

B Select Magnetic Field Proles

a)

r
r

b)

c)
r2

rm
f
2L

r1
f

|ro-r|
ro

Fig. B.1. Geometry of cylindrical magnet. a) Cylindrical magnet with center bore.
Magnetic charges lie on the top and bottom annular surfaces. b) Top view showing
inner and outer radius. c) For in-plane calculation, position ro is oset from the zaxis an can be related to angle (see text).

where the prime denotes a point on or within the magnetic medium and r is
a spatial coordinate. The integral is taken over the volume of the magnetic
solid. Figure B.1 illustrates the magnetic cylinder under consideration. Taking
advantage of the cylindrical symmetry, the superposition integral is evaluated
as
 r2
 L
 2
m (r )
(B.6)
d
r dr
dz
=
4o |r r |
o
r1
L
The rst evaluation of (B.6) is done along the z-axis. Integration of (B.6)
along z yields two magnetic sheets, one annulus at +L with positive charges
o M and the other annulus at L with negative charges o M . Moreover, the 
distance from any point on the annular sheets to the z-axis is
|r r | = r2 + (z L)2 , where the minus sign corresponds to the top sheet.
The scalar potential, still in integral form, is
 r2
 r2
2o M r dr
2o M r dr



(B.7)
=
r2 + (z L)2
r2 + (z + L)2
r1 4o
r1 4o
Integration and subsequently taking the gradient as prescribed by (B.2) yields
the magnetic eld strength along the central axis [1, 2]:

7
8
1
M
1
(z L) 
Hz (z) =


2
(z L)2 + r22
(z L)2 + r12
8
7
1
1

(B.8)
(z + L) 
(z + L)2 + r22
(z + L)2 + r12
Figure B.2(a) illustrates an evaluation of (B.8). A design goal would be to
achieve the highest possible magnetic eld in the region of z = 0 while minimizing the size of the magnet. The change in sign is due to the elds wrapping
around either magnet end to terminate on the surface charges.

B Select Magnetic Field Proles


r

N
H(z, r=0)/Br

a)

-1

D
d
z

0.2
2L

0.1
-2

495

-d
-D

z/L

-0.1
-0.2

H(r, z=0)/Br

b)

0.2
0.1

-2

-1

r/L

Fig. B.2. Axial and transverse magnetic eld amplitude Hz of a cylindrical magnet.
a) Axial eld strength of Hz under the conditions r2 = L and r1 = L/2. b) Transverse
eld strength of Hz in plane located at center of magnet. Inset shows coordinates.

The purpose of the following transverse eld calculation is to explore the


eld uniformity within the bore but o-axis. Since an optical beam has nite
width, it is insucient to saturate the center of an iron garnet but not the
outer edges. In order to keep the analysis quasi-analytic, only the eld amplitude in a plane normal to z and located half-way along the bore is calculated.
The key to evaluating the superposition integral in this case is an analytic
expression for |r r| away from the z-axis. Referring to Fig. B.1(c), oset
position ro is related to r and as

|r ro | = R2 + (z L)2
(B.9)
where the in-plane length is
R2 = (r sin )2 + (r cos ro )2

(B.10)

With this measure, the superposition integral is


7
8
 2
 r2
1
1
  o M

=
d
r dr

(B.11)
4o
R2 + (z + L)2
R2 + (z L)2
0
r1
Even though (B.11) does not have an analytic form, an expression closer to
Hz (r, z = 0) can still be found. In particular,
Hz =

(B.12)

496

B Select Magnetic Field Proles

Carrying through the derivative with respect to z rst and then setting z = 0
yields
 2
 r2
2o M
r dr
Hz (r, z = 0) =
d
(B.13)
4o [R2 + 1]3/2
0
r1
This integral can be evaluated numerically. Applying the parameters from
Fig. B.2(a) to (B.13) generates the curve given in Fig. B.2(b). Note that
the z component of the magnetic eld does not change sign but monotonically
decays to zero far away from the magnet. Also, the uniformity of the eld, for
these parameters, remains within 10% of the peak within the inner radius.
A samarium-cobalt (SmCo) magnet can be an excellent choice for the
permanent around an iron garnet due to its high coercivity in a small size.
Length-diameter products of 1 mm2 can readily achieve the 100250 Oe magnetic eld required for Hsat in iron garnets.

References
1. H. A. Haus and J. R. Melcher, Electromagnetic Fields and Energy. Englewood
Clis, New Jersey: PrenticeHall, 1989.
2. K. Shiraishi, F. Tajima, and S. Kawakami, Compact faraday rotator for an optical isolator using magnets arranged with alternating polarities, Optics Letters,
vol. 11, no. 2, pp. 8284, 1986.

C
Ecient Calculation of PMD Spectra

Scalar and vector PMD spectra calculated in the Stokes-based PMD representation is straightforward and ecient. The concatenation rules presented
in 8.2.4 starting on page 337 are derived for  and  by taking frequency
derivatives analytically; numerical derivatives are therefore not necessary.
This appendix gives a vectorized code fragment written in Matlab which
can be used as a core calculator for larger programs. Given the particular
vectorization that follows, the code works well when there are more frequency
evaluations than birefringent segments.
The dierential-group delay | |, magnitude second-order PMD | |, and
polarization-dependent chromatic dispersion | | scalar spectra are calculated
for each frequency by
2

(C.1a)

| | =  

(C.1b)

 
| | =
 

(C.1c)

| | =  

The output and input PSP vector spectra are


pout =  /

pin = R pout

(C.2a)
(C.2b)

where R = RN RN 1 . . . R1 . Each rotation operator is expanded in vector form


as
rk rk ) + sin (k ) (
rk )
(C.3)
Rk = I cos (k ) + (1 cos (k )) (
where k is the DGD of a single birefringent segment and rk is the Stokes direction of its birefringent axis. The concatenation equations (8.2.34) on page 337
are used to compute the cumulative rst- and second-order PMD vector.

498

C Ecient Calculation of PMD Spectra

function [tau2, tauw2, pdcd, PSPout, PSPin] = CalcPMDSpec_1(w_vec, r_vec, tau_vec, phz_vec)

10

15

20

25

30

35

%
%
%
%
%
%
%
%
%
%
%
%
%
%

Inputs :
/w vec/
(Trad/s) 1 x wlen vector of radial frequency range
/r vec /
( scalar ) 3 x Nseg matrix , each column is a unit Stokes vector of tau k
/tau vec/ (ps)
1 x Nseg vector of DGD for each segment
(not to be confused with the PMD vector tau)
/phz vec/ (rad )
1 x Nseg vector of residual birefringent phase
for each segment.
Outputs:
/tau2/
/tauw2/
/pdcd/
/PSPout/
/PSPin/

(ps2)
(ps4)
(ps2)
( scalar )
( scalar )

1
1
1
3
3

x
x
x
x
x

wlen
wlen
wlen
wlen
wlen

vector
vector
vector
matrix
matrix

of
of
of
of
of

DGD2(w)
SOPMD2(w)
PDCD(w)
output PSP Stokes vectors
input PSP Stokes vectors

% Defs
DEG2RAD = pi / 180; RAD2DEG = 180 / pi;
I2 = diag([1,1]);
I3 = diag([1,1,1]);
% Inputspecic Defs
wlen = length(w_vec);
Nseg = length(tau_vec);
% Calculate rrdot and rcross for each segment up front
% Note: rrdot and rcross matrix has the following structure :
%
% rrdot vec = [ rrdot (1) rrdot (2) ... rrrdot (Nseg)]
%

%
|
%
im
%
% where each rrdot is a 3x3 matrix itself . Same with rcross .
%
% Calculate rrdot and rcross for each segment
for k = 1: Nseg,

40

im = 3 * (k - 1) + 1;
% column index into rrdot vec
rrdot_vec(:, [0:2]+im) = r_vec(:,k) * r_vec(:,k); % rrdot from dyadic
% rcross is sum over crossproducts of r vec w/ S1, S2, and S3
for i = 1: 3,
rcross_vec(:, (i-1)+im) = cross(r_vec(:,k), I3(:,i));
end

45

end
50

55

% Precalculate the trig tables , row > segment #; column > freq
coswt = cos(tau_vec * w_vec + phz_vec * ones(size(w_vec)));
sinwt = sin(tau_vec * w_vec + phz_vec * ones(size(w_vec)));
% Dene the tau vectors
for k = 1: Nseg,
tauvec(:,k) = tau_vec(k) * r_vec(:,k);
end

C Ecient Calculation of PMD Spectra


60

499

% Now calculate the frequency response


for iw = 1: wlen,
% Set the frequency for the concat
w = w_vec(iw);

65

% Initialize cumulative tau and tauw vectors .


tau_cat = tauvec(:, 1);
% tau(1) = tau 1
tauw_cat = zeros(size(tauvec(:, 1))); % tauw(1) = 0;
% Initialize cumulative R operator
R_cat = I3;

70

% We need R1 to nd PSPin
Rseg = coswt(1, iw) * I3 + ...
(1-coswt(1, iw)) * rrdot_vec(:, [0:2]+1) + ...
sinwt(1, iw) * rcross_vec(:, [0:2]+1);

75

% Make rst concatenation


R_cat = Rseg * R_cat;
80

% Accumulate tau, tauw, and Rseg through each segment


for iseg = 2: Nseg,

85

% column index into rrdot vec and rcross vec


im = 3 * (iseg - 1) + 1;

90

% Construct R iseg(w)
Rseg = coswt(iseg, iw) * I3 + ...
(1-coswt(iseg, iw)) * rrdot_vec(:, [0:2]+im) + ...
sinwt(iseg, iw) * rcross_vec(:, [0:2]+im);
% Accumulate R
R_cat = Rseg * R_cat;
% Accumulate tau cat
tau_cat = tauvec(:,iseg) + Rseg * tau_cat;

95

% Accumulate tauw cat


tauw_cat = cross( tauvec(:,iseg), tau_cat ) + Rseg * tauw_cat;
100

end
% Calculate the input PMD vectors tau and tauw
Radj = conj( transpose( R_cat ) );
tau_in = Radj * tau_cat;
tauw_in = Radj * tauw_cat;

105

110

% Calculate scalar spectra ( could do this outside the loop , too)


tau2(iw) = tau_cat * tau_cat;
tauw2(iw) = tauw_cat * tauw_cat;
dgd = sqrt(tau2(iw));
pdcd(iw) = tau_cat * tauw_cat / dgd;

115

% Calculate the vector spectra


PSPout(:, iw) = tau_cat / dgd;
PSPin(:, iw) = tau_in / dgd;
end

The point-of-view of the preceding code is that operators rk rk and rk


as well as the k product can be evaluated outside the main loop. In this
way the core loop is mainly a multiply-and-accumulate register.

500

C Ecient Calculation of PMD Spectra

After an initial setup, the operators rk rk and rk are evaluated for each
PMD segment in the loop between lines 39-48.
line 42:
lines 45-47:

(
rk rk ) = rk rkT
(
rk ) = rk s1 + rk s2 + rk s3

Matrices rrdot vec and rcross vec store the 3 3 operator associated with
the k th segment in a 3 3k matrix that is indexed as a row vector on k.
The sine and cosine of the k product are computed before the concatenation loop. These calculations are stored in tables coswt and sinwt on lines
51-52. There is an important point that needs to be highlighted. Strictly
speaking, the birefringent phase of a segment is k . The radial frequency
can certainly be used, such as (2)194.1 THz. As an alternative, the birefringent phase of a segment is written ( o )k + k , where o is an arbitrary
frequency and k is a measure of the residual birefringent phase at o . This
form is useful when investigating the role of the birefringent phase on a PMD
spectrum. The trigonometric terms on lines 51-52 provide for a vector of
residual birefringent phases that are added to k , which if the vector is nonzero should be interpreted as ( o )k + k .
Finally, the segment PMD vectors k are calculated in advance:
lines 55-57:

k = k rk

The main frequency loop runs from lines 60-118. For each frequency
the respective coswt and sinwt values are recalled, the Rk operators are
constructed, vectors  and  are calculated, and the scalar and vector PMD
spectra are computed and stored. The vectors  and  are generated by the
nested loop that runs from lines 82-101; this loop runs the concatenation
equations (8.2.34) on page 337. The inner loop is initialized with
lines 67-68:

 (1) = 1 ,

and

 (1) = 0

and line 71: R = I. Each iteration of the accumulation loop generates the
PMD vectors from
line 96:

 (k) = k + Rk  (k 1)

line 99:

 (k) = k  (k) + Rk  (k 1)

Note that the running product of line 93: R(k) = Rk R(k 1) is recorded.
This operator is used to nd the input PSPs from the output PSPs. In
particular,
lines 104-106:

s = Rt ,

and

s = Rt

C Ecient Calculation of PMD Spectra

501

With these preliminary calculations in place, the vector and scalar PMD
spectra are computed on lines 109-116, following (C.1-C.2).
Figures C.1-C.2 are calculated for four equal-length stages using the above
code fragment. The input and output PSP vector spectra are shown as are
the DGD, magnitude SOPMD, and PDCD scalar spectra. The DGD spectra
in Fig. 8.33 on page 362 were calculated in the same way.
Not included in the code but easily added is the calculation of U (). Direct calculation of U () is ideal due to the diculty extracting U from jU U .
While calculating U () one should concurrently calculate U () so that jU U
can be checked against   , the latter being calculated from concatenation
rules on  by R as above. The product rule for U () is trivial:
U (N ) = UN UN 1 . . . U1

(C.4)

The frequency derivative is calculated analytically and accumulated using a


recurrence relation. Matrices U and U are expressed as
Uk = I cos (k /2) j (k  ) sin (k /2)
Uk = k /2 (I sin (k /2) + j (k  ) cos (k /2))

(C.5a)
(C.5b)

As with Rk , k  , sin (k /2), and cos (k /2) can be calculated in advance


of the frequency loop. The recurrence relation for U (k) is
U (k) = Uk U (k 1) + Uk U (k 1)

(C.6)

A quick test to verify that U and U are correctly calculated is to check


that jU U is Hermitian.
Calculation of U () is useful, for instance, when calculating the time response of a signal that transits a PMD medium. The polarization transfer
matrix in frequency and time domains are given by (8.2.41) and (8.2.42) on
page 343.

502

C Ecient Calculation of PMD Spectra


S3
vo
PSPin

vo

S2

S1
PSPout

DGD (ps)

40
30
20
10

SOPMD (ps2)

0
400
300
200
100
0
PDCD (ps2)

DGD

SOPMD

150
75
0
-75

PDCD
-150
-100

-50

50

100

Relative Frequency (GHz)

t1f

t1f

t~

Fig. C.1. Vector and scalar spectra for four birefringent sections:
= {10, 10, 10, 10} ps, = {0, 45, 45, 0} , r = {0, 45, 90, 135} 1.5 lying
on the equator. The center frequency o is indicated on both vector and scalar
plots. The period of the scalar spectra is 100 GHz and the spectra have been shifted
by one-eighth period.

C Ecient Calculation of PMD Spectra

503

S3
PSPin

PSPout

vo

vo

S2

S1

DGD (ps)

40
30
20
10

SOPMD (ps2)

0
400
300
200
100
0
PDCD (ps2)

DGD

SOPMD

150
75
0
-75

PDCD
-150
-100

-50

50

100

Relative Frequency (GHz)

t1f

t2f

t~

Fig. C.2. Vector and scalar spectra for four birefringent sections:
r = {0, 45, 90, 135} 1.25
= {10, 10, 10, 10} ps,
= {0, 22.5, 67.5, 0} ,
lying on the equator. The center frequency o is indicated on both vector and
scalar plots. The dierential phase shift of 22.5 in the center sections about the
common phase shift 45 distorts the PMD spectra.

D
Multidimensional Gaussian Deviates

Consider the gaussian random variable X. The probability density is




1
x2
X (x) = 
exp 2
2x
2x2

(D.1)

The expectation and variance are


E [X] = 0, and var (X) = x2

(D.2)

Consider now a two-dimensional distribution composed of two independent


identically distributed (i.i.d.) gaussian random variables (g.r.v.); denote the
two deviates X1 and X2 , and a vector dened as X = (X1 , X2 ). While we may
be interested in the distribution of these cartesian components, an alternative
is the distribution of the corresponding polar coordinates. A polar deviate is
dened as P = (R, ). A one-to-one map g relates the two coordinates such
that
g(x1 , x2 ) = (r, )


=
x1 + x2 , tan1 x2 /x1
The inverse map h = g 1 given in polar form is
h(r, ) = (x1 , x2 )
= (r cos , r sin )
The joint density of the polar coordinates is related to the joint density of the
cartesian coordinates through the Jacobian:
P (r, ) = X (h (r, )) Jh
where

(D.3)

506

D Multidimensional Gaussian Deviates



 h h 
1
1 




Jh =  r

 h2 h2 


r

In the present case, Jh = r. The polar joint distribution is therefore




r
r2
exp 2
P (r, ) =
2x2
2x

(D.4)

where the argument of the exponential is x21 + x22 = r2 (cos2 + sin2 ). Now,
the random variables R and are independent, so the joint distribution is
the product of the two individual distributions. The angular distribution is
uniform over 2, so the product is written as
 


1
r
r2
exp 2
()R (r) =
2
x2
2x
The resultant radial distribution, known at the Rayleigh distribution, is


r
r2
(D.5)
R (r) = 2 exp 2 , r 0
x
2x
The moments of the Rayleigh distribution are
!
n"
,
E [nR (r)] = 2n/2 xn 1 +
2

nZ

where Z is the set of integers greater or equal to zero. Denoting the nth moment
as rn  and var (r) = r2 , the basic Rayleigh distribution parameters are

 

(D.6a)
x , r2 = 2x2
r =
2
!
" 2
x
(D.6b)
r2 = 2
2
Note in particular the relation between the rst and second moments:
 2
4
2
r = r

(D.7)

Next consider the three-dimensional distribution of three i.i.d. gaussian


random variables X = (X1 , X2 , X3 ), each with variance x2 , and its polar
equivalent P = (R, , ). In the polar coordinate system, [0, ] is the declination angle from X3 and [, ] is the azimuth angle. The polar to
cartesian transformation h is
h(r, , ) = (x1 , x2 , x3 )
= (r cos sin , r sin sin , r cos )

D Multidimensional Gaussian Deviates

507

Table D.1. Key Relations for Multivariate Gaussian Distributions


 2
var (r)
ratio
Distribution
R (r)
r
r


1
r2

exp

x2
0
x2
Gaussian(a)
2x2
2x2


r
r2
exp

x2
2x2

Rayleigh(b)

Maxwellian
(a)

(b)

r (, ),

(b)

x 2x2
2


 
2 r2
8
r2
exp 2
x 3x2
x3
2x

" 2
x
2


x2

 2
4
r = r2

 2
3
r =
r2
8

r [0, )

The corresponding Jacobian is


Jh = r sin
The polar joint distribution P (R, , ), written as the product of three independent polar random variables, is
 


 
1
sin
r2
r2
() ()R (r) =
exp

2
2
2 x3
2x2
The resultant radial distribution, known at the Maxwellian distribution, is



r2
r2
R (r) =
exp

(D.8)
2 x3
2x2
The moments of the Maxwellian distribution are


3+n
2 n/2 n
n

2 x
E [R (r)] =
2

Therefore, the basic parameters of the Maxwellian distribution are



 
8
x , r2 = 3x2
r =



8
2
m = 3
x2

(D.9a)
(D.9b)

The relation between the rst and second moments is


 2  3
2
r
r =
8

(D.10)

508

D Multidimensional Gaussian Deviates


Gaussian
hri

0.50

1sx

2sx

0.25
0

-4

-2

r / sx

Rayleigh
1.00

hrip
hr2i

0.75
0.50
0.25
0

2sr
0

1sr
1

r / sx

Maxwellian
1.00
0.75

hri
p

0.50
0.25
0

2sm
0

hr2i
1sm
2

r / sx

Fig. D.1. Probability densities for Gaussian, Rayleigh, and Maxwellian distributions. The gaussian distribution is symmetric about the origin while the Rayleigh
and Maxwellian distributions, associated with the radius of a circle and sphere, respectively, are one-sided with r 0. All distributions are completely determined by
the component variance x2 .

Index

Abbe number 94
-BBO 150, 176
group-index temp. co. 170
material properties 148
temperature compensation 173
ABCD matrices
from q-transformation 224
GRIN lens 231
optimal lens coupling 241
plane-wave limit 227
achromats
Koester 184
MgF/quartz 184
Pancharatnam 186
Shirasaki 189
Amp`eres law 2, 82
anisotropic media 85, 136, 139, see
birefringent media
attractor-precessor method (APM)
436, 446
autocorrelation bandwidth
PMD 410
autocorrelation function see DGD
autocorrelation function, see PMD
vector
connection between ensemble and
frequency averages 408
derivations 411
mean-square DGD 409
PMD vector 409
Becquerel formula
bi-circulator 294
bi-isolator 294

133

Bi:RIG see iron garnet


bianisotropic media 85, 136, see
optical activity
biaxial crystal 105
Biots law 141, 149
birefringent beat length 121, 179, 389
of crystalline quartz 150
of SMF ber 385, 450
465
of YVO4
birefringent crystal
eective index 109, 268
high and low birefringence 147
positive and negative uniaxial 106
Poynting vector direction 109
properties 148
refraction 112
susceptibility tensor 106
temperature dependence 163, 170
walko angle 110
waveplate cut 116, 120
birefringent media see ber birefringence
constitutive relation 107
ordinary and extraordinary axes
107
birefringent phase 72, 120, 130, 179,
329, 467, see residual birefringent
phase
control of 185
Evans phase shifter 464
frequency dependence 182
relation to DGD value 122
relation to PMD spectrum 474

510

Index

temperature compensation 172


temperature dependence 166, 171
birefringent walko see walko angle,
see walko block
compensation 174
crystal cut 116, 202
eective index 117
total internal reection 118
birefringent wedge see prism
BK7 glass 159, 161, 184, 464
material properties 147
bra and ket vectors
duality 40
bracket notation 39
Brewsters angle 100, 102
birefringent separation 118, 204, 275
Brownian motion 394
density function 390
Karhunen-Lo`eve expansion 419
sample paths 392, 421
C-band 144
calcite 274, 278
group-index temp. co. 170
material properties 148
temperature compensation 173
Cayley-Klein unitary matrix 51, 332
characteristic admittance 5
characteristic impedance 5
chiral media 80, 135, 150
constitutive relation 138
optic ber 389, 422
wire model 136
chirality parameter 138
circular polarization 15, 35, 53, 430
and Fresnel rhomb 196
eigenstates 129, 139
circulators see bi-circulator
classication 273
deection type 285
Kaifa type 286
performance specs 294
Shirasaki-Cao type 290
Xie-Huang type 286, 292
displacement type
ladder type 282
quartz-free 283
strict-sense type 281
historical examples 277

polarzation dependent 274


coherency matrix 23, 35
coherent PMD 465
collimator see dual-ber collimator,
see ber-to-ber coupling
assemblies 214
air gap 216
epoxy joint 213
fused joint 217
C-lens example 236
C-lens type 212
comparison chart 219
design goals 213
GRIN-lens example 236
GRIN-lens type 212
pointing direction 217, 230, 237
complete gap 284
component DGD see residual
birefringent phase
circulator 283, 292
delay crystals 172, 177, 456
isolator 258, 263, 266, 268, 270
Kaifa prism 204
Rochon prism 201
Wollaston prism 201
component PDL
circulator 277, 282, 294
isolator 259, 263
o-axis delay crystals 174
confocal parameter 222, 225, 262
conservation of energy 6, 138, see
Poyntings theorem
constitutive relations 3, 85, 90
birefringent 107
chiral media 138
Drude-Born-Fedorov model 138
gyrotropic materials 126
isotropic 94
losslessness 86
optically active media 138
coupling coecient 242
critical angle 101, 118, 198
crystal classes 107
crystalline quartz 147, 179, 455, 464
and MgF2 achromat 184
material properties 149
Curie temperature see iron garnet
current density 2, 7

Index
data folding 443
degree of polarization (DOP) 22, see
repolarization
from coherency matrix 23
from intensity 34
from Stokes parameters 23
PDL surfaces 306
depolarization
connection to partial polarization
23, 31, 324
connection to PDL 306
density conditional on DGD 404
eect on pulse 346, 356
entangled states 344
probability density 402
programmable generation 454
relation to mean-square SOPMD
404
relation to PMD autocorrelation
411
relation to second-order PMD 323
depth of focus 223, 239, 262
DGD see component DGD, see PMD
and impulse response 325, 359
anomalous see PMD and PDL
combined
impulse response verses input
polarization state 354
relationship to birefringent phase
122
DGD autocorrelation function 410
DGD component of PMD
length of PMD vector 314, 330, 333
DGD in ber
diusions limits 398
examples 324, 407
DGD measurement 443, 445
DGD spectrum 321, 443, 461, see
coherent PMD
eect of spectrum 322
DGD statistics see mean ber DGD
Maxwellian density 402
mean-square equation of motion
399
mean-square growth 397
diamagnetic media 123
electron equation of motion 124
susceptivility tensor 125

511

dierential attenuation slope (DAS)


371, 374, 447, 449, see PMD and
PDL combined
dierential-group delay see DGD
diraction angle 223
diusion equation
PDL 422
PMD 399
SOP 393
diusion process 388
dispersion relation 4, 92, 95, 108, 128,
156
Drudes equation 141
Drude-Born-Fedorov model 138
dual-ber collimator 198, 201
divergence angle 238
example 238
for circulator 284, 290
for polarization-beam splitter 285
DWDM channel spacing 143
lter tolerancing 145, 159
ECHO source 454, see coherent PMD
birefringent phase control 465
calibration 464
common and dierential phase
control 475
comparison with JPDF and PMDS
477
design criteria 471
frequency shift of PMD spectrum
475
independent control of 1st and 2nd
order PMD 473
instrument bandwidth 478
eigenstates 46
electric charge density 2
electric dipole moment 80
electric eld 2, 12
electric-ux density 81
continuity condition 83
including media interaction 91
electromagnetic dissipation 7
electromagnetic stored energy 6, 87
electron equation of motion 90, 105,
124
elliptical polarization 15, 35, 127
epoxy
heat cure 213

512

Index

UV cure 215
Euler rotations 71
evanescent eld 102, 196
penetration depth 104
Evans phase shifter 184, 464, 475
evolution equation see diusion
equation
PDL 310, 380
PMD 339, 380
PMD and PDL 377, 380
SOP 339, 380
extinction coecient
from permittivity 92
extraordinary axis see birefringent
media
Fabry-Perot interferometer 154, 163,
215
frequency response 157
temperature dependence 161
Faraday angle 132, see specic
rotation
Faraday rotation 129, 132, 197, 251,
493, see nonreciprocal polarization
rotation
comparison to optical activity 141
operator expression 207
Faraday rotator 189, see Shirasaki
achromat
for circulators 273, 277, 286
for isolators 247, 255, 259
garnet 135, 150
linear 197, 207
Faradays law 2, 82
ferrimagnetic garnet 123, 133, 150, see
iron garnet
ferrule 213, 232, 285, 290
tilt angle 215, 234
ber autocorrelation length 385, 390,
395, 398
in relation to birefringence beat
length 396
ber birefringence 385, 392
chirality 450
length scales 387
no chirality 389
origins 386
random birefringence model 391
random orientation model 389

ber-to-ber coupling 239, 261


optimal coupling 240
rst and second order PMD
independent control of 473
rst-order PMD see component DGD,
see DGD
focus error 244
four-states method
combined PMD and PDL measurement 449
PDL measurement 432
PMD measurement 437, 447
free-spectral range
and group index 158
birefringent 121
Fabry-Perot 158, 165
PMD 336, 366, 460, 478
temperature shift 168
Fresnel rhomb 196
Frigo equation 372, 446
fused silica 183, 218
material properties 147
Gauss electric law 2, 80
Gauss magnetic law 2, 82
gaussian distribution 505
gaussian optics 219
beam waist 222
confocal parameter 222
diraction angle 223
generator function 394, 399, 422
Stratonivich translation 394
Gires-Tournois interferometer 154
frequency response 161
Glan-Taylor prism 204, 274, 278, 280
Glan-Thompson prism 274, 278
Goos-H
anchen displacement 104
Goos-H
anchen phase shift 102
GRIN lens 211, 215, 237
ABCD matrix 231
index gradient constant 232
index prole 230
melt point 218
pitch 232
polish angle 217
group delay
GT interferometer 161
PMD 329
group index 93, 121, 158, 263

Index
in ber 450
temperature dependence 163, 170
group velocity 93, 163
birefringent media 112
gyrotropic angle 127
gyrotropic media
constitutive relation 126
eigenvector orientation 127
nonreciprocal polarization rotation
129
permittivity tensor 122
precession angle 130
half-wave waveplate 179, 184, 254, 276,
279, 286, 434, 455, 464
achromat 184
bandwidth 182
operator expression 207
polarization control 193
Helmholtz equation 3, 91, 388
Hermite coecients 411
Hermitian matrix see Mueller matrix
relation to PMD 332
Hermitian operator 47, see skewHermitian operator
relation to PMD 327
relation to unitary 49
spin-operator form 62
spin-vector form 61
impermeability 86
impermittivity 86, 107, 126
importance sampling 405
indicatrix 110, 116
Poynting vector 112
inner product 41
interferometric (INT) method 436,
439
invar 153
iron garnet 150, 493, see Faraday
rotator
Bi:RIG 151, 248, 254, 279
Curie temperature 124, 152
design goals 151
dual 252
hysteresis 134, 152
latching 134, 153, 284, 290
saturation 134
YIG 151, 254, 276

513

isolators see bi-isolator


deection type 254
isolation 257
PMD 258, 263
ray-trace 256, 267
technology comparison 259
displacement type
isolation 262
PMD 263
ray-trace 261, 265
insertion loss 249
isolation denition 250
lens systems 253
PMD compensated 266
polarization-dependent 247
polarization-independent 254, 259
return loss 258
temperature dependence 252
tolerancing 249
two stage 263
wavelength dependence 251
isomorphism 65
isotropic media
electron equation of motion 90
propagation in 95
reection coecient 98, 99
refractive index 94
susceptibility 91
joint probability distribution of PMD
453, 477
scales with mean ber DGD 405
Jones matrix 18, 45, see Hermitian
matrix, see PMD operator, see
unitary matrix
from Stokes parameters 19
on-axis PDL 302
relation to Mueller matrix 19, 38
spin-matrix form 61
Jones matrix eigenanalysis (JME)
436, 442, see data folding
Hener eigenvalue equation for PMD
442
LabView code 442
step size 444
Jones to Stokes 56
Jones vector 13, 52
from Stokes parameters 18

514

Index

Kaifa circulator 286


Kaifa prism 202, 284, 285
Karhunen-Lo`eve expansion 419
kDB system 87
birefringent materials 108
dening coupled equations 89
Faraday rotation 129
gyrotropic materials 126
isotropic materials 95
optically active media 139
Kolmogorovs backward equation 394
kovar 153
/4, /4 combination 192
/2, /4 combination 194
/4, /2, /4 combination 195
L-band 144
Lagrange multiplier method 433
Langevin process 391
lead molybdate (PbMoO4 ) 149
lens classication 211
lens equation, simple 229
linear polarization 15, 35
eigenstate 110
gyrotropic rotation 136, 141
150, 185, 258, 456, 464
LiNbO3
group-index temp. co. 170
material properties 148
temperature compensation 173, 177
local birefringence vector 329, 339,
392, 397
Lorentz force 90, 125
Lorentz gauge 9
Lyot depolarizer 31, 298
magnesium uoride (MgF2 ) 149, 183
magnetic dipole moment 82, 123
magnetic eld 2, 81, 90
magnetic ux density 82
magnetic material types 123
magnetization density vector 2, 82
magnication error 242
magnication, lens 229, 233, 239, 262
Maxwells Equations 137, 493
complete form 2
in terms of D and B 85
in vacuum 8
interaction with media 84
time-harmonic form 11

Maxwellian distribution
derivation of 506
Maxwellian distribution of DGD 400,
417
Maxwellian distribution of PDL 423
mean ber DGD
connection to waveplate model 417
def as statistical unit 400
measurement 436, see interferometric (INT) method, see
wavelength-scanning (WS)
method
measurement uncertainty 409, 414
relation to PMD vector measurement
444
mean outage duration 478
mean outage rate 478
mean-reverting process see Langevin
process
mean-square DGD
autocorrelation function 414
relation to mean ber DGD 401
relation to pulse broadening 363
mean-square SOPMD
relation to mean ber DGD 401
modulation phase-shift (MPS) method
436, 447
for combined PMD and PDL 449
Mueller matrix 23, 37, 449, see
four-states method
and trace of Hermitian 298
comparison between unitary and
Hermitian 19, 38
comparison between unitary and
traceless Hermitian 327
from Jones matrix 18, 66
on-axis PDL 304
PDL measurement 432
polarimeter 431
Mueller matrix method (MMM) 436,
444
PDL tolerance 437
natural light 22
nonreciprocal polarization rotation
122, 131, 150, 247
numerical aperture (N.A.) 211, 223,
292

Index
O(3) group 65, 327
o-axis delay
eective index 177
operators 44, 76
PMD 330
rotation
Jones form 67
Stokes form 68
optical activity see chiral media
bi-isotropic 138
comparison to Faraday rotation 141
polarization rotation 140
reciprocal and nonreciprocal 139
optical power 228, 230
optically active material 85, 135, see
tellurium dioxide (TeO2 )
optically active rotator 189, 278, 282,
372
ordinary axis see birefringent media
orthogonal polarization states 59
and PDL 302
dierential-group delay 326
orthonormal basis 43
outer product 42
P.A.M. Dirac 40
Pancharatnam achromat 186
paraxial wave equation 220
partial dierential equation
connection to SDE 394
partial polarization
coherent light 24
incoherent light 28
natural light 22
pseudo-depolarization 31
Pasteur chirality parameter 139
Pauli spin matrices 54
Pauli spin operators 61
decomposition 61
exponential form 62
Pauli spin vector see spin vector
PDCD component
density conditional on DGD 404
probability density 402
relation to mean-square SOPMD
404
PDL 297, see component PDL, see
cumulative PDL vector, see
repolarization

515

depolarized transmission 309


equation of motion 310
polarization transformation 304
polarization-state pulling 305, 310,
373
separation from PMD 378
symbol denitions 301
transmission coecient 301
transmission surfaces 303
PDL diusion 422
PDL measurement
four-states method 432
maximum discrepancy 435
six-states method 435
PDL operator 300
PDL statistics
probability density
Maxwellian approximation 424
precise 423
stochastic dierential equation 422
PDL value
connection between local and
cumulative 302
in terms of cumulative PDL value
302
in terms of Mueller entries 433
in terms of transmission 299
PDL vector
cumulative 301, 308
equation of motion 310, 377, 380
examples 311
local 300
permeability 86, 91
free-space 2
permittivity 86, 388
free-space 2
from susceptibility 91
phase velocity 4, 93, 101, 120
in kDB 89, 95, 128, 129, 139
phase-matching condition
birefringent media 113, 119
isotropic media 96, 101
plane wave 4, 220, 227
polarization 12
time-harmonic form 11
vector form 12, 92
PMD 297, see DGD, see mean ber
DGD, see PSP

516

Index

comparison to PMD and PDL


combined 377
frequency SOP evolution 319
historical development 312
how hard can it be? 312
in relation to polarization transformation 319
is not DGD 346
physical denition 314
separation from PDL 378
single section 315
spectral decomposition 320
two sections 320, 336, 459
PMD and PDL combined 371, see
dierential attenuation slope
(DAS), see separation of PMD
and PDL
anomalous pulse spreading 371
change in polarization state 373
equation of motion 377
Frigo equation 372
non-orthogonal PSPs 374
non-rigid precession 446
operator eigenvalue equation 374
spin-vector operator 376
PMD concatenation rules
and Fourier analysis 365
rst-order 337
including waveplates 468
second-order 337
PMD diusion
component probability density 402
Maxwellian density 402
PMD emulator 451
PMD evolution
equation of motion 380, 397
examples 336, 338, 341
PMD Fourier content 364, see
coherent PMD
examples 367, 466, 467
generator function 370
phase shift due to mode mixing 369
PMD impulse response 325, 358
connection with rms DGD 361
example 362
pulse broadening 363
PMD measurement see data folding, see mean ber DGD, see
separation of PMD and PDL

classication 437
PDL tolerance 436
PMD operator
eigenvalue equation 329
spin-vector form 330
traceless Hermitian 327
PMD pulse distortion
distortion
rst-order 345
moments analysis 352
second-order 347, 350, 351
vs. launch state 357
eye closure 363
eld verse intensity response 440
inter-impulse interference 349
polarization transfer function 343
PMD source 451
multi-state source
calibration 456
control 459
precision servo motors 455
temperature compensation 456
wavelength-at states 458
PMD spectrum
decomposition 320
ecient calculation of 497
examples 324, 407, 474
frequency shift 475
PMD statistics 402
waveplate model 417
PMD vector see PMD concatenation
rules
as Stokes vector 321
autocorrelation function 409, 413
cartisian components 332
comparison between length and
frequency increment 326
connection to PMD operator 330
governing eigenvalue equation 380
polarization precession in frequency
319
relation to DGD 319
relation to PSP 319
relation to unitary operator 333
statistical moments 402
stochastic dierential equation 399
PMD vector measurement see
attractor-precessor method

Index
(APM), see Jones matrix eigenanalysis (JME), see modulation
phase-shift (MPS) method, see
Mueller matrix method (MMM),
see Poincare sphere analysis
(PSA)
Poincare sphere
from Stokes parameters 20
Poincare sphere analysis (PSA) 436,
446
polarimeter 430
ber-grating type 432
for PDL measurement 432
for PMD measurement 443
polarization beam splitter
prism comparison 285
polarization control 191, see fourstates method
arbitrary-to-arbitrary 192, 195
electro-optic 313
linear-to-arbitrary 194
polarization decorrelation length 392
connection with ber autocorrelation
length 395
local and xed frame 396
polarization density vector 2, 80
birefringent media 105
chiral media 137
gyrotropic media 125
resonance expression 91
polarization diusion
short-range anisotropy 397
stochastic dierential equation 393
polarization ellipse
elliptical equation 13
polarization retarders see Fresnel
rhomb, see waveplate
polarization state 39, see circular
polarization, see elliptical polarization, see linear polarization, see
orthogonal polarization states
convension for this text 13, 53
measurement of 430
polarization transfer function 343
polarization vector see Jones vector,
see Stokes vector
polarization-dependent chromatic dispersion component see PDCD
component

517

polarization-dependent loss see PDL


polarization-dependent optical
frequency-domain reectometry
(P-OFDR) 450
polarization-dependent optical timedomain reectometry (P-OTDR)
450
polarization-mode dispersion see
PMD
polarization-mode dispersion compensator 313
polarization-state evolution see
diusion equation, see evolution
equation
polarization-state measurement see
polarimeter
polarization-state speed change 429
polarizing wedge 117
Poynting vector 6, 140
birefringent media 109, 114
gyrotropic media 128
isotropic media 95
relation to indicatrix 112
time averaged 8, 12
time-harmonic form 11
walk-o compensation 175
Poyntings theorem 6, 86, see Poynting
vector
time-harmonic form 11
precession
about birefringent vector 69, 121,
131, 140, 316
about PMD vector 319, 331
birefringent and PMD comparison
326, 339
equation of motion 70
Evans phase shifter 184
non-rigid precession (PMD+PDL)
446
precession angle see birefringent phase
principal axis
Evans 184
Pancharatnam 186
principal state of polarization see PSP
prism see Kaifa prism, see Rochon
prism, see Shirasaki prism, see
Wollaston prism
birefringent 199
isotropic 198

518

Index

programmable PMD source see


ECHO source, see PMD source
projection matrix 16, 52
projectors 42
spin-vector form 56
pseudo-depolarization 31
PSP see PMD
calculation 497
comparison to one-stage eigen-system
326
eect of spectrum 322
evolution 340
ber spectrum 324
four-section spectrum 473
non-orthogonal see PMD and PDL
combined
non-orthogonal overlap 375
pointing direction of PMD vector
330
stationary polarization transformation 318
two-section spectrum 320, 336, 443,
461
PSP spectrum 321
q-transformation 224, see ABCD
matrices
quarter-wave waveplate 179, 184, 189,
430, 434, 440, 464
operator expressions 207
polarization control 193
rr matrix 70
r matrix 70
Rayleigh distribution 392, 417
derivation of 505
Rayleigh length see confocal parameter
receiver map 479
reciprocal polarization rotation 141
reection coecient 96
refractive index see Sellmeier equation
from permittivity 92
resonant model 93
relative permittivity 91
repolarization 424
surfaces 307
residual birefringent phase 341, 465,
500

Rochon prism 200, 273, 290


modied 201
rotary power
Faraday rotation 133
optical activity 141
rotation matrix
matrix form 67
rotation operator see PMD concatenation rules
connection to PMD vector 332
connection to Stokes rotation 64,
207
rutile (TiO2 ) 147, 149, 206, 259, 279
samarium-cobalt magnet 150, 496
scalar potential 9, 224, 493
scattering matrix
partially reecting mirror 154
second-order PMD
calculation 497
concatenation 337
decomposition 324
evolution equation 380
examples 407
generation 454, 459, 470
joint-probability density with DGD
406
probability density 402
pulse distortion 346, 356
pulse spreading 358
refs to Jones matrix form 364
relation between depolarization and
PDCD densities 404
relation to PMD vector 323
simple impulse response 349
Sellmeier equation 94, 141
separation of PMD and PDL 437, 442,
446, 449
Jones theorem 378
Shirasaki achromat 189
Shirasaki circulator 279
Shirasaki prism 204, 279
Shirasaki-Cao circulator 290
similarity transform 50, 327
skew-Hermitian operator 61, 372
SMF-28
birefringent beat length 385
eective index 215
mode-eld diameter 223

Index
N.A. 211
Snells law 96, 115, 233
specic rotation 135, 151
sensitivity 248
spectral coverage 146, 182
speed of light 4
spin vector 55
identities 57
matrix form 61
PMD operator 330
spun ber 386, 389, 450
stochastic dierential equation 388
Ito and Stratonovich forms 393
Stokes parameters
relation to ellipse 17, 430
Stokes to Jones 56
Stokes transformation
unitary 65
Stokes vector 17, 35, 432, 442
from coherency matrix 23
from Jones vector 56
of pseudo-depolarizer 32
orthogonal states 59
strong mode coupling 398
SU(2) group 51
susceptibility 107, 125, 130
linear relation between P and E 91
332
TE wave
reection and transmission 97
Tellegen parameter 139
tellurium dioxide (TeO2 ) 149, 150
temperature compensation
crystal combinations 173
of birefringent phase 170, 458
of compound crystal 177
temperature dependence
measurement of group index 163
quadratic model 166
thermally expanded-core ber 280
tilt error 243
TM wave
reection and transmission 99
total internal reection 101
asymetric 119
dierential cuto (birefringent) 118
retarder 196
Shirasaki prism 204

519

total outage probability 478


transformation matrix 18, 318, 378
Fabry-Perot 155
from scattering matrix 155
transmission coecient 96, 155
uniaxial crystal 106
propagation in 109
unitary matrix see Mueller matrix
calculation 501
Cayley-Klein form 51
general form 51
unitary operator 48
connection to Hermitian operator
49
spin-operator form 68
spin-vector form 68
unitary transform see Mueller matrix,
see similarity transform
Jones-Stokes equivalence 64, 331
unspun ber
ber autocorrelation length 385
zero chirality 389, 450
vector potential 9
gaussian optics 220
Verdet constant 133
walko angle 110, 204, 259, 262, 268
maximum 117
walko block 119
for circulators 279, 281, 290
for isolators 259, 266
wavelength 4
in media 93
wavelength-division multiplexed grid
143
wavelength-scanning (WS) method
436, 438
relationship to INT method 440
wavenumber 4, 93, 110, 156, 220, 225
birefringent 120
gyrotropic 129
optically active 140
waveplate 120, 179, 207, see half-wave
waveplate, see quarter-wave
waveplate
combinations 184
extinction ratio 183, 457

520

Index

frequency dependence 181


polarization control 191
technologies 182
waveplate model of PMD ber 417
weak mode coupling 398
Wollaston prism 199, 255, 273, 285
modied 201

Xie-Huang circulator

286, 292

YIG see iron garnet


147, 165, 176, 185, 201, 206,
YVO4
258, 262, 456, 464
group-index temp. co. 166, 170
material properties 148
temperature compensation 173

Springer Series in

optical sciences
Volume 1
1 Solid-State Laser Engineering
By W. Koechner, 5th revised and updated ed. 1999, 472 gs., 55 tabs., XII, 746 pages

Published titles since volume 80


80 Optical Properties of Photonic Crystals
By K. Sakoda, 2nd ed., 2004, 107 gs., 29 tabs., XIV, 255 pages
81 Photonic Analog-to-Digital Conversion
By B.L. Shoop, 2001, 259 gs., 11 tabs., XIV, 330 pages
82 Spatial Solitons
By S. Trillo, W.E. Torruellas (Eds), 2001, 194 gs., 7 tabs., XX, 454 pages
83 Nonimaging Fresnel Lenses
Design and Performance of Solar Concentrators
By R. Leutz, A. Suzuki, 2001, 139 gs., 44 tabs., XII, 272 pages
84 Nano-Optics
By S. Kawata, M. Ohtsu, M. Irie (Eds.), 2002, 258 gs., 2 tabs., XVI, 321 pages
85 Sensing with Terahertz Radiation
By D. Mittleman (Ed.), 2003, 207 gs., 14 tabs., XVI, 337 pages
86 Progress in Nano-Electro-Optics I
Basics and Theory of Near-Field Optics
By M. Ohtsu (Ed.), 2003, 118 gs., XIV, 161 pages
87 Optical Imaging and Microscopy
Techniques and Advanced Systems
By P. Torok, F.-J. Kao (Eds.), 2003, 260 gs., XVII, 395 pages
88 Optical Interference Coatings
By N. Kaiser, H.K. Pulker (Eds.), 2003, 203 gs., 50 tabs., XVI, 504 pages
89 Progress in Nano-Electro-Optics II
Novel Devices and Atom Manipulation
By M. Ohtsu (Ed.), 2003, 115 gs., XIII, 188 pages
90/1 Raman Ampliers for Telecommunications 1
Physical Principles
By M.N. Islam (Ed.), 2004, 488 gs., XXVIII, 328 pages
90/2 Raman Ampliers for Telecommunications 2
Sub-Systems and Systems
By M.N. Islam (Ed.), 2004, 278 gs., XXVIII, 420 pages
91 Optical Super Resolution
By Z. Zalevsky, D. Mendlovic, 2004, 164 gs., XVIII, 232 pages
92 UV-Visible Reection Spectroscopy of Liquids
By J.A. Raty, K.-E. Peiponen, T. Asakura, 2004, 131 gs., XII, 219 pages
93 Fundamentals of Semiconductor Lasers
By T. Numai, 2004, 166 gs., XII, 264 pages

Springer Series in

optical sciences
94 Photonic Crystals
Physics, Fabrication and Applications
By K. Inoue, K. Ohtaka (Eds.), 2004, 209 gs., XV, 320 pages
95 Ultrafast Optics IV
Selected Contributions to the 4th International Conference
on Ultrafast Optics, Vienna, Austria
By F. Krausz, G. Korn, P. Corkum, I.A. Walmsley (Eds.), 2004, 281 gs., XIV, 506 pages
96 Progress in Nano-Electro Optics III
Industrial Applications and Dynamics of the Nano-Optical System
By M. Ohtsu (Ed.), 2004, 186 gs., 8 tabs., XIV, 224 pages
97 Microoptics
From Technology to Applications
By J. Jahns, K.-H. Brenner, 2004, 303 gs., XI, 335 pages
98 X-Ray Optics
High-Energy-Resolution Applications
By Y. Shvydko, 2004, 181 gs., XIV, 404 pages
99 Few-Cycle Photonics and Optical Scanning Tunneling Microscopy
Route to Femtosecond ngstrom Technology
By M. Yamashita, H. Shigekawa, R. Morita (Eds.) 2005, 241 gs., XX, 393 pages
100 Quantum Interference and Coherence
Theory and Experiments
By Z. Ficek and S. Swain, 2005, 178 gs., approx. 432 pages
101 Polarization Optics in Telecommunications
By J. Damask, 2005, 110 gs, XVI, 528 pages
102 Lidar
Range-Resolved Optical Remote Sensing of the Atmosphere
By C. Weitkamp (Ed.), 161 gs., approx. 416 pages
103 Optical Fiber Fusion Splicing
By A. D. Yablon, 2005, 100 gs., approx. 300 pages
104 Optoelectronics of Molecules and Polymers
By A. Moliton, 2005, 200 gs., approx. 460 pages
105 Solid-State Random Lasers
By M. Noginov, 2005, 149 gs., approx. 380 pages
106 Coherent Sources of XUV Radiation
Soft X-Ray Lasers and High-Order Harmonic Generation
By P. Jaegle, 2005, 150 gs., approx. 264 pages
107 Optical Frequency-Modulated Continuous-Wave (FMCW) Interferometry
By J. Zheng, 2005, 137 gs., approx. 250 pages
108 Laser Resonators and Beam Propagation
Fundamentals, Advanced Concepts and Applications
By N. Hodgson and H. Weber, 2005, 497 gs., approx. 790 pages

You might also like