0% found this document useful (0 votes)
22 views

Olink Data Normalization White Paper v2.0

Uploaded by

x771779024
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

Olink Data Normalization White Paper v2.0

Uploaded by

x771779024
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

White paper

Data normalization and


standardization
Introduction Olink's built-in QC system
With Olink's Proximity Extension Assay (PEA) technique, as used Olink has developed a built-in QC system, using internal controls,
as in the Olink® Target 96/48 protein biomarker panels, real-time for its multiplex biomarker panels. This system enables full control
quantitative PCR (qPCR) is used in the readout step to measure over the technical performance of assays and samples.
relative changes in protein expression. The qPCR detects the unique
DNA sequence formed when complementary oligonucleotide-tags Internal controls
attached to pairs of analyte-specific antibodies hybridize and The QC system consists of four internal controls that are spiked
extend in the presence of DNA polymerase. into every sample and are designed to monitor the three main
Olink translates the Ct values from the qPCR into the relative steps of the Olink protocol: Immunoreaction, extension and
quantification unit, Normalized Protein eXpression (NPX), using a amplification/detection (Figure 1).
series of computations. These operations are designed to minimize Incubation controls: Incubation Control 1 and 2 are two different non-human
technical variation and improve interpretability of the results. antigens measured with PEA. These controls monitor potential technical
variation in all three steps of the reaction.
Olink has also developed a PEA platform with NGS readout (Olink®
Extension control: The Extension Control is composed of an antibody
Explore). While the NPX calculations are different between these
coupled to a unique pair of DNA-tags. These DNA-tags are always
two readout methods, the evaluation of the data is performed in in proximity, so that this control is expected to give a constant signal
the same basic way. In this white paper, however, we will focus on independently of the immunoreaction. This control monitors variation in the
normalization of data generated via qPCR readout only. extension and amplification/detection step and is used to adjust the signal
from each sample with respect to extension and amplification.
The analysis of the data can be affected by a number of technical
Detection control: The Detection Control is a complete double stranded DNA
factors. To account for this, Olink uses a quality control (QC) amplicon which does not require any proximity binding or extension step to
system that monitors the performance of assays and samples, generate a signal. This control monitors the amplification/detection step.
followed by appropriate normalization that alleviates systematic
noise caused by sample processing or technical variation.

Immuno reaction Extension and pre-amplification Amplification and detection


Allow the 92 antibody probe pairs to bind Extend and pre-amplify 92 unique DNA reporter Quantify each biomarker’s DNA reporter using high throughput
to their respective proteins in your samples. sequences by proximity extension. real-time qPCR.

Immuno/incubation controls
Extension controls
Detection control

Fig 1. Main steps and controls in a PEA assay with qPCR readout.
Sample controls Plate QC
There are six required and two recommended external controls The internal controls are also used in plate QC. This assesses the
that are added to separate wells on the plate. variation over the plate for each of Incubation Control 1 and 2
and the Detection Control. If the variation for one of the controls
 Inter-plate controls
Inter-plate Control (IPC) is included in triplicate on each plate and these is too large, the entire plate is considered unreliable.
are run as normal samples. The IPC is a pool of 92 antibodies, each with
one pair of unique DNA-tags positioned in fixed proximity and can be
seen as a synthetic sample, expected to give a high signal for all assays. Normalization using inter-plate controls
The median of the IPC triplicates is used to normalize each assay, to
compensate for potential variation between runs and plates. Calculation of NPX
 Negative controls
Olink uses an arbitrary, relative quantification unit called
Negative Control is also included in triplicate on each plate and consists of
buffer run as a normal sample. These are used to monitor any background Normalized Protein Expression (NPX). In qPCR, the x-axis value
noise generated when DNA-tags come in close proximity without prior of the point where the reaction curve intersects the threshold line
binding to the appropriate protein. The negative controls set the background is called the C­t, or “threshold cycle.” This indicates the number
levels for each protein assay and are used to calculate the limit of detection.
of cycles needed for the signal to surpass the fluorescent signal
 Sample controls threshold line. NPX is derived from the Ct values obtained from
When running more than 90 samples, it is recommended to run a pooled
plasma sample in duplicate. These are used to assess potential variation the qPCR using the following equations:
between runs and plates, for example to calculate inter-assay and intra-
Extension Control:
assay CV.
CtAnalyte – CtExtension Control = dCtAnalyte

Inter-plate Control:
dCtAnalyte – dCtInter-plate Control = ddCtAnalyte

Adjustment against a correction factor:


Correction factor – ddCtAnalyte = NPXAnalyte

TERMINOLOGY
The correction factor is calculated by Olink during the validation
of the panels. The value is pre-determined, using Negative Control,
and used to invert the scale so that a higher value corresponds to
Fig 2. Recommended control setup for a PEA run.
a higher signal. It is also used to ensure that background levels are
approximately zero.
Sample QC
Each of the internal controls is spiked into the samples at the same
concentration. The signals for these are therefore expected to be the NPX is a relative quantification unit logarithmically related to
same over the plate. Sample QC is performed using the Detection protein concentration. Even if two different proteins have the
Control and Incubation Control 2, on the initially calculated NPX same NPX values, their absolute concentrations may differ.
values. Within each plate, the levels of these controls are monitored NPX should be compared for each assay separately between
for each sample and compared against the median of all samples. samples within a run. NPX should not be compared between
If either of the controls differ by more than the acceptance criteria runs without proper inter-plate normalization due to the risk of
±0.3 NPX, the sample is considered deviant and fails QC. Deviating falsely interpreting shifts in median between runs as a biological
values for the internal controls can be caused by factors such as difference. However, relative differences in NPX can be compared
errors in pipetting or pre-analytical variations in the samples that more easily, often without additional normalization steps.
affect the performance, for example matrix effects.

TERMINOLOGY
Matrix effects are defined as changes in the analytical readout that
can be caused by all other sample components except the specific
analyte to be quantified.

2
Example curves Normalized against IPCs
To improve inter-assay repeatability, the data is then normalized
Raw data
against the IPCs. This is done per assay and plate and may
Below is an example of raw data for three runs with a calibrator
be followed by intensity or reference sample normalization
curve. There are deviating samples in the middle, and one curve
depending on the study characteristics.
has a parallel shift.
dCtAnalyte – dCtInter-plate Control = ddCt
Experiment 1 Experiment 2 Experiment 3

24 Experiment 1 Experiment 2 Experiment 3

22 14
20 12
18 10
16 8
14 6
Ct

ddCt
12 4
10 2
8 0
6 -2
4 -4
0.15 0.6 2.4 9.2 37 147 588 2.35 9.4 37.7 151 600 ... IPC
-6
(pg/mL) (ng/mL) 0.15 0.6 2.4 9.2 37 147 588 2.35 9.4 37.7 151 600 ... IPC
Samples
(pg/mL) (ng/mL)
Samples
Fig 3. Raw data for three runs.
Fig 5. Adjusted against IPCs.
Adjusted against Extension Control
The raw data is then adjusted against the Extension Control per Adjusted against correction factor
sample to improve intra-assay repeatability by reducing technical The curve is then inverted so that a high NPX value means high
variation introduced in the extension step. protein concentration which allows more intuitive interpretation.
A difference of 1 NPX approximates to a doubling of the protein
CtAnalyte – CtExtension Control = dCtAnalyte
concentration regardless of protein.
Experiment 1 Experiment 2 Experiment 3 Correction factor – ddCt = NPX
14
Experiment 1 Experiment 2 Experiment 3
12
10 16

8 14

6 12
dCt

4 10
2 8
NPX

0
6
-2
4
-4
2
-6
0.15 0.6 2.4 9.2 37 147 588 2.35 9.4 37.7 151 600 ... IPC 0
(pg/mL) (ng/mL) -2
Samples
0.15 0.6 2.4 9.2 37 147 588 2.35 9.4 37.7 151 600 ... IPC

Fig 4. Adjusted against Extension Control. (pg/mL) (ng/mL)


Samples

Fig 6. Adjusted against correction factor.

3
Inter-plate normalization methods Intensity normalization method
The assumption behind intensity normalization is that on
Olink recommends one of two normalization methods depending
average, there is no expected difference between the median
on the study design. For randomized studies, IPC normalization
signal for an assay on one plate compared to another. If any such
should be replaced with intensity normalization, which will
difference is seen between plates, it can be construed as technical
increase statistical power and reduce technical variation between
bias and be safely removed, resulting in more comparable values.
plates and projects. Intensity normalization can also be used
when combining studies from different sources as long as the Olink uses intensity normalization, with the median as the
sample distribution can be considered comparable between the normalization factor, when samples are randomized across plates
data sets. and projects. The intensity normalization adjusts the data so that
the median value for an assay on each plate is equal to the overall
For non-randomized studies, Olink recommends reference sample
median across all plates. This method assumes that the actual
normalization. Running reference samples on all plates is a good
median of each plate is the same. The way to ensure this is to
strategy to minimize technical variation. When applied correctly,
randomize the samples beforehand. If complete randomization
both intensity normalization and reference sample normalization
can be assumed, this is a robust and high-performing
can increase the power in a given study by reducing technical
normalization method. Intensity normalization is performed in
variation, since they are based on real samples in contrast to the
the following way:
IPC samples.
Step Description
1 For each assay, calculate the overall median value for all samples
TERMINOLOGY
and plates.
Randomization in this context applies to the sample placement
across the plates. A sample set is randomized if the relevant 2 For each plate and assay, calculate the plate specific median value.
experimental variables can be considered evenly distributed across 3 For each assay, subtract the plate specific median from every value
plates. The variables to consider must be decided for each study. for the plates (equals centralizing to median 0).
They can for example include study groups, treatment, time points
4 For each assay, add the overall median value (equals centralizing to
or demographics. If the randomization is not appropriate with
the overall median).
regards to experimental variables, the normalization might remove
true biological variation that otherwise could have been identified.

Reference sample normalization method


When samples are not randomized across plates, the inclusion
In studies where samples are randomized across plates, a
of eight or more bridging reference samples on each plate can
global adjustment is used to centre the values for each assay
be used for normalization. Reference sample normalization is
around its median and across all plates. This is called intensity
performed in the following way:
normalization. If randomization is not feasible or cannot be
guaranteed, Olink recommends including reference samples that Step Description
are representative of the cohort. For example, pooled plasma 1 Choose a reference plate to normalize towards.
samples can be included on all plates to ensure maximum control
2 For each assay and plate, calculate the pairwise difference for each
over any systematic biases. This method also allows for improved of the overlapping samples with the reference plate.
handling of technical variation when combining studies. 3 Estimate the plate- and assay-specific normalization factor by
calculating the median for the pairwise differences calculated in
step 2.
TERMINOLOGY
4 For each assay and plate, add the plate- and assay-specific
Bias can be defined as a signal attributed to experimental, normalization factor from step 3 to each value, to normalize it to
biological or technical aspects that causes a systematic error. the reference plate chosen in step 1.
Measured protein expression levels can be affected by several
sources of bias including pre-analytical and technical variation, but
also incorrect sampling. Normalization can reduce introduced biases
if properly implemented. Advice on sample handling and processing
is provided in Olink's white paper Pre-analytical variation in protein
biomarker research (www.olink.com/downloads).

4
Evaluation
To evaluate how effective a normalization method is, Olink investigates the NPX distributions and
compares the average %CV for different normalization methods.

Intensity normalization evaluation


The following illustration depicts the NPX of a protein across three different plates, where the samples
have been randomized. The colors indicate different plates.

Plate 1
Plate 2
Plate 3

4
NPX

0 50 100 150 200 250


Samples

Fig 7. NPX across plates for each protein.

After intensity normalization the median NPX is leveled.

Plate 1
Plate 2
Plate 3

4
NPX

0 50 100 150 200 250


Samples

Fig 8. NPX across plates for each protein after intensity normalization.

In properly randomized studies, intensity normalization typically improves average %CV by a few percent.

5
Reference sample normalization evaluation
In the following example, the samples are poorly randomized, and the plate in the middle contains mostly
samples from one group. This means that technical variation between plates is mixed with the biological
variation that can be investigated by comparing the sample groups. In this case, intensity normalization of
the plates would remove some of the relevant biological variation when correcting for technical variation
between plates. The illustration shows the NPX values of the proteins in poorly randomized samples and
the NPX values of the reference samples. The reference samples are shown in purple.

Plate 1
Plate 2
Plate 3
Reference

4
NPX

0 50 100 150 200 250


Samples

Fig 9. NPX across plates for each protein including reference samples.

The reference samples are then used for normalization. After reference sample normalization, the median
NPX can still differ between plates, reflecting different compositions of sample groups.

Plate 1
Plate 2
Plate 3
4 Reference
NPX

0 50 100 150 200 250


Samples

Fig 10. NPX across plates for each protein after reference sample normalization.

6
Misinterpreted data
To further illustrate issues with improper normalization, the image below gives an example of the bias
introduced into a dataset when normalization factors are driven by variance introduced by biological
factors, and not simply by inappropriate sample preparation or instrument-based variability. The
illustration shows a poorly randomized set of samples where the difference between the two groups is
statistically significant (p<0.05). In this case, intensity normalization would remove that difference.

4
NPX

0 50 100 150 200 250


Samples
Fig 11. NPX across plates for each protein, poorly randomized samples.

The second plate contains samples from mainly one group. Here the technical variable (plate) is mixed
with the biological variable (sample group), and intensity normalization cannot separate between the
two, as shown in the following image.

4
NPX

Study group 1 Study group 2

Fig 12. Intensity normalization removed the difference (p = 0.52).

With reference normalization, it is possible to see a significant difference.

4
NPX

Study group 1 Study group 2

Fig 13. Significant difference with reference sample normalization (p<0.05).

7
Olink can help
Normalization is important to remove systematic variation, but needs to be applied carefully to minimize
the risk of removing true biological variation. As a service, Olink can discuss normalization approaches
and study design with you before analysis starts.
If you send your samples to Olink's Analysis Service team, the data analysis steps will be taken care of.
They will provide you with a comprehensive NPX data and QC report after your study is completed.
If more extensive data analysis services are required, our biostatisticians in the Olink Data Science team
can help you with customized statistical analysis.

Olink's Data Science offering currently includes


• Olink® Insights Stat Analysis
A free web-based application for basic data visualizations and statistical analyses (Shiny App).
www.olink.com/biostat-apps
• OlinkAnalyze (RPackage on GitHub)
A versatile toolbox for handling of Olink data including QC plot functions, normalization,
various statistical tests and modelling.
www.olink.com/biostat-apps
• Olink Statistical Services
Performed by experts experienced in handling Olink data.
www.olink.com/biostat-services

www.olink.com
For Research Use Only. Not for Use in Diagnostic Procedures.
This product includes a license for non-commercial use of Olink products. Commercial users may require additional licenses. Please contact Olink Proteomics AB
for details. There are no warranties, expressed or implied, which extend beyond this description. Olink Proteomics AB is not liable for property damage, personal
injury, or economic loss caused by this product.
The following trademark is owned by Olink Proteomics AB: Olink®.
This product is covered by several patents and patent applications available at https://www.olink.com/patents/.
Components in the Olink® Target 96 Probe Kits utilize Thunder-Link technology and are provided under license from Expedeon Ltd.
© Copyright 2018-2021 Olink Proteomics AB. All third party trademarks are the property of their respective owners.
Olink Proteomics, Dag Hammarskjölds väg 52B , SE-752 37 Uppsala, Sweden
1096, v2.0, 2021-04-08

You might also like