Chas4.4 Manual
Chas4.4 Manual
4
USER GUIDE
DISCLAIMER
TO THE EXTENT ALLOWED BY LAW, LIFE TECHNOLOGIES AND/OR ITS AFFILIATE(S) WILL NOT BE LIABLE FOR SPECIAL, INCIDENTAL,
INDIRECT, PUNITIVE, MULTIPLE, OR CONSEQUENTIAL DAMAGES IN CONNECTION WITH OR ARISING FROM THIS DOCUMENT, INCLUDING
YOUR USE OF IT.
Legal entity
Affymetrix, Inc. | Santa Clara, CA 95051 USA | Toll Free in USA 1 800 955 6288
TRADEMARKS
All trademarks are the property of Thermo Fisher Scientific and its subsidiaries unless otherwise specified.
Features in v4.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
About this user guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Customer support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Starting ChAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Logging into the ChAS database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Analysis workflow module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
First time setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Assigning an Input sample path(s) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Assigning an Output results path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Assigning a Central QC history path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
File types and data organization in ChAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
ChAS file types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
IMPORTANT! The results from ChAS are for Research Use Only and not for use in diagnostic
procedures.
Chromosome Analysis Suite is not a secondary analysis package. However, it does create
CYCHP, OSCHP, XNCHP, and tab-delimited text files required for secondary analysis
packages.
Features in v4.4
Seamless integration with Franklin by Genoox for enhanced sample data
TM TM
interpretation.
Whole Genome Segmentation Visualization for large copy number abberations on
CytoScan XON arrays.
Flag Segments to bypass Filter settings.
Display LOH segments in different colors based on median copy number.
Ability to filter out LOH segments having a copy number < 2.
Addition of pHaplo and pTriplo scores.
Included the ClinGen Curated Regions in a Recurrent/Curated Regions track.
Download Library Files from the NetAffx Server securely, via https.
Customer support
Visit thermofisher.com/support for the latest in service and support, including:
Worldwide contact telephone numbers
Product support, including:
Product FAQs
Software, patches, and updates
Order and web support
Product documentation, including:
User guides, manuals, and protocols
IMPORTANT! Due to the amount of memory that ChAS requires to operate, Thermo Fisher
Scientific VERY STRONGLY recommends that you DO NOT install the ChAS software on
instrumentation computers being used for scanning and operating fluidics systems.
Table 1 Software
Processor 3 GHz (or greater) Pentium Quad 3 GHz (or greater) Pentium Dual Core
Core Processor Processor
RAM 32 GB 16 GB
Table 2 Server
Processor 3.1 or 3.3 GHz Quad Core Processor 2.7GHz Quad core processor
Windows Operating System Windows Server 2022 Standard 64-bit Windows Server 2022 Standard 64-bit
RAM >24 GB 16 GB
IMPORTANT! A Windows 64-bit Operating System is required for all array types.
IMPORTANT! Chromosome Analysis Suite requires AGCC 4.3/GCC 6.1 or higher to produce
CytoScan/Oncoscan CEL files.
IMPORTANT! The larger file sizes associated with the CytoScan HD Array should be taken into
account when calculating the necessary free space requirement. A CytoScan HD Array CYCHP file
is ~155 MB. A CytoScan XON Array XNCHP files is ~174MB.
IMPORTANT! The ChAS software has been verified for use on a Windows 64-bit Operating
System. ChAS may work on other Windows Operating Systems, but only the 64-bit version has
been verified.
Before performing an analysis, you must download the appropriate zip file package(s)
listed in the table below.
Installing ChAS
Note: The installation process also installs additional required components, which
includes Java components and Visual C++ runtime.
New Installation 1. Double click on the ChAS4.4_setup.exe file from the “Chromosome Analysis
Suite 4.4” folder.
The Install Shield Wizard for Chromosome Analysis Suite begins.
2. At the Welcome window, click Next.
The License Agreement window appears.
3. Please read the license agreement carefully, click the “I accept the terms of the
license agreement” radio button, then click Next.
The Setup Type window appears.
IMPORTANT! If your Windows Firewall is enabled during the installation of ChAS and you want
to Backup the ChAS Database and Restore it to your local ChAS DB (see "Using a shared ChAS
database while off-line" on page 457) a message may appear indicating that you cannot connect
to the shared folder.
If this message appears, contact your IT department for help in allowing file sharing through the
Windows Firewall.
Upgrade
installation
IMPORTANT! The ChAS 4.4 Installer does NOT support upgrade from previous versions. Due to
an updated version of the ChAS DB, previous versions of ChAS must be uninstalled using add/
remove programs prior to running the ChAS 4.4 installer. Be sure to make a backup of your ChAS
DB prior to uninstalling your previous version of ChAS.
To keep current preferences, see "Exporting and importing preferences" on page 444.
Copying analysis The CytoScan Analysis Library Files zip package download contains the Analysis Files
files required to process their respective CytoScan Array CEL files into CYCHP files.
The OncoScan Analysis Library Files.zip package download contains the Analysis
Files required to process their respective OncoScan Array CEL files into OSCHP files.
The CytoScan XON Analysis Library Files.zip package download contains the Analysis
Files required to process their respective CytoScan XON Array CEL files into XNCHP
files.
Also included in the CytoScan HD Analysis Library Files. zip are the files for
GenomewideSNP_6 files (required in ChAS to view GenomeWideSNP_6 CNCHP
files).
If you are unable to download library files through the software, you can download the
zipped library files from the ChAS product page at www.thermofisher.com, then
extract (unzip) the files into the following location: C:\Affymetrix\ChAS\Library
Displaying hidden 1. At the Windows 10 Desktop, move your mouse to the bottom right of the Task
files and folders in bar (right of the clock).
Windows 10 Five large icons appear.
2. Click on the Settings icon.
3. Click Control Panel.
The Control Panel window opens.
4. Click Appearance and Personalization in Control Panel. Under Folder Option,
click “Show hidden files and folders”.
5. In the Folder Options window that appears, click the View tab. Under Hidden files
and folders, click Show hidden files and folders.
Hidden files and folders are dimmed to indicate they are not typical items. If you
know the name of a hidden file or folder, you can search for it.
6. Click OK.
7. Close all open windows.
Analysis file When you start ChAS for the first time, you will be prompted to:
download 1. Create a user profile. (See "Creating and using user profiles" on page 440)
Note: To process the CytoScan Arrays in AGCC/GCC, you must install the
appropriate library files for AGCC/GCC on the AGCC/GCC workstation (see the
specific array product page at www.thermofisher.com for details).
You can download the ChAS analysis files from NetAffx using either the ChAS
Browser or the Analysis Workflow. The files will be saved into the same folder whether
downloading through the ChAS Browser or the Analysis Workflow.
Downloading ChAS analysis files from NetAffx for use with the ChAS
Browser
1. Start ChAS.
2. Click OK.
The Library File Download Service window opens.
Note: You can also open the Library File Download Service window by selecting
Update Library and Annotation Files from the Help menu.
3. From the Library File Download Service window, click OK to view available Library
Files for download.
4. Select the library and annotation files you want to download.
5. Click Download.
The Download Progress window displays the progress of the downloading and
unpacking of the files.
6. Click OK when the download is complete.
7. The NetAffx Authentication window remains open, click Close when finished
downloading the library files.
Downloading ChAS analysis files from NetAffx using the Analysis Workflow
1. Select Analysis → Perform Analysis Setup.
2. Select Utility Actions → Download library Files.
3. From the Library File Download Service window, click OK to view available Library
Files for download.
4. Select the array type check box(es) for the analysis files that you want to download,
then click Download.
Access to NetAffx 1. Launch the Analysis Workflow by selecting Analysis → Perform Analysis Set
from the analysis Up in the ChAS Browser.
workflow 2. Click Utility Actions → Download Library Files.
5. Enter the Host Address, Port (if not listed), User and Password.
IMPORTANT! This proxy user ID and password is NOT the same ID and password used to
connect to NetAffx.
6. Click Save.
Access to NetAffx 1. From the Help drop-down menu, click Update Library and Annotation Files…
from the ChAS The Library File Download Service window appears.
browser
2. Click the Proxy Settings tab. (Figure 2)
3. Click the Enable Custom Proxy Server check box.
4. Enter the Host Address and Port information, then enter your user name and
password.
IMPORTANT! This proxy user ID and password is NOT the same ID and password used to
connect to NetAffx.
5. Click Save.
Access to a 1. From the Preferences drop-down menu, click Edit Application Configuration…
remote ChAS The Configuration window appears. (Figure 3)
database server
from the ChAS Figure 3 Proxy Settings window
browser
Uninstalling
IMPORTANT! It is strongly recommended you backup your database BEFORE uninstalling
ChAS.
1. From the Windows Start Menu, navigate to the Windows Control Panel,
3. Locate the Chromosome Analysis Suite application, then perform the uninstall as
you normally would.
4. Click OK to acknowledge the message box that warns the ChAS application must be
closed (before removing it).
2. Use the drop-down button to select a user or click Create New to create a new
user profile. For more information, see "Creating and using user profiles" on page
440.
3. Optional: Click the Manual connection check box. For information on manual
connections, see "Manual or automatic connection mode" on page 408.
4. Click OK.
The Chromosome Analysis Suite application opens. as shown in Figure 5 on
page 31 after logging into the ChAS DB. To login, see "Logging into the ChAS
database" (below).
Note: A message may appear indicating a more current version of the NetAffx
Genomic Annotation file is available for download. To download the newer
version of the file, see "Analysis file download" on page 26. If you are unable to
download the files via the NetAffx dialog, please contact Technical Support for
alternative downloading options.
Menu Bar
Tool Bar
Files List
Named Karyoview in
Settings Upper Display Area
Data Types
List
Detail View in
Lower Display Area
First time setup After installation, you must configure your data paths.
The Analysis Workflow requires the following steps:
1. From the Analysis menu, select Perform Analysis Setup. (Figure 6)
Assigning a 1. Click the Browse button, then navigate to a folder in which to store the QC
Central QC history file. (Example: C:\Cytoscan_data\)
history path 2. Click Create New Folder to create a central QC history path folder. (Example:
QC_History)
Array-type specific Library file sets with files for running Copy Number/LOH/Mosaicism
analysis and Reference Creation workflows (Analysis files)
Files for visualizing and exporting data from xxCHP results data files.
Reference Annotation files
Browser Annotation files are named using the following format:
<NetAffxGenomicAnnotations.Homo_sapiens.hgXX.naYYYYMMDD.db>
Data organization ChAS enables you to keep your CEL and Analysis Results files in any folder on your
in ChAS computer. As long as you know where the files are, you can load them from anywhere and
move them around at your convenience.
IMPORTANT! It is recommended that you perform analysis operations with all analysis files
stored on a local disk drive.
IMPORTANT! The results from ChAS are for Research Use Only. Not for use in diagnostic
procedures.
Array processing Array processing is performed in AGCC 4.1.2/GCC 5.0 or higher for the CytoScan Arrays,
workflow (using OncoScan Arrays, and Genome-Wide Human SNP 6.0 Array.
instrument Note: You need to have the appropriate library files installed on the instrument control
control software) workstation to perform these analyses for the different array types.
Probe-level Copy number data is handled differently from genome-wide genotyping data in this step.
Analysis of CEL Note: You need to have the appropriate ChAS library files installed to perform these
file data analyses for different array types. A 64-bit system is required to analyze CytoScan CEL
files.
For CytoScan arrays, this analysis is performed in ChAS and produces CYCHP or
XNCHP files (depending on array type) and contain the data shown in Table 5. See
“CN/LOH/Mosaicism analysis” on page 44.
For CytoScan HTCMA arrays, the analysis is performed in the RHAS which produces
a RHCHP file for viewing in ChAS.
Genome-Wide Human SNP Array 6.0 Data: The probe level analysis on CEL file data is
performed in GTC and produces the CNCHP file data types shown in Table 5. See the
GTC User Guide for more information.
Table 5
Analysis Results *
* CYCHP for CytoScan, CNCHP for Genome-Wide Human SNP 6.0 Array, XNCHP for CytoScan XON
Array, and OSCHP for OncoScan Array.
1) For more details on CytoScan Array data, see Table 14 on page 175.
2) For more details on Genome-Wide SNP Array 6.0 data, see Table 17 on page 178.
3) For more details on OncoScan FFPE Assay Data, see Table 18 on page 178.
4) For more details on CytoScan XON array data, see Table 16 on page 177.
Note: Segment types drawn with flat ends (Gain and Loss) are the result of algorithms
which can ascertain precise marker-to-marker breakpoints. Segment Types drawn with
rounded ends (LOH, GainMosaic, LossMosaic) are the output of algorithms which closely
approximate breakpoints based on the data.)
ReproSeq When loading zip files from Ion Reporter into ChAS for viewing, the software displays
aneuploidy (zip the following segments and graph data:
files) Segment data
Copy Number Gain/Loss
Graph Data
Copy Number State (sequence tiles)
Viewing data and ChAS provides the following options for viewing and studying your loaded analysis
features of results data:
interest using the Graphic Displays
ChAS display
See “Displaying data in graphic views” on page 152.
controls
Tables
See “Displaying data in table views” on page 327.
After the data is loaded, you can:
Filter the segments by Segment Parameters to hide segments that do not meet
your requirements for significance. mSee “Filtering segments” on page 217.
Select a region information file for use as a CytoRegion file and:
Perform differential filtering for segments in CytoRegions and in the rest of the
genome. See “Using CytoRegions” on page 267.
Display only segments that appear in CytoRegions using Restricted Mode.
Query segments from a loaded sample against segments previously uploaded to
a ChAS Database. See "Querying a segment from the segment table" on page 392.
See which samples had segments similar to the current sample.
View the Calls and Interpretations of previous segments to help in the analysis
of the current sample.
Select a region information file for use as an Overlap Map and use the Overlap filter
to identify or conceal segments that appear in the Overlap Map regions. See
“Using the overlap map and filter” on page 279.
Add selected features of the genome to new or existing Region (AED) files, and edit
annotation data on existing annotations. (To open a BED or AED file, click the
button or select File → Open on the menu bar.) See “Creating and editing AED
files” on page 287.
Changing pane Do one of the following to change the size of the panes in the ChAS window, as shown
sizes in Figure 9 on page 43.
Click and drag the dividers between panes.
Click the arrows in the dividers ( or ) to hide or maximize an entire pane.
Click arrows to
show/hide a pane
Opening panes in You can display a pane in a separate window by clicking the icon on the tab. To
separate close the window and return the information to the tab panel, click the icon in the
windows window.
IMPORTANT! The results from ChAS are for Research Use Only. Not for use in diagnostic
procedures.
Note: Reference files are provided as part of the complete Library file packages. You
can also create your own reference file using ChAS.
Load Genome-Wide Human SNP Array 6.0 CNCHP into ChAS to display and detect
Copy Number and Loss of Heterozygosity segments. See "Loading data" on page
118.
Load CytoScan HTCMA Array RHCHP into ChAS to display and detect Copy Number
segments, Loss of Heterozygosity segments, variant data, and SMN data. See
"Loading data" on page 118.
IMPORTANT! It is recommended to perform analysis operations with all associated analysis files
in a locally stored folder(s).
ChAS analysis file Table 6 lists the compatibility between ChAS Analysis file versions for the CytoScan
compatibility Arrays.
Note: ChAS automatically prevents you from selecting an incompatible analysis file
version for analysis or when viewing analysis results.
NA32.3(hg19) No No No No No No
NA32.1(hg19) No No No No No No
NA32(hg19) No No No No No No
Note: Refer to the ChAS Release Notes for data equivalency information between the
ChAS software and the Library file versions used to create CHP files.
The expected copy number state on the X chromosome in normal males is not
constant over its entire length. This is due to the structure of the sex chromosomes,
and the fact that they share extensive homology with each other only in the Pseudo
Autosomal Regions (PARs) that they each have at either end. PAR1 is at the top of the
p-arm and PAR2 at the bottom of the q-arm.
Markers occurring in the PAR regions are mapped exclusively to the X Chromosome.
Therefore, in normal males the PAR regions of the X are expected to be CN=2 (probes
on the X and Y contribute to the signal), while the rest of the Chr X is expected CN=1
for normal males. As a result, we treat the two X PARs in males as independent units
(CN=2 expected) from the rest of the X chromosome (CN=1 in males) when generating
Copy Number Segments.
Aberrant segments that cross PAR/non-PAR boundaries may be normalized into one
segment if they have equivalent type (Gain or Loss) and CN State. During this
normalization process, ChAS will not combine an aberrant (Gain or Loss) segment
with a normal segment across PAR/non-PAR boundaries, even if they have the same
CN State. If smoothing is subsequently applied, aberrant segments with different copy
number state may be combined. If joining is subsequently applied, aberrant segments
separated by a non-aberrant segment may be combined.
Because only Y-specific probes are mapped to the Y chromosome, the expected
state of the entire Y chromosome is 1 for males and is 0 for females.
The expected copy number state on the X chromosome in normal males is not
constant over its entire length. This is due to the structure of the sex chromosomes
and the fact that they share extensive homology with each other only in the Pseudo
Autosomal Regions (PARs) that they each have at either end. PAR1 is at the top of the
p-arm and PAR2 at the bottom of the q-arm.
Markers occurring in the PAR regions are mapped exclusively to the X Chromosome.
Therefore, in normal males the PAR regions of the X are expected to be CN=2 (probes
on the X and Y contribute to the signal), while the rest of the Chr X is expected CN=1
for normal males.
Mosaic Segments whose boundaries start and end entirely in one of the PAR regions
will use CN=2 as normal to determine the type (GainMosaic or LossMosaic) of Mosaic
segment to draw.
Because the Mosaicism algorithm can generate segments which cross the PAR
boundaries, Mosaic Segments that touch the non-PAR region of the X chromosome
use the gender call of the sample to determine the Type of Mosaic segment to draw.
Because only Y-specific probes are mapped to the Y chromosome, the expected
state of the entire Y chromosome is 1 for males and is 0 for females.
Table 7 Expected LOH calls on the X and Y chromosomes for the CytoScan arrays
Normal male sample (XY) LOH calls that are single copy-based LOH
call (CN = 1).
Male sample with multiple X LOH calls are possible, depending on the
chromosomes (for example, XXY) constitution of the X chromosomes’ origins.
No LOH calls are made for the Y
Normal female sample (XX) LOH calls are possible, depending on the chromosome. Genotype calling is not
constitution of the X chromosomes’ origins. performed on the Y chromosome.
Female sample with a single X LOH calls on X regions which have only a
chromosome (X0) single copy. Heterozygous SNP genotypes
are possible, but are due to the low inherent
Heterozygote call error rate noise, not the
true presence of two alleles.
Table 8 Expected LOH calls on the X and Y chromosomes for the Genome-Wide Human SNP Array 6.0
Normal male sample (XY) LOH calls on the non-PAR region of the X
chromosome resulting from “forced”
homozygote-only calls due to the presence
of the Y chromosome.
Normal female sample (XX) LOH calls are possible, depending on the
constitution of the X chromosomes’ origins. LOH analysis is not performed on the Y
chromosome since it is assumed that
Female sample with a single X LOH calls on X regions with only a single
there not substantial Y chromosomal
chromosome (X0) copy. Heterozygous SNP genotypes are
possible, but are due to the low inherent Female sample with a single X material.
Heterozygote call error rate noise, not the
true presence of two alleles.
Table 9 Expected LOH calls on the X and Y chromosomes for the OncoScan arrays
Normal male sample (XY) LOH calls that are single copy-based LOH
call (CN = 1).
Male sample with multiple X LOH calls are possible where there is either
chromosomes (for example, XXY) loss or low heterozygosity.
No LOH calls are made for the Y
Normal female sample (XX) LOH calls are possible depending on the chromosome.
constitution of the X origins or in regions of
either loss or low heterozygosity.
Female sample with a single X LOH regions on X which have only a single
chromosome (X0) copy. Will be called LOH where there is
single copy X.
Setting up and Note: If you want to setup and run an OncoScan Analysis, see "Setting up and running
running a single an OncoScan single sample analysis" on page 67. If your samples are cancer samples
sample analysis and you suspect aberrations for at least 50% of the genome, then running a Normal
Diploid Analysis is recommended. For more information, see "Setting up and running
a normal diploid analysis" on page 65.
1. From the Analysis menu, select Perform Analysis Setup. (Figure 11)
2. From the Select array type drop-down list, click to select CytoScan array type.
(Example: CytoScanHD_Array)
Note: Once you select the array type, analysis workflow, and reference model file,
then the annotation file will be auto selected for you based on your earlier selections.
The Select array type drop-down list includes only the array types for which library
(analysis) files have been downloaded from NetAffx or copied from the Library
package provided with the installation.
8. If your CEL files are located somewhere other than your input path location, navigate
to the desired folder. Single click, Ctrl click, Shift click or Ctrl a (to select multiple CEL
files).
9. Click Open.
The Select the intensity (CEL) file(s) to analyze pane is now populated with your
CEL files, as shown in Figure 14.
Note: You can load several CEL files at a time for a Single Sample Analysis.
To remove a CEL file from this list, click to highlight it, then click Remove.
10. At the Output result information pane, confirm the path shown for your output
...
file folder. To change the current path/folder, click button to select a different
output path/folder.
Note: To better organize your output results, you can add sub-folders to your
assigned output result path/folder.
Click the
... button to return to your assigned output path and/or folder.
11. If you are using a previously analyzed CEL file(s) to verify new CHP data (against
CHP data generated from previous versions of ChAS and Library files), you may
want to use a suffix to append the new resulting CHP file(s). To do this, click
inside the Select a suffix to append to the analysis results field to enter an
appending file suffix. (Figure 16)
IMPORTANT! If you are saving the same .CYCHP file into the same output file folder that
contains your originally run CYCHP file, a “1” is automatically added into the filename (in
addition to any suffix you may add) to differentiate the two runs of identical CEL file names.
Example: na33(1).cyhd.cychp
12. Optional: If you have a CEL file(s) in which the Y chromosome is partially/fully
deleted and therefore determined to be female by the gender calling algorithm,
go to the Analysis Setup’s Optional pane (Figure 17), click the Set Gender
Manually check box, then click to select the appropriate radio button.
13. Optional: If you want to have an automatic export of the Karyoview, Segments
Table, and Detail View for Copy Number and LOH Segments in the CHP file, click
the check box Generate a Results Summary File, then and select the output
format of either PDF or DOCX. (Figure 18)
Note: You can assign a CytoRegion and Overlap Map region file that will
highlight these regions in the export. The export is placed in the same folder as
the CYCHP file. This automatic export feature is only available for CytoScan
arrays.
The Workflow dashboard window appears and your annotation files begin to
load. (Figure 20).
Note: The View Logs button will access the algorithm pipeline logs which may
be useful if you have a Workflow that fails to complete.
15. Click to choose the analysis you want to view, then click View Results List.
The QC Results tab window appears showing the Basic View QC settings. A
Detail View QC setting, which provides more columns of data, is also available in
the QC Settings drop-down list. (Figure 22)
Note: QC parameters can also be viewed in the ChAS Browser see setting QC
parameters in ChAS Browser.
16. Click each sample's check box or click the 'Select All' button to select all
samples.
At the QC Results window, click the View in Browser button or the View in MSV
button. For more MSV information, see the RHAS User Guide.
If the following warning message appears (Figure 25), acknowledge it, then click
OK.
If the following warning message appears (Figure 26), acknowledge it, then click
OK.
Note: The ChAS Browser allows for loading of xxCHP files analyzed from
different versions of ChAS. However, xxCHP files analyzed from different
genome versions (hg18, hg19, hg38) cannot be loaded at the same time.
After a few moments, the ChAS browser featuring your selected samples
appears. (Figure 29)
Recentering Due to the complexity and low diploid count in a small fraction of cancer samples,
CytoScan HD, there may be a need to manually assign the diploid region of the sample or recenter it.
750K and Optima In Figure 30, Chromosome 1 is called as a mosaic copy number loss, the log 2 ratio
arrays data is shifted downward, the smooth signal averages 1.75 copies, but the Allele
Difference (AD) and B- Allele Frequency (BAF) Graphs are displaying three tracks.
Note: Since it is unlikely to have three tracks in AD/BAF data in a region of loss (unless
the loss is CN=0), this sample needs to be recentered.
If the region that is true diploid is a whole chromosome, use Method 1. If the region
that is true diploid is part of a chromosome, use Method 2.
Method 1 Determining the median Log 2 ratio for the region in the sample that is truly
diploid
Figure 31 Chromosome Summary tab - median log2 ratio value for Chromosome 16
example
Method 2 Determining the median Log 2 ratio for the region in the sample that is truly
diploid
No gender single The No Gender Single Sample Analysis (Figure 36) is the same analysis as
sample analysis described previously for Single Sample Analysis with the exception that no
gender information is displayed in ChAS. The gender will not be reported and no
segment or probe level data from X or Y chromosomes are displayed.
The metric, Sex Chromosomes Aberrated can be added to the QC table and
reports either a Yes or No.
Yes: Indicates that the sample does have segments meeting the following
default thresholds: 50 Markers/200kb for copy number and 50 Markers/
10,000kb for LOH segments.
No: Indicates no copy number or LOH segments meet the previously defined
thresholds.
Setting up and The Normal Diploid Analysis for CytoScan is recommended for cancer samples
running a normal in which >50% of the genome is likely to be rearranged. This analysis will
diploid analysis automatically determine the normal diploid regions and normalize the rest of the
sample based on those regions resulting in properly centered data.
A Normal Diploid Analysis has the identical setup steps as "Setting up and
running a single sample analysis" on page 50. The only difference is you must
select Normal Diploid Analysis from the Select analysis workflow drop-down
menu, as shown in Figure 37.
The data shown above is from a cancer sample in which >50% of the genome was non-diploid. The top
graph (purple) shows the sample run through the traditional single sample analysis. There are no Copy Num-
ber Segments called, the weighted log2 is centered around 0, but there are 4 allele difference tracks indi-
cating more than two copies of this chromosome. In the bottom graph (pink), this same sample is run
through the Normal Diploid normalization algorithm. The Copy Number Gain segment is called, the weighted
log2 is shifted above the 0 line which is in agreement with the four allele difference tracks.
Note: For samples run through the Normal Diploid Analysis, the following QC metrics
are recommended:
MAPD < 0.25
SNPQC or ndSNPQC >= 15
wavinessSD or ndwavinessSD < 0.12
1. From the Analysis Workflow, click the QC Results window tab.
2. Click the Settings drop-down menu, then select NDN View.r1, as shown in
Figure 39.
Setting up and 1. From the Analysis menu, select Perform Analysis Setup. (Figure 40)
running an
OncoScan single Figure 40 Analysis drop-down menu
sample analysis
2. From the Select array type drop-down list, click to select OncoScan.
Note: Once you select the array type, analysis workflow, and reference model
file, then the annotation file will be auto selected for you based on your earlier
selections.
IMPORTANT! The Select array type drop-down list includes only the array types for which
library (analysis) files have been downloaded from NetAffx or copied from the Library package
provided with the installation.
IMPORTANT! For FFPE samples use the FFPE Analysis NAXX workflow. For Control DNA
use the Control DNA Analysis.
5. By default, the Set workflow name is Workflow. Click inside the Workflow’s
(upper right) text box to enter a different workflow name.
IMPORTANT! After loading the CEL files, check that the AT lines up with the matching GC
CEL file.
12. Click the Result File Names drop-down menu to enable ChAS to automatically
generate Output Names.
Note: Output file names are only auto-generated if the two CEL files have the
same root name. It is recommended to use an “A” or “C” as the last character to
designate the channel in the CEL file naming convention. Example:
“_AS_05A.CEL” is an AT Channel file, while “_AS_05C.CEL” is a GC Channel file.
You can also clear this (populated) column by clicking Clear Column.
13. OPTIONAL: To choose a different output folder from the saved output path that
is displayed, click the Output result information’s Browse button.
An Explorer window appears.
14. Navigate to an output folder location, then click OK.
Note: To better organize your output results, you can add sub-folders to your
assigned output result path/folder.
1. Click the
... button to return to your assigned output path and/or folder.
IMPORTANT! If you are saving the same OSCHP file into the same output file folder that
contains your originally run OSCHP file, a “1” is automatically added into the filename (in
addition to any suffix you may add) to differentiate the two runs of identical CEL file names.
Example: na33(1).oschp
7. Click Submit.
The Workflow dashboard window appears and your annotation files begin to
load. (Figure 45).
The Analysis Workflow Dashboard tracks ongoing analysis tasks for ChAS. It also
delivers the results of the analyses and can restart the Browser (if it was shut
down to free up memory for the analysis).
Note: The View Logs button will access the algorithm pipeline logs which may
be useful if you have a Workflow that fails to complete.
8. Click to choose the analysis you want to view, then click View Results List.
9. Click each sample’s check box or click the Sample File check box to select ALL
samples.
If the following warning message appears (Figure 50), click Yes to acknowledge
it.
Note: The ChAS Browser allows loading files analyzed using different NetAffx
version at the same time (as long as the versions are all from all the same
reference and genome builds). If NetAffx versions are from different builds of the
genome (for example Hg18 and Hg19), The ChAS Browser does not load the files.
After a few moments, the ChAS browser featuring your selected samples
appears.
If the region that is true diploid is a whole chromosome, use Method 1. If the region
that is true diploid is part of a chromosome, use Method 2.
Method 1 Determining the median Log 2 ratio for the region in the sample that is truly
diploid
Figure 53 Chromosome Summary tab - median log2 ratio value for Chromosome 16
example
Method 2 Determining the median Log 2 ratio for the region in the sample that is truly
diploid
7. Load the OSCHP file into the QC Results tab by clicking on the Add Files
button. (Figure 56)
8. From the QC Settings drop-down menu, select Recenter View, then make a
note of the TuScan L2R Adj value. (Figure 56)
9. Click on the Analysis setup tab. (Figure 57)
10. Select the FFPE or non-FFPE analysis based on the sample type.
11. Load in the two CEL files into the appropriate channel.
12. Check the Manual Recenter check box to enable the parameter fields.
(Figure 57)
13. Enter the TuScan Log2 Adjustment value you noted earlier.
14. Enter the value of the median Log2 determined from the browser into the Adjust
this log 2 to 0 field.
15. Enter a suffix if desired.
Note: An RC will automatically be appended onto any OSCHP file that goes
through Manual Recentering for an RC.OSCHP extension.
Figure 58 shows the original OSCHP file (pink data) and the manually recentered
RC.OSCHP (green data).
By inputting both the TuScan Log2 Ratio value (derived from the algorithm) and the
median Log2 Ratio value (for the region you have determined to be diploid,
Chromosome 16q for our example), the Recentering Algorithm has recentered the
log2 ratio data (for the region determined to be diploid) around 0 and there is no longer
a loss segment called in this region.
Figure 58 Example original OSCHP file (pink data) and the manually re-centered RC.OSCHP (green data)
Setting up and As long as your library file folder contains the necessary analysis files for the array,
running an your configuration paths are established (Figure 59) your Array Information fields will
OncoScan auto-populate. (Figure 60)
matched normal
analysis
IMPORTANT! After adding new library files to the library file folder, always close and re-
launch OncoScan Console to ensure the newly added files are recognized by the software.
2. From the Select analysis workflow drop-down list, click to select FFPE
Analysis including Matched Normal NAXX.
Other available Analysis Workflow options are:
Control DNA Analysis NAXX - Use this workflow for the Control DNA in the
OncoScan Kit.
Non-FFPE Analysis NAXX - Use this workflow with cell line DNA.
FFPE Analysis NAXX - Use this workflow for a standard analysis.
3. Enter a Workflow name (optional). By default, the Set workflow name is
Workflow. Click Workflow (upper right) to enter a different workflow name.
IMPORTANT! If the Reference Model File and Somatic mutation Reference Model File were
created independently of each other, a warning message appears after you click Submit (to
start the Workflow Analysis process). Click OK to acknowledge the message.
Adding CEL files You can manually add CEL files or import them as a tab-delimited text file.
to analyze
Manually adding CEL files to analyze
1. At the Select the intensity (CEL) file(s) to analyze pane, click the Add CEL files
drop-down.
2. Click Tumor AT Channel.
The CEL file window appears. (Figure 61)
4. Single click, Ctrl click, or Shift click (to select multiple Tumor AT Channel files).
IMPORTANT! It is recommended to use an “A” or “C” as the last character to designate the
channel in the CEL file naming convention. Example: “_AS_05A.CEL” is an AT Channel file,
while “_AS_05C.CEL” is a GC Channel file. See Figure 61.
5. Click Open.
The Tumor AT Channel fields are now populated. (Figure 63)
The File Name drop-down list (Figure 67) is dynamically populated and based on what
attributes are populated in the ARR file.
To use this display option, you must:
1. Provide the appropriate attributes at the time of sample registration in AGCC.
2. The ARR files must reside in the same folder as the CEL files.
To see “channel” (as an option in the drop down), you must use a template (or the
1. Click the File Name drop-down button, then click to select the attribute you want
displayed along with your CEL file names.
The two examples (Figure 68 and Figure 69) show how the table appears with the
display set to Filename, then to Channel.
Importing CEL The batch file must be saved as a text (Tab-delimited) format and include the full
files using batch directory path for your CEL files, as shown in Figure 70.
import Note: The resulting OSCHP files are saved to your output path location, therefore it is
not necessary to include a path under RESULT. Simply enter the desired results
filename in this column.
The format for this tab-delimited file is 5 columns (A,B, C, D, and E) with the headers:
ATCHANNELCEL
GCCHANNELCEL
ATChannelMatchedNormalCel
GCChannelMatchedNormalCel
RESULT
You must provide the full path to the CEL files for each Channel column.
3. Click Open.
The Tumor AT, Tumor GC, Normal AT, Normal GC and Result File Name fields
are now populated. (Figure 71)
How it works The Automatic CEL File Analysis tool continually scans up to five designated input
folders for new CEL files to analyze. Once a CEL file is detected, it is analyzed resulting
in a CHP file (which is auto-saved to a designated output folder). Once a CEL file is
processed, the ARR and CEL files are moved to the designated Archive folder.
Supported array CytoScan HD (Single Sample Analysis or Normal Diploid Analysis only)
types CytoScan XON (Single Sample Analysis only)
CytoScan 750K (Single Sample Analysis or Normal Diploid Analysis only)
CytoScan Optima (Single Sample Analysis only)
OncoScan CNV (FFPE Analysis only)
Launching the 1. Click the Utility Actions button (top right of the Analysis Workflow window)
tool 2. Click Automatic CEL File Analysis.
The window opens. (Figure 72)
Any non-ARR and non-CEL files detected in your assigned input folder will remain in this folder.
Any detected ARR and CEL files will be processed and moved into your assigned Archive
folder.
Copies of the processed CEL files are placed in the Archive folder.
1. By default, hg19 is selected. If needed, click the Target Genome Version drop-
down to select hg38, a shown in Figure 73.
1. Click the check box to use the Normal Diploid algorithm. Leave unchecked to run
the default single sample analysis.
Opening the 1. From the ChAS Browser window, click File → Open.
newly generated An Explorer window appears.
CHP file(s)
2. Navigate to your assigned Automatic CEL File Analysis output folder(s). See
"Assigning your output folder(s)" on page 86.
3. Click or Ctrl click to highlight the CHP files you want to open in the ChAS
Browser, then click Open.
Reference files
This section explains how to create a reference file which is required to perform single
sample analysis in ChAS. The software analyzes a sample file by comparing it to a
reference file. You can use the reference file provided with ChAS, or you can create a
reference file using your own sample data.
See Figure 75 for an overview of the analyses involved in creating a reference file for
the CytoScan Arrays.
1. From the Analysis menu, select Perform Analysis Setup. (Figure 76)
2. From the Select array type drop-down list, click to select an array type
(Example: CytoScanHD_Array.
Note: The Select array type drop-down list includes only the array types for
which library (analysis) files have been downloaded from NetAffx or copied from
the Library package provided with the installation.
3. Select a Genome Build. (Example: hg19)
IMPORTANT! The same annotation file you used to create a Reference Model File MUST also
be used with future Single Sample Analyses runs that utilize your created Reference Model File.
9. Single click, Ctrl click, or Shift click (to select multiple CEL files), then click Open.
The Select the intensity (CEL) file(s) to analyze pane is now populated with
your CEL files. (Figure 79)
To remove a CEL file from this list, click to highlight it, then click Remove.
10. At the Output result information pane, enter a name for your reference model
file. (Figure 80)
Exporting QC 1. Check the adjacent check box next to the file(s) you want to export or click the
table information Select All button (atop the check boxes) to auto-check all the displayed
files.
2. To export probe-level data, click the Generate Report drop-down. (Figure 82)
The following export reporting options appear: (Figure 82)
3. If you want to export all 4 available reports at once, click to select Export All
Probe Level Data. (Figure 83) Otherwise, click to select the specific report(s) you
want export.
The appropriate (previously assigned) folder file window appears.
Note: The default root filename is Result. Click inside the File Name field to enter
a different root filename.
4. Enter a File Name for your reporting file, then click Save or navigate to a different
save location.
Exporting a gene This report summarizes the copy number segments that overlap user defined
report (CytoScan regions of interest (e.g., Genes) as defined in the selected BED file.
or OncoScan 1. From the QC Results tab, click the Generate Report drop-down menu and
arrays) select Export Gene Report. (Figure 84)
Figure 85 Select the BED file for the Gene Report window
4. Click Yes.
The Results Output folder window appears.
5. Locate the Gene Report text file, then open it in Microsoft Excel.
The following window appears. (Figure 88)
Start Position Start position of gene or region as defined in the bed file.
End Position End position of gene or region as defined in the bed file.
Genes This column is populated from the name column of the bed file. In most cases, it will contain gene
names.
Threshold Test Displays Outside Bounds if any of the QC metrics fail to meet a threshold test. For more information
on thresholds, see "Creating your own custom QC setting" on page 57.
% Aberr.Cells If % AC = 100%, we return “homogeneous” because it could be 100% normal or 100% tumor.
(OSCHP only) If % AC =NA, the percent aberrant cells could not be determined and TuScan returns non-integer CN
calls. This metric is an algorithmically determined estimate of the % of aberrant cells in the sample.
TuScan Ploidy TuScan Ploidy is the most likely ploidy state of the tumor before additional aberrations occurred.
(OSCHP bnly) TuScan Ploidy is assigned the median CN state of all markers, provided that %AC could be
determined and integer copy numbers are returned. If %AC cannot be determined, NA (Not Available)
is reported for both ploidy and %AC.
Low Diploid Flag An essential part of the algorithm is the identification of “normal diploid” markers in the cancer
(OSCHP only) samples. This is particularly important in highly aberrated samples. The normal diploid markers are
used to calibrate the signals so that “normal diploid markers” result in a log2 ratio of 0 (e.g. copy
number 2). The algorithm might later determine that the "normal diploid" markers identified really
correspond to (for example) CN=4. In this case the log2 ratio gets readjusted and TuScan ploidy will
report 4. Occasionally (in about 2% of samples) the algorithm cannot identify a sufficient number of
“normal diploid” markers and no “normal diploid calibration occurs. This event triggers “low diploid
flag” = YES. In this case the user needs to carefully examine the log2 ratios and verify if re-centering
is necessary.
Median Log2 Ratio Log2 Ratio is the log2 ratio of the normalized intensity of the sample over the normalized intensity of
(OSCHP only) a reference with further correction for sample specific variation. The Median Log2 Ratio is computed
for each segment.
Median BAF B-allele frequency (BAF) is (Signal (B)/{Signal(A) + Signal(B), where signal (A) is the signal from the AT
(OSCHP only) chip and signal (B) is the signal from the G/C chip. Median BAF is reported for each segment and is
the median BAF of the markers identified as heterozygous, after mirroring any marker BAFs above 0.5
to the equivalent value below 0.5. If the number of heterozygous markers in the segment is below 10
or the percent of homozygous markers is above 85% no value is reported,
State This is a comma separated list of the copy number state of the segments that overlap the gene or
region.
LOH Flag to indicate whether the gene or region is in a Loss of Heterozygosity region (0=No, 1=Yes).
Gene This column is populated from the name column of the bed file. In most cases, it will contain gene
names.
Gene Start Start position of the gene or region as defined in the bed file.
Position
Gene End Position End position of gene or region as defined in the bed file.
XON Region Start Start position of the Exon Region segment as defined in the XNCHP file.
Position
XON Region Stop End position of the Exon Region segment as defined in the XNCHP file.
Position
2. Click the Select the region file Browse button to navigate to and select the
appropriate file.
Note: For the Regions file, you can use the default bed files provided in the in the
library files for use with the Copy Number Expression Overlap report. You may
also create your own AED/BED file containing your custom regions of interest.
3. Click the Select the AED file (TAC output) Browse button to navigate to and
select the appropriate file. Refer to the Transcriptome Analysis Console (TAC) 4.0
User Manual for analyzing and exporting expression data as an AED file.
4. Click the Select output file name Browse button to navigate to an existing report
location, or click inside the text field to enter a different root filename (other than
the default Results filename), then click OK.
A progress bar appears while your report generates.
5. Locate the cnexoverlapreport.txt file, then double-click on it to open it. It is
recommended to open the tab delimited file with Excel for easier viewing.
Gene This column is populated from the name column of the bed file. In most cases, it will contain gene
names.
Gene Start Start position of gene or region as defined in the bed file.
Position
Gene End Position End position of gene or region as defined in the bed file.
CN State This is a comma separated list of the copy number state of the segments that overlap the gene or
region.
Segment Start Start position of the overlapping copy number segment in the xxCHP file.
Position
Segment End End position of the overlapping copy number segment in the xxCHP file.
Position
Segment Size (bp) The Segment Stop Position minus The Segment Start Position.
% Aberr.Cells If % AC = 100%, we return “homogeneous” because it could be 100% normal or 100% tumor. If %
(OncoScan only) AC =NA, the percent aberrant cells could not be determined and TuScan returns non-integer CN calls.
This metric is an algorithmically determined estimate of the % of aberrant cells in the sample.
TuScan Ploidy TuScan Ploidy is the most likely ploidy state of the tumor before additional aberrations occurred.
(OncoScan only) Algorithmically it is the CN state of the markers identified by the algorithm as normal diploid before
%AC and ploidy are determined. When a high ploidy is determined the "normal diploid" is deemed to
correspond to a higher CN and the log2 ratio gets adjusted appropriately. If ploidy cannot be
determined NA (Not Available) is reported.
Low Diploid Flag An essential part of the algorithm is the identification of “normal diploid” markers in the cancer
(OncoScan only) samples. This is particularly important in highly aberrated samples. The normal diploid markers are
used to calibrate the signals so that “normal diploid markers” result in a log2 ratio of 0 (e.g. copy
number 2). The algorithm might later determine that the "normal diploid" markers identified really
correspond to (for example) CN=4. In this case the log2 ratio gets readjusted and TuScan ploidy will
report 4. Occasionally (in about 2% of samples) the algorithm cannot identify a sufficient number of
“normal diploid” markers and no “normal diploid calibration occurs. This event triggers “low diploid
flag” = YES. In this case the user needs to carefully examine the log2 ratios and verify if re-centering
is necessary.
Median Log2 Ratio Log2 Ratio is the log2 ratio of the normalized intensity of the sample over the normalized intensity of
(OncoScan only) a reference with further correction for sample specific variation. The Median Log2 Ratio is computed
for each segment.
Median BAF B-allele frequency (BAF) is (Signal (B)/{Signal(A) + Signal(B), where signal (A) is the signal from the AT
(OncoScan only) chip and signal (B) is the signal from the G/C chip. Median BAF is reported for each segment and is
the median BAF of the markers identified as heterozygous, after mirroring any marker BAFs above 0.5
to the equivalent value below 0.5. If the number of heterozygous markers in the segment is below 10
or the percent of homozygous markers is above 85% no value is reported,
LOH Flag to indicate whether the gene or region is in a Loss of Heterozygosity region (0=No, 1=Yes).
Fold Change The level of fold change as determined from the TAC software.
TCID The Transcript Cluster ID overlapping the Gene or Region defined in the bed file.
TCID Start Position Start Position: Start position of the overlapping TCID(s).
TCID GeneSymbol Gene Symbol: the Gene Symbol assigned to the TCIS(s) based on the TAC analysis.
Exporting to The ChAS Analysis Workflow enables you to export a variety of graphs for viewing in
Integrative IGV. To access this viewer, go to: http://software.broadinstitute.org/software/igv/
Genomics Viewer 1. In the ChAS Analysis Workflow, click the QC Results tab.
(IGV) 2. Load results files by clicking on Add Files, then navigate to and highlight the
CHP files.
3. Click Open.
4. Select files by either clicking the Select All, or checking each filename’s adjacent
check box.
5. Click Export to IGV.
The IGV Exporter window appears. (Figure 94)
6. Click the Browse button to assign an output folder.
7. From the IGV Exporter window, click the check box(es) adjacent to the data you
want to export, as shown in Figure 94.
8. Click OK.
Note: To include sample attribute information in the IGV Export, click on
Attribute → Save prior to running the IGV Export.
Note: The export process may take several minutes to complete, as it is dependent
on your sample(s) and data type(s).
If the export was successful, an IGV Export Complete message appears. (Figure 95)
Acknowledge the message, then click OK.
Principle ChAS Analysis Workflow enables you to perform Principle Component Analysis (PCA)
component on signal (CHP) data. PCA identifies a new set of variables (PCA1, PCA2, and PCA3)
analysis that account for a majority of the variance in the original data set. The first principal
component (PCA1) captures as much variability in the data as possible. PCA2
captures as much of the remaining variability (not accounted for by PCA1) as possible.
PCA3 captures as much of the remaining variability (not accounted for by PCA2) as
possible.
3. Points on the plot can also be hidden by right-clicking, then selecting the
appropriate option.
Note: The selected samples are only hidden from the plot.
4. Use the drop-down menus (Figure 98) to select attributes for display by Color
By Attribute and by Shape By Attribute.
Concordance The ChAS Analysis Workflow enables you to perform pairwise comparison
checks concordance checks on genotype calls for all selected samples.
The concordance between all pairwise comparisons for the samples in the results
table are reported.
A reference sample can be selected. Once selected, concordances are displayed.
Compare to reference allows you to compare every sample to a single reference file.
The default values are set at >98%, 95-98%, and below 95%.
The three categories of QC are passing, failing, and marginal.
Each QC category is represented by its own unique color, as shown in Figure 100.
The default values can be changed by either moving the hash marks on the line bar,
or typing in a number in the designated category (Figure 101).
The data can also be viewed in matrix format by selecting Matrix in the drop-down
menu, as shown in Figure 102.
3. Click to select the samples you want to filter from the drop-down list.
2. Use the menu selections to customize your sample and reference columns.
Running an error 1. From the Analysis menu, select Perform Analysis Setup.
checking analysis The Analysis Workflow window opens. (Figure 106)
2. From the Select array type drop-down list, click to select an array type
(Example: CytoScanHD_Array.
Note: The Select array type drop-down list includes only the array types for
which library (analysis) files have been downloaded from NetAffx or copied from
the Library package provided with the installation.
3. From the Select analysis workflow drop down, click to select the appropriate
Mendelian Error Check Workflow. (Example: CytoScanHD_Array Mendelian
Error Check)
4. By default, the Set workflow name is Workflow. Click inside the Workflow’s
(upper right) text box to enter a different workflow name.
Interpreting an The Mendelian Error Check analysis provides two key points of information:
error checking 1. Are the input samples related?
analysis
Mother-Child
Father-Child
If the samples are related, the Role Validity equals 1. If the samples are not related,
Role Validity equals 0. (Figure 109) The output also indicates which CYCHP/XNCHP
file is assigned as the Mother, Father and Child (Index). The analysis also can be run
as a DUO analysis. (Mother-Child or Father-Child).
Analysis Type Provides the samples being run through the analysis based on the sample key.
0 = Proband, 1 = Mother, 2 = Father.
Familial Sample Key Tells you the parent for which relatedness is being tested.
1= Mother only used in the analysis.
2= Father used in the analysis (may be father only (duo) or trio).
Role Validity A logical value with 0 being False and 1 being True. If the Role Index Score is > 1000 then the Role
Validity = 1 (likely related). If the Role Index Score is < 1000 then Role Validity = 0 (likely unrelated).
Role Index Score Role Index Score: This score is basically a Log Odds score that computes the probabilities of the
observed genotype calls for the trio while accounting for potential genotyping error. Assume the
following hypothesis: H1 – alleged father is true H2 – alleged father is random male. For each marker,
compute a likelihood ratio of H1 vs H2. Sum all markers. In theory a value of zero means equally likely
probability for either hypothesis. Positive means more likely Paternity related. Negative means more
likely unrelated.
MIE Trio Mendelian Inheritance Error for the Trio. (Ex 119 errors of 56482 SNPs).
Percentage MIE Number of raw error turned into a percent 119/56482 *100.
Trio
3. Make note of the assigned zip folder filename and its location.
4. Use Windows Explorer to navigate to the location.
Example: C:\ProgramData\Affymetrix\ChAS\log
5. Locate the zip folder you noted earlier, then double-click on it to open it.
The folder opens.
Log rollover When the software determines that the log file for the Analysis Workflow
(C:\ProgramData\Affymetrix\ChAS\log\AnalysisWorkflow.log) has reached a defined size
(approximately 4MB), the following steps will be completed:
A sub-folder will be created in C:\ProgramData\Affymetrix\ChAS\log called 'Log*'
(the * denotes the current date and time).
A zip file called RolledLogFile*.zip is created in that folder. The '*' is the same date
and time used for the folder name. The files in the
C:\ProgramData\Affymetrix\ChAS\log folder and all files found in the currently
selected QC History Log folder will be included in this zip file.
The Analysis Workflow files that are associated with analysis workflows that are no
longer active on the Dashboard will be deleted from
C:\ProgramData\Affymetrix\ChAS\log
A new AnalysisWorkflow.log file will be created in
C:\ProgramData\Affymetrix\ChAS\log
Note: When referring to steps that apply to both CytoScan CYCHP and SNP 6.0 CNCHP
data files, the CHP files are described as CxCHP files. When referring to steps that apply
to CytoScan CYCHP, CytoScan XNCHP, CytoScan HTCMA RHCHP, SNP 6.0 CNCHP
data, and OncoScan files, the resultant files are described as xxCHP files.
IMPORTANT! In a new user profile, smoothing and joining are turned on by default for
CytoScan 750K and CytoScan HD arrays. Smoothing and joining are disabled for CytoScan
Optima arrays. Smoothing and joining are OFF by default for OncoScan and CytoScan HTCMA
arrays. The smoothing and joining settings are specific for each array type (for more details on
smoothing and joining, see "Copy number segment smoothing and joining (optional)" on page
127).
When loading XNCHP files into ChAS for viewing from CytoScan XON arrays, the
software:
1. Selects the segments in the XNCHP file to display as segments.
2. Displays the segments and graph data:
Segment Data
• Loss of Heterozygosity
• Exon Region Gain/Loss
Graph Data
• Log2 Ratio
• Weighted Log 2 Ratio
• Smooth Signal
• Loss of Heterozygosity (LOH)
• Allele Difference
• B-allele Frequency (BAF)
• Genotype Calls
When loading RHCHP files into ChAS for viewing, the software:
1. Selects the segments in the RHCHP file to displays as segments.
2. Displays the segments and graph data:
Segments Data
• Copy Number Gain/Loss
• Loss of Heterozygosity
Graph Data
• Copy Number State
• Log 2 Ratio
• Smooth Signal
• LOH
• Allele Difference
• B-allele Frequency
When loading SNP6 CNCHP files into ChAS for viewing, the software does the
following:
1. Performs segment detection by analyzing the CN and LOH graph data in the
CNCHP file.
Note: When running the Segment Reporting Tool in GTC on SNP 6 data, the software
sets the end coordinate such that the segment ends at the base position of the last
marker in the segment. When loading SNP 6 data into ChAS, the segment detection
sets the end coordinate for a segment such that the segment ends one base after the
last marker in the segment. This may result in a discrepancy between the end position
for segments when comparing data analyzed in both GTC and ChAS.
IMPORTANT! For CNCHP files from the SNP 6.0 Array, smoothing but not joining is turned
on by default in a new user profile.
When loading OncoScan OSCHP files for viewing, the software does the following:
1. Displays segments in the OSCHP created by the TuScan Copy number
algorithm. For details on this algorithm, please refer to the OncoScan Console
User Guide (P/N 703195) or Appendix G.
IMPORTANT! Smoothing and Joining are OFF by default for OSCHP files.
When loading ReproSeq Aneuploidy data for viewing, the software does the following:
1. Displays segments from the Ion Reporter software. For details, refer to the Ion
Reporter User Guide: https://ionreporter.thermofisher.com/ir/
2. Displays the whole genome sequencing tiles on the Copy Number State graph.
Loading files
Loading xxCHP data for viewing in ChAS involves the following steps:
1. Optional: Before loading: Select Segment Smoothing and Segment Joining
parameters for processing the CN Gain and Loss Segment data for CxCHP files.
IMPORTANT! Smoothing and Joining are ON by default for CytoScan 750K and HD arrays.
Both Smoothing and Joining are OFF by default for OncoScan, CytoScan HTCMA, and
GenomeWide Human SNP 6.0 arrays. and is disabled for CytoScan Optima arrays and
ReproSeq Aneuploidy data.
XON Segment Merging is turned ON by default for CytoScan XON arrays. For details, see "XON
segment merging" on page 132.
3. Select File → Open on the menu bar. Alternatively, click the File Open
button.
The Open window appears. (Figure 112)
4. To view information about results, select one or more files, then click Sample
Info.
The Sample Info window opens. (Figure 113)
Note: If the xxCHP and ARR files are located in the same folder, the Sample Info
window shows information about both the sample and the results. To load files
from the Sample Info window, select the files, then click Open Selected Files.
Figure 114 Enter a file name search term (*text string*) and select a file type
Figure 115 The Open window shows files with names that include your search term
6. Select the files (you can use Shift click or CTRL click to select multiple files)
7. Click Open.
If any of the files fail the QC checks, a warning notice appears (Figure 116).
You can click Yes to continue to load the files.
If the following warning message appears (Figure 119), click Yes to acknowledge
it.
Note: The ChAS Browser allows loading different NetAffx versions at the same
time (as long as the versions are all from all the same reference and genome
builds). If NetAffx versions are from different builds of the genome (for example
Hg18 and Hg19), The ChAS Browser does not load the files.
A progress bar appears (Figure 120)
After a few moments, the ChAS browser featuring your selected samples
appears.
The loaded files appear in the Files list pane. (Figure 121).
Loaded
File(s)
IMPORTANT! Smoothing and joining are turned on by default in a new user profile for CytoScan
750k and HD Arrays.
Smoothing and Joining are specified per array type. The processes do not affect the
marker data in the CNCHP or CYCHP file. If these settings are turned off, the Copy
Number segment data is displayed without smoothing or joining.
IMPORTANT! Smoothing and Joining affect only data loaded from CNCHP and CYCHP files.
This ONLY applies to copy number data, NOT LOH or Mosaic types. Smoothing and Joining is OFF
by default for CytoScan Optima, OncoScan, and ReproSeq Aneuploidy files.
Segments which have been smoothed and/or joined are indicated by a blue check
mark in the Smoothed/Joined column of the Segments table (Figure 122). The
segment ID name indicates whether smoothing and/or joining has occurred. A red “X”
indicates no smoothing or joining has been applied.
Option Description
Use default segment data rules configuration For the CytoScan 750K and HD Arrays, the default smoothing and
joining rules are:
– Smooth Gain or Loss CNState runs to the most common
marker value, then generate segments.
– Join any "split" CNState runs separated by no more than 50
normal-state markers.
– Join Gain or Loss CNState runs interrupted by normal state
data which are separated from each other by no more than
200 kbp
– Skip segments from the smoothing operation for all arrays
that have a CN < 1.
For SNP 6 arrays, the default smoothing rule:
– Smooth Gain or Loss CNState runs to the most common
marker value, then generate segments.
Smooth Gain or Loss CNState runs to the most Smoothing to the most common marker state value is only applied to
common marker value contiguous CNState runs of the same type (gain or loss).
Limit smoothing of CNState data to not smooth If this option is chosen, CNState runs which are farther apart than the
aberrant segments more distant than this number “smoothing maximum jump limit” will not be smoothed. For example,
of CNStates if the smoothing maximum jump limit is set at 1, then adjacent
segments with CNState 3 and 5 will not be smoothed.
Join Gain or Loss CNState runs separated by no If this option is chosen, only Gain or Loss CNState Runs which are
more than this number of markers of normal state separated by less than a threshold number of markers of normal state
data data will be joined. For example, if the marker threshold is set at 50,
then CNState runs separated by more than 50 markers of normal
state data will not be joined.
Join Gain or Loss CNState runs interrupted by For the CytoScan 750K and HD Arrays, the default smoothing and
normal state data which are separated from each joining rules are:
other by no more than this distance measured in – Smooth Gain or Loss CNState runs to the most common
kbp marker value, then generate segments.
– Join any "split" CNState runs separated by no more than 50
normal-state markers.
– Join Gain or Loss CNState runs interrupted by normal state
data which are separated from each other by no more than
200 kbp
For SNP 6 arrays, the default smoothing rule:
– Smooth Gain or Loss CNState runs to the most common
marker value, then generate segments.
Limit the joining of CNState data (which flanks Smoothing to the most common marker state value is only applied to
normal state data) to not join aberrant segments contiguous CNState runs of the same type (gain or loss).
more distant than this number of CNStates
IMPORTANT! If multiple smoothing and/or joining check boxes are selected, all criteria must
be met to smooth and/or join the segments.
About smoothing Note: The examples shown below are for a case where the expected copy number is
2. Similar calculations take place for the X and Y chromosomes where the expected
copy number may be 0, 1 or 2, depending on gender and whether the segment is
located within or outside of the PAR region.
If you have a contiguous set of segments with gain values (for instance, of CN State
values of three and four), with no markers of copy number 2 or lower, without
smoothing they will be treated as a series of individual gain segments. The same rules
apply to a set of segments with loss values of 0 or 1.
If you have a contiguous set of markers with gain values of three and four, with no
intervening markers of copy number 2 or lower, with smoothing they will be
consolidated into a single gain segment. (Figure 124)
If you have a contiguous set of markers with loss values of zero and one, with no
intervening markers of copy number 2 or higher, after smoothing, they will be
consolidated into a single loss segment.
About joining The joining options enable you to join segments with the same type (gain or loss)
aberrant CNState that are separated by no more than a specified number of normal-
state markers or by no more than a specified distance of normal-state data
(Figure 125).
Turning off XON 1. To turn off XON Merging, go to Preferences → User Configuration.
merging The User Configuration window appears. (Figure 127)
2. Click the Segment Data tab, then select CytoScan XON from the array drop-
down.
3. Uncheck the Use default segment data rules configuration check box
4. Uncheck the XON Merging check box.
5. Close the ChAS browser, then reopen it.
XON Merging is now off/disabled.
Note: When using custom QC thresholds for CytoScan HTCMA in RHAS, these
custom thresholds will also need to be updated in ChAS to reflect the desired QC
thresholds.
Option Description
Property Name SNPQC is a QC metric for SNP probes that is derived from polymorphic (SNP) probes
MAPD is a QC metric for all probes used to determine copy number that is derived from both polymorphic
(SNP) and non-polymorphic (CN) probes
Waviness SD is a global measure of variation of microarray probes that is insensitive to short-range
variation and focuses on long-range variation.
nd SNP QC is a QC metric for SNP probes that is derived from polymorphic SNP probes in normal diploid
regions.
nd Waviness SD is the same measure as Waviness SD, but only calculates in those regions that are
identified as normal diploid.
Cel Pair Check is a test that inspects each pair of intensity (*.cel) files to determine whether the files have
been properly paired and assigned to the correct channel. (OSCHP only)
DQC (DishQC) measures the amount of overlap between two homozygous peaks created by non-
polymorphic probes. DQC of 1 is no overlap, which is good. DQC of 0 is complete overlap, which is bad.
QC Call Rate is the percentage of autosomal SNPs with a call other than NoCall (measured at the Sample
QC step).
SMN MAPD is a QC metric for all probes used to determine copy number that is derived from both
polymorphic (SNP) and non-polymorphic (CN) probes calculated during SMN analysis.
SMN WavinessSD is a global measure of variation of microarray probes that is insensitive to short-range
variation and focuses on long-range variation calculated during SMN analysis.
Note: The property names are from the header information of the xxCHP file. QC thresholds are not
established for Reproseq Aneuploidy files, but QC metrics are displayed in the QC and Sample Info tab.
CytoScan (Normal Diploid Analysis) < 0.25 > 15.0 < 0.12 > 15.0 < 0.12
CytoScan Optima Array MAPD < 0.29 > 8.5 < 0.12 - -
Note: The waviness SD metric is applicable to blood and cell line data. The waviness SD
metric is not intended for alternative sample types such as solid tumor or FFPE samples
in which the results may vary as a result of the biological complexity. For these sample
types, it is recommended using nd Waviness SD.
Effect of SNPQC value (for a Single Sample Analysis) on the Allele Difference Track.
(Figure 129)
SNPQC is one of the CytoScan within-array QC metrics which provides insight into
the overall level of data quality from a SNP perspective. When evaluating the SNPQC
values, the key consideration is to ensure that the threshold is exceeded. The quality
of the SNP allele data is compromised, from an interpretation perspective, when the
SNPQC values are below the recommended acceptance threshold as illustrated by
the two left most graphs representing the two and three copy allelic state.
For the CytoScan HD array, when the SNPQC value is below 15, (as illustrated by the
data in the two graphs above), the noise within the array is higher than expected. This
in turn, compromises the overall data quality and clarity of the results. However, when
the SNPQC value is above 15, the consideration is whether the SNPQC value is above
or below the threshold value and not the absolute magnitude.
As long as the SNPQC value exceeds the threshold, there is a retention in the data
quality as illustrated by the graphs which demonstrate clear allelic data across a broad
range of SNPQC values that exceed the recommended threshold. The threshold was
determined from thousands of arrays processed across multiple reagent lots,
operators, and sample aberration types. SNPQC is one of the metrics used to assess
array quality and should be helpful in determining which experimental data sets are of
satisfactory quality to continue with subsequent interpretation.
New Row
Editing an 1. Click on an existing Property Name, then edit its QC property name.
existing QC 2. Click the Type field to edit its Decimal Number or Whole Number.
threshold
3. Click the Operator field to choose a different operator from the drop-down list.
4. Double-click the Value field to edit the current threshold value.
5. To edit another existing QC property, repeat steps 1-4.
6. Click OK to apply the newly edited QC threshold(s).
Histogram data
Loading
histogram data
IMPORTANT! You must be logged into the ChASDB to view Histogram data.
The histograms load by default. If they are not currently displayed, click ChAS DB →
Refresh ChAS DB data to view them in the ChAS Browser.
Note: The histograms are only available for NetAffx Genomic annotation files for
genome build Hg19. The Browser produces an error message, if you try to load Hg19-
based histograms while a hg18 or hg38 based NetAffxGenomeAnnotation is currently
displayed.
Changing the 1. From the File tree (upper left column), locate the Default Histogram entry.
default histogram 2. Right click on Default Histogram.
filters
A menu appears.
3. Click View/Edit Properties.
The File Properties window appears. (Figure 131)
Figure 131 File Properties window and Default Histogram Filters tab
4. Click the Histogram Filters window tab, then use its check box(es), selections,
and radio buttons to modify the default Histogram’s factory settings.
5. Click on the Change Filter Parameters button to select new filter settings.
7. Use this window’s check boxes, pre-populated entries, and/or the provided text
fields (to enter the filter parameters you want).
8. Click OK to save your changes.
Note: If using the aDGV containing segments from both HD and XON arrays, use the
Categories filters to limit the histogram data to only HD (check Gain and Loss) or only
XON (Gain XON Region and Loss XON region)
Changing the 1. From the File tree (upper left column), locate the Default Histogram entry.
default histogram 2. Right click on Default Histogram.
colors
A menu appears.
3. Click View/Edit Properties.
The File Properties window appears.
4. Click on the Histogram Colors tab. (Figure 133)
5. Click on the color square representing the histogram track you want to change.
A change color window appears.
6. Select a new color, then click OK.
7. Repeat steps 5-6 as needed.
8. Click OK to save your color changes or click Reset to Defaults to return to the
default color settings.
Graphic Displays
See “Displaying data in graphic views” on page 152.
Tables
See “Displaying data in table views” on page 327.
After the data is loaded, you can:
Filter the segments by Segment Parameters to hide segments that do not meet
your requirements for significance. See “Filtering segments” on page 217.
Select a region information file for use as a CytoRegion file and:
Perform differential filtering for segments in CytoRegions and in the rest of the
genome.
Display only segments that appear in CytoRegions using Restricted Mode. See
“Using CytoRegions” on page 267.
Select a region information file for use as an Overlap Map and use the Overlap filter
to conceal segments that overlap with the Overlap Map items. This functionality
may be helpful for tracking or filtering out benign copy number change regions.
See “Using the overlap map and filter” on page 279.
Add selected features of the genome to new or existing Region (AED) files, and edit
annotation data on existing annotations. See “Creating and editing AED files” on
page 287.
Prepare reports on your findings by exporting graphics and table data in PDF and
other formats. See “Exporting results” on page 415.
Save setups of ChAS for different tasks in user profiles and named settings.See
“User profiles and named settings” on page 438.
Menu Bar
Tool Bar
Karyoview in
Upper Display Area
Files List
Selected
Chromosome
Named
Setting
drop
down list
Detail View in
Lower Display Area
Data Types
List
Status Bar
Files list
The Files list (Figure 137) displays the different sources of data and annotations that
are loaded in the Chromosome Analysis Suite. Files are grouped by type in the Files
list.
Sample data
*Reference annotations displayed may be different than shown above depending on the NetAffx Genomic
Annotation file loaded.
The Data Types list enables you to select from Segments data and Graph data.
The Segments data is displayed graphically in:
Karyoview
Selected Chromosome View
Detail View
If filtering is applied to a segment type, a funnel icon appears next to the segment
symbol in the list.
Graph data, indicated by the icon, is displayed only in the Detail View. See "Detail view"
on page 170 for more information.
Unselected data is also concealed from the different tables and graphs.
See “Selecting data types for display” on page 192. for information on using the Data
Types list to select different data types for display.
Named settings
The Named Settings drop-down list (Figure 140) enables you to apply a previously
created setting for ChAS. The settings may include things like:
Segment Filter and Overlap Map Filter settings
Types of data to be displayed
Restricted Mode ON/OFF status. See “Using restricted mode” on page 275.
Note: Default Named Settings (indicated by the icon) should not be deleted from
the system because they are shared by all user profiles.
See Chapter 20, "User profiles and named settings" on page 438 for information on
creating and using Named Settings.
Note: The OncoScan Default Named Setting has no filters applied so all segments
called by the TuScan algorithm can be viewed. Users can create their own appropriate
custom filter Named Setting for their data, see "User profiles and named settings" on
page 438.
Status bar
The status bar (Figure 141) (very bottom of browser) displays information on:
NetAffx Genomic Annotation database and its hg version
Restricted Mode status (See “Using restricted mode” on page 275.)
Edit Mode status (See “Using edit mode” on page 224.)
Cursor (Mouse Over) Position
User Profile ID
NetAffx Genomic Annotation Restricted Mode Cursor Position information User Profile
database and its hg version Indicator ID
currently loaded in the ChAS
Browser (the database is not
array-specific)
Display area
The Display Area (Figure 142) is divided into three panes:
"Upper panes" on page 150
"Selected chromosome view" on page 150
"Lower panes" on page 150
Figure 142 Display Area showing Segments Table and Detail View
Selected
Chromosome
Karyoview pane
Details View
pane
Chromosome
Selection List
The tabs in the upper and lower panes display different types of data, in both graphical
and table formats. Data from the same sample files is displayed in all three panes.
You can display a pane in a separate window by clicking the icon on the tab.
To close the window and return the information to the tab panel, click the icon in
the window.
Selected The Chromosome View displays detected segments in selected sample files for the
chromosome chromosome selected in the Karyoview, while the Chromosome Selection List (far
view right column) displays its number.
See "Selected chromosome view" on page 168 for more information.
For more information on what types of data is in the database files and how this
information varies in different versions, see "ChAS browser NetAffx Genomic
Annotations" on page 487.
Graphic views
Data can be displayed in a Karyoview, Whole Genome View, Selected Chromosome
View, and a Detailed View, as shown in Figure 144.
Karyoview
Selected
Chromosome
View
Detail View
Data from the same sample files is displayed in all the views at different scales.
If an item in any of the views is selected, the icon for that item is enlarged or
highlighted in the views.
Karyoview The Karyoview (Figure 145) displays a genome-wide view of the detected segments
and other data.
In the Karyoview:
Click a chromosome in the Karyoview to select it.
Press Ctrl + Left/Right Arrow keys to move between chromosomes.
To jump to chromosome 1, press Ctrl+Home
To jump to chromosome Y (last chromosome in the Karyoview), press the
Ctrl+End.
Vertical
Stretch Scroll
Slider Bar
Selected Chromosome
Using the mouse, click and drag on a selected chromosome to select an area for
display in the Detail View. This area is highlighted in the Selected Chromosome View.
(Figure 146)
Selected
Chromosome
Detail View
Note: To easily remove the blue highlight surrounding the selected chromosome for
image captures, go to View → Hide Karyoview Highlights.
Chromosome
Number
Note: When opening ReproSeq for the first in the Whole Genome View, the graphs may
appear empty. Use Choose Data to select the Copy Number Graph type to view the data.
Note: The Y axis on the left represents the data points in that graph (Log2/Weighted Log2
Ratio). The Y axis on the right represents the line graph data (Smooth Signal/Copy
Number).
2. The Whole Genome View for the selected file(s) opens. (Figure 150)
Note: The Y axis settings for the Whole Genome View are initially determined from the
Y axis settings set in the Detail View Graph Settings. The Y axis on the left of the WGV
pertains to the Row Data. The Y axis on the right of the WGV pertains to the Line data.
Also, not all graphs are available for a given xxCHP file. If a graph type is selected in
which data is not available, the graph will appear with no data points.
Changing colors Graphs can be viewed in a single color or alternating colors every three chromosomes.
of the data points 1. Click View → Choose Data. (Figure 153)
or line data
Figure 153 Choose Colors pane
2. Select the radio button to change to either 1 color data points or alternating 3
color data points.
Note: Changing Data display and/or Colors affects the current sample only. To save these
settings as default setting (whenever a file is opened), save these settings as the Default
WGV State. (See "Setting a default WGV state display and colors".)
Setting a default 1. Select the Data and Colors to be displayed as your Default settings when opening
WGV state files in the WGV, as described in "Changing colors of the data points or line data" on
display and colors page 160.
2. Select View → Save as default WGV State. (Figure 154)
Note: If changes have been made to colors, data or Y axis values, you can return to this
WGV Default State by clicking View → Apply default WGV State or click Load a WGV
State → Default.
Creating WGV Multiple WGV States can be created, then saved for a quick selection of different graph/
states color settings.
1. Select the Data and Colors to be displayed as your Default settings when opening
files in the WGV.
2. To save these selections as a WGV State, click on View → Save WGV State.
3. Enter a name in the dialog box, then click OK to save this new WGV State.
2. Click on the Line color button to select a color for the Reference Line.
3. In the Coordinate text field, enter an approximate coordinate (based on the Y
axis) where you want the Reference Line placed.
Note: The Reference Line can be dragged and dropped to a different location
once placed onto the graph.
4. Click OK.
5. Repeat steps 2-4 to place additional Reference Lines onto the graph.
Note: Additional Reference Lines can also be added by right-clicking in the
graph and selecting Add Reference Line for... (Figure 158)
6. Delete a single Reference Line by right-clicking on the Line and selecting Delete
Selected Line. Remove all Reference lines by right-clicking in the graph and
selecting Delete All Lines. (Figure 158)
7. To change the color or coordinate of a Reference Line, right-click on the selected
Reference Line, then select Edit Selected Line. (Figure 158)
Adding a Use this feature to view two samples in the same WGV window.
comparison file Note: Before using this feature, both analysis files MUST BE loaded and available in
the ChAS Browser.
1. Open the first file in the Whole Genome View, as you normally would.
2. Click File → Add Comparison File.
A Comparison File window opens with available analysis results files loaded into
ChAS Browser.
3. Click to highlight the file to be viewed with the first file (already open in the WGV),
then click OK.
The data for the files are loaded in the same window (Figure 159) displaying the
following alternating tracks:
Track 1: Weighted Log 2 and Smooth Signal for File 1
Track 2: Weighted Log 2 and Smooth Signal for File 2
Track 3: Allele Difference for File 1
Track 4: Allele Difference for File 2
Track 5: B-allele Frequency for File 1
Track 6: B-allele Frequency for File 2
Position Indicator
hromosome
umber
Use the Stretch Slider and Vertical Scroll Bar (Figure 161) or press the Alt key and turn
the mouse wheel to zoom in on a section of the Selected Chromosome View.
If you have selected a CytoRegions file, the CytoRegions are displayed as gray bands
that stretch across the entire chromosome cell (from right to left of the Cytobands).
Position
Indicator
Markers
SNP markers are displayed in the light green band nearest the cytobands. The SNP
marker/probe names in the CytoScan start with the letter ‘S’.
There is one marker track for every distinct array type that is loaded.
Copy Number markers are displayed in the dark green band nearest the detected
segments. The non-polymorphic copy number probe names on the CytoScan start
with the letter ‘C’.
You can mouse over a marker to learn more about it.
Segments selected in any view are highlighted in the Selected Chromosome View.
For information on the other features of the Selected Chromosome View, see
"Selected chromosome view" on page 168.
Detail view
The Detail View (Figure 163) enables you to look in detail at the detected segments,
marker data, regions, and reference annotations for the loaded files.
Position Indicator
Data
Region
Files
Histograms
Reference
Annotations
Chromosome Info
Details
(see below)
Chromosome Number
Markers
Data
In the Filters tab, when only Level 1 is selected and Levels 2-4 remain unchecked (as
shown in Figure 164 on page 172) then:
XON Region segments that overlap regions of the genome designated as Level 1 is
visible in the XON Region Segment Track and in the Segments Table. All remaining
data (log2 ratio, weighted log2 ratio, smooth signal, allele difference, B-allele
Frequency) contained within Level 1 regions is colored the same color as the color nib
of the sample. XON Region segments that overlap regions of the genome designated
as Levels 2-4 are not visible on the XON Region segment track or in the Segments
Track. All remaining data (log2 ratio, weighted log2 ratio, smooth signal, allele
difference, B-allele Frequency) contained within Level 2-4 regions is colored gray.
In the Filters tab, when Levels 1 and 2 are selected and Levels 3 and 4 remain
unchecked (as shown in Figure 165 on page 172) then:
XON Region segments that overlap regions of the genome designated as Level 1 or
Level 2 will be visible in the XON Region Segment Track and in the Segments Table.
All remaining data (log2 ratio, weighted log2 ratio, smooth signal, allele difference, B-
allele Frequency) contained within Levels 1 or 2 regions will be colored the same color
as the color nib of the sample. XON Region segments that overlap regions of the
genome designated as Levels 3 or 4 will not be visible on the XON Region segment
track or in the Segments Table. All remaining data (log2 ratio, weighted log2 ratio,
smooth signal, allele difference, B-allele Frequency) contained within Level 3 and 4
regions are colored gray.
If selected Level 3, then XON Region Segment calls are displayed on the XON Region
Segment track and the marker level data are colored the same color as the color of
the color nib of the sample.
When all four Levels are selected in the Filters Tab, all XON Region segment calls for
the sample are displayed and all marker level data is colored the same color as the
color of the color nib of the sample.
Annotation OMIM In Browser annotation files version NA32.3 and higher, the following OMIM colored
color codes gene entries were generated by genome.ucsc.edu and are based on the associated
OMIM phenotype map key. For more information on OMIM display conventions, go
to: www.genome.ucsc.edu
Lighter Green for phenotype map key 1 OMIM records - the disorder has been
placed on the map based on its association with a gene, but the underlying defect
is not known.
Light Green for phenotype map key 2 OMIM records - the disorder has been
placed on the map by linkage; no mutation has been found.
Dark Green for phenotype map key 3 OMIM records - the molecular basis for the
disorder is known; a mutation has been found in the gene.
Purple for phenotype map key 4 OMIM records - a contiguous gene deletion or
duplication syndrome; multiple genes are deleted or duplicated causing the
phenotype.
Light Gray for Others - no associated OMIM phenotype map key info available.
2. Specify a color for the selected annotation type using the color controls in the color
palette (Figure 167), then click OK.
The new color is applied to the annotations in the Details View.
3. To return to the default annotation color, right-click the annotation in the Files
windowpane, and select Clear Custom Color on the shortcut menu.
LOH Loss of Heterozygosity (CN <2 LOH = light purple, CN 2 (or higher) LOH = dark purple)
Note: In Dark Scheme (page 195), CN <2 LOH = dark purple, CN 2 (or higher) LOH = light purple.
Log2 Ratio Per marker Log2 Ratio of normalized intensity with respect to a reference, with further correction for
sample specific variation.
Weighted Log2 Contains the Log2 Ratios processed through a Bayes wavelet shrinkage estimator. These processed
Ratio values are input to the CNState algorithm HMM.
Allele Difference Filtered and smoothed values for individual markers. Nonparametric estimation is used to understand
possible regional peak structure towards which the data is smoothed. The amount of filtration and
smoothing is dynamically adapted based on sample quality. Allele difference is computed based on
differencing A signal and B signal, then standardizing based on reference file information.
B-allele Frequency Number of B alleles/number of A+B alleles used to show allelic imbalances.
The Detail View displays the following types of data for a CytoScan HTCMA array:
LOH Loss of Heterozygosity (CN <2 LOH = light purple, CN 2 (or higher) LOH = dark purple)
Note: In Dark Scheme (page 195), CN <2 LOH = dark purple, CN 2 (or higher) LOH = light purple.
Log2 Ratio Per marker Log2 Ratio of normalized intensity with respect to a reference, with further correction
for sample specific variation.
Allele Difference Filtered and smoothed values for individual markers. Nonparametric estimation is used to
understand possible regional peak structure towards which the data is smoothed. The amount
of filtration and smoothing is dynamically adapted based on sample quality. Allele difference is
computed based on differencing A signal and B signal, then standardizing based on reference
file information.
B-allele Frequency Number of B alleles/number of A+B alleles used to show allelic imbalances.
Note: There is a subset of ~55,000 SNP probes which are used for allelic information
analysis but which are not used for Copy Number analysis (on the CytoScan HD
Array).
For these SNP probes, LOH and Allele Peaks data will be displayed, but these SNP
probes will not have Log2 Ratio, Weighted Log2 Ratio, SmoothSignal, or Copy
Number State data displayed, nor will they be used for ascertainment of Mosaicism.
The calculation of Segment data for all the various Segment types takes this into
account. All non-polymorphic (copy number) and the vast majority of SNP probes are
NOT affected by this change, and will continue to display all graphs and their data
points from the CytoScan HD Array CYCHP files.
LOH Loss of Heterozygosity (CN <2 LOH = light purple, CN 2 (or higher) LOH = dark purple)
Note: In Dark Scheme (page 195), CN <2 LOH = dark purple, CN 2 (or higher) LOH = light purple.
Log2 Ratio Per marker Log2 Ratio of normalized intensity with respect to a reference, with further correction for
sample specific variation.
Weighted Log2 Contains the Log2 Ratios processed through a Bayes wavelet shrinkage estimator. These processed
Ratio values are input to the CNState algorithm HMM.
Allele Difference Filtered and smoothed values for individual markers. Nonparametric estimation is used to understand
possible regional peak structure towards which the data is smoothed. The amount of filtration and
smoothing is dynamically adapted based on sample quality. Allele difference is computed based on
differencing A signal and B signal, then standardizing based on reference file information.
B-allele Frequency Number of B alleles/number of A+B alleles used to show allelic imbalances.
The Detail View displays the following kind of data for Genome-Wide SNP Array 6.0
Array data (CNCHP):
LOH Loss of Heterozygosity (CN <2 LOH = light purple, CN 2 (or higher) LOH = dark purple)
Note: In Dark Scheme (page 195), CN <2 LOH = dark purple, CN 2 (or higher) LOH = light purple.
Log2 Ratio Per marker Log2 Ratio of normalized intensity with respect to a reference, with further correction
for sample specific variation.
Allele Difference Difference of A signal and B signal, each standardized with respect to their median values in the
reference.
Log2 Ratio Per marker Log2 Ratio of normalized intensity with respect to a reference, with further correction
for sample specific variation.
Weighted Log2 Ratio Contains the Log2 Ratios processed through a Bayes wavelet shrinkage estimator. These
processed values are input to the CNState algorithm HMM.
Allele Difference Difference of A signal and B signal, each standardized with respect to their median values in the
reference.
Variants Location and detection of Somatic Mutation. (OncoScan FFPE Assay only)
B-allele Frequency Number of B alleles/ number of A+B alleles, used to show allelic imbalances.
Gain Amplifications
Graph Data
See "Changing graph appearance" on page 196 for more information about
controlling the display of graph data.
In addition, the Detail View displays:
Regions: Features in the various region files loaded into ChAS, including
CytoRegions and Overlap Map items.
Annotations: Indicate the known or suspected locations of features, such as
mRNAs, exons, structural variants, and so forth.
You can expand or contract the annotations. See "Expanding and contracting
annotations" on page 195.
Database Display
Default Histograms: displays all segments in the database.
Filtered Histograms: displays all segments meeting the filter criteria set by the
user. For information on how to create a filtered histogram, see "Adding filtered
histogram data" on page 141.
Chromosome info, with:
Coordinate scale
Marker position information
Chromosome number
Cytoband information
Selected segments are displayed with enlarged icons; selected regions or annotations
are outlined and highlighted. (Figure 168)
Zoom Slider
Stretch
Slider
Vertical
scroll bar
Position Indicator
Control Function
Zoom Slider Controls the horizontal zoom and the area of the chromosome displayed.
3x Zoom Out Press Ctrl + Minus to view up to three preset Zoom Out settings.
Zoom in Using Place your mouse cursor over a point of interest. Press Shift while holding down the left mouse button.
Click and Drag Drag the mouse cursor to frame/zoom in on your point of interest.
Scroll bars Used to select the area displayed after zooming or stretching the vertical or horizontal scale.
Position Indicator Dashed vertical blue line. Click in the view to set the position of the indicator
The position that is highlighted in the graphs table.
The position that is used as the center point when zooming.
Use Stretch Slider and Vertical Scroll Bar to zoom in on a section of the Detail. You
can also use the mouse wheel as shown below:
Alt + mouse wheel stretches the display
Ctrl + mouse wheel zooms in on the horizontal scale
Mouse wheel scrolls up and down
Selecting a Data from the same sample files is displayed in all 3 views, at different scales.
chromosome You can select a particular chromosome, or a section of the chromosome, for detailed
section for study using:
display "Karyoview and selected chromosome view"
"Coordinate range box" on page 182
"Zooming to a selected item" on page 183
"Navigation controls in detail view" on page 180
You can also double-click on an item in a table to zoom to the region of the
chromosome where that item is located.
Figure 170 Areas displayed in Karyoview, Selected Chromosome View, and Detail View
Coordinate range The Coordinate Range box (Figure 171) is located in the ChAS main tool bar. It shows
box the selected chromosome and the start and stop positions displayed in the Detail
View. You can enter coordinates in the box to update the Detail View.
Accessing the 1. Click Preferences → Edit User Configurations or click on the upper tool
Misc tab bar.
The User Configuration window appears.
2. Click the Misc tab. (Figure 175)
Autosave Click to check the Autosave check box to automatically save your files as they are
edited. Uncheck to disable auto-save. (Figure 176).
Coordinated box Click the appropriate radio button to choose the format of your displayed data.
format (Figure 177)
Zoom buffer By default, the zoom percentage is set to 15% in new user profiles.
The Zoom Buffer feature offers 3 settings:
No Buffer: Click this radio button to turn off the Zoom Buffer feature.
Number of bases: Click this radio button, then manually enter the number of
bases you want.
Percent of segment length: Click, then drag the slider bar (Figure 179) to the
zoom percentage you want.
There are five preset CHP file colors assigned to your CHP data (Figure 180), but each
default color, can be changed to a different color.
2. Use the color wheel to locate the specific color you want or click on one of the
several coloring options (including a website-safe color
pallet).
3. Click OK.
4. Repeat steps 1-3 to change additional default colors.
At anytime, click Reset to Default to return the 5 CHP file colors back to their
default colors. (Figure 182)
To return a single CHP file color back to its default color, click on the CHP file
color, then click the Color Wheel’s Reset button.
5. Click the User Configuration window’s OK button to save your changes and exit.
Note: Samples that are currently loaded while a color change is made, may not reflect
your new color scheme. To remedy this, close, then re-open ChAS to ensure your new
color choices are reflected throughout all your samples.
Microarray Define how the ISCN Microarray nomenclature will be represented in the segments
nomenclature table and exports. Any format from ISCN 2013, 2016 and 2020 can be configured
configuration Coordinates: choose the radio button to display genomic coordinates with or
without commas.
Genome names: choose the radio button to display genome versions using either
GRCh or hg.
Range separator: choose the radio button to display either a dash (-) or an
underscore (_) between genomic positions.
Mosaic separator: choose the radio button to display either a dash (-) or a tilde (~)
between copy number values on mosaic segments.
(Optional) Select the check box to automatically have information in the Inheritance
field appended to the Microarray nomenclature field.
Files List
Upper Pane
Named
Setting
Drop-down
list Selected
Chromosome
View
Lower Pane
Data Types List
Status Bar
Sample Data
Reference Annotations
Cytobands
Closing a file
1. Right-click on the file you want to close.
2. Select Close from the menu. (Figure 186)
The file is removed from the Files list and the data is no longer displayed.
Selecting data The Data Types list (Figure 187) shows the data types that can be displayed in ChAS.
types for display
Figure 187 Data Types list
Detected Segments
and Graph Data
Changing the A unique color is assigned to each sample and used for the lanes in the Karyoview,
grouping of Selected Chromosome View, and Detail View.
samples and data Each segment type is assigned to its own lane and has its own symbol.
types
Changing the grouping
1. From the View menu, select Group by Sample or Group by Type or in the tool
bar, click the Group by Sample or Group by Type button.
When the lanes are grouped by data type, the lanes for different samples are kept
together for each segment or graph type in the Karyoview and Selected Chromosome
View and in the Detail View.
Note: You can change the order of samples and Data types in the views by clicking
and dragging in the Files and Data Types list.
Expanding and For tracks containing multiple rows of annotations, collapsing tracks consolidates all
contracting rows within a track into two rows. Any annotations after the first one will be placed in
annotations the second row (Figure 191).
When there are multiple annotations of one type at the same coordinate, the separate
annotations will be shown on separate rows.
Collapsing tracks is useful if you don’t need to see all the details. However, be aware
that in collapsed tracks larger annotations may obscure smaller ones; annotations
with introns may be obscured by annotations that don’t show the intron.
To expand or collapse just a single annotation track, right-click on the track name in
the Files tree and choose Expand (or Collapse) Annotation track. The Annotation track
check box must be checked in order to see this option in the right-click menu.
Note: The maximum number of tracks that can be displayed for any reference
annotation is 25.
Changing graph You can modify many properties of the graphs in the Detail View. ChAS provides
appearance options for:
"Selecting different graph styles" on page 197
"Changing graph attributes" on page 201
"Changing scale" on page 202
Settings and adjustments that are specific for graphs can be made using the Graph
Settings window.
The Types box displays the graph data types being displayed in the Detail View.
Note: Line, Min/Max/Avg, and Stairstep are not available for CytoScan XON arrays
due to the differential coloring based on Level assignment.
Bar – Individual values are shown as vertical bars that are one base wide for
position graphs. (Figure 194)
Line – Subsequent values are linked with a line. Even if the input file was not sorted,
the values will be connected in order along the genomic coordinate axis.
(Figure 195)
Points – Shows a single dot for each data value. (Figure 196)
Big dots – Shows a single big dot for each data value. (Figure 197)
Min/Max/Avg – This style is especially useful for showing very densely populated
graphs with data points for large numbers of positions. (Figure 198) Note: This
data style is not available for CytoScan XON arrays due to regions annotated by
Levels.
When Detail View is zoomed all the way in, the display is equivalent to the Line
style. When zooming out, ChAS starts to summarize values. When the scale of
the display reaches the point where individual x-values are associated with
multiple score values, ChAS picks the maximum and minimum values and draws
a vertical bar between them. In addition, ChAS draws lines through the average
of all the data points represented at each x value.
Stairstep – Similar to the bar graph style, except that bar widths along the
horizontal axis are stair-stepped. (Figure 199) Note: This data style is not available
for CytoScan XON arrays due to regions annotated by Levels.
For example, if position 100 has a value of 50 and position 200 has a value of 75
and there are no values in between, then ChAS will draw a bar of height 50 that
starts at position 100 and stops at position 200. Then, at position 200, ChAS will
draw a new bar of height 75 that terminates at the next location with a value.
Auto-Size Dots - Transition from Points to Big Dots when zooming in the Detail
View. You can select the window size (in base pairs) in which the transition occurs.
(Figure 200)
Changing scale
Changing the visible bounds involves changing the scale of the graph by setting the
maximum and minimum values to be displayed.
To set these visible bounds, use the Range section of the Graph Properties dialog.
The graph height slider is used to increase or decrease the size of a given graph type.
The size is specified in a relative manner. The final graph size will depend on the
number of other graphs and annotations being displayed.
Pop-ups You can mouse over a feature in any of the views to display a popup box with
information on the feature. The information provided depends on the type of data that
the mouse arrow is on.
Pop-ups are available for:
Cytobands
Detected segments
Graph data
Marker position indicators
Histograms
Reference annotations
Note: You should expand the reference annotations before selecting one to avoid
selecting multiple annotations. See "Expanding and contracting annotations" on page
195.
Column Description
Common
Min Zero-based index position of the first base pair in the sequence.
Max Zero-based index position of the last base pair in the sequence, plus one. Adding one ensures
that the length of any (hypothetical) segment containing a single marker would be one, and
ensures that the coordinates match the coordinate system used in BED files.
For all segments, the segment start coordinates are always lower by one bp from the coordinate
for the starting probe of the segment as reported in the graphs table while the end coordinate
matches the coordinate for the ending probe as reported in the graphs table (see Appendix E,
"Genomic position coordinates" on page 488).
Column Description
Segments
CN State Copy Number State (not displayed for LOH segment types).
The expected Copy Number State on the X chromosome in normal males is not constant over
its entire length. This is due to the structure of the sex chromosomes. See "LOH segments on X
and Y chromosomes" on page 49 for more information.
Mean Marker Distance Length of the segment in base pairs divided by the number of markers in the segment.
Curation By The current computer Operating System login ID and ChAS user profile name at the time that
the Call or Interpretation field was last edited.
Curation Time The time and date when the Call or Interpretation field was last edited.
Materially Modified Indication that segment was previously merged, deleted, or had its start or end boundary, type,
Segment or state altered by a ChAS user. (ChAS-based processes of Smoothing and Joining are not
“Modifications”, nor are making Calls or Interpretations, in this context).
Materially Modified By The current computer Operating System login ID and ChAS user profile name at the time that
the segment was last materially modified.
Materially Modified Time The time and date when the Segment was last materially modified.
Max % Overlap The highest percentage by which some item(s) in the Overlap Map overlaps the segment.
Segments completely overlapped by an Overlap Map item are 100% overlapped. This number
is used for Filtering Segments out by “Overlap”.
Overlap Map Items (% of Item(s) in the Overlap Map which overlap the segment, followed by the percentage by which the
Segment overlapped) segment is overlapped by that Item.
CytoRegions Names of the CytoRegions with which the segment shares coordinates.
Use in Report Allows manual selection of Segments for export to a Segments Table PDF, DOCX, or Text rather
than all segments in the table.
Genes List of RefSeq genes from the Genes track that share coordinates with the segment. Identically
named gene isoforms are NOT repeated.
Gene Count A count of the gene names listed in the Genes column
DGV List of DGV variations that share coordinates with the segment.
sno/miRNA List of sno/miRNA features that share coordinates with the segment.
OMIM Genes List of OMIM Genes that share coordinates with the segment.
OMIM Gene Count A count of the OMIM Gene names listed in the OMIM Genes column.
OMIM Phenotype Loci List of OMIM Phenotype Loci that share coordinates with the segment.
Column Description
OMIM Region List of OMIM Phenotype associated with a genomic region. The OMIM Morbidity information is
Phenotype Loci displayed when using all three OMIM tracks. (OMIM Genes, OMIM Phenotype Loci and OMIM
Region Phenotype Loci)
Segmental Duplications List of Segmental Duplications that share coordinates with the segment.
Smoothed/Joined Indication that segment was created by smoothing or joining two or more segments in the initial
segment detection.
Segment Label A label comprised of the segment's Type, State, and Filename.
Start Marker The array marker name which marks the beginning of the segment.
End Marker The array marker name which marks the end of the segment.
Preceding Marker The array marker just above the segment in the data track used as input for the segment. Note:
This column is only applicable to CNState Gain and Loss segments.
Preceding Marker The coordinate location of the array marker just above the segment in the data track used as
Location input for the segment.
Note: This column is only applicable to CNState Gain and Loss segments.
Following Marker The array marker just below the segment in the data track used as input for the segment.
Note: This column is only applicable to CNState Gain and Loss segments.
Following Marker The coordinate location of the array marker just below the segment in the data track used as
Location input for the segment.
Note: This column is only applicable to CNState Gain and Loss segments.
Mean Log2 Ratio The mean of all the Log2 Ratio values contained in the segment.
Mean Weighted Log2 The mean of all the Weighted Log2 Ratio values contained in the segment.
Ratio
Max % Coverage The highest percentage by which a segment covers some item(s) in the Overlap Map.
Number of Overlap Map Number of Overlap Map items which share genomic coordinates with the segment.
Items
% of Overlaps Map Item Overlap Map Item and the percentage by which it is covered by the segment.
covered by Segment
Full Location Chromosome Start and Stop in a user-friendly format for use in external databases.
Median log2 The median of all the Log2 Ratio values contained in the segment.
DB Count Both Number of segments in the database meeting both the user defined thresholds of minimum
Percent Overlap Count and Coverage Count.
DB Coverage Count Number of segments in the database meeting the minimum Percent Coverage Count.
DB Overlap Count Number of segments in the database meeting the minimum Percent Overlap Count.
XON Region Level The annotation Level assigned to this region of the genome.
Summarized Log 2 Ratio The median of the LR, after transformation to adjust for individual marker responsiveness.
Column Description
Genes
Min Zero-based index position of the first base pair in the sequence.
Max Zero-based index position of the last base pair in the sequence, plus one. Adding one ensures
that the length of any (hypothetical) segment containing a single marker would be one, and
ensures that the coordinates match the coordinate system used in BED files.
For all segments, the segment start coordinates are always lower by one bp from the coordinate
for the starting probe of the segment as reported in the graphs table while the end coordinate
matches the coordinate for the ending probe as reported in the graphs table (see Appendix E,
"Genomic position coordinates" on page 488).
%Hi Dosage sensitivity indicator derived from DECIPHER (https://decipher.sanger.ac.uk/). The lower
the percentage, the more likely the gene is to be dosage sensitive.
pLI Probability of loss intolerance. Genes with higher numbers are more likely to be dosage
sensitive. Derived from gnomAD (https://gnomad.broadinstitute.org/).
pHaplo Probability of Haploinsufficiency): The higher the value (0-1), the more likely to be dosage
sensitive. Collins et al. A cross-disorder dosage sensitivity map of the human genome. 2022.
PMID: 35917817
pTriplo Probability of Triplosensitivity): The higher the value (0-1), the more likely to be dosage sensitive.
Collins et al. A cross-disorder dosage sensitivity map of the human genome. 2022. PMID:
35917817
Ensembl Genes
OMIM
OMIM Gene Title The title of the gene associated with the OMIM entry.
OMIM Gene Symbol List A list of genes associated with the OMIM entry.
OMIM Phenotype Key Indicates how this phenotype was placed on the map.
Column Description
OMIM Phenotype Map Indicates how this phenotype was placed on the map.
Key
OMIM Phenotype Locus Describes the phenotype or disorder associated at the OMIM Phenotype Loci.
Description
Gene Title The title of the region associated with the OMIM entry.
Sentimental Duplications
Score Score based on the raw BLAST alignment score. The score for segmental duplications is set to
zero in NetAffx annotation 31 and higher.
Note: Thermo Fisher Scientific does not generate or verify the information for genes,
FISH clones, Segmental Duplications, sno/miRNAs, DGV annotations, or OMIM data.
Segmental Duplication and sno/miRNA annotations do not have any unique terms; but
sno/miRNA annotations use the “type” field to indicate subtypes like “cdBOX” and
“HAcaBOX”.
Some information may not be displayed, depending upon the feature type. The
information can include custom properties created by a user (see "Viewing and editing
annotations" on page 297).
You can export data from the table using the standard table export tools (see
"Exporting table data" on page 422).
You can perform multi-column sorts. See "Sorting by columns" on page 328.
UCSC
Ensembl
Toronto DGV
ClinVAR
ClinGen
DECIPHER
Viewing a 1. In the Detail View, zoom and scroll to the area of interest. (Figure 213)
selected area at a
public site
Figure 213 Selected area in Detail View
2. From the View menu, select View Region at [site name] or click the appropriate
site’s tool bar button.
A browser opens, displaying the selected area of the chromosome.
Viewing and TaqMan assays can easily be accessed from within ChAS. These assays can be
ordering TaqMan used for confirmation of copy number aberrations. TaqMan assays can only be
assays for CN ordered based on hg38 genome coordinates.
Do the following for the region(s) you would like to view and order TaqMan assays
for Copy Number:
1. Locate the region containing the aberration in the Detail View.
3. Right-click on the TaqMan assays you want to order, then click Order TaqMan
assay to link out to the website, as shown in Figure 215.
Viewing and TaqMan assays for genotyping can be ordered from VCF files that contain dbSNP IDs.
Ordering TaqMan Note: The VCF file must contain an rsID for the SNP to directly access the TaqMan
assays for website for that SNP. Also, TaqMan assays can only be ordered based on hg38
genotyping genome coordinates.
1. Load a VCF file clicking File → Open.
2. Right-click on an SNP for which you would like to view and order a TaqMan
assay.
A menu appears. (Figure 216)
Marker Count
Length
XON Level Assignment (XNCHP only)
You can apply these filters to different segment types, using different parameters for
each type. The filtering is done on the fly, with changes to the parameters reflected in
the different views as they are made.
A segment must pass all filter requirements for the segment type to be displayed.
You can apply different filter values for areas inside CytoRegions and areas outside
the CytoRegions (genomewide). See “Using filters with CytoRegions” on page 275.
The Overlap Map filter is described in "Using the overlap map and filter" on page 279.
Filter settings are saved when a Named Setting is created and can be reapplied. See
“User profiles and named settings” on page 438.
IMPORTANT! The Filters set in the browser are NOT linked to the filters for the MSV. The same
filter settings should be set in both the ChAS browser and the MSV separately. The MSV does have
a flag to indicate when filter settings do not match. For more information, see the RHAS User
Guide.
Note: If you use the right-click menu option, only the filter settings for the selected
segment type are displayed.
Genome and CytoRegions tabs are displayed only when a cytoregions file
is selected.
Different
Segment
Types
Note: The Overlap Map filtering parameter is set using the same window. The Overlap
Map function is described in "Using the overlap map and filter" on page 279.
For XON, check the Level check box(es) (Figure 220) to reveal any XON region
segment calls in regions assigned to the Level(s) you selected.
Hide All Segments Hides all the segments. This is particularly useful when using a CytoRegions file for CytoScan XON
in this Region arrays. Check the Hide All Segments in this Region check box on the Genome Tab so that ONLY
segments overlapping CytoRegions will be shown. Note: This option is only available with a
CytoRegions file is assigned.
Hide LOH where Selecting this option will hide LOH segments that are assigned a median copy number less than 2.
median CN < 2 LOH with median copy number of 2 or higher will still be displayed.
Marker Count The number of markers the segment encompasses from start to finish. A segment must have at least
as many markers as you specify to be displayed. Each marker represents a probe which represents a
sequence along the genome at a particular spot. Markers are probe sequences of DNA, each sized
from 12-50 base pairs long, depending on the type of array data. The 12-50 bp sequence is unique to
that one spot on the genome it represents.
Size Based on the start and end markers of a segment. Because each segment represents a single place
in the genome, you can measure from start to end, in DNA base pairs, and by filtering, demand a
segment be at least that long to be visualized.
XON Segment Based on the Level Assignment to the region in the genome.
Level Level 1: Medical Research exome and cancer
Level 2: ClinVar genes not covered in Level 1
Level 3: Other OMIM genes
Level 4: Opportunistic regions from Refseq/UCSC/Enseble/LOVD.
The XON Segment Filters can be used to narrow down the number of XON segments to review based
on their annotation level assignment. Those regions assigned as Level 1 contain genes/regions that
have been identified as part of the Medical Research Exome along with regions associated with
cancer. By selecting only Level 1 in the filter settings, only XON segment calls in regions assigned as
Level 1 will be displayed. XON segments for all other Levels will be hidden from view as well as hidden
in the Segments Table. To expose XON segment calls in other regions simply check the box for those
Levels in the Filters Windows. For more details on CytoScan XON analysis workflow
recommendations, Appendix H, "Recommended CytoScan XON array workflows" on page 511.
2. Use the slider to set the value for the parameter or enter a value in the provided
text field. (Figure 221) Note: As you move the slider from left to right, more
segments are removed.
Your filtered results are displayed instantly in all tables and graphs, as shown in
Figure 222.
For information about using the Overlap setting, see "Using the overlap map and filter"
on page 279. For information on using different filtering settings in CytoRegions, see
"Using filters with CytoRegions" on page 275.
By default, Edit Mode is OFF. Click (located on the Browser’s top icon row or
above the Detail View) to turn Edit Mode ON.
Click to turn Edit Mode OFF and remove all visual indications of your segment
changes.
When Edit Mode is ON, deleted segments are visible, and edited segments
appear distinct from non-edited segments.
When Edit Mode is ON, a track on a dotted axis line will appear showing the
original calls made by the software for comparison with the manual
modifications on the segment track.
When Edit Mode is OFF, deleted segments are invisible, and edited segments
look identical to non-edited segments.
IMPORTANT! Turn Edit Mode OFF, before exporting a report of your data. Also, Edit Mode
must be OFF, before publishing to the database.
Edit Mode ON
Merging segment Note: Merging segment groups together, cancels out any previously assigned Calls.
groups However, un-doing the group of merged segments (page 232) reinstates their original
Calls.
1. Click File → Open.
Your Sample File data folder window appears.
2. Click to select the file you want to edit, then click Open.
The file appears in the ChAS browser’s Detail view.
3. Left-click, hold, then move the mouse to frame the segments you want to merge
together. (Figure 227)
8. In the State field, enter a Copy Number, then use the drop-down menu to select
a Type (Gain/Loss) for the new segment. (Figure 231)
Note: For CytoScan XON arrays (XNCHP files), it is not required to enter a copy
number state value.
9. Click OK.
In cases where the segment to be merged into a group contains a Call or
Interpretation, the following message appears: (Figure 232)
Segment to
segment merge
IMPORTANT! When merging segments during the editing of the start or end of one particular
segment, only the segment whose start or end you are editing has the option of having its Call
or Interpretation saved (or not).
Segments which are being engulfed by the edit start/end procedure being performed will not
have the option of having their Call or Interpretations placed on the resulting segment.
However, un-doing the edit resulting in this type of merging (page 232) will reinstate previous
Call and Interpretation information for all the original segments involved.
6. Click, then drag the shaded area over the segment you want to merge it with.
Make sure you overlap your target segment just slightly, as shown. (Figure 236)
Merged
Segments 1. Right-click on the newly merged segments (Figure 240), then click Un-do
Merge.
The following message appears. (Figure 241)
Note: Even though your selected segments are deleted, ChAS preserves their data
for future reference. See "Right-click menu options" on page 206 for information on
how to view and edit segment properties and view segment details.
Note: Segment start or end boundaries can be moved left and/or right. The Adjusting/
Modifying Segment Boundaries example that follows, shows how to move a segment
end boundary farther to the right.
1. Using the zoom tools and scroll bars (if needed), identify the segment whose start
or end (or both) you want to modify
2. Single-click on the segment to highlight it. (Figure 248)
5. Click, then drag the segment’s right-edge boundary to the right, then stop at an
appropriate point. (Figure 250) You will ONLY be allowed to set the Start or End
position to match the position of a marker probeset.
Un-doing the 1. Right-click on the previously adjusted/modified segment, then click Un-do Edit.
edited start/end The segment returns to its previous state, including, previous boundaries, calls,
coordinates of a and interpretations.
segment
Figure 253 Selected segment - Change Copy Number - Pick a state window
Figure 255 Newly drawn segment limited in size by the Copy Number State track
IMPORTANT! Newly drawn segments will stop when they encounter another segment,
whether or not that segment is currently drawn or whether it is currently filtered out; see
Figure 255 for an example in the CN State data track which is used to draw Copy Number
Segments of Gain and Loss.
New segments will also stop when they encounter the last appropriate marker on the
chromosome. Because of this, new segments drawn will vary in their initial size.
6. Move the mouse cursor over the newly drawn segment to reveal its properties.
(Figure 256)
Figure 256 Newly drawn segment - Mouse over to see its properties
IMPORTANT! Segments in the Mosaic Segment Track are NOT uploaded to the database.
To capture the information for a mosaic segment in the database, that segment must
be "promoted" to the Copy Number State track. This is done to reduce redundancy in
those regions in which segments were called by both the copy number algorithm and
the mosaic detection algorithm. (Figure 262)
Promoted mosaic segments maintain their non-integer copy number state, marker
count, median log2 ratio, genome coordinates and size when they are promoted to the
copy number state track. (Figure 263)
The mosaic gain segments will have the same blue/red used for integer copy number
Gains/Losses, however, they maintain their non-integer copy number state to indicate
they are a mosaic.
In PDF reports, deleted segments are never shown in graphical views, nor are they listed in the
Segments Table, as Edit Mode is required to be OFF to generate a PDF report.
Editing mode on When the Edit Mode is ON , the Modified segments appear differently within the
Segments table, as shown in Figure 264.
Deleted Segments are represented with a red X and a strike-through line. Deleted
segments do NOT show up in PDF reports because PDF reports cannot be created
while Edit Mode is ON.
Materially Modified Segments (including segments that have been merged,
boundaries edited, or had their Copy Number States changed) are represented
with italicized text.
Note: The three deleted segment rows which were in strike-through text are removed
from the table when Edit Mode is OFF. In addition to when Edit Mode is OFF, the rows
which indicated Modified segments are no longer italicized.
The Discard Changes window appears and displays the following options:
Purge All Edits: Reverts all edits to segments and removes all calls and
interpretations.
Note: The log of the edits that have been performed on the sample can still be
viewed.
Clear Edit Log: Purges all edits and clear the log of any edits performed on the
sample.
Discard all changes: Purges all edits, clears the edit log, and deletes the CHCAR
file.
Sample annotations
Sample file level annotations such as Sample-type, Phenotype, and Sample
Interpretation can be added to each sample.
Adding, 1. Click Preferences → Edit User Configurations or click on the upper tool
removing, and bar.
changing the The User Configuration window appears.
order of sample 2. Click the Vocabularies tab, then click the Sample Type tab. (Figure 270)
type text
Adding, 1. Click Preferences → Edit User Configurations or click on the upper tool
removing, and bar.
changing the The User Configuration window appears.
order of 2. Click the Vocabularies tab, then click the Phenotype tab.
phenotype text
The following window tab appears: (Figure 271)
Adding 1. Right-click on a File name you want to add a Sample level annotation to.
annotations at the A menu appears.
sample (xxCHP)
2. Click View/Edit Properties.
file level
The File Properties window appears. (Figure 272)
4. Use this window to Add/Enter Sample ID(s), choose a Sample Type(s), enter
Phenotype(s), and Sample Interpretation(s). Note: The Sample ID defaults to the
File Name, but you can edit/change the Sample ID name if you want.
5. Click OK.
Segment annotations
Segment level annotations such as Call, Interpretation and Inheritance can be added
to segment data.
Setting up the Note: If you are using a user profile from a previous version of ChAS, your default set
calls feature of Calls will NOT appear in the Calls drop-down list. To restore them, click on Edit User
Configurations → Vocabularies → Calls, then click the Restore Defaults button, as
shown in Figure 274 on page 252.
3. Click the Call drop-down menu, then select your appropriate interpretation Call.
(Figure 278)
3. At the Interpretation window, click inside the field shown (Figure 279), then enter
the snippet you want to use with your interpretation(s).
4. Click Add.
The snippet now appears and is saved in the Interpretations List pane.
5. Repeat steps 3 and 4 to add additional snippets.
6. Click OK.
6. Click the snippet bar’s drop-down to display the available snippets, then click on
the appropriate snippet. (Figure 280)
4. Click to select the appropriate snippet or click Close to exit. (Figure 284)
For information on Oncomine Reporter Table State, "Saved table states" on page 332
on Table States.
2. Click Smooth and Join for its description and summary. (Figure 290)
3. Click Edits for its description and summary of the file’s past user edits.
(Figure 291)
4. Click Details.
A Edit Details window appears featuring an Edit and Log tab. (Figure 292)
The Edit tab lists only recent edits that were NOT reverted. Use the horizontal
scroll bar to reveal additional edit information. For a complete historic account of
all edits made (including reversions of some edits), click on the Log tab.
5. Click OK to return to the Process Pipeline window or click the Log tab.
The Log tab displays an extensive summary of the file’s edited history to date.
Use the horizontal scroll bar to reveal additional log information. (Figure 293)
Use the Restricted Mode to display only Segments and graph data that appear in
those regions. While in this mode, annotations are not hidden by CytoRegions or
by the application of Restricted Mode.
Use differential filtering options for these regions and for the rest of the genome.
Protect a CytoRegions file. See "Protecting an AED file" on page 308.
hg version
Or
From the menu bar, click View → Select as CytoRegions file(s).
Alternatively, in the CytoRegions Tab of the Upper display, click the Select to
include in CytoRegions button.
Select a regions information file(s) to use for CytoRegions, then click OK.
Note: You may select more than one Region file (AED/BED) to include as
CytoRegions. ChAS handles multiple selected files as single (larger) CytoRegion
file.
The Create New feature is described in "Creating an AED file of annotations" on
page 287.
Viewing CytoRegions
After selecting a CytoRegions file, it can be displayed in either a graphic view or table.
• ʺCytoRegions in the graphic viewsʺ on page 270
• ʺCytoRegions tableʺ on page 272
Note: CytoRegions that share genomic coordinates with a detected segment are
listed in the “CytoRegions” column of the Segments table. See "Segments table" on
page 338.
CytoRegions in Regions specified in the CytoRegions file are displayed as dark gray stripes in the
the graphic views Karyoview and Chromosome Display (Figure 298) and Detail View (Figure 299).
Figure 298 Karyoview and Selected Chromosome View with CytoRegions Displayed
Area displayed in
Detail View
Cytoregions
You can turn off highlights by un-checking the View menu’s Hide CytoRegion
Background Highlights check box. (Figure 300) This action keeps the CytoRegions in the
Table View, however they are no longer highlighted in the Graph Views.
Tool bar
Select CytoRegions file (See "Selecting a CytoRegions information file" on page 268 )
CytoRegion File Name of the AED/BED file that contains the region.
CytoRegion Type Type of file or element from which the CytoRegion is derived. Default User Annotations are annotations
derived from AED or BED file.
Segment File Sample File that the segment was detected in.
CN State Copy Number State (Displayed for Gain, Loss, and Mosaicism segment types).
You can:
Use the common table functions to control the display of data in the table. See
“Common table operations” on page 327.
Export data in various formats. See “Exporting table data” on page 422..
Search CytoRegions to display only regions of interest (see below).
Searching CytoRegions
The Search CytoRegions feature enables you to search the “CytoRegion” column for
text strings that match a search string.
1. Enter the search string in the Search CytoRegions box at the bottom of the
CytoRegions table.
2. Press Enter.
The table displays only annotations that match the search string.
3. To restore the table, click Clear Search Field.
Figure 306 Create Panel (AED) File from Gene List window
3. Select the NetAffx Genomic Annotation file from the drop down list. The
annotation database files visible are available files from the Library file folder. If
you do not see a NetAffx Genomic Annotation file that you want to use, please
copy the file into C:\Affymetrix\ChAS\Library and restart the ChAS browser.
4. Select an Input Gene List File: Click on Select File button and browse to the
location of your TXT tab-delimited Gene List file.
5. Select an Output AED File: Click on the Select File button, select the location to
save the output file, name the AED file, click Save.
6. Click OK to generate the AED file. (Figure 307)
Note: If entries in the Gene List are not found in the NetAffx Genomic Annotation file,
they will be listed in a message box. Any skipped entries can be added to the AED file.
See "Creating and editing AED files" on page 287 and "Adding annotations to an AED
file" on page 289.
hg version
OR
1. On the menu bar, click Select View → Overlap Map.
Alternatively, in the Overlap Map tab in the upper display, click the Select Overlap
Map tool bar button.
The Select Overlap Map window appears. (Figure 310)
The window displays a list of the region and annotation files you can select for an
overlap map.
2. Select the file you want to use, then click OK.
The region file loads and its regions are displayed with overlap information in the
Overlap Map tab.
Note: To clear an Overlap map, click Select None. (Figure 310) Alternatively, right-
click the file in the files list, then select Set File as Overlap Map from the shortcut
menu.
Viewing overlap In the Karyoview and Selected Chromosome View, regions specified in the Overlap
map regions in Map file are displayed as short rectangles to the immediate left of the Cytobands.
the graphic (Figure 311)
displays
Figure 311 Karyoview and Selected Chromosome View with
Overlap Map Regions Displayed
Overlap Map regions Overlap Map regions in
in Karyoview Selected Chromosome View
Viewing the The Overlap Map table (Figure 313 on page 284) displays a list of overlapping items
overlap map table from the overlap map file and the Segments table. Each region in the Overlap Map file
will be listed on at least one row of the table, even if it does not overlap any segments.
For those regions which overlap one or more segments, there will be one for each
overlap. Depending on the columns which have been used to sort the Overlap Map
table, these rows may or may not be near each other. A segment that overlaps more
than one region in the Overlap Map file will appear multiple times in the Overlap Map
table, one row for each overlap.
Highlighting overlap regions in the overlap map table and details view
If the Details View displays the Overlap Map file (the Overlap Map file is check marked
in the Files list), you can conveniently find and view items.
Click a row in the Overlap Map table to select the corresponding annotation from
the Overlap Map file. All of the rows for that region are highlighted in the table. The
Details View zooms to the currently selected region.
Note: If the Details View does not automatically zoom to the selected region, confirm
that the Auto-zoom option is selected (click View → Auto-zoom to table selection
from the menu bar.)
Select Overlap Map file (see "Selecting the overlap map file" on page 280).
Map Item Type Source of the position information (CN Gain or Loss segment, reference annotation, etc.)
Segment File Sample File that the segment was detected in.
CN State Copy Number State (Not displayed for other segment types).
% Overlap How much of the detected segment is covered by the Overlap Map item. A Segment which has 20%
of its length somehow encompassed within the boundaries of an Overlap Map item has an Overlap
value of 20%.
This percentage value is used to filter segments out of the displays and tables when filtering segments
by “Overlap” in the filter slider dialogs.
% Coverage How much of the Overlap Map item is covered by the Segment.
Shared Size Size of the overlap between segment and Overlap Map item. Coverage values are not presently used
in filtration of Segments from the displays or tables.
3. Select the Overlap check box(es) for the segment types you want to filter.
4. Use the slider to set the parameter’s value or enter a value in the adjacent text
box.
As you move the slider farther to the right (or enter smaller values in the box)
more and more of the Overlapped segments are removed from the display.
The detected segments must share at least the specified percentage of their
length with the Overlap Map region to be filtered out and hidden from display.
A Segment which has 20% of its length somehow encompassed within the
boundaries of an Overlap Map item has an Overlap value of 20%.
The minimum value of a setting is 1%.
The results of filtering are seen instantly in all tables and graphs.
See “Using filters with CytoRegions” on page 275. for information about the
interactions of the Overlap Map filter with the CytoRegions.
3. In the Create window (Figure 316), click to select a folder, then enter a file name.
4. Click Create.
The Select Destination File window appears and displays the name of your new
AED file. (Figure 317)
5. Click OK.
The Details View shows the new annotation (AED).
Note: The AED file is automatically assigned the same genome assembly version
(i.e., “hg18”, “hg19”, etc.) as the currently loaded NetAffx annotations.
Adding 1. In the Detail View, use one of the following methods to select the annotation(s)
annotations to an you want to add to an AED file:
AED file Right-click an annotation and select Add to a File on the shortcut menu
OR
Draw a box around multiple annotations, right-click the selection, and select
Add to a File on the shortcut menu.
2. In the Select Destination File window, select an AED file, then click OK.
Note: Adding annotations to an AED file does not modify the genome assembly
version. If the AED file does not specify a genome assembly version, none is
automatically assigned to the AED file when annotations are added.
Creating a new 1. From the View menu, choose Select CytoRegions file or Select Overlap Map.
CytoRegions file The appropriate Select File window appears. (Figure 318)
in AED file format
Note: You can also create a new AED file when adding a region to an AED file.
3. Use the window controls to browse to a folder for the AED file.
4. Enter a file name.
5. Click Create in the Create window.
The Select File window appears with the new file selected. (Figure 321)
Note: The AED file is automatically assigned the same genome assembly version
(i.e. “hg18”, “hg19”, etc.) as the currently loaded NetAffx annotations.
You can select the new file or add regions to it, depending upon what function
you were performing initially.
Note: To open an AED file, click the button or select File → Open on the
menu bar.
The Select Destination File window displays a list of the currently existing AED
files to which you may add the segment.
2. Select the region file you want to use, then click OK.
Note: The Annotation Properties window opens if you have selected a single
item (Figure 324). If you have selected multiple items, the Annotation Properties
window does not open.
Deleting regions 1. Right-click on a region in a region file and select Delete Annotations(s)
from an AED file (Figure 325).
Viewing the The assigned genome assembly version of a loaded file can be viewed in the
genome Properties box (Figure 326). The property, if it has been set for the file, is shown as
assembly version “ucscGenomeVersion(Affx)”. An AED file that is created within ChAS is automatically
assigned the same genome assembly version as the loaded NetAffx annotations (for
example, “hg19”). If you add annotations to an existing AED file, its genome assembly
version will not be modified; and if no version is specified for the AED file, no version
will be assigned to it.
Note: When you save a file as AED or BED, the current value of the genome assembly
version property, if present, will be saved in the file. If two or more files are merged
into an AED or BED file, the current value of the genome assembly version, if present
in at least one of the files, will be saved in the merged file.
If an AED file does not include a genome assembly version, you can manually set it.
To do this, in the Properties window:
1. Click + to add a new property row, as shown in Figure 327 on page 296.
2. Select the Property Name ucscGenomeVersion(Affx) from the drop-down list.
3. Select Text under the Type drop-down list.
4. In the Value column, enter the genome assembly version (for example,
“Constitutional”).
Note: You can manually set the genome version of an AED file by editing the
“ucscGenomeVersion(Affx)” property. For more details on editing a property
value, see "Editing a property value" on page 297.
5. Enter a value, then click OK. For significantly longer values, click
(Figure 328) to open a Value editing window. Enter your (longer) value in this
window, then click OK.
The new value is entered in the File Properties table.
6. Click OK.
Editing a property 1. Right-click the file and select View/Edit Properties on the shortcut menu.
value The File Properties window appears. (Figure 329)
2. Click on a row to select it.
3. Enter a value, then click OK. For significantly longer values, click to open
a Value editing window. Enter your (longer) value in this window, then click OK.
The new value is entered in the File Properties table.
4. Click OK.
Note: The View/Edit Annotation Properties menu option is not available if you have
more than one feature selected in the Detail View.
The Annotation Properties window has three tabs:
General (See "Entering general information" on page 299)
Additional (See "Adding Properties" on page 300)
Curation (See "Adding a curation (Optional)" on page 303)
You can also create new user annotations if you select an element. This feature
enables you to create a region that is not based on a segment or reference annotation.
See "New user annotations" on page 304.
Annotation Description
Label Label given to the AED, CHP, or other annotation. For annotations originating from CHP
segments, can be the Type, State, and Filename of the CHP file. The Label is not editable.
Name/ID Name/ID assigned to the annotation in the AED, CHP or other file.
Category Information on the source of the region. If the region was added by selecting a CHP segment,
the segment type is saved.
Strand The Sequence Strand of the item.
Chromosome Cannot be edited in Annotation Properties box. See "New user annotations" on page 304.
Min The smallest of the annotation's chromosomal coordinates.
Max The largest of the annotation's chromosomal coordinates.
Materially Modified Time Time stamp of when the start or end coordinate, type, or state of the segment or BED/AED
annotation was last altered.
Materially Modified By The Operating System user and ChAS User IDs of the Modifier who last changed a Material
property (start or end coordinate, type, or state) of the segment or annotation.
Note Information and comments about the region.
Note: Always use alphanumeric characters and underscores. Avoid the use of odd characters.
(Examples: & + ( ) [ ] { } ~ ^ and commas.)
Annotation Description
Reference URL/web address associated with the annotation.
Click to link directly to the Reference from the Annotation Properties window. Internet
connection is required.
Modified The time stamp of the last modification of the annotation.
Modified By The Operating System user and ChAS User IDs of the Modifier who last changed the
annotation.
Counter enables you to track the number of times something has been seen.
Color Allows assignment of a hard-coded color to the Annotation in ChAS's graphical views.
3. Click in the row under the Property Name column and enter a name for the
property. Note that your new entry is followed by the text “(custom)”. (Figure 335)
For more details, see "ChAS properties and types" on page 482.
4. Click in the row under Type and select a property type from the drop-down list
(Figure 336). For more details, see "ChAS properties and types" on page 482.
5. Click in the row under Value and enter the value (Figure 336).
Alternatively, click the Browse button (Figure 336), then in the Value window
(Figure 337) that appears, enter the property value, then click OK.
New user You can create a new annotation from an AED annotation.
annotations 1. In the Details view, right-click an AED annotation and select New User Annotation on
the shortcut menu. (Figure 341)
2. In the window that appears, enter the annotation information (Figure 342). For more
details, see "Entering general information" on page 299 and "Customizing
properties" on page 300.
3. Click OK.
The new user annotation is created and saved in the AED file.
Note: The default New User Annotation information includes only the chromosome
number. It does not include any information or properties associated with the AED
annotation.
Note: Each row in this table represents a specific Annotation, while each
column in this table represents a specific Property.
Note: The AED File Editor table displays ALL properties and tabs of every
annotation contained in an AED file.
Property (Column)
Annotations (Rows)
IMPORTANT! Only editable fields can be edited. If a field is non-editable, the Annotations
Property Window pops-up. This window displays which fields are not editable (grayed out)
versus those that are editable. Also, not all user-editable AED file fields may currently be edited
from within the AED Editor. Some basic values (start, stop, type) cannot be edited in the AED
Editor table directly. You MUST use “View/Edit Annotation Properties...” for editing the
particular field in the annotation of interest.
8. Enter an appropriate label to distinguish your new batch annotation entry, then click
OK.
9. Your batch (group) of annotations appear as follows: (Figure 348)
When adding annotations to a Protected AED file, the following warning message
appears. (Figure 350)
3. Click Yes to acknowledge the message, then click OK to add the annotation to
the AED file.
Click No to return to the browser window without adding the annotation to the
AED file.
Note: Protected AED files are noted with an asterisk, as shown in Figure 351.
Selecting a new 1. Open the User Configuration window (click the button or select
default color for Preferences → Edit User Configuration on the menu bar).
loaded AED or 2. In the Color Rules tab (Figure 352), click the button.
BED files 3. In the window that appears, choose a color swatch or use the color controls to
specify a color.
4. Click OK in the AED/BED Annotation Color window.
Column Description
4. Click the Add button at the top of the table (Figure 354).
6. Click in the row under Type, then select a property type from the drop-down list.
7. Click in the row under Operator, then select an appropriate operator from the
drop-down list. (Figure 356)
Figure 356 Selecting property type and operator for the comparison
8. Click in the row under Value, then enter a value for the property.
9. Click in the row under Color.
The Pick a Color window appears. (Figure 357)
10. Select a color, then click OK.
2. Choose a color in the Swatches tab, or click the HSB or RGB tab to define a
color.
3. Click OK.
Note: The color set in the Annotation Property window overrides the colors
specified by a Color Rule created in the User Configuration window. For more
details, see "AED/BED color rules" on page 310.
Detected Segments for xxCHP files Regions, names, and properties Regions and names
Annotation Features in Reference Regions, names, and properties Regions and names
Annotation files
2. Click the Filter Exported Segments check box to restrict the export to the
contents of the Segments table. If filters have been applied to the data, only the
retained segments will be exported. Graph data and Chromosome Summary
data will not be exported.
Note: If this option is not selected, all segments which were loaded with the
CxCHP file will be exported along with header information, regardless of whether
filters are applied. The export includes all segment data, regardless of check
mark status (ON or OFF) in the Files window pane.
3. Select a folder location for the file, as you normally would.
4. To export to AED file format, enter a name for the file.
To export to BED file format (Figure 362), enter a name for the file, then select
Browser Extensible Data (BED) from the Files of Type drop-down list.
3. Use the navigation controls to select a folder for the merged AED file and enter
a file name for the file.
4. Click Export.
The file with the merged AED region information is created and can be used as a
region information file in ChAS.
IMPORTANT! After two AED files are merged, the original metadata in the header is not
retained. Also, when two or more files are merged into AED/BED format, the current value of
the genome assembly version property, if present in at least one of the files, will be saved in
the merged file.
Expression AED files containing Gene Expression information and data can be created using
analysis AED file Thermo Fisher Scientific Human Gene Expression arrays analyzed in the Transcription
generation Analysis Console (TAC) 3.0 or higher.
For details on how to create an AED file containing Gene Expression/miRNA data,
please see the Gene Expression Copy Number Analysis Quick Reference Card.
You can simultaneously view fold change from your Gene Expression dataset with
copy number data from xxCHP files. Positive gene expression fold changes are
represented by blue transcript cluster IDs. Negative fold changes are represented by
red transcript cluster IDs. The deeper the color (blue or red) the larger the fold change.
IMPORTANT! If your VCF file(s) do not strictly adhere to the above guidelines, the file(s) will not
be compatible with, nor load into ChAS.
1. From your software, export your VCF file (based on the above guidelines).
2. Click File → Open.
An Explorer window appears.
3. Navigate to your VCF file location, click to highlight it, then click Open.
A message appears indicating that the Genome Version could not be detected.
If you are certain that the genome build for the VCF file matches the Genome
Version loaded in the ChAS Browser, click OK to acknowledge the message.
* CNV segments on the X and Y chromosomes may appear gray if the gender of the file cannot
be determined which in turn, prevents the gain/loss status from being determined.
To link to TaqMan assays from the VCF file, see "Viewing and Ordering TaqMan
assays for genotyping" on page 215. Note: Linking feature applies to hg38 analyses
only.
IMPORTANT! Before you can export VCF files, you must install RHAS on the same workstation.
2. Select the files to be exported in VCF format by clicking on the check box next
to the file name.
Left side
Export as TXT file. See "Exporting tables as TXT file" on page 426.
Export as PDF Report. See "Exporting table data into a PDF" on page 422.
Export as MS Word DOCX file. See "Exporting as Word (DOCX) format" on page 420.
Copy selected cells to clipboard. See "Exporting a segments table with modified segments to a TXT
file" on page 428.
Calculate the sum, median, and mean of the selected values from a numeric column.
Far Right
The number of rows in the table.
Opens the Select Columns window that enables you to choose the column headers to show or hide.
The Export functions are described in "Exporting table data" on page 422.
Sorting by You can sort a table by a single column’s values, or by the values in up to three columns.
columns Note: You may sort on any column except, for reasons of efficiency, the marker name
column in the graphs table.
Sorting on certain columns can cause a noticeable decrease in performance. For example,
it is recommended that you do not sort a table using the columns in the Segments table
that show the overlapping RefSeq, FISHClones, or other items. The data for these table
cells is calculated only on an as-needed basis when it needs to be displayed. Using such
a column for sorting would force the calculation of the data for all such cells. Since the
sorting would be alphabetical, it is unlikely to be useful. Similarly, for reasons of efficiency,
sorting based on the marker name column in the Graphs table is not allowed.
Figure 370 Table sorted by descending order of Segment ID and ascending order of Size
Changing the 1. Click and drag in the column header to move the column to a new location.
order of table
columns
1. Click the Select Columns tool bar button . The Select Columns window
opens. (Figure 371)
Note: Specific items may vary, as they are dependent upon the type of table and
data being displayed.
The columns in the left side pane are displayed in the table (in their default order).
To hide columns within the table, move the column entry from the left (Chosen) pane
into the right (Available) pane.
To view different columns in the table, click and drag entries from the right pane to the
left pane.
Column order can be determined by clicking onto a column entry, then dragging it into
its desired location (inside the left pane).
Use the Table Sorting drop-down menus to sort your columns.
Note: These choices are auto-saved between sessions.
Calculating 1. Ctrl click or Shift click to highlight multiple numeric fields (MUST BE in the same
multiple numeric column).
values in a 2. Click .
column Your multiple numeric values are calculated and summarized, as shown in.
Figure 373.
3. Click OK
IMPORTANT! You can restore your default settings at any time by right-clicking on a table
column header and selecting Restore Defaults from the menu or selecting the Default Table
State under Apply Table State.
3. Type Default.
4. Click OK.
The table is now saved (as Default) to your User Profile for future use and/or
reference.
Adding columns 1. Click the Select Columns icon. (Top right of Segments table)
to table states
Removing (hiding) 1. Right-click on a column header you want to remove (hide) from the table.
columns in a table The following menu appears: (Figure 376)
(for report use)
Figure 376 Save Table State - Remove (Hide)
Column
Figure 377 Saved Table State - Example: Table used for reporting.
Applying There are six Table States that are created automatically in ChAS.
previously saved Default: Commonly used columns in any xxCHP file analysis
table states
Oncomine Reporter: Simplified Table State for export and use with Oncomine
Reporter Software.
Cytogenetic: Commonly used columns when analyzing constitutional samples.
Oncology: Commonly used columns when analyzing cancer samples.
ReproSeq: Standard columns for use with analyzing ReproSeq samples.
ClinVar: All required columns for using the ClinVar export.
Segments table
The Segments table (Figure 380) displays a list of the detected segments in the loaded
sample data files.
Note: If the Use in Export column is hidden from the Segments table, then all rows
are exported.
Segments table In the Segments table, “N/A” can mean that the information (for example, FISH Clones
or sno/mRNA) is not available in the NetAffx database because the information has not
yet been mapped. For example, FISH Clones or sno/mRNA files will not appear in the
Files list for the NA31 (hg19) ChAS Browser NetAffx Genomic Annotation file. “N/A”
can also mean that a column which has been persisted to appear from a previous user
profile, no longer has data in the current NetAffx Genomic Annotation file that is
loaded.
Annotations which share genomic coordinates with a segment are listed in order of
start coordinate value, smallest values first (i.e. from left to right in the Details View).
For annotations with the same start coordinate (for example, isoforms of a single
gene), the one with the smallest end coordinate is listed before others with larger stop
coordinates.
If a column in the Segments table contains more than 10 items, “…” is displayed after
the 10th item to indicate that some data are not displayed in order to save calculation
time. For example, “…” will follow the 10th name in the Genes column. However, a
complete list of the genes will be included when the information is copied to the
system clipboard or exported to reports. For gene isoforms with identical names, only
one instance of the gene locus will be listed in the Segment table to reduce duplicate
gene names.
The table can display each segment with the following information (the default set of
columns in a new user profile may include only a subset of the total columns listed.
For instructions on how to switch quickly between table column sets for a particular
table, see "Saved table states" on page 332.
Materially Modified Segments (merged, created de novo, segments with edited start
or end coordinates) and deleted segment have a different appearance in the
Segments Table and export to TXT differently depending on the software settings. For
more information, see "Exporting a segments table with modified segments to a TXT
file" on page 428.
Column Description
% of Overlaps Map Overlap Map Item and the percentage by which it is covered by the segment.
Item covered by
Segment
Call Approval Can be used as a bookmark for segments that have been reviewed.
Call From The Call term assigned based on Tier or Score Classification. For more information, see "Viewing
Prioritization segment prioritization in the segments table" on page 380.
CN State Copy Number State (not displayed for LOH segment types).
The expected Copy Number State on the X chromosome in normal males is not constant over its entire
length. This is due to the structure of the sex chromosomes. See "LOH segments on X and Y
chromosomes" on page 49 for more information.
Confidence The log likelihood that the called copy number state is not normal ploidy, example 2 on autosomes
(ReproSeq) (reflects the likelihood of the region's ploidy number being different than the normal ploidy 2).
Curation By The current computer Operating System login ID and ChAS user profile name at the time that the
Curation field was last edited.
Curation Time The time and date when the Curation field was last edited.
CytoRegions Names of the CytoRegions with which the segment shares coordinates.
DB Both The number of segments in the database meeting BOTH the Minimum Percent Overlap and the
Minimum Percent Coverage. This number can change depending on whether the "match only same
gain/loss type" box is checked. Right-click on the a row in the Segment Table. From the menu, click
DB Count Both. See "Querying a segment from the segment table" on page 392.
DB Coverage The number of segments in the database meeting the Minimum Percent Coverage. This number can
change depending on whether the "match only same gain/loss type" box is checked. Right-click on
the a row in the Segment Table. From the menu, click DB Coverage Count. See "Setting up a ChAS
DB query" on page 390.
DB Overlap The number of segments in the database meeting the Minimum Percent Overlap. This number can
change depending on whether the "match only same gain/loss type" box is checked. Right-click on
the a row in the Segment Table. From the menu, click DB Overlap Count. See "Setting up a ChAS DB
query" on page 390.
DGV List of DGV variations that share coordinates with the segment.
DGV-GS List of curated Database of Genomic Variants considered "Gold Standard" that share coordinates with
the segment.
End Marker The array marker name which marks the end of the segment.
Column Description
Evidence Provides information on which annotations the segment overlapped. For more information, see
"Viewing segment prioritization in the segments table" on page 380.
Filtered DB Both The number of segments in the database meeting the Minimum Percent Overlap and Minimum Percent
Coverage and the selected Filter Criteria.
Filtered DB The number of segments in the database meeting the Minimum Percent Coverage and the selected
Coverage Filter Criteria.
Filtered DB The number of segments in the database meeting the Minimum Percent Overlap and the selected Filter
Overlap Criteria.
FISH Clones List of FISH clones that share coordinates with the segment.
Following Marker The array marker just below the segment in the data track used as input for the segment.
Note: This column is only applicable to CNState Gain and Loss segments.
Following Marker The coordinate location of the array marker just below the segment in the data track used as input for
Location the segment.
Note: This column is only applicable to CNState Gain and Loss segments.
Full Location Chromosome Start and Stop in a user-friendly format for use in external databases.
Gene Count A count of the gene names listed in the Genes column
Genes List of RefSeq genes from the Genes track that share coordinates with the segment. Identically named
gene isoforms are NOT repeated.
Materially Modified The current computer Operating System login ID and ChAS user profile name at the time that the
By segment was last materially modified.
Materially Modified Indication that segment was previously merged, deleted, or had its start or end boundary, type, or
Segment state altered by a ChAS user. (ChAS-based processes of Smoothing and Joining are not
“Modifications”, nor are making Calls or Interpretations, in this context).
Materially Modified The time and date when the Segment was last materially modified.
Time
Max % Coverage The highest percentage by which a segment covers some item(s) in the Overlap Map.
Max % Overlap The highest percentage by which some item(s) in the Overlap Map overlaps the segment. Segments
completely overlapped by an Overlap Map item are 100% overlapped. This number is used for Filtering
Segments out by “Overlap”.
Mean Log2 Ratio The mean of all the Log2 Ratio values contained in the segment.
Mean Marker Length of the segment in base pairs divided by the number of markers in the segment.
Distance
Column Description
Mean Weighted The mean of all the Weighted Log2 Ratio values contained in the segment.
Log2 Ratio
Median Log2 Ratio The median of all the Log2 Ratio values contained in the segment.
Number of Overlap Number of Overlap Map items which share genomic coordinates with the segment.
Map Items
OMIM Gene Count A count of the OMIM Gene names listed in the OMIM Genes column.
OMIM Genes List of OMIM Genes that share coordinates with the segment.
OMIM Phenotype List of OMIM Phenotype Loci that share coordinates with the segment.
Loci
OMIM Region Lists the Gene Title of any region that overlaps with a copy number or LOH segment.
Phenotype
Oncomine Report A drop-down list of annotations compatible with the Oncomine Reporter Software. For details on this
application, go to: https://www.thermofisher.com/order/catalog/product/A34298
Overlap Map Items Item(s) in the Overlap Map which overlap the segment, followed by the percentage by which the
(% of Segment segment is overlapped by that Item.
overlapped)
Preceding Marker The array marker just above the segment in the data track used as input for the segment.
Note: This column is only applicable to CNState Gain and Loss segments.
Preceding Marker The coordinate location of the array marker just above the segment in the data track used as input for
Location the segment.
Note: This column is only applicable to CNState Gain and Loss segments.
Precision The log likelihood that the called copy number state is different than next closest copy number state
(ReproSeq) (reflects the likelihood that the precise ploidy number is correct).
Protein Coding List of protein coding RefSeq genes from the Genes track that share coordinates with the segment.
Genes Identically named gene isoforms are NOT repeated.
Protein Coding Number of Protein Coding RefSeq genes that shapre coordinates with the segment.
Genes Count
Protein Coding List of protein coding Ensembl Genes annotations that share coordinates with the segment. Identically
Ensembl Genes named gene isoforms are NOT repeated.
Protein Coding Number of Protein Coding Ensembl Genes that share coordinates with the segment.
Ensembl Genes
Count
Column Description
Segment Label A label comprised of the segment's Type, State, and Filename.
Segmental List of Segmental Duplications that share coordinates with the segment.
Duplications
Smoothed/Joined/ Indication that the segment was created by smoothing, joining or merging two or more segments in
XON merged the initial segment detection.
sno/miRNA List of sno/miRNA features that share coordinates with the segment.
Start Marker The array marker name which marks the beginning of the segment.
Summarized Log 2 The median of the LR, after transformation to adjust for individual marker responsiveness.
Ratio
Tier or Score The assigned Tier or Score value based on the Segment Prioritization method selected. When using
Tier based, the column will display the assigned Tier. When using Score based, the column will display
the score value based on the annotations the segment overlaps.For more information, see "Viewing
segment prioritization in the segments table" on page 380.
Type Type of segment, for example, LOH. When sorting by this column, the segments of a particular sample
are listed in the same order that they appear in the Data Types window pane.
Use in Report Allows manual selection of Segments for export to a Segments Table PDF, DOCX, or Text rather than
all segments in the table.
XON Region Level The annotation Level assigned to this region of the genome.
Figure 382 Detail View zoomed in on the segment selected in the Segments table
Detail View
Figure 383 Detail View, annotations have been expanded and selected
3. Using the mouse, draw a box around all of the genes and annotations of interest.
When you release the mouse button, a blue box outlines the selected items in the
Detail View. (Figure 383)
4. Click the tool bar button.
The Selection Details table appears (Figure 384). It includes all of the items
selected in the Detail View. For more details on the table, see "Selection details
table" on page 208.
Click a button
to export the
information
For details on exporting from the Segment table, see "Exporting table data" on page
422.
Graphs table
IMPORTANT! The results from ChAS are for Research Use Only. Not for use in diagnostic
procedures.
The Graphs table displays the marker data used to create the graphs in the Detail
view. Markers that are not used for the graphs currently displayed do not appear in
this table. As in the Detail View, only markers from a single chromosome are displayed.
The column headings are colored according to the tracks used for the Karyoview,
Selected Chromosome View, and Details View.
Figure 385 Example Graphs table with Genotype Calls for CytoScan HD results
Column Description
Markers Marker ID. Right-click to link to NetAffx information about the marker.
Note: For efficiency reasons, it is not possible to sort the table on this column.
The Graphs Settings button opens the Graph Settings panel, enabling you to
change the style of graph, scale, and other features for the data graphs. See
"Changing graph appearance" on page 196.
Exporting Note: Exporting genotypes is not available for OncoScan or ReproSeq data.
genotype calls
1. Click the tool bar button. Alternatively, select Reports → Export Genotype
Results Text File from the menu bar.
2. In the window that appears (Figure 387 on page 348), select the array type,
results file(s) (CYCHP), and annotation database to use for the export.
4. Enter the path name or click the Browse button to select a folder for the output.
5. Enter a file name prefix. If only one output file is created (see below), this will be
the file name. If multiple files are created, a suffix will be added to this string to
create the file name. Do not include the file extension here.
6. Select a Multiple File Output option which determines if a separate file will be
created for each chromosome and/or CYCHP file.
None One output file will be exported that contains all chromosome and all CYCHP file data.
There will be separate data columns for each CYCHP file in the exported file.
Separate File for each Creates a separate file for each chromosome in the output data. If all chromosomes are
Chromosome selected, 24 files will be created. There will be separate data columns for each CYCHP
file in the exported file.
Separate File for each CYCHP File Creates one text file per CYCHP file. Each file contains genotype calls for all
chromosomes.
Separate File for each Create a separate file for each CYCHP file and for each chromosome. If three CYCHP
Chromosome and Separate File for files are selected and all chromosomes are reported on, this will create 72 files.
each CYCHP File
7. Click OK.
Note: Exporting of Genotypes may take several minutes to complete, as this
process is dependent on the total number of SNPs selected for export.
The exported text file (Figure 389) includes information about the analysis (for
example, array type, NetAffx annotation database, hg version, and chromosome).
Note: If the option “Separate File for each CYCHP File” was not selected, many of the
headers will be repeated for each CYCHP file. The header titles will be appended with
the CYCHP file name to indicate which file the column belongs to.
The column headers report the following information:
Column Description
Forward Strand Base Calls Base calls for the forward strand.
Variants table
For OncoScan FFPE and CytoScan HTCMA arrays only. The Variants table
(Figure 390) displays the somatic mutation information from OncoScan FFPE arrays
and/or the variant information from CytoScan HTCMA arrays.
Variants table The table can display each mutation with the following information (the default set of
columns in a new user profile may include only a subset of the total columns listed
below).
CytoScan HTCMA
Column Description
Affx SNP ID Unique Thermo Fisher Scientific generated identifier for the SNP.
Alt Allele The call for the first alternate allele associated with a non-normal phenotype.
Associated Phenotype Displays the Phenotype that is associated with the variant.
c.name Displays standard variant nomenclature based on coding DNA reference sequences.
Inheritance Pattern Method of Inheritance (Example: AR (Autosomal Recessive), XLR (X-linked Recessive).
Recommended A quality control metric determined by SNP Polisher algorithm that chooses the best probesets
Probeset querying a SNP.
Ref Allele The call for the reference allele associated with a normal phenotype.
RSID dbSNP ID
Column Description
Type Undetected/Detected. Detected are those mutations with any call other than the wild-type or
major homozygous genotype.
Variant Status Status of the variant based on the genotype (i.e. Not Detected, Het, Hom, NoCall, NRP).
Variant Status Alt Allele Severity status for the variant mapped to Alt Allele.
Variation ID ClinVar ID
Column Description
Channel CEL file from which the signal is measured. “A” is the AT CEL, “C” is the GC CEL.
Common Name Abbreviated description of the mutations to which this ProbeSet is known to respond. The name
has the form [Gene]:[amino acid change for mutation]:[cDNA change for mutation]. In the event
that the ProbeSet cannot differentiate among multiple mutations to which it can respond, the slash
(/) delimits the multiple known mutations.
COSMIC ID The identifier of the mutation as listed in the COSMIC database, which is a catalogue of somatic
mutations in cancer. More information on these mutations can be found at:
http://cancer.sanger.ac.uk
Event Describes if the probeset is a point mutation, deletion, insertion or sequence variant.
Event Type Describes if the event is missense, frame-shift, in-frame insertion or deletion.
Genes RefSeq gene that shares coordinates with the somatic mutation.
High Threshold High confidence MutScore threshold. Measurements equal to or greater than this threshold are
called “High confidence,” describing the likelihood that the mutation is present.
Low Threshold Lower confidence MutScore threshold. Measurements with a MutScore below this value are called
“Undetected”. Measurements equal to or greater than this threshold but less than the High
Threshold are called “Lower confidence,” describing the likelihood that the mutation is present.
Mutation (aa) Wild type > mutant amino acid change on the coding strand.
Mutation (nt) Wild type > mutant nucleotide change on the coding strand.
Column Description
Mutation Syntax (aa) Encoding of which nucleotide was changed and its location in the CDS.
Mutation Syntax (cds) Encoding of which amino acid was changed and the location of the codon.
MutScore Measures somatic mutation probeset response. The stronger the response, the more likely it is
that the somatic mutation is present. The MutScore calculation depends on the algorithm version.
The newer MutScore calculation also corrects for sample-specific effects, and thereby reduces
false positive calls, which were sample specific.
For algorithm versions 1.0 - 1.2 (ChAS 3.0 and earlier, OncoScan Console 1.2 and earlier):
MutScore.old = (measured quantile normalized signal - median signal for this marker in the
reference model file) / (95th percentile signal for this marker in the reference model file - median
signal for this marker in the reference model file).
For algorithm versions 1.3 and newer (ChAS 3.1 and newer, releases of OncoScan Console after
1.2):
MutScore.new = (MutScore.old - median MutScore.old for this sample) / standard deviation of
MutScore.old for this sample (where standard deviation is calculated for all but the num-out-std
strongest MutScore.old for this sample, median is calculated for all but the num-out-med
strongest MutScore.old for this sample, and the used median is the maximum of zero and the
measured median).
Note: Changes made to an OSCHP file in the Somatic Mutation Viewer Application
requires the OSCHP file to be reloaded into the ChAS Browser to reflect the change
made to the sample.
Status
Data Files
Region Files
The top section displays Status for Restricted Mode (see "Using restricted mode" on
page 275)
The tables display information on:
Loaded data files. "QC and sample information table" on page 355
Loaded region files. "Loaded AED/BED files table" on page 361
QC and sample The QC table has six pre-loaded Table States allowing you to quickly toggle to the
information table relevant information based on array type. For detailed information, see "Saved table
states" on page 332.
CytoScan QC view
Column Description
Antigenomic Ratio Ratio of median intensity antigenomic control probes/median intensity all copy number probes.
MAPD Median Absolute Pairwise Difference value. See Appendix G for detailed information.
QC In or Out of QC bounds.
Sex Gender call for the sample. See "Gender call algorithms" on page 360).
SNPQC SNP QC value. Median Absolute Pairwise Difference value. See Appendix G for detailed
information.
Waviness SD A global measure of variation of microarray probes that is insensitive to short-range variation and
focuses on long-range variation. See Appendix G for detailed information.
Default QC view
Column Description
Annotation File Name of the Annotation file used to create the xxCHP file.
Cel Pair Check Inspects each pair of intensity (*.cel) files to determine whether the files have been properly paired
and assigned to the correct channel.
Column Description
Low Diploid flag An essential part of the algorithm is the identification of "normal diploid" markers in the cancer
samples. This is particularly important in highly aberrated samples. The normal diploid markers
are used to calibrate the signals so that "normal diploid markers" result in a log2 ratio of 0 (e.g.
copy number 2). The algorithm might later determine that the "normal diploid" markers identified
really correspond to (for example) CN=4. In this case the log2 ratio gets readjusted and TuScan
ploidy will report 4. Occasionally (in about 2% of samples) the algorithm cannot identify a
sufficient number of "normal diploid" markers and no "normal diploid calibration occurs. This
event triggers "low diploid flag" = YES. In this case the user needs to carefully examine the log2
ratios and verify if re-centering is necessary.
MAPD Median Absolute Pairwise Difference value. See Appendix G for detailed information.
ndSNPQC QC metric for SNP probes that is derived from polymorphic SNP probes in normal diploid
regions.
ndwavinessSD Measure of variation of probes in normal diploid regions that are insensitive to short-range
variation and focus on long-range variation.
Parameter File Name of the chasparam file used to create the xxCHP file.
QC In or Out of QC bounds.
Sex Gender call for the sample. See "Gender call algorithms" on page 360).
SNPQC SNP QC value. Median Absolute Pairwise Difference value. See Appendix G for detailed
information.
TuScan %AC If % AC = 100%, we return "homogeneous" because it could be 100% normal or 100% tumor. If
% AC =NA, the percent aberrant cells could not be determined and TuScan returns non-integer
CN calls. This metric is an algorithmically determined estimate of the % of aberrant cells in the
sample.
TuScan Ploidy The most likely ploidy state of the tumor before additional aberrations occurred. TuScan Ploidy
is assigned the median CN state of all markers, provided that %AC could be determined and
integer copy numbers are returned. If %AC cannot be determined, NA (Not Available) is reported
for both ploidy and %AC.
Waviness SD A global measure of variation of microarray probes that is insensitive to short-range variation and
focuses on long-range variation. See Appendix G for detailed information.
Column Description
DishQC (DCQ) Measures the amount of overlap between two homozygous peaks created by non polymorphic
probes. DQC of 1 is no overlap, which is good. DQC of 0 is complete overlap, which is bad.
MAPD Median Absolute Pairwise Difference value. See Appendix G for detailed information.
Column Description
QC In or Out of QC bounds.
QC Call Rate Percentage of autosomal SNPs with a call other than NoCall (measured at the Sample QC step).
QC Het Rate Percentage of SNPs called AB (i.e. the heterozygosity) for autosomal SNPs (measured at the
Sample QC step).
Sex Gender call for the sample. See "Gender call algorithms" on page 360).
SMN MAPD Median Absolute Pairwise Difference value calculated from the CNVMix algorithm for the SMN
pipeline.
SMN WavinessSD A global measure of variation of microarray probes that is insensitive to short-range variation and
focuses on long-range variation from the CNVMix algorithm for the SMN pipeline.
SNPQC SNP QC value. Median Absolute Pairwise Difference value. See Appendix G for detailed
information.
Waviness SD A global measure of variation of microarray probes that is insensitive to short-range variation and
focuses on long-range variation. See Appendix G for detailed information.
OncoScan QC view
Column Description
Cel Pair Check Inspects each pair of intensity (*.cel) files to determine whether the files have been properly paired
and assigned to the correct channel.
Cel Pair Check Percentage of SNPs that match between the AT and GC arrays.
Concordance
Low Diploid flag An essential part of the algorithm is the identification of "normal diploid" markers in the cancer
samples. This is particularly important in highly aberrated samples. The normal diploid markers
are used to calibrate the signals so that "normal diploid markers" result in a log2 ratio of 0 (e.g.
copy number 2). The algorithm might later determine that the "normal diploid" markers identified
really correspond to (for example) CN=4. In this case the log2 ratio gets readjusted and TuScan
ploidy will report 4. Occasionally (in about 2% of samples) the algorithm cannot identify a
sufficient number of "normal diploid" markers and no "normal diploid calibration occurs. This
event triggers "low diploid flag" = YES. In this case the user needs to carefully examine the log2
ratios and verify if re-centering is necessary.
MAPD Median Absolute Pairwise Difference value. See Appendix G for detailed information.
ndSNPQC QC metric for SNP probes that is derived from polymorphic SNP probes in normal diploid
regions.
ndWaviness SD Measure of variation of probes in normal diploid regions that are insensitive to short-range
variation and focus on long-range variation.
Column Description
QC In or Out of QC bounds.
Sex Gender call for the sample. See "Gender call algorithms" on page 360).
TuScan %AC If % AC = 100%, we return "homogeneous" because it could be 100% normal or 100% tumor. If
% AC =NA, the percent aberrant cells could not be determined and TuScan returns non-integer
CN calls. This metric is an algorithmically determined estimate of the % of aberrant cells in the
sample.
TuScan Log 2 Ratio Log 2 ratio determined from TuScan algorithm needed to "center" the diploid region of the
adjustment sample (around Log 2 = 0).
ReproSeq QC view
Column Description
Batch File (ReproSeq) Name of the batch file downloaded from Ion Reporter.
MAPD Median Absolute Pairwise Difference value. See Appendix G for detailed information.
Sex Gender call for the sample. See "Gender call algorithms" on page 360).
Single File (ReproSeq) Name of the current file from a Batch File downloaded from Ion Reporter.
Column Description
DishQC (DCQ) Measures the amount of overlap between two homozygous peaks created by non polymorphic
probes. DQC of 1 is no overlap, which is good. DQC of 0 is complete overlap, which is bad.
MAPD Median Absolute Pairwise Difference value. See Appendix G for detailed information.
QC In or Out of QC Bounds.
QC Call Rate Percentage of autosomal SNPs with a call other than NoCall (measured at the Sample QC step).
QC Het Rate Percentage of SNPs called AB (i.e. the heterozygosity) for autosomal SNPs (measured at the
Sample QC step).
Sex Gender call for the sample. See "Gender call algorithms" on page 360).
SMN MAPD Median Absolute Pairwise Difference value calculated from the CNVMix algorithm for the SMN
pipeline.
Column Description
SMN WavinessSD A global measure of variation of microarray probes that is insensitive to short-range variation and
focuses on long-range variation from the CNVMix algorithm for the SMN pipeline.
SMN1 (variant) Genotype call for SMN variant(s) as defined in the SMN.SNP.list
For more details, see the RHAS User Guide.
Other data from the header of the Sample Data file or the ARR file can also be selected
for display in the Select Columns window.
You can only hide or display columns by using the Column Select window, at the right
of the table.
Note: or samples run through the Normal Diploid Analysis for CytoScan, the
ndSNPQC and ndwavinessSD metrics can be viewed in the QC Information Tab, but
will not flag a sample as pass/fail.
Gender call The table below explains which algorithm is used to make the gender calls for the
algorithms different arrays.
The CytoScan Array uses the call “Y-gender” which gives a male/female call.
Depending on the version they were created under, various GTC 2.x and 3.x SNP6
CNCHP files use other gender calls present in their CNCHP file header.
These calls used from the CNCHP file header are NOT the same gender calls used for
those files in GTC, since the GTC-displayed gender calls were stored in GQC or
CN_SEGMENTS files which are not supported in ChAS.
Note: For more details how the array-specific algorithms call LOH segments for the X
or Y chromosome, see "LOH segments on X and Y chromosomes" on page 49.
File File Name with Icons displayed if selected as Overlap File or CytoRegions File).
You can only hide or display columns by using the Column Select window, at the right
of the table.
To choose the data type, make a selection from the drop-down list. (Figure 396)
Figure 396 Selecting the data type for the Chromosome Summary table
IMPORTANT! You must check the LOH Segment Data type to view the sample’s percent LOH.
Searching results
The Search function enables you to search:
Detected Segments
Reference Annotations
Loaded Region Information Files
The search can find:
Names of Reference Annotations
BED and AED file elements, including those in files designated as CytoRegions or
Overlap Maps
Loaded and displayed segments
You can search by:
File (select the files to be searched)
ID Label
Type
1. From the View menu, select Search or click the upper tool bar’s icon.
The Search window opens. (Figure 398)
If the search takes more than a few seconds, a Searching... window appears.
(Figure 400)
If results are found, the Search Results table opens. (Figure 401)
Searching within 1. Right-click on a selected (checked) file inside the Files list pane.
a selected file
Figure 402 Searching notice
If the search takes more than a few seconds, an In Progress notice appears.
(Figure 405)
If results are found, the Search Results window/table opens. (Figure 406)
Column Description
Finding intersections
The Find Intersection feature enables you to find segments and regions that overlap
for different:
Detected Segments
Reference Annotations
Loaded Region Information Files
2. Select the first file for the comparison from the File A drop-down list. (Figure 408)
The list shows the available Sample files, Region Information Files, and
Reference Annotations.
Note: Only files that are check marked in the Files List appear in the Match File
drop-down list.
3. Select the second file from the File B drop-down list.
4. Click Find Intersection…
The table displays the names of the A and B files above the tool bar.
To highlight features in the views or the table:
Double-click in a row of the table to zoom to the feature for File A in the Selected
Chromosome and Detail Views.
Click on a feature in the Selected Chromosome or Detail View to highlight the
feature in the Intersection Results table (the feature must be listed in the table to
be highlighted).
You can perform the common table operations in the Intersection Results table (see
"Common table operations" on page 327).
Column Description
Note: The following NetAffx Genomic Annotation files are required for full use of all the
parameter in Segment Prioritization:
NetAffxGenomicAnnotations.Homo_sapiens.hg19.na20221201/
NetAffxGenomicAnnotations.Homo_sapiens.hg38.na20221201 (or more
current) for optimal segment prioritization results.
IMPORTANT! All data results from the Segment Prioritization process should be manually
reviewed.
ChAS AIR tokens Contact your local sales representative to purchase AIR Tokens in quantities of 24,
96 or 384 samples.
Your sales representative will provide a link for setting up an account in Franklin
(required for new accounts only).
The ChAS AIR Tokens will be deposited into your Franklin account.
One token is deducted from your AIR token balance for each CHP file you upload
to Franklin (1 CHP file = 1 AIR token).
A window appears confirming what data type and filters are going to be uploaded
to Franklin. (Figure 412)
IMPORTANT! Do not delete the open case in Franklin, simply re-upload the edited file(s) to
Franklin to overwrite the existing data.
Returning to an 1. To access a previously uploaded file in Franklin from ChAS, go to the File
open case in window, right-click on the file(s), then click Open Case in Franklin. (Figure 413)
Franklin from
ChAS Figure 413 Open Case in Franklin
The Franklin web page opens displaying the details of your open case.
Note: Before you can review a segment from the Franklin website in ChAS, the HTTP
service in ChAS must be enabled. From the ChAS Browser, click Preferences →
Configure HTTP Service. Confirm the Enable box is checked and Port 8348 is
displayed. Also make sure Port 8348 is not blocked by your firewall.
Configuring the 1. From the Segment Table tool bar, click the button.
tier-based option The Segment Prioritization Options window appears. (Figure 414)
3. To prioritize your copy number segments in a tier-based order, assign a tier value
to the rule(s) you want to use. To do this, click the drop-down arrow adjacent to
the rule, then click the tier value (1-5) you want to assign to it. A Tier 1 assignment
denotes the highest rule priority, while Tier 5 is the lowest.
Note: Not every Rule requires an assigned tier. See the table below for Rule
definitions.
IMPORTANT! For copy number segments that meet rule(s) with different tier assignments, the
highest tier (lowest number) will be assigned to the segment. Example: A copy number segment
overlaps a rule assigned as Tier 1 and also a rule that is Tier 3, the segment will be assigned as a
Tier 1 since that is the higher of the 2 tiers. Two Rules are exceptions: DGV-GS and DB-B, if either
of these rules are met, the assigned tier overrides any other tiers. If both of these rules are met, the
tier assignment for DGV-GS is assigned.
Tier-based Description
rule/evidence
DGV-GS Database of Genomic Variants - Gold Standard. The copy number segment is completely contained within
an entry of like type (Gain or Loss) from the Gold Standard DGV track and meeting a defined frequency.
Default frequency is >=1%. This rule WILL override higher ranked tiers based on the tier selected in this
rule.
DB-B ChAS DB Both. Compare the copy number segment to the ChAS DB Both column data. If the number of
entries in this column exceeds the defined threshold, then the copy number segment will be assigned the
tier associated with this rule (unless the DGV-GS rule is also met). This rule WILL override rules with higher
ranked tiers (with the exception of DGV-GS).
DB-F ChAS DB - Filtered. Compare the copy number segment to the Filtered ChAS DB Both column. Example:
Set the Filtered ChAS DB query to filter on segments in the database with the Call 'unknown significance'.
If the copy number segment overlaps enough segments in your ChAS DB called 'unknown significance'',
then the selected Tier will be assigned.
OM-3 Any OMIM Genes annotation that is dark green in color. Dark Green is assigned for phenotype map key 3
OMIM records indicating the molecular basis is known; a mutation has been found in the gene.
OM Any OMIM Genes annotation that is NOT dark green in color. See OM-3 (above).
Cyto-R CytoRegions file. The segment overlaps any region in the assigned CytoRegion file. For more information
on CytoRegions, see "Using CytoRegions" on page 267.
TS Triplosensitivity. The copy number segment overlaps an entry in the Triplosensitivity track which has an
assigned TS_score of 3.
HI Haploinsufficiency. The copy number segment overlaps an entry in the Haploinsufficiency track which has
an assigned HI_score of 3.
RS RefSeq. The copy number segment overlaps an entry in the Protein Coding Genes Track.
P-HI The copy number segment overlaps a Protein Coding Gene or Protein Coding Ensembl Gene with
predicted haploinsufficiency values meeting the defined thresholds. pLI derived from gnomAD (https://
gnomad.broadinstitute.org/) and %HI derived from DECIPHER (https://decipher.sanger.ac.uk/).
NoGene The copy number segment does not overlap any known Protein Coding Gene or Protein Coding Ensembl.
4. Once all desired rules have an assigned tier, click OK to save the selections and
return to the Segment Prioritization Options window.
Click Cancel to return to the Segment Prioritization Options window without
saving any tier assignment changes.
Click Restore Defaults to return to the installation settings.
Tier to call You can assign a Call to represent each tier. The contents of the drop-down list was
settings generated from the Calls Vocabulary list. There are a set of default 'Calls", but this list
can be customized, as detailed in "Using the calls feature" on page 253.
1. Click on the drop-down list adjacent to the Tier(s) you want to assign a Call to,
then click on a selection, as show in Figure 416.
Note: Tiers are not required to have a Call assigned. Unassigned Tiers will appear
blank in the Segment Table’s Calls From Prioritization column.
2. After your Tier to Call assignments are complete, click OK.
Call From Prioritization: Displays the Call associated with the Tier assigned to the
copy number segment.
. .Evidence: Displays the abbreviation representing the rules met based on which
annotations the copy number segment overlaps. See the Tier-based rule/
evidence table of definitions above for more details.
Note: A copy number segment that does not overlap any rules with an assigned
tier will display No rules match.
Tier or Score: This will be a number from 1-5 representing the Tier that was
assigned to the copy number segment based on the user-defined Tier-Based rules
selected.
If the Call from Prioritization assignments are correct, they can be accepted as the
Calls for each segment. To do this:
Configuring the 1. From the Segment Table tool bar, click the button.
score-based The Segment Prioritization Options window appears. (Figure 419)
option
2. Use the text field adjacent to the Rule to enter a new numerical. Click the
Restore Defaults button to return to the factory values. See the table below for
Score-based rule/evidence, descriptions, and default value information.
1A The Gain copy number segment fully or partially overlaps at least 1 annotation in the Protein 0
Coding Genes track.
1B The Gain copy number segment does not fully or partially overlap any annotation in the -0.6
Protein Coding Genes track.
2A The Gain copy number segment completely overlaps an annotation in either the 1
Triplosensitivity or Recurrent/Curated Regions track with a TS Score = 3.
2B The Gain copy number segment partially overlaps an annotation in either the Triplosensitivity 0
or Recurrent/Curated Regions track with a TS Score = 3. Partial overlap indicates one
breakpoint of the Gain segment is located within the TS_Score = 3 gene/region.
2C/2F The Gain copy number segment contains the same gene content as a Triplosensitivity or -1
Recurrent/Curated Regions annotation with a TS Score = 40. The copy number Gain segment
might be larger than the gene/region, but contains the same gene content as listed in the
Triplosensitivity or Recurrent/Curated Regions tracks.
2D/2E Both breakpoints of the Gain copy number segment are contained within an annotation -1
having a TS Score = 40 in either the Triplosensitivity or Recurrent/Curated Regions.
2H The Gain copy number segment completely overlaps an annotation in either the 0
Haploinsufficiency or Recurrent/Curated Regions track with a HI Score = 3.
2I Both breakpoints of the Gain copy number segment are contained within an annotation 0.3
having a HI_ Score = 3 in either the Haploinsufficiency or Recurrent/Curated Regions. The
copy number Gain segments is smaller than the user defined threshold (Default >=90%).
2I+ Both breakpoints of the Gain copy number segment are contained within an annotation 0.9
having a HI_ Score = 3 in either the Haploinsufficiency or Recurrent/Curated Regions. The
copy number Gain segments is larger than the user defined threshold (Default >=90%).
2J/2K The Gain copy number segment partially overlaps an annotation in either the 0
Haploinsufficiency or Recurrent/Curated Regions track with a HI Score = 3. Partial overlap
indicates one breakpoint of the Gain segment is located within the HI_Score = 3 gene/region.
3A The Gain copy number segment (partially or completely) overlaps at least 1 Protein Coding 0
Gene annotation. Default is 1-34 genes.
3B The Gain copy number segment (partially or completely) overlaps more Protein Coding Gene 0.45
annotations than in 3A. Default is 35-49.
3C The Gain copy number segment (partially or completely) overlaps more Protein Coding Gene 0.9
annotations than in 3A or 3B. Default is > =50.
4O-DB-B The Gain copy number segment overlaps/covers a defined number of segments in your ChAS -1.0
database (DB Count Both column). Default is 400 segments. Configuration of DB Count Both
parameters can be found in "Querying a segment from the segment table" on page 392.
4O-DGV-GS Both breakpoints of the Gain copy number segment are contained within an annotation in the -1.0
DGV-GS Gain (blue). The DGV-GS annotation must have an NR frequency greater than track
defined. Default NR frequency is 1%.
CY The Gain copy number segment overlaps an annotation in the customer supplied 0
CytoRegions File(s). For more information on CytoRegion files, see "Using CytoRegions" on
page 267.
2. Use the text field adjacent to the Rule to enter a new value. Click the Restore
Defaults button to return to the factory values. See the table below for Score-
based rule/evidence, descriptions, and default value information.
1A The Loss copy number segment fully or partially overlaps at least 1 annotation in the 0
Protein Coding Genes track.
1B The Loss copy number segment does not fully or partially overlap any annotation in the -0.6
Protein Coding Genes track.
2A The Loss copy number segment completely overlaps an annotation in either the 1
Triplosensitivity or Recurrent/Curated Regions track with a TS Score = 3.
2B-r The Loss copy number segment partially overlaps an annotation in the Recurrent/Curated 0
Regions track with a HI Score = 3. Partial overlap indicates one breakpoint of the Loss
segment is located within the HI Score = 3 region.
2B-g The Loss copy number segment partially overlaps an annotation in the Haploinsufficiency 0
track with a HI Score = 3. Partial overlap indicates one breakpoint of the Loss segment is (static
located within the HI Score = 3 gene. If 2B-g is met, then move on to 2C - 2E to assess a value,
value based on location of the partial overlap. further
assessmen
t required)
2C-1 The Loss copy number segment overlaps the 5'UTR and some CDS of a gene with HI score 0.9
= 3 in the Haploinsufficiency track.
TIP: Right-click on the transcript, choose View/Edit Annotation Properties, then select the
Structure tab to view the exons and CDS coordinates.
Note: All transcripts for a gene are assessed as long as the transcript is =< 90% of the
length of the gene as defined in the Haploinsufficiency track and have identical gene
symbols.
2C-2 The Loss copy number segment overlaps the 5'UTR, but no CDS of a gene with HI score 0
= 3 in the Haploinsufficiency track.
TIP: Right-click on the transcript, choose View/Edit Annotation Properties, then select the
Structure tab to view the exons and CDS coordinates.
Note: All transcripts for a gene are assessed as long as the transcript is =< 90% of the
length of the gene as defined in the Haploinsufficiency track and have identical gene
symbols.
2D-1 The Loss copy number segment overlaps the 3'UTR only, no CDS is involved for a gene 0
with HI score = 3 in the Haploinsufficiency track.
TIP: Right-click on the transcript, choose View/Edit Annotation Properties, then select the
Structure tab to view the exons and CDS coordinates.
Note: All transcripts for a gene are assessed as long as the transcript is =< 90% of the
length of the gene as defined in the Haploinsufficiency track and have identical gene
symbols.
2D2/2D3 The Loss copy number segment overlaps the 3'UTR AND the last exon in the coding region 0.3
for a gene with HI_score = 3 in the Haploinsufficiency track.
TIP: Right-click on the transcript, choose View/Edit Annotation Properties, then select the
Structure tab to view the exons and CDS coordinates.
Note: All transcripts for a gene are assessed as long as the transcript is =< 90% of the
length of the gene as defined in the Haploinsufficiency track and have identical gene
symbols.
2D-4 The Loss copy number segment overlaps the 3'UTR AND multiple exons in the coding 0.9
region for a gene with HI_score = 3 in the Haploinsufficiency track.
TIP: Right-click on the transcript, choose View/Edit Annotation Properties, then select the
Structure tab to view the exons and CDS coordinates.
Note: All transcripts for a gene are assessed as long as the transcript is =< 90% of the
length of the gene as defined in the Haploinsufficiency track and have identical gene
symbols.
2E Both breakpoints of the Loss copy number segment are contained within an annotation 0.3
having a HI_ Score = 3 in either the Haploinsufficiency track or Recurrent/Curated Regions
track. The copy number Loss segment is smaller than the annotation in the track by less
than the user defined threshold (Default >=90%).
2E+ Both breakpoints of the Loss copy number segment are contained within an annotation 0.9
having a HI_ Score = 3 in either the Haploinsufficiency track or Recurrent/Curated Regions
track. The copy number Loss segment is larger than the annotation in the track by less than
the user defined threshold (Default >=90%).
2F Both breakpoints of the Loss copy number segment are contained within an annotation -1
having a HI_ Score = 40 in either the Haploinsufficiency track or Recurrent/Curated
Regions track.
2H The Loss copy number segment overlaps a Protein Coding Gene or Protein Coding 0.15
Ensembl Gene with predicted haploinsufficiency values meeting the defined thresholds.
pLI derived from gnomAD (https://gnomad.broadinstitute.org/) and %HI derived from
DECIPHER (https://decipher.sanger.ac.uk/).
3A The Loss copy number segment (partially or completely) overlaps at least 1 Protein Coding 0
Gene annotation. Default is 1-24 genes.
3B The Loss copy number segment (partially or completely) overlaps more Protein Coding 0.45
Gene annotations than in 3A. Default is 25-34.
3C The gain copy number segment (partially or completely) overlaps more Protein Coding 0.9
Gene annotations than in 3A or 3B. Default is > =35.
4O-DB-B The Loss copy number segment overlaps/covers a defined number of segments in your -0.9
ChAS database (DB Count Both column). Default is 400 segments. Configuration of DB
Count Both parameters can be found in "Querying a segment from the segment table" on
page 392.
40-DGV-GS Both breakpoints of the Loss copy number segment are contained within an annotation in -0.9
the DGV-GS gain (red). The DGV-GS annotation must have an NR frequency greater than
track defined. Default NR frequency is 1%.
CY The Gain copy number segment overlaps an annotation in the customer supplied 0
CytoRegions File(s). For more information on CytoRegion files, see "Using CytoRegions"
on page 267.
Define the Score Thresholds: In the appropriate text field, enter a Call based on the
segment score as defined above. Your entered threshold values for each Call will
be populated in the Segment Table’s Call from Prioritization column.
Select Calls: Use the drop-downs adjacent to each threshold to assign a Call that
will be associated with a range of scores.
Note: Calls in the drop-down lists can be customized by adding to the Calls
Vocabulary window in the User Configuration.
In the example below (Figure 423), a copy number segment with a Score of 1.3 would
have a Call from Prioritization assignment of "Level 1". A copy number segment with
a score of -0.96 would have a Call from Prioritization assignment of "Probably
nothing".
2. Click OK to accept the Score thresholds and Calls or click Cancel to return to the
ChAS Browser without saving any new assignments. Click the Restore Defaults
button to return to the factory values.
Viewing segment Three new segment prioritization columns now appear in the Segment Table.
prioritization in (Figure 417).
the segments
table
Call From Prioritization: Displays the "Call" associated with the score threshold
ranges.
A segment can be queried against the ChAS Database for intersecting segments from
previously published samples. Using both the Overlap Threshold and Coverage Threshold
can focus the query results to segments that are of approximately the same size as the
segment in the current sample.
Note: ReproSeq Aneuploidy data can not be published into the ChAS DB.
2. Check the Match only same gain/loss type check box if you want to query the
database for only similar copy number types (gains only or losses only). Uncheck this
check box if you want to query all copy number segment types.
3. Check the Include Exon Regions check box if you want to include Exon Region
Segments in your query.
4. Check the Include LOH box to include LOH segments in your query.
5. Click OK to save your changes or click Restore Defaults to return the parameter
settings back to their default settings.
Setting up query Note: When querying on an LOH segment in the Browser, the values set in the LOH Query
parameters for an Parameters section are used.
LOH segment 1. Enter minimum percentage values for both Overlap and Coverage using the text
search boxes or click and drag the appropriate slider bar. (Figure 428) Note: The default
values are set to 50%.
2. Click OK to save your changes or click Restore Defaults to return the parameter
settings back to their default settings.
Setting up query Note: When querying on an XON Region segment in the Browser, the values set in the
parameters for an Exon Region Query Parameters section are used.
XON region 1. Enter minimum percentage values for both Overlap and Coverage using the text
segment search boxes or click and drag the appropriate slider bar. (Figure 429) The default values are
set to 50%.
2. Check the Match only same gain/loss type check box if you want to query the
database for only similar XON Region segment types (gains only or losses only).
Uncheck this check box if you want to query all XON Region segment types.
3. Check the Include Copy Number Segments check box to include Copy
Number Segments in your query.
4. Click OK to save your changes or click Restore Defaults to return the parameter
settings back to their default settings.
Segment The Segment Intersections view appears with the results from the query.
intersections (Figure 431)
The Segment Intersections view shows samples in the database that contain
segments that meet the criteria set in the DB Query.
The middle portion of this view contains table information about the samples in the
database including any Call, Interpretation or Inheritance information assigned to the
segments for the samples shown in the example above. To display or hide columns
within this table, click (upper right corner).
The lower portion of the view provides the same external annotations available in the
Detail View. To display an annotation track, go to the ChAS Browser’s Files Menu and
check the box. The annotation track will then be displayed in both the Detail View and
the Segment Intersections View. (Figure 431)
You can return segments from the database based on either DB Overlap or DB
Coverage. These segments meet either one of the overlap or coverage threshold
criteria.
This segment example is represented as a loss (in the sample currently loaded in the browser).
External Annotations
Click the Side-by-Side icon (upper right) to split the Segment Intersections window, as
shown in Figure 432 on page 395.
Note: Columns with a pad and pencil icon represent a segment field that can be
edited. All edits are stored directly to the database.
Column Description
% of Overlap Item The percentage of the Overlap Map Item covered by the segment.
covered by Segment
Call from The Call term assigned based on Tier or Score Classification at the time the sample was published
Prioritization (Stored) to ChAS DB.
Column Description
CytoRegions Names of the CytoRegions with which the segment shares coordinates.
DB Coverage Number of segments in the database meeting the minimum Percent Coverage Count.
DB Overlap Number of segments in the database meeting the minimum Percent Overlap Count.
DGV List of DGV variations that share coordinates with the segment.
Evidence (Stored) Provides information on which annotations the segment overlapped at the time the sample was
published to ChAS DB.
Genes List of RefSeq genes from the Genes track that share coordinates with the segment. Identically
named gene isoforms are NOT repeated.
Max Zero-based index position of the last base pair in the sequence, plus one. Adding one ensures that
the length of any (hypothetical) segment containing a single marker would be one, and ensures that
the coordinates match the coordinate system used in BED files.
For all segments, the segment start coordinates are always lower by one bp from the coordinate for
the starting probe of the segment as reported in the graphs table while the end coordinate matches
the coordinate for the ending probe as reported in the graphs table
Min Zero-based index position of the first base pair in the sequence.
OMIM Genes List of OMIM Genes that share coordinates with the segment.
Overlap Map Item(s) in the Overlap Map which overlap the segment.
Overlap Map Items The percentage of the segment that is overlapped by the Overlap Map Item.
Sample DB ID A xxCHP file ID automatically assigned when the xxCHP file is published to the database.
Sample Type Displays the Sample Type assigned to this xxCHP file.
Segment DB ID An ID automatically assigned to each segment when the xxCHP file is published to the database
Segmental List of Segmental Duplications that share coordinates with the segment.
Duplications
Column Description
Tier or Score (Stored) The assigned Tier or Score value based on the Segment Prioritization method selected at the time
the sample was published to ChAS DB. When using Tier based, the column will display the assigned
Tier. When using Score based, the column will display the score value based on the annotations the
segment overlaps.
Column Description
Original location Chr:start-stop genome location of the original genome build for the segment.
Removed markers Number of original markers removed from the segment from remapping.
Original Genome Build Genome Build from the original analysis prior to remapping.
Note: As shown in Figure 436, sample files downloaded from ChAS DB are listed in the
Files Tree with a database symbol. Segments from samples files downloaded from
ChAS DB are listed in the Segments Table with a database symbol in the File column.
2. Refer to "Setting up a ChAS DB query" on page 390 for how to set the Percent
Minimum Overlap/Coverage Thresholds.
5. Use the windows check boxes, radio buttons, and selections to change your
filter parameters.
6. Click OK.
Your new query parameters are saved and displayed at the bottom of the Filtered
DB Query window tab, as shown in Figure 439.
To reset the Filtered DB Query Parameters back to default settings, click Restore
Defaults. For more information see, "Filtering DB count columns" on page 401.
Figure 440 shows the DB Count Both and Filtered DB Count Both columns. It
illustrates that DB Count Both queries the database for all segments matching the
Minimum Percent Overlap/Coverage and gain/loss/LOH parameters. The Filter DB
column reflects the additional Filter Criteria.
The results returned when querying a segment in the detail view will contain segments
that meet the DB Coverage filter or the DB Overlap filter set up previously. (Figure 426
on page 390)
2. Use the provided radio buttons, check boxes, and text fields to customize your
search, then click OK.
Note: Altering these parameters only affects the current segment query. The
following fields, Sample Types, Array Types and Calls are populated based on
what has been published by the user into the ChAS database If the ChAS
Browser is unable to contact the database, these fields are populated based on
the library file and vocabularies entries.
IMPORTANT! You MUST have Manager or Admin Role permissions to publish data to the
database. For details, see "Administration" on page 456.
Method 2
1. Click to highlight the sample name(s), then click the tool bar's Publish to
Database icon .
A summary of uploaded segments/Publish? appears. (Figure 445)
During a ChAS session, go to the upper bar of icons and click on for
automatic connection or click on for manual connection.
Or
Click ChAS DB → Auto-Connect, then click on the check box to toggle between
connection modes, as shown in Figure 447.
2. Use the provided radio buttons, check boxes, and text fields to customize your
query, then click OK.
The Query Samples table populates with your filtered search results. (Figure 449)
The contents for columns in which the headers have an Edit Icon can be
modified. Changes will apply directly to the ChAS database.
Columns with a chip/pencil icon represent sample properties that can be
edited.
Figure 449 Query Samples window tab table - Filtered search results
For instructions on how to use the table’s features, see "Common table operations"
on page 327.
Note: The following fields, Sample Types, Array Types and Calls are populated
based on what has been published by the user into the ChAS database. If the ChAS
Browser is unable to contact the database, these fields are populated based on the
library file and vocabularies entries. Queries are not automatically refreshed when
publishing to or deleting from the ChAS DB. Queries must be re-run to reflect changes
to the database made after the initial query.
WARNING! You must have manager or admin permissions for the ChAS database to delete
samples.
Sample(s) deleted from the ChAS Database are permanently deleted and cannot be retrieved. There
is no undo delete feature.
Deleting a single To remove a single sample in a database, use the Query Samples window to locate
sample the file to be deleted.
1. In the Query Samples window (Figure 448 on page 409), enter the file’s Filename
or Sample ID, then click OK.
The sample appears in the table. (Figure 451)
Deleting multiple 1. Multiple samples can be highlighted to delete. They can be selected using the
samples following keyboard and mouse combinations: Ctrl click, Shift click or Ctrl a..
(Figure 453)
Figure 453 Query Samples window tab table - Deleting multiple samples
2. Use the provided radio buttons, check boxes, and text fields to customize your
query, then click OK.
The Query Segments table populates with your filtered search results.
(Figure 455)
The contents for columns in which the headers have an Edit Icon can be
modified. Changes will apply directly to the ChAS database. Editable columns
are: Call, Segment Interpretation, Inheritance, Oncomine Reporter.
Figure 455 Query Segments window tab table - Filtered search results
For instructions on how to use the table’s features, see "Common table operations"
on page 327.
Note: To delete sample files from the Query Segments tab, follow the same
instructions outlined in "Deleting sample(s) from the ChAS database" on page 411.
This procedure deletes the entire sample file and all file information associated with it.
Chromosome Analysis Suite includes the following tools for reporting results:
Export the Karyoview, Selected Chromosome View, and Detail View as a DOCX,
PDF or PNG file. See "Exporting graphic views".
Export to a DOCX file. See "Creating signature and background profiles".
Export table data as a DOCX, PDF, TXT file, or copy selected data onto your
clipboard. See "Exporting table data".
Combine PDF reports. See "Combining PDFs into a single PDF"
Use a ClinVar export template. See "Exporting with ClinVar".
Export copy number and variant data as VCF. See "Exporting VCF files".
IMPORTANT! The results from ChAS are for Research Use Only. Not for use in diagnostic
procedures.
Exporting as a ChAS provides a variety of options for exporting graphic views as PDFs. The PDF
PDF Report displays the graphic with basic information about data files and settings.
IMPORTANT! The results from ChAS are for Research Use Only. Not for use in diagnostic
procedures.
1. From the Exports menu, select the PDF option you want to use.
Export application window PDF - Creates PDF with entire software screen and
information about data files
Export Karyoview PDF - Creates PDF with Karyoview and information about
data files.
Export Selected Chromosome PDF - Creates PDF with Selected Chromosome
View and information about data files.
Export Detail View PDF - Creates PDF with selected Detail View and information
about data files.
Export Segments Table PDF - Creates PDF with Segment Table
Export QC and Sample Info PDF - Creates PDF with the QC and Sample
Information
Export Chromosome Summary Data PDF - Creates PDF with the Chromosome
Summary Data
Signature profiles Signature Profiles, including a logo, address, reviewer name and credentials, can be
added to a DOCX export. Use saved signature profiles for quick recalls with any DOCX
export
1. Click on the Preference Menu, then select Edit User Configuration.
2. Click the Exports Tab.
3. Click the Summary Exports tab, then click the Summary Export tab. (Figure 457)
Deleting a signature
1. Highlight the Signature name you would like to delete.
2. Click the Delete button.
3. Click OK to permanently remove the signature.
Background Background profiles (Figure 458) provide saved text(s) that can be added to each
profiles DOCX export. For example, a noteworthy background about the assay profiles you
want to save.
Deleting a background
1. Highlight the Background name you would like to delete.
2. Click the Delete button.
3. Click OK to permanently remove the Background.
Exporting as 1. From the Exports menu, select Export - Word (docx) Format.
Word (DOCX) The Export Details window opens. (Figure 459)
format
Figure 459 Export Details window
6. Optional: Click the Include Row Number check box to add row numbers to the
Segment Table.
7. Optional: Click the Hide Y Chromosome check box to export the Karyoview without
the Y chromosome ideogram for female samples.
Exporting as PNG You can also create a PNG screen shot of the entire software screen.
IMPORTANT! The results from ChAS are for Research Use Only. Not for use in diagnostic
procedures.
Exporting table For information on how to choose and export preset column content (from previously
data into a PDF saved table states). See “Saved table states” on page 332.
IMPORTANT! The results from ChAS are for Research Use Only. Not for use in diagnostic
procedures.
Note: You can export data from the Segment Table by selecting Export Segment
Table PDF from the Exports menu, but you cannot export graph table data in a PDF
format.
In order to track which of your segments have been modified (merged, created de
novo, deleted, had their start or end coordinates edited, or had their type or state
Figure 461 Select Columns and File window for Segment Table PDF
If you have added a Sample Interpretation in the View/Edit Sample Properties window,
the information will be populated into the Interpretation dialog box. You can also type
free text into the Interpretation box. You must check the Include Sample Properties
check box in order to enable the Interpretation field
4. Select the option for adding page and row numbers, if desired.
5. Select the option for adding to an existing report, if desired. See "Combining
PDFs into a single PDF" on page 430.
6. Click the Select File button.
7. Select a folder location for the file using the navigation tools.
8. Enter a name for the PDF file, or select a file for the information to be appended
to.
9. Click Save in the Save As window.
10. Click OK in the Select Columns and File window.
A PDF file is created with the selected data type saved.
The PDF report displays:
Table type
Information on chromosome and genome region
Interpretation
Data files
Genome or CytoRegion Segment Filters used
Settings for Data Processing
Microarray Nomenclature
Details of the table data
Exporting tables The TXT file format enables you to transfer data to other software for analysis.
as TXT file
IMPORTANT! The results from ChAS are for Research Use Only. Not for use in diagnostic
procedures.
3. Select a folder location for the file using the navigation tools.
4. Enter a name for the TXT file.
5. Click Save.
The TXT file is saved in the selected location (Figure 466).
It can be opened using a text editing or spreadsheet program, or in other
software designed to use Tab Separated Value TXT format.
Figure 468 is an example of a Segment table that has been exported with the Edit
Mode ON. Note the 4th row of the Use in Report column. In the case of this segment,
it reads FALSE, because this segment was deleted.
Figure 469 is an example of a Segment Table with Edit Mode OFF. The italicized text
representing materially modified segments is no longer present. The deleted segment
and strike-through text showing a deletion (shown in Figure 467 and Figure 468) also
do not appear.
Figure 470 is an example of how an exported TXT table appears with Edit Mode OFF.
Note the deleted segment shown in Figure 467 and Figure 468 is not present.
Transfer to You can copy data from selected cells to the clipboard for pasting into a text or
clipboard spreadsheet file.
1. Select the cells you want to copy in the table. (Figure 471)
IMPORTANT! The results from ChAS are for Research Use Only. Not for use in diagnostic
procedures.
Adding a new 1. When exporting a table or graph as a PDF, click the Add to Existing Export
PDF export to an check box in the Export Details window. (Figure 473)
existing PDF file
Figure 473 Export Details window with Add to Existing Export selected
4. Click Yes in the Confirm Rewrite notice to append the data in the selected file.
5. Click OK in the Select Columns and File or Export Details window.
The new report data is combined with the existing report.
3. Select the PDF files to combine, then click Open in the Select PDF Files window.
The selected files are displayed in the Select Input Files list.
You can use the Remove File button to remove a selected input file.
Click and drag on a file in the list to change the order of data in the Combined
PDF.
4. Click the Select Output File button.
5. Enter a file name for the combined PDF file, then click Save in the Save As
window.
You are returned to the Combine PDF Exports window.
6. Click OK.
Your selected PDFs are combined.
IMPORTANT! ChAS is a research use only application and any submission to ClinVar is the
responsibility of the submitter.
All fields exported into the ClinVar submission template are selected and defined by the user.
Standard ClinVar nomenclature is provided for required submission fields and can be customized
as shown in "Adding vocabulary content" on page 435.
3. In ClinVar Submission Info pane, click the New button to create a new
submission profile.
An Edit Profile window appears.
4. Complete all the appropriate fields. Fields with an * are required by ClinVar.
5. Click OK to save the profile.
6. Optional: To create additional ClinVar profiles, repeat steps 3-5
Deleting a profile 1. Highlight the profile name in the ClinVar Submission Info list box you want to
delete.
2. Click the Delete button.
3. Click OK to confirm the profile deletion, or click Cancel to keep the profile.
Adding By default, recommended ClinVar vocabularies are stored for certain required fields,
vocabulary but additional terms can be added to any field.
content 1. Click Preferences → Edit User Configuration.
The User Configuration window appears.
2. Click the Exports tab, then click the ClinVar tab.
3. Select a category you want to add a term(s) to, then use the text field (Figure 479)
to enter the additional term(s).
4. From the ClinVar vocabularies drop-down, select the category that you want to
add a term(s) to.
5. Use the text field (at the bottom) to enter the additional term(s).
6. Click the Add button to add the term to the category’s list.
Removing 1. From the ClinVar vocabularies drop-down list, select a category that contains the
vocabulary term you want to remove.
content 2. Highlight the term, then click Remove.
Exporting in There are certain fields that are required before uploading to ClinVar. It is
ClinVar format recommended you use the ClinVar Table State in the Segments Table to expose all
required fields. To use the ClinVar Table State, refer to "Saved table states" on page
332.
1. In the Segments Table, apply the ClinVar Table State.
2. Select the segments to be exported using the Use in Export check box.
3. Fill in all columns for the selected segments, as all columns in the ClinVar Table
State are required before you can upload to Clinvar.
4. In the Exports Menu, select ClinVar Export.
A browse window appears.
5. Select a location to save the export, then use the File Name text box to name the
export, then click OK.
A Submission Info/Segment Data window appears.
6. Select the ClinVar Profile you want to use for this export.
Note: You can open the ClinVar export in Excel to add information to the optional
fields. Opening the ClinVar export from ChAS, auto-populates all currently required
ClinVar fields.
Table 20 Variant tab: Auto-populated columns into the ClinVar submission template (all other optional columns are blank upon
export).
Date last evaluated Uses date of export unless otherwise specified on the Submission Info window before
exporting.
Table 21 Case Data tab: Auto-populated columns into the ClinVar submission template (all other optional columns are blank
upon export).
Deletion Logs can also be exported from the ChAS DB Tools Maintenance Page.
See "Downloading deletion logs" on page 454.
Types of settings
ChAS provides two ways to store setup information. User profiles and Named
settings. Each way works differently and performs different functions.
User profiles A ChAS Browser User Profile stores your selections for various display settings as
they were when the software was last shut down while using that user profile.
A new user profile can be created or selected only when starting the software.
The user profile saves the following display settings:
Screen size, displayed tabs, and sizing of display areas
The views displayed in ChAS, and the size of the display panes
Available Named Settings: Different users can have different lists of named
settings to choose from
Name of the currently selected named setting
Copies of the user’s custom (not shared) named settings
Data Display Configurations
Region information files selected for CytoRegions and Overlap Map
Which types of graph and segment data are turned on or off
Display options for graph data (height, grid, values, etc.)
Chromosome and area displayed.
Selected Reference Annotation database (ChAS Browser NetAffx Genomic
Annotations file)
Loaded AED and BED files
The files and Reference Annotations (Genes, DGV, etc.) that are checked or
unchecked
Custom color rules
OncoScan Defaults 0 0 0
Deleting a user 1. Go to the Windows folder where the user profiles are stored and delete the folder
profile with the profile name you want to delete.
You can see the location of the folder in the About window, as described in
"Analysis file locations in Windows 10" on page 26.
3. Enter a name for the setting you want to create, then click OK.
The setting is saved and appears in the Named Setting drop-down list.
(Figure 486)
Note: The Named Setting saves the settings at the time it was created. Subsequent
changes to the settings will not be saved in the Named Setting.
Selecting a 1. From the Named Setting drop-down list, select the setting. (Figure 487)
named setting
Figure 487 Named Setting drop-
down list
2. Select the Named Setting from the drop-down list, then click OK.
The selected setting is applied. Note: A Named Setting is not modified by any
changes that you make to the settings in ChAS. If you want to keep a copy of your
new settings, you will need to save them as a new Named Setting.
Deleting a named 1. From the Preferences menu, select Delete Named Setting.
setting The Delete Setting window opens. (Figure 489) Note: Shared Named Settings
(the icon in the Named Setting list) do not appear in the Delete Setting drop-
down list. Users cannot delete or modify a shared Named Setting.
2. Select the setting you want to delete from the drop-down list, then click OK.
The setting is deleted.
2. Use the navigation features of the window to select or create a directory for the
preferences. Note: The software creates a folder named “preferences” in the
directory you select or create. If you select a directory that already contains a
“preferences” folder, it will be overwritten. When you want to import the preferences,
select the directory that contains the “preferences” folder that is indicated by the
icon.
4. Click Yes to export the preference files to the directory that you selected.
You can then transfer the preferences to another user profile or system.
2. Use the navigation features of the window to select the directory that the
preferences were exported to (directories that contain a “preferences” folder are
indicated by the icon.)
3. Click Open to import the preference files.
If you selected a directory that doesn’t contain the “preferences” folder, the
following notice appears. (Figure 493)
Note: The imported preferences will not be applied until you restart ChAS.
Importing External websites may update their links from time to time. To remedy this, a feature
hyperlinks as been added to update an outdated link(s) within ChAS.
1. Click Preferences → Import Hyperlinks.
The Load Hyperlinks Configuration window opens.
2. Navigate to your updated hyperlinks (.chaslink) file, click to highlight it, then click
Select File.
3. Restart the ChAS Browser to apply the link update(s).
Note: For .chaslink file help, contact Technical Support.
Configuring the The HTTP service may enable external applications to interact with the ChAS Browser.
HTTP service By default, this service is off. Please contact Technical Support before activating this
feature.
1. Click Preferences → Configure HTTP Service.
The Configure HTTP Service window opens.
2. Click the check box to enable the service, then enter the designated Port
number.
3. Close the window.
The service and Port are now activated.
4. Click OK.
IMPORTANT! The ChAS Server Home Page requires an active Internet connection, requires a
browser (Chrome and Internet Explorer v11 are recommended). Also, if you are using the local
ChAS DB, an active Internet connection is not required.
2. Log in using the installer’s factory default Username: admin and Password:
admin. After logging in, it is recommended new users go to "Administration" on
page 456 to create a New User(s) and/or edit User(s) roles.
Note: Make sure you look in the URL field to identify which ChAS database the
ChAS Database Tools is accessing.
The ChAS DB Home Page appears. (Figure 499)
Status page
Use this page to view how many samples and segments are in the Database.
(Figure 500)
Maintenance
Use this page to perform a backup, restore, and database clean up. (Figure 501)
IMPORTANT! It is strongly recommended that you perform scheduled routine backups of the
database.
Note: The ChAS installer creates a disabled Windows Task that automatically backs up
the ChASDB database on a weekly basis once it is enabled.
Restoring a Note: Restoring a backup file created in ChAS 3.0/3.1/3.2/3.3/4.0 automatically upgrades
database it to ChAS 4.1.
1. Click Restore Database, then click the Choose File button. (Figure 503)
An Explorer window opens.
2. Navigate to the location where your ChAS DB was last backed up, then click Open.
By default, a backup of your current database is stored/resides here:
\\Affymetrix\ChAS\PostgreSQL\Backups
3. Click Restore to start the restore process.
IMPORTANT! Do not leave this page once choosing the Restore button until you see the
message that the database has been successfully restored. Also, once the restore process has
successfully completed, you must click ChAS DB → Refresh ChAS DB Data to view the data in the
database using the ChAS Browser.
IMPORTANT! After restoring the database, you must click ChAS DB → Refresh ChAS DB data
to view the newly restored data from the database.
When merging the segments from two databases, if a duplicate entry is found then the
merge keeps the entry for the database currently active in ChAS. The duplicate from the
backup.db is skipped.
Merging two
ChAS databases
IMPORTANT! The two ChAS databases to be merged, must be from the same version of ChAS.
Also, the library files for CytoScan HD and OncoScan CNV Plus must be present in your Library
folder before merging the two databases.
Make sure one of the ChAS databases is restored into ChAS (for details on how to Restore
a ChAS DB, see page "Restoring a database" on page 451). Since any duplicate segments
between the databases will keep the copy from the actively restored database, make sure
the database with the more complete content is the one that is actively restored in ChAS.
1. From the ChAS browser, go to ChAS DB → Database Tools.
If prompted, log into the ChAS database as you normally would.
Merging an older If you want to merge a database (from an older version of ChAS) with a current ChAS 4.0
ChAS database database, perform these steps:
1. Backup your current ChAS 4.0 database.
2. Restore the database from (e.g.) ChAS 3.1. See "Restoring a database" on page 451.
During the restore process, the older ChAS database is automatically upgraded and
is now compatible with ChAS 4.0.
3. Backup the 3.1 database you just restored.
4. Restore the ChAS 4.0 database you backed up in Step 1.
Note: If the databases to be merged contain duplicate entries, the copy that is in the
currently restored database will be kept. The entry from the backup.db that is being
merged will be skipped.
5. From the ChAS browser, go to ChAS DB → Database Tools.
If prompted, log into the ChAS database as you normally would.
6. Click on the Maintenance link.
7. Click on the Merge database link.
8. Click the Browse button to navigate to the backup.db file you want to merge with
the current database.
9. Click Merge.
Cleaning up a ChAS DB will automatically run re-indexing scripts to maintain optimal performance. You
database can also run these scripts manually if desired.
Note: You must have a Manager or an Admin role to clean up a database.
1. Click the Clean up database button (Figure 504) to run the Vacuum Analyze and
Reindex Database optimization process.
4. Check the box to have a backup of the current ChAS DB before the database is
deleted and an empty DB is created.
5. Click Delete all data and reset database.
Deidentifying Files
Deidentifying files will remove potentially sensitive information from the ChAS DB. By
running Deidentification, the file names stored in the ChAS DB will be replaced by an
alpha numeric ID.
Note: If you have included sensitive information in custom database fields, this
process will not remove that information.
1. Click on De-identify Files.
2. (Optional) Create a backup of the database to preserve the original content, then
click the Start De-identification button to replace the file names in your ChAS
Database.
Administration
Note: You must have an Admin role to perform Administration functions. Log files for
the ChAS database can be found in: \ProgramData\Affymetrix\ChAS\Log
1. Click Administration.
The following window appears: (Figure 506)
Using a shared
ChAS database
while off-line
IMPORTANT! If your Windows Firewall is enabled during the installation of ChAS and you want
to Backup the ChAS Database and Restore it to your local ChAS DB, a message may appear
indicating that you cannot connect to the shared folder. If this message appears, contact your IT
department for help in allowing file sharing through the Windows Firewall.
IMPORTANT! You must have a Manager or Admin role and make sure you log back into the local
host before restoring your computer. While performing a restore from a backup of a shared server,
the roles associated with the shared server are displayed, therefore any roles that were created on
the local server are replaced by those used on the shared server (until local host ChAS DB is
restored again).
Publishing data you have analyzed in off-line mode to the shared ChAS DB
server
1. Log back into the shared ChAS DB server. To do this. click Preferences → Edit
Application Configuration, ChAS DB server tab, then enter the server name/IP
address.
2. Click OK.
Note: After you are logged into the shared ChAS DB server, publish the samples to
the database as you normally would. See "Publishing data to the database" on page
406. If a xxCHP file has been previously Published to the database, you will receive a
warning indicating this sample already exists in the database. You can choose to
overwrite the existing information or cancel to keep the existing information.
3. Click Start.
Note: Depending on the size of the ChAS DB.backup to be remapped, this
process may take several minutes.
The ChAS Remapper window appears. (Figure 509)
The following files are generated in the same folder as your original ChAS DB select
to remap:
ChAS DB.hg38.backup - this backup can be restored as the active ChAS DB for
querying within the Browser.
A TXT file listing all original segments - provides a text file of the original segment
information and the remapped segment information along with success or fail
criteria.
A TXT file listing the segments that failed to remap to hg38 - provides s text file of
the original segments that did not remap to the new genome build.
Starting CDL
1. Click ChASDB → ChAS Database Loader
The ChAS Database Loader window appears. (Figure 510)
By default, the Files of Type is set to All Supported Types. (Figure 511) If you
want to view a specific supported file type, click the drop-down, then select the
file extension you want to display.
2. Single click, Ctrl click, Shift click or Ctrl a (to select multiple files).
3. Click Open.
Your selected files now appear in CDL’s main window. (Figure 514)
Repeat steps 1-3 if you want to add files (up to 500) from different saved
locations.
2. Single click, Ctrl click, Shift click or Ctrl a (to select multiple files).
3. Click Open Selected Files.
Your selected files now appear in CDL’s main window.
Note: A special icon is used to indicate a CHPCAR or “sidecar” file, as shown in
Figure 514. For more information on sidecar files, "Editing segment data
overview" on page 223.
IMPORTANT! File level properties are optional fields that are available to CHP files analyzed in
ChAS 3.0 or higher. Any file level properties entered are stored in the CHPCAR file, these properties
will not populate in the CDL window. However, if those properties were entered for a CHP file and
are contained in the CHPCAR file, they will be published to the database. Entries in a CHPCAR file
supersede entries in the text file. File level properties for your xxCHP files can be added directly to
the database after publishing has completed. For more details, see "Interacting with the ChAS
database" on page 390.
IMPORTANT! Before you use CDL to publish your files, it is highly recommended you backup
your ChAS database first. For instructions on how to access and backup your database, refer to
Chapter 21, "Database tools" on page 447. Also, xxCHP files can ONLY be published to a ChAS
DB of the same genome version assignment.
Testing your Before publishing, you may want to test your ChAS database connection. To do this:
connection 1. Click ChAS DB → ChAS Database Loader
(optional)
2. Click Tools → Test Connection
A message appears if there is a successful connection to the ChAS database.
Verifying the 1. From the ChAS Browser, click the Preferences drop-down menu, then select
ChAS database Edit Application Configuration…
The Configuration window appears.
2. Click the Server tab.
The Server window/tab appears. (Figure 516)
3. Verify the ChAS DB you are publishing to is correct, then click OK.
Before publishing Before publishing files, you must check the Genome dialog box (Figure 517) to make sure
files your desired filter settings and data types are enabled.
Note: QC thresholds and Smooth/Joining settings in the ChAS Browser will be used when
publishing xxCHP files using CDL. To use different QC thresholds and/or Smoothing and
Joining settings, see "Setting QC parameters in the ChAS browser" on page 133.
Changing 1. Click on the Filter icon (or click Files → Segments Filters).
segment filters The Segments Filters window opens. (Figure 518)
(optional)
2. Update the appropriate filters using the provided check boxes, text boxes and
sliders.
3. Click X to save your changes and close the window.
Managing data 1. Use the Segments Filters window (Figure 518) to click the check box of the data
types (optional) type(s) you want to publish.
2. Click X to save your changes and close the window
3. Review the Genome dialog window (Figure 517) again to make sure your data
types to be published are displayed.
IMPORTANT! You must have ChAS DB Manager or Admin privileges before you can publish. For
information on setting up ChAS DB role assignments, see "Administration" on page 456.
2. Click .
A Publish? window appears. (Figure 520)
3. Acknowledge the message, click to check its check box, then click OK.
4. An Overwrite? message may appear. (Figure 521) Click the appropriate button
to continue.
To pause the publishing process, click . While in pause mode, you can
add more files to the table, as described in "Adding files to CDL" on page 462.
After the publishing process is complete, each Status column is marked with a result
icon.
= The file was skipped over and not published, because it was already found
in the database or it did not meet the assigned QC thresholds.
Note: Refer to the table’s Status Message column (Figure 522) for details regarding a
skipped or failed file.
Clear Published After publishing, click Clear → Clear Published to remove all files that successfully
published.
After clicking Clear Published, files with a Skipped or Failed status remain in the
table. Click Export properties... to export these files for further investigation.
Clear All Click Clear → Clear All to remove all files from the table.
Closing CDL
1. Click Close CDL or click File → Close CDL.
Figure 524 AED file in Excel with required header fields for properties and metadata
Header Row
Metadata
Annotations
IMPORTANT! AED supports only Unicode, which can be stored in one of various encodings
(charsets such as UTF-8, UTF-16LE, and UTF-16BE). The AED file indicates the charset with an
initial Byte Order mark (BOM). An AED file with no initial BOM is not recommended. An AED file
that does not begin with a BOM will be interpreted as containing only the ASCII subset of Unicode,
resulting in an error if any characters lie outside the range of ASCII. (With no indication of a charset,
it is not possible to determine which non-ASCII characters were intended. File formats such as
BED that make assumptions about non-ASCII characters have the potential to corrupt data when
transported between systems.)
"Header row": Names the properties that can be used in the annotations
"Metadata records" on page 475 Optional: Provides information about the AED
file itself and the group of annotations it contains.
"Annotations Rows" on page 476: The annotation row displays values for the
properties listed in the header rows (for each feature that is annotated).
Header row The header row of an AED file is a tab-delimited list of the properties that can be used
to describe a region of the genome.
Each AED file header represents a property. Normal records in the file represent
annotations, and the record fields represent annotation properties. Special metadata
records represent metadata properties for the file as a whole, rather than for a
particular annotation.
A property name has the following format:
namespacePrefix:propertyIdentifier(namespacePrefix:TypeIdentifier)
namespacePrefix
• A namespacePrefix is optional. It assigns the property or type to a vocabulary
grouping called a namespace. The lack of a namespace prefix indicates that the
property has been created by a user and is not part of the formal AED
specification.
The lack of a namespacePrefix indicates that the property is in the default/
custom namespace; this namespace enables users to add properties to an
annotation just by adding new columns, such as foo(aed:String) or
bar(aed:Integer).
IMPORTANT! ChAS verifies the property types when importing an AED file. If a file header
specifies a known property, but includes an incorrect data type for the property, the file will not be
loaded. For example, “fish:score” is a known property with “whole number” data type. An AED file
header that specifies “fish:score(aed:String)” would be treated as an error.
Metadata records Some records, instead of providing annotation about a location on a genome assembly,
provide metadata information about the AED file itself (Figure 525). These metadata
records are identified by the presence of an empty string in the bio:sequence field. The
bio:start and bio:end fields must also be left blank for metadata records. If there are
metadata records present, the aed:value field is required.
In a metadata record, the value in the aed:name field is interpreted as the name of the
metadata property, with type identification rules identical to those of the header fields. The
value in the aed:value field is interpreted as the value of the metadata property, and the
characters that make up its string value must follow the lexical and semantic rules
specified by the type indicated in the aed:name field.
Namespaces The name of each type and property in AED is considered part of a vocabulary
grouping called a namespace. Namespaces prevent clashes between names defined
by disparate parties, as well as unambiguously identify commonly used types and
properties so that identical semantics may be assured. A namespace is identified by
a Uniform Resource Identifier (URI) as defined in RFC 3986.
A type or property identifies its namespace by a namespace prefix followed by a colon
character. If no namespace prefix is present, the type or property is considered part
of the AED default namespace. The part of the type or property after the namespace
prefix is considered its simple name.
AED has several build-in namespaces, with predefined namespace URIs and prefixes:
Table 22 Namespaces
If any other namespace is used in an AED file, it must be declared in the metadata
section of the file using the special namespace prefix. The simple name of the
metadata header indicates the prefix being declared, and the value (of type aed:URI)
indicates the namespace to be associated with the prefix. For example, to associate
the prefix “example” with the URI http://example.com/namespace/, and the “affx”
with the URL http://affymetrix.com/ontology/, then use the following metadata record:
namespace:example(aed:URI) http://example.com/namespace/
AFFX properties
This section describes the properties defined by the AED specification. By convention,
property names begin with lowercase letters. A predefined property is only required if
indicated. Some properties are only useful as metadata, and these are so indicated.
AED properties
These properties are parts of the AED namespace.
aed:application aed:String (Metadata) The name of the application that produced the AED file, if metadata, or
the annotation.
aed:category aed:String Identifies the group and optionally subgroups into which the resource is classified.
Subcategories, if any, should be delimited using the forward slash character '/'
(U+002F) with no whitespace. (Example: copynumber/gain)
aed:created aed:DateTime (Metadata) The point in time the data was created; this is not necessarily the time the
file was created.
aed:counter aed:Integer A general purpose field to be incremented when user-defined circumstances occur.
A common use for this field is to indicate, the number of samples in which the
condition has been observed.
aed:modified aed:DateTime (Metadata) The point in time the data was modified; this is not necessarily the time
the file was modified.
aed:value aed:String (Required only if metadata records are present.) In a metadata record, this value is
interpreted as the value of the metadata property. An AED processor must ignore
this field for all non-metadata records.
aed:uuid aed:UUID (Metadata) The universally unique identifier of the resource. Although allowed, it is
not always advised to identify user-editable resources such as AED documents with
UUIDs, as copying and manually editing such resources can result in multiple such
resources with identical UUIDs, negating the purpose of UUIDs.
bio:assembly aed:URI (Metadata) A URI indicating the genome assembly used. Currently the DAS
GlobalSeqIDs are recommended.
bio:confidence aed:Rational A value between 0.0 and 1.0, inclusive, indicating the confidence that an annotation
call is accurate.
bio:end aed:Integer (Required) The zero-based ending position, exclusive, of the record along the
sequence.
bio:markerCount aed:Integer The number of markers such as probes that intersect an annotation.
bio:sequence aed:String (Required) The name of the chromosome (e.g. chr3, chrY), contig (e.g. ctg5), or
scaffold (e.g. scaffold90210).
The special value of an empty string (“”) indicates that the record is a metadata
record, giving special meaning to values in other fields in the record.
bio:start aed:Integer (Required) The zero-based starting position, inclusive, of the record along the
sequence.
Style properties
Style properties are used to control the display of the annotation.
style:color aed:Color The color to be used when visually depicting the annotation.
Table 27 Types
Type Description Lexical Form Examples
Compatibility
UCSC Browser The BED file format, developed at UCSC, is widely used for transfer of simple region
Extensible Data coordinates. However, the format has been interpreted and implemented in multiple
(BED) ways by various software within and outside of UCSC. Some implementations require
a TAB delimited format, others require a space-delimited format, and still others
accept both. Characters outside of the ASCII character set are not well supported. We
created the AED format with very strict and explicit definitions so as to avoid some of
these compatibility issues.
Although the AED format is preferred, ChAS supports both the import and export of
data in BED format. When exporting data in BED format, ChAS exports only the basic
4-column tab-delimited BED format containing the position and name of each item. If
the names of any of your items contain spaces or non-ASCII characters, there is no
guarantee that all programs will be able to interpret those names correctly.
When importing data in BED format, ChAS supports the reading of BED files with
anywhere from 4 to 12 columns.
The file must be TAB delimited
Only ASCII characters should be used
The values for thickStart and thickEnd will be ignored for display purposes
The value for itemRgb will be honored for display purposes
The values for blockCount, blockSizes and blockStarts can be used to import
and display data with intron/exon structure, such as genes.
Formatting rules in the BED header are ignored
BED files containing multiple tracks are not supported; use a separate BED file
for each track.
The UCSC Browser, as well as ChAS, uses the strict definition of BED where
chromStart is not allowed to be greater than chromEnd. ChAS will accept import of
BED files even if this convention is violated, but will auto-correct and export BED files
properly with chromStart < chromEnd.
AED has been structured to facilitate as much as possible migration of data rows to
and from BED. Starting with existing AED and BED files, data records from AED may
be transferred to BED by using:
The “Export” function from inside ChAS (recommended)
A text editor (not recommended) if the AED files are first prepared in the following
manner:
• Remove all fields except for bio:sequence, bio:start, bio:end, and aed:name.
• Ensure that no non-ASCII characters are included. (The treatment of non-ASCII
characters by a BED processor is undefined.)
• Ensure that no name contains whitespace characters.
• Data rows from the first four columns of a BED file can be transferred to AED with
no constraints as long as the columns are delimited by TAB.
Microsoft Though not recommended, an AED file may be edited by any text editor that supports
Notepad and Unicode and that uses a byte order mark (BOM) to indicate the charset. The version
other Text editors of Microsoft Notepad in Windows XP, for example, will both correctly read text files
marked with a BOM and save text files using the appropriate BOM if the following rules
are followed:
When saving an AED file from Microsoft Notepad, make sure the encoding is set to
“UTF-8” or “Unicode”.
For other text editors, make sure the correct preferences are set both to recognize and
write BOMs for files.
Text Editors
EmEditor <http://www.emeditor.com/> is a commercial text editor that has extremely
good Unicode and BOM support, and is able to open up gigantic text files.
PSPad <http://www.pspad.com/> is a free text editor that has particularly extensive
Unicode and BOM support and is available in many localizations.
UniPad <http://www.unipad.org/> is a shareware text editor that correctly handles
Unicode and BOM, and provides a wide range of built-in glyphs for representing
Unicode code points that cannot be viewed on most other text editors.
References
ISO 8601: ISO 8601:2004(E): Data elements and interchange formats — Information
interchange — Representation of dates and times. International Organization for
Standardization, 2004-12-01.
Microsoft Byte Order Mark: http://msdn.microsoft.com/en-us/library/
ms776429(VS.85).aspx
RFC 3986: RFC 3986: Uniform Resource Identifier (URI): Generic Syntax. T. Berners-
Lee, R. Fielding, and L. Masinter. Internet Engineering Task Force, 2005. http://
tools.ietf.org/html/rfc3986
RFC 4122: RFC 4122: A Universally Unique IDentifier (UUID) URN Namespace. P.
Leach, M. Mealling, and R. Salz. Internet Engineering Task Force, 2005. http://
tools.ietf.org/html/rfc4122
RFC 4180: RFC 4180: Common Format and MIME Type for Comma-Separated Values
(CSV) Files. Y. Shafranovich. Internet Engineering Task Force, 2005. http://
tools.ietf.org/html/rfc4180
Unicode Byte Order Mark FAQ: http://unicode.org/faq/utf_bom.html
Standard AED Because every user-accessible property within ChAS complies with the AED
property style framework, any property may be entered in standard prefix:simpleName style, just
as it would appear in an AED file. For example, the creation date of an entity may be
entered using the aed:created property name.
The predefined AED prefixes defined for AED files may always be used. Unlike AED
files, which allow declaration of arbitrary prefixes with additional namespaces, ChAS
has an additional list of predefined namespace prefix associations valid only within the
context of the ChAS user interface.
These prefixes may be used to refer to the corresponding namespaces with no explicit
namespace declaration. For example, the fish prefix may be used to refer to FISH
namespace properties (e.g. fish:labs) with no need to explicitly associate the FISH
namespace URI with the namespace.
Table 28 lists the namespace prefixes recognized within the ChAS user interface.
Header name ChAS uses the following namespaces when converting CHP header names based on
conversion the header name prefix, as shown in Table 29.
affymetrix-algorithm- http://affymetrix.com/ontology/algorithm/
affymetrix-algorithm-param- http://affymetrix.com/ontology/algorithm/
affymetrix-algorithm-param-option- http://affymetrix.com/ontology/algorithm/
option/
affymetrix-algorithm-param-state- http://affymetrix.com/ontology/algorithm/state/
affymetrix-chipsummary- http://affymetrix.com/ontology/chp/summary/
After determining the appropriate namespace to use, ChAS removes the header name
prefix and modifies the remaining characters according to the following rules (simplified):
1. All beginning uppercase letters are converted to lowercase.
2. All separator characters (such as ʹ-ʹ and ʹ_ʹ) are removed.
3. The characters immediately following separators are converted to uppercase. For
example, the CHP header:
affymetrix-algorithm-param-option-gender-override-file is
converted to optionGenderOverrideFile and placed in the http://
affymetrix.com/ontology/algorithm/ namespace.
Converted ChAS performs further special conversions on the following header parameters for
properties historical and consistency reasons, as shown in Table 30.
affymetrix-array-type http://affymetrix.com/ontology/arrayType
affymetrix-chipsummary-snp-qc http://affymetrix.com/ontology/chp/summary/snpQC
affymetrix-chipsummary-MAPD http://affymetrix.com/ontology/chp/summary/mapd
Derived The following properties are each assigned to a file property derived from one or more
properties header parameters, xxCHP file attributes, or other information, in the given order:
http://affymetrix.com/ontology/aed/created (aed:created)
The file creation time in the Calvin generic data header.
The create_date header parameter.
The create-date header parameter.
http://affymetrix.com/ontology/aed/modified (aed:modified)
File system last modified date.
http://affymetrix.com/ontology/algorithm/annotationFile (alg:annotationFile)
The affymetrix-algorithm-param-state-annotation-file header parameter.
The affymetrix-algorithm-param-cn-annotation-file header parameter.
The affymetrix-algorithm-param-mapfile header parameter.
The affymetrix-algorithm-param-option-annotation-file header parameter.
http://affymetrix.com/ontology/algorithm/parameterFile (alg:parameterFile)
The affymetrix-algorithm-param-state-config-file header parameter.
The affymetrix-algorithm-param-config-file header parameter.
IMPORTANT! Starting with ChAS v3.3, the naming convention of the NetAffx Genomic
Annotation database file will change. This file will now include a date (as opposed to a specific
NetAffx build number) and will be updated more regularly than its current schedule of being
updated with each software release. The new naming convention is as follows:
NetAffxGenomicAnnotations.Homo_sapiens.hg38.naYYYYMMDD.db
For optimal use of the Segment Prioritization methods described in Chapter 17, "Prioritizing
segments" on page 373, you must download the NetAffx Genomic Annotation files released with
ChAS 4.4 or use more current ones when available.
IMPORTANT! It is VERY highly recommended to confirm findings obtained using the ChAS
Browser's NetAffx Genomic Annotations file contents by linking out to external databases using
the ChAS software coordinates for the most current annotation information. See "Linking to
external websites" on page 213.
Genome assemblies
First, it is important to know which set of DNA sequences is being used as the
reference. For the human genome, the reference assembly is available for download
from public sources such as UCSC and Ensembl. Those two sites currently use
identical genome assemblies, but refer to them by different names. UCSC uses names
such as “hg18”, “hg19” and “hg38”. The identical genome assemblies are known as
“NCBI36”, “GRCh37”and “GRCh37”at Ensembl. Assemblies at NCBI can have a
decimal point as well, for example, “36.3” or “37.1”. For positions on the
chromosomes 1-22, X and Y, there is no difference between assemblies “36.1”, “36.2”
and “36.3” and we expect the same will be true for future “point” releases.
Segment positions
BED and AED file formats are used for storing and sharing region files between software.
The BED format was created by UCSC for use with their genome browser, and is also used
in other software. The AED format was created by Thermo Fisher Scientific for use with
ChAS and possible future software, but used the BED format as a starting point.
The BED file format is explicitly defined to use a 0-based coordinate system where the
second column (chromStart) in the file is the position of the first base-pair and the third
column (chromEnd) is the position of the last base-pair plus one. Another way of saying
this is that the start index is inclusive and the end index is exclusive. As an example, to
refer to the first 100 based on the chromosome, you would use chromStart=0 and
chromEnd=100. The length of any region is always given simply by chromEnd minus
chromStart.
The UCSC browser strictly requires that chromStart not be larger than chromEnd. In order
to support file outputs from non-conforming programs, ChAS will accept BED files where
chromStart > chromEnd. It will simply switch those two coordinates and act as if the
coordinates were given in the correct order.
Since a SNP has, by definition, a length of one base-pair, the proper way to represent a
SNP position is with chromEnd = chromStart + 1. The UCSC browser does allow
chromStart to be equal to chromEnd. But this is used for representing insertion points, and
is not used to represent SNP positions. Because the AED format was intended to be
compatible with BED format, we use the same coordinate system.
For example, suppose there are three markers with the following positions on a
chromosome given in the CYCHP file: Marker A at 1000, Marker B at 2000, Marker C at
3000. Marker positions in the CYCHP file are 1-based index positions. To represent these
in a BED file, we would need a file like this:
Chr3 999 1000 markerA
Chr3 1999 2000 markerB
Chr3 2999 3000 markerC
If there were a segment starting at markerA and ending at markerC, we would need to
represent it in a BED or AED file as:
Chr3 999 3000 segment_1
hg version
Algorithm overview
This section provides a high level overview of how copy number calls are generated
within the software. The copy number workflow starts with the intensities on the array,
include normalization and scaling, reference set ratios, Log2 transformation, CN state
segmentation, and how CN segment calls are made.
Note: For CytoScan HTCMA algorithm and QC, see the RHAS User Guide.
Feature GeneChip Cartridge Microarrays are scanned on the GeneChip Scanner and
identification and processed by the AGCC/GCC scanner software package. AGCC/GCC aligns a grid
signal extraction on the DAT file (the original scanned image) to identify each microarray feature and
calculates the signal from each feature. This process uses the DAT file, containing the
raw signal, and creates a CEL file, which contains a single signal intensity for each
feature. The CEL file is used for all downstream analyses.
Single sample Beginning with the raw signal data in the CEL file, the Single Sample CytoScan
CytoScan Workflow implements a series of steps that perform probe set summarization,
workflow normalization, removal of variation caused by known properties and residual variation,
and completing with calling genotypes, copy number segments and LOH segments.
A brief overview of each step performed by the CytoScan Workflow is shown in
Figure 528 on page 494. In addition, a rough sketch of Analysis Pipeline (for Single
Sample Analysis) is demonstrated in Figure 528 on page 494.
Figure 527 CytoScan Workflow Overview. Steps related to probes used for copy
number determination run down the left and those steps used for SNP probes run down
the right side of the diagram
Figure 528 Rough sketch of Analysis Pipeline (for Single Sample Analysis)
Log2 ratio Log2 Ratios for each marker are calculated relative to the reference signal profile. The
calculation Log2 Ratio is simply Log2(samplem) – Log2(referencem), for each marker, “m”.
High pass filter Since most probes map to genomic markers associated with a normal copy number,
image correction most Log2 Ratios should be centered at a value of zero. Also, since markers from any
genomic region are scattered across the surface of the microarray, regions of altered
copy number will not appear as regional changes on the microarray image.
Some samples do reveal spatial trends away from zero that are gradual and this
spatial bias when scattered back across the genome exhibits itself as added noise in
the Log2 Ratios. The High Pass Filter Image Correction identifies these gradual spatial
trends and adjusts Log2 Ratios to remove the spatial bias and lower the level of
noise.Log2 Ratio-Level Covariate Adjustors
Median This final level of normalization simply shifts the median Log2 Ratio of the autosomes
Autosome to a copy-number state equal to 2, i.e. a Log2 Ratio of 0.
normalization
Systematic Even after all of the Covariate Adjustors, there is some residual variation with unknown
residual variability origins. During product development we have introduced variation into the protocol in
removal an attempt to capture other forms of unanticipated variation. The Systematic Residual
Variability Removal step matches sample variability to the residual variability of the
reference set, and when matched, corrects the data to remove the residual.
Segmentation Copy Number Calls for each Marker based on Log2 Ratios
For CytoScan arrays, markers are individually assigned a copy number call by a
Hidden Markov model (HMM). The sample specific inputs to the HMM are the
Weighted Log2 Ratios generated by the Signal Restoration module.
The weighted Log2 Ratios are centered on copy number (CN) = 2. In theory, when
Log2 Ratio = 0 then CN = 2, when Log2 Ratio = -1 the CN = 1, etc. In truth,
microarrays, or any hybridization-based technology, exhibit Log2 Ratio compression
due to many factors, so the Log2 Ratios never exhibit the amplitude expected by the
math. The following table shows theoretical and actual Log2 Ratios for different Copy
Number States.
1 -1 -0.45
2 0 0
3 0.58 0.3
The actual Log2 Ratios observed are best derived from a very large data set with well-
characterized copy number changes. To this end, we have analyzed over 1400
samples that have copy number changes across 75% of the genome and have
established stable empirical values for these expected Log2 Ratios. These values, as
well as the dispersion characteristics of the Log2 Ratio data, are used as inputs to the
HMM along with the weighted Log2 Ratios of the sample data.
The HMM uses these inputs to convert observed Log2 Ratios into a CN state for each
marker. It uses a table of transition probabilities that express the probability of
changing from any CN state to another. As can be seen in the following example
(Figure 529), there are many potential paths through the possible CN states of a set of
markers.
The HMM uses the Viterbi algorithm to calculate the most probable path through the
set of markers using the transition probabilities between each pair of CN states.
Essentially, the graph of potential CN states is the “hidden” layer of the HMM, and the
measure Log2 Ratios are the observed layer. The HMM algorithm finds the most
probably CN states given the observed Log2 Ratios.
Segment Formation
Once markers are assigned Copy Number States by the HMM, contiguous stretches
of adjacent markers ordered by chromosome position having the same state are
aggregated into segments. These segments are described in a segment table within
the resulting CYCHP file that provides for each segment, the common Copy Number
State, the number of markers in the segment, the genomic marker position that
initiates the segment and the genomic marker position that terminates the segment.
Segment table The final result of the copy number pipeline is a table of segments identified in the
output sample. The table in the CYCHP file includes segments of normal and non-normal
copy number. Segments called on the X- and Y-chromosomes are characterized as
normal or non-normal using gender information and adjusting for the Pseudo
Autosomal Regions (PAR) that are present on the X and Y. In ChAS, the segment table
display only shows segments of non-normal copy number.
Mosaicism The algorithm for detection of copy number aberrations in the presence of mosaicism
segment considers single copy deletions and gains. The algorithm is tuned to be most accurate
algorithm when the normal/expected Copy Number State is two. The algorithm targets detection
of changes of approximately 3MB or more in size (for CytoScan HD). Copy number
change events less than this size may be detected; however, sensitivity and specificity
will be reduced.
Signal CytoScan arrays contain six probes for each SNP probe set, three targeting each
summarization allele. The first step of the SNP-specific workflow is to summarize the previously-
normalized probe intensities for the A and B alleles, yielding allelic signal values.
Allelic signal For each marker, the Allelic Difference is calculated as the difference between the
computation summarized signal of the A allele minus B allele, standardized such that an A-allele
genotype is scaled to a positive value, and the B allele is scaled to a negative value.
The standardization is determined based on median values for this difference under
different genotype configurations determined by the reference set. In this way a
homozygous AA maps to approximately +1, and a homozygous BB allele maps to
approximately -1, with the heterozygote mapping to approximately 0. Additionally,
single A and B allele signals will map to 0.5 and -0.5, respectively. This scaling
provides a useful way of discerning two copies of an A allele from a single copy,
enabling detection of regions of copy-neutral LOH (e.g. IBD) from hemizygous LOH.
Genotyping Genotyping for CytoScan arrays is accomplished using the BRLMM-P algorithm
described in the White Paper: BRLMM-P: A Genotype Calling Method for the SNP
Array 5.0 (2007).
Allelic difference Systematic changes in Allelic Differences can be related to differences in GC content.
GC correction For instance, on a given sample Allelic Differences representing AA and BB genotype
markers might get progressively closer or further from each other as the GC content
changes. It is assumed that such changes represent unwanted variability. The Allelic
Difference GC correction determines differences in the structure of the allelic
differences associated with GC and then removes these differences. For CytoScan
HD the super GC covariate is used. For CytoScan 750K the Local GC covariate is
used.
Detection of LOH The LOH algorithm frames the problem in terms of a statistical hypothesis test. Given
a specific region containing N SNP markers with heterozygous and homozygous
genotype calls, decide between the following two hypotheses:
Null Hypothesis: Region is LOH
Alternative Hypothesis: Region is non-LOH
To decide between the two hypotheses the number of heterozygous calls is compared
with a critical value that is computed for each sample. When the number of
heterozygous calls is above the critical value, then the alternative hypothesis is
favored, i.e. region is not LOH. If there are not a sufficient number of heterozygous
calls then the decision is made in favor of LOH. The algorithm moves the region of N
markers along the genome to determine LOH events. Further details are provided in
the While Paper: The Loss of Heterozygosity (LOH) Algorithm in Genotyping Console
2.0.
Median of the MAPD is a global measure of the variation of all microarray probes across the genome.
Absolute values It represents the median of the distribution of changes in Log2 Ratio between adjacent
of all Pairwise probes. Since it measures differences between adjacent probes, it is a measure of
short-range noise in the microarray data. Based on an empirical testing dataset, we
Differences have determined that array data with MAPD > 0.25 (for CytoScan 750K and HD, MAPD
(MAPD) > 0.29 for CytoScan Optima) has too much noise to provide reliable copy number
calls.
SNPQC SNPQC is a measure of how well genotype alleles are resolved in the microarray data.
Based on an empirical testing dataset, we have determined that array data with
SNPQC < 15 (for CytoScan 750K and HD, SNP QC < 8.5 for CytoScan Optima) is of
poorer quality than is required to meet genotyping QC standards.
Figure 530
Effect of MAPD As a measure of performance, we measured copy number gain and loss using
on functional samples with large chromosome aberrations that spanned approximately 70% of the
performance genome. With this dataset of nearly 1500 microarrays we measured the sensitivity for
detecting regions of copy number change across all of these regions. The sensitivity
of detecting an aberration on each array was binned into groups of varying
sensitivities, and plotted versus MAPD for each array in the following graph.
(Figure 531)
Figure 531
The bins of detection sensitivity are displayed as coordinates along the x-axis, with
0% detection at the left and 100% at the right. The number of arrays is listed above
each box plot. The majority of the arrays had sensitivities above 90%. Based on this
analysis, we established a QC cutoff for MAPD of 0.25. Arrays with MAPD above 0.25
cannot be reliably used to determine copy number.
Figure 532
SNPQC correlates well with genotype performance, as measured by Call Rate and
Concordance to published HapMap genotypes. To establish this relationship, we
scored 380 microarrays from the Reference Set by calculating SNPQC, Call Rate and
Concordance. The following graphs show the relationships between SNPQC and the
other two metrics (Figure 533).
Figure 533
Call Rates and Concordance also tied closely to SNPQC
The left panel shows that when SNPQC > 15, Call Rate is above 98%. The right panel
shows that when SNPQC > 15, Concordance is above 99%. This functional mapping
of SNPQC has allowed us to set a functional threshold for this QC metric at 15.
Microarrays with SNPQC > 15 are considered of high quality and interpretation of the
data is possible.
Figure 534 Examples of Allele Track data quality at various levels of SNPQC. The lower row of
figures show data for a CN=2 and the upper for CN=3 regions. The panels from left-to-right
represent increasing SNPQC quality. The functional threshold for SNPQC is 15, so all values
above 15 show excellent data quality.
The key consideration is whether the SNPQC value is above or below the threshold
value and not the absolute magnitude. As long as the SNPQC value exceeds the
threshold there is a retention in the data quality as illustrated by the graphs to the right
which demonstrate clear allelic data across a broad range of SNPQC values which
exceed the recommended threshold. SNPQC is one of the metrics used to assess
array quality and should be helpful towards determining which experimental data sets
are of satisfactory quality to continue with subsequent interpretation.
Note: For detailed information on algorithms and QC metrics for the OncoScan array,
refer to the OncoScan Console User Manual (P/N 703195).
TuScan algorithm
The TuScan algorithm uses B-allele frequencies (BAFs) and log2 ratios to estimate the
ploidy and percentage of aberrant cells in the sample (%AC) which in turn are used to
calculate copy number calls (CN). The BAFs and log2 ratios contribute equally to CN
determination. TuScan first uses the BAFs and log2 ratio data to identify segments of
equal CN. Next TuScan uses the BAFs, log2ratios and segment data to find the
combination of %AC and ploidy that best fits the data. When TuScan can successfully
determine %AC, the algorithm assigns each aberrant segment an integer copy
number representing the copy number in the tumor portion of the sample. This is
possible because CN is well approximated by an integer when the tumor is nearly
homogeneous. If the tumor is highly heterogeneous (i.e., lacks a dominant clone), or
contains a large amount of “normal” cells %AC cannot be determined. In other words,
if the percentage of aberrant cells contributing to the various aberrations in the sample
varies across all aberrations, %AC and ploidy cannot be determined. When %AC
cannot be determined, the segmentation algorithm will still identify segments of equal
CN, but the CN in just the aberrant cells cannot be determined. In this case, TuScan
bins the copy numbers and returns fractional CN values in 1/3 increments (e.g., 2,
2.33, 2.66, 3 etc.). This fractional copy number is derived from the normal
contamination as well as the heterogeneous population of tumor cells; therefore, the
fractional CN calls represent the average CN observed for that segment. Users should
look at the value of %AC to determine whether the CN value represents the CN in the
tumor (%AC= number) or the average CN in the sample (%AC=NA). Tumor
heterogeneity also affects the interpretation of the CN number calls when %AC cannot
be determined. For example, a TuScan call of 2.33 can result from 40% of the aberrant
cells having 3 copies, 10% of aberrant cells having 5 copies, or a more complex
heterogeneous mixture of copy numbers. Since nearly every tumor sample will have
some amount of normal contamination combined with tumor heterogeneity it is not
possible to predict how often TuScan will be able to determine the %AC, it will vary
depending on the sample.
Figure 535
Figure 536
IMPORTANT! The Segment Prioritization feature outlined in Chapter 17, "Prioritizing segments"
on page 373 can be used in conjunction with the recommended steps below to quickly prioritize
XON regions.
4. Add the XON Region Level and Summarized Log 2 Ratio columns to the segment
table, then use the Segments Table to review the XON Region segment calls.
5. Optional: Annotate the XON Region segments as you normally would using the
Call and Interpretation columns, as described in "Adding annotations at the
sample (xxCHP) file level" on page 250.
6. Optional: Exon Region segment calls from other Levels can be displayed by
selecting the check box(es) in the Filters Tab.
7. Optional: Publish the XNCHP file to the ChAS DB, as described in "Publishing
data to the database" on page 406.