0% found this document useful (0 votes)
101 views

Data Analysis of Brain Cancer With Biopython

This document discusses analyzing brain cancer data using Biopython tools. It provides background on Biopython and describes how it can be used for biological computation and data analysis. Specifically, it analyzes nucleotide sequences of brain cancer (glioblastoma) using various Biopython modules and tools. This includes using Bio.Seq to calculate GC-content of sequences and Bio.Motif for sequence motif analysis to better understand genetic alterations involved in brain cancer.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
101 views

Data Analysis of Brain Cancer With Biopython

This document discusses analyzing brain cancer data using Biopython tools. It provides background on Biopython and describes how it can be used for biological computation and data analysis. Specifically, it analyzes nucleotide sequences of brain cancer (glioblastoma) using various Biopython modules and tools. This includes using Bio.Seq to calculate GC-content of sequences and Bio.Motif for sequence motif analysis to better understand genetic alterations involved in brain cancer.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Volume 8, Issue 3, March – 2023 International Journal of Innovative Science and Research Technology

ISSN No:-2456-2165

Data Analysis of Brain Cancer with Biopython


Vinita Kukreja1 Uma Kumari2*
1 2
Trainee, Bioinformatics Project and Research Institute, * Senior Bioinformatics Scientist, Bioinformatics Project
Noida - 201301, India and Research Institute, Noida - 201301, India

Abstract:- Biopython, an open source tools of Python for Standalone Blast from NCBI, and command line tools from
biological computation, was first published in 2000 by EMBOSS), a standard sequence class (dealing with
Brad Chapman and Jeff Chang. Biopython features sequences, sequence ids and sequence features), tools for
consist of parsers for Bioinformatics file formats, access performing common procedures on sequences (translation,
to online Bioinformatics databases, interfaces to common transcription, and weight calculations), Bio.Motif module
programs, a standard sequence class, etc. Glioblastoma provide analysis of sequence motif (searching, comparing,
is one of the most aggressive (grade IV) type of brain and de novo learning),[2, 6] Bio.Phylo module used for the
cancer, accounts for 15% of all brain tumors. Genetic visualization of phylogenetic trees.[7]
alteration in Glioblastoma include EGFR and PDGFR
amplification, TERT promoter mutation, alteration in An abnormal cell growth that have formed in the brain
TP53, NF1, PTEN and RB, loss of chromosome arm 10q is a Brain tumor. Tumors can form in the brain or other parts
and aberrations in RTK/Ras/PI3K signaling pathways. of the central nervous system (CNS) (spine or cranial
Pathway maps were used to understand a molecular nerves). Brain controls most of the bodily functions which
interaction and reaction network in Glioma. Multiple include awareness, movements, sensations, thoughts,
sequence alignment tools helps us to analyze the area of speech, and memory. Tumors can affect these function and
similarity and evolutionary relationships between the alters brain’s ability to operate properly. [8, 9] There are more
sequences. Using Biopython tools we perform the than 120 different types of brain tumor, based on the tissue
analysis of the nucleotide sequences. This study they arise from. Brain tumors can be cancerous and non-
introduces the application of a brain tumor detection cancerous or benign, but even non-cancerous tumors can be
algorithm using machine learning techniques. harmful due to its size and location.[10] Tumors that arises in
the brain are called primary brain tumors and cancer that
Keywords:- Brain Tumor, Computer Vision, Structure metastasize from other parts of the body to the brain are
Analysis, Sequence Alignment, Deep Learning, Biopython. secondary brain tumors. Brain tumors can also classified as
histological grading (I-IV) and molecular marker. Tumor
I. INTRODUCTION diagnosis should be “layered” as histological classification,
WHO grade, and molecular information and reported as
In 1980, Guido van Rossum started working on Python “integrated diagnosis.”[11]
and first published it in 1991 as Python 0.9.0. [1]
Development of Biopython initiated in 1999 and it was first Gliomas are one of the type of brain tumor that look
published in 2000 by Brad Chapman and Jeff Chang. [2] like glial cells.[12] The most common type of malignant
Python is a high-level programming language extensively gliomas are Glioblastoma (grade IV), accounts for 15% of
used in commercial and academics, accessible to all the all brain tumors.[13] Glioblastoma is one of the most
major operating system. It promotes basic syntax, object- aggressive types of brain cancer because it arises from
oriented programming and a wide array of libraries. [3] astrocytes cells that supports nerve cells and regulate the
Biopython is a member project of the Open Bioinformatics blood amount that reaches them, so having access to the
Foundation (OBF), which organises Biopython web site, large number of blood vessels helps cancer cells to grow and
source code repository, bug tracking database and email spread rapidly.[14] Another reason behind the aggressiveness
mailing lists. It also supports the related projects such as: of glioblastoma is their high recurrence rate. This is because
BioPerl[4], BioJava[5], BioRuby and BioSQL. tumor contains glioma stem cells (GSC), a type of self-
regenerating cancer stem cell that controls the growth of
Biopython is an open source compilation of Python tumors. In previous study, Subhas Mukherjee and his
tools for biological computation, created by an international colleagues found high level of cyclin-dependent kinase 5
team of developers.[2, 6] The main reason for development of (CDK5) enzymes in GSC, the study shows that blocking this
Biopython is to make it easier for Python programming enzyme inhibits GSCs ability to self-regenerate.[15] The
language user by creating high-quality, reusable modules cause of some glioblastoma cases are unknown. Some
and classes for the complex bioinformatics problems. uncommon risk factors include genetic disorders, previous
Biopython consist of various features which include the radiation therapy[16, 17] and its association with viruses
ability to parse various Bioinformatics file formats (BLAST, (SV40, [18] HHV-6,[19, 20] and cytomegalovirus[21]). Common
Clustalw, FASTA, Genbank, PubMed, ExPASy, SCOP, genetic alteration in Glioblastomas include amplification of
KEGG, UniGene, and SwissProt), access to online epidermal growth factor receptor (EGFR) and platelet-
Bioinformatics databases (NCBI and ExPASy), interfaces to derived growth factor receptor (PDGFR); Telomerase
common programs (Clustalw alignment program, reverse transcriptase (TERT) promoter mutation; alteration

IJISRT23MAR1974 www.ijisrt.com 2146


Volume 8, Issue 3, March – 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
in tumor protein 53 (TP53), neurofibromin 1 (NF1), the annotation of protein sequences, we study Conserved
phosphate and tensin homologue (PTEN) and Domain Database (CDD), consist of analyzed multiple
retinoblastoma (RB); loss of chromosome arm 10q, and sequence alignment models for ancient domains and full-
aberrations in Receptor tyrosine kinase length proteins. CDD uses Reverse Position-Specific
(RTK)/Ras/phosphoinositide 3-kinase (PI3K) signaling BLAST (RPS-BLAST) by comparing query sequence to the
pathways.[22, 23] In this research journal, we study more about position-specific score matrices (PSSMs) of the conserved
brain tumor with different Biopython (Bio.Seq import) and domain protein to find a several types of RPS-BLAST hits
Bioinformatics tools. Bioinformatics tools and software and domain model scope. CDD includes NCBI curated
propaganda is to develop a new biological technique and domain which uses information of 3D structure to define
chemical database to help in the understanding of domain boundaries and provide understanding of
fundamental ADME processes that can control disease and sequence/structure/function relationships, it also imports
health by the metabolic processes.[24] Bioinformatics data from external source databases. Analysis of nucleotide
strategies develop to significantly improve survival rates in sequence were performed using Biopython tools. GC-
patients and explore about how new models that allow us to content (guanine-cytosine content) is used to calculate the
bridge the gap between promising preclinical findings and percentage of nitrogenous bases that are guanine (G) or
identification of clinical translation.[25] cytosine (C) present in a DNA or RNA molecule and used to
describe genomes. Biopython modules, Bio.Seq and
II. MATERIALS AND METHODS Bio.SeqUtils are used to calculate the GC-content of a
nucleotide sequence. Same as GC-content, Bio.Seq modules
To understand a network of molecular interaction and are used to measure the sequence length. Complementary
reaction in Glioma we used KEGG (Kyoto Encyclopedia of sequence follows the lock-and-key principle, it shares the
Genes and Genomes) pathway maps. KEGG database is property between two DNA or RNA sequences.
used to understand advanced functions and efficacies of the Complementary base pairing allows cells to copy
biological system from genomic and molecular-level information from one generation to another. Reverse
information. According to developers, it is a computer Complement is formed by reversing the complementary
representation of the biological system which include the sequence. Transcription is the synthesis of RNA molecule
integration of molecular building blocks of genes and from DNA sequence with the help of RNA polymerase
proteins (genomic information) and chemical substances enzyme. To make a complementary RNA strand one of the
(chemical information) and wiring diagrams of molecular DNA strands acts as a template. Translation is the process in
interaction and reaction networks (systems information), it which protein synthesize from mRNA template. At a time of
also contains disease and drug information (health translation, the sequence of nucleotides is translated into a
information). To study the alignment of more than two sequence of amino acids, these amino acids form
sequence we used Clustal Omega tool. Clustal Omega is a polypeptide chain which further bends and folds on itself to
multiple sequence alignment tool that uses seeded guide form a protein.
trees and HMM (Hidden Markov Model) profile-profile
techniques to generate alignments between three or more III. RESULT AND DISCUSSION
sequences. Alignment of multiple sequences emphasizes
areas of similarity which perhaps associated with specific KEGG database utilizes genomic and molecular-level
traits that are more highly conserved than other regions. information to understand the functions and efficacies of the
Clustal Omega also used to analyze the evolutionary biological system. Figure 1, shows the molecular interaction
relationships between sequences through phylogenetic network of Glioma’s primary and secondary pathway. The
analysis. To study the gene prediction we used ORF Finder highlighted route in these two pathway represent: Mutation-
(Open Reading Frame Finder), to identify genomic DNA activated EGFR to RAS-ERK signaling pathway; EGFR-
regions that encode genes including protein coding genes, overexpression to PI3K signaling pathway; Amplified
RNA genes and regulatory genes. The ORF Finder is a EGFR to PLCG-CAMK signaling pathway; Amplified
graphical analysis tool which finds all open reading frames PDGFR to PLCG-CAMK signaling pathway; Mutation-
of a selectable sequence. This tool classify all open reading inactivated TP53 to transcription.
frames using the standard or alternative genetic codes. For

IJISRT23MAR1974 www.ijisrt.com 2147


Volume 8, Issue 3, March – 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165

Fig 1 Molecular Interaction Network of Primary and Secondary Pathway of Glioma

Clustal Omega is a multiple sequence alignment tool which generate alignment between three or more sequence using
seeded guide trees and HMM profile-profile techniques. In Figure 2, we observe a lot of variations in nucleotide sequences. The
gap here represents the deletion in sequences and asterisk shows fully conserved alignment.

In Figure 3, the "length" of the branches represented by the values shown in the tree, indicating evolutionary distance
between the sequences, i.e., the larger number represent the larger amount of genetic changes.

IJISRT23MAR1974 www.ijisrt.com 2148


Volume 8, Issue 3, March – 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165

Fig 2 Multiple Sequence Alignment of Nucleotide Sequences

Fig 3 Cladogram of Nucleotide Sequences

Open Reading Frame (ORF) identifies all the possible protein coding region in the sequence. There would be 3 possible
reading frames in each direction of the DNA sequence, i.e. there are total 6 possible reading frame (6 horizontal bars) in every
DNA sequence. The 6 possible reading frames are +1, +2, +3 in the forward strand and -1, -2 and -3 in the reverse strand. Asterisk
(*) represent Stop Codon whereas M codes for Start Codon. In figure 4 and 5, the result displays all the possible six reading frame
present in the entered sequence query. The ORF is listed according to their size and the graphical representation of the sequence.
The selected ORF is the ORF1, +1 reading frame in the forward strand. Nucleotide length of ORF1 is 96 and 31 is the protein
length. For ORF1 start codon is placed at 169 while stop codon is placed at 264. The longest ORF among all is ORF6, -3 reading
frame in the reverse strand. Nucleotide length and protein length of ORF6 is 360 and 119 respectively. For ORF6 start codon is
placed at 360 while stop codon is placed at >1.

IJISRT23MAR1974 www.ijisrt.com 2149


Volume 8, Issue 3, March – 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165

Fig 4 Six-Frame Translation of DNA Sequence

Fig 5 Display of Reading-Frame Strand and Length

The result of CDD provides various display options and information for the conserved domains that align to the query
sequence. In figure 6, the graphical summary displays the standard result which shows best scoring domain model from each
source database. Small triangle in figure indicates specific amino acids involved in conserved features like binding and catalytic
sites. Specific hits determine by the e-value of RPS-BLAST hits to be equal or lower than domain-specific threshold e-value. It
describes the high confidence association between query sequence and conserved domain i.e. the query sequence is related to the
same protein family. Superfamily is a set of conserved domain models of different families which have the same protein
sequences and provide structural, functional and evolutionary information for proteins.

IJISRT23MAR1974 www.ijisrt.com 2150


Volume 8, Issue 3, March – 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165

Fig 6 Graphical Representation of CDD-Based Annotation on Query Sequence

Bio.Seq and Bio.SeqUtils modules used here to CCCTCAGAAATCTGCCTGCAGTTCTCACCAAGCCGC


determine the GC content of nucleotide sequences. GC TGTGAAAATGGGGATAAACACCCGGGAGCTGTTTCT
content is a percentage of nitrogenous base (Guanine or CAACTTCACTATTGTCTTGATTACGGTTATTCTTATG
Cytosine) in DNA or RNA molecule. It can be predicted by TGGCTCCTTGTGAGGTCCTATCAGTACTGAGAGGCC
dividing the number of GC nucleotides by the total number ATGCCATGGTCCTGGGATTGACTGAGATGCTCCGGA
of nucleotides. It also helps in estimating the length of the GCTGCCTGCTCTATGCCCTGAGACCCCACTGCTGTC
sequences. ATTGTCACAGGATGCCATTCTCCATCCGAGGGCACC
TGTGACCTGCACTCACAATATCTGCTATGCTGTAGT
In Biopython module, there are many in-build methods GCTAGGATTGATTATGTGTTCTCCAAAGATGCTGCT
which helps in performing basic and advanced operations on CCCAAGGGCTGCCAAGTGTTTGCCAGGGAACGGTA
sequences. Some of the advanced operation like complement, GATTTATTCCCCAACTCTTAACTGAAAATGTGTTAG
reverse complement, transcription, and translation were used ACAAGCCACAAAGTTAAAATTAAACTGGATTCATG
to analyze the data of nucleotide sequences of patient with ATGATGTAGGATTGTTACAAGCCCCTGATCTGTCTC
brain tumor. ACCACACATCCCTTCAACCCACACGGTCTGCAACCA
AACTCTAATTCAACCTGCCAGAAGGAATGTTAGAGG
Complement() function helps to find complement of AAGTCTTTGTCAGCCCTTATAGCTATCATGTGAATA
given DNA sequences while reverse complement() function AAGTTAAGTCAACTTC")
is used to get original nucleotide sequence using
complemented sequence. Transcribe() function is used to  >>>GC(dna_seq1)
convert DNA sequence into RNA sequence. In Biopython,  48.46368715083799
DNA strand is directly converted into mRNA strand by  >>>len(dna_seq1)
replacing letter T with U. Translate() function is to translate  716
RNA sequence into a protein sequence.  >>>from Bio.Seq import Seq
 >>>dna_seq1
 DNA Seq1
 >>>dna_seq1.complement()
 Seq('TCTGTCGGACCCTCCCTCTTCCTCAACCTCG
 >>>from Bio.Seq import Seq
AGTTCAACCTCTGTCGCTCCTCT...AAG')
 >>>from Bio.SeqUtils import GC  >>>dna_seq1.reverse_complement()
 Seq('GAAGTTGACTTAACTTTATTCACATGATAGC
>>>dna_seq1 = Seq("AGACAGCCTGGGAGGGAGA
TATAAGGGCTGACAAAGACTTCC...TCT')
AGGAGTTGGAGCTCAAGTTGGAGACAGCGAGGAGA
AACCTGCCATAGCCAGGGTGTGTCTTTGATCCTCTT  >>>dna_seq1.transcribe()
CAGGAGGTGAGGACAAGCCAGAGGTCCTTGGTGTG

IJISRT23MAR1974 www.ijisrt.com 2151


Volume 8, Issue 3, March – 2023 International Journal of Innovative Science and Research Technology
ISSN No:- 2456-2165.
 Seq('AGACAGCCUGGGAGGGAGAAGGAGUUGGA GGCTACCTGGTCCTGGATGAGGAACAGCAACCTTCA
GCUCAAGUUGGAGACAGCGAGGAGA...UUC') CAGCCGCAGTCGGCCCTGGAGTGCCACCCCGAGAG
 >>>dna_seq1.translate() AGGTTGCGTCCCAGAGCCTGGAGCCGCCGTGGCCGC
 Seq('RQPGREKELELKLETARRNLP*PGCVFDPLQE CAGCAAGGGGCTGCCGCAGCAGCTGCCAGCACCTC
VRTSQRSLVCPQKSACSSHQAA...KST') CGGACGAGGATGACTCAGCTGCCCCATCCACGTTGT
CCCTGCTGGGCCCCACTTTCCCCGGCTTAAGCTGCT
 DNA Seq2 GCTCCGCTGACCTTAAAGACATCCTGAGCGAGGCCA
GCACCATG")
 >>>from Bio.Seq import Seq
 >>>from Bio.SeqUtils import GC  >>>GC(dna_seq3)
 66.25766871165644
>>>dna_seq2 = Seq("ATGGCTGCCATCCGGAAGA  >>>len(dna_seq3)
AACTGGTGATTGTTGGTGATGGAGCCTGTGGAAAGA  489
CATGCTTGCTCATAGTCTTCAGCAAGGACCAGTTCC
CAGAGGTGTATGTGCCCACAGTGTTTGAGAACTATG  >>>from Bio.Seq import Seq
TGGCAGATATCGAGGTGGATGGAAAGCAGGTAGAG  >>>dna_seq3
TTGGCTTTGTGGGACACAGCTGGGCAGGAAGATTAT  >>>dna_seq3.complement()
GATCGCCTGAGGCCCCTCTCCTACCCAGATACCGAT  Seq('TCGCACGCGTTTCACTAGGTCTTGGGCC
GTTATACTGATGTGTTTTTCCATCGACAGCCCTGATA CGGGGACCGTGGGTCTCCGGCGCTCG...TAC')
GTTTAGAAAACATCCCAGAAAAGTGGACCCCAGAA  >>>dna_seq3.reverse_complement()
GTCAAGCATTTCTGTCCCAACGTGCCCATCATCCTG  Seq('CATGGTGCTGGCCTCGCTCAGGATGTCT
GTTGGGAATAAGAAGGATCTTCGGAATGATGAGCA TTAAGGTCAGCGGAGCAGCAGCTTAA...GCT'
CACAAGGCGGGAGCTAGCCAAGATGAAGCAGGAGC )
CGGTGAAACCTGAAGAAGGCAGAGATATGGCAAAC  >>>dna_seq3.transcribe()
AGGATTGGCGCTTTTGGGTACATGGAGTGTTCAGCA  Seq('AGCGUGCGCAAAGUGAUCCAGAACCCG
AAGACCAAAGATGGAGTGAGAGAGGTTTTTGAAAT
GGCCCCUGGCACCCAGAGGCCGCGAGC...AU
GGCTACGAGAGCTGCTCTGCAAGCTAGACGTGGGA G')
AGAAAAAATCTGGTTGCCTTGTCTTG")
 >>>dna_seq3.translate()
 Seq('SVRKVIQNPGPWHPEAASAAPPGASLLLL
 >>>GC(dna_seq2)
QQQQQQQQQQQQQQQQQQQQQQQET...STM'
 49.22279792746114 )
 >>>len(dna_seq2)
 579  DNA Seq4
 >>>from Bio.Seq import Seq
 >>>dna_seq2  >>>from Bio.Seq import Seq
 >>>dna_seq2.complement()  >>>from Bio.SeqUtils import GC
 Seq('TACCGACGGTAGGCCTTCTTTGACCACTAAC
AACCACTACCTCGGACACCTTTC...AAC') >>>dna_seq4 = Seq("TGGATTGGTCCATTTTACAG
 >>>dna_seq2.reverse_complement() AGTGCTGATTGNTCCGTTTTTACAGAGTGCTANTTG
 Seq('CAAGACAAGGCAACCAGATTTTTTCTTCCCA GTGTGTTTACAAAGCTTTAGCTAGACAGAAAATTTC
CGTCTAGCTTGCAGAGCAGCTCT...CAT') TCCAAGTCCCCACTGGACACAGGAAGTCCAGCTGGC
 >>>dna_seq2.transcribe() TTCACCTCTGAAAACTTTTTAGATTAAAAAAATAGA
 Seq('AUGGCUGCCAUCCGGAAGAAACUGGUGAUU ACAAACTAGTTTTAGTAGACACTTTTAAAATGATAA
GUUGGUGAUGGAGCCUGUGGAAAG...UUG') AGCAACTTGCGTTAATTTAATTCCTATCATTATGAC
 >>>dna_seq2.translate() ATAAATATCTAAGCAATGAAAGATAATATCTTTTAT
 Seq('MAAIRKKLVIVGDGACGKTCLLIVFSKDQFPE TATAAAGCTGCATAATGTGAAATCTTGCTGATGGTG
VYVPTVFENYVADIEVDGKQVE...LVL') TCACATCACTGGACATTACTGACACCTTTTGTTAAA
AAACTAACGTTCTACTGATCAGACCAATCCAAATCA
 DNA Seq3 CTAGTGAATTCGCG")

 >>>from Bio.Seq import Seq  >>>GC(dna_seq4)


 >>>from Bio.SeqUtils import GC  34.263959390862944
 >>>len(dna_seq4)
>>>dna_seq3 = Seq("AGCGTGCGCAAAGTGATCC  394
AGAACCCGGGCCCCTGGCACCCAGAGGCCGCGAGC  >>>from Bio.Seq import Seq
GCAGCACCTCCCGGCGCCAGTTTGCTGCTGCTGCAG  >>>dna_seq4
CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA  >>>dna_seq4.complement()
GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAAGAGA  Seq('ACCTAACCAGGTAAAATGTCTCACGACTAA
CTAGCCCCAGGCAGCAGCAGCAGCAGCAGGGTGAG CNAGGCAAAAATGTCTCACGATNA...CGC')
GATGGTTCTCCCCAAGCCCATCGTAGAGGCCCCACA  >>>dna_seq4.reverse_complement()

IJISRT23MAR1376 www.ijisrt.com 2152


Volume 8, Issue 3, March – 2023 International Journal of Innovative Science and Research Technology
ISSN No:- 2456-2165.
 Seq('CGCGAATTCACTAGTGATTTGGATTGGTCTG [6] Cock PJ, Antao T, Chang JT, Chapman BA, Cox CJ,
ATCAGTAGAACGTTAGTTTTTTA...CCA') Dalke A, Friedberg I, Hamelryck T, Kauff F,
 >>>dna_seq4.transcribe() Wilczynski B, de Hoon MJ. Biopython: freely
 Seq('UGGAUUGGUCCAUUUUACAGAGUGCUGAU available Python tools for computational molecular
UGNUCCGUUUUUACAGAGUGCUANU...GCG') biology and bioinformatics. Bioinformatics. 2009 Jun
 >>>dna_seq4.translate() 1;25(11):1422-3. doi: 10.1093/bioinformatics/btp163.
 Seq('WIGPFYRVLIXPFLQSAXWCVYKALARQKISP Epub 2009 Mar 20. PMID: 19304878; PMCID:
SPHWTQEVQLASPLKTF*IKKI...*IR') PMC2682512.
[7] Talevich E, Invergo BM, Cock PJ, Chapman BA.
IV. CONCLUSION Bio.Phylo: a unified toolkit for processing, analyzing
and visualizing phylogenetic trees in Biopython. BMC
Biopython, an open-source programming application Bioinformatics. 2012 Aug 21;13:209. doi:
used in the development of bioinformatics software and in 10.1186/1471-2105-13-209. PMID: 22909249;
bioinformatics scripting. Computational biology helps to PMCID: PMC3468381.
analyze the mechanism process of various diseases or clinical [8] Dorsey JF, Salinas RD, Dang M, et al. Cancer of the
conditions and accelerates the research process. central nervous system. In: Niederhuber JE, Armitage
Bioinformatics introduces the application of a brain tumor JO, Doroshow JH, Kastan MB, Tepper JE, eds.
detection algorithm using machine learning techniques. For Abeloff's Clinical Oncology. 6th ed. Philadelphia, PA:
this purpose, a number of tools and software were used to Elsevier; 2020:chap 63.
understand the genomic and molecular-level information of [9] Brain tumor. Cancer.Net. https://www.cancer.net/
disease, its specific traits and evolutionary relationships, to cancer -types/braintumor/view-all. Accessed Nov. 1,
study the gene prediction, to study protein annotation, and to 2022.
perform an exploratory data analysis. In this study explaining [10] Louis DN, Perry A, Wesseling P, Brat DJ, Cree IA,
the importance of Bioinformatics and Biopython using the Figarella-Branger D, Hawkins C, Ng HK, Pfister SM,
data of patients with brain tumor disease. Reifenberger G, Soffietti R, von Deimling A, Ellison
DW. The 2021 WHO Classification of Tumors of the
REFERENCE Central Nervous System: a summary. Neuro Oncol.
2021 Aug 2;23(8):1231-1251. doi: 10.1093/neuonc/
[1] Rossum, Guido Van (20 January 2009). "The History noab106. PMID: 34185076; PMCID: PMC8328013.
of Python: A Brief Timeline of Python". The History [11] Louis DN, Perry A, Reifenberger G, von Deimling A,
of Python. Archived from the original on 5 June 2020. Figarella-Branger D, Cavenee WK, Ohgaki H,
Retrieved 5 March 2021. Wiestler OD, Kleihues P, Ellison DW. The 2016
[2] Chapman, Brad; Chang, Jeff (August 2000). World Health Organization Classification of Tumors
"Biopython: Python tools for computational biology". of the Central Nervous System: a summary. Acta
ACM SIGBIO Newsletter. 20 (2): 15–19. Neuropathol. 2016 Jun;131(6):803-20. doi:
doi:10.1145/360262.360268. S2CID 9417766. 10.1007/s00401-016-1545-1. Epub 2016 May 9.
[3] T. E. Oliphant, "Python for Scientific Computing," in PMID: 27157931.
Computing in Science & Engineering, vol. 9, no. 3, [12] Louis DN, et al. Classification and pathologic
pp. 10-20, May-June 2007, doi:10.1109/MCSE diagnosis of gliomas, glioneuronal tumors and
.2007.58. neuronal tumors. https://www.uptodate.com/ contents/
[4] Stajich JE, Block D, Boulez K, Brenner SE, Chervitz search. Accessed June 10, 2022.
SA, Dagdigian C, Fuellen G, Gilbert JG, Korf I, Lapp [13] Young RM, Jamshidi A, Davis G, Sherman JH (June
H, Lehväslaiho H, Matsalla C, Mungall CJ, Osborne 2015). "Current trends in the surgical management and
BI, Pocock MR, Schattner P, Senger M, Stein LD, treatment of adult glioblastoma". Annals of
Stupka E, Wilkinson MD, Birney E. The Bioperl Translational Medicine. 3 (9): 121. doi:10.3978
toolkit: Perl modules for the life sciences. Genome /j.issn.2305-5839.2015.05.10. PMC 4481356. PMID
Res. 2002 Oct;12(10):1611-8. doi: 10.1101/gr.361602. 26207249.
PMID: 12368254; PMCID: PMC187536. [14] Vinita Kukreja, Uma Kumari "Genome Annotation of
[5] Holland RC, Down TA, Pocock M, Prlić A, Huen D, Brain Cancer and Structure Analysis by applying Drug
James K, Foisy S, Dräger A, Yates A, Heuer M, Designing Technique", International Journal of
Schreiber MJ. BioJava: an open-source framework for Emerging Technologies and Innovative Research,
bioinformatics. Bioinformatics. 2008 Sep 15;24(18): Vol.9, Issue 5, page no.k473-k479, May-2022.
2096-7. doi: 10.1093/bioinformatics/btn397. Epub [15] Mukherjee S, Tucker-Burden C, Kaissi E, Newsam A,
2008 Aug 8. PMID: 18689808; PMCID: Duggireddy H, Chau M, Zhang C, Diwedi B, Rupji M,
PMC2530884. Seby S, Kowalski J, Kong J, Read R, Brat DJ. CDK5
Inhibition Resolves PKA/cAMP-Independent
Activation of CREB1 Signaling in Glioma Stem Cells.
Cell Rep. 2018 May 8;23(6):1651-1664. doi:
10.1016/j.celrep.2018.04.016. PMID: 29742423;
PMCID: PMC5987254.

IJISRT23MAR1376 www.ijisrt.com 2153


Volume 8, Issue 3, March – 2023 International Journal of Innovative Science and Research Technology
ISSN No:- 2456-2165.
[16] World Cancer Report 2014. World Health
Organization. 2014. Chapter 5.16. ISBN 978-
9283204299. Archived from the original on September
19, 2016. https://www.who.int/cancer /publications/
WRC_2014/en/. [Google Scholar]
[17] Gallego O (August 2015). "Nonsurgical treatment of
recurrent glioblastoma". Current Oncology. 22 (4):
e273–81. doi:10.3747/co.22.2436. PMC 4530825.
PMID 26300678.
[18] Vilchez RA, Kozinetz CA, Arrington AS, Madden
CR, Butel JS (June 2003). "Simian virus 40 in human
cancers". The American Journal of Medicine. 114 (8):
675–84. doi:10.1016/S0002-9343(03)00087-1. PMID
12798456.
[19] Crawford JR, Santi MR, Thorarinsdottir HK,
Cornelison R, Rushing EJ, Zhang H, et al. (September
2009). "Detection of human herpesvirus-6 variants in
pediatric brain tumors: association of viral antigen in
low grade gliomas". Journal of Clinical Virology. 46
(1): 37–42. doi:10.1016/j.jcv.2009.05.011. PMC
2749001. PMID 19505845.
[20] Chi J, Gu B, Zhang C, Peng G, Zhou F, Chen Y, et al.
(November 2012). "Human herpesvirus 6 latent
infection in patients with glioma". The Journal of
Infectious Diseases. 206 (9): 1394–98.
doi:10.1093/infdis/jis513. PMID 22962688.
[21] McFaline-Figueroa JR, Wen PY (February 2017).
"The Viral Connection to Glioblastoma". Current
Infectious Disease Reports. 19 (2): 5.
doi:10.1007/s11908-017-0563-z. PMID 28233187.
S2CID 30446699.
[22] Reifenberger G, Liu L, Ichimura K, Schmidt EE,
Collins VP. Amplification and overexpression of the
MDM2 gene in a subset of human malignant gliomas
without p53 mutations. Cancer Res. 1993 Jun
15;53(12):2736-9. PMID: 8504413.
[23] Verhaak RG, Hoadley KA, Purdom E, Wang V, Qi Y,
Wilkerson MD, Miller CR, Ding L, Golub T, Mesirov
JP, Alexe G, Lawrence M, O'Kelly M, Tamayo P,
Weir BA, Gabriel S, Winckler W, Gupta S, Jakkula L,
Feiler HS, Hodgson JG, James CD, Sarkaria JN,
Brennan C, Kahn A, Spellman PT, Wilson RK, Speed
TP, Gray JW, Meyerson M, Getz G, Perou CM, Hayes
DN; Cancer Genome Atlas Research Network.
Integrated genomic analysis identifies clinically
relevant subtypes of glioblastoma characterized by
abnormalities in PDGFRA, IDH1, EGFR, and NF1.
Cancer Cell. 2010 Jan 19;17(1):98-110. doi:
10.1016/j.ccr.2009.12.020. PMID: 20129251; PMCID:
PMC2818769.
[24] Uma Kumari “Insilico analysis and computer aided
drug designing approach for mutant cancer gene”
(IJBTR) DEC 2021 (Impact factor 6.6, ICV 61.5,
NASS RATING 3.8).
[25] Shubhi Bindal, Uma Kumari et al “Homology
Modeling and Drug Designing Approach for
Prospective Malignant Brain Cancer” IJIRT 2022
Volume 9, Issue 6, Pages 528-534.

IJISRT23MAR1376 www.ijisrt.com 2154

You might also like