14
14
382–387
* Correesponding Author:
Hamed Tabesh
Department of Medical Informatics, Faculty of Medicine,
Mashhad University of Medical Sciences, Mashhad, Iran
Phone: +98 51 38002536;
Fax: +98 51 38002445.
E-mail: [email protected]
Abstract
By changing the lifestyle and increasing the cancer incidence, accurate diagnosis becomes a significant medical action. Today, DNA
microarray is widely used in cancer diagnosis and screening since it is able to measure gene expression levels. Analyzing them by
using common statistical methods is not suitable because of the high gene expression data dimensions. So, this study aims to use new
techniques to diagnose acute myeloid leukemia.
In this study, the leukemia microarray gene data, contenting 22283 genes, was extracted from the Gene Expression Omnibus repos-
itory. Initial preprocessing was applied by using a normalization test and principal component analysis in Python. Then DNNs neural
network designed and implemented to the data and finally results cross-validated by classifiers.
The normalization test was significant (P>0.05) and the results show the PCA gene segregation potential and independence of cancer
and healthy cells. The results accuracy for single-layer neural network and DNNs deep learning network with three hidden layers are
63.33 and 96.67, respectively.
Using new methods such as deep learning can improve diagnosis accuracy and performance compared to the old methods. It is rec-
ommended to use these methods in cancer diagnosis and effective gene selection in various types of cancer.
382
Journal of Medicine and Life Vol. 13, Issue 3, July-September 2020, pp. 382–387
there is no specific way considered the best way to ana- layer gets data (dendrite), the hidden layer processes data
lyze microarrays [2]. Recently, expert systems to diagnose (soma and axon), and finally, processed data is sent to the
cancerous gene data are increasing, and machine learn- output layer (synapse)(Figure 2) [18-21].
ing techniques are currently used more. Machine learning
can help to automation and intelligence process, improve
development, accuracy and reducing costs [12]. Machine
learning, ensemble methods, and deep learning are show-
ing high performance in classifying biological data [13-16].
In this study, neural networks and deep learning were
used to separate healthy and cancerous cells in leukemia
related genes. Acute myeloid leukemia (AML) is the type
of cancer that starts in the bone marrow, but in most of the
cases, it moves to the blood very fast. This type of cancer
worsens fast if left untreated [17].
In this study, we classified healthy and cancerous cells by The neural network’s behavior is shaped by the architec-
neural networks and deep learning. ture of that network. Neural network architecture can be
defined as follows:
Artificial neural network • The number of neurons;
• Number of layers;
It is a computational and algorithmic model inspired by the • Types of communication between layers.
structure and functional aspects of biological neural net- Perceptron is one of the simplest neural networks. It is
works and the concept of neurons. It is considered one of a learning algorithm for a binary classifier, called a thresh-
the nonlinear statistical data modeling tools and is used old function:
for pattern recognition and modeling complex relations
between inputs and outputs. It consists of some simple
units that work in parallel. Weighting between units is the
primary way to store information long-term and learn new
information by updating weights. w: vector of real value weights; b: bias
A neuron of the human nervous system consists of
dendrites, a single axon, soma, and nucleus, as shown in
Figure 1. The main configuration of perceptron networks is shown
in Figure 3.
Activation functions are used to propagate the node
outputs from one layer to the next (up to the output layer).
The activation function is a function that activates the neu-
ron. There are several types of activation functions such as
Identity, Binary Step, Sigmoid, Tanh, ReLU, Leaky ReLU,
and Softmax.
Sigmoid is a widely used activation function convert-
ing illimitable independent variables to simple probabilities
between 0 and 1. Sigmoid can infinitely reduce data or
outlying values without deleting them. Unlike the sigmoid
activation function, Tanh is bound to the (-1,1) range. It is
worth mentioning that tanh deals easier with negative num-
bers. Besides, tanh is a well-liked and widely used activa-
tion function. The Softmax function is a multiclass logistic
regression and a generalization of the sigmoid. Therefore,
Figure 1: Structure of a typical neuron. it can be applied to continuous data (rather than binary
classification). Rectified linear units (ReLU) are based on
the latest scientific advances, and it has been proven to
Dendrite receives electrochemical impulses from the other be working in many conditions. ReLU is bound to the [0,
neurons. The soma processes these signals. The output inf) range. It makes the network lighter and efficient due
is transmitted to terminal dendrites by axons, where these to the characteristics of ReLU. Also, these activation func-
new impulses are sent to the next neuron. An artificial neu- tions show better performance than Sigmoid in training
ral network works the same way on three layers: the input data. Recent studies show that deep learning networks
383
Journal of Medicine and Life Vol. 13, Issue 3, July-September 2020, pp. 382–387
using ReLU are able to train well without preprocessing networks (RNNs), convolutional neural networks (CNNs)
techniques. and more [19].
Loss functions determine how much a trained neural
network is close to reality. It is a measured inconsisten- Quality control
cy between the real and predicted value considered as
an error; the mean error value takes and represents the This is one of the steps of the microarray data analyzing,
difference between the real-world and the neural network. after which it is possible to test and interpret the method.
There are many common loss functions, such as MAE, Any negligence to impose quality control may cause
MAPE, MSLE, and MSE [19]. detour and alter the results significantly for many reasons,
such as the following:
Learning rate 1. The biologist grows the cell culture without know-
ing that bacteria may live in the cell.
It determines the neural network values and how much 2. There may be fungal or viral contamination.
they change by new training data. The learning rate is set 3. The RNA treatment may not do well after RNA ex-
before the learning process begins. A low learning rate traction.
means more time to train, but a high learning rate makes 4. Because RNA is a highly unstable molecule and it
the network more sensitive to new information [20, 21]. begins to crumble, the quality decreases at room
temperature.
Deep learning 5. The sample size is not enough, or there is an error
in complementary DNA (cDNA) generation in rank
Recently, a new machine learning technique known as steps.
deep learning, is used frequently. New studies show that 6. The results are not reliable if something goes
this algorithm has better results compared to machine wrong during the scanning or hybridization steps.
learning, for example, identifying and discovering drugs, Biases that occur in the study results related to genetic
image processing, and speech [22-27]. data lead to false-positive and false-negative results.
Deep-learning is defined as a neural network with a The genes that can separate the cancerous and healthy
large number of parameters and layers. In fact, it is a class cells indicate that the experiment is well done. So, dimen-
of machine learning algorithms that uses a hierarchical sion reduction techniques are used to detect important
nonlinear structure in multiple layers to extract features genes in separating these samples. Otherwise, co-expres-
and transformations [19]. sion or co-relation between genes or between samples can
Unlike other machine learning methods requiring an be measured. Actually, the purpose of dimension reduction
expert to extract features, deep learning can act as an is to capture the variations in microarray data [30, 31].
automatic feature extractor that transforms low-level fea-
tures into higher-level abstractions [28]. In addition, deep PCA
learning can incorporate momentary, indirect and minor
changes and leads to higher accuracy than other machine Principal component analysis (PCA) is an analysis of sim-
learning methods [29]. plifying high-dimensional complexity, including patterns
Types of deep learning techniques can include deep and trends. High-dimensional data is common in biology,
neural networks (DNNs), autoencoders networks (AEs), and multiple features occur when the expression of the dif-
generative adversarial networks (GANs), repeating neural ferent genes for each sample is measured [32, 33].
384
Journal of Medicine and Life Vol. 13, Issue 3, July-September 2020, pp. 382–387
385
Journal of Medicine and Life Vol. 13, Issue 3, July-September 2020, pp. 382–387
386
Journal of Medicine and Life Vol. 13, Issue 3, July-September 2020, pp. 382–387
26. Yang Yilong,et al.Deep learning for in vitro prediction of pharma- 40. Abdel-Zaher AM, Eldeib AM. Breast cancer classification using
ceutical formulations . Acta Pharmaceutica Sinica B (2018),https:// deep belief networks. Expert Systems with Applications. 2016 Mar
doi.org/10.1016/j.apsb.2018.09.010 15;46:139-44.
27. Predicting oral disintegrating tablet formulations by neural network 41. Xu Y, Dai Z, Chen F, Gao S, Pei J, Lai L. Deep learning for drug-in-
techniques Run Han, Yilong Yang, Xiaoshan Li, Defang Ouyang duced liver injury. Journal of chemical information and modeling.
28. Schmidhuber J . Deep learning in neural networks: an overview. 2015 Oct 13;55(10):2085-93.
Neural Netw 2015;61:85–117. 42. Han R, Yang Y, Li X, Ouyang D. Predicting oral disintegrating tablet
29. Rost B ,Sander C . Combining evolutionary information and formulations by neural network techniques. Asian Journal of Phar-
neural networks to predict protein secondary structure. Proteins maceutical Sciences. 2018 Jul 1;13(4):336-42.
1994;19:55–72 . 43. Lusci A, Pollastri G, Baldi P. Deep architectures and deep learning
30. Raman T, O’Connor TP, Hackett NR, Wang W, Harvey BG, Attiyeh in chemoinformatics: the prediction of aqueous solubility for drug-
MA, Dang DT, Teater M, Crystal RG. Quality control in microarray like molecules. Journal of chemical information and modeling.
assessment of gene expression in human airway epithelium. BMC 2013 Jul 2;53(7):1563-75.
genomics. 2009 Dec;10(1):493. 44. Zeng H, Edwards MD, Liu G, Gifford DK. Convolutional neural
31. Anderson CA, Pettersson FH, Clarke GM, Cardon LR, Morris AP, network architectures for predicting DNA–protein binding. Bioin-
Zondervan KT. Data quality control in genetic case-control associ- formatics. 2016 Jun 11;32(12):i121-7.
ation studies. Nature protocols. 2010 Sep;5(9):1564. 45. Sree PK, Rao PS, Devi NU. CDLGP: A novel unsupervised classi-
32. Jolliffe I. Principal component analysis. Springer Berlin Heidel- fier using deep learning for gene prediction. In2017 IEEE Interna-
berg; 2011. tional Conference on Power, Control, Signals and Instrumentation
33. Raychaudhuri S, Stuart JM, Altman RB. Principal components Engineering (ICPCSI) 2017 Sep 21 (pp. 2811-2813). IEEE.
analysis to summarize microarray experiments: application to 46. Lanchantin J, Singh R, Lin Z, Qi Y. Deep motif: Visualizing genom-
sporulation time series. InBiocomputing 2000 1999 (pp. 455-466). ic sequence classifications. arXiv preprint arXiv:1605.01133.
34. Quackenbush J. Microarray data normalization and transforma- 2016 May 4.
tion. Nat Genet. 2002;32:496-501 47. Zeng H, Edwards MD, Liu G, Gifford DK. Convolutional neural
35. Min S, Lee B, Yoon S. Deep learning in bioinformatics. Briefings in network architectures for predicting DNA–protein binding. Bioin-
bioinformatics. 2017 Sep 1;18(5):851-69. formatics. 2016 Jun 11;32(12):i121-7.
36. Zou J, Huss M, Abid A, Mohammadi P, Torkamani A, Telenti A. 48. Yue T, Wang H. Deep learning for genomics: A concise overview.
A primer on deep learning in genomics. Nature genetics. 2018 arXiv preprint arXiv:1802.00810. 2018 Feb 2.
Nov 26:1. 49. Sree PK, Rao PS, Devi NU. CDLGP: A novel unsupervised classi-
37. Ma L, Ma C, Liu Y, Wang X. Thyroid diagnosis from SPECT imag- fier using deep learning for gene prediction. In2017 IEEE Interna-
es using convolutional neural network with optimization. Computa- tional Conference on Power, Control, Signals and Instrumentation
tional intelligence and neuroscience. 2019;2019. Engineering (ICPCSI) 2017 Sep 21 (pp. 2811-2813). IEEE.
38. Tomov NS, Tomov S. On Deep Neural Networks for Detecting 50. Mamoshina P, Vieira A, Putin E, Zhavoronkov A. Applications of
Heart Disease. arXiv preprint arXiv:1808.07168. 2018 Aug 22. deep learning in biomedicine. Molecular pharmaceutics. 2016 Mar
39. Mohamed AA, Berg WA, Peng H, Luo Y, Jankowitz RC, Wu S. A 29;13(5):1445-54.
deep learning method for classifying mammographic breast densi-
ty categories. Medical physics. 2018 Jan;45(1):314-21.
387