0% found this document useful (0 votes)

35 views6 pages

Gene Selection and Classification of Microarray Data Using Convolutional Neural Network

Uploaded by

Tajbia Hossain

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views6 pages

Gene Selection and Classification of Microarray Data Using Convolutional Neural Network

Uploaded by

Tajbia Hossain

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

2018 International Conference on Advanced Science and Engineering (ICOASE), Kurdistan Region, Iraq

Gene Selection and Classification of Microarray

Data Using Convolutional Neural Network
Diyar Qader Zeebaree Habibollah Haron Adnan Mohsin Abdulazeez
School of Computing Department of Computer Science Presidency of Duhok Polytechnic University
Faculty of Engineering Faculty of Computing Duhok, Kurdistan Region, Iraq
University Teknologi Malaysia (UTM) Universiti Teknologi Malaysia [email protected]
Johor, Malaysia Johor, Malaysia
[email protected] [email protected]

Abstract— Gene expression profiles could be generated in affects efficiency and effectiveness to a large data dimension-
large quantities by utilizing microarray techniques. Currently, the ality which impair classification performance [4].
task of diagnosing diseases relies on gene expression data. One of Convolutional neural network (CNN) is an instance of deep
the techniques which helps in this task is by utilizing deep learning learning strategy is mimicking brain function in processing in-
algorithms. Such algorithms are effective in the identification and
classification of informative genes. These genes may subsequently
formation [5]. In this paper, multilayered CNN, which is a deep
be used in predicting testing samples’ classes. In cancer learning algorithm, is proposed to classify microarray cancer
identification, the microarray data typically possesses minimal data in the identification of type of cancer. CNN is proposed
samples number with a huge feature collection size which are due to its ability in dealing with insufficient data and boosting
hailing from gene expression data. Lately, applications of deep classification performance. In addition, CNN is also powerful
learning algorithms are gaining much attention to solve various in integrating cancer datasets that are strongly linked, which
challenges in artificial intelligence field. In the present study, we improves performance in classifying data. This is attributed to
investigated a deep learning algorithm based on the convolutional its ability in detecting latent characteristics of cancer from com-
neural network (CNN), for classification of microarray data. In parable types. The organization of the present paper is as
comparison to similar techniques such as Vector Machine
Recursive Feature Elimination and improved Random Forest
following. Section 2 elaborates related works and definitions.
(mSVM-RFE-iRF and varSeIRF), CNN showed that not all the Section 3 elaborates the methods. Section 4 describes selected
data have superior performance. Most of experimental results on datasets, proposed architecture, evaluation techniques, and
cancer datasets indicated that CNN is superior in terms of benchmark. Section 5 presents the results and discussions.
accuracy and minimizing gene in classifying cancer comparing Lastly, conclusions of this paper.
with hybrid mSVM-RFE-iRF.
II. RELATED WORKS AND DEFINITIONS
Keywords: Deep Learning; Convolutional Neural Network
(CNN); Microarray Cancer Data; Classification; In this section, microarray data, machine learning, and CNN
algorithms along with related works would be reviewed.

I. INTRODUCTION
A. Microarray data classification
Microarray data are widely used in prognosis treatment and
Microarray gene expression data have been utilized in past
disease classification through a variety of genes selection and
researches to perform cancer type classification by using ma-
classification methods [1].
chine learning strategies. Decision tree (DT) was the most
Since cancer diagnosis has seen much applications of micro-
primitive machine learning strategy introduced in comparing
array particularly on gene expression profiles, scholars also
human proteins to informative gene in proteins containing dis-
have begun exploring data analysis by using the technology.
eases [6]. Diagnoses of cancer have been largely assisted by
This is attributed to its effectiveness in discovering abnormal
exploring gene expression data with the application of technol-
and normal tissue patterns in speedier time as microarray scales
ogy available in microarray technique. The technology enables
well on large dataset. Microarray is an attractive research ave-
genes to be measured simultaneously in a large quantity. In as-
nue as the technology is typically utilized to investigate dataset
sessing significant genes, parametric statistical analysis has
with high dimensionality, which demands significant memory
been typically employed to establish statistical significance [7].
and processor requirements [2].
In literature, numerous algorithms and mathematical models
There remains room for improvement of microarray in clas-
have been constructed and proposed to interpret and analyze
sifying data as the technology struggles with small samples col-
gene expression data. In analyzing gene expression data, two
lection yet large features quantity. Selection of suitable features
dominant strategies which have been focused are clustering and
are the keys in this field as numerous research endeavors aim
classification [8]. Additionally, there are also numerous tech-
to minimize data dimensionality with improved performance in
niques which have been executed previously in classifying
classification [3]. In the case of classifying cancer cells, numer-
gene expression data, including, k-nearest neighbors (k-NN)
ous machine learning algorithms strive to a number of workable
[9], Support Vector Machines (SVM) [10], Multilayer
samples is significantly lower than gene count. This situation

978-1-5386-6696-8/18/$31.00 ©2018 IEEE 145

2018 International Conference on Advanced Science and Engineering (ICOASE), Kurdistan Region, Iraq

perceptron (MLP) [11], and variants of Artificial Neural Net- of all the cheques in the United States. A number of CNN based
works (ANNs) [12]. on optical character recognition and handwriting recognition
The breast cancer and leukemia datasets in performing systems were later deployed by Microsoft [22]. CNN was also
selection of informative features from gene expression data. experimented with in the early of 1990s for object detection in
The work assessed the accuracy of the proposed selection tech- natural images, including faces and hands [23, 24].
nique. In the work, the researchers concluded that k-NN classi- CNN performance in prediction was measured against K-NN
fier performed superiorly than random forest in terms of accu- in the task of classifying materials. Features fed into the algo-
rate classification [13]. The author [14] reported Random Sur- rithms were processed utilizing Local Binary Pattern in differ-
vival Forest strategy in selecting informative genes by means ent variants. CNN produced accurate classification (95%) com-
of eliminating non-informative genes iteratively. pared to a hybrid of K-NN and feature extractor (83%) [25].
a hybrid of particle swarm optimization and decision tree a deep learning comprising multiple tasks learning and trans-
(PSOC4.5) to be used in classifying informative genes from ferred learning in analyzing images of biological components
cancer datasets. The proposed classification strategy allows [6,14]. On the other hand, a deep learning algorithm based on
non-informative genes to be overlooked, which could success- CNN was proposed by [26] with reported results surpassing ex-
fully lead to cancer identification. The work reported superior isting ML strategies. The proposed work has won the research-
accuracy of the proposed classifier [15]. ers accolade in visual recognition challenge.
From a hybrid classifier comprising particle swarm optimi- Earliest use of CNN concerned with classifying images, par-
zation (PSO) and adaptive K-nearest neighbourhood (KNN) in ticularly, in segmenting and grouping images [27,28] in medi-
selecting informative genes. The proposed work identifies cal domain with superior accuracy. Apart from that, researchers
handful genes that meet the criteria of classification [16]. have also implemented CNN in different domains, including,
facial recognition [29], and examination of documents.
There are three main components: 1) input layer, 2) hidden
B. Deep Learning (DL)
layer, and 3) latent layers. These latent (hidden) layers may be
Deep Learning (DL) concerns with processing information categorized as a fully-connected layer, a pooling layer, or a con-
utilizing deep networks. It is a part of machine learning ap- volutional layer. Figure 1 shows these layers adapted from [30]:
proaches. In its earlier appearance in 1943, DL was termed by
McClulloch and Pitts as “cybernetics” [17]. Researchers were
drawn to DL attributed to its capability as well as characteris-
tics in mimicking the way the brain processes information prior
to make decisions. DL is constructed to process information ei-
ther via unsupervised or supervised approaches, whereby,
learning is conducted on multilayered features and representa-
tions. Numerous breakthroughs were reported on DL; relating
to improve solution and solve problems, attributed to highly ad-
vanced computation models. Due to capability of learning on
Fig. 1. The pipeline of the general CNN architecture [30].
multilayered representations, DL is superior in drawing out-
comes from complex problems. In this sense, DL is the most 1. Convolutional layer is essentially the primary layer in
advanced approach to be used in capturing and processing ab- CNN architecture. The process of convolution concerns
straction of data in several layers. Such characteristics presents with iterative execution of specific function toward the
DL as a suitable approach to be considered in analyzing and output of a different function [31]. This layer consists of
studying gene expression data. The ability in learning multi- numerous maps of neurons, described as maps of features
layered representations makes DL a versatile strategy in pro- or filters. It is relatively identical in size to the dimension-
ducing more accurate results in a much speedier time. Multi- ality of input data. Neural reactivity is interpreted through
layered representation is a component that forms the overall ar- quantifying discrete convolution of receptors. The quanti-
chitecture of deep learning [18]. fication deals with calculating total neural weights of input
ML and DL differs in terms of performance depending on the and assigning activation function.
quantity of data. In learning dataset with low dimensionality, 2. Max pooling layer concerns with producing several grids
DL is ineffective, as it requires data with high dimensionality from the splitting convolution layer’s output. Maximum
in order to comprehend learning to be carried out [19]. values of grids are sequenced in matrices [31]. Operators
are utilized in performing computation on each matrix in
C. Deep learning Convolutional Neural Network (CNN) order to quantify average or maximize value.
3. Full connection layer is an almost complete CNN, com-
A type of artificial neural network, Convolutional Neural
prising 90% of overall CNN architectural parameters. The
Network (CNN) is capable of extracting local features in data.
layer enables input to be transmitted in the network with
CNN simplifies network model through assigning weights on
pre-set vector lengths [26]. Dimensional data is trans-
singular mapping of features, which allows overall weights to
formed by a layer prior to classification. Convolutional
be reduced. These characteristics have resulted in a widespread
layer also undergoes transformation, which allows infor-
utilization of CNN in pattern recognition field [20, 21].
mation integrity to be retained.
The document reading system used a CNN trained jointly
with a probabilistic model that implemented language con-
straints. By the late of 1990s this system was reading over 10%

146
2018 International Conference on Advanced Science and Engineering (ICOASE), Kurdistan Region, Iraq

III. MATERIAL AND METHODS

In this section, experimental datasets would be used to
benchmark the performance of the proposed work. The section
outlines data input representation system as well as deep
convolutional network. Microarray cancer data shows in Table
I are used as input cancer data. The data files are opened using
R package as its described in next section. After that, data have
been converted into array to transform format using Python.
Before applying CNN, cancer data are arranged as matrix
vectors. Based on CNN the matrix of data has classified. Then,
the accuracy of the classified data is calculated and discussed
depending on ANOVA as shown in Fig. 2.

Fig. 3. General CNN Architecture

B. Evaluation Technique
In assessing the proposed deep learning CNN, ten cancer data
were tested. These data were used in training classification.
Mean accuracy was obtained through averaging accuracy
scores from the data. This eliminates concerns on redundant
tests and optimizes utilization on data that have been obtained.
In this paper, accuracy as a measure of performance for the pro-
posed convolutional CNN is defined as following:
To evaluate the performance, the accuracy of the result is cal-
culated according to [6].

𝐶𝑜𝑟𝑟𝑒𝑐𝑡𝑙𝑦 𝑃𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 𝐷𝑎𝑡𝑎

Accuracy = × 100% (1)
𝑇𝑜𝑡𝑎𝑙 𝑇𝑒𝑠𝑡𝑖𝑛𝑔 𝐷𝑎𝑡𝑎

C. Benchmarks
In benchmarking the results of the proposed CNN, accuracy
performance of the proposed work is benchmarked against
Fig. 2. Proposed general methodology MSCM-RFE [33] and varSeIFE [34, 35] on selected cancer da-
tasets. Based on the results, the proposed CNN performed with
superior accuracy, indicating its ability to improve classifica-
A. Proposed CNN Architecture tion through accurate gene selection. Table II lists accuracy
CNN model is configured in this paper upon completion of performances of aforementioned methods.
data collection and preprocessing. A convolutional CNN is se-
lected comprising of fully connected layers. Convolutional
layer is chosen as a default; as the architecture is capable of IV. DATASETS
dealing with data with high and multiple dimensionalities, such In this study, ten cancer datasets are used. The datasets con-
as, gene expression data and 2D images. Krizhevsky principles tain gene expression profiles that are extracted utilizing micro-
[32] were applied in the construction of the CNN architecture. array technology. Pre-processing is required prior to utilize of
In this paper a new system has been proposed with 2-Dimen- datasets. The file is stored in .RDA format which can be ac-
tional convolutional h. The size of the filter is 64 kernels with cessed by utilizing software suite supporting R package. All of
size 3×3 and the non-linearity activation relu has been used the gene profiles of tumor-inflicted and normal patients were
with convolutional layer. In fully connected layer the filter size encoded in binary format, described as different-class datasets.
is 128 kernels with size 2×2. The system has been learned for The datasets were provided with class file and data file. Data
the ration of the testing and training is 30% and 70% of the data file stores values in numeric format, arranged by rows and col-
respectively. umns. Each column indicates patient number. Each row indi-
cates gene number in cancer dataset. Table I lists the descrip-
tion of datasets, used in this paper, in terms of gene number,

147
2018 International Conference on Advanced Science and Engineering (ICOASE), Kurdistan Region, Iraq

patient number, as well as the reference. CNN algorithm is im- 92.14% in mean classification accuracy, scoring the highest ac-
plemented on the selected cancer datasets. Each dataset stores curacy at 97.62%, with 15.65% variance. On the other hand,
numerous categories of cancer. The data file primarily stores Breast2 dataset yielded 34.97% in mean classification accu-
cancer data of multiple classes that are obtained from microar- racy, scoring the highest accuracy at 41.26% with 8.52% vari-
ray technology. ance. Meanwhile, Breast3 dataset yielded 92.90% in mean clas-
sification accuracy, scoring the highest accuracy at 97.69%,
TABLE I. THE MAIN CHARACTERISTICS OF THE CANCER DATASET USED IN THIS with 10.27% variance. Colon dataset, meanwhile, yielded
RESEARCH
57.34% in mean classification accuracy, scoring the highest ac-
Data set #gene #Sample Class curacy at 64.52%, with 11.61% variance. While, Leukemia da-
Brain 5597 42 5 taset yielded 95.69% in mean classification accuracy, scoring
Breast2 13321 286 4
Breast3 1509 264 4
the highest accuracy at 100.00% with 13.30% variance. Lym-
Colon 2000 62 2 phoma dataset, on the other hand, yielded 100.00% in mean
Leukemia 3571 72 2 classification accuracy, scoring the highest accuracy at
Lymphoma 4026 62 3 100.00% with 1.73% variance.
Prostate 6033 102 2 Next, Lymphoma dataset yielded 76.62% in mean classifica-
Srbct 2308 63 4
tion accuracy, scoring the highest accuracy at 91.86% with
Lung 5217 86 2
(michigan) 16.91% variance. Whereas, SRBCT dataset yielded 98.02% in
Lung (boston) 5217 62 2 mean classification accuracy, scoring the highest accuracy at
100.00% with 19.92% variance. Next, LungMichigan dataset
yielded 62.27% in mean classification accuracy, scoring the
highest accuracy at 72.09% with 17.67% variance. Lastly,
V. RESULTS AND DISCUSSION LungBoston dataset yielded 50.00% in mean classification ac-
The proposed CNN was executed in Theano [36], which curacy, scoring the highest accuracy at 50.32% with 0.00% var-
hosts environment for constructing deep learning software. iance.
Theano is built upon Keras technology [37]. Intially, neuron The classification accuracy results obtained on 10 cancer da-
weights were assigned based on settings in Keras. tasets for proposed CNN are subsequently compared against
ADADELTA [38] was utilized to initiate deep network layers hybrid mSVM-RFE-IRF [7] and varSelRF [8, 9]. The results
concurrently. A MacBook Pro Core i5-3210M CPU system are listed in Table II shows the his highlighted cells signifying
with 8GB memory was used to execute classification training superior method with the highest classification accuracy. The
on the proposed CNN. The time taken to test and train the net- overall comparison of accuracies is tabulated to represent the
work was 12 days, utilizing Python package. overall findings of this study.
Analysis of Variance (ANOVA) was utilized as a statistical In overall, the proposed CNN scored higher accuracies in
analysis in establishing statistical significance of accuracy in comparison to MSVM-RFE and varSelRF in seven cancer da-
the classification of 10 types of cancer datasets. ANOVA com- tasets, including, Brain, Breast3, Leukemia, lymphoma,
parison is illustrated in Fig. 4. Based on accuracy of classifica- SRBCT, LungMichigan and LungBoston datasets.
tion obtained from ANOVA, there exists statistical significance
among 10 types of cancer datasets, with p = 3.4 *10-22.
TABLE II. COMPARISON OF CLASSIFICATION ACCURACY FOR CNN AND
HYBRID MSVM-RFE-IRF.

DATA SET mSVM-RFE- varSelRF Proposed CNN

IRF
Brain 80.4 76.46 97.62
Breast2 81.7 65.38 41.26
Breast3 71.07 63.4 97.69
Colon 89.25 82.7 64.52
Leukemia 94.9 91.09 100.00
lymphoma 93.06 89.33 100.00
prostate 94.09 93.07 91.86
SRBCT 95.55 92.72 100.00
LungMichigan 82.39 65.01 72.09
LungBoston 74.92 49.54 50.32

Based on the classification results, in Brain dataset, CNN

outperformed MSVM-RFE by 17.22% and outperformed
varSelRF by 21.16%. Meanwhile, in Breast3 dataset, CNN out-
performed MSVM-RFE by 26.63% and outperformed
Fig. 4. ANOVA analysis for classification accuracy of proposed CNN for 10 varSelRF by 34.29%. While, in Leukemia dataset, CNN out-
types of cancer datasets.
performed MSVM-RFE by 5.10% and outperformed varSelRF
From the analysis, in Brain dataset, the proposed CNN scored by 8.91%. On the other hand, in Lymphoma cancer dataset,

148
2018 International Conference on Advanced Science and Engineering (ICOASE), Kurdistan Region, Iraq

CNN improved classification accuracy in comparison to VI. CONCLUSIONS

MSVM-RFE by 6.94% and outperformed varSelRF by In this paper, a convolutional neural network (CNN) was
10.67%. proposed to perform gene selection on ten cancer datasets.
In SRBCT dataset, CNN improved classification accuracy in Selecting informative genes is a crucial process to be performed
comparison to MSVM-RFE by 4.45% and outperformed prior to classify and select appropriate genes in each dataset.
varSelRF by 7.28%. In both Lung Michigan and Lung Boston The proposed CNN for selecting genes affords flexibility by
datasets, CNN performance scaled well, with improvements of allowing ranges of genes in subset to be determined based on
7.08% and 0.78%, respectively. Despite the positive improve- different gene sizes criterion that is required. This uplifts
ments on the aforementioned datasets, CNN was outperformed accuracy performance significantly in classification as
by MSVM-RFE and varSelRF in Breast2, Colon, Prostate, demonstrated from the experiments on 10 cancer datasets.
Lung Michigan and Lung Boston. In Breast2 dataset, CNN was Thus, the proposed CNN gene selection method provides
outperformed by MSVM-RFE and varSelRF; by 0.44% and flexibility in determining the classification of cancer genes. In
24.12%, respectively. Meanwhile, in Colon dataset, accuracy CNN architecture, every kernel examines the entire cancer gene
performance of CNN was outperformed as well by 24.73% input matrix, in order to yield output maps. CNN output maps
when MSVM-RFE was utilized and by 18.18% when varSelRF comprising more features would allow learning process to
was utilized. produce more meaningful data in contrast to methods with
Similarly, in Prostate dataset, accuracy performance of CNN lesser features. In addition, pooling approach provides
was outperformed as well by 2.23% when MSVM-RFE was flexibility among features to undertake relative movements. As
utilized and by 1.21% when varSelRF was utilized. Lastly, sim- multi-level pooling is applied, drastic and small movements
ilar underperformance was also recorded in LungMichigan and would occur in the movements of high-level features and low-
LungBoston datasets, whereby, CNN was outperformed by level features, respectively. However, confidence values are
10.03% when compared against MSVM-RFE and by 24.60% minimally influenced despites these different movements.
when compared against varSelRF. Based on the results, CNN These characterizations allow CNN to perform with improved
outperformed classification accuracy of MSVM-RFE and classification accuracy. The experimental results of the
varSelRF methods in a majority of the selected cancer datasets proposed work showed that the most of the cancer datasets
in significant percentages, while sometimes underperformed at provide higher accuracy based on CNN compared to mSVM-
negligible percentages in several datasets. RFE-IRF and varSelRF.

REFERENCES
[1] T. W. Shi, K. Moorthy, M. S. Mohamad, S. Deris, S. Omatu and M.
Yoshioka.Random Forest and Gene Ontology for functional analysis
microarray data. in Computational Intelligence and Applications (CIA),
2014 IEEE 7th International Workshop on. 2014. IEEE.
[2] Koschmieder, A., Zimmermann, K., Trißl, S., Stoltmann, T., & Leser, U.
(2011). Tools for managing and analyzing microarray data. Briefings in
bioinformatics, 13(1), 46-60.
[3] Bolón-Canedo, V., Sánchez-Marono, N., Alonso-Betanzos, A., Benítez,
J. M., & Herrera, F. (2014). A review of microarray datasets and applied
feature selection methods. Information Sciences, 282, 111-135.
[4] Tomašev, N., Radovanović, M., Mladenić, D., & Ivanović, M. (2013).
The role of hubness in clustering high-dimensional data. IEEE
Transactions on Knowledge & Data Engineering, (1), 1.
[5] Zeng, T., & Ji, S. (2015, November). Deep convolutional neural networks
for multi-instance multi-task learning. In Data Mining (ICDM), 2015
IEEE International Conference on (pp. 579-588). IEEE.
[6] Qing Liao, Lin Jiang, Xuan Wang, Chunkai Zhang and Ye Ding. Cancer
Fig. 5. ANOVA of classification accuracy for CNN, mSVM-RFE-IRF and
Classification with Multi-task Deep Learning. 2017.
varSelRF.
[7] Lee, C. P., Lin, W. S., Chen, Y. M., & Kuo, B. J. (2011). Gene selection
and sample classification on microarray data based on adaptive genetic
Based on ANOVA analysis, statistical significant difference algorithm/k-nearest neighbor method. Expert Systems with
exists between the three methods, as indicated by p = 0.007. Applications, 38(5), 4661-4667.
Fig. 5 illustrates the results of ANOVA analysis of CNN with [8] Wang, H., Meghawat, A., Morency, L. P., & Xing, E. P. (2016). Select-
hybrid mSVM-RFE-IRF and varSelRF. The proposed CNN Additive Learning: Improving Generalization in Multimodal Sentiment
scored mean classification accuracy of 94.74%, with best Analysis. arXiv preprint arXiv:1609.05244.
Meanwhile accuracy performance of 100, with 38.03% vari- [9] Chao Li, Shuheng Zhang, Huan Zhang, Lifang Pang, Kinman Lam, Chun
Hui, and Su Zhang. Using the K-nearest neighbor algorithm for the
ance. Meanwhile, mSVM-RFE-IRF recorded second highest classification of lymph node metastasis in gastric cancer. Computational
accuracy performance, scoring mean classification accuracy of and mathematical methods in medicine, 2012. 2012.
85.82%, with best accuracy performance of 95.55, with 15.26% [10] Furey, T. S., Cristianini, N., Duffy, N., Bednarski, D. W., Schummer, M.,
variance. Lastly, varSelRF recorded third best accuracy perfor- & Haussler, D. (2000). Support vector machine classification and
validation of cancer tissue samples using microarray expression
mance, scoring mean classification accuracy of 79.58%, with data. Bioinformatics, 16(10), 906-914.
best accuracy performance of 93.07%, with 26.89% variance. [11] Zuyi Wang Yue Wang Jianhua Xuan Yibin Dong Marina
Bakay Yuanjian FengRobert Clarke Eric P. Hoffman Optimized

149
2018 International Conference on Advanced Science and Engineering (ICOASE), Kurdistan Region, Iraq

multilayer perceptrons for molecular classification and diagnosis using [32] Willett, P., Wilton, D., Hartzoulakis, B., Tang, R., Ford, J., & Madge, D.
genomic data. Bioinformatics, 2006. 22(6): p. 755-761. (2007). Prediction of ion channel activity using binary kernel
[12] Asyali, M. H., Colak, D., Demirkaya, O., & Inan, M. S. (2006). Gene discrimination. Journal of chemical information and modeling, 47(5),
expression profile classification: a review. Current Bioinformatics, 1(1), 1961-1966.
55-73. [33] Moorthy, K., Improved Random Forest with Multiple Support Vector
[13] Kumar, C.A., M. Sooraj, and S. Ramakrishnan, A Comparative Machine for Gene Selection and Classification of Microarray Data. 2015,
Performance Evaluation of Supervised Feature Selection Algorithms on Universiti Teknologi Malaysia.
Microarray Datasets. Procedia Computer Science, 2017. 115: p. 209-217. [34] Díaz-Uriarte, R. and S.A. De Andres, Gene selection and classification of
[14] Pang, H., George, S. L., Hui, K., & Tong, T. (2012). Gene selection using microarray data using random forest. BMC bioinformatics, 2006. 7(1): p.
iterative feature elimination random forests for survival 3.
outcomes. IEEE/ACM Transactions on Computational Biology and [35] Huerta, E.B., B. Duval, and J.-K. Hao. A hybrid GA/SVM approach for
Bioinformatics (TCBB), 9(5), 1422-1431. gene selection and classification of microarray data. in Workshops on
[15] Pang, H., George, S. L., Hui, K., & Tong, T. (2012). Gene selection using Applications of Evolutionary Computation. 2006. Springer.
iterative feature elimination random forests for survival [36] Bastien, F., Lamblin, P., Pascanu, R., Bergstra, J., Goodfellow, I.,
outcomes. IEEE/ACM Transactions on Computational Biology and Bergeron, A., ... & Bengio, Y. (2012). Theano: new features and speed
Bioinformatics (TCBB), 9(5), 1422-1431. improvements. arXiv preprint arXiv:1211.5590.
[16] Kar, S., K.D. Sharma, and M. Maitra, Gene selection from microarray [37] Chollet, F. (2015). Keras.
gene expression data for classification of cancer subgroups employing [38] Zeiler, M.D., ADADELTA: an adaptive learning rate method. arXiv
PSO and adaptive K-nearest neighborhood technique. Expert Systems preprint arXiv:1212.5701, 2012.
with Applications, 2015. 42(1): p. 612-627.
[17] McCulloch, W.S. and W. Pitts, A logical calculus of the ideas immanent
in nervous activity. The bulletin of mathematical biophysics, 1943. 5(4):
p. 115-133.
[18] Bianchini, M. and F. Scarselli, On the complexity of neural network
classifiers: A comparison between shallow and deep architectures. IEEE
transactions on neural networks and learning systems, 2014. 25(8): p.
1553-1565.
[19] Wang, H., Meghawat, A., Morency, L. P., & Xing, E. P. (2016). Select-
Additive Learning: Improving Generalization in Multimodal Sentiment
Analysis. arXiv preprint arXiv:1609.05244.
[20] Vincent, P., Larochelle, H., Bengio, Y., & Manzagol, P. A. (2008, July).
Extracting and composing robust features with denoising autoencoders.
In Proceedings of the 25th international conference on Machine
learning (pp. 1096-1103). ACM.
[21] Huang, F.J. and Y. LeCun. Large-scale learning with svm and
convolutional for generic object categorization. in Computer Vision and
Pattern Recognition, 2006 IEEE Computer Society Conference on. 2006.
IEEE.
[22] Simard, P.Y., D. Steinkraus, and J.C. Platt. Best practices for
convolutional neural networks applied to visual document analysis. in
ICDAR. 2003.
[23] Vaillant, R., C. Monrocq, and Y. Le Cun, Original approach for the
localisation of objects in images. IEE Proceedings-Vision, Image and
Signal Processing, 1994. 141(4): p. 245-250.
[24] Nowlan, S.J. and J.C. Platt, A convolutional neural network hand tracker.
Advances in neural information processing systems, 1995: p. 901-908.
[25] Muja, M. and D.G. Lowe, Fast approximate nearest neighbors with
automatic algorithm configuration. VISAPP (1), 2009. 2(331-340): p. 2.
[26] Krizhevsky, A., I. Sutskever, and G.E. Hinton. Imagenet classification
with deep convolutional neural networks. in Advances in neural
information processing systems. 2012.
[27] Faro, A., Giordano, D., Spampinato, C., & Pennisi, M. (2010). Statistical
texture analysis of MRI images to classify patients affected by multiple
sclerosis. In XII Mediterranean Conference on Medical and Biological
Engineering and Computing 2010 (pp. 272-275). Springer, Berlin,
Heidelberg.
[28] Pereira, S., Pinto, A., Alves, V., & Silva, C. A. (2016). Brain tumor
segmentation using convolutional neural networks in MRI images. IEEE
transactions on medical imaging, 35(5), 1240-1251.
[29] Tivive, F.H.C. and A. Bouzerdoum. A new class of convolutional neural
networks (SICoNNets) and their application of face detection. in Neural
Networks, 2003. Proceedings of the International Joint Conference on.
2003. IEEE.
[30] Liu, Y. and X. An. A classification model for the prostate cancer based
on deep learning. in Image and Signal Processing, BioMedical
Engineering and Informatics (CISP-BMEI), 2017 10th International
Congress on. 2017. IEEE.
[31] Guo, Y., Liu, Y., Oerlemans, A., Lao, S., Wu, S., & Lew, M. S. (2016).
Deep learning for visual understanding: A review. Neurocomputing, 187,
27-48.

150

Comprehensive Project Report On Honda
No ratings yet
Comprehensive Project Report On Honda
106 pages
14
No ratings yet
14
6 pages
Optimized Gene Classification Using Support Vector Machine With Convolutional Neural Network For Cancer Detection From Gene Expression Microarray Data
No ratings yet
Optimized Gene Classification Using Support Vector Machine With Convolutional Neural Network For Cancer Detection From Gene Expression Microarray Data
9 pages
Evolutionary Neural Network
No ratings yet
Evolutionary Neural Network
19 pages
Final Year Project
No ratings yet
Final Year Project
10 pages
Machine Learning Based Approaches For Cancer Classification Using Gene Expression Data
No ratings yet
Machine Learning Based Approaches For Cancer Classification Using Gene Expression Data
12 pages
Classification and Clustring
No ratings yet
Classification and Clustring
75 pages
Supervised Learning Approach For Human Liver Cancer Diagnosis
No ratings yet
Supervised Learning Approach For Human Liver Cancer Diagnosis
10 pages
Otro
No ratings yet
Otro
16 pages
A Microarray Gene Expression Data Classification U
No ratings yet
A Microarray Gene Expression Data Classification U
14 pages
1 s2.0 S0079610722000803 Main
No ratings yet
1 s2.0 S0079610722000803 Main
13 pages
Cancerous Profiles - 2017 - Conference - Paper
No ratings yet
Cancerous Profiles - 2017 - Conference - Paper
6 pages
Neon DNA The Human Body Recipe Presentation
No ratings yet
Neon DNA The Human Body Recipe Presentation
18 pages
Classification of Cancerous Profiles Using Machine Learning
No ratings yet
Classification of Cancerous Profiles Using Machine Learning
38 pages
Microarray gene expression classification: dwarf mongoose optimization with deep learning
No ratings yet
Microarray gene expression classification: dwarf mongoose optimization with deep learning
9 pages
Microarray Time Series
No ratings yet
Microarray Time Series
19 pages
114CS0011 Poster
No ratings yet
114CS0011 Poster
1 page
23-GWO Papers-27-02-2024
No ratings yet
23-GWO Papers-27-02-2024
15 pages
A_Comparative_Study_of_Classification_Methods_For_
No ratings yet
A_Comparative_Study_of_Classification_Methods_For_
6 pages
8.A_Comparative_Study_on_Classification_Methods_for_Renal_Cell_and_Lung_Cancers_Using_RNA-Seq_Data
No ratings yet
8.A_Comparative_Study_on_Classification_Methods_for_Renal_Cell_and_Lung_Cancers_Using_RNA-Seq_Data
9 pages
cancer
No ratings yet
cancer
9 pages
Project (Sec: 01)
No ratings yet
Project (Sec: 01)
10 pages
Aces 150
No ratings yet
Aces 150
5 pages
Deep Learning for Biomedical Data Analysis Techniques, Approaches, and Applications
No ratings yet
Deep Learning for Biomedical Data Analysis Techniques, Approaches, and Applications
358 pages
TSP_CMC_44065
No ratings yet
TSP_CMC_44065
26 pages
2023
No ratings yet
2023
21 pages
Deep Learning For Comp Bio Review
No ratings yet
Deep Learning For Comp Bio Review
16 pages
Convolutional Neural Network Models For Cancer Typ PDF
No ratings yet
Convolutional Neural Network Models For Cancer Typ PDF
34 pages
2012 IJCSE Gene Expression
No ratings yet
2012 IJCSE Gene Expression
6 pages
Almugren, Alshamlan - 2019 - A Survey On Hybrid Feature Selection Methods in Microarray Gene Expression Data For Cancer Classification
No ratings yet
Almugren, Alshamlan - 2019 - A Survey On Hybrid Feature Selection Methods in Microarray Gene Expression Data For Cancer Classification
16 pages
Classification of Kidney Cancer Data Using Depth Aware Generative Adversarial Networks Approach
No ratings yet
Classification of Kidney Cancer Data Using Depth Aware Generative Adversarial Networks Approach
8 pages
A Systematic Review
No ratings yet
A Systematic Review
22 pages
(IJCST-V4I3P23) :fadoua Rafii, Badr Dine Rossi Hassani, M'hamed Aït Kbir
No ratings yet
(IJCST-V4I3P23) :fadoua Rafii, Badr Dine Rossi Hassani, M'hamed Aït Kbir
8 pages
A Novel Feature Selection Algorithm Using Particle
No ratings yet
A Novel Feature Selection Algorithm Using Particle
5 pages
Cancer Detection Using Convolutional Neural Network: February 2021
No ratings yet
Cancer Detection Using Convolutional Neural Network: February 2021
10 pages
Deep Learning for Cb
No ratings yet
Deep Learning for Cb
16 pages
Obust Model For Gene Anlysis and Classification: Fatemeh Aminzadeh, Bita Shadgar, Alireza Osareh
No ratings yet
Obust Model For Gene Anlysis and Classification: Fatemeh Aminzadeh, Bita Shadgar, Alireza Osareh
10 pages
Anembeddedfeatureselectionmethodbasedongeneralizedclassifierneural Network For Cancer Classification
No ratings yet
Anembeddedfeatureselectionmethodbasedongeneralizedclassifierneural Network For Cancer Classification
11 pages
252184
No ratings yet
252184
72 pages
Review 1 Report
No ratings yet
Review 1 Report
10 pages
Grid Search-Based Hyperparameter Tuning and Classification of Microarray Cancer Data
No ratings yet
Grid Search-Based Hyperparameter Tuning and Classification of Microarray Cancer Data
8 pages
Irjet V9i1124
No ratings yet
Irjet V9i1124
5 pages
AutoGenome An AutoML Tool For Genomi - 2021 - Artificial Intelligence in The Li
No ratings yet
AutoGenome An AutoML Tool For Genomi - 2021 - Artificial Intelligence in The Li
11 pages
Computational Biology Issues and Applications in Oncology - 1st Edition Full Chapter Download
100% (12)
Computational Biology Issues and Applications in Oncology - 1st Edition Full Chapter Download
15 pages
Bioinformatics TM4
No ratings yet
Bioinformatics TM4
44 pages
2108.11833v4
No ratings yet
2108.11833v4
11 pages
Classification of Genetic Mutations for Cancer
No ratings yet
Classification of Genetic Mutations for Cancer
6 pages
LHQ Thesis
No ratings yet
LHQ Thesis
198 pages
s11831-021-09556-z
No ratings yet
s11831-021-09556-z
22 pages
2nd Review PPT Template
No ratings yet
2nd Review PPT Template
13 pages
RP Report
No ratings yet
RP Report
1 page
Detect Key Genes in Classification of Microarray Data
No ratings yet
Detect Key Genes in Classification of Microarray Data
13 pages
Methods of Microarray Data Analysis II Papers from CAMDA 01 - 1st Edition Complete eBook Edition
100% (13)
Methods of Microarray Data Analysis II Papers from CAMDA 01 - 1st Edition Complete eBook Edition
16 pages
Asdfr Removed
No ratings yet
Asdfr Removed
25 pages
P11 - Machine Learning Applications in Genetics and Genomics
No ratings yet
P11 - Machine Learning Applications in Genetics and Genomics
12 pages
Cancer Type Prediction and Classification Based On RNA-sequencing Data
No ratings yet
Cancer Type Prediction and Classification Based On RNA-sequencing Data
4 pages
Neural Network
No ratings yet
Neural Network
15 pages
Plagiarism1 - Report
No ratings yet
Plagiarism1 - Report
8 pages
ScienceDirect_multiclass_leukemia
No ratings yet
ScienceDirect_multiclass_leukemia
23 pages
Neural Networks for Beginners: Introduction to Machine Learning and Deep Learning
From Everand
Neural Networks for Beginners: Introduction to Machine Learning and Deep Learning
daniel Huston
No ratings yet
50 Breakthrough AI Concepts in 500 Words Each: In 500 words, #17
From Everand
50 Breakthrough AI Concepts in 500 Words Each: In 500 words, #17
Nietsnie Trebla
No ratings yet
Energy Saving Smart Waste Segregation and Notification System
No ratings yet
Energy Saving Smart Waste Segregation and Notification System
4 pages
A New Classification Technique: Random Weighted LSTM (RWL)
No ratings yet
A New Classification Technique: Random Weighted LSTM (RWL)
4 pages
Pseudo-Static Master-Slave Match-Line Scheme For Sustainable-Performance and Energy-Efficient Content Addressable Memory
No ratings yet
Pseudo-Static Master-Slave Match-Line Scheme For Sustainable-Performance and Energy-Efficient Content Addressable Memory
4 pages
An Application Based Comparative Study of Lpwan Technologies For Iot Environment
No ratings yet
An Application Based Comparative Study of Lpwan Technologies For Iot Environment
4 pages
Predicting The Possibility of Being Malignant Tumor Based On Physical Symptoms Using Iot
No ratings yet
Predicting The Possibility of Being Malignant Tumor Based On Physical Symptoms Using Iot
5 pages
Analog Signal Processing Based Hardware Implementation of Real-Time Audio Visualizer
No ratings yet
Analog Signal Processing Based Hardware Implementation of Real-Time Audio Visualizer
5 pages
20
No ratings yet
20
4 pages
Cost-Effective Seawater Purification System Using Solar Photovoltaic
No ratings yet
Cost-Effective Seawater Purification System Using Solar Photovoltaic
4 pages
Native Vehicles Classification On Bangladeshi Roads Using CNN With Transfer Learning
No ratings yet
Native Vehicles Classification On Bangladeshi Roads Using CNN With Transfer Learning
4 pages
Two-Stage Classification Methods For Microarray Data: Tzu-Tsung Wong, Ching-Han Hsu
No ratings yet
Two-Stage Classification Methods For Microarray Data: Tzu-Tsung Wong, Ching-Han Hsu
9 pages
Improved Identification Performance of Lysine Glycation PTM Using PSI-BLAST
No ratings yet
Improved Identification Performance of Lysine Glycation PTM Using PSI-BLAST
4 pages
Thesis On Gene Expression Analysis
No ratings yet
Thesis On Gene Expression Analysis
125 pages
Modelling of Dried Banana Leaves Based Microwave Absorber Over 1-7 GHZ Frequency Range
No ratings yet
Modelling of Dried Banana Leaves Based Microwave Absorber Over 1-7 GHZ Frequency Range
5 pages
Deep Learning and Thresholding With Class-Imbalanced Big Data
No ratings yet
Deep Learning and Thresholding With Class-Imbalanced Big Data
8 pages
Dealing With Data Imbalance in Text Classification New
No ratings yet
Dealing With Data Imbalance in Text Classification New
10 pages
BMC Bioinformatics: Gene Selection and Classification of Microarray Data Using Random Forest
No ratings yet
BMC Bioinformatics: Gene Selection and Classification of Microarray Data Using Random Forest
13 pages
Sequence Classification
No ratings yet
Sequence Classification
9 pages
JD Digital Marketing
No ratings yet
JD Digital Marketing
2 pages
AI in Education
No ratings yet
AI in Education
33 pages
Bayesian Statistics Using Python
No ratings yet
Bayesian Statistics Using Python
329 pages
About Heroic Game Day
No ratings yet
About Heroic Game Day
11 pages
7.fuzzy Neurons and Fuzzy Neural Networks
No ratings yet
7.fuzzy Neurons and Fuzzy Neural Networks
6 pages
Rough Work
No ratings yet
Rough Work
27 pages
Sentiment Analysis 1
No ratings yet
Sentiment Analysis 1
12 pages
Tiqqun, The Cybernetic Hypothesis
No ratings yet
Tiqqun, The Cybernetic Hypothesis
83 pages
ĐỀ 3
No ratings yet
ĐỀ 3
4 pages
AI in project management
No ratings yet
AI in project management
4 pages
Librarynews - Blog.fordham - Edu-The Future of Libraries AI and Machine Learning
No ratings yet
Librarynews - Blog.fordham - Edu-The Future of Libraries AI and Machine Learning
3 pages
Ai Project (Ak)
No ratings yet
Ai Project (Ak)
51 pages
2407.12040v7
No ratings yet
2407.12040v7
16 pages
155731640908 Hive l 640908155731
No ratings yet
155731640908 Hive l 640908155731
1 page
Volume 3 English
No ratings yet
Volume 3 English
170 pages
AI-Driven Risk Modeling in Life Insurance: Advanced Techniques For Mortality and Longevity Prediction
No ratings yet
AI-Driven Risk Modeling in Life Insurance: Advanced Techniques For Mortality and Longevity Prediction
31 pages
Brochure PG Level Advanced Certification Programme in Digital Manufacturing and Smart Factories Iisc
No ratings yet
Brochure PG Level Advanced Certification Programme in Digital Manufacturing and Smart Factories Iisc
22 pages
Programmes Movein 2024-2025 0
No ratings yet
Programmes Movein 2024-2025 0
113 pages
UNIT II: Perceptron: Output Sum
No ratings yet
UNIT II: Perceptron: Output Sum
12 pages
Bmw Ai Ml Internship Guide
No ratings yet
Bmw Ai Ml Internship Guide
4 pages
DH XVR1B08 I
No ratings yet
DH XVR1B08 I
3 pages
Traffic Indicator PPT
No ratings yet
Traffic Indicator PPT
8 pages
Debrah Artificial Intelligence Building
No ratings yet
Debrah Artificial Intelligence Building
64 pages
MNIST
No ratings yet
MNIST
54 pages
The ROOM Fellowship Slides 2022-0820 - 783978159
No ratings yet
The ROOM Fellowship Slides 2022-0820 - 783978159
23 pages
BCG-Executive-Perspectives-AI-Powered-RandD-EP1-14Feb2025
No ratings yet
BCG-Executive-Perspectives-AI-Powered-RandD-EP1-14Feb2025
24 pages
A Hybrid Transformer Model for Fake News Detection Leveraging Bayesian Optimization and Bidirectional Recurrent Unit
No ratings yet
A Hybrid Transformer Model for Fake News Detection Leveraging Bayesian Optimization and Bidirectional Recurrent Unit
6 pages
Gujarat Technological University
No ratings yet
Gujarat Technological University
2 pages
Plant Disease Detection
No ratings yet
Plant Disease Detection
3 pages

Gene Selection and Classification of Microarray Data Using Convolutional Neural Network

Uploaded by

Gene Selection and Classification of Microarray Data Using Convolutional Neural Network

Uploaded by

2018 International Conference on Advanced Science and Engineering (ICOASE), Kurdistan Region, Iraq

Gene Selection and Classification of Microarray

978-1-5386-6696-8/18/$31.00 ©2018 IEEE 145

III. MATERIAL AND METHODS

Fig. 3. General CNN Architecture

𝐶𝑜𝑟𝑟𝑒𝑐𝑡𝑙𝑦 𝑃𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 𝐷𝑎𝑡𝑎

DATA SET mSVM-RFE- varSelRF Proposed CNN

Based on the classification results, in Brain dataset, CNN

CNN improved classification accuracy in comparison to VI. CONCLUSIONS

You might also like