0% found this document useful (0 votes)

38 views

An Enhanced Monte Carlo Outlier Detection Method

This document summarizes a research paper that proposes an enhanced Monte Carlo outlier detection method. The method establishes cross-prediction models based on determinate normal samples and analyzes the distribution of prediction errors individually for dubious samples. When applied to one simulated and three real datasets, the results indicated this method outperformed traditional Monte Carlo outlier detection in identifying outlier samples. After removing the identified outliers, the predictive performance of the models improved as measured by decreased error terms.

Uploaded by

EduardoPaca

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

38 views

An Enhanced Monte Carlo Outlier Detection Method

Uploaded by

EduardoPaca

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

FULL PAPER WWW.C-CHEM.

ORG

An Enhanced Monte Carlo Outlier Detection Method

Liangxiao Zhang,*[a,b,c,d] Peiwu Li,*[a,c,d,e] Jin Mao,[a,b,c] Fei Ma,[a,d] Xiaoxia Ding,[a,c]
and Qi Zhang[a,e]

Outlier detection is crucial in building a highly predictive removed, the value of validation by Kovats retention indices
model. In this study, we proposed an enhanced Monte Carlo and the root mean square error of prediction decreased from
outlier detection method by establishing cross-prediction 3.195 to 1.655, and the average cross-validation prediction
models based on determinate normal samples and analyzing error decreased from 2.0341 to 1.2780. This method helps
the distribution of prediction errors individually for dubious establish a good model by eliminating outliers. V
C 2015 Wiley

samples. One simulated and three real datasets were used to Periodicals, Inc.
illustrate and validate the performance of our method, and the
results indicated that this method outperformed Monte Carlo DOI: 10.1002/jcc.24026
outlier detection in outlier diagnosis. After these outliers were

Introduction as a feasible way to detect different kinds of outliers by estab-

lishing many cross-prediction models.[2,14,15] The core idea of
Outlier detection is a primary step in data modeling and is
an MC outlier detector is that the predictive results for the X
important in identifying and subsequently eliminating atypical
outlier far from the center of the sample space are consider-
observations from a given set of data during the establish-
ably variable by Monte Carlo sampling subset predictive mod-
ment of a high-performance model.[1,2] Both univariate meth-
els while predicting the y outlier is usually difficult. Thus, the
ods and multivariate methods can be used for outlier
distribution of predictive errors could be used for samples in
detection.[3] Most early univariate methods for outlier detec-
multiple outlier detection. However, due to the masking effect,
tion were designed on the assumption of underlying identical
the boundary between normal and abnormal samples is
and independent data distribution, such as the Chauvenet’s
unclear in the plot of variance of residuals versus mean of
criterion, Peirce’s criterion, Grubbs’ test,[4] Tietjen–Moore test,[5]
residuals.
and generalized extreme Studentized deviate test.[6] These
In this study, we proposed a new strategy to detect outliers
methods are unsuitable for high-dimensional datasets and
using MCOD to identify normal samples from the plot of var-
arbitrary datasets without prior knowledge of underlying data
iance versus mean of residuals and then individually checking
distribution.[7]
the dubious samples. As there is no more than one outlier in
Multivariate outlier detection methods include statistical and
the dataset, the masking effect does not exist, and therefore,
data mining methods. Statistical methods aim at identifying
it is easier to detect outliers. In addition, if the dubious
the observations relatively far from the center of data distribu-
tion.[8] In this method, the Mahalanobis distance is used as a
well-known criterion depending on the estimated multivariate [a] L. Zhang, P. Li, J. Mao, F. Ma, X. Ding, Q. Zhang
normal distribution parameters.[9] The detection result is satis- Oil Crops Research Institute, Chinese Academy of Agricultural Sciences,
Wuhan 430062, China
factory for datasets with only one calibration outlier but not E-mail: [email protected] (or) [email protected]
for those with multiple outliers because outliers could distort [b] L. Zhang, J. Mao
the mean value (MV) or covariance matrix to generate a mask- Key Laboratory of Biology and Genetic Improvement of Oil Crops,
ing effect, which makes the hat matrix leverage unattainable. Ministry of Agriculture, Wuhan 430062, China
[c] L. Zhang, P. Li, J. Mao, X. Ding,
To mitigate the masking effect, many methods have been Laboratory of Risk Assessment for Oilseeds Products (Wuhan),
proposed to detect outliers, including minimum volume ellip- Ministry of Agriculture, Wuhan 430062, China
soid,[10] ellipsoidal multivariate trimming,[11] minimum covari- [d] P. Li, F. Ma, L. Zhang,
ance determinant,[12] resampling by half-means and smallest Quality Inspection and Test Center for Oilseeds Products,
Ministry of Agriculture, Wuhan 430062, China
half volume.[13] The key to these methods is to find out the
[e] P. Li, Q. Zhang
main body of an observation matrix and identify the outliers Key Laboratory of Detection for Mycotoxins, Ministry of Agriculture,
significantly different from the majority of the dataset. Wuhan 430062, China
Data mining methods are designed to manage large data- Contract grant sponsor: National Key Technologies R&D Program;
Contract grant number: 2012BAK08B03; Contract grant sponsor:
bases from high-dimensional spaces, including distance-based
National Nature Foundation Committee of P.R. China; Contract grant
methods, clustering methods, and spatial methods. To effec- number: 21205118; Contract grant sponsor: Earmarked Fund for China
tively use dependent variables and detect more outliers, the Agriculture Research System; Contract grant number: CARS-13
Monte Carlo outlier detection (MCOD) method was developed C 2015 Wiley Periodicals, Inc.
V

1902 Journal of Computational Chemistry 2015, 36, 1902–1906 WWW.CHEMISTRYVIEWS.COM

WWW.C-CHEM.ORG FULL PAPER

MC outlier detection

Traditional outlier detection methods usually analyze the distri-

bution of samples in the sample space. The purpose of outlier
detection is actually to build the best prediction model, and
the model-based method is, therefore, more effective. Recently,
Monte Carlo cross-validation was developed to detect outliers
by studying the distribution of prediction errors of each sample
obtained from the original dataset.[2] The hypothesis of this
outlier detection method is that the normal samples at the
center of the sample space are less influenced by fluctuant
model parameters. As the detailed algorithm was described
elsewhere,[2] this study only gives the outline. Figure 1 (the left)
presents a flow chart for the complete algorithm. The number
of principle components was firstly determined by cross-
validation in partial least squares or principal component
regression modeling. Then, the whole dataset was randomly
divided into training and validation sets. The prediction model
was built using the training set and used to predict the valida-
Figure 1. Flow chart for enhanced Monte Carlo outlier detection. tion set so as to obtain the prediction error for each validation
sample. After N cycles, the prediction error distribution was
obtained for each sample. Then, the MV and standard deviation
samples are normal, the prediction errors decrease to an (STD) of the prediction error distribution for the samples were
acceptable range with no influence from the outliers. used to detect outliers. The distribution of the predictive errors
generated by many models contains more sample information
Theories and Methods about whether a sample is an outlier or not. The error distribu-
tion of a normal sample is less likely varied when normal sam-
Dataset
ples are the main body of the whole dataset. The predictive
Dataset 1, a simulated dataset, was designed (X (100 3 10) residuals of a y outlier, however, have a large expectation value,
and y (100 3 1) samples with normally distributed noise) to while an X outlier (good leverage point) far from the main
illustrate our method. In these examples, matrix X contains body of all the samples possesses a small expectation value of
independent columns (X) meaning molecular descriptors and a predictive residuals but a large STD. According to the hypothe-
dependent column (y) being related to X by the equation: sis of this outlier detection method, normal samples have small
y 5 f(X). Then, two types of outliers (y and X) are added into MVs and STDs for their prediction errors and, therefore, lie on
this dataset as follows: (1) 20 additional y outliers with three- the lower left of the MV/STD plot; the upper left lies the sam-
fold noise are added to this dataset. The independent varia- ple outliers that have small MVs but large STDs; the lower right
bles of these y outliers are derived from the main body of the displays the y outliers or model outliers that have large MVs
100 normal samples and (2) 20 additional X outliers with a but small STDs. This MV/STD plot could obviously provide vis-
large Mahalanobis distance (twice larger than the average ual diagnosis for direct outlier detection. Being validated by
leverage value from the normal samples) are added to this several datasets, this method was proven effective for outlier
dataset. The X outliers have the same functional relationship detection.[2]
as the 100 normal samples.
Dataset 2, stack loss plant dataset for oxidation of ammonia Enhanced Monte Carlo outlier detection
to nitric acid, provides operational data of a plant, which As the outliers in a calibration dataset increase, the probability
includes 21 observations on three independent variables (cool- of selecting at least one outlier observation is relatively small.
ing air flow, cooling water inlet temperature, and acid concen- In this case, the masking effect existing in the interaction of
tration) and a dependent variable of stack loss.[16,17] Among all outlier observation makes it quite unclear to divide outliers
the samples, the outliers are No. 1, 3, 4, and 21, and No. 2 is a from normal samples. A visual diagnostic for the distribution
good leverage point. of prediction errors is, therefore, insufficient and complex. To
Dataset 3, Hawkins–Bradu–Kass Data, is another classic data- overcome the masking effect, an enhanced Monte Carlo out-
set for outlier detection and robust regression. The first 14 lier detection (EMCOD) method was developed to obtain bet-
observations out of 75 are outliers of this dataset.[18] ter outlier detection results. This method is designed based
Dataset 4, a dataset of Kovats retention indices, was chosen on the fact that normal samples with the smallest MV and
according to.[19–21] A total of 177 methylalkanes comprised a STD of prediction errors are easily determined. As shown in
range of different carbon chain lengths. Only the last two dig- Figure 1, the EMCOD procedure is similar with MCOD, and the
its of the KI were recorded in Ref. 21, which were obtained by procedure contains the following steps: (1) use MCOD to
subtracting the number of carbons from the main chain 100. obtain the prediction error distribution for each sample; (2)

Journal of Computational Chemistry 2015, 36, 1902–1906 1903

FULL PAPER WWW.C-CHEM.ORG

Figure 2. Mean/STD plot of prediction errors for Dataset 1: Enhanced Monte Carlo outlier detection (left) and Monte Carlo outlier detection (right).

select 40–60% of the samples with the smallest MV and STD Results and Discussion
of prediction errors, and determine the remaining samples as
Enhanced Monte Carlo outlier detection method
dubious samples; (3) randomly divide the selected data (Ns)
into training and validation sets; (4) after the number of prin- Outlier detection is an important step in building a highly pre-
ciple components is determined by cross-validation, build the dictive model. MOCD was recently developed to provide a fea-
prediction model with the training set and use it to predict sible means of detecting different kinds of outliers by
the samples in the dubious samples in the validation set to establishing many predictive models and a MV/STD plot of
obtain the prediction error; (5) after N cycles, obtain the pre- prediction errors for all samples. This outlier detection method
diction error distribution for the dubious samples; and (6) use depends on the graphic MV/STD plot, so the key is to deter-
the MV and STD of the error distribution on the dubious sam- mine the visualized boundary between normal and abnormal
ples to test whether the dubious samples are outliers. Accord- samples.
ing to the hypothesis of this outlier detection method, the To illustrate our method, a simulated dataset was designed,
MVs and STDs of their prediction errors decrease, while those which contains 100 normal samples, 20 X outliers, and 20 y
of the outliers increase to some extent. As the masking effect outliers. MCOD was initially conducted to detect the outliers.
could be eliminated by EMCOD, it could provide better results As shown in Figure 2, two kinds of outliers have a clear tend-
than MCOD. ency to separate from the normal samples. The y outliers have
larger prediction errors than normal samples while X outliers
(good leverage point) have large STD values than normal sam-
ples. However, the boundary between the outliers and the
Data processing and analysis
normal samples is indistinct, making it difficult to determine
All programs used were coded in MATLAB 2011a for Win- whether a sample far from the original point is an outlier or
dows and all calculations were performed on a personal com- not. In this case, EMCOD was performed to detect the outliers
puter. The MATLAB implementation of EMCOD is available in this simulated dataset. As an enhancement of MCOD, the
from http://www.mathworks.com/matlabcentral/fileexchange/ MVs and STDs of prediction errors acquired from MCOD were
52023-emcod. used to select out 60 normal samples with the smallest MVs

Figure 3. Mean/STD plot of prediction errors for Dataset 2: Enhanced Monte Carlo outlier detection (left) and Monte Carlo outlier detection (right).

1904 Journal of Computational Chemistry 2015, 36, 1902–1906 WWW.CHEMISTRYVIEWS.COM

WWW.C-CHEM.ORG FULL PAPER

Figure 4. Mean/STD plot of prediction errors for Dataset 3: Enhanced Monte Carlo outlier detection (left) and Monte Carlo outlier detection (right). [Color
figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]

and STDs. When the number (N) of Monte Carlo models and 17 were normal samples, which had the smallest mean and
sampling ratio are, respectively, set to 10,000 and 0.8, the MVs STD values. We established MC prediction models using these
and STDs of prediction errors could be used to determine 11 samples and used these models to observe other samples.
whether the dubious samples are outliers. As shown in Figure The number (N) of Monte Carlo models and sampling ratio are
2, the samples in this simulated dataset were noticeably classi- also set to 10,000 and 0.8, respectively. According to the
fied into four groups. The distances between outliers and nor- hypothesis that the models built with merely normal samples
mal samples significantly increase, and 20 X outliers and 20 y provide lower prediction errors for normal samples but higher
outliers could be easily identified from the MV/STD plot of pre- prediction errors for outliers, the distances between normal
diction errors. It is absolutely obvious that EMCOD could pres- samples and outliers should be longer. The result is shown in
ent a better result in correctly detecting outliers. Figure 3 (left), which illustrates that EMCOD has a better result
as the outliers have correctly been detected.
With the help of an MC outlier detector, normal samples
Method validation
with the smallest MVs and STDs of prediction errors could be
Dataset 2 is the stack loss dataset of a plant. In MCOD, the easily detected, even though it was hard to determine the
number (N) of Monte Carlo models and sampling ratio are set boundary between normal samples and outliers. We selected
to 10,000 and 0.8, respectively. The MV/STD plot of the predic- some normal samples with the smallest MVs and STDs of pre-
tion errors for 21 samples was shown on the right of Figure 3. diction errors and then determined whether other samples
Lacking the information about this commonly used dataset, it were outlier one after another.
is hard to determine the boundary for outlier detection. To Dataset 3 represents the Hawkins–Bradu–Kass data. As
obtain a clearer result, enhanced MOCD was proposed and shown on the right of Figure 4, the M/SD plot indicates that
used to detect outliers in this dataset. As shown in Figure 3, 14 samples (No. 1–14) are outliers. 52 samples with the lowest
the samples including 20, 5, 16, 18, 19, 13, 14, 8, 15, 10, and STDs of prediction errors (<0.5) were selected as normal

Figure 5. Mean/STD plot of prediction errors for Dataset 4. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]

Journal of Computational Chemistry 2015, 36, 1902–1906 1905

FULL PAPER WWW.C-CHEM.ORG

samples. Other 23 samples were then detected and tested one

How to cite this article: L. Zhang, P. Li, J. Mao, F. Ma, X. Ding,
by one by the MC prediction models of the dataset estab- Q. Zhang, J. Comput. Chem. 2015, 36, 1902–1906. DOI:
lished with the 52 samples. As shown on the left hand of Fig- 10.1002/jcc.24026
ure 4, the prediction errors of 9 normal samples decrease and
the prediction errors of 14 outliers greatly increase. The distan-
ces between normal samples and outliers significantly increase. [1] R. Todeschini, D. Ballabio, V. Consonni, F. Sahigara, Anal. Chim. Acta
Moreover, this method is insensitive to the number of normal 2013, 787, 1.
[2] D. S. Cao, Y. Z. Liang, Q. S. Xu, H. D. Li, X. Chen, J. Comput. Chem.
samples used to build the prediction models (data not 2010, 31, 592.
shown). [3] G. J. Williams, R. A. Baxter, H. X. He, S. Hawkins, L. Gu, In IEEE Interna-
Dataset 4 is the dataset of the Kovats retention indices. The tional Conference on Data-Mining (ICDM’02), CSIRO Technical Report
CMIS-02/102; Maebashi City, Japan, CSIRO Technical Report CMIS-02/
detailed information was provided in the previous stud-
102, 2002.
ies.[19–22] EMCOD was conducted on Dataset 4. As shown in [4] F. Grubbs, Technometrics 1969, 11, 1.
Figure 5, the M/SD plot indicates that 18 samples are far from [5] W. Stefansky, Technometrics 1976, 14, 469.
[6] B. Rosner, Technometrics 1983, 25, 165.
the original point and can be regarded as outliers. The models
[7] C. Zhu, H. Kitagawa, S. Papadimitriou, C. Faloutsos, J. Intell. Inf. Syst.
built by all 177 samples were compared with those built by 2011, 36, 217.
159 samples, and the result showed that the root mean square [8] E. M. Knorr, R. T. Ng, In Proceedings of the VLDB Conference; New
error of prediction decreased from 3.195 to 1.655. After these York, 1998; pp. 392–403.
[9] I. Ben-Gal, Outlier detection, In Data Mining and Knowledge Discovery
outliers were removed, the accuracy of the model significantly Handbook: A Complete Guide for Practitioners and Researchers; O.
improved. The average cross-validation prediction error also Maimon and L. Rockach, Eds.; Kluwer Academic Publishers, Dordrecht,
dropped from 2.0341 to 1.2780, which was obviously better the Netherlands, 2005, ISBN 0-387-24435-2.[WorldCat]
[10] R. Gnanadesikan, J. R. Kettenring, Biometrics 1972, 28, 81.
than the previous study (4.6 and 4.3, respectively).[21]
[11] D. M. Rocke, D. L. Woodruff, J. Am. Stat. Assoc. 1996, 91, 1047.
[12] P. J. Rousseeuw, V. D. Katrien, Technometrics 1999, 41, 212.
[13] W. J. Egan, S. L. Morgan, Anal. Chem. 1998, 79, 2372.
[14] D. S. Cao, Y. Z. Liang, Q. S. Xu, Y. F. Yun, H. D. Li, J. Comput. Aid Mol.
Des. 2011, 25, 67.
[15] H. D. Li, Y. Z. Liang, Q. S. Xu, D. S. Cao, J. Chemometr. 2009, 24, 418.
Conclusion [16] K. A. Brownlee, Statistical Theory and Methodology in Science and
In this study, we proposed EMCOD by establishing cross- Engineering; Academic: New York, 1965; pp. 491–500.
[17] R. A. Becker, J. M. Chambers, A. R. Wilks, The New S Language; Wads-
predictive models using determinate normal samples and indi- worth & Brooks/Cole, Pacific Grove, California, 1988.
vidually analyzing the distribution of prediction errors for dubi- [18] D. M. Hawkins, D. Bradu, G. V. Kass, Technometrics 1984, 26, 197.
ous samples. Four datasets were used to illustrate and validate [19] D. A. Carlson, U. R. Bernier, B. D. Sutton, J. Chem. Ecol. 1998, 24, 1845.
[20] Y. V. Kissin, G. P. Feulmer, J. Chromatogr. Sci. 1986, 24, 53.
our method. The results indicated that EMCOD could increase
[21] A. R. Katritzky, K. Chen, U. Maran, D. A. Carlson, Anal. Chem. 2000, 72, 101.
the distances between outliers and normal samples, making it [22] Y. Z. Liang, D. L. Yuan, Q. S. Xu, O. M. Kvalheim, J. Chemometr. 2008, 22, 23.
easier to detect outliers.

Keywords: outlier detection enhanced Monte Carlo outlier Received: 18 June 2015
Accepted: 1 July 2015
detection validation Published online on 31 July 2015

1906 Journal of Computational Chemistry 2015, 36, 1902–1906 WWW.CHEMISTRYVIEWS.COM

Some Small-Sample Properties of Some Recently Proposed
No ratings yet
Some Small-Sample Properties of Some Recently Proposed
13 pages
0 (4)
No ratings yet
0 (4)
8 pages
Multidimensional Outlier Detection and Robust
No ratings yet
Multidimensional Outlier Detection and Robust
12 pages
On Detection of Outliers and Their Effect in Supervised Classification
No ratings yet
On Detection of Outliers and Their Effect in Supervised Classification
14 pages
Robust Detection of Multiple Outliers in A Multivariate Data Set
No ratings yet
Robust Detection of Multiple Outliers in A Multivariate Data Set
30 pages
2001, Pena, Prieto
No ratings yet
2001, Pena, Prieto
25 pages
Ijaser 01 23
No ratings yet
Ijaser 01 23
13 pages
A Multivariate Outlier Detection Method
No ratings yet
A Multivariate Outlier Detection Method
5 pages
Tugas Statistik 1
No ratings yet
Tugas Statistik 1
5 pages
Distance Based Outlier Detection
No ratings yet
Distance Based Outlier Detection
40 pages
Methods To Detect Different Types of Outliers: March 2016
No ratings yet
Methods To Detect Different Types of Outliers: March 2016
7 pages
Identification of Multivariate Outliers - Problems and Challenges of Visualization Methods
No ratings yet
Identification of Multivariate Outliers - Problems and Challenges of Visualization Methods
15 pages
IJERTV1IS6305
No ratings yet
IJERTV1IS6305
7 pages
Outlier Detection
No ratings yet
Outlier Detection
22 pages
Subspace Histograms For Outlier Detection in Linear Time: Saket Sathe Charu C. Aggarwal
No ratings yet
Subspace Histograms For Outlier Detection in Linear Time: Saket Sathe Charu C. Aggarwal
25 pages
Statistical Test Methods For Hypothesis Testing
No ratings yet
Statistical Test Methods For Hypothesis Testing
6 pages
Energy Conversion and Econom - 2023 - Patel - Taxonomy of Outlier Detection Methods For Power System Measurements
No ratings yet
Energy Conversion and Econom - 2023 - Patel - Taxonomy of Outlier Detection Methods For Power System Measurements
16 pages
1-s2.0-S0020025523011052-main
No ratings yet
1-s2.0-S0020025523011052-main
17 pages
Distance-Based Outlier Detection: Consolidation and Renewed Bearing
No ratings yet
Distance-Based Outlier Detection: Consolidation and Renewed Bearing
12 pages
Identification of Outliers in Multivariate Data
No ratings yet
Identification of Outliers in Multivariate Data
16 pages
Krishnendu PCB-IT602B
No ratings yet
Krishnendu PCB-IT602B
11 pages
Anomaly Detection
No ratings yet
Anomaly Detection
10 pages
6735367a5d6e24a5f185bf9c_99512104437
No ratings yet
6735367a5d6e24a5f185bf9c_99512104437
2 pages
Outlier Detection For High-Dimensional Data
No ratings yet
Outlier Detection For High-Dimensional Data
11 pages
Applied Sciences: Outlier Detection Based Feature Selection Exploiting Bio-Inspired Optimization Algorithms
No ratings yet
Applied Sciences: Outlier Detection Based Feature Selection Exploiting Bio-Inspired Optimization Algorithms
28 pages
Outlier Detection A Survey
No ratings yet
Outlier Detection A Survey
84 pages
Data Minning Unit 4-1
No ratings yet
Data Minning Unit 4-1
10 pages
Anomoly Detection - Ensemble - Classifiers
No ratings yet
Anomoly Detection - Ensemble - Classifiers
68 pages
Lecture 8 Data Prepration Techniques
No ratings yet
Lecture 8 Data Prepration Techniques
4 pages
Outlier Mining Techniques For Uncertain Data
No ratings yet
Outlier Mining Techniques For Uncertain Data
7 pages
Anomaly Detection and Outlier Analysis
No ratings yet
Anomaly Detection and Outlier Analysis
25 pages
On Normalization and Algorithm Selection For Unsupervised Outlier Detection
No ratings yet
On Normalization and Algorithm Selection For Unsupervised Outlier Detection
34 pages
Anomaly Detection
No ratings yet
Anomaly Detection
22 pages
Test To Identify Outliers in Data Series
No ratings yet
Test To Identify Outliers in Data Series
16 pages
Outlier Detection Techniques
No ratings yet
Outlier Detection Techniques
11 pages
11 Different Ways For Outlier Detection in Python
No ratings yet
11 Different Ways For Outlier Detection in Python
11 pages
Handling Outliers
No ratings yet
Handling Outliers
6 pages
Reverse Accessible in Local Outlier Factor Density Based Recognition (1)
No ratings yet
Reverse Accessible in Local Outlier Factor Density Based Recognition (1)
10 pages
(C) 2019 Application of Outlier Detection Using Re-Weighted Least Squares and R-Squared For IoT Extracted Data
No ratings yet
(C) 2019 Application of Outlier Detection Using Re-Weighted Least Squares and R-Squared For IoT Extracted Data
6 pages
UNIT 4
No ratings yet
UNIT 4
17 pages
A Review Paper On Outlier Detection Using Two-Phase SVM Classifiers With Cross Training Approach For Multi - Disease Diagnosis
No ratings yet
A Review Paper On Outlier Detection Using Two-Phase SVM Classifiers With Cross Training Approach For Multi - Disease Diagnosis
5 pages
Module_11(c)
No ratings yet
Module_11(c)
4 pages
Outlier Detection in Non-Gaussian Distributions Uitschieter Detectie in Niet-Gauss Verdelingen
No ratings yet
Outlier Detection in Non-Gaussian Distributions Uitschieter Detectie in Niet-Gauss Verdelingen
45 pages
Babakhani Mahshid 201404 MSC
No ratings yet
Babakhani Mahshid 201404 MSC
99 pages
Paella Algorithm
No ratings yet
Paella Algorithm
17 pages
Adsl Exp 8 2024
No ratings yet
Adsl Exp 8 2024
10 pages
ADS EXP 7
No ratings yet
ADS EXP 7
10 pages
Outliers
No ratings yet
Outliers
28 pages
A Two-Stage Optimized Robust Kernel Density Estima
No ratings yet
A Two-Stage Optimized Robust Kernel Density Estima
36 pages
sullivan2021
No ratings yet
sullivan2021
14 pages
Outlier Analysis
No ratings yet
Outlier Analysis
28 pages
17 dm2 Anomaly Detection 2022 23
No ratings yet
17 dm2 Anomaly Detection 2022 23
113 pages
WINSEM2024-25_CBS3006_ETH_VL2024250505168_2025-01-09_Reference-Material-III
No ratings yet
WINSEM2024-25_CBS3006_ETH_VL2024250505168_2025-01-09_Reference-Material-III
4 pages
188 1496475265 - 03-06-2017 PDF
No ratings yet
188 1496475265 - 03-06-2017 PDF
6 pages
ADII11 Metode Deteksi Outlier
No ratings yet
ADII11 Metode Deteksi Outlier
50 pages
Outlier Detection and Removal
No ratings yet
Outlier Detection and Removal
2 pages
A Survey On Outlier Detection Techniques
No ratings yet
A Survey On Outlier Detection Techniques
37 pages
How To Calculate Outliers
No ratings yet
How To Calculate Outliers
7 pages
Some Methods of Detection of Outliers in Linear Regression Model-Ranjit PDF
No ratings yet
Some Methods of Detection of Outliers in Linear Regression Model-Ranjit PDF
19 pages
Clinical Trials Design and Methodology: Clinical Trials Mastery Series, #3
From Everand
Clinical Trials Design and Methodology: Clinical Trials Mastery Series, #3
Dr. Nilesh Panchal
No ratings yet
04 Mahalanobis Distance in R
No ratings yet
04 Mahalanobis Distance in R
12 pages
04 Mahalanobis Distance in R MV PDF
No ratings yet
04 Mahalanobis Distance in R MV PDF
9 pages
Detection of Prediction Outliers and Inliers in Multivariate Calibration
No ratings yet
Detection of Prediction Outliers and Inliers in Multivariate Calibration
19 pages
Rousseeuwhubert Highbdmultivariatelocscatter Fests
No ratings yet
Rousseeuwhubert Highbdmultivariatelocscatter Fests
19 pages
Robust Statistics For Outlier Detection: Peter J. Rousseeuw and Mia Hubert
No ratings yet
Robust Statistics For Outlier Detection: Peter J. Rousseeuw and Mia Hubert
7 pages
Project Presentation Viva Question and Answers
No ratings yet
Project Presentation Viva Question and Answers
4 pages
ML Mod1-CS467 Machine Learning - Ktustudents - in
No ratings yet
ML Mod1-CS467 Machine Learning - Ktustudents - in
16 pages
Skating Speed: A Statistical Approach To Modelling
No ratings yet
Skating Speed: A Statistical Approach To Modelling
13 pages
Explainable AI For Trees
No ratings yet
Explainable AI For Trees
72 pages
Research Paper
No ratings yet
Research Paper
7 pages
Predicting Employee Churn in Python
100% (1)
Predicting Employee Churn in Python
19 pages
Elmousalami-Elaskary2020 Article DrillingStuckPipeClassificatio
No ratings yet
Elmousalami-Elaskary2020 Article DrillingStuckPipeClassificatio
14 pages
Wne WP361
No ratings yet
Wne WP361
36 pages
Sentimental Analysis of Twitter Using Emoji: A Creative and Innovative Project Report
No ratings yet
Sentimental Analysis of Twitter Using Emoji: A Creative and Innovative Project Report
19 pages
Tensorflow/Keras Assignment: Problem Specification
No ratings yet
Tensorflow/Keras Assignment: Problem Specification
10 pages
3a Artificial Intelligence in Healthcare Guidelines (AIHGLE) Publishedoct21
100% (1)
3a Artificial Intelligence in Healthcare Guidelines (AIHGLE) Publishedoct21
49 pages
NHL Players Salary Project Documentation
No ratings yet
NHL Players Salary Project Documentation
29 pages
Dott. Ing. Letizia Squarcina, PH.D.: Tecniche Di Analisi Di MRI Cerebrale Neuroscience and Psychiatry
No ratings yet
Dott. Ing. Letizia Squarcina, PH.D.: Tecniche Di Analisi Di MRI Cerebrale Neuroscience and Psychiatry
46 pages
Frequent Pattern For Classification
No ratings yet
Frequent Pattern For Classification
10 pages
Modeling The Critical Flashover Voltage of High Voltage Insulators Using Artificial Intelligence
No ratings yet
Modeling The Critical Flashover Voltage of High Voltage Insulators Using Artificial Intelligence
18 pages
Optimizing The Seed-Cell Filling Performance of An Inclined Plate Seed Metering Device Using Integrated ANN-PSO Approach
No ratings yet
Optimizing The Seed-Cell Filling Performance of An Inclined Plate Seed Metering Device Using Integrated ANN-PSO Approach
12 pages
Time Series Forecasting Business Report
No ratings yet
Time Series Forecasting Business Report
42 pages
Hybrid Machine Learning Based E-Mail Spam Filtering Technique
100% (2)
Hybrid Machine Learning Based E-Mail Spam Filtering Technique
58 pages
CSC 413 Ass
No ratings yet
CSC 413 Ass
7 pages
Algorithmic Trading Using Intelligent Agents
No ratings yet
Algorithmic Trading Using Intelligent Agents
8 pages
Labeled Faces in The Wild: A Database For Studying Face Recognition in Unconstrained Environments
No ratings yet
Labeled Faces in The Wild: A Database For Studying Face Recognition in Unconstrained Environments
11 pages
CSCE 5063-001: Assignment 2: 1 Implementation of SVM Via Gradient Descent
No ratings yet
CSCE 5063-001: Assignment 2: 1 Implementation of SVM Via Gradient Descent
5 pages
Fully Convolutional Networks With Sequential Information For Robust Crop and Weed Detection in Precision Farming
No ratings yet
Fully Convolutional Networks With Sequential Information For Robust Crop and Weed Detection in Precision Farming
16 pages
University Admission Prediction
No ratings yet
University Admission Prediction
18 pages
Updated ML LAB Manual-2020-21
No ratings yet
Updated ML LAB Manual-2020-21
57 pages
7 - Classfication - Concept - DecisionTree - Evaluation
No ratings yet
7 - Classfication - Concept - DecisionTree - Evaluation
47 pages
Phase 2
83% (6)
Phase 2
27 pages
Using Deep Learning For Image-Based Plant Disease Detection: Sharada Prasanna Mohanty, David Hughes, and Marcel Salathé
No ratings yet
Using Deep Learning For Image-Based Plant Disease Detection: Sharada Prasanna Mohanty, David Hughes, and Marcel Salathé
7 pages
Data Mining: Practical Machine Learning Tools and Techniques
No ratings yet
Data Mining: Practical Machine Learning Tools and Techniques
59 pages
How To Develop LSTM Models For Time Series Forecasting
100% (1)
How To Develop LSTM Models For Time Series Forecasting
188 pages

An Enhanced Monte Carlo Outlier Detection Method

Uploaded by

An Enhanced Monte Carlo Outlier Detection Method

Uploaded by

FULL PAPER WWW.C-CHEM.

An Enhanced Monte Carlo Outlier Detection Method

Introduction as a feasible way to detect different kinds of outliers by estab-

1902 Journal of Computational Chemistry 2015, 36, 1902–1906 WWW.CHEMISTRYVIEWS.COM

Traditional outlier detection methods usually analyze the distri-

Journal of Computational Chemistry 2015, 36, 1902–1906 1903

1904 Journal of Computational Chemistry 2015, 36, 1902–1906 WWW.CHEMISTRYVIEWS.COM

Journal of Computational Chemistry 2015, 36, 1902–1906 1905

samples. Other 23 samples were then detected and tested one

1906 Journal of Computational Chemistry 2015, 36, 1902–1906 WWW.CHEMISTRYVIEWS.COM

You might also like