Crowd Monitoring and Safety Analysis: (Bachelor of Engineering)
Crowd Monitoring and Safety Analysis: (Bachelor of Engineering)
Of the degree of
(Bachelor of Engineering)
By
Name of student Class Roll No.
1. Darshan Pramod Nemade BE-4 25
2. Ulkesh Sharad More BE-3 73
3. Paras Sameer Thakur BE-3 80
Page 1
[Type the document title]
Certificate
This is to certify that the report of the project entitled
COMPUTER ENGINEERING.
---------------------------------------
(Prof. Vaishali Chavan)
Guide
------------------------------------------ ---------------------------------------
(Prof. Uday Bhave) (Dr. Bhavesh Patel)
I/c Head of Department Principal
Page 2
[Type the document title]
Date
To,
The Principal
Shah and Anchor Kutchhi Engineering College,
Chembur, Mumbai-88
Subject: Confirmation of Attendance
Respected Sir,
This is to certify that Final year (BE) students from your college
Darshan Pramod Nemade, Ulkesh Sharad More and Paras Sameer Thakur
have duly attended the sessions on the day allotted to them during the period from 2020 to 2021 for
performing the Project titled CROWD MONITORING AND SAFETY ANALYSIS.
They were punctual and regular in their attendance. Following is the detailed record of the
student’s attendance.
Attendance Record:
Page 3
[Type the document title]
Attendance Certificate
Date
To,
The Principal
Shah and Anchor Kutchhi Engineering College,
Chembur, Mumbai-88
Subject: Confirmation of Attendance
Respected Sir,
This is to certify that Final year (BE) students
Darshan Pramod Nemade , Ulkesh Sharad More and Paras Sameer Thakur
have duly attended the sessions on the day allotted to them during the period from 2020 to 2021 for
performing the Project titled CROWD MONITORING AND SAFETY ANALYSIS.
They were punctual and regular in their attendance. Following is the detailed record of the
student’s attendance.
Attendance Record:
Date Student1 Student2 Student3 Student4
Page 4
[Type the document title]
Examiners
1.__________________________
2.__________________________
Guide
1.__________________________
2.__________________________
Date:
Place:
Page 5
[Type the document title]
Declaration
I declare that this written submission represents my ideas in my own words and where others'
ideas or words have been included, I have adequately cited and referenced the original sources. I
also declare that I have adhered to all principles of academic honesty and integrity and have not
misrepresented or fabricated or falsified any idea/data/fact/source in my submission. I understand
that any violation of the above will be cause for disciplinary action by the Institute and can also
evoke penal action from the sources which have thus not been properly cited or from whom
proper permission has not been taken when needed.
Date:
Place:
Page 6
[Type the document title]
Abstract
This project is differs from other implementations in quite a few aspects. Firstly you can input
any video based data for training. The data will be pre processed and converted into numpy
matrix which will be used for training. Converting data into matrix allows for more efficient
retention of data. We have also tinkered with the layers and parameters of the model to make it
more efficient and accurate. Our changes led to a boost in accuracy compared to baseline. We
have also added 4 different ways to deploy the model. A person can use frames extracted from
video to get accuracy, use real time video feed, use saved video or directly use a numpy file of
the data to be classified and pass it through the model. All these changes makes our project
flexible, adaptable and modular.
Page 7
[Type the document title]
Acknowledgement
We wish to express gratitude to our principal Dr. Bhavesh Patel for allowing us to go ahead with
this project and giving us the opportunity to explore this domain. We would also like to thank
our Head of Department Prof. Uday Bhave for our constant encouragement and support towards
achieving this goal. We would also like to thank the Review Committee for their invaluable
suggestions and feedback without whom our work would have been very difficult. We take this
opportunity to express our profound gratitude and deep regards to our guide Mrs. Vaishali
Chavan for her exemplary guidance, monitoring and constant encouragement throughout the
course of this project. The blessing, help and guidance given by her time to time shall carry us a
long way in the journey of life on which we are about to embark. No project is ever complete
without the guidelines of these experts who have already established a mark on this path before
and have become masters of it. So, we would like to take this opportunity to thank all those who
have helped us in implementing this project.
Page 8
[Type the document title]
Table of Contents
List of Figures.............................................................................................................................................10
List of Tables..............................................................................................................................................11
List of Abbreviations..................................................................................................................................12
Chapter 1. Introduction.......................................................................................................................13
1.1 VIDEO SURVEILLANCE................................................................................................................13
1.2 VIDEO SURVEILLANCE ARCHITECTURE.......................................................................................13
1.2.1 Automated Surveillance System........................................................................................14
1.3 INTRODUCTION TO ANOMALY DETECTION...............................................................................16
1.4 MOTIVATION.............................................................................................................................17
1.5 ORGANIZATION OF THESIS.........................................................................................................18
Chapter 2. Literature Survey................................................................................................................20
2.1 Survey Existing system..............................................................................................................20
2.2 Anomaly Detection Approaches In Video Surveillance..............................................................20
2.3 Review on Detecting Anomalies In Video Sequences.................................................................22
2.4 Review on Detecting Anomalies In Crowded Video Sequences.................................................25
2.5 Limitation of existing system or research gap............................................................................27
2.5.1 Challenges In Anomaly Detection......................................................................................27
2.5.2 Gap Analysis on Detecting Anomalies................................................................................27
2.6 Problem Statement and Objective..............................................................................................28
2.7 Scope.........................................................................................................................................28
Chapter 3. Proposed System...............................................................................................................30
3.1 Algorithm..................................................................................................................................30
3.2 Details of Hardware & Software................................................................................................30
3.2.1 Software Required.............................................................................................................30
3.2.2 Hardware required.............................................................................................................31
3.3 Design details............................................................................................................................31
3.3.1 An Efficient Spatio-Temporal Frequent Object Mining Method to Predict Abnormal
Activities 31
3.4 Methodology..............................................................................................................................32
3.4.1 Preprocessing.....................................................................................................................32
3.4.2 Feature Learning................................................................................................................32
3.4.3 Regularity Score.................................................................................................................33
Chapter 4. Implementation Details.....................................................................................................34
4.1 Modules & Description..............................................................................................................34
4.1.1 AutoEncoder......................................................................................................................34
Page 9
[Type the document title]
4.2 Snapshot....................................................................................................................................36
4.2.1 DATASET DESCRIPTION......................................................................................................36
Chapter 5. Testing...............................................................................................................................37
5.1 Testing.......................................................................................................................................37
5.1.1 OUTPUT OF ANOMALY DETECTION...................................................................................40
5.2 Results.......................................................................................................................................41
Chapter 6. Results & Analysis..............................................................................................................43
6.1 PERFORMANCE METRICS...........................................................................................................43
6.1.1 True Positive Rate..............................................................................................................43
6.1.2 Accuracy.............................................................................................................................43
6.1.3 Precision............................................................................................................................43
6.1.4 Recall.................................................................................................................................43
6.1.5 Information Gain Ratio.......................................................................................................43
6.1.6 Regularity Score.................................................................................................................43
6.2 Results & Analysis......................................................................................................................44
Chapter 7. Conclusion and Future Scope.............................................................................................47
7.1 APPLICATIONS OF ANOMALY DETECTION..................................................................................47
Chapter 8. References.........................................................................................................................49
Page 10
[Type the document title]
List of Figures
Figure 1 Architecture of a simple video surveillance system........................................................13
Figure 2 Architecture of automated surveillance system..............................................................14
Figure 3 Anomaly detection process in real time video sequence.................................................15
Figure 4 anomalies observed during object or human behavior....................................................16
Figure 5 literature review work done towards...............................................................................19
Figure 6 Different approaches used for anomaly detection...........................................................21
Figure 7 Algorithm of proposed system........................................................................................29
Figure 8 Frame Sequence of a segmented video...........................................................................36
Figure 9 installing required python library....................................................................................37
Figure 10 run python code.............................................................................................................37
Figure 11 User interface to open test data.....................................................................................38
Figure 12 User interface define data location................................................................................38
Figure 13 Test data under analysis................................................................................................39
Figure 14 Real time detection result..............................................................................................40
Figure 15 Man throwing bag in air................................................................................................40
Figure 16 Small Boy Jumping.......................................................................................................40
Figure 17 Man Running Man........................................................................................................41
Figure 18 Running In Opposite Direction.....................................................................................41
Page 11
[Type the document title]
List of Tables
Table 1 System Accuracy for Different Data Samples Taken.......................................................43
Table 2 System Precision for Different Data Samples Taken..................................................43
Table 3 System Recall for Different Data Samples Taken............................................................44
Table 4 Accuracy Comparison for Different data.........................................................................44
Page 12
[Type the document title]
List of Abbreviations
AI : Artificial Intelligence
ANN : Artificial Neural Network
API : Application Programming Interface
BG : Background
BN : Batch Normalization
CDNET : Change Detection .net
CNN : Convolutional Neural Network
CPU : Central Processing Unit
FC : Fully connected
FP : False Possitive
FN : False Negetive
TP : True Possitive
TN : True Negetive
FG : Foreground
FOV : Field Of View
General purpose computing on graphics processing
GPGPU : units
ILSVRC : ImageNet Large Scale Visual Recognition Challenge
ML : Machine Learning
MLP : Multilayer Perceptron
MoG : Mixture of Gaussian
NN : Nearest Neighbor
PTZ : Pan Tilt Zoom
RGB : Red Green Blue
ROI : Region Of Interest
SFO : Static Foreground Object
SGD : Stochastic Gradient Descent
VDAO : Video Database of Abandoned Objects
Page 13
[Type the document title]
Chapter 1. Introduction
Page 14
[Type the document title]
This report work is focused towards developing enhancement of algorithm used to detect
different types of anomalies in real time video.
Page 16
[Type the document title]
Page 17
[Type the document title]
1.4 Motivation
It has been observed that in spite of the presence of several secured surveillance systems
in place, several untoward incidents such as thefts, robbery, terror plots etc take place.
Page 18
[Type the document title]
The video footage in such cases are mostly used for post-mortem analysis rather than
preventing the untoward incident from happening. This serves as a motivation to develop
a computer vision based smart system to detect and alert anomalous activities
instantaneously and automatically without any human intervention. With this motivation
a Computer Vision Based Anomaly Detection System (CVADS) is proposed which
automatically detects anomalies such as presence of masked faces, anomalous activities
that may result in security threat and detecting abandoned objects. This system
automatically detects the anomalies without any intervention of security persons and
sends instantaneous security alerts.
Page 19
[Type the document title]
A detailed literature review was carried out to understand the existing approaches used
towards detecting anomalies in surveillance videos and the details of the same are
presented in this chapter. Also, the literature survey was carried out to identify the
various datasets, tools and classifiers that were used to detect different types of video
anomalies from surveillance perspective. The review discusses about the literature
regarding: detecting masked or partially occluded faces, detecting anomalous object
movements, detecting anomalous activities in crowded environment and detecting
abandoned objects. A detailed analysis and comparative study of various methods used
for detecting different anomalies has also been performed and presented in this chapter.
Figure 5 shows the high level taxonomy of the literature work carried out as part of this
work towards anomaly detection.
Page 20
[Type the document title]
Page 21
[Type the document title]
Page 22
[Type the document title]
Page 23
[Type the document title]
constructed and applied to directly learning progressively abstract and global high-level
representations from raw data sequence. The D-IncSFA network had the functionalities
of both feature extractor and anomaly detector that make AD completion in one step.
Ying Zhang et al.(2016) proposed a novel anomaly detection approach based on Locality
Sensitive Hashing Filters (LSHF), which hashed normal activities into multiple feature
buckets with Locality Sensitive Hashing (LSH) functions to filter out abnormal activities.
(Emmanu Varghese et al.) proposed a new supervised algorithm for detecting abnormal
events in confined areas like ATM room, server room etc. (Siqi Wang et al. 2018)
proposed a novel approach to detect and localize video anomalies automatically. Video
volumes were jointly represented by two novel local motion based video descriptors, SL-
HOF and ULGP-OF. Sovan Biswas & Venkatesh Babu(2017) proposed a novel idea of
detecting anomalies in a video, based on short history of a region in motion based on
trajectories. Maying Shen et al.(2018) proposed a Nearest Neighbour (NN) based search
with the Locality-Sensitive B-tree (LSB-tree) to detect anomalies, which helped to find
the approximate NNs among the normal feature samples for each test sample. Dan Xu et
al.(2014) proposed an approach to detect anomalies based on a hierarchical activity
pattern discovery framework, comprehensively considering both global and local spatio-
temporal contexts. Tian Wang et al.(2018) proposed an algorithm to solve abandoned
object detection efficiently based on an image descriptor which encodes the movement
information and the classification method.
Huorong Ren et al.(2017)proposed an anomaly detection approach based on a dynamic
Markov model. This approach segmented sequence data by a sliding window. Also, an
anomaly substitution strategy was proposed to prevent the detected anomalies from
impacting the building of the models and keep anomaly detection continuously. Fan Jiang
et al.(2011) proposed a hierarchical data mining approach where frequency-based
analysis was performed at each level to automatically discover regular rules of normal
events. Events deviating from these rules were identified as anomalies. Shifu Zhou et al.
(2016)coupled anomaly detection with a spatial–temporal Convolutional Neural
Networks (CNN) to capture features from both spatial and temporal dimensions by
Page 24
[Type the document title]
extorted. This sub section of video is presented by creating vectors. The grouping
technique and even the resemblance measures are applied to those vectors and once it is
processed, the sub-video actions might be deemed as irregular only if the sub video had
very less resemblance. In real time, it is very complex to recognize the unusual actions.
To verify irregular performance like burglary, fight and chasing (Jian-hao&Li 2011)
projected a technique which identifies the actions based upon the turmoil of speed and
also the path of movement. However, the three unusual actions cannot be differentiated
by this technique. (Cheng et al.2011) projected a method which could identify the cyclic
activities and also distinguishes the cyclic motion of a flexible moving object such as an
instance of finding the running behaviour of the human. Moreover, to recognize the
human running behaviour, a descriptor resulted from cyclic action depiction is utilized.
To gratify the real-time presentation sequentially in the surveillance method, a technique
has been projected in this work which identifies the unusual running action in
surveillance tape in accordance with spatiotemporal constraints. Firstly, the objects
presented in the foreground are extorted from video segments associated with Gaussian
Mixture representation and also frame subtraction calculation as discussed in (Xin et al.
2008 and Chen et al. 2010) is performed. The input images are converted into binary
images. The nonlinear structures are entailed in extorted foreground object detection
algorithm as discussed in the works done by Liao et al.(2011), Hu. et al.(2011) and Liao
et al.(2010).
Although various strategies and object handling methods are utilized in real life to
promote tracking in crowded area, more difficulties emerge while tracking the scenes in
crowd area rather than the small sequences. For instance, It is highly difficult to
recognize a targeted object in crowded area due to the size of the targeted object and
other scenarios such as occlusion, relative movement of other objects etc. To overcome
these difficulties, various outcomes are projected in (Li et al. 2011) where the researchers
have reinstated those by tracking each unit of the targeted object. Some researchers have
projected the algorithm by removing the foreground as suggested by Liao et al.(2011).
The plan for recognizing and observing the temporal strategy for a crowded area is
Page 26
[Type the document title]
represented. Initially, various attributes recover the substances of every lead frameworks
involved in the operation. Once every object is identified, the Gaussian Mixture
algorithm (GMM) is used. In this segment, we describe the recognition of unusual
performance in wider aspect, for instance, the unpredicted actions of a person. The
researchers try to expand several methods that are usually utilized for video surveillance.
If there are any unexpected transformation in scenes like lighting or change in weather
and difficulties such as identifying the action are addressed using Gaussian Mixture
Model (GMM). Individual events are identified in this series based on identifying the
action of every person. Then “vision.BlobAnalysis” object is used for analyzing the
individual objects. Before performing blob analysis, segmentation of the objects from the
background is performed using GMM and then morphological operations are applied for
removing noise and extracting the boxes containing the connected components.
Page 27
[Type the document title]
the anomaly detection is carried out with minimal errors and high degree of accuracy. As
part of this research a new spatio-temporal approach towards anomaly detection has been
proposed. The salient feature of this approach is that it not only provides high degree of
accuracy in detecting anomalies, but also comprises of very minimal errors.
Gap Analysis on Detecting Anomalies in Crowded Spaces
Most of the anomaly detection related works were focused towards detecting anomalies
in video sequences. But it is highly complex to detect anomalies when the surveillance
space is crowded. Also, the human behavior is difficult to track in crowded spaces.
The primary objective of this research work is to design a Computer vision based
anomaly detection system using smart anomaly detection algorithms to promote better
and smart surveillance system without any human intervention. The secondary objectives
of this work include:
a. To provide high degree of accuracy in anomaly detection.
b. To maintain minimal misclassification rate.
c. To improve response time in terms of both anomaly detection and alerting.
d. To improve anomaly detection in occlusion conditions.
2.7 Scope
The four contributions as part of designing the computer vision based anomaly detection
system are:
1) A pivotal point based approach for detecting partially occluded or masked faces in
videos was developed to detect partially masked faces in video frame. The primary
advantage of this approach when compared to the existing approaches is the quick
turnaround time in detecting masked or partially occluded faces.
2) A new approach based on spatio-temporal parameters has been designed to detect
anomalies in video sequences and alert the anomalies detected. The salient aspect of
this approach when compared to the previous approaches is that the anomaly
Page 28
[Type the document title]
detection is carried out using spatial segmentation of video frames which in turn
improves the accuracy of detection with minimal errors.
3) An improved block based strategy using discrete cosine transform co-efficient and
entropy has been proposed to detect anomalies in video sequences involving
crowded space. The prominent feature of this approach when compared with
existing works is its efficiency in detecting anomalies in crowded sequences.
4) A new strategy to detect abandoned objects based on blob analysis has been
proposed. The striking feature of this approach is that the abandoned object
classification is consistently carried out even under occlusion scenarios.
Page 29
[Type the document title]
3.1 Algorithm
This project is differs from other implementations in quite a few aspects. Firstly you can input
any video based data for training. The data will be pre processed and converted into numpy
matrix which will be used for training. Converting data into matrix allows for more efficient
retention of data. We have also tinkered with the layers and parameters of the model to make it
more efficient and accurate. Our changes led to a boost in accuracy compared to baseline. We
have also added 4 different ways to deploy the model. A person can use frames extracted from
video to get accuracy, use real time video feed, use saved video or directly use a numpy file of
the data to be classified and pass it through the model. All these changes makes our project
flexible, adaptable and modular
Page 30
[Type the document title]
Page 31
[Type the document title]
abnormal activities. The whole process is named as Spatio-Temporal Frequent Object Mining
(STFOM).
3.4 Methodology
The method described here is based on the principle that when an abnormal event occurs, the
most recent frames of video will be significantly different than the older frames. Inspired by [5],
we train an end-to-end model that consists of a spatial feature extractor and a temporal encoder-
decoder which together learns the temporal patterns of the input volume of frames. The model is
trained with video volumes consists of only normal scenes, with the objective to minimize the
reconstruction error between the input video volume and the output video volume reconstructed
by the learned model. After the model is properly trained, normal video volume is expected to
have low reconstruction error, whereas video volume consisting of abnormal scenes is expected
to have high reconstruction error. By thresholding on the error produced by each testing input
volumes, our system will be able to detect when an abnormal event occurs. Our approach
consists of three main stages:
3.4.1 Preprocessing
The task of this stage is to convert raw data to the aligned and acceptable input for the model.
Each frame is extracted from the raw videos and resized to 227 x 227. To ensure that the input
images are all on the same scale, the pixel values are scaled between 0 and 1 and subtracted
every frame from its global mean image for normalization. The mean image is calculated by
averaging the pixel values at each location of every frame in the training dataset. After that, the
images are converted to grayscale to reduce dimensionality. The processed images are then
normalized to have zero mean and unit variance. The input to the model is video volumes, where
each volume consists of 10 consecutive frames with various skipping strides. As the number of
parameters in this model is large, large amount of training data is needed. Following [5]s
practice, we perform data augmentation in the temporal dimension to increase the size of the
training dataset. To generate these volumes, we concatenate frames with stride-1, stride-2, and
stride-3. For example, the first stride-1 sequence is made up of frame {1, 2, 3, 4, 5, 6, 7, 8, 9,
10}, whereas the first stride-2 sequence contains frame number {1, 3, 5, 7, 9, 11, 13, 15, 17, 19},
and stride-3 sequence would contain frame number {1, 4, 7, 10, 13, 16, 19, 22, 25, 28}. Now the
input is ready for model training.
Page 32
[Type the document title]
where fW is the learned weights by the spatiotemporal model. We then compute the abnormality
score sa(t) by scaling between 0 and 1. Subsequently, regularity score sr(t) can be simply derived
by subtracting abnormality score from 1:
e ( t ) −e ( t ) min
sa ( t ) = Equation 3.2
e(t )max
Page 33
[Type the document title]
4.1.1 AutoEncoder
Autoencoder is an unsupervised artificial neural network that learns how to efficiently compress
and encode data then learns how to reconstruct the data back from the reduced encoded
representation to a representation that is as close to the original input as possible. Autoencoder,
by design, reduces data dimensions by learning how to ignore the noise in the data.
Page 34
[Type the document title]
1. Download the videos ie; 16 training videos and 12 testing videos and divide it by frames.
2. Images with random objects in the backgorund.
3. Various background conditions such as dark, light, indoor, oudoor, etc.
4. Save all the images in a folder called images and all images should be in .jpg format.
5. Use Argprase parser to add argument to the file names.
6. Divide each and every video into frames and save the frames in a directory separated by
the type of anomaly or situation as well as resize the images to scale.
7. Reshape and normalize the images.
8. Clip negative values and remove buffer directory.
Step 2. Loading the Keras Models
1. Import the three models given below:
2. Convolutional 3DConvolutional LSTM 2D
3. Convolutional 3D Transpose
4. Using Sequential define filters, padding and activation of these models. I am choosing
Relu.
5. Let the optimizer be Adam and metric loss be Categorical Crossentropy.
Step 3: Training the Model
1. train.py which runs the training process
2. pipeline_config_path=Path/to/config/file/model.config
3. model_dir= Path/to/training/
4. If the kernel dies, the training will resume from the last checkpoint. Unless you didn’t
save the training/ directory somewhere, ex: GDrive.
5. If you are changing the below paths, make sure there is no space between the equal sign =
and the path.
6. And use early Callbacks to stop the training if it goes out of hand.
Step 4: Export the Trained Model
1. The model will save a checkpoint every 600 seconds while training up to 5 checkpoints.
Then, as new files are created, older files are deleted.
2. A file called model.h5 is created which will be used while testing later.
3. Epochs were used as arg.epoch and batch size for training was 32.
Page 35
[Type the document title]
4. Another file called training.npy would be created it contains the array form of all the
coordinates required while testing. SO here no frozen inference graph or pdtxt file is
created.
4.2 Snapshot
Page 36
[Type the document title]
Chapter 5. Testing
5.1 Testing
Implemented system is subjected for the test various datasets available on internet such as
avenue data set, UCSD Anomaly Detection Dataset, University of Minnesota crowd activity
datasets, Anomalous Behavior Data Set, Virat video dataset and McGill University Dominant
and Rare Event Detection Data is used for test
Page 37
[Type the document title]
Page 38
[Type the document title]
Page 39
[Type the document title]
Figure 13 shows test data under analysis, normal frame is analyzed in screen short then final
output is shown in the form of numbers in figure 14
Page 40
[Type the document title]
5.2 Results
Page 41
[Type the document title]
Page 42
[Type the document title]
6.1.2 Accuracy
It is the fraction of true results of human activity prediction (true positive and true
negative) among the total number of cases analyzed. It is calculated as,
True Positive ( TP ) +True Negetive ( TN ) Equation 6.1
Accuracy=
TP+TN + False Positive ( FP ) + False Negetive ( FN )
where, if the class label is positive and the human abnormal activity prediction outcome is
positive, then it is TP. If the class label is negative and the human abnormal activity prediction
outcome is negative, then it is TN. If the class label is negative and the human abnormal activity
prediction outcome is positive, then it is FP. If the class label is positive and the human abnormal
activity prediction outcome is negative, then it is FN.
6.1.3 Precision
It is the fraction of the number of suspicious faces that are appropriately recognized to the sum of
the count of correctly recognized suspicious faces and the wrongly recognized suspicious faces.
True Positive (TP)
Precision= Equation 6.2
True Positive (TP)+ False Positive ( FP )
6.1.4 Recall
It is the fraction of the number of suspicious faces that are appropriately recognized to the sum of
the count of correctly recognized suspicious faces and the wrongly recognized non-suspicious
faces.
True Positive ( TP )
Recall= Equation 6.3
True Positive ( TP )+ False Negetive ( FN )
To better compare with [5], we used the same formula to calculate the regularity score for all
frames, the only difference being the learned model is of a different kind. The reconstruction
error of all pixel values I in frame t of the video sequence is taken as the Euclidean distance
between the input frame and the reconstructed frame:
e (t)¿ Equation 6.4
where fW is the learned weights by the spatiotemporal model. We then compute the abnormality
score sa(t) by scaling between 0 and 1. Subsequently, regularity score sr(t) can be simply derived
by subtracting abnormality score from 1:
e ( t ) −e ( t ) min
sa ( t ) = Equation 6.5
e(t )max
Page 44
[Type the document title]
Page 45
[Type the document title]
0.600
0.500
0.400
0.300
0.200
0.100
0.000
Boy Jumping Man Running Man Running Throwing Bag
Opposite
Page 46
[Type the document title]
1.000
0.900
0.800
0.700
0.600
0.500
0.400
0.300
0.200
0.100
0.000
Boy Jumping Man Running Man Running Throwing Bag
Opposite
Page 47
[Type the document title]
4. Industrial Damage Detection: Such damages need to be detected early to prevent further
escalation and losses.
5. Image Processing: Anomaly detection techniques dealing with images are either interested
in any changes in an image over time (motion detection) or in regions which appear
abnormal on the static image. This domain includes satellite imagery.
6. Anomaly Detection in Text Data: Anomaly detection techniques in this domain primarily
detect novel topics or events or news stories in a collection of documents or news articles.
The anomalies are caused due to a new interesting event or an anomalous topic.
7. Sensor Networks: Since the sensor data collected from various wireless sensors has several
unique characteristics.
Page 49
[Type the document title]
Chapter 8. References
[1] Zhao Bin, Li Fei , and Xing E. P. “Online detection of unusual events in videos via
dynamic sparse coding,” IEEE Conference on Computer Vision and Pattern Recognition,
pp. 3313–3320, 2011.
[2] Cong Yang, Yuan Junsong, and Liu Ji “Sparse reconstruction cost for abnormal event
detection,” IEEE Conference on Computer Vision and Pattern Recognition, pp. 3449–
3456, 2011.
[3] Chen Zhu, and Saligrama V. “Video anomaly detection based on local statistical
aggregates,” Computer Vision and Pattern Recognition, pp. 2112–2119, 2012.
[4] Zhou Xu Gang, Zhang Li Qing “Abnormal Event Detection Using Recurrent Neural
Network,” International Conference on Computer Science and Applications,pp. 222–226,
2015.
[5] Sabokrou M., Fathy M., and Hoseini M. “Video anomaly detection and localisation based
on the sparsity and reconstruction error of auto-encoder,” Electronics Letters, vol. 52, no.
13, pp. 1122–1124, 2016.
[6] Xu Dan, Ricci Elisa, Yan Yan, Song Jingkuan, and Sebe Nicu “Learning Deep
Representations of Appearance and Motion for Anomalous Event Detection,” BMVC ,
2015.
[7] Hasan Mahmudul, Choi Jonghyun, Neumann Jan, Roychowdhury Amit K., and Davis
Larry S. “Learning Temporal Regularity in Video Sequences,” Computer Vision and
Pattern Recognition , pp. 733–742,2016.
[8] Ravanbakhsh Mahdyar, Nabi Moin, Sangineto Enver, Marcenaro Lucio, Regazzoni Carlo,
and Sebe Nicu “Abnormal Event Detection in Videos using Generative Adversarial Nets,”
International Conference on Image Processing, 2017.
[9] Yong Shean Chong, and Yong Haur Tay “Abnormal Event Detection in Videos Using
Spatiotemporal Autoencoder,” International Symposium on Neural Networks, pp. 189–
196, 2017.
[10] Patraucean Viorica, Handa Ankur, and Cipolla Roberto “Spatio-temporal video
autoencoder with differentiable memory,” Computer Science, vol. 58, no. 11, pp.2415–
2422, 2015.
Page 50
[Type the document title]
[11] Sutskever, Ilya, Vinyals, Oriol, Le, and Quoc V, “Sequence to sequence learning with
neural networks,” In Advances in Neural Information Processing Systems, vol. 4, pp.
3104–3112, 2014.
[12] Srivastava, Nitish, Mansimov, Elman ,Salakhutdinov, and Ruslan “Unsupervised Learning
of Video Representations using LSTMs,” International Conference on Machine Learning,
pp. 843–852, 2015.
[13] Ji Yangfeng, Cohn Trevor, Kong Lingpeng, Dyer Chris, and Eisenstein Jacob “Document
Context Language Models,” Computer Science, 2016.
[14] He Kaiming, Zhang Xiangyu, Ren Shaoqing, and Sun Jian “Deep Residual Learning for
Image Recognition,” Computer Vision and Pattern Recognition , pp. 770– 778, 2015.
[15] Huang Gao, Liu Zhuang, Weinberger Kilian Q, and Laurens Van Der Maaten “Densely
Connected Convolutional Networks,” Computer Vision and Pattern Recognition, 2016.
[16] Shi Xingjian, Chen Zhourong, Wang Hao, Yeung Dit Yan , Wong Wai Kin, Woo Wang
Chu “Convolutional LSTM network: A machine learning approach for precipitation
nowcasting,” NIPS, pp. 802–810, 2015.
[17] Graves, and Alex “Generating Sequences With Recurrent Neural Networks,” Computer
Science, 2013.
[18] Jefferson Ryan Medel, Andreas E. Savakis “Anomaly Detection Using Predictive
Convolutional Long ShortTerm Memory Units,” CoRR abs/1612.00390, 2016.
[19] Y. Kozlov, and T. Weinkauf “Persistence 1D: Extracting and filtering minima and maxima
of 1d functions,” http://people.mpiinf.mpg.de/ weinkauf/notes/persistence1d.html
[20] V.Mahadevan, W.Li, V.Bhalodia, and N.Vasconcelos “Anomaly detection in crowed
scenes,” IEEE International Conference on Signal Processing, pp. 1975– 1981,2010.
[21] C.Lu, J.shi, and J.Jia “Anomaly event detection at 150fps in matlab,” IEEE International
Conference on Computer Vision, pp. 2720–2727, 2013.
[22] Amit Adam, Ehud Rivlin, Ilan Shimshoni, and David Reinitz “Robust Real-Time Unusual
Event Detection using Multiple Fixed-Location Monitors,” IEEE Trans. Pattern Anal.
Mach. Intell., vol. 30, no. 3, pp. 555–560, 2008.
[23] Wang Tian, and Snoussi H. “Histograms of optical flow orientation for abnormal events
detection,” IEEE International Workshop on Performance Evaluation of Tracking and
Surveillance (PETS 2013), vol. 5, no. 9, pp. 13–18, 2013.
Page 51