50% found this document useful (2 votes)
1K views

Analysis and Prediction of Industrial Accidents Using Machine Learning

This document proposes a system to analyze and predict industrial accidents using machine learning. It aims to address the high costs and limitations of existing safety systems. The system would collect large datasets from various industries, analyze the data using machine learning algorithms like random forest, and predict future accidents. This would allow for more accurate predictions while handling large amounts of data at low storage costs.

Uploaded by

Sushma Sri
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
50% found this document useful (2 votes)
1K views

Analysis and Prediction of Industrial Accidents Using Machine Learning

This document proposes a system to analyze and predict industrial accidents using machine learning. It aims to address the high costs and limitations of existing safety systems. The system would collect large datasets from various industries, analyze the data using machine learning algorithms like random forest, and predict future accidents. This would allow for more accurate predictions while handling large amounts of data at low storage costs.

Uploaded by

Sushma Sri
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 39

Analysis and Prediction of Industrial Accidents Using Machine Learning

Analysis and Prediction of Industrial Accidents Using Machine


Learning

1
Analysis and Prediction of Industrial Accidents Using Machine Learning

ABSTRACT
With the different businesses in today’s environment, there is a
huge development in the measure of information being created from
various sources. With this tremendous measure of information being
generated day by day, there is a requirement for the information to be
investigated and be managed methodically. There has been an increase in
the number of accidents ever since the evolution of such industries. Even
with the diverse industrial safety and accident prevention systems
available, they haven’t been efficient in managing a wide range of
parameters and be able to effectively predict them by handling a large
amount of data. Moreover, with the existing systems, the cost of planning
and storing the data is soaring. In this research, a conceptual system is
made that utilizes low cost storage and process data in less time. It
additionally utilizes Machine Learning, NLP and Random Forest
calculation so as to comprehend and foresee mishaps in Industrial
condition. The industrial data is procured from one of the largest industries
in Brazil and the world which records the industrial accidents that took
place in every nation. The information is investigated and prepared with
Machine Learning algorithm so as to comprehend the reasons for such
incidents and how the expectation of future accidents can be done.
Subsequently, the framework can think about an assortment of parameters
and decide future happenings with exactness.
Analysis and Prediction of Industrial Accidents Using Machine Learning

1.INTRODUCTION
Industries have become quite a vital part of today’s world that without
it, it would be difficult to sustain in the world. Industrial growth and
development are significant as it plays a big role in our economy, development
of the country as a whole and earns revenue. The requests and needs of the
individuals have been rising due to the populace upheaval too. To cope up and
keep up to this, industries are required in the world. Not only that, but
industries also provide various employment opportunities for people to work
in them. Clearly, the more the businesses, the more the working individuals. It
means that a solitary industry is answerable for an enormous number of
working individuals just as its environment. The wellbeing of these laborer’s
is a need of great importance. In our endeavors to make out a living through
various callings, we have disregarded numerous significant parts of life and
committed a few errors. These are making undesirable states of work,
expanding the danger of ailments, the danger of mishaps in the processing
plants another mechanical establish-ments and ruining the earth, by making
contamination and even by disregarding the wellbeing standards, which takes
steps to make difficult issues of wellbeing, both physical and mental.
Industrial accidents are quite fatal and can cause quite a loss. Those that occur
in the workplace can cause harm to employees, environment and damage to
the equipment. Industrial related accidents, injuries and fatality data
demonstrate that continued efforts and effective measures are necessary to
reduce the number of industrial accidents, illnesses and fatalities. A worker
dies of occupational injury every three minutes and about every second at least
four workers get injured according to the International Labor Organization
(ILO). India happens to be one of the nations with the most elevated record of
such Industrial accidents. When looking into Indian industrial accident data,
it’s found that about 47 factory workers are injured and a handful of them die
every day. Data from the Labor and Employment Ministry reveal that in three
years (2014-2016), 3,562 workers lost their lives while 51,124 were injured in
accidents that occurred in factories across the country. Gujarat, Maharashtra
and Tamil Nadu are the top three states when it comes to fatalities. All though
neither the government nor the public has held Indian industry adequately
account for the thousands of deaths each year. Also, according to International
Analysis and Prediction of Industrial Accidents Using Machine Learning

Labor Organization, every year, 250 million accidents occur causing absence
from work, the equivalent of 685,000 accidents every day, 475 every minute, 8
every second and 12 million children working undergo occupational accidents
around with 12,000 are fatal. The few reasons for fatal accidents in factories
are mainly due to lack of management commitment, failure to develop safety
culture and noncompliance of safety systems. Even with various existing
safety measures and systems that have been proposed or being followed, there
isn’t an exact accurate one present that can help in eliminating such fatalities
to increase. Every organization present in the world follow a certain set of
rules and regulations that ensure workers safety and their security in the work
they have been assigned. The mishaps that occur fall into the responsibility of
organizations for which compensation would be required for the causes.
Around 600,000 lives would be saved every year if available safety practices
and appropriate information were used. In order to analyze and predict such
fatalities, this paper aims to propose a system which can collect large datasets
from various industries, analyze them and the ability to predict and reduce
future accidents. Along with that, the proposed system focuses on building a
real-time analysis of industrial data which can help in the retrieving of any sort
of accidents in a faster manner. A large data set is handled effectively and
looked onto by utilizing the aspects of Machine Learning.
1.1 .MOTIVATION
Industrial accidents are quite fatal and can cause quite a loss. Those
that occur in the workplace can cause harm to employees, environment and
damage to the equipment. Industrial related accidents, injuries and fatality data
demonstrate that continued efforts and effective measures are necessary to
reduce the number of industrial accidents, illnesses and fatalities. A worker
dies of occupational injury every three minutes and about every second at least
four workers get injured according to the International Labor Organization
(ILO).
1.2 Existing System
Anuoluwapo et al [5] had proposed by introducing the big data framework in
the occupational health system. The aim was to indulge into getting accurate
results in determining production industry accidents using various Big Data
platforms including Hadoop, Spark, and MapReduce. The data is drawn from
Analysis and Prediction of Industrial Accidents Using Machine Learning

a leading power infrastructure company in the UK and was analyzed using the
B-DAPP architecture.

1.2.1 Limitations of existing system


 with the existing systems, the cost of planning and storing the data is
soaring
1.3 Objectives
Industries have become quite a vital part of today’s world that without
it, it would be difficult to sustain in the world. Industrial growth and
development are significant as it plays a big role in our economy, development
of the country as a whole and earns revenue. The requests and needs of the
individuals have been rising due to the populace upheaval too. To cope up and
keep up to this, industries are required in the world. Not only that, but
industries also provide various employment opportunities for people to work
in them. Clearly, the more the businesses, the more the working individuals. It
means that a solitary industry is answerable for an enormous number of
working individuals just as its environment. The wellbeing of these laborer’s
is a need of great importance.
1.4 Outcomes
Using the dataset, the system was able to read the data, clean the data,
produce various analyses and statistics along with making predictions based
on the model it was trained with. With the use of Random Forest Classifier, it
can be depicted that it is comparatively a better algorithm than by using single
trees. The system can be used for any industry and this can also be mean to
help industries in getting to know better about the fatalities that occur
1.5Applications
This application can use to predict phenotype: studies in yeast, rice, and
wheat
STRUCTURE OF PROJECT (SYSTEM ANALYSIS)
Analysis and Prediction of Industrial Accidents Using Machine Learning

Fig: 1 Project SDLC


• Project Requisites Accumulating and Analysis
• Application System Design
• Practical Implementation
• Manual Testing of My Application
• Application Deployment of System
• Maintenance of the Project
1.6.1 REQUISITES ACCUMULATING AND ANALYSIS
It’s the first and foremost stage of the any project as our is a an academic
leave for requisites amassing we followed of IEEE Journals and Amassed
so many IEEE Relegated papers and final culled a Paper designated
“Individual web revisitation by setting and substance importance input and
for analysis stage we took referees from the paper and did literature survey
of some papers and amassed all the Requisites of the project in this stage
1.6.2 SYSTEM DESIGN
In System Design has divided into three types like GUI Designing, UML
Designing with avails in development of project in facile way with
different actor and its utilizer case by utilizer case diagram, flow of the
Analysis and Prediction of Industrial Accidents Using Machine Learning

project utilizing sequence, Class diagram gives information about different


class in the project with methods that have to be utilized in the project if
comes to our project our UML Will utilizable in this way The third and
post import for the project in system design is Data base design where we
endeavor to design data base predicated on the number of modules in our
project
1.6.3 IMPLEMENTATION
The Implementation is Phase where we endeavor to give the practical
output of the work done in designing stage and most of Coding in Business
logic lay coms into action in this stage its main and crucial part of the
project

1.6.4TESTING UNIT TESTING


It is done by the developer itself in every stage of the project and fine-
tuning the bug and module predicated additionally done by the developer
only here we are going to solve all the runtime errors
MANUAL TESTING
As our Project is academic Leave, we can do any automatic testing so we
follow manual testing by endeavor and error methods

1.6.4 DEPLOYMENT OF SYSTEM AND MAINTENANCE


Once the project is total yare, we will come to deployment of client system
in genuinely world as its academic leave we did deployment i our college
lab only with all need Software’s with having Windows OS .
The Maintenance of our Project is one-time process only

1.7 FUNCTIONAL REQUIREMENTS


1.Data Collection

2.Data Preprocessing

3.Training And Testing

4.Modiling

5.Predicting
Analysis and Prediction of Industrial Accidents Using Machine Learning

2.LITERATURE SURVEY
[1] Long Wang, Xiaoqing Wang, Aixia Dou, Dongliang Wang
“Study on construction seismic damage loss assessment using RS
and GIS” International Symposium on Electromagnetic
compatibility, 2014.
In this paper, a quick assessment method for earthquake emergency is
introduced. The method contains two different modes to obtain damage
information from remote sensing images, one of which is based on
damage index and the other adopts image classification. The damage
index mode relies on traditional visual interpretation. After the damage
index is given by experts, the ground intensity data can be gained, and
then loss estimate parameters will be acquired from the experiential
vulnerability matrix. The image classification mode is an application of
digital image processing technique. Those loss estimate parameters can
be calculated from the classification result which is sorted by the type
of buildings and ranged by the damage degree. While the assessment
models are introduced, the action of multi-resourced estimate data is
explained to show how to find parameters in various data.
.
[2] Ramli Adnan. Abd Manan Samad, Zainazlan Md Zain,
Fazlina Ahmat Ruslan “5 hours flood prediction modeling using
improved NNARX structure: case study Kuala Lumpur”, IEEE
4th International Conference on System Engineering and
Technology, 2014.
Analysis and Prediction of Industrial Accidents Using Machine Learning

Flood is one of natural disaster that has becomes major threat around
the world. Flood disaster may damages people's life and property.
Therefore, an accurate flood water level prediction is very important in
flood modelling because it can give ample time to residents nearby
flood location for evacuation purposes. However, due to the dynamics
of flood water level itself is highly nonlinear, Artificial Neural
Network (ANN) technique is a good modelling option because ANN
was widely used to solve nonlinear problems. NNARX is one type of
ANN model. Therefore, this paper proposed flood prediction
modelling to overcome the nonlinearity problem and come out with
advanced neural network technique for the prediction of flood water
level 5 hours in advance. The input and output parameters used in this
model are based on real-time data obtained from Department of
Irrigation and Drainage Malaysia upon special request. Results showed
that the Improved NARX model successfully predicted the flood water
level 5 hours ahead of time and significant improvement can be
observed from the original NNARX model.
[3] H Takata, H. Nakamura, T Hachino “On prediction of
electric power damage by typhoons in each district in Kagoshima
Prefecture via LRM and NN”, SICE Annual Conference, 2004.
Kagoshima Prefecture has suffered from natural disasters by
typhoons repeatedly. They hit power systems very badly and
sometimes cut off electricity. To ensure the rapid restoration of
electricity supply, one needs to predict the accurate amount of damage
by typhoon in every region. This paper considers the damage
prediction in each district in Kagoshima Prefecture by using a two-
stages predictor. It consists of LRM (linear regression model) at the
first stage and NN (neural networks) at the second stage. This predictor
enables us to predict the number of damaged distribution poles and
lines from weather forecasts of typhoon. Effectiveness of the approach
is assured by applying it to the actual data.
[4] Industrial Safety and Accident Prevention; A
Managerial Approach Industrial Safety and Accident Prevention;
Analysis and Prediction of Industrial Accidents Using Machine Learning

A Managerial Approach International Journal of Science,


Engineering and Technology Research, February 2013

The industries are looking to their production systems in the


different direction to get the competitive advantages. But the most
important is to find out the problem of the production system to make
improvements. In this paper, a part of the production system of
companies is studied to find the problems of the safety system of the
company to make the improvements and to recommend some points to
the companies for the achievements of its goals and avoid accidents.
Increasing number of accidents involving workers has drawn our
attention towards safety measures in the factories Index Terms—
Accidents. Companies, Production, Safety. I. INTRODUCTION With
the start of the new century, the competition and pressure to perform
competitively have increased on the companies. This new age will be
challenge for the companies to provide new, exciting, innovative and
cost effective products in the market. Now the business has become
globalize and competitive. To survive, a company has to offer best
prices to its customers with high quality, service and operate with
lowest cost. It is only possible for a company if all of its departments
are well managed. In today’s competitive environment, companies
want to get the benefits of the different techniques which are being
used in the product processes. They have implemented total quality
management (TQM), just in time (JIT) manufacturing and total
employee involvement (TEI) [1]. Now many companies have shifted
their focus to optimization of their assets. One of the main parts of the
company which has a strong influence on the assets is the safety
department or the employees responsible for maintenance the different
concepts to meet the requirements of the manufacturing plant are not
successful without the support of the quality and maintenance strategy.
There is no doubt that the safety has a vital role in the companies. Now
a days most of the companies are giving attention to this important
function which is considered as the necessary evil for the companies,
i.e. an expense to the companies and a non-value addition function.
Analysis and Prediction of Industrial Accidents Using Machine Learning

The companies cannot survive for long time without considering the
safety as an important function because they will be put out of the
business by the companies that are considering the safety as a
competitive weaponA. Industrial Accident Accidents occurring in the
industries are called industrial accidents. These are generally due to
faulty equipment and machinery or negligence on the part of the
workers. Proper precautions can reduce the accidents. There are always
some causes for the occurring of the accidents. There are always some
chances of accidents while working on the machinery and equipment.
All industrial operations increase the chances of accidents. Proper
training and knowledge should be given about the dangers of
accidents. Accident occurs in industries due to faults of the workers.
They can be negligent dis-interested in jobs and under the influence of
of intoxicants resulting in a higher number of accidents. B.
Classification of Accidents According to length of recovery; This is an
important method of classifying the industrial accidents. This is further
divided into three categories. First Aid Cases: The injuries due minor
accidents are not serious. The workers are given first aid at the factory
hospital. After getting the medical treatment at factory hospital, the
worker can again start the work. In this type of accidents no time is lost
except when the worker is receiving first aid treatment. No
compensation is paid to the injured worker. Home case accidents: The
injured worker is given preliminary treatment at the factory hospital
and is allowed to go home. The worker recovers in this period and is
ready to resume his duties. So, the worker loses the day, shift or turn of
work in which the accident has taken place. This type of accidents do
not involve any compensation to the workers as the workers do not fall
under the preview of workmen's compensation act. Lost Time
Accidents: For these accidents, the factory has to pay compensation.
The worker has to leave the work on account of accidents for more
days in addition to the day, shift or turn in which the mishappening has
taken place. The worker is generally admitted to the hospital. In this
case temporary type permanent type of disablement may result. The
accident may lead to enquiry and investigation if difference of option
Analysis and Prediction of Industrial Accidents Using Machine Learning

is found regarding the causes of the accidents. For example, the hand,
arm, leg or any other part of the body is injured seriously or cut by the
machine. According to cause of events: Some example of machine
accidents are given: 1. Catching of fingers, arms, clothing etc. in
machine. 2. Catching of tool, guides etc. in machine. 3. Catching of fly
objects or particles. These are common but generally less serious type
accidents. Some examples are given below:- 1. Falling objects. 2.
Objects on floor. 3. Pushes, bumps etc. by other persons objects.
According to damage caused: This classification is based on damage
caused. Damage can be that of property, material or building. Some
examples are given below: 1. Damage to the store material. 2. Partial
or complete loss of container or contents. 3. Damage to hand trucks. 4.
Damage to trolleys. 5. Damage to belt conveyors, cranes or machines.
According to nature of injury: This classification is as follows: Fatal
Accidents: In such an accident, one or more persons are killed,
Permanent Disablement: Due to accident the worker loses earning
capacity ,Temporary Disablement: These accidents are less serious
than of previous category. C. Concept of Safety Industrial safety is
primarily a management activity which is concerned with reducing,
controlling and eliminating hazards from the industries or industrial
units. Safety is opposite to accidents. If accidents are harmful, safety is
beneficial. Man's greatest desire is security. He wants longer life.
Accidents are one of the major causes of deaths. So accidents should
be minimized. Safety is beneficial in all respects. Safety has become an
essential feature of all walks of life [2], [3]. The maintenance of safety
has become a major program in the industries. Specially trained
persons known as safety engineers are appointed in the industries. The
government has also framed rules and regulations towards safety. The
factories act has special provisions on safety. Violation of these
provisions is punishment. Some of the factories conduct special
programmers in first aid treatment. The workers are acquainted with
the preliminary treatment to be given to the injured. Programmers like
extinguishing fire, removing the people from the building on fire are
also carried out. Research methodology is considered as a supporting
Analysis and Prediction of Industrial Accidents Using Machine Learning

subject. It is used to develop a variety of research paradigms. These


paradigms are varying in their contents and substance but their broader
approach to inquiry is the same. ’Although the basic logic of scientific
methodology is the same in all fields, its specific techniques and
approaches will vary, depending on the subject matter’. There are two
basic steps in the research process. The first step is to decide the goal
or research questions. The second step is to find out the way to get the
answers of the questions. The path to get the answers of the questions
is composed of the research methodology. The selection of suitable
methods, procedures and models are played an important role in each
operational step of the research process to get the objectives of the
research. There are two types of data which are collected for the
research purpose, primary data and secondary data. Primary data mean
new data. While secondary data mean the data which already exist.
The methods used to collect the new data are observations, interviews
and experiments. There are some problems to the use of the data. One
problem is the compatibility and the other trustworthiness.

[5]Marc A. Rose, “Engineering Health and Safety Module


and Case Studies”, vol. 1, pp. 1-9, July 2004

Health and safety issues are important in engineering,


management and other fields. Most professional engineering
associations point out that health and safety are issues of utmost
importance in engineering practice. For example, Professional
Engineers Ontario (http://www.peo.on.ca) states in its Code of Ethics,
“A practitioner shall … regard the practitioner's duty to public welfare
as paramount.” The need for appropriate education and training in
engineering health and safety is also widely recognized, and
engineering programs usually must appropriately address health and
safety to maintain accreditation. For instance, the Canadian
Engineering Accreditation Board (http://www.ccpe.ca) includes in its
curriculumcontent criteria, “Appropriate exposure to … public and
worker safety and health considerations … must be an integral
Analysis and Prediction of Industrial Accidents Using Machine Learning

component of the engineering curriculum.” This document is an


engineering-oriented module and set of case studies on health and
safety, which helps convey the importance of these issues in a concise
package. The material can be covered in a single lecture, or over an
extended period. The materials herein are intended and structured for
engineering students, but are also useful for others, e.g., students in
other technical programs such as applied sciences and technology,
students in management, business and other programs that interface
with engineering, and students in company training programs. This
package contains case studies since they usually present a useful and
interesting means of delivering education on health and safety to
engineering students. Minerva Canada
(http://www.minervacanada.org) and others have in the past developed
several useful businessand engineering-oriented case studies on health
and safety. The case studies presented here are fictitious, although they
contain ideas based on actual incidents. Although the case studies are
oriented towards engineering, they also incorporate management and
business issues, since health and safety must be dealt with in an
integrated and interdisciplinary manner. For example, criteria for
business success, such as performance and profitability, must be
considered in concert with health and safety. The case studies are not
intended to be judgmental, but rather to provide a basis for discussion.
The author invites feedback and comments from interested parties and
users, so that the module and accompanying case studies can be
enhanced in the future.

Occupational health and safety is concerned with the


identification, evaluation and control of hazards associated with the
workplace. Companies and organizations often have occupational
health and safety programs, the objectives of which are to reduce: •
occupational injuries, which include any harm from a workplace
accident (e.g., fracture, cut, burn), and • occupational illnesses, which
include abnormal conditions caused by exposure to factors associated
with the workplace. Occupational health and safety are often grouped
Analysis and Prediction of Industrial Accidents Using Machine Learning

together, but they are not the same even though they are closely
related. It is important to understand both. One way we can
differentiate health and safety is as follows: • Safety usually is
concerned with situations that cause injury and deals with hazards that
lead to severe and sudden outcomes. • Health usually is concerned with
situations that cause illness or disease and deals with adverse reactions
to exposure over prolonged periods to hazards that are usually less
severe, but still dangerous. Of course, some situations can
simultaneously lead to safety and health concerns.

[6]PaiviHamalainen, JukkaTakala, KaijaLeenaSaarela,


“Global estimates of occupational accidents”, vol. 1,pp. 2-3, 2005

This paper reviews latest global and country numbers of


occupational injuries and work-related illnesses. Methods: Source
material included those from ILO, WHO, EU, ASEAN and national
institutions. Where country information was missing proxy countries
were used. Results: We estimated that 2.3 million deaths occur
annually across countries for reasons attributed to work. The biggest
burden comes from work-related illnesses, accounting for 2 million
deaths and the remainder were due to occupational injuries. Globally,
work-related cardiovascular diseases and cancer were the top illnesses,
followed by occupational injuries and infectious diseases at work.
Analysis and Prediction of Industrial Accidents Using Machine Learning

[7] Management of Industrial Accident Prevention and


Preparedness” A Training Resource Package UNEP, vol. 1, pp. 55-
97, June 1996.

Recent accidents around the world have highlighted the


potential hazards inherent in many industrial operations. Many
accidents, both large and small, are preventable. For accidents that do
occur, much can be done to reduce the seriousness of the
consequences. In particular, potential victims of large-scale accidents
can be informed of the best way to act if an accident should occur so as
to minimize the risks to themselves and to property. APELL is an
effective tool to either prevent or to reduce the seriousness of the
consequences of accidents. APELL stands for Awareness and
Preparedness for Emergencies at the Local Level. It is not a risk
reduction program per se, although effective hazards communication
often stimulates industry to take action to reduce further the degree of
hazard. APELL mainly is a hazard communication process which leads
to collective action to take preparative measures. Comprehensive
management of field programmes for accident prevention and
preparedness depends on systematic integration of technical,
administrative, legal and infrastructure considerations. In addition to
this, it is very important to realize that effective public communication
at an early stage is a prerequisite to effective accident preparedness.
Indeed, an informed community is the one best defended against risk!
Analysis and Prediction of Industrial Accidents Using Machine Learning

The APELL Seminar Workshop is a regular UNEP activity. The


workshop is aimed at informing local communities so that they are
able to start and lead an APELL process themselves. Effective hazard
evaluation, being an essential part of the awareness and preparation
process, can be done by local interested parties if they know where to
find information and how to interpret it. UNEP has published a short
manual on this evaluation process, and national procedures are also
available in some countries. This training resource package is designed
to introduce the concepts and methodologies of APELL to
professionals who may one day need to participate in it. As the APELL
process involves professionals in many sectors of activity as well as
members of the public, a better understanding of APELL by
professionals is important. The package can be used by nonspecialized
trainers for the preparation of a short presentation or a workshop about
the APELL programme and hazard identification, or to develop a
curriculum about APELL and hazard identification for undergraduate
students. Part 1 Introduction I:12 A Training Resource Package :
Management of Industrial Accident Prevention and Preparedness Some
noteworthy accidents n order to illustrate the application of the APELL
Process, a table of major technological accidents is shown below. As
well as headline nuclear incidents, there are many shocking
‘petrochemical headlines’. Indeed, the list of headline accidents below
has led to advances in legislation, and is selected from 180 severe
industrial accidents which occurred between 1970 and 1990. Caused
mainly by fires, explosion and escapes of toxic gas, they have killed
about 8 000 people, injured more than 200 000, and led to hundreds of
evacuations involving thousands of people all around the world. It
should be noted that the APELL process is equally applicable, with
suitable modifications, to smaller industrial installations and
nonindustrial hazards.
Analysis and Prediction of Industrial Accidents Using Machine Learning

[8] Y. Wan-Jun, W. Jian and Z. Huai-Lin, "Research on


Risk Management of Gas Safety based on Big Data," 2018
International Conference on Intelligent Informatics and
Biomedical Sciences (ICIIBMS), 2018.

In order to solve the shortcomings of coal mine enterprises in


gas safety management, according to the big data theory, a safety
management analysis method is proposed. First, the behavioral safety
theory is used to classify the causes of gas hazards. Then use the HDFS
to storage the unsafe behavior and unsafe physical state that found by
behavior observers, and finally use the MapReduce-based parallel FP-
growth algorithm to find out the repeated and dangerous unsafe
behaviors in daily operations, and form a Hadoop-based gas behavior
security management model. The experimental results show that the
model has certain practical reference value for the targeted
implementation of gas safety management in coal mine enterprises.
Through the unsafe behavior and physical state found, the
shortcomings in safety management will be discover, it will help
enterprises improve the safety management system. So it has certain
prospects for improving the safety production culture of coal mine
enterprises and reducing the occurrence of gas accidents.
Analysis and Prediction of Industrial Accidents Using Machine Learning

3. PROBLEM ANALYSIS
3.1 EXISTING APPROACH:
Anuoluwapo et al [5] had proposed by introducing the big data
framework in the occupational health system. The aim was to indulge into
getting accurate results in determining production industry accidents using
various Big Data platforms including Hadoop, Spark, and MapReduce.
The data is drawn from a leading power infrastructure company in the UK
and was analyzed using the B-DAPP architecture.
.
3.1.1 Drawbacks
 with the existing systems, the cost of planning and storing the data is
soaring

3.2 Proposed System


• The aim of the proposed system was to design a way to analyze multiple
accident data, parameters involved and determine a way to ensure such
fatalities don’t occur in the future.
3.2.1 Advantages
• Helping various industries around the world to ensure employee safety within
their environment.
• Utilizes low cost storage and process data in less time.
Analysis and Prediction of Industrial Accidents Using Machine Learning

3.3 Software And Hardware Requirements

SOFTWARE REQUIREMENTS
The functional requirements or the overall description documents
include the product perspective and features, operating system and
operating environment, graphics requirements, design constraints and user
documentation.
The appropriation of requirements and implementation constraints
gives the general overview of the project in regards to what the areas of
strength and deficit are and how to tackle them.

• Python idel 3.7 version (or)


• Anaconda 3.7 ( or)
• Jupiter (or)
• Google colab
Analysis and Prediction of Industrial Accidents Using Machine Learning

HARDWARE REQUIREMENTS
Minimum hardware requirements are very dependent on the particular
software being developed by a given Enthought Python / Canopy / VS
Code user. Applications that need to store large arrays/objects in
memory will require more RAM, whereas applications that need to
perform numerous calculations or tasks more quickly will require a
faster processor.
• Operating system : windows, linux
• Processor : minimum intel i3
• Ram : minimum 4 gb
• Hard disk : minimum 250gb

3.4 About Dataset


Industrial accidents:
The dataset used in our paper can be downloaded here:

For industrial accidents we collect dataset from


IHMStefanini_industrial_safety_and_health_database and
IHMStefanini_industrial_safety_and_health_database
_with_accidents_description. This datasets consists nine columns and 439
records for IHMStefanini_industrial_safety_and_health_database and
eleven columns and 425 records for
IHMStefanini_industrial_safety_and_health_database
_with_accidents_description.
Data, countries, local, industry sector, accident Level, potential accident
level, gender, employee ou terceiro, risco Critico
Analysis and Prediction of Industrial Accidents Using Machine Learning

2016-1-1 0:00, Country_01, Local_01, mining, I, IV, Male, Third Party,


pressed

s.no, Data, countries, local, industry sector, accident Level, potential


accident level, gender, employee ou terceiro, critical risk, description
1, 2016-1-1 0:00, Country_01, Local_01, mining, I, IV, Male, Third Party,
pressed, While removing the drill rod of the Jumbo 08 for maintenance, the
supervisor proceeds to loosen the support of the intermediate centralizer to
facilitate the removal, seeing this the mechanic supports one end on the drill of
the equipment to pull with both hands the bar and accelerate the removal from
this, at this moment the bar slides from its point of support and tightens the
fingers of the mechanic between the drilling bar and the beam of the jumbo.
Above all bold names are column names and thin values are dataset
values.

3.5 Algorithms
Logistic regression

Random Forest Algorithm

Random Forest is a popular machine learning algorithm that belongs to the supervised
learning technique. It can be used for both Classification and Regression problems in
ML. It is based on the concept of ensemble learning, which is a process of combining
multiple classifiers to solve a complex problem and to improve the performance of the
model.

As the name suggests, "Random Forest is a classifier that contains a number of


decision trees on various subsets of the given dataset and takes the average to
improve the predictive accuracy of that dataset." Instead of relying on one decision
tree, the random forest takes the prediction from each tree and based on the majority
votes of predictions, and it predicts the final output.

Decision tree Algorithm:


Analysis and Prediction of Industrial Accidents Using Machine Learning

Decision trees classify instances by sorting them down the tree from the root to some
leaf node, which provides the classification of the instance. An instance is classified
by starting at the root node of the tree,testing the attribute specified by this node,then
moving down the tree branch corresponding to the value of the attribute as shown in
the above figure.This process is then repeated for the subtree rooted at the new node

1. ANN is rarely used for predictive modelling. The reason being that Artificial
Neural Networks (ANN) usually tries to over-fit the relationship. ANN is
generally used in cases where what has happened in past is repeated almost
exactly in same way. For example, say we are playing the game of Black Jack
against a computer. An intelligent opponent based on ANN would be a very
good opponent in this case (assuming they can manage to keep
the computation time low). With time ANN will train itself for all possible
cases of card flow. And given that we are not shuffling cards with a dealer,
ANN will be able to memorize every single call. Hence, it is a kind of
machine learning technique which has enormous memory. But it does not
work well in case where scoring population is significantly different compared
to training sample. For instance, if I plan to target customer for a campaign
using their past response by an ANN. I will probably be using a wrong
technique as it might have over-fitted the relationship between the response
and other predictors.

3.6 Modules
1. Data Collection:Collect sufficient data samples and legitimate
software samples. 
2. Data Preporcessing:Perform effective data processing on the sample
and extract the features. 
3. Train and Test Modelling: Split the data into train and test data
Train will be used for trainging the model and Test data to check the
performace
4. Modelling: Logitic, Navie bayes, Random FOrest,KNN,and
xgboost . Combine the training using machine learning algorithms and
establish a classification model.
Analysis and Prediction of Industrial Accidents Using Machine Learning

4. SYSTEM DESIGN

UML DIAGRAMS
The System Design Document describes the system requirements,
operating environment, system and subsystem architecture, files and
database design, input formats, output layouts, human-machine interfaces,
detailed design, processing logic, and external interfaces.
Global Use Case Diagrams:
Identification of actors:
Actor: Actor represents the role a user plays with respect to the system.
An actor interacts with, but has no control over the use cases.
Graphical representation:

<<Actor name>>

Actor

An actor is someone or something that:


Analysis and Prediction of Industrial Accidents Using Machine Learning

Interacts with or uses the system.


Provides input to and receives information from the system.
Is external to the system and has no control over the use cases. Actors are
discovered by examining:
 Who directly uses the system?
 Who is responsible for maintaining the system?
 External hardware used by the system.
 Other systems that need to interact with the system. Questions to identify
actors:
o Who is using the system? Or, who is affected by the system? Or, which
groups need help from the system to perform a task?
Analysis and Prediction of Industrial Accidents Using Machine Learning

o Who affects the system? Or, which user groups are needed by the system
to perform its functions? These functions can be both main functions and
secondary functions such as administration.
o Which external hardware or systems (if any) use the system to perform
tasks?
o What problems does this application solve (that is, for whom)?
o And, finally, how do users use the system (use case)? What are they doing
with the system?
The actors identified in this system are:
a. System Administrator
b. Customer
c. Customer Care
Identification of usecases:
Usecase: A use case can be described as a specific way of using the
system from a user’s (actor’s) perspective.
Graphical representation:

A more detailed description might characterize a use case as:


 Pattern of behavior the system exhibits
 A sequence of related transactions performed by an actor and the
system
 Delivering something of value to the actor Use cases provide a
means to:
 capture system requirements
 communicate with the end users and domain experts
 test the system
Use cases are best discovered by examining the actors and defining what
the actor will be able to do with the system.
Guide lines for identifying use cases:
Analysis and Prediction of Industrial Accidents Using Machine Learning

 For each actor, find the tasks and functions that the actor should be
able to perform or that the system needs the actor to perform. The use case
should represent a course of events that leads to clear goal
 Name the use cases.
 Describe the use cases briefly by applying terms with which the user is
familiar. This makes the description less ambiguous
Questions to identify use cases:
 What are the tasks of each actor?
 Will any actor create, store, change, remove or read information in the
system?
 What use case will store, change, remove or read this information?
 Will any actor need to inform the system about sudden external
changes?
 Does any actor need to inform about certain occurrences in the system?
 What usecases will support and maintains the system?
Flow of Events
A flow of events is a sequence of transactions (or events) performed by the
system. They typically contain very detailed information, written in terms
of what the system should do, not how the system accomplishes the task.
Flow of events are created as separate files or documents in your favorite
text editor and then attached or linked to a use case using the Files tab of a
model element.
A flow of events should include:
 When and how the use case starts and ends
 Use case/actor interactions
 Data needed by the use case
 Normal sequence of events for the use case
 Alternate or exceptional flows Construction of Usecase diagrams:
Use-case diagrams graphically depict system behavior (use cases). These
diagrams present a high level view of how the system is used as viewed
from an outsider’s (actor’s) perspective. A use-case diagram may depict all
or some of the use cases of a system.
A use-case diagram can contain:
Analysis and Prediction of Industrial Accidents Using Machine Learning

 actors ("things" outside the system)

 use cases (system boundaries identifying what the system should do)
 Interactions or relationships between actors and use cases in the system
including the associations, dependencies, and generalizations.
Relationships in use cases:
1. Communication:
The communication relationship of an actor in a usecase is shown by
connecting the actor symbol to the usecase symbol with a solid path. The
actor is said to communicate with the usecase.
2. Uses:
A Uses relationship between the usecases is shown by generalization
arrow from the usecase.
3. Extends:
The extend relationship is used when we have one usecase that is similar to
another usecase but does a bit more. In essence it is like subclass.
SEQUENCE DIAGRAMS
A sequence diagram is a graphical view of a scenario that shows object
interaction in a time- based sequence what happens first, what happens
next. Sequence diagrams establish the roles of objects and help provide
essential information to determine class responsibilities and interfaces.
There are two main differences between sequence and collaboration
diagrams: sequence diagrams show time-based object interaction while
collaboration diagrams show how objects associate with each other. A
sequence diagram has two dimensions: typically, vertical placement
represents time and horizontal placement represents different objects.
Object:
An object has state, behavior, and identity. The structure and behavior of
similar objects are defined in their common class. Each object in a diagram
indicates some instance of a class. An object that is not named is referred
to as a class instance.
The object icon is similar to a class icon except that the name is
underlined: An object's concurrency is defined by the concurrency of its
class.
Analysis and Prediction of Industrial Accidents Using Machine Learning

Message:
A message is the communication carried between two objects that trigger
an event. A message carries information from the source focus of control
to the destination focus of control. The synchronization of a
message can be modified through the
message specification. Synchronization means a message where
the sending object pauses to wait for results.
Link:
A link should exist between two objects, including class utilities, only if
there is a relationship between their corresponding classes. The existence
of a relationship between two classes symbolizes a path of communication
between instances of the classes: one object may send messages to another.
The link is depicted as a straight line between objects or objects and class
instances in a collaboration diagram. If an object links to itself, use the
loop version of the icon.

CLASS DIAGRAM:
Identification of analysis classes:
A class is a set of objects that share a common structure and common
behavior (the same attributes, operations, relationships and semantics). A
class is an abstraction of real-world items. There are 4 approaches for
identifying classes:
a. Noun phrase approach:
b. Common class pattern approach.
c. Use case Driven Sequence or Collaboration approach.
d. Classes , Responsibilities and collaborators Approach
1. Noun Phrase Approach:
The guidelines for identifying the classes:
 Look for nouns and noun phrases in the usecases.
 Some classes are implicit or taken from general knowledge.
 All classes must make sense in the application domain; Avoid
computer implementation classes – defer them to the design stage.
 Carefully choose and define the class names After identifying the
classes we have to eliminate the following types of classes:
Analysis and Prediction of Industrial Accidents Using Machine Learning

 Adjective classes.
2. Common class pattern approach:
The following are the patterns for finding the candidate classes:
 Concept class.
 Events class.
 Organization class
 Peoples class
 Places class
 Tangible things and devices class.
3. Use case driven approach:
We have to draw the sequence diagram or collaboration diagram. If there
is need for some classes to represent some functionality then add new
classes which perform those functionalities.
4. CRC approach:
The process consists of the following steps:
 Identify classes’ responsibilities ( and identify the classes )
 Assign the responsibilities
 Identify the collaborators. Identification of responsibilities of each
class:
The questions that should be answered to identify the attributes and
methods of a class respectively are:
a. What information about an object should we keep track of?
b. What services must a class provide? Identification of relationships
among the classes:
Three types of relationships among the objects are:
Association: How objects are associated?
Super-sub structure: How are objects organized into super classes and sub
classes? Aggregation: What is the composition of the complex classes?
Association:
The questions that will help us to identify the associations are:
a. Is the class capable of fulfilling the required task by itself?
b. If not, what does it need?
c. From what other classes can it acquire what it needs? Guidelines for
Analysis and Prediction of Industrial Accidents Using Machine Learning

identifying the tentative associations:


 A dependency between two or more classes may be an association.
Association often corresponds to a verb or prepositional phrase.

 A reference from one class to another is an association. Some


associations are implicit or taken from general knowledge.
Some common association patterns are:
Location association like part of, next to, contained in….. Communication
association like talk to, order to ……
We have to eliminate the unnecessary association like implementation
associations, ternary or n- ary associations and derived associations.
Super-sub class relationships:
Super-sub class hierarchy is a relationship between classes where one class
is the parent class of another class (derived class).This is based on
inheritance.
Guidelines for identifying the super-sub relationship, a generalization are
1. Top-down:
Look for noun phrases composed of various adjectives in a class name.
Avoid excessive refinement. Specialize only when the sub classes have
significant behavior.
2. Bottom-up:
Look for classes with similar attributes or methods. Group them by
moving the common attributes and methods to an abstract class. You may
have to alter the definitions a bit.
3. Reusability:
Move the attributes and methods as high as possible in the hierarchy.
4. Multiple inheritances:
Avoid excessive use of multiple inheritances. One way of getting benefits
of multiple inheritances is to inherit from the most appropriate class and
add an object of another class as an attribute.
Aggregation or a-part-of relationship:
It represents the situation where a class consists of several component
classes. A class that is composed of other classes doesn’t behave like its
parts. It behaves very difficultly. The major properties of this relationship
Analysis and Prediction of Industrial Accidents Using Machine Learning

are transitivity and anti symmetry.


The questions whose answers will determine the distinction between the
part and whole relationships are:
 Does the part class belong to the problem domain?
 Is the part class within the system’s responsibilities?
 Does the part class capture more than a single value?( If not then
simply include it as an attribute of the whole class)
 Does it provide a useful abstraction in dealing with the problem
domain? There are three types of aggregation relationships. They are:
Assembly:
It is constructed from its parts and an assembly-part situation physically
exists.
Container:
A physical whole encompasses but is not constructed from physical parts.
Collection member:
A conceptual whole encompasses parts that may be physical or conceptual.
The container and collection are represented by hollow diamonds but
composition is represented by solid diamond.
Analysis and Prediction of Industrial Accidents Using Machine Learning

USE CASE DIAGRAM


A use case diagram in the Unified Modeling Language (UML) is a
type of behavioral diagram defined by and created from a Use-case
analysis. Its purpose is to present a graphical overview of the functionality
provided by a system in terms of actors, their goals (represented as use
cases), and any dependencies between those use cases. The main purpose
of a use case diagram is to show what system functions are performed for
which actor. Roles of the actors in the system can be depicted.

Start

Data Extraction

Data Describe

Preprocessing

NLP processing

Data Visualisation

User NLP Analysis

Modeling

ML

Featuring & Visualize

Stop

Fig 1: Use Case Diagram


Analysis and Prediction of Industrial Accidents Using Machine Learning

CLASS DIAGRAM
In software engineering, a class diagram in the Unified
Modeling Language (UML) is a type of static structure diagram that
describes the structure of a system by showing the system's classes, their
attributes, operations (or methods), and the relationships among the
classes. It explains which class contains information.

User
hospital admission

Start()
Data Extraction()
Data Describe()
Data Info() System
Preprocessing()
NLP processing()
Data Visualisation()
NLP Analysis()
Modelling()
ML()
Featuring & Visualize()
end()

Fig 2:Class Diagram


Analysis and Prediction of Industrial Accidents Using Machine Learning

SEQUENCE DIAGRAM

A sequence diagram in Unified Modeling Language (UML) is a


kind of interaction diagram that shows how processes operate with one
another and in what order. It is a construct of a Message Sequence Chart.
Sequence diagrams are sometimes called event diagrams, event scenarios,
and timing diagrams.

Us er Sy s tem

St art

Dat a

Dat a Ex t rac ted

Dat a Des c ribe

Preproces sing

NLP proces sing

Dat a Visualis at ion

NLP Analy s is

Modeling

ML

Featuring & Vis ualiz e

Get Result

Fig 3: Sequence Diagram


Analysis and Prediction of Industrial Accidents Using Machine Learning

5.IMPLEMENTATION

5.1 FLOW CHART:


Analysis and Prediction of Industrial Accidents Using Machine Learning

7.RESULTS AND DISCUSSIONS

8.CONCLUSION
The system that was proposed was in an aim to analyze and create predictions of
Industrial accidents from a publicly provided dataset. Using the dataset, the system was
able to read the data, clean the data, produce various analyses and statistics along with
making predictions based on the model it was trained with. With the use of Random
Forest Classifier, it can be depicted that it is comparatively a better algorithm than by
using single trees. The system can be used for any industry and this can also be mean to
help industries in getting to know better about the fatalities that occur. Also, the system
aids in understanding the data and result out a prediction so as to ensure in keeping the
employees safer from any further happenings. 1. Lack of valuable data: A machine
learning algorithm often requires tens of thousands of data [35] to be trained in
Analysis and Prediction of Industrial Accidents Using Machine Learning

order to get an effective model. The acquisition of these basic data often requires
manual operations and the speed cannot be guaranteed..

FUTURE SCOPE
In future enhancement we will add some more algorithms to predict efficiently

8.BIBILOGRAPHY
[1] Using Decision Tree to Predict the Occupational Hazards and Return-to-
work ,Kang, Ya-Chin, Yun-Fu, IEEE International Conference on Applied System
Innovation, 2017.
[2] An Occupational Health and Safety Monitoring System IEEE 14th
International Conference on Industrial Informatics, October 2016 S.A. Ngubo, C.P.
Kruger, G.P. Hancke, B.J. Silva
[3] Wireless Solutions for Improving Health and Safety Conditions in Industrial
Environments IEEE 15th International Conference on e-Health Networking,
Analysis and Prediction of Industrial Accidents Using Machine Learning

Applications and Services, 2013 Jose Antonio Palazon, Javier Gozalvez, Juan Luis
Maestre, Jose Ramon Gisbert.
[4] Industrial Safety and Accident Prevention; A Managerial Approach Industrial
Safety and Accident Prevention; A Managerial Approach International Journal of
Science, Engineering and Technology Research, February 2013
[5] Big Data platform for health and safety accident prediction Tom Jose V, Sijo M
T, Praveen
[6] C. Vehbi, et al, “Industrial Wireless Sensor Networks: Challenges, Design
Principles, and Technical Approaches”, IEEE Transactions on Industrial
Electronics, nol. 56, no.10, Oct. 2009.
[7] Chen Chen, “Analysis and Forecast of Traffic Accident Big Data”, ITM Web
of Conferences, Jan 2017.
[8] David Oswal et al, “Exploring Factors Affecting Unsafe Behaviours In
Construction”, 29th Annual ARCOM Conference, Sep. 2013
[9] Anuoluwapo Ajayi et al, “Big Data Platform for health and safety Accident
Prediction”, World Journal of Science, Technology and Sustainable Development,
Jan. 2019.
[10] Y. Wan-Jun, W. Jian and Z. Huai-Lin, "Research on Risk Management of
Gas Safety based on Big Data," 2018 International Conference on Intelligent
Informatics and Biomedical Sciences (ICIIBMS), 2018.

You might also like