0% found this document useful (0 votes)
100 views

Insurance Premium Prediction

This document provides a high-level design for an insurance premium prediction system. It will use machine learning models to predict insurance premiums based on individuals' health data. Regression, tree-based, and ensemble models will be tested and compared. The best performing model, GradientBoostingRegressor, will be implemented in an API to predict premiums. Technical requirements, data needs, tools, and the overall system design are described at a high level. The goal is to create a solution that can accurately predict insurance premiums based on factors like BMI, smoking status, age, and more.

Uploaded by

Vinay Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
100 views

Insurance Premium Prediction

This document provides a high-level design for an insurance premium prediction system. It will use machine learning models to predict insurance premiums based on individuals' health data. Regression, tree-based, and ensemble models will be tested and compared. The best performing model, GradientBoostingRegressor, will be implemented in an API to predict premiums. Technical requirements, data needs, tools, and the overall system design are described at a high level. The goal is to create a solution that can accurately predict insurance premiums based on factors like BMI, smoking status, age, and more.

Uploaded by

Vinay Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Insurance Premium Prediction

Insurance Premium Prediction

Document Version Control

Date Issued Version Description Author

11/01/2023 1 Initial HLD — V1.0 Aakash Pal


Insurance Premium Prediction

Document Version Control. ......................................................................................................2


Abstract.................................................................................................................................... 4
1 Introduction ....................................................................................................................... 5
1.1 Why This High-Level Design Document? ................................................................ 5
1.2 Scope. ....................................................................................................................... 5
1.3 Definitions .................................................................................................................. 6
2 General Description. ........................................................................................................ 7
2.1 Product Perspective .................................................................................................. 7
2.2 Problem Statement.................................................................................................... 7
2.3 Proposed Solution ..................................................................................................... 7
2.4 Technical Requirements............................................................................................ 7
2.5 Data Requirements. ...................................................................................................8
2.6 Tools Used................................................................................................................. 8
2.7 Constraints.................................................................................................................9
2.8 Assumptions .............................................................................................................. 9
3 Design Details ................................................................................................................. 10
3.1 Process Flow. .......................................................................................................... 10
3.2 Event Log ................................................................................................................ 10
3.3 Error Handling.......................................................................................................... 11
4 Performance.................................................................................................................... 11
4.1 Reusability. .............................................................................................................. 11
4.2 Application Compatibility. ........................................................................................ 11
4.3 Deployment. ............................................................................................................ 11
5 Dashboards. ....................................................................................................................12
6 Conclusion ...................................................................................................................... 12
Insurance Premium Prediction

In this project, we aim to predict insurance premiums for individuals by analyzing their health
data. We employed models like the Regression model, Tree algorithm, and Bagging and
Boosting Algorithm to accomplish this. We used these models to compare and contrast their
performance.

The training dataset was used to train the model, and the predictions made by the model were
compared with actual data to test and verify the model's accuracy. After reaching the accuracy
of all the models, it was determined that the GradientBoostingRegressor and
RandomForestRegressor algorithms performed better than the remaining models.

Ultimately, we found that the GradientBoostingRegressor algorithm was the best suited for this
task as it provided the best evaluation score compared to the other models.
Insurance Premium Prediction

Introduction
1.1 Why this High-Level Design Document?
The purpose of this High-Level Design (HLD) Document is to add the necessary
detail to the current project description to represent a suitable model for
coding. This document is also intended to help detect contradictions prior to
coding, and can be used as a reference manual for how the modules interact at
a high level.

The HLD will:


• Present all of the design aspects and define them in detail
• Describe the user interface being implemented
• Describe the hardware and software interfaces
• Describe the performance requirements
• Include design features and the architecture of the project
• List and describe the non-functional attributes like:
o Reliability
o Security
o Maintainability
o Portability
o Reusability
o Application compatibility
o Resource utilization
o Serviceability

1.2 Scope
The HLD documentation presents the structure of the system, such as the
database architecture, application architecture (layers), application flow
(Navigation), and technology architecture. The HLD uses non-technical to
mildly-technical terms which should be understandable to the administrators
of the system.
Insurance Premium Prediction

1.3 Definitions

Term Description
UGV Unmanned Ground Vehicle
Database Collection of all the information monitored by this system
IDE AWS Integrated Development Environment
Amazon Web Services
Insurance Premium Prediction

2.1 Product Perspective


The Insurance premium estimation is a machine learning-based predictive model
which will help us to predict the premium of the person for health insurance.

2.2 Problem Statement


To develop an API interface to predict the premium of insurance using
people's individual health data and analyze the following:
• To detect BMI value affects the premium.
• To detect smoking affects the premium of the insurance.
• To create an API interface to predict the premium.

2.3 Proposed Solution


The solution proposed here is to estimate the premium of insurance based on
people's health data. This can be implemented to perform the use cases
mentioned above. In the first case, we will analyze how the BMI value affects
people's health as well as the premium of the insurance. In the second case, if
the model detects that smoking affects the premium, we will inform those
people. And in the last use case, we will create an interface to predict the
premium.

2.4 Technical Requirements

The solution proposed here can be implemented as a cloud-based solution


or as an application hosted on an internal server, or even on a local
machine. To access this application, the following minimum requirements
are necessary:
• A good internet connection.
• A web browser.
Insurance Premium Prediction

For training the model, the following system requirements are preferred:

• 4 GB RAM or more.

• An operating system such as Windows, Linux or Mac.


• Visual Studio Code or Jupyter notebook.

2.5 Data Requirements

The data requirements for this project will depend on the specific problem
statement. A CSV file will be used as the input file, and the feature/field names
and sequence should be followed as decided. It's important to have a clear
understanding of the problem statement and the data that is required to solve
it, to design a suitable data pipeline, and to train the model effectively.

2.6 Tools used

Python programming language and frameworks such as NumPy, Pandas, Scikit-


learn, TensorFlow, Keras, and Roboflow are used to build the whole model.
Insurance Premium Prediction

 Pandas is an open-source Python package that is widely used for data analysis
and machine-learning tasks.
 NumPy is the most commonly used package for scientific computing in
Python.
 Plotly is an open-source data visualization library used to create interactive
and quality charts/graphs.
 Scikit-learn is used for machine learning.
 Flask is used to build API.
 VS Code is used as an IDE (Integrated Development Environment)
 GitHub is used as a version control system.
 Front-end development is done using HTML and CSS.
 Railway is used for the deployment of the model.

2.7 Constraints

It is useful for the user by predicting Insurance prices based on their provided
details for ex: - Bmi, sex, smoker, yes/no, age, etc.

2.8 Assumptions

The main objective of the project is to develop an API to predict the premium for
people based on their health information. A machine learning-based regression
model is used for predicting the above-mentioned cases on the input data.
Insurance Premium Prediction

3 Design Details
3.1 Process Flow

3.2 Event log

The system should log every event so that the user will know what
process is running internally.

Initial Step-By-Step Description:


 The System identifies at what step logging required
 The System should be able to log each and every system flow.
 Developer can choose logging method. You can choose database
logging / File logging as well.
 System should not hang even after using so many loggings.
Insurance Premium Prediction

3.3 Error Handling

Should errors be encountered, an explanation will be displayed as to what


went wrong? An error will be defined as anything that falls outside the normal
and intended usage.

4 Performance.
4.1 Reusability

The entire solution will be done in a modular fashion and will be API oriented.
So, in the case of scaling the application, the components are completely
reusable.

4.2 Application Compatibility

The interaction with the application is done through the designed user interface,
which the end user can access through any web browser.

4.3 Deployment
Insurance Premium Prediction

5 Dashboards
A dashboard is a data visualization and analysis tool that displays on one screen the
status of key performance indicators (KPIs) and other important business metrics.

As a high-level reporting mechanism, dashboards provide fast ‘big-picture’ answers


to critical business questions and assist and benefit decision-making in several
ways:

 Communicating how premium varies with BMI value.


 Visualizing the relationship of gender with premium in easy to understanding
way.

6 Conclusion

This system shows us the different techniques that are used to estimate the how
much amount of premium required based on individual health situations. After
analysis, it shows how a smoker and non-smokers affect the amount of estimate.
Also, a significant difference between male and female expenses. Accuracy plays
a key role in prediction-based systems. From the results, we could see that Gradient
Boosting turned out to be the best working model for this problem in terms of
accuracy. Our predictions help users to know how much amount premium they
need based on their current health situation.

You might also like