Algorithm Current Situation

The goal of the algorithm is to detect abnormalities in a client's energy consumption compared to a reference period and normalize current consumption based on explanatory variables. Key steps include training on a past reference period, applying the model to current data to calculate normalized consumption, and flagging differences that exceed a confidence interval. Further work involves developing statistical models to identify anomalies, exploring unsupervised models, improving the machine learning model through feature engineering, and ensuring the algorithm works for various client datasets and consumption patterns. The algorithm aims to generate few but accurate alarms for abnormal overconsumption.

Uploaded by

NisAr Ahmad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views7 pages

Algorithm Current Situation

Uploaded by

NisAr Ahmad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Sequence

Algorithm

1
Context of our clients
1. 2 types of clients:
a. Tertiary sector
i. Shopping centers (majority)
ii. Recreation center…
b. Industries

2. Data Source
a. The data come from sensors and meters installed at each customer.
b. These sensors or meters communicate with our server every hour or every fifteen minutes depending on the types of
devices. (mainly every fifteen minutes)

3. Data Structure
a. The data are divided into two categories for each :
i. One to be predicted. This one is composed by multiple meters of electricity, gas, water, cold, heat
ii. One that will help with the prediction. These will be referred to as ‘explanatory variables’. Here are some
examples : the days, the opening hours, temperature , CO2, lighting, sensors in different area of the buildings.
b. In general, there are several sensors of CO2, temperature… for each customer placed at different locations in the
building.
c. The data structure is different for everyone in terms of the number of meters. However, shopping centers have the same
types of meters and explanatory variables. This means that two of our customers being shopping centers, they will have
different number of meters and explanatory variables. The number will depends on the size of the site. However, their
consumption habit and the general structure of data should be quite similar.
Goal of the algorithm
1. Goal
The goal of the algorithm is double:
- to detect current abnormalities in the consumption based on a reference period. (Attention, the goal is not to
predict exactly the current or the future consumption).
- To normalize the consumption based on a referenced period

2. Steps
a. Training on a past reference period
i. The customer will choose a period in the past a whole calendar year.
ii. The algorithm will then have to train itself on this period and try to explain in the best way possible the
consumption based on the explanatory variables. The input are the explanatory variables and the meters
are the outputs.
b. Apply the model for current consumption:
i. The model will then be applied to real-time explanatory variables in order to calculate a normalized
current consumption (this value depends on the model and therefore will be different according to the
chosen reference period).
ii. The real-time consumption will then be compared to the normalized consumption.
iii. If there is too big of a difference, the algorithm must conclude that there is an abnormality. In order to do
so we would like you to create a confidence interval.
Our needs
The difficulties come from what we want to detect and from the reference periods.
What we want to know:
- Savings/losses compared to a reference period (being able to say: in 2023, if I consumed the same way as in 2022, I would have consumed x
kWh, I only consumed x, so savings of x% ).
To do this, we need as a reference period, a complete calendar year without processing/cleaning the reference period. Do not worry about outliers
in the reference period, they are considered normal.
- Detect problems and drifts in consumption (to save energy and warn the customer that there is a problem)
For this, it would be necessary to base oneself on a reference period which is the most optimized in terms of consumption. This period becomes
the reference in the past and we must do better than that. The goal is to find everything that is wrong with the best period.
What we already had as an idea:
- “Clean a past period” by removing the values that are abnormal (we “clean” 2022 for example)
- Either manually: a lot of data and sometimes difficult to quickly identify problems ⇒ complicated
- Either find a way to analyze the reference profile, detect what does not seem normal, remove these values and use this data as a
reference ⇒ it is better perhaps to directly analyze the profile and see if it is normal or not ⇒ isolation forest or other?
- The problem (or not, we don’t know) with applying forest insulation to the entire profile is that you don't remove
abnormal past consumption.
- Have a 'rolling' reference period that sticks as closely as possible to today's date: reference period = the last month or the last 6 months,
for example
- The concern is to do this technique if consumption increases. The reference period would become bad and therefore high
consumption would become normal.

The difficulty is therefore “how to detect overconsumption? Based on what?” To see if a model with a reference over a year (XGBoost for example)
4
combined with another method (forest insulation for example) would meet our needs.
Stages - Plan of action
Already done
1. Exploration of Data and Analysis: We computed correlations between our data and the explanatory variables as we wanted to firstly
have an idea of the feasibility of the project.
a. We then decided to create variables (see slide x “Explanatory Variables”)
b. We recomputed the correlation in order to know if this made sense and for most it did
c. We aggregated the consumption of the meters per usage in order to see if there were correlations. There were a few but not
as much as expected
2. Building the ML model
a. We have tried 3 different models which are the XGBoost, SVR and Elastic Net. We selected the most performant using the
score of grid_search function for each.
b. We focused then on XGBoost and ran the code for all the meters of one site.
c. We are having relatively good results, it depends on the meters.
Next to do
3. Come up with new ideas of feature engineering to feed the model. Think about ways of improving the model ⇒ feature selection,...
You might want to test out different models, why not
a. We think feature engineering is an important part of the performance of our model. In order to so we expect you to come up
with ideas.
4. Develop a statistical model that would flag anomalies based on the prediction of the ML model (XGBoost).
5. Explore and propose other ways to detect anomalies (unsupervised models such as isolation forest)
6. If different solutions used, find a way to implement them one in the other

5
Remarks
General:
- Each customer can have different ways of consuming. We would like the model to be the more versatile possible. It should
work on different customers. The algorithm should not vary between each customer.You should anticipate it by testing it on
the different datasets you will be provided with and for example by automating the fine-tuning process of finding the adequate
parameters for the model.
- It may be that one and only model does not work for every customer. If this is the case we will discuss the alternatives. Possible
still to create one algorithm that tests different models, for example.

Data
- All the data must not be taken into account in your calculations as it might sometimes happen that we have outliers. Most of
the outliers are due to a problem of communication. This mostly translates into an absence of data during a certain period and
then a huge peak.
- The list of explanatory variables may change according to the customer. If you have any ideas on what kind of variables we can
add please feel free to share your thoughts. We will decide together if this can be done.
- Some meters are less important than others as their consumption is way less than the others. A list of the level of importance
for each meter will be joined to the files that will be given.
- You should analyze the importance of each explanatory variables for customers. The goal would be to have a list of important
variables.
-
Remarks

Reference period
- By default, the base period will be the previous civic year. It could change or retrain automatically every period of time.

Generating alarms:
- An alarm is being created when there is an abnormal consumption
- We rather create fewer alarms but that they are all true (true positive) than having too many false alarms (false
positive). (Rather false negatives than false positives).

The expected accuracy:

- The most important meters should have a high R². The minimum level should be discussed after the first convincing
results.

Electricity Theft Detection: Using Machine Learning
100% (1)
Electricity Theft Detection: Using Machine Learning
23 pages
Goldsmith Comb
No ratings yet
Goldsmith Comb
146 pages
P1 Ele70b BV04
No ratings yet
P1 Ele70b BV04
40 pages
A TensorFlow Approach To Data Analysis For Time Series Forecasting in The Energy Efficiency Realm
No ratings yet
A TensorFlow Approach To Data Analysis For Time Series Forecasting in The Energy Efficiency Realm
26 pages
Electricity Consumption Forecasting For Optimal Resource Management Using Hybrid ES-RNN Model
No ratings yet
Electricity Consumption Forecasting For Optimal Resource Management Using Hybrid ES-RNN Model
27 pages
Lecture 4
No ratings yet
Lecture 4
62 pages
Proposed System and Methodology Part 2
No ratings yet
Proposed System and Methodology Part 2
42 pages
Report Tarp
No ratings yet
Report Tarp
17 pages
The Prediction and Optimisation of Smart energy usgage hrough ML Recommandations
No ratings yet
The Prediction and Optimisation of Smart energy usgage hrough ML Recommandations
32 pages
Thesis Sample For CS RCET UET Copy
No ratings yet
Thesis Sample For CS RCET UET Copy
93 pages
Bitcoin chapter 2
No ratings yet
Bitcoin chapter 2
18 pages
Anomaly Detection Using Visualisation
No ratings yet
Anomaly Detection Using Visualisation
12 pages
Ai & Iot Cep (Group No 13)
No ratings yet
Ai & Iot Cep (Group No 13)
20 pages
Model Evaluation (ML)
No ratings yet
Model Evaluation (ML)
15 pages
CNZ-analysis-baselining-methodologies-Jan-2024
No ratings yet
CNZ-analysis-baselining-methodologies-Jan-2024
18 pages
Forecasting Electricity Demand in The Data-Poor Indian Context
No ratings yet
Forecasting Electricity Demand in The Data-Poor Indian Context
53 pages
project_Synopsis_final[1][2][1]
No ratings yet
project_Synopsis_final[1][2][1]
21 pages
Phase 5 kutty
No ratings yet
Phase 5 kutty
27 pages
FinalProject Instruction
No ratings yet
FinalProject Instruction
5 pages
Ai-based Anomaly Detection in Power Electronics[1]
No ratings yet
Ai-based Anomaly Detection in Power Electronics[1]
25 pages
abc project
No ratings yet
abc project
17 pages
Report-4
No ratings yet
Report-4
50 pages
BE Project Synopsis
No ratings yet
BE Project Synopsis
26 pages
Case Studies ML
No ratings yet
Case Studies ML
21 pages
Forecasting Electricity Bills
No ratings yet
Forecasting Electricity Bills
8 pages
Short-Term Load Forecasting Using Smart Meter Data
No ratings yet
Short-Term Load Forecasting Using Smart Meter Data
22 pages
IEEE Report of BTP
No ratings yet
IEEE Report of BTP
10 pages
Assignment_2 (1)
No ratings yet
Assignment_2 (1)
9 pages
1-s2.0-S0360544221033090-main
No ratings yet
1-s2.0-S0360544221033090-main
10 pages
SSL_Assignment_Report_1 (5)
No ratings yet
SSL_Assignment_Report_1 (5)
11 pages
Team_19_Project_Report
No ratings yet
Team_19_Project_Report
27 pages
NM Project (1)
No ratings yet
NM Project (1)
22 pages
s3950476 TimeSeriesAnalysis Assignment 3
No ratings yet
s3950476 TimeSeriesAnalysis Assignment 3
13 pages
anomoly detcetion
No ratings yet
anomoly detcetion
41 pages
Efficient Incremental Smart Grid Data Analytics: David Xi Cheng Wojciech Golab Paul A. S. Ward
No ratings yet
Efficient Incremental Smart Grid Data Analytics: David Xi Cheng Wojciech Golab Paul A. S. Ward
8 pages
Case Study 3
No ratings yet
Case Study 3
5 pages
A project based on Python
No ratings yet
A project based on Python
17 pages
1assignment_RichardPásler
No ratings yet
1assignment_RichardPásler
11 pages
Energies: Assessing Tolerance-Based Robust Short-Term Load Forecasting in Buildings
No ratings yet
Energies: Assessing Tolerance-Based Robust Short-Term Load Forecasting in Buildings
20 pages
Anomaly Detection in Electricity Consumption Data of Buildings Using Predictive Models
No ratings yet
Anomaly Detection in Electricity Consumption Data of Buildings Using Predictive Models
20 pages
Knime Bigdata Energy Timeseries Whitepaper
No ratings yet
Knime Bigdata Energy Timeseries Whitepaper
37 pages
f06eaab9-bfd2-4613-a173-9c9a7f6d9cbd
No ratings yet
f06eaab9-bfd2-4613-a173-9c9a7f6d9cbd
2 pages
Context: Description
No ratings yet
Context: Description
5 pages
Smart Meter Data Analytics For Building Monitoring System A Case Study
No ratings yet
Smart Meter Data Analytics For Building Monitoring System A Case Study
5 pages
Paper Presentation Betab Ash
No ratings yet
Paper Presentation Betab Ash
7 pages
Sat - 96.Pdf - Machine Learning Models For Electricity Consumption Forecasting
No ratings yet
Sat - 96.Pdf - Machine Learning Models For Electricity Consumption Forecasting
11 pages
ElectricPowerConsumptionForecasting
No ratings yet
ElectricPowerConsumptionForecasting
5 pages
Documento PDF
No ratings yet
Documento PDF
1 page
Documento PDF
No ratings yet
Documento PDF
1 page
Electric Load Forecasting Using Data Mining Techniques (1).docx
No ratings yet
Electric Load Forecasting Using Data Mining Techniques (1).docx
3 pages
3.Anomaly Detection in Conventional Meters
No ratings yet
3.Anomaly Detection in Conventional Meters
4 pages
Data Mining Project 11
No ratings yet
Data Mining Project 11
18 pages
Sample II
No ratings yet
Sample II
8 pages
Electrify Real-Time Analysis of Electricity Consumption and Bill Prediction
No ratings yet
Electrify Real-Time Analysis of Electricity Consumption and Bill Prediction
3 pages
Load Forecasting
50% (2)
Load Forecasting
30 pages
Capstones AIML and DS Capstone Projects
No ratings yet
Capstones AIML and DS Capstone Projects
6 pages
product design powerpoint
No ratings yet
product design powerpoint
17 pages
DATA3001 Proposal
No ratings yet
DATA3001 Proposal
2 pages
Ipc in Tamil
89% (18)
Ipc in Tamil
51 pages
System Programming - CS609 Handouts
No ratings yet
System Programming - CS609 Handouts
341 pages
Individual Household Electric Power Consumption Forecasting Using Machine Learning Algorithms
No ratings yet
Individual Household Electric Power Consumption Forecasting Using Machine Learning Algorithms
4 pages
Transport Layer Notes
No ratings yet
Transport Layer Notes
35 pages
Java Web Framework - Spring
No ratings yet
Java Web Framework - Spring
52 pages
Instruction Manual: Weighing Controller
100% (1)
Instruction Manual: Weighing Controller
2 pages
Konsep Dasar Akhlak Tasawuf
No ratings yet
Konsep Dasar Akhlak Tasawuf
48 pages
Determine best fit topology- short note
No ratings yet
Determine best fit topology- short note
37 pages
Microsoft Defender For Endpoint - Architecture, Features & Plans
No ratings yet
Microsoft Defender For Endpoint - Architecture, Features & Plans
20 pages
Riotouch IR Android 11.0 LTS982EA 4+32GB Spec
No ratings yet
Riotouch IR Android 11.0 LTS982EA 4+32GB Spec
2 pages
CallRecord Log
No ratings yet
CallRecord Log
11 pages
Boox M96 Smart Reading Series User Manual V1.6: Downloaded From Manuals Search Engine
No ratings yet
Boox M96 Smart Reading Series User Manual V1.6: Downloaded From Manuals Search Engine
52 pages
Be Computer Engineering Semester 5 2023 December Computer Networkrev 2019 C Scheme
No ratings yet
Be Computer Engineering Semester 5 2023 December Computer Networkrev 2019 C Scheme
2 pages
ISI Emerging Markets
No ratings yet
ISI Emerging Markets
7 pages
OpenTX Diagram - Output
No ratings yet
OpenTX Diagram - Output
1 page
Coe Solution Design Guideline Oracle Field Service
No ratings yet
Coe Solution Design Guideline Oracle Field Service
11 pages
Pert Master For Primavera and Contractor
No ratings yet
Pert Master For Primavera and Contractor
47 pages
Doosan Operation Manual PJM05
100% (1)
Doosan Operation Manual PJM05
3 pages
A Arte de Ler Mentes em Portugues Do Brasil
No ratings yet
A Arte de Ler Mentes em Portugues Do Brasil
3 pages
Internship Presentation
No ratings yet
Internship Presentation
9 pages
Library of JavaScript Rules For FusionPro
No ratings yet
Library of JavaScript Rules For FusionPro
3 pages
5 SimplyPrint
No ratings yet
5 SimplyPrint
3 pages
Modul 3 Contingency Planning Part 2
No ratings yet
Modul 3 Contingency Planning Part 2
26 pages
SON Trial KPI List V4.0 (N) (13.10.2016)
No ratings yet
SON Trial KPI List V4.0 (N) (13.10.2016)
25 pages
Touch Screen Manual
No ratings yet
Touch Screen Manual
14 pages
FDMEE Invoking Essbase Calculation Script
No ratings yet
FDMEE Invoking Essbase Calculation Script
13 pages
GNU Emacs Reference Card
No ratings yet
GNU Emacs Reference Card
2 pages
Class and Objects
No ratings yet
Class and Objects
15 pages
Lesson: 1 Class: Xi Subject: Computer Science Topic: Algorithm and Flowchart Date: 23.12.2017 Duration: 45 Mins
No ratings yet
Lesson: 1 Class: Xi Subject: Computer Science Topic: Algorithm and Flowchart Date: 23.12.2017 Duration: 45 Mins
9 pages
Solid State Drive Documentation
No ratings yet
Solid State Drive Documentation
11 pages
Gale Researcher Guide for: Econometric Models
From Everand
Gale Researcher Guide for: Econometric Models
Chupp
No ratings yet
Practical Statistical Process Control
From Everand
Practical Statistical Process Control
Colin Hardwick
5/5 (9)

Algorithm Current Situation

Uploaded by

Algorithm Current Situation

Uploaded by

Sequence

The expected accuracy:

You might also like