SlideShare a Scribd company logo
Practical Machine Learning for DBAs

Alex Gorbachev
Las Vegas, NV
April 2014
Alex Gorbachev
• Chief Technology Officer at Pythian
• Blogger
• Cloudera Champion of Big Data
• OakTable Network member
• Oracle ACE Director
• Founder of BattleAgainstAnyGuess.com
• Founder of Sydney Oracle Meetup
• IOUG Director of Communities
• EVP, Ottawa Oracle User Group
Agenda
• What’s Machine Learning
– Typical Machine Learning applications
• Why using Oracle Database for
Machine Learning
• Practical examples
– Classifying PL/SQL code
– Classifying database schemas into good
and bad
– SQL statements clustering
– Detecting anomalies in database
workload
What is Machine Learning?
data magic
scientific	

data	

analysis
modern	

practical	

AI
building simplified	

models of the universe	

using probabilistic models
Tom Mitchell’s definition
• Machine Learning is the study of computer
algorithms that improve automatically through
experience.
!
• A computer program is said to learn from
experience E with respect to some task T and some
performance measure P, if its performance on T, as
measured by P, improves with experience E.
Why is it useful?
Why is it useful?
Why is it useful?
Why is it useful?
Classes of ML algorithms
• Supervised learning
– Input: data + known facts; Output - predictions
• Unsupervised learning
– Input: data; Output – hypothesis
!
– Other less common algorithms such as reinforcement
learning, recommenders and etc
Supervised Learning: Linear Regression
Supervised Learning: Classification
Unsupervised Learning: Clustering
Unsupervised Learning: Anomaly Detection
Machine Learning workflow
• Gather
• Clean & transform
• Explore
• Model
• Interpret
• Produce value
} today’s focus
Why Machine Learning in
Oracle Database?
Machine Learning in Oracle DB?
• That’s where the data is
• Data in an RDBMS is often clean
• Easy to transform data with SQL
• Powerful algorithms implemented
– Oracle Data Mining option
– Analytic SQL
Machine Learning by example
Applying Machine Learning
to the business of DBAs
Problem: Detect bad PL/SQL
• Goal: automated PL/SQL code grading
– Classify as Good or Bad
• Typical classification task
– Assignment of labels to the set of unlabeled items
based on prior observations
Classification process
• Parse input data
• Extract features
– Manually or automatically or they are clearly defined (if
row is an item, columns may be features)
• Train – calculate model based on labeled input
• Verify – test model on labeled input
• Apply labels to unlabeled input
!
• Classification is supervised learning
Features definition - easy task?
Kittens vs …
Kittens vs Puppies
PL/SQL code features
• Automatically extract words from the text as
features (tokenize)
– EASY TO AUTOMATE
• Assign features intelligently
– Code size
– Author
– Percent of comment lines
– Presence of specific code patterns
– DIFFICULT TO AUTOMATE
Classification model workflow
1. Create Oracle Text policy (define lexer)
2. Configure and build the model on training set
3. Apply model to the testing set
4. Assess model performance
5. Adjust model settings/features/size and repeat
Basic probability lesson
• p(A) is the probability that A is true
A is
false
A is
true
Area is 1
Basic probability lesson
• p(A) is the probability that A is true
• Axioms of Probability
Basic probability lesson
• p(A) is the probability that A is true
• Axioms of Probability
!
!
!
!
• Bayes Law
How Bayes Law can work for us?
!
!
!
• A – presence of a feature
like WHEN OTHERS THEN NULL in PL/SQL
• B – bad PL/SQL code
A
B
Area is 1
B|A
PL/SQL data source
• OBJECT_ID – case ID
• CODE – text column
• TARGET_VALUE – 0 is good and 1 is bad
• Training set
– where mod(object_id, 10) < 5
• Testing set
– where mod(object_id, 10) >= 5
Oracle Text policy
begin
begin
ctx_ddl.drop_policy('plsql_nb_policy');
exception when others then null;
end;
begin
ctx_ddl.drop_preference('plsql_nb_lexer');
exception when others then null;
end;
ctx_ddl.create_preference
('plsql_nb_lexer’, 'BASIC_LEXER');
ctx_ddl.create_policy
('plsql_nb_policy', lexer=>'plsql_nb_lexer');
end;
/
Model settings
CREATE TABLE plsql_nb_settings (
setting_name VARCHAR2(30),
setting_value VARCHAR2(4000));
BEGIN
-- Populate settings table
INSERT INTO plsql_svm_settings VALUES
(dbms_data_mining.algo_name, dbms_data_mining.algo_naive_bayes);
INSERT INTO plsql_nb_settings VALUES
(dbms_data_mining.prep_auto, dbms_data_mining.prep_auto_on);
INSERT INTO plsql_nb_settings VALUES
(dbms_data_mining.odms_text_policy_name, 'plsql_nb_policy');
-- INSERT INTO plsql_nb_settings VALUES
-- (dbms_data_mining.NABS_PAIRWISE_THRESHOLD,0.01);
-- INSERT INTO plsql_nb_settings VALUES
-- (dbms_data_mining.NABS_SINGLETON_THRESHOLD,0.01);
COMMIT;
END;
/
Build model
DECLARE
xformlist dbms_data_mining_transform.TRANSFORM_LIST;
BEGIN
BEGIN DBMS_DATA_MINING.DROP_MODEL('PLSQL_NB');
EXCEPTION WHEN OTHERS THEN NULL; END;
!
dbms_data_mining_transform.SET_TRANSFORM(
xformlist, 'code', null, 'code', null, 'TEXT(TOKEN_TYPE:NORMAL)');
!
DBMS_DATA_MINING.CREATE_MODEL(
model_name => 'PLSQL_NB',
mining_function => dbms_data_mining.classification,
data_table_name => 'plsql_build',
case_id_column_name => 'object_id',
target_column_name => 'target_value',
settings_table_name => 'plsql_nb_settings',
xform_list => xformlist);
END;
/
Test model
SELECT
target_value AS actual_target,
PREDICTION(plsql_nb USING *) AS predicted_target,
COUNT(*) AS cases_count
FROM plsql_test
GROUP BY target_value,
PREDICTION(plsql_nb USING *)
ORDER BY 1, 2;
Demo
40
Skyline and Oculus by Etsy
blackbox anomaly detection
41
Thanks and Q&A
Contact info
gorbachev@pythian.com
+1-877-PYTHIAN
To follow us
pythian.com/blog
@alexgorbachev

@pythian
linkedin.com/company/pythian

More Related Content

What's hot (20)

PPTX
DBCS Office Hours - Modernization through Migration
Tammy Bednar
 
PPTX
The Art of Intelligence – Introduction Machine Learning for Oracle profession...
Lucas Jellema
 
PDF
Database Cloud Services Office Hours : Oracle sharding hyperscale globally d...
Tammy Bednar
 
PPTX
SQL On Hadoop
Muhammad Ali
 
PDF
Getting Ready to Use Redis with Apache Spark with Tague Griffith
Databricks
 
PDF
Meetup Oracle Database MAD_BCN: 1.2 Oracle Database 18c (autonomous database)
avanttic Consultoría Tecnológica
 
PDF
Improving Python and Spark Performance and Interoperability with Apache Arrow...
Databricks
 
PDF
Turning Relational Database Tables into Hadoop Datasources by Kuassi Mensah
Data Con LA
 
PDF
Streaming Solutions for Real time problems
Abhishek Gupta
 
PPTX
#dbhouseparty - Should I be building Microservices?
Tammy Bednar
 
PPTX
Tame Big Data with Oracle Data Integration
Michael Rainey
 
PDF
#dbhouseparty - Spatial Technologies - @Home and Everywhere Else on the Map
Tammy Bednar
 
PPTX
EDB's Migration Portal - Migrate from Oracle to Postgres
EDB
 
PDF
Database@Home - Data Driven : Loading, Indexing, and Searching with Text and ...
Tammy Bednar
 
PDF
Getting Spark ready for real-time, operational analytics
airisData
 
PDF
Oracle to Postgres Migration - part 1
PgTraining
 
PPTX
Applied Deep Learning with Spark and Deeplearning4j
DataWorks Summit
 
PPTX
Whats new in Oracle Database 12c release 12.1.0.2
Connor McDonald
 
PPT
Etu Solution Day 2014 Track-D: 掌握Impala和Spark
James Chen
 
PDF
Spark mhug2
Joseph Niemiec
 
DBCS Office Hours - Modernization through Migration
Tammy Bednar
 
The Art of Intelligence – Introduction Machine Learning for Oracle profession...
Lucas Jellema
 
Database Cloud Services Office Hours : Oracle sharding hyperscale globally d...
Tammy Bednar
 
SQL On Hadoop
Muhammad Ali
 
Getting Ready to Use Redis with Apache Spark with Tague Griffith
Databricks
 
Meetup Oracle Database MAD_BCN: 1.2 Oracle Database 18c (autonomous database)
avanttic Consultoría Tecnológica
 
Improving Python and Spark Performance and Interoperability with Apache Arrow...
Databricks
 
Turning Relational Database Tables into Hadoop Datasources by Kuassi Mensah
Data Con LA
 
Streaming Solutions for Real time problems
Abhishek Gupta
 
#dbhouseparty - Should I be building Microservices?
Tammy Bednar
 
Tame Big Data with Oracle Data Integration
Michael Rainey
 
#dbhouseparty - Spatial Technologies - @Home and Everywhere Else on the Map
Tammy Bednar
 
EDB's Migration Portal - Migrate from Oracle to Postgres
EDB
 
Database@Home - Data Driven : Loading, Indexing, and Searching with Text and ...
Tammy Bednar
 
Getting Spark ready for real-time, operational analytics
airisData
 
Oracle to Postgres Migration - part 1
PgTraining
 
Applied Deep Learning with Spark and Deeplearning4j
DataWorks Summit
 
Whats new in Oracle Database 12c release 12.1.0.2
Connor McDonald
 
Etu Solution Day 2014 Track-D: 掌握Impala和Spark
James Chen
 
Spark mhug2
Joseph Niemiec
 

Similar to Introduction to Machine Learning for Oracle Database Professionals (20)

PDF
Machine Learning and AI at Oracle
Sandesh Rao
 
PDF
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
Dataconomy Media
 
PDF
Data meets AI - AICUG - Santa Clara
Sandesh Rao
 
PPTX
Machine Learning for Auditors: What you need to know - ISACA North America CA...
Andrew Clark
 
PDF
Data meets AI - ATP Roadshow India
Sandesh Rao
 
PDF
Introducing new AIOps innovations in Oracle 19c - San Jose AICUG
Sandesh Rao
 
PDF
AUSOUG - NZOUG-GroundBreakers-Jun 2019 - AI and Machine Learning
Sandesh Rao
 
PDF
From DBA to DE: Becoming a Data Engineer
Jim Czuprynski
 
PPS
Oracle database Career paths - Introduction
MyOnlineITCourses
 
PDF
Introduction to Machine Learning - From DBA's to Data Scientists - OGBEMEA
Sandesh Rao
 
PDF
Introduction to Machine learning - DBA's to data scientists - Oct 2020 - OGBEmea
Sandesh Rao
 
PDF
Introduction to AutoML and Data Science using the Oracle Autonomous Database ...
Sandesh Rao
 
PPSX
What's Your Super-Power? Mine is Machine Learning with Oracle Autonomous DB.
Jim Czuprynski
 
PPT
Supervised algorithms
Yassine Akhiat
 
PDF
AIOUG : ODEVCYathra 2018 - Oracle Autonomous Database What Every DBA should know
Sandesh Rao
 
PDF
ZcCsXvjIRFKcqd2Yzt4d_Shallahamer-MLPoorPerf-3d.pdf
cookie1969
 
PPTX
12363 database certification
Universitas Bina Darma Palembang
 
ODP
Introduction to Machine learning
Knoldus Inc.
 
DOCX
BIAM 410 Final Paper - Beyond the Buzzwords: Big Data, Machine Learning, What...
Thomas Rones
 
PDF
Introduction to Machine Learning and Data Science using the Autonomous databa...
Sandesh Rao
 
Machine Learning and AI at Oracle
Sandesh Rao
 
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
Dataconomy Media
 
Data meets AI - AICUG - Santa Clara
Sandesh Rao
 
Machine Learning for Auditors: What you need to know - ISACA North America CA...
Andrew Clark
 
Data meets AI - ATP Roadshow India
Sandesh Rao
 
Introducing new AIOps innovations in Oracle 19c - San Jose AICUG
Sandesh Rao
 
AUSOUG - NZOUG-GroundBreakers-Jun 2019 - AI and Machine Learning
Sandesh Rao
 
From DBA to DE: Becoming a Data Engineer
Jim Czuprynski
 
Oracle database Career paths - Introduction
MyOnlineITCourses
 
Introduction to Machine Learning - From DBA's to Data Scientists - OGBEMEA
Sandesh Rao
 
Introduction to Machine learning - DBA's to data scientists - Oct 2020 - OGBEmea
Sandesh Rao
 
Introduction to AutoML and Data Science using the Oracle Autonomous Database ...
Sandesh Rao
 
What's Your Super-Power? Mine is Machine Learning with Oracle Autonomous DB.
Jim Czuprynski
 
Supervised algorithms
Yassine Akhiat
 
AIOUG : ODEVCYathra 2018 - Oracle Autonomous Database What Every DBA should know
Sandesh Rao
 
ZcCsXvjIRFKcqd2Yzt4d_Shallahamer-MLPoorPerf-3d.pdf
cookie1969
 
12363 database certification
Universitas Bina Darma Palembang
 
Introduction to Machine learning
Knoldus Inc.
 
BIAM 410 Final Paper - Beyond the Buzzwords: Big Data, Machine Learning, What...
Thomas Rones
 
Introduction to Machine Learning and Data Science using the Autonomous databa...
Sandesh Rao
 
Ad

More from Alex Gorbachev (8)

PPTX
Bridging Oracle Database and Hadoop by Alex Gorbachev, Pythian from Oracle Op...
Alex Gorbachev
 
PDF
UTHOC2 - Under The Hood of Oracle Clusterware 2.0 - Grid Infrastructure by Al...
Alex Gorbachev
 
PDF
Benchmarking Oracle I/O Performance with Orion by Alex Gorbachev
Alex Gorbachev
 
PDF
Demystifying Oracle RAC Workload Management by Alex Gorbachev, Pythian | NoCO...
Alex Gorbachev
 
KEY
MOW2010: 1TB MySQL Database Migration and HA Infrastructure by Alex Gorbachev...
Alex Gorbachev
 
KEY
MOW2010: Under the Hood of Oracle Clusterware by Alex Gorbachev, Pythian
Alex Gorbachev
 
KEY
Oracle ASM 11g - The Evolution
Alex Gorbachev
 
KEY
Oracle 11g New Features Out-of-the-Box by Alex Gorbachev (from Sydney Oracle ...
Alex Gorbachev
 
Bridging Oracle Database and Hadoop by Alex Gorbachev, Pythian from Oracle Op...
Alex Gorbachev
 
UTHOC2 - Under The Hood of Oracle Clusterware 2.0 - Grid Infrastructure by Al...
Alex Gorbachev
 
Benchmarking Oracle I/O Performance with Orion by Alex Gorbachev
Alex Gorbachev
 
Demystifying Oracle RAC Workload Management by Alex Gorbachev, Pythian | NoCO...
Alex Gorbachev
 
MOW2010: 1TB MySQL Database Migration and HA Infrastructure by Alex Gorbachev...
Alex Gorbachev
 
MOW2010: Under the Hood of Oracle Clusterware by Alex Gorbachev, Pythian
Alex Gorbachev
 
Oracle ASM 11g - The Evolution
Alex Gorbachev
 
Oracle 11g New Features Out-of-the-Box by Alex Gorbachev (from Sydney Oracle ...
Alex Gorbachev
 
Ad

Recently uploaded (20)

PDF
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PDF
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
PDF
NASA A Researcher’s Guide to International Space Station : Physical Sciences ...
Dr. PANKAJ DHUSSA
 
PDF
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PDF
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
PPTX
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
PDF
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PDF
UPDF - AI PDF Editor & Converter Key Features
DealFuel
 
PPTX
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
PDF
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
PDF
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
PPTX
MuleSoft MCP Support (Model Context Protocol) and Use Case Demo
shyamraj55
 
PDF
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
Lilly Gracia
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PPTX
Mastering ODC + Okta Configuration - Chennai OSUG
HathiMaryA
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PDF
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
NASA A Researcher’s Guide to International Space Station : Physical Sciences ...
Dr. PANKAJ DHUSSA
 
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
UPDF - AI PDF Editor & Converter Key Features
DealFuel
 
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
MuleSoft MCP Support (Model Context Protocol) and Use Case Demo
shyamraj55
 
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
Lilly Gracia
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
Mastering ODC + Okta Configuration - Chennai OSUG
HathiMaryA
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 

Introduction to Machine Learning for Oracle Database Professionals

  • 1. Practical Machine Learning for DBAs
 Alex Gorbachev Las Vegas, NV April 2014
  • 2. Alex Gorbachev • Chief Technology Officer at Pythian • Blogger • Cloudera Champion of Big Data • OakTable Network member • Oracle ACE Director • Founder of BattleAgainstAnyGuess.com • Founder of Sydney Oracle Meetup • IOUG Director of Communities • EVP, Ottawa Oracle User Group
  • 3. Agenda • What’s Machine Learning – Typical Machine Learning applications • Why using Oracle Database for Machine Learning • Practical examples – Classifying PL/SQL code – Classifying database schemas into good and bad – SQL statements clustering – Detecting anomalies in database workload
  • 4. What is Machine Learning?
  • 8. building simplified models of the universe using probabilistic models
  • 9. Tom Mitchell’s definition • Machine Learning is the study of computer algorithms that improve automatically through experience. ! • A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.
  • 10. Why is it useful?
  • 11. Why is it useful?
  • 12. Why is it useful?
  • 13. Why is it useful?
  • 14. Classes of ML algorithms • Supervised learning – Input: data + known facts; Output - predictions • Unsupervised learning – Input: data; Output – hypothesis ! – Other less common algorithms such as reinforcement learning, recommenders and etc
  • 19. Machine Learning workflow • Gather • Clean & transform • Explore • Model • Interpret • Produce value } today’s focus
  • 20. Why Machine Learning in Oracle Database?
  • 21. Machine Learning in Oracle DB? • That’s where the data is • Data in an RDBMS is often clean • Easy to transform data with SQL • Powerful algorithms implemented – Oracle Data Mining option – Analytic SQL
  • 22. Machine Learning by example Applying Machine Learning to the business of DBAs
  • 23. Problem: Detect bad PL/SQL • Goal: automated PL/SQL code grading – Classify as Good or Bad • Typical classification task – Assignment of labels to the set of unlabeled items based on prior observations
  • 24. Classification process • Parse input data • Extract features – Manually or automatically or they are clearly defined (if row is an item, columns may be features) • Train – calculate model based on labeled input • Verify – test model on labeled input • Apply labels to unlabeled input ! • Classification is supervised learning
  • 25. Features definition - easy task?
  • 28. PL/SQL code features • Automatically extract words from the text as features (tokenize) – EASY TO AUTOMATE • Assign features intelligently – Code size – Author – Percent of comment lines – Presence of specific code patterns – DIFFICULT TO AUTOMATE
  • 29. Classification model workflow 1. Create Oracle Text policy (define lexer) 2. Configure and build the model on training set 3. Apply model to the testing set 4. Assess model performance 5. Adjust model settings/features/size and repeat
  • 30. Basic probability lesson • p(A) is the probability that A is true A is false A is true Area is 1
  • 31. Basic probability lesson • p(A) is the probability that A is true • Axioms of Probability
  • 32. Basic probability lesson • p(A) is the probability that A is true • Axioms of Probability ! ! ! ! • Bayes Law
  • 33. How Bayes Law can work for us? ! ! ! • A – presence of a feature like WHEN OTHERS THEN NULL in PL/SQL • B – bad PL/SQL code A B Area is 1 B|A
  • 34. PL/SQL data source • OBJECT_ID – case ID • CODE – text column • TARGET_VALUE – 0 is good and 1 is bad • Training set – where mod(object_id, 10) < 5 • Testing set – where mod(object_id, 10) >= 5
  • 35. Oracle Text policy begin begin ctx_ddl.drop_policy('plsql_nb_policy'); exception when others then null; end; begin ctx_ddl.drop_preference('plsql_nb_lexer'); exception when others then null; end; ctx_ddl.create_preference ('plsql_nb_lexer’, 'BASIC_LEXER'); ctx_ddl.create_policy ('plsql_nb_policy', lexer=>'plsql_nb_lexer'); end; /
  • 36. Model settings CREATE TABLE plsql_nb_settings ( setting_name VARCHAR2(30), setting_value VARCHAR2(4000)); BEGIN -- Populate settings table INSERT INTO plsql_svm_settings VALUES (dbms_data_mining.algo_name, dbms_data_mining.algo_naive_bayes); INSERT INTO plsql_nb_settings VALUES (dbms_data_mining.prep_auto, dbms_data_mining.prep_auto_on); INSERT INTO plsql_nb_settings VALUES (dbms_data_mining.odms_text_policy_name, 'plsql_nb_policy'); -- INSERT INTO plsql_nb_settings VALUES -- (dbms_data_mining.NABS_PAIRWISE_THRESHOLD,0.01); -- INSERT INTO plsql_nb_settings VALUES -- (dbms_data_mining.NABS_SINGLETON_THRESHOLD,0.01); COMMIT; END; /
  • 37. Build model DECLARE xformlist dbms_data_mining_transform.TRANSFORM_LIST; BEGIN BEGIN DBMS_DATA_MINING.DROP_MODEL('PLSQL_NB'); EXCEPTION WHEN OTHERS THEN NULL; END; ! dbms_data_mining_transform.SET_TRANSFORM( xformlist, 'code', null, 'code', null, 'TEXT(TOKEN_TYPE:NORMAL)'); ! DBMS_DATA_MINING.CREATE_MODEL( model_name => 'PLSQL_NB', mining_function => dbms_data_mining.classification, data_table_name => 'plsql_build', case_id_column_name => 'object_id', target_column_name => 'target_value', settings_table_name => 'plsql_nb_settings', xform_list => xformlist); END; /
  • 38. Test model SELECT target_value AS actual_target, PREDICTION(plsql_nb USING *) AS predicted_target, COUNT(*) AS cases_count FROM plsql_test GROUP BY target_value, PREDICTION(plsql_nb USING *) ORDER BY 1, 2;
  • 39. Demo
  • 40. 40
  • 41. Skyline and Oculus by Etsy blackbox anomaly detection 41
  • 42. Thanks and Q&A Contact info [email protected] +1-877-PYTHIAN To follow us pythian.com/blog @alexgorbachev
 @pythian linkedin.com/company/pythian