0% found this document useful (0 votes)
4 views

Profile (2)

Booth Dennis is a Data Scientist/Engineer with 7 years of experience in data processing, machine learning, and AI, currently working at Cabot Financial. He has led significant projects, including developing a charging order identifier and customer segmentation strategies, while also mentoring junior employees. His research interests encompass predictive modeling, deep learning, and graph theory, with a strong focus on customer service and team performance.

Uploaded by

Disney
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Profile (2)

Booth Dennis is a Data Scientist/Engineer with 7 years of experience in data processing, machine learning, and AI, currently working at Cabot Financial. He has led significant projects, including developing a charging order identifier and customer segmentation strategies, while also mentoring junior employees. His research interests encompass predictive modeling, deep learning, and graph theory, with a strong focus on customer service and team performance.

Uploaded by

Disney
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Contact

[email protected]
Booth Dennis
Data Coder Operator at Cabot Financial
www.linkedin.com/in/booth-dennis- Antioch, Illinois, United States
b70a88312 (LinkedIn)

Summary
Top Skills
Stakeholder Engagement An adaptable Data Scientist/Engineer with 7 years professional
Sequence Analysis experience delivering a wide array of impactful products for data
Customer Segmentation Strategy processing, machine learning and artificial intelligence. Through
influencing and communication with stakeholders, many of these
products have been deployed to be used regularly throughout
respective organisations. Especial regard is given to customer
service and satisfaction, to motivate and drive forward team
performance, recently adapting agile and DevOps practices. Very
open to new ideas and developing new skills, to continuously stay
competitive and to strive for excellence.

My research interests include:


· Predictive Modelling and Quantitative Structure-Property
Relationships (QSAR /QSPR)
· Deep Learning
· Graph Theory
· Similarity Searching (mostly in the context of Virtual Screening
using graph theory and/or fingerprints)
· Compound Deck design and Multi-Parameter Optimisation

Experience
Cabot Financial
Data Coder Operator

Cabot Financial
Senior Data Scientist
January 2022 - March 2023 (1 year 3 months)
Machine learning project lead and developer, building processes to help
recover debt on two projects:
· Built a charging order identifier (debts attached to property) using Spark NLP,
to search through historic company notes, to find references of charging orders
with a classification accuracy of 84%. Has the potential to recover hundreds

Page 1 of 4
of thousands of pounds of debt, though is currently pending deployment to
stakeholders.
· As the resident spark and cloud expert, upskilled other employees via
presentations and support groups, particularly on best practices to ensure
spark and DataBricks code was better optimised for speed and memory.
· Built a new customer segmentation strategy over a previous version, using
Experian and Open API (Doorda) data alongside internal customer information
to engineer features to optimise cluster separation.
· Managed two junior employees for 5 weeks in line manager’s absence to
progress the charging order project

THAMES WATER UTILITIES LIMITED


Senior Data Scientist
November 2018 - December 2021 (3 years 2 months)
Pioneer in Smart Metering and Leakage analytics introducing ground-breaking
products via the Azure platform to drive Thames Water into outstanding levels
of successful data utilisation. Along with managing a data scientist, our team
has:
· Built a critically acclaimed automated model to predict failed mains sensors;
hailed as one of the greatest breakthroughs in data science and modelling in
the business. Built to output daily scheduled predictions to Azure Synapse,
where stakeholders could access model outputs via Microsoft Excel. Model
was continuously used to direct technicians to predicted failed sensors for
immediate replacement.
· Built a shared supply predictor to slash customer complaints and thus
mitigate extortionate fines to the business. Built in Azure DataBricks (pyspark),
using recommended features from subject matter experts across the business,
and subsequently used to direct technicians to pre-emptively address shared
supply properties.
· Produced a random forest regressor, for predicting number of occupants in
a house (mean squared error 0.8). Pyspark was used to produce the model,
mining terabytes of meter reads from half a million customer accounts.
· Automated several models and projects, to give scheduled outputs to
stakeholders. Azure Data Factory ETL was used for scheduling, and
DataBricks (pyspark and SQL) for table creation and manipulation.
· Hosted and presented at forums for data science within the organisation

UCB
Chemoinformatician
October 2015 - November 2018 (3 years 2 months)

Page 2 of 4
Evaluated a set of calculable properties, which lead to the production of global
and series-specific pKa and logD models (m. s. error of 0.5 for both). These
models have been utilised in more than 6 therapeutic projects. Other activities
included:
· Released a hERG random forest classifier, which helped bridge our CADD
group with the DMPK department.
· Large-scale similarity searching, both 2D (ChemAxon MadFast and ChemFP)
and 3D (Schrodinger phase)
· Web form development for chemist access of QSAR models (including
JQuery and JavaScript)
· Demonstration and training of our chemoinformatics software to chemists in
both UK and Belgium sites
· Involved in the supervision and development of two chemoinformatics
placement students (one year each)

University of Sheffield
PhD Student in Chemoinformatics
October 2012 - October 2015 (3 years 1 month)
Explored the maximum common substructure concept and its applications
to 2D similarity searching and virtual screening. My work involved coding
various graph-matching algorithms to find the maximum common substructure
between two molecules. Such algorithms are used in the generation of
chemical hyperstructures (equivalent to supergraphs in graph theory), as well
as for various forms of virtual screening runs. Of note, we found a particular
topology-based manipulation which drastically sped up the search speed of
exact algorithms, allowing small molecules (~500 Daltons) to be compared
in seconds. KNIME (with Java, JUnit and R) was the software platform of
choice, with my thesis completed using LaTeX. Additional experience gained
in web design by creating the website for the information school postgraduate
research conference. Also did demonstration work for the teaching of
undergraduates and postgraduates in: chemoinformatics; web design (HTML);
content management systems (PHP and open source platforms); JavaScript;
JQuery; Database Design (ORACLE SQL).

EBI
Trainee
June 2012 - September 2012 (4 months)
Cheminformatics project using KNIME and Pipeline Pilot to construct naive
Bayesian classifiers, from ChEMBL bioactivity data based on ADME-related
protein targets. Application of domain of applicability concept to verify model

Page 3 of 4
validity and development/programming of KNIME nodes/software (in Java) to
retrieve data from ChEMBL.

Xention Discovery
Placement Student
June 2010 - August 2011 (1 year 3 months)
Used Pipeline Pilot, along with Java to develop software which designed
molecules with novel scaffolds and analogues from existing ligands. Built
Bayesian and activity space models (involving principal components analysis
and multi-dimensional scaling) to help predict compound activities against
targets, as well as doing some work on pharmacophore mapping. Helped
implement database retrieval software using Pipeline Pilot, ORACLE and
JavaScript. Learned how to apply information from scientific literature in a
programming perspective, and to present software to company scientists in
presentations and conversation.

Wellcome Trust Sanger Centre


Vacation Student
June 2009 - September 2009 (4 months)
Used Perl to develop a platform that performed text-based statistical analyses
of yeast promoter sequences. Also used R to statistically validate a DNA
binding motif predictor andplot the results.

Mologic Ltd
Vacation Student
June 2008 - September 2008 (4 months)
Researched and Presented findings to ‘Mologic Ltd’ about yeast promoter
functionality between many yeast species. Gained experience in extracting
information from scientific literature (using PubMed and online libraries as main
sources). Learned how to develop and apply personal ideas independently,
working in a field with no prior knowledge.

Education
The University of Sheffield
Doctor of Philosophy, Chemoinformatics · (2012 - 2015)

University of Birmingham School


Bioinformatics · (2008 - 2012)

Page 4 of 4

You might also like