Profile (2)
Profile (2)
[email protected]
Booth Dennis
Data Coder Operator at Cabot Financial
www.linkedin.com/in/booth-dennis- Antioch, Illinois, United States
b70a88312 (LinkedIn)
Summary
Top Skills
Stakeholder Engagement An adaptable Data Scientist/Engineer with 7 years professional
Sequence Analysis experience delivering a wide array of impactful products for data
Customer Segmentation Strategy processing, machine learning and artificial intelligence. Through
influencing and communication with stakeholders, many of these
products have been deployed to be used regularly throughout
respective organisations. Especial regard is given to customer
service and satisfaction, to motivate and drive forward team
performance, recently adapting agile and DevOps practices. Very
open to new ideas and developing new skills, to continuously stay
competitive and to strive for excellence.
Experience
Cabot Financial
Data Coder Operator
Cabot Financial
Senior Data Scientist
January 2022 - March 2023 (1 year 3 months)
Machine learning project lead and developer, building processes to help
recover debt on two projects:
· Built a charging order identifier (debts attached to property) using Spark NLP,
to search through historic company notes, to find references of charging orders
with a classification accuracy of 84%. Has the potential to recover hundreds
Page 1 of 4
of thousands of pounds of debt, though is currently pending deployment to
stakeholders.
· As the resident spark and cloud expert, upskilled other employees via
presentations and support groups, particularly on best practices to ensure
spark and DataBricks code was better optimised for speed and memory.
· Built a new customer segmentation strategy over a previous version, using
Experian and Open API (Doorda) data alongside internal customer information
to engineer features to optimise cluster separation.
· Managed two junior employees for 5 weeks in line manager’s absence to
progress the charging order project
UCB
Chemoinformatician
October 2015 - November 2018 (3 years 2 months)
Page 2 of 4
Evaluated a set of calculable properties, which lead to the production of global
and series-specific pKa and logD models (m. s. error of 0.5 for both). These
models have been utilised in more than 6 therapeutic projects. Other activities
included:
· Released a hERG random forest classifier, which helped bridge our CADD
group with the DMPK department.
· Large-scale similarity searching, both 2D (ChemAxon MadFast and ChemFP)
and 3D (Schrodinger phase)
· Web form development for chemist access of QSAR models (including
JQuery and JavaScript)
· Demonstration and training of our chemoinformatics software to chemists in
both UK and Belgium sites
· Involved in the supervision and development of two chemoinformatics
placement students (one year each)
University of Sheffield
PhD Student in Chemoinformatics
October 2012 - October 2015 (3 years 1 month)
Explored the maximum common substructure concept and its applications
to 2D similarity searching and virtual screening. My work involved coding
various graph-matching algorithms to find the maximum common substructure
between two molecules. Such algorithms are used in the generation of
chemical hyperstructures (equivalent to supergraphs in graph theory), as well
as for various forms of virtual screening runs. Of note, we found a particular
topology-based manipulation which drastically sped up the search speed of
exact algorithms, allowing small molecules (~500 Daltons) to be compared
in seconds. KNIME (with Java, JUnit and R) was the software platform of
choice, with my thesis completed using LaTeX. Additional experience gained
in web design by creating the website for the information school postgraduate
research conference. Also did demonstration work for the teaching of
undergraduates and postgraduates in: chemoinformatics; web design (HTML);
content management systems (PHP and open source platforms); JavaScript;
JQuery; Database Design (ORACLE SQL).
EBI
Trainee
June 2012 - September 2012 (4 months)
Cheminformatics project using KNIME and Pipeline Pilot to construct naive
Bayesian classifiers, from ChEMBL bioactivity data based on ADME-related
protein targets. Application of domain of applicability concept to verify model
Page 3 of 4
validity and development/programming of KNIME nodes/software (in Java) to
retrieve data from ChEMBL.
Xention Discovery
Placement Student
June 2010 - August 2011 (1 year 3 months)
Used Pipeline Pilot, along with Java to develop software which designed
molecules with novel scaffolds and analogues from existing ligands. Built
Bayesian and activity space models (involving principal components analysis
and multi-dimensional scaling) to help predict compound activities against
targets, as well as doing some work on pharmacophore mapping. Helped
implement database retrieval software using Pipeline Pilot, ORACLE and
JavaScript. Learned how to apply information from scientific literature in a
programming perspective, and to present software to company scientists in
presentations and conversation.
Mologic Ltd
Vacation Student
June 2008 - September 2008 (4 months)
Researched and Presented findings to ‘Mologic Ltd’ about yeast promoter
functionality between many yeast species. Gained experience in extracting
information from scientific literature (using PubMed and online libraries as main
sources). Learned how to develop and apply personal ideas independently,
working in a field with no prior knowledge.
Education
The University of Sheffield
Doctor of Philosophy, Chemoinformatics · (2012 - 2015)
Page 4 of 4