PRINTX_1
PRINTX_1
AT
BY
PRINT_1
MATRIC_1
CERTIFICATION
This is to certify that this industrial attachment report was written and carried out by
PRINT_1 with matriculation number MATRIC_1 of The Department of Computer Science,
Faculty of Communication and Information Sciences, University of Ilorin, Ilorin at
SOFTRAYS COMPUTER INSTITUTE, Tanke Bubu, Behind Sanrab Filling Station,
Tanke, Ilorin, Kwara State.
Supervisor _
Sign /Date
SIWES Coordinator
Sign /Date
DEDICATION
The most important person to whom I dedicate this report is Almighty Allah, who has seen me
through and provided me with health and strength, particularly during the months I spent
finishing my SIWES program. For everything—for his care, support, safety, and the
innumerable other things that are too numerous to list. This report is also dedicated to my
family, the SOFTRAYS COMPUTER INSTITUTE staffs and the University of Ilorin for this
incredible opportunity, as well as for their unwavering support and assistance throughout my
program.
ACKNOWLEDGEMENT
I am appreciative to both the Industrial Training Fund (ITF) for their vision in creating this
program for students and Almighty God for making it possible for me to successfully finish
my student industrial work experience scheme (SIWES).
My sincere gratitude is extended to the whole SOFTRAYS COMPUTER INSTITUTE team
for their unwavering support and inspiration during my time at the institute. Also deserving of
my gratitude are Mr. ADEMOLA JOHN BUSAYO (CEO) and Mr. OLA my supervisors, who
took the time to carefully go over everything I have learned at the institute with me.
My sincere appreciation is extended to my parents, Mr. and Mrs. Yusuf. Thank you to my
friends Mr. Habeeb, Mr. Monday and Miss. Tosin for their unwavering financial support,
encouragement, and inspiration during this program.
You have helped me become a better version of myself, and I sincerely appreciate what you
have done.
REPORT OVERVIEW
The experience I had working at SOFTRAYS COMPUTER INSTITUTE, TANKE BUBU,
BEHIND SANRAB FILLING STATION, TANKE, ILORIN, KWARA STATE, during the
Students Industrial Work Experience Scheme (SIWES) is described in detail in this report. The
program ran from June 13, 2023 to December 13, 2023. Various departments of the
Establishment together with their functions are listed in the company description. This report
includes information obtained from the Web Development division of SOFTRAYS
COMPUTER INSTITUTE, where I was employed. My training ground for Data Science
abilities has been made available to me through the SIWES program. Through it, my
knowledge of web building and information literacy has increased, gives me further
suggestions about how to use the computer system.
This has really helped me to connect the dots between what I learned throughout the SIWES
program and what I was taught in the lecture hall.
I was able to obtain important practical information about web development and how to create
a standard, professional website with useful functionality all through my six-month Industrial
Training program at SOFTRAYS COMPUTER INSTITUTE.
TABLE OF CONTENTS
Contents
CERTIFICATION ................................................................................................................................ 2
DEDICATION .................................................................................................................................... 3
ACKNOWLEDGEMENT ..................................................................................................................... 4
REPORT OVERVIEW ......................................................................................................................... 5
TABLE OF CONTENTS ....................................................................................................................... 6
LIST OF FIGURES .............................................................................................................................. 8
CHAPTER ONE ................................................................................................................................. 9
1.0 INTRODUCTION ................................................................................................................... 9
1.1 BACKGROUND ..................................................................................................................... 9
1.2 OBJECTIVES OF SIWES ....................................................................................................... 11
1.3 BENEFITS OF SIWES PROGRAMME ................................................................................... 11
1.4 ROLES OF STUDENT ........................................................................................................... 12
1.5 THE LOGBOOK ....................................................................................................................... 12
2.0 DESCRIPTION OF THE ESTABLISHMENT OF ATTACHMENT ............................................... 13
2.1 LOCATION AND BRIEF HISTORY OF ESTABLISHMENT ....................................................... 13
2.2 OBJECTIVES AND VISION OF ESTABLISHMENT .................................................................. 13
2.2.1 COMPANY'S AREA OF SPECIALIZATION .............................................................................. 14
2.4 VARIOUS DEPARTMENT/UNITS AND THEIR FUNCTIONS .................................................. 15
CHAPTER THREE ............................................................................................................................ 17
3.0 WORK EXPERIENCE ........................................................................................................... 17
3.1 INTRODUCTION ................................................................................................................. 17
3.2 ADMINISTRATIVE EXPERIENCE .......................................................................................... 17
3.21 WORKING ETHICS .............................................................................................................. 17
3.22 FRONT DESK AND HUMAN RELATIONS ............................................................................. 18
3.23 HANDS-ON EXPERIENCE WITH KIPPA ACCOUNT MANAGEMENT SYSTEM ....................... 18
3.3 TECHNICAL EXPERIENCE .................................................................................................... 20
3.31 PYTHON PROGRAMMING ................................................................................................. 20
3.32 FEATURES OF ADVANCED PYTHON PPROGRAMMING (OOP CONCEPTS, ERROR
HANDLING, FILE HANDLING) ......................................................................................................... 21
3.33 EXPLORATORY DATA ANALYSIS (EDA IN PYTHON) ............................................................ 32
3.34 AIDING ASSIGNMENT SUCCESS: CRAFTING a QUADRATIC EQUATION SOLVER ............... 33
3.35 MACHINE LEARNING OVERVIEW ...................................................................................... 33
LIST OF FIGURES
FIGURE 2.2.1 COMPANY'S ORGANOGRAM
FIGURE 3.32 USER INTERFACE DESIGN (THE GUESS NUMBER GAME)
FIGURE 3.2 ERROR HANDLING
FIGURE 3.3 CONCEPTS OF CSS
CHAPTER ONE
1.0 INTRODUCTION
1.1 BACKGROUND
Before science and technology education became standardized in Nigeria, graduates of each
university would have received no formal training in the subject or occupational experience. In
order to broaden their perspectives and provide them with technical expertise or real-world
experience before graduating from their various institutions, science and technology-related
courses were required of students in different institutions.
The Industrial Training Fund (ITF) created the Student Industrial Work Experience Scheme
(SIWES) in 1973 to give tertiary students a foundational understanding of industrial work
based on their course of study before they graduate from their respective institutions. It was
founded to address the issue of Nigerian tertiary institution graduates' lack of sufficient
practical skills to prepare them for work in industry.
Through this program, students are exposed to industry-based skills that are essential for a
smooth transition from the classroom to the workplace. It gives postsecondary students the
chance to become acquainted with and exposed to the necessary experience in operating
machinery and equipment, which are typically not available in educational institutions.
Assisting students in incorporating leadership development into the process of experiential
learning is one of the main objectives of the SIWES. Through a mentoring relationship with
creative non-profit leaders, students are expected to acquire and develop fundamental non-
profit leadership abilities.
The skills and competences that students gain from their diligent participation in the Students
Industrial Work Experience Scheme (SIWES) are the main advantages. Those who receive
industrial training retain the necessary manufacturing abilities as enduring assets that cannot
be taken away from them. This is due to the fact that the abilities and information gained from
training are internalized and made applicable when needed to carry out tasks or duties.
The government's education policy has made participation in SIWES a mandatory requirement
for the granting of diplomas and degrees in particular fields at the majority of the nation's higher
education establishments. Employers of labor, institutions, coordinating agencies (NUC,
NCCE, and NBTE), and the ITF are among the operators. It is intended to oversee an
occupational experience program that includes hands-on learning activities carried out in an
actual industrial setting and apart from the usual classroom.
Promoting industrialization in Nigeria and providing a link between academia, industry, and
the workplace for fields of study like microbiology, agriculture, engineering, and other
professional education programs are the objectives of SIWES. With the exception of programs
in engineering and technology, which have a minimum duration of 40 weeks, all programs
have a minimum duration of 24 weeks. The following groups contribute to the operation of
SIWES as a joint venture:
The Federal Government (F.G.): With assistance from the Federal Ministry of
Commerce and Industry (FMC&I), the Federal Government finances the program.
One of the Federal Ministry of Commerce and Industrial agencies in charge of the
overall administration of the program in coordination with other stakeholders is the
Industrial Training Fund (ITF).
Regulatory/Supervising Agencies: These organizations oversee postsecondary
education on behalf of the federal government, ensuring that all educational institutions
abide by the rules governing SIWES operations. The National Board for Technical
Education (NBTE) and the National University Commission (NUC) are two such
organizations.
Employers: These comprise government establishments and members of the Organized
Private Sector (OPS) who offer SIWES participants locations for industrial
attachments.
Tertiary Institutions: The main receivers of SIWES funds are universities, polytechnics,
and colleges of education. It is evident that their primary responsibility is to guarantee
the successful implementation of SIWES.
Students: Since they are the ones receiving the training made possible by this program,
students are the direct beneficiaries of SIWES. Before graduating, it is the responsibility
of the students to fully participate in this program and gain the necessary production
skills.
The CEOs of ITF, NUC, NBTE, NCCE, and OPS make up the Chief Executives Forum,
which is in charge of developing guidelines for the effective operation and execution
of SIWES at the federal level.
The goals of SIWES were specified in the Industrial Training Fund's policy document No. 1 of
1973, which created the program. The following are the goals:
i. Assist students in being ready for the industrial work environments they will probably
encounter after graduation.
ii. Give students enrolled in higher education institutions a way to gain practical
experience and industrial skills while they are studying.
iii. Introduce pupils to industrial work practices and procedures for operating machinery
and equipment that might not be available at their educational institutions.
iv. Facilitate students' transition from school to the workforce and increase their
connections for potential future employment.
v. Give students the chance to put their academic knowledge to use in authentic work
settings to close the knowledge gap between theory and practice.
vi. Utilize SIWES to enlist and increase employers' participation in the entire educational
process.
vii. To introduce students to the newest advancements and technical discoveries in the
fields they have chosen.
During the attachment time, the student utilized a logbook provided by the institution to
document all daily activities. During supervision, the logbook was reviewed and approved
by supervisors from both the industry and the institution.
CHAPTER TWO
With a team of seasoned computer engineers and programmers with proven track
records in computer and communication business, the company offers the following wide areas
of specialization:
NETWORK CONSULTANT
The Network Consultant is an experienced and educated professional who certifies network
functionality and performance. They are responsible for designing, setting up and maintaining
computer networks at either an organization or client location.
Establishing, defining, documenting network environment and systems including a
LAN, VLAN (A virtual LAN is any broadcast domain that is partitioned and isolated
in a computer network at the data link layer (OSI layer 2)), as well as local intranet or
extranet servers and resources. Installation and configuration of network equipment.
Maximizing network performance such as network traffic, security, and capacity.
Troubleshooting network problems and outages, overseeing, monitoring and upgrading
network infrastructure and user access to the network.
Remote troubleshooting and fault finding if issues occur upon initial installation.
PROGRAMMING UNIT
Building and designing interface\ of websites. Designing and implementing security
measures against cyber-attack and virus. Write software packages to handle specific
tasks, such as controlling equipment or storing and retrieving data. Also, model, design,
create and maintain the computer databases and tables used by a software solution.
CHAPTER THREE
3.1 INTRODUCTION
Physical appearance is widely recognized and addressed with great sensitivity at SOFTRAYS
COMPUTER INSTITUTE. As a result, any sort of laziness, carelessness, lack of seriousness,
or tardiness to work is not acceptable behavior from either staff member or trainee. Considering
the set of instructions. I had gotten from the school before to the Industrial Training, I found it
simple to accept this. The official start time for work each day is 8:00 a.m., and it is expected
that all employees and trainees will arrive at work by that time or earlier, unless they have
valid, non-livable excuses, however the boss is not mean to any of the staffs or trainees. I
frequently encountered the closure time extension because it helped prepare me for the work
that awaited me after graduation.
Working at the front desk heightened my understanding of human relations. I learned practical
skills, such as defusing stressful situations and providing effective customer service. It's like
being the conductor of a smooth orchestra, ensuring everything harmonizes for a positive
experience.
1I see the front desk as the control center at a concert, where the conductor manages everything
to create a seamless and enjoyable performance.
During my SIWES stint at [Company Name], I got to dive into the workings of "Kippa," an
account management system that keeps things organized and secure.
What I Did:
1. Superhero Security: Kippa turned out to be our digital superhero, keeping our data safe and
sound. No bad guys allowed!
2. Speedy Hiring and Firing: The system made hiring and firing (well, kind of) a breeze. Less
paperwork, more action.
3. DIY Profiles: Employees could now do their own thing – update profiles, access info –
without waiting around.
In Summary:
Exploring around with Kippa was a highlight. I saw firsthand how it's like the guardian of our
digital castle, making things run smoother and safer. It's the behind-the-scenes hero that keeps
the digital world spinning.
During my SIWES, I delved into Data Science concepts, building upon my University
knowledge of Python programming and data structures like arrays and dictionaries. Imagine
data structure as organizing information like you would arrange items in a toolbox to solve
problems. It's like having a well-organized toolbox for data.
I think of data structuring as organizing your phone contacts by category (friends, family, work)
for easier access and efficiency.
PROGRAM
A program is a set of instructions that tells the computer what to do. It's all about storing
information (in some type(s)), and modifying it.
DATA
Values/information about any 'object' to be stored and acted upon by the computer.
Python is portable, flexible. It has the simplest syntax. And it's generally used (Blockchain,
web, data science, e.t.c.). Indentation strict. There are many text editors/ IDE's for python-
specified software "anaconda". There are spider and Jupyter or VSCode. But with VSCode, I
just installed python extensions-pack, indentation, environment, code runner, python extended.
CLASS IN PYTHON
• A class is a blueprint or a recipe that defines how to create objects with their characteristics
(attributes) and behaviors (methods). It's like a building block of your code that helps you
organize and manage different types of objects in a structured way.
• A class serves as a blueprint or template for creating our own datatypes (objects); we know
that the built-in datatypes are also objects of pre-built classes.
• Classes are called user-defined datatypes in most programming languages including Python.
• Imagine a class as a blueprint or a template for creating something. It defines the structure
and behavior.
• The behaviors (methods) contain the structures (attributes).
• The 'Class' coordinates everything. It instructs each object made from that class to perform a
particular method (action).
INSTANTIATING CLASS
• When you create an object from a class, it's called instantiation. This means you're making a
specific instance of that class, with its own unique data.
OBJECT IN PYTHON
• An object is a class that you've given some values/ instances; A class you've instantiated.
• i.e, we use class to create objects.
• Values/information to be stored and acted upon by the computer.
• Think of objects as real-world instances or copies made from that blueprint (class).
• An object is an instance of a class.
• Instantiated objects are in different memory locations as they're different, anyways.
• An object can have more or less attributes than other objects in the same class.
• Objects have methods and attributes that you can access using:
.something() for methods
.something for attributes
• This ` init ` method is used for INITIALIZING (setting up) the attributes (characteristics)
of the object. It's like giving the object its initial values.
• Again, this ` init ` method only does one thing. It initializes some attributes to the
parameters.
• So, when you create an object, for example:
• person1 = Person("Alice", 30), it sets person1's name to "Alice" and age to 30.
• In simpler terms, I think of the Person class as a recipe for creating robots. When you
create(instantiate) a person using this recipe, you provide their name and age as ingredients,
and the ` init ` method makes sure those ingredients are properly assigned to the person
being created.
'membership' ATTRIBUTE
Membership is a class-object attribute. It's like a property shared by all instances
(objects) created from this class.
It doesn't change across multiple instances.
It is constant (not dynamic) and can be used anywhere in the class.
E.g: membership = True. The statement implies that all player characters have
membership.
Both Object-Oriented Programming (OOP) with classes and procedural programming with
functions have their uses. It's essential to choose the right approach for your specific problem.
Here's a breakdown of why you might choose one over the other. The choice between the two
paradigms depends on the specific needs of your project and your preferences as a developer.
Fore.LIGHTMAGENTA_EX:
• Fore is a class from the colorama library,
• and LIGHTMAGENTA_EX is a constant from that class, representing a light magenta color.
This adds a light magenta color to the text.
In Python, classes act as blueprints for creating objects, while objects are specific instances of
those classes. The language's fantastic handling of loops includes the "else" block, executed
when loops complete without encountering a "break" statement. Additionally, the colorama
library enhances the terminal experience with its delightful text colours, like
"Fore.LIGHTMAGENTA_EX" for a charming light magenta shade. By utilizing these
concepts, Python programmers create captivating and functional code masterpieces.
ERROR HANDLING
VALUE ERROR HANDLING: This is a type of exception which is raised when a function
receives an argument of the correct type, but an inappropriate value.
• I explored other exceptions and we also explored a game program.
• Exception handling is a crucial programming concept that enables graceful management of
errors during program execution, preventing crashes and handling unexpected situations. It
involves the use of "try" to execute code, "except" to handle exceptions, and "finally" (optional)
for CLEANUP operations. With this approach, developers can create more robust and reliable
programs that handle errors effectively.
I handled some exceptions like:
• SyntaxError
• NameError
• TypeError
• ListError
• KeyError
FILE HANDLING
File handling in Python involves working with files, reading and writing data to them. It allows
you to interact with files on your computer, like drawers containing papers representing data.
Reading from a file is like following a recipe to cook, while writing to a file is like writing a
letter. Understanding file handling is crucial for working with data and performing various
programming tasks. These are a list of what I added to a new file and how.
DATA STRUCTURE
It's like a container around our data. It's a way of organizing our data, giving some pros and
cons on accessing, deleting, removing, inserting data in the container.
Good code is code that is well-organized, readable, maintainable, and efficient. It follows best
practices, adheres to coding standards, and is easy for other developers to understand and
collaborate on. Good code is also modular, meaning it's divided into logical components that
can be reused and tested independently. Additionally, it is well-documented to explain its
purpose, functionality, and any potential pitfalls.
We Focus on Readability. Developing a code that is easy to understand is what makes a good
programmer. In programming, there are many ways to solve a problem. The key is to solve
make it as simple as possible; The idea of readability.
'nonlocal' KEYWORD
I explored a code snippet that showcases nested functions in Python, with the addition of the
nonlocal keyword to modify a variable from an outer function's scope within an inner function.
This technique is useful for updating and maintaining data between different levels of
functions, allowing changes in the inner function to affect the outer function's variables.
CLUSTERING
Clustering using k-Means algorithm to identify patterns in datasets, its application to artificial
and real datasets, and offers readers the opportunity to experiment and gain a deeper
understanding of data clustering techniques.
Collaborated on a Python assignment, developing a solver for quadratic equations. The code
calculates roots using coefficients, considering the discriminant for real or complex results,
enhancing my mathematical problem-solving skills.
Machine Learning is about teaching computers to learn from data and make decisions or
predictions based on what they've learned, rather than following predefined rules. This idea of
TMUD (Training Model Using Data). Just like teaching someone using a textbook.
Machine Learning shines in the following aspects:
• Image Recognition
• Natural Language Processing (NLP)
• Recommender Systems
• Predictive Analytics
corresponding output label (target value/variable). It's useful for ML to learn the mapping
function between the inputs and the outputs. Having a well-annotated and diverse labeled
training set is crucial for the machine learning algorithm to generalize well to new, unseen data
and make accurate predictions or classifications.
B. SUPERVISED TASKS
Supervised Tasks in Data Science involve training a computer to either classify data into
categories or predict numerical values based on input data. Most common supervised tasks are:
Classification and Regression.
Spam detection is treated as a supervised learning problem because it requires labeled training
data to build a model capable of classifying emails as either spam or not spam based on learned
patterns from the labeled examples.
• Classification: the goal is to train a computer to automatically sort data points into different
groups or classes based on their characteristics.
• Regression: involves predicting numerical values based on input data.
Note that classification is about sorting data into distinct groups, while regression focuses on
predicting continuous values. These tasks are essential in various real-world applications, from
email spam detection to predicting sales or prices of products from historical data.
C. UNSUPERVISED LEARNING
This involves tasks like clustering similar data points, detecting anomalies or outliers, reducing
the data's dimensionality, and mining associations between items in transactional data. These
tasks are valuable for understanding data patterns, detecting abnormalities, and simplifying
complex datasets. Common Unsupervised Learning tasks include:
• Clustering: This is about grouping similar data points together based on their characteristics.
Common clustering algorithms include K-means, hierarchical clustering, and DBSCAN,
among others. These algorithms help businesses gain insights into their customer base and
make informed decisions to improve customer satisfaction and overall business performance.
• Anomaly Detection: Anomaly detection aims to identify rare or unusual patterns in the data
that do not conform to the majority.
• Dimensionality Reduction: Dimensionality reduction reduces the number of features or
variables in the data while preserving important information.
D. REINFORCEMENT LEARNING
This is the type of Machine Learning algorithm you would use to teach a robot to walk in
different terrains. It learns from its actions and the feedback received from the environment to
develop a strategy for successful walking. Just like how we learn to walk as babies by trial and
error, the robot learns through repeated attempts in various terrains to become a proficient
walker. I understand what type of algorithm I would use to segment an entity into multiple
groups, which is known as Clustering.
Terrain refers to the physical features and characteristics of the land's surface, including its
elevation, slope, and surface type.
F. OUT-OF-CORE LEARNING
This is like solving a big puzzle with limited space, where you work with smaller chunks of
data at a time, gradually building knowledge about the entire dataset. This approach allows
data scientists to efficiently handle and learn from large datasets without overwhelming their
computer's memory.
G. INSTANCE-LEARNING SYSTEM
This is useful when the underlying distribution of the data is complex and nonlinear, as it can
capture intricate patterns in the data. However, it can be computationally expensive, especially
for large datasets, as it requires calculating the similarity measure for each new data point
against all training instances.
Now, I understand the meaning and difference between Model parameters and
Hyperparameters.
DIFFERENCE BETWEEN MODEL PARAMETERS AND HYPERPARAMETERS
These are internal settings of the machine learning model that the algorithm learns from the
data, while algorithm's hyperparameters are external settings that influence the algorithm's
behavior during training. Model parameters directly impact how the model interprets the data,
while hyperparameters affect how the algorithm learns and optimizes the model. Meaning that
hyperparameters still learn well but either overfit or underfit. The settings (called
hyperparameters) and the learned stuff (parameters).
ALGORITHM
An algorithm is a set of instructions that guide the learning process, while a model is the result
of applying the algorithm to the data and represents the learned patterns and relationships. The
algorithm drives the learning, while the model encapsulates the knowledge gained from the
data, allowing it to make predictions or decisions for new data points.
overfitting - the memorization trap. Just like a student who memorizes answers without
understanding the concepts, the model memorizes training examples without generalizing well.
The three possible solutions to address overfitting are providing more diverse training data,
selecting relevant features, making sure you interpret accurately, and using regularization
techniques to keep the model from becoming overly complex. By applying these solutions, we
can help the model perform better in real-world scenarios and avoid the memorization trap.
Now, I also grasp the idea of creating validation and test sets and why they are useful.
VALIDATION SET
A validation set is like a practice quiz in data science. It allows us to assess how well the model
is performing on new, unseen data DURING TRAINING and helps fine-tune the model's
settings for better performance. Just like taking a practice quiz before a big test, using a
validation set helps us identify and address potential issues in the model before evaluating its
final performance on the test set.
TEST SET
A test set is like the final exam in data science - a separate set of unseen data used to evaluate
the model's true performance and ability to generalize to new situations. Just like a final exam
helps you understand how well you grasp the subject, the test set helps us measure how well
the model performs on real-world data it hasn't seen before.
I explored powerful concepts such as Machine Learning and Deep Learning, which essentially
allow computers to learn on their own. To reinforce my Data Science journey, I embraced
modern tools like JUPYTER notebooks, Google Collaboratory, R-studios, and VS-code. These
tools are like a digital workshop, where I use them to create visual charts, write code, and
import data seamlessly. Using these tools is akin to having a high-tech kitchen where you
effortlessly prepare various recipes, each tool playing a unique role.
a library but also an integrated development environment (IDE) for Python programming
language. I think of these libraries as specialized tools in your toolbox, like a calculator
(NumPy) and a scientific instrument (SciPy) that help you analyze and understand data.
John Rollins has a framework for data science which is like a recipe book with three main
steps. Now, introducing CRISP-DM: A Data Mining Adventure Guide
• Deployment: Unleashing the Magic ("How can I use the feedback I received during the
visualization/share phase to actually meet the stakeholders needs?")
Just like a thrilling roller coaster, the CRISP-DM is a continuous loop. One might need to
revisit previous steps as one uncovers new insights. The journey doesn't end – it's a cycle of
learning, refining, and improving. I think of it as a never-ending adventure, where each loop
brings me closer to uncovering the ultimate treasure - valuable insights from my data.
CHAPTER FOUR
b. Analytical Approach:
• We can use Decision Trees –which is like making choices in a game.
• In simple terms, Decision Trees help us make choices by breaking them into smaller
decisions. This recursive process of asking questions and making choices guides us to the
best decision based on our preferences and circumstances.
• Predictiveness in Decision Trees is based on decrease in entropy/rowdiness –gain in
information, and minimize impurity – means that to make accurate predictions, one needs
to organize and structure one’s data to reduce chaos (entropy), gain useful information, and
minimize impurity. This helps me uncover meaningful patterns and trends, making my
predictions more reliable.
• The goal is to make my classification and decision-making process efficient and accurate.
• A tree stops growing when the current node is pure, when there's no remaining variable
to split for, and when the pre-selected limit is reached.
• Decision Tree models are easy to understand (because of their Visual Representation),
and their decision-making process is readily interpretable (you understand how the decision
is made).
• Decision Trees also have limitations, such as the potential for overfitting, and difficulty
handling complex relationships in the data.
a. Data Understanding:
Just like knowing the taste of every ingredient, I investigated congestive heart failure
admissions. To truly grasp the data, I performed three key activities:
• Descriptive Statistics
• Pairwise Correlations
• Histograms Plotting
These steps also help me assess data quality. For instance, missing values could indicate "no
data," "zero," or "unknown."
By understanding such nuances, I refined my dataset. I considered a case where the initial
definition of congestive heart failure admission missed certain cases. This realization prompts
me to include secondary diagnoses, refining my understanding and data.
b. Data Preparation:
This stage deals with missing data or improperly-coded data to structure data and get it ready
for analysis. I think of preparing data for machine learning like getting a recipe ready to cook
a delicious meal. I understand that just like we follow certain steps to chop (transform), mix,
and prepare(clean) ingredients before cooking, we also need to MAKE OUR DATA READILY
AVAILABLE FOR LEARNING before using it to train your machine learning algorithms.
This is the most time-consuming phase as it consumes 70% or more of the time. However, I
automated the activities in this Data Preparation stage thereby reducing the time to about 50%
which caused increased time for focusing on my model development. This stage is also iterative
and complicated.
It's also necessary to understand that congestive heart failure was just one type of heart failure.
The next step involved defining readmission criteria for the same condition.
In summary, during this stage, I made data available for learning by:
• aggregating the data and merge them from different sources, enabling me to
use clean data in the decision-trees classification analysis resulting in a single
record for each patient.
• a literature review on congestive heart failure was also conducted to ensure that
significant data elements like previously unaccounted comorbidities were not
overlooked or discarded.
My literature review involves returning to the data collection stage to add a few indicators for
conditions and procedures. The data preparation phase concluded for this case study with
cohort of 2,343 patients meeting all the criteria. The cohort was then divided into training and
test sets for building and validating the model, respectively.
This stage is based on the analytic approach stage earlier taken, which may be statistics-driven
or machine-learning driven. Data Modeling is like fine-tuning a sauce in cooking - sampling
the sauce to see if it needs more seasoning. Data Modeling is about understanding the business
task, choosing the right approach, using historical data, and making sure the model's answer is
relevant in the evolving field of data science. The process of modeling and some of the
characteristics of this process are:
• Descriptive Analysis: "If a customer likes this, he is likely to have this" type of outcomes.
We use training set and test set for predictive modeling. The outcome is already known in the
TRAINING SET, while the test data is used for evaluation. The training set acts like a guage
to determine if the model needs to be caliberated.
Data scientists play around with DIFFERENT ALGORITHMS to ensure that the variables in
play are actually required, by adjusting the relative cost of misclassifying “yes” and “no”
results. The success of data compilation, preparation and modeling depends on the
understanding of the business task and the analytical approach being taken.
Constant refinement, adjustment and tweaking is necessary in each step to ensure that the
algorithm is great. Modeling may require testing multiple algorithms and parameters.
I asked myself this question before deploying. "Has the result answered the business task?" "Is
it relevant?" because, experimentation is necessary to finding the right balance. Finding the
right balance between "yes" and "no" accuracy of Decision Trees –in some cases– is essential.
Data scientists may ITERATE by redefining variables and improving data representation in the
data preparation stage.
b. Model Evaluation:
This includes ensuring that the data are properly handled and interpreted. I underrwent two (2)
main stages of model evaluation:
• The first is the diagnostic measures stage to ensure that the model works as intended.
• The second evaluation stage that can be employed is a statistical significance test. This is
applied to ensure that data is being processed and interpreted correctly within the model,
designed to avoid unnecessary second guesses when the response comes out. The criterion
is relative misclassification cost.
The optimal model is the one that maximizes the separation between the blue ROC (Receiver
Operating Characteristic) curve and the red baseline. The ROC curve is a useful diagnosis tool
used for determining the optimal classification model. It measures how well a binary
classification model performs, removes the classification BIAS between 'yes' and 'no' outputs
as some threshold criteria change.
This step helps my business team figure out what information the intervention team (who are
working to reduce readmission risks) will need to do their job effectively. It's key that
stakeholders are familiar with the tool to make it relevant and useful. During the business
requirements phase, the Intervention Program Director and the team requested an application
that would provide automatic, nearly real-time risk assessments for congestive heart failure. It
was then deduced that the application would be tablet-based and would generate patient data
throughout their hospital stay in the required format, scoring each patient near the time of
discharge. Clinicians would then have the most up-to-date risk assessment for each patient to
help them target interventions after discharge. A part of the solution deployment, the
Intervention team would develop and provide training for clinical staff. Additionally, processes
for monitoring and tracking patients who received interventions needed to be developed in
collaboration with IT developers and database administrators. My model can be deployed
through a Cognos application.
This stage emphasizes the importance of stakeholders' involvement and understanding in the
solution system of reducing hospital readmissions. And highlights the collaboration between
business teams and technical experts during the deployment process.
b. Model Feedback:
It is essential that once I create a model, I need users feedback (like taste bud) from users to
refine and make it better. In data science, after creating a model, we put it to the real-world test
to see if it truly helps in the field. They need updates to make them smarter. The more one
knows about and handles what needs fixing, the better one’s model becomes. In this case, I
measured the impact of the model in reducing re-admissions for CHF patients.
CHAPTER FIVE
In my data science journey in Softrays, I've mastered advanced python programming for Data
Science and also mastered the art of preparing and transforming data, analogous to crafting
the perfect recipe. Through personalized handling of text, categories, and scaling, I've
tailored my data tools, empowering my model with efficient and harmonious attributes. The
culmination involves training, fine-tuning with strategic adjustments, and launching my
model like presenting a perfected dish, emphasizing continuous monitoring for sustained
success.
My administrative work experience is also what I have been well-grounded among the skills I
learned, as it is invaluable in the industry.
School should provide a place of attachment for student to ensure conformity with course
of study.
During their program, students and supervisors should get allowances. This will greatly
assist them in managing financial difficulties that may arise during their training.
A mass enlightenment campaign should be carried out to enable industries and
establishments know the importance of SIWES to the future of student and the society at
large.
Pre-SIWES orientation should be made available by the departments before the
commencement of the program.
REFERENCES