GROUP-4-TES1-Written-Report
GROUP-4-TES1-Written-Report
Submitted by:
Castillo, Juliet V.
Flores, Patricia Victoria B.
Gaddi, Trisha Mae C.
Macario, Mckervin M.
Oreta, Princess Denilyn
Pangilinan, Eunice D.
Verdera, Crizaldy Jr. L.
BSBA-IM-3A-M
1st Semester, SY 2024-2025
Submitted to:
January 2025
Table of Contents
I. Abstract ……………………………………………….……………..…………….. 3
II. Objectives and Scope ………………………………...…………………..……… 3
III. Introduction ……………………………………………………………………..... 4
IV. Core Topics ………………………………………………………...……………... 5
a. Overview on Technology
b. Key Technologies and Tools
c. Popular Tools and Platforms for Automating Data Analytics
d. Data Collection and Preparation
e. Automated Data Analysis
f. Automated Analytical Models and Algorithms
g. Visualization and Reporting
h. Application and Use cases
i. Impact of Automation on Business Processes
j. Future Trends and Advancements in the Fields
2
I. Abstract
Automation in data analytics uses technologies like AI, ML, and RPA to streamline
processes, reduce costs, and uncover insights quickly. Tools such as Python, R, Tableau,
and Power BI automate data collection, cleaning, and analysis, enabling continuous data
flow and real-time insights. Automation also improves visualization and reporting, creating
dynamic dashboards and reports. Applications in industries like finance, healthcare, and
marketing demonstrate significant benefits, including cost savings and improved decision-
making. Challenges include data privacy concerns and the need for skilled professionals.
Despite these challenges, automation in data analytics continues to evolve, driving
efficiency and innovation across various sectors.
• Learn automated methods for data collection, such as web scraping and APIs.
• Understand techniques for automating data cleaning and preprocessing.
III. Introduction
Several key technologies underpin automation in data analytics, including AI, machine
learning, and Robotic Process Automation (RPA). These technologies enable the
automation of complex analytical tasks and data workflows, allowing organizations to
handle large datasets with ease and precision. Automated data collection methods, such as
web scraping, APIs, and sensors, gather data from various sources efficiently. Additionally,
automating data cleaning and preprocessing tasks ensures data quality by removing
4
duplicates, correcting errors, and organizing data for analysis. Setting up automated data
pipelines streamlines the process of moving data from collection to analysis, ensuring
efficient and continuous data flow. Analytical models and algorithms, such as machine
learning models for predictive analytics and clustering algorithms, can be automated to
provide faster and more accurate insights. This automation allows organizations to quickly
adapt to changing data and make informed decisions.
Despite its benefits, implementing automated data analytics presents challenges, such as
data privacy concerns, integration with existing systems, and the need for skilled personnel.
However, future trends, including advancements in AI and machine learning, increased
adoption of IoT, and the development of more sophisticated analytical tools, are expected to
drive the evolution of automation in data analytics. In conclusion, automation in data
analytics empowers organizations to leverage their data assets more effectively, driving
growth and fostering a culture of data-driven decision-making. As technology continues to
evolve, the potential applications and benefits of automated data analytics will only expand,
making it an essential component of modern business strategies.
Overview on Technology
Technology encompasses a wide range of tools, systems, and methods developed to solve
problems, improve efficiency, and enhance human capabilities. It includes both tangible and
intangible elements, ranging from physical devices and machinery to software, algorithms,
and processes.
5
Popular Tools and Platforms for Automating Data Analytics
a. Phyton: An incredibly versatile programming language, Python is favored for its ease
of use and extensive libraries.
b. R: A language designed for statistical computing and graphics.
c. Tableau: A powerful tool for creating interactive and shareable dashboards. It
connects to various data sources and offers intuitive drag-and-drop functionality for
visual analysis.
d. Power BI: Microsoft's business analytics tool, Power BI, enables users to create
reports and dashboards from a wide range of data sources, offering robust data
visualization and real-time insights.
Data collection and preparation are critical steps in the data analytics process. They ensure
that the data used for analysis is accurate, complete, and ready for processing.
• Web Scraping
Web scraping is a technique used to automatically extract data from websites. It's a powerful
tool for gathering large amounts of data from the internet, which can then be used for
various purposes such as analysis, research, and machine learning.
An Application Programming Interface (API) is a set of rules and protocols that allows
different software applications to communicate with each other. Think of an API as a bridge
that connects different systems, enabling them to share data and functionality seamlessly.
• Barcode Scanners
Barcode scanners are devices used to read barcodes and convert them into digital data. They
are widely used in various industries for tasks such as inventory management, point-of-sale
(POS) transactions, and asset tracking.
6
• RFID (Radio-Frequency Identification)
RFID is a technology that uses electromagnetic fields to automatically identify and track
tags attached to objects. These tags contain electronically stored information, which can be
read by RFID readers without direct contact or line of sight.
The Internet of Things (IoT) refers to a network of interconnected devices that communicate
and exchange data with each other over the internet. These devices, often equipped with
sensors and software, can collect, send, and receive data, enabling them to perform tasks
and provide insights without human intervention.
Automated surveys and forms are tools that streamline the process of collecting and
analyzing data from respondents. These tools leverage technology to design, distribute, and
manage surveys and forms, making data collection more efficient and accurate.
7
• Outlier Detection: Automatically identify and handle outliers, which are data points
significantly different from the rest of the dataset. Techniques such as z-score analysis
or machine learning algorithms can flag these anomalies for further review or
automatic correction.
• Automated Workflows: Set up workflows that automate the entire data cleaning
process, from data ingestion to preprocessing. Tools like Apache NiFi or Talend can
orchestrate these workflows, ensuring that each step is executed systematically and
consistently.
• Regularly Review and Update Data Quality: Implement automated systems to
regularly review data quality and apply necessary updates. This can include
scheduling periodic checks and updates to maintain high data standards over time.
Continuous monitoring tools can alert you to data quality issues as they arise.
Automated data analysis uses technology and algorithms to analyze data with minimal
human input. It streamlines the analysis process, enhances accuracy, and quickly reveals
insights, helping organizations respond effectively to trends and make informed decisions.
Automated Machine Learning, or AutoML, refers to the process of automating the tasks of
applying machine learning to real-world problems. AutoML aims to make machine learning
more accessible by streamlining the workflow, from data preprocessing to model
deployment, enabling users without extensive knowledge of machine learning to create
effective models.
• Predictive Analytics
Predictive analytics is a branch of advanced analytics that uses historical data, statistical
algorithms, and machine learning techniques to identify the likelihood of future outcomes
based on past data. The main goal of predictive analytics is to provide actionable insights by
forecasting trends, behaviors, and events.
8
• Natural Language Processing (NLP)
Natural Language Processing (NLP) is a subfield of artificial intelligence (AI) that focuses
on the interaction between computers and humans through natural language. The goal of
NLP is to enable computers to understand, interpret, and generate human language in a way
that is both meaningful and useful. NLP combines computational linguistics and machine
learning to achieve these objectives.
Business Intelligence (BI) refers to the technologies, applications, and practices for the
collection, integration, analysis, and presentation of business information. The goal of BI is
to support better business decision-making by providing actionable insights based on data
analysis.
Data ingestion and preparation are crucial steps in the data analytics process, ensuring that
data is collected, transformed, and ready for analysis.
• Anomaly Detection
Anomaly detection, also known as outlier detection, is a technique used to identify unusual
patterns or observations in a dataset that do not conform to expected behavior. These
anomalies can indicate critical incidents, such as fraud, equipment failures, or network
security breaches.
• Define Your Goals: Identify what you want to achieve with your data pipeline. This
could include data integration, transformation, or real-time analytics.
• Identify Data Sources and Destinations: Determine where your data is coming
from and where it needs to go.
2. Choosing the Right Tools and Technologies
9
• ETL Tools: Tools like Apache NiFi, Talend, or AWS Glue can help with Extract,
Transform, Load (ETL) processes.
• Workflow Orchestration: Tools like Apache Airflow or Prefect can manage and
schedule your data workflows.
• Data Storage: Choose appropriate storage solutions like Amazon S3, Google
BigQuery, or Azure Data Lake.
• Connect to Data Sources: Use connectors or APIs to pull data from your sources.
• Configure Destinations: Set up your data destinations to receive and store the
processed data.
• Data Cleaning: Remove duplicates, handle missing values, and standardize formats.
• Data Transformation: Apply business logic to transform data into the desired
format.
• Test Your Pipeline: Run tests to ensure data is correctly processed and loaded.
• Validate Data: Check the accuracy and completeness of the data in the destination.
10
• Share Knowledge: Ensure your team understands how the pipeline works and how
to troubleshoot issues.
• Monitor Performance: Regularly check the performance of your pipeline and make
adjustments as needed.
• Optimize: Continuously improve your pipeline to handle more data and improve
efficiency.
In the context of automation data analytics involves the use of advanced tools and
techniques to automatically generate visual representations of data and compile reports
without manual intervention.
• Automating the Creation of Dashboards and Reports: This means using software
and tools that automatically gather data from various sources, process it, and then
display it in an easy-to-understand format, like a dashboard.
• Tools for Real-Time Data Visualization and Automated Reporting: These are
specific applications or platforms that facilitate the creation of visual data
representations such as graphs, charts, and dashboards.
Automated data analytics uses machine learning and AI to streamline data collection,
processing, and analysis, enabling businesses to derive actionable insights quickly.
• Finance
• Healthcare
o IBM Watson Health: IBM Watson Health uses artificial intelligence (AI) and
data analytics to transform healthcare by improving patient care, operational
efficiency, and clinical decision-making. It provides solutions for imaging,
data management, and workflow optimization.
o Penn Medicine: Penn Medicine's Institute for Biomedical Informatics has
developed "Penn AI," an open-source automated machine learning system
designed for data analysis. It aims to make AI accessible to a wide range of
users, from students to researchers.
o Zebra Medical Vision: Zebra Medical Vision uses AI and deep learning to
analyze medical imaging data, providing diagnostic tools that improve patient
outcomes. The company offers FDA-cleared AI solutions for detecting
various medical conditions.
• Marketing
1. Data Quality and Consistency: Ensuring that data is clean, accurate, and consistent
is crucial. Poor data quality can lead to misleading insights, making it essential to
establish robust data governance practices.
2. Cost of Implementation: Initial costs associated with software, training, and
infrastructure upgrades can be significant. Organizations must carefully consider
their return on investment.
3. Privacy and Security Concerns: Automating data analytics often involves handling
sensitive information. Ensuring compliance with data protection regulations (e.g.,
GDPR) and maintaining data security are paramount challenges.
1. Generative AI
Generative AI, including models like GPT-4, is being used to create synthetic data, generate
content, and automate complex tasks. It's transforming industries by enabling the creation of
new forms of data and content.
2. Quantum Computing
Quantum computing is revolutionizing data processing with its superior speed and
efficiency. It's being explored for solving complex problems that are currently intractable for
classical computers.
NLP is being used to understand and generate human language, enabling applications like
chatbots, sentiment analysis, and language translation.
14
5. FinOps
FinOps is a practice that brings financial accountability to the variable expenses of cloud
computing, helping organizations manage costs effectively.
a. Real-World Examples
1. Netflix:
2. Amazon:
3. Uber:
• Application: Uber uses automated data analytics for dynamic pricing and
route optimization.
• Description: By analyzing real-time data on traffic, demand, and driver
availability, Uber's algorithms dynamically adjust prices and suggest optimal
routes. This ensures efficient ridesharing services, maximizes driver earnings,
15
and improves customer experience by reducing wait times and fares during
low-demand periods.
b. Scenario Analysis
1. Retail Sector:
2. Financial Services
3. Healthcare Industry
1. Data Integration: Integrating data from various sources, such as sensors, machines,
and systems, can be complex due to different formats, structures, and protocols.
2. Real-Time Data Processing: Processing large volumes of data in real-time requires
significant computational power and efficient algorithms.
3. Data Quality and Consistency: Ensuring the quality and consistency of data
collected from various sources is difficult.
4. Scalability: Scaling data analytics systems to handle increasing data volumes and
complexity is a major challenge.
5. Security and Privacy: Protecting sensitive data from cyber threats and ensuring
compliance with data privacy regulations is critical.
6. Skill Gap: There is a shortage of skilled professionals who can effectively manage
and analyze industrial data.
7. Cost: Implementing and maintaining advanced data analytics systems can be
expensive.
8. Change Management: Integrating new data analytics technologies into existing
workflows and systems requires effective change management.
17
VIII. Conclusion
Automation in data analytics is transforming how businesses process and interpret vast
amounts of data, providing significant improvements in efficiency, accuracy, and decision-
making. At its core, automated data analytics involves using advanced technologies such as
artificial intelligence (AI), machine learning (ML), and robotic process automation (RPA) to
handle data collection, processing, and analysis with minimal human intervention. This
approach enables organizations to swiftly gather data from various sources, including web
scraping and APIs, and prepare it for analysis through automated data cleaning and
preprocessing techniques. By automating these initial stages, companies ensure that their
data is of high quality and ready for insightful analysis.
Key technologies and tools play a crucial role in the automation of data analytics. Platforms
like Python, R, Tableau, and Power BI facilitate the creation of automated data pipelines
and analytical models, which continuously learn and improve from new data. This
automation extends to data visualization and reporting, where real-time dashboards and
automated report generation provide stakeholders with up-to-date insights, enhancing
accessibility and usability of data. Real-world applications of automated data analytics are
vast and varied, from fraud detection in finance, improving patient care in healthcare, to
personalized marketing campaigns that drive customer engagement. These examples
demonstrate the tangible benefits that automation brings, including increased operational
efficiency, cost savings, and enhanced strategic decision-making.
However, the implementation of automated data analytics is not without challenges. Data
integration from multiple sources, real-time processing, ensuring data quality, and
maintaining security and privacy are significant hurdles. Additionally, the need for skilled
professionals who can manage and interpret complex data systems remains a critical issue.
Despite these challenges, advancements in AI, quantum computing, and augmented data
management hold promise for overcoming these obstacles. Addressing these challenges is
crucial for fully leveraging the potential of automated data analytics, making it a vital tool
for modern businesses seeking to stay competitive in a rapidly evolving data-driven world.
The future of automated data analytics is bright, with continuous innovations paving the
way for even greater impact on business efficiency and decision-making processes.
18
IX. References
Dataheroes. (n.d.). How to automate data cleaning. Retrieved from
https://dataheroes.ai/blog/how-to-automate-data-cleaning/
Infosys BPM. (n.d.). Manufacturing industry problems and solutions. Retrieved from
https://www.infosysbpm.com/blogs/manufacturing/manufacturing-industry-problems-
solutions.html
Insightsoftware. (n.d.). Top 5 predictive analytics models and algorithms. Retrieved from
https://insightsoftware.com/blog/top-5-predictive-analytics-models-and-algorithms/
Polymersearch. (n.d.). Automated data analysis: Everything you need to know. Retrieved
from https://www.polymersearch.com/data-analysis-guide/automated-data-analysis-
everything-you-need-to-know
SGU. (n.d.). Challenges in industry and emerging technology trends for the future.
Retrieved from https://sgu.ac.id/challenges-in-industry-and-emerging-technology-trends-
for-the-future/
CASTILLO, JULIET V.
Address:318 A Brgy. 193 Zone 20 Pildera II NAIA, Pasay City
Number: 09660429263
Email Address: [email protected]
EDUCATIONAL BACKGROUND
College
Technological University of the Philippines
Ermita, Manila
2022 - On going
High School
Parañaque National High School - Main
Parañaque City
2016 – 2022
Elementary School
Camp Claudio Elementary School
Tambo, Parañaque City
2010 – 2016
PERSONAL PARTICULARS
Age : 20
Sex : Female
Date of Birth : November 1, 2004
Place of Birth : Philippine General Hospital
Height : 156 cm
Weight : 43 kg
Religion : Protestant
Languages Spoken : Filipino, English
Nationality : Filipino
20
FLORES, PATRICIA VICTORIA B.
Address: Binauganan, Tarlac City
Number: 09666808293
Email Address: [email protected]
EDUCATIONAL BACKGROUND
College
Technological University of the Philippines
Ermita, Manila
2022 - On going
High School
College of the Holy Spirit of Tarlac
San Sebastian, Tarlac City
2016 – 2022
Elementary School
Fairlane Young Achievers’ School
E Fairlane Rd, Tarlac City
2010 – 2013
PERSONAL PARTICULARS
Age : 21
Sex : Female
Date of Birth : October 5, 2003
Place of Birth : Tarlac City
Height : 163 cm
Weight : 63 kg
Religion : Roman Catholic
Languages Spoken : Filipino, English
Nationality : Filipino
21
GADDI, TRISHA MAE C.
Address: 1040 M.Dela Fuente St.Sampaloc, Manila
Number: 09155616870
Email Address: [email protected]
EDUCATIONAL BACKGROUND
College
Technological University of the Philippines
Ermita, Manila
2022 - On going
High School
Ramon Magsaysay High School
Sampaloc, Manila
2016 – 2022
Elementary School
Dr.Alejandro Albert Elementary School
Sampaloc, Manila
2013 – 2016
PERSONAL PARTICULARS
Age : 20
Sex : Female
Date of Birth : May 17, 2004
Place of Birth : Manila
Height : 157 cm
Weight : 55 kg
Religion : Roman Catholic
Languages Spoken : Filipino, English
Nationality : Filipino
22
MACARIO, MCKERVIN M.
Address: Brgy. San Roque Antipolo City
Number: 09556879884
Email Address: [email protected]
EDUCATIONAL BACKGROUND
College
Technological University of the Philippines
Ermita, Manila
2022 - On going
High School
San Roque National High School
Antipolo City
2016 - 2020
Elementary School
Lores Elementary School
Antipolo City
2010 - 2016
PERSONAL PARTICULARS
Age : 22
Sex : Male
Date of Birth : September 16, 2002
Place of Birth : Las Pinas City
Height : 173 cm
Weight : 76 kg
Religion : Christian
Languages Spoken : Filipino, English
Nationality : Filipino
23
ORETA, PRINCESS DENILYN
Address: 4847 12th St. Sto.Nino, Paranaque City
Number: 09666878498
Email Address: [email protected]
EDUCATIONAL BACKGROUND
College
Technological University of the Philippines
Ermita, Manila
2022 - On going
High School
Sto. Nino High School
Paranaque City
2014 - 2016
Elementary School
Sto. Nino Elementary School
Paranaque City
2014 - 2016
PERSONAL PARTICULARS
Age : 21
Sex : Female
Date of Birth : February 28, 2003
Place of Birth : Paranaque City
Height : 158 cm
Weight : 47 kg
Religion : Roman Catholic
Languages Spoken : Filipino, English
Nationality : Filipino
24
PANGILINAN, EUNICE D.
Address: Sta. Maria, Minalin, Pampanga
Number: 09685342855
Email Address: [email protected]@
EDUCATIONAL BACKGROUND
College
Technological University of the Philippines
Ermita, Manila
2021 - On going
High School
Sta. Maria National High School
2015 – 2020
Elementary School
Sto. Domingo Elementary School
2009– 2014
PERSONAL PARTICULARS
Age : 22
Sex : Female
Date of Birth : September 20, 2002
Place of Birth : San Fernando, Pampanga
Height : 152 cm
Weight : 55kg
Religion : Christian (Born Again)
Languages Spoken : Filipino, English
Nationality : Filipino
25
VERDERA, CRIZALDY JR L.
Address: 1928 Immaculada St. Tayuman Tondo Manila
Number: 09167744210
Email Address: [email protected]
EDUCATIONAL BACKGROUND
College
Technological University of the Philippines
Ermita, Manila
2021 - On going
High School
Arellano University
Legarda, Manila
2016 – 2018
Elementary
Lakandula High School
Tayuman, Tondo Manila
PERSONAL PARTICULARS
Age : 24
Sex : Male
Date of Birth : Aug. 7, 2000
Place of Birth : Quezon Province
Height : 155cm
Weight : 55 kg
Religion : Roman Catholic
Languages Spoken : Filipino, English
Nationality : Filipino
26