analytics and data science
analytics and data science
Steps to explore, preprocess, and condition data prior to modeling and analysis.
It requires the presence of an analytic sandbox, the team execute, load, and transform, to get
data into the sandbox.
Data preparation tasks are likely to be performed multiple times and not in predefined order.
Several tools commonly used for this phase are – Hadoop, Alpine Miner, Open Refine, etc.
Phase 3: Model Planning –
In this phase, data science team develop data sets for training, testing, and production
purposes.
Team builds and executes models based on the work done in the model planning phase.
Several tools commonly used for this phase are – Matlab, STASTICA.
Phase 4: Model Building –
The team creates datasets for use in testing, training, and production.
The team also examines if its present tools will serve for running the models or if a more robust
environment is required for model execution.
The four most popular types of business analytics are descriptive, diagnostic, predictive, and
prescriptive. The fifth– cognitive analytics is a new type that employs AI, ML, and deep
learning. Whilst each of these business analytics types is effective when used individually, they become
extremely powerful when employed together.
1. Descriptive Analytics
It analyses historical data to determine the response of a unit over a set of given variables. It tracks key
performance indicators (KPIs) for a better understanding of the present state of a business.
Deciding which business metrics will effectively evaluate performance against objectives
Collecting and preparing data using various processes like depublication, transformation, and cleansing.
2. Diagnostic Analytics
Diagnostic Analytics is one of those business analytics types that help understand why things happened
in the past. Using drill-downs, data mining, data discovery, and correlations, you can comprehend the
driving factors.
This advanced analytics method is usually employed as a preceding step of Descriptive Analytics to find
the reasoning behind certain results in finance, marketing, cybersecurity, and more.
3. Predictive Analytics
It considers historical data trends for determining the probability of particular future outcomes. It uses
several techniques like data mining, machine learning algorithms, and statistical modeling to forecast
the likelihood of events.
Predictive analytics helps improve business areas, including customer service, efficiency, fraud detection
and prevention, and risk management. It allows you to grow the most profitable customers, improve the
operations of businesses, and determine customer responses and cross-sell opportunities.
Recommending products
4. Prescriptive Analytics
Prescriptive analytics generates recommendations to handle similar future situations relying on past
performances. It employs several tools, statistics, and ML algorithms for the available internal data and
external data.
It gives you insights into what may happen, when, and why.
Price modeling
Identifying testing
5. Cognitive Analytics
Combining Artificial Intelligence and Data Analytics, Cognitive Analytics is one of the newest
types of business analytics. It looks at the available data in the knowledge base and discovers the
best solutions for the questions posed.
Cognitive analytics covers multiple analytical techniques to analyze large data sets and monitor
customer behavior patterns and emerging trends.
Tapping unstructured data sources such as images, text documents, emails, and
social posts.
Business problems
Business problems are obstacles that a business might encounter as they continue to
conduct operations. Learning more about common business problems can help a
company develop plans and react to these obstacles effectively. These types of
problems often relate to certain components of a business, such as:
Strategy
Service or products
People
Processes
Applications
Information
Infrastructure
Data preparation
Data preparation is the process of gathering, combining, structuring and organizing data so
it can be used in business intelligence (BI), analytics and data visualization applications.
The components of data preparation include data preprocessing, profiling, cleansing,
validation and transformation; it often also involves pulling together data from different
internal systems and external sources.
Produce top-quality data — Cleaning and reformatting datasets ensures that all data used in analysis
will be of high quality.
Make better business decisions — Higher-quality data that can be processed and analyzed more quickly
and efficiently leads to more timely, efficient, better-quality business decisions.
Superior scalability — Cloud data preparation can grow at the pace of the business. Enterprises don’t
have to worry about the underlying infrastructure or try to anticipate their evolutions.
Future proof — Cloud data preparation upgrades automatically so that new capabilities or problem fixes
can be turned on as soon as they are released. This allows organizations to stay ahead of the innovation
curve without delays and added costs.
Accelerated data usage and collaboration — Doing data prep in the cloud means it is always on, doesn’t
require any technical installation, and lets teams collaborate on the work for faster results.
Data preparation is done in a series of steps. There's some variation in the data preparation
steps listed by different data professionals and software vendors, but the process typically
involves the following tasks:
1. Data collection. Relevant data is gathered from operational systems, data warehouses,
data lakes and other data sources. During this step, data scientists, members of the BI
team, other data professionals and end users who collect data should confirm that it's a
good fit for the objectives of the planned analytics applications.
2. Data discovery and profiling. The next step is to explore the collected data to better
understand what it contains and what needs to be done to prepare it for the intended
uses. To help with that, data profiling identifies patterns, relationships and other
attributes in the data, as well as inconsistencies, anomalies, missing values and other
issues so they can be addressed.
3. Data cleansing. Next, the identified data errors and issues are corrected to create
complete and accurate data sets. For example, as part of cleansing data sets, faulty data
is removed or fixed, missing values are filled in and inconsistent entries are harmonized.
4. Data structuring. At this point, the data needs to be modeled and organized to meet the
analytics requirements. For example, data stored in comma-separated values (CSV) files
or other file formats has to be converted into tables to make it accessible to BI and
analytics tools.
5. Data transformation and enrichment. In addition to being structured, the data typically
must be transformed into a unified and usable format. For example, data
transformation may involve creating new fields or columns that aggregate values from
existing ones. Data enrichment further enhances and optimizes data sets as needed,
through measures such as augmenting and adding data.
6. Data validation and publishing. In this last step, automated routines are run against
the data to validate its consistency, completeness and accuracy. The prepared data is
then stored in a data warehouse, a data lake or another repository and either used
directly by whoever prepared it or made available for other users to access.
Data preparation can also incorporate or feed into data curation work that creates and
oversees ready-to-use data sets for BI and analytics. Data curation involves tasks such as
indexing, cataloging and maintaining data sets and their associated metadata to help users
find and access the data. In some organizations, data curator is a formal role that works
collaboratively with data scientists, business analysts, other users and the IT and data
management teams. In others, data may be curated by data stewards, data engineers,
database administrators or data scientists and business users themselves.
Data collection is the methodological process of gathering information about a specific subject.
Before collecting data, there are several factors you need to define:
Improve services, understand consumer needs, refine business strategies, grow and retain customers,
and even sell the data as second-party data to other businesses at a profit
Surveys are physical or digital questionnaires that gather both qualitative and quantitative data from
subjects. One situation in which you might conduct a survey is gathering attendee feedback after an
event. This can provide a sense of what attendees enjoyed, what they wish was different, and areas
in which you can improve or save money during your next event for a similar audience.
While physical copies of surveys can be sent out to participants, online surveys present the
opportunity for distribution at scale. They can also be inexpensive; running a survey can cost nothing
if you use a free tool. If you wish to target a specific group of people, partnering with a market
research firm to get the survey in front of that demographic may be worth the money.
Something to watch out for when crafting and running surveys is the effect of bias, including:
Collection bias: It can be easy to accidentally write survey questions with a biased lean. Watch
out for this when creating questions to ensure your subjects answer honestly and aren’t swayed
by your wording.
Subject bias: Because your subjects know their responses will be read by you, their answers may
be biased toward what seems socially acceptable. For this reason, consider pairing survey data
with behavioral data from other collection methods to get the full picture.
Related: 3 Examples of Bad Survey Questions & How to Fix Them
2. Transactional Tracking
Each time your customers make a purchase, tracking that data can allow you to make decisions
about targeted marketing efforts and understand your customer base better.
Often, e-commerce and point-of-sale platforms allow you to store data as soon as it’s generated,
making this a seamless data collection method that can pay off in the form of customer insights.
Interviews and focus groups consist of talking to subjects face-to-face about a specific topic or issue.
Interviews tend to be one-on-one, and focus groups are typically made up of several people. You
can use both to gather qualitative and quantitative data.
Through interviews and focus groups, you can gather feedback from people in your target audience
about new product features. Seeing them interact with your product in real-time and recording their
reactions and responses to questions can provide valuable data about which product features to
pursue.
As is the case with surveys, these collection methods allow you to ask subjects anything you want
about their opinions, motivations, and feelings regarding your product or brand. It also introduces the
potential for bias. Aim to craft questions that don’t lead them in one particular direction.
One downside of interviewing and conducting focus groups is they can be time-consuming and
expensive. If you plan to conduct them yourself, it can be a lengthy process. To avoid this, you can
hire a market research facilitator to organize and conduct interviews on your behalf.
4. Observation
Observing people interacting with your website or product can be useful for data collection because
of the candor it offers. If your user experience is confusing or difficult, you can witness it in real-time.
Yet, setting up observation sessions can be difficult. You can use a third-party tool to record users’
journeys through your site or observe a user’s interaction with a beta version of your site or product.
While less accessible than other data collection methods, observations enable you to see firsthand
how users interact with your product or site. You can leverage the qualitative and quantitative data
gleaned from this to make improvements and double down on points of success.
5. Online Tracking
To gather behavioral data, you can implement pixels and cookies. These are both tools that track
users’ online behavior across websites and provide insight into what content they’re interested in
and typically engage with.
You can also track users’ behavior on your company’s website, including which parts are of the
highest interest, whether users are confused when using it, and how long they spend on product
pages. This can enable you to improve the website’s design and help users navigate to their
destination.
Inserting a pixel is often free and relatively easy to set up. Implementing cookies may come with a
fee but could be worth it for the quality of data you’ll receive. Once pixels and cookies are set, they
gather data on their own and don’t need much maintenance, if any.
It’s important to note: Tracking online behavior can have legal and ethical privacy implications.
Before tracking users’ online behavior, ensure you’re in compliance with local and industry data
privacy standards.
6. Forms
Online forms are beneficial for gathering qualitative data about users, specifically demographic data
or contact information. They’re relatively inexpensive and simple to set up, and you can use them to
gate content or registrations, such as webinars and email newsletters.
You can then use this data to contact people who may be interested in your product, build out
demographic profiles of existing customers, and in remarketing efforts, such as email workflows and
content recommendations.
Monitoring your company’s social media channels for follower engagement is an accessible way to
track data about your audience’s interests and motivations. Many social media platforms have
analytics built in, but there are also third-party social platforms that give more detailed, organized
insights pulled from multiple channels.
You can use data collected from social media to determine which issues are most important to your
followers. For instance, you may notice that the number of engagements dramatically increases
when your company posts about its sustainability efforts.
.