0% found this document useful (0 votes)

10 views

MDS131 - Research Methods in Data Science - Unit 2 - Part 1

Uploaded by

alokyashashwy

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views

MDS131 - Research Methods in Data Science - Unit 2 - Part 1

Uploaded by

alokyashashwy

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

Research Methods in Data Science

MISSION VISION CORE VALUES

CHRIST is a nurturing ground for an individual’s Excellence and Service Faith in God | Moral Uprightness
holistic development to make effective contribution to Love of Fellow Beings
the society in a dynamic environment Social Responsibility | Pursuit of Excellence
CHRIST
Deemed to be University
Unit - II
Introduction to Data Science
Definition – Big Data and Data Science Hype – Why data science – Getting Past the
Hype – The Current Landscape – Who is a Data Scientist? - Data Science Process
Overview – Defining goals – Retrieving data – Data preparation – Data exploration –
Data modeling – Presentation
Sampling, Measurement and Scaling Techniques
Sampling: Steps in Sampling Design, Different Types of Sample Designs,
Measurement and Scaling: Measurement in Research, Measurement Scales,
Technique of Developing Measurement Tools, Scaling, Important Scaling Techniques.

• Davy Cielen and Arno Meysman, Introducing Data Science. Simon and Schuster, 2016
• C. R. Kothari, Research Methodology Methods and Techniques. 3rd. ed. New Delhi: New Age
International Publishers, Reprint 2014

Excellence and Service

CHRIST
Deemed to be University
Unit – II
Part-I

Introduction to Data Science

Definition – Big Data and Data Science Hype – Why data science – Getting Past
the Hype – The Current Landscape – Who is a Data Scientist? - Data Science
Process Overview – Defining goals – Retrieving data – Data preparation – Data
exploration – Data modeling – Presentation

• Davy Cielen and Arno Meysman, Introducing Data Science. Simon and Schuster, 2016

Excellence and Service

CHRIST
Deemed to be University
Big Data

• Big data is a blanket term for any collection of data sets so large or complex
that it becomes difficult to process them using traditional data management
techniques such as, for example, the RDBMS (Relational Database Management
Systems).
• Big data is important because it can be used to gain insights into a wide variety
of areas, including business, healthcare, and government. It can also be used to
improve decision making, predict trends, and identify new opportunities.
• Data science involves using methods to analyze massive amounts of data and
extract the knowledge it contains.
• Data science and big data evolved from statistics and traditional data
management but are now considered to be distinct disciplines.

Excellence and Service

CHRIST
Deemed to be University
Big Data
Big data poses a number of challenges, including:
• Storage: Big data requires a lot of storage space. Traditional data storage methods, such as relational
databases, are not scalable enough to handle the massive amounts of data generated by big data
applications. Cloud-based storage solutions are a more scalable option, but they can be expensive.
• Processing: Big data is difficult to process using traditional data processing methods. Traditional data
processing methods are designed to process small batches of data in a sequential manner. Big data,
on the other hand, is often streaming in real time and needs to be processed in parallel. This requires
specialized hardware and software.
• Analysis: Big data is difficult to analyze and extract insights from. Traditional data analysis methods
are not designed to handle the volume, velocity, and variety of big data. This requires new data
analysis methods that can scale to handle big data and that can deal with unstructured data.
• Security: Big data is a valuable asset that is vulnerable to security threats. Big data applications often
collect sensitive data, such as personal information and financial data. This data needs to be
protected from unauthorized access, use, disclosure, disruption, modification, or destruction.
• Skills: There is a shortage of skilled big data professionals. Big data requires a variety of skills,
including data engineering, data science, and machine learning. There are not enough people with
these skills to meet the demand for big data talent.
• Organizational resistance: Some organizations are resistant to change and are reluctant to adopt big
data technologies. This can be a major barrier to the adoption of big data.

Excellence and Service

CHRIST
Deemed to be University
Current Landscape of Big Data – Characteristics / Framework

3 V’s 7 V’s

4 V’s

The "Seven V's of Big Data" is a framework

used to describe the key characteristics that
define the challenges and opportunities
associated with big data. These
characteristics help illustrate why traditional
data processing methods and tools are often
insufficient for handling and making sense of
large and complex datasets.

Excellence and Service

CHRIST
Deemed to be University
Current Landscape of Big Data – Characteristics / Framework

• Volume: This refers to the sheer scale of data generated and collected. Big data involves massive
amounts of information that exceed the processing capacity of conventional databases and tools. The
volume of data is measured in terms of petabytes, exabytes, and beyond.
o Scenario: Social Media Data Analysis for Marketing Description: A marketing company is analyzing
social media data to understand consumer sentiment towards a new product launch. They collect and
process millions of tweets, comments, and posts in real-time to gauge public reactions and identify
potential areas for improvement.

• Velocity: Velocity pertains to the speed at which data is generated, processed, and delivered. With the
advent of real-time data streams from sources like social media, sensors, and financial markets,
organizations need to analyze and act upon data in near-real-time to capitalize on opportunities and
respond to challenges.
o Scenario: Stock Market Real-time Analysis. Description: An investment firm is monitoring stock market
data in real time to make informed trading decisions. They process and analyze market data streams,
such as price fluctuations and trading volumes, to identify trends and execute buy/sell orders swiftly.
Excellence and Service
CHRIST
Deemed to be University
Current Landscape of Big Data – Characteristics / Framework

• Variety: Variety refers to the diverse types of data that big data encompasses. This includes structured
data (such as relational databases), unstructured data (such as text and images), and semi-structured
data (such as XML files). Managing and extracting insights from this varied data requires specialized
techniques and tools.
o Scenario: Healthcare Data Integration. Description: A hospital is integrating various types of patient
data, including structured electronic health records (EHRs), unstructured doctor's notes, and medical
images. They use advanced analytics to correlate different data types to provide personalized treatment
plans.

• Veracity: Veracity refers to the accuracy and reliability of data. In the big data context, data can come
from numerous sources, each with varying levels of accuracy and trustworthiness. Ensuring data quality
and addressing issues like inconsistencies and errors become critical to making reliable decisions.
o Scenario: Fraud Detection in Financial Transactions. Description: A credit card company is analyzing a
massive volume of transaction data to detect fraudulent activities. They use machine learning algorithms
to identify patterns that indicate potentially fraudulent transactions while reducing false positives.
Excellence and Service
CHRIST
Deemed to be University
Current Landscape of Big Data – Characteristics / Framework

• Value: The value of big data lies in its potential to provide meaningful insights and drive informed decisions.
Extracting value from big data involves analyzing and interpreting the data to uncover patterns, trends,
correlations, and insights that can lead to improved business strategies, innovations, and efficiencies.
o Scenario: Retail Customer Behavior Analysis Description: An e-commerce company is analyzing customer
behavior data to improve sales and marketing strategies. They analyze browsing history, purchase patterns,
and demographics to personalize recommendations, promotions, and advertisements, ultimately driving
higher conversion rates.

• Variability: Variability refers to the inconsistency of data flows, which can be erratic and unpredictable. Data
can arrive in irregular intervals, and its structure can change over time. Handling variability requires flexible
data processing techniques and tools that can adapt to changing data patterns.
o Scenario: Weather Forecasting and Emergency Response Description: A meteorological agency is collecting
and processing weather data from various sources, including satellites, sensors, and weather stations. They
handle the variability in data frequency and format to provide accurate and timely weather forecasts for
disaster preparedness.

Excellence and Service

CHRIST
Deemed to be University
Current Landscape of Big Data – Characteristics / Framework

• Visibility (Visualization): Visibility refers to the ability to access and understand data from
various perspectives. Effective visualization and data presentation techniques are crucial
to making complex data comprehensible and actionable for a wide range of stakeholders.
o Scenario: Supply Chain Analytics for Manufacturing. Description: A manufacturing
company is using big data analytics to gain visibility into its supply chain. They track data
from suppliers, production facilities, transportation, and distribution centers to optimize
inventory levels, reduce lead times, and enhance overall operational efficiency.

These Seven V's collectively emphasize the challenges and opportunities presented by big data.
Organizations that can successfully address these characteristics can harness the power of big
data to gain insights, improve decision-making, drive innovation, and enhance their competitive
advantage.

Excellence and Service

CHRIST
Deemed to be University
Big Data Landscape - Technologies
• The big data landscape consists of many different technologies that can be
categorized into the following:
– File system
– Distributed programming frameworks
– Data integration
– Databases
– Machine learning
– Security
– Scheduling
– Benchmarking
– System deployment
– Service programming
Excellence and Service
CHRIST
Deemed to be University
Facets of Data
Structured
• Definition: Structured data is organized with a specific format, schema, and
clear relationships between data elements. It's typically stored in databases
using rows and columns, making it easily queryable and analyzable.
• Examples: Customer information, sales transactions, product inventory,
financial records.

Excellence and Service

CHRIST
Deemed to be University
Facets of Data
Unstructured
• Definition: Unstructured data lacks a fixed format or schema. It
comes in various forms, making it more flexible but harder to
analyze compared to structured data. Specialized tools are often
required for meaningful insights.
• Examples: Textual content (documents, emails), images, social
media posts, videos.

Natural language
• Definition: Natural language data refers to text or speech data
generated by humans in their everyday communication.
Analyzing natural language involves techniques like sentiment
analysis, language translation, and text summarization.
• Examples: Social media comments, customer reviews, email
correspondence.

Excellence and Service

CHRIST
Deemed to be University
Facets of Data
Machine-generated
• Definition: Machine-generated data is produced by
automated systems, devices, or sensors without human
intervention. It's often generated at high volumes and high
velocities.
• Examples: Sensor data from IoT devices, server logs, GPS data.

Graph-Based Data:
• Definition: Graph-based data represents relationships
between entities using nodes (vertices) and edges. It's
especially useful for modeling complex interactions and
networks.
• Examples: Social networks (nodes as users, edges as
connections), supply chain networks, knowledge graphs.

Excellence and Service

CHRIST
Deemed to be University
Facets of Data
Audio, video, and images
• Definition: These data types include multimedia content like audio recordings, video clips,
and images. Analyzing such data often involves computer vision and audio processing
techniques.
• Examples: Surveillance camera footage, medical imaging, YouTube videos.

Streaming Data
• Definition: Streaming data refers to real-time data that is generated, processed, and
analyzed as it is produced. It's crucial for applications that require immediate insights and
actions.
• Examples: Stock market tick data, social media live feeds, IoT sensor data.

Excellence and Service

CHRIST
Deemed to be University
Data Science
• Data Science is an interdisciplinary field that involves the use of various
techniques, algorithms, processes, and systems to extract insights and
knowledge from structured and unstructured data.
• It combines expertise from domains such as statistics, computer science,
domain knowledge, and data engineering to analyze complex data sets and
solve real-world problems.
• Data science is widely applicable across industries and sectors, from
business and healthcare to finance and social sciences. It aids in solving a
range of challenges, such as predicting customer preferences, detecting
fraud, optimizing supply chains, analyzing medical data, and more. The field
is constantly evolving, driven by advancements in technology and the
growing availability of data.
Excellence and Service
CHRIST
Deemed to be University
Data Scientist

• Professionals in data science, often referred to as data scientists, possess a

blend of technical skills (programming, statistics, machine learning) and
domain-specific knowledge.
• They leverage data to generate insights that inform decision-making, enhance
processes, and create innovative solutions. The ability to extract meaningful
information from data is central to data science's role in addressing complex
and data-rich challenges in the modern world.
• Data scientists should possess a unique combination of technical prowess,
critical thinking, and interpersonal skills, allowing them to extract valuable
insights from data and contribute significantly to an organization's success.

Excellence and Service

CHRIST
Deemed to be University
Data Scientist – Key Qualities
• Analytical Mindset
• Strong Programming Skills (Python/R)
• Statistical Knowledge
• Machine Learning Expertise
• Domain Knowledge
• Data Wrangling
• Data Visualization
• Problem-Solving
• Continuous Learning
• Communication Skills
• Ethical Considerations
• Curiosity
• Attention to Detail
• Business Acumen
• Team Player
Excellence and Service
CHRIST
Deemed to be University
Data Science - Process
• Problem Definition
• Data Collection
• Data Cleaning and Preprocessing
• Exploratory Data Analysis (EDA)
• Feature Engineering
• Model Selection and Training
• Model Evaluation
• Model Deployment
• Interpretation and Visualization
• Iterative Improvement
• Communication and Presentation
The data science process is not always strictly linear, and iterations may occur between different stages as new insights
are discovered or challenges are encountered. Successful data science projects require collaboration between domain
experts, data engineers, and data scientists to ensure that the entire process yields valuable and actionable results.

Excellence and Service

CHRIST
Deemed to be University
Unit – II
Part-II

Sampling, Measurement and Scaling Techniques

Sampling: Steps in Sampling Design, Different Types of Sample Designs,
Measurement and Scaling: Measurement in Research, Measurement Scales,
Technique of Developing Measurement Tools, Scaling, Important Scaling
Techniques.

• C. R. Kothari, Research Methodology Methods and Techniques. 3rd. ed. New Delhi: New Age
International Publishers, Reprint 2014

Excellence and Service

CHRIST
Deemed to be University

Sampling Techniques
• Deliberate sampling: Deliberate sampling is also known as purposive or non-probability
sampling. This sampling method involves purposive or deliberate selection of particular units
of the universe for constituting a sample which represents the universe. When population
elements are selected for inclusion in the sample based on the ease of access, it can be called
convenience sampling. On the other hand, in judgement sampling the researcher’s
judgement is used for selecting items which he considers as representative of the population.
Scenario: Market Research for a New Product Launch

A technology company is preparing to launch a new smartphone model targeted at a specific

demographic: tech-savvy professionals aged 25 to 40 who are frequent travelers. The company
wants to conduct market research to gather insights into this target audience's preferences,
expectations, and purchasing behaviors.

In this scenario, deliberate sampling allows the company to focus its market research efforts on a
specific demographic that is crucial for the success of its new product. While deliberate sampling
doesn't provide the statistical representativeness of probability sampling, it offers valuable
qualitative insights that can guide decision-making in product development and marketing
strategies.

Excellence and Service

CHRIST
Deemed to be University

Sampling Techniques
• Simple random sampling: This type of sampling is also known as chance sampling or
probability sampling where each and every item in the population has an equal chance of
inclusion in the sample and each one of the possible samples, in case of finite universe,
has the same probability of being selected.

Scenario: Political Opinion Poll in a City

Imagine a scenario where a research firm wants to conduct a political opinion poll to gauge the
sentiments of the residents of a city regarding upcoming local elections. The firm aims to use
simple random sampling to ensure that each eligible resident has an equal chance of being
included in the survey.

Simple random sampling in this scenario ensures that every registered voter has an equal
chance of being included in the survey, minimizing potential bias and providing a representative
sample of the population's political opinions.

Excellence and Service

CHRIST
Deemed to be University

Sampling Techniques
• Systematic sampling: In some instances the most practical way of sampling is to select
every 15th name on a list, every 10th house on one side of a street and so on. Sampling of
this type is known as systematic sampling. An element of randomness is usually introduced
into this kind of sampling by using random numbers to pick up the unit with which to start.
This procedure is useful when sampling frame is available in the form of a list.

Scenario: Customer Feedback Collection in a Retail Store

Consider a scenario where a retail store wants to collect customer feedback to understand their
satisfaction levels and improve their services. The store aims to use systematic sampling to
efficiently gather feedback from customers while maintaining randomness in the selection process.

Systematic sampling in this scenario allows the retail store to collect customer feedback in an
organized and efficient manner while maintaining a level of randomness. It ensures that feedback is
gathered from a variety of customers, providing insights that can lead to improvements in the
store's operations and customer satisfaction.

Excellence and Service

CHRIST
Deemed to be University

Sampling Techniques
• Stratified sampling: If the population from which a sample is to be drawn does not constitute a
homogeneous group, then stratified sampling technique is applied so as to obtain a
representative sample. In this technique, the population is stratified into a number of non-
overlapping subpopulations or strata and sample items are selected from each stratum. If the
items selected from each stratum is based on simple random sampling the entire procedure, first
stratification and then simple random sampling, is known as stratified random sampling.

Scenario: Educational Assessment of Schools in a District

Imagine a scenario where a district school authority wants to assess the academic performance of
its students across different grade levels and subjects. The district authority aims to use stratified
sampling to ensure representation from each grade level and subject area while maintaining a
manageable sample size.

Stratified sampling in this scenario allows the district authority to obtain a representative sample of
students from each grade level and subject area. This ensures that the assessment results
accurately reflect the academic performance of students across the entire school district, enabling
effective educational planning and targeted interventions.

Excellence and Service

CHRIST
Deemed to be University

Sampling Techniques

• Multi-stage sampling: This is a further development of the idea of cluster sampling. This
technique is meant for big inquiries extending to a considerably large geographical area like
an entire country. Under multi-stage sampling the first stage may be to select large primary
sampling units such as states, then districts, then towns and finally certain families within
towns.
Scenario: Environmental Impact Assessment in a Region

Imagine a scenario where an environmental agency is conducting an assessment of the

ecological impact of a construction project in a large region. The agency wants to use multi-stage
sampling to efficiently gather data from different areas while considering the diverse ecosystems
present i.e. they might randomly choose 10 urban zones, 8 rural zones, 5 forested zones, and 3
wetland zones.

Multi-stage sampling in this scenario allows the environmental agency to efficiently gather data
from various ecosystems within the region while maintaining a representative sample. By
considering different geographical zones and sub-areas, they can assess the potential impact of
the construction project on the environment more comprehensively.

Excellence and Service

CHRIST
Deemed to be University

Sampling Techniques

• Sequential sampling: This is somewhat a complex sample design where the ultimate size
of the sample is not fixed in advance but is determined according to mathematical decisions
on the basis of information yielded as survey progresses. This design is usually adopted
under acceptance sampling plan in the context of statistical quality control

Scenario: Quality Control in a Manufacturing Plant

Consider a scenario where a manufacturing plant produces electronic components and wants to
ensure the quality of its products. The plant implements sequential sampling to monitor the
production process and make real-time decisions about the quality of the components.

In this scenario, sequential sampling is utilized to make ongoing decisions about the quality of
manufactured components. It enables the manufacturing plant to quickly identify and rectify
quality issues, leading to higher product quality and reduced waste.

Excellence and Service

CHRIST
Deemed to be University

Sampling Techniques
• Quota sampling: In stratified sampling the cost of taking random samples from individual
strata is often so expensive that interviewers are simply given quota to be filled from
different strata, the actual selection of items for sample being left to the interviewer’s judgement. This is called
quota sampling.

Scenario: Consumer Preference Survey for a New Beverage

Imagine a scenario where a beverage company wants to conduct a consumer preference survey
to understand which flavors of a new drink are most popular among different age groups and
genders. The company decides to use quota sampling to ensure a balanced representation of
participants from various demographic categories.

In this scenario, quota sampling helps the beverage company gather insights about consumer
preferences across different demographic categories without conducting a fully random sample.
While quota sampling doesn't guarantee statistical representativeness like probability sampling, it
allows for a certain level of control over the composition of the sample to ensure diversity and
balance.

Excellence and Service

CHRIST
Deemed to be University

Sampling Techniques
• Cluster sampling and area sampling: Cluster sampling involves grouping the population
and then selecting the groups or the clusters rather than individual elements for inclusion in
the sample. Suppose some departmental store wishes to sample its credit card holders. It
has issued its cards to 15,000 customers. The sample size is to be kept say 450. For cluster sampling this
list of 15,000 card holders could be formed into 100 clusters of 150 card holders each. Three clusters might
then be selected for the sample randomly.

Scenario: Healthcare Facilities Assessment in a Region

Imagine a scenario where a government health department wants to assess the quality of
healthcare facilities in a large region with many hospitals and clinics. The department decides to
use cluster sampling to evaluate a representative subset of healthcare facilities.

In this example, cluster sampling allows the health department to evaluate the quality of
healthcare facilities in the region without having to assess each facility individually. By selecting
representative clusters and conducting assessments within them, the department can make
informed decisions to improve healthcare services.

Excellence and Service

CHRIST
Deemed to be University

Sampling Techniques
• Area sampling is quite close to cluster sampling and is often talked about when the total
geographical area of interest happens to be big one. Under area sampling we first divide
the total area into a number of smaller non-overlapping areas, generally called geographical
clusters, then a number of these smaller areas are randomly selected, and all units in these
small areas are included in the sample. Area sampling is specially helpful where we do not
have the list of the population concerned.

Scenario: Urban Air Quality Assessment in a City

Imagine a scenario where an environmental agency wants to assess the air quality in a large and
densely populated city. The agency decides to use area sampling to gather data on air pollution
levels across different neighborhoods within the city.

In this example, area sampling allows the environmental agency to assess urban air quality
efficiently across a diverse city landscape. By selecting representative neighborhoods and
measuring air pollution within those areas, they can make informed decisions to address
environmental concerns and enhance the overall quality of life for city residents.

Excellence and Service

Introduction To Big Data - Report 1
No ratings yet
Introduction To Big Data - Report 1
5 pages
Data Analytics Unit-I
No ratings yet
Data Analytics Unit-I
35 pages
DSBDA_UNIT1
No ratings yet
DSBDA_UNIT1
221 pages
Challenging Tools On Research Issues in Big Data Analytics
No ratings yet
Challenging Tools On Research Issues in Big Data Analytics
4 pages
Operational and Analytical Big Data
No ratings yet
Operational and Analytical Big Data
23 pages
Challenges in Big Data Analytics Techniques
No ratings yet
Challenges in Big Data Analytics Techniques
6 pages
Big Datapptfina1
No ratings yet
Big Datapptfina1
25 pages
Da Notes (Big Data) PDF
No ratings yet
Da Notes (Big Data) PDF
32 pages
Big Data Analytics
No ratings yet
Big Data Analytics
7 pages
Introduction to Big Data
No ratings yet
Introduction to Big Data
4 pages
Introduction To Data Analytics
No ratings yet
Introduction To Data Analytics
33 pages
Big Data Presentation
No ratings yet
Big Data Presentation
22 pages
Unit I
No ratings yet
Unit I
61 pages
4. GE ELECT 1 - Data and Databases
No ratings yet
4. GE ELECT 1 - Data and Databases
5 pages
BIG-DATAPPTFINAL
No ratings yet
BIG-DATAPPTFINAL
28 pages
CS8091 BDA Unit I LectureNotes
No ratings yet
CS8091 BDA Unit I LectureNotes
73 pages
Dsbda Unit 1
No ratings yet
Dsbda Unit 1
18 pages
Pub Res Feb 20231
No ratings yet
Pub Res Feb 20231
5 pages
CH 1
No ratings yet
CH 1
10 pages
BDA Answerbank
No ratings yet
BDA Answerbank
71 pages
BDA-1st unit
No ratings yet
BDA-1st unit
39 pages
Big Data Unlocking the Power of Massive Data Sets
No ratings yet
Big Data Unlocking the Power of Massive Data Sets
8 pages
PPT 1.1.1
No ratings yet
PPT 1.1.1
13 pages
Chapter 1. Data Analytics Thinking
No ratings yet
Chapter 1. Data Analytics Thinking
15 pages
DS231 Module 3.PDF
No ratings yet
DS231 Module 3.PDF
41 pages
Data Engineering
No ratings yet
Data Engineering
48 pages
Big Data: Made By: Harshita Salian 17038 Syed Khadija Rizvi 17049 Sayyed Alfiya 17041 Rahul Masam 17028 Deepak Pal 17033
No ratings yet
Big Data: Made By: Harshita Salian 17038 Syed Khadija Rizvi 17049 Sayyed Alfiya 17041 Rahul Masam 17028 Deepak Pal 17033
12 pages
Big Data Lec1
No ratings yet
Big Data Lec1
37 pages
Implikasi Pemanfaatan Analisis Big Data Terhadap Hukum Persaingan Prof - DR - .Ningrum Natasya Sirait S.H. M.Li - .
No ratings yet
Implikasi Pemanfaatan Analisis Big Data Terhadap Hukum Persaingan Prof - DR - .Ningrum Natasya Sirait S.H. M.Li - .
23 pages
Module 1
No ratings yet
Module 1
21 pages
117769
No ratings yet
117769
20 pages
PPT 1.1.3
No ratings yet
PPT 1.1.3
15 pages
DS231_Week_3
No ratings yet
DS231_Week_3
41 pages
Big Datapptfinal
No ratings yet
Big Datapptfinal
27 pages
R Programming UNIT-1
No ratings yet
R Programming UNIT-1
48 pages
A Survey On Big Data Applications and Challenges
No ratings yet
A Survey On Big Data Applications and Challenges
4 pages
Business Intelligence & Big Data Analytics-CSE3124Y
No ratings yet
Business Intelligence & Big Data Analytics-CSE3124Y
25 pages
Big Data A Survey Dinesh
No ratings yet
Big Data A Survey Dinesh
9 pages
Unit-Iii CC&BD CS71
No ratings yet
Unit-Iii CC&BD CS71
89 pages
Challenging Tools On Research Issues in Big Data Analytics: Althaf Rahaman - SK, Sai Rajesh.K .Girija Rani K
No ratings yet
Challenging Tools On Research Issues in Big Data Analytics: Althaf Rahaman - SK, Sai Rajesh.K .Girija Rani K
8 pages
BigData_BCom
No ratings yet
BigData_BCom
57 pages
BigData_BCom-Unit-1
No ratings yet
BigData_BCom-Unit-1
9 pages
Syai Sem3 - Ds Unit2
No ratings yet
Syai Sem3 - Ds Unit2
37 pages
Fods MQP Solutions - 025136
No ratings yet
Fods MQP Solutions - 025136
76 pages
Big Data
No ratings yet
Big Data
43 pages
Unit-III CC&BD Cs62 Ab
No ratings yet
Unit-III CC&BD Cs62 Ab
85 pages
Impact of Data Science Across Industries
No ratings yet
Impact of Data Science Across Industries
3 pages
Inroduction To Data Science
No ratings yet
Inroduction To Data Science
62 pages
Unit - I
No ratings yet
Unit - I
60 pages
Big data analytics notes
No ratings yet
Big data analytics notes
33 pages
Big Data Ashish
No ratings yet
Big Data Ashish
7 pages
Introduction To Big Data
No ratings yet
Introduction To Big Data
10 pages
Big Data
No ratings yet
Big Data
19 pages
Applications & Trends in Data Mining: Gaurav Gupta, Geetika Hans, Tamanna Sehgal
No ratings yet
Applications & Trends in Data Mining: Gaurav Gupta, Geetika Hans, Tamanna Sehgal
3 pages
Big Data Part 1 (5)
No ratings yet
Big Data Part 1 (5)
12 pages
Bda CHP1
No ratings yet
Bda CHP1
83 pages
Krist Jayanti School,Bariya 20240624 192351 0000
No ratings yet
Krist Jayanti School,Bariya 20240624 192351 0000
9 pages
1.big Data
No ratings yet
1.big Data
53 pages
The Power of Big Data: Transforming Industries and Shaping the Future
From Everand
The Power of Big Data: Transforming Industries and Shaping the Future
Tom Henricksen
No ratings yet
Data Science
From Everand
Data Science
Chloe Martin
No ratings yet
MGT555 Individual Assignment 1
No ratings yet
MGT555 Individual Assignment 1
11 pages
ShortInternship
No ratings yet
ShortInternship
32 pages
SNPC Prosource Front Office
No ratings yet
SNPC Prosource Front Office
2 pages
Poster Bupar
No ratings yet
Poster Bupar
1 page
Pitching
No ratings yet
Pitching
9 pages
ALX Foundations Overview
No ratings yet
ALX Foundations Overview
20 pages
Project Report: Ipl Score and Win Prediction Using Machine Learning
No ratings yet
Project Report: Ipl Score and Win Prediction Using Machine Learning
43 pages
DS Basics
No ratings yet
DS Basics
9 pages
Quantitative User Experience Research: Informing Product Decisions by Understanding Users at Scale Chris Chapman download
100% (1)
Quantitative User Experience Research: Informing Product Decisions by Understanding Users at Scale Chris Chapman download
86 pages
DAA - Chapter 04
No ratings yet
DAA - Chapter 04
13 pages
Data Analysis Resume
No ratings yet
Data Analysis Resume
2 pages
Business Analytics Using Excel
No ratings yet
Business Analytics Using Excel
34 pages
ENGG1003 ProjectSpecification 2024-25term1
No ratings yet
ENGG1003 ProjectSpecification 2024-25term1
4 pages
Qlarant Overview Booklet
No ratings yet
Qlarant Overview Booklet
7 pages
S22 Lecture 1 Intro Inked
No ratings yet
S22 Lecture 1 Intro Inked
46 pages
ITS OD 202 Data Analytics 0225
No ratings yet
ITS OD 202 Data Analytics 0225
2 pages
Latex and MATLAB Class
No ratings yet
Latex and MATLAB Class
12 pages
BI Bankai
No ratings yet
BI Bankai
27 pages
Chapter-3 data science ppt
No ratings yet
Chapter-3 data science ppt
7 pages
Class12 Pandas Notes
No ratings yet
Class12 Pandas Notes
23 pages
Olamide Lammy
No ratings yet
Olamide Lammy
1 page
Proteomics - 2022 - Schessner - A Practical Guide To Interpreting and Generating Bottom Up Proteomics Data Visualizations
No ratings yet
Proteomics - 2022 - Schessner - A Practical Guide To Interpreting and Generating Bottom Up Proteomics Data Visualizations
18 pages
Data Analytical Roadmap
No ratings yet
Data Analytical Roadmap
10 pages
Syllabus Updated DATA2 1
No ratings yet
Syllabus Updated DATA2 1
16 pages
Hci - Lab Manual
No ratings yet
Hci - Lab Manual
47 pages
Ba Exam Paper 2022 (Sem II)
No ratings yet
Ba Exam Paper 2022 (Sem II)
2 pages
MA 21001 Probability and Statistics For Engineers
No ratings yet
MA 21001 Probability and Statistics For Engineers
2 pages
Minor Project Presentation 4
No ratings yet
Minor Project Presentation 4
12 pages
DATA ANALYTICS Syllabus 3 Units
No ratings yet
DATA ANALYTICS Syllabus 3 Units
37 pages
Voulgaris - Data Scientist (AVG) (2014)
No ratings yet
Voulgaris - Data Scientist (AVG) (2014)
297 pages

MDS131 - Research Methods in Data Science - Unit 2 - Part 1

Uploaded by

MDS131 - Research Methods in Data Science - Unit 2 - Part 1

Uploaded by

Research Methods in Data Science

MISSION VISION CORE VALUES

Excellence and Service

Introduction to Data Science

Excellence and Service

Excellence and Service

Excellence and Service

The "Seven V's of Big Data" is a framework

Excellence and Service

Excellence and Service

Excellence and Service

Excellence and Service

Excellence and Service

Excellence and Service

Excellence and Service

• Professionals in data science, often referred to as data scientists, possess a

Excellence and Service

Excellence and Service

Sampling, Measurement and Scaling Techniques

Excellence and Service

A technology company is preparing to launch a new smartphone model targeted at a specific

Excellence and Service

Scenario: Political Opinion Poll in a City

Excellence and Service

Scenario: Customer Feedback Collection in a Retail Store

Excellence and Service

Scenario: Educational Assessment of Schools in a District

Excellence and Service

Imagine a scenario where an environmental agency is conducting an assessment of the

Excellence and Service

Scenario: Quality Control in a Manufacturing Plant

Excellence and Service

Scenario: Consumer Preference Survey for a New Beverage

Excellence and Service

Scenario: Healthcare Facilities Assessment in a Region

Excellence and Service

Scenario: Urban Air Quality Assessment in a City

Excellence and Service

You might also like