0% found this document useful (0 votes)
22 views27 pages

2a. Sources of Data

The document discusses different types of data sources and categorizations including primary vs secondary, internal vs external, numerical vs categorical, discrete vs continuous, and nominal vs ordinal data. It also covers topics like big data analysis, sampling methods, and probability vs non-probability sampling.

Uploaded by

Anandha Krishnan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views27 pages

2a. Sources of Data

The document discusses different types of data sources and categorizations including primary vs secondary, internal vs external, numerical vs categorical, discrete vs continuous, and nominal vs ordinal data. It also covers topics like big data analysis, sampling methods, and probability vs non-probability sampling.

Uploaded by

Anandha Krishnan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Sources of Data

Main topics

Data categorisations:
 Primary vs secondary
 Internal vs external
 Numerical vs categorical
 Discrete vs continuous
 Nominal vs ordinal

Big data analysis


Sampling
Primary vs secondary

Primary
 Data obtained from first hand sources; origin of data
 Surveys, observation, experimentation, interviews etc.
 Original research on the specific topic
Primary vs secondary

Secondary
 Data already obtained by third parties
 Libraries, newspapers, journals, magazines, governments, banks, internet etc.
 Research conducted on the specific topic or another topic
 Considerations: accuracy, reliable, suitability etc.
Internal vs external data
Internal sources:
 Accounting system(records): invoices, timesheets, journal entries, budgets etc.
 Department wise data: Payroll, marketing, production etc.
 Strategic planning system
Internal vs external data

External sources:
 Libraries, newspapers, journals, magazines, governments, banks, internet, customers etc.
Numerical vs categorical

Numerical:
 Expressed in numbers
 It can be discrete or continuous
 Examples: Marks, attendance, height

Categorical:
 Descriptive data rather than numeric
 It can be nominal or ordinal
 Examples: Eye colour, educational qualification, feedbacks
Discrete vs continuous

Discrete:
 non-continuous data
 include only countable data
 ex.: Class attendance, goals scored by a team

Continuous:
 unbroken data
 include uncountable data
 measured using appropriate measures
 ex.: Distance, height, weight
Nominal or ordinal

Nominal:
 Cannot be measured
 No set order or scale
 Examples: Name, eye colour

Ordinal:
 Has a set order or scale
 Examples: Feedbacks measured through ratings
Impact of general economic environment:

 General economic environment affects costs and revenues of firms at national


and international level.
 The forecast state of the economy will influence the planning process for
organisations which operate within it.
Impact of general economic environment:

Discussion points:

 GDP
 Economic trends
 Inflation
 Interest rates
 Exchange rates
 Tax levels
 Government policy
 Economic cycles: boom, bust or recession
Big data analysis

Data analytics:
The process of collecting and examining data in order to extract meaningful business
insights, which can be used to inform decision making and improve performance.

5 Vs of big data:
 Volume: large quantity of data
 Velocity: speed of generating data
 Variety: different types of data
 Veracity: reliability of data
 Value: cost-benefit analysis
Variety
Structured:
 Well organized
 Example: Government statistics, mark list of students in an excel

Unstructured:
 Unorganized data
 Example: Social media

Semi-structured:
 Mix of structured & unstructured; not totally unorganized
 Example: Sorted mailbox
Uses of big data
 Direct access to customers
 Cheaper marketing
 Quicker decision making
 Better costing and pricing
 New product design & features
Big data analysis

Pros:
 Faster
 Cheaper
 Direct access to customers

Cons:
 Unreliable - Veracity
 Lack of technical know-how
 Difficulty in understanding
Sample vs population

 What is sampling?
 When do we use sampling?

Key terms:
 Population
 Sample
 Census
 Sampling frame
Sample vs population
 If all members of a population are examined, the survey is called a census.
 If it is not possible to survey the entire population, a sample is selected.
 The results from the sample are used to estimate the results of the whole.
 A sampling frame is a numbered list of all items in a population.
Sample vs population
A sampling frame should have the following characteristics:
 Completeness
 Accuracy
 Adequacy
 Current
 Non-duplication
 Convenience
Probability sampling vs non-probability sampling

Probability sampling: sampling method in which there is a known chance of each member of
the population being selected as sample
 Random
 Stratified random
 Systematic
 Multistage
 Cluster
Probability sampling vs non-probability sampling

Non-probability sampling: sampling method in which the chance of each member of the
population appearing in the sample is not known
 Quota sampling
Random sampling

 Samples selected such that every item in the population has an equal chance of
being selected
 Sampling frame is required

Benefits and drawbacks:


+ Free from bias
+ Easy to select
‒ Unrepresentative

Quasi random sampling methods: Stratified, systematic & multistage


Stratified random sampling
 Method of sampling in which population is divided into strata or categories.
Random samples are then taken from each strata or category
 Sampling frame is required

Benefits and drawbacks:


+ Free from bias
+ Reflect structure of population; representative
+ Increased precision
‒ Laborious: prior knowledge of strata required
Systematic sampling

 Sampling method which selects every Nth item after a random start
 Sampling frame is required

Benefits and drawbacks:


+ Easy to use
+ Cheap
‒ Unrepresentative
‒ Not entirely random
Multistage sampling
 Sampling method which involves dividing the population into several sub-populations and
then selecting a small sample of these sub-populations at random
 Each sub-population is then divided further, and then a small sample is again selected at
random. This process is repeated as many times as required

Benefits and drawbacks:


+ Fewer investigators required
+ Cheap
‒ Unrepresentative
‒ Not truly random
Cluster sampling

 Sampling method that involves selecting one definable subsection of the


population as the sample
 Non-random sampling method

Benefits and drawbacks:


+ Fewer investigators required
+ Cheap
‒ Unrepresentative
‒ Not truly random
Quota sampling

 Samples selected until a certain quota is covered


 Non-probability sampling method

Benefits and drawbacks:


+ Fewer investigators required
+ No sampling frame required
+ Cheap
‒ Unrepresentative; very biased
‒ Not truly random
Thank you

You might also like