Econ1203 Lecture 1 Notes
Econ1203 Lecture 1 Notes
Week 1
Week 1 topics
Key references
Keller 9th ed. provides detail & basic reference material Lectures provide
Overview Emphasis of key points Some worked examples Review & discussion opportunities Preview participate practice
4
Tutorials provide p
Learning cycle
This is presumed knowledge & there will be no mathematics instruction in BES But BES is about statistics not mathematics
Assessment Percentage Component of total mark Feedback quizzes In-tutorial test Project Final examination Total 6 14 20 60 100
In-tutorial test
Some tutorial problems have computing part Major j p project j requires q EXCEL
Final examination
Lecture notes will be sparse Lecture discussion will be more expansive There should be benefit in attending lectures Use textbook to fill any remaining i i gaps Textbook provides extensive reference material Worked examples in lectures will appear with space for an answer a s e Statistics in action (SIA) will use real case studies
Provides skills for real-world decision making Provides foundation for all 2nd year econometrics subjects U d i quantitative Underpins tit ti analysis l i across all ll ASB schools h l
Applications of interest to accounting, accounting economics, economics finance, management & marketing students
the sexy job in the next 10 years will be statisticians. And Im not kidding kidding. Hal Varian, Varian Chief economist@Google
8
Identify appropriate statistical methods for describing data & making inferences about population parameters Apply appropriate statistical methods to samples of data Use statistical reasoning to aid in problem solving Use EXCEL to apply appropriate statistical methods W it a basic Write b i b business i report td documenting ti statistical t ti ti l analyses Critically evaluate statistical work of others
Motivating examples in lectures Material for tutorial problems Suggest prototype project problems
Baby bonus Cars in China Private health insurance Petrol prices p Migrant wealth Sydney housing prices Crime statistics
10
Increased again July 1 1, 2006 Analysis based on Gans & Leigh (2008) If so by how much? Note: change announced 7 weeks before implementation Daily births in Australia for 1975 1975-2004 2004 from Australian Bureau of Statistics (ABS)
Data?
11
Heres (part of) the data BUT need descriptive p statistics in order to
Summarize data Facilitate interpretation Analyse the research question Extract information
12
Numbe er of births per day 500 600 700 800 900 0 400 1000
June 10
June 10
June 17
June 17
June 24
June 24
July 1
July 7
July 14
July 21
July 28
13
- Data are numbers of What are the key newborns per day features of these data? (variable) -Data are observed in time (time series data)
Statistical concepts?
14
Statistical concepts
A population consists of all subjects that are studied. A parameter is a numerical measure of population. - Note: parameters are often unknown A sample p is a subset of the p population p that is being g studied. A statistic is a numerical measure that describes a characteristic of a sample. Inferential statistics uses the information from a sample l t to draw d conclusions l i about b t th the population. l ti Or, inferential statistics computes/uses a statistic to infer about a parameter of the population. population
15
Statistical concepts
16
The data set consists of observations on 442 patients about their 4 variables: a quantitative measure of disease progression, height, weight and gender. What is the population? What is the sample? The average height of all diabetes patients is a parameter (unknown) The average height of all patients in the data set is a statistic (known)
17
Types of data
A variable is a characteristic of a population or of a sample from a population We observe values or observations of a variable A data set contains observations on variables Variables may be Discrete or continuous
Discrete example - football scores, number of newborns per day Continuous example - time remaining in football game, height, weight
This course is p poor, , good, g , very y good g Standard & Poors ratings AAA > AAA- > AA+ > AA
18
Types of data
Type of observation can also be used to classify data Time series data refer to measurements at different points in time
Eg: SIA Baby bonus births per day Eg SIA Sydney Eg: S dne housing ho sing prices b by s suburb, b rb diabetes data
Data type can influence what is appropriate by way of analysis Total number of births per day makes sense Suppose marital status is coded as Single =1, Married =2, Divorced =3, Widowed=4;
19
Descriptive statistics
Need to organize & summarize data in order to extract i f information ti This is role of descriptive statistics Descriptive statistics is useful for inferential statistics Note: descriptive statistics is about organizing and summarizing data while inferential statistics is for inferring about the population parameters t Some are graphical, graphical some numerical Type of data may impact on which to use
20
Want to summarize categorical data with associated counts Lists all possible values for a variable along with the number of observations for each value UNSW interested in transport issues
Mode of Frequency transport to campus p Resident Walk Cycle Car Bus Other Total
Relative Frequency
How do people travel to campus? (http://www.facilities.unsw.edu. au/getting-uni g g ) Categories need to be mutually exclusive & exhaustive
100
21
Provide graphical representation of frequency distributions 2011 UNSW Travel S Survey sample l of f 5881
BarchartofmodeoftransporttoUNSWcampus
3000
2500
2000
47 (0.8%) Resident 628 (10 (10.7%) 7%) Walk 210 (3.6%) Bike 1032 (17.5%) Car 1188 (20.2%) Bus 2776 (47.2%) Other
1500
1000
500
22
Pie charts show relative frequencies What s in the Other Whats Other category? Can we use a bar chart to summarize quantitative data? q
Piechartofmodeoftransport toUNSWcampus
Resident 1% Walk 11% Bike 4%
Other 47%
Car 17%
Bus 20%
Modifiedpiechartofmodeoftransport toUNSWcampus
Resident 1% Walk 11% Bike 4%
Bus&train 45%
Car 17%
Other 2%
Bus 20%
23
groups the data into classes (intervals) with the corresponding frequencies width =(largest value-smallest value)/(number of classes)
Each interval has the same width determined by y The intervals need to be mutually exclusive & exhaustive
Too many doesnt summarize Too few no information No set rules although g more observations more classes Usually at least 5 and no more than 15
24
25
Histograms
A graph of a frequency distribution (for numerical data) is called a histogram The horizontal axis shows the class limits or class midpoints The vertical axis shows frequencies
"Aussie" Marks histogram
35 30 25 20 15 10 5 0 49 64 74 Marks 84 100
Frequenc cy
26
Histograms
incorrect
Should be no gaps between bars for quantitative data Classes defined by upper limits when class midmid points may be more natural Bar areas should be proportional to frequencies (refer Ch 8 8, p 265)
27
Cumulative frequency or (cumulative) relative frequency distributions Aussie A i marks k eg - How H many students d got a credit di or better? Associated cumulative histograms & ogives May be interesting information lost in histograms Do examiners avoid marks close to borderline?
Stem-and-leaf displays
Symmetry
Left half of histogram is a mirror image of right half Famous bell-shape is symmetric Asymmetric A t i hi histogram t Long tail to the right (positively skewed) Long tail to the left (negatively skewed) May be associated with outliers Modal class is class with highest frequency Histograms may be unimodal or multimodal, or no mode
29
Skewness
30
From Keller Ex 3 3.2 2 Histogram of returns on Investment A is Modal class is No. of modal classes?
Therefore histogram is
31
Bivariate relations
For relationship between qualitative variables For relationship between quantitative variables, e.g. height and weight If one of the variables is time we get a time series plot (line chart)
Scatter plots
32
Want to understand if there is a relation between commuter type and mode of transport What does the bar graph highlight?
Mode Resident Walk Cycle Car Bus Bus & train Other Total
Staff
Total
Modeoftransport bycommutertype
3000 2500
Frequency
2000 1500 1000 500 0 Resident Walk Bike S ff Staff Car S d Students Bus Bus& train Other
33
Scatter plots
34
35
How does this graph rate in terms of characteristics for graphical excellence?
36
With time series data order matters Business cycles defined by extended periods of growth or contraction Relatively sophisticated time plot
37
SYDNEYMETRO
BRISBANEMETRO
MELBOURNEMETRO
38
Do these data inform motorists about the best time to buy petrol?
39
40
Pattern representative of daily price movements in Sydney in winter 2006 C Consider id d daily il d data t J June 11 2006 (S (Sunday) d ) -July J l 10 2006
Notable price variation from day to day Determine day of weekly peak and trough Common day for prices to peak was Thursday No N peaks k ( (or t troughs) h ) on M Monday, d T Tuesday, d F Friday, id Saturday or Sunday Common day for prices to trough was Tuesday No troughs on Wednesday, Thursday, Friday or Saturday No days when prices both peaked & troughed
41
Would bar or pie charts be useful in displaying the weekly peak & trough data?
42