0% found this document useful (0 votes)
6 views

Week 1 Summary Notes

The document provides an overview of data types, including discrete and continuous numerical data, as well as categorical data. It discusses methods for presenting data, such as tables and various graphical forms, and summarizes key statistical measures for location, spread, and skewness. Additionally, it outlines formulas for calculating measures of location, spread, and skewness in data sets.

Uploaded by

Pride Nkala
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
6 views

Week 1 Summary Notes

The document provides an overview of data types, including discrete and continuous numerical data, as well as categorical data. It discusses methods for presenting data, such as tables and various graphical forms, and summarizes key statistical measures for location, spread, and skewness. Additionally, it outlines formulas for calculating measures of location, spread, and skewness in data sets.

Uploaded by

Pride Nkala
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 8
A Chapter 1 Summary " Data ( betes Ue gps k anovnbh | meaner A age feat Data (ie information or facts) can be subdivided as follows: < Fatt we Sea) with: : eres numerical Ake Coon rine en (ae numbers) 4° Ltinbwe perc) discee continuous attribute bk nominal by (sk Lomeli igh fegpnet | ee coon AY “Luby Ural tant be ae (fe [Po Hale [Fool Lace lowe (Monde lt pager Ly thulen| Discrete data is numerical data that can only ae eam values (eg 0. El 2, p, Nite £°] Continuous data is numerical data that can take any value withew 4 nate exact Ps qt an iat penal 12 (Ae auen@hitad, Hew aber Une vv | $ We can summarise data ning tables or ny distributions) or We un tonlgse- etme Histograms Msc. there fe be te. us freer: Len be od palin we ee Tens tafe reabedels A histogram is similar to a Bar char but is dean for ye ids “therefore ha ie be can be detwns | en | no gaps between the bars. X68 raat Cap he 7 my An th tac : eT = GE AES YT tk EL aint: However, for a histogram the area of the bar gives the frequency of the group pe ‘The frequency density (the height ofthe bars) is found Gar yr hreas wXh = cys wi a Ag A Frey: y us Te : \ requency density = AeMuene, — : 4) class width ’ -| where the class width isthe difference beiiveen the Inddest uid "Baick values allowed VS) in the class. mp Cans Wettle 2 Teme Line plots | Datplet, (se cued) hk e We just plot each data value against a number line using a cross or a dot. he median) covrced in bad be hiner grams dot Lats Shr tw ee a a toe Lette dy whsetheat SL yack b2 ram 7 © IRE. 2014 Examinations tve Shamed Stem and leaf diagrams A stem and leaf diagram splits each data value up into 2 parts as follows: 3 ry Key: 2[4 represents 24 This diagram represents the values: 17, 19, 19, 24, 25, 27, 28, 30. 31. r f i rat ov thes Cate cee member has beew Splee vep Cumulative frequency diagrams Cumulative frequency is the sum of the frequencies. A cumulative frequency diagram plots the largest possible value in each group against the cumulative frequency. tan cabled bx k abisher plot) > clad Boxplots (/ 3) 25% of data 25% of data 25% of data 25% of data 4 —— eee }__4 3S 1 i ft [4 lowest lower median upper highest value quartile \ quartile value ) 2 . 2 Comparing data sets wpe postin beweaciable head sqm A, ¥ v ¥ When comparing data sets we look at the location, spread and skewness (shape) of each distribution. ‘The types of skewness are: feet CTS we bike nb 3 gertere & | uy mess Anda a right af hump lewd (© IFE: 2014 Examinations ‘The Actuarial Education Company A Chapter 2 Summary ’ ‘ a teahe’ or ovens tra th op deta | teens vakse 62 Sem asiye Ltahien of eobre daly sk 5 mentucel 4 names = The location of a data set isa single value that is representative of that data set. te) "A Ure. reac meen Mode The mode is the data value that appears most often. To obtain it from a frequency distribution we find the value or group with the greatest frequency. Mean ‘The sample mean, ¥, is given by: ye To calculate the mean from a frequency distribution we use: For a grouped frequency distribution, we use ee inidpoine for the x valpies, The mean Wy donk 5 ebb Tha can we ee eter en ‘The median is the middle value of data set arranged in order. It isthe: 6 rhe radian 26-15 235 “un) ’ : REG 4D Cams oyth data value hers malian sS-L44= Be v Atbt For a grouped frequency distribution the median is the sn th “value”. we can then use interpolation to estimate the median within a group. = tm ee vale op ob ered Bi ve wee a ve 1 hor ve Sek ot the mean gor yreney Lab . We Aoseme Pye ahs It ms ees our tangs fe vcdhan 35 dn nv th Vale be @ 100 claims, reteen Fo cout 50-19 = 330A vada ew He Ine C1500 spon 44 peste lava & € 1500 Gav {ce C1097 Jrooge ¢ Gos pa 4 —I Predeay 5 pan Around ms 133e2y ‘The sample mean is the first order moment. Location and skewness For a symmetrical distribution the mode, mean and median coincide. For skewed distributions their locations are: aad postive skew negative skew ili aaa Transforming ae Given a sample mean of ¥, if we multiply each of the sample values by a and then add the mean of these new values is: a+b Xe yRa~ Sz. git Xn = + hb a anrh phate aah? 7 Ga4B) 4 Comat 1 lornt) n ‘The Actuarial Education Company © IFE: 2014 Examinations A. Chapter 3 Summary Spread pega The spread (or dispersion) of a data set can be measured by the range, IQR or the standard deviation, T ncanieres of Spear ractye 2)IQR 3) Range The range of a set of data values xj,....%, is defined as: Range = max{x,}—min{x;) a JOR The interquartile range is given by: JOR =0;-O, For ungrouped data: lower quartile Q, =(4n+4)th value upper quartile Q; =(3n+3)th value For grouped data we use +n and 31 to give the positions of the quartiles. Q y=(Ln)» Qy= (An If the quartile lies between two values we use interpolation to estimate its value. larly, we use erpolation when estimating the quartile within a group. Standard deviation 2 ‘The sample variance, s?, measures the spread squared and is given by: ‘The sample standard deviation, s, is the square root of the sample variance and measures the spread of the set of data. be date hale ADD erbreme vile (Write ere Fite) bok prevent, Stee pope Lake: The mean our Sept o mere Wy to ban a binnlher 5 Gagelerd Aewekion than the pepo wn. To $x le by rn] mind ons Led bg a fmaMer nmerber rrebay ter Stpnple Seimberd Lerator, bigger (there doe an unbiased snbertornes the To calculate the variance from a frequency distribution we use: SAe-3P - {Dam} nl For a grouped frequency distribution, we use the midpoint for the x values. Central moments The Ath order central sample moment is given by: +y (a - x) ‘The sample variance is the second order central moment (except we divide by -1).. The third order central moment is used to measure the skewness. Skewness ‘The third order central moment is used to measure the skewness. [— J. 2 positively skewed symmetrical negatively skewed A positive value indicates that the data is positively skewed. Similarly a negative value indicates that the data is positively skewed, whereas a value of zero indicates that the data is symmetrical Transforming data iven a sample standard deviation of s, if we multiply each of the sample values by a and then add b the standard deviation of these new values is: as ‘The new sample variance would be: © IFE: 2014 Examinations ‘The Actuarial Education Company Chapter 1 Summary ‘Numerical data can be discrete or continuous. Categorical data can be dichotomous (attribute), nominal or ordinal. Data can be presented either in tabular form (using a frequency table, a cumulative frequency table or a stem and leaf diagram) or in graphical form (using a lineplot, a dotplot, a boxplot, a bar chart or a histogram). The /ocation of a data set can be summarised using the mean, the median or the mode. The spread of a data set can be summarised using the standard deviation, the range or the interquartile range. ‘The variance measures the spread squared. Third moments can be used to summarise the skewness (ie the degree of asymmetry) of a data set. . ‘The Actuarial Education Company © IFE: 2017 Examinations Chapter 1 Formulae Measures of location ® . M=(1Ln+4)" value or 4 froma grouped frequency table ant 2 Measures of spread R= max(x,)—min(x;) 10R=0;-0,, 0s=(3n+4)" value Q\=(4n+4)" value alternatively Qs =(3n+3)" value Q)=(4n+ 4)" value or 37 and 417 froma grouped frequency table Measures of skewness Sei i skewness = coeff of skew = KE™MESS where 5? = (4-77 ma Sample moments ‘Ath moment = ‘Ath moment about @ =1¥G, -a)t n ia Ath central moment 1S, -x)' sl ‘© IFE: 2017 Examinations ‘The Actuarial Education Company

You might also like