0% found this document useful (0 votes)
8 views2 pages

simulation

The document discusses the process of simulating data using random numbers, highlighting methods for generating random numbers and examples of their application. It also addresses the importance of cleaning collected data to resolve issues such as outliers, missing values, and incorrect formats before analysis. Additionally, it provides examples of data collection methods and the potential problems associated with them.

Uploaded by

padma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views2 pages

simulation

The document discusses the process of simulating data using random numbers, highlighting methods for generating random numbers and examples of their application. It also addresses the importance of cleaning collected data to resolve issues such as outliers, missing values, and incorrect formats before analysis. Additionally, it provides examples of data collection methods and the potential problems associated with them.

Uploaded by

padma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

25

Simulation
collecting your data, you can generate it based on
Instead of
going out
and information you know (intriguing).
Simulate Real DData by using Random Numbers
Asimulationis a prediction of what might happen in a situation, based on some previous data or known
This is
useful when collecting the real data would be time-consuning,
probabilities.
numbers- e.g. rom random number tables, dice, a expensive or impractical.
random
To simulate
data, use calculator or computer.
A boxed toy randomly contains one of four characters, A, B, C or D. Use random
EXAMPLE: numbere
to simulate how many of the toys need to be bought before all four characters are
collected
1) Choose a suitable method for getting random numbers e.g. arandom number table.
) Assian numbers to the data. Assign the numbers 1-100 to the different charactore
Fach character is equally likely to be picked, so use the numbers
O1-25 for A, 26-50for B, 51-75 for Cand 76-100 for D.
3) Generate two-digit random numbers, using the random number table Example of a random number table.
until you have a number matching all four characters. 7387 4736 8000 6376
Here you could use the last two digits from each number in the table
luse OO as 100). Start in the top left corner and read across each row. 2821 3128 3553 6383
4) Match the random numbers to the correct character.
The numbers 87, 36, O0, 76, 21, 28 and 53 represent To get a more reliable estimation you could
DB.D, D,A, Band C. So this simulation predicts that repeat the simulation lots of times (possibly
using a computer) and then calculate the mean
7 toys need to be bought before all four characters are collected. number of toys bought before all are collected.

There are Different Ways to Get Random Numbers...


Calculators can generate random numbers. Your calculator might have a 'Ran# function
that generates a random decimal between 0 and 1 (e.g. 0.693 or 0.581). = Make sure you know how
Hmight also have a 'Ranlnt' function that you can use to set limits your calculator works.
- e.g. Ranlnt(1, 13) chooses a random whole number between 1 and 13.
2) Computers are useful for generating large amounts of random numbers quickly. in the interval
you want e.g. a spreadsheet can generate 100 random numbers between I and 100.
3) Atrusty dice is good for six random numbers and you can also use a deck of cards.

EXAN\LDi Over one month, a shopkeeper monitors the flavour of ice cream
her cüstomers buy. She finds
and a sixth choose mint.
that half of her customers choose vanilla, a third choose chocolate
Explain how you could simulate the choice of ice cream for
the next five customers.
this case
you could use a fair dice in
1) Choosea suitable method for getting random numbers
Decause it's easy to assian the outcomes to the numbers 1 to 6.
3,
4) Assign numbers to the data.of 6is 3 so use three numbers for vanilla, eg 2and for mint, eg. 6.
1 of 6 is 1 so use one number
3of 6is 2 so use two numbers for chocolate, e.g. 4 and 5, 6
) Generate five randomMumbers by rolling the dice 5 times.
4) Match the random numbers to the flavour of ice cream. 5= chocolate.
4 = chocolate,1 = vanilla, 6 = mint,
Eg. the next five customers could be: 2 = vanilla,

You won't get a yellow card for diving into this page...
Make sure you know the four steps for simulating data before having a go at this question. dark
picking either
a white or aGiven
) Abox chance of chocolate.
equal
contains milk, white and dark chocolates. There isis four
an
times that of
white
fpicking a three
selections.

chocolate. The
that achocolate probability of picking a milk chocolate could
simulate the next
Collecting Data
explain how you
is selected then replaced, Cogtion Iwo
26

Problems with Collected Data


Collected data may end up in a bit of a mess, especially if it's been recorded by lots of
dif erent
You Need to SpotProblems with Recorded Data peonle.
) Some values might be outliers values that seen out of place with
the rest of the data. These could be accurate but extreme values, or Eue colour
incorrectlu recorded-e.g. the height of 4.72 mis certainly inaccurate. Blue
2) There could be missing data values e.g. this eye colour is missing.
3) It might be given in the wrong format e.g. the data for the
4h person is in the wrong order and some eye colours
Brown
have been written as one letter instead of the word. 1.80 m
4) There might be units or other surnbols in the data (particularly in R
spreadsheets) and different units may have been used (e.g. mand cm).

Data is Cleaned by Fixing Problems


When there are problerms with raw data, it should be 'cleaned before it is processed.
CLEANING DATA means fixing problems by removing/correcting inaccuracies and missing deh.
dealing with genuine outliers, putting data in the same format and removing units/sumbols.
You might decide to remove outliers or use analysis that's unaffected by extreme values,
but this could lead to inacourate conclusions Another option is to try to record
hat data again, but this can be difficult and time-consuming.

EXAMPLE: Kevin thinks that older students spend more money on their Money sper
lunch than younger students. He carries out a census of the Age on lunch
1000 students at his school by asking them fill in an 13 1.80
online database with how much they spent on their lunch 14
table.
that day. The first few entries are shown in the 9.05
clean the data.
a) Explain three ways that Kevin could Kevin has used to collect the data. Eleven 195p
method
b) Give twO problems with the 15
showing all ages as a number
a) 1) Put the data in the same format by 46
£5.25

and all the money spent in pounds. 168p


money spent column. Fourteen
2) Remove the £ and p symbols from the to 16
3) Remove outliers (e.g. get rid of the person aged 46 or correct it Cleaning the data gves

missing values and any zeros Moneyspen


-it could be incorrectly recorded), on lunch (E
the student didnt buy lunch). Age
(a zero could distort the data since
packed lunch, leading to missing data.
1.80
) 1) Some students might take asuitable way to collect data. 13
2.05
So a census probably isnt a not reflect 14
The amount students spent on their lunch that day might 1.95
2)
how much they spend on other days.
1.68

illing in the
Other problems include: students not respond1ng, students
database incorrectly/inconsistently or not having access to the internet

guestin
this

rule of the Stats Kitchen...


try
howto clean
data and thenn below.
Double-check:you know shown
clean raw data before processing. puzzle. The data is
long it takes them to finish
a short M
bup of students record how F
Female
0
103 s Boy Girl Imin 9s
Gender 53
137 s 178 s
Time taken Male 2 mins

reasonswhy the data should


be cleaned.

You might also like