0% found this document useful (0 votes)
3 views24 pages

Stats_Lecture_-_1

Uploaded by

zeyneperolmez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views24 pages

Stats_Lecture_-_1

Uploaded by

zeyneperolmez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Probability and

Statistics
Lecture 1

Dr. Sumeyye BAKIM


2024
1
Outline

• Introduction
• Two Branches of Statistical Methods
• Some Basic Concepts
• Frequency Tables

2
Statistics is a branch of
mathematics that focuses on the
organization, analysis, and
interpretation of a group of
numbers.

Think of statistics as a tool that has evolved from a basic thinking process used by
everyone: you observe something, wonder what it means and what caused it, you
have an intuition or an intuitive guess; you observe again, but now more in detail, or
you make small changes in the process to test your intuition. And then you face the
big question:
'Was the intuition confirmed?'
What are the chances that what you observed the second time will happen
repeatedly?
This way, you can declare your intuition as a probable truth to the world.
3
Statistics is a way of truth-seeking.

Statistics helps you understand,


when, where and whom your
predictions hold true.
This kind of truth-seeking is essential for
predicting future events, which is at the core
of engineering, science, and human evolution.

Scientists rely on statistical methods


to understand the data they collect.

4
Descriptive Statistics
Used to summarize and describe a
group of numbers from a research
Two Branches study.

Of Statistical
Methods
Inferential Statistics
Used to make inferences that go
beyond the numbers collected.

5
To sum up:

For example: Inferential statistics allow for conclusions about a large group of people
to be drawn from a research study that involves fewer individuals.
6
Some Basic Concepts
Researchers gave students in an
introductory statistics class a survey
during the first week. One of the
questions asked: "On a scale of 0 to 10,
how stressed were you over the last 2½
weeks, where 0 means not at all
stressed and 10 means as stressed as
possible?" (How would you answer? :))

In this example, the level of stress is a variable that can take values from 0 to
10, and the value of any person's answer is their score. If you answered 6,
your score is 6; your score has a value of 6 on the variable called "level of
stress."

7
Stress level = variable

0 1 2 3 4 5 6 7 8 9 10

Answer: 6
value
If it varies from
person to person,
it’s a variable ! score 8
Variable: A characteristic that can take on different values.

Examples: Temperature, Stress level in a material, Type of engineering


project, Reaction time of a sensor, Number of defects in a batch of
products.

Value: The possible number or category that a score can have.


It could be a number: 25, -10, 101.5.
It could be a category: Mechanical-Civil
It could be an Engineering Outcome: Failure, Success, Needs
Improvement.

Score: The particular person's value on a variable.


It is the result of the individual being studied on the variable.
9
Engineering research involves variables, values, and
scores. The definitions may seem a bit abstract, but their
practical meaning is generally clear.

10
Measurement Levels - Types of Variables
The variables used by engineers are often similar to those used in stress measurement examples.
The values represent how much of something is being measured. In the case of stress
measurement, the higher the number, the greater the stress level. This is an example
of quantitative variables.

Quantitative Variable:A variable whose values are numbers.


There are several types of quantitative variables. The most important distinction in
engineering research is between equal-interval variables and rank-order
variables. An equal-interval variable is a variable where the differences between values
represent approximately equal amounts of what is being measured. For example, grade
point average (GPA) is roughly an equal-interval variable because the difference between a
GPA of 2.5 and 2.8 is about the same as the difference between a GPA of 3.0 and 3.3.
Engineers often treat measurements like stress levels from 0 to 10 as roughly equal-
interval variables. For example, the difference between stress levels 4 and 6 is about the
same as the difference between 7 and 9.
Equal-Interval Variable: A variable where the difference between two
categories is the same. 11
Another type of quantitative variable is the rank-order variable, a numerical variable where
values are based on ranks, such as positions in a race or hierarchical levels. It is also called
an ordinal variable.

Ordinal Variable: A variable based on ranking (1st, 2nd, 3rd…)


A rank-order variable provides less information than an equal-interval variable. Nonetheless,
psychologists often use rank-order variables.

When people are asked to rate something, ranking is sometimes easier and less
subjective. For example, when asked to rate how much you like each of your
friends, it may be easier to rank them based on how much you like them rather
than assign a specific rating to each one. Another reason researchers often use
rank-order variables is that asking people to rank forces them to differentiate. For
example, if you were asked to rate how much you like each of your friends on a
scale of 1 to 10, you might give many of them the same score. But if asked to rank
them, such a result wouldn’t happen.

12
In engineering research, another main type of variable, which is not numerical, is a nominal
variable, where values are names or categories.

Nominal Variable: A variable whose values are categories.

For example, the values for the nominal variable "project type" could be "Mechanical"
or "Civil." A person's score on the "project type" variable would be one of these two
categories. Another example could be different engineering fields like "Electrical,"
"Chemical," or "Structural.”

These different types of variables represent different levels of measurement.


Researchers sometimes must decide how to measure a particular variable. For
example, they may use an equal-interval scale, a rank-order scale, or a nominal scale
depending on the type of data.

13
14
15
Continuous? Or Discrete?
Another distinction that researchers often make is between continuous variables and
discrete variables.

A discrete variable is one that has specific values and cannot have values between
these specific points. For example, the number of times a server crashed in the last month
is a discrete variable. It could be 0, 1, 2, or more crashes, but you cannot have 1.72 or 2.34
crashes. Categorical variables, such as "operating system" or "programming language," are
also considered discrete variables.

In contrast, a continuous variable can have an infinite number of values between


any two points. For instance, when asked, "How long did your code run?" you can answer
with 19 seconds or 20 seconds, but you can also say it ran for 19.26 seconds. Continuous
variables include things like processing time, data size, and latency.

16
Frequency Tables
A Frequency Table is the listing of the number of individuals that
possess each value for a given variable.

Out of 151 students, the responses of 30 students to the "stress level" question
are as follows:

8, 7, 4, 10, 8, 6, 8, 9, 9, 7, 3, 7, 6, 5, 0, 9, 10, 7, 7, 3, 6, 7, 5, 2, 1, 6, 7, 10, 8, 8.

The number of students who used each rating is the frequency for that value.

How many people used each rating??

17
Stress Level Frequency Table

The table on the right is


called a frequency
table. It shows how
frequently (how many
times) each score was used.
A frequency table allows
you to easily see the
pattern of the numbers. In
this example, you can see
that most students rated
their stress levels around 7
or 8.

18
How to Create a Frequency Table?
1. List all possible values from lowest to highest.

Note: If a value is not used, it must still be included in the table!

2. Go through the scores one by one, marking each value next to its corresponding place in the list.

3. Create a table showing how many times each value was used.
That is, count the marks next to each value.
4. Calculate the percentage for each value.
To do this, take the frequency of the value, divide it by the total number of scores, and

19
multiply by 100. (You may need to round the percentage.)
Frequency Table for Nominal Variables
You can also use a frequency table to show the number of scores, or in this case,
the number of individuals for each category of a nominal variable. For example,
researchers Aron asked 208 students to choose the closest person in their life. As
shown in the table on the right, 33 students chose a family member, 76 students
chose a friend, 92 students chose a romantic partner, and 7 students chose
someone else. Additionally, the values listed on the left side of the frequency
table are the values (categories) of the variable.

20
Example
Let's say a group of 94 engineering students was asked to track their weekly
interactions with technical problems they encountered while working on
projects. Each time a student spent 10 minutes or more working on a technical
issue, they were required to log the incident. The log card contained questions
about the type of problem and the resources used to solve it. Excluding family
and personal life issues, the number of technical interactions students had per
week that lasted 10 minutes or longer is as follows:
48, 15, 33, 3, 21, 19, 17, 16, 44, 25, 30, 3, 5, 9, 35,
32, 26, 13, 14, 14, 47, 47, 18, 11, 5, 19, 24, 17, 6,
25, 8, 18, 29, 1, 18, 22, 3, 22, 29, 2, 6, 10, 29, 10,
29, 21, 38, 41, 16, 17, 8, 40, 8, 10, 18, 7, 4, 4, 8,
11, 3, 23, 10, 19, 21, 13, 12, 10, 4, 17, 11, 21, 9, 8,
7, 5, 3, 22, 14, 25, 4, 11, 10, 18, 1, 28, 27, 19, 24,
35, 9, 30, 8, 26.

Now, let’s follow the steps to create a frequency table ☺


21
Frequency table for interactions

22
Grouped Frequency Table
Sometimes, there are so many values that displaying them in a simple frequency table becomes
impractical. As in the previous example, the solution is to create groups of values within a certain
range. These combined categories are called intervals. Frequency tables that use intervals are
called grouped frequency tables.
Instead of having separate frequency counts for students who rated their stress as 8 or 9, you can
combine them into one category.

The specific interval between 8 and 9 would have a total frequency of 8:

For the score 8, there are 5 counts, and for the score 9, there are 3 counts, resulting in 8 total.

Interval size : 2
23
Note!
When adjusting the table, ensure that the starting point of each interval is a
multiple of the interval size, and that the upper limit of each interval is just
below the starting point of the next interval.

In the table below, a 10-interval range with an interval size of 5 has been used. Each interval
starts at a multiple of 5. (In the previous table, intervals of 2 and multiples of 2 were used.)

24

You might also like