0% found this document useful (0 votes)
55 views

Statistics LO1 LO4

The document discusses statistics and financial decisions. It defines data as facts or numbers that are collected to be examined and used for decision making. Information is defined as knowledge obtained from study or research. Descriptive statistics describe data through measures like mean, median and mode, while inferential statistics make generalizations from a sample to a population. Primary data is originally collected while secondary data is obtained from other sources. Exploratory analysis summarizes key features in data, descriptive analysis identifies trends, and confirmatory analysis validates hypotheses. Inferential statistics allow for generalizations but results may not be fully accurate, while descriptive statistics simply describe known data.

Uploaded by

Omar El-Tal
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views

Statistics LO1 LO4

The document discusses statistics and financial decisions. It defines data as facts or numbers that are collected to be examined and used for decision making. Information is defined as knowledge obtained from study or research. Descriptive statistics describe data through measures like mean, median and mode, while inferential statistics make generalizations from a sample to a population. Primary data is originally collected while secondary data is obtained from other sources. Exploratory analysis summarizes key features in data, descriptive analysis identifies trends, and confirmatory analysis validates hypotheses. Inferential statistics allow for generalizations but results may not be fully accurate, while descriptive statistics simply describe known data.

Uploaded by

Omar El-Tal
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 14

 

Statistics and Financial Decisions

Topic 1: Define the concepts of data and information, and then clarify how each is used.

Statistics is the science dealing with the collection, analysis, interpretation, and presentation of data.

What does data mean?

Data is information, especially facts or numbers. Data is collected to be examined and considered and
used to help decision-making. It can also be information in an electronic form. In addition, data is the
raw information from which statistics are created.

What does information mean?

Information is knowledge obtained from study, investigation, communication, research, or instructions.

Uses of data:

 Describing: is the use of data to delineate what has happened.


 Diagnosing: is the use of data to demonstrate why something happened.
 Predicting: is the use of data to set out what will happen.
 Prescribing: is the use of data to define what will be done.

Uses of information:

 Decision-making.
 Problem solving.
 Develop skills.
 Develop knowledge.

Topic 2: Explain the different sources of data, and then evaluate them by stating their benefits and
Limitations

A. Primary data: this kind of data is collected for the first time and it is original. For example:
Population census by the government.

Advantages: Researchers are collecting data for the specific purposes of their study. / More up to
date, since previous studies may not answer the questions you need to collect data.

Disadvantages: Needs a large enough sample to make the sample authoritative and be able to
generalize.

B. Secondary data: this kind of data is obtained not collected, gathered from studies, surveys, etc.

Advantages: Secondary data tends to be easily available and cheap to obtain. / Secondary data are
collected over a long period, which allow researchers to uncover the changes over time.

Disadvantages: Data may be outdated. / Precision of secondary data is unknown.


Topic 3: Define the different methods of analysis, for example: descriptive, exploratory and confirmatory.
Support your answer with an example of data for each analysis method then state your opinion
regarding the differences in application between them. You need to highlight the usefulness of each.

Analysis methods:

Exploratory data analysis: In order to summarize data key features, derive relevant variables, and
evaluate the underlying hypotheses, exploratory data analysis is an approach to data analysis.

Example: surge in the number of users canceling their product subscription. You want to find out why
this is so that you can tackle the underlying cause and reverse the trend.

Descriptive data analysis: Descriptive data analysis is a mathematical method used to identify trends or
principles by identifying and summarizing historical data.

Example: The idea of a GPA is that it takes data points from a wide range of exams, classes and
grades, and averages them together to provide a general understanding of a student's overall
academic performance. A student's personal GPA reflects their mean academic performance.

Confirmatory data analysis: In order to validate preconceived hypotheses in general, figures are directed
at addressing one or more research questions. In comparison, confirmatory data analysis is where you
use traditional statistical tools to analyze the facts, such as inference, relevance, and trust.

Example: Blood units initially tested positive must undergo confirmatory tests to confirm the presence
of a specific virus or disease.

References:

 Courses.lumenlearning.com. 2020.  1.1 Definitions Of Statistics And Key Terms | Introduction To Statistics.
[online] Available at: <https://courses.lumenlearning.com/odessa-introstats1-1/chapter/definitions-of-
statistics-probability-and-key-terms/> [Accessed 13 November 2020].
 Data?, W., 2020. What Are The Advantages And Disadvantages Of Internal Sources Of Data? - Blurtit.
[online] Science.blurtit.com. Available at: <https://science.blurtit.com/429449/what-are-the-advantages-
and-disadvantages-of-internal-sources-of-data> [Accessed 17 November 2020].
 www.dictionary.com. 2020. Definition Of Data | Dictionary.Com. [online] Available at:
<https://www.dictionary.com/browse/data> [Accessed 10 November 2020].
 Merriam-webster.com. 2020.  Definition Of INFORMATION. [online] Available at: <https://www.merriam-
webster.com/dictionary/information> [Accessed 10 November 2020].
 www.dictionary.com. 2020. Definition Of Information | Dictionary.Com. [online] Available at:
<https://www.dictionary.com/browse/information> [Accessed 10 November 2020].
 Dictionary.cambridge.org. 2020. DATA | Meaning In The Cambridge English Dictionary. [online] Available
at: <https://dictionary.cambridge.org/dictionary/english/data> [Accessed 10 November 2020].
 Sealey, D. and Sealey, D., 2020.  The Four Uses Of Data (Big Or Small) - Storm81. [online] Storm81.
Available at: <http://storm81.com/data/four-uses-of-data> [Accessed 17 November 2020].
 Igi-global.com. 2020. What Is Information Use | IGI Global. [online] Available at: <https://www.igi-
global.com/dictionary/information-literacy-and-the-circular-economy-in-industry-40/14578> [Accessed 17
November 2020].
 BYJUS. 2020. What Are The Sources Of Data? Primary And Secondary Data. [online] Available at:
<https://byjus.com/commerce/what-are-the-sources-of-data/> [Accessed 14 November 2020].
 Development, H., 2020. Secondary Data - Meaning, Its Advantages And Disadvantages. [online]
Managementstudyguide.com. Available at:
<https://www.managementstudyguide.com/secondary_data.htm> [Accessed 17 November 2020].

Task 2: LO2

You have been asked by your supervisor to show your ability to analyse and evaluate qualitative and
quantitative raw business data using a number of statistical methods such as central tendency measures
and variation measures. In this task you have to:

 Part 1: Provide a word document includes an evaluation (strengths and weaknesses) of the
differences in application between descriptive statistics and inferential statistics.

Descriptive statistics: Descriptive Statistics refers to a discipline that quantitatively describes the
important characteristics of the dataset. For the purpose of describing properties, it uses measures of
central tendency, i.e. mean, median, mode and the measures of dispersion i.e. range standard deviation,
quartile deviation and variance, etc.

Inferential statistics: Inferential Statistics is all about generalising from the sample to the population, i.e.
the results of the analysis of the sample can be deduced to the larger population, from which the sample
is taken. It is a convenient way to draw conclusions about the population when it is not possible to query
each and every member of the universe. The sample chosen is a representative of the entire population;
therefore, it should contain important features of the population.

The differences between descriptive statistics and inferential statistics.

Descriptive Statistics is a discipline which is concerned with describing the population under study.
Inferential Statistics is a type of statistics; that focuses on drawing conclusions about the population,
on the basis of sample analysis and observation.

Descriptive Statistics collects, organises, analyses and presents data in a meaningful way. On the
contrary, Inferential Statistics, compares data, test hypothesis and make predictions of the future
outcomes.
There is a diagrammatic or tabular representation of final result in descriptive statistics whereas the
final result is displayed in the form of probability.

Descriptive statistics describes a situation while inferential statistics explains the likelihood of the
occurrence of an event.

Descriptive statistics explains the data, which is already known, to summarise sample. Conversely,
inferential statistics attempts to reach the conclusion to learn about the population; that extends
beyond the data available.

The strength and weakness of descriptive statistic:

The strength:

In order to improve data and improve it in the future and process the defects, it makes it easier for data
users to analyze, understand and study the variables.

Clarify large volumes of data with no uncertainties.

The weakness:

They are so limited that they only allow you to make summaries about people or things that you have
already measured not all things.

There are no generalizations about the data.

The results are not 100% accurate.

The strength and weakness of inferential statistics

The strength:

It’s allowed to the researcher to make generalizations about the data set or in most cases.

The weakness:

After completing inferential statistics, a person must provide data about a population that you have not
fully measured, and therefore you cannot be completely sure that the values / statistics you compute are
correct.

SCENARIO 2 :

4-The percentage of employees 'salaries exceeding the average of 60% and the employees below the
average of 40%, so I should change the percentage of employees' salaries to 60% below the average and
40% more than the average by reducing the salaries of at least two employees. Example:

The salary of "A" an employee from 770 to 740


The salary of "I" an employee from 850 to 740

This results in a difference of 140, and that difference is added to employee F's salary, changing the value
from 350 to 490.
Then we collect the value of salaries to show that their total has not changed, which is 7500, and that
average salaries have not changed.

LO3 – TASK 2&3

1) Linear Regression: It is used when we want to predict the value of a variable based on the
value of another variable. The variable we want to predict is called the dependent variable
(or sometimes, the outcome variable).

Advantages:
1. Linear Regression is simple to implement and easier to interpret the output coefficients.

2. When you know the relationship between the independent and dependent variable has a
linear relationship, this algorithm is the best to use because of its less complexity compared
to other algorithms.

3. In addition, it works in most cases. Even when it doesn't fit the data exactly, we can use it
to find the nature of the relationship between the two variables.

Disadvantages:
1. On the other hand in the linear regression technique outliers can have huge effects on the
regression and boundaries are linear in this technique.

2. Diversely, linear regression assumes a linear relationship between dependent and


independent variables. That means it assumes that there is a straight-line relationship
between them. It assumes independence between attributes.

3. But then linear regression also looks at a relationship between the mean of the dependent
variables and the independent variables. Just as the mean is not a complete description of a
single variable, linear regression is not a complete description of relationships among
variables.

2) Moving Average: A moving average (MA) is a widely used indicator


In technical analysis helps smooth out price action by filtering out the "noise" from random
short-term price fluctuations. It is a trend-following or lagging, indicator because it is based
on past prices.
The Formulas For Moving Averages Are Simple Moving Average:
SMA= (nA1+A2+...+An)/n
A=average in period
n=number of time periods
The simple moving average calculates the arithmetic mean of a
Security over a number (n) of time periods, A

Advantage
1. Less prone to whipsawing up and down in response to slight, temporary price swings back
and forth
2. Moving averages can be used for measuring the trend of any series. This method is
applicable to linear as well as non-linear trends.

Disadvantages
1. The trend obtained by moving averages generally is neither a straight line nor a standard
curve, for this reason, the trend cannot be extended for forecasting future values. Trend
values are not available for some periods at the start and some values at the end of the time
series. This method is not applicable to short time series.
2. Some of the data used to compute the moving average might be old or stal

3) Naïve: Estimating technique in which the last period's actuals are used as this period's
forecast, without adjusting them or attempting to establish causal factors. It is used only for
comparison with the forecasts generated by the better (sophisticated) techniques.
Advantages:
1. You’ll gain valuable insight
2. Efficiency and accuracy have also led to the widespread proliferation
3. It can decrease costs
Disadvantages:
1. It not Considerate if there any emergency conditions
2. Forecasts are never 100% accurate
3. It can be time-consuming and resource-intensive

4) Correlation: is used to describe the linear relationship between two continuous variables
(e.g., height and weight). In general, correlation tends to be used when there is no identified
response variable. It measures the strength (qualitatively) and direction of the linear
relationship between two or more variables.

Advantages:
1. can show the strength of the relationship between two variables
2. Study behavior that you cannot study
3. Gain quantitative data that can be easily analyzed

Disadvantages:
1. Cannot show cause and effect (what variables control what)
2. No control of the third variable that might affect the correlation

Scenario 1:
Naïve 10500

Moving Average 11166.66667

(10500+11000+12000)/3

YEARS PRODUCTION CAPACITY


1 2011 15000
2 2012 14000
3 2013 14500
4 2014 13000
5 2015 12000
6 2016 11000
7 2017 10500
8 2018 9722.93
Linear Regression 9722.93

Y=-785.71*2018+1.595.285.71=
production volume
total quantity of inventory (*1000)
Year (*1000)
1 100 20
2 120 27
3 150 36
4 200 50
  250 65.2267

Scenario 2:

Correlation 0.999044444 Strong Positive Correlation

=CORREL
Linear Regression (Production = 250) 65.2267

Y=0.2974*250-9.1233=
Scenario 3:

n=20000
M=5
σ =0.1

Upper Boundary (5+2*0.1)


(5.2)
Lower Boundary (5-2*0.1)
(4.8)

Number of bottle 95/100*20000


19000

LO4

Identify different types of charts / tables available to communicate different categories of variables.

There are many ways to organize data, here is some:

1. Summary table: The summary table is a visualization which in table form, summarizes statistical data
information. In other visualizations, all visualizations can only be set up to display data constrained by
one or more markings (details visualizations). It is also possible to restrict the overview tables to one or
more filters.

2. Frequency Distribution table: A frequency distribution is a representation that shows the number of
observations within a given interval, either in a graphical or tabular format. The magnitude of the
interval depends on the data being evaluated and the analyst objectives. There must be mutually
exclusive and exhaustive intervals. In a mathematical sense, frequency distributions are usually used. In
general, the distribution of frequency may be combined with the mapping of regular distribution.

3. Contingency table: A data table in which data is tabulated by row entries according to one variable
and tabulated by column entries according to another variable, and which is used in particular in the
analysis of the association between variables.

4. Ordered array: In ascending or descending order, the elements of the ordered array are arranged.
Generally speaking an ordered array may have duplicate components.

After organizing data, you must visualize them so here is some ways in visualizing data:

1) Pie chart: A circular mathematical graph is a pie map (or a circle chart), which is broken into slices to
show numerical proportions. The arc length of each slice (and thus its central angle and area) in a pie
chart is equal to the sum it represents.

2) Stem and leaf: A table used for viewing data is a stem and leaves. On the left is the 'stem' that
indicates the first digit or digits. On the right is the ‘leaf’, which indicates the last digit.

3) Bar chart: A bar chart or bar graph is a chart or graph that provides rectangular bars with categorical
data with heights or lengths proportional to the values they represent. It is possible to plot the bars
vertically or horizontally. Comparisons of various groups are seen in a bar graph.

4) Scatter plot: A scatter plot is a series of points on a horizontal and vertical axis. In statistics, scatter
plots are important since they will display the degree of association, if any, between the values of
quantities or phenomena observed.

5) Histogram: Description of Quality Glossary: Histogram. A spectrum of frequencies indicates how often
each different value in a data set happens. The most widely used graph to illustrate frequency
distributions is a histogram. It looks pretty much like a bar map, but the distinctions between them are
major.

Use the appropriate tables/charts in order to present and communicate the following variables:

Survey 1:
-One variable: (major field of study)

For one categorical variable summary table is the simplest and easiest way to organize it:

Major field of study Frequency


HRM 21
Marketing 16
Accounting 17
-Two variables: (major field of study and gender):

The best way to organize two categorical variables is the contingency table:

Major field of study


Gender HRM Marketing Accounting
Male 11 6 10
Female 10 10 7
Total 21 16 17
Survey 2:

Employee ID Performance out 100

H 23
G 35
E 53
N 60
O 63
M 65
A 70
L 70
F 78
B 80
J 80
K 85
I 90
C 95
D 98
Performance
Stem Leaf    
2 3    
3 5    
4      
5 3    
6 0 3 5
7 0 0 8
8 0 0 5
9 0 5 8

-Two variables: (employee performance and salary)

With two numerical variables, we do a normal table to organize data because it is the easiest way to
read data after organizing it.

Employee ID Performance out 100 Salary in £

A 23 350
B 35 500
C 53 600
D 60 500
E 63 650
F 65 1200
G 70 1000
H 70 1200
I 78 1000
J 80 1200
K 80 1400
L 85 1350
M 90 1500
N 95 1400
O 98 1200
Salary Vs Performance
1600
1400 f(x) = 15.85 x − 100.91
1200 R² = 0.76
1000
Salary

800 Salary in £
600 Linear (Salary in £ )

400
200
0
10 20 30 40 50 60 70 80 90 100 110
Prformance

You might also like