2024 - Assignment 01 Fronsheet
2024 - Assignment 01 Fronsheet
P1 P2 P3 M1 M2 D1
Assessor Feedback:
*Please note that constructive and useful feedback should allow students to understand:
a) Strengths of performance
b) Limitations of performance
c) Any improvements needed in future assessments
Feedback should be against the learning outcomes and assessment criteria to help students understand how these
inform the process of judging the overall grade.
Feedback should give full guidance to the students on how they have met the learning outcomes and
assessment criteria.
Resubmission Feedback:
*Please note resubmission feedback is focussed only on the resubmitted work
1
Signature & Date:
* Please note that grade decisions are provisional. They are only confirmed once internal and
external moderation has taken place and grades decisions have been agreed at the assessment.
When submitting evidence for assessment, each student must sign a declaration confirming that
the work is their own.
29/7/2024 CMS
Programme:
Plagiarism Unit 42 -
Plagiarism is a particular form of cheating. Plagiarism must be avoided at all costs and students
who break the rules, however innocently, may be penalised. It is your responsibility to ensure that
you understand correct referencing practices. As a university level student, you are expected to
use appropriate references throughout and keep carefully detailed notes of all your sources of
materials for material you have used in your work, including any material downloaded from the
Internet. Please consult the relevant unit lecturer or your course tutor if you need any further
advice.
2
Student Declaration
Student declaration
I certify that the assignment submission is entirely my own work and I fully understand the
consequences of plagiarism. I declare that the work submitted for assessment has been
carried out without assistance other than that which is acceptable according to the rules of
the specification. I certify I have clearly referenced any sources and any artificial
intelligence (AI) tools used in the work. I understand that making a false declaration is a
form of malpractice.
3
Table of Contents
I. Introduction: .....................................................................................................................................................5
II. Evaluate the nature and process of business and economic data/information from a range of different
published sources ................................................................................................................................................5
III. Evaluate data from a variety of sources using different methods of analysis ...............................................9
V. Conclusion: .................................................................................................................................................. 19
References ........................................................................................................................................................ 20
4
I. Introduction:
As a Research Analyst of SSI Securities Corporation. The author’s goal is to demonstrate their
understanding by evaluating and analysing the business data or microeconomics or macroeconomic
recent problems, future trends/ intentions, etc. regarding their research topic. With the author’s topic
of choice is Red River Delta, the author needs to firstly, evaluate the nature and process of business
and economic data/information from a range of different published sources. Secondly, the author is
required to evaluate data from a variety of sources using different methods of analysis. Lastly, with
the author’s provided data set, the author will state one research question in which the dataset is a
sample with identification of the population for this research question following with the sampling
strategy. The author will also need to apply statistical tools to display mean, mode, median, range
and standard deviation of one variable by a table, describe one categorical variable by a table/figure.
A relationship between two variables by a scatter plot with a trend line and calculation of the
correlation between these two variables will also be provided to explain the business meaning of
this relationship.
II. Evaluate the nature and process of business and economic data/information from a
range of different published sources
2.1. Data, Information and Knowledge:
The concept of data as it is used in the syllabus is commonly referred to as ‘raw’ data – a collection
of text, numbers and symbols with no meaning. Data therefore has to be processed, or provided with
a context, before it can have meaning. Data on its own has no meaning. It only takes on meaning
and becomes information when it is interpreted. Data consists of raw facts and figures.
When that data is processed into sets according to context, it provides information. Data refers to
raw input that when processed or arranged makes meaningful output. Information isusually the
processed outcome of data. When data is processed into information, it becomes interpretable and
gains significance.
5
The term "raw" data is widely used to refer to the concept of data as it is utilized in the curriculum.
Raw data is a collection of text, numbers, and symbols that have no understanding of what they
represent. In order for data to acquire meaning, it must first be processed, or given a context, before
it can be considered meaningful. There is no meaning to the data by itself. The interpretation of it is
the only way for it to acquire meaning and transform into information. A collection of raw facts and
figures is known as data (Cambridge, 2017).
This information is obtained by processing the data in sets according to the context in which it was
collected. Input that is raw and, after being processed or structured, produces output that is
meaningful is referred to as data. In most cases, information is the completed product of data
processing. When data is transformed into information, it is converted into something that can be
interpreted and acquires importance (Cambridge, 2017).
This type of learning is sometimes referred to as "learning by heart" or "rote learning," and it occurs
when an individual memorizes knowledge. It is possible to comprehend that they have obtained a
certain amount of knowledge. There is a different kind of knowledge that is developed as a result of
the individual comprehending the information that has been provided to them and making use of that
information in order to acquire knowledge about how to solve difficulties. Explicit knowledge is a
common name for the first sort of knowledge. This is the kind of information that can be easily
transmitted to other people. It is possible to store the majority of explicit forms of knowledge in
particular media. The material that can be found in publications such as encyclopedias and textbooks
are excellent instances of explicit knowledge. The second kind of knowledge is referred to as tacit
data. It is the kind of information that is difficult to convey to another person just by writing it down
with the intention of passing it on. The capacity to speak a foreign language, make bread, program a
computer, or operate intricate machinery, on the other hand, necessitates learning additional pieces
of information that are not usually known directly and are difficult to pass on to future users
(Cambridge, 2017).
6
2.2. Data collection
By reducing the number of errors that could potentially occur during the course of a research
endeavor, the primary stage of research, which is data gathering, can overshadow the quality of the
results that are achieved. Therefore, in addition to having a strong design for the study, a significant
amount of quality time should be invested in the collection of data in order to get acceptable results.
This is because the absence of sufficient data or data that is erroneous prevents the correctness of the
findings from being guaranteed (Kabir, 2016).
Whether they are nominal or descriptive, non-numerical facts that cannot be exhibited as numbers
are referred to as qualitative data when they are presented in the form of words or phrases. In the
context of a research study, this category of data provides answers to "how and why" questions and
mostly encompasses information concerning feelings, perceptions, and emotions. The data
collection process takes place using unstructured methods such as interviews. It is possible for
researchers to collect these data using a variety of approaches, including the use of audiotapes,
sketches, notes, and photographs (Hox & Boeije, 2005).
Qualitative data:
Although qualitative data can be suitable to achieve further information to explore and determine
new effects and consequences of programs on the research, and finally enhance the quality of
quantitative results, its implementation is dependent on spending a considerable amount of cost and
time and the results may not be generalizable. It means the findings of case studies can be used just
for the same issues as the general patterns for different studies. Qualitative methods encompass three
main categories including observations, document reviews, and in-depth interviews in spite of the
fact that there are less common ways to gather qualitative data (Taherdoost, 2021).
Quantitative data:
It is generally accepted that quantitative data consists of numerical data that has been developed and
computed using mathematical methods. Number scales, ordinal scales, interval scales, and ratio
scales are some of the numerous types of scales that can be used to measure quantitative data (Kabir,
2016). The "what" question type is addressed by a qualitative method in the context of an
7
investigation. The methodologies that are utilized in these approaches are based on random
sampling and utilize structured data collection methods. When compared to qualitative methods,
these methods are considered to be more cost-effective. Furthermore, the findings can be
standardized to get different results based on certain parameters, such as size. Both a generalization
and a summary of the findings are not difficult to accomplish. It is also feasible to do a
straightforward comparison of the outcomes. Due to the fact that these approaches have a limited
capacity for application and examination, they are also susceptible to encountering unanticipated
differences and a few challenges (Taherdoost, 2021).
8
Data: 35, 118, 9, 21, 105
With a population of over one million people, the Red River Delta contains a total of 35 firms that
are above average in size, 118 firms that are normal in size, 9 firms that are extreme in size, 21
firms that are large in size, and 105 firms that are small in size. Only one firm of average size can be
found in an area with a population ranging from 50,000 to 250,000 people. Last but not least, in the
region with a population of less than 50,000 people, there 1 firm that is larger than average,
3 firms that are average in size, one firm that is large in size, and 7 firms firms that are small in size.
With a grand total of 36 firms that are larger than average, 122 firms that are medium in size, 9
firms that are extremely large, 22 firms that are large in size, and 112 firms that are small in size.
III. Evaluate data from a variety of sources using different methods of analysis
3.1. Descriptive statistic
Descriptive statistics are brief informational coefficients that summarize a particular data set. This
data set can be a representation of the full population or a sample of a population. Descriptive
statistics can be used to summarize either type of data set. Statistical measurements of central
tendency and statistical measures of variability (spread) are the two categories that make up
descriptive statistics. The mean, the median, and the mode are all examples of measures of central
tendency. On the other hand, the standard deviation, variance, minimum and maximum variables,
kurtosis, and skewness are examples of what are known as measures of variability. (Hayes, 2024).
Through the provision of concise summaries of the sample and the measurements of the data,
descriptive statistics contribute to the process of describing and explaining the characteristics of a
particular data set. There are many other types of descriptive statistics, but the most common ones
are measures of center. Defining and describing a data collection can be accomplished, for instance,
by the utilization of the mean, the median, and the mode, which are utilized at virtually all levels of
mathematics and statistics. After adding up all of the figures contained within the data set, the mean,
also known as the average, is determined by dividing the total by the total number of figures
contained within the set. (Hayes, 2024).
9
Descriptive analysis upholds a significant level of objectivity and impartiality among researchers,
allowing for the precise and unbiased representation of data characteristics and observable trends.
By maintaining this neutrality, it ensures that the data is accurately depicted, free from subjective
influence, and reflects the true nature of the variables being studied. This approach enables
researchers to capture the essence of the data in a manner that remains faithful to its inherent
patterns, contributing to the reliability and credibility of their findings (Education, 2024).
Descriptive analysis is not suitable for making predictions or forecasts based on the available data
values. Unlike inferential statistics, which provide tools for predicting future trends or outcomes
through methods such as regression analysis, descriptive statistics are limited to summarizing and
presenting the characteristics of the current data set. While inferential statistics can extend beyond
the data at hand to make educated projections about unknown values, descriptive statistics are solely
concerned with offering a snapshot of the existing data. This method lacks the analytical
mechanisms necessary for drawing conclusions about potential future events, as its primary purpose
is to describe what is already observed rather than to anticipate what might occur (Education,
2024).
The absolute values of discretionary accruals that were calculated using the DACKO model have a
mean (median) of 0.079 (0.054) and a standard deviation of 0.08, as shown in Panel A of Table 2.
This indicates that the overall volume of earnings management accounts for 7.9% (5.4%) of the total
assets that have been lagged.
Panel B in Table 2 reveals that Big Four auditors audit 26.7% of the sample companies, while
companies audited by non-Big Four firms comprise around 73.3% of the sample. This information
pertains to the independent variables. In addition, the data shown in panel C of Table 2 reveals that
the percentage of audit fees paid to Big Four firms by their customers has a mean (median) value of
4.3365 (4.0792) and a standard deviation of 0.4605. The proportion of audit fees paid to non-Big
Four firms by their clients has a mean (median) of 3.8804 (3.8891) and a standard deviation of
0.2093. This is in contrast to the Big Four firms, who have a mean of 3 (Almarayeh, et al., 2020).
10
3.2. Exploratory analysis:
In the process of exploratory data analysis, which involves evaluating datasets with an open mind,
the objectives include the discovery of patterns, the testing of theories, and the development of
intuition. This methodology is an example of an open-ended method that places an emphasis on
garnering new insights. When conducting exploratory analysis, the goal is to freely examine the data
in order to find intriguing patterns and come up with new hypotheses as a result of doing both of
11
these things. As a consequence of this, the analysts are able to acquire a comprehension of the data
as well as the potential insights from it. Exploratory analysis requires people to have a mindset that
is both open and flexible in order to identify trends that were not predicted. Without any prior
preconceptions about what they might find, analysts examine the data without any predetermined
notions. The ability to innovate is crucial. Exploratory analysis is a method that is open-ended and
aims to identify potential relationships in the data. Although it does not include any prior
hypotheses, it is a method that is commonly used. In addition to contributing to the development of
basic concepts, the objective of this endeavor is to identify noteworthy patterns, anomalies, and
correlations (DataHeadhunters, 2024).
Exploratory analysis is a versatile approach that does not demand rigid assumptions regarding the
distribution or structure of the data, which distinguishes it from more traditional statistical methods
that rely heavily on such assumptions. By not being constrained by strict distributional requirements,
exploratory analysis enables the investigation of complex and diverse datasets, encouraging a more
open-ended exploration of the data without the need for predefined models or hypotheses. This
makes it an invaluable tool in early-stage research, where the goal is often to uncover patterns,
relationships, or insights that might not be immediately apparent (Techvify, 2023).
The effectiveness of exploratory analysis can be constrained by the size and representativeness of
the data sample being examined. When working with small or non-representative samples, the
conclusions or insights drawn from the analysis may not accurately reflect broader trends or the true
nature of the population being studied. In such cases, the patterns or relationships identified through
exploratory analysis can be misleading, as they may be artifacts of the limited or skewed data set
rather than genuine insights. Consequently, while exploratory analysis offers a valuable framework
for identifying potential patterns and generating hypotheses, its utility is highly dependent on the
quality and adequacy of the data sample (Techvify, 2023).
12
Figure 1 is a scatter plot that illustrates the overall performance score of 14 local health departments.
The focus of the plot is on two aspects: the jurisdictional coverage, which is represented by full
circles, and the health department contribution, which is shown by open circles. Given that the x-
axis is labeled "Screen" and the y-axis is labeled "Full survey plus 3 questions," it can be deduced
that the plot compares scores obtained from a screening process, which is probably a shorter
examination, with scores obtained from a more full survey that includes three more questions
(Jogger, 2023).
It appears that there is a strong positive association between the degree of jurisdictional coverage
and the amount of contribution made by the health department. This is indicated by the fact that the
points, which include both full and open circles, are tightly spaced around the line of equality. The
implication of this is that health departments that do well in the area of Jurisdictional Coverage are
also likely to perform well in the area of Health Department Contribution, and vice versa to that
extent. Due to the fact that the majority of points are located in close proximity to the line of
equality, it can be deduced that the performance scores for Jurisdictional Coverage and Health
Department Contribution are significantly comparable to one another. It might be inferred from this
that both measures are assessing the same features of the performance of the health department, or
that successful jurisdictional coverage is typically associated with higher contributions from the
health department (Jogger, 2023).
There is a significant amount of variation in the overall performance of the health departments that
were examined, as indicated by the scattering of points around the line of equality throughout a
range of scores (from 0.1 to 0.9). The alignment around the line of equality, on the other hand, lends
support to the notion that advances in Jurisdictional Coverage frequently occur concurrently with
gains in Health Department Contribution (Jogger, 2023).
13
3.3. Confirmatory analysis:
The purpose of confirmatory data analysis is to assess particular hypotheses and validate
relationships through the application of specific analytical methods. Before looking at the facts, it
focuses on determining if the hypotheses that were presented are correct or incorrect. Based on the
results of specific statistical testing, confirmatory analysis validates the predictions that were
established in advance. In the end, the objective is to either validate or invalidate particular
hypotheses concerning anticipated data linkages. In the process of confirmatory analysis, researchers
choose a method that is hypothesis-driven and targeted. Validating particular associations through
thorough statistical testing is the focus of the analysis, which is conducted with specific predictions
already in mind. Specific hypotheses that were developed during the exploratory phase are put to the
test by the confirmatory analysis. The purpose of this endeavor is to statistically validate or reject
hypothesized associations using a significance level that has been predetermined (DataHeadhunters,
2024).
Confirmatory analysis provides researchers with the ability to rigorously test well-defined
hypotheses and address specific research questions with precision. By requiring the formulation of
hypotheses prior to the analysis, this approach ensures that the research process is structured,
focused, and guided by predetermined objectives. This pre-planned approach allows for a more
targeted investigation, as researchers can concentrate on testing the validity of their hypotheses and
answering the particular questions they seek to resolve (DataHeadhunters, 2024).
The results of confirmatory analysis may have limitations in their ability to be generalized beyond
the specific hypotheses, data sets, and methodologies employed during the research process. The
findings, therefore, may not readily extend to other contexts or populations outside of the original
scope. This is because the analysis is designed to assess specific relationships or patterns that are
contingent on the unique characteristics of the data set and the methodological choices made by the
researcher. As a result, the insights gained from confirmatory analysis may not hold true when
applied to different datasets, hypotheses, or research frameworks (DataHeadhunters, 2024).
14
A comparison of the impact of one variable on another is presented in Table 5. We determined the
correlation coefficients as well as the level of significance associated with them. Already, it was
discovered that the electrochemical parameters and the strength parameters all showed a poorer
association with Sn. A strong association between the appropriateness number and other properties
of Nsukka sand has already been established in Table 5 by the correlation matrix. Some of these
properties have been shown to have a significance level of between 0.01% and 0.05. These attributes
had independently revealed stronger correlation coefficients, and as a result, they were chosen for
multiple linear regression studies based on that information. When it comes to estimating the
appropriateness number of Nsukka sand, the equation that was chosen, which is number four, offers
the strongest association in terms of a higher coefficient of determination. It should be noted,
however, that additional multiple regression studies were carried out, some of which included more
than two independent variables. The following equations, which are presented in descending order,
illustrate some of the multiple regression studies that were performed and the coefficients of
determination associated with them (J.H.Pogu, et al., 2018).
15
IV. Data statistic
4.1. Inferential statistic
For this report, the authoe research question is “Would the relationship between inventory and hours
operating per week be positive ?”. The population for this research question covers all firms located
in Red River Delta region in Vietnam. The sampling method that the author will be used is stratified
random sampling and cluster sampling.
From the board above, the author has calculate key statistics of all firms from the unit of d2-sales.
The mean, which is the arithmetic average of all the data points, is calculated to given the value of
109. The mode is the the value that appears most frequently in the dataset. In this case, the value 2
occurs most often. The median is the middle value when the data is ordered from least to greatest.
Half of the data points are below this value, and half are above it, and the value after calculation is
19. The variance is the measure of how much the data points vary from the mean. It’s calculated as
the average of the squared differences from the Mean, which, it has been given the value of
182,932.8118. The standard deviation is a measure of the spread of data points. It’s the square root
of the variance, representing how much, on average, each data point differs from the mean. In this
case, there’s significant variability in the data, which given the value of 427.7064552. As for the
range, it represent the difference between the maximum and minimum values in the dataset, which
takes a value of 6000. The count is simply the number of data points of the given set, which, there
are 293 data points. For coefficient of variation, this measures the ratio of the standard deviation to
16
the mean, expressed as a percentage. It indicates the relative variability in the data. In this case, it
has been given the percentage of 394%. On the point of 70th percentile with the value of 43.6, it
means that 70% of the data points are less than or equal to 43.6. It’s a measure of the distribution of
values within the dataset. Lastly, the 1st quartile with the calculated value of 5 indicates that 25% of
the data points are less than or equal to 5. Quartiles divide the data into four equal parts, and the first
quartile represents the lowest quarter of data points.
The distribution table and pie chart above represent the data for firm size. Which, it could be viewed
from this two charts that in the Red River Delta region, there are 11. 96% of above average size firm,
represented through the deep blue section of the pie chart. Secondly, there are 40.53% of average
size firm in the region, showed through the red section of the pie chart. Thirdly, the amount of
extreme size firm only hold 2.99%, the representation of the firm lies in the green section of the pie
chart. The fourth firm size is the large size, holds 7.31% amount in the region and is shown in the
purple section of the pie chart. Lastly, the small firm size holds the second most amount in the
region, with up to 37.21% and is displayed in the sky blue section of the pie chart. From this info, it
could be understand that the firm of this region is mostly in development, due to the amount of small
and average firm sizes overweight the amount of larger firm sizes.
17
4.3. Measuring association
The scatter plot above display the relationship between the two variables of f2 (hours operating per
week) and d16 (days of inventory). With the scatter plot appearing to have a downward trend, and
correllation coeficient value calculated to gain a value of -0.05152, it is confirmed that there is a
very weak linear relationship of the two variables. There is weak negative relationship between days
of inventory and hours, indicating that the more working hours, the less days of inventory in
business and vice versa
18
V. Conclusion:
To summarize, the report has addressed all of the significant issues that were brought up by the
author in the beginning of the paragraph. In the first place, it has been noted that the examination of
the nature and process of business and economic data and information from a variety of various
published sources has been mentioned. The second thing that has been mentioned in the report is the
review of data that was gathered from a wide range of sources and analyzed using a variety of
different methodologies. Lastly, an additional investigation of the data set that was provided in order
to assess the relationship between two variations is included.
19
References
Almarayeh, T. S., Aibar-Guzmán, B. & Abdullatif, M., 2020. Does audit quality influence earnings
management in emerging markets? Evidence from Jordan. SPANISH ACCOUNTING REVIEW, 23(1), pp. 64-74.
Cambridge, 2017. Cambridge International AS & A Level Information Technology. 1 ed. England: Cambridge
International Examinations.
DataHeadhunters, 2024. Exploratory vs Confirmatory Data Analysis: Approaches and Mindsets. [Online]
Available at: https://dataheadhunters.com/academy/exploratory-vs-confirmatory-data-analysis-
approaches-and-mindsets/
[Accessed 27 July 2024].
DataHeadhunters, 2024. Exploratory vs Confirmatory Data Analysis: Approaches and Mindsets. [Online]
Available at: https://dataheadhunters.com/academy/exploratory-vs-confirmatory-data-analysis-
approaches-and-mindsets/
[Accessed 12 September 2024].
Education, J., 2024. What is Descriptive Analysis?- Types and Advantages. [Online]
Available at: https://www.jaroeducation.com/blog/descriptive-analysis-types-and-advantages/
[Accessed 13 September 2024].
Hayes, A., 2024. Descriptive Statistics: Definition, Overview, Types, and Examples. [Online]
Available at: https://www.investopedia.com/terms/d/descriptive_statistics.asp
[Accessed 27 July 2024].
Hox, J. & Boeije, 2005. Data collection, primary versus secondary Encyclopedia of social. Encyclopedia of
social management, 1(12), pp. 593 - 598.
J.H.Pogu, C.C.Okafor & J.C.Ezeokonkwo, 2018. Suitability of sands from different locations at Nsukka as
backfill material for vibroflotation. Nigerian Journal of Technology, 37(4), pp. 868-874.
Kabir, 2016. Methods Of Data Collection. Basic Guidelines for Research, 1(1), pp. 201 - 275.
Taherdoost, H., 2021. Data Collection Methods and Tools for Research; A Step-by-Step Guide to Choose
Data Collection Technique for Academic and Business Research. International Journal of Academic Research
in Maangement, 10(1), pp. 10 - 38.
20
Techvify, 2023. Exploratory Data Analysis: Everything you need to know. [Online]
Available at: https://techvify-software.com/exploratory-data-analysis/
[Accessed 12 September 2024].
21
Student Assessment Submission and Declaration
When submitting evidence for assessment, each student must sign a declaration confirming that the
work is their own.
29/7/2024 CMS
Programme:
Plagiarism Unit 42 -
Plagiarism is a particular form of cheating. Plagiarism must be avoided at all costs and students who
break the rules, however innocently, may be penalised. It is your responsibility to ensure that you
understand correct referencing practices. As a university level student, you are expected to use
appropriate references throughout and keep carefully detailed notes of all your sources of materials
for material you have used in your work, including any material downloaded from the Internet.
Please consult the relevant unit lecturer or your course tutor if you need any further advice.
22
Student Declaration
Student declaration
I certify that the assignment submission is entirely my own work and I fully understand the
consequences of plagiarism. I declare that the work submitted for assessment has been
carried out without assistance other than that which is acceptable according to the rules of
the specification. I certify I have clearly referenced any sources and any artificial
intelligence (AI) tools used in the work. I understand that making a false declaration is a
form of malpractice.
23