Econometrics Report - Draft
Econometrics Report - Draft
Year IV – Semester I
2020
By
Group No. 01
Introduction ........................................................................................................................5
Methodology .......................................................................................................................7
Data ............................................................................................................................................. 7
Model Specification .................................................................................................................... 7
Variable Selection ....................................................................................................................... 7
Quantitative Data .................................................................................................................... 8
Qualitative Data ...................................................................................................................... 9
Hypothesis................................................................................................................................. 10
Based on Z Values ................................................................................................................ 10
Based on P-Values ................................................................................................................ 13
Confidence Intervals for True Coefficients .......................................................................... 15
Analysis & Findings .........................................................................................................17
Normality Assumption .............................................................................................................. 17
Graphical Tests ..................................................................................................................... 17
Statistical Tests ..................................................................................................................... 19
Interpretations of Coefficients .................................................................................................. 21
Statistical Significance .............................................................................................................. 23
t- Test ........................................................................................................................................ 23
R2 and adjusted R2 ..................................................................................................................... 24
Root-Mean-Square Error (RMSE) ............................................................................................ 25
Standard Errors ......................................................................................................................... 26
Overall Significance of the Model ............................................................................................ 26
Probability F Test .................................................................................................................. 26
Independent Variables Analysis ............................................................................................... 27
Checking for Errors................................................................................................................... 29
Multicollinearity ................................................................................................................... 29
Heteroscedasticity ................................................................................................................. 30
Specification Bias ................................................................................................................. 31
Testing Normality Assumption ............................................................................................. 32
Conclusion ........................................................................................................................34
iii
Recommendations ............................................................................................................36
Limitations ........................................................................................................................36
Bibliography .....................................................................................................................37
Appendix ...........................................................................................................................38
Table of Tables
Table 1: Data Set .......................................................................................................................... 10
Table 2 : Critical Values and t-Statistics ...................................................................................... 11
Table 3 : p-Values ......................................................................................................................... 14
Table 4 : Confidence Intervals for Coefficients ............................................................................ 15
Table 5 : Regression Results ......................................................................................................... 17
Table 6: Anderson-Darlin Test ..................................................................................................... 20
Table 7: Coefficients ..................................................................................................................... 21
Table 8 : Multicollinearity Test .................................................................................................... 29
Table of Charts
Chart 1: Histogram - Data Normality Test................................................................................... 18
Chart 2: Probability Plot- Data Normality Test ........................................................................... 19
Chart 3 : Residuals and Fitted Values .......................................................................................... 30
iv
Introduction
Furniture industry is one of the major industries in Sri Lanka and according to (Padmasiri, 2012)
it has almost above 10000 furniture plants and wood working firms across the country. It has been
contributing to the Sri Lankan economy through creating job opportunities as well as to the growth
of gross domestic product (GDP). According to (Padmasiri, 2012) Since there is an increasing
demand for high quality wooden furniture in local market due growing population and the per
capita income the furniture industry plays a vital role in the manufacturing sector.
As highlighted by (Amarasekara, 2012) majority of furniture firms are in Moratuwa which is well-
known in the country for their carpentry. Also, most of wooden activities and products are taken
place within city of Moratuwa. Majority of these firms are small in scale and owned by private
sector. According to the same report this furniture industry is the main economic activity of the
people in Moratuwa area. Moratuwa has the concentrated furniture activities in the country
(Dasanayaka, 2011) because of the comparative advantage they have gained by the easy supply of
raw materials needed in producing furniture and also in marketing due to its popularity among the
Most commonly used timber species by these firms is Teak and other timber species include
Mahogany and jack. They use mainly consumer preference, traditional designs by carpenters and
designs from foreign catalogues for designing of furniture (H.S.C.Perera, 2009) and accordingly
Teak Cabinets, Teak Elmira, and Teak Chairs are the main manufactured items.
Main issues and challenges faced by firms in furniture industry in Sri Lanka are financial and credit
relates issues, skill labor mismatching, technological problems, resource management issues and
5
to concentrate on these problems and identify the factors that can improve and make a significant
6
Methodology
Data
Using primary data sources such as surveys and direct observations, primary data was extracted to
conduct this research. Structured interviews were conducted using questionnaires provided by the
Our sample consists of 20 observations selected to cover the population of all the furniture firms
exist in Moratumulla area. All the interviews were conducted as one-to-one interviews in
collecting data.
Model Specification
The research depends on more than one independent variable to analyze the growth of the furniture
Variable Selection
We have taken both quantitative and qualitative data to analyze the growth of furniture industry in
Moratumulla. As for the dependent variable of the model we have taken the production value and
as for independent variables we have taken years of experience in the industry, number of laborers,
family inheritance, quantity of timber used, electricity payment and the training experience. All
7
Quantitative Data
Quantitative data includes all the variables we can identify through a numerical measurement.
Including the dependent variable and four independent variables, the model is based on five
quantitative variables.
• Dependent variable
1. Production value
Value of the production is taken from multiplying the number of items produced by its unit
price.
• Independent variables
1. Years of experience
Owner of each firm had experience in the industry which we have measured on yearly
basis.
2. Number of laborers
We have taken the number of laborers who are currently working in the firm.
The quantity of timber used is also taken on a weekly basis and measured by cubic feet.
3. Electricity payment
Electricity cost that firm incurs was given monthly and we have divided it by four to get
8
Qualitative Data
All the non-numerical data taken for the research represents the qualitative data included in the
model. We have selected two independent variables such as family inheritance and training
experience as our qualitative variables. These variables are analyzed through a dummy variable
regression.
• Independent variables
1. Family inheritance
This variable concern whether the business was newly acquired or inherited from
Yes - 1
No - 0
2. Training experience
This variable state whether the current workers has any training experience regarding
Yes - 1
No - 0
9
Table 1: Data Set
Hypothesis
Based on Z Values
H0: β1= 0
(There is no significant relationship between independent variable and the dependent variable.)
H1: β1≠ 0
(There is a significant relationship between independent variable and the dependent variable.)
Decision rule
10
Table 2 : Critical Values and t-Statistics
Critical value at
t-
Explanatory the 0.05 level of
Hypothesis statisti Decision
variables significance for 14
c
df (Two tailed test)
Intercept Economically
- 0.28 2.145
meaningless
significant
at 95% confidence
level.
11
significant
at 95% confidence
level.
at 95% confidence
level.
12
H1: There is a statistically significant
at 95% confidence
level.
confidence level.
Based on P-Values
H0: β1= 0
(There is no significant relationship between independent variable and the dependent variable.)
H1: β1≠ 0
13
(There is a significant relationship between independent variable and the dependent variable.)
Decision Rule
Table 3 : p-Values
Parameters are
level of significance
Economically
Intercept 0.782 0.05
meaningless
14
Confidence Intervals for True Coefficients
Explanatory
b±se*t.05/2 Lower limit Upper limit
Variables
Here the confidence interval for the coefficient of years of experience at 95% confidence level is
from 11,343.59 to 24,937.11, which means the true β1 for years of experience lies in between that
range. Since both limits are positive figures it cannot be zero. So, we can say that there is a
Here the confidence interval for the coefficient of number of laborers at 95% confidence level is
from 8,689.016 to 41,882.06, which means the true β1 for number of laborers lies in between that
range. Since both limits are positive figures it cannot be zero. So, we can say that there is a
However, confidence intervals for coefficients of family inherent, timber quantity, and training
have negative figures for lower limits, which means zero lies in between. Here we do not reject
15
the null hypothesis and thereby conclude that they do not have a significant relationship with the
16
Analysis & Findings
The following analysis and interpretations are based on the results obtained through Eviews.
Normality Assumption
Based on the sample, we make inferences to generalize it to the population. So that estimators
chosen (β0, β1, β2, β3, β4, β5) should be best to represent the population. If we assign probability
distribution for these estimators, it will give all the possible values with their probabilities. For that
we assume the chosen random variables are normally distributed. To test the normality, we have
Graphical Tests
1. Histogram
17
6
Series: Residuals
Sample 1 20
5
Observations 20
4 Mean -8.44e-11
Median 7578.646
3 Maximum 216873.6
Minimum -245151.6
Std. Dev. 117096.3
2
Skewness -0.267104
Kurtosis 2.703284
1
Jarque-Bera 0.311182
0 Probability 0.855909
-200000 -100000 1 100001 200001
Here the vertical axis represents frequency of residuals at each interval. The horizontal axis
The distribution can be considered reasonably symmetrical. (Exhibits a reasonable bell shape)
Here the residuals are plotted on and closer to the line, therefore we can conclude that the residuals
The error terms (residuals) are considered because it is statistically proven that if we make an
assumption about residuals that will be valid for the parameters as well. So we can conclude that
18
Even though these graphical tests are used in determining the normality of residuals, these
interpretations can be highly subjective according to the analysts’ perception about the shape of
200,000
Quantiles of Normal
100,000
-100,000
-200,000
-300,000
-300,000 -100,000 0 100,000 300,000
Quantiles of RESID
Chart 2: Probability Plot- Data Normality Test
Statistical Tests
Hypothesis Testing:
19
Decision rule:
0.8559> 0.05
Conclusion:
Under 5% level of significance, we have enough evidence to prove that the residuals are
2. Anderson-– Darling
Table 6: Anderson-Darlin Test
20
Hypothesis Testing:
Decision rule:
Conclusion:
Under 5% level of significance, we have enough evidence to prove that the residuals are normally
Interpretations of Coefficients
Table 7: Coefficients
21
Years of Experience +18,140.35 There is positive relationship between
years of experience of furniture
Manufacturers/Sellers and revenue
earned by them per a week. So, when
the experience increases by one year,
revenue earned per week also will
increase approximately by Rs.18,140.
22
Timber Quantity +1459.454 There is a positive relationship
between timber quantity (cubic feet)
used by furniture
manufacturers/sellers during a week
and average revenue earned by them
during a week. Each one cubic foot
increase in timber usage is associated
with a Rs.1,459.454 increase in
revenue.
Statistical Significance
t- Test
• Step 01
Building hypothesis:
H0: β1= 0
dependent variable.)
H1: β1≠ 0
23
(There is a significant relationship between independent variable and the
dependent variable.)
• Step 02
• Step 03
• Step 04
Decision rule:
• Step 05
Decision:
R2 and adjusted R2
R-squared (R2) is a statistical measure that represents the proportion of the variance for a dependent
correlation explains the strength of the relationship between an independent and dependent
variable, R2 explains to what extent the variance of one variable explains the variance of the second
variable. So, if the R2 of a model is 0.50, then approximately half of the observed variation can be
24
We use adjusted R2 to compare the goodness-of-fit for regression models that contain differing
numbers of independent variables. The adjusted R2 adjusts for the number of terms in the model.
Importantly, its value increases only when the new term improves the model fit more than expected
by chance alone. The adjusted R2 value decreases when the term doesn’t improve the model fit by
a sufficient amount. And also, it explains whether the included variables really reflect the
dependent variable, in our case the revenue. So, the adjusted R2 and R2 should be almost closer. If
it’s not closer, it means that we have not included necessary variables.
In our study the variation of revenue is explained by the variations of Independent variables. Our
The R2 indicates that 83.67% of the variance in revenue of carpenters can be predicted from the
independent variables that above mentioned. And the gap between Adjusted R2 and R2 is 0.0584
which is comparatively low. So, we can conclude that our model has an overall significance
because even though we have used different variables they all are necessary for the model.
The RMSE is the square root of the variance of the residuals. It indicates the absolute fit of the
model to the data–how close the observed data points are to the model’s predicted values. Whereas
R-squared is a relative measure of fit, RMSE is an absolute measure of fit. As the square root of a
variance, RMSE can be interpreted as the standard deviation of the unexplained variance and has
the useful property of being in the same units as the response variable. Lower values of RMSE
indicate better fit. RMSE is a good measure of how accurately the model predicts the response,
and it is the most important criterion for fit if the main purpose of the model is prediction.
25
In our study the relevant RMSE is 140,000 and the production mean is 587,100. So, compared to
mean value this is a lower value and we can conclude that how close the observed data points are
Standard Errors
The standard errors of the coefficients of the model explains the deviation of population mean
from the estimated mean value of the coefficients. Accordingly following are the standard errors
of the coefficients.
̂ = 3168.967
𝛽1
̂ = 7738.076
𝛽2
̂ = 69257.63
𝛽3
̂ = 1482.788
𝛽4
̂ = 94653.97
𝛽5
Probability F Test
α=0.05
26
Decision Criteria:
Decision: Reject H0
Conclusion:
There is enough evidence to say that at least one independent variable included in the model
1. Number of laborers
The number of workers employed at these firms have a significant relationship with the output
they produce. Firms with higher number of workers have the opportunity to gain advantages of
division of labor and thereby increased productivity resulting in comparatively higher levels of
revenue. Therefore, the analysis has included the number of workers as a key determinant of the
revenue in these firms. Impact of labor in terms of cost was not considered due to the changes in
salaries paid to those workers, rather the number of laborers has been taken to show the relationship
2. Years of experience
According to the years of experience in these firms there are changes in revenue since the p value
of experience variable is less than the significance level 0.05. Level of experience was measured
by the number of years since it gives a reasonable idea about the exposure of the firm in the
furniture industry. The reason behind this relationship is when the owner has a higher level of
27
experience it has a positive impact to the production. Firms with lower level of experience are new
to the industry and therefore have a lack of awareness and exposure to the industry. So, a firm with
higher level of experience can earn a higher revenue compared to a firm with lower level of
3. Timber quantity
The analysis shows that the amount of timber used by each firm does not have a significant
relationship with the revenue. Therefore, even though a relationship existed, according to the
analysis the timber quantity is not a significant determinant of the revenue in these furniture firms.
4. family inherent
Since the p value of this variable is greater than the Significance level 0.05 it doesn’t have a
statistically significant relationship with revenue. So even though there might possibly be a
relationship it is statistically insignificant to the dependent variable. It doesn’t have any strong
impact on the revenue. The main reason behind this is that firms which have been inherited from
their owners' families are more reluctant to implement changes in their business or to adopt new
strategies. They tend to continue the same traditional production methods and technology. They
don’t feel the need to make changes in their established systems and therefore become less efficient
than the firms that use modern technology and new features in their furniture. On the other hand,
even though the firms that are family inherent avoid making changes in some instances it could be
beneficial for them. Since they have been repeating the same processes in manufacturing and
selling, they have more knowledge and skills to perfect their output than newly established firms.
And they have better brand recognition and customer loyalty. So in this case family inherent
28
5. Labor training
This variable was included to test the impact of the training program the workers have received on
the production of furniture in the firm. The results show that it is statistically insignificant to the
dependent variable. Therefore, output produced by a worker who has received training program
does not differ from output produced by a worker who hadn’t received any labor training programs.
The training hours or the training program itself do not affect to the variations in the revenue
Multicollinearity
In a Classical Linear Regression Model, we assume that the model is free from the multicollinearity
along with some other assumptions. Multicollinearity problem exists only in multiple regression
model. Multicollinearity occurs when the independent variables which determine the dependent
variable are very highly correlated with each other. When multicollinearity occurs, standard error
may be inflated.
29
If mean vif value or independent vif value is less than 10 there is no multicollinearity problem.
Since the mean vif (1.26) and each invidual variables’ vif values (1.45,1.31,1.24,1.23,1.10) are
Heteroscedasticity
Heteroscedasticity means when the error variance (ui) team is not constant. If there is no
regression model are that the variance of each disturbance term ui, conditional on the chosen values
To identify Heteroscedasticity problem, we can use “rvfplot” and it graphical shows the pattern of
variance.
200000
100000
Residuals
0
-300000 -200000 -100000
30
As you can see, there is no pattern that we can observe in this graph. Therefore, there is no
heteroscedasticity problem.
Since the diagram is subjective, the heteroscedasticity can be shown through Breusch Pegan Test.
. estat hettest
chi2(1) = 1.59
Prob > chi2 = 0.2075
If P value is less than 0.05, then we can reject null hypothesis. That means the model has
heteroscedasticity problem. In our model P value (0.2075) which is greater than 0.05. Therefore,
we fail to reject the null hypothesis. That means the model is free from heteroscedasticity.
Specification Bias
Specification bias means the specified model is not correct. Among many, omitted variable bias is
one form of specification bias. Testing for omitted variable bias is important for our model since
it is related to the assumption that the error term and the independent variables in the model are
not correlated.
31
. ovtest
If P value is less than 0.05, we reject null hypothesis. That means the model has omitted variables.
In our model the P value (0.3443) is greater than 0.05, which means we do not reject H0. Therefore,
the model has no omitted variables. We can conclude that the model is correctly specified.
From normality assumption we check whether the behavior of residuals is normally distributed,
. swilk e
32
If p value is less than 0.05, then we can reject null hypothesis. That means the error terms are not
normally distributed. In our model key value (0.96) is greater than 0.05. That means we fail to
33
Conclusion
Furniture industry is the one of major industries in Sri Lanka which plays a significant role in the
Sri Lankan economy. It has been making significant contribution towards the economy through
creating employment opportunities and increasing Gross Domestic Production (GDP) mainly in
terms of manufacturing sector. It is obvious that majority of furniture manufacturing firms are
located in Moratuwa which is dominated in manufacturing furniture. This mini research mainly
focused in deriving a most fitted multiple regression line for dependent variable, production value
based on independent variables, years of experience in the industry, number of laborers, family
inheritance, quantity of timber used, electricity payment and the training experience.
The research is mainly based on primary data and survey method was used to collect data from the
Moratuwa area. Survey was based on interviewer administrative questionnaires given by the
Department of Business Economics. Our sample consists of 20 observations selected to cover the
population of all the furniture firms exist in Moratumulla area. All the interviews were conducted
We have taken both quantitative and qualitative data to analyze the growth of furniture industry in
Moratumulla. As for the dependent variable of the model we have taken the production value and
as for independent variables we have taken years of experience in the industry, number of laborers,
family inheritance, quantity of timber used, electricity payment and the training experience. All
these variables are measured on a weekly basis and in order to quantify qualitative variables
dummy variables were used. We have selected two independent variables such as family
inheritance and training experience as our qualitative variables. These variables are analyzed
34
through a dummy variable regression. In order to analyze data collected through the survey, STATA
and EVIEWS software were used and we were able to derive a best fitted multiple regression line
According to the results of the regression analysis only years of experience and number of laborers
have a statistically significant relationship with the dependent variable, revenue. Therefore, even
though a relationship existed, according to the analysis all other independent variables are not
significant determinants of the revenue in these furniture firms. When it comes to explanatory
power of the model, the R2 indicates that 83.67% of the variance in revenue of carpenters can be
predicted from the independent variables that above mentioned. And the gap between Adjusted R2
and R2 is 0.0584 which is comparatively low. So, we can conclude that our model has an overall
significance because even though we have used different variables they all are necessary for the
model. On the other hand, F value of the model is less than 0.05 probability which concludes that
there is enough evidence to say that at least one independent variable included in the model
significantly affects the dependent variable, weekly revenue. Jarque – Bera Test and Anderson –
Darling Test were used to check whether residuals of the model have distributed normally or not.
So results indicate that, under 5% level of significance there are enough evidence to prove that the
residuals are normally distributed. When it comes to multicollinearity problem, as our mean “vif”
value and each individual variable’s “vif” values are less than 10, our model is free from the
multicollinearity problem. At the same time model does not subject to the Heteroscedasticity
problem as the p value derived from Breusch Pegan Test is greater than 0.05, where we failed to
reject the null hypothesis. Therefore, as a whole it is concluded that the regression model is best
35
Recommendations
• Through the survey we carried out we identified that most of businesses do not use
marketing, Business website, Online buying and selling platform in order to increase the
• Most of manufacturers are facing a problem of not having experienced workers. So, it is
• It is identified that firms which have been inherited from their owners' families are more
reluctant to implement changes in their business or to adopt new strategies. They tend to
continue the same traditional production methods and technology. So, in order to attract
new customers while maintaining their long-term loyal customers we recommend them to
Limitations
When carrying out the mini research project the team had to face the following limitations,
• The respondents to the questionnaire sometimes objected to answer certain questions in the
• The respondents were trying to be average in answering the questionnaire rather answering
36
Bibliography
Amarasekara, H., 2012. A study on the status of furniture manufacturing industry in moratuwa
Dasanayaka, S. W. S. B., 2011. Identification of barriers for development of the Sri Lankan small
and medium scale furniture and wooden products manufacturing enterprises, a study based on the
H.S.C.Perera, 2009. Manufacturing Strategy and Improvement Activities of Sri Lankan Furniture
Padmasiri, H. N., 2012. The role of human and social capital on the development of wooden
furniture clusters in Sri Lanka. International Journal of Development , Volume 11, pp. 19-36.
Shantha, A., 2013. Resource use efficiency of small scale furniture industry in Sri Lanka. s.l., s.n.
37
Appendix
Appendix 2: Questionnaire
38
Appendix 1: Images of Moratumulla visit
39
40
Appendix 2: Questionnaire
41