KeywordsBest-Fitting Model Forecasting Linear Regression Non-Linear Regression
JEL Classification M10
Full Article
1.Introduction and Model Estimation for the Linear Model
Regression analysis, in which an equation is derived that connects the value of one dependent variable (Y) to the values of one independent variable X (linear model and some non-linear models), starts with a given bivariate data set and uses the Least Squares Method to assign the best possible values to the unknown multipliers found in the models we wish to estimate. The bivariate data, used to estimate the linear model and some non-linear models, consists of n ordered pairs of values:
The linear model we wish to estimate, using the given data, is:
(1)
while the non-linear models of interest are given by
(Exponential Model) (2)
(Power Model) (3)
and
(Quadratic Model) (4)
To estimate model (1) we use the Least Squares Methodology, which calls for the formation of the quadratic function:
(5)
To derive the “normal” equations for the linear model from which the values of a and b of the linear model are obtained, we take the partial derivative of Q (a,b) of equation (5) with respect to a and b, set each equal to zero, and then simplify:
The result is:
(6)
and
(7)
When (6) and (7) are set equal to zero and simplified, we obtain the “Normal” equations for the linear model:
(8)
(9)
The only unknowns in equation (8) and (9) are a and b and they should be solved for them simultaneously, thus deriving (or estimating) the linear model. This is so because all the other values of equations (8) and (9) come from the given data, where:
n =number of ordered pairs ()
= sum of the x values
= sum of the y values
= sum of the given x values, which are first squared
= sum of the products of the x_{i} and y_{i} values in each ordered pair.
Note: The values of (a) and (b) obtained from the Normal equations correspond to a minimum value for the Quadratic function Q (a,b) given by equation (5), as can be easily demonstrated by using the Optimization methodology of Differential Calculus for functions of 2 independent variables.
To complete the Estimation of the Linear model we need to find the standard deviation for a, σ (a), and b, σ (b), which are needed for testing of the significance of the model. The standard deviations, σ (a), and σ (b), are given by:
(10)
and
, (11)
where:
(12)
The a and b in equation (12) come from the solution of equations (8) and (9) while ,, and come directly from the given bivariate data.
2.Model Testing
Now that our model of interest has been estimated, we need to test for the significance of the terms found in the estimated model. This is very important because the results of this testing will determine the final equation which will be retained and used for Forecasting purposes.
Testing of the linear model consists of the following steps:
2.1. Testing for the significance of each term separately
Here we test the hypotheses:
1. H_{0}: b = 0 vs H_{1}: b ¹ 0, and
2. H_{0}: a = 0 vs H_{1}: a ¹ 0, based on our knowledge of b, σ (b), a, and σ (a).
If n ³ 30, we calculate
and
and compare each to Z_{a}_{/2} (where Z_{a}_{/2 }is a value obtained from the standard Normal Table when a, or 1 – a, is specified).
For example if a = 0.05, Z_{a}_{/2} = Z_{0.025 }= 1.96; if a = 0.10, Z_{a}_{/2} = Z_{0.05} = 1.645; if a = 0.02, Z_{a}_{/2} = Z_{0.01} = 2.33 and if a = 0.01, Z_{a}_{/2} = Z_{0.005} = 2.58).
If Z^{*}_{b} Z_{a}_{/2} (or Z^{*}_{b} < -Z_{a}_{/2}), the hypothesis H_{0}: b = 0 is rejected and we conclude that b ¹ 0 and the term bx (in the estimated model ŷ=a+bx) is important for the calculation of the value of y. Similarly, if Z^{*}_{a} Z_{a}_{/2} (or Z^{*}_{a} < -Z_{a}_{/2}), H_{0}: a = 0 is rejected, and we conclude that the linear equation ŷ=a+bx does not go through the origin.
If n < 30, we calculate
and
and compare each to t_{n-2 (}_{a}_{/2)}, for a given a value, where t_{n-2 (}_{a}_{/2)} is obtained from the t-distribution table, with the same interpretation for H_{0}: b = 0 and H_{0}: a = 0 as above.
But, instead of hypothesis testing, we can construct Confidence Intervals for b and a using the equations:
(13)
and, if n ≥ 30,
(14)
or
(15)
and, if n < 30,
(16)
If the hypothesized values: b = 0 falls inside the Confidence Intervals given by equations (13) or (15), or a = 0 falls inside the Confidence Intervals given by equations (14) or (16), the corresponding hypotheses H_{0}: b = 0 and H_{0}: a = 0 are not rejected and we conclude that b = 0 (and b = 0 and the term bx is not important for the calculation of y) and a = 0 (i.e. a = 0 and the line goes through zero). If for a given data set, we performed the above-discussed tests, we will obtain one of 4 possible conclusions:
A) H_{0}: b = 0 and H_{0}: a = 0 are both rejected; Therefore b ¹ 0, and a ¹ 0, and both the terms a and bx are important to the calculation of y. In this case the final equation is ŷ=a+bx, with both terms staying in the equation.
B) H_{0}: b = 0 is rejected, but H_{0}: a = 0 is not rejected. Therefore b ¹ 0 but a = 0 and the term a is not important to the calculation of y. In this case the final equation is ŷ=a+bx, with the term a dropping out of the equation.
C) H_{0}: b = 0 is not rejected but H_{0}: a = 0 is rejected. Therefore b = 0 and the term bx is not important for the calculation of y, while a ¹ 0 and is important to the calculation of y. In this case the final equation is ŷ = a, with the term bx dropping out of the equation
D) H_{0}: b = 0 and H_{0}: a = 0 are both not rejected; Therefore b = 0, and a = 0, and both terms a and bx are not important to the calculation of y. In this case the final equation will be ŷ = 0, with both terms a and bx dropping out of the equation.
2.2. Testing for the Significance of the Entire Linear Equation
This test consists of testing the hypothesis:
1. H_{0}: a = b = 0 vs H_{0}: a and b are not both equal to 0, or
2. H_{0}: The Entire Regression equation is not significant vs H_{1}: The Entire Regression equation is significant
For a given bivariate data set and a given a value, we need to first calculate:
(17)
(18)
(19)
(20)
Then we calculate:
(21)
and compare F^{*}_{Total} to F^{2}_{n-2} (α), which is a tabulated value, for a specified a value. If F^{*}_{Total} F^{2}_{n-2} (α), we reject H_{0} and conclude that the entire regression equation (i.e. ŷ=a+bx) or that both the constant term a, and the factor x (and term bx) are significant to the calculation of the y value, simultaneously.
Note 1:
When TSS, RSS_{b}, and ESS are known, we can also define the coefficient of determination R², where:
(22)
where 0 ≤ R^{2} ≤ 1, which tells us how well the regression equation ŷ = a + bx fits the given bivariate data. A value of R close to 1 implies a good fit.
Note 2:
(23)
2.3. A Bivariate Example
A sample of 5 adult men for whom heights and weights are measured gives the following results (Table 1):
Table 1. Given bivariate data set (n =5)
x = H | y = W | x^{2}=H^{2} | y^{2}=W^{2} | xy = HW |
64 | 130 | 64² | 130² | 64 x 130 |
65 | 145 | 65² | 145² | 65 x 145 |
66 | 150 | 66² | 150² | 66 x 150 |
67 | 165 | 67² | 165² | 67 x 165 |
68 | 170 | 68² | 170² | 68 x 170 |
For this Bivariate Data set we have: n = 5
To obtain the linear equation ŷ = a + bx, we substitute the values of n, , ,
to equations (8) and (9) and obtain:
When these equations are solved simultaneously we obtain: a = -508 and b = 10, and the regression equation is
.
Then, using the values of a = -508, b=10, and , andwe obtain from equation (12):
and from equations (10) and (11):
Since n=5<30, a and b are distributed as variables and when a = 0.05, t_{3} (α/2) = t_{3} (0.025) = = ±3.1824.
Then the hypotheses H_{0}: b = 0 vs. H_{1}: b ≠ 0, and H_{0}: α = 0 vs. H_{0}: α ≠ 0 are both rejected because:
and
Therefore, the final equation is
.
To test for the significance of the entire equation, and to calculate the coefficient of determination, we first evaluate, TSS, RSS_{b}, ESS, SS_{a} using equations (17) – (20) and obtain:
From equation (22), we obtain R^{2 }= 1000/1030 ≈ 0.971, which tells us that 97% of the variation in the values of Y can be explained (or are accounted for) by the variable X included in the regression equation and only 3% is due to other factors. Since R^{2} is close to 1, the fit of the equation to the data is very good.
Note:
The correlation coefficient r, which measures the strength of the linear relationship between Y and X is related to the coefficient of determination by:
for this example. Clearly X and Y are very strongly linearly related.
Using equation (21) we obtain:
when F^{*}_{Total} is compared to
H_{0} (The entire regression equation is not significant) is rejected, and we conclude that the entire regression equation is significant.
3.MINITAB Solution to the Linear Regression Problem
We enter the given data and issue the regression command as shown in Table 2.
Table 2.Data set in MINITAB
MTB Set C1 |
DATA 6465666768 |
DATA end |
MTB set C2 |
DATA 130145 150 165 170 |
DATA end |
MTB Name C1 ‘X’ C2 ‘Y’ |
MTB REGRESS ‘Y’ 1 ‘X’ |
and obtain the MINITAB output presented in Table 3, Table 4, and Figure 1.
Table 3. Regression Analysis: Y versus X
Regression equation: | Y = – 508 + 10.0 X | ||||
Predictor | Coef | SE Coef | T | p | |
Constant | -508.000 | 66.020 | -7.700 | 0.005 | |
X | 10.000 | 1.000 | 10.000 | 0.002 | |
Regression fit: | |||||
S | R-Sq | R-Sq (adj) | |||
3.162 | 97.1% | 96.1% | |||
Analysis of Variance: | |||||
Source | DF | SS | MS | F | p |
Regression | 1 | 1000.0 | 1000.0 | 100.0 | 0.002 |
Residual Error | 3 | 30.0 | 10.0 | ||
Total | 4 | 1030.0 |
Table 4. Correlations: Y, X
Pearson correlation of Y and X | 0.985 |
P-Value | 0.002 |
Figure 1. Plot Y * X<
When we compare the MINITAB and hand solutions, they are identical. We obtain the same equation ŷ = -508 + 10x, the same standard deviations for a and b (under SE Coefficient) and the same t values, the same R^{2}, the same s = σ and σ^{2} = 10. Notice also that an Analysis of Variance table provides the values for RSS_{b}, ESS, and TSS. The only value missing is SS_{a}, which can be easily calculated from
.
The MINITAB solution also gives a p-value for each coefficient. The p-value is called the “Observed Level of Significance” and represents the probability of obtaining a value more extreme than the value of the test statistic. For example the p-value for the predictor X is calculated as p = 0.002, and it is given by:
(24)
The p-value has the following connection to the selected a-value.
If p ³ a, do not reject H_{0}
If p < a, reject H_{0}
Since p = 0.002 < a = 0.05, H_{0}: β = 0 will be rejected.
4.Introduction and Model Estimation for Some Non-Linear Models of Interest
Sometimes two variables are related but their relationship is not linear and trying to fit a linear equation to a data set that is inherently non-linear will result in a bad-fit. But, because non-linear regression is, in general, much more difficult than linear regression, we explore in this part of the paper estimation methods that will allow us to fit non-linear equations to a data set by using the results of linear regression which is much easier to understand and analyze. <p”This becomes possible by first performing logarithmic transformations of the non-linear equations, which change the non-linear into linear equations, and then using the normal equations of the linear model to generate the normal equations of the “linearized” non-linear equations, from which the values of the unknown model parameters can be obtained. In this paper we show how the exponential model, ŷ = ke^{cx}, and the power model, ŷ = ax^{b} (for b≠1) can be easily estimated by using logarithmic transformations to first derive the linearized version of the above non-linear equations, namely:
and
,
and then comparing these to the original linear equation, ŷ = a + bx, and its normal equations (see equations (8) and (9)).
<p”Also discussed is the quadratic model, ŷ = a + bx + cx^{2} which, even though is a non-linear model, can be discussed directly using the linear methodology. But now we have to solve simultaneously a system of 3 equations in 3 unknowns, because the normal equations for the quadratic model become: <p”
(25)
<p” <p”A procedure is also discussed which allows us to fit these four models (i.e. linear, exponential, power, quadratic), and possibly others, to the same data set, and then select the equation which fits the data set “best”. These four models are used extensively in forecasting and, because of this, it is important to understand how these models are constructed and how MINITAB can be used to estimate such models efficiently. <p”
4.1.The Linear Model and its Normal Equations
The linear model and the normal equations associated with it as explained above, are given by:
Linear Model
(1)
Normal Equations
(8)
(9)
4.2.The Exponential Model
The exponential model is defined by the equation:
(26)
Our objective is to use the given data to find the best possible values for k and c, just as our objective in equation (1) was to use the data to find the best (in the least-square sense) values for a and b.
Taking natural logarithms (i.e. logarithms to the base e) of both sides of equation (26) we obtain
or
(27)
4.2.1.Logarithmic Laws
To simplify equation (27), we have to use some of the following laws of logarithms:
i)log (A∙B) = log A + log B (28)
ii)log (A/B) = log A – log B (29)
iii)log (A^{n}) = n log A (30)
Then, using equation (28) we can re-write equation (27) as:
(31)
and, by applying equation (30) to the second term of the right hand side of equation (31), equation (31) can be written finally as:
or
(32)
(because ln e = log_{e} e = 1)
Even though equation (26) is non-linear, as can be verified by plotting y against x, equation (32) is linear (i.e. the logarithmic transformation changed equation (26) from non-linear to linear) as can be verified by plotting: ln y against x.
But, if equation (32) is linear, it should be similar to equation (1), and must have a set of normal equations similar to the normal equations of the linear model (see equations (8) and (9)).
Question:How are these normal equations going to be derived?
Answer:We will compare the “transformed linear model”, i.e. equation (32), to the actual linear model (equation (1)), note the differences between these two models, and then make the appropriate changes to the normal equations of the linear model to obtain the normal equations of the “transformed linear model”.
4.2.2.Comparison of the Logarithmic Transformed Exponential Model to the Linear Model
To make the comparison easier, we list below the 2 models under consideration, namely:
a) Original Linear Model:
(1)
b) Transformed Linear Model:
(32)
Comparing equations (1) and (32), we note the following three differences between the two models:
i.y in equation (1) has been replaced by ln y in equation (32)
ii.a in equation (1) has been replaced by ln k in equation (32)
iii.b in equation (1) has been replaced by c in equation (32)
4.2.3.Normal Equations of Exponential Model
When the three changes listed above are applied to the normal equations of the actual linear model (equations (8) and (9)), we will obtain the normal equations of the “transformed model”.The normal equations of the “transformed linear model” are:
(33)
(34)
In equations (33) and (34) all the quantities are known numbers, derived from the given data as will be shown later, except for: ln k and c, and equations (33) and (34) must be solved simultaneously for ln k and c.
Suppose that for a given data set, the solution to equations (33) and (34) produced the values:
ln k = 0.3andc = 1.2 (35)
If we examine the exponential model (equation (26)), we observe that the value of c = 1.2 can be substituted directly into equation (26), but we do not yet have the value of k; instead we have the value of ln k = 0.3!
Question:If we know: ln k = 0.3, how do we find the value of k?
Answer:If ln k = 0.3, then: k = e^{0.3} (2.718281828)^{0.3} 1.349859
Therefore, now that we have both the k and c values, the non-linear model, given by equation (26), has been completely estimated.
4.3.The Power Model
Another non-linear model, which can be analyzed in a similar manner, is the Power Model defined by the equation:
(36)
which is non-linear if b ≠ 1 and, as before, we must obtain the best possible values for a and b (in the least-square sense) using the given data.
4.3.1.Logarithmic Transformation of Power Model
A logarithmic transformation of equation (36) produces the “transformed linear model”
(37)
When equation (37) is compared to equation (1), we note the following 3 changes:
i.y in equation (1) has been replaced by ln y in equation (37)
ii.a in equation (1) has been replaced by ln a in equation (37) (38)
iii.x in equation (1) has been replaced by ln x in equation (37)
When the changes listed in (38) are substituted into equations (8) and (9), we obtain the normal equations for this “transformed linear model” which are given by equations (39) and (40) below:
4.3.2.Normal Equations of Power Model
(39)
(40)
Equations (39) and (40) must be solved simultaneously for (ln a) and b.
If ln a = 0.4, then a = e^{0.4} ≈ (2.718251828)^{0.4} ≈ 1.491825 and, since we have numerical values for both a and b, the non-linear model defined by equation (36) has been completely estimated.
4.4.Derivation of the normal equations for the Quadratic model, y = a + bx + cx^{2}
To derive the normal equations of the quadratic model, first form the function
(41)
Then take the partial derivatives: and set each equal to 0, to obtain the 3 equations needed to solve for a, b, c.
We obtain:
or:
(42)
or:
(43)
or:
(44)
Equations (42), (43), and (44) are identical to equation (25).
4.5.Data Utilization in Estimating the 4 Models
To generate the quantities needed to estimate the 4 models:
a. The Linear Model
b. The Exponential Model
c. The Power Model,
d. The Quadratic Model,
the given (x, y) bivariate data must be “manipulated” as shown in Tables: 5, 6, 7, and 8, respectively.
4.5.1.Given Data to Evaluate the Linear Model
Table 5. Manipulation of Given Data to Evaluate the Linear Model
x | y | xy | x^{2} |
x_{1} | y_{1} | x_{1}y_{1} | x_{1}^{2} |
x_{2} | y_{2} | x_{2}y_{2} | x_{2}^{2} |
x_{3} | y_{3} | x_{3}y_{3} | x_{3}^{2} |
… | … | … | … |
x_{n} | y_{n} | x_{n}y_{n} | x_{n}^{2} |
N_{1} | N_{2} | N_{3} | N_{4} |
To evaluate y = a + bx, substitute: N_{1}, N_{2}, N_{3}, N_{4} into equations (8) and (9) and solve for a and b simultaneously.
4.5.2.Given Data to Evaluate the Exponential Model
Table 6. Manipulation of Given Data to Evaluate the Exponential Model
x | y | x^{2} | ln y | x ln y |
x_{1} | y_{1} | ln y_{1} | ||
x_{2} | y_{2} | ln y_{2} | ||
x_{3} | y_{3} | ln y_{3} | ||
… | … | … | … | … |
x_{n} | y_{n} | ln y_{n} | ||
N_{5} | N_{6} | N_{7} | N_{8} | N_{9} |
To evaluate, substitute N_{5}, N_{7}, N_{8}, N_{9} into equations (33) and (34) and solve for ln k and c simultaneously.
4.5.3.Given Data to Evaluate the Power Model
Table 7. Manipulation of Given Data to Evaluate the Power Model
x | y | ln x | (ln x)^{2} | (ln x) (ln y) | ln y |
x_{1} | y_{1} | ln x_{1} | (ln x_{1})^{2} | ln y_{1} | |
x_{2} | y_{2} | ln x_{2} | (ln x_{2})^{2} | ln y_{2} | |
x_{3} | y_{3} | ln x_{3} | (ln x_{3})^{2} | ln y_{3} | |
… | … | … | … | … | … |
x_{n} | y_{n} | ln x_{n} | (ln x_{n})^{2} | ln y_{n} | |
N_{10} | N_{11} | N_{12} | N_{13} | N_{14} | N_{15} |
To evaluate ŷ=ax^{b}, substitute N_{12}, N_{13}, N_{14}, N_{15} into equations (39) and (40) and solve simultaneously for (ln a) and b.
4.5.4.Given Data to Evaluate the Quadratic Model
Table 8. Manipulation of Given Data to Evaluate the Quadratic Model
x | y | x^{2} | x^{3} | xy | x^{4} | x^{2} y |
x_{1} | y_{1} | x_{1}^{2} | x_{1}^{3} | x_{1}y_{1} | x_{1}^{4} | x_{1}^{2}y_{1} |
x_{2} | y_{2} | x_{2}^{2} | x_{2}^{3} | x_{2}y_{2} | x_{2}^{4} | x_{2}^{2}y_{2} |
x_{3} | y_{3} | x_{3}^{2} | x_{3}^{3} | x_{3}y_{3} | x_{3}^{4} | x_{3}^{2}y_{3} |
… | … | … | … | … | … | … |
x_{n} | y_{n} | x_{n}^{2} | x_{n}^{3} | x_{n}y_{n} | x_{n}^{4} | x_{n}^{2}y_{n} |
N_{16} | N_{17} | N_{18} | N_{19} | N_{20} | N_{21} | N_{22} |
To evaluate y = a + bx + cx^{2}, substitute N_{16}, N_{17}, N_{18}, N_{19}, N_{20}, N_{21}, N_{22} into equations (42), (43), and (44), and solve simultaneously for a, b, and c.
5.Selecting the Best-Fitting Model
5.1.The Four Models Considered
Given a data set (x_{i}, y_{i}), we have shown how to fit to such a data set four different models, namely:
a.Linear:
(45)
b.Exponential:
(46)
c.Power:
(47)
d.Quadratic:
(48)
We might decide to fit all four models to the same data set if, after examining the scatter diagram of the given data set, we are unable to decide which of the “4 models appears to fit the data BEST.”
But, after we fit the 4 models, how can we tell which model fits the data best?
To answer this question, we calculate the “variance of the residual values” for each of the models, and then “select as the best model” the one with the smallest variance of the residual values.
5.2.Calculating the Residual Values of Each Model and Their Variance
Use each x_{i} value, of the given data set (x_{i}, y_{i}), to calculate the value, from the appropriate model, and then for each i, form the residual:
, (49)
for each i.
Then the variance of the residual values is defined by:
, (50)
where DOF = Degrees of Freedom.
Note:The DOF are DOF = n – 2 for the first three models (Linear, Exponential, Power) due to the fact that each of these 3 models has 2 unknown quantities that need to be evaluated from the data (a and b, k and c, and a and b, respectively) and, as a consequence, 2 degrees of freedom are lost. For the Quadratic model, DOF = n – 3 because the model has 3 unknown quantities that need to be estimated and, as a consequence, 3 degrees of freedom are lost.
Using equation (50) to calculate the variance of the residuals for each of the 4 models, we obtain:
(51)
(52)
(53)
(54)
(55)
(56)
(57)
(58)
After the calculation of the 4 variances from equations: (52), (54), (56), and (58), the model with the “smallest” variance is the model which fits the given data set “best”.
We will now illustrate, through an example, how the 4 models we discussed above can be fitted to a given bivariate data set, and then how the “best” model from among them is selected.
5.3. A Considered Example
A sample of 5 adult men for whom heights and weights are measured gives the following results (Table 9).
Table 9. Sample of 5 adult men
# | X = Height | Y = Weight |
1 | 64 | 130 |
2 | 65 | 145 |
3 | 66 | 150 |
4 | 67 | 165 |
5 | 68 | 170 |
Problem:Fit the linear, exponential, power, and quadratic models to this bivariate data set and then select as the “best” the model with the smallest variance of the residual values.
5.3.1. Fitting the Linear Model
To fit the linear model, we must extend the given bivariate data so that we can also calculateand , as shown below, in Table 10:
Table 10. Calculations for bivariate data of 5 adults for the linear model
x^{2} | x | y | Xy |
4096 | 64 | 130 | 8320 |
4225 | 65 | 145 | 9425 |
4356 | 66 | 150 | 9900 |
4489 | 67 | 165 | 11055 |
4624 | 68 | 170 | 11560 |
=21,790 | = 330 | = 760 | = 50,260 |
We then substitute the generated data into the normal equations for the linear model, namely equations (8) and (9):
,
and obtain the equations:
When these equations are solved simultaneously for a and b we obtain:
Therefore, the linear model is:
The variance of the residual values for the linear model is calculated as shown below, in Table 11:
Table 11. Variance of the residual values for the linear model
Given X | Given Y | Calculated Y | Residual | (Residual)^{2} |
x | y | = -508 + 10x | y – | (y – )^{2} |
64 | 130 | -508 + 10 (64) = 132 | -2 | (-2)^{2 }= 4 |
65 | 145 | -508 + 10 (65) = 142 | +3 | (+3)^{2} = 9 |
66 | 150 | -508 + 10 (66) = 152 | -2 | (-2)^{2} = 4 |
67 | 165 | -508 + 10 (67) = 162 | +3 | (+3)^{2} = 9 |
68 | 170 | -508 + 10 (68) = 172 | -2 | (-2)^{2} = 4 |
Therefore, the variance of the residual values, for the linear model is:
5.3.2. Fitting the Exponential Model ŷ = ke^{cx}
To fit the exponential model we need to extend the given bivariate data so that we can calculate, in addition to and , and as shown below, in Table 12:
Table 12. Calculations for bivariate data of 5 adults for the exponential model
x^{2} | x | y | lny | xlny |
4096 | 64 | 130 | 4.8675 | 311.5200 |
4225 | 65 | 145 | 4.9767 | 323.4855 |
4356 | 66 | 150 | 5.011 | 330.726 |
4489 | 67 | 165 | 5.1059 | 342.0953 |
4624 | 68 | 170 | 5.1358 | 349.2344 |
We then substitute the generated data into the normal equations for the exponential model (i.e. equations (33) and (34)):
,
and obtain the equations:
When these equations are solved simultaneously for ln k and c, we obtain: c = 0.06658 and lnk = 0.6251, or: k = e^{0.6251 }= 1.868432
Therefore, the exponential model is:
or
ln y = ln k + cx = 0.6251 + 0.06658x
Then, the variance of the residual values, for the exponential model, is calculated as shown below, in Table 13:
Table 13. Variance of the residual values for the exponential model
x | y | ŷ = ke^{cx} = 1.868432e^{0.06658x} | y-ŷ | (y-ŷ)^{2} |
64 | 130 | 1.868432 e^{0.06658 (64)} = 132.4515 | -2.4515 | 6.0099 |
65 | 145 | 1.868432 e^{0.06658 (65)} = 141.5703 | 3.4297 | 11.7628 |
66 | 150 | 1.868432 e^{0.06658 (66)} = 151.3169 | -1.3169 | 1.7324 |
67 | 165 | 1.868432 e^{0.06658 (67)} = 161.7346 | 3.2654 | 10.6630 |
68 | 170 | 1.868432 e^{0.06658 (68)} = 172.8694 | -2.8694 | 8.2336 |
Therefore, the variance of the residual values, for the exponential model is:
5.3.3. Fitting the Power Model, ŷ = ax^{b}
To fit the power model we need to extend the given bivariate data set to generate the quantities:, , and , and this is accomplished as shown below, in Table 14:
Table 14. Calculations for bivariate data of 5 adults for the power model
x | y | ln x | (ln x)^{2} | ln y | (ln x) (ln y) |
64 | 130 | 4.158883 | 17.2963085 | 4.867553 | 20.2435 |
65 | 145 | 4.1738727 | 17.42550908 | 4.976734 | 20.7723 |
66 | 150 | 4.189654742 | 17.55320686 | 5.010635 | 20.9928 |
67 | 165 | 4.204692619 | 17.67944002 | 5.105945 | 21.4689 |
68 | 170 | 4.219507705 | 17.80424527 | 5.135798 | 21.6705 |
=20.9471 | =87.7581 | =25.0967 | =105.1505 |
We then substitute the generated data into the normal equations of the power model, namely equations (39) and (40):
and obtain the equations:
When these equations are solved simultaneously for b and ln a we obtain:
Therefore, the “linearized” power model becomes:
Then the variance of the residual values for the power model is obtained as shown below:
Table 15. Variance of the residual values for the power model
x | y | ln x | lnŷ= ln a + b ln x= -13.316 + 4.3766x | ŷ | y – ŷ | (y – ŷ)^{2} |
64 | 130 | 4.158883 | lnŷ_{1} = 4.885768 | 132.3920 | -2.3920 | 5.721664 |
65 | 145 | 4.173873 | lnŷ_{2} = 4.95623 | 141.6874 | 3.3126 | 10.973319 |
66 | 150 | 4.189655 | lnŷ_{3} = 5.020443 | 151.4784 | -1.47843 | 2.185667 |
67 | 165 | 4.204693 | lnŷ_{4} = 5.086258 | 161.7833 | 3.2167 | 10.347159 |
68 | 170 | 4.219508 | lnŷ_{5} = 5.151097 | 172.6208 | -2.6208 | 6.868592 |
= 36.09640 |
Therefore, the variance of the residuals values for the power model is:
5.3.4. Fitting the Quadratic Model, ŷ = a + bx +cx^{2}
To fit the quadratic model, we need to use the given bivariate data set and extend it to generate the quantities:
We then substitute the generated data into the normal equations of the quadratic model (see equation (25)), and obtain:
Solving these 3 equations simultaneously, we obtain a = -25,236/7, b = 730/7, c = -5/7. Therefore, the quadratic function ŷ = f (x) is given by:
The variance of the residual values for the quadratic model is calculated as shown below, in Table 16:
Table 16. Variance of the residual values for the quadratic model
x | y | ŷ=(1/7)[-25,326+730x-5x^{2}] | y_{i} – ŷ_{i} | (y_{i} – ŷ_{i})^{2} |
64 | 130 | ŷ_{1} = 130.5714286 | -0.5714286 | 0.326530644 |
65 | 145 | ŷ_{2} = 142.7142857 | 2.2857143 | 5.224489861 |
66 | 150 | ŷ_{3} = 153.4285714 | -3.4285714 | 11.75510184 |
67 | 165 | ŷ_{4} = 162.7142857 | 2.2857143 | 5.224489861 |
68 | 170 | ŷ_{5} = 170.5714286 | -0.5714286 | 0.326530644 |
= 22.85714286 |
Therefore, the variance of the residual values for the quadratic model is:
5.3.5. Summary of Results and Selection of the “Best” Model
We have fitted the 4 models: linear, exponential, power, and quadratic models, calculated the respective residual variances, and have obtained the following results:
a) The linear model is:
with V (Residual)_{Linear} = 10
b) The exponential model is:
with V (Residual)_{Exponential} = 12.8017
c) The power model is:
with V (Residual)_{Power} = 12.0321
d) The quadratic model is
with V (Residual)_{ Quadratic} = 11.4286
Since the linear model has the smallest variance of the residual values of the 4 models fitted to the same bivariate data set, the linear model is the “best” model (but the other 3 values are very close). The linear model, therefore, will be selected as the “best” model and used for forecasting purposes.
6. MINITAB Solutions
To obtain the MINITAB solutions of the four models we discussed in this paper we do the following:
6.1. Finding the MINITAB Solution for the Linear Model
The data set used to find the MINITAB solution for the linear model is presented in Table 17.
Table 17. Data set in MINITAB for the linear model
MTB Set C1 |
DATA 6465666768 |
DATA end |
MTB set C2 |
DATA 130145 150 165 170 |
DATA end |
MTB Name C1 ‘X’ C2 ‘Y’ |
MTB REGRESS ‘Y’ 1 ‘X’ |
The results of the regression analysis for the linear model is presented in Table 18.
Table 18. Regression analysis: Y versus X for the linear model
Regression equation: | Y = – 508 + 10.0 X | ||||
Predictor | Coef | SE Coef | T | p | |
Constant | -508.000 | 66.020 | -7.700 | 0.005 | |
X | 10.000 | 1.000 | 10.000 | 0.002 | |
Regression fit: | S | R-Sq | R-Sq (adj) | ||
3.162 | 97.1% | 96.1% | |||
Analysis of Variance: | |||||
Source | DF | SS | MS | F | p |
Regression | 1 | 1000.0 | 1000.0 | 100.0 | 0.002 |
Residual Error | 3 | 30.0 | 10.0 | ||
Total | 4 | 1030.0 |
6.2. Finding the MINITAB Solution for the Exponential Model
The data set used to find the MINITAB solution for the exponential model is presented in Table 19.
Table 19. Data set in MINITAB for the exponential model
MTB Set C1 |
DATA 64 65 66 67 68 |
DATA end |
MTB set C2 |
DATA 130 145 150 165 170 |
DATA end |
MTB Name C1 ‘X’ C2 ‘Y’ |
MTB REGRESS ‘Y’ 1 ‘X’ |
The results of the regression analysis for the exponential model is presented in Table 20.
Table 20. Regression analysis: Y versus X for the exponential model
Regression equation: | Y = 0.625 + 0.0666 X | ||||
Predictor | Coef | SE Coef | T | p | |
Constant | 0.6251 | 0.4925 | 1.27 | 0.294 | |
X | 0.066580 | 0.007460 | 8.92 | 0.003 | |
Regression fit: | S | R-Sq | R-Sq (adj) | ||
0.0235917 | 96.4% | 95.2% | |||
Analysis of Variance: | |||||
Source | DF | SS | MS | F | p |
Regression | 1 | 0.044329 | 0.044329 | 79.65 | 0.003 |
Residual Error | 3 | 0.001670 | 0.000557 | ||
Total | 4 | 0.045999 |
6.3. Finding the MINITAB Solution for the Power Model
The data set used to find the MINITAB solution for the power model is presented in Table 21.
Table 21. Data set in MINITAB for the power model
MTB Set C1 |
DATA 4.158883; 4.1738727; 4.189654742; 4.204692619; 4.2195077 |
DATA end |
MTB set C2 |
DATA 4.867553; 4.976734; 5.010635; 5.105945; 5.135798 |
DATA end |
MTB Name C1 ‘X’ C2 ‘Y’ |
MTB REGRESS ‘Y’ 1 ‘X’ |
The results of the regression analysis for the power model is presented in Table 22.
Table 22. Regression analysis: Y versus X for the power model
Regression equation: | Y = – 13.3 + 4.38X | ||||
Predictor | Coef | SE Coef | T | p | |
Constant | -13.316 | 2.069 | -6.44 | 0.008 | |
X | 4.3766 | 0.4939 | 8.86 | 0.003 | |
Regression fit: | S | R-Sq | R-Sq (adj) | ||
0.0237507 | 96.3% | 95.1% | |||
Analysis of Variance: | |||||
Source | DF | SS | MS | F | p |
Regression | 1 | 0.044301 | 0.044301 | 78.53 | 0.003 |
Residual Error | 3 | 0.001692 | 0.000564 | ||
Total | 4 | 0.045993 |
6.4. Finding the MINITAB Solution for the Wuadratic Model
The data set used to find the MINITAB solution for the quadratic model is presented in Table 23.
Table 23. Data set in MINITAB for the quadratic model
MTB Set C1 |
DATA 64 65 66 67 68 |
DATA end |
MTB set C2 |
DATA 4096 4225 4356 4489 4624 |
DATA end |
MTB SET C3 |
DATA 130 145 150 165 170 |
DATA END |
MTB NAME C1 ‘X1′ C2 ‘X2′ C3 ‘Y’ |
MTB REGRESS ‘Y’ 2 ‘X1′ ‘X2′ |
The results of the regression analysis for the quadratic model is presented in Table 24.
Table 24. Regression analysis: Y versus X1, X2 for the quadratic model
Regression equation: | Y = – 3618 + 104 X1 – 0.714 X2 | ||||
Predictor | Coef | SE Coef | T | p | |
Constant | -3618 | 3935 | -0.92 | 0.455 | |
X1 | 104.3 | 119.3 | 0.87 | 0.474 | |
X2 | -0.7143 | 0.9035 | -0.79 | 0.512 | |
Regression fit: | S | R-Sq | R-Sq (adj) | ||
3.38062 | 97.8% | 95.6% | |||
Analysis of Variance: | |||||
Source | DF | SS | MS | F | p |
Regression | 2 | 1007.14 | 503.57 | 44.06 | 0.022 |
Residual Error | 2 | 22.86 | 11.43 | ||
Total | 4 | 1030.00 | |||
Source | DF | Seg SS | |||
X1 | 1 | 1000.00 | |||
X2 | 1 | 7.14 |
7. Conclusions
Reviewing our previous discussion we come to the following conclusions:
The Linear Regression problem is relatively easy to solve and can be handled using algebraic methods.
The problem can also be solved easily using available statistical software, like MINITAB.
Even though the solution to Regression problems can be obtained easily using MINITAB (or other statistical software) it is important to know what the hand methodology is and how it solves these problems before you can properly interpret and understand MINITAB’s output.
In general, non-linear regression is much more difficult to perform than linear regression.
There are, however, some simple non-linear models that can be evaluated relatively easily by utilizing the results of linear regression.
The non-linear models analyzed in this paper are: Exponential Model, Power Model, Quadratic Model.
A procedure is also discussed which allows us to fit to the same bivariate data set many models (such as: linear, exponential, power, quadratic) and select as the “best fitting” model the model with the “smallest variance of the residuals”.
In a numerical example, in which all 4 of these models were fitted to the same bivariate data set, we found that the Linear model was the “best fit”, with the Quadratic model “second best”. The Power and Exponential models are “third best” and “fourth best” respectively, but are very close to each other.
The evaluation of these models is facilitated considerably by using the statistical software package MINITAB which, in addition to estimating the unknown parameters of the corresponding models, also generates additional information (such as the p-value, standard deviations of the parameter estimators, and R^{2}).
This additional information allows us to perform hypothesis testing and construct confidence intervals on the parameters, and also to get a measure of the “goodness” of the equation, by using the value of R^{2}. A value of R^{2} close to 1 is an indication of a good fit.
The MINITAB solution for the linear model shows that both a and b (of ŷ = a + bx = -508 + 10x) are significant because the corresponding p-values are smaller than a= 0.05, while the value of R^{2} = 97.1%, indicating that the regression equation explains 97.1% of the variation in the y-values and only 2.9% is due to other factors.
The MINITAB solution for the quadratic model shows that a, b, and c (of ŷ = a + bx + cx^{2} =-3,618 + 104.3x + 0.7143x^{2}) are individually not significant (because of the corresponding high p-values, but b and c jointly are significant because of the corresponding p-value of p = 0.022 < a= 0.05. The value of R^{2} is: R^{2} = 97.8%.
The MINITAB solution for the power model shows that both a and b (of ŷ = ax^{b} or ln y = ln a + b ln x = -13.3 + 4.3766 ln x) are significant because the corresponding p-values are smaller than a= 0.05, while the value of R^{2} = 96.3%.
The MINITAB solution for the exponential model shows that the k (in ŷ = ke^{cx} =1.868432e^{0.06658x} or ln ŷ = ln k + cx = 0.6251 + 0.06658x) is not significant because of the corresponding high p-value, while the c is significant because of the corresponding p-value being smaller than a= 0.05. The value of R^{2} = 96.4%.
References
- Adamowski, J., H., Fung Chan, S.O., Prasher, B., Ozga-Zielinski, and Sliusarieva. A., 2012. Comparison of multiple linear and nonlinear regression, autoregressive integrated moving average, artificial neural network, and wavelet artificial neural network methods for urban water demand forecasting. Water Resources, 48, W01528, Montreal: Canada
- Berenson, M.L., Levine, D.M. and Krehbiel, T.C., 2004. Basic Business Statistics (9th Edition). Upper Saddle River, N.J.: Prentice-Hall
- Bhatia, N., 2009. Linear Regression: An Approach for Forecasting
- Black, K., 2004. Business statistics (4th Edition). Hoboken, NJ: Wiley
- Canavos, G.C., 1984. Applied Probability and Statistical Methods. Boston: Little Brown
- Carlson, W.L. and Thorne, B., 1997. Applied Statistical Methods. Upper Saddle River, N.J.: Prentice-Hall
- Chen, Kuan-Yu, 2011. Combining linear and nonlinear model in forecasting tourism demand. Expert Systems with Applications, 38(8), pp.10368–10376
- Childress, R.L., Gorsky, R.D. and Witt, R.M., 1989. Mathematics for Managerial Decisions. Upper Saddle River, N.J.: Prentice-Hall
- Chou, Ya-lun, 1992. Statistical Analysis for Business and Economics. New York: Elsevier
- Freud, J.E. and Williams, F.J., 1982. Elementary Business Statistics: The Modern Approach. Upper Saddle River, N.J.: Prentice-Hall
- McClave, J.T., Benson, G.P. and Sincich, T., 2001. Statistics for Business and Economics (8th Edition). Upper Saddle River, N.J.: Prentice-Hall
- Pindyck, R. and Rubinfeld, D.L., 1981. Econometric Models and Economic Forecasts (2nd Edition). New York: McGraw-Hill
- Vasilopoulos, A. and Lu, F.V., 2006. Quantitative Methods for Business with Computer Applications. Boston, MA: Pearson Custom Publishing
- Vasilopoulos, A., 2005. Regression Analysis Revisited. Review of Business, 26 (3), pp.36-46
- Vasilopoulos, A., 2007. Business Statistics – A Logical Approach. Theory, Models, Procedures, and Applications Including Computer (MINITAB) Solutions. Boston, MA: Pearson Custom Publishing