Linear and Non-Linear Regression: Powerful and Very Important Forecasting Methods

Athanasios VASILOPOULOS

Athanasios VASILOPOULOS

Linear and Non-Linear Regression: Powerful and Very Important Forecasting Methods

Expert Journal of Business and Management

3(2)

pp. 205-228

The Author(s). 2015

Received: November 2, 2015

Accepted: November 23, 2015

Published: December 19, 2015

December 19, 2015

Regression Analysis is at the center of almost every Forecasting technique, yet few people are comfortable with the Regression methodology. We hope to improve the level of comfort with this article. In this article we briefly discuss the theory behind the methodology and then outline a step-by-step procedure, which will allow almost everyone to construct a Regression Forecasting function for both the linear and some non-linear cases. Also discussed, in addition to the model construction mentioned above, is model testing (to establish significance) and the procedure by which the Final Regression equation is derived and retained to be used as the Forecasting equation. Hand solutions are derived for some small-sample problems (for both the linear and non-linear cases) and their solutions are compared to the MINITAB-derived solutions to establish confidence in the statistical tool, which can be used exclusively for larger problems.

KeywordsBest-Fitting Model Forecasting Linear Regression Non-Linear Regression

JEL Classification M10

Full Article

1.Introduction and Model Estimation for the Linear Model

Regression analysis, in which an equation is derived that connects the value of one dependent variable (Y) to the values of one independent variable X (linear model and some non-linear models), starts with a given bivariate data set and uses the Least Squares Method to assign the best possible values to the unknown multipliers found in the models we wish to estimate. The bivariate data, used to estimate the linear model and some non-linear models, consists of n ordered pairs of values:

The linear model we wish to estimate, using the given data, is:

(1)

while the non-linear models of interest are given by

(Exponential Model) (2)

(Power Model) (3)

and

(Quadratic Model) (4)

To estimate model (1) we use the Least Squares Methodology, which calls for the formation of the quadratic function:

(5)

To derive the “normal” equations for the linear model from which the values of a and b of the linear model are obtained, we take the partial derivative of Q (a,b) of equation (5) with respect to a and b, set each equal to zero, and then simplify:

The result is:

(6)

and

(7)

When (6) and (7) are set equal to zero and simplified, we obtain the “Normal” equations for the linear model:

(8)

(9)

The only unknowns in equation (8) and (9) are a and b and they should be solved for them simultaneously, thus deriving (or estimating) the linear model. This is so because all the other values of equations (8) and (9) come from the given data, where:

n =number of ordered pairs ()

= sum of the x values

= sum of the y values

= sum of the given x values, which are first squared

= sum of the products of the x_i and y_i values in each ordered pair.

Note: The values of (a) and (b) obtained from the Normal equations correspond to a minimum value for the Quadratic function Q (a,b) given by equation (5), as can be easily demonstrated by using the Optimization methodology of Differential Calculus for functions of 2 independent variables.

To complete the Estimation of the Linear model we need to find the standard deviation for a, σ (a), and b, σ (b), which are needed for testing of the significance of the model. The standard deviations, σ (a), and σ (b), are given by:

(10)

and

, (11)

where:

(12)

The a and b in equation (12) come from the solution of equations (8) and (9) while ,, and come directly from the given bivariate data.

2.Model Testing

Now that our model of interest has been estimated, we need to test for the significance of the terms found in the estimated model. This is very important because the results of this testing will determine the final equation which will be retained and used for Forecasting purposes.

Testing of the linear model consists of the following steps:

2.1. Testing for the significance of each term separately

Here we test the hypotheses:

1. H₀: b = 0 vs H₁: b ¹ 0, and

2. H₀: a = 0 vs H₁: a ¹ 0, based on our knowledge of b, σ (b), a, and σ (a).

If n ³ 30, we calculate

and

and compare each to Z_a_/2 (where Z_a_/2is a value obtained from the standard Normal Table when a, or 1 – a, is specified).

For example if a = 0.05, Z_a_/2 = Z_0.025= 1.96; if a = 0.10, Z_a_/2 = Z_0.05 = 1.645; if a = 0.02, Z_a_/2 = Z_0.01 = 2.33 and if a = 0.01, Z_a_/2 = Z_0.005 = 2.58).

If Z^*_b Z_a_/2 (or Z^*_b < -Z_a_/2), the hypothesis H₀: b = 0 is rejected and we conclude that b ¹ 0 and the term bx (in the estimated model ŷ=a+bx) is important for the calculation of the value of y. Similarly, if Z^*_a Z_a_/2 (or Z^*_a < -Z_a_/2), H₀: a = 0 is rejected, and we conclude that the linear equation ŷ=a+bx does not go through the origin.

If n < 30, we calculate

and

and compare each to t_{n-2 (}_a_/2), for a given a value, where t_{n-2 (}_a_/2) is obtained from the t-distribution table, with the same interpretation for H₀: b = 0 and H₀: a = 0 as above.

But, instead of hypothesis testing, we can construct Confidence Intervals for b and a using the equations:

(13)

and, if n ≥ 30,

(14)

or

(15)

and, if n < 30,

(16)

If the hypothesized values: b = 0 falls inside the Confidence Intervals given by equations (13) or (15), or a = 0 falls inside the Confidence Intervals given by equations (14) or (16), the corresponding hypotheses H₀: b = 0 and H₀: a = 0 are not rejected and we conclude that b = 0 (and b = 0 and the term bx is not important for the calculation of y) and a = 0 (i.e. a = 0 and the line goes through zero). If for a given data set, we performed the above-discussed tests, we will obtain one of 4 possible conclusions:

A) H₀: b = 0 and H₀: a = 0 are both rejected; Therefore b ¹ 0, and a ¹ 0, and both the terms a and bx are important to the calculation of y. In this case the final equation is ŷ=a+bx, with both terms staying in the equation.

B) H₀: b = 0 is rejected, but H₀: a = 0 is not rejected. Therefore b ¹ 0 but a = 0 and the term a is not important to the calculation of y. In this case the final equation is ŷ=a+bx, with the term a dropping out of the equation.

C) H₀: b = 0 is not rejected but H₀: a = 0 is rejected. Therefore b = 0 and the term bx is not important for the calculation of y, while a ¹ 0 and is important to the calculation of y. In this case the final equation is ŷ = a, with the term bx dropping out of the equation

D) H₀: b = 0 and H₀: a = 0 are both not rejected; Therefore b = 0, and a = 0, and both terms a and bx are not important to the calculation of y. In this case the final equation will be ŷ = 0, with both terms a and bx dropping out of the equation.

2.2. Testing for the Significance of the Entire Linear Equation

This test consists of testing the hypothesis:

1. H₀: a = b = 0 vs H₀: a and b are not both equal to 0, or

2. H₀: The Entire Regression equation is not significant vs H₁: The Entire Regression equation is significant

For a given bivariate data set and a given a value, we need to first calculate:

(17)

(18)

(19)

(20)

Then we calculate:

(21)

and compare F^*_Total to F²_n-2 (α), which is a tabulated value, for a specified a value. If F^*_Total F²_n-2 (α), we reject H₀ and conclude that the entire regression equation (i.e. ŷ=a+bx) or that both the constant term a, and the factor x (and term bx) are significant to the calculation of the y value, simultaneously.

Note 1:

When TSS, RSS_b, and ESS are known, we can also define the coefficient of determination R², where:

(22)

where 0 ≤ R² ≤ 1, which tells us how well the regression equation ŷ = a + bx fits the given bivariate data. A value of R close to 1 implies a good fit.

Note 2:

(23)

2.3. A Bivariate Example

A sample of 5 adult men for whom heights and weights are measured gives the following results (Table 1):

Table 1. Given bivariate data set (n =5)

x = H	y = W	x²=H²	y²=W²	xy = HW
64	130	64²	130²	64 x 130
65	145	65²	145²	65 x 145
66	150	66²	150²	66 x 150
67	165	67²	165²	67 x 165
68	170	68²	170²	68 x 170

For this Bivariate Data set we have: n = 5

To obtain the linear equation ŷ = a + bx, we substitute the values of n, , ,

to equations (8) and (9) and obtain:

When these equations are solved simultaneously we obtain: a = -508 and b = 10, and the regression equation is

.

Then, using the values of a = -508, b=10, and , andwe obtain from equation (12):

and from equations (10) and (11):

Since n=5<30, a and b are distributed as variables and when a = 0.05, t₃ (α/2) = t₃ (0.025) = = ±3.1824.

Then the hypotheses H₀: b = 0 vs. H₁: b ≠ 0, and H₀: α = 0 vs. H₀: α ≠ 0 are both rejected because:

and

Therefore, the final equation is

.

To test for the significance of the entire equation, and to calculate the coefficient of determination, we first evaluate, TSS, RSS_b, ESS, SS_a using equations (17) – (20) and obtain:

From equation (22), we obtain R²= 1000/1030 ≈ 0.971, which tells us that 97% of the variation in the values of Y can be explained (or are accounted for) by the variable X included in the regression equation and only 3% is due to other factors. Since R² is close to 1, the fit of the equation to the data is very good.

Note:

The correlation coefficient r, which measures the strength of the linear relationship between Y and X is related to the coefficient of determination by:

for this example. Clearly X and Y are very strongly linearly related.

Using equation (21) we obtain:

when F^*_Total is compared to

H₀ (The entire regression equation is not significant) is rejected, and we conclude that the entire regression equation is significant.

3.MINITAB Solution to the Linear Regression Problem

We enter the given data and issue the regression command as shown in Table 2.

Table 2.Data set in MINITAB

MTB Set C1

DATA 6465666768

DATA end

MTB set C2

DATA 130145 150 165 170

DATA end

MTB Name C1 ‘X’ C2 ‘Y’

MTB REGRESS ‘Y’ 1 ‘X’

and obtain the MINITAB output presented in Table 3, Table 4, and Figure 1.

Table 3. Regression Analysis: Y versus X

Regression equation:					Y = – 508 + 10.0 X

Predictor	Coef	SE Coef	T	p
Constant	-508.000	66.020	-7.700	0.005
X	10.000	1.000	10.000	0.002

Regression fit:
	S	R-Sq	R-Sq (adj)
	3.162	97.1%	96.1%

Analysis of Variance:
Source	DF	SS	MS	F	p
Regression	1	1000.0	1000.0	100.0	0.002
Residual Error	3	30.0	10.0
Total	4	1030.0

Table 4. Correlations: Y, X

Pearson correlation of Y and X	0.985
P-Value	0.002

Figure 1. Plot Y * X<

When we compare the MINITAB and hand solutions, they are identical. We obtain the same equation ŷ = -508 + 10x, the same standard deviations for a and b (under SE Coefficient) and the same t values, the same R², the same s = σ and σ² = 10. Notice also that an Analysis of Variance table provides the values for RSS_b, ESS, and TSS. The only value missing is SS_a, which can be easily calculated from

.

The MINITAB solution also gives a p-value for each coefficient. The p-value is called the “Observed Level of Significance” and represents the probability of obtaining a value more extreme than the value of the test statistic. For example the p-value for the predictor X is calculated as p = 0.002, and it is given by:

(24)

The p-value has the following connection to the selected a-value.

If p ³ a, do not reject H₀

If p < a, reject H₀

Since p = 0.002 < a = 0.05, H₀: β = 0 will be rejected.

4.Introduction and Model Estimation for Some Non-Linear Models of Interest

Sometimes two variables are related but their relationship is not linear and trying to fit a linear equation to a data set that is inherently non-linear will result in a bad-fit. But, because non-linear regression is, in general, much more difficult than linear regression, we explore in this part of the paper estimation methods that will allow us to fit non-linear equations to a data set by using the results of linear regression which is much easier to understand and analyze. <p”This becomes possible by first performing logarithmic transformations of the non-linear equations, which change the non-linear into linear equations, and then using the normal equations of the linear model to generate the normal equations of the “linearized” non-linear equations, from which the values of the unknown model parameters can be obtained. In this paper we show how the exponential model, ŷ = ke^cx, and the power model, ŷ = ax^b (for b≠1) can be easily estimated by using logarithmic transformations to first derive the linearized version of the above non-linear equations, namely:

and

,

and then comparing these to the original linear equation, ŷ = a + bx, and its normal equations (see equations (8) and (9)).

<p”Also discussed is the quadratic model, ŷ = a + bx + cx² which, even though is a non-linear model, can be discussed directly using the linear methodology. But now we have to solve simultaneously a system of 3 equations in 3 unknowns, because the normal equations for the quadratic model become: <p”

(25)

<p” <p”A procedure is also discussed which allows us to fit these four models (i.e. linear, exponential, power, quadratic), and possibly others, to the same data set, and then select the equation which fits the data set “best”. These four models are used extensively in forecasting and, because of this, it is important to understand how these models are constructed and how MINITAB can be used to estimate such models efficiently. <p”

4.1.The Linear Model and its Normal Equations

The linear model and the normal equations associated with it as explained above, are given by:

Linear Model

(1)

Normal Equations

(8)

(9)

4.2.The Exponential Model

The exponential model is defined by the equation:

(26)

Our objective is to use the given data to find the best possible values for k and c, just as our objective in equation (1) was to use the data to find the best (in the least-square sense) values for a and b.

Taking natural logarithms (i.e. logarithms to the base e) of both sides of equation (26) we obtain

or

(27)

4.2.1.Logarithmic Laws

To simplify equation (27), we have to use some of the following laws of logarithms:

i)log (A∙B) = log A + log B (28)

ii)log (A/B) = log A – log B (29)

iii)log (Aⁿ) = n log A (30)

Then, using equation (28) we can re-write equation (27) as:

(31)

and, by applying equation (30) to the second term of the right hand side of equation (31), equation (31) can be written finally as:

or

(32)

(because ln e = log_e e = 1)

Even though equation (26) is non-linear, as can be verified by plotting y against x, equation (32) is linear (i.e. the logarithmic transformation changed equation (26) from non-linear to linear) as can be verified by plotting: ln y against x.

But, if equation (32) is linear, it should be similar to equation (1), and must have a set of normal equations similar to the normal equations of the linear model (see equations (8) and (9)).

Question:How are these normal equations going to be derived?

Answer:We will compare the “transformed linear model”, i.e. equation (32), to the actual linear model (equation (1)), note the differences between these two models, and then make the appropriate changes to the normal equations of the linear model to obtain the normal equations of the “transformed linear model”.

4.2.2.Comparison of the Logarithmic Transformed Exponential Model to the Linear Model

To make the comparison easier, we list below the 2 models under consideration, namely:

a) Original Linear Model:

(1)

b) Transformed Linear Model:

(32)

Comparing equations (1) and (32), we note the following three differences between the two models:

i.y in equation (1) has been replaced by ln y in equation (32)

ii.a in equation (1) has been replaced by ln k in equation (32)

iii.b in equation (1) has been replaced by c in equation (32)

4.2.3.Normal Equations of Exponential Model

When the three changes listed above are applied to the normal equations of the actual linear model (equations (8) and (9)), we will obtain the normal equations of the “transformed model”.The normal equations of the “transformed linear model” are:

(33)

(34)

In equations (33) and (34) all the quantities are known numbers, derived from the given data as will be shown later, except for: ln k and c, and equations (33) and (34) must be solved simultaneously for ln k and c.

Suppose that for a given data set, the solution to equations (33) and (34) produced the values:

ln k = 0.3andc = 1.2 (35)

If we examine the exponential model (equation (26)), we observe that the value of c = 1.2 can be substituted directly into equation (26), but we do not yet have the value of k; instead we have the value of ln k = 0.3!

Question:If we know: ln k = 0.3, how do we find the value of k?

Answer:If ln k = 0.3, then: k = e^0.3 (2.718281828)^0.3 1.349859

Therefore, now that we have both the k and c values, the non-linear model, given by equation (26), has been completely estimated.

4.3.The Power Model

Another non-linear model, which can be analyzed in a similar manner, is the Power Model defined by the equation:

(36)

which is non-linear if b ≠ 1 and, as before, we must obtain the best possible values for a and b (in the least-square sense) using the given data.

4.3.1.Logarithmic Transformation of Power Model

A logarithmic transformation of equation (36) produces the “transformed linear model”

(37)

When equation (37) is compared to equation (1), we note the following 3 changes:

i.y in equation (1) has been replaced by ln y in equation (37)

ii.a in equation (1) has been replaced by ln a in equation (37) (38)

iii.x in equation (1) has been replaced by ln x in equation (37)

When the changes listed in (38) are substituted into equations (8) and (9), we obtain the normal equations for this “transformed linear model” which are given by equations (39) and (40) below:

4.3.2.Normal Equations of Power Model

(39)

(40)

Equations (39) and (40) must be solved simultaneously for (ln a) and b.

If ln a = 0.4, then a = e^0.4 ≈ (2.718251828)^0.4 ≈ 1.491825 and, since we have numerical values for both a and b, the non-linear model defined by equation (36) has been completely estimated.

4.4.Derivation of the normal equations for the Quadratic model, y = a + bx + cx²

To derive the normal equations of the quadratic model, first form the function

(41)

Then take the partial derivatives: and set each equal to 0, to obtain the 3 equations needed to solve for a, b, c.

We obtain:

or:

(42)

or:

(43)

or:

(44)

Equations (42), (43), and (44) are identical to equation (25).

4.5.Data Utilization in Estimating the 4 Models

To generate the quantities needed to estimate the 4 models:

a. The Linear Model

b. The Exponential Model

c. The Power Model,

d. The Quadratic Model,

the given (x, y) bivariate data must be “manipulated” as shown in Tables: 5, 6, 7, and 8, respectively.

4.5.1.Given Data to Evaluate the Linear Model

Table 5. Manipulation of Given Data to Evaluate the Linear Model

x	y	xy	x²
x₁	y₁	x₁y₁	x₁²
x₂	y₂	x₂y₂	x₂²
x₃	y₃	x₃y₃	x₃²
…	…	…	…
x_n	y_n	x_ny_n	x_n²

N₁	N₂	N₃	N₄

To evaluate y = a + bx, substitute: N₁, N₂, N₃, N₄ into equations (8) and (9) and solve for a and b simultaneously.

4.5.2.Given Data to Evaluate the Exponential Model

Table 6. Manipulation of Given Data to Evaluate the Exponential Model

x	y	x²	ln y	x ln y
x₁	y₁		ln y₁
x₂	y₂		ln y₂
x₃	y₃		ln y₃
…	…	…	…	…
x_n	y_n		ln y_n

N₅	N₆	N₇	N₈	N₉

To evaluate, substitute N₅, N₇, N₈, N₉ into equations (33) and (34) and solve for ln k and c simultaneously.

4.5.3.Given Data to Evaluate the Power Model

Table 7. Manipulation of Given Data to Evaluate the Power Model

x	y	ln x	(ln x)²	(ln x) (ln y)	ln y
x₁	y₁	ln x₁	(ln x₁)²		ln y₁
x₂	y₂	ln x₂	(ln x₂)²		ln y₂
x₃	y₃	ln x₃	(ln x₃)²		ln y₃
…	…	…	…	…	…
x_n	y_n	ln x_n	(ln x_n)²		ln y_n

N₁₀	N₁₁	N₁₂	N₁₃	N₁₄	N₁₅

To evaluate ŷ=ax^b, substitute N₁₂, N₁₃, N₁₄, N₁₅ into equations (39) and (40) and solve simultaneously for (ln a) and b.

4.5.4.Given Data to Evaluate the Quadratic Model

Table 8. Manipulation of Given Data to Evaluate the Quadratic Model

x	y	x²	x³	xy	x⁴	x² y
x₁	y₁	x₁²	x₁³	x₁y₁	x₁⁴	x₁²y₁
x₂	y₂	x₂²	x₂³	x₂y₂	x₂⁴	x₂²y₂
x₃	y₃	x₃²	x₃³	x₃y₃	x₃⁴	x₃²y₃
…	…	…	…	…	…	…
x_n	y_n	x_n²	x_n³	x_ny_n	x_n⁴	x_n²y_n

N₁₆	N₁₇	N₁₈	N₁₉	N₂₀	N₂₁	N₂₂

To evaluate y = a + bx + cx², substitute N₁₆, N₁₇, N₁₈, N₁₉, N₂₀, N₂₁, N₂₂ into equations (42), (43), and (44), and solve simultaneously for a, b, and c.

5.Selecting the Best-Fitting Model

5.1.The Four Models Considered

Given a data set (x_i, y_i), we have shown how to fit to such a data set four different models, namely:

a.Linear:

(45)

b.Exponential:

(46)

c.Power:

(47)

d.Quadratic:

(48)

We might decide to fit all four models to the same data set if, after examining the scatter diagram of the given data set, we are unable to decide which of the “4 models appears to fit the data BEST.”

But, after we fit the 4 models, how can we tell which model fits the data best?

To answer this question, we calculate the “variance of the residual values” for each of the models, and then “select as the best model” the one with the smallest variance of the residual values.

5.2.Calculating the Residual Values of Each Model and Their Variance

Use each x_i value, of the given data set (x_i, y_i), to calculate the value, from the appropriate model, and then for each i, form the residual:

, (49)

for each i.

Then the variance of the residual values is defined by:

, (50)

where DOF = Degrees of Freedom.

Note:The DOF are DOF = n – 2 for the first three models (Linear, Exponential, Power) due to the fact that each of these 3 models has 2 unknown quantities that need to be evaluated from the data (a and b, k and c, and a and b, respectively) and, as a consequence, 2 degrees of freedom are lost. For the Quadratic model, DOF = n – 3 because the model has 3 unknown quantities that need to be estimated and, as a consequence, 3 degrees of freedom are lost.

Using equation (50) to calculate the variance of the residuals for each of the 4 models, we obtain:

(51)

(52)

(53)

(54)

(55)

(56)

(57)

(58)

After the calculation of the 4 variances from equations: (52), (54), (56), and (58), the model with the “smallest” variance is the model which fits the given data set “best”.

We will now illustrate, through an example, how the 4 models we discussed above can be fitted to a given bivariate data set, and then how the “best” model from among them is selected.

5.3. A Considered Example

A sample of 5 adult men for whom heights and weights are measured gives the following results (Table 9).

Table 9. Sample of 5 adult men

#	X = Height	Y = Weight
1	64	130
2	65	145
3	66	150
4	67	165
5	68	170

Problem:Fit the linear, exponential, power, and quadratic models to this bivariate data set and then select as the “best” the model with the smallest variance of the residual values.

5.3.1. Fitting the Linear Model

To fit the linear model, we must extend the given bivariate data so that we can also calculateand , as shown below, in Table 10:

Table 10. Calculations for bivariate data of 5 adults for the linear model

x²	x	y	Xy
4096	64	130	8320
4225	65	145	9425
4356	66	150	9900
4489	67	165	11055
4624	68	170	11560
=21,790	= 330	= 760	= 50,260

We then substitute the generated data into the normal equations for the linear model, namely equations (8) and (9):

,

and obtain the equations:

When these equations are solved simultaneously for a and b we obtain:

Therefore, the linear model is:

The variance of the residual values for the linear model is calculated as shown below, in Table 11:

Table 11. Variance of the residual values for the linear model

Given X	Given Y	Calculated Y	Residual	(Residual)²
x	y	*= -508 + 10x*	y –	(y – )²
64	130	-508 + 10 (64) = 132	-2	(-2)²= 4
65	145	-508 + 10 (65) = 142	+3	(+3)² = 9
66	150	-508 + 10 (66) = 152	-2	(-2)² = 4
67	165	-508 + 10 (67) = 162	+3	(+3)² = 9
68	170	-508 + 10 (68) = 172	-2	(-2)² = 4

Therefore, the variance of the residual values, for the linear model is:

5.3.2. Fitting the Exponential Model ŷ = ke^cx

To fit the exponential model we need to extend the given bivariate data so that we can calculate, in addition to and , and as shown below, in Table 12:

Table 12. Calculations for bivariate data of 5 adults for the exponential model

x²	x	y	lny	xlny
4096	64	130	4.8675	311.5200
4225	65	145	4.9767	323.4855
4356	66	150	5.011	330.726
4489	67	165	5.1059	342.0953
4624	68	170	5.1358	349.2344

We then substitute the generated data into the normal equations for the exponential model (i.e. equations (33) and (34)):

,

and obtain the equations:

When these equations are solved simultaneously for ln k and c, we obtain: c = 0.06658 and lnk = 0.6251, or: k = e^0.6251= 1.868432

Therefore, the exponential model is:

or

ln y = ln k + cx = 0.6251 + 0.06658x

Then, the variance of the residual values, for the exponential model, is calculated as shown below, in Table 13:

Table 13. Variance of the residual values for the exponential model

x	y	ŷ = ke^cx = 1.868432e^0.06658x	y-ŷ	(y-ŷ)²
64	130	1.868432 e^{0.06658 (64)} = 132.4515	-2.4515	6.0099
65	145	1.868432 e^{0.06658 (65)} = 141.5703	3.4297	11.7628
66	150	1.868432 e^{0.06658 (66)} = 151.3169	-1.3169	1.7324
67	165	1.868432 e^{0.06658 (67)} = 161.7346	3.2654	10.6630
68	170	1.868432 e^{0.06658 (68)} = 172.8694	-2.8694	8.2336

Therefore, the variance of the residual values, for the exponential model is:

5.3.3. Fitting the Power Model, ŷ = ax^b

To fit the power model we need to extend the given bivariate data set to generate the quantities:, , and , and this is accomplished as shown below, in Table 14:

Table 14. Calculations for bivariate data of 5 adults for the power model

x	y	ln x	(ln x)²	ln y	(ln x) (ln y)
64	130	4.158883	17.2963085	4.867553	20.2435
65	145	4.1738727	17.42550908	4.976734	20.7723
66	150	4.189654742	17.55320686	5.010635	20.9928
67	165	4.204692619	17.67944002	5.105945	21.4689
68	170	4.219507705	17.80424527	5.135798	21.6705
		=20.9471	=87.7581	=25.0967	=105.1505

We then substitute the generated data into the normal equations of the power model, namely equations (39) and (40):

and obtain the equations:

When these equations are solved simultaneously for b and ln a we obtain:

Therefore, the “linearized” power model becomes:

Then the variance of the residual values for the power model is obtained as shown below:

Table 15. Variance of the residual values for the power model

x	y	ln x	lnŷ= ln a + b ln x= -13.316 + 4.3766x	ŷ	y – ŷ	(y – ŷ)²
64	130	4.158883	lnŷ₁ = 4.885768	132.3920	-2.3920	5.721664
65	145	4.173873	lnŷ₂ = 4.95623	141.6874	3.3126	10.973319
66	150	4.189655	lnŷ₃ = 5.020443	151.4784	-1.47843	2.185667
67	165	4.204693	lnŷ₄ = 5.086258	161.7833	3.2167	10.347159
68	170	4.219508	lnŷ₅ = 5.151097	172.6208	-2.6208	6.868592
						= 36.09640

Therefore, the variance of the residuals values for the power model is:

5.3.4. Fitting the Quadratic Model, ŷ = a + bx +cx²

To fit the quadratic model, we need to use the given bivariate data set and extend it to generate the quantities:

We then substitute the generated data into the normal equations of the quadratic model (see equation (25)), and obtain:

Solving these 3 equations simultaneously, we obtain a = -25,236/7, b = 730/7, c = -5/7. Therefore, the quadratic function ŷ = f (x) is given by:

The variance of the residual values for the quadratic model is calculated as shown below, in Table 16:

Table 16. Variance of the residual values for the quadratic model

x	y	ŷ=(1/7)[-25,326+730x-5x²]	y_i – ŷ_i	(y_i – ŷ_i)²
64	130	ŷ₁ = 130.5714286	-0.5714286	0.326530644
65	145	ŷ₂ = 142.7142857	2.2857143	5.224489861
66	150	ŷ₃ = 153.4285714	-3.4285714	11.75510184
67	165	ŷ₄ = 162.7142857	2.2857143	5.224489861
68	170	ŷ₅ = 170.5714286	-0.5714286	0.326530644
				= 22.85714286

Therefore, the variance of the residual values for the quadratic model is:

5.3.5. Summary of Results and Selection of the “Best” Model

We have fitted the 4 models: linear, exponential, power, and quadratic models, calculated the respective residual variances, and have obtained the following results:

a) The linear model is:

with V (Residual)_Linear = 10

b) The exponential model is:

with V (Residual)_Exponential = 12.8017

c) The power model is:

with V (Residual)_Power = 12.0321

d) The quadratic model is

with V (Residual)_Quadratic = 11.4286

Since the linear model has the smallest variance of the residual values of the 4 models fitted to the same bivariate data set, the linear model is the “best” model (but the other 3 values are very close). The linear model, therefore, will be selected as the “best” model and used for forecasting purposes.

6. MINITAB Solutions

To obtain the MINITAB solutions of the four models we discussed in this paper we do the following:

6.1. Finding the MINITAB Solution for the Linear Model

The data set used to find the MINITAB solution for the linear model is presented in Table 17.

Table 17. Data set in MINITAB for the linear model

MTB Set C1

DATA 6465666768

DATA end

MTB set C2

DATA 130145 150 165 170

DATA end

MTB Name C1 ‘X’ C2 ‘Y’

MTB REGRESS ‘Y’ 1 ‘X’

The results of the regression analysis for the linear model is presented in Table 18.

Table 18. Regression analysis: Y versus X for the linear model

Regression equation:			Y = – 508 + 10.0 X

Predictor	Coef	SE Coef	T	p
Constant	-508.000	66.020	-7.700	0.005
X	10.000	1.000	10.000	0.002

Regression fit:	S	R-Sq	R-Sq (adj)
	3.162	97.1%	96.1%

Analysis of Variance:
Source	DF	SS	MS	F	p
Regression	1	1000.0	1000.0	100.0	0.002
Residual Error	3	30.0	10.0
Total	4	1030.0

6.2. Finding the MINITAB Solution for the Exponential Model

The data set used to find the MINITAB solution for the exponential model is presented in Table 19.

Table 19. Data set in MINITAB for the exponential model

MTB Set C1

DATA 64 65 66 67 68

DATA end

MTB set C2

DATA 130 145 150 165 170

DATA end

MTB Name C1 ‘X’ C2 ‘Y’

MTB REGRESS ‘Y’ 1 ‘X’

The results of the regression analysis for the exponential model is presented in Table 20.

Table 20. Regression analysis: Y versus X for the exponential model

Regression equation:			Y = 0.625 + 0.0666 X

Predictor	Coef	SE Coef	T	p
Constant	0.6251	0.4925	1.27	0.294
X	0.066580	0.007460	8.92	0.003

Regression fit:	S	R-Sq	R-Sq (adj)
	0.0235917	96.4%	95.2%

Analysis of Variance:
Source	DF	SS	MS	F	p
Regression	1	0.044329	0.044329	79.65	0.003
Residual Error	3	0.001670	0.000557
Total	4	0.045999

6.3. Finding the MINITAB Solution for the Power Model

The data set used to find the MINITAB solution for the power model is presented in Table 21.

Table 21. Data set in MINITAB for the power model

MTB Set C1

DATA 4.158883; 4.1738727; 4.189654742; 4.204692619; 4.2195077

DATA end

MTB set C2

DATA 4.867553; 4.976734; 5.010635; 5.105945; 5.135798

DATA end

MTB Name C1 ‘X’ C2 ‘Y’

MTB REGRESS ‘Y’ 1 ‘X’

The results of the regression analysis for the power model is presented in Table 22.

Table 22. Regression analysis: Y versus X for the power model

Regression equation:			Y = – 13.3 + 4.38X

Predictor	Coef	SE Coef	T	p
Constant	-13.316	2.069	-6.44	0.008
X	4.3766	0.4939	8.86	0.003

Regression fit:	S	R-Sq	R-Sq (adj)
	0.0237507	96.3%	95.1%

Analysis of Variance:
Source	DF	SS	MS	F	p
Regression	1	0.044301	0.044301	78.53	0.003
Residual Error	3	0.001692	0.000564
Total	4	0.045993

6.4. Finding the MINITAB Solution for the Wuadratic Model

The data set used to find the MINITAB solution for the quadratic model is presented in Table 23.

Table 23. Data set in MINITAB for the quadratic model

MTB Set C1

DATA 64 65 66 67 68

DATA end

MTB set C2

DATA 4096 4225 4356 4489 4624

DATA end

MTB SET C3

DATA 130 145 150 165 170

DATA END

MTB NAME C1 ‘X1′ C2 ‘X2′ C3 ‘Y’

MTB REGRESS ‘Y’ 2 ‘X1′ ‘X2′

The results of the regression analysis for the quadratic model is presented in Table 24.

Table 24. Regression analysis: Y versus X1, X2 for the quadratic model

Regression equation:			Y = – 3618 + 104 X1 – 0.714 X2

Predictor	Coef	SE Coef	T	p
Constant	-3618	3935	-0.92	0.455
X1	104.3	119.3	0.87	0.474
X2	-0.7143	0.9035	-0.79	0.512

Regression fit:	S	R-Sq	R-Sq (adj)
	3.38062	97.8%	95.6%

Analysis of Variance:
Source	DF	SS	MS	F	p
Regression	2	1007.14	503.57	44.06	0.022
Residual Error	2	22.86	11.43
Total	4	1030.00

Source	DF	Seg SS
X1	1	1000.00
X2	1	7.14

7. Conclusions

Reviewing our previous discussion we come to the following conclusions:

The Linear Regression problem is relatively easy to solve and can be handled using algebraic methods.

The problem can also be solved easily using available statistical software, like MINITAB.

Even though the solution to Regression problems can be obtained easily using MINITAB (or other statistical software) it is important to know what the hand methodology is and how it solves these problems before you can properly interpret and understand MINITAB’s output.

In general, non-linear regression is much more difficult to perform than linear regression.

There are, however, some simple non-linear models that can be evaluated relatively easily by utilizing the results of linear regression.

The non-linear models analyzed in this paper are: Exponential Model, Power Model, Quadratic Model.

A procedure is also discussed which allows us to fit to the same bivariate data set many models (such as: linear, exponential, power, quadratic) and select as the “best fitting” model the model with the “smallest variance of the residuals”.

In a numerical example, in which all 4 of these models were fitted to the same bivariate data set, we found that the Linear model was the “best fit”, with the Quadratic model “second best”. The Power and Exponential models are “third best” and “fourth best” respectively, but are very close to each other.

The evaluation of these models is facilitated considerably by using the statistical software package MINITAB which, in addition to estimating the unknown parameters of the corresponding models, also generates additional information (such as the p-value, standard deviations of the parameter estimators, and R²).

This additional information allows us to perform hypothesis testing and construct confidence intervals on the parameters, and also to get a measure of the “goodness” of the equation, by using the value of R². A value of R² close to 1 is an indication of a good fit.

The MINITAB solution for the linear model shows that both a and b (of ŷ = a + bx = -508 + 10x) are significant because the corresponding p-values are smaller than a= 0.05, while the value of R² = 97.1%, indicating that the regression equation explains 97.1% of the variation in the y-values and only 2.9% is due to other factors.

The MINITAB solution for the quadratic model shows that a, b, and c (of ŷ = a + bx + cx² =-3,618 + 104.3x + 0.7143x²) are individually not significant (because of the corresponding high p-values, but b and c jointly are significant because of the corresponding p-value of p = 0.022 < a= 0.05. The value of R² is: R² = 97.8%.

The MINITAB solution for the power model shows that both a and b (of ŷ = ax^b or ln y = ln a + b ln x = -13.3 + 4.3766 ln x) are significant because the corresponding p-values are smaller than a= 0.05, while the value of R² = 96.3%.

The MINITAB solution for the exponential model shows that the k (in ŷ = ke^cx =1.868432e^0.06658x or ln ŷ = ln k + cx = 0.6251 + 0.06658x) is not significant because of the corresponding high p-value, while the c is significant because of the corresponding p-value being smaller than a= 0.05. The value of R² = 96.4%.

References

Adamowski, J., H., Fung Chan, S.O., Prasher, B., Ozga-Zielinski, and Sliusarieva. A., 2012. Comparison of multiple linear and nonlinear regression, autoregressive integrated moving average, artificial neural network, and wavelet artificial neural network methods for urban water demand forecasting. Water Resources, 48, W01528, Montreal: Canada
Berenson, M.L., Levine, D.M. and Krehbiel, T.C., 2004. Basic Business Statistics (9th Edition). Upper Saddle River, N.J.: Prentice-Hall
Bhatia, N., 2009. Linear Regression: An Approach for Forecasting
Black, K., 2004. Business statistics (4th Edition). Hoboken, NJ: Wiley
Canavos, G.C., 1984. Applied Probability and Statistical Methods. Boston: Little Brown
Carlson, W.L. and Thorne, B., 1997. Applied Statistical Methods. Upper Saddle River, N.J.: Prentice-Hall
Chen, Kuan-Yu, 2011. Combining linear and nonlinear model in forecasting tourism demand. Expert Systems with Applications, 38(8), pp.10368–10376
Childress, R.L., Gorsky, R.D. and Witt, R.M., 1989. Mathematics for Managerial Decisions. Upper Saddle River, N.J.: Prentice-Hall
Chou, Ya-lun, 1992. Statistical Analysis for Business and Economics. New York: Elsevier
Freud, J.E. and Williams, F.J., 1982. Elementary Business Statistics: The Modern Approach. Upper Saddle River, N.J.: Prentice-Hall
McClave, J.T., Benson, G.P. and Sincich, T., 2001. Statistics for Business and Economics (8th Edition). Upper Saddle River, N.J.: Prentice-Hall
Pindyck, R. and Rubinfeld, D.L., 1981. Econometric Models and Economic Forecasts (2nd Edition). New York: McGraw-Hill
Vasilopoulos, A. and Lu, F.V., 2006. Quantitative Methods for Business with Computer Applications. Boston, MA: Pearson Custom Publishing
Vasilopoulos, A., 2005. Regression Analysis Revisited. Review of Business, 26 (3), pp.36-46
Vasilopoulos, A., 2007. Business Statistics – A Logical Approach. Theory, Models, Procedures, and Applications Including Computer (MINITAB) Solutions. Boston, MA: Pearson Custom Publishing

Article Rights and License

Corresponding Author

Athanasios Vasilopoulos, Athanasios Vasilopoulos, Ph.D., St. John’s University, The Peter J. Tobin College of Business, CIS/DS Department, 8000 Utopia Parkway Jamaica, N.Y. 11439

TagsBest-Fitting Model Forecasting Linear Regression Non-Linear Regression

Click HERE to Send Your Article Indexed in DOAJ

Linear and Non-Linear Regression: Powerful and Very Important Forecasting Methods

KeywordsBest-Fitting Model Forecasting Linear Regression Non-Linear Regression

JEL Classification M10

Full Article

1.Introduction and Model Estimation for the Linear Model

2.Model Testing

2.1. Testing for the significance of each term separately

2.2. Testing for the Significance of the Entire Linear Equation

2.3. A Bivariate Example

3.MINITAB Solution to the Linear Regression Problem

4.Introduction and Model Estimation for Some Non-Linear Models of Interest

4.1.The Linear Model and its Normal Equations

4.2.The Exponential Model

4.3.The Power Model

4.4.Derivation of the normal equations for the Quadratic model, y = a + bx + cx²

4.5.Data Utilization in Estimating the 4 Models

5.Selecting the Best-Fitting Model

5.1.The Four Models Considered

5.2.Calculating the Residual Values of Each Model and Their Variance

5.3. A Considered Example

6. MINITAB Solutions

6.1. Finding the MINITAB Solution for the Linear Model

6.2. Finding the MINITAB Solution for the Exponential Model

6.3. Finding the MINITAB Solution for the Power Model

6.4. Finding the MINITAB Solution for the Wuadratic Model

7. Conclusions

References

Article Rights and License

Corresponding Author

Download PDF

Author(s)

KeywordsBest-Fitting Model Forecasting Linear Regression Non-Linear Regression

JEL Classification M10

Full Article

1.Introduction and Model Estimation for the Linear Model

2.Model Testing

2.1. Testing for the significance of each term separately

2.2. Testing for the Significance of the Entire Linear Equation

2.3. A Bivariate Example

3.MINITAB Solution to the Linear Regression Problem

4.Introduction and Model Estimation for Some Non-Linear Models of Interest

4.1.The Linear Model and its Normal Equations

4.2.The Exponential Model

4.3.The Power Model

4.4.Derivation of the normal equations for the Quadratic model, y = a + bx + cx2

4.5.Data Utilization in Estimating the 4 Models

5.Selecting the Best-Fitting Model

5.1.The Four Models Considered

5.2.Calculating the Residual Values of Each Model and Their Variance

5.3. A Considered Example

6. MINITAB Solutions

6.1. Finding the MINITAB Solution for the Linear Model

6.2. Finding the MINITAB Solution for the Exponential Model

6.3. Finding the MINITAB Solution for the Power Model

6.4. Finding the MINITAB Solution for the Wuadratic Model

7. Conclusions

References

Article Rights and License

Corresponding Author

Download PDF

Author(s)

4.4.Derivation of the normal equations for the Quadratic model, y = a + bx + cx²