##plugins.themes.bootstrap3.article.main##

This note draws researchers’ attention to the use of linear regression for the purpose of conducting a hypothesis testing. Even when multiple explanatory variables are included in a regression equation to preclude the hazard of a simple regression of omitting other factors, multicollinearity unfortunately is inherent in multiple regression simply because the included explanatory variables can share some common parameter domain; that is, they co-vary. Here we shall show that however one transforms the variance-covariance matrix of the least-squares estimation to reduce the estimation errors, the procedure amounts to affine transformations of the explanatory variables so that despite the transformation they remain to co-vary, rendering the coefficient as a partial derivative invalid. The root of this problem originates from the fact that one receives the explanatory variables’ values by observation rather than predetermining their values and then collecting the corresponding dependent variable’ values. This situation becomes especially disconcerting when a transformed explanatory variable has its estimated coefficient enjoying an exceptional degree of confidence but of no mathematical status as a partial derivative, misleading engineering or medical prescriptions and public policies.

Introduction

It is a well-recognized fact that a simple linear regression can cause the problem of omitted variables or model misspecification. Thus, multiple regression serves as a better methodology; here, however, one must exercise caution about the choices of the explanatory variables and also the design of the sample. In empirical work, there are mainly two approaches: observational (as in astronomy) or experimental (as in a particle collider). In practical applications of regression, mostly it is the former: one draws a sample from the underlying population and observes the input data [1]. This presents problems, as the analyst has no control over the observed values, say, {(xi1,xi2;yi)|i=1,2,,n}. This note seeks to draw researchers’ attention to a highly likely situation where x1 and x2 are both functions of t and s (connoting time and space), i.e., x1(t,s) and x2(t,s), so that the very construct of E(Y)=α+β1X1+β2X2 is immediately invalid as βj=(E(Y))/(Xj),j=1,2, does not exist. In the next section we will show that all modifications of the “variance-covariance matrix” (XTX) of the least-squares estimation for the purpose of alleviating the problem of multicollinearity cannot treat this problem of explanatory variables sharing the same parameter domain. Then in Section 3 we will conclude with a summary remark.

Analysis

All remedial procedures against the problem of multicollinearity are based on modifications of the matrix (XTX) of the ordinary least-squares estimation [2], in the case of two explanatory variables,

X T X = ( i = 1 n ( x i 1 x ¯ 1 ) 2 i = 1 n ( x i 1 x ¯ 1 ) ( x i 2 x ¯ 2 ) i = 1 n ( x i 1 x ¯ 1 ) ( x i 2 x ¯ 2 ) i = 1 n ( x i 2 x ¯ 2 ) 2 ) .

Since (XTX) is a linear operator, transformations of which are in general linear or affine [3], [4]. Consider a linear transformation by k1>1 of the (1,1)entry of the above matrix, then each individual observation i of x1(t,s) undergoes an affine transformation, which still ends as a function of (t,s). Consider now an affine transformation of the (1,1)entry of the above matrix by M1>0, i.e., but then the above (4) can be re-expressed as (2),

k 1 2 i ( x i 1 x ¯ 1 ) 2 = i ( k 1 x i 1 k 1 x ¯ 1 ) 2 ;
k 1 x i 1 ( t i , s i ) ( k 1 x ¯ 1 ) ,
M 1 + i ( x i 1 x ¯ 1 ) 2 ,
M 1 + i ( x i 1 x ¯ 1 ) 2 = k 1 2 i ( x i 1 x ¯ 1 ) 2 = i ( k 1 x i 1 k 1 x ¯ 1 ) 2 ; with  k 1 2 = M 1 ( x i 1 x ¯ 1 ) 2 + 1.

Thus, each diagonal entry of (XTX) remains as a function of (t,s) following any linear/affine transformations, and thereof each transformed explanatory variable a la (3) is still a function of (t,s). Next, consider the off-diagonal entries of (XTX): Denote the transformed x1 and x2 by x1 and x2 respectively; then one has

also still a function of (t,s).

i ( x i 1 x ¯ 1 ) ( x i 2 x ¯ 2 ) = k 1 x 1 , n × 1 k 1 x ¯ 1 , n × 1 , k 2 x 2 , n × 1 k 2 x ¯ 2 , n × 1 = k 1 k 2 ( x 1 , x 2 x 1 , x ¯ 2 x ¯ 1 , x 2 + x ¯ 1 , x ¯ 2 ) ,

As such, any linear or affine transformation of the matrix (XTX) amounts to a transformation of an explanatory variable xj into kjxj(t,s)mj=fj(t,s). Consequently, the transformed regression equation remains as

i.e., the root problem of multicollinearity remains: while the t-statistics of b1,b2,,bk can be increased to high (absolute) values to the extent that the 99% confidence intervals around b1,b2,,bk are extremely narrow, one still has the problem

y ^ = b 1 x 1 ( t , s ) + b 2 x 2 ( t , s ) + + b k x k ( t , s ) ;

by the simple fact that xj=fj(t,s),j=1,2,,k; i.e., all the explanatory variables co-vary. Therefore, the engineered highly significant estimates of the coefficients can only lead to a false confidence in their values [5]–[7].

E ( b j ) = β j = E ( Y ) X j , j = 1 , 2 , , k , do not exist ,

Summary Remark

From the above analysis, we see that hypothesis testing by a regression equation without an underlying mathematical model as based on theories can be problematic, unless one fixes the explanatory variables’ input values {(xi1,xi2,,xik)|i=1,2,,n} in advance and then collects the dependent variable’s values {yi|i=1,2,,n}. Otherwise, no matter how one transforms the matrix (XTX), the problem of xj(t,s)j persists; then the transformation not only is futile but can also be dangerous—considering a wrong sign of a coefficient (in the sense of Xj/t>0,Xj/s>0,Y/t>0,Y/s>0 butbj<0) that yet carries a high degree of confidence in engineering or medical applications.

References

  1. Kmenta J. Elements of Econometrics. New York: Macmillan; 1971.
     Google Scholar
  2. Huffel SV, Vandewalle J. Algebraic Connections between Total Least Squares Estimation and Classical Linear Regression in Multicollinearity Problems. Philadelphia, PA: SIAM; 1991, ch. 9. doi: 10.1137/1.9781611971002.
     Google Scholar
  3. Hoerl AE, Kennard RW. Ridge regression: biased estimation for nonorthogonal problems. Technometrics. 1970;12:55–67. doi: 10.2307/1267351.JSTOR1267351.
     Google Scholar
  4. Liu K. A new class of biased estimate in linear regression. Commun Stat Theory Methods. 1993;22:393–402. doi:10.1080/03610929308831027.
     Google Scholar
  5. Liu R,Wang H,Wang S. Functional variable selection via Gram-Schmidt orthogonalization for multiple functional linear regression. J Stat Comput Simul. 2018;88:3664–80. doi: 10.1080/00949655.2018.1530776.
     Google Scholar
  6. Oman SD. A confidence bound approach to choosing the biasing parameter in ridge regression. JASA. 1981;76:452–61. doi: 10.2307/2287849.
     Google Scholar
  7. Smith G, Campbell F. A critique of some ridge regression methods. JASA. 1980;75:74–81. doi: https://doi.org/10.1080/01621459.1980.10477428.
     Google Scholar