row
Alex Towell (atowell@siue.edu)
Problem 1
Regression through the origin: We will consider a special case of simple linear regression where the intercept is assumed to be zero from the outset. Let [\[ Y i = \beta x_i + \epsilon_i\,, \]]{.math .display} where [\(\operatorname{E}(\epsilon_i) = 0\)]{.math .inline} and [\(\operatorname{Var}(\epsilon_i) = \sigma^2\)]{.math .inline}.
Part (A)
Part (B)
Part (C)
Part (D)
The equation [\(\mathbf{Y} = \mathbf{X} \mathbf{\beta }+ \mathbf{\epsilon}\)]{.math .inline} can be rewritten as [\[\begin{equation} \begin{pmatrix} Y_1\\ Y_2\\ \vdots\\ Y_n \end{pmatrix} = \begin{pmatrix} x_1\\ x_2\\ \vdots\\ x_n \end{pmatrix} \beta + \begin{pmatrix} \epsilon_1\\ \epsilon_2\\ \vdots\\ \epsilon_n \end{pmatrix}\,. \end{equation}\]]{.math .display}
Part (E)
Part (F)
Observe that since [\(\hat\beta = c \mathbf{Y}\)]{.math .inline} where [\(c\)]{.math .inline} is constant and [\(\mathbf{Y}\)]{.math .inline} is a vector of normals, [\(\hat\beta\)]{.math .inline} is normally distributed [\(\hat\beta \sim \mathcal{N}(\beta, \operatorname{Var}(\hat\beta))\)]{.math .inline}.
Problem 2
Burple, Stephens, and Gloopshire (2014) report on a study in the Journal of Questionable Research. Data were collected on the number of minutes [\(Y_i\)]{.math .inline} it took [\(n = 237\)]{.math .inline} Glippers to learn how to drive a small, motorized car. Two predictors of interest are the estimated age of the glipper in months [\(x_{i 1}\)]{.math .inline} and the Glippers Maladaptive Score (GMS) [\(x_{i 2}\)]{.math .inline}, a number from [\(50\)]{.math .inline} to [\(100\)]{.math .inline} that summarizes how poor the glipper’s vision is. Consider the following multiple regression output from R.
Coefficients Estimate Std. Error
(Intercept) -182.57923 7.41169 Age 8.56069 0.31150 GMS 0.28066 0.04621
: Regression output from R
type df ss ms
regression
error 23565
total 236 104682
: Anaylysis of Variance Table
Common code
The following R code is used to produce many of the answers to this problem set.
n <- 237
p <- 3
ss.e <- 23565
ss.t <- 104682
# ss.e + ss.r = ss.t => ss.r = ss.t - ss.e
ss.r <- ss.t - ss.e
df.r <- p-1
df.e <- n-p
df.t <- n-1
ms.r <- ss.r / df.r
ms.e <- ss.e / df.e
# sample variance
ms.t <- ss.t / df.t
B0.est <- -182.57923
B1.est <- 8.56069
B2.est <- 0.28066
B0.se <- 7.41169
B1.se <- 0.31150
B2.se <- 0.04621
Part (A)
There are [\(p = 3\)]{.math .inline} parameters in the model and it is given that there are [\(n = 237\)]{.math .inline} observations with a sum of squared error [\(\rm{SS_E} = 23565\)]{.math .inline} and sum of squared total [\(\rm{SS_T} = 104682\)]{.math .inline}.
From this given information, we may compute the remaining values in the ANOVA table by observing the following set of relationships: [\[\begin{align*} \rm{SS_R} &= \rm{SS_T} - \rm{SS_E}\\ \rm{MS_R} &= \rm{SS_R} / \rm{df_R}\\ \rm{MS_E} &= \rm{SS_E} / \rm{df_E}\\ \rm{MS_T} &= \rm{SS_T} / \rm{df_T} \end{align*}\]]{.math .display} where [\(\rm{df}_R = p-1\)]{.math .inline}, [\(\rm{df}_E = n-p\)]{.math .inline}, and [\(\rm{df}_T = n-1\)]{.math .inline}.
We output the ANOVA table with the following R code:
anova_table <- data.frame(
type = c("regression","residual","total"),
ss = c(ss.r,ss.e,ss.t),
dof = c(df.r,df.e,df.t),
ms = c(ms.r,ms.e,ms.t)
)
knitr::kable(anova_table, digits=2, caption = "ANOVA table.")
type ss dof ms
regression 81117 2 40558.50 residual 23565 234 100.71 total 104682 236 443.57
: ANOVA table.
Part (B)
Consider the model [\[\begin{equation} Y_i = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \epsilon_i \end{equation}\]]{.math .display} where [\(x_1\)]{.math .inline} and [\(x_2\)]{.math .inline} are the independent predictor variables.
If we wish to do an overall test on the model, then we are interested in the hypothesis test [\[\begin{align*} H_0 &\colon \beta_1 = \beta_2 = 0\,\\ H_1 &\colon \beta_1 \neq 0 \lor \beta_2 \neq 0\,. \end{align*}\]]{.math .display}
Rejection of [\(H_0\)]{.math .inline} implies that one or more predictors is signficant in the model.
If [\(H_0\)]{.math .inline} is true and the model errors are normal and independent with constant variance, then the test statistic for significance of regression is [\[\begin{equation} F_0 = \frac{\rm{SS_R} / (p-1)}{\rm{SS_E}/(n-p)} = \frac{\rm{MS_R}}{\rm{MS_E}}\,, \end{equation}\]]{.math .display} where [\(F_0 \sim F(p-1,n-p)\)]{.math .inline}.
We compute the statistic using the following R code:
F.ratio=ms.r/ms.e
p_value=1-pf(F.ratio,p,n-p) #p-value from the F test
cat("test statistic F0 =",F.ratio, "with p-value", p_value, "\n")
## test statistic F0 = 402.7451 with p-value 0
We see that [\(\Pr(F_0 > 3) \approx 0.05\)]{.math .inline}. We have a much larger value and we see that the [\(p\)]{.math .inline}-value is essentially [\(0\)]{.math .inline}. Thus, we have very strong evidence to reject the null hypothesis, i.e., either [\(\beta_1\)]{.math .inline} or [\(\beta_2\)]{.math .inline} (or both) are not zero and we conclude that one or more of the predictors in the model is significant.
Part (C)
Observe [\(\hat\beta \sim \mathcal{N}(\mathbf{\beta}, \sigma^2 \mathbf{C})\)]{.math .inline} and therefore [\(\hat\beta_j \sim \mathcal(\beta_j, \sigma^2 C_{j j})\)]{.math .inline}. Thus, [\(\frac{\hat\beta_j - \beta_j}{\sqrt(\sigma^2 C_{j j})} \sim \mathcal{N}(0,1)\)]{.math .inline}. Since we do not know [\(\sigma^2\)]{.math .inline}, we estimate it with [\(\hat\sigma^2 = \rm{MS_E}\)]{.math .inline}, in which case
[\[ \frac{\hat\beta_j - \beta_j}{\operatorname{SE}(\hat\beta_j)} \sim t(\rm{df} = n-p)\,, \]]{.math .display} where [\(\operatorname{SE}(\hat\beta_j) \coloneqq \sqrt{\hat\sigma^2 C_{j j}}\)]{.math .inline} is denoted the standard error and [\(\mathbf{C} \coloneqq (\mathbf{X}' \mathbf{X})^{-1}\)]{.math .inline}.
We consider the hypothesis test for testing the significance of [\(\beta_j\)]{.math .inline}, [\[\begin{align*} H_0 : \beta_j = 0\,,\\ H_1 : \beta_j \neq 0\,, \end{align*}\]]{.math .display} with test statistic [\[ t_0 = \frac{\hat\beta_j}{\operatorname{SE}(\hat\beta_j)} \sim t(\rm{df}=n-p)\,. \]]{.math .display} We reject [\(H_0\)]{.math .inline} if [\(t_0\)]{.math .inline} has a sufficiently small [\(p\)]{.math .inline}-value, where [\[ \text{$p$-value} = 2 \Pr(t < -|t_0|)\,. \]]{.math .display} We specify that we reject [\(H_0\)]{.math .inline} if [\(p\text{-value} \leq \alpha = 0.05\)]{.math .inline} (or equivalently [\(|t_0| > t_{\alpha/2,n-p}\)]{.math .inline}).
B1.t0 <- B1.est / B1.se
B2.t0 <- B2.est / B2.se
cat("B1.t0 =",B1.t0,"\n")
## B1.t0 = 27.48215
cat("B2.t0 =",B2.t0,"\n")
## B2.t0 = 6.073577
#p-value from the T test
B1.pval=2*pt(-1*abs(B1.t0),df=n-p)
B2.pval=2*pt(-1*abs(B2.t0),df=n-p)
cat("B1 pvalue =",B1.pval,",B2 pvalue =", B2.pval)
## B1 pvalue = 3.321975e-75 ,B2 pvalue = 5.019091e-09
It is given that [\(\hat\beta_1 = 8.56\)]{.math .inline} and [\(\hat\beta_2 = 0.28\)]{.math .inline} respectively with standard errors [\(0.31150\)]{.math .inline} and [\(0.04621\)]{.math .inline} and [\(p\)]{.math .inline}-values of [\(3.321975e^{-75}\)]{.math .inline} and [\(5.019091e^{-09}\)]{.math .inline}. These [\(p\)]{.math .inline}-values are practically and thus in the presence in the presence of the other, neither predictor should be dropped.
Part (D)
The estimated coefficients in the model are given by [\(\hat\beta_1 = 8.56\)]{.math .inline} and [\(\hat\beta_2 = 0.28\)]{.math .inline}, thus
[\[\begin{equation} \hat{Y}_i = \hat\beta_0 + 8.56 x_{i 1} + 0.28 x_{i 2} + \epsilon_i \end{equation}\]]{.math .display} which has an estimated expectation [\[\begin{equation} E(\hat{Y}_i) = \hat\beta_0 + 8.56 x_{i 1} + 0.28 x_{i 2}\,, \end{equation}\]]{.math .display} where [\(x_{i 1}\)]{.math .inline} is the estimated age of the glipper in months, [\(x_{i 2}\)]{.math .inline} is the Glippers Maladaptive Score (GMS), a number between [\(50\)]{.math .inline} and [\(100\)]{.math .inline} that summarizes how poor the glipper’s vision is, and [\(\hat{Y}_i\)]{.math .inline} is the number of minutes it takes a glipper to learn how to drive a small, motorized car with those predictors.
According to the model, the rate of increase with respect to age and GMA is expected to increase the number of minutes to learn by [\(8.56\)]{.math .inline} and [\(0.28\)]{.math .inline} minutes respectively.
Part (E)
Recall that [\(R^2 = SS_R / SS_T = 1 - SS_E / SS_T\)]{.math .inline}. Thus, it is the percentage of the variability explained by the regression model versus the total variability. All things else being equal, the more of the variability we can explain by the regression model the better. Since [\(R^2\)]{.math .inline} is a number between [\(0\)]{.math .inline} and [\(1\)]{.math .inline}, the closer to [\(1\)]{.math .inline} the better.
We run the following R code:
#R2 <- 1 - ss.e / ss.t
R2 <- ss.r / ss.t
cat("R2 =",R2,"\n")
## R2 = 0.7748897
We see that [\(R^2 \approx 0.77\)]{.math .inline}, and thus the model explains around [\(77\%\)]{.math .inline} of the variability in the data.
Note that a large value of [\(R^2\)]{.math .inline} does not necessarily imply the model is good, since we can always capture more variability in a model by adding more parameters to it, which contributes to the particulars of the observe data.
Problem 3
Consider the “mtcars” data in R. You can load and view the dataset by typing “mtcars” in R console. Consider the response variable mileage per hour ([\(Y=\rm{mpg}\)]{.math .inline}), and two predictors, horsepower and weight [\((X_1 = \rm{hp}, X_2 = \rm{wt})\)]{.math .inline}. Ignore other variables for now.
Part (A)
We print out the scatterplot matrix with the following R code.
pairs(mpg~hp+wt,mtcars, main="Scatterplot Matrix") #construct scatterplots
{width=“672”}
The scatter plot visually indicates that the miles per gallon (mpg) is negatively correlated with both the horsepower (hp) and the weight (wt).
Part (B)
We use the following R code to fit the model and then report the ANOVA table.
#### Regression Analysis Using lm() ####
# fit a multiple regression model
mpg_hp_wt.model=lm(mpg~hp+wt, data=mtcars)
# get details from the regression output
summary(mpg_hp_wt.model)
##
## Call:
## lm(formula = mpg ~ hp + wt, data = mtcars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.941 -1.600 -0.182 1.050 5.854
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 37.22727 1.59879 23.285 < 2e-16 ***
## hp -0.03177 0.00903 -3.519 0.00145 **
## wt -3.87783 0.63273 -6.129 1.12e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.593 on 29 degrees of freedom
## Multiple R-squared: 0.8268, Adjusted R-squared: 0.8148
## F-statistic: 69.21 on 2 and 29 DF, p-value: 9.109e-12
# get the anova table
anova(mpg_hp_wt.model)
## Analysis of Variance Table
##
## Response: mpg
## Df Sum Sq Mean Sq F value Pr(>F)
## hp 1 678.37 678.37 100.862 5.987e-11 ***
## wt 1 252.63 252.63 37.561 1.120e-06 ***
## Residuals 29 195.05 6.73
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Part (C)
The [\(p\)]{.math .inline}-value [\(9.109e^{-12}\)]{.math .inline} for the overall test is extremely small (very large [\(F_0\)]{.math .inline} statistic), thus we have very strong evidence that either [\(\beta_1\)]{.math .inline} or [\(\beta_2\)]{.math .inline} (or both) are not zero. In other words, we conclude that one or more of the predictors in the model are significant.
Part (D)
The hypothesis for testing the significance of [\(\beta_j\)]{.math .inline}, [\[\begin{align*} H_0 : \beta_j = 0\,, \end{align*}\]]{.math .display} is given by the individual [\(t\)]{.math .inline}-test [\[\begin{equation} t_0 = \frac{\hat\beta_j}{\operatorname{SE}(\hat\beta_j)}\,, \end{equation}\]]{.math .display} which are reported in the regression summary in Part (B). Both have [\(p\)]{.math .inline}-values that are approximately so in the presence of the other neither should be dropped.
Part (E)
A QQ-normal probability plot is shown next.
# construct qq plot of the residuals
qqnorm(mpg_hp_wt.model$residual)
{width=“672”} The QQ-plot of the residuals is not quite a 45 degree line, suggesting that the assumption of normality on the error term may not be reasonable under some circumstances.
Next, we show the historgram of the residuals
# construct a histogram plot of the residuals
hist(mpg_hp_wt.model$residual)
{width=“672”} This is not terribly symmetric, confirming our earlier suspicions.
Part (F)
The following R code generates the sum of squared regressions for the models.
#### Testing on group of coefficients
result.full=lm(mpg~hp+wt, data=mtcars)
result.red_x1=lm(mpg~hp, data=mtcars)
result.red_x2=lm(mpg~wt, data=mtcars)
summary(result.full) #get details from the regression output
##
## Call:
## lm(formula = mpg ~ hp + wt, data = mtcars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.941 -1.600 -0.182 1.050 5.854
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 37.22727 1.59879 23.285 < 2e-16 ***
## hp -0.03177 0.00903 -3.519 0.00145 **
## wt -3.87783 0.63273 -6.129 1.12e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.593 on 29 degrees of freedom
## Multiple R-squared: 0.8268, Adjusted R-squared: 0.8148
## F-statistic: 69.21 on 2 and 29 DF, p-value: 9.109e-12
anova(result.full) #ANOVA table for the full model
## Analysis of Variance Table
##
## Response: mpg
## Df Sum Sq Mean Sq F value Pr(>F)
## hp 1 678.37 678.37 100.862 5.987e-11 ***
## wt 1 252.63 252.63 37.561 1.120e-06 ***
## Residuals 29 195.05 6.73
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(result.red_x1) #ANOVA table for the reduced model
## Analysis of Variance Table
##
## Response: mpg
## Df Sum Sq Mean Sq F value Pr(>F)
## hp 1 678.37 678.37 45.46 1.788e-07 ***
## Residuals 30 447.67 14.92
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(result.red_x2) #ANOVA table for the reduced model
## Analysis of Variance Table
##
## Response: mpg
## Df Sum Sq Mean Sq F value Pr(>F)
## wt 1 847.73 847.73 91.375 1.294e-10 ***
## Residuals 30 278.32 9.28
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(result.red_x1,result.full,test = "F") #F-test for the effect of act and year togother
## Analysis of Variance Table
##
## Model 1: mpg ~ hp
## Model 2: mpg ~ hp + wt
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 30 447.67
## 2 29 195.05 1 252.63 37.561 1.12e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The sum of squares for the regression for the full model is [\(\operatorname{SSR}(x_1,x_2) = 930.996\)]{.math .inline}. The sum of squares for the regression when we remove [\(x_1\)]{.math .inline} is [\(\operatorname{SSR}(x_2) = 847.7252\)]{.math .inline}. The sum of squares for the regression when we remove [\(x_2\)]{.math .inline} is [\(\operatorname{SSR}(x_1) = 678.3729\)]{.math .inline}. The extra sum of squares when we add [\(x_2\)]{.math .inline} given [\(x_1\)]{.math .inline} is already in the model is [\(\operatorname{SSR}(x_2|x_1) = \operatorname{SSR}(x_1,x_2) - \operatorname{SSR}(x_1) = 930.996 - 678.3729 = 252.6266\)]{.math .inline}. Thus, plugging them in, [\(930.996 = 678.3729+252.6266\)]{.math .inline}.
Part (G)
Look in the next section, Alternative approach to Problem 3, where I compute the intervals. There’s a lot of other stuff there, also, but the intervals are near the end of the R code block.
I get the result [\([16.66102, 20.41629]\)]{.math .inline}.
Alternative approach to Problem 3
We use the matrix approach. Here is the R code.
#### Regression analysis using matrix methods ####
p3.Y <- mtcars[c("mpg")]
rownames(p3.Y) <- NULL
colnames(p3.Y) <- NULL
p3.Y <- data.matrix(p3.Y)
p3.n <- nrow(p3.Y)
p3.X <- mtcars[c("hp","wt")]
rownames(p3.X) <- NULL
colnames(p3.X) <- NULL
p3.X <- cbind(rep(1,p3.n),data.matrix(p3.X))
p3.XtX=t(p3.X)%*%p3.X #get X'X
p3.C=solve(p3.XtX)
p3.coeffs=p3.C%*%t(p3.X)%*%p3.Y #Least Squares result
print(p3.coeffs)
## [,1]
## [1,] 37.22727012
## [2,] -0.03177295
## [3,] -3.87783074
# number of parameters p
p3.p <- 3
# degrees of freedom for SSR,SSE,and SST respectively
p3.df.r <- p3.p-1
p3.df.e <- p3.n-p3.p
p3.df.t <- p3.n-1
# SSE = y'y - B'x'y
p3.ss.e=t(p3.Y)%*%p3.Y-t(p3.coeffs)%*%t(p3.X)%*%p3.Y # calculate SSE
p3.ms.e=p3.ss.e/p3.df.e # calculate MSE
# SSR = B'X'y - n avg(y)^2
p3.ss.r=(t(p3.coeffs)%*%t(p3.X)%*%p3.Y-p3.n*mean(p3.Y)^2) # calculate SSR
p3.ms.r=p3.ss.r/p3.df.r # calculate MSR
# SST = sum(y[i] - avg(y))
# = y'y - n*avg(y)
p3.ss.t=t(p3.Y)%*%p3.Y-p3.n*mean(p3.Y)^2 # calculate SSTO
p3.ms.t=p3.ss.t/p3.df.t
# estimate of sigma^2 is mean squared error (MSE)
p3.sigma2_hat = p3.ms.e
print(p3.sigma2_hat)
## [,1]
## [1,] 6.725785
# covariance matrix for coefficient estimators
p3.cov=p3.sigma2_hat[1]*p3.C
print(p3.C)
## [,1] [,2] [,3]
## [1,] 3.800481e-01 2.207476e-05 -0.1094214557
## [2,] 2.207476e-05 1.212285e-05 -0.0005595912
## [3,] -1.094215e-01 -5.595912e-04 0.0595249024
p3.r.sq=(t(p3.coeffs)%*%t(p3.X)%*%p3.Y-p3.n*mean(p3.Y)^2)/p3.ss.t # calculate R-square
print(p3.r.sq)
## [,1]
## [1,] 0.8267855
p3.F.ratio=p3.ms.r/p3.ms.e #F test for overall model significance
p3.pval=1-pf(p3.F.ratio,2,p3.n-3) #p-value from the F test
p3.anova <- data.frame(
type = c("regression","residual","total"),
ss = c(p3.ss.r,p3.ss.e,p3.ss.t),
dof=c(p3.df.r,p3.df.e,p3.df.t),
ms = c(p3.ms.r,p3.ms.e,p3.ms.t)
)
knitr::kable(p3.anova, digits=2, caption = "ANOVA table.")
type ss dof ms
regression 931.00 2 465.50 residual 195.05 29 6.73 total 1126.05 31 36.32
: ANOVA table.
p3.B0.est = p3.coeffs[1]
p3.B0.se = sqrt(p3.ms.e*p3.C[1,1])
p3.B1.est = p3.coeffs[2]
p3.B1.se = sqrt(p3.ms.e*p3.C[2,2])
p3.B2.est = p3.coeffs[3]
p3.B2.se = sqrt(p3.ms.e*p3.C[3,3])
p3.B0.t0 <- p3.B0.est / p3.B0.se
p3.B1.t0 <- p3.B1.est / p3.B1.se
p3.B2.t0 <- p3.B2.est / p3.B2.se
cat("overall F0 =",p3.F.ratio,"pval =",p3.pval)
## overall F0 = 69.21121 pval = 9.109047e-12
cat("B0 =",p3.B0.est,"se(B0) =",p3.B0.se,"t0 =",p3.B0.t0)
## B0 = 37.22727 se(B0) = 1.598788 t0 = 23.28469
cat("B1 =",p3.B1.est,"se(B1) =",p3.B1.se,"t0 =",p3.B1.t0)
## B1 = -0.03177295 se(B1) = 0.00902971 t0 = -3.518712
cat("B2 =",p3.B2.est,"se(B2) =",p3.B2.se,"t0 =",p3.B2.t0)
## B2 = -3.877831 se(B2) = 0.6327335 t0 = -6.128695
##### code for confidence intervals
x0 <- c(1,100,4)
yhat=t(p3.coeffs)%*%x0
print(yhat)
## [,1]
## [1,] 18.53865
se.yhat=sqrt(p3.ms.e*t(x0)%*%p3.C%*%x0) #std.error for the mean response
CI.l=yhat-qt(0.975,n-3)*se.yhat
CI.u=yhat+qt(0.975,n-3)*se.yhat #the lower/upper limit for a 95% CI
print(CI.l)
## [,1]
## [1,] 16.66102
print(CI.u)
## [,1]
## [1,] 20.41629
We collect the results and summarize the results in the following table:
[\\(\\hat\\beta_j\\)]{.math .inline} [\\(\\operatorname{SE}(\\hat\\beta_j)\\)]{.math .inline} [\\(t_0\\)]{.math .inline} [\\(F_0\\)]{.math .inline}
[\(\beta_0\)]{.math .inline} 37.22727 1.598788 23.28469 69.21121
[\(\beta_1\)]{.math .inline} -0.03177295 0.00902971 -3.518712
[\(\beta_2\)]{.math .inline} -3.877831 0.6327335 -6.128695
: Coefficient statistics