Computes the likelihood ratio test (LRT) statistic and p-value for comparing nested models.

lrt(null_loglik, alt_loglik, dof = NULL)

Arguments

null_loglik

The log-likelihood under the null (simpler) model. Either a numeric scalar or a logLik object (as returned by stats::logLik()).

alt_loglik

The log-likelihood under the alternative (more complex) model. Either a numeric scalar or a logLik object.

dof

Positive integer. Degrees of freedom, typically the difference in the number of free parameters between models. When both null_loglik and alt_loglik are logLik objects, dof is computed automatically from their df attributes and may be omitted.

Value

A hypothesis_test object of subclass likelihood_ratio_test containing:

stat

The LRT statistic \(\Lambda = -2(\ell_0 - \ell_1)\)

p.value

P-value from chi-squared distribution with dof degrees of freedom

dof

The degrees of freedom

null_loglik

The input null model log-likelihood (numeric)

alt_loglik

The input alternative model log-likelihood (numeric)

Details

The likelihood ratio test is a fundamental method for comparing nested statistical models. Given a null model \(M_0\) (simpler, fewer parameters) nested within an alternative model \(M_1\) (more complex), the LRT tests whether the additional complexity of \(M_1\) is justified by the data.

The test statistic is:

$$\Lambda = -2 \left( \ell_0 - \ell_1 \right) = -2 \log \frac{L_0}{L_1}$$

where \(\ell_0\) and \(\ell_1\) are the maximized log-likelihoods under the null and alternative models, respectively.

Under \(H_0\) and regularity conditions, \(\Lambda\) is asymptotically chi-squared distributed with degrees of freedom equal to the difference in the number of free parameters between models.

Assumptions

  1. The null model must be nested within the alternative model (i.e., obtainable by constraining parameters of the alternative).

  2. Both likelihoods must be computed from the same dataset.

  3. Standard regularity conditions for asymptotic chi-squared distribution must hold (true parameter not on boundary, etc.).

Relationship to Other Tests

The LRT is one of the "holy trinity" of likelihood-based tests, alongside the Wald test (wald_test()) and the score (Lagrange multiplier) test. All three are asymptotically equivalent under \(H_0\), but the LRT is often preferred because it is invariant to reparameterization.

See also

wald_test() for testing individual parameters, stats::logLik() for extracting log-likelihoods from fitted models

Examples

# Comparing nested regression models with raw log-likelihoods
# Null model: y ~ x1 (log-likelihood = -150)
# Alt model:  y ~ x1 + x2 + x3 (log-likelihood = -140)
# Difference: 3 additional parameters

test <- lrt(null_loglik = -150, alt_loglik = -140, dof = 3)
test
#> Hypothesis test (likelihood_ratio_test)
#> -----------------------------
#> Test statistic: 20
#> P-value: 0.000169742435552826
#> Degrees of freedom: 3
#> Significant at 5% level: TRUE

# Is the more complex model significantly better?
is_significant_at(test, 0.05)
#> [1] TRUE

# Extract the test statistic (should be 20)
test_stat(test)
#> [1] 20

# Using logLik objects (dof computed automatically)
ll_null <- structure(-150, df = 2, nobs = 100, class = "logLik")
ll_alt  <- structure(-140, df = 5, nobs = 100, class = "logLik")
lrt(ll_null, ll_alt)  # dof = 5 - 2 = 3
#> Hypothesis test (likelihood_ratio_test)
#> -----------------------------
#> Test statistic: 20
#> P-value: 0.000169742435552826
#> Degrees of freedom: 3
#> Significant at 5% level: TRUE

# With real models (any model supporting stats::logLik)
set.seed(42)
x <- 1:50
y <- 2 + 3 * x + rnorm(50)
m0 <- lm(y ~ 1)
m1 <- lm(y ~ x)
lrt(logLik(m0), logLik(m1))
#> Hypothesis test (likelihood_ratio_test)
#> -----------------------------
#> Test statistic: 365.2825932939
#> P-value: 1.99222035561072e-81
#> Degrees of freedom: 1
#> Significant at 5% level: TRUE