Likelihood Named Distributions Model: likelihood_name_model

Introduction

The likelihood.model package introduces a flexible framework for working with likelihood models in R. Here, we demonstrate a model based on the naming conventions of distribution functions in R, making it easy to construct likelihood models for a wide range of distributions and handle different types of censoring. This vignette provides a brief guide to using the likelihood_name_model within the likelihood.model package.

You can also use these likelihood functions as likelihood contributions in a more complicated likelihood contribution model.

Conceptual Foundation

At the heart of many statistical models lies the concept of likelihood, a fundamental tool in estimating parameters and testing hypotheses. The likelihood.model package builds on this foundation, providing a way to generate likelihood models based on named distributions (e.g., normal, exponential) and handle data that may be subject to different censoring mechanisms—such as exact, left-censored, right-censored, or interval-censored observations.

Getting Started

Installation

To begin, install the likelihood.model package from GitHub:

if (!require(devtools)) {
    install.packages("devtools")
}
devtools::install_github("queelius/likelihood.model")

Loading the Package

Load the package along with other necessary libraries:

library(likelihood.model)
library(tidyverse)  # For data manipulation and visualization
library(MASS)       # For additional statistical tools

Creating a Likelihood Model

Start by creating a simple likelihood model for a normal distribution:

model <- likelihood_name("norm", "x", "censoring")
print(model)
summary(model)
print(assumptions(model))

This model is based on the normal distribution, with x as the observation column and censoring as the censoring column that indicates the type of censoring for the corresponding observation.

Detailed Example

Let’s simulate some data to apply our model:

generate_data <- function(n, right_censoring_quantile) {
  df <- data.frame(
    x = rnorm(n, mean = 0, sd = 1),
    censoring = rep("exact", n)
  )
  q <- qnorm(right_censoring_quantile)

  for (i in 1:n) {
    if (df[i, 1] > q) {
      df[i, ] <- list(q, "right")
    }
  }
  df
}

df <- generate_data(n = 100, right_censoring_quantile = .5)

Now we have a dataset df with \(n=100\) observations, with 50% (\(`right_censoring_quantile` = 0.5\)) expected to be right-censored. Now, compute the log-likelihood of the dataset given the parameters:

ll <- loglik(model)
print(ll(df, c(mean = 0, sd = 1)))
s <- score(model)
print(s(df, c(mean = 0, sd = 1))
H <- hess_loglik(model)
print(H(df, c(mean = 0, sd = 1)))

Customizing the Optimization Algorithm

To fine-tune the model fitting process, you can customize the optimization algorithm used by fit:

mle <- fit(model, df, par = c(mean = 0, sd = 1))
print(mle)

There are a number of control parameters that can be passed to fit to customize the optimization algorithm. For example, you can specify the maximum number of iterations, the convergence tolerance, the optimization method, and any box constraints on the parameters.

We can show the confidence intervals for the parameters:

confint(mle)

We can do many other tasks, too, See: algebraic.mle in addition to the functions availabel in this package. Since the MLE is itself a distribution, you may also use the algebraic.dist to do other tasks, such as

Hypothesis Testing

The likelihood.model package also provides tools for hypothesis testing. For example, you can perform a likelihood ratio test to compare two models using the lrt function, which produces a result that is a subclass of the hypothesis_test function.