active library

llm-priors

LLMs as Intelligent Priors: Enhancing Classical Algorithms Through Learned Initialization

Started 2025 Python

Resources & Distribution

Source Code

Package Registries

1 Stars

LLM-Priors: Intelligent Initialization for Classical Algorithms

A research project demonstrating how Large Language Models (LLMs) can serve as intelligent priors for classical algorithms, with a focus on Bayesian network structure learning where we achieve 49.8% average improvement over traditional approaches.

Key Results

  • 49.8% average improvement in F1 score using LLM priors with Hill Climbing
  • Optimal configuration: 40% LLM prior weight, 60% data-driven learning
  • Model scaling hypothesis validated: Network complexity sweet spot scales with model capability

Project Structure

llm-priors/
├── llm_bayes/              # Core implementation
│   ├── hybrid_system.py    # Main hybrid learning system
│   ├── llm_interface.py    # Ollama LLM interface
│   └── benchmarks.py       # Benchmark networks
├── experiments/            # Experiment scripts
├── tests/                  # Test files
├── papers/                 # Academic paper and PDFs
├── docs/                   # Documentation
├── logs/                   # Experiment logs
└── outputs/               # Results and visualizations

Installation

pip install -r requirements.txt

Quick Start

from llm_bayes import HybridBayesianNetwork

# Create hybrid system
model = HybridBayesianNetwork(
    llm_host="192.168.0.225",  # Ollama server
    llm_model="qwen2.5:32b-instruct-q4_K_M"
)

# Learn structure from data
structure = model.learn_structure(data, variable_descriptions)

Key Findings

1. LLMs as Intelligent Priors

LLMs can propose sensible initial network structures based on semantic understanding of variable relationships, providing warm starts for hill climbing algorithms.

2. Model Capability Scaling

The optimal network complexity (9-11 nodes for 32B models) scales with model size:

  • Small models (1-4B): 5-8 nodes optimal
  • Medium models (8-32B): 9-11 nodes optimal
  • Large models (70B+): 12-15+ nodes optimal

3. Hybrid Approach Benefits

Combining LLM priors with data-driven learning achieves better results than either approach alone, particularly effective for small-to-medium datasets.

Citation

@article{llmpriors2024,
  title={LLMs as Intelligent Priors: Enhancing Classical Algorithms Through Learned Initialization},
  author={...},
  year={2024}
}

License

MIT License - See LICENSE file for details

Discussion