active library

longshade

Generate a conversable persona from personal data: conversations, writings, emails, bookmarks, photos, reading notes

Started 2026 Python

Resources & Distribution

Source Code

Package Registries

longshade: Conversable Persona Generation

Status: Specification Only — No Implementation Yet


What is longshade?

longshade generates a conversable persona from personal data. Given conversations and writings, it produces everything needed to instantiate an LLM that can speak in your voice.

This is the “ghost” — your digital echo that can answer questions, share perspectives, and represent your thinking after you’re gone.

“The ghost is not you. But it echoes you.”


Quick Start (Planned)

# Generate persona from input data
longshade generate ./input/ --output ./persona/

# Test the persona interactively
longshade chat ./persona/

# Analyze inputs without generating
longshade analyze ./input/

Input Formats

conversations/*.jsonl

Conversational data — your voice in dialogue.

{"role": "user", "content": "What do you think about...", "timestamp": "2024-01-15T10:30:00Z", "source": "ctk"}
{"role": "assistant", "content": "I think...", "timestamp": "2024-01-15T10:31:00Z", "source": "ctk"}

Required fields:

  • role: “user” (your messages) or “assistant” (AI responses for context)
  • content: Message text

Optional fields:

  • timestamp: ISO 8601 datetime
  • source: Where this came from (for attribution)
  • conversation_id: Group related messages
  • topic: Subject/theme

Note: Your messages (role: "user") are the primary signal for voice. AI responses provide context but are not persona.

writings/*.md

Long-form writing — your voice in prose.

---
title: Why I Care About Durability
date: 2024-01-15
tags: [philosophy, archiving]
type: essay
---

When I think about what matters...

Frontmatter (optional but helpful):

  • title: Title of the piece
  • date: When written
  • tags: Topics/themes
  • type: essay, post, note, letter, etc.

Output Format

longshade produces a persona/ directory:

persona/
├── README.md           # How to use this persona
├── system-prompt.txt   # Ready-to-use LLM system prompt
├── rag/                # Embeddings and index for retrieval
│   ├── index.faiss
│   ├── metadata.json
│   └── chunks.jsonl
├── voice-samples.jsonl # Example Q&A pairs
└── fine-tune/          # Optional training data

The system prompt captures voice, values, and style. The RAG index enables grounded responses with semantic search. Voice samples demonstrate correct tone for few-shot prompting.


How It Works

Any Source                        longshade                      Output
┌─────────────────┐              ┌─────────────────┐           ┌────────────────┐
│ conversations/  │─────────────→│                 │           │ persona/       │
│   *.jsonl       │              │ Analyze voice   │           │   README.md    │
├─────────────────┤              │ Extract style   │──────────→│   system-prompt│
│ writings/       │─────────────→│ Build RAG index │           │   rag/         │
│   *.md          │              │ Generate prompt │           │   voice-samples│
└─────────────────┘              └─────────────────┘           └────────────────┘
  1. Ingest — Read conversations and writings
  2. Analyze — Extract voice characteristics, values, patterns
  3. Chunk & Embed — Build semantic search index
  4. Generate — Produce system prompt and artifacts

Standalone Toolkit

longshade is part of the ECHO ecosystem but works independently:

  • longshade defines what it accepts — Input formats are longshade’s specification
  • Any source can provide input — If you can produce JSONL conversations or Markdown writings, longshade accepts them
  • Outputs are self-contained — The persona directory works with any LLM

Compatible data sources:

  • ctk — Conversation export
  • btk — Bookmark annotations
  • Any tool that outputs JSONL or Markdown

Privacy Considerations

longshade processes personal data. Consider:

  • Review inputs before processing
  • Think about what you’re comfortable having in a conversable persona
  • Use filtering options to exclude sensitive content
  • Control who has access to the output

The generated persona can answer questions you never anticipated. Think carefully about what’s included.


Specification

For the complete technical specification, see SPEC.md.


  • longecho — ECHO compliance validator
  • ctk — Conversation toolkit
  • btk — Bookmark toolkit
  • ebk — Ebook toolkit

“The ghost is not you. But it echoes you.”

Discussion