BTK (Bookmark Toolkit) is a bookmark manager that treats bookmarks as structured data rather than flat lists. It sits on top of SQLite with NLP-powered auto-tagging and brings database-level querying to personal bookmark management.
Why I Built This
Browser bookmark managers are terrible:
- Flat: No rich metadata, limited organization
- Siloed: Each browser has its own format
- Ephemeral: Pages go offline, links break
Your bookmarks represent years of intellectual curation. Articles that shaped your thinking, tutorials that taught you skills, references you return to over and over. They deserve better than ephemeral browser sync.
BTK is part of the Long Echo toolkit: tools for preserving your digital intellectual life in formats you control.
Quick Start
pip install bookmark-tk
# Start the interactive shell (recommended)
btk shell
# Or use direct CLI commands
btk bookmark add https://example.com --title "Example" --tags tutorial,web
btk bookmark list
btk bookmark search "python"
# Import and export
btk import html bookmarks.html
btk export bookmarks.html html --hierarchical
Interactive Shell
BTK includes a shell with a virtual filesystem interface:
$ btk shell
btk:/$ ls
bookmarks tags starred archived recent domains
btk:/$ cd tags
btk:/tags$ ls
programming/ research/ tutorial/ web/
btk:/tags$ cd programming/python
btk:/tags/programming/python$ ls
3298 4095 5124 5789 (bookmark IDs with this tag)
btk:/tags/programming/python$ cat 4095/title
Advanced Python Techniques
btk:/tags/programming/python$ star 4095
★ Starred bookmark #4095
btk:/tags/programming/python$ cd /bookmarks/4095
btk:/bookmarks/4095$ tag data-science machine-learning
✓ Added tags to bookmark #4095
Shell features:
- Virtual filesystem – Navigate bookmarks like files and directories
- Hierarchical tags – Tags like
programming/python/djangocreate navigable folders - Context-aware commands – Commands adapt based on your current location
- Unix-like interface – Familiar
cd,ls,pwd,mv,cpcommands
I wanted something that feels like navigating a filesystem because that is the mental model I already have for hierarchical data. If you live in the terminal, this should feel natural.
Hierarchical Tags
Organize with nested tags:
# Add with hierarchical tags
btk bookmark add https://docs.python.org --tags programming/python/docs
btk bookmark add https://flask.palletsprojects.com --tags programming/python/web
# Query at any level
btk tag filter programming # All programming bookmarks
btk tag filter programming/python # Python subset
# Tag management
btk tag list
btk tag tree # Show hierarchy
btk tag rename old-tag new-tag
Auto-Tagging with NLP
BTK automatically suggests tags based on content:
# Preview suggested tags for a bookmark
btk content auto-tag --id 42
# Apply suggested tags
btk content auto-tag --id 42 --apply
# Bulk auto-tag with parallel workers
btk content auto-tag --all --workers 100
The auto-tagger looks at the cached page content and suggests tags from your existing taxonomy. It is not perfect, but it handles the 80% case of “this page is clearly about Python web frameworks” well enough to save significant manual effort.
Content Caching
Store page content for offline access and full-text search:
# Content is cached automatically when adding bookmarks
btk bookmark add https://example.com
# Manually refresh content
btk content refresh --id 42 # Specific bookmark
btk content refresh --all # All bookmarks
btk content refresh --all --workers 50 # Parallel refresh
# View cached content
btk content view 42 # View markdown in terminal
btk content view 42 --html # Open HTML in browser
# Search cached content
btk bookmark search "specific phrase" --in-content
This is the feature that motivated the whole project. I was tired of bookmarking pages that disappeared six months later. BTK caches the content locally so your bookmarks survive link rot.
PDF Support
Extract and index text from PDF bookmarks:
# Add PDF bookmark (auto-extracts text)
btk bookmark add https://arxiv.org/pdf/2301.00001.pdf --tags research,ml
# Search within PDF text
btk bookmark search "neural network" --in-content
# View extracted text
btk content view 42
Browser Integration
Import bookmarks from browsers:
# Import bookmarks
btk import chrome
btk import firefox --profile default
btk import html bookmarks.html
btk import json bookmarks.json
btk import csv bookmarks.csv
Database Operations
# Use specific database
btk --db ~/bookmarks.db bookmark list
# Set default database
btk config set database.path ~/bookmarks.db
# Database management
btk db info # Show statistics
btk db vacuum # Optimize database
# Deduplication
btk db dedupe --strategy merge # Merge duplicate metadata
btk db dedupe --strategy keep_first # Keep oldest
btk db dedupe --preview # Preview changes
Export Formats
# Export to various formats
btk export output.html html --hierarchical # HTML with folder structure
btk export output.json json # JSON format
btk export output.csv csv # CSV format
btk export output.md markdown # Markdown with sections
Plugin System
Extend BTK with custom plugins:
from btk.plugins import Plugin, PluginMetadata, PluginPriority
class MyPlugin(Plugin):
def get_metadata(self) -> PluginMetadata:
return PluginMetadata(
name="my-plugin",
version="1.0.0",
description="Custom functionality",
priority=PluginPriority.NORMAL
)
def on_bookmark_added(self, bookmark):
# Custom logic when bookmark is added
pass
Architecture
- Database: SQLAlchemy ORM with SQLite backend
- Testing: 515 tests, >80% coverage on core modules
- Content: HTML/Markdown conversion, zlib compression, PDF extraction
Installation
pip install bookmark-tk
Discussion