Skip to main content

pagevault: Client-Side Encryption for Static Sites

I run a static site on GitHub Pages. No backend, no server, no database. That simplicity is the whole point — until you want to share something semi-private.

Maybe it’s client-confidential notes mixed in with public documentation. Maybe it’s solutions to homework problems that should be behind a password. Maybe it’s personal thoughts in a blog post that you want to share with specific people but not the entire internet.

The usual answer is “add a backend.” But I didn’t want to give up static hosting just because a few paragraphs needed a password.

So I built pagevault.


The Core Idea

pagevault encrypts regions of HTML files — not the whole page, just the parts you mark. Public navigation, styling, and scripts remain untouched. Only the sensitive sections become ciphertext, replaced with a password prompt that decrypts in the browser using the Web Crypto API.

<header>Public navigation stays visible</header>

<pagevault hint="Contact me for the password">
  <h2>Private thoughts on this topic</h2>
  <p>This section is encrypted with AES-256-GCM...</p>
</pagevault>

<footer>Public footer stays visible</footer>

After running pagevault lock, the <pagevault> block becomes an opaque blob of base64. Visitors see a styled password prompt. If they have the password, the content decrypts instantly in their browser. If they don’t, they see nothing — not even the size of what they’re missing (with the --pad flag).

Here’s a real example — a team dashboard where public metrics are visible but financial data is encrypted:

Dashboard with encrypted region showing password prompt between public sections

After entering the password, the protected sections decrypt in-place:

Same dashboard after decryption, showing Internal Revenue and Infrastructure Costs

The output is a single self-contained HTML file. No JavaScript CDN, no external CSS, no runtime dependencies. You can email it, upload it to any static host, or open it from a local filesystem. Everything needed for decryption is embedded in the file.


Encryption Details

The crypto is intentionally non-configurable:

  • AES-256-GCM (AEAD — authenticated encryption with associated data)
  • PBKDF2-SHA256 with 310,000 iterations (OWASP 2023 recommendation)
  • 16-byte random salt, 12-byte random IV per encryption
  • v2 format with content-encryption-key (CEK) wrapping

I chose to fix these parameters rather than expose them as config options. Every pagevault file has the same security properties. No foot-guns, no “I accidentally set iterations to 1.”

The iteration count is stored in the encrypted payload, so future versions can increase it without breaking backward compatibility. Old files decrypt at their original iteration count; new files get the latest.

Why CEK Key-Wrapping?

Early versions derived a key directly from the password and encrypted content with it. This works fine for single-user encryption, but breaks down with multiple users.

If Alice and Bob each need their own password, naive multi-password means encrypting the content twice — doubling the payload size and making re-encryption expensive when passwords change.

The CEK model solves this cleanly. A random content-encryption key encrypts the actual data once. Then the CEK itself is wrapped (encrypted) separately for each user’s password. Adding a new user only wraps one small key blob. Changing a password only re-wraps one key blob. The bulk content stays untouched.

Content  ──encrypt with CEK──>  Ciphertext (one copy)
CEK  ──wrap with Alice's key──>  Key Blob A
CEK  ──wrap with Bob's key────>  Key Blob B

The pagevault sync command re-wraps keys when you add or remove users. For full key rotation (new CEK), there’s pagevault sync --rekey.


The Closure Property

This is my favorite design constraint. The output of pagevault lock must be valid input to pagevault lock.

This means you can encrypt a page for Alice, then encrypt different sections of the same page for Bob, and everything just works. Alice sees her content. Bob sees his. The page round-trips cleanly through multiple encryption passes.

Why does this matter? Because it eliminates the need for a first-class “multi-tier access” feature. You don’t need pagevault to understand access levels. You just run it twice with different passwords on different sections. Composability replaces configuration.

In practice, this constraint forces discipline throughout the codebase. The HTML manipulation layer (BeautifulSoup/lxml) must preserve structure perfectly. The injected JavaScript must be idempotent — a page with two decryption runtimes shouldn’t conflict. Attributes must round-trip through lock/unlock cycles without mutation.

Every release runs roundtrip tests: lock, then lock again, then unlock twice. If any content is lost or corrupted, the test fails.


Beyond HTML: Encrypting Arbitrary Files

Once you can encrypt HTML regions, the next question is obvious: what about PDFs? Images? Entire directories?

pagevault handles this by wrapping non-HTML files into encrypted HTML payloads:

pagevault lock report.pdf              # → _locked/report.pdf.html
pagevault lock presentation.pptx       # → _locked/presentation.pptx.html
pagevault lock mysite/ --site          # → _locked/mysite.html

The encrypted HTML includes the file’s contents as a base64 payload. When decrypted in the browser, a viewer plugin renders the content inline — images display as images, PDFs render in an embedded viewer, Markdown gets formatted with headings and code blocks.

Here’s what a password-protected file looks like before and after decryption:

Encrypted file showing password prompt with lock icon and filename

After entering the password, the viewer renders the content with a toolbar showing the filename, file size, and a download button:

Decrypted markdown file with rendered headings, lists, and code blocks

The text viewer renders plain text with line numbers — useful for meeting notes, logs, or config files:

Decrypted text file with line numbers and monospace rendering

Viewer Plugin System

The viewer architecture follows a MIME-type dispatch model. Each viewer plugin declares which MIME types it handles:

class ImageViewer(ViewerPlugin):
    name = "image"
    mime_types = ["image/*"]

    def js(self) -> str:
        return """async function(container, blob, url, meta, toolbar) {
            const img = document.createElement('img');
            img.src = url;
            container.appendChild(img);
        }"""

Seven viewers ship built-in: Image, PDF, HTML, Text, Markdown, Audio, and Video. Each lives in its own file under viewers/builtins/, and the registry discovers them automatically by scanning the directory — no registration boilerplate required.

Custom viewers work the same way. Point viewers_dir in your .pagevault.yaml at a directory of .py files that define ViewerPlugin subclasses, and they’re picked up automatically. User viewers override builtins on name collision, so you can replace any built-in viewer with your own implementation.

Resolution follows a priority chain: exact MIME match first, then wildcard patterns (e.g., image/*), then a download fallback. The viewer’s JavaScript, CSS, and any dependencies are embedded directly into the encrypted HTML — maintaining the self-contained property even with custom viewers.

Security is enforced at multiple layers. Plugin names and MIME types are validated via regex at class definition time (__init_subclass__), re-validated at the injection point (defense-in-depth), and all JS/CSS content is escaped to prevent </script> breakout. The HTML viewer renders in a sandboxed iframe (sandbox='allow-same-origin', no allow-scripts).


CLI Design

pagevault follows a mark-then-lock workflow:

# 1. Mark what to encrypt (in-place, content stays plaintext)
pagevault mark index.html -s "#secret" --hint "Ask admin"

# 2. Encrypt marked regions (outputs to _locked/)
pagevault lock index.html

# 3. Deploy _locked/ to your static host

The lock command is unified — it detects whether the input is HTML (encrypt marked regions), a non-HTML file (wrap as encrypted HTML), or a directory with --site (bundle as a single-page encrypted site).

A few commands that turned out to be surprisingly useful:

  • pagevault info — inspect an encrypted file without the password. Shows encryption metadata, number of key blobs, viewer information, content hash.
  • pagevault check -p PASSWORD — fast password verification. Does one PBKDF2 derivation and one key unwrap, returns exit code 0 or 1. Useful in CI scripts.
  • pagevault unlock --stdout — pipe decrypted content to stdout. Combined with shell pipelines, this makes pagevault a building block:
pagevault unlock report.pdf.html --stdout -p "$SECRET" > report.pdf

What It Defends Against (and What It Doesn’t)

pagevault protects against casual snooping (web scrapers, search engines, curious visitors) and determined offline attacks (someone downloads the HTML and runs dictionary attacks against the PBKDF2-wrapped payload). AES-256-GCM with 310k PBKDF2 iterations makes brute-force expensive.

It does not defend against browser-level adversaries (malicious extensions, XSS on the same origin) or hosting providers modifying the served HTML. The threat model is “your static host is honest but the public internet is not.” This is the right tradeoff for static hosting — if you need protection against a malicious host, you need a different architecture entirely.


Try It

pip install pagevault
pagevault config init
pagevault mark page.html -s ".private"
pagevault lock page.html

The project is on GitHub. It’s MIT-licensed, has 530+ tests, and is designed to be a building block — composable, self-contained, no server required.


Static sites shouldn’t need backends just because some content needs a password.

Discussion