BETA Work in Progress - Developed with Promptotyping GitHub

Desktop Recommended

coOCR/HTR is optimized for large screens to display document, editor, and validation panels simultaneously.

Document Viewer i
Load Document
Supports images (JPG, PNG) and PAGE-XML. Zoom with mouse wheel, pan by dragging.
Load a demo document or upload an image (JPG, PNG) or PAGE-XML.

Drop files here

Drag & drop images or PAGE-XML files

JPG, PNG, TIFF, XML
100%
Editor i
How does it work?
1. Transcribe: The LLM creates an initial draft which you correct and validate as an expert.
2. Describe: As configured in the user prompt, the LLM generates a visual description of the document page.
LLM | Gemini Flash
Click "Transcribe" to process the document with AI. Double-click to edit.
Validation / LLM Review i
Hybrid Validation
Combines deterministic validation with LLM Review. Categories: confident / uncertain / problematic (no percentages).
Model Thinking
Watch the LLM's reasoning process in real time. This helps you understand how the model approaches your document and refine your prompts accordingly.
Validation checks transcription for errors and uncertainties.

No Validation Yet

Load a document and run transcription to see validation results here.

LLM Configuration

Security Notice:
  • API Keys are not stored - entry required on each visit
  • Keys remain only in browser memory while the tab is open
  • Never share your API Keys with others
  • For maximum security: Use Ollama locally (no API key needed)
Learn more

Browser-based apps cannot fully protect API Keys. The key is in the browser's memory during use. We recommend:

  • Create a separate API key only for this tool
  • Set spending limits with your provider
  • For sensitive documents: Use Ollama locally

Tip for local use: When running the tool locally, you can create a config.local.js file (see config.local.example.js). Your keys will then be loaded automatically.

Gemini 3 Flash for fast results, Pro for complex handwriting.
Get API key from Google AI Studio.

Start Transcription

Document Context (optional) Additional document information

Optional information about the document that will be considered during transcription.

If the source text is known, name it here for better correction proposals.
Describe origin, author, special features or other helpful information.
Prompt Profile (optional) Scenario profile and Stage 1 override
Select a document scenario. You can still override prompts manually.

Describe Illuminated Initials

Using Gemini 3 Pro for image analysis

Start Validation

Validation

Local, fast checks without API calls

LLM Review

One API call per page

Use custom prompt (optional)

Overrides the default LLM Review prompt. Use {text} as placeholder for the transcription.

Prompt Profile

Choose a document scenario profile for Stage 1/2/3 prompts.

Advanced stage overrides (optional)

If filled, overrides replace profile/default prompts for the stage.

Export Transcription

Settings

Editor

Validation

Display

Storage

Loading...

Images and transcriptions are stored in browser storage.

Data

Help

Quick Start

  1. LoadUpload an image or load a demo document
  2. TranscribeClick "Transcribe" to process with AI
  3. DescribeGenerate an image description for initials and iconography
  4. ValidateReview Validation and LLM Review results
  5. ExportDownload as TXT, JSON, or Markdown

Keyboard Shortcuts

Ctrl + Z Undo
Ctrl + Shift + Z Redo
Navigate lines
Enter Edit selected cell
Escape Cancel editing
Tab Next cell (grid mode)

Confidence Markers

ConfidentHigh certainty reading
UncertainNeeds expert review
ProblematicLikely error or illegible

Use [?] for uncertain readings and [illegible] for unreadable text.

Load IIIF Resource

Load images from IIIF-compatible repositories. Supports Presentation API v2 and v3 manifests.

Examples: