About - coOCR/HTR

The Project

coOCR/HTR is a browser-based tool that puts domain experts at the center of OCR (Optical Character Recognition) and HTR (Handwritten Text Recognition) workflows for historical documents. The expert leads, the AI assists.

The tool combines the pattern recognition capabilities of Large Language Models with the critical judgment of human experts. It supports researchers in Digital Humanities who work with handwritten sources from the 16th-20th century: letters, account books, diaries, and registers.

Methodology

Critical Expert in the Loop

The AI assists; the human decides. Every transcription is a hypothesis that requires expert validation. The interface positions the user as the expert operating a precision instrument, not a consumer of automated output.

Categorical Confidence

Instead of misleading percentage scores (92.3% confidence), we use three meaningful categories: confident, uncertain, and problematic. This avoids the "automation bias" where users over-trust high percentages.

Hybrid Validation

Validation combines deterministic rules (transcription markers, text statistics, OCR artifacts) with LLM Review. Advanced users can provide custom validation prompts for domain-specific checks.