AIHub
An intelligent document processing platform that combines OCR, PDF manipulation, and AI-powered text extraction to transform how teams handle documents.
Overview
AIHub is a document intelligence platform that brings AI-powered processing capabilities to the browser. It enables teams to extract text from images and PDFs, manipulate documents, and process content at scale - all without uploading sensitive documents to external servers.
The Challenge
Document processing in enterprise environments faces several constraints:
- Sensitive documents can't be uploaded to third-party services
- Manual data extraction is slow and error-prone
- Existing OCR solutions require complex server infrastructure
- Teams need to process documents in various formats
- Integration with existing authentication systems is required
Solution
A browser-based platform that:
- Runs OCR entirely in the browser using Tesseract.js
- Manipulates PDFs client-side with pdf-lib
- Processes images with Sharp for optimal OCR input
- Stores data locally with IndexedDB for privacy
- Authenticates via Azure AD for enterprise deployment
Technical Architecture
Client-Side OCR
Tesseract.js enables accurate text recognition without server round-trips:
// Initialize Tesseract worker const worker = await createWorker('eng') // Process image with confidence scoring const { data } = await worker.recognize(imageData) // Extract text with position data const blocks = data.blocks.map(block => ({ text: block.text, confidence: block.confidence, bbox: block.bbox }))
PDF Processing Pipeline
Multi-stage document handling:
- Ingestion - Parse PDF structure with pdfjs-dist
- Extraction - Render pages to canvas for OCR
- Enhancement - Pre-process images with Sharp
- Recognition - Run Tesseract on enhanced images
- Assembly - Combine results with pdf-lib
Image Pre-processing
Sharp optimizes images before OCR:
- Convert to grayscale for better recognition
- Apply adaptive thresholding
- Deskew rotated scans
- Remove noise and artifacts
Local Storage Strategy
IndexedDB provides persistent, private storage:
- Processed documents cached locally
- User preferences and settings
- Offline capability for previously processed files
Key Features
Smart Text Extraction
AI-powered OCR that handles:
- Scanned documents and photos
- Multi-column layouts
- Tables and structured data
- Handwritten text (with reduced accuracy)
PDF Manipulation
Client-side operations:
- Merge multiple PDFs
- Split documents by page
- Extract specific pages
- Add watermarks and annotations
Batch Processing
Queue multiple documents for processing:
- Progress tracking per document
- Parallel processing with Web Workers
- Resume interrupted batches
Enterprise Authentication
Azure MSAL integration:
- Single sign-on with company credentials
- Token-based session management
- Automatic token refresh
Technical Stack
- Nuxt 3 for the application framework
- Tesseract.js 6 for browser-based OCR
- pdf-lib for PDF creation and manipulation
- pdfjs-dist for PDF rendering and parsing
- Sharp for image processing
- Canvas API for image manipulation
- IndexedDB (idb) for local storage
- Azure MSAL for authentication
- Playwright for E2E testing
Performance Optimizations
Document processing is resource-intensive. Key optimizations:
- Web Workers - OCR runs off the main thread
- Progressive loading - Process visible pages first
- Caching - Store intermediate results
- Lazy initialization - Load Tesseract only when needed
- Memory management - Release resources after processing
Privacy by Design
All processing happens in the browser:
- Documents never leave the user's device
- No server-side storage of content
- Authentication tokens stored securely
- Clear data option for sensitive sessions
Results
- Processes 50+ page documents in under a minute
- 95%+ accuracy on clean printed text
- Zero server infrastructure for document processing
- Deployed to enterprise teams with strict data policies
Lessons Learned
- Browser capabilities are impressive - Modern browsers can handle serious workloads
- OCR quality depends on input - Image pre-processing is crucial
- Memory limits are real - Large documents need careful chunking
- UX during processing - Users need feedback for long operations
- Offline-first wins - Local storage makes the app feel instant
More Projects
ArtworkFlow Desktop
A cross-platform desktop application that bridges the ArtworkFlow ERP system with local file systems, enabling seamless asset management and real-time synchronization.
PDF Diff
A browser-based tool for visually comparing PDF documents, highlighting differences between versions to streamline review workflows.
Portfolio Website v2
A modern, animated developer portfolio built with Nuxt 3, Tailwind CSS, and Vue.js featuring dark mode, blog, and project showcase.