Skip to content
#

document-processing

Here are 1,133 public repositories matching this topic...

📄 The PDF intelligence layer for AI agents — Agent Document Twin, evidence-first extraction, visual crops, OCR provenance, trust reports, and benchmark-gated releases. MCP server for Claude, Cursor, VS Code, and any MCP client.

  • Updated Jul 1, 2026
  • TypeScript

Open-source batch OCR workbench — a free, local alternative to ABBYY FineReader. Powered by Ollama + GLM-OCR + PP-DocLayoutV3, ~0.5s/page on RTX 4090. Three-panel editor, layout-aware, PDF/image batch processing, Markdown/Word export. 批量OCR工作台,纯本地运行,免费平替ABBYY,适合书籍文档数字化。

  • Updated Jun 18, 2026
  • Python

Python, LlamaIndex, LangChain, Docker Compose: 15 Property Graph, 4 RDF , 10 Vector, OpenSearch, Elasticsearch, Alfresco DBs. 13 data sources (9 auto-sync), KG auto-building, Ontologies, LLMs, Docling or LlamaParse doc processing, GraphRAG, RAG only, Hybrid Search, AI Chat. TypeScript React, Vue, Angular frontends, FastAPI REST backend, MCP Server.

  • Updated Jun 20, 2026
  • Python
formkiq-core

Open-source document management platform leveraging AWS managed services. RESTful API for document storage, processing, full-text search, and metadata management. Multi-tenant serverless architecture with auto-scaling... deployed entirely in your AWS account.

  • Updated Jul 1, 2026
  • Java

Improve this page

Add a description, image, and links to the document-processing topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the document-processing topic, visit your repo's landing page and select "manage topics."

Learn more