dedoc

dedoc

ispras

Dedoc is a library (service) for automate documents parsing and bringing to a uniform format. It automatically extracts content, logical structure, tables, and meta information from textual electronic documents. (Parse document; Document content extraction; Logical structure extraction; PDF parser; Scanned document parser; DOCX parser; HTML parser

659 Stars
52 Forks
659 Watchers
Python Language
apache-2.0 License
100 SrcLog Score
Cost to Build
$19.62M
Market Value
$82.80M

Growth over time

3 data points  ·  2026-04-10 → 2026-04-24
Stars Forks Watchers
💬

How do you feel about this project?

Ask AI about dedoc

Question copied to clipboard

What is the ispras/dedoc GitHub project? Description: "Dedoc is a library (service) for automate documents parsing and bringing to a uniform format. It automatically extracts content, logical structure, tables, and meta information from textual electronic documents. (Parse document; Document content extraction; Logical structure extraction; PDF parser; Scanned document parser; DOCX parser; HTML parser". Written in Python. Explain what it does, its main use cases, key features, and who would benefit from using it.

Question is copied to clipboard — paste it after the AI opens.

How to clone dedoc

Clone via HTTPS

git clone https://github.com/ispras/dedoc.git

Clone via SSH

[email protected]:ispras/dedoc.git

Download ZIP

Download master.zip

Found an issue?

Report bugs or request features on the dedoc issue tracker:

Open GitHub Issues