240 Forks
2816 Stars
2816 Watchers

text-extract-api

Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown

How to download and setup text-extract-api

Open terminal and run command
git clone https://github.com/CatchTheTornado/text-extract-api.git
git clone is used to create a copy or clone of text-extract-api repositories. You pass git clone a repository URL.
it supports a few different network protocols and corresponding URL formats.

Also you may download zip file with text-extract-api https://github.com/CatchTheTornado/text-extract-api/archive/master.zip

Or simply clone text-extract-api with SSH
[email protected]:CatchTheTornado/text-extract-api.git

If you have some problems with text-extract-api

You may open issue on text-extract-api support forum (system) here: https://github.com/CatchTheTornado/text-extract-api/issues

Similar to text-extract-api repositories

Here you may see text-extract-api alternatives and analogs

 sheetjs    cli    prettier    fastjson    grape    learn-anything    SwiftyJSON    just-dashboard    lowdb    rxdb    brain.js    falcon    strapi    php-curl-class    Requester    unqlite    diplomat    neovim    SwiftAI    structured-text-tools    ServiceStack    countries    rest-assured    poco    mimesis    tbox    minify    render    acl    EVReflection