aws-pdf-textract-pipeline

aws-pdf-textract-pipeline

aeksco

:mag: Data pipeline for crawling PDFs from the Web and transforming their contents into structured data using AWS textract. Built with AWS CDK + TypeScript

167 Stars
18 Forks
167 Watchers
TypeScript Language
mit License
Cost to Build
$90.4K
Market Value
$245.0K

Growth over time

13 data points  ·  2021-05-01 → 2025-09-01
Stars Forks Watchers
💬

How do you feel about this project?

Ask AI about aws-pdf-textract-pipeline

Question copied to clipboard

What is the aeksco/aws-pdf-textract-pipeline GitHub project? Description: ":mag: Data pipeline for crawling PDFs from the Web and transforming their contents into structured data using AWS textract. Built with AWS CDK + TypeScript ". Written in TypeScript. Explain what it does, its main use cases, key features, and who would benefit from using it.

Question is copied to clipboard — paste it after the AI opens.

How to clone aws-pdf-textract-pipeline

Clone via HTTPS

git clone https://github.com/aeksco/aws-pdf-textract-pipeline.git

Clone via SSH

[email protected]:aeksco/aws-pdf-textract-pipeline.git

Download ZIP

Download master.zip

Found an issue?

Report bugs or request features on the aws-pdf-textract-pipeline issue tracker:

Open GitHub Issues