18 Forks
167 Stars
167 Watchers

aws-pdf-textract-pipeline

:mag: Data pipeline for crawling PDFs from the Web and transforming their contents into structured data using AWS textract. Built with AWS CDK + TypeScript

How to download and setup aws-pdf-textract-pipeline

Open terminal and run command
git clone https://github.com/aeksco/aws-pdf-textract-pipeline.git
git clone is used to create a copy or clone of aws-pdf-textract-pipeline repositories. You pass git clone a repository URL.
it supports a few different network protocols and corresponding URL formats.

Also you may download zip file with aws-pdf-textract-pipeline https://github.com/aeksco/aws-pdf-textract-pipeline/archive/master.zip

Or simply clone aws-pdf-textract-pipeline with SSH
[email protected]:aeksco/aws-pdf-textract-pipeline.git

If you have some problems with aws-pdf-textract-pipeline

You may open issue on aws-pdf-textract-pipeline support forum (system) here: https://github.com/aeksco/aws-pdf-textract-pipeline/issues

Similar to aws-pdf-textract-pipeline repositories

Here you may see aws-pdf-textract-pipeline alternatives and analogs

 react-starter-kit    data-science-ipython-notebooks    saws    aws-doc-sdk-examples    derek    ice    react-firebase-starter    functions-samples    hydro-serving    LambStatus    cstate    are-you-fake-news    sensu-plugins-aws    apex    es2017-lambda-boilerplate    retinal    kong    faas    up    tidb    prisma1    dev-setup    zentral    lambda-packs    learning-tools    pingbot    amazon-sagemaker-examples    Zappa    awesome-aws    chalice