crawlee
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.
How to download and setup crawlee
Open terminal and run command
git clone https://github.com/apify/crawlee.git
git clone is used to create a copy or clone of crawlee repositories.
You pass git clone a repository URL. it supports a few different network protocols and corresponding URL formats.
Also you may download zip file with crawlee https://github.com/apify/crawlee/archive/master.zip
Or simply clone crawlee with SSH
[email protected]:apify/crawlee.git
If you have some problems with crawlee
You may open issue on crawlee support forum (system) here: https://github.com/apify/crawlee/issuesSimilar to crawlee repositories
Here you may see crawlee alternatives and analogs
freeCodeCamp bootstrap vue react You-Dont-Know-JS javascript electron node atom axios three.js webpack html5-boilerplate meteor express angular material-ui Chart.js TypeScript free-programming-books-zh_CN ionic-framework nw.js lodash materialize yarn javascript-algorithms element echarts Front-End-Checklist babel