The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns
What is the ArchiveTeam/grab-site GitHub project? Description: "The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns". Written in Python. Explain what it does, its main use cases, key features, and who would benefit from using it.
Question is copied to clipboard — paste it after the AI opens.
Clone via HTTPS
Clone via SSH
Download ZIP
Download master.zipReport bugs or request features on the grab-site issue tracker:
Open GitHub Issues