0 Forks
1 Stars
1 Watchers

webcrawl

Webcrawl is a Python web crawler that recursively follows links from a starting URL to extract and print unique HTTP links. Using 'requests and 'BeautifulSoup', it avoids revisits, handles errors, and supports configurable crawling depth. Ideal for gathering and analyzing web links.

How to download and setup webcrawl

Open terminal and run command
git clone https://github.com/ls-saurabh/webcrawl.git
git clone is used to create a copy or clone of webcrawl repositories. You pass git clone a repository URL.
it supports a few different network protocols and corresponding URL formats.

Also you may download zip file with webcrawl https://github.com/ls-saurabh/webcrawl/archive/master.zip

Or simply clone webcrawl with SSH
[email protected]:ls-saurabh/webcrawl.git

If you have some problems with webcrawl

You may open issue on webcrawl support forum (system) here: https://github.com/ls-saurabh/webcrawl/issues