1 Forks
2 Stars
2 Watchers

wiki-scraper

This web crawler uses Scrapy py to crawl Wikipedia. It prints the page title, total word count, and page category (using openpyxl) to an Excel workbook, in order to analyze the verbosity of articles by category.

How to download and setup wiki-scraper

Open terminal and run command
git clone https://github.com/marinakiseleva/wiki-scraper.git
git clone is used to create a copy or clone of wiki-scraper repositories. You pass git clone a repository URL.
it supports a few different network protocols and corresponding URL formats.

Also you may download zip file with wiki-scraper https://github.com/marinakiseleva/wiki-scraper/archive/master.zip

Or simply clone wiki-scraper with SSH
[email protected]:marinakiseleva/wiki-scraper.git

If you have some problems with wiki-scraper

You may open issue on wiki-scraper support forum (system) here: https://github.com/marinakiseleva/wiki-scraper/issues