0 Forks
2 Stars
2 Watchers

punjabi_news_website_crawlers

This project contain three Python file for creating the Punjabi News Corpus by crawling three respective Punjabi News websites, i.e. punjabitribuneonline.com, punjabijagran.com, and jagbani.punjabkesari.in

How to download and setup punjabi_news_website_crawlers

Open terminal and run command
git clone https://github.com/GurjotSinghMahi/punjabi_news_website_crawlers.git
git clone is used to create a copy or clone of punjabi_news_website_crawlers repositories. You pass git clone a repository URL.
it supports a few different network protocols and corresponding URL formats.

Also you may download zip file with punjabi_news_website_crawlers https://github.com/GurjotSinghMahi/punjabi_news_website_crawlers/archive/master.zip

Or simply clone punjabi_news_website_crawlers with SSH
[email protected]:GurjotSinghMahi/punjabi_news_website_crawlers.git

If you have some problems with punjabi_news_website_crawlers

You may open issue on punjabi_news_website_crawlers support forum (system) here: https://github.com/GurjotSinghMahi/punjabi_news_website_crawlers/issues

Similar to punjabi_news_website_crawlers repositories

Here you may see punjabi_news_website_crawlers alternatives and analogs

 scrapy    Sasila    colly    headless-chrome-crawler    Lulu    gopa    newspaper    isp-data-pollution    webster    cdp4j    spidy    stopstalk-deployment    N2H4    memorious    easy-scraping-tutorial    antch    pomp    Harvester    diffbot-php-client    talospider    corpuscrawler    Python-Crawling-Tutorial    learn.scrapinghub.com    crawling-projects    dig-etl-engine    crawlkit    scrapy-selenium    spidyquotes    zcrawl    podcastcrawler