All Projects → holwech → Newsscraper

holwech / Newsscraper

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Newsscraper

Monero Gui Guide
Guide for the Monero GUI wallet
Stars: ✭ 36 (-5.26%)
Mutual labels:  hacktoberfest
Togglr
an R and Rstudio wrapper for toggl Api
Stars: ✭ 37 (-2.63%)
Mutual labels:  hacktoberfest
Rw.rs
Free shell account and web 1.0 hosting @ http://rw.rs/~you
Stars: ✭ 38 (+0%)
Mutual labels:  hacktoberfest
Hero Starter
Hero code needed to play the game.
Stars: ✭ 36 (-5.26%)
Mutual labels:  hacktoberfest
Kubernetes Credentials Provider Plugin
Credentials provider that allows storing credentials in Kubernetes
Stars: ✭ 37 (-2.63%)
Mutual labels:  hacktoberfest
Nsfw Filter
🚀 A Google Chrome / Firefox extension that blocks NSFW images from the web pages that you load using TensorFlow JS.
Stars: ✭ 984 (+2489.47%)
Mutual labels:  hacktoberfest
Merge Branch
A GitHub Action that merge PR branch to other branchs
Stars: ✭ 36 (-5.26%)
Mutual labels:  hacktoberfest
Blaze
⚡ File sharing progressive web app built using WebTorrent and WebSockets
Stars: ✭ 991 (+2507.89%)
Mutual labels:  hacktoberfest
Rocket.chat.apps Cli
The CLI for interacting with Rocket.Chat Apps
Stars: ✭ 37 (-2.63%)
Mutual labels:  hacktoberfest
Keycloak Admin Ui
Keycloak Admin Console
Stars: ✭ 38 (+0%)
Mutual labels:  hacktoberfest
React Step Progress
Dynamic multi-step progress indicator for React.
Stars: ✭ 37 (-2.63%)
Mutual labels:  hacktoberfest
Cbj smart Home
If you are searching for an easy way to deploy a smart home 🏡 by yourself CyBear Jinni 🦾🐻🧞‍♂️ is here for you. Join the community and make your home smarter than yesterday.
Stars: ✭ 37 (-2.63%)
Mutual labels:  hacktoberfest
Analysispreservation.cern.ch
Source code for the CERN Analysis Preservation portal
Stars: ✭ 37 (-2.63%)
Mutual labels:  hacktoberfest
Nvquicksite
nvQuickSite is a desktop installation app for DNN, the world's most popular ASP.NET-based CMS. This app allows you to easily install DNN onto any environment that meets the minimum system requirements for DNN to be installed.
Stars: ✭ 36 (-5.26%)
Mutual labels:  hacktoberfest
Yii Queue
Queue extension for Yii 3.0
Stars: ✭ 38 (+0%)
Mutual labels:  hacktoberfest
Ns Vue Radio
A native white-label application built with NativeScript-Vue for community radios
Stars: ✭ 36 (-5.26%)
Mutual labels:  hacktoberfest
Dmake
DMake is a tool to manage micro-service based applications
Stars: ✭ 37 (-2.63%)
Mutual labels:  hacktoberfest
Swamp
Teh AWS profile manager
Stars: ✭ 38 (+0%)
Mutual labels:  hacktoberfest
Be Pretty
💄 a small CLI utility for every lazy prettier maximalist out there
Stars: ✭ 38 (+0%)
Mutual labels:  hacktoberfest
Puppet Redis
Puppet Module to manage Redis
Stars: ✭ 37 (-2.63%)
Mutual labels:  hacktoberfest

NewsScraper - Scrape any newspaper automatically

This is a simple python script for automatically scraping the most recent articles from any news-site.

Just add the websites you want to scrape to NewsPapers.json and the script will go through and scrape each site listed in the file.

This repository was originally created as part of this tutorial.

Thanks to Pål Grønås Drange for his contributions to the repository.

Installing

You need to download the content of this repository, then run

pip install -r requirements.txt

Usage

Simply run python newsscraper.py NewsPapers.json.

The NewsPapers.json file should be a JSON file like this:

{
  "bbc": {
    "rss": "http://feeds.bbci.co.uk/news/rss.xml",
    "link": "http://www.bbc.com/"
  },
  "breitbart": {
    "link": "http://www.breitbart.com/"
  },
  "cnn": {
    "rss": "http://rss.cnn.com/rss/edition.rss",
    "link": "http://edition.cnn.com/"
  },
  "foxnews": {
    "rss": "http://feeds.foxnews.com/foxnews/latest",
    "link": "http://www.foxnews.com/"
  },
  "nytimes_frontpage": {
    "link": "https://nytimes.com/",
    "rss": "https://rss.nytimes.com/services/xml/rss/nyt/HomePage.xml"
  },
  "nytimes_international": {
    "link": "https://nytimes.com/",
    "rss": "https://rss.nytimes.com/services/xml/rss/nyt/World.xml"
  },
  "theguardian": {
    "rss": "https://www.theguardian.com/uk/rss",
    "link": "https://www.theguardian.com/international"
  },
  "washingtonpost": {
    "rss": "http://feeds.washingtonpost.com/rss/world",
    "link": "https://www.washingtonpost.com/"
  },
  "wsj": {
    "rss": "https://feeds.a.dj.com/rss/RSSWorldNews.xml",
    "link": "https://www.wsj.com"
  }
}

Libraries

This script uses the following libraries:

https://github.com/codelucas/newspaper

https://github.com/kurtmckee/feedparser

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].