All Projects β†’ rg089 β†’ newsemble

rg089 / newsemble

Licence: other
API for fetching data from news websites.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to newsemble

newspaperjs
News extraction and scraping. Article Parsing
Stars: ✭ 59 (+40.48%)
Mutual labels:  scraper, news, webscraping
BoxFeed
News App πŸ“± built to demonstrate the use of SwiftUI 3 features, Async/Await, CoreData and MVVM architecture pattern.
Stars: ✭ 112 (+166.67%)
Mutual labels:  news, newsapi
MalScraper
Scrape everything you can from MyAnimeList.net
Stars: ✭ 132 (+214.29%)
Mutual labels:  scraper, news
covid19.swift
🌐 Small iOS app to show some COVID-19 health, data, news and tweets
Stars: ✭ 25 (-40.48%)
Mutual labels:  news, newsapi
NewsPin
News app for android using Kotlin, coroutines, MVP architecture
Stars: ✭ 25 (-40.48%)
Mutual labels:  news, newsapi
BookingScraper
🌎 🏨 Scrape Booking.com 🏨 🌎
Stars: ✭ 68 (+61.9%)
Mutual labels:  scraper, webscraping
civic-scraper
Tools for downloading agendas, minutes and other documents produced by local government
Stars: ✭ 21 (-50%)
Mutual labels:  scraper, news
Instagram Proxy Api
CORS compliant API to access Instagram's public data
Stars: ✭ 245 (+483.33%)
Mutual labels:  heroku, scraper
News-API-Kotlin
Access the News API with Kotlin.
Stars: ✭ 35 (-16.67%)
Mutual labels:  news, newsapi
HeadLines
HeadLines is a πŸ“° news app that delivers you with the latest news. It has interactive UI and easy to use. The app can be scrolled offline to watch your bookmarked news. Give this app a try and let me know.
Stars: ✭ 16 (-61.9%)
Mutual labels:  news, newsapi
Inshorts-News-API
Unofficial API of Inshorts written in Flask
Stars: ✭ 87 (+107.14%)
Mutual labels:  news, newsapi
info-bot
πŸ€– A Versatile Telegram Bot
Stars: ✭ 37 (-11.9%)
Mutual labels:  news, bs4
TradeTheEvent
Implementation of "Trade the Event: Corporate Events Detection for News-Based Event-Driven Trading." In Findings of ACL2021
Stars: ✭ 64 (+52.38%)
Mutual labels:  scraper, news
TrollHunter
Twitter Troll & Fake News Hunter - Crawls news websites and twitter to identify fake news
Stars: ✭ 38 (-9.52%)
Mutual labels:  scraper, news
Heroku ebooks
A script to generate Markov chains and to post to an _ebooks account on Twitter using Heroku
Stars: ✭ 251 (+497.62%)
Mutual labels:  heroku, scraper
robotstxt
robots.txt file parsing and checking for R
Stars: ✭ 65 (+54.76%)
Mutual labels:  scraper, webscraping
Youtube Projects
This repository contains all the code I use in my YouTube tutorials.
Stars: ✭ 144 (+242.86%)
Mutual labels:  scraper, webscraping
Polite
Be nice on the web
Stars: ✭ 253 (+502.38%)
Mutual labels:  scraper, webscraping
extractnet
A Dragnet that also extract author, headline, date, keywords from context
Stars: ✭ 52 (+23.81%)
Mutual labels:  news, webscraping
NewsApp
An app that fetches latest news, headlines
Stars: ✭ 28 (-33.33%)
Mutual labels:  news, newsapi

πŸ“° Newsemble πŸ“°


Logo
An API for fetching the current news.

python   Flask   MongoDB  Heroku

GitHub release Visits Badge Stars Badge Fork Badge Github all releases watchers Badge

πŸ”– About πŸ”–


Blog Post

Newsemble is an API that provides easy access to the current news for programmatic analysis. It has been built using Python, BeautifulSoup and MongoDB.
The data is scraped from these news websites every hour, stored in a database on the cloud and whenever requested, the most recent articles are promptly served.
Developers can make use of this API to fetch current data with each article having the following fields:
Headlines, Content, Source, Link and Time.



πŸ—’οΈ Table of contents

πŸ’» Technologies

Newsemble is created with:

  • Python 3
  • Flask
  • PyMongo
  • BeautifulSoup

πŸ“‚ File Structure and Description

  • app.py - Flask code for the API
  • scraper.py - Collection of scrapers for the various news sites.
  • db.py - Connecting and Using MongoDB
  • utils.py - Utility Functions
  • scheduler.py - Scheduler
  • Procfile - For Deployment
  • requirements.txt - Python Requirments

πŸ› οΈ Pipeline

Newsemble pipeline

πŸš€ Getting-started

This project can be accessed by using following setup

Links

Links Description
http://www.newsemble.ml/news Link to fetch all the data from all sources
http://www.newsemble.ml/news/toi Link to fetch data from Times of India
http://www.newsemble.ml/news/th Link to fetch data from The Hindu
http://www.newsemble.ml/news/tie Link to fetch data from The Indian Express
http://www.newsemble.ml/news/ndtv Link to fetch data from NDTV news
http://www.newsemble.ml/news/it Link to fetch data from India Today

Request format

$ import requests
$ url = "http://www.newsemble.ml/news/"
$ requests.get(url).json()

Response format

{   
    β€˜link’      :  $source_link$,
    β€˜content’   :  $content_text$,    
    β€˜source’    :  $news_source$,
    β€˜title’     :  $headline$, 
    β€˜time       :  $date_time_of_article$  
 }

Sample output

image

βš™οΈ Currently Supported Sites



πŸ™ Thanks!

All contributions are welcome and appreciated. πŸ‘
If you liked this project, or found it useful in any way, please drop a 🌟!

✍️ Authors ✍️

βœ’οΈ Rishabh Gupta
βœ’οΈ Vishal Singhania
βœ’οΈ Roshan Kumar

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].