All Projects → misspink1011 → News-Manager

misspink1011 / News-Manager

Licence: other
🗞news scraping and recommendation system

Programming Languages

python
139335 projects - #7 most used programming language
javascript
184084 projects - #8 most used programming language
HTML
75241 projects
CSS
56736 projects
shell
77523 projects

Projects that are alternatives of or similar to News-Manager

Recommendersystem Dataset
This repository contains some datasets that I have collected in Recommender Systems.
Stars: ✭ 249 (+1678.57%)
Mutual labels:  recommendation-system
imgur downloader
Python script/class to download an entire Imgur album in one go into a folder of your choice.
Stars: ✭ 35 (+150%)
Mutual labels:  webscraper
gcf-packs
Library packs for google cloud functions
Stars: ✭ 48 (+242.86%)
Mutual labels:  webscraper
super-anime-downloader
A program which takes an Anime name or URL and downloads the specified range of episodes.
Stars: ✭ 26 (+85.71%)
Mutual labels:  webscraper
fBrowser
Helpful Selenium functions to make web-scraping easier and faster
Stars: ✭ 16 (+14.29%)
Mutual labels:  webscraper
scraperx
Library for scraping websites or apis at any scale
Stars: ✭ 49 (+250%)
Mutual labels:  webscraper
Recommendationsystem
Book recommender system using collaborative filtering based on Spark
Stars: ✭ 244 (+1642.86%)
Mutual labels:  recommendation-system
compatibility-family-learning
Compatibility Family Learning for Item Recommendation and Generation
Stars: ✭ 21 (+50%)
Mutual labels:  recommendation-system
hipposcraper
A Linux terminal tool for parsing and scraping Holberton project pages to automate repetitive tasks.
Stars: ✭ 32 (+128.57%)
Mutual labels:  webscraper
anime-scraper
[partially working] Scrape and add anime episode stream URLs to uGet (Linux) or IDM (Windows) ~ Python3
Stars: ✭ 21 (+50%)
Mutual labels:  webscraper
CoWin-Vaccine-Notifier
Automated Python Script to retrieve vaccine slots availability and get notified when a slot is available.
Stars: ✭ 102 (+628.57%)
Mutual labels:  webscraper
Wuxiaworld-2-eBook
This Python script will download chapters from novels availaible on wuxiaworld.com saves then into the .epub format
Stars: ✭ 90 (+542.86%)
Mutual labels:  webscraper
Ruby Capstone
A simple web scraper built with Ruby and the Nokogiri gem. It crawls a certain website and gets the prices and other data of cryptocurrencies. Rspec was used for testing.
Stars: ✭ 14 (+0%)
Mutual labels:  webscraper
Deep Learning Interview Book
深度学习面试宝典(含数学、机器学习、深度学习、计算机视觉、自然语言处理和SLAM等方向)
Stars: ✭ 3,677 (+26164.29%)
Mutual labels:  recommendation-system
TrackPurchase
단 몇줄의 코드로 다양한 쇼핑 플랫폼에서 결제 내역을 긁어오자!
Stars: ✭ 19 (+35.71%)
Mutual labels:  webscraper
Recsys core
[电影推荐系统] Based on the movie scoring data set, the movie recommendation system is built with FM and LR as the core(基于爬取的电影评分数据集,构建以FM和LR为核心的电影推荐系统).
Stars: ✭ 245 (+1650%)
Mutual labels:  recommendation-system
HostPanic
Find host header injections and perform Host Header attacks with other kind of bugs like web cache poissoning
Stars: ✭ 23 (+64.29%)
Mutual labels:  webscraper
AIML-Projects
Projects I completed as a part of Great Learning's PGP - Artificial Intelligence and Machine Learning
Stars: ✭ 85 (+507.14%)
Mutual labels:  recommendation-system
Soup
Web Scraper in Go, similar to BeautifulSoup
Stars: ✭ 1,685 (+11935.71%)
Mutual labels:  webscraper
makenews
MakeNews is for journalists and newsrooms. It helps you track news from web and social media in real-time.
Stars: ✭ 46 (+228.57%)
Mutual labels:  webscraper

News Manager

Introduction

News Manager is a real time news scraping and recommendation system. This system uses a news pipeline to scrape latest news from various of resources such CNN, BBC and Bloomberg etc. To render the news, this system integrates with a single-page web application built by React. In addition, it generates a customized news list for each user based on news topics. To achieve this function, a click log processor collects users’ click logs to update a news preference model for each user, and an offline training pipeline models news topics.

demo

Architecture

architecture

SOA

  • Client: a single-page web application built by React.
  • Web Server: handles the sign-up and login functions with node.js and Express.
  • Backend Server: requests news from database, labels specific news with "recommend" tag based on the response of news recommendation service and sends user's click events to click log processor
  • News Recommendation Service: responses a list of prefered news topics for the current user.
  • Click Log Processor: updates a user's preference model using a time decay method.
  • News Topic Modeling Service: predicts news topics using a CNN model generated by an offline training pipeline.
  • News Monitor: with the help of News API, finds the latest news from 20+ resource websites. It integrates with Redis to filter news with the same titles.
  • News Fetcher: obtains a scraping task from the task queue and scrapes news using Newspaper3K library.
  • News Deduper: utilizes NLP techniques to compare the content of the scraped news and existing news in mongoDB, calls the news topic modeling service and then stores the unique news into database.

How to run it

To set up this system

./launcher.sh

To scrape more news

cd news_pipeline
./news_pipeline_launcher.sh

To update user'preference model

cd news_recommendation_service
python3 click_log_processor.py
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].