All Projects → Yogayu → weibo-summary

Yogayu / weibo-summary

Licence: other
微博自动摘要系统 Chinese Microblog Automatic Summary System

Programming Languages

javascript
184084 projects - #8 most used programming language
python
139335 projects - #7 most used programming language
HTML
75241 projects
CSS
56736 projects

Projects that are alternatives of or similar to weibo-summary

Keywords-Abstract-TFIDF-TextRank4ZH
使用tf-idf, TextRank4ZH等不同方式从中文文本中提取关键字,从中文文本中提取摘要和关键词
Stars: ✭ 26 (-7.14%)
Mutual labels:  textrank, tf-idf
KeywordExtraction
Implementation of algorithm in keyword extraction,including TextRank,TF-IDF and the combination of both
Stars: ✭ 95 (+239.29%)
Mutual labels:  textrank, tf-idf
DocSum
A tool to automatically summarize documents abstractively using the BART or PreSumm Machine Learning Model.
Stars: ✭ 58 (+107.14%)
Mutual labels:  automatic-summarization
a-soul
Full-featured social media monitor that extracts data from a variety of services and pushes updates to Telegram or other platforms
Stars: ✭ 39 (+39.29%)
Mutual labels:  weibo
SentimentAnalysis
(BOW, TF-IDF, Word2Vec, BERT) Word Embeddings + (SVM, Naive Bayes, Decision Tree, Random Forest) Base Classifiers + Pre-trained BERT on Tensorflow Hub + 1-D CNN and Bi-Directional LSTM on IMDB Movie Reviews Dataset
Stars: ✭ 40 (+42.86%)
Mutual labels:  tf-idf
dore
React Native-base Hybrid Framework, for migrating Cordova and WebView application to React Native.
Stars: ✭ 52 (+85.71%)
Mutual labels:  hybird
sharon
A lightweight and modular social sharing library
Stars: ✭ 16 (-42.86%)
Mutual labels:  weibo
autoplait
Python implementation of AutoPlait (SIGMOD'14) without smoothing algorithm. NOTE: This repository is for my personal use.
Stars: ✭ 24 (-14.29%)
Mutual labels:  automatic-summarization
fb scraper
FBLYZE is a Facebook scraping system and analysis system.
Stars: ✭ 61 (+117.86%)
Mutual labels:  tf-idf
OAuthLogin.AspNetCore
第三方平台联合登陆(facebook、微信、微博、QQ、Kakao)
Stars: ✭ 43 (+53.57%)
Mutual labels:  weibo
weibo-porter
微博搬运姬,自动搬运B博动态至微博的机器人
Stars: ✭ 53 (+89.29%)
Mutual labels:  weibo
OAuthLogin
第三方平台联合登陆(facebook、微信、微博、QQ、Kakao)
Stars: ✭ 57 (+103.57%)
Mutual labels:  weibo
wink-bm25-text-search
Fast Full Text Search based on BM25
Stars: ✭ 44 (+57.14%)
Mutual labels:  tf-idf
soan
Social Analysis based on Whatsapp data
Stars: ✭ 106 (+278.57%)
Mutual labels:  tf-idf
tf-idf-python
Term frequency–inverse document frequency for Chinese novel/documents implemented in python.
Stars: ✭ 98 (+250%)
Mutual labels:  tf-idf
WPImage2WeiBo
WeiBo Pic for WordPress by youthlin.com
Stars: ✭ 16 (-42.86%)
Mutual labels:  weibo
WeiboEmoji
Weibo Emoji is a repository for saving and sharing most Emoji images that are used/were previously used by the app Weibo.
Stars: ✭ 17 (-39.29%)
Mutual labels:  weibo
fuck-login
提供一些境内境外常见的站点Python登录脚本
Stars: ✭ 40 (+42.86%)
Mutual labels:  weibo
occupationcoder
Given a job title and job description, the algorithm assigns a standard occupational classification (SOC) code to the job.
Stars: ✭ 30 (+7.14%)
Mutual labels:  tf-idf
flat-weibo
☃️ 这是一个使微博网页版更加纯净的 Chrome 插件
Stars: ✭ 26 (-7.14%)
Mutual labels:  weibo

README

新浪微博自动文摘系统

给定微博话题数据集,从中筛选出前N(N>0)条微博,作为话题摘要。

问题实质:短文本多文档自动文摘。

流程:数据-算法-评估-展现

  1. 数据获取与预处理:如何通过编写爬虫获取网站数据?
  2. 摘要算法:如何使用Python及其相关工具集实现算法?(阅读论文和资料理解算法,理解公式含义,将公式转化为实际代码,也有很多算法在Python中已经实现,可以直接使用)
  3. 摘要评估:如何对不同算法的结果进行评估?(ROUGE评估方式,中文评估注意点)
  4. 系统展现:如何实现前端展现和后台管理的系统?(Flask、Flask-Admin;Bootstrap、E-Charts)如何部署系统到服务器?

技术方案概览:

Techology

使用

macOS Sierra 10.2.5

环境配置

建议在虚拟环境中运行, 若已安装pip:

sudo pip install virtualenv
virtualenv virtualEnv

进入虚拟环境:

cd virtualEnv/
source bin/activate
  • Python 2.7
  • Flask
  • Flask 插件

运行

cd virtualEnv/
source bin/activate
cd weiboApplication

指定变量:

export FLASK_APP = app.py

调试模式(可选):

export FLASK_DEBUG=1

运行:

flask run

默认访问:

* Serving Flask app "weibo-summary.weiboApplication.app"
* Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)

服务器端部署

Ubuntu ( 可否写一个自动部署脚本?)

安装

  1. 安装MySQL

  2. 安装Apache

     sudo apt-get install apache2
    
  3. 安装mod_wsgi

     sudo apt-get install libapache2-mod-wsgi 
    
  4. 配置Python虚拟环境

  5. 安装Flask以及相关依赖

     pip install -r requestments.txt
    

启动

  1. 启动MySQL
  2. 导入数据表
  3. 启动apache
  4. 启动项目

进入MySQL:

mysql -u root -p

导入数据:

mysql -u root -p weibodb < weibodb_summary.sql

项目目录结构

|____app.py(应用入口)
|____config.py
|____manage.py
|____requirement.txt(需安装的依赖)
|____weiboModel.py
|____weibo-summary.wsgi(部署配置)
|____Algorithms(算法)
| |____Hybird-TFIDF.py
| |____TextRank.py
| |____TFIDF.py
| |____utilities.py
| |...
|____Data (数据)
| |____rawData(原始数据)
| |____weiboData(预处理后数据)
| |____ResultData(生成摘要结果数据)
| |____ROUGE(算法评估结果数据)
|____db(数据库)
|____lib(使用的第三方库)
|____static
| |____css
| |____echarts(图表插件)
| |____font-awesome
| |____js
| |...
|____templates
| |____admin(后台)
| |____slide(模板分割模块)
|____util(自动化与处理脚本)

摘要算法

  • Baseline
    • MostRecent
    • Random
  • TextRank
  • TF-IDF
  • Hybrid TF-IDF

ROUGE评估

系统

需求分析

需求分析

架构设计

架构图

功能模块设计

功能模块

数据库

本系统数据库为MySQL

E

前端

前台展现

后台管理

后台

TODO

  • 封装类
  • 抽离app.py

References:

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].