All Projects → ryh95 → Pyspider Stock

ryh95 / Pyspider Stock

A project using pyspider to collect data and NLP techs to analyze the correlation among the data

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Pyspider Stock

akshare
AKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库
Stars: ✭ 5,155 (+9105.36%)
Mutual labels:  stock, quant
stock-news-sentiment-analysis
This program uses Vader SentimentIntensityAnalyzer to calculate the news headline overall sentiment for a stock
Stars: ✭ 21 (-62.5%)
Mutual labels:  sentiment-analysis, stock
Steward
A stock portfolio manager that provides neural net based short-term predictions for stocks and natural language processing based analysis on community sentiments.
Stars: ✭ 25 (-55.36%)
Mutual labels:  sentiment-analysis, stock
pystockfilter
Financial technical and fundamental analysis indicator library for pystockdb.
Stars: ✭ 26 (-53.57%)
Mutual labels:  stock, quant
Akshare
AKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库
Stars: ✭ 4,334 (+7639.29%)
Mutual labels:  stock, quant
Beibo
🤖 Predict the stock market with AI 用AI预测股票市场
Stars: ✭ 46 (-17.86%)
Mutual labels:  stock, quant
stocktwits-sentiment
Stocktwits market sentiment analysis in Python with Keras and TensorFlow.
Stars: ✭ 23 (-58.93%)
Mutual labels:  sentiment-analysis, stock
Sphinx Quant
一个基于vnpy,支持多账户,多策略,实盘交易,数据分析,分布式在线回测,风险管理,多交易节点的量化交易系统;支持CTP期货,股票,期权,数字货币等金融产品
Stars: ✭ 217 (+287.5%)
Mutual labels:  stock, quant
Abu
阿布量化交易系统(股票,期权,期货,比特币,机器学习) 基于python的开源量化交易,量化投资架构
Stars: ✭ 8,589 (+15237.5%)
Mutual labels:  stock, quant
trader
交易模块
Stars: ✭ 20 (-64.29%)
Mutual labels:  stock, quant
mooquant
MooQuant 是一个基于 pyalgotrade 衍生而来的支持 python3 的支持国内A股的量化交易框架。
Stars: ✭ 24 (-57.14%)
Mutual labels:  stock, quant
Rqalpha
A extendable, replaceable Python algorithmic backtest && trading framework supporting multiple securities
Stars: ✭ 4,425 (+7801.79%)
Mutual labels:  stock, quant
backend-ctp
CTP接口封装,使用redis做消息中转
Stars: ✭ 26 (-53.57%)
Mutual labels:  stock, quant
tqk
한국 주식 데이터를 위한 R 패키지
Stars: ✭ 55 (-1.79%)
Mutual labels:  stock, quant
MyTT
MyTT将通达信,同花顺,文华麦语言等指标公式,最简移植到Python中,核心库单个文件,仅百行代码,十几个核心函数,神奇的实现所有常见技术指标算法(不依赖talib库)的纯python实现和转换通达信MACD,RSI,BOLL,ATR,KDJ,CCI,PSY等公式,全部基于pandas函数计算方法封装,简洁且高性能,能非常方便的应用在股票指标公式,股市期货量化框架分析,自动程序化交易,数字货币量化等领域,它是您最精练的股市量化工具。Python library with most stock market indicators.
Stars: ✭ 888 (+1485.71%)
Mutual labels:  stock, quant
openctp
CTP开放平台提供A股、港股、美股、期货、期权等全品种接入通道,通过提供中泰证券XTP、华鑫证券奇点、东方证券OST、东方财富证券EMT、盈透证券TWS等各通道的CTPAPI接口,CTP程序可以无缝对接各股票柜台。平台也提供了一套基于TTS交易系统的模拟环境,同样提供了CTPAPI兼容接口,可以替代Simnow,为CTP量化交易开发者提供7x24可用的模拟环境。
Stars: ✭ 389 (+594.64%)
Mutual labels:  stock, quant
Friartuck
Live Quant Trading Framework for Robinhood, using IEX Trading and AlphaVantage for Free Prices.
Stars: ✭ 142 (+153.57%)
Mutual labels:  stock, quant
Stock
30天掌握量化交易 (持续更新)
Stars: ✭ 2,966 (+5196.43%)
Mutual labels:  stock, quant
dipiper
基于nodejs的股票数据爬虫
Stars: ✭ 83 (+48.21%)
Mutual labels:  stock, quant
Trady
Trady is a handy library for computing technical indicators, and it targets to be an automated trading system that provides stock data feeding, indicator computing, strategy building and automatic trading. It is built based on .NET Standard 2.0.
Stars: ✭ 433 (+673.21%)
Mutual labels:  stock, quant

pyspider-stock

Note:This README will have both Chinese and English version, Chinese first because it is for Chinese stock market.

Update :

  • [x] 增加IT版块股票的抓取和分析

这个项目做什么?

这个项目使用pyspider抓取东方财富网股吧雪球网新浪股吧的帖子,然后使用自然语言处理(情感分析)的方式分析舆论

所以

它有两个部分

  1. 抓取帖子
  2. 情感分析

如何运行它?

第一步 抓取帖子

  • 下载pyspidermongoDBredissnowNLPpymongo(2.9)及相应的依赖库
  • 运行set_codes/set_hs300.pyset_IT.py(为了将HS300成份股的股票代码装入mongoDB,后者的目的是放入IT股票的代码)
  • 然后,将resultdb.py放入pyspider的database/mongodb目录下(为了将爬取到的数据放入mongoDB),pyspider路径使用pip show pyspider命令
  • 启动redis
  • 然后,在有config.json的目录下,command line 运行pyspider -c config.json all &
  • 其次,将script里的脚本复制后,粘贴到localhost:5000下你自己的工程里(想要爬取哪个网站就粘贴哪个script),保存
  • 最后在网页localhost:5000里单击run

在早上开盘前执行完最后两步即可在每天早上开盘前获取到HS300昨日的舆论数据

第二步 情感分析

在完成第一步30分钟后即可执行该步骤

第一次运行时在和main.py同目录下新建目录data

运行 main.py即可

发生了什么?

默认使用gubaEast.py抓取东方财富网下的股友汇版块,因为它最稳定

执行完第一步后,你会在名为[stockcode]eastmoney的database下发现[date]GuYouHui的collection,其中[stockcode][date]分别是HS300成份股的股票代码和昨天的日期

接着是情感分析部分

核心是3段代码:

produceFactor.getSentimentFactor(stockCode, grab_time)

用于获得抓取日期的特定股票帖子的情感因子和情感值(由情感因子乘以阅读量获得)

aggregateFactor.aggregate(stockCode, grab_time)

用于获得抓取日期的特定股票的情感值(由所有帖子情感值相加得到),结果保存在[stockcode]eastmoney下的[date]SentimentFactor

dailyResult.setDailyResult(stockCode, grab_time)

用于汇总抓取日期的所有HS300股票的情感值和帖子数,结果在[date]database的DailyResultcollection下

而后结果会以excel的格式保存在data目录下

结果会以邮件形式发给你指定的人,通过sendMail模块

最后taskdb里面这个任务会被清除,以便明天增量抓取。同时会将5天前数据库中的数据导出,存在本地,并删除数据库中的数据

如果想用app在android端查看结果,就保留

os.system('mv data/' + grab_time + 'result.xls' + ' /var/www/html')

English version

What's the aim of this project?

This project use pyspider to get posts of eastmoney, xueqiu, sinaguba,then use NLP techs to analyze the sentiment of public in order to select stocks.

SO

It has two parts

  1. crawl posts
  2. sentiment analysis

How to run this project?

Step 1 Crawl posts

  • Download pyspidermongoDBredissnowNLP and other dependencies
  • run set_hs300/setCodes.py(in order to get all symbols of HS300 and load them into mongoDB)
  • put resultdb.py into database/mongodb directory of pyspider(in order to save the crawling data to mongoDB)
  • start redis
  • command line run pyspider -c config.json all & under directory of config.json
  • copy script in script folder, paste code to your own project in localhost:5000, save
  • click run button in localhost:5000

Complete two last steps before the market is open, then you'll get sentiment data everyday periodically.

Step 2 Sentiment analysis

Run main.py after the posts been crawled and stored, also remember to create data directory for the first running.

What happened?

Because of the stability, use gubaEast.py to crawl GuYouHui section is by default.

After Step 1 finished,you'll find a collection named [date]GuYouHui under a database called [stockcode]eastmoney, where [stockcode] and [date]are symbols of HS300 and date of yesterday.

Another part is sentiment analysis

The core part is three pieces of code:

produceFactor.getSentimentFactor(stockCode, grab_time)

To obtain sentiment values and sentiment factor for a specific symbol post and crawl date(sentiment values are computed by snowNLP while sentiment factor is sentiment values times read numbers)

aggregateFactor.aggregate(stockCode, grab_time)

To obtain sentiment values and sentiment factor for a specific symbol and crawl date(by adding all the posts for that stock on that day), result is in [date]SentimentFactor under [stockcode]eastmoney

dailyResult.setDailyResult(stockCode, grab_time)

To collect all the sentiment factors and number of posts for the crawl date,result is in the DailyResult collection which is under the [date]database.

Then an excel would be saved under the data directory as the final result.

The result would be mailed to specific users, through sendMail module.

Tasks under taskdb would be deleted in order to crawl posts periodically. Meanwhile data which stored 5 days ago would be dumped as backup and mongoDB would delete the original one.

If you want to use the app to check the result on android, keep the following code

os.system('mv data/' + grab_time + 'result.xls' + ' /var/www/html')
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].