All Projects → mattmurray → topic_modelling_financial_news

mattmurray / topic_modelling_financial_news

Licence: other
Topic modelling on financial news with Natural Language Processing

Programming Languages

Jupyter Notebook
11667 projects

Projects that are alternatives of or similar to topic modelling financial news

How To Mine Newsfeed Data And Extract Interactive Insights In Python
A practical guide to topic mining and interactive visualizations
Stars: ✭ 61 (+19.61%)
Mutual labels:  sklearn, topic-modeling, tf-idf, nlp-machine-learning
Python Ai Assistant
Python AI assistant 🧠
Stars: ✭ 219 (+329.41%)
Mutual labels:  sklearn, nltk, nlp-machine-learning
Deception-Detection-on-Amazon-reviews-dataset
A SVM model that classifies the reviews as real or fake. Used both the review text and the additional features contained in the data set to build a model that predicted with over 85% accuracy without using any deep learning techniques.
Stars: ✭ 42 (-17.65%)
Mutual labels:  sklearn, nltk, nlp-machine-learning
Ai Chatbot Framework
A python chatbot framework with Natural Language Understanding and Artificial Intelligence.
Stars: ✭ 1,564 (+2966.67%)
Mutual labels:  sklearn, nltk
Lambda Packs
Precompiled packages for AWS Lambda
Stars: ✭ 997 (+1854.9%)
Mutual labels:  sklearn, spacy
Coronavirus visualization and prediction
This repository tracks the spread of the novel coronavirus, also known as SARS-CoV-2. It is a contagious respiratory virus that first started in Wuhan in December 2019. On 2/11/2020, the disease is officially named COVID-19 by the World Health Organization.
Stars: ✭ 62 (+21.57%)
Mutual labels:  sklearn, matplotlib
lda2vec
Mixing Dirichlet Topic Models and Word Embeddings to Make lda2vec from this paper https://arxiv.org/abs/1605.02019
Stars: ✭ 27 (-47.06%)
Mutual labels:  sklearn, topic-modeling
Ml Cheatsheet
A constantly updated python machine learning cheatsheet
Stars: ✭ 136 (+166.67%)
Mutual labels:  sklearn, matplotlib
Ds Ai Tech Notes
📖 [译] 数据科学和人工智能技术笔记
Stars: ✭ 131 (+156.86%)
Mutual labels:  sklearn, matplotlib
Data Analysis
主要是爬虫与数据分析项目总结,外加建模与机器学习,模型的评估。
Stars: ✭ 142 (+178.43%)
Mutual labels:  sklearn, matplotlib
Document-Classification-using-LSA
Document classification using Latent semantic analysis in python
Stars: ✭ 16 (-68.63%)
Mutual labels:  tf-idf, latent-semantic-analysis
Semantic-Textual-Similarity
Natural Language Processing using NLTK and Spacy
Stars: ✭ 30 (-41.18%)
Mutual labels:  spacy, nltk
Ai competitions
AI比赛相关信息汇总
Stars: ✭ 443 (+768.63%)
Mutual labels:  sklearn, matplotlib
Textmining
Python文本挖掘系统 Research of Text Mining System
Stars: ✭ 268 (+425.49%)
Mutual labels:  sklearn, tf-idf
python3-docker-devenv
Docker Start Guide with Python Development Environment
Stars: ✭ 13 (-74.51%)
Mutual labels:  sklearn, matplotlib
Machine Learning Projects
This repository consists of all my Machine Learning Projects.
Stars: ✭ 135 (+164.71%)
Mutual labels:  sklearn, matplotlib
Practical Machine Learning With Python
Master the essential skills needed to recognize and solve complex real-world problems with Machine Learning and Deep Learning by leveraging the highly popular Python Machine Learning Eco-system.
Stars: ✭ 1,868 (+3562.75%)
Mutual labels:  spacy, nltk
pygrams
Extracts key terminology (n-grams) from any large collection of documents (>1000) and forecasts emergence
Stars: ✭ 52 (+1.96%)
Mutual labels:  nltk, tf-idf
Tensorflow Ml Nlp
텐서플로우와 머신러닝으로 시작하는 자연어처리(로지스틱회귀부터 트랜스포머 챗봇까지)
Stars: ✭ 176 (+245.1%)
Mutual labels:  sklearn, nltk
merkalysis
A marketing tool that helps you to market your products using organic marketing. This tool can potentially save you 1000s of dollars every year. The tool predicts the reach of your posts on social media and also suggests you hashtags for captions in such a way that it increases your reach.
Stars: ✭ 28 (-45.1%)
Mutual labels:  sklearn, nlp-machine-learning

Topic Modelling on Financial News Articles

Summary

This repo contains code for pre-processing and vectorizing raw text collected from 85,000 news articles downloaded from a variety of online broadsheet newspapers and newswires covering finance, business and the economy.

A detailed blog post can be found at http://mattmurray.net/topic-modelling-financial-news-with-natural-language-processing/

Article counts by year

The data was pre-processed with the removal of stop words, punctuation and numbers, and the words were stemmed using the Snowball stemmer.

The data was vectorized into a TF-IDF matrix, then Latent Semantic Analysis techniques were applied to reduce the dimensions into a smaller number of latent features.

Finally, the latent features were clustered into topic clusters and the trends in the topics visualized over time.

Outcome

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].