Medha11 / Twitter-Trends

Licence: other

Twitter Trends is a web-based application that automatically detects and analyzes emerging topics in real time through hashtags and user mentions in tweets. Twitter being the major microblogging service is a reliable source for trends detection. The project involved extracting live streaming tweets, processing them to find top hashtags and user …

Programming Languages

python

139335 projects - #7 most used programming language

javascript

184084 projects - #8 most used programming language

CSS

56736 projects

HTML

75241 projects

Batchfile

5799 projects

Projects that are alternatives of or similar to Twitter-Trends

contextualLSTM

Contextual LSTM for NLP tasks like word prediction and word embedding creation for Deep Learning

Stars: ✭ 28 (-65.85%)

Mutual labels: topic-modeling

JoSH

[KDD 2020] Hierarchical Topic Mining via Joint Spherical Tree and Text Embedding

Stars: ✭ 55 (-32.93%)

Mutual labels: topic-modeling

KGE-LDA

Knowledge Graph Embedding LDA. AAAI 2017

Stars: ✭ 35 (-57.32%)

Mutual labels: topic-modeling

stmprinter

Print multiple stm model dashboards to a pdf file for inspection

Stars: ✭ 34 (-58.54%)

Mutual labels: topic-modeling

BTM

Biterm Topic Modelling for Short Text with R

Stars: ✭ 78 (-4.88%)

Mutual labels: topic-modeling

TopicNet

Interface for easier topic modelling.

Stars: ✭ 127 (+54.88%)

Mutual labels: topic-modeling

machine learning

Stars: ✭ 29 (-64.63%)

Mutual labels: topic-modeling

tassal

Tree-based Autofolding Software Summarization Algorithm

Stars: ✭ 38 (-53.66%)

Mutual labels: topic-modeling

amazon-reviews

Sentiment Analysis & Topic Modeling with Amazon Reviews

Stars: ✭ 26 (-68.29%)

Mutual labels: topic-modeling

lda2vec

Mixing Dirichlet Topic Models and Word Embeddings to Make lda2vec from this paper https://arxiv.org/abs/1605.02019

Stars: ✭ 27 (-67.07%)

Mutual labels: topic-modeling

hlda

Gibbs sampler for the Hierarchical Latent Dirichlet Allocation topic model

Stars: ✭ 138 (+68.29%)

Mutual labels: topic-modeling

PyLDA

A Latent Dirichlet Allocation implementation in Python.

Stars: ✭ 51 (-37.8%)

Mutual labels: topic-modeling

converse

Conversational text Analysis using various NLP techniques

Stars: ✭ 147 (+79.27%)

Mutual labels: topic-modeling

tomoto-ruby

High performance topic modeling for Ruby

Stars: ✭ 49 (-40.24%)

Mutual labels: topic-modeling

gensimr

📝 Topic Modeling for Humans

Stars: ✭ 35 (-57.32%)

Mutual labels: topic-modeling

ml-nlp-services

机器学习、深度学习、自然语言处理

Stars: ✭ 23 (-71.95%)

Mutual labels: topic-modeling

ctpfrec

Python implementation of "Content-based recommendations with poisson factorization", with some extensions

Stars: ✭ 31 (-62.2%)

Mutual labels: topic-modeling

Product-Categorization-NLP

Multi-Class Text Classification for products based on their description with Machine Learning algorithms and Neural Networks (MLP, CNN, Distilbert).

Stars: ✭ 30 (-63.41%)

Mutual labels: topic-modeling

learning-stm

Learning structural topic modeling using the stm R package.

Stars: ✭ 103 (+25.61%)

Mutual labels: topic-modeling

twic

Topic Words in Context (TWiC) is a highly-interactive, browser-based visualization for MALLET topic models

Stars: ✭ 51 (-37.8%)

Mutual labels: topic-modeling

View All Similar Projects ➔

Plan

First extractor extracts tweets as usual.
Tweets are cleaned and dumped into MongoDB.
Aggregation is done for the whole day.
Based on the aggregation, top 100 entities are found and the respective tweets are clubbed into one collection.
Before it is dumped into the collection, sentiment analysis is done on them.
Using each of the 100 collections as a separate document, LDA is performed. If 100 documents is too low, we can split the big documents into smaller ones.
The tweets are iterated individually to find the topic to which it belongs.
URLs are extracted for each topic which seem to be most relevant.
Webpages corresponding to the URLs are downloaded and parsed.
A portion of the main content can be displayed after extraction.
The graph is approximated as usual but the time span has to be discussed upon.
The graph, related tweets and summarizations of the URLs along with the hyperlinks is displayed for each topic on the portal.

Workflow

Control of engine starts with manager.py
manager.py makes us of multiprocess and subprocess to spawn extractor, preprocessor and postprocessor as separate processes
config.py in the utilities package stores tuning parameters such as 'alarm' times, file limit etc.
Refer to this .ppt for further information.

Dataset

Download dataset(s) from the Drive folder
- The full_dataset.rar contains all 2 Million tweets
- Optionally, you can download parts of this dataset from the Parts folder, each (dataset*.rar) containing 200,000 tweets
- Each .json file contains 10,000 tweets

init

Clone the git repository
Run python_path.bat to add PYTHONPATH env variable. This needs to be done only once
Make necessary changes in the config.py file in *engine\utilities*
Run python init.py in Command Prompt to start engine
To stop, close all Command Prompt and Python windows

Portal

The portal folder is the django project for the web portal
Create a database called 'trends'
In the settings file, change password for mysql root, in case it is different
Run createsuperuser to create an admin
Create some top trends using the admin site. I have included a screenshot for UI after creating some sample topics(with ranks). It will redirect to the details page after clicking(see screenshots).
Homepage can be opened using the url: 127.0.0.1:8000 or localhost:8000
TopTrends model has a topic object and a rank object. Will be modified to include graphs n all when implementation is done.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Medha11 / Twitter-Trends

Programming Languages

Labels

Projects that are alternatives of or similar to Twitter-Trends

Plan

Workflow

Dataset

init

Portal