All Projects → Bigartm → Similar Projects or Alternatives

423 Open source projects that are alternatives of or similar to Bigartm

coolplayflink

Flink: Stateful Computations over Data Streams

Stars: ✭ 14 (-97.51%)

Mutual labels: bigdata

Mnemonic

Apache Mnemonic - A non-volatile hybrid memory storage oriented library

Stars: ✭ 91 (-83.84%)

Mutual labels: bigdata

Unsupervised Aspect Extraction

Code for acl2017 paper "An unsupervised neural attention model for aspect extraction"

Stars: ✭ 277 (-50.8%)

Mutual labels: topic-modeling

Ignite Book Code Samples

All code samples, scripts and more in-depth examples for the book high performance in-memory computing with Apache Ignite. Please use the repository "the-apache-ignite-book" for Ignite version 2.6 or above.

Stars: ✭ 86 (-84.72%)

Mutual labels: bigdata

learning-spark

Tidy up Spark and Hadoop tutorials.

Stars: ✭ 28 (-95.03%)

Mutual labels: bigdata

Mlsql

The Programming Language Designed For Big Data and AI

Stars: ✭ 1,262 (+124.16%)

Mutual labels: bigdata

centurion

Kotlin Bigdata Toolkit

Stars: ✭ 320 (-43.16%)

Mutual labels: bigdata

Hudi Resources

汇总Apache Hudi相关资料

Stars: ✭ 79 (-85.97%)

Mutual labels: bigdata

enstop

Ensemble topic modelling with pLSA

Stars: ✭ 104 (-81.53%)

Mutual labels: topic-modeling

Cleanframes

type-class based data cleansing library for Apache Spark SQL

Stars: ✭ 75 (-86.68%)

Mutual labels: bigdata

Cds

Data syncing in golang for ClickHouse.

Stars: ✭ 501 (-11.01%)

Mutual labels: bigdata

Big Data Engineering Coursera Yandex

Big Data for Data Engineers Coursera Specialization from Yandex

Stars: ✭ 71 (-87.39%)

Mutual labels: bigdata

deduce

Deduce: de-identification method for Dutch medical text

Stars: ✭ 40 (-92.9%)

Mutual labels: text-mining

Reddit sse stream

A Server Side Event stream to deliver Reddit comments and submissions in near real-time to a client.

Stars: ✭ 39 (-93.07%)

Mutual labels: bigdata

TAKG

The official implementation of ACL 2019 paper "Topic-Aware Neural Keyphrase Generation for Social Media Language"

Stars: ✭ 127 (-77.44%)

Mutual labels: topic-modeling

Autocrawler

Google, Naver multiprocess image web crawler (Selenium)

Stars: ✭ 957 (+69.98%)

Mutual labels: bigdata

python-api

A Python client for Infermedica API.

Stars: ✭ 53 (-90.59%)

Mutual labels: python-api

Panther

Detect threats with log data and improve cloud security posture

Stars: ✭ 885 (+57.19%)

Mutual labels: bigdata

Ldetool

Code generator for fast log file parsers

Stars: ✭ 273 (-51.51%)

Mutual labels: bigdata

Bigdata Interview

🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结

Stars: ✭ 857 (+52.22%)

Mutual labels: bigdata

hlda

Gibbs sampler for the Hierarchical Latent Dirichlet Allocation topic model

Stars: ✭ 138 (-75.49%)

Mutual labels: topic-modeling

10 Weeks

10-weeks of technology exploration

Stars: ✭ 22 (-96.09%)

Mutual labels: bigdata

sensim

Sentence Similarity Estimator (SenSim)

Stars: ✭ 15 (-97.34%)

Mutual labels: text-mining

Bigdataguide

大数据学习，从零开始学习大数据，包含大数据学习各阶段学习视频、面试资料

Stars: ✭ 817 (+45.12%)

Mutual labels: bigdata

columnify

Make record oriented data to columnar format.

Stars: ✭ 28 (-95.03%)

Mutual labels: bigdata

Coding Now

学习记录的一些笔记，以及所看得一些电子书eBooks、视频资源和平常收纳的一些自己认为比较好的博客、网站、工具。涉及大数据几大组件、Python机器学习和数据分析、Linux、操作系统、算法、网络等

Stars: ✭ 750 (+33.21%)

Mutual labels: bigdata

Cudf

cuDF - GPU DataFrame Library

Stars: ✭ 4,370 (+676.2%)

Mutual labels: python-api

Spark Movie Lens

An on-line movie recommender using Spark, Python Flask, and the MovieLens dataset

Stars: ✭ 745 (+32.33%)

Mutual labels: bigdata

tomoto-ruby

High performance topic modeling for Ruby

Stars: ✭ 49 (-91.3%)

Mutual labels: topic-modeling

Running Elasticsearch Fun Profit

A book about running Elasticsearch

Stars: ✭ 664 (+17.94%)

Mutual labels: bigdata

blueprints-text

Jupyter notebooks for our O'Reilly book "Blueprints for Text Analysis Using Python"

Stars: ✭ 103 (-81.71%)

Mutual labels: text-mining

Gwu data mining

Materials for GWU DNSC 6279 and DNSC 6290.

Stars: ✭ 217 (-61.46%)

Mutual labels: text-mining

contextualLSTM

Contextual LSTM for NLP tasks like word prediction and word embedding creation for Deep Learning

Stars: ✭ 28 (-95.03%)

Mutual labels: topic-modeling

Qminer

Analytic platform for real-time large-scale streams containing structured and unstructured data.

Stars: ✭ 206 (-63.41%)

Mutual labels: text-mining

Wolframclientforpython

Call Wolfram Language functions from Python

Stars: ✭ 268 (-52.4%)

Mutual labels: python-api

Fake news detection

Fake News Detection in Python

Stars: ✭ 194 (-65.54%)

Mutual labels: text-mining

malay-dataset

Text corpus for Bahasa Malaysia, https://malaya.readthedocs.io/en/latest/Dataset.html

Stars: ✭ 189 (-66.43%)

Mutual labels: text-mining

Hdltex

HDLTex: Hierarchical Deep Learning for Text Classification

Stars: ✭ 191 (-66.07%)

Mutual labels: text-mining

textdigester

TextDigester: document summarization java library

Stars: ✭ 23 (-95.91%)

Mutual labels: text-mining

Texthero

Text preprocessing, representation and visualization from zero to hero.

Stars: ✭ 2,407 (+327.53%)

Mutual labels: text-mining

TopicsExplorer

Explore your own text collection with a topic model – without prior knowledge.

Stars: ✭ 53 (-90.59%)

Mutual labels: topic-modeling

Multi rake

Multilingual Rapid Automatic Keyword Extraction (RAKE) for Python

Stars: ✭ 162 (-71.23%)

Mutual labels: text-mining

Bigdataie

大数据博客、笔试题、教程、项目、面经的整理

Stars: ✭ 445 (-20.96%)

Mutual labels: bigdata

Lazynlp

Library to scrape and clean web pages to create massive datasets.

Stars: ✭ 1,985 (+252.58%)

Mutual labels: text-mining

api-python

Python client library to access Data Commons

Stars: ✭ 52 (-90.76%)

Mutual labels: python-api

Awesome Text Classification

Awesome-Text-Classification Projects,Papers,Tutorial .

Stars: ✭ 158 (-71.94%)

Mutual labels: text-mining

big data

A collection of tutorials on Hadoop, MapReduce, Spark, Docker

Stars: ✭ 34 (-93.96%)

Mutual labels: bigdata

Chemdataextractor

Automatically extract chemical information from scientific documents

Stars: ✭ 152 (-73%)

Mutual labels: text-mining

qs-hadoop

大数据生态圈学习

Stars: ✭ 18 (-96.8%)

Mutual labels: bigdata

Xioc

Extract indicators of compromise from text, including "escaped" ones.

Stars: ✭ 148 (-73.71%)

Mutual labels: text-mining

Lda

LDA topic modeling for node.js

Stars: ✭ 262 (-53.46%)

Mutual labels: topic-modeling

support-tickets-classification

This case study shows how to create a model for text analysis and classification and deploy it as a web service in Azure cloud in order to automatically classify support tickets. This project is a proof of concept made by Microsoft (Commercial Software Engineering team) in collaboration with Endava http://endava.com/en

Stars: ✭ 142 (-74.78%)

Mutual labels: text-mining

thrones2vec

Using Word2Vec to explore semantic similarities between the entities of "A Song of Ice and Fire" ("Game of Thrones").

Stars: ✭ 27 (-95.2%)

Mutual labels: text-mining

Wikipron

Massively multilingual pronunciation mining

Stars: ✭ 99 (-82.42%)

Mutual labels: python-api

Pyseeta

python api for SeetaFaceEngine(https://github.com/seetaface/SeetaFaceEngine.git)