LazynlpLibrary to scrape and clean web pages to create massive datasets.
Stars: ✭ 1,985 (+470.4%)
TsfelAn intuitive library to extract features from time series
Stars: ✭ 202 (-41.95%)
RAll Algorithms implemented in R
Stars: ✭ 294 (-15.52%)
Linkedingiveaway👨🏽🏫You can learn about anything over here. What Giveaways I do and why it's important in today's modern world. Are you interested in Giveaway's?🔋
Stars: ✭ 67 (-80.75%)
Rumble⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
Stars: ✭ 58 (-83.33%)
Tsrepr TSrepr: R package for time series representations
Stars: ✭ 75 (-78.45%)
ClevercsvCleverCSV is a Python package for handling messy CSV files. It provides a drop-in replacement for the builtin CSV module with improved dialect detection, and comes with a handy command line application for working with CSV files.
Stars: ✭ 887 (+154.89%)
DexDex : The Data Explorer -- A data visualization tool written in Java/Groovy/JavaFX capable of powerful ETL and publishing web visualizations.
Stars: ✭ 1,238 (+255.75%)
MlxtendA library of extension and helper modules for Python's data analysis and machine learning libraries.
Stars: ✭ 3,729 (+971.55%)
PhormaticsUsing A.I. and computer vision to build a virtual personal fitness trainer. (Most Startup-Viable Hack - HackNYU2018)
Stars: ✭ 79 (-77.3%)
Lda Topic ModelingA PureScript, browser-based implementation of LDA topic modeling.
Stars: ✭ 91 (-73.85%)
NeuroflowArtificial Neural Networks for Scala
Stars: ✭ 105 (-69.83%)
DataflowjavasdkGoogle Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.
Stars: ✭ 854 (+145.4%)
MatrixprofileA Python 3 library making time series data mining tasks, utilizing matrix profile algorithms, accessible to everyone.
Stars: ✭ 141 (-59.48%)
Efficient AprioriAn efficient Python implementation of the Apriori algorithm.
Stars: ✭ 145 (-58.33%)
AcceleratorThe Accelerator is a tool for fast and reproducible processing of large amounts of data.
Stars: ✭ 137 (-60.63%)
Machine Learning With PythonPractice and tutorial-style notebooks covering wide variety of machine learning techniques
Stars: ✭ 2,197 (+531.32%)
Fantasy Basketball Scraping statistics, predicting NBA player performance with neural networks and boosting algorithms, and optimising lineups for Draft Kings with genetic algorithm. Capstone Project for Machine Learning Engineer Nanodegree by Udacity.
Stars: ✭ 146 (-58.05%)
PycaretAn open-source, low-code machine learning library in Python
Stars: ✭ 4,594 (+1220.11%)
Datasciencera curated list of R tutorials for Data Science, NLP and Machine Learning
Stars: ✭ 1,727 (+396.26%)
ChefboostA Lightweight Decision Tree Framework supporting regular algorithms: ID3, C4,5, CART, CHAID and Regression Trees; some advanced techniques: Gradient Boosting (GBDT, GBRT, GBM), Random Forest and Adaboost w/categorical features support for Python
Stars: ✭ 176 (-49.43%)
Data Science Resources👨🏽🏫You can learn about what data science is and why it's important in today's modern world. Are you interested in data science?🔋
Stars: ✭ 171 (-50.86%)
LightautomlLAMA - automatic model creation framework
Stars: ✭ 196 (-43.68%)
InstascrapePowerful and flexible Instagram scraping library for Python, providing easy-to-use and expressive tools for accessing data programmatically
Stars: ✭ 202 (-41.95%)
TweetfeelsReal-time sentiment analysis in Python using twitter's streaming api
Stars: ✭ 249 (-28.45%)
DatascienceCurated list of Python resources for data science.
Stars: ✭ 3,051 (+776.72%)
DeepgraphAnalyze Data with Pandas-based Networks. Documentation:
Stars: ✭ 232 (-33.33%)
Awesome Datascience📝 An awesome Data Science repository to learn and apply for real world problems.
Stars: ✭ 17,520 (+4934.48%)
Pm4py CorePublic repository for the PM4Py (Process Mining for Python) project.
Stars: ✭ 313 (-10.06%)
clustextEasy, fast clustering of texts
Stars: ✭ 18 (-94.83%)
TextClassification基于scikit-learn实现对新浪新闻的文本分类,数据集为100w篇文档,总计10类,测试集与训练集1:1划分。分类算法采用SVM和Bayes,其中Bayes作为baseline。
Stars: ✭ 86 (-75.29%)
estrattoparsing fixed width files content made easy
Stars: ✭ 12 (-96.55%)
iisInformation Inference Service of the OpenAIRE system
Stars: ✭ 16 (-95.4%)
woollyThe Text Mining Elixir
Stars: ✭ 48 (-86.21%)
deduceDeduce: de-identification method for Dutch medical text
Stars: ✭ 40 (-88.51%)
classySuper simple text classifier using Naive Bayes. Plug-and-play, no dependencies
Stars: ✭ 12 (-96.55%)
textlearnRA simple collection of well working NLP models (Keras, H2O, StarSpace) tuned and benchmarked on a variety of datasets.
Stars: ✭ 16 (-95.4%)
Statistical LearningLecture Slides and R Sessions for Trevor Hastie and Rob Tibshinari's "Statistical Learning" Stanford course
Stars: ✭ 223 (-35.92%)
tf-idf-pythonTerm frequency–inverse document frequency for Chinese novel/documents implemented in python.
Stars: ✭ 98 (-71.84%)
FNet-pytorchUnofficial implementation of Google's FNet: Mixing Tokens with Fourier Transforms
Stars: ✭ 204 (-41.38%)
lda2vecMixing Dirichlet Topic Models and Word Embeddings to Make lda2vec from this paper https://arxiv.org/abs/1605.02019
Stars: ✭ 27 (-92.24%)
TextDatasetCleaner🔬 Очистка датасетов от мусора (нормализация, препроцессинг)
Stars: ✭ 27 (-92.24%)
nlpbuddyA text analysis application for performing common NLP tasks through a web dashboard interface and an API
Stars: ✭ 115 (-66.95%)
nlp classificationImplementing nlp papers relevant to classification with PyTorch, gluonnlp
Stars: ✭ 224 (-35.63%)
stringxDrop-in replacements for base R string functions powered by stringi
Stars: ✭ 14 (-95.98%)
converseConversational text Analysis using various NLP techniques
Stars: ✭ 147 (-57.76%)
SparseLSHA Locality Sensitive Hashing (LSH) library with an emphasis on large, highly-dimensional datasets.
Stars: ✭ 127 (-63.51%)