All Projects → bumblebee → Similar Projects or Alternatives

289 Open source projects that are alternatives of or similar to bumblebee

optimus
🚚 Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
Stars: ✭ 1,351 (+1025.83%)
datatile
A library for managing, validating, summarizing, and visualizing data.
Stars: ✭ 419 (+249.17%)
Mutual labels:  dask, data-profiling
allie
🤖 A machine learning framework for audio, text, image, video, or .CSV files (50+ featurizers and 15+ model trainers).
Stars: ✭ 93 (-22.5%)
Mutual labels:  datasets, data-cleaning
foofah
Foofah: programming-by-example data transformation program synthesizer
Stars: ✭ 24 (-80%)
Mutual labels:  data-preparation, data-cleaning
reskit
A library for creating and curating reproducible pipelines for scientific and industrial machine learning
Stars: ✭ 27 (-77.5%)
Mutual labels:  data-preparation, prepare-data
covid-19-data-cleanup
Scripts to cleanup data from https://github.com/CSSEGISandData/COVID-19
Stars: ✭ 25 (-79.17%)
Mutual labels:  datasets, data-cleaning
xarray-beam
Distributed Xarray with Apache Beam
Stars: ✭ 83 (-30.83%)
Mutual labels:  dask
Google-Playstore-Dataset
Google PlayStore App dataset. (2.3 million App Data) and 24 attributes
Stars: ✭ 27 (-77.5%)
Mutual labels:  datasets
geodaData
Data package for accessing GeoDa datasets using R
Stars: ✭ 15 (-87.5%)
Mutual labels:  datasets
clothing-detection-ecommerce-dataset
Clothing detection dataset
Stars: ✭ 43 (-64.17%)
Mutual labels:  datasets
big-data-exploration
[Archive] Intern project - Big Data Exploration using MongoDB - This Repository is NOT a supported MongoDB product
Stars: ✭ 43 (-64.17%)
Mutual labels:  datasets
fedora-prime
Simple program to switch between intel and nvidia gpu
Stars: ✭ 24 (-80%)
Mutual labels:  optimus
firestore-to-bigquery-export
NPM package for copying and converting Cloud Firestore data to BigQuery.
Stars: ✭ 26 (-78.33%)
Mutual labels:  datasets
dagpi
Dagpi is a powerful and fast api that does image manipulation as well as serves datasets. It is fast and written in rust and python. Perfect for discord bots, social media apps, camera apps and more.
Stars: ✭ 25 (-79.17%)
Mutual labels:  datasets
DiscEval
Discourse Based Evaluation of Language Understanding
Stars: ✭ 18 (-85%)
Mutual labels:  datasets
humanflow2
Official repository of Learning Multi-Human Optical Flow (IJCV 2019)
Stars: ✭ 37 (-69.17%)
Mutual labels:  datasets
kaggledatasets
Collection of Kaggle Datasets ready to use for Everyone (Looking for contributors)
Stars: ✭ 44 (-63.33%)
Mutual labels:  datasets
qhub
🪴 Nebari - your open source data science platform
Stars: ✭ 175 (+45.83%)
Mutual labels:  dask
scRNAseq cell cluster labeling
Scripts to run and benchmark scRNA-seq cell cluster labeling methods
Stars: ✭ 41 (-65.83%)
Mutual labels:  datasets
Cleaner.jl
A toolbox of simple solutions for common data cleaning problems.
Stars: ✭ 21 (-82.5%)
Mutual labels:  data-cleaning
industrial-ml-datasets
A curated list of datasets, publically available for machine learning research in the area of manufacturing
Stars: ✭ 45 (-62.5%)
Mutual labels:  datasets
FIFA-2019-Analysis
This is a project based on the FIFA World Cup 2019 and Analyzes the Performance and Efficiency of Teams, Players, Countries and other related things using Data Analysis and Data Visualizations
Stars: ✭ 28 (-76.67%)
Mutual labels:  data-cleaning
codex-africanus
Radio Astronomy Algorithms Library
Stars: ✭ 13 (-89.17%)
Mutual labels:  dask
morghulis
No description or website provided.
Stars: ✭ 18 (-85%)
Mutual labels:  datasets
coiled-resources
Notebooks that support blog posts and tech talks on Dask / Coiled.
Stars: ✭ 33 (-72.5%)
Mutual labels:  dask
rs datasets
Tool for autodownloading recommendation systems datasets
Stars: ✭ 22 (-81.67%)
Mutual labels:  datasets
biomechanics dataset
Information of public available data sets for biomechanics.
Stars: ✭ 31 (-74.17%)
Mutual labels:  datasets
awesome-dynamic-graphs
A collection of resources on dynamic/streaming/temporal/evolving graph processing systems, databases, data structures, datasets, and related academic and industrial work
Stars: ✭ 89 (-25.83%)
Mutual labels:  datasets
dask-sql
Distributed SQL Engine in Python using Dask
Stars: ✭ 271 (+125.83%)
Mutual labels:  dask
torchgeo
TorchGeo: datasets, samplers, transforms, and pre-trained models for geospatial data
Stars: ✭ 1,125 (+837.5%)
Mutual labels:  datasets
dh-core
Functional data science
Stars: ✭ 123 (+2.5%)
Mutual labels:  datasets
errorlocate
Find and replace erroneous fields in data using validation rules
Stars: ✭ 19 (-84.17%)
Mutual labels:  data-cleaning
metadat
Meta-analytic datasets for R
Stars: ✭ 21 (-82.5%)
Mutual labels:  datasets
Thirukkural-Tamil-Dataset
திருக்குறள் by திருவள்ளுவர்.
Stars: ✭ 44 (-63.33%)
Mutual labels:  datasets
CompBioDatasetsForMachineLearning
A Curated List of Computational Biology Datasets Suitable for Machine Learning
Stars: ✭ 90 (-25%)
Mutual labels:  datasets
data-profiling
a set of scripts to pull meta data and data profiling metrics from relational database systems
Stars: ✭ 57 (-52.5%)
Mutual labels:  data-profiling
data.world-py
Python package for data.world
Stars: ✭ 98 (-18.33%)
Mutual labels:  datasets
exemplary-ml-pipeline
Exemplary, annotated machine learning pipeline for any tabular data problem.
Stars: ✭ 23 (-80.83%)
Mutual labels:  data-cleaning
delitos-caba
🚓 Crime dataset for the City of Buenos Aires, Argentina
Stars: ✭ 44 (-63.33%)
Mutual labels:  datasets
cifair
A duplicate-free variant of the CIFAR test set.
Stars: ✭ 13 (-89.17%)
Mutual labels:  datasets
nvidia-auto-installer-for-fedora-linux
A CLI tool which lets you install proprietary NVIDIA drivers and much more easily on Fedora Linux (32 or above and Rawhide)
Stars: ✭ 270 (+125%)
Mutual labels:  optimus
prefect-saturn
Python client for using Prefect Cloud with Saturn Cloud
Stars: ✭ 15 (-87.5%)
Mutual labels:  dask
Udacity-Data-Analyst-Nanodegree
Repository for the projects needed to complete the Data Analyst Nanodegree.
Stars: ✭ 31 (-74.17%)
Mutual labels:  data-cleaning
git-rdm
A research data management plugin for the Git version control system.
Stars: ✭ 34 (-71.67%)
Mutual labels:  datasets
machine-learning-data-pipeline
Pipeline module for parallel real-time data processing for machine learning models development and production purposes.
Stars: ✭ 22 (-81.67%)
Mutual labels:  data-preparation
scrapeOP
A python package for scraping oddsportal.com
Stars: ✭ 99 (-17.5%)
Mutual labels:  datasets
open2ch-dialogue-corpus
おーぷん2ちゃんねるをクロールして作成した対話コーパス
Stars: ✭ 65 (-45.83%)
Mutual labels:  datasets
thermostat
Collection of NLP model explanations and accompanying analysis tools
Stars: ✭ 126 (+5%)
Mutual labels:  datasets
auctus
Dataset search engine, discovering data from a variety of sources, profiling it, and allowing advanced queries on the index
Stars: ✭ 34 (-71.67%)
Mutual labels:  data-profiling
transfermarkt-datasets
⚽️ Extract, prepare and publish Transfermarkt datasets.
Stars: ✭ 60 (-50%)
Mutual labels:  datasets
Spatio-Temporal-papers
This project is a collection of recent research in areas such as new infrastructure and urban computing, including white papers, academic papers, AI lab and dataset etc.
Stars: ✭ 180 (+50%)
Mutual labels:  datasets
HoloClean-Legacy-deprecated
A Machine Learning System for Data Enrichment.
Stars: ✭ 75 (-37.5%)
Mutual labels:  data-cleaning
daskperiment
Reproducibility for Humans: A lightweight tool to perform reproducible machine learning experiment.
Stars: ✭ 25 (-79.17%)
Mutual labels:  dask
isarn-sketches-spark
Routines and data structures for using isarn-sketches idiomatically in Apache Spark
Stars: ✭ 28 (-76.67%)
Mutual labels:  datasets
awesome-mobile-robotics
Useful links of different content related to AI, Computer Vision, and Robotics.
Stars: ✭ 243 (+102.5%)
Mutual labels:  datasets
bugrepo
A collection of publicly available bug reports
Stars: ✭ 93 (-22.5%)
Mutual labels:  datasets
enmSdm
Faster, better, smarter ecological niche modeling and species distribution modeling
Stars: ✭ 39 (-67.5%)
Mutual labels:  prepare-data
Scene-Text-Recognition-Recommendations
Papers, Datasets, Algorithms, SOTA for STR. Long-time Maintaining
Stars: ✭ 215 (+79.17%)
Mutual labels:  datasets
mlx
Machine Learning eXchange (MLX). Data and AI Assets Catalog and Execution Engine
Stars: ✭ 132 (+10%)
Mutual labels:  datasets
NLP PEMDC
NLP Predtrained Embeddings, Models and Datasets Collections(NLP_PEMDC). The collection will keep updating.
Stars: ✭ 58 (-51.67%)
Mutual labels:  datasets
1-60 of 289 similar projects