Ml PyxisTool for reading and writing datasets of tensors in a Lightning Memory-Mapped Database (LMDB). Designed to manage machine learning datasets with fast reading speeds.
Stars: ✭ 93 (-34.04%)
RetrieverQuickly download, clean up, and install public datasets into a database management system
Stars: ✭ 241 (+70.92%)
Data Science HacksData Science Hacks consists of tips, tricks to help you become a better data scientist. Data science hacks are for all - beginner to advanced. Data science hacks consist of python, jupyter notebook, pandas hacks and so on.
Stars: ✭ 273 (+93.62%)
Data Science Resources👨🏽🏫You can learn about what data science is and why it's important in today's modern world. Are you interested in data science?🔋
Stars: ✭ 171 (+21.28%)
Dbg PdsDeutsche Boerse's Financial Trading Public Data Set
Stars: ✭ 124 (-12.06%)
CartolaExtração de dados da API do CartolaFC, análise exploratória dos dados e modelos preditivos em R e Python - 2014-20. [EN] Data munging, analysis and modeling of CartolaFC - the most popular fantasy football game in Brazil and maybe in the world. Data cover years 2014-19.
Stars: ✭ 304 (+115.6%)
Agile data code 2Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
Stars: ✭ 413 (+192.91%)
Machine Learning RoadmapA roadmap connecting many of the most important concepts in machine learning, how to learn them and what tools to use to perform them.
Stars: ✭ 5,277 (+3642.55%)
Blockchain2graphBlockchain2graph extracts blockchain data (bitcoin) and insert them into a graph database (neo4j).
Stars: ✭ 134 (-4.96%)
Oie ResourcesA curated list of Open Information Extraction (OIE) resources: papers, code, data, etc.
Stars: ✭ 283 (+100.71%)
DatasetsA repository of pretty cool datasets that I collected for network science and machine learning research.
Stars: ✭ 302 (+114.18%)
DatacleanerThe premier open source Data Quality solution
Stars: ✭ 391 (+177.3%)
AkshareAKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库
Stars: ✭ 4,334 (+2973.76%)
PdpipeEasy pipelines for pandas DataFrames.
Stars: ✭ 590 (+318.44%)
DatasheetsRead data from, write data to, and modify the formatting of Google Sheets
Stars: ✭ 593 (+320.57%)
Voice datasets🔊 A comprehensive list of open-source datasets for voice and sound computing (50+ datasets).
Stars: ✭ 494 (+250.35%)
Awesome StreamlitThe purpose of this project is to share knowledge on how awesome Streamlit is and can be
Stars: ✭ 769 (+445.39%)
SkdataPython tools for data analysis
Stars: ✭ 16 (-88.65%)
DataconfsA list of conferences connected with data worldwide.
Stars: ✭ 36 (-74.47%)
DatacomparerdataCompareR is an R package that allows users to compare two datasets and view a report on the similarities and differences.
Stars: ✭ 58 (-58.87%)
LegislatorInterface to the Comparative Legislators Database
Stars: ✭ 62 (-56.03%)
Dataspice🌶 Create lightweight schema.org descriptions of your datasets
Stars: ✭ 137 (-2.84%)
SetlA simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (-43.97%)
GraphiaA visualisation tool for the creation and analysis of graphs
Stars: ✭ 67 (-52.48%)
Gopup数据接口:百度、谷歌、头条、微博指数,宏观数据,利率数据,货币汇率,千里马、独角兽公司,新闻联播文字稿,影视票房数据,高校名单,疫情数据…
Stars: ✭ 1,229 (+771.63%)
Datasets🎁 3,000,000+ Unsplash images made available for research and machine learning
Stars: ✭ 1,805 (+1180.14%)
Datagear数据可视化分析平台,使用Java语言开发,采用浏览器/服务器架构,支持SQL、CSV、Excel、HTTP接口、JSON等多种数据源
Stars: ✭ 266 (+88.65%)
Eseur Code DataCode and data used to create the examples in "Evidence-based Software Engineering based on the publicly available data"
Stars: ✭ 340 (+141.13%)
Browser Compat DataThis repository contains compatibility data for Web technologies as displayed on MDN
Stars: ✭ 3,710 (+2531.21%)
RioA Swiss-Army Knife for Data I/O
Stars: ✭ 467 (+231.21%)
HubDataset format for AI. Build, manage, & visualize datasets for deep learning. Stream data real-time to PyTorch/TensorFlow & version-control it. https://activeloop.ai
Stars: ✭ 4,003 (+2739.01%)
Disk.frameFast Disk-Based Parallelized Data Manipulation Framework for Larger-than-RAM Data
Stars: ✭ 517 (+266.67%)
RowsA common, beautiful interface to tabular data, no matter the format
Stars: ✭ 739 (+424.11%)
Knowledge RepoA next-generation curated knowledge sharing platform for data scientists and other technical professions.
Stars: ✭ 4,956 (+3414.89%)
Datastream.ioAn open-source framework for real-time anomaly detection using Python, ElasticSearch and Kibana
Stars: ✭ 814 (+477.3%)
Osint collectionMaintained collection of OSINT related resources. (All Free & Actionable)
Stars: ✭ 809 (+473.76%)
JschemaA simple, easy to use data modeling framework for JavaScript
Stars: ✭ 261 (+85.11%)
OpenrefineOpenRefine is a free, open source power tool for working with messy data and improving it
Stars: ✭ 8,531 (+5950.35%)
PycmMulti-class confusion matrix library in Python
Stars: ✭ 1,076 (+663.12%)
ColourColour Science for Python
Stars: ✭ 1,131 (+702.13%)
Php MlPHP-ML - Machine Learning library for PHP
Stars: ✭ 7,900 (+5502.84%)
MagicboxA platform that uses real-time data to inform life-saving humanitarian responses to emergency situations
Stars: ✭ 73 (-48.23%)
Covid19JSON time-series of coronavirus cases (confirmed, deaths and recovered) per country - updated daily
Stars: ✭ 1,177 (+734.75%)
Openml RR package to interface with OpenML
Stars: ✭ 81 (-42.55%)
Qriyou're invited to a data party!
Stars: ✭ 1,003 (+611.35%)
CodesearchnetDatasets, tools, and benchmarks for representation learning of code.
Stars: ✭ 1,378 (+877.3%)
Awesome BigdataA curated list of awesome big data frameworks, ressources and other awesomeness.
Stars: ✭ 10,478 (+7331.21%)
Pyspark Cheatsheet🐍 Quick reference guide to common patterns & functions in PySpark.
Stars: ✭ 108 (-23.4%)
Just Dashboard📊 📋 Dashboards using YAML or JSON files
Stars: ✭ 1,511 (+971.63%)
Gspread PandasA package to easily open an instance of a Google spreadsheet and interact with worksheets through Pandas DataFrames.
Stars: ✭ 226 (+60.28%)
Data PolygamyData Polygamy is a topology-based framework that allows users to query for statistically significant relationships between spatio-temporal data sets.
Stars: ✭ 39 (-72.34%)
FlyteAccelerate your ML and Data workflows to production. Flyte is a production grade orchestration system for your Data and ML workloads. It has been battle tested at Lyft, Spotify, freenome and others and truly open-source.
Stars: ✭ 1,242 (+780.85%)
Githubrankingsspain⬆️ Rankings with the most active GitHub users in Spain (sorted by public contributions) 🇪🇸
Stars: ✭ 127 (-9.93%)