Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → d6t → D6t Python

d6t / D6t Python

Licence: mit

Accelerate data science

Labels

html data-science pandas data-engineering

Projects that are alternatives of or similar to D6t Python

Gspread Pandas

A package to easily open an instance of a Google spreadsheet and interact with worksheets through Pandas DataFrames.

Stars: ✭ 226 (+91.53%)

Mutual labels: data-science, pandas, data-engineering

Aws Data Wrangler

Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

Stars: ✭ 2,385 (+1921.19%)

Mutual labels: data-science, pandas, data-engineering

Ds and ml projects

Data Science & Machine Learning projects and tutorials in python from beginner to advanced level.

Stars: ✭ 56 (-52.54%)

Mutual labels: data-science, pandas

Seaborn

Statistical data visualization in Python

Stars: ✭ 9,007 (+7533.05%)

Mutual labels: data-science, pandas

Seaborn Tutorial

This repository is my attempt to help Data Science aspirants gain necessary Data Visualization skills required to progress in their career. It includes all the types of plot offered by Seaborn, applied on random datasets.

Stars: ✭ 114 (-3.39%)

Mutual labels: data-science, pandas

Just Dashboard

📊 📋 Dashboards using YAML or JSON files

Stars: ✭ 1,511 (+1180.51%)

Mutual labels: data-science, data-engineering

Skoot

A package for data science practitioners. This library implements a number of helpful, common data transformations with a scikit-learn friendly interface in an effort to expedite the modeling process.

Stars: ✭ 50 (-57.63%)

Mutual labels: data-science, pandas

Setl

A simple Spark-powered ETL framework that just works 🍺

Stars: ✭ 79 (-33.05%)

Mutual labels: data-science, data-engineering

Python for ml

brief introduction to Python for machine learning

Stars: ✭ 29 (-75.42%)

Mutual labels: data-science, pandas

Danfojs

danfo.js is an open source, JavaScript library providing high performance, intuitive, and easy to use data structures for manipulating and processing structured data.

Stars: ✭ 1,304 (+1005.08%)

Mutual labels: data-science, pandas

Applied Ml

📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.

Stars: ✭ 17,824 (+15005.08%)

Mutual labels: data-science, data-engineering

Dat8

General Assembly's 2015 Data Science course in Washington, DC

Stars: ✭ 1,516 (+1184.75%)

Mutual labels: data-science, pandas

10 Simple Hacks To Speed Up Your Data Analysis In Python

Some useful Tips and Tricks to speed up the data analysis process in Python.

Stars: ✭ 45 (-61.86%)

Mutual labels: data-science, pandas

Machinelearningcourse

A collection of notebooks of my Machine Learning class written in python 3

Stars: ✭ 35 (-70.34%)

Mutual labels: data-science, pandas

Sweetviz

Visualize and compare datasets, target values and associations, with one line of code.

Stars: ✭ 1,851 (+1468.64%)

Mutual labels: data-science, pandas

Mlcourse.ai

Open Machine Learning Course

Stars: ✭ 7,963 (+6648.31%)

Mutual labels: data-science, pandas

Sayn

Data processing and modelling framework for automating tasks (incl. Python & SQL transformations).

Stars: ✭ 79 (-33.05%)

Mutual labels: data-science, data-engineering

Sigmoidal ai

Tutoriais de Python, Data Science, Machine Learning e Deep Learning - Sigmoidal

Stars: ✭ 103 (-12.71%)

Mutual labels: data-science, pandas

Pandas Profiling

Create HTML profiling reports from pandas DataFrame objects

Stars: ✭ 8,329 (+6958.47%)

Mutual labels: data-science, pandas

Crime Analysis

Association Rule Mining from Spatial Data for Crime Analysis

Stars: ✭ 20 (-83.05%)

Mutual labels: data-science, pandas

View All Similar Projects ➔

Accelerate Data Science

Databolt python libraries

For data scientists and data engineers, DataBolt is a collection of python-based libraries and products to reduce the time it takes to get your data ready for analysis and collaborate with others.

Majority of time in data science is spent on tedious tasks unrelated to data analysis. DataBolt simplifies those tasks so you can experience up to 10x productivity gains.

manage data workflows: quickly build highly effective data science workflows
push/pull data: quickly get and share data files like code
import data: quickly ingest messy raw CSV and XLS files to pandas, SQL and more
join data: quickly combine multiple datasets using fuzzy joins

The libraries are modularized so you can use them individually but they work well together to improve your entire data workflow.

Manage data workflows

Easily manage data workflows including complex dependencies and parameters. With d6tflow you can easily chain together complex data flows and intelligently execute them. You can quickly load input and output data for each task. It makes your workflow very clear and intuitive.

What can it do?

Build a data workflow made up of tasks with dependencies and parameters
Intelligently rerun workflow after changing parameters, code or data
Quickly load task input and output data without manual work

Learn more at https://github.com/d6t/d6tflow

Push/Pull Data

d6tpipe is a python library which makes it easier to exchange data. It's like git for data! But better because you can include it in your data science code.

What can it do?

Quickly create public and private remote file storage on AWS S3 and ftp
Push/pull data to/from remote file storage to sync files and share with others
Add schema information so data can be loaded quickly

Learn more at https://github.com/d6t/d6tpipe

Ingest Data

Quickly ingest raw files. Works for XLS, CSV, TXT which can be exported to CSV, Parquet, SQL and Pandas. d6tstack solves many performance and other problems typically encountered when ingesting raw files.

What can it do?

Fast pd.to_sql() for postgres and mysql
Check and fix schema problems like added/missing/renamed columns
Load and process messy Excel files

Learn more at https://github.com/d6t/d6tstack

Join Data

Easily join different datasets without writing custom code using fuzzy matches. Does similarity joins on strings, dates and numbers. For example you can quickly join similar but not identical stock tickers, addresses, names and dates without manual processing.

What can it do?

Identify and diagnose join problems
Best match fuzzy joins on strings and dates
Best match substring joins

Learn more at https://github.com/d6t/d6tjoin

Blog

We encourage you to join the Databolt blog to get updates and tips+tricks http://blog.databolt.tech

About

https://www.databolt.tech

For questions or comments contact: support-at-databolt.tech

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 118

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (4) 🔗