All Projects → Springboard-Data-Science-Immersive → Similar Projects or Alternatives

798 Open source projects that are alternatives of or similar to Springboard-Data-Science-Immersive

80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.

Stars: ✭ 406 (+680.77%)

Mutual labels: hadoop, pyspark

aut

The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.

Stars: ✭ 111 (+113.46%)

Mutual labels: hadoop, pyspark

Sparkora

Powerful rapid automatic EDA and feature engineering library with a very easy to use API 🌟

Stars: ✭ 51 (-1.92%)

Mutual labels: eda, pyspark

Uc Davis Cs Exams Analysis

📈 Regression and Classification with UC Davis student quiz data and exam data

Stars: ✭ 33 (-36.54%)

Mutual labels: web-scraping, statistical-analysis

awesome-time-series

Resources for working with time series and sequence data

Stars: ✭ 178 (+242.31%)

Mutual labels: time-series-analysis, time-series-prediction

basin

Basin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser

Stars: ✭ 25 (-51.92%)

Mutual labels: hadoop, pyspark

fireTS

A python multi-variate time series prediction library working with sklearn

Stars: ✭ 62 (+19.23%)

Mutual labels: time-series-analysis, time-series-prediction

Spark With Python

Fundamentals of Spark with Python (using PySpark), code examples

Stars: ✭ 150 (+188.46%)

Mutual labels: hadoop, pyspark

big data

A collection of tutorials on Hadoop, MapReduce, Spark, Docker

Stars: ✭ 34 (-34.62%)

Mutual labels: hadoop, pyspark

Atsd Use Cases

Axibase Time Series Database: Usage Examples and Research Articles

Stars: ✭ 335 (+544.23%)

Mutual labels: statistical-analysis, time-series-analysis

datalake-etl-pipeline

Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations

Stars: ✭ 39 (-25%)

Mutual labels: hadoop, pyspark

pyspark-ML-in-Colab

Pyspark in Google Colab: A simple machine learning (Linear Regression) model

Stars: ✭ 32 (-38.46%)

Mutual labels: hadoop, pyspark

MLHadoop

This repository contains Machine-Learning MapReduce codes for Hadoop which are written from scratch (without using any package or library). E.g. Prediction (Linear and Logistic Regression), Clustering (K-Means), Classification (KNN) etc.

Stars: ✭ 50 (-3.85%)

Mutual labels: hadoop

cobra-policytool

Manage Apache Atlas and Ranger configuration for your Hadoop environment.

Stars: ✭ 16 (-69.23%)

Mutual labels: hadoop

copper

An open source PCB editor in rust

Stars: ✭ 26 (-50%)

Mutual labels: eda

OLX Scraper

📻 An OLX Scraper using Scrapy + MongoDB. It Scrapes recent ads posted regarding requested product and dumps to NOSQL MONGODB.

Stars: ✭ 15 (-71.15%)

Mutual labels: web-scraping

time series notebooks

My Experiments with Time Series

Stars: ✭ 20 (-61.54%)

Mutual labels: time-series-analysis

Spark-for-data-engineers

Apache Spark for data engineers

Stars: ✭ 22 (-57.69%)

Mutual labels: pyspark

densenet

A PyTorch Implementation of "Densely Connected Convolutional Networks"

Stars: ✭ 50 (-3.85%)

Mutual labels: tensorboard

BasisFunctionExpansions.jl

Basis Function Expansions for Julia

Stars: ✭ 19 (-63.46%)

Mutual labels: time-series-analysis

check-engine

Data validation library for PySpark 3.0.0

Stars: ✭ 29 (-44.23%)

Mutual labels: pyspark

Sequence-to-Sequence-Learning-of-Financial-Time-Series-in-Algorithmic-Trading

My bachelor's thesis—analyzing the application of LSTM-based RNNs on financial markets. 🤓

Stars: ✭ 64 (+23.08%)

Mutual labels: time-series-analysis

deep-scite

🚣 A simple recommendation engine (by way of convolutions and embeddings) written in TensorFlow

Stars: ✭ 20 (-61.54%)

Mutual labels: tensorboard

skimpy

skimpy is a light weight tool that provides summary statistics about variables in data frames within the console.

Stars: ✭ 236 (+353.85%)

Mutual labels: eda

basic-image-eda

A simple image dataset EDA tool (CLI / Code)

Stars: ✭ 51 (-1.92%)

Mutual labels: eda

India-WhatsAppFakeNews-Dataset

WhatsApps related deaths News Articles along with other articles across India during that period

Stars: ✭ 41 (-21.15%)

Mutual labels: web-scraping

autojs-webView

autojs的webView实现，支持初始化脚本注入、jsBridge两端互调

Stars: ✭ 38 (-26.92%)

Mutual labels: h5

pomp

R package for statistical inference using partially observed Markov processes

Stars: ✭ 88 (+69.23%)

Mutual labels: statistical-inference

Tensorflow-Wide-Deep-Local-Prediction

This project demonstrates how to run and save predictions locally using exported tensorflow estimator model

Stars: ✭ 28 (-46.15%)

Mutual labels: tensorboard

clickhouse hadoop

Import data from clickhouse to hadoop with pure SQL

Stars: ✭ 26 (-50%)

Mutual labels: hadoop

jobAnalytics and search

JobAnalytics system consumes data from multiple sources and provides valuable information to both job hunters and recruiters.

Stars: ✭ 25 (-51.92%)

Mutual labels: pyspark

SynapseML

Simple and Distributed Machine Learning

Stars: ✭ 3,355 (+6351.92%)

Mutual labels: pyspark

tsa-tutorial

Material for the tutorial, "Time series analysis with pandas" at T-Academy

Stars: ✭ 21 (-59.62%)

Mutual labels: time-series-analysis

lstm-electric-load-forecast

Electric load forecast using Long-Short-Term-Memory (LSTM) recurrent neural network

Stars: ✭ 56 (+7.69%)

Mutual labels: time-series-prediction

platys-modern-data-platform

Support for generating modern platforms dynamically with services such as Kafka, Spark, Streamsets, HDFS, ....

Stars: ✭ 35 (-32.69%)

Mutual labels: hadoop

automation-scripts

Simple scripts that I'm using to automate the boring things.

Stars: ✭ 14 (-73.08%)

Mutual labels: web-scraping

clusterdock

clusterdock is a framework for creating Docker-based container clusters

Stars: ✭ 26 (-50%)

Mutual labels: hadoop

Node-js-functionalities

This repository contains very useful restful API's and functionalities in node-js containing many important tutorial code for mastering node-js, all tutorials have been published on medium.com, tutorials link is given below

Stars: ✭ 69 (+32.69%)

Mutual labels: web-scraping

ros hadoop

Hadoop splittable InputFormat for ROS. Process rosbag with Hadoop Spark and other HDFS compatible systems.

Stars: ✭ 92 (+76.92%)

Mutual labels: hadoop

leetcode-compensation

Compensation analysis on the posts scraped from leetcode.com/discuss/compensation. At present, the reports have been generated only for Indian cities.

Stars: ✭ 83 (+59.62%)

Mutual labels: web-scraping

flokkr

Documentation placeholder and utilities for all the other containers.

Stars: ✭ 30 (-42.31%)

Mutual labels: hadoop

DaFlow

Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.

Stars: ✭ 24 (-53.85%)

Mutual labels: hadoop

wink-statistics

Fast & numerically stable statistical analysis

Stars: ✭ 36 (-30.77%)

Mutual labels: statistical-analysis

WaWebSessionHandler

(DISCONTINUED) Save WhatsApp Web Sessions as files and open them everywhere!

Stars: ✭ 27 (-48.08%)

Mutual labels: web-scraping

IMDB-Scraper

Scrapy project for scraping data from IMDB with Movie Dataset including 58,623 movies' data.