All Projects → architect_big_data_solutions_with_spark → Similar Projects or Alternatives

665 Open source projects that are alternatives of or similar to architect_big_data_solutions_with_spark

Od
Česká otevřená data
Stars: ✭ 99 (+147.5%)
Mutual labels:  etl
Datav
📊https://datav.io is a modern APM, provide observability for your business, application and infrastructure. It's also a lightweight alternative to Grafana.
Stars: ✭ 2,757 (+6792.5%)
Mutual labels:  data-analysis
PracticalMachineLearning
A collection of ML related stuff including notebooks, codes and a curated list of various useful resources such as books and softwares. Almost everything mentioned here is free (as speech not free food) or open-source.
Stars: ✭ 60 (+50%)
Mutual labels:  data-analysis
Dabest Python
Data Analysis with Bootstrapped ESTimation
Stars: ✭ 231 (+477.5%)
Mutual labels:  data-analysis
Udacity Data Engineering
Udacity Data Engineering Nano Degree (DEND)
Stars: ✭ 89 (+122.5%)
Mutual labels:  etl
Querytree
Data reporting and visualization for your app
Stars: ✭ 230 (+475%)
Mutual labels:  data-analysis
document-understanding-solution
Example of integrating & using Amazon Textract, Amazon Comprehend, Amazon Comprehend Medical, Amazon Kendra to automate the processing of documents for use cases such as enterprise search and discovery, control and compliance, and general business process workflow.
Stars: ✭ 180 (+350%)
Mutual labels:  aws-machine-learning
Streamlit
Streamlit — The fastest way to build data apps in Python
Stars: ✭ 16,906 (+42165%)
Mutual labels:  data-analysis
Hale
(Spatial) data harmonisation with hale studio (formerly HUMBOLDT Alignment Editor)
Stars: ✭ 84 (+110%)
Mutual labels:  etl
dsr
Introduction to Data Science with R (2017)
Stars: ✭ 25 (-37.5%)
Mutual labels:  data-analysis
Bitcoin Etl
ETL scripts for Bitcoin, Litecoin, Dash, Zcash, Doge, Bitcoin Cash. Available in Google BigQuery https://goo.gl/oY5BCQ
Stars: ✭ 174 (+335%)
Mutual labels:  etl
PDAP-Scrapers
Code relating to scraping public police data.
Stars: ✭ 72 (+80%)
Mutual labels:  etl
Runalyze
Create your free account at runalyze.com
Stars: ✭ 219 (+447.5%)
Mutual labels:  data-analysis
Sayn
Data processing and modelling framework for automating tasks (incl. Python & SQL transformations).
Stars: ✭ 79 (+97.5%)
Mutual labels:  etl
Static Frame
Immutable and grow-only Pandas-like DataFrames with a more explicit and consistent interface.
Stars: ✭ 217 (+442.5%)
Mutual labels:  data-analysis
openrefine-batch
Shell script to run OpenRefine in batch mode (import, transform, export). It orchestrates OpenRefine (server) and a python client that communicates with the OpenRefine API.
Stars: ✭ 76 (+90%)
Mutual labels:  etl
Awosome Bioinformatics
A curated list of resources for learning bioinformatics.
Stars: ✭ 214 (+435%)
Mutual labels:  data-analysis
Dataspherestudio
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
Stars: ✭ 1,195 (+2887.5%)
Mutual labels:  etl
Toad
ESC Team's scorecard tools
Stars: ✭ 207 (+417.5%)
Mutual labels:  data-analysis
python mozetl
ETL jobs for Firefox Telemetry
Stars: ✭ 25 (-37.5%)
Mutual labels:  etl
Awkward 1.0
Manipulate JSON-like data with NumPy-like idioms.
Stars: ✭ 203 (+407.5%)
Mutual labels:  data-analysis
Luigi Warehouse
A luigi powered analytics / warehouse stack
Stars: ✭ 72 (+80%)
Mutual labels:  etl
Discovery
Frontend framework for rapid data (JSON) analysis, sharable serverless reports and dashboards
Stars: ✭ 199 (+397.5%)
Mutual labels:  data-analysis
iex-stocks
ETL for the IEX Stocks API
Stars: ✭ 19 (-52.5%)
Mutual labels:  etl
Data Science Notebook
📖 每一个伟大的思想和行动都有一个微不足道的开始
Stars: ✭ 196 (+390%)
Mutual labels:  data-analysis
Target Postgres
A Singer.io Target for Postgres
Stars: ✭ 70 (+75%)
Mutual labels:  etl
spectrochempy
SpectroChemPy is a framework for processing, analyzing and modeling spectroscopic data for chemistry with Python
Stars: ✭ 34 (-15%)
Mutual labels:  data-analysis
Klib
Easy to use Python library of customized functions for cleaning and analyzing data.
Stars: ✭ 192 (+380%)
Mutual labels:  data-analysis
Etl with python
ETL with Python - Taught at DWH course 2017 (TAU)
Stars: ✭ 68 (+70%)
Mutual labels:  etl
Volbx
Graphical tool for data manipulation written in C++/Qt
Stars: ✭ 187 (+367.5%)
Mutual labels:  data-analysis
Spark ALS
基于spark-ml,spark-mllib,spark-streaming的推荐算法实现
Stars: ✭ 89 (+122.5%)
Mutual labels:  spark-streaming
Goaccess
GoAccess is a real-time web log analyzer and interactive viewer that runs in a terminal in *nix systems or through your browser.
Stars: ✭ 14,096 (+35140%)
Mutual labels:  data-analysis
Discreetly
ETLy is an add-on dashboard service on top of Apache Airflow.
Stars: ✭ 60 (+50%)
Mutual labels:  etl
Collapse
Advanced and Fast Data Transformation in R
Stars: ✭ 184 (+360%)
Mutual labels:  data-analysis
Udacity-Data-Analyst-Nanodegree
Repository for the projects needed to complete the Data Analyst Nanodegree.
Stars: ✭ 31 (-22.5%)
Mutual labels:  data-analysis
Matplotlib Doc Zh
📖 [译] Matplotlib 用户指南
Stars: ✭ 178 (+345%)
Mutual labels:  data-analysis
Bentools Etl
PHP ETL (Extract / Transform / Load) library with SOLID principles + almost no dependency.
Stars: ✭ 45 (+12.5%)
Mutual labels:  etl
link-move
A model-driven dynamically-configurable framework to acquire data from external sources and save it to your database.
Stars: ✭ 32 (-20%)
Mutual labels:  etl
DataBridge.NET
Configurable data bridge for permanent ETL jobs
Stars: ✭ 16 (-60%)
Mutual labels:  etl
Covid19 Severity Prediction
Extensive and accessible COVID-19 data + forecasting for counties and hospitals. 📈
Stars: ✭ 170 (+325%)
Mutual labels:  data-analysis
Configs
Public, free to use, repository with diggers configs for scraping / extracting data from various e-commerce websites and online stores
Stars: ✭ 37 (-7.5%)
Mutual labels:  etl
Dabestr
Data Analysis with Bootstrap Estimation in R
Stars: ✭ 169 (+322.5%)
Mutual labels:  data-analysis
wasp
WASP is a framework to build complex real time big data applications. It relies on a kind of Kappa/Lambda architecture mainly leveraging Kafka and Spark. If you need to ingest huge amount of heterogeneous data and analyze them through complex pipelines, this is the framework for you.
Stars: ✭ 19 (-52.5%)
Mutual labels:  spark-streaming
Countly Sdk Web
Countly Product Analytics SDK for websites and web applications
Stars: ✭ 165 (+312.5%)
Mutual labels:  data-analysis
Ethereum Etl
Python scripts for ETL (extract, transform and load) jobs for Ethereum blocks, transactions, ERC20 / ERC721 tokens, transfers, receipts, logs, contracts, internal transactions. Data is available in Google BigQuery https://goo.gl/oY5BCQ
Stars: ✭ 956 (+2290%)
Mutual labels:  etl
Report Designer
🚀 打印设计、可视化、大屏、编辑器、设计器、数据分析、报表设计、组件化、表单设计、h5页面、调查问卷、pdf生成、流程图、试卷、SVG、图形元素、物联网
Stars: ✭ 160 (+300%)
Mutual labels:  data-analysis
Real-time-log-analysis-system
🐧基于spark streaming+flume+kafka+hbase的实时日志处理分析系统(分为控制台版本和基于springboot、Echarts等的Web UI可视化版本)
Stars: ✭ 31 (-22.5%)
Mutual labels:  spark-streaming
Visualize ml
Python package for consolidated and extensive Univariate,Bivariate Data Analysis and Visualization catering to both categorical and continuous datasets.
Stars: ✭ 160 (+300%)
Mutual labels:  data-analysis
Aws Auto Terminate Idle Emr
AWS Auto Terminate Idle AWS EMR Clusters Framework is an AWS based solution using AWS CloudWatch and AWS Lambda using a Python script that is using Boto3 to terminate AWS EMR clusters that have been idle for a specified period of time.
Stars: ✭ 21 (-47.5%)
Mutual labels:  etl
Bovespastockratings
Crawler for Fundamental analysis platform for BOVESPA stocks, generating a score for each share according to the selected criteria on the indicators.
Stars: ✭ 154 (+285%)
Mutual labels:  data-analysis
Bitcoin Analysis-
Python Bitcoin is widely used cryptocurrency for digital market. It is decentralised that means it is not own by government or any other company.Transactions are simple and easy as it doesn’t belong to any country.Records data are stored in Blockchain.Bitcoin price is variable and it is widely used so it is important to predict the price of it f…
Stars: ✭ 42 (+5%)
Mutual labels:  data-analysis
Dswarm Backoffice Web
The backoffice web application of d:swarm (https://github.com/dswarm/dswarm-documentation/wiki)
Stars: ✭ 11 (-72.5%)
Mutual labels:  etl
singer-runner
A CLI and library to run Singer Taps and Targets
Stars: ✭ 33 (-17.5%)
Mutual labels:  etl
ipychart
The power of Chart.js with Python
Stars: ✭ 48 (+20%)
Mutual labels:  data-analysis
data-analysis
金融市场与体育彩券市场 --- 数据分析与量化交易
Stars: ✭ 73 (+82.5%)
Mutual labels:  data-analysis
hnn
The Human Neocortical Neurosolver (HNN) is a software tool that gives researchers/clinicians the ability to develop/test hypotheses on circuit mechanisms underlying EEG/MEG data.
Stars: ✭ 62 (+55%)
Mutual labels:  data-analysis
covid-19
Data ETL & Analysis on the global and Mexican datasets of the COVID-19 pandemic.
Stars: ✭ 14 (-65%)
Mutual labels:  etl
zingg
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
Stars: ✭ 655 (+1537.5%)
Mutual labels:  etl
Bender
Bender - Serverless ETL Framework
Stars: ✭ 171 (+327.5%)
Mutual labels:  etl
go-bqloader
bqloader is a simple ETL framework to load data from Cloud Storage into BigQuery.
Stars: ✭ 16 (-60%)
Mutual labels:  etl
301-360 of 665 similar projects