All Projects → Bigslice → Similar Projects or Alternatives

928 Open source projects that are alternatives of or similar to Bigslice

Bigdata Interview
🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结
Stars: ✭ 857 (+82.73%)
Mutual labels:  bigdata, mapreduce
Aws Auto Terminate Idle Emr
AWS Auto Terminate Idle AWS EMR Clusters Framework is an AWS based solution using AWS CloudWatch and AWS Lambda using a Python script that is using Boto3 to terminate AWS EMR clusters that have been idle for a specified period of time.
Stars: ✭ 21 (-95.52%)
Mutual labels:  bigdata, etl
Aws Etl Orchestrator
A serverless architecture for orchestrating ETL jobs in arbitrarily-complex workflows using AWS Step Functions and AWS Lambda.
Stars: ✭ 245 (-47.76%)
Mutual labels:  bigdata, etl
Mobius
C# and F# language binding and extensions to Apache Spark
Stars: ✭ 929 (+98.08%)
Mutual labels:  bigdata, mapreduce
Big Data Engineering Coursera Yandex
Big Data for Data Engineers Coursera Specialization from Yandex
Stars: ✭ 71 (-84.86%)
Mutual labels:  bigdata, mapreduce
Dpark
Python clone of Spark, a MapReduce alike framework in Python
Stars: ✭ 2,668 (+468.87%)
Mutual labels:  bigdata, mapreduce
big data
A collection of tutorials on Hadoop, MapReduce, Spark, Docker
Stars: ✭ 34 (-92.75%)
Mutual labels:  bigdata, mapreduce
zdh web
大数据采集,抽取平台
Stars: ✭ 292 (-37.74%)
Mutual labels:  etl, bigdata
qs-hadoop
大数据生态圈学习
Stars: ✭ 18 (-96.16%)
Mutual labels:  bigdata, mapreduce
Panther
Detect threats with log data and improve cloud security posture
Stars: ✭ 885 (+88.7%)
Mutual labels:  bigdata, etl
Vaex
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualize and explore big tabular data at a billion rows per second 🚀
Stars: ✭ 6,793 (+1348.4%)
Mutual labels:  bigdata, machinelearning
gan deeplearning4j
Automatic feature engineering using Generative Adversarial Networks using Deeplearning4j and Apache Spark.
Stars: ✭ 19 (-95.95%)
Mutual labels:  bigdata, machinelearning
lectures-hse-spark
Масштабируемое машинное обучение и анализ больших данных с Apache Spark
Stars: ✭ 20 (-95.74%)
Mutual labels:  bigdata, mapreduce
bigdata-doc
大数据学习笔记,学习路线,技术案例整理。
Stars: ✭ 37 (-92.11%)
Mutual labels:  bigdata, mapreduce
Bigdata Notes
大数据入门指南 ⭐
Stars: ✭ 10,991 (+2243.5%)
Mutual labels:  bigdata, mapreduce
web-click-flow
网站点击流离线日志分析
Stars: ✭ 14 (-97.01%)
Mutual labels:  etl, mapreduce
Arvados
An open source platform for managing and analyzing biomedical big data
Stars: ✭ 274 (-41.58%)
Mutual labels:  bigdata, cluster
Dotnext
Next generation API for .NET
Stars: ✭ 379 (-19.19%)
Mutual labels:  cluster
Openpbs
An HPC workload manager and job scheduler for desktops, clusters, and clouds.
Stars: ✭ 427 (-8.96%)
Mutual labels:  cluster
Abc
Power of appbase.io via CLI, with nifty imports from your favorite data sources
Stars: ✭ 375 (-20.04%)
Mutual labels:  etl
Tensorflowonspark
TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.
Stars: ✭ 3,748 (+699.15%)
Mutual labels:  cluster
Simple Ocr Opencv
A simple python OCR engine using opencv
Stars: ✭ 453 (-3.41%)
Mutual labels:  machinelearning
Onepanel
The open and extensible integrated development environment (IDE) for computer vision with built-in modules for model building, automated labeling, data processing, model training, hyperparameter tuning and workflow orchestration.
Stars: ✭ 428 (-8.74%)
Mutual labels:  machinelearning
Wedatasphere
WeDataSphere is a financial level one-stop open-source suitcase for big data platforms. Currently the source code of Scriptis and Linkis has already been released to the open-source community. WeDataSphere, Big Data Made Easy!
Stars: ✭ 372 (-20.68%)
Mutual labels:  etl
Articles
A repository for the source code, notebooks, data, files, and other assets used in the data science and machine learning articles on LearnDataSci
Stars: ✭ 350 (-25.37%)
Mutual labels:  machinelearning
Syn
A global Process Registry and Process Group manager for Erlang and Elixir.
Stars: ✭ 412 (-12.15%)
Mutual labels:  cluster
Text summurization abstractive methods
Multiple implementations for abstractive text summurization , using google colab
Stars: ✭ 359 (-23.45%)
Mutual labels:  machinelearning
Nfx
C# Server UNISTACK framework [MOVED]
Stars: ✭ 379 (-19.19%)
Mutual labels:  cluster
Circosjs
d3 library to build circular graphs
Stars: ✭ 436 (-7.04%)
Mutual labels:  bigdata
Swarmlet
A self-hosted, open-source Platform as a Service that enables easy swarm deployments, load balancing, automatic SSL, metrics, analytics and more.
Stars: ✭ 373 (-20.47%)
Mutual labels:  cluster
Pglogical
Logical Replication extension for PostgreSQL 13, 12, 11, 10, 9.6, 9.5, 9.4 (Postgres), providing much faster replication than Slony, Bucardo or Londiste, as well as cross-version upgrades.
Stars: ✭ 455 (-2.99%)
Mutual labels:  etl
Choetl
ETL Framework for .NET / c# (Parser / Writer for CSV, Flat, Xml, JSON, Key-Value, Parquet, Yaml, Avro formatted files)
Stars: ✭ 372 (-20.68%)
Mutual labels:  etl
Minimesos
The experimentation and testing tool for Apache Mesos - NO LONGER MAINTANED!
Stars: ✭ 429 (-8.53%)
Mutual labels:  cluster
Aistore
AIStore: scalable storage for AI applications
Stars: ✭ 367 (-21.75%)
Mutual labels:  etl
Etlalchemy
Extract, Transform, Load: Any SQL Database in 4 lines of Code.
Stars: ✭ 460 (-1.92%)
Mutual labels:  etl
Sidekick
High Performance HTTP Sidecar Load Balancer
Stars: ✭ 366 (-21.96%)
Mutual labels:  bigdata
Cortx
CORTX Community Object Storage is 100% open source object storage uniquely optimized for mass capacity storage devices.
Stars: ✭ 426 (-9.17%)
Mutual labels:  bigdata
Metorikku
A simplified, lightweight ETL Framework based on Apache Spark
Stars: ✭ 361 (-23.03%)
Mutual labels:  etl
Akkadotnet Code Samples
Akka.NET professional reference code samples
Stars: ✭ 451 (-3.84%)
Mutual labels:  cluster
Diplomat
A HTTP Ruby API for Consul
Stars: ✭ 358 (-23.67%)
Mutual labels:  cluster
Victoriametrics
VictoriaMetrics: fast, cost-effective monitoring solution and time series database
Stars: ✭ 5,558 (+1085.07%)
Mutual labels:  cluster
Jigsaw
Jigsaw七巧板 provides a set of web components based on Angular5/8/9+. The main purpose of Jigsaw is to help the application developers to construct complex & intensive interacting & user friendly web pages. Jigsaw is supporting the development of all applications of Big Data Product of ZTE.
Stars: ✭ 354 (-24.52%)
Mutual labels:  bigdata
Nodejsstarterkit
Starter Kit for Node.js v14.x, minimum dependencies 🚀
Stars: ✭ 348 (-25.8%)
Mutual labels:  cluster
Smartcode
SmartCode = IDataSource -> IBuildTask -> IOutput => Build Everything!!!
Stars: ✭ 464 (-1.07%)
Mutual labels:  etl
Smudge
A lightweight library that provides group member discovery, status dissemination, and failure detection using the SWIM epidemic protocol.
Stars: ✭ 458 (-2.35%)
Mutual labels:  cluster
Minikube
Run Kubernetes locally
Stars: ✭ 22,673 (+4734.33%)
Mutual labels:  cluster
Actionai
custom human activity recognition modules by pose estimation and cascaded inference using sklearn API
Stars: ✭ 404 (-13.86%)
Mutual labels:  machinelearning
Datawave
DataWave is an ingest/query framework that leverages Apache Accumulo to provide fast, secure data access.
Stars: ✭ 347 (-26.01%)
Mutual labels:  bigdata
Dataform
Dataform is a framework for managing SQL based data operations in BigQuery, Snowflake, and Redshift
Stars: ✭ 342 (-27.08%)
Mutual labels:  etl
Big data architect skills
一个大数据架构师应该掌握的技能
Stars: ✭ 400 (-14.71%)
Mutual labels:  bigdata
K8s Multicluster Ingress
kubemci: Command line tool to configure L7 load balancers using multiple kubernetes clusters
Stars: ✭ 345 (-26.44%)
Mutual labels:  cluster
Sparklens
Qubole Sparklens tool for performance tuning Apache Spark
Stars: ✭ 345 (-26.44%)
Mutual labels:  cluster
Data Science Ipython Notebooks
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Stars: ✭ 22,048 (+4601.07%)
Mutual labels:  mapreduce
Akka.net
Port of Akka actors for .NET
Stars: ✭ 4,024 (+758%)
Mutual labels:  cluster
Api.rss
RSS as RESTful. This service allows you to transform RSS feed into an awesome API.
Stars: ✭ 340 (-27.51%)
Mutual labels:  bigdata
Webkettle
基于web版kettle开发的一套分布式综合调度,管理,ETL开发的用户专业版B/S架构工具
Stars: ✭ 334 (-28.78%)
Mutual labels:  etl
Kube Spawn
A tool for creating multi-node Kubernetes clusters on a Linux machine using kubeadm & systemd-nspawn. Brought to you by the Kinvolk team.
Stars: ✭ 392 (-16.42%)
Mutual labels:  cluster
Ckss Certified Kubernetes Security Specialist
This repository is a collection of resources to prepare for the Certified Kubernetes Security Specialist (CKSS) exam.
Stars: ✭ 333 (-29%)
Mutual labels:  cluster
Kontraktor
distributed Actors for Java 8 / JavaScript
Stars: ✭ 333 (-29%)
Mutual labels:  cluster
Udacity Data Engineering Projects
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
Stars: ✭ 458 (-2.35%)
Mutual labels:  cluster
1-60 of 928 similar projects