Wifi基于wifi抓取信息的大数据查询分析系统
Stars: ✭ 93 (-18.42%)
AtsdAxibase Time Series Database Documentation
Stars: ✭ 68 (-40.35%)
Bigdataguide大数据学习,从零开始学习大数据,包含大数据学习各阶段学习视频、面试资料
Stars: ✭ 817 (+616.67%)
PyhivePython interface to Hive and Presto. 🐝
Stars: ✭ 1,378 (+1108.77%)
Winutilswinutils.exe hadoop.dll and hdfs.dll binaries for hadoop windows
Stars: ✭ 657 (+476.32%)
JumbuneJumbune, an open source BigData APM & Data Quality Management Platform for Data Clouds. Enterprise feature offering is available at http://jumbune.com. More details of open source offering are at,
Stars: ✭ 64 (-43.86%)
TonyTonY is a framework to natively run deep learning frameworks on Apache Hadoop.
Stars: ✭ 626 (+449.12%)
H2o 3H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Stars: ✭ 5,656 (+4861.4%)
PetastormPetastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
Stars: ✭ 1,108 (+871.93%)
Gather DeploymentGathers scalable tensorflow and infrastructure deployment
Stars: ✭ 326 (+185.96%)
WaterdropProduction Ready Data Integration Product, documentation:
Stars: ✭ 1,856 (+1528.07%)
Bigdata💎🔥大数据学习笔记
Stars: ✭ 488 (+328.07%)
Rumble⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
Stars: ✭ 58 (-49.12%)
School Of SreAt LinkedIn, we are using this curriculum for onboarding our entry-level talents into the SRE role.
Stars: ✭ 5,141 (+4409.65%)
Bigdata File ViewerA cross-platform (Windows, MAC, Linux) desktop application to view common bigdata binary format like Parquet, ORC, AVRO, etc. Support local file system, HDFS, AWS S3, Azure Blob Storage ,etc.
Stars: ✭ 86 (-24.56%)
Presto EthereumPresto Ethereum Connector -- SQL on Ethereum
Stars: ✭ 450 (+294.74%)
God Of Bigdata专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...
Stars: ✭ 6,008 (+5170.18%)
Hadoop SolrCode to index HDFS to Solr using MapReduce
Stars: ✭ 51 (-55.26%)
AkkeeperAn easy way to deploy your Akka services to a distributed environment.
Stars: ✭ 30 (-73.68%)
CascadingCascading is a feature rich API for defining and executing complex and fault tolerant data processing flows locally or on a cluster. See https://github.com/Cascading/cascading for the release repository.
Stars: ✭ 318 (+178.95%)
Node ParquetNodeJS module to access apache parquet format files
Stars: ✭ 46 (-59.65%)
SkaleHigh performance distributed data processing engine
Stars: ✭ 390 (+242.11%)
Avro Hadoop StarterExample MapReduce jobs in Java, Hive, Pig, and Hadoop Streaming that work on Avro data.
Stars: ✭ 110 (-3.51%)
BigdlBuilding Large-Scale AI Applications for Distributed Big Data
Stars: ✭ 3,813 (+3244.74%)
QuiltQuilt is a self-organizing data hub for S3
Stars: ✭ 1,007 (+783.33%)
SqlpadWeb-based SQL editor run in your own private cloud. Supports MySQL, Postgres, SQL Server, Vertica, Crate, ClickHouse, Trino, Presto, SAP HANA, Cassandra, Snowflake, BigQuery, SQLite, and more with ODBC
Stars: ✭ 4,113 (+3507.89%)
CamusMirror of Linkedin's Camus
Stars: ✭ 81 (-28.95%)
WedatasphereWeDataSphere is a financial level one-stop open-source suitcase for big data platforms. Currently the source code of Scriptis and Linkis has already been released to the open-source community. WeDataSphere, Big Data Made Easy!
Stars: ✭ 372 (+226.32%)
AntsdbAntsDB is a low latency, high concurrency, MySQL compliant SQL layer for HBase
Stars: ✭ 99 (-13.16%)
OzoneScalable, redundant, and distributed object store for Apache Hadoop
Stars: ✭ 330 (+189.47%)
Jsr203 HadoopA Java NIO file system provider for HDFS
Stars: ✭ 35 (-69.3%)
PystoreFast data store for Pandas time-series data
Stars: ✭ 325 (+185.09%)
PucketBucketing and partitioning system for Parquet
Stars: ✭ 29 (-74.56%)
TezApache Tez
Stars: ✭ 313 (+174.56%)
Hadoop BookExample source code accompanying O'Reilly's "Hadoop: The Definitive Guide" by Tom White
Stars: ✭ 3,317 (+2809.65%)
SplineData Lineage Tracking And Visualization Solution
Stars: ✭ 306 (+168.42%)
KglabGraph-Based Data Science: an abstraction layer in Python for building knowledge graphs, integrated with popular graph libraries – atop Pandas, RDFlib, pySHACL, RAPIDS, NetworkX, iGraph, PyVis, pslpython, pyarrow, etc.
Stars: ✭ 98 (-14.04%)
Docker Spark🚢 Docker image for Apache Spark
Stars: ✭ 78 (-31.58%)
Data Algorithms Book MapReduce, Spark, Java, and Scala for Data Algorithms Book
Stars: ✭ 949 (+732.46%)
CloudbreakA tool for provisioning and managing Apache Hadoop clusters in the cloud. Cloudbreak, as part of the Hortonworks Data Platform, makes it easy to provision, configure and elastically grow HDP clusters on cloud infrastructure. Cloudbreak can be used to provision Hadoop across cloud infrastructure providers including AWS, Azure, GCP and OpenStack.
Stars: ✭ 301 (+164.04%)
Elasticsearch loaderA tool for batch loading data files (json, parquet, csv, tsv) into ElasticSearch
Stars: ✭ 300 (+163.16%)
Storm Camel ExampleReal-time analysis and visualization with Storm-AMQ-Camel-Websockets-Highcharts integration.
Stars: ✭ 28 (-75.44%)
ElasticlusterCreate clusters of VMs on the cloud and configure them with Ansible.
Stars: ✭ 298 (+161.4%)
BehemothBehemoth is an open source platform for large scale document analysis based on Apache Hadoop.
Stars: ✭ 286 (+150.88%)
ChukwaMirror of Apache Chukwa
Stars: ✭ 77 (-32.46%)
Android NosqlLightweight, simple structured NoSQL database for Android
Stars: ✭ 284 (+149.12%)
RatatoolA tool for data sampling, data generation, and data diffing
Stars: ✭ 279 (+144.74%)
Parquet Dotnet🏐 Apache Parquet for modern .NET
Stars: ✭ 276 (+142.11%)
SchemerSchema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.
Stars: ✭ 97 (-14.91%)