LinkisLinkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
Stars: ✭ 2,323 (-92.65%)
TrinoOfficial repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Stars: ✭ 4,581 (-85.51%)
KyuubiKyuubi is a unified multi-tenant JDBC interface for large-scale data processing and analytics, built on top of Apache Spark
Stars: ✭ 363 (-98.85%)
Spark With PythonFundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (-99.53%)
MetorikkuA simplified, lightweight ETL Framework based on Apache Spark
Stars: ✭ 361 (-98.86%)
GimelBig Data Processing Framework - Unified Data API or SQL on Any Storage
Stars: ✭ 216 (-99.32%)
XsqlUnified SQL Analytics Engine Based on SparkSQL
Stars: ✭ 176 (-99.44%)
SparkjniA heterogeneous Apache Spark framework.
Stars: ✭ 11 (-99.97%)
autThe Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Stars: ✭ 111 (-99.65%)
PrestoThe official home of the Presto distributed SQL query engine for big data
Stars: ✭ 12,957 (-59.02%)
Ejc SqlEmacs SQL client uses Clojure JDBC.
Stars: ✭ 164 (-99.48%)
CalciteApache Calcite
Stars: ✭ 2,816 (-91.09%)
Minidao轻量级JAVA持久层,类似Mybatis一样的用法,基于SpringJdbc实现更轻量
Stars: ✭ 177 (-99.44%)
spark-acidACID Data Source for Apache Spark based on Hive ACID
Stars: ✭ 91 (-99.71%)
leaflet heatmap简单的可视化湖州通话数据 假设数据量很大,没法用浏览器直接绘制热力图,把绘制热力图这一步骤放到线下计算分析。使用Apache Spark并行计算数据之后,再使用Apache Spark绘制热力图,然后用leafletjs加载OpenStreetMap图层和热力图图层,以达到良好的交互效果。现在使用Apache Spark实现绘制,可能是Apache Spark不擅长这方面的计算或者是我没有设计好算法,并行计算的速度比不上单机计算。Apache Spark绘制热力图和计算代码在这 https://github.com/yuanzhaokang/ParallelizeHeatmap.git .
Stars: ✭ 13 (-99.96%)
SqlhelperSQL Tools ( Dialect, Pagination, DDL dump, UrlParser, SqlStatementParser, WallFilter, BatchExecutor for Test) based Java. it is easy to integration into any ORM frameworks
Stars: ✭ 242 (-99.23%)
DomaDAO oriented database mapping framework for Java 8+
Stars: ✭ 257 (-99.19%)
bigdata-funA complete (distributed) BigData stack, running in containers
Stars: ✭ 14 (-99.96%)
ClojureqlClojureQL is superior SQL integration for Clojure
Stars: ✭ 281 (-99.11%)
CrateCrateDB is a distributed SQL database that makes it simple to store and analyze
massive amounts of data in real-time.
Stars: ✭ 3,254 (-89.71%)
SparklerSpark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.
Stars: ✭ 362 (-98.86%)
HiveApache Hive
Stars: ✭ 4,031 (-87.25%)
Data Science Ipython NotebooksData science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Stars: ✭ 22,048 (-30.27%)
NormAccess a database in one line of code.
Stars: ✭ 152 (-99.52%)
BigdlBuilding Large-Scale AI Applications for Distributed Big Data
Stars: ✭ 3,813 (-87.94%)
JooqjOOQ is the best way to write SQL in Java
Stars: ✭ 4,695 (-85.15%)
PoliAn easy-to-use BI server built for SQL lovers. Power data analysis in SQL and gain faster business insights.
Stars: ✭ 1,850 (-94.15%)
Db UtilIf you are using JPA and Hibernate, this tool can auto-detect N+1 query issues during testing.
Stars: ✭ 194 (-99.39%)
Presto Go ClientA Presto client for the Go programming language.
Stars: ✭ 183 (-99.42%)
QuickperfQuickPerf is a testing library for Java to quickly evaluate and improve some performance-related properties
Stars: ✭ 231 (-99.27%)
QuicksqlA Flexible, Fast, Federated(3F) SQL Analysis Middleware for Multiple Data Sources
Stars: ✭ 1,821 (-94.24%)
awesome-AI-kubernetes❄️ 🐳 Awesome tools and libs for AI, Deep Learning, Machine Learning, Computer Vision, Data Science, Data Analytics and Cognitive Computing that are baked in the oven to be Native on Kubernetes and Docker with Python, R, Scala, Java, C#, Go, Julia, C++ etc
Stars: ✭ 95 (-99.7%)
ClickhouseClickHouse® is a free analytics DBMS for big data
Stars: ✭ 21,089 (-33.3%)
incubator-linkisLinkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
Stars: ✭ 2,459 (-92.22%)
JaydebeapiJayDeBeApi module allows you to connect from Python code to databases using Java JDBC. It provides a Python DB-API v2.0 to that database.
Stars: ✭ 247 (-99.22%)
SuccinctEnabling queries on compressed data.
Stars: ✭ 257 (-99.19%)
H2databaseH2 is an embeddable RDBMS written in Java.
Stars: ✭ 3,078 (-90.27%)
SylphStream computing platform for bigdata
Stars: ✭ 362 (-98.86%)
ZeppelinWeb-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
Stars: ✭ 5,513 (-82.56%)
DatafusionDataFusion has now been donated to the Apache Arrow project
Stars: ✭ 611 (-98.07%)
ScriptisScriptis is for interactive data analysis with script development(SQL, Pyspark, HiveQL), task submission(Spark, Hive), UDF, function, resource management and intelligent diagnosis.
Stars: ✭ 696 (-97.8%)
IgniteApache Ignite
Stars: ✭ 4,027 (-87.26%)
MagellanGeo Spatial Data Analytics on Spark
Stars: ✭ 507 (-98.4%)
DeltaAn open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.
Stars: ✭ 3,903 (-87.66%)
Hibernate SpringbootCollection of best practices for Java persistence performance in Spring Boot applications
Stars: ✭ 589 (-98.14%)
JailerDatabase Subsetting and Relational Data Browsing Tool.
Stars: ✭ 576 (-98.18%)
H2o 3H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Stars: ✭ 5,656 (-82.11%)
RagtimeDatabase-independent migration library
Stars: ✭ 519 (-98.36%)
Mycat2MySQL Proxy using Java NIO based on Sharding SQL,Calcite ,simple and fast
Stars: ✭ 750 (-97.63%)
Parquet IndexSpark SQL index for Parquet tables
Stars: ✭ 109 (-99.66%)
SpecqlAutomatic PostgreSQL CRUD queries
Stars: ✭ 120 (-99.62%)
Requeryrequery - modern SQL based query & persistence for Java / Kotlin / Android
Stars: ✭ 3,071 (-90.29%)
BeamApache Beam is a unified programming model for Batch and Streaming
Stars: ✭ 5,149 (-83.71%)