ScriptisScriptis is for interactive data analysis with script development(SQL, Pyspark, HiveQL), task submission(Spark, Hive), UDF, function, resource management and intelligent diagnosis.
Stars: ✭ 696 (+538.53%)
OapOptimized Analytics Package for Spark* Platform
Stars: ✭ 343 (+214.68%)
Rumble⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
Stars: ✭ 58 (-46.79%)
XsqlUnified SQL Analytics Engine Based on SparkSQL
Stars: ✭ 176 (+61.47%)
MetorikkuA simplified, lightweight ETL Framework based on Apache Spark
Stars: ✭ 361 (+231.19%)
Whylogs JavaProfile and monitor your ML data pipeline end-to-end
Stars: ✭ 164 (+50.46%)
SqlindexmanagerFree GUI Tool for Index Maintenance on SQL Server and Azure
Stars: ✭ 403 (+269.72%)
IcebergIceberg is a table format for large, slow-moving tabular data
Stars: ✭ 393 (+260.55%)
experimentsCode examples for my blog posts
Stars: ✭ 21 (-80.73%)
Mlinterview A curated awesome list of AI Startups in India & Machine Learning Interview Guide. Feel free to contribute!
Stars: ✭ 410 (+276.15%)
DatafusionDataFusion has now been donated to the Apache Arrow project
Stars: ✭ 611 (+460.55%)
PucketBucketing and partitioning system for Parquet
Stars: ✭ 29 (-73.39%)
QuicksqlA Flexible, Fast, Federated(3F) SQL Analysis Middleware for Multiple Data Sources
Stars: ✭ 1,821 (+1570.64%)
GafferA large-scale entity and relation database supporting aggregation of properties
Stars: ✭ 1,642 (+1406.42%)
SparkApache Spark - A unified analytics engine for large-scale data processing
Stars: ✭ 31,618 (+28907.34%)
Spark With PythonFundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (+37.61%)
LinkisLinkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
Stars: ✭ 2,323 (+2031.19%)
KyuubiKyuubi is a unified multi-tenant JDBC interface for large-scale data processing and analytics, built on top of Apache Spark
Stars: ✭ 363 (+233.03%)
RoapiCreate full-fledged APIs for static datasets without writing a single line of code.
Stars: ✭ 253 (+132.11%)
Devops Python Tools80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.
Stars: ✭ 406 (+272.48%)
Kamu CliNext generation tool for decentralized exchange and transformation of semi-structured data
Stars: ✭ 69 (-36.7%)
SchemerSchema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.
Stars: ✭ 97 (-11.01%)
Php Thrift SqlA PHP library for connecting to Hive or Impala over Thrift
Stars: ✭ 107 (-1.83%)
Xormxorm是一个简单而强大的Go语言ORM库,通过它可以使数据库操作非常简便。本库是基于原版xorm的定制增强版本,为xorm提供类似ibatis的配置文件及动态SQL支持,支持AcitveRecord操作
Stars: ✭ 1,394 (+1178.9%)
Ransom0Ransom0 is a open source ransomware made with Python, designed to find and encrypt user data.
Stars: ✭ 105 (-3.67%)
Legacy SearchDemo project showing how to add elasticsearch to a legacy application.
Stars: ✭ 103 (-5.5%)
Pyspark Cheatsheet🐍 Quick reference guide to common patterns & functions in PySpark.
Stars: ✭ 108 (-0.92%)
Tennis Crystal BallUltimate Tennis Statistics and Tennis Crystal Ball - Tennis Big Data Analysis and Prediction
Stars: ✭ 107 (-1.83%)
CubesLight-weight Python OLAP framework for multi-dimensional data analysis
Stars: ✭ 1,393 (+1177.98%)
MinisqlqueryMinimalist SQL Query tool for any .NET DB Provider - SQL, SQLite, SQL CE, Oracle, Access...
Stars: ✭ 103 (-5.5%)
MonetdbliteMonetDB reconfigured as a library
Stars: ✭ 107 (-1.83%)
Your spotifySelf hosted Spotify tracking dashboard
Stars: ✭ 102 (-6.42%)
Laravel Stats📈 Get insights about your Laravel or Lumen Project
Stars: ✭ 1,386 (+1171.56%)
Isl PythonPorting the R code in ISL to python. Labs and exercises
Stars: ✭ 108 (-0.92%)
HnswlibJava library for approximate nearest neighbors search using Hierarchical Navigable Small World graphs
Stars: ✭ 108 (-0.92%)
LogigskA Linux based software package to control led's on Logitech G910, G810, G610 and G410.
Stars: ✭ 107 (-1.83%)
Gitlogg💾 🧮 🤯 Parse the 'git log' of multiple repos to 'JSON'
Stars: ✭ 102 (-6.42%)
Ml VideosA collection of video resources for machine learning
Stars: ✭ 1,446 (+1226.61%)
HackermathIntroduction to Statistics and Basics of Mathematics for Data Science - The Hacker's Way
Stars: ✭ 1,380 (+1166.06%)
Cslearning 开源项目之「计算机编程自学之路」:计算机自学指南+面试大全+资源分享+技术文章
Stars: ✭ 107 (-1.83%)
RootThe official repository for ROOT: analyzing, storing and visualizing big data, scientifically
Stars: ✭ 1,377 (+1163.3%)
SqlobjectSQLObject, an object-relational mapper for Python
Stars: ✭ 106 (-2.75%)
F3 CortexA multi-engine ORM / ODM for the PHP Fat-Free Framework
Stars: ✭ 101 (-7.34%)
Spark FfmFFM (Field-Awared Factorization Machine) on Spark
Stars: ✭ 101 (-7.34%)
Sqlfaker轻量级、易拓展的数据库智能填充Java开源库
Stars: ✭ 109 (+0%)
PtstatProbabilistic Programming and Statistical Inference in PyTorch
Stars: ✭ 108 (-0.92%)
Scikit Learnscikit-learn: machine learning in Python
Stars: ✭ 48,322 (+44232.11%)
Npm Stats📈 npm package statistics dashboard build with vue
Stars: ✭ 106 (-2.75%)
MahaA framework for rapid reporting API development; with out of the box support for high cardinality dimension lookups with druid.
Stars: ✭ 101 (-7.34%)
GriddbGridDB is a next-generation open source database that makes time series IoT and big data fast,and easy.
Stars: ✭ 1,587 (+1355.96%)
Fiflowflink-sql 在 flink 上运行 sql 和 构建数据流的平台 基于 apache flink 1.10.0
Stars: ✭ 100 (-8.26%)