TonYTonY is a framework to natively run deep learning frameworks on Apache Hadoop.
Stars: ✭ 687 (+4193.75%)
GeomancerAutomated feature engineering for geospatial data
Stars: ✭ 194 (+1112.5%)
ScriptisScriptis is for interactive data analysis with script development(SQL, Pyspark, HiveQL), task submission(Spark, Hive), UDF, function, resource management and intelligent diagnosis.
Stars: ✭ 696 (+4250%)
SqlpadWeb-based SQL editor run in your own private cloud. Supports MySQL, Postgres, SQL Server, Vertica, Crate, ClickHouse, Trino, Presto, SAP HANA, Cassandra, Snowflake, BigQuery, SQLite, and more with ODBC
Stars: ✭ 4,113 (+25606.25%)
LuigiLuigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.
Stars: ✭ 15,226 (+95062.5%)
CalciteApache Calcite
Stars: ✭ 2,816 (+17500%)
Almanac.httparchive.orgHTTP Archive's annual "State of the Web" report made by the web community
Stars: ✭ 310 (+1837.5%)
gcp-dlDeep Learning on GCP
Stars: ✭ 27 (+68.75%)
Issue Label BotCode For The Issue Label Bot, an App that automatically labels issues using machine learning, available on the GitHub Marketplace. This is also code for the blog article: "How to automate tasks on GitHub with machine learning for fun and profit"
Stars: ✭ 292 (+1725%)
Bigdata PlaygroundA complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL
Stars: ✭ 177 (+1006.25%)
spark-on-k8s-gcp-examplesExample Spark applications that run on Kubernetes and access GCP products, e.g., GCS, BigQuery, and Cloud PubSub
Stars: ✭ 36 (+125%)
Big WhaleSpark、Flink等离线任务的调度以及实时任务的监控
Stars: ✭ 163 (+918.75%)
Nodejs BigqueryNode.js client for Google Cloud BigQuery: A fast, economical and fully-managed enterprise data warehouse for large-scale data analytics.
Stars: ✭ 268 (+1575%)
JavaFrameworkSimple Java Framework,designed for easily develop Spring based java program.Support Bigdata And metadata management.A common elasticsearch comm query tool and so on.
Stars: ✭ 16 (+0%)
HadoopApache Hadoop
Stars: ✭ 12,177 (+76006.25%)
XlearningAI on Hadoop
Stars: ✭ 1,709 (+10581.25%)
TiBigDataTiDB connectors for Flink/Hive/Presto
Stars: ✭ 192 (+1100%)
alphasqlAlphaSQL provides Integrated Type and Schema Check and Parallelization for SQL file set mainly for BigQuery
Stars: ✭ 35 (+118.75%)
GafferA large-scale entity and relation database supporting aggregation of properties
Stars: ✭ 1,642 (+10162.5%)
beanszooDistributed Java micro-services using ZooKeeper
Stars: ✭ 12 (-25%)
SpydraEphemeral Hadoop clusters using Google Compute Platform
Stars: ✭ 128 (+700%)
dbddbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.
Stars: ✭ 30 (+87.5%)
Parquet4sRead and write Parquet in Scala. Use Scala classes as schema. No need to start a cluster.
Stars: ✭ 125 (+681.25%)
logparserEasy parsing of Apache HTTPD and NGINX access logs with Java, Hadoop, Hive, Pig, Flink, Beam, Storm, Drill, ...
Stars: ✭ 139 (+768.75%)
Hdfs ShellHDFS Shell is a HDFS manipulation tool to work with functions integrated in Hadoop DFS
Stars: ✭ 117 (+631.25%)
growthbookOpen Source Feature Flagging and A/B Testing Platform
Stars: ✭ 2,342 (+14537.5%)
AsakusafwAsakusa Framework
Stars: ✭ 114 (+612.5%)
DeployMachineLearningModelsThis Repo Contains Deployment of Machine Learning Models on various cloud services like Azure, Heroku, AWS,GCP etc
Stars: ✭ 14 (-12.5%)
Parquet GoGo package to read and write parquet files. parquet is a file format to store nested data structures in a flat columnar data format. It can be used in the Hadoop ecosystem and with tools such as Presto and AWS Athena.
Stars: ✭ 114 (+612.5%)
bigquery fdwBigQuery Foreign Data Wrapper for PostgreSQL
Stars: ✭ 65 (+306.25%)
grucloudGenerate diagrams and code from cloud infrastructures: AWS, Azure,GCP, Kubernetes
Stars: ✭ 76 (+375%)
pre-commit-dbt🎣 List of `pre-commit` hooks to ensure the quality of your `dbt` projects.
Stars: ✭ 149 (+831.25%)
deploy-appengineA GitHub Action that deploys source code to Google App Engine.
Stars: ✭ 184 (+1050%)
bigquery-geo-vizVisualize Google BigQuery geospatial data using Google Maps Platform APIs
Stars: ✭ 68 (+325%)
gke-anthos-holistic-demoThis repository guides you through deploying a private GKE cluster and provides a base platform for hands-on exploration of several GKE related topics which leverage or integrate with that infrastructure. After completing the exercises in all topic areas, you will have a deeper understanding of several core components of GKE and GCP as configure…
Stars: ✭ 55 (+243.75%)
ChukwaMirror of Apache Chukwa
Stars: ✭ 77 (+381.25%)
dbqCLI tool to easily Decorate BigQuery table name
Stars: ✭ 13 (-18.75%)
Docker HadoopApache Hadoop docker image
Stars: ✭ 1,190 (+7337.5%)
echarts-wwwSource of Apache ECharts website
Stars: ✭ 59 (+268.75%)
SrcA light-weight distributed stream computing framework for Golang
Stars: ✭ 67 (+318.75%)
Bigquery GrafanaGoogle BigQuery Datasource Plugin for Grafana.
Stars: ✭ 188 (+1075%)
Drone🍰 The missing library manager for Android Developers
Stars: ✭ 512 (+3100%)
WaimakWaimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.
Stars: ✭ 60 (+275%)
YauaaYet Another UserAgent Analyzer
Stars: ✭ 472 (+2850%)
moadsd-ngThe MOADSD-NG project does provide a simple way to setup a hybrid cloud security demo, playground and learning environment within the clouds.
Stars: ✭ 13 (-18.75%)
QuixQuix Notebook Manager
Stars: ✭ 184 (+1050%)
Bdp Dataplatform大数据生态解决方案数据平台:基于大数据、数据平台、微服务、机器学习、商城、自动化运维、DevOps、容器部署平台、数据平台采集、数据平台存储、数据平台计算、数据平台开发、数据平台应用搭建的大数据解决方案。
Stars: ✭ 456 (+2750%)
ScioA Scala API for Apache Beam and Google Cloud Dataflow.
Stars: ✭ 2,247 (+13943.75%)
YanagishimaWeb UI for Trino, Presto, Hive, Elasticsearch, SparkSQL
Stars: ✭ 424 (+2550%)
astroAstro allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.
Stars: ✭ 79 (+393.75%)
GCPEditorProAmazingly fast and simple ground control points interface. ◎
Stars: ✭ 33 (+106.25%)