All Projects → uber → uberscriptquery

uber / uberscriptquery

Licence: Apache-2.0 and 2 other licenses found Licenses found Apache-2.0 LICENSE Unknown LICENSE.txt GPL-2.0 LICENSE_GPL
UberScriptQuery, a SQL-like DSL to make writing Spark jobs super easy

Programming Languages

java
68154 projects - #9 most used programming language
ANTLR
299 projects
scala
5932 projects

Labels

Projects that are alternatives of or similar to uberscriptquery

Arcarmovement
This is navigation example on google map. Here Marker move as vehicles moves with turns as uber does in their app. Using old and new coordinates animating bearing value the markers are moving.
Stars: ✭ 137 (+153.7%)
Mutual labels:  uber
Makisu
Fast and flexible Docker image building tool, works in unprivileged containerized environments like Mesos and Kubernetes.
Stars: ✭ 2,409 (+4361.11%)
Mutual labels:  uber
ridesharing-ios
Ridesharing driver & rider sample apps using HyperTrack SDK
Stars: ✭ 97 (+79.63%)
Mutual labels:  uber
Uber React
Uber-like project in React Native
Stars: ✭ 151 (+179.63%)
Mutual labels:  uber
Expo Uber
Uber UI Clone with React Native & Expo
Stars: ✭ 186 (+244.44%)
Mutual labels:  uber
H3
Hexagonal hierarchical geospatial indexing system
Stars: ✭ 3,167 (+5764.81%)
Mutual labels:  uber
Uber Car Animation Android
An example project to demonstrate how to Add Uber Like Car Animation in Android App
Stars: ✭ 134 (+148.15%)
Mutual labels:  uber
deck.gl-time-series-widget
A React Time Slider implementation for DECK.GL - (non)temporal data - by CPU filtering ⌛
Stars: ✭ 19 (-64.81%)
Mutual labels:  uber
Atari Model Zoo
A binary release of trained deep reinforcement learning models trained in the Atari machine learning benchmark, and a software release that enables easy visualization and analysis of models, and comparison across training algorithms.
Stars: ✭ 198 (+266.67%)
Mutual labels:  uber
UberCarAnimation
This app is for animating a car like uber from one position to another with preserving angle and smooth animation
Stars: ✭ 53 (-1.85%)
Mutual labels:  uber
Luma.gl
High-performance Toolkit for WebGL-based Data Visualization
Stars: ✭ 1,928 (+3470.37%)
Mutual labels:  uber
Ridesharing Android
Ridesharing driver & rider sample apps using HyperTrack SDK
Stars: ✭ 186 (+244.44%)
Mutual labels:  uber
Fastlane
🚗 Book Uber from your menubar, fast using Electron, React and Redux 🍟
Stars: ✭ 253 (+368.52%)
Mutual labels:  uber
H3 Java
Java bindings for H3, a hierarchical hexagonal geospatial indexing system
Stars: ✭ 150 (+177.78%)
Mutual labels:  uber
Uber-Like-Cab-Service
Uber Like Android App with PHP backend
Stars: ✭ 123 (+127.78%)
Mutual labels:  uber
Big Companies Interview Questions
A curated list of previous asked Interview Question at Big Companies and Startups 🤲 🏆
Stars: ✭ 135 (+150%)
Mutual labels:  uber
Uber Mobile Web
This is an attempt to mock the new uber app into mobile web to learn how react can be used to optimize for mobile web.
Stars: ✭ 235 (+335.19%)
Mutual labels:  uber
libretaxi2
Open source Uber PoC #deleteuber
Stars: ✭ 90 (+66.67%)
Mutual labels:  uber
dailycodingproblem
Solutions to Daily Coding Problem questions
Stars: ✭ 26 (-51.85%)
Mutual labels:  uber
Vis Academy
A set of tutorials on how our frameworks make effective data visualization applications.
Stars: ✭ 254 (+370.37%)
Mutual labels:  uber

UberScriptQuery

UberScriptQuery is a script query wrapper to run Spark SQL jobs.

Why did we build this? Apache Spark is a great tool to do data processing, yet people usually end up writing many similar Spark jobs. There is substantial development cost to write and maintain all these jobs. Additionally, Spark is still mostly for developers, and other people such as data analysts or data scientists may still feel that Spark has a steep learning curve.

To make Spark easier, we define a high level SQL-like DSL (Domain Specific Language) on top of Spark. People can use that DSL to write Spark jobs without worrying about Spark internal details. Another benefit to define such a DSL is to break up complicated logic or SQL query to a declarative script, which is easy to review and maintain. Our result? UberScriptQuery, a SQL-like DSL to make writing Spark jobs super easy.

DSL Example

The following is a quick example for the UberScriptQuery DSL. It queries data from a MySQL database and Hadoop file, joins them together, and saves the result to another MySQL data table.

-- Define variables
datePath = '2017/01/10';
  
-- Load data from mysql
clients = sql jdbc set connectionString='jdbc:mysql://server:3306/database';
  select clientId, clientName from dim_client;
  
-- Load data from hadoop
orders = file json hdfs:///dir/to/files/${datePath}/*;
  
-- Join data from two tables
result = select clientName, productName, totalCount, orderId, orderDescription 
  from orders
  join clients on clients.clientId = orders.clientId;
  
-- Write result to mysql output table
-- The output table will be auto created if not exist, and will have:
-- Primary Key Columns: clientName,productName,orderId
-- Index Columns: clientName,orderId
-- Text Columns: orderDescription
  
writeJdbc('jdbc:mysql://server:3306/database',
  'jdbcTable',
  'clientName,productName,orderId',
  'clientName,orderId',
  'orderDescription',
  'Append',
  result);
  
-- Send email via MailGun
sendMailGunEmail('https://api.mailgun.net/v3/sandbox549566ecba1d49fab0d7b53d4cfb01a4.mailgun.org/messages',
  'MailGun_ApiKey',
  '[email protected]',
  '[email protected]',
  'Email Title - Job Done',
  'Email Content - Successfully queried data on ${datePath}');

DSL Features

  1. Flexible Input/Output: it supports multiple input/output data sources with different formats, including database and Hadoop. It is also possible to add other data sources like Cassandra and Elasticsearch.

  2. Multiple SQL Statements: it allows executing multiple SQL Statements, storing temporary results in another table, and referencing them in other SQL statements. This avoids a huge complicated single SQL statement, and makes the logic very clear and easy to maintain.

  3. Variable Substitution: it allows defining variables with names/values, and substitute these variable in the script body. It also allows variable overwriting from outside of the script, so people can run the same script with different variable bindings.

  4. Custom Action: it supports Actions like writeJdbc() in the previous DSL example. It also allows users to write their own Actions and plug into the script.

  5. Upsert Result to Database: it implements an "upsert" based JDBC writer, and can insert/update database records in a single operation. This makes it easy to provide "Exactly Once" semantic support for Spark Jobs.

Quick Start

Build this project with Maven with Java 1.8:

mvn package -DskipTests

Run the following command to execute your first UberScriptQuery job:

java -cp target/UberScriptQuery-1.1.01.jar com.uber.uberscriptquery.examples.QueryExampleJob \
  -query "result = select cast(unix_timestamp() as timestamp) as time, 'Hello World' as message; printTable(result);" 

The following is another example to run with variable overwriting (note we use '\${message}' in following command because of escaping $ in bash command, in programming code, it should be like '${message}'):

java -cp target/UberScriptQuery-1.1.01.jar com.uber.uberscriptquery.examples.QueryExampleJob \
  -query "message = 'Hello World'; result = select cast(unix_timestamp() as timestamp) as time, '\${message}' as message; printTable(result);" \
  -queryOverwrite "message = 'Hello New World';"

You could also integrate the UberScriptQuery Engine into your own code, and run the script in your own job:

QueryEngine engine = new QueryEngine();
engine.executeScript(script, sparkSession);

There are more detailed sample codes in this class:

com.uber.uberscriptquery.examples.QueryExecutionExample

Future Work

  1. Support more data sources, e.g. Cassandra and Elasticsearch.

  2. Support "upsert" into more databases like PostgreSQL. Now it only supports MySQL and H2 database.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].