All Projects → ldaniels528 → qwery

ldaniels528 / qwery

Licence: other
A SQL-like language for performing ETL transformations.

Programming Languages

scala
5932 projects

Projects that are alternatives of or similar to qwery

DaFlow
Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.
Stars: ✭ 24 (-14.29%)
Mutual labels:  csv, hive, avro, etl, etl-framework
Pyetl
python ETL framework
Stars: ✭ 33 (+17.86%)
Mutual labels:  csv, hive, etl, etl-framework
Rumble
⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
Stars: ✭ 58 (+107.14%)
Mutual labels:  query, csv, avro, s3
Choetl
ETL Framework for .NET / c# (Parser / Writer for CSV, Flat, Xml, JSON, Key-Value, Parquet, Yaml, Avro formatted files)
Stars: ✭ 372 (+1228.57%)
Mutual labels:  csv, avro, etl, etl-framework
redis-connect-dist
Real-Time Event Streaming & Change Data Capture
Stars: ✭ 21 (-25%)
Mutual labels:  csv, etl, etl-framework
Storagetapper
StorageTapper is a scalable realtime MySQL change data streaming, logical backup and logical replication service
Stars: ✭ 232 (+728.57%)
Mutual labels:  avro, etl, s3
RecordParser
Zero Allocation Writer/Reader Parser for .NET Core
Stars: ✭ 155 (+453.57%)
Mutual labels:  tsv, csv, delimited
cubetl
CubETL - Framework and tool for data ETL (Extract, Transform and Load) in Python (PERSONAL PROJECT / SELDOM MAINTAINED)
Stars: ✭ 21 (-25%)
Mutual labels:  csv, etl, etl-framework
dswarm
an open-source data management platform for knowledge workers (https://github.com/dswarm/dswarm-documentation/wiki)
Stars: ✭ 57 (+103.57%)
Mutual labels:  csv, etl
Rhythm-CB-Scripts
Collection of scripts for use with Carbon Black Cb Response API
Stars: ✭ 14 (-50%)
Mutual labels:  query, csv
ETL-Starter-Kit
📁 Extract, Transform, Load (ETL) 👷 refers to a process in database usage and especially in data warehousing. This repository contains a starter kit featuring ETL related work.
Stars: ✭ 21 (-25%)
Mutual labels:  hive, etl-framework
kafka-encryption
Kafka End to End Encryption
Stars: ✭ 44 (+57.14%)
Mutual labels:  kafka-consumer, kafka-producer
web-click-flow
网站点击流离线日志分析
Stars: ✭ 14 (-50%)
Mutual labels:  hive, etl
astro
Astro allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.
Stars: ✭ 79 (+182.14%)
Mutual labels:  etl, s3
stock-market-scraper
Scraps historical stock market data from Yahoo Finance (https://finance.yahoo.com/)
Stars: ✭ 110 (+292.86%)
Mutual labels:  query, csv
GooglePlay-Web-Crawler
Mapreduce project by Hadoop, Nutch, AWS EMR, Pig, Tez, Hive
Stars: ✭ 18 (-35.71%)
Mutual labels:  hive, s3
bandar-log
Monitoring tool to measure flow throughput of data sources and processing components that are part of Data Ingestion and ETL pipelines.
Stars: ✭ 20 (-28.57%)
Mutual labels:  athena, etl
TIL
Today I Learned
Stars: ✭ 43 (+53.57%)
Mutual labels:  query, hive
YouPlot
A command line tool that draw plots on the terminal.
Stars: ✭ 412 (+1371.43%)
Mutual labels:  tsv, csv
Addax
Addax is an open source universal ETL tool that supports most of those RDBMS and NoSQLs on the planet, helping you transfer data from any one place to another.
Stars: ✭ 615 (+2096.43%)
Mutual labels:  hive, etl

Qwery

Table of Contents

What is Qwery?

Qwery is a general-purpose programming language with a SQL-like syntax. Qwery is currently being developed as a SQL-driven tool for Extract Transform and Loading (ETL) workflows, but could be used to write many types of applications, even Games! (See demos/BreakOutDemo.sql in this repository)

Getting Started

Qwery offers 3 database connectivity options to choose from:

  1. JDBC driver (com.qwery.database.jdbc.QweryDriver) (found in ./app/jdbc-driver)
  2. Qwery CLI - commandline interface (com.qwery.database.jdbc.QweryCLI) (packaged with JDBC driver)
  3. QweryScript - SQL script executor (com.qwery.runtime.QweryScript) (found in ./app/core)
Build the Qwery JDBC driver

To build the JDBC driver:

sbt "project jdbc_driver" clean assembly

The generated binary can be found at app/jdbc-driver/target/scala-2.13/jdbc-driver-assembly-0.5.0.jar

Once you've built the JDBC driver, you ready to perform one of the following:

  • Connect to a standalone Qwery server.
  • Startup a local embedded Qwery server and connect to it.
  • Use the CLI tool to connect to any other JDBC database.
Qwery CLI and QweryScript

Start the previously created JDBC driver without arguments to enter interactive mode:

java -jar ./app/jdbc-driver/target/scala-2.13/jdbc-driver-assembly-0.5.0.jar
Entering interactive mode.
Choose one of the following startup options:
1. Start dedicated Qwery Server
2. Start CLI with embedded Qwery Server
3. Start CLI and connect to remote Qwery Server
4. Start CLI and connect to generic JDBC Database
5. Start SQL script in standalone mode
Choice (1..5)> 

From here you can run Qwery as a dedicated server or client-server. Additionally, some of the above options can be executed directly via the commandline:

  1. Start in client-server mode: java -jar app/jdbc-driver/target/scala-2.13/jdbc-driver-assembly-0.5.0.jar <port>
  2. Start in client mode: java -cp app/jdbc-driver/target/scala-2.13/jdbc-driver-assembly-0.5.0.jar <host> <port>

Once you're up and running

In the Qwery CLI, here's a little advice. The following a few commands will save you some headaches:

  1. this - produces a table containing all variables defined within the current scope.
  2. @@help - produces a table containing all available commands.
  3. @@objects - produces a table containing all created objects (tables, views, types, etc.)

Example usage:

@@help

The commands above are actually table variables, and as such, they can also write queries against them:

select * from @@help where name like 'create%'
name type description example syntax
create external function Instruction Creates an external function create external function myFunc from class('com.qwery.udf.MyFunc') using jar('/home/ldaniels/shocktrade/jars/shocktrade-0.8.jar CREATE EXTERNAL FUNCTION ?%IFNE:exists %L:name ?%z:description FROM %jc:class ?USING +?%jj:jar
create external table Instruction Creates an external table create external table customers ( customer_uid uuid, name string, address string, ingestion_date int64) input format is 'json' location is './datasets/customers/json/' CREATE EXTERNAL TABLE ?%IFNE:exists %L:name ( %P:columns ) %O {{ ?%z:description ?FIELD +?TERMINATOR +?IS +?%a:field_delimiter ?HEADERS +?%C(headerVerb
create function Instruction Creates a function create function if not exists calc_add(a int, b int) as select @a + @b CREATE FUNCTION ?%IFNE:exists %L:name ?( +?%P:params +?) ?%z:description ?AS %e:code
create index Instruction Creates a table index create index stocks_symbol on stocks (symbol) comment is 'Index on Stock symbols' CREATE INDEX ?%IFNE:exists %L:name ON %L:table ( %F:columns ) ?%z:description
create macro Instruction Creates a database MACRO create macro DUMP from template("DUMP FROM %L:source ?WHERE +?%c:condition ?LIMIT +?%e:n") comment is 'This is a convenience MACRO for selecting data' as select * from @@source where @condition = true CREATE MACRO ?%IFNE:exists %L:name FROM TEMPLATE ( %e:template ) ?%z:description ?AS %N:code
create procedure Instruction Creates a database procedure create procedure testInserts(industry string) as select symbol, name, sector, industry, summaryQuote from Customers where industry = @industry CREATE PROCEDURE ?%IFNE:exists %L:name ?( +?%P:params +?) ?%z:description ?AS %N:code
create table Instruction Creates a database table create table Stocks ( symbol varchar(8), exchange enum (AMEX, NASDAQ, NYSE, OTCBB, OTHEROTC), lastSale double, lastSaleTime date) CREATE TABLE ?%IFNE:exists %L:name ?( +?%P:columns +?) %O {{ ?%z:description ?FROM +?%V:source ?LIKE +?%L:template ?PARTITIONED +?BY +?( +?%o:partitions +?) }}
create type Instruction Creates a database type create type mood comment is 'mood type' as enum (sad, okay, happy) CREATE TYPE ?%IFNE:exists %L:name ?%z:description AS %T:type
create view Instruction Creates a view create view if not exists OilAndGas as select symbol, name, sector, industry, summaryQuote from Customers where industry = 'Oil/Gas Transmission' CREATE VIEW ?%IFNE:exists %L:ref ?%z:description AS %Q:query
create webservice Instruction Creates a webservice create webservice if not exists getStocks (symbol VARCHAR(80)) { select * from @@stocks where symbol = @symbol } CREATE WEBSERVICE ?%IFNE:exists %L:name ?( +?%P:params +?) ?%z:description ?AS %N:code

The table variable this produces a result containing all user-created variables in the current scope. Consider the following example:

set @x = 7
set @s = 'Hello World'
this
name type value capacity sizeInBytes
s VARCHAR(11) Hello World null 11
x INT32 7 null 4

Features

Qwery supports:

  1. In-memory table variables (e.g. DECLARE @@stocks TABLE(symbol VARCHAR(8), exchange VARCHAR(8), lastSale DOUBLE))
  2. Custom SQL statements via MACRO (e.g. ~ INSERT 10 RANDOM STOCKS INTO @@stocks ~)
  3. Stored Procedures (e.g. CREATE PROCEDURE processTickers(@exchange STRING) { ... })
  4. Stored Functions (e.g. CREATE FUNCTION getTickers(@symbol STRING) { ... })
  5. Stored Types (e.g. CREATE TYPE TransactionType AS TABLE (price DOUBLE, transactionTime DATE))
  6. Common Table Expressions (e.g. WITH tickerLookup (symbol, exchange) { ... })
  7. DO/WHILE statements (e.g. DO { ... } WHILE @cnt < 5)
  8. FOR statements (e.g. FOR @item IN (SELECT symbol, lastSale FROM @@stocks) { ... })
  9. IF expressions (e.g. SELECT IF(@value >= 100, 'Excellent', IF(@value >= 50, 'Satisfactory', 'Unsatisfactory')) AS rating)
  10. IF-ELSE statements (e.g. IF(@value >= 100) SELECT 'Excellent' AS rating ELSE IF(@value >= 50) SELECT 'Satisfactory' AS rating ELSE SELECT 'Unsatisfactory' AS rating)
  11. WHILE/DO statements (e.g. WHILE @cnt < 5 DO { ... })
  12. INTERSECT/SUBTRACT/UNION (e.g. SELECT @@stocks + @@mutatualFunds - @@etfs)
  13. INTERSECT/SUBTRACT/UNION (e.g. SELECT @@stocks UNION @@mutatualFunds SUBTRACT @@etfs)
  14. String Interpolation (e.g. 'name: {{ @name }}, age: {{ @age }}')
  15. PostgreSQL-style casting operator (::) (e.g. '1234'::INTEGER yields 1234)

Aggregate Queries

SELECT
   exchange AS market,
   SUM(lastSale) AS sumLastSale,
   MIN(lastSale) AS minLastSale,
   MAX(lastSale) AS maxLastSale,
   AVG(lastSale) AS avgLastSale,
   COUNT(*) AS total,
   COUNT(DISTINCT(symbol)) AS tickers
FROM `stocks.transaction_history`
GROUP BY exchange
ORDER BY market DESC
market sumLastSale minLastSale maxLastSale avgLastSale total tickers
OTCBB 6229909.28 0.00 983.82 249.39 24980 23177
NYSE 6258087.96 0.00 995.56 250.59 24973 23223
NASDAQ 6295374.46 0.00 994.70 249.98 25183 23426
AMEX 6157848.41 0.00 988.63 247.66 24864 23155

Nested Tables

CREATE TABLE `stocks.transaction_history` (
  symbol STRING(8) NOT NULL,
  exchange STRING(8) NOT NULL,
  transactions STRUCT (
    price DOUBLE,
    transactionTime DATE
  )[100]
)
INSERT INTO `stocks.transaction_history` (symbol, exchange, transactions)
VALUES ('AAPL', 'NASDAQ', '{"price":156.39, "transactionTime":"2021-08-05T19:23:11.000Z"}'),
       ('AMD', 'NASDAQ', '{"price":56.87, "transactionTime":"2021-08-05T19:23:11.000Z"}'),
       ('INTC','NYSE', '{"price":89.44, "transactionTime":"2021-08-05T19:23:11.000Z"}'),
       ('AMZN', 'NASDAQ', '{"price":988.12, "transactionTime":"2021-08-05T19:23:11.000Z"}'),
       ('SHMN', 'OTCBB', '[{"price":0.0010, "transactionTime":"2021-08-05T19:23:11.000Z"},{"price":0.0011, "transactionTime":"2021-08-05T19:23:12.000Z"}]')
SELECT symbol, exchange, TO_JSON(transactions) AS transactions
FROM `stocks.transaction_history`
ORDER BY symbol ASC
symbol exchange transactions
AAPL NASDAQ [{"__id":"0","price":"156.39","transactionTime":"Thu Aug 05 19:23:11 PDT 2021"}]
AMD NASDAQ [{"__id":"0","price":"56.87","transactionTime":"Thu Aug 05 19:23:11 PDT 2021"}]
AMZN NASDAQ [{"__id":"0","price":"988.12","transactionTime":"Thu Aug 05 19:23:11 PDT 2021"}]
INTC NYSE [{"__id":"0","price":"89.44","transactionTime":"Thu Aug 05 19:23:11 PDT 2021"}]
SELECT symbol, exchange, UNNEST(transactions)
FROM `stocks.transaction_history`
WHERE symbol = 'SHMN'
symbol exchange price transactionTime
SHMN OTCBB 0.001 Thu Aug 05 19:23:11 PDT 2021
SHMN OTCBB 0.0011 Thu Aug 05 19:23:11 PDT 2021

String Interpolation

DECLARE @name STRING
DECLARE @age INT8
DECLARE @title STRING
SET @name = 'Larry Jerry'
SET @age = 32
SET @title = 'name: {{ @name }}, age: {{ @age }}'
SELECT @title
@title
name: Larry Jerry, age: 32

Views

CREATE VIEW IF NOT EXISTS securities.stocks_view
COMMENT IS 'NYSE Stock symbols sorted by last sale'
AS
SELECT
   symbol AS ticker,
   exchange AS market,
   lastSale,
   ROUND(lastSale, 1) AS roundedLastSale,
   lastSaleTime
FROM `securities.stocks`
WHERE exchange = 'NYSE'
ORDER BY lastSale DESC
LIMIT 50
market lastSale roundedLastSale maxLastSale avgLastSale total tickers
NYSE 6229909.28 0.00 983.82 249.39 24980 23177
NYSE 6258087.96 0.00 995.56 250.59 24973 23223
NYSE 6295374.46 0.00 994.70 249.98 25183 23426
NYSE 6157848.41 0.00 988.63 247.66 24864 23155

Webservices

First, let's create a webservice.

USE examples

-- create a table
DROP TABLE IF EXISTS Travelers
CREATE TABLE Travelers (id UUID, lastName VARCHAR(12), firstName VARCHAR(12), destAirportCode VARCHAR(3))
INSERT INTO Travelers (id, lastName, firstName, destAirportCode)
VALUES ('7bd0b461-4eb9-400a-9b63-713af85a43d0', 'JONES', 'GARRY', 'SNA'), (UUID(), 'JONES', 'DEBBIE', 'SNA'),
       (UUID(), 'JONES', 'TAMERA', 'SNA'), (UUID(), 'JONES', 'ERIC', 'SNA'),
       (UUID(), 'ADAMS', 'KAREN', 'DTW'), (UUID(), 'ADAMS', 'MIKE', 'DTW'),
       (UUID(), 'JONES', 'SAMANTHA', 'BUR'), (UUID(), 'SHARMA', 'PANKAJ', 'LAX')

-- create the webservice
DROP WEBSERVICE IF EXISTS findTraveler
CREATE WEBSERVICE findTraveler (firstName varchar(80), lastName varchar(80)) {
   SELECT * FROM Travelers WHERE firstName = @firstName AND lastName = @lastName
}

Next, let's call the service from Scala:

val response = scala.io.Source.fromURL(s"http://0.0.0.0:8888/ws/examples/findTraveler?firstName=GARRY&lastName=JONES").use(_.mkString)
assert(response == """[{"__id":0,"destAirportCode":"SNA","lastName":"JONES","__src_id":0,"firstName":"GARRY","id":"7bd0b461-4eb9-400a-9b63-713af85a43d0"}]""")

MACRO Development

Qwery facilitates the creation of custom SQL commands via MACROs. Here's an example of a MACRO that bulk inserts random stock quotes:

drop macro if exists `INSERT`
create macro `INSERT` from template('INSERT %e:total RANDOM STOCKS INTO %L:myQuotes') {
    -- @cnt is a local variable
    declare @cnt: int32 = 0
    println 'Generating {{ @total }} random stock quotes...'
    do {
        set @cnt = @cnt + 1
        insert into @@myQuotes (lastSaleTime, lastSale, exchange, symbol)
        select NOW() as lastSaleTime,
               ROUND(500 * RAND(0.99), 4) as lastSale,
               ['AMEX', 'NASDAQ', 'NYSE', 'OTCBB'][rand(4)] as exchange,
               RandomString(['A' TO 'Z'], 4) as symbol
    } while @cnt < @total
    select 'INSERTED' as action, @cnt as count
}

Next, let's create a new table to store the stock quotes.

DECLARE @@stockQuotes TABLE(symbol VARCHAR(4), exchange VARCHAR(5), lastSale DOUBLE, lastSaleTime DATE)[1000]

Finally, let's execute this MACRO:

~ INSERT 10000 RANDOM STOCKS INTO @@stockQuotes ~

Above we've asked the MACRO to create 10,000 random stock quotes into our @@stockQuotes table.

MACRO Template tags
tag description / function example
? optional attribute tag ?LIMIT +?%e:limit matches LIMIT 100
+? optionally required sequence tag ?ORDER +?BY +?%o:sortFields matches ORDER BY Symbol DESC
%a a single atom %a:name matches Tickers or 'Tickers'
%aa a list of atoms %aa:name matches symbol, exchange, lastSale
%A a list of arguments %A:args matches (1,2,3)
%c a conditional expression %c:condition matches x = 1 and y = 2
%C a combo-box / choice %C(mode,INTO,OVERWRITE) matches INSERT INTO or INSERT OVERWRITE
%e a single expression %e:expression matches 2 * (x + 1)
%E a list of expressions %E:fields matches field1, 'hello', 5 + now(), ..., fieldN
%f a single field name %f:field matches lastSale
%F a list of field names %F:fields matches field1, field2, ..., fieldN
%IFE IF EXISTS clause %IFE:exists matches IF EXISTS
%IFNE IF NOT EXISTS clause %IFNE:exists matches IF NOT EXISTS
%jc a Java Class reference %jc:class matches java.util.Date
%jj a Java Jar reference %jj:jar matches libs/utils.jar
%J a JOIN clause %J:joins matches INNER JOIN stocks ON A.symbol = B.ticker
%L a location or table %L:table matches accounts or @@accounts
%LQ a location or table or sub-query %LQ:query matches accounts or @@accounts or (SELECT ...)
%N an instruction or code block %N:code matches PRINTLN 'Hello World'
%o an ordered field list %o:orderedFields matches field1 DESC, field2 ASC
%P a collection of parameters %P:params matches name STRING, age INTEGER, dob DATE
%p properties ('quoteChar'='~', 'separatorChar'=',')
%q indirect query source (queries, tables and variables) %q:source matches AddressBook or @@addressBook or ( SELECT * FROM AddressBook )
%Q direct query source (queries and variables) %Q:query matches @@addressBook or SELECT * FROM AddressBook)
%r regular expression matching %r``\\d{3,4}\\S+`` matches 123ABC`
%t a table %t:name matches Customers
%T a type %T:myType matches Decimal(20,2)
%U update field assignments %U:assign matches SET lastSale = 25, code = 'R'
%v variable reference %v:variable matches SET @variable = 5
%V insert values (queries, VALUES and variables) %V:data matches @@numbers or (SELECT ...) or VALUES (...)
%z quoted text values %z:comment matches 'This is a comment'

Development

Build Requirements

Building the application

sbt clean assembly

Running the tests

sbt clean test
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].