All Projects → trinodb → Trino

trinodb / Trino

Licence: apache-2.0
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

Programming Languages

java
68154 projects - #9 most used programming language
javascript
184084 projects - #8 most used programming language
shell
77523 projects
ANTLR
299 projects
HTML
75241 projects
CSS
56736 projects

Projects that are alternatives of or similar to Trino

Crate
CrateDB is a distributed SQL database that makes it simple to store and analyze massive amounts of data in real-time.
Stars: ✭ 3,254 (-28.97%)
Mutual labels:  sql, analytics, big-data, database, distributed-database
Hive
Apache Hive
Stars: ✭ 4,031 (-12.01%)
Mutual labels:  sql, big-data, hadoop, hive, database
Spark With Python
Fundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (-96.73%)
Mutual labels:  sql, analytics, big-data, hadoop, database
Presto
The official home of the Presto distributed SQL query engine for big data
Stars: ✭ 12,957 (+182.84%)
Mutual labels:  sql, big-data, hadoop, hive, presto
Maha
A framework for rapid reporting API development; with out of the box support for high cardinality dimension lookups with druid.
Stars: ✭ 101 (-97.8%)
Mutual labels:  sql, analytics, big-data, hive, presto
Drill
Apache Drill is a distributed MPP query layer for self describing data
Stars: ✭ 1,619 (-64.66%)
Mutual labels:  big-data, hive, hadoop, jdbc
Kyuubi
Kyuubi is a unified multi-tenant JDBC interface for large-scale data processing and analytics, built on top of Apache Spark
Stars: ✭ 363 (-92.08%)
Mutual labels:  sql, analytics, jdbc, hive
Griffon Vm
Griffon Data Science Virtual Machine
Stars: ✭ 128 (-97.21%)
Mutual labels:  data-science, big-data, hadoop, database
Ignite
Apache Ignite
Stars: ✭ 4,027 (-12.09%)
Mutual labels:  sql, big-data, hadoop, database
Web Database Analytics
Web scrapping and related analytics using Python tools
Stars: ✭ 175 (-96.18%)
Mutual labels:  data-science, sql, analytics, database
Data Science Best Resources
Carefully curated resource links for data science in one place
Stars: ✭ 1,104 (-75.9%)
Mutual labels:  data-science, sql, analytics, database
Clickhouse
ClickHouse® is a free analytics DBMS for big data
Stars: ✭ 21,089 (+360.36%)
Mutual labels:  sql, analytics, big-data, distributed-database
Rqlite
The lightweight, distributed relational database built on SQLite
Stars: ✭ 9,147 (+99.67%)
Mutual labels:  sql, database, distributed-database, distributed-systems
Eventql
Distributed "massively parallel" SQL query engine
Stars: ✭ 1,121 (-75.53%)
Mutual labels:  sql, analytics, database, distributed-database
Linkis
Linkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
Stars: ✭ 2,323 (-49.29%)
Mutual labels:  sql, jdbc, hive, presto
Pachyderm
Reproducible Data Science at Scale!
Stars: ✭ 5,305 (+15.8%)
Mutual labels:  data-science, analytics, big-data, distributed-systems
Sciblog support
Support content for my blog
Stars: ✭ 694 (-84.85%)
Mutual labels:  data-science, analytics, big-data
incubator-linkis
Linkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
Stars: ✭ 2,459 (-46.32%)
Mutual labels:  presto, hive, jdbc
Sayn
Data processing and modelling framework for automating tasks (incl. Python & SQL transformations).
Stars: ✭ 79 (-98.28%)
Mutual labels:  data-science, sql, analytics
Data Science Career
Career Resources for Data Science, Machine Learning, Big Data and Business Analytics Career Repository
Stars: ✭ 630 (-86.25%)
Mutual labels:  data-science, analytics, big-data

Trino Logo

Trino is a fast distributed SQL query engine for big data analytics.

See the User Manual for deployment instructions and end user documentation.

Trino download Trino Slack Trino: The Definitive Guide book download

Development

See DEVELOPMENT for information about code style, development process, and guidelines.

See CONTRIBUTING for contribution requirements.

Security

See the project security policy for information about reporting vulnerabilities.

Build requirements

  • Mac OS X or Linux
  • Java 11.0.11+, 64-bit
  • Docker

Building Trino

Trino is a standard Maven project. Simply run the following command from the project root directory:

./mvnw clean install -DskipTests

On the first build, Maven downloads all the dependencies from the internet and caches them in the local repository (~/.m2/repository), which can take a while, depending on your connection speed. Subsequent builds are faster.

Trino has a comprehensive set of tests that take a considerable amount of time to run, and are thus disabled by the above command. These tests are run by the CI system when you submit a pull request. We recommend only running tests locally for the areas of code that you change.

Running Trino in your IDE

Overview

After building Trino for the first time, you can load the project into your IDE and run the server. We recommend using IntelliJ IDEA. Because Trino is a standard Maven project, you easily can import it into your IDE. In IntelliJ, choose Open Project from the Quick Start box or choose Open from the File menu and select the root pom.xml file.

After opening the project in IntelliJ, double check that the Java SDK is properly configured for the project:

  • Open the File menu and select Project Structure
  • In the SDKs section, ensure that JDK 11 is selected (create one if none exist)
  • In the Project section, ensure the Project language level is set to 11

Running a testing server

The simplest way to run Trino for development is to run the TpchQueryRunner class. It will start a development version of the server that is configured with the TPCH connector. You can then use the CLI to execute queries against this server. Many other connectors have their own *QueryRunner class that you can use when working on a specific connector.

Running the full server

Trino comes with sample configuration that should work out-of-the-box for development. Use the following options to create a run configuration:

  • Main Class: io.trino.server.DevelopmentServer
  • VM Options: -ea -Dconfig=etc/config.properties -Dlog.levels-file=etc/log.properties -Djdk.attach.allowAttachSelf=true
  • Working directory: $MODULE_DIR$
  • Use classpath of module: trino-server-dev

The working directory should be the trino-server-dev subdirectory. In IntelliJ, using $MODULE_DIR$ accomplishes this automatically.

If VM options doesn't exist in the dialog, you need to select Modify options and enable Add VM options.

Running the CLI

Start the CLI to connect to the server and run SQL queries:

client/trino-cli/target/trino-cli-*-executable.jar

Run a query to see the nodes in the cluster:

SELECT * FROM system.runtime.nodes;

Run a query against the TPCH connector:

SELECT * FROM tpch.tiny.region;
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].