All Projects → apache → incubator-tez

apache / incubator-tez

Licence: Apache-2.0 license
Mirror of Apache Tez (Incubating)

Programming Languages

java
68154 projects - #9 most used programming language

Projects that are alternatives of or similar to incubator-tez

Eland
Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch
Stars: ✭ 235 (+291.67%)
Mutual labels:  big-data
Vue Virtual Scroll List
⚡️A vue component support big amount data list with high render performance and efficient.
Stars: ✭ 3,201 (+5235%)
Mutual labels:  big-data
masc
Microsoft's contributions for Spark with Apache Accumulo
Stars: ✭ 20 (-66.67%)
Mutual labels:  big-data
Kafka Ui
Open-Source Web GUI for Apache Kafka Management
Stars: ✭ 230 (+283.33%)
Mutual labels:  big-data
Data Accelerator
Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.
Stars: ✭ 247 (+311.67%)
Mutual labels:  big-data
Clickhouse
ClickHouse® is a free analytics DBMS for big data
Stars: ✭ 21,089 (+35048.33%)
Mutual labels:  big-data
Lite Virtual List
Virtual list component library supporting waterfall flow based on vue
Stars: ✭ 223 (+271.67%)
Mutual labels:  big-data
Social-Network-Analysis-in-Python
Social Network Facebook Analysis (Python, Networkx)
Stars: ✭ 26 (-56.67%)
Mutual labels:  big-data
Cboard
An easy to use, self-service open BI reporting and BI dashboard platform.
Stars: ✭ 2,795 (+4558.33%)
Mutual labels:  big-data
acousticbrainz-server
The server components for the AcousticBrainz project
Stars: ✭ 128 (+113.33%)
Mutual labels:  big-data
Trafodion
Apache Trafodion
Stars: ✭ 242 (+303.33%)
Mutual labels:  big-data
Hyperspace
An open source indexing subsystem that brings index-based query acceleration to Apache Spark™ and big data workloads.
Stars: ✭ 246 (+310%)
Mutual labels:  big-data
predictionio-template-recommender
PredictionIO Recommendation Engine Template (Scala-based parallelized engine)
Stars: ✭ 80 (+33.33%)
Mutual labels:  big-data
Selinon
An advanced distributed task flow management on top of Celery
Stars: ✭ 237 (+295%)
Mutual labels:  big-data
predictionio-sdk-ruby
PredictionIO Ruby SDK
Stars: ✭ 192 (+220%)
Mutual labels:  big-data
Books
整理一些书籍 ,包含 C&C++ 、git 、Java、Keras 、Linux 、NLP 、Python 、Scala 、TensorFlow 、大数据 、推荐系统、数据库、数据挖掘 、机器学习 、深度学习 、算法等。
Stars: ✭ 222 (+270%)
Mutual labels:  big-data
Koalas
Koalas: pandas API on Apache Spark
Stars: ✭ 3,044 (+4973.33%)
Mutual labels:  big-data
bagri
XML/Document DB on top of distributed cache
Stars: ✭ 40 (-33.33%)
Mutual labels:  big-data
TT Tech Space
TT Tech Research Notes
Stars: ✭ 21 (-65%)
Mutual labels:  big-data
Detecting-Malicious-URL-Machine-Learning
No description or website provided.
Stars: ✭ 47 (-21.67%)
Mutual labels:  big-data

Apache Tez

Apache Tez is a generic data-processing pipeline engine envisioned as a low-level engine for higher abstractions such as Apache Hadoop Map-Reduce, Apache Pig, Apache Hive etc.

At it's heart, tez is very simple and has just two components:

  • The data-processing pipeline engine where-in one can plug-in input, processing and output implementations to perform arbitrary data-processing. Every 'task' in tez has the following:
  • Input to consume key/value pairs from.
  • Processor to process them.
  • Output to collect the processed key/value pairs.
  • A master for the data-processing application, where-by one can put together arbitrary data-processing 'tasks' described above into a task-DAG to process data as desired. The generic master is implemented as a Apache Hadoop YARN ApplicationMaster.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].