All Projects → CODAIT → Hadoop Yarn Api Python Client

CODAIT / Hadoop Yarn Api Python Client

Licence: bsd-3-clause
Python client for Hadoop® YARN API

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Hadoop Yarn Api Python Client

Xlearning
AI on Hadoop
Stars: ✭ 1,709 (+1778.02%)
Mutual labels:  hadoop, yarn
beanszoo
Distributed Java micro-services using ZooKeeper
Stars: ✭ 12 (-86.81%)
Mutual labels:  yarn, hadoop
Tensorflowonyarn
Support TensorFlow on YARN
Stars: ✭ 114 (+25.27%)
Mutual labels:  hadoop, yarn
Bigdata Notes
大数据入门指南 ⭐
Stars: ✭ 10,991 (+11978.02%)
Mutual labels:  hadoop, yarn
knit
Deprecated, please use https://github.com/jcrist/skein or https://github.com/dask/dask-yarn instead
Stars: ✭ 53 (-41.76%)
Mutual labels:  yarn, hadoop
Tf Yarn
Train TensorFlow models on YARN in just a few lines of code!
Stars: ✭ 76 (-16.48%)
Mutual labels:  hadoop, yarn
docker-hadoop
Docker image for main Apache Hadoop components (Yarn/Hdfs)
Stars: ✭ 59 (-35.16%)
Mutual labels:  yarn, hadoop
yarn-prometheus-exporter
Export Hadoop YARN (resource-manager) metrics in prometheus format
Stars: ✭ 44 (-51.65%)
Mutual labels:  yarn, hadoop
fastdata-cluster
Fast Data Cluster (Apache Cassandra, Kafka, Spark, Flink, YARN and HDFS with Vagrant and VirtualBox)
Stars: ✭ 20 (-78.02%)
Mutual labels:  yarn, hadoop
wasp
WASP is a framework to build complex real time big data applications. It relies on a kind of Kappa/Lambda architecture mainly leveraging Kafka and Spark. If you need to ingest huge amount of heterogeneous data and analyze them through complex pipelines, this is the framework for you.
Stars: ✭ 19 (-79.12%)
Mutual labels:  yarn, hadoop
Akkeeper
An easy way to deploy your Akka services to a distributed environment.
Stars: ✭ 30 (-67.03%)
Mutual labels:  hadoop, yarn
Bigdata Interview
🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结
Stars: ✭ 857 (+841.76%)
Mutual labels:  hadoop, yarn
Jumbune
Jumbune, an open source BigData APM & Data Quality Management Platform for Data Clouds. Enterprise feature offering is available at http://jumbune.com. More details of open source offering are at,
Stars: ✭ 64 (-29.67%)
Mutual labels:  hadoop, yarn
Docker Hadoop Cluster
Multiple node cluster on Docker for self development.
Stars: ✭ 82 (-9.89%)
Mutual labels:  hadoop
Emma Cli
📦 Terminal assistant to find and install node packages.
Stars: ✭ 1,201 (+1219.78%)
Mutual labels:  yarn
Create React App
Yarn Workspaces Monorepo support for Create-React-App / React-Scripts.
Stars: ✭ 76 (-16.48%)
Mutual labels:  yarn
Cuesheet
A framework for writing Spark 2.x applications in a pretty way
Stars: ✭ 86 (-5.49%)
Mutual labels:  yarn
Node Developer Boilerplate
🍭 Boilerplate for ES6+ Node.js and npm Developer
Stars: ✭ 82 (-9.89%)
Mutual labels:  yarn
Have It
The fastest NPM install does nothing because you already have it
Stars: ✭ 75 (-17.58%)
Mutual labels:  yarn
Dataspherestudio
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
Stars: ✭ 1,195 (+1213.19%)
Mutual labels:  hadoop

hadoop-yarn-api-python-client

Python client for Apache Hadoop® YARN API

Latest Version Downloads Travis CI build status Latest documentation status Test coverage

Package documentation: yarn-api-client-python.readthedocs.org

REST API documentation: hadoop.apache.org


Compatibility

Library is compatible with Apache Hadoop 3.2.1.

If u have version other than mentioned (or vendored variant like Hortonworks), certain APIs might be not working or have differences in implementation. If u plan to use certain API long-term, you might want to make sure its not in Alpha stage in documentation.

Installation

From PyPI

pip install yarn-api-client

From Anaconda (conda forge)

conda install -c conda-forge yarn-api-client

From source code

pip install git+https://github.com/CODAIT/hadoop-yarn-api-python-client.git

Enabling support for Kerberos/SPNEGO Security

  1. First option - using requests_kerberos package

To avoid deployment issues on a non Kerberized environment, the requests_kerberos dependency is optional and needs to be explicit installed in order to enable access to YARN console protected by Kerberos/SPNEGO.

pip install requests_kerberos

From python code

from yarn_api_client.history_server import HistoryServer
from requests_kerberos import HTTPKerberosAuth
history_server = HistoryServer('https://127.0.0.2:5678', auth=HTTPKerberosAuth())

PS: You need to get valid kerberos ticket in systemwide kerberos cache before running your code, otherwise calls to kerberized environment won't go through (run kinit before proceeding to run code)

  1. Second option - using gssapi package

If you want to avoid using terminal calls, you have to perform SPNEGO handshake to retrieve ticket yourself. Full API documentation: https://pythongssapi.github.io/python-gssapi/latest/

Usage

CLI interface

  1. First way
bin/yarn_client --help
  1. Alternative way
python -m yarn_api_client --help

Programmatic interface

from yarn_api_client import ApplicationMaster, HistoryServer, NodeManager, ResourceManager
am = ApplicationMaster('https://127.0.0.2:5678')
app_information = am.application_information('application_id')

Changelog

1.0.2 Release

  • Add support for Python 3.8.x
  • Fix HTTPS url parsing
  • Fix JSON body request APIs
  • Handle YARN response with empty contents
  • Better logging support

1.0.1 Release

  • Passes the authorization instance to the Active RM check
  • Establishes a new (working) documentation site in readthedocs.io: yarn-api-client-python.readthedocs.io
  • Adds more python version (3.7 and 3.8) to test matrix and removes 2.6.

1.0.0 Release

  • Major cleanup of API.
    • Address/port parameters have been replaced with complete endpoints (includes scheme [e.g., http or https]).
    • ResourceManager has been updated to take a list of endpoints for improved HA support.
    • ResourceManager, ApplicationMaster, HistoryServer and NodeManager have been updated with methods corresponding to the latest REST API.
  • pytest support on Windows has been provided.
  • Documentation has been updated.

NOTE: Applications using APIs relative to releases prior to 1.0 should pin their dependency on yarn-api-client to less than 1.0 and are encouraged to update to 1.0 as soon as possible.

0.3.7 Release

  • Honor configured HTTP Policy when no address is provided - enabling using of HTTPS in these cases.

0.3.6 Release

  • Extend ResourceManager to allow applications to determine resource availability prior to submission.

0.3.5 Release

  • Hotfix release to fix internal signature mismatch

0.3.4 Release

  • More flexible support for discovering Hadoop configuration including multiple Resource Managers when HA is configured
  • Properly support YARN post response codes

0.3.3 Release

  • Properly set Content-Type in PUT requests
  • Check for HADOOP_CONF_DIR env variable

0.3.2 Release

  • Make Kerberos/SPNEGO dependency optional

0.3.1 Release

  • Fix cluster_application_kill API

0.3.0 Release

  • Add support for YARN endpoints protected by Kerberos/SPNEGO
  • Moved to requests package for REST API invocation
  • Remove http_con property, as connections are now managed by requests package

0.2.5 Release

  • Fixed History REST API

0.2.4 Release

  • Added compatibility with HA enabled Resource Manager

Team

YARN API client is developed by an open community, and the current maintainers are listed below in alphabetical order:

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].