All Git Users → cloudera

37 open source projects by cloudera

[ Open user page on Github ]

Real-time Query for Hadoop; mirror of Apache Impala

✭ 16

C++java python javascript c Thrift

Sqoop has moved to Apache!

✭ 174

3. Cloudera Playbook

Cloudera deployment automation with Ansible

✭ 168

Cloudera Manager Extensibility Tools and Documentation.

✭ 146

5. Impala Tpcds Kit

TPC-DS Kit for Impala

✭ 142

The fast and fun way to write YARN applications.

✭ 132

✭ 105

8. Flink Tutorials

✭ 87

9. Python Ngrams

✭ 75

C++ native client for Impala and Hive, with Python / pandas bindings

✭ 69

✭ 61

Access Server

✭ 45

13. Cdh Package

✭ 41

Livy is an open source REST interface for interacting with Apache Spark from anywhere

✭ 942

15. Poisson sampling

✭ 10

WE HAVE MOVED to Apache Incubator. https://cwiki.apache.org/FLUME/ . Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data. It has a simple and flexible architecture based on streaming data flows. It is robust and fault tolerant with tunable reliability mechanisms and many failover and recovery mechanisms. The system is centrally managed and allows for intelligent dynamic management. It uses a simple extensible data model that allows for online analytic applications.

✭ 941

17. Lucene Solr

Mirror of Apache Lucene + Solr https://github.com/apache/lucene-solr

✭ 16

Apache Kudu. Mirrored from https://github.com/apache/kudu

✭ 829

Python DB API 2.0 client for Impala and Hive (HiveServer2 protocol)

✭ 625

Open source SQL Query Assistant service for Databases/Warehouses

✭ 351

python HTML javascript c Mako typescript sql autocomplete databases compose data-warehouse sql-editor query-editor sql-assistant

Crunch is an Apache TLP now, and lives at http://crunch.apache.org/

✭ 312

22. Cdh Twitter Example

Example application for analyzing Twitter data using CDH - Flume, Oozie, Hive

✭ 285

Cloudera Manager API Client

✭ 278

No description, website, or topics provided.

✭ 28

No description, website, or topics provided.

✭ 28

26. clusterdock

No description, website, or topics provided.

✭ 67

27. cdsw-training

Example Python and R code for Cloudera Data Science Workbench training

✭ 22

No description, website, or topics provided.

✭ 14

java shell PigLatin

29. native-toolchain

No description, website, or topics provided.

✭ 23

shell python C++

No description, website, or topics provided.

✭ 15

java shell python

31. thrift sasl

Thrift SASL module that implements TSaslClientTransport

✭ 17

32. seismichadoop

System for performing seismic data processing on a Hadoop cluster.

✭ 32

A collection of Custom Service Descriptors

✭ 50

34. kudu-examples

Example code for Kudu

✭ 79

35. cloudera-scripts-for-log4j

Scripts for addressing log4j zero day security issue

✭ 82

36. director-sdk

Cloudera Director API clients

✭ 19

37. impala-udf-samples

Sample UDF and UDAs for Impala.

✭ 59

1-37 of 37 user projects