Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → Intel-bigdata → Oap

Intel-bigdata / Oap

Licence: apache-2.0

Optimized Analytics Package for Spark* Platform

Programming Languages

scala

5932 projects

Labels

spark parquet

Projects that are alternatives of or similar to Oap

Gaffer

A large-scale entity and relation database supporting aggregation of properties

Stars: ✭ 1,642 (+378.72%)

Mutual labels: spark, parquet

Devops Python Tools

80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.

Stars: ✭ 406 (+18.37%)

Mutual labels: spark, parquet

Iceberg

Iceberg is a table format for large, slow-moving tabular data

Stars: ✭ 393 (+14.58%)

Mutual labels: spark, parquet

Pucket

Bucketing and partitioning system for Parquet

Stars: ✭ 29 (-91.55%)

Mutual labels: spark, parquet

Schemer

Schema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.

Stars: ✭ 97 (-71.72%)

Mutual labels: spark, parquet

Rumble

⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more

Stars: ✭ 58 (-83.09%)

Mutual labels: spark, parquet

Parquet Generator

Parquet file generator

Stars: ✭ 16 (-95.34%)

Mutual labels: spark, parquet

Parquet Index

Spark SQL index for Parquet tables

Stars: ✭ 109 (-68.22%)

Mutual labels: spark, parquet

experiments

Code examples for my blog posts

Stars: ✭ 21 (-93.88%)

Mutual labels: spark, parquet

Awesome Ada

A curated list of awesome resources related to the Ada and SPARK programming language

Stars: ✭ 299 (-12.83%)

Mutual labels: spark

Cook

Fair job scheduler on Kubernetes and Mesos for batch workloads and Spark

Stars: ✭ 314 (-8.45%)

Mutual labels: spark

Elasticluster

Create clusters of VMs on the cloud and configure them with Ansible.

Stars: ✭ 298 (-13.12%)

Mutual labels: spark

Zat

Zeek Analysis Tools (ZAT): Processing and analysis of Zeek network data with Pandas, scikit-learn, Kafka and Spark

Stars: ✭ 303 (-11.66%)

Mutual labels: spark

Sparklint

A tool for monitoring and tuning Spark jobs for efficiency.

Stars: ✭ 316 (-7.87%)

Mutual labels: spark

Elasticsearch loader

A tool for batch loading data files (json, parquet, csv, tsv) into ElasticSearch

Stars: ✭ 300 (-12.54%)

Mutual labels: parquet

Parquet Cpp

Apache Parquet

Stars: ✭ 339 (-1.17%)

Mutual labels: parquet

Spark Hbase Connector

Connect Spark to HBase for reading and writing data with ease

Stars: ✭ 299 (-12.83%)

Mutual labels: spark

Spark Notebook

Interactive and Reactive Data Science using Scala and Spark.

Stars: ✭ 3,081 (+798.25%)

Mutual labels: spark

Scalnet

A Scala wrapper for Deeplearning4j, inspired by Keras. Scala + DL + Spark + GPUs

Stars: ✭ 342 (-0.29%)

Mutual labels: spark

Ytk Learn

Ytk-learn is a distributed machine learning library which implements most of popular machine learning algorithms(GBDT, GBRT, Mixture Logistic Regression, Gradient Boosting Soft Tree, Factorization Machines, Field-aware Factorization Machines, Logistic Regression, Softmax).

Stars: ✭ 337 (-1.75%)

Mutual labels: spark

View All Similar Projects ➔

Optimized Analytics Package for Spark* Platform (OAP)

* LEGAL NOTICE: Your use of this software and any required dependent software (the "Software Package") is subject to the terms and conditions of the software license agreements for the Software Package, which may also include notices, disclaimers, or license terms for third party or open source software included in or with the Software Package, and your use indicates your acceptance of all such terms. Please refer to the "TPP.txt" or other similarly-named text file included with the Software Package for additional details.

* Optimized Analytics Package for Spark* Platform is under Apache 2.0 (https://www.apache.org/licenses/LICENSE-2.0).

OAP is a project to optimize Spark by providing optimized implementation of packages for various aspects including cache, shuffle, native SQL engine, Mllib and so on. In this version, OAP contains the optimized implementations of SQL Index and Data Source Cache supporting DRAM and PMem, RDD Cache PMem Extension, Shuffle Remote PMem Extension, Remote Shuffle, Intel MLlib, Unified Arrow Data Source and Native SQL Engine.

Installation Guide

Please follow the link below for the guide to compile and install OAP to your system.

OAP Installation Guide

User Guide

Please refer to the corresponding documents below for the introductions on how to use the features.

Developer Guide

Please follow the link below for the guide for developers.

OAP Developer Guide

*Other names and brands may be claimed as the property of others.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 343

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (212) 🔗