All Projects → asakusafw → Asakusafw

asakusafw / Asakusafw

Licence: apache-2.0
Asakusa Framework

Programming Languages

java
68154 projects - #9 most used programming language

Projects that are alternatives of or similar to Asakusafw

Bigdata Notes
大数据入门指南 ⭐
Stars: ✭ 10,991 (+9541.23%)
Mutual labels:  big-data, hadoop, mapreduce
Data Science Ipython Notebooks
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Stars: ✭ 22,048 (+19240.35%)
Mutual labels:  big-data, hadoop, mapreduce
big data
A collection of tutorials on Hadoop, MapReduce, Spark, Docker
Stars: ✭ 34 (-70.18%)
Mutual labels:  big-data, hadoop, mapreduce
Moosefs
MooseFS – Open Source, Petabyte, Fault-Tolerant, Highly Performing, Scalable Network Distributed File System (Software-Defined Storage)
Stars: ✭ 1,025 (+799.12%)
Mutual labels:  big-data, hadoop
Cdc Kafka Hadoop
MySQL to NoSQL real time dataflow
Stars: ✭ 13 (-88.6%)
Mutual labels:  data-flow, hadoop
Data Algorithms Book
MapReduce, Spark, Java, and Scala for Data Algorithms Book
Stars: ✭ 949 (+732.46%)
Mutual labels:  hadoop, mapreduce
Hadoop For Geoevent
ArcGIS GeoEvent Server sample Hadoop connector for storing GeoEvents in HDFS.
Stars: ✭ 5 (-95.61%)
Mutual labels:  big-data, hadoop
Big Data Engineering Coursera Yandex
Big Data for Data Engineers Coursera Specialization from Yandex
Stars: ✭ 71 (-37.72%)
Mutual labels:  big-data, mapreduce
Docker Spark Cluster
A Spark cluster setup running on Docker containers
Stars: ✭ 57 (-50%)
Mutual labels:  big-data, hadoop
Setl
A simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (-30.7%)
Mutual labels:  big-data, framework
Repository
个人学习知识库涉及到数据仓库建模、实时计算、大数据、Java、算法等。
Stars: ✭ 92 (-19.3%)
Mutual labels:  hadoop, mapreduce
Connective
agent-based reactive programming library for typescript
Stars: ✭ 98 (-14.04%)
Mutual labels:  data-flow, framework
Bigdata Interview
🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结
Stars: ✭ 857 (+651.75%)
Mutual labels:  hadoop, mapreduce
Hazelcast Jet
Distributed Stream and Batch Processing
Stars: ✭ 855 (+650%)
Mutual labels:  big-data, batch-processing
Mkvtoolnix Batch
Windows Batch script to automate batch processing using mkvtoolnix.
Stars: ✭ 42 (-63.16%)
Mutual labels:  batch, batch-processing
Apex Chainable
Chain Batches in a readable and flexible way without hardcoding the successor.
Stars: ✭ 27 (-76.32%)
Mutual labels:  batch, framework
Src
A light-weight distributed stream computing framework for Golang
Stars: ✭ 67 (-41.23%)
Mutual labels:  hadoop, mapreduce
Cyclow
A reactive frontend framework for JavaScript
Stars: ✭ 105 (-7.89%)
Mutual labels:  data-flow, framework
Beam
Apache Beam is a unified programming model for Batch and Streaming
Stars: ✭ 5,149 (+4416.67%)
Mutual labels:  big-data, batch
H2o 3
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Stars: ✭ 5,656 (+4861.4%)
Mutual labels:  big-data, hadoop

Asakusa Framework

Asakusa is a full stack framework for distributed/parallel computing, which provides with a development platform and runtime libraries supporting various distributed/parallel computing environments such as Hadoop, Spark, M3 for Batch Processing, and so on. Users can enjoy the best performance on distributed/parallel computing transparently changing execution engines among MapReduce, SparkRDD, and C++ native based on their data size.

Other than query-based languages, Asakusa helps to develop more complicated data flow programs more easily, efficiently, and comprehensively due to following components.

  • Data-flow oriented DSL

    Data-flow based approach is suitable for DAG constructions which is appropriate for distributed/parallel computing. Asakusa offers Domain Specific Language based on Java with data-flow design, which is integrated with compilers.

  • Compilers

    A multi-tier compiler is supported. Java based source code is once compiled to inter-mediated representation and then optimized for each execution environments such that Hadoop(MapReduce), Spark(RDD), M3 for Batch Processing(C++ Native), respectively.

  • Data-Modeling language

    Data-Model language is supported, which is comprehensive for mapping with relational models, CSVs, or other data formats.

  • Test Environment

    JUnit based unit testing and end-to-end testing are supported, which are portable among each execution environments. Source code, test code, and test data are fully compatible across Hadoop, Spark, M3 for Batch Processing and others.

  • Runtime execution driver

    A transparent job execution driver is supported.

All these features have been well designed and developed with the expertise from experiences on enterprise-scale system developments over decades and promised to contribute to large scale systems on distributed/parallel environments to be more robust and stable.

How to build

Maven artifacts

./mvnw clean install -DskipTests

Gradle plug-ins

cd gradle
./gradlew clean [build] install

How to run tests

Maven artifacts

export HADOOP_CMD=/path/to/bin/hadoop
./mvnw test

Gradle plug-ins

cd gradle
./gradlew [clean] check

How to import projects into Eclipse

Maven artifacts

./mvnw eclipse:eclipse

And then import existing projects from Eclipse.

If you run tests in Eclipse, please activate Preferences > Java > Debug > 'Only include exported classpath entries when launching'.

Gradle plug-ins

cd gradle
./gradlew eclipse

And then import existing projects from Eclipse.

Sub Projects

Related Projects

Resources

Bug reports, Patch contribution

License

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].