All Projects → apache → Oodt

apache / Oodt

Licence: apache-2.0
Mirror of Apache OODT

Programming Languages

java
68154 projects - #9 most used programming language

Labels

Projects that are alternatives of or similar to Oodt

Sparkjni
A heterogeneous Apache Spark framework.
Stars: ✭ 11 (-78.85%)
Mutual labels:  big-data
Predictionio Template Text Classifier
Text Classification Engine
Stars: ✭ 30 (-42.31%)
Mutual labels:  big-data
Couchdb Couch
Mirror of Apache CouchDB
Stars: ✭ 43 (-17.31%)
Mutual labels:  big-data
Phoenix
Mirror of Apache Phoenix
Stars: ✭ 867 (+1567.31%)
Mutual labels:  big-data
Qcportal
A client interface to the QCArchive Project (read-only image of QCFractal)
Stars: ✭ 29 (-44.23%)
Mutual labels:  big-data
Esper Tv
Esper instance for TV news analysis
Stars: ✭ 37 (-28.85%)
Mutual labels:  big-data
Hazelcast Jet
Distributed Stream and Batch Processing
Stars: ✭ 855 (+1544.23%)
Mutual labels:  big-data
Trck
Query engine for TrailDB
Stars: ✭ 48 (-7.69%)
Mutual labels:  big-data
Skymap
High-throughput gene to knowledge mapping through massive integration of public sequencing data.
Stars: ✭ 29 (-44.23%)
Mutual labels:  big-data
Attaca
Robust, distributed version control for large files.
Stars: ✭ 41 (-21.15%)
Mutual labels:  big-data
K8s Ingress Claim
An admission control policy that safeguards against accidental duplicate claiming of Hosts/Domains.
Stars: ✭ 14 (-73.08%)
Mutual labels:  big-data
Awesome Scalability
The Patterns of Scalable, Reliable, and Performant Large-Scale Systems
Stars: ✭ 36,688 (+70453.85%)
Mutual labels:  big-data
Analysispreservation.cern.ch
Source code for the CERN Analysis Preservation portal
Stars: ✭ 37 (-28.85%)
Mutual labels:  big-data
Dremio Oss
Dremio - the missing link in modern data
Stars: ✭ 862 (+1557.69%)
Mutual labels:  big-data
Moosefs
MooseFS – Open Source, Petabyte, Fault-Tolerant, Highly Performing, Scalable Network Distributed File System (Software-Defined Storage)
Stars: ✭ 1,025 (+1871.15%)
Mutual labels:  big-data
Accumulo
Apache Accumulo
Stars: ✭ 857 (+1548.08%)
Mutual labels:  big-data
Metrics
Measure behavior of Java applications
Stars: ✭ 35 (-32.69%)
Mutual labels:  big-data
Datumbox Framework
Datumbox is an open-source Machine Learning framework written in Java which allows the rapid development of Machine Learning and Statistical applications.
Stars: ✭ 1,063 (+1944.23%)
Mutual labels:  big-data
Traildb
TrailDB is an efficient tool for storing and querying series of events
Stars: ✭ 1,029 (+1878.85%)
Mutual labels:  big-data
Egads
A Java package to automatically detect anomalies in large scale time-series data
Stars: ✭ 997 (+1817.31%)
Mutual labels:  big-data

Welcome to Apache OODT http://oodt.apache.org/

Apache Object Oriented Data Technology (OODT) is the smart way to integrate and archive your processes, your data, and its metadata. OODT allows you to:

  • Generate Data
  • Process Data
  • Manage Your Data
  • Distribute Your Data
  • Analyze Your Data Allowing for the integration of data, computation, visualization and other components.

OODT also allows for remote execution of jobs on scalable computational infrastructures so that computational and data- intensive processing can be integrated into OODT’s data processing pipelines using cloud computing and high-performance computing environments.

Overview of OODT Main Components

Why OODT?

Traditional processing pipelines are commonly made up of custom UNIX shell scripts and fragile custom written glue code. Apache OODT uses structured XML-based capturing of the processing pipeline that can be understood and modified by non-programmers to create, edit, manage and provision workflow and task execution.

It is being used on a number of successful projects at NASA's Jet Propulsion Laboratory/California Institute of Technology, and many other research institutions and universities, specifically those part of the:

  • National Cancer Institute's (NCI's) Early Detection Research Network (EDRN) project - over 40+ institutions all performing research into discovering biomarkers which are early indicators of disease.
  • NASA's Planetary Data System (PDS) - NASA's planetary data archive, a repository and registry for all planetary data collected over the past 30+ years.
  • Various Earth Science data processing missions, including Seawinds/QuickSCAT, the Orbiting Carbon Observatory, the NPP Sounder PEATE project, and the Soil Moisture Active Passive (SMAP) mission.
  • Apache DRAT - A distributed release audit tool written on top of OODT's capabilities.

OODT is a Top Level project of the Apache Software Foundation http://www.apache.org/.


Getting Started

Useful Resources

Build from scratch

OODT is primarily written in Java, with some components available in Python. It requires Java 8 and uses the Maven 3 http://maven.apache.org/ build system. To build the Java components of OODT, use the following command in this directory:

mvn clean install

For the Python components, see the agility subdirectory.

RADiX Powered By OODT

OODT isn’t an out of the box solution, but we do try to make it as easy as possible. For that, OODT provides the RADIX build system which will compile a fully operational OODT platform ready for development and deployment. Building the RADIX distribution is as simple as running the following commands:

export JAVA_HOME=/usr/lib/jvm… (adjust for your own JAVA_HOME)
curl -s "https://git-wip-us.apache.org/repos/asf?p=oodt.git;a=blob_plain;f=mvn/archetypes/radix/src/main/resources/bin/radix;hb=HEAD" | bash
 
mv oodt oodt-src; cd oodt-src; mvn install
mkdir ../oodt; tar -xvf distribution/target/oodt-distribution-0.1-bin.tar.gz -C ../oodt
cd ../oodt; ./bin/oodt start
./resmgr/bin/batch_stub 2001

Navigate to http://localhost:8080/opsui, you should see the default OPSUI system which provides system oversight and interrogation


Contributing

To contribute a patch, follow these instructions.

  1. File JIRA issue for your fix at https://issues.apache.org/jira/browse/OODT. you will get issue id OODT-xxx where xxx is the issue ID.

  2. Fork the repo on which you're working, clone your forked repo to your local computer, and set up the upstream remote:

    git clone https://github.com/<YourGitHubUserName>/oodt.git
    git remote add upstream https://github.com/apache/oodt.git
    
  3. Go into oodt directory

    cd oodt
    
  4. Checkout out a new local branch based on your master and update it to the latest.The convention is to name the branch after the current JIRA issue, e.g. OODT-xxx where xxx is the issue ID.

    git checkout -b OODT-xxx
    
  5. Do the changes to the relavant files and keep your code clean. If you find another bug, you want to fix while being in a new branch, please fix it in a separated branch instead.

  6. Add relevant files to the staging area.

    git add <files>
    
  7. For every commit please write a short (max 72 characters) summary of the change. Use markdown syntax for simple styling. Please include any JIRA issue numbers in your summary.

    git commit -m “[OODT-xxx] Put change summary here ”
    

    NEVER leave the commit message blank! Provide a detailed, clear, and complete description of your commit!

  8. Before submitting a pull request, update your branch to the latest code.

    git checkout master
    git pull --rebase upstream master
    git checkout OODT-xxx
    git rebase -i master
    
  9. Push the code to your forked repository

    git push origin OODT-xxx
    
  10. In order to make a pull request,

  • Navigate to the OODT repository you just pushed to (e.g. https://github.com/your-user-name/oodt)
  • Click "Pull Request".
  • Write your branch name in the branch field (this is filled with "master" by default)
  • Click "Update Commit Range".
  • Ensure the changesets you introduced are included in the "Commits" tab.
  • Ensure that the "Files Changed" incorporate all of your changes.
  • Fill in some details about your potential patch including a meaningful title.
  • Click "Send pull request".

Issue Tracker

If you encounter errors in OODT or want to suggest an improvement or a new feature, please visit the OODT issue tracker https://issues.apache.org/jira/browse/OODT. There you can also find the latest information on known issues and recent bug fixes and enhancements.

Documentation

You can find an enormous amount of useful documentation/resources related to OODT in OODT Confluence Wiki.

You can build a local copy of the OODT documentation including JavaDocs using the following Maven 2 command in the OODT source directory:

mvn site

You can then open the OODT Documentation in a web browser:

./target/site/index.html

Note: all OODT source files are encoded with UTF-8. You must set your MAVEN_OPTS environment variable to include "-Dfile.encoding=UTF-8" in order to properly generate the web site and other artifacts from source.

Note: generating the documentation requires enormous amounts of memory. More than likely you'll need to add to the MAVEN_OPTS environment variable in order to set the Java heap maximum size with "-Xmx512m" or larger before attempting to run "mvn site".


License (see also LICENSE.txt)

Collective work: Copyright 2010-2012 The Apache Software Foundation.

Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

 http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Apache OODT includes a number of subcomponents with separate copyright notices and license terms. Your use of these subcomponents is subject to the terms and conditions of the licenses listed in the LICENSE.txt file.


Export control

This distribution includes cryptographic software. The country in which you currently reside may have restrictions on the import, possession, use, and/or re-export to another country, of encryption software. BEFORE using any encryption software, please check your country's laws, regulations and policies concerning the import, possession, or use, and re-export of encryption software, to see if this is permitted. See http://www.wassenaar.org/ for more information.

The U.S. Government Department of Commerce, Bureau of Industry and Security (BIS), has classified this software as Export Commodity Control Number (ECCN) 5D002.C.1, which includes information security software using or performing cryptographic functions with asymmetric algorithms. The form and manner of this Apache Software Foundation distribution makes it eligible for export under the License Exception ENC Technology Software Unrestricted (TSU) exception (see the BIS Export Administration Regulations, Section 740.13) for both object code and source code.

The following provides more details on the included cryptographic software:

Apache OODT uses Apache Tika which uses the Bouncy Castle generic
encryption libraries for extracting text content and metadata from
encrypted PDF files.  See http://www.bouncycastle.org/ for more details on
Bouncy Castle.

Mailing Lists

Discussion about OODT takes place on the following mailing lists:

[email protected]    - About using OODT and developing OODT

Notification on all code changes are sent to the following mailing list:

[email protected]

The mailing lists are open to anyone and publicly archived.

You can subscribe the mailing lists by sending a message to [email protected] (for example [email protected]). To unsubscribe, send a message to [email protected]. For more instructions, send a message to [email protected].

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].