All Projects → lmarabi → st-hadoop

lmarabi / st-hadoop

Licence: other
ST-Hadoop is an open-source MapReduce extension of Hadoop designed specially to analyze your spatio-temporal data efficiently

Programming Languages

java
68154 projects - #9 most used programming language
HTML
75241 projects
javascript
184084 projects - #8 most used programming language
ruby
36898 projects - #4 most used programming language
PHP
23972 projects - #3 most used programming language
PigLatin
29 projects

Projects that are alternatives of or similar to st-hadoop

rsgislib
Remote Sensing and GIS Software Library; python module tools for processing spatial data.
Stars: ✭ 103 (+505.88%)
Mutual labels:  spatial, spatial-analysis, spatial-data-analysis
recmap
Draw your own Rectangular Statistical Cartogram - CRAN package
Stars: ✭ 18 (+5.88%)
Mutual labels:  spatial, spatial-data-analysis
scanstatistics
An R package for space-time anomaly detection using scan statistics.
Stars: ✭ 41 (+141.18%)
Mutual labels:  spatial, spatio-temporal
Googleway
R Package for accessing and plotting Google Maps
Stars: ✭ 187 (+1000%)
Mutual labels:  spatial, spatial-analysis
mlr3spatiotempcv
Spatiotemporal resampling methods for mlr3
Stars: ✭ 43 (+152.94%)
Mutual labels:  spatial, temporal
sabre
sabre: Spatial Association Between REgionalizations
Stars: ✭ 34 (+100%)
Mutual labels:  spatial, spatial-analysis
Tinspin Indexes
Spatial index library with R*Tree, STR-Tree, Quadtree, CritBit, KD-Tree, CoverTree
Stars: ✭ 64 (+276.47%)
Mutual labels:  spatial, indexing
libzinc
Zinc is a C++ library for spatial processing.
Stars: ✭ 39 (+129.41%)
Mutual labels:  spatial, spatial-index
st dbscan
ST-DBSCAN: Simple and effective tool for spatial-temporal clustering
Stars: ✭ 82 (+382.35%)
Mutual labels:  spatio-temporal, spatio-temporal-analysis
Intro to spatial analysis
Intro to spatial analysis in R
Stars: ✭ 28 (+64.71%)
Mutual labels:  spatial, spatial-analysis
CAST
Developer Version of the R package CAST: Caret Applications for Spatio-Temporal models
Stars: ✭ 65 (+282.35%)
Mutual labels:  spatial, spatio-temporal
couchbase-index-manager
Command-line interface to manage Couchbase indexes, synchronizing them to index definitions.
Stars: ✭ 14 (-17.65%)
Mutual labels:  indexing
Spatial pre2021
This repo has been archived. The latest version of the GIS and Spatial Analysis online book is at https://github.com/mgimond/Spatial
Stars: ✭ 88 (+417.65%)
Mutual labels:  spatial
NLMR
📦 R package to simulate neutral landscape models 🏔
Stars: ✭ 57 (+235.29%)
Mutual labels:  spatial
impact-tools
Simple blueprints for change-makers
Stars: ✭ 34 (+100%)
Mutual labels:  spatial
newerhoods
A Data Clinic project that aggregates NYC Open Data at the tract-level and uses Machine Learning techniques to re-imagine neighborhood boundaries.
Stars: ✭ 36 (+111.76%)
Mutual labels:  spatial-analysis
binaural-audio-editor
This is an audio application that produces 3D binaural audio from 2D mono audio samples and positional information given by the graphical user interface. Listen to 3D audio through stereo headphones. Video Demo:https://www.youtube.com/watch?v=peF9cZSwVGw
Stars: ✭ 37 (+117.65%)
Mutual labels:  spatial
big data
A collection of tutorials on Hadoop, MapReduce, Spark, Docker
Stars: ✭ 34 (+100%)
Mutual labels:  mapreduce
libDrive
libDrive is a Google Drive media library manager and indexer, similar to Plex, that organizes Google Drive media to offer an intuitive and user-friendly experience.
Stars: ✭ 14 (-17.65%)
Mutual labels:  indexing
Apache-Directory-Listing
A directory listing theme for Apache
Stars: ✭ 138 (+711.76%)
Mutual labels:  indexing

ST-Hadoop

ST-Hadoop is an extension to SpatialHadoop that provides efficient processing of Spatio-temporal data using MapReduce Hadoop framework. It provides an extendable spatio-temporal data types to be used in MapReduce jobs including STpoint and Interval. ST-Hadoop employs hierarchical multi-resolution spatio-temporal index structure within Hadoop Distributed File System (HDFS) as a mean of efficient retrieval of spatio-temporal data. The fact that time and space are crucial to the query processing influences our design to use two-level indexing of temporal and then spatial as a second level indexing. It also adds a lower-level of spatial indices in HDFS such as Grid file, R-tree, R+-tree, Quad-tree, and other. Some new InputFormats and RecordReaders are also provided to allow reading and processing spatio-temporal indexes efficiently in MapReduce jobs. ST-Hadoop shipped with two basic spatio-temporal operations that are implemented as efficient MapReduce jobs (i.e., Range query and join) which access ST-Hadoop spatio-temporal index. Developers can implement various spatio-temporal operations that run efficiently using ST-Hadoop data types, operations, and indexes.

How it works

ST-Hadoop is used in the same way as Hadoop. Data files are first loaded into HDFS and indexed using the stmanager command which builds the hierarchical spatio-temporal index structure of your choice of a specified resolution and lower level of a spatial index over the input file. Once the file is indexed, you can execute any of the spatio-temporal operations provided in ST-Hadoop such as range query and spatio-temporal similarity join. New operations will be added occasionally.

Install

ST-Hadoop is packaged as a single jar file which contains all the required classes to run. All operations including building the index can be accessed through this jar file and it gets automatically distributed to all slave nodes by the Hadoop framework. In addition the spatial-site.xml configuration file needs to be placed in the conf directory of your Hadoop installation. This allows you to configure the cluster accordingly.

Examples

Please visit ST-Hadoop website there are few examples of how to use ST-Hadoop operations.

Operation operator Example
spatio-temporal index stmanager ./hadoop jar st-hadoop-uber.jar stmanager /geolife_raw_data.txt /index_geolife time:year shape:edu.umn.cs.sthadoop.trajectory.GeolifeTrajectory sindex:str -no-local -overwrite
spatio-temporal range query strangequery ./hadoop jar st-hadoop-uber.jar strangequery /index_geolife /rq_result shape:edu.umn.cs.sthadoop.trajectory.GeolifeTrajectory rect:39.9111716,116.5956883,39.9119983,116.606835 interval:2008-05-01,2008-05-31 time:year -overwrite
Spatio-temporal DTW KNN dtwknn ./hadoop jar st-hadoop-uber.jar dtwknn /index_geolife /knndtw_result shape:edu.umn.cs.sthadoop.trajectory.GeolifeTrajectory interval:2008-05-01,2008-05-31 time:year k:20 traj:39.9119983,116.606835x39.9119783,116.6065483x39.9119599,116.6062649x39.9119416,116.6059899x39.9119233,116.6057282x39.9118999,116.6054783x39.9118849,116.6052366x39.9118666,116.6050099x39.91185,116.604775x39.9118299,116.604525x39.9118049,116.6042649x39.91177,116.6040166x39.9117516,116.6037583x39.9117349,116.6035066x39.9117199,116.6032666x39.9117083,116.6030232x39.9117,116.6027566x39.91128,116.5969383x39.9112583,116.5966766x39.9112383,116.5964232x39.9112149,116.5961699x39.9111933,116.5959249x39.9111716,116.5956883 -overwrite -no-local

Compile

Advanced users and contributors might like to compile ST-Hadoop on their own machines. ST-Hadoop can be compiled via Maven. First, you need to grab your own version of the source code. You can do this through git. The source code resides on github. To clone the repository, run the following command

git clone https://github.com/lmarabi/st-hadoop.git

If you do not want to use git, you can still download it as a single archive provided by github.

Once you download the source code, you need to make sure you have any and Ivy installed on your system. Please check the installation guide of Maven if you do not have it installed.

To compile ST-Hadoop, navigate to the source code and run the command:

mvn compile

This will automatically retrieve all dependencies and compile the source code.

To build a redistribution package, run the command:

mvn assembly:assembly

This Maven command will package all classes of ST-Hadoop along with the dependent jars not included in Hadoop into an archive. This archive can be used to install ST-Hadoop on any existing Hadoop cluster.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].