All Projects → Sqoop → Similar Projects or Alternatives

369 Open source projects that are alternatives of or similar to Sqoop

Mockneat
MockNeat is a Java 8+ library that facilitates the generation of arbitrary data for your applications.
Stars: ✭ 410 (-49.82%)
Mutual labels:  big-data
Metorikku
A simplified, lightweight ETL Framework based on Apache Spark
Stars: ✭ 361 (-55.81%)
Mutual labels:  big-data
Onlinestats.jl
Single-pass algorithms for statistics
Stars: ✭ 507 (-37.94%)
Mutual labels:  big-data
Listenbrainz Server
Server for the ListenBrainz project
Stars: ✭ 420 (-48.59%)
Mutual labels:  big-data
Parquet Cpp
Apache Parquet
Stars: ✭ 339 (-58.51%)
Mutual labels:  big-data
Couchdb
Seamless multi-master syncing database with an intuitive HTTP/JSON API, designed for reliability
Stars: ✭ 5,166 (+532.31%)
Mutual labels:  big-data
Ignite
Apache Ignite
Stars: ✭ 4,027 (+392.9%)
Mutual labels:  big-data
Kafka Streams
equivalent to kafka-streams 🐙 for nodejs ✨🐢🚀✨
Stars: ✭ 613 (-24.97%)
Mutual labels:  big-data
Vespa
The open big data serving engine. https://vespa.ai
Stars: ✭ 3,747 (+358.63%)
Mutual labels:  big-data
Fit Sne
Fast Fourier Transform-accelerated Interpolation-based t-SNE (FIt-SNE)
Stars: ✭ 485 (-40.64%)
Mutual labels:  big-data
Circosjs
d3 library to build circular graphs
Stars: ✭ 436 (-46.63%)
Mutual labels:  big-data
Tez
Apache Tez
Stars: ✭ 313 (-61.69%)
Mutual labels:  big-data
Pachyderm
Reproducible Data Science at Scale!
Stars: ✭ 5,305 (+549.33%)
Mutual labels:  big-data
Opendata.cern.ch
Source code for the CERN Open Data portal
Stars: ✭ 411 (-49.69%)
Mutual labels:  big-data
Data Science Career
Career Resources for Data Science, Machine Learning, Big Data and Business Analytics Career Repository
Stars: ✭ 630 (-22.89%)
Mutual labels:  big-data
Kafka Connect Hdfs
Kafka Connect HDFS connector
Stars: ✭ 400 (-51.04%)
Mutual labels:  big-data
Arkime
Arkime (formerly Moloch) is an open source, large scale, full packet capturing, indexing, and database system.
Stars: ✭ 4,994 (+511.26%)
Mutual labels:  big-data
Hive
Apache Hive
Stars: ✭ 4,031 (+393.39%)
Mutual labels:  big-data
Spark Movie Lens
An on-line movie recommender using Spark, Python Flask, and the MovieLens dataset
Stars: ✭ 745 (-8.81%)
Mutual labels:  big-data
Sylph
Stream computing platform for bigdata
Stars: ✭ 362 (-55.69%)
Mutual labels:  big-data
Pgm Index
🏅State-of-the-art learned data structure that enables fast lookup, predecessor, range searches and updates in arrays of billions of items using orders of magnitude less space than traditional indexes
Stars: ✭ 499 (-38.92%)
Mutual labels:  big-data
Attic Apex Core
Mirror of Apache Apex core
Stars: ✭ 346 (-57.65%)
Mutual labels:  big-data
Oozie
Mirror of Apache Oozie
Stars: ✭ 602 (-26.32%)
Mutual labels:  big-data
Grouparoo
🦘 The Grouparoo Monorepo - open source customer data sync framework
Stars: ✭ 334 (-59.12%)
Mutual labels:  big-data
Hazelcast
Open-source distributed computation and storage platform
Stars: ✭ 4,662 (+470.62%)
Mutual labels:  big-data
Data Science Ipython Notebooks
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Stars: ✭ 22,048 (+2598.65%)
Mutual labels:  big-data
Uproot3
ROOT I/O in pure Python and NumPy.
Stars: ✭ 312 (-61.81%)
Mutual labels:  big-data
Scanner
Efficient video analysis at scale
Stars: ✭ 569 (-30.35%)
Mutual labels:  big-data
Cortx
CORTX Community Object Storage is 100% open source object storage uniquely optimized for mass capacity storage devices.
Stars: ✭ 426 (-47.86%)
Mutual labels:  big-data
Samza
Mirror of Apache Samza
Stars: ✭ 676 (-17.26%)
Mutual labels:  big-data
Datascience Ai Machinelearning Resources
Alex Castrounis' curated set of resources for artificial intelligence (AI), machine learning, data science, internet of things (IoT), and more.
Stars: ✭ 414 (-49.33%)
Mutual labels:  big-data
Nipype
Workflows and interfaces for neuroimaging packages
Stars: ✭ 557 (-31.82%)
Mutual labels:  big-data
Cogcomp Nlp
CogComp's Natural Language Processing libraries and Demos:
Stars: ✭ 410 (-49.82%)
Mutual labels:  big-data
Storm
Mirror of Apache Storm
Stars: ✭ 6,297 (+670.75%)
Mutual labels:  big-data
Decentralized Internet
A SDK/library for decentralized web and distributing computing projects
Stars: ✭ 406 (-50.31%)
Mutual labels:  big-data
Thrill
Thrill - An EXPERIMENTAL Algorithmic Distributed Big Data Batch Processing Framework in C++
Stars: ✭ 528 (-35.37%)
Mutual labels:  big-data
Orc
Apache ORC - the smallest, fastest columnar storage for Hadoop workloads
Stars: ✭ 389 (-52.39%)
Mutual labels:  big-data
Sdc
Intel® Scalable Dataframe Compiler for Pandas*
Stars: ✭ 623 (-23.75%)
Mutual labels:  big-data
Bigdl
Building Large-Scale AI Applications for Distributed Big Data
Stars: ✭ 3,813 (+366.71%)
Mutual labels:  big-data
Beam
Apache Beam is a unified programming model for Batch and Streaming
Stars: ✭ 5,149 (+530.23%)
Mutual labels:  big-data
Halodb
A fast, log structured key-value store.
Stars: ✭ 370 (-54.71%)
Mutual labels:  big-data
Titanoboa
Titanoboa makes complex workflows easy. It is a low-code workflow orchestration platform for JVM - distributed, highly scalable and fault tolerant.
Stars: ✭ 787 (-3.67%)
Mutual labels:  big-data
Sparkler
Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.
Stars: ✭ 362 (-55.69%)
Mutual labels:  big-data
Magellan
Geo Spatial Data Analytics on Spark
Stars: ✭ 507 (-37.94%)
Mutual labels:  big-data
Bigtop
Mirror of Apache Bigtop
Stars: ✭ 356 (-56.43%)
Mutual labels:  big-data
H2o 3
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Stars: ✭ 5,656 (+592.29%)
Mutual labels:  big-data
Devops Roadmap
DevOps methodology & roadmap for a devops developer in 2019. Interesting books to learn new technologies.
Stars: ✭ 349 (-57.28%)
Mutual labels:  big-data
Stream Framework
Stream Framework is a Python library, which allows you to build news feed, activity streams and notification systems using Cassandra and/or Redis. The authors of Stream-Framework also provide a cloud service for feed technology:
Stars: ✭ 4,576 (+460.1%)
Mutual labels:  big-data
Stroom
Stroom is a highly scalable data storage, processing and analysis platform.
Stars: ✭ 344 (-57.89%)
Mutual labels:  big-data
Cython
The most widely used Python to C compiler
Stars: ✭ 6,588 (+706.36%)
Mutual labels:  big-data
Ozone
Scalable, redundant, and distributed object store for Apache Hadoop
Stars: ✭ 330 (-59.61%)
Mutual labels:  big-data
Redislite
Redis in a python module.
Stars: ✭ 464 (-43.21%)
Mutual labels:  big-data
Beeva Best Practices
Best Practices and Style Guides in BEEVA
Stars: ✭ 335 (-59%)
Mutual labels:  big-data
Zeppelin
Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
Stars: ✭ 5,513 (+574.79%)
Mutual labels:  big-data
Courses
Quiz & Assignment of Coursera
Stars: ✭ 454 (-44.43%)
Mutual labels:  big-data
Parquet Format
Apache Parquet
Stars: ✭ 800 (-2.08%)
Mutual labels:  big-data
Rakam Api
📈 Collect customer event data from your apps. (Note that this project only includes the API collector, not the visualization platform)
Stars: ✭ 772 (-5.51%)
Mutual labels:  big-data
Sciblog support
Support content for my blog
Stars: ✭ 694 (-15.06%)
Mutual labels:  big-data
Giraph
Mirror of Apache Giraph
Stars: ✭ 569 (-30.35%)
Mutual labels:  big-data
Conjure Up
Deploying complex solutions, magically.
Stars: ✭ 454 (-44.43%)
Mutual labels:  big-data
1-60 of 369 similar projects