All Projects → spark-records → Similar Projects or Alternatives

474 Open source projects that are alternatives of or similar to spark-records

awesome-tools
curated list of awesome tools and libraries for specific domains
Stars: ✭ 31 (-53.73%)
Mutual labels:  big-data, apache-spark
Morpheus
Morpheus brings the leading graph query language, Cypher, onto the leading distributed processing platform, Spark.
Stars: ✭ 303 (+352.24%)
Mutual labels:  big-data, apache-spark
mmtf-workshop-2018
Structural Bioinformatics Training Workshop & Hackathon 2018
Stars: ✭ 50 (-25.37%)
Mutual labels:  big-data, apache-spark
Hydrograph
A visual ETL development and debugging tool for big data
Stars: ✭ 144 (+114.93%)
Mutual labels:  big-data, apache-spark
Mist
Serverless proxy for Spark cluster
Stars: ✭ 309 (+361.19%)
Mutual labels:  big-data, apache-spark
datalake-etl-pipeline
Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Stars: ✭ 39 (-41.79%)
Mutual labels:  big-data, apache-spark
Griffon Vm
Griffon Data Science Virtual Machine
Stars: ✭ 128 (+91.04%)
Mutual labels:  big-data, apache-spark
aut
The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Stars: ✭ 111 (+65.67%)
Mutual labels:  big-data, apache-spark
pyspark-cheatsheet
PySpark Cheat Sheet - example code to help you learn PySpark and develop apps faster
Stars: ✭ 115 (+71.64%)
Mutual labels:  big-data, apache-spark
Parquet Dotnet
🏐 Apache Parquet for modern .NET
Stars: ✭ 276 (+311.94%)
Mutual labels:  big-data, apache-spark
Detecting-Malicious-URL-Machine-Learning
No description or website provided.
Stars: ✭ 47 (-29.85%)
Mutual labels:  big-data, apache-spark
Sparkrdma
RDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark
Stars: ✭ 215 (+220.9%)
Mutual labels:  big-data, apache-spark
Spark With Python
Fundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (+123.88%)
Mutual labels:  big-data, apache-spark
gan deeplearning4j
Automatic feature engineering using Generative Adversarial Networks using Deeplearning4j and Apache Spark.
Stars: ✭ 19 (-71.64%)
Mutual labels:  big-data, apache-spark
Parquetviewer
Simple windows desktop application for viewing & querying Apache Parquet files
Stars: ✭ 145 (+116.42%)
Mutual labels:  big-data, apache-spark
SparkProgrammingInScala
Apache Spark Course Material
Stars: ✭ 57 (-14.93%)
Mutual labels:  big-data, apache-spark
Bigdata Playground
A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL
Stars: ✭ 177 (+164.18%)
Mutual labels:  big-data, apache-spark
Spark On Lambda
Apache Spark on AWS Lambda
Stars: ✭ 137 (+104.48%)
Mutual labels:  big-data, apache-spark
leaflet heatmap
简单的可视化湖州通话数据 假设数据量很大,没法用浏览器直接绘制热力图,把绘制热力图这一步骤放到线下计算分析。使用Apache Spark并行计算数据之后,再使用Apache Spark绘制热力图,然后用leafletjs加载OpenStreetMap图层和热力图图层,以达到良好的交互效果。现在使用Apache Spark实现绘制,可能是Apache Spark不擅长这方面的计算或者是我没有设计好算法,并行计算的速度比不上单机计算。Apache Spark绘制热力图和计算代码在这 https://github.com/yuanzhaokang/ParallelizeHeatmap.git .
Stars: ✭ 13 (-80.6%)
Mutual labels:  big-data, apache-spark
sparkucx
A high-performance, scalable and efficient ShuffleManager plugin for Apache Spark, utilizing UCX communication layer
Stars: ✭ 32 (-52.24%)
Mutual labels:  big-data, apache-spark
SynapseML
Simple and Distributed Machine Learning
Stars: ✭ 3,355 (+4907.46%)
Mutual labels:  big-data, apache-spark
Mmlspark
Simple and Distributed Machine Learning
Stars: ✭ 2,899 (+4226.87%)
Mutual labels:  big-data, apache-spark
Scala Spark Tutorial
Project for James' Apache Spark with Scala course
Stars: ✭ 121 (+80.6%)
Mutual labels:  big-data, apache-spark
Data Accelerator
Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.
Stars: ✭ 247 (+268.66%)
Mutual labels:  big-data, apache-spark
mmtf-spark
Methods for the parallel and distributed analysis and mining of the Protein Data Bank using MMTF and Apache Spark.
Stars: ✭ 20 (-70.15%)
Mutual labels:  big-data, apache-spark
hazelcast-csharp-client
Hazelcast .NET Client
Stars: ✭ 98 (+46.27%)
Mutual labels:  big-data
data-viz-utils
Functions for easily making publication-quality figures with matplotlib.
Stars: ✭ 16 (-76.12%)
Mutual labels:  big-data
pyspark-algorithms
PySpark Algorithms Book: https://www.amazon.com/dp/B07X4B2218/ref=sr_1_2
Stars: ✭ 72 (+7.46%)
Mutual labels:  big-data
geospark
bring sf to spark in production
Stars: ✭ 53 (-20.9%)
Mutual labels:  apache-spark
spark-root
Apache Spark Data Source for ROOT File Format
Stars: ✭ 28 (-58.21%)
Mutual labels:  big-data
automile-php
Automile offers a simple, smart, cutting-edge telematics solution for businesses to track and manage their business vehicles.
Stars: ✭ 28 (-58.21%)
Mutual labels:  big-data
corpusexplorer2.0
Korpuslinguistik war noch nie so einfach...
Stars: ✭ 16 (-76.12%)
Mutual labels:  big-data
SANSA-Stack
Big Data RDF Processing and Analytics Stack built on Apache Spark and Apache Jena http://sansa-stack.github.io/SANSA-Stack/
Stars: ✭ 130 (+94.03%)
Mutual labels:  apache-spark
lidbox
End-to-end spoken language identification out of the box.
Stars: ✭ 39 (-41.79%)
Mutual labels:  big-data
RemoteShuffleService
Celeborn provides an elastic and high-performance service for shuffle and spilled data.
Stars: ✭ 262 (+291.04%)
Mutual labels:  big-data
dxram
A distributed in-memory key-value storage for billions of small objects.
Stars: ✭ 25 (-62.69%)
Mutual labels:  big-data
FlameStream
Distributed stream processing model and its implementation
Stars: ✭ 14 (-79.1%)
Mutual labels:  big-data
osm-parquetizer
A converter for the OSM PBFs to Parquet files
Stars: ✭ 71 (+5.97%)
Mutual labels:  apache-spark
sparklygraphs
Old repo for R interface for GraphFrames
Stars: ✭ 13 (-80.6%)
Mutual labels:  apache-spark
lubeck
High level linear algebra library for Dlang
Stars: ✭ 57 (-14.93%)
Mutual labels:  big-data
learning-hadoop-and-spark
Companion to Learning Hadoop and Learning Spark courses on Linked In Learning
Stars: ✭ 146 (+117.91%)
Mutual labels:  apache-spark
proxima-platform
The Proxima platform.
Stars: ✭ 17 (-74.63%)
Mutual labels:  apache-spark
nebula
A distributed, fast open-source graph database featuring horizontal scalability and high availability
Stars: ✭ 8,196 (+12132.84%)
Mutual labels:  big-data
ngm
swissgeol.ch gives you insight in geoscientific data - above and below the surface.
Stars: ✭ 23 (-65.67%)
Mutual labels:  big-data
big-data-engineering-indonesia
A curated list of big data engineering tools, resources and communities.
Stars: ✭ 26 (-61.19%)
Mutual labels:  big-data
nifi
Deploy a secured, clustered, auto-scaling NiFi service in AWS.
Stars: ✭ 37 (-44.78%)
Mutual labels:  big-data
merkle-db
High-scalability analytics database built on immutable merkle-trees
Stars: ✭ 44 (-34.33%)
Mutual labels:  big-data
javaer-mind
Java 程序员进阶学习的思维导图
Stars: ✭ 66 (-1.49%)
Mutual labels:  big-data
scarf
Toolkit for highly memory efficient analysis of single-cell RNA-Seq, scATAC-Seq and CITE-Seq data. Analyze atlas scale datasets with millions of cells on laptop.
Stars: ✭ 54 (-19.4%)
Mutual labels:  big-data
IoT-system-PLC-data-to-InfluxDB
This project aim is to provide free software to fetch data from plcs (Siemens S7-300/400/1200/1500) and store it. Used stack is completly opensource. I used InfluDB as data storage, so application principle is following Big Data paradigm.
Stars: ✭ 26 (-61.19%)
Mutual labels:  big-data
img2dataset
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
Stars: ✭ 1,173 (+1650.75%)
Mutual labels:  big-data
automile-net
Automile offers a simple, smart, cutting-edge telematics solution for businesses to track and manage their business vehicles.
Stars: ✭ 24 (-64.18%)
Mutual labels:  big-data
metriql
The metrics layer for your data. Join us at https://metriql.com/slack
Stars: ✭ 227 (+238.81%)
Mutual labels:  big-data
leetspeek
Open and collaborative content from leet hackers!
Stars: ✭ 11 (-83.58%)
Mutual labels:  big-data
big-data-upf
RECSM-UPF Summer School: Social Media and Big Data Research
Stars: ✭ 21 (-68.66%)
Mutual labels:  big-data
dislib
The Distributed Computing library for python implemented using PyCOMPSs programming model for HPC.
Stars: ✭ 39 (-41.79%)
Mutual labels:  big-data
awesome-coder-resources
编程路上加油站!------【持续更新中...欢迎star,欢迎常回来看看......】【内容:编程/学习/阅读资源,开源项目,面试题,网站,书,博客,教程等等】
Stars: ✭ 54 (-19.4%)
Mutual labels:  big-data
Real Time Social Media Mining
DevOps pipeline for Real Time Social/Web Mining
Stars: ✭ 22 (-67.16%)
Mutual labels:  big-data
iis
Information Inference Service of the OpenAIRE system
Stars: ✭ 16 (-76.12%)
Mutual labels:  big-data
phoenix-queryserver
Apache Phoenix Query Server
Stars: ✭ 33 (-50.75%)
Mutual labels:  big-data
1-60 of 474 similar projects