BeamApache Beam is a unified programming model for Batch and Streaming
Stars: ✭ 5,149 (+18289.29%)
automile-netAutomile offers a simple, smart, cutting-edge telematics solution for businesses to track and manage their business vehicles.
Stars: ✭ 24 (-14.29%)
leetspeekOpen and collaborative content from leet hackers!
Stars: ✭ 11 (-60.71%)
logparserEasy parsing of Apache HTTPD and NGINX access logs with Java, Hadoop, Hive, Pig, Flink, Beam, Storm, Drill, ...
Stars: ✭ 139 (+396.43%)
FlameStreamDistributed stream processing model and its implementation
Stars: ✭ 14 (-50%)
awesome-toolscurated list of awesome tools and libraries for specific domains
Stars: ✭ 31 (+10.71%)
beamdasmErlang\Elixir byte code viewer. BEAM file disassembler extension for Visual Studio Code.
Stars: ✭ 44 (+57.14%)
iisInformation Inference Service of the OpenAIRE system
Stars: ✭ 16 (-42.86%)
couchdb-pkgApache CouchDB Packaging support files
Stars: ✭ 24 (-14.29%)
mmtf-sparkMethods for the parallel and distributed analysis and mining of the Protein Data Bank using MMTF and Apache Spark.
Stars: ✭ 20 (-28.57%)
HadoopDedup🍉基于Hadoop和HBase的大规模海量数据去重
Stars: ✭ 27 (-3.57%)
bigflowA Python framework for data processing on GCP.
Stars: ✭ 96 (+242.86%)
pyspark-algorithmsPySpark Algorithms Book: https://www.amazon.com/dp/B07X4B2218/ref=sr_1_2
Stars: ✭ 72 (+157.14%)
img2datasetEasily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
Stars: ✭ 1,173 (+4089.29%)
xarray-beamDistributed Xarray with Apache Beam
Stars: ✭ 83 (+196.43%)
ngmswissgeol.ch gives you insight in geoscientific data - above and below the surface.
Stars: ✭ 23 (-17.86%)
awesome-coder-resources编程路上加油站!------【持续更新中...欢迎star,欢迎常回来看看......】【内容:编程/学习/阅读资源,开源项目,面试题,网站,书,博客,教程等等】
Stars: ✭ 54 (+92.86%)
Quantitative-Big-Imaging-2018(Latest semester at https://github.com/kmader/Quantitative-Big-Imaging-2019) The material for the Quantitative Big Imaging course at ETHZ for the Spring Semester 2018
Stars: ✭ 50 (+78.57%)
FIW KRTFamilies In the WIld: A Kinship Recogntion Toolbox.
Stars: ✭ 18 (-35.71%)
Clustering4EverC4E, a JVM friendly library written in Scala for both local and distributed (Spark) Clustering.
Stars: ✭ 126 (+350%)
yildiz🦄🌟 Graph Database layer on top of Google Bigtable
Stars: ✭ 24 (-14.29%)
gan deeplearning4jAutomatic feature engineering using Generative Adversarial Networks using Deeplearning4j and Apache Spark.
Stars: ✭ 19 (-32.14%)
data-viz-utilsFunctions for easily making publication-quality figures with matplotlib.
Stars: ✭ 16 (-42.86%)
nebulaA distributed, fast open-source graph database featuring horizontal scalability and high availability
Stars: ✭ 8,196 (+29171.43%)
automile-phpAutomile offers a simple, smart, cutting-edge telematics solution for businesses to track and manage their business vehicles.
Stars: ✭ 28 (+0%)
spacesuitAPI Gateway with URL remapping
Stars: ✭ 19 (-32.14%)
IoT-system-PLC-data-to-InfluxDBThis project aim is to provide free software to fetch data from plcs (Siemens S7-300/400/1200/1500) and store it. Used stack is completly opensource. I used InfluDB as data storage, so application principle is following Big Data paradigm.
Stars: ✭ 26 (-7.14%)
lidboxEnd-to-end spoken language identification out of the box.
Stars: ✭ 39 (+39.29%)
lubeckHigh level linear algebra library for Dlang
Stars: ✭ 57 (+103.57%)
merkle-dbHigh-scalability analytics database built on immutable merkle-trees
Stars: ✭ 44 (+57.14%)
nifiDeploy a secured, clustered, auto-scaling NiFi service in AWS.
Stars: ✭ 37 (+32.14%)
metriqlThe metrics layer for your data. Join us at https://metriql.com/slack
Stars: ✭ 227 (+710.71%)
scarfToolkit for highly memory efficient analysis of single-cell RNA-Seq, scATAC-Seq and CITE-Seq data. Analyze atlas scale datasets with millions of cells on laptop.
Stars: ✭ 54 (+92.86%)
dislibThe Distributed Computing library for python implemented using PyCOMPSs programming model for HPC.
Stars: ✭ 39 (+39.29%)
big-data-upfRECSM-UPF Summer School: Social Media and Big Data Research
Stars: ✭ 21 (-25%)
GDLibraryMatlab library for gradient descent algorithms: Version 1.0.1
Stars: ✭ 50 (+78.57%)
cdp-servicecdp数据平台,帮助企业充分了解客户,实现千人千面的精准营销。
Stars: ✭ 30 (+7.14%)
jifaceA Clojure-idiomatic wrapper around Erlang's JInterface
Stars: ✭ 27 (-3.57%)
sgdAn R package for large scale estimation with stochastic gradient descent
Stars: ✭ 55 (+96.43%)
spark-rootApache Spark Data Source for ROOT File Format
Stars: ✭ 28 (+0%)
ytprivYT metadata exporter
Stars: ✭ 28 (+0%)
scikit-learn-intelexIntel(R) Extension for Scikit-learn is a seamless way to speed up your Scikit-learn application
Stars: ✭ 887 (+3067.86%)
lcbo-apiA crawler and API server for Liquor Control Board of Ontario retail data
Stars: ✭ 152 (+442.86%)
spark-recordsBulletproof Apache Spark jobs with fast root cause analysis of failures.
Stars: ✭ 67 (+139.29%)
RemoteShuffleServiceCeleborn provides an elastic and high-performance service for shuffle and spilled data.
Stars: ✭ 262 (+835.71%)
dxramA distributed in-memory key-value storage for billions of small objects.
Stars: ✭ 25 (-10.71%)
datalake-etl-pipelineSimplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Stars: ✭ 39 (+39.29%)