Devops Python Tools80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.
Stars: ✭ 406 (+200.74%)
lzbase62LZ77(LZSS) based compression algorithm in base62 for JavaScript.
Stars: ✭ 38 (-71.85%)
HdfsA native go client for HDFS
Stars: ✭ 992 (+634.81%)
Maven Min Plugin📦 Latke application JavaScript and CSS files compression.
Stars: ✭ 5 (-96.3%)
ACCV TinyGANBigGAN; Knowledge Distillation; Black-Box; Fast Training; 16x compression
Stars: ✭ 62 (-54.07%)
Rumble⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
Stars: ✭ 58 (-57.04%)
Imageoptim CliMake optimisation of images part of your automated build process
Stars: ✭ 3,215 (+2281.48%)
Repository个人学习知识库涉及到数据仓库建模、实时计算、大数据、Java、算法等。
Stars: ✭ 92 (-31.85%)
bigdata-funA complete (distributed) BigData stack, running in containers
Stars: ✭ 14 (-89.63%)
PucketBucketing and partitioning system for Parquet
Stars: ✭ 29 (-78.52%)
SnakebiteA pure python HDFS client
Stars: ✭ 828 (+513.33%)
py-hdfs-mountMount HDFS with fuse, works with kerberos!
Stars: ✭ 13 (-90.37%)
Cloud Note基于分布式的云笔记(参考某道云笔记),数据存储在redis与hbase中
Stars: ✭ 71 (-47.41%)
Bigdata💎🔥大数据学习笔记
Stars: ✭ 488 (+261.48%)
Html Minifier Terseractively maintained fork of html-minifier - minify HTML, CSS and JS code using terser - supports ES6 code
Stars: ✭ 106 (-21.48%)
TiledbThe Universal Storage Engine
Stars: ✭ 1,072 (+694.07%)
Hdfs ShellHDFS Shell is a HDFS manipulation tool to work with functions integrated in Hadoop DFS
Stars: ✭ 117 (-13.33%)
EasyCompressor⚡ A compression library that implements many compression algorithms such as LZ4, Zstd, LZMA, Snappy, Brotli, GZip, and Deflate. It helps you to improve performance by reducing Memory Usage and Network Traffic for caching.
Stars: ✭ 167 (+23.7%)
leaflet heatmap简单的可视化湖州通话数据 假设数据量很大,没法用浏览器直接绘制热力图,把绘制热力图这一步骤放到线下计算分析。使用Apache Spark并行计算数据之后,再使用Apache Spark绘制热力图,然后用leafletjs加载OpenStreetMap图层和热力图图层,以达到良好的交互效果。现在使用Apache Spark实现绘制,可能是Apache Spark不擅长这方面的计算或者是我没有设计好算法,并行计算的速度比不上单机计算。Apache Spark绘制热力图和计算代码在这 https://github.com/yuanzhaokang/ParallelizeHeatmap.git .
Stars: ✭ 13 (-90.37%)
Compress.jsA simple JavaScript based client-side image compression algorithm
Stars: ✭ 86 (-36.3%)
fastdata-clusterFast Data Cluster (Apache Cassandra, Kafka, Spark, Flink, YARN and HDFS with Vagrant and VirtualBox)
Stars: ✭ 20 (-85.19%)
McimageAndroid Gradle Plugin -- Auto Check big image and compress image in building.
Stars: ✭ 872 (+545.93%)
react-native-compressorThe lightweight library for compress image, video, and audio with an awesome experience
Stars: ✭ 157 (+16.3%)
Tiledb PyPython interface to the TileDB storage manager
Stars: ✭ 78 (-42.22%)
Hadoop For GeoeventArcGIS GeoEvent Server sample Hadoop connector for storing GeoEvents in HDFS.
Stars: ✭ 5 (-96.3%)
Py7zr7zip in python3 with ZStandard, PPMd, LZMA2, LZMA1, Delta, BCJ, BZip2, and Deflate compressions, and AES encryption.
Stars: ✭ 110 (-18.52%)
SpartaReal Time Analytics and Data Pipelines based on Spark Streaming
Stars: ✭ 513 (+280%)
God Of Bigdata专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...
Stars: ✭ 6,008 (+4350.37%)
DynamometerA tool for scale and performance testing of HDFS with a specific focus on the NameNode.
Stars: ✭ 122 (-9.63%)
JuicefsJuiceFS is a distributed POSIX file system built on top of Redis and S3.
Stars: ✭ 4,262 (+3057.04%)
MyboxEasy tools of document, image, file, network, location, color, and media.
Stars: ✭ 45 (-66.67%)
Filepicker🔥🔥🔥Android文件、图片选择器,可按文件夹查找,文件类型查找,支持自定义相机
Stars: ✭ 265 (+96.3%)
Apiproject[https://www.sofineday.com], golang项目开发脚手架,集成最佳实践(gin+gorm+go-redis+mongo+cors+jwt+json日志库zap(支持日志收集到kafka或mongo)+消息队列kafka+微信支付宝支付gopay+api加密+api反向代理+go modules依赖管理+headless爬虫chromedp+makefile+二进制压缩+livereload热加载)
Stars: ✭ 124 (-8.15%)
bigkubeMinikube for big data with Scala and Spark
Stars: ✭ 16 (-88.15%)
PLzmaSDKPLzmaSDK is (Portable, Patched, Package, cross-P-latform) Lzma SDK.
Stars: ✭ 28 (-79.26%)
Wifi基于wifi抓取信息的大数据查询分析系统
Stars: ✭ 93 (-31.11%)
pylovepdfilovepdf.com python API library
Stars: ✭ 52 (-61.48%)
Jsr203 HadoopA Java NIO file system provider for HDFS
Stars: ✭ 35 (-74.07%)
IbisA pandas-like deferred expression system, with first-class SQL support
Stars: ✭ 1,630 (+1107.41%)
csso-webpack-pluginCSSO full restructuring minification files to serve your webpack bundles
Stars: ✭ 104 (-22.96%)
Gulp TarCreate tarball from files
Stars: ✭ 28 (-79.26%)
pdf-scripts📑 Scripts to repair, verify, OCR, compress, wrangle, crop (etc.) PDFs
Stars: ✭ 33 (-75.56%)
Bigdata File ViewerA cross-platform (Windows, MAC, Linux) desktop application to view common bigdata binary format like Parquet, ORC, AVRO, etc. Support local file system, HDFS, AWS S3, Azure Blob Storage ,etc.
Stars: ✭ 86 (-36.3%)
Bigdata Interview🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结
Stars: ✭ 857 (+534.81%)
SlimSurprisingly space efficient trie in Golang(11 bits/key; 100 ns/get).
Stars: ✭ 1,705 (+1162.96%)
ElasticctrElasticCTR,即飞桨弹性计算推荐系统,是基于Kubernetes的企业级推荐系统开源解决方案。该方案融合了百度业务场景下持续打磨的高精度CTR模型、飞桨开源框架的大规模分布式训练能力、工业级稀疏参数弹性调度服务,帮助用户在Kubernetes环境中一键完成推荐系统部署,具备高性能、工业级部署、端到端体验的特点,并且作为开源套件,满足二次深度开发的需求。
Stars: ✭ 123 (-8.89%)
CamusMirror of Linkedin's Camus
Stars: ✭ 81 (-40%)
Cluster PackA library on top of either pex or conda-pack to make your Python code easily available on a cluster
Stars: ✭ 23 (-82.96%)