BallistaDistributed compute platform implemented in Rust, and powered by Apache Arrow.
Stars: ✭ 2,274 (+893.01%)
DatacompyPandas and Spark DataFrame comparison for humans
Stars: ✭ 147 (-35.81%)
Spark NlpState of the Art Natural Language Processing
Stars: ✭ 2,518 (+999.56%)
Technology Talk汇总java生态圈常用技术框架、开源中间件,系统架构、数据库、大公司架构案例、常用三方类库、项目管理、线上问题排查、个人成长、思考等知识
Stars: ✭ 12,136 (+5199.56%)
Pisces♓️ Fish shell plugin that helps you to work with paired symbols in the command line
Stars: ✭ 210 (-8.3%)
Spark AuthorizerA Spark SQL extension which provides SQL Standard Authorization for Apache Spark
Stars: ✭ 141 (-38.43%)
Deeplearning4jSuite of tools for deploying and training deep learning models using the JVM. Highlights include model import for keras, tensorflow, and onnx/pytorch, a modular and tiny c++ library for running math code and a java based math library on top of the core c++ library. Also includes samediff: a pytorch/tensorflow like library for running deep learni…
Stars: ✭ 12,277 (+5261.14%)
Data science blogsA repository to keep track of all the code that I end up writing for my blog posts.
Stars: ✭ 139 (-39.3%)
ScannsA scalable nearest neighbor search library in Apache Spark
Stars: ✭ 190 (-17.03%)
DotfilesConfig for vim sublime awesome xmonad etc.
Stars: ✭ 140 (-38.86%)
Dotfiles💻 macOS System Configuration with Fish, Package Control, VS Code, Repo management, Hammerspoon
Stars: ✭ 168 (-26.64%)
Sparkling GraphSparklingGraph provides easy to use set of features that will give you ability to proces large scala graphs using Spark and GraphX.
Stars: ✭ 139 (-39.3%)
Spark ExcelA Spark plugin for reading Excel files via Apache POI
Stars: ✭ 216 (-5.68%)
Isolation ForestA Spark/Scala implementation of the isolation forest unsupervised outlier detection algorithm.
Stars: ✭ 139 (-39.3%)
HydroUltra-pure, lag-free prompt with async Git status. Designed for Fish.
Stars: ✭ 137 (-40.17%)
AzuredatabricksbestpracticesVersion 1 of Technical Best Practices of Azure Databricks based on real world Customer and Technical SME inputs
Stars: ✭ 186 (-18.78%)
Getopts.fishParse CLI options in Fish.
Stars: ✭ 166 (-27.51%)
Apache Spark NodeNode.js bindings for Apache Spark DataFrame APIs
Stars: ✭ 136 (-40.61%)
Example SparkSpark, Spark Streaming and Spark SQL unit testing strategies
Stars: ✭ 205 (-10.48%)
Z.lua⚡ A new cd command that helps you navigate faster by learning your habits.
Stars: ✭ 2,164 (+844.98%)
AbrisAvro SerDe for Apache Spark structured APIs.
Stars: ✭ 130 (-43.23%)
RoaringbitmapA better compressed bitset in Java
Stars: ✭ 2,460 (+974.24%)
Spylon KernelJupyter kernel for scala and spark
Stars: ✭ 129 (-43.67%)
Whylogs JavaProfile and monitor your ML data pipeline end-to-end
Stars: ✭ 164 (-28.38%)
GafferA large-scale entity and relation database supporting aggregation of properties
Stars: ✭ 1,642 (+617.03%)
Ruby SparkRuby wrapper for Apache Spark
Stars: ✭ 221 (-3.49%)
FeastFeature Store for Machine Learning
Stars: ✭ 2,576 (+1024.89%)
DotfilesMy personal dotfiles.
Stars: ✭ 162 (-29.26%)
OpenubaA robust, and flexible open source User & Entity Behavior Analytics (UEBA) framework used for Security Analytics. Developed with luv by Data Scientists & Security Analysts from the Cyber Security Industry. [PRE-ALPHA]
Stars: ✭ 127 (-44.54%)
DotfilesMy [NeoVim + Tmux + Fish Shell] Setup /w install scripts
Stars: ✭ 180 (-21.4%)
UpQuickly navigate to a parent directory via tab-completion.
Stars: ✭ 126 (-44.98%)
Vue Info CardSimple and beautiful card component with an elegant spark line, for VueJS.
Stars: ✭ 159 (-30.57%)
HadoopcryptoledgerHadoop Crypto Ledger - Analyzing CryptoLedgers, such as Bitcoin Blockchain, on Big Data platforms, such as Hadoop/Spark/Flink/Hive
Stars: ✭ 126 (-44.98%)
Scala SamplesThere are pieces of scala code that explain Scala syntax and related things - like what you can do with all this
Stars: ✭ 125 (-45.41%)
Spark AlchemyCollection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive
Stars: ✭ 122 (-46.72%)
Sparkstreaming💥 🚀 封装sparkstreaming动态调节batch time(有数据就执行计算);🚀 支持运行过程中增删topic;🚀 封装sparkstreaming 1.6 - kafka 010 用以支持 SSL。
Stars: ✭ 179 (-21.83%)
ZparkioBoiler plate framework to use Spark and ZIO together.
Stars: ✭ 121 (-47.16%)
GeniA Clojure dataframe library that runs on Spark
Stars: ✭ 152 (-33.62%)
SparkrdmaRDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark
Stars: ✭ 215 (-6.11%)
Kinesis SqlKinesis Connector for Structured Streaming
Stars: ✭ 120 (-47.6%)
IbisA pandas-like deferred expression system, with first-class SQL support
Stars: ✭ 1,630 (+611.79%)
Powerline GoA beautiful and useful low-latency prompt for your shell, written in go
Stars: ✭ 2,299 (+903.93%)
Cube.js📊 Cube — Open-Source Analytics API for Building Data Apps
Stars: ✭ 11,983 (+5132.75%)
QuillCompile-time Language Integrated Queries for Scala
Stars: ✭ 1,998 (+772.49%)
Spark LucenerddSpark RDD with Lucene's query and entity linkage capabilities
Stars: ✭ 114 (-50.22%)
Spark PracticeApache Spark (PySpark) Practice on Real Data
Stars: ✭ 200 (-12.66%)
PowderkegLive-coding the cluster!
Stars: ✭ 152 (-33.62%)
ZoxideA smarter cd command. Supports all major shells.
Stars: ✭ 4,422 (+1831%)
Kraps RpcA RPC framework leveraging Spark RPC module
Stars: ✭ 175 (-23.58%)
Benchm MlA minimal benchmark for scalability, speed and accuracy of commonly used open source implementations (R packages, Python scikit-learn, H2O, xgboost, Spark MLlib etc.) of the top machine learning algorithms for binary classification (random forests, gradient boosted trees, deep neural networks etc.).
Stars: ✭ 1,835 (+701.31%)