All Projects → Scanns → Similar Projects or Alternatives

443 Open source projects that are alternatives of or similar to Scanns

Training of Locally Optimized Product Quantization (LOPQ) models for approximate nearest neighbor search of high dimensional data in Python and Spark.

Stars: ✭ 530 (+178.95%)

Mutual labels: nearest-neighbor-search, spark

Linkis

Linkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.

Stars: ✭ 2,323 (+1122.63%)

Mutual labels: spark

Aztk

AZTK powered by Azure Batch: On-demand, Dockerized, Spark Jobs on Azure

Stars: ✭ 152 (-20%)

Mutual labels: spark

Nd4j

Fast, Scientific and Numerical Computing for the JVM (NDArrays)

Stars: ✭ 1,742 (+816.84%)

Mutual labels: spark

Powderkeg

Live-coding the cluster!

Stars: ✭ 152 (-20%)

Mutual labels: spark

Azure Cosmosdb Spark

Apache Spark Connector for Azure Cosmos DB

Stars: ✭ 165 (-13.16%)

Mutual labels: spark

Datacompy

Pandas and Spark DataFrame comparison for humans

Stars: ✭ 147 (-22.63%)

Mutual labels: spark

Spark

Firely's open source FHIR server

Stars: ✭ 174 (-8.42%)

Mutual labels: spark

Scalable Data Science Platform

Content for architecting a data science platform for products using Luigi, Spark & Flask.

Stars: ✭ 158 (-16.84%)

Mutual labels: spark

Azure Event Hubs Spark

Enabling Continuous Data Processing with Apache Spark and Azure Event Hubs

Stars: ✭ 140 (-26.32%)

Mutual labels: spark

Elastiknn

Elasticsearch plugin for nearest neighbor search. Store vectors and run similarity search using exact and approximate algorithms.

Stars: ✭ 139 (-26.84%)

Mutual labels: nearest-neighbor-search

Quill

Compile-time Language Integrated Queries for Scala

Stars: ✭ 1,998 (+951.58%)

Mutual labels: spark

Spark Iforest

Isolation Forest on Spark

Stars: ✭ 166 (-12.63%)

Mutual labels: spark

Spark Ml Source Analysis

spark ml 算法原理剖析以及具体的源码实现分析

Stars: ✭ 1,873 (+885.79%)

Mutual labels: spark

Spark Kafka Writer

Write your Spark data to Kafka seamlessly

Stars: ✭ 175 (-7.89%)

Mutual labels: spark

Cc Pyspark

Process Common Crawl data with Python and Spark

Stars: ✭ 147 (-22.63%)

Mutual labels: spark

Whylogs Java

Profile and monitor your ML data pipeline end-to-end

Stars: ✭ 164 (-13.68%)

Mutual labels: spark

Technology Talk

汇总java生态圈常用技术框架、开源中间件，系统架构、数据库、大公司架构案例、常用三方类库、项目管理、线上问题排查、个人成长、思考等知识

Stars: ✭ 12,136 (+6287.37%)

Mutual labels: spark

Spark Streaming With Kafka

Self-contained examples of Apache Spark streaming integrated with Apache Kafka.

Stars: ✭ 180 (-5.26%)

Mutual labels: spark

Rasterframes

Geospatial Raster support for Spark DataFrames

Stars: ✭ 142 (-25.26%)

Mutual labels: spark

Vald

Vald. A Highly Scalable Distributed Vector Search Engine

Stars: ✭ 158 (-16.84%)

Mutual labels: nearest-neighbor-search

Sparkling Graph

SparklingGraph provides easy to use set of features that will give you ability to proces large scala graphs using Spark and GraphX.

Stars: ✭ 139 (-26.84%)

Mutual labels: spark

Faiss tips

Some useful tips for faiss

Stars: ✭ 170 (-10.53%)

Mutual labels: nearest-neighbor-search

Pgann

Fast Approximate Nearest Neighbor (ANN) searches with a PostgreSQL database.

Stars: ✭ 156 (-17.89%)

Mutual labels: nearest-neighbor-search

Spark On Lambda

Apache Spark on AWS Lambda

Stars: ✭ 137 (-27.89%)

Mutual labels: spark

Horovod

Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.

Stars: ✭ 11,943 (+6185.79%)

Mutual labels: spark

Sparkmonitor

Monitor Apache Spark from Jupyter Notebook

Stars: ✭ 154 (-18.95%)

Mutual labels: spark

Spark Structured Streaming Examples

Spark Structured Streaming / Kafka / Cassandra / Elastic

Stars: ✭ 168 (-11.58%)

Mutual labels: spark

Spark.jl

Julia binding for Apache Spark

Stars: ✭ 153 (-19.47%)

Mutual labels: spark

Xsql

Unified SQL Analytics Engine Based on SparkSQL

Stars: ✭ 176 (-7.37%)

Mutual labels: spark

Spark Tsne

Distributed t-SNE via Apache Spark

Stars: ✭ 151 (-20.53%)

Mutual labels: spark

Geopyspark

GeoTrellis for PySpark

Stars: ✭ 167 (-12.11%)

Mutual labels: spark

Benchm Ml

A minimal benchmark for scalability, speed and accuracy of commonly used open source implementations (R packages, Python scikit-learn, H2O, xgboost, Spark MLlib etc.) of the top machine learning algorithms for binary classification (random forests, gradient boosted trees, deep neural networks etc.).

Stars: ✭ 1,835 (+865.79%)

Mutual labels: spark

Roaringbitmap

A better compressed bitset in Java

Stars: ✭ 2,460 (+1194.74%)

Mutual labels: spark

Spark With Python

Fundamentals of Spark with Python (using PySpark), code examples

Stars: ✭ 150 (-21.05%)

Mutual labels: spark

Big Whale

Spark、Flink等离线任务的调度以及实时任务的监控

Stars: ✭ 163 (-14.21%)

Mutual labels: spark

Pyspark Learning

Updated repository

Stars: ✭ 147 (-22.63%)

Mutual labels: spark

Kraps Rpc

A RPC framework leveraging Spark RPC module

Stars: ✭ 175 (-7.89%)

Mutual labels: spark

Spark Cassandra Connector

DataStax Spark Cassandra Connector

Stars: ✭ 1,816 (+855.79%)

Mutual labels: spark

Bigdata docker

Big Data Ecosystem Docker

Stars: ✭ 161 (-15.26%)

Mutual labels: spark

Nanopq

Pure python implementation of product quantization for nearest neighbor search

Stars: ✭ 145 (-23.68%)

Mutual labels: nearest-neighbor-search

Azuredatabricksbestpractices

Version 1 of Technical Best Practices of Azure Databricks based on real world Customer and Technical SME inputs

Stars: ✭ 186 (-2.11%)

Mutual labels: spark

Spark Authorizer

A Spark SQL extension which provides SQL Standard Authorization for Apache Spark

Stars: ✭ 141 (-25.79%)

Mutual labels: spark

Vue Info Card

Simple and beautiful card component with an elegant spark line, for VueJS.

Stars: ✭ 159 (-16.32%)

Mutual labels: spark

Data science blogs

A repository to keep track of all the code that I end up writing for my blog posts.

Stars: ✭ 139 (-26.84%)

Mutual labels: spark

Spark Nlp

State of the Art Natural Language Processing

Stars: ✭ 2,518 (+1225.26%)

Mutual labels: spark

Ecommercerecommendsystem

商品大数据实时推荐系统。前端：Vue + TypeScript + ElementUI，后端 Spring + Spark

Stars: ✭ 139 (-26.84%)

Mutual labels: spark

Glow

An open-source toolkit for large-scale genomic analysis

Stars: ✭ 159 (-16.32%)

Mutual labels: spark

Isolation Forest

A Spark/Scala implementation of the isolation forest unsupervised outlier detection algorithm.

Stars: ✭ 139 (-26.84%)

Mutual labels: spark

Tarsoslsh

A Java library implementing practical nearest neighbour search algorithm for multidimensional vectors that operates in sublinear time. It implements Locality-sensitive Hashing (LSH) and multi index hashing for hamming space.

Stars: ✭ 179 (-5.79%)

Mutual labels: nearest-neighbor-search

Quicksql

A Flexible, Fast, Federated(3F) SQL Analysis Middleware for Multiple Data Sources

Stars: ✭ 1,821 (+858.42%)

Mutual labels: spark

Handyspark

HandySpark - bringing pandas-like capabilities to Spark dataframes

Stars: ✭ 158 (-16.84%)

Mutual labels: spark

Apache Spark Node

Node.js bindings for Apache Spark DataFrame APIs

Stars: ✭ 136 (-28.42%)

Mutual labels: spark

Deeplearning4j

Suite of tools for deploying and training deep learning models using the JVM. Highlights include model import for keras, tensorflow, and onnx/pytorch, a modular and tiny c++ library for running math code and a java based math library on top of the core c++ library. Also includes samediff: a pytorch/tensorflow like library for running deep learni…

Stars: ✭ 12,277 (+6361.58%)

Mutual labels: spark

Aliyun Emapreduce Datasources

Extended datasource support for Spark/Hadoop on Aliyun E-MapReduce.

Stars: ✭ 132 (-30.53%)

Mutual labels: spark

Geni

A Clojure dataframe library that runs on Spark

Stars: ✭ 152 (-20%)

Mutual labels: spark

Js Spark

Realtime calculation distributed system. AKA distributed lodash

Stars: ✭ 187 (-1.58%)

Mutual labels: spark

Kotlin Spark Api

This projects gives Kotlin bindings and several extensions for Apache Spark. We are looking to have this as a part of Apache Spark 3.x

Stars: ✭ 183 (-3.68%)

Mutual labels: spark

Sparkstreaming

💥 🚀 封装sparkstreaming动态调节batch time(有数据就执行计算)；🚀 支持运行过程中增删topic；🚀 封装sparkstreaming 1.6 - kafka 010 用以支持 SSL。

Stars: ✭ 179 (-5.79%)

Mutual labels: spark

Transmogrifai

TransmogrifAI (pronounced trăns-mŏgˈrə-fī) is an AutoML library for building modular, reusable, strongly typed machine learning workflows on Apache Spark with minimal hand-tuning

Stars: ✭ 2,084 (+996.84%)

Mutual labels: spark

1-60 of 443 similar projects

›

next*5