BigsliceA serverless cluster computing system for the Go programming language
Stars: ✭ 469 (+149.47%)
solr-zkutilSolr Cloud and ZooKeeper CLI
Stars: ✭ 14 (-92.55%)
Aws Data WranglerPandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
Stars: ✭ 2,385 (+1168.62%)
SmartcodeSmartCode = IDataSource -> IBuildTask -> IOutput => Build Everything!!!
Stars: ✭ 464 (+146.81%)
uptasticsearchAn Elasticsearch client tailored to data science workflows.
Stars: ✭ 47 (-75%)
SetlA simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (-57.98%)
django-calaccess-raw-dataA Django app to download, extract and load campaign finance and lobbying activity data from the California Secretary of State's CAL-ACCESS database
Stars: ✭ 61 (-67.55%)
PglogicalLogical Replication extension for PostgreSQL 13, 12, 11, 10, 9.6, 9.5, 9.4 (Postgres), providing much faster replication than Slony, Bucardo or Londiste, as well as cross-version upgrades.
Stars: ✭ 455 (+142.02%)
sync-engine-exampleSynchronization Algorithm Exploration: Techniques to synchronize a SQL database with external destinations.
Stars: ✭ 17 (-90.96%)
Xbin Store模仿国内知名B2C网站,实现的一个分布式B2C商城 使用Spring Boot 自动配置 Dubbox / MVC / MyBatis / Druid / Solr / Redis 等。使用Spring Cloud版本请查看
Stars: ✭ 2,140 (+1038.3%)
blockchain-etl-streamingStreaming Ethereum and Bitcoin blockchain data to Google Pub/Sub or Postgres in Kubernetes
Stars: ✭ 57 (-69.68%)
RsolrA Ruby client for Apache Solr
Stars: ✭ 416 (+121.28%)
lukePlease use the luke bundled with lucene! This repo is archived and frozen now.
Stars: ✭ 101 (-46.28%)
Data StoryA visual process builder for Laravel
Stars: ✭ 71 (-62.23%)
yasaYet Another Solr Admin
Stars: ✭ 48 (-74.47%)
Spark SolrTools for reading data from Solr as a Spark RDD and indexing objects from Spark into Solr using SolrJ.
Stars: ✭ 411 (+118.62%)
polygon-etlETL (extract, transform and load) tools for ingesting Polygon blockchain data to Google BigQuery and Pub/Sub
Stars: ✭ 53 (-71.81%)
RikoA Python stream processing engine modeled after Yahoo! Pipes
Stars: ✭ 1,571 (+735.64%)
GeoParserExtract and Visualize location from any file
Stars: ✭ 48 (-74.47%)
JanusgraphJanusGraph: an open-source, distributed graph database
Stars: ✭ 4,277 (+2175%)
zinggScalable identity resolution, entity resolution, data mastering and deduplication using ML
Stars: ✭ 655 (+248.4%)
Locopylocopy: Loading/Unloading to Redshift and Snowflake using Python.
Stars: ✭ 73 (-61.17%)
csv-cruncherTreats CSV and JSON files as SQL tables, and exports SQL SELECTs back to CSV or JSON.
Stars: ✭ 32 (-82.98%)
DatacleanerThe premier open source Data Quality solution
Stars: ✭ 391 (+107.98%)
xCommerce Search & Discovery frontend web components
Stars: ✭ 54 (-71.28%)
Mara Example Project 2An example mini data warehouse for python project stats, template for new projects
Stars: ✭ 154 (-18.09%)
solr-containerAnsible Container project that manages the lifecycle of Apache Solr on Docker.
Stars: ✭ 17 (-90.96%)
ChoetlETL Framework for .NET / c# (Parser / Writer for CSV, Flat, Xml, JSON, Key-Value, Parquet, Yaml, Avro formatted files)
Stars: ✭ 372 (+97.87%)
solr-vector-scoringVector Plugin for Solr: calculate dot product / cosine similarity on documents
Stars: ✭ 28 (-85.11%)
Blog一款简洁响应式博客系统
Stars: ✭ 72 (-61.7%)
solr wrapperWrap your tests with Solr 5+
Stars: ✭ 22 (-88.3%)
WedatasphereWeDataSphere is a financial level one-stop open-source suitcase for big data platforms. Currently the source code of Scriptis and Linkis has already been released to the open-source community. WeDataSphere, Big Data Made Easy!
Stars: ✭ 372 (+97.87%)
KibaData processing & ETL framework for Ruby
Stars: ✭ 1,618 (+760.64%)
eyy-indexerAn image and video friendly directory indexer for web directories.
Stars: ✭ 53 (-71.81%)
SparklerSpark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.
Stars: ✭ 362 (+92.55%)
YaEtlYet Another ETL in PHP
Stars: ✭ 60 (-68.09%)
VectorsinsearchDice.com repo to accompany the dice.com 'Vectors in Search' talk by Simon Hughes, from the Activate 2018 search conference, and the 'Searching with Vectors' talk from Haystack 2019 (US). Builds upon my conceptual search and semantic search work from 2015
Stars: ✭ 71 (-62.23%)
argoThe administrative discovery interface for Stanford's Digital Object Registry
Stars: ✭ 19 (-89.89%)
DataformDataform is a framework for managing SQL based data operations in BigQuery, Snowflake, and Redshift
Stars: ✭ 342 (+81.91%)
zdh server数据采集平台zdh,etl 处理服务
Stars: ✭ 53 (-71.81%)
Unnpk解包网易游戏NeoX引擎NPK文件,如阴阳师、魔法禁书目录。
Stars: ✭ 171 (-9.04%)
FlowMasterETL flow framework based on Yaml configs in Python
Stars: ✭ 19 (-89.89%)
Scout ExtendedScout Extended: The Full Power of Algolia in Laravel
Stars: ✭ 330 (+75.53%)
clojurerankerTune Solr-rankings with Clojure code.
Stars: ✭ 13 (-93.09%)
Awesome SolrA curated list of Awesome Apache Solr links and resources.
Stars: ✭ 69 (-63.3%)
iex-stocksETL for the IEX Stocks API
Stars: ✭ 19 (-89.89%)
SmooksAn extensible Java framework for building XML and non-XML streaming applications
Stars: ✭ 293 (+55.85%)
neo4j-jdbcJDBC driver for Neo4j
Stars: ✭ 110 (-41.49%)
Guidepages引导页/首次安装引导页/渐变引导页/APP介绍页/功能介绍页
Stars: ✭ 119 (-36.7%)
Zheng基于Spring+SpringMVC+Mybatis分布式敏捷开发系统架构,提供整套公共微服务服务模块:集中权限管理(单点登录)、内容管理、支付中心、用户管理(支持第三方登录)、微信平台、存储系统、配置中心、日志分析、任务和通知等,支持服务治理、监控和追踪,努力为中小型企业打造全方位J2EE企业级开发解决方案。
Stars: ✭ 16,163 (+8497.34%)
PilosaPilosa is an open source, distributed bitmap index that dramatically accelerates queries across multiple, massive data sets.
Stars: ✭ 2,224 (+1082.98%)
MetlMetl is a simple, web-based integration platform that allows for several different styles of data integration including messaging, file based Extract/Transform/Load (ETL), and remote procedure invocation via Web Services. Read more at www.jumpmind.com/products/metl/overview
Stars: ✭ 185 (-1.6%)
Bitcoin EtlETL scripts for Bitcoin, Litecoin, Dash, Zcash, Doge, Bitcoin Cash. Available in Google BigQuery https://goo.gl/oY5BCQ
Stars: ✭ 174 (-7.45%)
Code4javaRepository for my java projects.
Stars: ✭ 164 (-12.77%)
Mara PipelinesA lightweight opinionated ETL framework, halfway between plain scripts and Apache Airflow
Stars: ✭ 1,841 (+879.26%)
OdČeská otevřená data
Stars: ✭ 99 (-47.34%)
Dockerfiles50+ DockerHub public images for Docker & Kubernetes - Hadoop, Kafka, ZooKeeper, HBase, Cassandra, Solr, SolrCloud, Presto, Apache Drill, Nifi, Spark, Consul, Riak, TeamCity and DevOps tools built on the major Linux distros: Alpine, CentOS, Debian, Fedora, Ubuntu
Stars: ✭ 847 (+350.53%)