openverse-apiThe Openverse API allows programmatic access to search for CC-licensed and public domain digital media.
Stars: ✭ 41 (+51.85%)
Airflow PipelineAn Airflow docker image preconfigured to work well with Spark and Hadoop/EMR
Stars: ✭ 128 (+374.07%)
SparklerSpark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.
Stars: ✭ 362 (+1240.74%)
Goodreads etl pipelineAn end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
Stars: ✭ 793 (+2837.04%)
Agile data code 2Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
Stars: ✭ 413 (+1429.63%)
AirflowApache Airflow - A platform to programmatically author, schedule, and monitor workflows
Stars: ✭ 24,101 (+89162.96%)
airflow-code-editorA plugin for Apache Airflow that allows you to edit DAGs in browser
Stars: ✭ 195 (+622.22%)
DataspherestudioDataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
Stars: ✭ 1,195 (+4325.93%)
fairflowFunctional Airflow DAG definitions.
Stars: ✭ 38 (+40.74%)
bigkubeMinikube for big data with Scala and Spark
Stars: ✭ 16 (-40.74%)
Search Ads Web ServiceOnline search advertisement platform & Realtime Campaign Monitoring [Maybe Deprecated]
Stars: ✭ 30 (+11.11%)
airflow-boilerplateA complete development environment setup for working with Airflow
Stars: ✭ 94 (+248.15%)
viewflowViewflow is an Airflow-based framework that allows data scientists to create data models without writing Airflow code.
Stars: ✭ 110 (+307.41%)
vim-wwwToolbox to open & search URLs from vim
Stars: ✭ 32 (+18.52%)
visualize-data-with-pythonA Jupyter notebook using some standard techniques for data science and data engineering to analyze data for the 2017 flooding in Houston, TX.
Stars: ✭ 60 (+122.22%)
pytest-notebookA pytest plugin for regression testing and regenerating Jupyter Notebooks
Stars: ✭ 35 (+29.63%)
lazarus-beginners-guideA book written for new Lazarus users, named "Beginners’ Guide to Lazarus IDE". Moved to: https://gitlab.com/adnan360/lazarus-beginners-guide
Stars: ✭ 26 (-3.7%)
sparkar-voltsAn extensive non-reactive Typescript framework that eases the development experience in Spark AR
Stars: ✭ 15 (-44.44%)
pytest-faulthandlerpy.test plugin that activates the fault handler module during testing
Stars: ✭ 27 (+0%)
evildorkEvildork targeting your fiancee👁️
Stars: ✭ 46 (+70.37%)
pytest-localstackPytest plugin for local AWS integration tests
Stars: ✭ 66 (+144.44%)
spark-stringmetricSpark functions to run popular phonetic and string matching algorithms
Stars: ✭ 51 (+88.89%)
swordfishOpen-source distribute workflow schedule tools, also support streaming task.
Stars: ✭ 35 (+29.63%)
pytest-pytorchpytest plugin for a better developer experience when working with the PyTorch test suite
Stars: ✭ 36 (+33.33%)
iresearchIResearch is a cross-platform, high-performance document oriented search engine library written entirely in C++ with the focus on a pluggability of different ranking/similarity models
Stars: ✭ 121 (+348.15%)
ml-in-productionThe practical use-cases of how to make your Machine Learning Pipelines robust and reliable using Apache Airflow.
Stars: ✭ 29 (+7.41%)
awesome-AI-kubernetes❄️ 🐳 Awesome tools and libs for AI, Deep Learning, Machine Learning, Computer Vision, Data Science, Data Analytics and Cognitive Computing that are baked in the oven to be Native on Kubernetes and Docker with Python, R, Scala, Java, C#, Go, Julia, C++ etc
Stars: ✭ 95 (+251.85%)
collector-filesystemNorconex Filesystem Collector is a flexible crawler for collecting, parsing, and manipulating data ranging from local hard drives to network locations into various data repositories such as search engines.
Stars: ✭ 17 (-37.04%)
hsploitAn advanced command-line search engine for Exploit-DB
Stars: ✭ 16 (-40.74%)
python-page-object📔 Page object design pattern implementation (python, pom, selenium, pytest, travisCI)
Stars: ✭ 41 (+51.85%)
indexer4jSimple full text indexing and searching library for Java
Stars: ✭ 47 (+74.07%)
Spark-ArResources for Spark AR
Stars: ✭ 43 (+59.26%)
airflow-dbtApache Airflow integration for dbt
Stars: ✭ 233 (+762.96%)
starterCreate vertical search web application in minutes with generator (based on ItemsAPI)
Stars: ✭ 21 (-22.22%)
flow-indexerFlow-Indexer indexes flows found in chunked log files from bro,nfdump,syslog, or pcap files
Stars: ✭ 43 (+59.26%)
airflow-tutorialUse Airflow to move data from multiple MySQL databases to BigQuery
Stars: ✭ 96 (+255.56%)
pytest-ethPyTest plugin for testing smart contracts for Ethereum blockchain.
Stars: ✭ 23 (-14.81%)
pytest-itDecorate your pytest suite with RSpec-style pytest markers, then run `pytest --it` to see a plaintext spec of the test structure.
Stars: ✭ 26 (-3.7%)
experimentsCode examples for my blog posts
Stars: ✭ 21 (-22.22%)
pytest-snapshotA plugin for snapshot testing with pytest.
Stars: ✭ 68 (+151.85%)
lupynePythonic search engine based on PyLucene.
Stars: ✭ 61 (+125.93%)
astroAstro allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.
Stars: ✭ 79 (+192.59%)
pytest-watcherRerun pytest when your code changes
Stars: ✭ 60 (+122.22%)
jobAnalytics and searchJobAnalytics system consumes data from multiple sources and provides valuable information to both job hunters and recruiters.
Stars: ✭ 25 (-7.41%)
myrepocontinuous integration rep
Stars: ✭ 41 (+51.85%)
Free-Internet-PluginA free Internet is a better Internet. This Chrome browser plugin removes paywalled content from Google search results.
Stars: ✭ 121 (+348.15%)
HorizonA ZeroNet search engine
Stars: ✭ 15 (-44.44%)
code-compassa contextual search engine for software packages built on import2vec embeddings (https://www.code-compass.com)
Stars: ✭ 33 (+22.22%)