Devops Python Tools80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.
Stars: ✭ 406 (+680.77%)
autThe Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Stars: ✭ 111 (+113.46%)
SparkoraPowerful rapid automatic EDA and feature engineering library with a very easy to use API 🌟
Stars: ✭ 51 (-1.92%)
awesome-time-seriesResources for working with time series and sequence data
Stars: ✭ 178 (+242.31%)
basinBasin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser
Stars: ✭ 25 (-51.92%)
fireTSA python multi-variate time series prediction library working with sklearn
Stars: ✭ 62 (+19.23%)
Spark With PythonFundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (+188.46%)
big dataA collection of tutorials on Hadoop, MapReduce, Spark, Docker
Stars: ✭ 34 (-34.62%)
Atsd Use CasesAxibase Time Series Database: Usage Examples and Research Articles
Stars: ✭ 335 (+544.23%)
datalake-etl-pipelineSimplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Stars: ✭ 39 (-25%)
pyspark-ML-in-ColabPyspark in Google Colab: A simple machine learning (Linear Regression) model
Stars: ✭ 32 (-38.46%)
MLHadoopThis repository contains Machine-Learning MapReduce codes for Hadoop which are written from scratch (without using any package or library). E.g. Prediction (Linear and Logistic Regression), Clustering (K-Means), Classification (KNN) etc.
Stars: ✭ 50 (-3.85%)
cobra-policytoolManage Apache Atlas and Ranger configuration for your Hadoop environment.
Stars: ✭ 16 (-69.23%)
copperAn open source PCB editor in rust
Stars: ✭ 26 (-50%)
OLX Scraper📻 An OLX Scraper using Scrapy + MongoDB. It Scrapes recent ads posted regarding requested product and dumps to NOSQL MONGODB.
Stars: ✭ 15 (-71.15%)
densenetA PyTorch Implementation of "Densely Connected Convolutional Networks"
Stars: ✭ 50 (-3.85%)
check-engineData validation library for PySpark 3.0.0
Stars: ✭ 29 (-44.23%)
deep-scite🚣 A simple recommendation engine (by way of convolutions and embeddings) written in TensorFlow
Stars: ✭ 20 (-61.54%)
skimpyskimpy is a light weight tool that provides summary statistics about variables in data frames within the console.
Stars: ✭ 236 (+353.85%)
basic-image-edaA simple image dataset EDA tool (CLI / Code)
Stars: ✭ 51 (-1.92%)
autojs-webViewautojs的webView实现,支持初始化脚本注入、jsBridge两端互调
Stars: ✭ 38 (-26.92%)
pompR package for statistical inference using partially observed Markov processes
Stars: ✭ 88 (+69.23%)
clickhouse hadoopImport data from clickhouse to hadoop with pure SQL
Stars: ✭ 26 (-50%)
jobAnalytics and searchJobAnalytics system consumes data from multiple sources and provides valuable information to both job hunters and recruiters.
Stars: ✭ 25 (-51.92%)
SynapseMLSimple and Distributed Machine Learning
Stars: ✭ 3,355 (+6351.92%)
tsa-tutorialMaterial for the tutorial, "Time series analysis with pandas" at T-Academy
Stars: ✭ 21 (-59.62%)
platys-modern-data-platformSupport for generating modern platforms dynamically with services such as Kafka, Spark, Streamsets, HDFS, ....
Stars: ✭ 35 (-32.69%)
automation-scriptsSimple scripts that I'm using to automate the boring things.
Stars: ✭ 14 (-73.08%)
clusterdockclusterdock is a framework for creating Docker-based container clusters
Stars: ✭ 26 (-50%)
Node-js-functionalitiesThis repository contains very useful restful API's and functionalities in node-js containing many important tutorial code for mastering node-js, all tutorials have been published on medium.com, tutorials link is given below
Stars: ✭ 69 (+32.69%)
ros hadoopHadoop splittable InputFormat for ROS. Process rosbag with Hadoop Spark and other HDFS compatible systems.
Stars: ✭ 92 (+76.92%)
leetcode-compensationCompensation analysis on the posts scraped from leetcode.com/discuss/compensation. At present, the reports have been generated only for Indian cities.
Stars: ✭ 83 (+59.62%)
flokkrDocumentation placeholder and utilities for all the other containers.
Stars: ✭ 30 (-42.31%)
DaFlowApache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.
Stars: ✭ 24 (-53.85%)
wink-statisticsFast & numerically stable statistical analysis
Stars: ✭ 36 (-30.77%)
WaWebSessionHandler(DISCONTINUED) Save WhatsApp Web Sessions as files and open them everywhere!
Stars: ✭ 27 (-48.08%)
IMDB-ScraperScrapy project for scraping data from IMDB with Movie Dataset including 58,623 movies' data.
Stars: ✭ 37 (-28.85%)
PinionGenerate interactive and nice-looking diagrams for your PCBs!
Stars: ✭ 264 (+407.69%)
scraping-ebayScraping Ebay's products using Scrapy Web Crawling Framework
Stars: ✭ 79 (+51.92%)
fsbrowserFast desktop client for Hadoop Distributed File System
Stars: ✭ 27 (-48.08%)
mloperatorMachine Learning Operator & Controller for Kubernetes
Stars: ✭ 85 (+63.46%)
rreddit𝐫⟋ Get Reddit data
Stars: ✭ 49 (-5.77%)
ts4healthTime Series Data Analysis, Visualization and Forecasting with Python for Health and Self
Stars: ✭ 17 (-67.31%)
ice-chips-verilogIceChips is a library of all common discrete logic devices in Verilog
Stars: ✭ 78 (+50%)
AMCAMC: Asynchronous Memory Compiler
Stars: ✭ 31 (-40.38%)
TexomerTexomer: Integrating Analysis of Cancer Genome and Transcriptome Sequencing Data
Stars: ✭ 19 (-63.46%)
aaocp一个对用户行为日志进行分析的大数据项目
Stars: ✭ 53 (+1.92%)
PeakRDL-ipxactImport and export IP-XACT XML register models
Stars: ✭ 21 (-59.62%)
ztaro一套基于taro, zoro的完整的微信小程序及h5开发解决方案
Stars: ✭ 37 (-28.85%)