PaperScraperA web scraping tool to systematically extract the text of scientific papers and corresponding metadata from university accessible journals.
Stars: ✭ 63 (+21.15%)
PRTSUnofficial Python implementation of "Precision and Recall for Time Series".
Stars: ✭ 36 (-30.77%)
linkextractorA Docker tutorial using a link extraction application example
Stars: ✭ 41 (-21.15%)
halfstaff🇺🇸 Is the US flag at half-staff?
Stars: ✭ 22 (-57.69%)
investigation-amazon-brandsMaterials to reproduce our findings in our stories, "Amazon Puts Its Own 'Brands' First Above Better-Rated Products" and "When Amazon Takes the Buy Box, it Doesn’t Give it up"
Stars: ✭ 56 (+7.69%)
Hadoop cookbookCookbook to install Hadoop 2.0+ using Chef
Stars: ✭ 82 (+57.69%)
actor-scraperHouse of Apify Scrapers. Generic scraping actors with a simple UI to handle complex web crawling and scraping use cases.
Stars: ✭ 83 (+59.62%)
heroshiHeroshi – open source web crawler.
Stars: ✭ 51 (-1.92%)
CamusMirror of Linkedin's Camus
Stars: ✭ 81 (+55.77%)
tableau-scrapingTableau scraper python library. R and Python scripts to scrape data from Tableau viz
Stars: ✭ 91 (+75%)
TablesawJava dataframe and visualization library
Stars: ✭ 2,785 (+5255.77%)
Docker Spark🚢 Docker image for Apache Spark
Stars: ✭ 78 (+50%)
Scikit PosthocsMultiple Pairwise Comparisons (Post Hoc) Tests in Python
Stars: ✭ 186 (+257.69%)
Master-ThesisDeep Reinforcement Learning in Autonomous Driving: the A3C algorithm used to make a car learn to drive in TORCS; Python 3.5, Tensorflow, tensorboard, numpy, gym-torcs, ubuntu, latex
Stars: ✭ 33 (-36.54%)
Ee OutliersOpen-source framework to detect outliers in Elasticsearch events
Stars: ✭ 172 (+230.77%)
Tf YarnTrain TensorFlow models on YARN in just a few lines of code!
Stars: ✭ 76 (+46.15%)
Gitinspector📊 The statistical analysis tool for git repositories
Stars: ✭ 2,058 (+3857.69%)
query2reportQuery2Report is a simple open source business intelligence platform that allows users to build report/dashboard for business analytics or enterprise reporting
Stars: ✭ 43 (-17.31%)
MethylkitR package for DNA methylation analysis
Stars: ✭ 116 (+123.08%)
Docker HadoopApache Hadoop docker image
Stars: ✭ 1,190 (+2188.46%)
PycmMulti-class confusion matrix library in Python
Stars: ✭ 1,076 (+1969.23%)
Time-Series-ForecastingRainfall analysis of Maharashtra - Season/Month wise forecasting. Different methods have been used. The main goal of this project is to increase the performance of forecasted results during rainy seasons.
Stars: ✭ 27 (-48.08%)
DataframeC++ DataFrame for statistical, Financial, and ML analysis -- in modern C++ using native types, continuous memory storage, and no pointers are involved
Stars: ✭ 828 (+1492.31%)
platys-modern-data-platformSupport for generating modern platforms dynamically with services such as Kafka, Spark, Streamsets, HDFS, ....
Stars: ✭ 35 (-32.69%)
Devops Bash Tools550+ DevOps Bash Scripts - AWS, GCP, Kubernetes, Kafka, Docker, APIs, Hadoop, SQL, PostgreSQL, MySQL, Hive, Impala, Travis CI, Jenkins, Concourse, GitHub, GitLab, BitBucket, Azure DevOps, TeamCity, Spotify, MP3, LDAP, Code/Build Linting, pkg mgmt for Linux, Mac, Python, Perl, Ruby, NodeJS, Golang, Advanced dotfiles: .bashrc, .vimrc, .gitconfig, .screenrc, .tmux.conf, .psqlrc ...
Stars: ✭ 226 (+334.62%)
ScriptisScriptis is for interactive data analysis with script development(SQL, Pyspark, HiveQL), task submission(Spark, Hive), UDF, function, resource management and intelligent diagnosis.
Stars: ✭ 696 (+1238.46%)
SrcA light-weight distributed stream computing framework for Golang
Stars: ✭ 67 (+28.85%)
mitreThe Microbiome Interpretable Temporal Rule Engine
Stars: ✭ 37 (-28.85%)
ARIMASimple python example on how to use ARIMA models to analyze and predict time series.
Stars: ✭ 169 (+225%)
treecutFind nodes in hierarchical clustering that are statistically significant
Stars: ✭ 26 (-50%)
WaimakWaimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.
Stars: ✭ 60 (+15.38%)
Linkedin-ClientWeb scraper for grabing data from Linkedin profiles or company pages (personal project)
Stars: ✭ 42 (-19.23%)
AppliedStatsA repo with homeworks and labs from a course on applied stats taken by me during my bachelor's degree in MIPT, Ru. Course authors: Andrii Hraboviy, @andriygav and Oleg Bakhteev, @bahleg.
Stars: ✭ 16 (-69.23%)
pTFCEProbabilistic Threshold-Free Cluster Enhancement of Neuroimages
Stars: ✭ 29 (-44.23%)
Stock-Market-PredictorStock Market Predictor with LSTM network. Web scraping and analyzing tools (ohlc, mean)
Stars: ✭ 28 (-46.15%)
Applied Reinforcement LearningReinforcement Learning and Decision Making tutorials explained at an intuitive level and with Jupyter Notebooks
Stars: ✭ 229 (+340.38%)
Hadoop SolrCode to index HDFS to Solr using MapReduce
Stars: ✭ 51 (-1.92%)
automation-scriptsSimple scripts that I'm using to automate the boring things.
Stars: ✭ 14 (-73.08%)
MoosefsMooseFS – Open Source, Petabyte, Fault-Tolerant, Highly Performing, Scalable Network Distributed File System (Software-Defined Storage)
Stars: ✭ 1,025 (+1871.15%)
Mloggera lightweight and simple logger for Machine Learning
Stars: ✭ 122 (+134.62%)
OSCIOpen Source Contributor Index
Stars: ✭ 107 (+105.77%)
Pyspark Example ProjectExample project implementing best practices for PySpark ETL jobs and applications.
Stars: ✭ 633 (+1117.31%)
sparklanesA lightweight data processing framework for Apache Spark
Stars: ✭ 17 (-67.31%)
hypotheticalHypothesis and statistical testing in Python
Stars: ✭ 49 (-5.77%)
rust-vcdRead and write VCD (Value Change Dump) files in Rust
Stars: ✭ 23 (-55.77%)
hive-jdbc-driverAn alternative to the "hive standalone" jar for connecting Java applications to Apache Hive via JDBC
Stars: ✭ 31 (-40.38%)