A super-fast and scalable Random Forest library based on fast histogram decision tree algorithm and distributed bagging framework. It can be used for binary classification, multi-label classification, and regression tasks. This library provides both Python and command line interface to users.

Stars: ✭ 20 (-45.95%)

Mutual labels: data-mining

cat-message

Finds cat images/videos/gifs on reddit, sends them to my mom via applescript

Stars: ✭ 35 (-5.41%)

Mutual labels: scraper

gHarvester

Proof of concept for a security issue (in my opinion) that I found in accounts.google.com

Stars: ✭ 20 (-45.95%)

Mutual labels: scraper

barclayscrape

A small app to programmatically mainpulate Barclays online banking

Stars: ✭ 57 (+54.05%)

Mutual labels: scraper

TextClassification

基于scikit-learn实现对新浪新闻的文本分类，数据集为100w篇文档，总计10类，测试集与训练集1:1划分。分类算法采用SVM和Bayes，其中Bayes作为baseline。

Stars: ✭ 86 (+132.43%)

Mutual labels: data-mining

ColegaDondeEstaMiTFM

Un bot de Twitter que comparte cada hora un TFM hasta que Cristina Cifuentes enseñe el suyo.

Stars: ✭ 14 (-62.16%)

Mutual labels: scraper

diffbot-php-client

[Deprecated - Maintenance mode - use APIs directly please!] The official Diffbot client library

Stars: ✭ 53 (+43.24%)

Mutual labels: scraper

Federal-Parliament-Scraper

A scraper for obtaining information on the workings of the Belgian Federal Parliament.

Stars: ✭ 18 (-51.35%)

Mutual labels: scraper

buptclass

A nodejs-spider that gets the infomation of empty classrooms in BUPT

Stars: ✭ 29 (-21.62%)

Mutual labels: cheerio

gochanges

**[ARCHIVED]** website changes tracker 🔍

Stars: ✭ 12 (-67.57%)

Mutual labels: scraper

trawler

scraper for facebook, gab, google and tiktok

Stars: ✭ 20 (-45.95%)

Mutual labels: scraper

scrape-github-trending

Tutorial for web scraping / crawling with Node.js.

Stars: ✭ 42 (+13.51%)

Mutual labels: cheerio

wordpress-scraper

Simple, easy-to-use scraper to scrape data from WordPress JSON API

Stars: ✭ 22 (-40.54%)

Mutual labels: scraper

scrapeer

Essential PHP library that scrapes HTTP(S) and UDP trackers for torrent information.

Stars: ✭ 81 (+118.92%)

Mutual labels: scraper

scripts

A collection of random scripts I coded up

Stars: ✭ 17 (-54.05%)

Mutual labels: scraper

teanaps

자연어 처리와 텍스트 분석을 위한 오픈소스 파이썬 라이브러리 입니다.

Stars: ✭ 91 (+145.95%)

Mutual labels: data-mining

conferencias matutinas amlo

CSVs de las versiones estenográficas de las conferencias matutinas del Presidente Andres Manuel López Obrador ( Mañaneras AMLO )

Stars: ✭ 25 (-32.43%)

Mutual labels: data-mining

sciblox

sciblox - Easier Data Science and Machine Learning

Stars: ✭ 48 (+29.73%)

Mutual labels: data-mining

INMET-API-temperature

Crawler dos dados metereológicos de estações convencionais do INMET (BDMEP)

Stars: ✭ 32 (-13.51%)

Mutual labels: scraper

GChan

Scrape boards and threads from 4chan (8kun WIP). Downloads images, videos and HTML if desired.

Stars: ✭ 31 (-16.22%)

Mutual labels: scraper

dh-core

Functional data science

Stars: ✭ 123 (+232.43%)

Mutual labels: data-mining

twpy

Twitter High level scraper for humans.

Stars: ✭ 58 (+56.76%)

Mutual labels: scraper

MetQy

Repository for R package MetQy (read related publication here: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6247936/)

Stars: ✭ 17 (-54.05%)

Mutual labels: data-mining

xgboost-smote-detect-fraud

Can we predict accurately on the skewed data? What are the sampling techniques that can be used. Which models/techniques can be used in this scenario? Find the answers in this code pattern!

Stars: ✭ 59 (+59.46%)

Mutual labels: data-mining

stweet

Advanced python library to scrap Twitter (tweets, users) from unofficial API

Stars: ✭ 287 (+675.68%)

Mutual labels: scraper

pinterest-web-scraper

Scraping Visually Similar Images from Pinterest

Stars: ✭ 26 (-29.73%)

Mutual labels: scraper

doffy

a web auto run lib base on chrome headless

Stars: ✭ 13 (-64.86%)

Mutual labels: nightmare

scrapy-LBC

Araignée LeBonCoin avec Scrapy et ElasticSearch

Stars: ✭ 14 (-62.16%)

Mutual labels: scraper

KaliIntelligenceSuite

Kali Intelligence Suite (KIS) shall aid in the fast, autonomous, central, and comprehensive collection of intelligence by executing standard penetration testing tools. The collected data is internally stored in a structured manner to allow the fast identification and visualisation of the collected information.

Stars: ✭ 58 (+56.76%)

Mutual labels: data-mining

crawler-chrome-extensions

爬虫工程师常用的 Chrome 插件 | Chrome extensions used by crawler developer

Stars: ✭ 53 (+43.24%)

Mutual labels: scraper

Medium-Stats-Analysis

Exploring data and analyzing metrics for user-specific Medium Stats

Stars: ✭ 27 (-27.03%)

Mutual labels: data-mining

python web scraping

Web scraping using python, requests and selenium

Stars: ✭ 40 (+8.11%)

Mutual labels: scraper

Heart disease prediction

Heart Disease prediction using 5 algorithms

Stars: ✭ 43 (+16.22%)

Mutual labels: data-mining

tripadvisor-scraper

Scrape Tripadvisor restaurant, hotels, and places.

Stars: ✭ 40 (+8.11%)

Mutual labels: scraper

web-crawler

Python Web Crawler with Selenium and PhantomJS

Stars: ✭ 19 (-48.65%)

Mutual labels: scraper

Apriori-and-Eclat-Frequent-Itemset-Mining

Implementation of the Apriori and Eclat algorithms, two of the best-known basic algorithms for mining frequent item sets in a set of transactions, implementation in Python.

Stars: ✭ 36 (-2.7%)

Mutual labels: data-mining

iis

Information Inference Service of the OpenAIRE system

Stars: ✭ 16 (-56.76%)

Mutual labels: data-mining

hierarchical-clustering

A Python implementation of divisive and hierarchical clustering algorithms. The algorithms were tested on the Human Gene DNA Sequence dataset and dendrograms were plotted.

Stars: ✭ 62 (+67.57%)

Mutual labels: data-mining

Website-downloader

💡 Download the complete source code of any website (including all assets). [ Javascripts, Stylesheets, Images ] using Node.js

Stars: ✭ 615 (+1562.16%)

Mutual labels: scraper

perke

A keyphrase extractor for Persian

Stars: ✭ 60 (+62.16%)

Mutual labels: data-mining

scikit-cycling

Tools to analyze cycling data