CrawlabDistributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台,支持任何语言和框架
Stars: ✭ 8,392 (+13220.63%)
Crawlab LiteLite version of Crawlab. 轻量版 Crawlab 爬虫管理平台
Stars: ✭ 122 (+93.65%)
Gopa[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn
Stars: ✭ 277 (+339.68%)
FinvizUnofficial API for finviz.com
Stars: ✭ 493 (+682.54%)
Data Describedata⎰describe: Pythonic EDA Accelerator for Data Science
Stars: ✭ 269 (+326.98%)
DatacleanerThe premier open source Data Quality solution
Stars: ✭ 391 (+520.63%)
TurbodbcTurbodbc is a Python module to access relational databases via the Open Database Connectivity (ODBC) interface. The module complies with the Python Database API Specification 2.0.
Stars: ✭ 449 (+612.7%)
PretzelJavascript full-stack framework for Big Data visualisation and analysis
Stars: ✭ 26 (-58.73%)
PHATPathogen-Host Analysis Tool - A modern Next-Generation Sequencing (NGS) analysis platform
Stars: ✭ 17 (-73.02%)
flink-crawlerContinuous scalable web crawler built on top of Flink and crawler-commons
Stars: ✭ 48 (-23.81%)
PostguiA React web application to query and share any PostgreSQL database.
Stars: ✭ 260 (+312.7%)
CrawlBoxEasy way to brute-force web directory.
Stars: ✭ 118 (+87.3%)
Dash CytoscapeInteractive network visualization in Python and Dash, powered by Cytoscape.js
Stars: ✭ 309 (+390.48%)
Vaultswiss army knife for hackers
Stars: ✭ 346 (+449.21%)
Awesome CrawlerA collection of awesome web crawler,spider in different languages
Stars: ✭ 4,793 (+7507.94%)
TrinoOfficial repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Stars: ✭ 4,581 (+7171.43%)
Datasets For Recommender SystemsThis is a repository of a topic-centric public data sources in high quality for Recommender Systems (RS)
Stars: ✭ 564 (+795.24%)
FilemastaA search application to explore, discover and share online files
Stars: ✭ 571 (+806.35%)
MultiqcAggregate results from bioinformatics analyses across many samples into a single report.
Stars: ✭ 708 (+1023.81%)
IcrawlerA multi-thread crawler framework with many builtin image crawlers provided.
Stars: ✭ 629 (+898.41%)
MamanRust Web Crawler saving pages on Redis
Stars: ✭ 39 (-38.1%)
proxiProxy pool. Finds and checks proxies with rest api for querying results. Can find over 25k proxies in under 5 minutes.
Stars: ✭ 32 (-49.21%)
OLX Scraper📻 An OLX Scraper using Scrapy + MongoDB. It Scrapes recent ads posted regarding requested product and dumps to NOSQL MONGODB.
Stars: ✭ 15 (-76.19%)
awesome-geneticsA curated list of awesome bioinformatics software.
Stars: ✭ 60 (-4.76%)
SpidyThe simple, easy to use command line web crawler.
Stars: ✭ 257 (+307.94%)
RNAseq titration resultsCross-platform normalization enables machine learning model training on microarray and RNA-seq data simultaneously
Stars: ✭ 22 (-65.08%)
ArachniWeb Application Security Scanner Framework
Stars: ✭ 2,942 (+4569.84%)
Datasets For GoodList of datasets to apply stats/machine learning/technology to the world of social good.
Stars: ✭ 174 (+176.19%)
SupercrawlerA web crawler. Supercrawler automatically crawls websites. Define custom handlers to parse content. Obeys robots.txt, rate limits and concurrency limits.
Stars: ✭ 306 (+385.71%)
PreqlAn interpreted relational query language that compiles to SQL.
Stars: ✭ 257 (+307.94%)
Spider Flow新一代爬虫平台,以图形化方式定义爬虫流程,不写代码即可完成爬虫。
Stars: ✭ 365 (+479.37%)
Sol Journal✎ Simple, personal journaling progressive web app
Stars: ✭ 470 (+646.03%)
ScrappleA framework for creating semi-automatic web content extractors
Stars: ✭ 464 (+636.51%)
DapyEasy-to-use data analysis / manipulation framework for humans
Stars: ✭ 523 (+730.16%)
Warp10 PlatformThe Most Advanced Time Series Platform
Stars: ✭ 227 (+260.32%)
Wechatsogou基于搜狗微信搜索的微信公众号爬虫接口
Stars: ✭ 5,220 (+8185.71%)
FinancedatabaseThis is a database of 180.000+ symbols containing Equities, ETFs, Funds, Indices, Futures, Options, Currencies, Cryptocurrencies and Money Markets.
Stars: ✭ 554 (+779.37%)
FbcrawlA Facebook crawler
Stars: ✭ 536 (+750.79%)
SpidrA versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.
Stars: ✭ 656 (+941.27%)
ScrapyrtHTTP API for Scrapy spiders
Stars: ✭ 637 (+911.11%)
Tiledb VcfEfficient variant-call data storage and retrieval library using the TileDB storage library.
Stars: ✭ 26 (-58.73%)
Scrapy RedisRedis-based components for Scrapy.
Stars: ✭ 4,998 (+7833.33%)
ScanpySingle-Cell Analysis in Python. Scales to >1M cells.
Stars: ✭ 858 (+1261.9%)
SnsAnalysis pipelines for sequencing data
Stars: ✭ 43 (-31.75%)
ScdeR package for analyzing single-cell RNA-seq data
Stars: ✭ 147 (+133.33%)
Haipproxy💖 High available distributed ip proxy pool, powerd by Scrapy and Redis
Stars: ✭ 4,993 (+7825.4%)
Taxadb🐣 locally query the ncbi taxonomy
Stars: ✭ 26 (-58.73%)
AvbookAV 电影管理系统, avmoo , javbus , javlibrary 爬虫,线上 AV 影片图书馆,AV 磁力链接数据库,Japanese Adult Video Library,Adult Video Magnet Links - Japanese Adult Video Database
Stars: ✭ 8,133 (+12809.52%)