1. Crawling InfrastructureDistributed crawling infrastructure running on top of severless computation, cloud storage (such as S3) and sophisticated queues.
2. GooglescraperA Python module to scrape several search engines (like Google, Yandex, Bing, Duckduckgo, ...). Including asynchronous networking support.
4. Se ScraperJavascript scraping module based on puppeteer for many different search engines...
5. zardaxtPassive TCP/IP Fingerprinting Tool. Run this on your server and find out what Operating Systems your clients are *really* using.
6. strukturModule that extracts structured information from a rendered html site and outputs JSON. HTML to JSON.