All Projects → Mechaml → Similar Projects or Alternatives

228 Open source projects that are alternatives of or similar to Mechaml

Tabula is a tool for liberating data tables trapped inside PDF files

Stars: ✭ 5,420 (+8933.33%)

Mutual labels: scraping

Papercut is a scraping/crawling library for Node.js built on top of JSDOM. It provides basic selector features together with features like Page Caching and Geosearch.

Stars: ✭ 15 (-75%)

Mutual labels: scraping

Post Tuto Deployment

Build and deploy a machine learning app from scratch 🚀

Stars: ✭ 368 (+513.33%)

Mutual labels: scraping

humanparser

Parse a human name string into salutation, first name, middle name, last name, suffix.

Stars: ✭ 78 (+30%)

Mutual labels: scraping

Scrapy Cluster

This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.

Stars: ✭ 921 (+1435%)

Mutual labels: scraping

dust

Archive web pages with all relevant assets or save as a single file HTML

Stars: ✭ 19 (-68.33%)

Mutual labels: scraping

Katana

A Python Tool For google Hacking

Stars: ✭ 355 (+491.67%)

Mutual labels: scraping

scrapy facebooker

Collection of scrapy spiders which can scrape posts, images, and so on from public Facebook Pages.

Stars: ✭ 22 (-63.33%)

Mutual labels: scraping

Gazpacho

🥫 The simple, fast, and modern web scraping library

Stars: ✭ 525 (+775%)

Mutual labels: scraping

shup

A POSIX shell script to parse HTML

Stars: ✭ 28 (-53.33%)

Mutual labels: scraping

Autoscraper

A Smart, Automatic, Fast and Lightweight Web Scraper for Python

Stars: ✭ 4,077 (+6695%)

Mutual labels: scraping

image-collector

Download images from Google Image Search

Stars: ✭ 38 (-36.67%)

Mutual labels: scraping

Django Dynamic Scraper

Creating Scrapy scrapers via the Django admin interface

Stars: ✭ 1,024 (+1606.67%)

Mutual labels: scraping

naos

📉 Uptime and error monitoring CLI

Stars: ✭ 30 (-50%)

Mutual labels: scraping

Social Media Profiles Regexs

📇 Extract social media profiles and more with regular expressions

Stars: ✭ 324 (+440%)

Mutual labels: scraping

kuwala

Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data sc…

Stars: ✭ 474 (+690%)

Mutual labels: scraping

Facebook data analyzer

Analyze facebook copy of your data with ruby language. Download zip file from facebook and get info about friends ranking by message, vocabulary, contacts, friends added statistics and more

Stars: ✭ 515 (+758.33%)

Mutual labels: scraping

top-github-scraper

Scape top GitHub repositories and users based on keywords

Stars: ✭ 40 (-33.33%)

Mutual labels: scraping

Linkedin Scraper using Selenium Web Driver, Chromium headless, Docker and Scrapy

Stars: ✭ 309 (+415%)

Mutual labels: scraping

feedsearch-crawler

Crawl sites for RSS, Atom, and JSON feeds.

Stars: ✭ 23 (-61.67%)

Mutual labels: scraping

Webhere

HTML scraping for Objective-C.

Stars: ✭ 16 (-73.33%)

Mutual labels: scraping

ferenda

Transform unstructured document collections to structured Linked Data

Stars: ✭ 22 (-63.33%)

Mutual labels: scraping

Edu Mail Generator

Generate Free Edu Mail(s) within minutes

Stars: ✭ 301 (+401.67%)

Mutual labels: scraping

internet-affordability

🌍 Dataset that shows the Internet affordability by country (a shocking reality!)

Stars: ✭ 13 (-78.33%)

Mutual labels: scraping

Nickjs

Web scraping library made by the Phantombuster team. Modern, simple & works on all websites. (Deprecated)

Stars: ✭ 494 (+723.33%)

Mutual labels: scraping

gunaydin

Your good mornings ☀️

Stars: ✭ 16 (-73.33%)

Mutual labels: scraping

Clean Text

🧹 Python package for text cleaning

Stars: ✭ 284 (+373.33%)

Mutual labels: scraping

chesf

CHeSF is the Chrome Headless Scraping Framework, a very very alpha code to scrape javascript intensive web pages

Stars: ✭ 18 (-70%)

Mutual labels: scraping

Mtnt

Code for the collection and analysis of the MTNT dataset

Stars: ✭ 48 (-20%)

Mutual labels: scraping

rubium

Rubium is a lightweight alternative to Selenium/Capybara/Watir if you need to perform some operations (like web scraping) using Headless Chromium and Ruby

Stars: ✭ 65 (+8.33%)

Mutual labels: scraping

Lambdasoup

Functional HTML scraping and rewriting with CSS in OCaml

Stars: ✭ 280 (+366.67%)

Mutual labels: scraping

ogpParser

Open Graph Protocol Parser for Node.js

Stars: ✭ 43 (-28.33%)

Mutual labels: scraping

Ferret

Declarative web scraping

Stars: ✭ 4,837 (+7961.67%)

Mutual labels: scraping

angel.co-companies-list-scraping

No description or website provided.

Stars: ✭ 54 (-10%)

Mutual labels: scraping

Apify Js

Apify SDK — The scalable web scraping and crawling library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.

Stars: ✭ 3,154 (+5156.67%)

Mutual labels: scraping

browser-pool

A Node.js library to easily manage and rotate a pool of web browsers, using any of the popular browser automation libraries like Puppeteer, Playwright, or SecretAgent.

Stars: ✭ 71 (+18.33%)

Mutual labels: scraping

Imagescraper

✂️ High performance, multi-threaded image scraper

Stars: ✭ 630 (+950%)

Mutual labels: scraping

Scrapping

Mastering the art of scrapping 🎓

Stars: ✭ 24 (-60%)

Mutual labels: scraping

instagram explorer

📷 An app to scrap instagram posts and analyze data.

Stars: ✭ 17 (-71.67%)

Mutual labels: scraping

copycat

A PHP Scraping Class

Stars: ✭ 70 (+16.67%)

Mutual labels: scraping

Dataflowkit

Extract structured data from web sites. Web sites scraping.

Stars: ✭ 456 (+660%)

Mutual labels: scraping

scrap

Scrapping Facebook with JavaScript.

Stars: ✭ 25 (-58.33%)

Mutual labels: scraping

jazz

The Scripting Engine that Combines Speed, Safety, and Simplicity

Stars: ✭ 132 (+120%)

Mutual labels: scraping

Instagram-to-discord

Monitor instagram user account and automatically post new images to discord channel via a webhook. Working 2022!

Stars: ✭ 113 (+88.33%)

Mutual labels: scraping

Configs

Public, free to use, repository with diggers configs for scraping / extracting data from various e-commerce websites and online stores

Stars: ✭ 37 (-38.33%)

Mutual labels: scraping

ha-multiscrape

Home Assistant custom component for scraping (html, xml or json) multiple values (from a single HTTP request) with a separate sensor/attribute for each value. Support for (login) form-submit functionality.

Stars: ✭ 103 (+71.67%)

Mutual labels: scraping

bots-zoo

No description or website provided.

Stars: ✭ 59 (-1.67%)

Mutual labels: scraping

zcrawl

An open source web crawling platform

Stars: ✭ 21 (-65%)

Mutual labels: scraping

Mechanize

Mechanize is a ruby library that makes automated web interaction easy.

Stars: ✭ 4,158 (+6830%)

Mutual labels: scraping

scrapman

Retrieve real (with Javascript executed) HTML code from an URL, ultra fast and supports multiple parallel loading of webs

Stars: ✭ 21 (-65%)

Mutual labels: scraping

scraper

Nodejs web scraper. Contains a command line, docker container, terraform module and ansible roles for distributed cloud scraping. Supported databases: SQLite, MySQL, PostgreSQL. Supported headless clients: Puppeteer, Playwright, Cheerio, JSdom.

Stars: ✭ 37 (-38.33%)

Mutual labels: scraping

puppeteer-botcheck

🕵‍♂ Bot detection tests for Puppeteer. Hide and seek!

Stars: ✭ 42 (-30%)

Mutual labels: scraping

Newcrawler

Free Web Scraping Tool with Java

Stars: ✭ 589 (+881.67%)

Mutual labels: scraping

memes-api

API for scrapping common meme sites