All Projects → raspagem-de-dados-fatec → Similar Projects or Alternatives

350 Open source projects that are alternatives of or similar to raspagem-de-dados-fatec

Gopa

[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn

Stars: ✭ 277 (+1159.09%)

Mutual labels: scraping, web-scraping

Humanoid

Node.js package to bypass CloudFlare's anti-bot JavaScript challenges

Stars: ✭ 88 (+300%)

Mutual labels: scraping, web-scraping

Apify Js

Apify SDK — The scalable web scraping and crawling library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.

Stars: ✭ 3,154 (+14236.36%)

Mutual labels: scraping, web-scraping

trafilatura

Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments

Stars: ✭ 711 (+3131.82%)

Mutual labels: scraping, web-scraping

papercut

Papercut is a scraping/crawling library for Node.js built on top of JSDOM. It provides basic selector features together with features like Page Caching and Geosearch.

Stars: ✭ 15 (-31.82%)

Mutual labels: scraping, web-scraping

Scrapple

A framework for creating semi-automatic web content extractors

Stars: ✭ 464 (+2009.09%)

Mutual labels: scraping, web-scraping

browser-pool

A Node.js library to easily manage and rotate a pool of web browsers, using any of the popular browser automation libraries like Puppeteer, Playwright, or SecretAgent.

Stars: ✭ 71 (+222.73%)

Mutual labels: scraping, web-scraping

PythonScrapyBasicSetup

Basic setup with random user agents and IP addresses for Python Scrapy Framework.

Stars: ✭ 57 (+159.09%)

Mutual labels: scraping, web-scraping

Sqrape

Simple Query Scraping with CSS and Go Reflection (MOVED to Gitlab)

Stars: ✭ 144 (+554.55%)

Mutual labels: scraping, web-scraping

Autoscraper

A Smart, Automatic, Fast and Lightweight Web Scraper for Python

Stars: ✭ 4,077 (+18431.82%)

Mutual labels: scraping, web-scraping

top-github-scraper

Scape top GitHub repositories and users based on keywords

Stars: ✭ 40 (+81.82%)

Mutual labels: scraping, web-scraping

ioweb

Web Scraping Framework

Stars: ✭ 31 (+40.91%)

Mutual labels: scraping, web-scraping

Scrape Linkedin Selenium

`scrape_linkedin` is a python package that allows you to scrape personal LinkedIn profiles & company pages - turning the data into structured json.

Stars: ✭ 239 (+986.36%)

Mutual labels: scraping, web-scraping

Phpscraper

PHP Scraper - an highly opinionated web-interface for PHP

Stars: ✭ 148 (+572.73%)

Mutual labels: scraping, web-scraping

Detect Cms

PHP Library for detecting CMS

Stars: ✭ 78 (+254.55%)

Mutual labels: scraping, web-scraping

selectorlib

A library to read a YML file with Xpath or CSS Selectors and extract data from HTML pages using them

Stars: ✭ 53 (+140.91%)

Mutual labels: scraping, web-scraping

torchestrator

Spin up Tor containers and then proxy HTTP requests via these Tor instances

Stars: ✭ 32 (+45.45%)

Mutual labels: scraping, data-scraping

web-clipper

Easily download the main content of a web page in html, markdown, and/or epub format from command line.

Stars: ✭ 15 (-31.82%)

Mutual labels: scraping

linkextractor

A Docker tutorial using a link extraction application example

Stars: ✭ 41 (+86.36%)

Mutual labels: web-scraping

GSoC-Data-Analyser

Simple search for organisations participating/participated in the GSoC

Stars: ✭ 29 (+31.82%)

Mutual labels: web-scraping

actor-scraper

House of Apify Scrapers. Generic scraping actors with a simple UI to handle complex web crawling and scraping use cases.

Stars: ✭ 83 (+277.27%)

Mutual labels: web-scraping

Zeiver

A Scraper, Downloader, & Recorder for static open directories.

Stars: ✭ 14 (-36.36%)

Mutual labels: scraping

humanparser

Parse a human name string into salutation, first name, middle name, last name, suffix.

Stars: ✭ 78 (+254.55%)

Mutual labels: scraping

heroshi

Heroshi – open source web crawler.

Stars: ✭ 51 (+131.82%)

Mutual labels: web-scraping

subscene scraper

Library to download subtitles from subscene.com

Stars: ✭ 14 (-36.36%)

Mutual labels: scraping

dust

Archive web pages with all relevant assets or save as a single file HTML

Stars: ✭ 19 (-13.64%)

Mutual labels: scraping

Springboard-Data-Science-Immersive

No description or website provided.

Stars: ✭ 52 (+136.36%)

Mutual labels: web-scraping

naos

📉 Uptime and error monitoring CLI

Stars: ✭ 30 (+36.36%)

Mutual labels: scraping

whatsapp-tracking

Scraping the status of WhatsApp contacts

Stars: ✭ 49 (+122.73%)

Mutual labels: scraping

kuwala

Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data sc…

Stars: ✭ 474 (+2054.55%)

Mutual labels: scraping

comp thinking social science

Computational Thinking for Social Scientists book project

Stars: ✭ 42 (+90.91%)

Mutual labels: web-scraping

AngleParse

HTML parsing and processing tool for PowerShell.

Stars: ✭ 35 (+59.09%)

Mutual labels: scraping

sp-subway-scraper

🚆This web scraper builds a dataset for São Paulo subway operation status

Stars: ✭ 24 (+9.09%)

Mutual labels: web-scraping

restaurant-finder-featureReviews

Build a Flask web application to help users retrieve key restaurant information and feature-based reviews (generated by applying market-basket model – Apriori algorithm and NLP on user reviews).

Stars: ✭ 21 (-4.55%)

Mutual labels: web-scraping

webdext

Intelligent Web Data Extractor

Stars: ✭ 75 (+240.91%)

Mutual labels: scraping

scrapy-zyte-smartproxy

Zyte Smart Proxy Manager (formerly Crawlera) middleware for Scrapy

Stars: ✭ 317 (+1340.91%)

Mutual labels: scraping

tableau-scraping

Tableau scraper python library. R and Python scripts to scrape data from Tableau viz

Stars: ✭ 91 (+313.64%)

Mutual labels: web-scraping

TorScrapper

A Scraper made 100% in Python using BeautifulSoup and Tor. It can be used to scrape both normal and onion links. Happy Scraping :)

Stars: ✭ 24 (+9.09%)

Mutual labels: scraping

pomp

Screen scraping and web crawling framework

Stars: ✭ 61 (+177.27%)

Mutual labels: scraping

feedsearch-crawler

Crawl sites for RSS, Atom, and JSON feeds.

Stars: ✭ 23 (+4.55%)

Mutual labels: scraping

Captcha-Tools

All-in-one Python (And now Go!) module to help solve captchas with Capmonster, 2captcha and Anticaptcha API's!

Stars: ✭ 23 (+4.55%)

Mutual labels: scraping

text-mining-corona-articles

Text Mining for Indonesian Online News Articles About Corona

Stars: ✭ 15 (-31.82%)

Mutual labels: web-scraping

jseval

Evaluate JavaScript on a URL through headless Chrome browser.

Stars: ✭ 19 (-13.64%)

Mutual labels: data-scraping

api-flight.com

Main API Flight Git Repository

Stars: ✭ 26 (+18.18%)

Mutual labels: scraping

Movie-Recommendation-System-with-Sentiment-Analysis

Content based movie recommendation system with sentiment analysis

Stars: ✭ 44 (+100%)

Mutual labels: web-scraping

scrapy facebooker

Collection of scrapy spiders which can scrape posts, images, and so on from public Facebook Pages.

Stars: ✭ 22 (+0%)

Mutual labels: scraping

ferenda

Transform unstructured document collections to structured Linked Data

Stars: ✭ 22 (+0%)

Mutual labels: scraping

proxi

Proxy pool. Finds and checks proxies with rest api for querying results. Can find over 25k proxies in under 5 minutes.

Stars: ✭ 32 (+45.45%)

Mutual labels: scraping

dmi-instascraper

A GUI for Instaloader to scrape users and hashtags with on Instagram

Stars: ✭ 21 (-4.55%)

Mutual labels: scraping

internet-affordability

🌍 Dataset that shows the Internet affordability by country (a shocking reality!)

Stars: ✭ 13 (-40.91%)

Mutual labels: scraping

scraping-ebay

Scraping Ebay's products using Scrapy Web Crawling Framework

Stars: ✭ 79 (+259.09%)

Mutual labels: web-scraping

article-summary-deep-learning

📖 Using deep learning and scraping to analyze/summarize articles! Just drop in any URL!

Stars: ✭ 18 (-18.18%)

Mutual labels: web-scraping

halfstaff

🇺🇸 Is the US flag at half-staff?

Stars: ✭ 22 (+0%)

Mutual labels: web-scraping

scrapy-distributed

A series of distributed components for Scrapy. Including RabbitMQ-based components, Kafka-based components, and RedisBloom-based components for Scrapy.

Stars: ✭ 38 (+72.73%)

Mutual labels: scraping

India-WhatsAppFakeNews-Dataset

WhatsApps related deaths News Articles along with other articles across India during that period

Stars: ✭ 41 (+86.36%)

Mutual labels: web-scraping

codechef-rank-comparator

Web application hosted on Heroku cloud platform based on web scraping in python using lxml library (XML Path Language).

Stars: ✭ 23 (+4.55%)

Mutual labels: web-scraping

IMDB-Scraper

Scrapy project for scraping data from IMDB with Movie Dataset including 58,623 movies' data.