apify / actor-scraper

Licence: other

House of Apify Scrapers. Generic scraping actors with a simple UI to handle complex web crawling and scraping use cases.

Programming Languages

javascript

184084 projects - #8 most used programming language

Dockerfile

14818 projects

Projects that are alternatives of or similar to actor-scraper

actor-content-checker

You can use this act to monitor any page's content and get a notification when content changes.

Stars: ✭ 16 (-80.72%)

Mutual labels: web-scraping, apify

Apify Js

Apify SDK — The scalable web scraping and crawling library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.

Stars: ✭ 3,154 (+3700%)

Mutual labels: web-scraping, apify

iww

AI based web-wrapper for web-content-extraction

Stars: ✭ 61 (-26.51%)

Mutual labels: web-scraping

Springboard-Data-Science-Immersive

No description or website provided.

Stars: ✭ 52 (-37.35%)

Mutual labels: web-scraping

OLX Scraper

📻 An OLX Scraper using Scrapy + MongoDB. It Scrapes recent ads posted regarding requested product and dumps to NOSQL MONGODB.

Stars: ✭ 15 (-81.93%)

Mutual labels: web-scraping

browser-pool

A Node.js library to easily manage and rotate a pool of web browsers, using any of the popular browser automation libraries like Puppeteer, Playwright, or SecretAgent.

Stars: ✭ 71 (-14.46%)

Mutual labels: web-scraping

India-WhatsAppFakeNews-Dataset

WhatsApps related deaths News Articles along with other articles across India during that period

Stars: ✭ 41 (-50.6%)

Mutual labels: web-scraping

htmlunit

🕸🧰☕️Tools to Scrape Dynamic Web Content via the 'HtmlUnit' Java Library

Stars: ✭ 39 (-53.01%)

Mutual labels: web-scraping

heroshi

Heroshi – open source web crawler.

Stars: ✭ 51 (-38.55%)

Mutual labels: web-scraping

automation-scripts

Simple scripts that I'm using to automate the boring things.

Stars: ✭ 14 (-83.13%)

Mutual labels: web-scraping

actor-amazon-crawler

Amazon crawler - this configuration will extract items for a keywords that you will specify in the input, and it will automatically extract all pages for the given keyword. You can specify more keywords on the input for one run.

Stars: ✭ 59 (-28.92%)

Mutual labels: apify

Node-js-functionalities

This repository contains very useful restful API's and functionalities in node-js containing many important tutorial code for mastering node-js, all tutorials have been published on medium.com, tutorials link is given below

Stars: ✭ 69 (-16.87%)

Mutual labels: web-scraping

rreddit

𝐫⟋ Get Reddit data

Stars: ✭ 49 (-40.96%)

Mutual labels: web-scraping

scraping-ebay

Scraping Ebay's products using Scrapy Web Crawling Framework

Stars: ✭ 79 (-4.82%)

Mutual labels: web-scraping

extractnet

A Dragnet that also extract author, headline, date, keywords from context

Stars: ✭ 52 (-37.35%)

Mutual labels: web-scraping

tableau-scraping

Tableau scraper python library. R and Python scripts to scrape data from Tableau viz

Stars: ✭ 91 (+9.64%)

Mutual labels: web-scraping

Linkedin-Client

Web scraper for grabing data from Linkedin profiles or company pages (personal project)

Stars: ✭ 42 (-49.4%)

Mutual labels: web-scraping

leetcode-compensation

Compensation analysis on the posts scraped from leetcode.com/discuss/compensation. At present, the reports have been generated only for Indian cities.

Stars: ✭ 83 (+0%)

Mutual labels: web-scraping

IMDB-Scraper

Scrapy project for scraping data from IMDB with Movie Dataset including 58,623 movies' data.

Stars: ✭ 37 (-55.42%)

Mutual labels: web-scraping

restaurant-finder-featureReviews

Build a Flask web application to help users retrieve key restaurant information and feature-based reviews (generated by applying market-basket model – Apriori algorithm and NLP on user reviews).

Stars: ✭ 21 (-74.7%)

Mutual labels: web-scraping

View All Similar Projects ➔

Apify Scrapers

This repository houses all of Apify generic actors that are used for simplified scraping using a pre-defined, schema validated UI input instead of the typical JSON input used in other actors.

Web Scraper

Web Scraper (apify/web-scraper) is a ready-made solution for scraping the web using the Chrome browser. It takes away all the work necessary to set up a browser for crawling, controls the browser automatically and produces machine readable results in several common formats.

Underneath, it uses the Puppeteer library to control the browser, but you don't need to worry about that. Using a simple web UI and a little of basic JavaScript, you can tweak it to serve almost any scraping need.

Puppeteer Scraper

Puppeteer Scraper (apify/puppeteer-scraper) is the most powerful scraper tool in our arsenal (aside from developing your own actors). It uses the Puppeteer library to programmatically control a headless Chrome browser and it can make it do almost anything. If using the Web Scraper does not cut it, Puppeteer Scraper is what you need.

Puppeteer is a Node.js library, so knowledge of Node.js and its paradigms is expected when working with the Puppeteer Scraper.

If you need either a faster, or a simpler tool, see the Cheerio Scraper for speed, or Web Scraper for simplicity.

Cheerio Scraper

Cheerio Scraper (apify/cheerio-scraper) is a ready-made solution for crawling the web using plain HTTP requests to retrieve HTML pages and then parsing and inspecting the HTML using the Cheerio library. It's blazing fast.

Cheerio is a server-side version of the popular jQuery library, that does not run in the browser, but instead constructs a DOM out of a HTML string and then provides the user with API to work with that DOM.

Cheerio Scraper is ideal for scraping websites that do not rely on client-side JavaScript to serve their content. It can be as much as 20 times faster than using a full browser solution such as Puppeteer.

Scraper Tools

A library that houses logic common to all the scrapers.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

apify / actor-scraper

Programming Languages

Labels

Projects that are alternatives of or similar to actor-scraper

Apify Scrapers

Web Scraper

Puppeteer Scraper

Cheerio Scraper

Scraper Tools