Using Selenium, Neha scraped data about 35 top selling sneakers of Nike and Adidas from stockx.com. She used this data to draw insights about sneaker resales.

Stars: ✭ 32 (+166.67%)

Mutual labels: selenium, webscraping

Proxy requests

a class that uses scraped proxies to make http GET/POST requests (Python requests)

Stars: ✭ 357 (+2875%)

Mutual labels: requests, webscraping

weibo topic

微博话题关键词,个人微博采集, 微博博文一键删除 selenium获取cookie,requests处理

Stars: ✭ 28 (+133.33%)

Mutual labels: selenium, requests

python-crawler

爬虫学习仓库，适合零基础的人学习，对新手比较友好

Stars: ✭ 37 (+208.33%)

Mutual labels: selenium, requests

schedule-tweet

Schedules tweets using TweetDeck

Stars: ✭ 14 (+16.67%)

Mutual labels: selenium, webscraping

requestsR

R interface to Python requests module

Stars: ✭ 12 (+0%)

Mutual labels: requests, webscraping

Instagram-Scraper-2021

Scrape Instagram content and stories anonymously, using a new technique based on the har file (No Token + No public API).

Stars: ✭ 57 (+375%)

Mutual labels: selenium, webscraping

Requestium

Integration layer between Requests and Selenium for automation of web actions.

Stars: ✭ 1,618 (+13383.33%)

Mutual labels: selenium, requests

pyscrapper

📷 web scrapping in python: multiple libraries -requests, beautifulsoup, mechanize, selenium

Stars: ✭ 50 (+316.67%)

Mutual labels: selenium, requests

selenium-grid-docker-swarm

web scraping in parallel with Selenium Grid and Docker

Stars: ✭ 32 (+166.67%)

Mutual labels: selenium, webscraping

non-api-fb-scraper

Scrape public FaceBook posts from any group or user into a .csv file without needing to register for any API access

Stars: ✭ 40 (+233.33%)

Mutual labels: selenium, webscraping

Email-Crawler-Lead-Generator

This email crawler will visit all pages of a provided website and parse and save emails found to a csv file.

Stars: ✭ 47 (+291.67%)

Mutual labels: requests, webscraping

chesf

CHeSF is the Chrome Headless Scraping Framework, a very very alpha code to scrape javascript intensive web pages

Stars: ✭ 18 (+50%)

Mutual labels: selenium, webscraping

newspaperjs

News extraction and scraping. Article Parsing

Stars: ✭ 59 (+391.67%)

Mutual labels: webscraping

View All Similar Projects ➔

Image Crawler for Unsplash.com

A web image scraper that scrapes images from unsplash.com. All images downloaded from Unsplash are free for commercial and noncommercial use.

Prerequisites

Python 3 and pip - python 3 and Python package installer pip needs to be installed in the system. Check if you have python3 and pip already installed in your machine using,

  ~$ python3 --version
  Python 3.6.9

  ~$ pip3 --version
  pip 9.0.1 from /usr/lib/python3/dist-packages (python 3.6)

A web browser - A web browser (that supports headless mode) is required to run the script properly. My recommendation is Mozila Firefox or Google Chrome.

However, Chrome users, check here before running the script.

At the time of this documentation, headless mode is not supported by any other regular browser.
A web driver for the web browser - A web driver is required according to the chosen browser. Firefox, for example, requires geckodriver, which needs to be installed before script can be run. Download appropriate web driver for your browser from the following table.

Browser Driver Link

Firefox Download

Chrome Download

After download,

Linux/macOS users, make sure to place it in your PATH, e.g., place it in /usr/bin or /usr/local/bin.

Windows users, add it in the system environment variables.
A stable internet connection is must.

Browser	Driver Link
Firefox	Download
Chrome	Download

Getting started

Clone the repository to your local machine using,

~$ git clone https://github.com/Ayan-Kumar-Saha/image-crawler.git

Environment setup

To install all dependencies at once, move into project directory and run,

Linux/macOS

~$ pip3 install -r dependencies.txt

Windows

~$ pip install -r dependencies.txt

For Google Chrome users

Chrome users, change these lines before running the script,

Line 5 from

  from selenium.webdriver.firefox.options import Options

  from selenium.webdriver.chrome.options import Options

Line 26 from

  browser = webdriver.Firefox(options = options)

  browser = webdriver.Chrome(options = options)

Run Image Crawler

Run the crawler using,

Linux and macOS

~$ python3 image_crawler.py

Windows

~$ python image_crawler.py

Usage

Once the script starts, You need to give type or name of the image you want to download. For example, portraits

~$ Enter the image subject you want to download: portraits

Then enter the number of images you want to download.

~$ Number of images you want to download: 10

After that the script will download images for you. Once completed, an images folder should be created in the project directory, which will contain the downloaded images.

Output

Output should be somewhat similar as following

Terminal

Images folder:

Build With

Selenium - An automation tool
Requests - HTTP library for Python

Author

Ayan Kumar Saha

License

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Ayan-Kumar-Saha / image-crawler

Programming Languages

Labels

Projects that are alternatives of or similar to image-crawler

Image Crawler for Unsplash.com

Table of Content

Prerequisites

Getting started

Environment setup

Linux/macOS

Windows

For Google Chrome users

Run Image Crawler

Linux and macOS

Windows

Usage

Output

Terminal

Images folder:

Build With

Author

License