All Projects → Ayan-Kumar-Saha → image-crawler

Ayan-Kumar-Saha / image-crawler

Licence: other
An image scraper that scraps images from unsplash.com

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to image-crawler

Autolink
AutoLink是一个开源Web IDE自动化测试集成解决方案
Stars: ✭ 129 (+975%)
Mutual labels:  selenium, requests
TeslaPy
A Python module to use the Tesla Motors Owner API
Stars: ✭ 216 (+1700%)
Mutual labels:  selenium, requests
fBrowser
Helpful Selenium functions to make web-scraping easier and faster
Stars: ✭ 16 (+33.33%)
Mutual labels:  selenium, webscraping
Price Monitor
京东商品价格监控:监控用户设定商品价格,降价邮件/微信提醒。技术:Python爬虫/IP代理池/JS接口爬取/Selenium页面爬取
Stars: ✭ 634 (+5183.33%)
Mutual labels:  selenium, requests
SJS DROPS
Script using requests module to register accounts to Slam Jam Socialism raffles.
Stars: ✭ 21 (+75%)
Mutual labels:  selenium, requests
Wswp
Code for the second edition Web Scraping with Python book by Packt Publications
Stars: ✭ 112 (+833.33%)
Mutual labels:  selenium, webscraping
Sneakers Project
Using Selenium, Neha scraped data about 35 top selling sneakers of Nike and Adidas from stockx.com. She used this data to draw insights about sneaker resales.
Stars: ✭ 32 (+166.67%)
Mutual labels:  selenium, webscraping
Proxy requests
a class that uses scraped proxies to make http GET/POST requests (Python requests)
Stars: ✭ 357 (+2875%)
Mutual labels:  requests, webscraping
weibo topic
微博话题关键词,个人微博采集, 微博博文一键删除 selenium获取cookie,requests处理
Stars: ✭ 28 (+133.33%)
Mutual labels:  selenium, requests
python-crawler
爬虫学习仓库,适合零基础的人学习,对新手比较友好
Stars: ✭ 37 (+208.33%)
Mutual labels:  selenium, requests
schedule-tweet
Schedules tweets using TweetDeck
Stars: ✭ 14 (+16.67%)
Mutual labels:  selenium, webscraping
requestsR
R interface to Python requests module
Stars: ✭ 12 (+0%)
Mutual labels:  requests, webscraping
Instagram-Scraper-2021
Scrape Instagram content and stories anonymously, using a new technique based on the har file (No Token + No public API).
Stars: ✭ 57 (+375%)
Mutual labels:  selenium, webscraping
Requestium
Integration layer between Requests and Selenium for automation of web actions.
Stars: ✭ 1,618 (+13383.33%)
Mutual labels:  selenium, requests
pyscrapper
📷 web scrapping in python: multiple libraries -requests, beautifulsoup, mechanize, selenium
Stars: ✭ 50 (+316.67%)
Mutual labels:  selenium, requests
selenium-grid-docker-swarm
web scraping in parallel with Selenium Grid and Docker
Stars: ✭ 32 (+166.67%)
Mutual labels:  selenium, webscraping
non-api-fb-scraper
Scrape public FaceBook posts from any group or user into a .csv file without needing to register for any API access
Stars: ✭ 40 (+233.33%)
Mutual labels:  selenium, webscraping
Email-Crawler-Lead-Generator
This email crawler will visit all pages of a provided website and parse and save emails found to a csv file.
Stars: ✭ 47 (+291.67%)
Mutual labels:  requests, webscraping
chesf
CHeSF is the Chrome Headless Scraping Framework, a very very alpha code to scrape javascript intensive web pages
Stars: ✭ 18 (+50%)
Mutual labels:  selenium, webscraping
newspaperjs
News extraction and scraping. Article Parsing
Stars: ✭ 59 (+391.67%)
Mutual labels:  webscraping

Image Crawler for Unsplash.com

A web image scraper that scrapes images from unsplash.com. All images downloaded from Unsplash are free for commercial and noncommercial use.

Table of Content

Prerequisites

  • Python 3 and pip - python 3 and Python package installer pip needs to be installed in the system. Check if you have python3 and pip already installed in your machine using,
  ~$ python3 --version
  Python 3.6.9

  ~$ pip3 --version
  pip 9.0.1 from /usr/lib/python3/dist-packages (python 3.6)
  • A web browser - A web browser (that supports headless mode) is required to run the script properly. My recommendation is Mozila Firefox or Google Chrome.

    However, Chrome users, check here before running the script.

    At the time of this documentation, headless mode is not supported by any other regular browser.

  • A web driver for the web browser - A web driver is required according to the chosen browser. Firefox, for example, requires geckodriver, which needs to be installed before script can be run. Download appropriate web driver for your browser from the following table.

    Browser Driver Link
    Firefox Download
    Chrome Download

    After download,

    Linux/macOS users, make sure to place it in your PATH, e.g., place it in /usr/bin or /usr/local/bin.

    Windows users, add it in the system environment variables.

  • A stable internet connection is must.

Getting started

Clone the repository to your local machine using,

~$ git clone https://github.com/Ayan-Kumar-Saha/image-crawler.git

Environment setup

To install all dependencies at once, move into project directory and run,

Linux/macOS

~$ pip3 install -r dependencies.txt

Windows

~$ pip install -r dependencies.txt

For Google Chrome users

Chrome users, change these lines before running the script,

  • Line 5 from
      from selenium.webdriver.firefox.options import Options
    
    to
      from selenium.webdriver.chrome.options import Options
    
  • Line 26 from
      browser = webdriver.Firefox(options = options)
    
    to
      browser = webdriver.Chrome(options = options)
    

Run Image Crawler

Run the crawler using,

Linux and macOS

~$ python3 image_crawler.py

Windows

~$ python image_crawler.py

Usage

Once the script starts, You need to give type or name of the image you want to download. For example, portraits

~$ Enter the image subject you want to download: portraits

Then enter the number of images you want to download.

~$ Number of images you want to download: 10

After that the script will download images for you. Once completed, an images folder should be created in the project directory, which will contain the downloaded images.

Output

Output should be somewhat similar as following

Terminal

Images folder:

Build With

Author

Ayan Kumar Saha

License

Copyright © 2020 Ayan Kumar Saha Released under the MIT license.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].