All Projects → shivam5992 → pyscrapper

shivam5992 / pyscrapper

Licence: other
📷 web scrapping in python: multiple libraries -requests, beautifulsoup, mechanize, selenium

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to pyscrapper

Requestium
Integration layer between Requests and Selenium for automation of web actions.
Stars: ✭ 1,618 (+3136%)
Mutual labels:  selenium, requests
image-crawler
An image scraper that scraps images from unsplash.com
Stars: ✭ 12 (-76%)
Mutual labels:  selenium, requests
Price Monitor
京东商品价格监控:监控用户设定商品价格,降价邮件/微信提醒。技术:Python爬虫/IP代理池/JS接口爬取/Selenium页面爬取
Stars: ✭ 634 (+1168%)
Mutual labels:  selenium, requests
Autolink
AutoLink是一个开源Web IDE自动化测试集成解决方案
Stars: ✭ 129 (+158%)
Mutual labels:  selenium, requests
TeslaPy
A Python module to use the Tesla Motors Owner API
Stars: ✭ 216 (+332%)
Mutual labels:  selenium, requests
weibo topic
微博话题关键词,个人微博采集, 微博博文一键删除 selenium获取cookie,requests处理
Stars: ✭ 28 (-44%)
Mutual labels:  selenium, requests
python-crawler
爬虫学习仓库,适合零基础的人学习,对新手比较友好
Stars: ✭ 37 (-26%)
Mutual labels:  selenium, requests
SJS DROPS
Script using requests module to register accounts to Slam Jam Socialism raffles.
Stars: ✭ 21 (-58%)
Mutual labels:  selenium, requests
impf-bot
💉🤖 Bot for the German "ImpfterminService - 116117"
Stars: ✭ 167 (+234%)
Mutual labels:  selenium
NodeKit
surfstudio.github.io/nodekit
Stars: ✭ 27 (-46%)
Mutual labels:  requests
carina-demo
Carina demo project.
Stars: ✭ 40 (-20%)
Mutual labels:  selenium
gcf-packs
Library packs for google cloud functions
Stars: ✭ 48 (-4%)
Mutual labels:  selenium
RESTEasy
REST API calls made easier
Stars: ✭ 12 (-76%)
Mutual labels:  requests
test login
问卷星
Stars: ✭ 53 (+6%)
Mutual labels:  selenium
Peanuts
Peanuts is a free and open source wifi tracking tool. Based on the SensePosts Snoopy-NG project that is now closed.
Stars: ✭ 34 (-32%)
Mutual labels:  requests
DadosAbertosBrasil
Pacote Python para acesso a dados abertos e APIs do governo brasileiro.
Stars: ✭ 28 (-44%)
Mutual labels:  requests
nightwatch-boilerplate
boilerplate for nightwatch.js with selenium
Stars: ✭ 16 (-68%)
Mutual labels:  selenium
hcaptcha-solver-python-selenium
hCaptcha solver and bypasser for Python Selenium. Simple website to try to solve hCaptcha.
Stars: ✭ 32 (-36%)
Mutual labels:  selenium
python-ogren-4-saatte-python-baslangic
(TR) 4 saatlik Python başlangıç atölyesinin içerik dokümanı. (EN version is in progress!)
Stars: ✭ 71 (+42%)
Mutual labels:  requests
curly.hpp
Simple cURL C++17 wrapper
Stars: ✭ 48 (-4%)
Mutual labels:  requests

PyScrapper

Project Status: WIP – Initial development is in progress, but there has not yet been a stable, usable release suitable for the public. Packagist Twitter URL Twitter Follow

WIP DISCLAIMER

Some of the projects inside this repo are broken due to updates on the websites used, so they are being reworked to be fully functional. Contributions are welcome. Just fork the repo and pull request your updates.

Web Scrapping series in python.

Forked and mantained by Ivan Nieto [email protected]

Original work by Shivam Bansal [email protected]

Module dependencies:

mechanize, BeautifulSoup (for Python 2.x) | bs4 (for Python 3.x), json, re, requests, urlparse, urllib

    pip install <module_name>

Projects

Google Movies

    Script to scrap google movies, retrieving a list of theaters, their address, movies list, 
    movies genere and showtimes for a given location. 
         
    This script outputs a JSON file with the response. 

Zomato Top Restaurants

    Script to scrap the top 25 trending restaurants with their rank, rating, details... 
    for the mentioned cities on the zomato.com website.
    
    It outputs a separate JSON response for each city.

Finance and Stock

    Scrapping the last closing price for all the quotes from various sites 
    like google, yahoo, bloomberg etc

Live Weather

    Scrap the weather details for morning, afternoon and night time for a particular website.

Daily Horoscope

    Scrapping the daily horoscope details for each sign and creating the output as text files. 
    Multiple websites are scrapped to get the details.

Train Details

    Scrap the details of train from irctc by inputting train number.

Website Top Keywords

    Create a list of most occured words in a website.
    Also counts thier frequency.

News Scrapping

    Scrap the news from various news sources.

Alexa Top Websites

    Get the list of top 25 websites of a country.

Movie Details

    Get the movie details from IMDB and RottenTomatoes.

US President State of Union Speech

    Scrap the speech transcripts of all Us Presidents from 1700 to Present.

Spider Algorithm

    Spider algorithm is a typical web scrapping technique to fetch all urls (etc) of a webpage.
    By all means, even those urls which are not part of the requested page. 
    It fetches all urls of current urls as well.
    Implemented using two ways, one normal and second using mechanize.

Rework ToDo

  • Google Movies
  • Zomato Top Restaurants
  • Finance and Stock
  • Live Weather
  • Daily Horoscope
  • Train Details
  • Website Top Keywords
  • News Scrapping
  • Alexa Top Websites
  • Movie Details
  • US President State of Union Speech
  • Spider Algorithm
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].