All Projects → Jaime-alv → web_check

Jaime-alv / web_check

Licence: GPL-3.0 License
Script for checking changes in webpages

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to web check

koishi
Python wrapper for the unofficial scraped API of the satori testing system.
Stars: ✭ 13 (-74%)
Mutual labels:  webscraping
allitebooks.com
Download all the ebooks with indexed csv of "allitebooks.com"
Stars: ✭ 24 (-52%)
Mutual labels:  webscraping
anikimiapi
A Simple, LightWeight, Statically-Typed Python3 API wrapper for GogoAnime.
Stars: ✭ 15 (-70%)
Mutual labels:  webscraping
Bitcoin-Bar
Physical Bitcoin Stat Ticker
Stars: ✭ 32 (-36%)
Mutual labels:  webscraping
FisherMan
CLI program that collects information from facebook user profiles via Selenium.
Stars: ✭ 117 (+134%)
Mutual labels:  webscraping
costa
The Costa Graphical User Interface for MS-DOS and compatible systems
Stars: ✭ 27 (-46%)
Mutual labels:  graphical-user-interface
AIDeveloper
GUI-based software for training, evaluating and applying deep neural nets for image classification
Stars: ✭ 51 (+2%)
Mutual labels:  graphical-user-interface
ARGUS
ARGUS is an easy-to-use web scraping tool. The program is based on the Scrapy Python framework and is able to crawl a broad range of different websites. On the websites, ARGUS is able to perform tasks like scraping texts or collecting hyperlinks between websites. See: https://link.springer.com/article/10.1007/s11192-020-03726-9
Stars: ✭ 68 (+36%)
Mutual labels:  webscraping
OkanimeDownloader
Scrape your favorite Anime from Okanime.com without effort
Stars: ✭ 13 (-74%)
Mutual labels:  webscraping
Utlyz-CLI
Let's you to access your FB account from the command line and returns various things number of unread notifications, messages or friend requests you have.
Stars: ✭ 30 (-40%)
Mutual labels:  webscraping
NYTimes-iOS
🗽 NY Times is an Minimal News 🗞 iOS app 📱 built to describe the use of SwiftSoup and CoreData with SwiftUI🔥
Stars: ✭ 152 (+204%)
Mutual labels:  webscraping
phomber
Phomber is infomation grathering tool that reverse search phone numbers and get their details, written in python3.
Stars: ✭ 59 (+18%)
Mutual labels:  webscraping
zimit
Make a ZIM file from any Web site and surf offline!
Stars: ✭ 67 (+34%)
Mutual labels:  webscraping
hk0weather
Web scraper project to collect the useful Hong Kong weather data from HKO website
Stars: ✭ 49 (-2%)
Mutual labels:  webscraping
covid19-api
Covid19 Data API (JSON) - LIVE
Stars: ✭ 20 (-60%)
Mutual labels:  webscraping
TrackPurchase
단 몇줄의 코드로 다양한 쇼핑 플랫폼에서 결제 내역을 긁어오자!
Stars: ✭ 19 (-62%)
Mutual labels:  webscraping
Apricot
Desktop Agent for Windows
Stars: ✭ 39 (-22%)
Mutual labels:  graphical-user-interface
Instagram-Scraper-2021
Scrape Instagram content and stories anonymously, using a new technique based on the har file (No Token + No public API).
Stars: ✭ 57 (+14%)
Mutual labels:  webscraping
amelia 2.0
An Artificial Intelligence Chat Bot and Service Provider written in Python and AIML.
Stars: ✭ 19 (-62%)
Mutual labels:  webscraping
Torrents-Api
Torrent Api ✨
Stars: ✭ 82 (+64%)
Mutual labels:  webscraping

web check

A script that will warn you, by opening a new browser tab, when there are new content in your favourite websites.

logo

What it does

The script will check, when run, if there are any changes in the websites. If any changes are found, it will open a new browser tab.

Not every website can be scrap.

How does it work?

After adding an url, the script creates a copy of website's content in your hard drive. When run again, it will compare the website against the cached one line by line,and if there are any differences, a new tab will open. Note: Script doesn't need to open browser when running, you'll only see the terminal.

A lot of websites have some kind of calendar, that means, every day there will be changes in those websites. To avoid this, you can add a unique css selector to each url. With this unique identification, the script targets only specific parts of the website, and avoid unnecessary calls to browser.

If there is a change, a new back up file will be created at storage/url_data/backup.

All urls are stored in a JSON file with all the needed information, including encoding.

How to get the unique css selector

Go to the website, right click in the zone you want the script to check. Go to inspect mode. Hover your mouse until you see (usually in blue) everything you want. Right click and copy selector. Paste that in the css field in add url, or modify url.

Set up

Running the script

Once everything is installed, launch the script with web_check/main.pyw.

There are four tabs.

  • Home: it's the main tab. From here you can launch checker.py with the button Run!. Checker.py it's in charge of all the logic. It will access your stored url and compare it with the actual website.

home

  • Add url: From this tab, you can add a new url for checking, and its unique css selector.

    Important: urls have to start with http:// or https://. Hit Submit new url and the script will make all necessary checks.

add url

There is a second option, Import file. Import file will let you select a .txt file with several urls, and all of them will be saved.

The txt file needs the following structure: url(white space)css selector.

Url only means script will download whole website. Only one url per line.

https://github.com/

https://www.reddit.com/ #SHORTCUT_FOCUSABLE_DIV

https://postal.fsc.ccoo.es/Inicio #divMainContent

  • Modify url: If you need to change an url css selector, you can do it from here. Enter a new css selector, or leave it empty for capturing the whole site, and hit submit.

modify url

  • Delete url: Two options for deleting. Check one, or several, urls and hit delete. Delete all will delete all urls stored.

delete url

At the Options' menu, it's possible to reset the url_list.txt if, for some reason, the file can't be read.

Automate the script

There is no need to run web_check/main.py every time you want to check your websites, for that, only checker.py is required.

You can run checker.py manually whenever you want, but that's tedious and forgettable, first you would have to activate a virtual environment, and then, run checker.py. With 'Create batch file' you only have to point where python.exe is (the virtual environment one) and a directory where the file will be created.

After all, it's easier to run directly web_check.bat, and even easier if you add said batch file to windows' task scheduler.

Create shortcut

Create shortcut at Options' menu will create a batch file with all information about the script itself and the virtual environment. It let you run main.pyw with only a double click.

Now you don't need to activate each time a venv, web_check.bat will take care of it.

What's new in your favourite websites

what's new Inside logs folder there are two files. whats_new.txt displays all the changes in your favourite websites. Each url starts with a hyphen for easier readability.

If script is run from main.pyw, there is no need to check this file everytime. Script will output those changes into a new window.

Log file

Every time the script is run, script will output a log file. It clears its content automatically for easier reading. Any error, or info, will be written down here.

Log is located in storage/logs/log.txt.

Copyright (C) 2021 Jaime Álvarez Fernández
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].