All Projects → yubiuser → Pihole_adlist_tool

yubiuser / Pihole_adlist_tool

An tool to analyse how your pihole adlists cover you browsing behavior

Programming Languages

shell
77523 projects

Labels

Projects that are alternatives of or similar to Pihole adlist tool

1hosts
DNS filter-/blocklists | safe. private. clean. browsing!
Stars: ✭ 85 (-61.71%)
Mutual labels:  pi-hole
Pihole Speedtest
Pihole Speedtest Mod
Stars: ✭ 142 (-36.04%)
Mutual labels:  pi-hole
Pi Hole Monitoring
Monitoring Pi-Hole statistics with Grafana
Stars: ✭ 196 (-11.71%)
Mutual labels:  pi-hole
Maza Ad Blocking
Local ad blocker. Like Pi-hole but local and using your operating system.
Stars: ✭ 1,544 (+595.5%)
Mutual labels:  pi-hole
Pi Hole Influx
A python daemon to send Pi-Hole stats for Grafana to InfluxDB
Stars: ✭ 126 (-43.24%)
Mutual labels:  pi-hole
Api
The Pi-hole API
Stars: ✭ 150 (-32.43%)
Mutual labels:  pi-hole
Homebridge Pihole
Pihole switch for Homebridge
Stars: ✭ 80 (-63.96%)
Mutual labels:  pi-hole
Whitelist
A simple tool to add commonly white listed domains to your Pi-Hole setup.
Stars: ✭ 3,033 (+1266.22%)
Mutual labels:  pi-hole
Elk Hole
elasticsearch, logstash and kibana configuration for pi-hole visualiziation
Stars: ✭ 136 (-38.74%)
Mutual labels:  pi-hole
Blockpage
A temporary unblock solution and blockpage for your Pi-Hole system
Stars: ✭ 191 (-13.96%)
Mutual labels:  pi-hole
Addon Pi Hole
Pi-hole - Home Assistant Community Add-ons
Stars: ✭ 120 (-45.95%)
Mutual labels:  pi-hole
Pihole Panel
Python/GTK3 based Pi-hole (network-level adblocker) dashboard for stats and more
Stars: ✭ 125 (-43.69%)
Mutual labels:  pi-hole
Pihole Unbound
Guide to setup Unbound recursive DNS resolver with Pi-Hole. With additional configs for speed and security!! 🚀🔒
Stars: ✭ 165 (-25.68%)
Mutual labels:  pi-hole
Pi Hole Droid
Pi-hole Droid is an unofficial app that connects to your Pi-hole to show charts and statistics.
Stars: ✭ 107 (-51.8%)
Mutual labels:  pi-hole
Pibar
PiBar for Pi-hole - Manage your Pi-hole(s) from your menu bar!
Stars: ✭ 208 (-6.31%)
Mutual labels:  pi-hole
Wirehole
WireHole is a combination of WireGuard, Pi-hole, and Unbound in a docker-compose project with the intent of enabling users to quickly and easily create a personally managed full or split-tunnel WireGuard VPN with ad blocking capabilities thanks to Pi-hole, and DNS caching, additional privacy options, and upstream providers via Unbound.
Stars: ✭ 1,232 (+454.95%)
Mutual labels:  pi-hole
Piadvanced
This started as a custom install for my pihole!
Stars: ✭ 144 (-35.14%)
Mutual labels:  pi-hole
Balena Pihole
Pi-hole is a Linux network-level advertisement and Internet tracker blocking application.
Stars: ✭ 223 (+0.45%)
Mutual labels:  pi-hole
Pi.alert
WIFI / LAN intruder detector. Check the devices connected and alert you with unknown devices. It also warns of the disconnection of "always connected" devices
Stars: ✭ 209 (-5.86%)
Mutual labels:  pi-hole
Pihole Kubernetes
PiHole on kubernetes
Stars: ✭ 180 (-18.92%)
Mutual labels:  pi-hole

Pihole Adlist Tool

This script tries to provide you with a bunch of information that enables you to decide which adlists you need based on your browsing behavior. It does that by matching your browsing history (FTL's querylog) with your current adlist configuration (gravity database) generating a list of domains that you have visited in the past and which would have been blocked if your current adlist configuration would have been in place back then. In a second step the scripts takes this list and attributes each domain to the adlists it is on (similar to what pihole -q does). The final output is a table of all your adlists with the corresponding number of covered domains (domains that you have visited and that would have been blocked if only this particular adlist would have been used).


The script outputs

  • the number of adlists (and how many are enabled)
  • the number of unique domains in your gravity.db
  • the number of blocked domains as reported by pihole ('blocking status == blocked by gravity' or blocking status == blocked by gravity+blocked during CNAME inspection) and how often those domains have been blocked ('hits')
  • the number of covered domains and how often those would have been blocked ('hits')
  • special case: domains on your (personal) blacklist which are also on an adlist and have been visited in the past, including hits (run 'pihole -q' to see on which adlist those domains appear)
  • optional: top blocked domains and number of hits if your current adlist configuration would have been used
  • adlist table id, status, total domains on adlist, covered domains, hits, unique covered domains, address
  • the sum of unique covered domains
  • optional: list of unique coverd domains with adlist_id, address
  • optional: analyse regex blacklist

As domains usually appear on more then one adlist I introduce the concept of unique covered domains. Those are domains that have been visited, would have been blocked and appear on just one adlist. This might help you to value your adlists not just by how many domains are covered but also what would happen if you disable this adlist.


Limits

  • Disabled blocklist won't be analyzed as gravity is not including domains from deactivated adlists. You can enable all adlists from within the script. The script will warn you, if there is a mismatch between the enabled adlists and data found in the gravity database. Users have the choise to run gravity to clear the mismatch or proceed anyway. In this case the tool will analyze all availabe data, but results must be interpreted with caution. (see 8dab71)

  • Black/Whitelisted domains (including regex see PR #19 are not considered when calculating the number of covered domains (and hits)

    • Whitelisted domains reduce the number of blocked domains as reported by pihole compared to the calculated numbers
    • Blacklisted domains increase the number of blocked domains as reported by pihole compared to the calculated numbers
  • This tool can not deal with domains that have been blocked due to CNAME inspection because pihole doesn't store the actual blocked domain but the CNAME and a corresponding status ("Blocked during deep CNAME inspection"). This CNAME domain will not match a domain from an adlist - if it would it would have been blocked directly. (see PR #3)

  • Other differences between the number of domains/hits as reported by pihole and calculated numbers are due to change in adlist configuration over time

  • For the limits of the regex analysis see the notes of PR #19


Caveat

  • Depending on the number of enabled adlists and the number of visited domains in the selected time period the calculation might take some time - please be patient. On my NanoPi NeoPlus2 (ARM, Quad-core Cortex A53) it takes ~17-18sec to analyse 2.3 million queries from pihole-ftl.db and 347603 domains in gravity.db

  • Analysis of regex blacklist can take minutes easily!

  • While lists that have attracted no or only very few hits in the analysis are prime candidates for removal, you should also consider the type of blocklist before you ultimately decide do remove a list, e.g. you may want to keep malware or telemetry focused blocklists nonetheless.


Requirements

  • Pi-hole FTL v5.5 (see PR #13)
  • For Docker users see notes below

Notes on Docker

Running Pi-hole on docker is not officially supported by this script! I don't run Pi-hole on docker myself and have no ability to test the script. Expect things to break anytime. However, I do try to release a "workaround" script (pihole_adlist_tool_docker) that should work also with Pi-hole on docker. Don't expect the full functionality or me to invest a lot of time on this. Contributions welcome!

Requirements:

  • Pi-hole v5.5
  • sqlite3 packges installed on host system

** Usage **

pihole_adlist_tool [options]

options:
    -d [Num]                         Consider the last [Num] days (Default: 30). Enter 0 for all-time analysis.
    -t [Num]                         Show top blocked domains. [Num] defines the number to show.
    -s [total/domains/hits/unique]   Set sorting order to total domains, domains covered, hits covered or unique covered domains DESC. (Default sorting: id ASC)
    -u                               Show covered unique domains
    -a                               Run in 'automatic mode'. No user input is requiered at all, assuming default choice would be to leave everything untouched.
    -r                               Analyse regex as well. Depending on the amount of domains and regex this might take a while.
    -h                               Show this help dialog


Background

As adlist configuration might have changed over time (add/removed adlists, enabled/disabled adlists) this script doesn't rely on pholes blocking status for the analysis but rather determine if queries from the long-term database would have been blocked with the current adlist configuration. Relying on the blocking status could lead to wrong assumptions about adlist coverage of your current adlist configuration: some domains might have been blocked in the past but wouldn't be blocked now (removed adlist) and some might be blocked now but haven't in the past (added adlist). If the adlist configuration hasn't changed over time there should be no huge difference between this approach and using pihole's blocking status.

The deeper reason for re-analyzing the queries is that this tool should help you to make predictions for the future: assuming your online behavior is rather stable over time and you analyze a long enough dataset from the past this tool will tell you which adlist might be worth keeping (because it contains a lot of covered domains) and which you could safely remove (no covered domains and/or covered domains but no unique covered domains).


Support, Contribute & Todo

I'm not a developer. This script is mostly done by copy-pasting snippets I found online. I know there is no proper error and exception handling. If you are willing to improve the script feel free to submit pull requests. Things on my todo list:

  • Further improve speed of the database handling. The slow steps are
    • Select all domains from pihole-ftl.db that that are also found in gravity.db
    • Get the total number of blocked domains from pihole-ftl.db
    • Get the total number of hits from pihole-ftl.db
    • Update adlist with the total number of domains from gravity.db for each adlist (see e0af664)
  • format sql output with awk
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].