All Projects → FgForrest → Postfix-Deliverability-Analytics

FgForrest / Postfix-Deliverability-Analytics

Licence: other
[DEPRECATED] A tool that goes throu Posftix logs and builds a statistics of bounces (non-delivered messages). Statistics are provided by REST API to the client.

Programming Languages

scala
5932 projects
java
68154 projects - #9 most used programming language
shell
77523 projects
Roff
2310 projects

Projects that are alternatives of or similar to Postfix-Deliverability-Analytics

Matomo
Liberating Web Analytics. Star us on Github? +1. Matomo is the leading open alternative to Google Analytics that gives you full control over your data. Matomo lets you easily collect data from websites & apps and visualise this data and extract insights. Privacy is built-in. We love Pull Requests!
Stars: ✭ 15,711 (+74714.29%)
Mutual labels:  log, analytics
keen-sdk-net
A .NET SDK for the Keen IO API
Stars: ✭ 35 (+66.67%)
Mutual labels:  analytics
beneath
Beneath is a serverless real-time data platform ⚡️
Stars: ✭ 65 (+209.52%)
Mutual labels:  analytics
castopod-host
Castopod is an open-source hosting platform made for podcasters who want engage and interact with their audience. Synchronized read-only mirror of https://code.castopod.org/adaures/castopod
Stars: ✭ 81 (+285.71%)
Mutual labels:  analytics
mixpanel-lite
2.9k alternative to mixpanel-js with offline support for PWAs
Stars: ✭ 49 (+133.33%)
Mutual labels:  analytics
LSQ
Linked SPARQL Queries (LSQ): Framework for RDFizing triple store (web) logs and performing SPARQL query extraction, analysis and benchmarking in order to produce datasets of Linked SPARQL Queries
Stars: ✭ 23 (+9.52%)
Mutual labels:  analytics
piker
#nontina, #paperhands,, #pwnzebotz, #tradezbyguille
Stars: ✭ 63 (+200%)
Mutual labels:  analytics
clevertap-ios-sdk
CleverTap iOS SDK
Stars: ✭ 39 (+85.71%)
Mutual labels:  analytics
doom-console-log
🕹️ DOOM rendered via console.log() in a web browser.
Stars: ✭ 19 (-9.52%)
Mutual labels:  log
composition-logger
The most optimal way to visualize/debug functional compositions 🔍
Stars: ✭ 15 (-28.57%)
Mutual labels:  log
aws-web-analytics
Privacy-focused alternative to Google Analytics on AWS Pinpoint
Stars: ✭ 45 (+114.29%)
Mutual labels:  analytics
analog-ce
Analog CE
Stars: ✭ 14 (-33.33%)
Mutual labels:  log
twitter-analytics-wrapper
A simple Python wrapper to download tweets data from the Twitter Analytics platform. Particularly interesting for the impressions metrics that are unavailable on current Twitter API. Also works for the videos data.
Stars: ✭ 44 (+109.52%)
Mutual labels:  analytics
multi-verse
lit-element components for fast and modular multivariate analysis
Stars: ✭ 34 (+61.9%)
Mutual labels:  analytics
tellery
Tellery lets you build metrics using SQL and bring them to your team. As easy as using a document. As powerful as a data modeling tool.
Stars: ✭ 219 (+942.86%)
Mutual labels:  analytics
growthbook
Open Source Feature Flagging and A/B Testing Platform
Stars: ✭ 2,342 (+11052.38%)
Mutual labels:  analytics
mimir
Data-ish exploration through SQL+Uncertainty
Stars: ✭ 26 (+23.81%)
Mutual labels:  analytics
keen-analysis.js
A light JavaScript client for Keen
Stars: ✭ 40 (+90.48%)
Mutual labels:  analytics
smf-spf
It's a lightweight, fast and reliable Sendmail milter that implements the Sender Policy Framework
Stars: ✭ 12 (-42.86%)
Mutual labels:  postfix
workshops-setup cloud analytics machine
Tips and Tricks to setup a cloud machine for Analytics and Data Science with R, RStudio and Shiny Servers, Python and JupyterLab
Stars: ✭ 12 (-42.86%)
Mutual labels:  analytics

Postfix Deliverability Analytics

Due to the smtp protocol imperfection there are very limited possibilities for a mail client to learn much about what happens with a message after being sent. However we can resolve this information from logs of Postfix smtp server that is logging the entire course of message delivery process. PDA runs on the same host and it is parsing postfix log files and indexing data so that mail clients can query PDA over http and restful interface for whatever information it is interested in.

Build instructions

PDA is not meant to be an embeddable application but rather standalone agent monitoring postfix logs and serving client requests. So far it uses maven and you don't need to use anything else then basic maven goals like clean, test and install. Install makes a package with generated shell scripts that are self documented.

Use common.sh for package distribution.

Use control.sh for server lifecycle management

This is a directory structure you're end up with :

├── db
│   └── main
│       ├── queue
│       ├── queue.p
│       ├── queue.t
│       ├── smtpLogDb
│       └── smtpLogDb.p
└── pda
    ├── bin
    │   ├── control.sh
    │   └── start.sh
    ├── conf
    │   ├── application.conf
    │   └── bounce-regex-list.xml
    ├── logs
    │   └── app-2014-03-22_205457.log
    └── repo (jars)

PDA general info

  • Standalone application written in Scala, using Akka and MapDB as it's core dependencies
  • Apart from HttpServer single-threaded pool there are 4 more threads/actors :
    • Superviser
    • Indexer - receives relevant log records and indexes / stores them to MapDB.
    • Tailer - controls another thread that tails log files. It is blocked all the time
  • Memory footprint is from 20 - 400 MG RAM depending on databaze size that affects memory size required for handling restful queries
  • Database is encrypted and http server supports only basic authentication
  • Prefer using dedicated postfix server being shared by a set of PDA client applications. Do not expect PDA to be able of handling tens of GigaBytes of log files. On a dedicated postfix server PDA is able to index and serve around 5GB of uncompressed log files. It is not optimized and tested beyond that.
  • PDA is using bounce-regex-list.xml file to gather regular expressions for categorizing bounces by messages. Once in a while it needs to be updated with regular expression for newly encountered bounce messages that was not categorized by current rules. Beware that amount of regular expressions and their complexity influence indexing speed because it might be executed in order of magnitude of 7
  • PDA's user interface is served by HttpServer running on localhost:1523 by default
  • PDA's initialization has a several steps
    1. regex-bounce-list.xml file is processed
    2. log files that were backed up by logrotate are indexed
    3. tailing of the actual log file that is being written to starts, indexing all its current and future content
    4. HttpServer starts listening to http requests

Diagram

Postfix settings

  • All necessary configuration regarding to postfix can be set in /etc/postfix/main.cf
    • deferred queue tuning - when postfix is unable to deliver a message due to a soft bounce, it attempts to do so again using interval where next value is a double of the previous. By default it is trying to do so too frequently which might produce GigaBytes of data. Desired time serie would rather look like this : 4m, 8m, 16m, 32m, 1h, 2h, 4h, 8h, 16h, 32h, 64h
      • minimal_backoff_time = 240s
      • maximal_backoff_time = 40h
      • maximal_queue_lifetime = 6d
    • enable_long_queue_ids = yes
      • this is an essential setting that makes queue IDs unique which eliminates records duplication
    • everything else can be set in PDA's agent.conf file, especially :
      • logs.dir = "/var/log/"
      • tailed-log-file = "mail.log"
      • rotated-file = "mail.log.1"
      • rotated-file-pattern = "mail\.log\.(\d{1,3}).*"
      • max-file-size-to-index : 1000

Mail client settings

Client applications must identify themselves so that PDA is able to associate message queues with corresponding smtp clients. There are 2 ways how to do that:

  • simple : override Message-ID header with value of this format :
"<" + "cid".hashCode() + '.' + id + '.' + RND.nextInt(1000) + '@' + clientId + ">"
  • complicated : consists in adding a new header "client-id" to a message and setting up postfix to log these headers so that PDA can see that. Header_checks and logging requires creating a configuration file /etc/postfix/maps/header_checks which makes postfix log messages that have a "client-id" MIME header
"/^client-id:.* / INFO"

Syslog settings

Syslog is logging timestamps without miliseconds which is not quite useful. In /etc/rsyslog.conf :

  • define MailLogFormat template
$template MailLogFormat,"%timegenerated:1:4:date-rfc3339% %timegenerated:1:6:date-rfc3164-buggyday% %timegenerated:12:23:date-rfc3339% %source% %syslogtag%%msg%\n"
  • tell syslog to use template MailLogFormat for logging into mail.* log files
mail.*              -/var/log/mail.log;MailLogFormat

Logrotate settings

Logrotate is periodically rotating log files so that the log file will be renamed and compressed as follows :

  • mail.log -> mail.log.1 -> mail.log.1.gz -> mail.log.1.gz
  • these settings makes it easier for PDA to process everything. Property 'size 100MB' doesn't allow log files grow more then 100MB
  • /etc/logrotate.d/rsyslog
/var/log/mail.log {
        rotate 100
	    size 100M
	    missingok
	    notifempty
	    compress
	    delaycompress
	    sharedscripts
	    postrotate
		    invoke-rc.d rsyslog reload > /dev/null
	    endscript
}

PDA setting

All PDA settings are available in application.conf

Bounce classification by regular expressions

Considering the fact that individual email services make up their own error messages and reasons why email can not be delivered, there is no easy way to classify them. bounce-regex-list.xml groups regular expressions into a several categories to match every error message encountered at log file in order to classify it as a soft or hard bounce and to find general reason for it's nondelivery. After some time you can check unclassified bounces in UI http://localhost:1523/ under 'Unclassified bounce messages' or http://localhost:1523/agent-status/unknown-bounces. There is a list of clients, each might contain a list of bounced messages that were not classified, value of 'info' property is the actual error message that was not possible to classify using bounce-regex-list. You need to edit file bounce-regex-list.xml (it's location depends on application.conf settings), add or change some regular expression so that next time this kind of error will be classified. 'Refresh bounce list' command reloads bounce-regex-list.xml that you edited. Significant amount of bounces might mean that there is something wrong with your postfix settings or mx records

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].