All Projects → SpamScope → Mail Parser

SpamScope / Mail Parser

Licence: apache-2.0
Tokenizer for raw mails

Programming Languages

python
139335 projects - #7 most used programming language
python3
1442 projects

Projects that are alternatives of or similar to Mail Parser

Msgviewer
MsgViewer is email-viewer utility for .msg e-mail messages, implemented in pure Java. MsgViewer works on Windows/Linux/Mac Platforms. Also provides a java api to read mail messges (msg files) programmatically.
Stars: ✭ 61 (-74.58%)
Mutual labels:  mail, outlook
Spamscope
Fast Advanced Spam Analysis Tool
Stars: ✭ 223 (-7.08%)
Mutual labels:  outlook, docker-image
gromox
Groupware server backend for the grommunio Distribution, supporting MAPI/HTTP, RPC/HTTP, IMAP, POP3 protocols, PHP-MAPI bindings, and import from PST/OST/MSG/TNEF, EML/ICAL/VCF, KDB
Stars: ✭ 163 (-32.08%)
Mutual labels:  mail, outlook
Org Msg
OrgMsg is a GNU/Emacs global minor mode mixing up Org mode and Message mode to compose and reply to emails in a Outlook HTML friendly style.
Stars: ✭ 153 (-36.25%)
Mutual labels:  mail, outlook
Mfcmapi
MFCMAPI
Stars: ✭ 501 (+108.75%)
Mutual labels:  mail, outlook
Docker Postfix
Simple SMTP server / postfix null relay host for your Docker and Kubernetes containers. Based on Alpine Linux.
Stars: ✭ 163 (-32.08%)
Mutual labels:  mail, docker-image
Makisu
Fast and flexible Docker image building tool, works in unprivileged containerized environments like Mesos and Kubernetes.
Stars: ✭ 2,409 (+903.75%)
Mutual labels:  docker-image
Edu Account Creator
Stars: ✭ 225 (-6.25%)
Mutual labels:  mail
Spstorkcontroller
Now playing controller from Apple Music, Mail & Podcasts Apple's apps.
Stars: ✭ 2,494 (+939.17%)
Mutual labels:  mail
Docker Registry Ui
Docker Registry UI
Stars: ✭ 233 (-2.92%)
Mutual labels:  docker-image
Docker Emacs
Dockerized Emacs (GUI)
Stars: ✭ 224 (-6.67%)
Mutual labels:  docker-image
Docker Minecraft Server
Docker image that provides a Minecraft Server that will automatically download selected version at startup
Stars: ✭ 3,642 (+1417.5%)
Mutual labels:  docker-image
Uwsgi Nginx Flask Docker
Docker image with uWSGI and Nginx for Flask applications in Python running in a single container. Optionally with Alpine Linux.
Stars: ✭ 2,607 (+986.25%)
Mutual labels:  docker-image
Docker Mautic
Docker Image for Mautic
Stars: ✭ 227 (-5.42%)
Mutual labels:  docker-image
Forwardemail.net
The best free email forwarding for custom domains (Web Server)
Stars: ✭ 211 (-12.08%)
Mutual labels:  mail
Helm Kubectl
Docker Hub image with helm and kubectl on top of alpine linux with bash
Stars: ✭ 233 (-2.92%)
Mutual labels:  docker-image
Bitnami Docker Php Fpm
Bitnami PHP-FPM Docker Image
Stars: ✭ 210 (-12.5%)
Mutual labels:  docker-image
Rules k8s
This repository contains rules for interacting with Kubernetes configurations / clusters.
Stars: ✭ 222 (-7.5%)
Mutual labels:  docker-image
Bugbountyscanner
A Bash script and Docker image for Bug Bounty reconnaissance. Intended for headless use.
Stars: ✭ 229 (-4.58%)
Mutual labels:  docker-image
Docker Alpine Python3
The smallest Docker image with Python 3.7 (~57MB)
Stars: ✭ 218 (-9.17%)
Mutual labels:  docker-image

PyPI version Build Status Coverage Status BCH compliance

SpamScope

mail-parser

mail-parser is not only a wrapper for email Python Standard Library. It give you an easy way to pass from raw mail to Python object that you can use in your code. It's the key module of SpamScope.

mail-parser can parse Outlook email format (.msg). To use this feature, you need to install libemail-outlook-message-perl package. For Debian based systems:

$ apt-get install libemail-outlook-message-perl

For more details:

$ apt-cache show libemail-outlook-message-perl

mail-parser supports Python 3.

Apache 2 Open Source License

mail-parser can be downloaded, used, and modified free of charge. It is available under the Apache 2 license.

Support the project

Dogecoin: DAUbDUttkf8WN1kwP9YYQQKyEJYY2WWtEG

Donate with Bitcoin

Donate

mail-parser on Web

Description

mail-parser takes as input a raw email and generates a parsed object. The properties of this object are the same name of RFC headers:

  • bcc
  • cc
  • date
  • delivered_to
  • from_ (not from because is a keyword of Python)
  • message_id
  • received
  • reply_to
  • subject
  • to

There are other properties to get:

  • body
  • body html
  • body plain
  • headers
  • attachments
  • sender IP address
  • to domains
  • timezone

The attachments property is a list of objects. Every object has the following keys:

  • binary: it's true if the attachment is a binary
  • charset
  • content_transfer_encoding
  • content-disposition
  • content-id
  • filename
  • mail_content_type
  • payload: attachment payload in base64

To get custom headers you should replace "-" with "_". Example for header X-MSMail-Priority:

$ mail.X_MSMail_Priority

The received header is parsed and splitted in hop. The fields supported are:

  • by
  • date
  • date_utc
  • delay (between two hop)
  • envelope_from
  • envelope_sender
  • for
  • from
  • hop
  • with

mail-parser can detect defect in mail:

  • defects: mail with some not compliance RFC part

All properties have a JSON and raw property that you can get with:

  • name_json
  • name_raw

Example:

$ mail.to (Python object)
$ mail.to_json (JSON)
$ mail.to_raw (raw header)

The command line tool use the JSON format.

Defects

These defects can be used to evade the antispam filter. An example are the mails with a malformed boundary that can hide a not legitimate epilogue (often malware). This library can take these epilogues.

Authors

Main Author

Fedele Mantuano: LinkedIn

Installation

Clone repository

git clone https://github.com/SpamScope/mail-parser.git

and install mail-parser with setup.py:

$ cd mail-parser

$ python setup.py install

or use pip:

$ pip install mail-parser

Usage in a project

Import mailparser module:

import mailparser

mail = mailparser.parse_from_bytes(byte_mail)
mail = mailparser.parse_from_file(f)
mail = mailparser.parse_from_file_msg(outlook_mail)
mail = mailparser.parse_from_file_obj(fp)
mail = mailparser.parse_from_string(raw_mail)

Then you can get all parts

mail.attachments: list of all attachments
mail.body
mail.date: datetime object in UTC
mail.defects: defect RFC not compliance
mail.defects_categories: only defects categories
mail.delivered_to
mail.from_
mail.get_server_ipaddress(trust="my_server_mail_trust")
mail.headers
mail.mail: tokenized mail in a object
mail.message: email.message.Message object
mail.message_as_string: message as string
mail.message_id
mail.received
mail.subject
mail.text_plain: only text plain mail parts in a list
mail.text_html: only text html mail parts in a list
mail.text_not_managed: all not managed text (check the warning logs to find content subtype)
mail.to
mail.to_domains
mail.timezone: returns the timezone, offset from UTC
mail.mail_partial: returns only the mains parts of emails

It's possible to write the attachments on disk with the method:

mail.write_attachments(base_path)

Usage from command-line

If you installed mailparser with pip or setup.py you can use it with command-line.

These are all swithes:

usage: mailparser [-h] (-f FILE | -s STRING | -k)
                   [-l {CRITICAL,ERROR,WARNING,INFO,DEBUG,NOTSET}] [-j] [-b]
                   [-a] [-r] [-t] [-dt] [-m] [-u] [-c] [-d] [-o]
                   [-i Trust mail server string] [-p] [-z] [-v]

Wrapper for email Python Standard Library

optional arguments:
  -h, --help            show this help message and exit
  -f FILE, --file FILE  Raw email file (default: None)
  -s STRING, --string STRING
                        Raw email string (default: None)
  -k, --stdin           Enable parsing from stdin (default: False)
  -l {CRITICAL,ERROR,WARNING,INFO,DEBUG,NOTSET}, --log-level {CRITICAL,ERROR,WARNING,INFO,DEBUG,NOTSET}
                        Set log level (default: WARNING)
  -j, --json            Show the JSON of parsed mail (default: False)
  -b, --body            Print the body of mail (default: False)
  -a, --attachments     Print the attachments of mail (default: False)
  -r, --headers         Print the headers of mail (default: False)
  -t, --to              Print the to of mail (default: False)
  -dt, --delivered-to   Print the delivered-to of mail (default: False)
  -m, --from            Print the from of mail (default: False)
  -u, --subject         Print the subject of mail (default: False)
  -c, --receiveds       Print all receiveds of mail (default: False)
  -d, --defects         Print the defects of mail (default: False)
  -o, --outlook         Analyze Outlook msg (default: False)
  -i Trust mail server string, --senderip Trust mail server string
                        Extract a reliable sender IP address heuristically
                        (default: None)
  -p, --mail-hash       Print mail fingerprints without headers (default:
                        False)
  -z, --attachments-hash
                        Print attachments with fingerprints (default: False)
  -sa, --store-attachments
                        Store attachments on disk (default: False)
  -ap ATTACHMENTS_PATH, --attachments-path ATTACHMENTS_PATH
                        Path where store attachments (default: /tmp)
  -v, --version         show program's version number and exit

It takes as input a raw mail and generates a parsed object.

Example:

$ mailparser -f example_mail -j

This example will show you the tokenized mail in a JSON pretty format.

From raw mail to parsed mail.

Exceptions

Exceptions hierarchy of mail-parser:

MailParserError: Base MailParser Exception
|
\── MailParserOutlookError: Raised with Outlook integration errors
|
\── MailParserEnvironmentError: Raised when the environment is not correct
|
\── MailParserOSError: Raised when there is an OS error
|
\── MailParserReceivedParsingError: Raised when a received header cannot be parsed
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].