All Projects β†’ Mincka β†’ Dmarchiver

Mincka / Dmarchiver

Licence: gpl-3.0
A tool to archive the direct messages, images and videos from your private conversations on Twitter

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Dmarchiver

Scrape Twitter
🐦 Access Twitter data without an API key. [DEPRECATED]
Stars: ✭ 166 (-18.63%)
Mutual labels:  conversation, twitter, tweets
Hitomi Downloader
🍰 Desktop application to download images/videos/music/text from Hitomi.la and other sites, and more.
Stars: ✭ 1,154 (+465.69%)
Mutual labels:  downloader, twitter
Twitterldatopicmodeling
Uses topic modeling to identify context between follower relationships of Twitter users
Stars: ✭ 48 (-76.47%)
Mutual labels:  twitter, tweets
Twitter media downloader
Twitter media downloader.
Stars: ✭ 75 (-63.24%)
Mutual labels:  downloader, twitter
Guffer
Guffer tweets based on a daily schedule
Stars: ✭ 12 (-94.12%)
Mutual labels:  twitter, tweets
Quip Export
Export all folders and documents from Quip
Stars: ✭ 28 (-86.27%)
Mutual labels:  conversation, backup
Sarcasm Detection
Detecting Sarcasm on Twitter using both traditonal machine learning and deep learning techniques.
Stars: ✭ 73 (-64.22%)
Mutual labels:  twitter, tweets
Twitter
Twitter API for Laravel 5.5+, 6.x, 7.x & 8.x
Stars: ✭ 755 (+270.1%)
Mutual labels:  twitter, tweets
Tta Elastic
Official Trump Twitter Archive V2 source
Stars: ✭ 104 (-49.02%)
Mutual labels:  twitter, tweets
Twint
An advanced Twitter scraping & OSINT tool written in Python that doesn't use Twitter's API, allowing you to scrape a user's followers, following, Tweets and more while evading most API limitations.
Stars: ✭ 12,102 (+5832.35%)
Mutual labels:  twitter, tweets
Real Time Sentiment Tracking On Twitter For Brand Improvement And Trend Recognition
A real-time interactive web app based on data pipelines using streaming Twitter data, automated sentiment analysis, and MySQL&PostgreSQL database (Deployed on Heroku)
Stars: ✭ 127 (-37.75%)
Mutual labels:  twitter, tweets
Tweets
🐦 Tweet every 24 pull request
Stars: ✭ 8 (-96.08%)
Mutual labels:  twitter, tweets
Tumblthree
A Tumblr Blog Backup Application
Stars: ✭ 923 (+352.45%)
Mutual labels:  backup, downloader
Twweet Cli
🐦 Tweet right from your cli without even opening your browser.
Stars: ✭ 47 (-76.96%)
Mutual labels:  twitter, tweets
Twitter Post Fetcher
Fetch your twitter posts without using the new Twitter 1.1 API. Pure JavaScript! By Jason Mayes
Stars: ✭ 886 (+334.31%)
Mutual labels:  twitter, tweets
Bash2mp4
Video Downloader for Termux .
Stars: ✭ 68 (-66.67%)
Mutual labels:  downloader, twitter
Redditdownloader
Scrapes Reddit to download media of your choice.
Stars: ✭ 521 (+155.39%)
Mutual labels:  backup, downloader
Tweetscraper
TweetScraper is a simple crawler/spider for Twitter Search without using API
Stars: ✭ 694 (+240.2%)
Mutual labels:  twitter, tweets
Twitter Sentiment Analysis
This script can tell you the sentiments of people regarding to any events happening in the world by analyzing tweets related to that event
Stars: ✭ 94 (-53.92%)
Mutual labels:  twitter, tweets
Wa Reader
πŸ’¬ WA Reader is a platform to read WhatsApp conversations from email text backups in a easy-to-read UI.
Stars: ✭ 130 (-36.27%)
Mutual labels:  conversation, archive

DMArchiver is currently broken

2020-08-16

Due to recent changes on Twitter, the method originally used by DMArchiver will no longer work. There won't be a quick fix as it requires a major rewrite.

The issue is tracked here: https://github.com/Mincka/DMArchiver/issues/83

GitHub release PyPI Github All Releases Windows package Ubuntu package macOS package

DMArchiver

A tool to archive all the direct messages from your private conversations on Twitter.

Introduction

Have you ever need to retrieve old information from a chat with your friends on Twitter? Or maybe you would just like to backup all these cheerful moments and keep them safe.

I have made this tool to retrieve all the tweets from my private conversations and transform them in an IRC-like log for archiving.

Output sample:

[2016-09-07 10:35:55] <Michael> [Media-image] https://ton.twitter.com/1.1/ton/data/dm/773125478562429059/773401254876366208/mfeDmXXj.jpg I am so a Dexter fan...
[2016-09-07 10:36:12] <Michael> [Media-sticker] [Grinning face] https://ton.twimg.com/stickers/stickers/10001_raw.png
[2016-09-07 10:37:12] <Kathy> He is so sexy. 😳 I love him. ❀️
[2016-09-07 10:38:10] <Steve> You guys are ridiculous! πŸ˜‚

This tool is also able to download all the uploaded images and videos in their original resolution and, as a bonus, also retrieve the GIFs you used in your conversations as MP4 files (the format used by Twitter to optimize them and save space).

You may have found suggestions to use the Twitter's archive feature to do the same but Direct Messages are not included in the generated archive.

The script does not leverage the Twitter API because of its very restrictive limitations in regard of the handling of the Direct Messages. Actually, it is currently possible to retrieve only the latest 200 messages of a private conversation.

Because it is still possible to retrieve older messages from a Conversation by scrolling up, this script only simulates this behavior to automatically get the messages.

Warning: possible account lockout

A few users have reported account lockouts because of the use of this tool. Twitter seems to lock accounts more aggressively if a new login context is detected. Even though locking can be reverted, you should be aware of this risk when using this tool. An additional attempt after unlocking can allow the tool to perform better on the second run.

If you need to run the tool multiple times, it is also recommended to use the -s parameter to reuse cookies from a previous session. You will not receive a new login warning by e-mail since the tool will reuse an existing session.

Disclaimer:

Using this tool will only behave like you using the Twitter web site with your browser, so there is nothing illegal to use it to retrieve your own data. However, depending on your conversations' length, it may trigger a lot of requests to the site that could be suspicious for Twitter. In this case, Twitter could lock preemptively the account.

Because this script leverages an unsupported method to retrieve the tweets, it may break at any time. Indeed, Twitter may change the output code without warning. If you get errors you did not have previously, please check if new releases of the tool are available.

Installation & Quick start

By running the tool without any argument, you will be only prompted for your username and your password. The script will retrieve all the messages, from all the conversations without the images or the GIFs.

Windows

Download a Windows build from the project releases.

Unzip the archive in a temporary folder and double-click the executable or run it in a Command Prompt (mandatory if you want to use parameters to download images and videos):

> C:\Temp\DMArchiver.exe

Note: If you run the tool directly from the zip archive window, it may fail when writing the log file. Instead, copy DMArchiver.exe to any directory and run it from there.

Mac OS X / macOS

Download a macOS build from the project releases.

Then click on the executable, or run Terminal and execute the following commands (mandatory if you want to use parameters to download images and videos):

$ cd Downloads
$ ./dmarchiver

Note: If you run the tool by clicking on it, the result files will be available in your /users/username folder.

Ubuntu

$ pip3 install dmarchiver
$ dmarchiver

Installation & upgrade with pip (any platform)

$ pip3 install dmarchiver
$ dmarchiver
$ pip3 install dmarchiver --upgrade

Advanced usage

Command line tool

$ dmarchiver [-h] [-id CONVERSATION_ID] [-u] [-p] [-di] [-dg] [-dv]

$ dmarchiver --help
	usage: cmdline.py [-h] [-id CONVERSATION_ID] [-u] [-p] [-di] [-dg] [-dv]
	
	optional arguments:
	  -h, --help            show this help message and exit
	  -id CONVERSATION_ID, --conversation_id CONVERSATION_ID
	                        Conversation ID
	  -u,  --username       Username (e-mail or handle)
	  -p,  --password       Password
	  -d,  --delay          Delay between requests (seconds)
	  -s,  --save-session   Save the session locally
	  -di, --download-images
	                        Download images
	  -dg, --download-gifs  Download GIFs (as MP4)
	  -dv, --download-videos
	                        Download videos (as MP4)
	  -th,  --twitter-handle     
	                        Use the Twitter handles instead of the display names						
	  -r, --raw-output      Write the raw HTML to a file

Examples

Archive all conversations with images and videos:

$ dmarchiver -di -dv

The script output will be the 645754097571131337.txt file with the conversation formatted in an IRC-like style.

The images and videos files can be respectively found in the 645754097571131337/images and 645754097571131337/mp4-* folders.

Archive a specific conversation, and use the Twitter handles for the usernames:

To retrieve only one conversation with the ID 645754097571131337:

$ dmarchiver -id "645754097571131337" -th

The script output will be the 645754097571131337.txt file with the conversation formatted in an IRC-like style, using the Twitter handles instead of the display names.

Schedule a task to perform incremental backups of a conversation

You can also specify the username and the password in the options. Because DMArchiver is able to perform incremental updates, you can schedule a task or create a shortcut with the following arguments:

$ dmarchiver -id "conversation_id" -di -dg -dv -u your_username -p your_password -s

Note the usage of the -s flag to use an existing session, instead of creating a new one.

Development

Ubuntu / Windows

$ git clone https://github.com/Mincka/DMArchiver.git
$ cd DMArchiver
$ virtualenv venv
$ source venv/bin/activate # "venv/Scripts/Activate.bat" on Windows
$ pip install -r requirements.txt
$ python -m dmarchiver.cmdline

Mac OS X / macOS

To build and run the pip3 package, you need to have Xcode (β‰ˆ 130 MB), Homebrew and Python 3 (β‰ˆ 20 MB):

$ xcode-select --install
$ /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
$ brew install python3

Binary build with pyinstaller

The Python 3.4 (32-bit) branch is recommended to build the binaries. It will allow the best compatibility with all the platforms.

On Windows

> pip3 install pyinstaller
> pyinstaller --onefile dmarchiver\cmdline.py -n dmarchiver.exe
or alternative in case of import error
pyinstaller --onefile dmarchiver\cmdline.py --paths=dmarchiver -n dmarchiver.exe --hidden-import queue
> cd dist
> dmarchiver.exe

On Mac OS / macOS

$ pip3 install pyinstaller
$ pyinstaller --onefile dmarchiver/cmdline.py -n dmarchiver
or alternative for macOS Sierra with handling of external imports
$ /Library/Frameworks/Python.framework/Versions/3.4/bin/pyinstaller --onefile dmarchiver/cmdline.py -n dmarchiver --hidden-import cssselect --hidden-import lxml --hidden-import urllib3 --hidden-import requests --hidden-import queue 
$ cd dist
$ ./dmarchiver

Package upload to PyPI Live

python setup.py sdist upload -r pypi

Known issues

Missing messages in conversations

Sometimes, generally due to a connection error, the script will write the messages of the conversations before retrieving all the messages. In this case, you should try to run the script again.

Error message: "Unknown element type" / "Unknown media type" / "Unknown media"

Twitter may introduce new features or change the HTML output at any time. When it happens, DMArchiver may generate empty, broken logs or even crash. This kind of error message means the tool must be updated to handle the new output. Feel free to create a new issue when you encounter one of these messages.

Troubleshooting

Error building lxml

You may encounter building issues with the lxml library on Windows (error: Unable to find vcvarsall.bat). The most simple and straightforward fix is to download and install a precompiled binary from this site and install the package locally:

$ pip install lxml‑3.8.0‑cp34‑cp34m‑win32.whl

dmarchiver script not found after pip3 install

If Python bin path in not in your environment PATH variable, the program will not be found. Just run it with the complete path (location may vary...):

$ /Library/Frameworks/Python.framework/Versions/3.4/bin/dmarchiver

FAQ

What happens to my password and my messages? Are they sent to a third-party service?

Not at all. Unlike other online backup services, everything happens here on your computer. Your username and your password are only sent once to Twitter using a secured connection. Your messages are downloaded from your connection, and are written on your computer at the end of the script execution, so are the images and the GIFs if you chose to download them.

I received an e-mail from Twitter saying a suspicious connection occured on Twitter, should I be worried about it?

Not at all. The tool simulates a Chrome (Windows or Linux) or Safari (macOS) browser on your current operation system. Because the tool does not keep any cookie locally, Twitter will warn you each time you use it. You can safely ignore this message if you received it at the same time the tool was used.

macOS says the application is blocked because it is not from an identified developer, what should I do?

I am not able to sign the macOS executable. You will have to unblock the application if you want to use it. Go the "Security & Privacy" settings and click on the "Open Anyway" button.

License

Copyright (C) 2016-2017 Julien EHRHART

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].