All Projects → pbinkley → Tweet-Archive

pbinkley / Tweet-Archive

Licence: other
Python script to archive Tweets

Programming Languages

python
139335 projects - #7 most used programming language

Labels

Projects that are alternatives of or similar to Tweet-Archive

okuna-web
🌈 Okuna for the web.
Stars: ✭ 65 (+441.67%)
Mutual labels:  social
child-rescue-ui
ChildRescue aspires to effectively reduce the primary period between the moment a child is reported missing and the one when it is found, by increasing accuracy and timeliness of publicly available information and by accelerating location-based audience targeting of the mobile alerts for missing children based on evidence-based predictions.
Stars: ✭ 32 (+166.67%)
Mutual labels:  social
world
The world of Red Eclipse, community discussions and other useful things.
Stars: ✭ 15 (+25%)
Mutual labels:  social
react-native-social-buttons
Declarative social button components for React Native.
Stars: ✭ 30 (+150%)
Mutual labels:  social
sharemycart
Fight Corona virus by collaborative isolation: Buy your groceries along with your neighbors
Stars: ✭ 18 (+50%)
Mutual labels:  social
vuepress-plugin-autometa
Auto meta tags plugin for VuePress 1.x
Stars: ✭ 40 (+233.33%)
Mutual labels:  social
materialize-social
Social Login Buttons for MaterializeCSS
Stars: ✭ 50 (+316.67%)
Mutual labels:  social
socialx react native
The SocialX ecosystem takes the social media experience to the next level.
Stars: ✭ 20 (+66.67%)
Mutual labels:  social
spry
social media intelligence from the command line
Stars: ✭ 40 (+233.33%)
Mutual labels:  social
authorize-me
Authorization with social networks
Stars: ✭ 44 (+266.67%)
Mutual labels:  social
ts-ui
Telar Social Network using Reactjs
Stars: ✭ 35 (+191.67%)
Mutual labels:  social
go-coronanet
Go implementation of the Corona Network
Stars: ✭ 35 (+191.67%)
Mutual labels:  social
mastodon-autofollow
Autofollow bot for mastodon
Stars: ✭ 28 (+133.33%)
Mutual labels:  social
social
A simple social media using MEAN Stack. Frontend: Angular 6.
Stars: ✭ 13 (+8.33%)
Mutual labels:  social
vue-social-login
A Vue.js social login with: Firebase, Bulma and PWA
Stars: ✭ 21 (+75%)
Mutual labels:  social
UserFinder
OSINT tool for finding profiles by username
Stars: ✭ 379 (+3058.33%)
Mutual labels:  social
secrets
Simple Secret Sharing Service for social and decentralised management of passwords
Stars: ✭ 30 (+150%)
Mutual labels:  social
SAMN
This is our implementation of SAMN: Social Attentional Memory Network
Stars: ✭ 45 (+275%)
Mutual labels:  social
SAVY
SAVY Player provides service to watch local videos with in a synchronized way.
Stars: ✭ 15 (+25%)
Mutual labels:  social
LinkedIn Scraper
🙋 A Selenium based automated program that scrapes profiles data,stores in CSV,follows them and saves their profile in PDF.
Stars: ✭ 25 (+108.33%)
Mutual labels:  social
Tweet-Archive

A Python script to archive a user's tweets (including responses,
retweets, tweets responded to, etc.).

The Twitter API is used and tweets are stored in Twitter's XML format. A
separate script generates an html version, divided into monthly
archives. Ultimately it will produce pdfs as well.

Currently it archives these streams:

* user_timeline (your tweets)

* mentions

* direct_messages

* direct_messages_sent

For any tweet that is a response to another tweet (from any source), it
archives the original in a file called "references".

Dependencies: these Python modules are required:

* oauth2

* lxml

Installation:

First you must authorize the script to access your Twitter account using
OAuth. Follow the instructions in this tutorial:

http://jeffmiller.github.com/2010/05/31/twitter-from-the-command-line-in-python-using-oauth

Note that you'll need to give your application permission to access direct messages.

Copy the properties file secrets.SAMPLE.properties to secrets.properties
and insert the values that were generated by the authentication process,
as well as the Twitter ID you wish to archive. Make sure that
secrets.properties is secure (e.g. "chmod go-r secrets.properties").

Running:

Now run the script fetch-oauth2.py from the command line. It will create
a directory "archive" and download an XML file into it. The XML
downloads are placed in timestamped directories, e.g.
"archive/masters/2011/02/2011-01-16-112031.xml". A new properties file
ids.properties is also created containing the most recent tweet's id in
each stream. This is used in the next run to create "since_id"
parameters, to allow incremental updating. After the master download,
monthly dump files are created in "archive/xml", e.g.
"archive/xml/2011/02/2011-02-24-135642.xml". These dump files are never
altered; if a subsequent fetch gets more tweets for a given month,
another timestamped dump file will be created.

(To do a fresh full fetch, simply delete ids.properties and the archives
directory).

A separate script, static-archive.py, can then be run to create the
html. It creates yearly directories in "archive/html" and converts the
most recent dump files into monthly html files, e.g.
"archive/html/2011/2011_02.html", using statuses2html.xsl. Subsequent
runs of static-archive.py will generate new html archives if there are
new monthly dumps, or if new dumps have been added to previously
processed months. All html archives will be rebuilt if the xsl
stylesheet changes.

To do:

*	update to API 1.1 before 1.0 is removed. This will take some work,
since 1.1 only returns json.

*	deal with fetches that exceed the 200 requests/hour limit (which
can easily happen, since each responded-to tweet has to be fetched
individually)

*	generate pdfs from html using Firefox Command Line Print plug-in
(http://sites.google.com/site/torisugari/commandlineprint2)

*	incorporate generating of html and pdfs into main script

*	look up shortened urls, store the real url

*	maybe fetch referenced pictures? capture snapshot of referenced
pages?

*	allow for customized external css

*	allow for tracking replies across monthly boundaries: currently
each month is self-contained

*	check whether any of this works on Windows; if not, learn about
cross-platform file separators in Python

*	handle timezones consistently
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].