All Projects → BlipRanger → bdfr-html

BlipRanger / bdfr-html

Licence: GPL-3.0 license
Converts the output of the bulk downloader for reddit to a set of HTML pages.

Programming Languages

python
139335 projects - #7 most used programming language
HTML
75241 projects
CSS
56736 projects
Dockerfile
14818 projects

Projects that are alternatives of or similar to bdfr-html

saveddit
Bulk Downloader for Reddit
Stars: ✭ 130 (+465.22%)
Mutual labels:  reddit, bulk-downloader
subreddit-comments-dl
Download subreddit comments
Stars: ✭ 57 (+147.83%)
Mutual labels:  reddit, pushshift
timesearch
The subreddit archiver
Stars: ✭ 114 (+395.65%)
Mutual labels:  reddit, pushshift
rreddit
𝐫⟋ Get Reddit data
Stars: ✭ 49 (+113.04%)
Mutual labels:  reddit, pushshift
LiveReddit
🔗 A Chrome extension to update Reddit comments and scores in realtime.
Stars: ✭ 15 (-34.78%)
Mutual labels:  reddit
new-game-website
🚀 New Game! fan site made by the new game discord community
Stars: ✭ 29 (+26.09%)
Mutual labels:  reddit
hide-comments-everywhere
A browser extension for hiding major commenting systems like Disqus, Livefyre, Facebook plugin, WordPress, YouTube, etc.
Stars: ✭ 27 (+17.39%)
Mutual labels:  reddit
DownloadRedditImages
Easily download all the images from any subreddit (also select sort_type if you want hot/top/new/controversial, and also sort_time day/week/month/year/all). Randomly select downloaded images and set as wallpaper, updating every 30 mins (or whenever you want duh)!
Stars: ✭ 66 (+186.96%)
Mutual labels:  reddit
hacker-feeds-cli
📰 A command line tool for Hacker News、 Product Hunt、 GitHub Trending 、Reddit and V2EX feeds.
Stars: ✭ 129 (+460.87%)
Mutual labels:  reddit
RedditVanced
Reddit Android app mod inspired by Aliucord
Stars: ✭ 41 (+78.26%)
Mutual labels:  reddit
Feed-on-Feeds
FeedOnFeeds is a lightweight server-based RSS feed aggregator and reader
Stars: ✭ 52 (+126.09%)
Mutual labels:  reddit
RedditWallpaperBot
A Bot for Reddit that gets top Hot sorted picture from a given Subreddit and sets it as background.
Stars: ✭ 15 (-34.78%)
Mutual labels:  reddit
redvid
Smart downloader for Reddit hosted videos
Stars: ✭ 83 (+260.87%)
Mutual labels:  reddit
reddit-save
A Python tool for backing up your saved and upvoted posts on reddit to your computer.
Stars: ✭ 77 (+234.78%)
Mutual labels:  reddit
cronnit.com
A free tool for scheduling posts to Reddit.
Stars: ✭ 3 (-86.96%)
Mutual labels:  reddit
PrawWallpaperDownloader
Download images from reddit
Stars: ✭ 18 (-21.74%)
Mutual labels:  reddit
ImageDownloader
A program for downloading and filtering images based on their resolution.
Stars: ✭ 60 (+160.87%)
Mutual labels:  reddit
mysaves.app
An open source web app to filter your reddit saves, started by Robert Louis
Stars: ✭ 93 (+304.35%)
Mutual labels:  reddit
CwsShareCount
PHP class to get social share count for Delicious, Facebook, Google+, Linkedin, Pinterest, Reddit, StumbleUpon and Twitter.
Stars: ✭ 13 (-43.48%)
Mutual labels:  reddit
TerminusBrowser
CLI Reddit, Hacker News, 4chan, and lainchan browser
Stars: ✭ 93 (+304.35%)
Mutual labels:  reddit

bdfr-html

BDFR-HTML is a companion script that turns the output of the incredibly useful bulk downloader for reddit into a set of HTML pages with an index which can be easily viewed in a browser. It also provides a number of other handy tools such as the ability to grab the context for saved comments or pull down deleted posts from Pushshift. The HTML pages are rendered using jinja2 templates and can be easily modified to suit your needs. The script currently requires that you run both the archive and the download portions of the BDfR bulk downloader script and that the names of the downloaded files contain the post id (this is default). This can be automated using the included start.py or docker container.

Table of Contents

Installation

You can simply clone this repo and run the script in this folder or you can try your hand at installing the package using setuptools using the following command: python setup.py install (This is still a work in progress)

Usage

To run the script with defaults: python -m bdfrtohtml (the default is to look in the folder 'input' and write to the folder 'output')

python -m bdfrtohtml --input_folder ./location/of/archivedFiles --output_folder /../html/

Options

  --input_folder TEXT                           The folder where the download and archive results have been saved to.
  --output_folder TEXT                          Folder where the HTML results should be created.
  --recover_comments BOOLEAN                    Should we attempt to recover deleted comments?
  --recover_posts BOOLEAN                       Should we attempt to recover deleted posts?
  --generate_thumbnails BOOLEAN                 Generate thumbnails for video posts? (deprecated by index_mode)
  --archive_context BOOLEAN                     Should we attempt to archive the contextual post for saved comments?
  --delete_media BOOLEAN                        Should we delete the input media after creating the output?
  --index_mode [default|lightweight|oldreddit]  What type of templated index page should be generated?
  --write_links_to_file [None|Webpages|All]     Should we write the links from posts to a text file for external consuption?
  --config FILENAME                             Read in a config file
  --help                                        Show this message and exit.

start.py The start.py is what (currently) powers the docker container's automation and steps through running both bdfr and bdfr-html in sequence at timed intervals. Instead of running bdfrtohtml alone, you can run python start.py in a cloned copy of this repo to start up the automated process. The configuration for both bdfrtohtml and the start.py script itself can be found in the config folder. This script also includes multi-user support which can be found in the config file.

Docker

For ease of use of both bdfr and bdfr-html in an automated fashion, there is included a docker-compose file which will spin up both an automation container and a web server container. The automation container will run bdfr and then subsequently bdfr-html, producing a volume or mounted folder containing the generated html files. The web server container shares the output volume and hosts the generated files. Currently this is tasked to only save "Saved" user content, however this might be changed in the future. If you would prefer to populate bdfr-html with your own reddit json/media files from bdfr, you can use a similar docker-compose file, but mount the folder where you have saved your content to the folder (bdfrtohtml/input by default) and set the config variable RUN_BDFR to false.

Since BDFR 2.1.1 you should be able to properly hit the Oauth within the docker container. The proper port for validation has is exposed in the docker-compose file. If you are running the docker container on a different machine, replace locahost in the returned Oauth url with the address of the docker host.

To run the compose file, simply clone this repo and run docker-compose up.

The config file of the docker container can be mounted and modified just like the one mentioned above for the start.py script.

Contributing

I am open to any and all help or criticism on this project! Please feel free to create issues as you encounter them and I'll work to get them fixed. I have a set idea of the scope of this project, but I am always open to new feature suggestions or improvements to my code. Also, if you have code you'd like to contribute, just open a PR and I'll take a look!

Planned Features

  • Better documentation, including a lessons learned page
  • The ability to output more data/metrics
  • Docker support for automatically archiving subreddits/users
  • PyPi + Dockerhub package support

Screenshots

Example Post

Post

Default Index Page

Default Index Page

Lightweight Index Page

Lightweight Index Page

Old Reddit-Like Index Page

Old Reddit-like Index Page

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].