All Projects → deep-diver → fb-group-post-fetcher

deep-diver / fb-group-post-fetcher

Licence: other
No description, website, or topics provided.

Programming Languages

HTML
75241 projects
python
139335 projects - #7 most used programming language

newsletter_every_fri

Facebook Group Post Fetcher + Emailing

This repository is about scrapping a Facebook's group posts and sending out a email newsletter.

  • you must be an admin of the target facebook group
  • you must create a facebook app with a Group Post permission (app doesn't have be reviewed though)
  • email receivers could be configured via mailinglist.txt

How it works

It works with GitHub action, but you can run locally as well.

  1. GitHub Action dycryptes .env.gpg, and you will get .env
  2. GitHub Action triggers this application running
  • exchange facebook's access token with a long-lived one
  • replace the exchanged token to the existing one in .env
  • grab group posts via Facebook's Graph API
  • sort them
  • inject posts's information into HTML (jinja)
  • HTML/CSS will be nicely formatted for email (premailer)
  • email will be sent to those listed in mailinglist.txt
  1. GitHub Action deletes .env.gpg
  2. GitHub Action encrypts .env into .env.gpg
  3. GitHub Action pushes .env.gpg to your current repository

Look

Usage

CLI

$ python main.py --help
usage: main.py [-h] --since SINCE --until UNTIL --email-title EMAIL_TITLE [--limit LIMIT] 
               [--weight-reactions WEIGHT_REACTIONS] [--weight-shares WEIGHT_SHARES] [--weight-comments WEIGHT_COMMENTS]

Please specify the range of dates and the number of posts to be collected

optional arguments:
  -h, --help            show this help message and exit
  --since SINCE         dates in YYYY-MM-DD
  --until UNTIL         dates in YYYY-MM-DD
  --email-title EMAIL_TITLE
                        title for the email
  --limit LIMIT         number of posts to scrap
  --weight-reactions WEIGHT_REACTIONS
                        from 0 to 1
  --weight-shares WEIGHT_SHARES
                        from 0 to 1
  --weight-comments WEIGHT_COMMENTS
                        from 0 to 1

Example

$ python main.py --since 2020-01-10 \ 
                 --until 2020-01-20 \
                 --email-title="Weekly Newsletter"
$ python main.py --since 2020-01-10 \
                 --until 2020-01-20 \
                 --email-title="Weekly Newsletter" \
                 --limit 50
$ python main.py --since 2020-01-10 \
                 --until 2020-01-20 \
                 --email-title="Weekly Newsletter" \
                 --limit weights-reactions=0.5 \
                 --weights-shares=0.8 \
                 --weights-comments=1

Config files

There are two config files.

  • .env
    • this dotenv file contains sensitive information
  • config.cfg
    • this contains information that you can customize yourself

.env

You should set values described below on your own.

  • APP_ID: your facebook app id
  • APP_SECRET: your facebook secret
  • FB_ACCESS_TOKEN: access token issued via Graph API Explorer. make sure your access token belongs to your app.
APP_ID=XXX
APP_SECRET=XXX
FB_ACCESS_TOKEN=XXX

SMTP_USER=XXX
SMTP_PASS=XXX

.env only needs to be set initially. After you set .env, you have to encrypt it into .env.gpg using the command below. You must set your own password in --passphrase option.

gpg2 --quiet --batch --yes --decrypt --passphrase="SYMMETRIC_KEY" --output=.env .env.gpg

Also you must set the same passphrase in the SECRETS for your GitHub Repo. Here is the steps for doing so.

  1. Go to Settings tab.
  2. Go to SECRETS menu on the left.
  3. Click New secret button on the top left.
  4. Set Name as GPG_KEY.
  5. Set the value of the GPG_KEY to your choice as in --passphrase
  6. Click Add secret button.

config.cfg

The name of each key explains what they are. TOP_K means how many posts you want to grap. Only the TOP 10 posts will be included in the email with image. The rest will be included with only text. FIRST_WORDS means how many words you want to keep for the TOP 10 posts. SUB_FIRST_WORDS means how many words you want to keep for the other posts than TOP 10.

# config.cfg

[config]
TOP_K = XXX
FIRST_WORDS = XXX
SUB_FIRST_WORDS = XXX 

[fb]
FB_GROUP_ID = XXX

[web]
HEAD_IMAGE = image_url_to_appear_at_the_top_of_email
HEAD_ARTICLE = a_sentence_to_appear_at_the_top_of_email

Also the scrapped posts are sorted via the following formula. You can change the weights from the CLI. Please look at the Usage section.

  (# of reactions * WEIGHT_REACTIONS) + (# of shares * WEIGHT_SHARES) + (# of comments * WEIGHT_COMMENTS)

Who is behind this work

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].