All Projects → j0k3r → f43.me

j0k3r / f43.me

Licence: MIT license
A more readable & cleaner feed

Programming Languages

PHP
23972 projects - #3 most used programming language
Twig
543 projects
javascript
184084 projects - #8 most used programming language
SCSS
7915 projects
shell
77523 projects
Roff
2310 projects
Makefile
30231 projects

Projects that are alternatives of or similar to f43.me

meta-extractor
Super simple and fast html page meta data extractor with low memory footprint
Stars: ✭ 38 (-36.67%)
Mutual labels:  rss, extractor, feed
Feedek
FeedEk jQuery RSS/ATOM Feed Plugin
Stars: ✭ 190 (+216.67%)
Mutual labels:  rss, feed
this-american-life-archive
Unofficial RSS feed for the podcast "This American Life" with episodes 1 to current
Stars: ✭ 19 (-68.33%)
Mutual labels:  rss, feed
laminas-feed
Consume and generate Atom and RSS feeds, and interact with Pubsubhubbub.
Stars: ✭ 97 (+61.67%)
Mutual labels:  rss, feed
Feed Module
Everyone deserves RSS, ATOM and JSON feeds!
Stars: ✭ 182 (+203.33%)
Mutual labels:  rss, feed
Xity Starter
A blog-ready 11ty starter based on PostCSS, with RSS feed and Native Elements!
Stars: ✭ 184 (+206.67%)
Mutual labels:  rss, feed
Rss
Library for serializing the RSS web content syndication format
Stars: ✭ 223 (+271.67%)
Mutual labels:  rss, feed
Feedparser
feedparser gem - (universal) web feed parser and normalizer (XML w/ Atom or RSS, JSON Feed, HTML w/ Microformats e.g. h-entry/h-feed or Feed.HTML, Feed.TXT w/ YAML, JSON or INI & Markdown, etc.)
Stars: ✭ 156 (+160%)
Mutual labels:  rss, feed
V2
Minimalist and opinionated feed reader
Stars: ✭ 3,239 (+5298.33%)
Mutual labels:  rss, feed
reader
A Python feed reader library.
Stars: ✭ 290 (+383.33%)
Mutual labels:  rss, feed
cakephp-feed
CakePHP Plugin with RssView to create RSS feeds.
Stars: ✭ 13 (-78.33%)
Mutual labels:  rss, feed
Pluto
pluto gems - planet feed reader and (static) website generator - auto-build web pages from published web feeds
Stars: ✭ 174 (+190%)
Mutual labels:  rss, feed
Posidonlauncher
a one-page homescreen with a news feed
Stars: ✭ 163 (+171.67%)
Mutual labels:  rss, feed
RSS-to-Telegram-Bot
A Telegram RSS bot that cares about your reading experience
Stars: ✭ 482 (+703.33%)
Mutual labels:  rss, feed
Planetxamarin
We are an aggregator of content from Xamarin Community members. Why subscribe individually when you can subscribe to one convenient RSS feed, to see all the content generated by the community members in you news reader.
Stars: ✭ 158 (+163.33%)
Mutual labels:  rss, feed
Feed Io
A PHP library to read and write feeds in JSONFeed, RSS or Atom format
Stars: ✭ 200 (+233.33%)
Mutual labels:  rss, feed
web-front-end-rss
📙 根据 RSS 抓取最新前端技术文章,来源:前端早读课、前端大全、前端之巅、淘宝前端、张鑫旭博客、凹凸实验室等
Stars: ✭ 24 (-60%)
Mutual labels:  rss, feed
Gofeed
Parse RSS, Atom and JSON feeds in Go
Stars: ✭ 1,762 (+2836.67%)
Mutual labels:  rss, feed
Awesome Rss
Puts an RSS/Atom subscribe button back in URL bar
Stars: ✭ 125 (+108.33%)
Mutual labels:  rss, feed
Spotifeed
A simple service to serve up Spotify podcasts as RSS feeds for use in any podcast app.
Stars: ✭ 238 (+296.67%)
Mutual labels:  rss, feed

f43.me

CI Coverage Status

What's that?

I'm reading a lot of feeds in the subway. Mostly when I go to work and when I come back home. We are lucky in Paris because we have data network in the subway, but sometimes, network is saturated and you can't load the webpage of an item of your feed. You're stuck with only 3 lines from the feed...

That's why I've built a kind of proxy for RSS feeds that I read the most, called f43.me.

It's kind of a shortcut for "Feed For Free" (Feed = f, For = 4, Free = 3). Tada

Anyway, it's simple:

  • fetch items from a feed
  • grab the content
  • make it readable
  • store it
  • create a new feed with readable items

f43.me screenshot

Contents

Workflow

When it grabs a new item, there are several steps before we can say the item is readable. Let me introduce improvers, extractors, converters and parsers.

All of them works in a chain, we'll go thru all of them until we find one that match.

For curious people, this workflow happen in the Extractor->parseContent method.

Improvers

Most of social aggregator add great value to an url. For example, Reddit and HackerNews add a comments link and Reddit also provide a category, a little preview for video or image, etc...

These things are important to keep while it's still necessary to fetch the content of the target url.

This is where improver helps.

An improver use 3 methods:

  • match: tells if this improver will work on the given host (host that came for the main feed url)
  • updateUrl: can do whatever it wants to update the url of an item (for Reddit, we extract the url from the [link])
  • updateContent: add interesting information previous (or after) the readable content (for Reddit we just put the readable content after the item content. This method will be called AFTER the parser described below.

You can find some examples in the improver folder (atm Reddit & HackerNews).

Extractors

Parser that gets html content from an url and find what can be the most interesting part for the user is important. But, most of the time they fail when it comes to images (like from Imgur, Flickr) or from social network (like Tumblr, Twitter or Facebook).

These online service provides API to retrieve content from the their platform. Extractors will use them to grab the real content.

An extractor uses 2 methods:

  • match: tells if this extractor needs to work on that item (usually a bunch of regex & host matching)
  • getContent: it will call the related API or url to fetch the content from the match parameters found in the match method (like Twitter ID, Flickr ID, etc...) and return a clean html

You can find some of them in the extractor folder (Flickr, Twitter, GitHub, etc...)

Parsers

When we have the (most of the time, little) content from the feed, we use parsers to grab the html from the url and make it readable.

This involve 2 kind of parser:

  • the Internal, which uses a local PHP libray, called graby.
  • the External, which uses the excellent Mercury Parser API from Postlight Labs.

Converters

And finally, we can use some converters to transform HTML code to something different.

For example, Instagram embed code doesn't include the image itself (this part is usually done in javascript). The Instagram converter use the Instagram extractor to retrieve the image of an embed code and put it back in the feed item content.

You can find some of them in the converter folder (only Instagram for the moment)

How to use it

Requirements

  • PHP >= 7.4 (with pdo_mysql or pdo_pgsql)
  • Node.js 16 (for assets), use nvm install
  • MySQL >= 5.7 or PostgreSQL
  • RabbitMQ, which is optional (see below)
  • Supervisor (only if you use RabbitMQ)

For each external API that improvers / extractors / parsers use, you will need an api key:

Install

You should generate a password using php bin/console security:hash-password --empty-salt and then create a .env.local with your hashed password:

ADMINPASS="MY_HASHED_PASSWORD"

⚠️ Don't forget to escape understable variable, ie: all $ following by a letter will be interpreted as a variable in PHP. If your hashed password is $2y$13$BvprBNLfp6eKHtqLyN1.w.z214Q5LMEvF9LKJTn44hrMIBt3pzwNW, the $BvprBNLfp6eKHtqLyN1 part will be interpreted as a variable by PHP. You must escape it in your .env.local:

ADMINPASS="$2y$13\$BvprBNLfp6eKHtqLyN1.w.z214Q5LMEvF9LKJTn44hrMIBt3pzwNW"

Follow these steps:

git clone [email protected]:j0k3r/f43.me.git
cd f43.me
APP_ENV=prod composer install -o --no-dev
yarn install
php bin/console doctrine:schema:create --env=prod
yarn build

Without RabbitMQ

You just need to define these 3 cronjobs (replace all /path/to/f43.me with real value):

# fetch content for existing feed
*/2 * * * * php /path/to/f43.me/bin/console feed:fetch-items --env=prod old

# fetch content for fresh created feed
*/5 * * * * php /path/to/f43.me/bin/console feed:fetch-items --env=prod new

# cleanup old item. You can remove this one if you want to keep ALL items
0   3 * * * php /path/to/f43.me/bin/console feed:remove-items --env=prod

You can also run a command to fetch all new items from a given feed, using its slug:

php /path/to/f43.me/bin/console feed:fetch-items --env=prod --slug=reddit -t

With RabbitMQ

  1. You'll need to declare exchanges and queues. Replace guest by the user of your RabbitMQ instance (guest is the default one):

    php bin/console messenger:setup-transports -vvv fetch_items
  2. You now have one queue and one exchange defined f43.fetch_items which will receive messages to fetch new items

  3. Enable these 2 cronjobs which will periodically push messages in queues (replace all /path/to/banditore with real value):

    # fetch content for existing feed
    */2 * * * * php /path/to/f43.me/bin/console feed:fetch-items --env=prod old --use_queue
    
    # cleanup old item. You can remove this one if you want to keep ALL items
    0   3 * * * php /path/to/f43.me/bin/console feed:remove-items --env=prod
  4. Setup Supervisor using the sample file from the repo. You can copy/paste it into /etc/supervisor/conf.d/ and adjust path. The default file will launch 3 workers for fetching items.

    Once you've put the file in the supervisor conf repo, run supervisorctl update && supervisorctl start all (update will read your conf, start all will start all workers)

Try it

You can use the built-in Docker image using docker-compose:

docker-compose up

You should be able to access the interface using http://localhost:8100/index.php

License

f43.me is released under the MIT License. See the bundled LICENSE file for details.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].