All Projects → tmcw → Notfoundbot

tmcw / Notfoundbot

fix & archive outgoing links on your website

Programming Languages

typescript
32286 projects

Projects that are alternatives of or similar to Notfoundbot

Staticman
💪 User-generated content for Git-powered websites
Stars: ✭ 2,098 (+2735.14%)
Mutual labels:  jekyll, hugo
Awesome Static Hosting And Cms
A collection of awesome static hosting & CMS providers
Stars: ✭ 163 (+120.27%)
Mutual labels:  jekyll, hugo
Create Static Site
Create static websites with no build configuration.
Stars: ✭ 124 (+67.57%)
Mutual labels:  jekyll, hugo
Jamstackthemes
A list of themes and starters for JAMstack sites.
Stars: ✭ 298 (+302.7%)
Mutual labels:  jekyll, hugo
medium-2-md
A CLI tool that converts exported Medium posts (html) to Jekyll/Hugo compatible markdown with front matter.
Stars: ✭ 113 (+52.7%)
Mutual labels:  jekyll, hugo
Vanilla Back To Top
Simple and smooth Back To Top button
Stars: ✭ 179 (+141.89%)
Mutual labels:  jekyll, hugo
Pendulum
A simple markdown editor for static files (Hugo, Nexo, Jekyll, MkDocs, ...)
Stars: ✭ 157 (+112.16%)
Mutual labels:  jekyll, hugo
Post Scheduler
Schedule posts & content updates for static websites (Jekyll, Hugo, Gatsby, Phenomic etc)
Stars: ✭ 184 (+148.65%)
Mutual labels:  jekyll, hugo
awesome-static-digital-libraries
Delightful Static Digital Library projects and resources
Stars: ✭ 23 (-68.92%)
Mutual labels:  jekyll, hugo
Awesome Docs With Static Site Generators
Pointers to all templates and implementations based on static site generators
Stars: ✭ 44 (-40.54%)
Mutual labels:  jekyll, hugo
Hugo Geo
Theme I use for my personal website
Stars: ✭ 65 (-12.16%)
Mutual labels:  hugo
Simple Jekyll Search
A JavaScript library to add search functionality to any Jekyll blog.
Stars: ✭ 1,133 (+1431.08%)
Mutual labels:  jekyll
Hugo Steam Theme
Port of Tommaso Barbato's Ghost theme Steam to Hugo
Stars: ✭ 69 (-6.76%)
Mutual labels:  hugo
Rustycrate.ru
Русскоязычный сайт о языке программирования Rust
Stars: ✭ 72 (-2.7%)
Mutual labels:  jekyll
Znlbwo.github.io
Stars: ✭ 65 (-12.16%)
Mutual labels:  jekyll
Hugo Theme Learn
Porting Grav Learn theme to Hugo
Stars: ✭ 1,155 (+1460.81%)
Mutual labels:  hugo
Jekyll Vue Template
A starter template for Jekyll projects with Vue.js and Vue Single File Components, complete with webpack.
Stars: ✭ 65 (-12.16%)
Mutual labels:  jekyll
Hugofy Sublime
Hugo plugin for Sublime Text 3
Stars: ✭ 64 (-13.51%)
Mutual labels:  hugo
Medium To Hugo
Medium stories exporter to markdown/hugo articles.
Stars: ✭ 64 (-13.51%)
Mutual labels:  hugo
Minimalism
Minimalism is a Jekyll theme for minimalist!
Stars: ✭ 74 (+0%)
Mutual labels:  jekyll

notfoundbot

Maintainability Test Coverage

notfoundbot is a GitHub Action that helps you automatically maintain the correctness of your website's outgoing links. It finds links that need fixing and opens pull requests that fix them.

This action is intended for websites and blogs powered by static site generators.

notfoundbot does the following fixes:

  • Upgrades outgoing HTTP links to HTTPS
  • Replaces broken outgoing links with links to the Wayback Machine

By using post dates derived from filenames, notfoundbot searches for Wayback Machine archives of linked resources that are contemporary to the post itself: broken links in a 2011 blog post will be linked to archives from around that era.

Example YAML

name: notfoundbot
on:
  schedule:
    - cron: "0 5 * * *"
jobs:
  check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/[email protected]
      - name: Fix links
        uses: tmcw/[email protected]
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

Notes:

  • I might forget to update the version on notfoundbot here - make sure that it's the latest!
  • Check out crontab.guru to customize the schedule line, which can run the task more or less often if you want.

Features

  • Post date detection: supports filename-based dates, YAML & TOML frontmatter
  • notfoundbot uses magic-string to selectively update links without affecting surrounding markup

Workflow

  • If there is an existing PR tagged notfoundbot, exit
  • Gather post files and parse them, and then for each unique outlink URL
    • If the URL is not http or https, ignore it
    • If the URL is relative, ignore it
    • If the URL has been checked recently and is in the cache, ignore it
    • If the URL is HTTP, check its HTTPS equivalent.
      • If the HTTPS equivalent exists, upgrade the link to HTTPS
      • Otherwise, check the HTTP link
        • If the HTTP link resolves, ignore it
        • If the HTTP link fails, mark it as an error.
    • If the URL is HTTPS, check to see if it resolves
      • If the link resolves, ignore it
      • If the link fails, mark it as an error

Then, for each link marked as an error:

  • Check the Internet Archive to find contemporary archives of each failed URL
    • If an archive exists, replace the link
    • Otherwise, ignore it.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].