I’ve wanted to add a general link checker to look for broken links. This isn’t quite the same thing but it would be an option for remediating link rot when found. Plus it seemed simple to do.
My proof of concept for this also provides an excellent answer for a common question: when have you gone too far for a shell script and should switch to a “real language.” This script has gotten past that line so thought I’d share it.
I’ve written longer shell scripts that are fine as shell scripts. It’s not length - it’s complexity and brittleness. This script is both.
- Shellcheck will complain I’m reading/writing to
.ia-urlsin a single pipeline but in this case it’s wrong. The
sort -uafter the subshell acts as a barrier. It won’t output till the awk exits but that’s getting pretty into the weeds in shell esoterica.
- The xargs running
bash -c. It’s needed to prefix the archive save output with the url being saved. If urls were arguments to a
forloop (or the
while read urlvariant) that would make this a bit less brittle, but they would introduce their own problems.
- Getting the list of urls is incomplete and misses a different markdown url style that can cross over lines.
It’s a good proof of concept to learn how the IA “api” works. And
just generally to think through the data structures and the work
flow of deployment. Right now I have
./scripts/ia-check (also doesn’t work as a shell script)
which I see how to wire into my deployment pipeline and
possibly git hooks.