One of the nice parts about shell is that you can quickly prototype things. But eventually they become too brittle or limited or complex and you need to switch to a better langauge. Pipelines are powerful, but real data structures are better.
The brevity is nice though. From the
these four lines of shell…
…become twelve lines of python. The data in the python version
ends up in a set (in this case accomplished by a
dict) which makes
it easier to work with than the stream it’s in for the shell version.
In addition it’s easy to see how to give it a list of urls to not save.
Sometimes “more” code is better.