Fetching RSS feeds respectfully with curl
In this article, MacKenzie builds up a config, script and systemd file to respectfully fetch an RSS feed with curl.
It uses the following as base config for curl:
fail compressed max-time = 30 no-progress-meter alt-svc = alt-svc-cache.txt etag-compare = tech.CitizenLab.rss.etag etag-save = tech.CitizenLab.rss.etag output = tech.CitizenLab.rss.xml time-cond = "Tue, 05 Nov 2024 15:00:35 GMT" write-out = "%output{tech.CitizenLab.rss.lm}%header{last-modified}" url = "https://citizenlab.ca/feed/" next
Then adds conditional checks for the etag-compare
and time-cond
directives, so they are only added if the corresponding file contains a non-empty value.
The last part is then to use a systemd Timer file with OnUnitInactiveSec=1hour
, so that the command will be run one hour after the previous run finished.