Fetching RSS feeds respectfully with curl
In this article, MacKenzie builds up a config, script and systemd file to respectfully fetch an RSS feed with curl.
It uses the following as base config for curl:
fail
compressed
max-time = 30
no-progress-meter
alt-svc = alt-svc-cache.txt
etag-compare = tech.CitizenLab.rss.etag
etag-save = tech.CitizenLab.rss.etag
output = tech.CitizenLab.rss.xml
time-cond = "Tue, 05 Nov 2024 15:00:35 GMT"
write-out = "%output{tech.CitizenLab.rss.lm}%header{last-modified}"
url = "https://citizenlab.ca/feed/"
next
Then adds conditional checks for the etag-compare and time-cond directives, so they are only added if the corresponding file contains a non-empty value.
The last part is then to use a systemd Timer file with OnUnitInactiveSec=1hour, so that the command will be run one hour after the previous run finished.