Sidebar
https://web.archive.org/web/*/http://www-cdf.fnal.gov/offline/PostScript/* Get 'em while they're hot. It can still be removed if fnal changes its robots restrictions.
On Monday, April 11, 2022 at 9:57:17 PM UTC-4, luser droog wrote: https://web.archive.org/web/*/http://www-cdf.fnal.gov/offline/PostScript/* >> Get 'em while they're hot. It can still be removed if fnal changes its robots restrictions. Word to the wise: If the "MIME TYPE" column on the listing page luser droog linked to has "unk", the capture is going to be a 404 page. Mostly this appears to be when it crawled the wrong URL, e.g. http://www-cdf.fnal.gov/offline/PostScript/PLRM2.pdf<p> instead of http://www-cdf.fnal.gov/offline/PostScript/PLRM2.pdf Some of the incorrect URLs are incredibly long and appear to be multiple URLs. Try splitting into separate URLs. Sometimes the corrected URL is also captured; download that instead.
https://web.archive.org/web/*/http://www-cdf.fnal.gov/offline/PostScript/* >> Get 'em while they're hot. It can still be removed if fnal changes its robots restrictions.