On Sat, 24 Mar 2012 10:26:48 -0000, Dave said:
Doesn't the the -e, robots=off, --page-requisites and -H wget directives enable
one to collect all the necessary files that are called from a page?
No, not *all* the files, for the same reason that if you visit a page with
NoScript enabled, you may end up with missing content and/or big open spaces on
todaysfile = "http://www.news-site.com/" + date_as_string;
because yesterday and tomorrow will get a different URL. So basically,
if you try to pull it down with wget or similar, you will miss *all* the stuff
know how to follow CSS references?). On many modern web designs,
this ends up being the vast majority of the content.