mailing list archives
From: Tonnerre Lombard <tonnerre.lombard () sygroup ch>
Date: Mon, 21 Jan 2008 08:38:05 +0100
On Thu, 17 Jan 2008 17:58:41 +0000 "me me" <securityoneoone () googlemail com> wrote:
to take manual intervention, so is a number of other possible
technologies), I have never really found a tool that I consider to be
the defacto spidering tool from this perspective. One of the biggest
problems is a lot of the spiders seem to choke on really big sites,
or go into infinite loops etc etc.
Yes, Microsoft Passport is very evil there, as an example. My trick to
solve the Microsoft Passport Problem is to search every link if it
contains an URLencoded version of the current URL and if it does,
ignore it. That appears to avoid deadloops.
I haven't yet seen other deadloops as far as I remember, but then again
I didn't index very much yet.
Tel:+41 61 333 80 33 Güterstrasse 86
Fax:+41 61 383 14 67 4053 Basel
Web:www.sygroup.ch tonnerre.lombard () sygroup ch