Home page logo

pauldotcom logo PaulDotCom mailing list archives

Looking for a good web spider
From: Daniel Holiday <dehaul () gmail com>
Date: Mon, 27 Sep 2010 15:24:34 -0600

I once wrote a multi-threaded spider in C++ using libcurl.
Unfortunately I wrote the code in service of my present employer and
don't own it. :)

It was very fast - we had one server that could pull down at least 50
MB/s or so on one dual core server. We completely tapped out the small
ISP's pipe where we were running the spider from - and we left this
spider on for an entire weekend, costing them a bunch of money in
bandwidth overages.

It was awesome.

We had to add some bandwidth limiting code the following week.

If you roll your own and need extreme performance, libcurl will serve you well.

Pauldotcom mailing list
Pauldotcom () mail pauldotcom com
Main Web Site: http://pauldotcom.com

  By Date           By Thread  

Current thread:
[ Nmap | Sec Tools | Mailing Lists | Site News | About/Contact | Advertising | Privacy ]