Nmap Development mailing list archives

Re: [RFC] Mass rDNS performance tweak

From: jah <jah () zadkiel plus com>
Date: Fri, 23 Jan 2009 00:05:42 +0000

On 22/01/2009 19:01, David Fifield wrote:

Thanks for the detailed description. This is the way development should
be done, with documented procedures and measurable, repeatable results.
I created a ptr_list according to your instructions and ran a test three
times, with and without the patch, with and without --system-dns.

nmap parallel
13.31s. Mode: Async [#: 2, OK: 250, NX: 0, DR: 0, SF: 0, TR: 428, CN: 0]
14.58s. Mode: Async [#: 2, OK: 250, NX: 0, DR: 0, SF: 0, TR: 425, CN: 0]
10.29s. Mode: Async [#: 2, OK: 250, NX: 0, DR: 0, SF: 0, TR: 365, CN: 0]

nmap system
15.61s. Mode: System [OK: 250, ??: 0]
14.37s. Mode: System [OK: 250, ??: 0]
14.77s. Mode: System [OK: 250, ??: 0]

nmap_dns.cc.patch parallel
 9.13s. Mode: Async [#: 2, OK: 250, NX: 0, DR: 0, TO: 3, SF: 0, TR: 253, CN: 0]
10.02s. Mode: Async [#: 2, OK: 250, NX: 0, DR: 0, TO: 1, SF: 0, TR: 251, CN: 0]
 9.48s. Mode: Async [#: 2, OK: 250, NX: 0, DR: 0, TO: 2, SF: 0, TR: 252, CN: 0]

nmap_dns.cc.patch system
14.67s. Mode: System [OK: 250, ??: 0]
14.96s. Mode: System [OK: 250, ??: 0]
15.30s. Mode: System [OK: 250, ??: 0]

So as you can see I didn't have a problem with accuracy in any case but
the patched parallel resolved was the fastest and sent the fewest
probes, with the average being very close to 1 probe per host.

Thanks for trying this patch out.  These are nice results which nicely
illustrate that sending less requests can speed things up - if only a
little.  However, Doug's response prompted me too do much larger tests
and I've found that this patch is good at improving results when the
actual capacity isn't very much, but it can really slow things down when
capacity is higher.  More on this shortly.

(By the way this is an easy test that anyone can do. I just made a
separate checkout of nmap in an nmap-temp directory next to my usual
checkout, and applied the patch there. Then I made a shell script with
the contents
      ./nmap/nmap -sL -d -iL ptr_list | grep ^DNS
      ./nmap/nmap -sL -d -iL ptr_list --system-dns | grep ^DNS
      ./nmap-temp/nmap -sL -d -iL ptr_list | grep ^DNS
      ./nmap-temp/nmap -sL -d -iL ptr_list --system-dns | grep ^DNS
and ran it three times. Please do the same and share your results.)

Nice.  I usually perform --system-dns tests before async to absorb any
reductions in time to completion caused by caching.  It's worth noting
that these tests are good for measuring the accuracy, but using a list
of purely random IP's is better for real world efficacy.

The patch works great for me at least in the case where all hosts have
PTR record. I have a few questions and observations about the patch.

  
I don't understand the meaning of drop_ratio and how it affects
capacity. It's defined as
      drop_ratio = reqs_timedout / reqs_completed;
Shouldn't it rather be
      drop_ratio = reqs_timedout / (reqs_timedout + reqs_completed);
to make it a ratio strictly between 0 and 1? As it is I don't know how
to interpret it, except that drop_ratio > 1 means that more than 50% of
requests timed out.

Specifically what I was trying to achieve with the perhaps badly named
drop_ratio was that as the ratio increases, so the rate at which we
reduce capacity increases exponentially (roughly) and we decrease the
rate at which capacity increases.  I must admit though that this wasn't
done in a calculated fashion, but rather as a result of observation of
lots of debug output and a fair bit of trial and error.  In practise,
with this patch, the ratio rarely exceeds 0.5 and as such, it ought to
be worth trying the proportion of timed-out requests.

Likewise, what does this mean?
      capacity -= (float) drop_ratio
More drops means a greater reduction in capacity, everything else being
equal. But shouldn't drop ratios (defined as in the patch) of 6 / 5 and
600 / 500 cause different absolute changes in capacity? In the first
case it seems we should reduce capacity by about 6; in the second, by
about 600. The code in the patch will change it by -1.2 in both cases.

Remember also that we get a capacity decrease for each timed-out request
so the overall decrease in capacity is much greater after 600 timeouts
than after 6.

For the linear growth on capacity increase, you may be able to do
      capacity += 1.0 / capacity;
(Possibly with a different constant in place of the 1.0.) This is what
TCP does during congestion avoidance mode, see below.

I'll try this.  Your comments below have really helped to crystallise
some concepts and I can see that some k / capacity might work really well.

Going back to your description of the current DNS code,

When a response is received from a server, it's capacity is increased by
CAPACITY_UP_STEP (currently 2).


This algorithm leads to exponential growth of the capacity, just like in
TCP's slow start mode. Because the DNS code doesn't have anything like
TCP's linear congestion avoidance mode, it's like it's perpetually in
slow start. This does sound like a recipe for big fluctuations.

Added to that is the fact that we increase the capacity every time we
get a response often beyond the point at which capacity is exceeded and
the we eventually get a flurry of timeouts which reduces capacity right
back to where we started - rinse and repeat.

(While incrementing by a fixed amount may look like linear growth, the
rate at which those increments happen is proportional to the current
capacity, so the increase per unit time is proportional to 2 * capacity.
It's exponential growth when the rate of increase is proportional to the
current size.)

What this means is depending on the drop ratio, we'll increase capacity
by a maximum of 5 during any timeslot which allows for timed-out
requests to balance the capacity with decreases.


If I understand correctly, this is linear growth, just like TCP's
congestion avoidance mode. In TCP, the congestion window grows by 1
every RTT; for you the capacity grows by 5 every timeslot.

Within each timeslot, the growth is still exponential and the higher the
capacity, the quicker this growth reaches our imposed limit for that
timeslot and longer periods of zero growth occur.  Overall though, it is
linear growth.

Here are some results from some bigger list scans using nameservers
authoritative for the addresses in question - as suggested by Doug.

Results labelled "base" are from the current version of nmap, but also
including the timeout (TO) statistic.
Results labelled "tune" are from nmap with the patch.

These 3 sets of results are all using separate sets of 3 dns servers for
4096 IP addresses and each test is performed twice with each version of
nmap:
base:
41.91s. Mode: Async [#: 3, OK: 0, NX: 3993, DR: 103, TO: 3269, SF: 0,
TR: 7262, CN: 0]
tune:
33.66s. Mode: Async [#: 3, OK: 0, NX: 4066, DR: 30, TO: 331, SF: 0, TR:
4397, CN: 0]
tune:
32.63s. Mode: Async [#: 3, OK: 0, NX: 4062, DR: 34, TO: 325, SF: 0, TR:
4387, CN: 0]
base:
41.16s. Mode: Async [#: 3, OK: 0, NX: 4016, DR: 80, TO: 3002, SF: 0, TR:
7018, CN: 0]

tune:
37.33s. Mode: Async [#: 2, OK: 4012, NX: 16, DR: 68, TO: 333, SF: 0, TR:
4361, CN: 0]
base:
32.91s. Mode: Async [#: 2, OK: 3959, NX: 15, DR: 122, TO: 1861, SF: 0,
TR: 5835, CN: 0]
base:
34.75s. Mode: Async [#: 2, OK: 3949, NX: 16, DR: 131, TO: 1920, SF: 0,
TR: 5885, CN: 0]
tune:
38.59s. Mode: Async [#: 2, OK: 4007, NX: 16, DR: 73, TO: 391, SF: 0, TR:
4414, CN: 0]

base:
44.80s. Mode: Async [#: 3, OK: 3992, NX: 0, DR: 104, TO: 3903, SF: 0,
TR: 7895, CN: 0]
tune:
44.52s. Mode: Async [#: 3, OK: 4024, NX: 0, DR: 72, TO: 728, SF: 0, TR:
4752, CN: 0]
tune:
40.22s. Mode: Async [#: 3, OK: 4012, NX: 0, DR: 84, TO: 651, SF: 0, TR:
4663, CN: 0]
base:
43.80s. Mode: Async [#: 3, OK: 4011, NX: 0, DR: 85, TO: 3481, SF: 0, TR:
7492, CN: 0]

The timings are very similar, but in each case the tuned nmap is
slightly more accurate, sends considerably less requests and thus has
less timed-out requests.  I decided to add code to print out the peak,
mean and mode capacity for each server so that I could better see what
was going on and where improvements might be made.

More results with 4096 IPs:
tune:
658.48s. Mode: Async [#: 3, OK: 289, NX: 3519, DR: 288, TO: 1998, SF: 0,
TR: 5806, CN: 0]
STATS <X.252.0.72>  TX: 2335 (40%)  RESPONSES: 1660   TIMEDOUT: 675
 CAPACITY Peak: 26 Mean: 9.55 Mode: 2 (24%)
STATS <X.203.0.86>  TX: 2032 (34%)  RESPONSES: 1347   TIMEDOUT: 685
 CAPACITY Peak: 26 Mean: 12.25 Mode: 2 (28%)
STATS <X.252.0.71>  TX: 1439 (24%)  RESPONSES: 801   TIMEDOUT: 638
 CAPACITY Peak: 21 Mean: 9.15 Mode: 2 (35%)

base:
150.80s. Mode: Async [#: 3, OK: 286, NX: 3456, DR: 354, TO: 4709, SF: 0,
TR: 8451, CN: 0]
STATS <X.252.0.72>  TX: 2888 (34%)  RESPONSES: 1315   TIMEDOUT: 1573
 CAPACITY Peak: 200 Mean: 103.67 Mode: 200 (23%)
STATS <X.203.0.86>  TX: 1832 (21%)  RESPONSES: 613   TIMEDOUT: 1219
 CAPACITY Peak: 200 Mean: 74.05 Mode: 10 (20%)
STATS <X.252.0.71>  TX: 3731 (44%)  RESPONSES: 1814   TIMEDOUT: 1917
 CAPACITY Peak: 200 Mean: 106.51 Mode: 200 (29%)
 
Not such a great result since tuned nmap increases accuracy by only a
small amount, but took a huge hit in speed.  You can see that in the
tuned nmap we reached a maximum capacity of 26 for two of the servers
and 21 for the other, but that we sent a lot of requests at a capacity
of 2.  It seems that the servers failed to respond to a large number of
requests and that along with the low capacity, this meant we were often
waiting for requests to time-out before being able to send more.  This
is a good example of the difficulty in determining capacity when
time-outs happen for reasons other than sending more requests than can
be handled.

More results with 4096 IPs:
tune:
35.45s. Mode: Async [#: 2, OK: 1085, NX: 2959, DR: 52, TO: 357, SF: 0,
TR: 4401, CN: 0]
STATS <X.78.255.20>  TX: 2671 (60%)  RESPONSES: 2518   TIMEDOUT: 153
 CAPACITY Peak: 41 Mean: 27.91 Mode: 41 (17%)
STATS <X.78.100.1>  TX: 1730 (39%)  RESPONSES: 1526   TIMEDOUT: 204
 CAPACITY Peak: 39 Mean: 26.58 Mode: 34 (16%)
 
base:
44.53s. Mode: Async [#: 2, OK: 1042, NX: 2893, DR: 161, TO: 2475, SF: 0,
TR: 6410, CN: 0]
STATS <X.78.255.20>  TX: 3159 (49%)  RESPONSES: 1940   TIMEDOUT: 1219
 CAPACITY Peak: 200 Mean: 138.49 Mode: 200 (43%)
STATS <X.78.100.1>  TX: 3251 (50%)  RESPONSES: 1995   TIMEDOUT: 1256
 CAPACITY Peak: 200 Mean: 139.14 Mode: 200 (43%)
 
tune:
35.69s. Mode: Async [#: 2, OK: 1088, NX: 2960, DR: 48, TO: 352, SF: 0,
TR: 4400, CN: 0]
STATS <X.78.255.20>  TX: 2594 (58%)  RESPONSES: 2439   TIMEDOUT: 155
 CAPACITY Peak: 41 Mean: 27.09 Mode: 40 (18%)
STATS <X.78.100.1>  TX: 1806 (41%)  RESPONSES: 1609   TIMEDOUT: 197
 CAPACITY Peak: 39 Mean: 26.72 Mode: 35 (17%)

base:
39.69s. Mode: Async [#: 2, OK: 1059, NX: 2898, DR: 139, TO: 2136, SF: 0,
TR: 6093, CN: 0]
STATS <X.78.255.20>  TX: 2246 (36%)  RESPONSES: 1281   TIMEDOUT: 965
 CAPACITY Peak: 200 Mean: 128.23 Mode: 200 (37%)
STATS <X.78.100.1>  TX: 3847 (63%)  RESPONSES: 2676   TIMEDOUT: 1171
 CAPACITY Peak: 200 Mean: 151.81 Mode: 200 (55%)
 
Not a significant increase in accuracy (but an increase nonetheless) - a
small increase in speed due to sending far fewer requests.

More results with 4096 IP's:
base:
1500.75s. Mode: Async [#: 2, OK: 0, NX: 0, DR: 4096, TO: 16384, SF:
8962, TR: 16384, CN: 0]
STATS <X.96.72.129>  TX: 8192 (50%)  RESPONSES: 4480   TIMEDOUT: 8192
 CAPACITY Peak: 10 Mean: 10.00 Mode: 10 (100%)
STATS <X.96.147.1>  TX: 8192 (50%)  RESPONSES: 4482   TIMEDOUT: 8192
 CAPACITY Peak: 10 Mean: 10.00 Mode: 10 (100%)

tune: cancelled after 2 hours

This ones interesting.  We sent 2 requests per address per server and
got lots of SERVFAIL back.  I noticed that I was incorrectly
incrementing both the SF and TO stats because a SERVFAIL is first
handled by process_response() where the request timeout value is
artificially lowered before being handled by
deal_with_timedout_reads().  I fixed this in both base and tune nmap and
also modified them both so that should a SERVFAIL be received the
request is not retried at that server and nor does this message impose a
penalty on capacity (at the moment, neither does it cause an increase in
capacity).
I then repeated the tests:

base:
44.70s. Mode: Async [#: 2, OK: 0, NX: 0, DR: 4096, TO: 169, SF: 8176,
TR: 8345, CN: 0]
STATS <X.96.72.129>  TX: 4176 (50%)  RESPONSES: 4087   TIMEDOUT: 89
 CAPACITY TRANSMITS Peak: 10 Mean: 10.0 Mode: 10 (100%)
STATS <X.96.147.1>  TX: 4169 (50%)  RESPONSES: 4089   TIMEDOUT: 80
 CAPACITY TRANSMITS Peak: 10 Mean: 10.0 Mode: 10 (100%)

tune:
83.06s. Mode: Async [#: 2, OK: 0, NX: 0, DR: 4096, TO: 21, SF: 8191, TR:
8212, CN: 0]
STATS <X.96.72.129>  TX: 4109 (50%)  RESPONSES: 4095   TIMEDOUT: 14
 CAPACITY Peak: 2 Mean: 2.00 Mode: 2 (100%)
STATS <X.96.147.1>  TX: 4103 (49%)  RESPONSES: 4096   TIMEDOUT: 7
 CAPACITY Peak: 2 Mean: 2.00 Mode: 2 (100%)

The code change definitely helped in this situation, but it may be
worthwhile to consider increasing capacity in reaction to any response
including SERVFAIL since it is a response.
You can also see with tune that we got a SERVFAIL from each server for
each address (except for 1 address where we only got one SERVFAIL) which
seems to justify the modification.

More results:
base: mass_rdns: 775.50s 1400/4096 [#: 3, OK: 0, NX: 0, DR: 1400, TO:
8460, SF: 0, TR: 8460]
tune: mass_rdns: 2887.50s 1050/4096 [#: 3, OK: 0, NX: 0, DR: 1050, TO:
6300, SF: 0, TR: 6300]

I didn't allow these to complete.  None of the 3 dns servers responded
to any request and as you can see, base with its minimum capacity of 10
would have completed hours before tune with its minimum capacity of 2. 
This is an extreme case, but clearly demonstrates how the total time for
resolution may be affected by non-responsive dns servers.

Up to this point, tuned nmap allows 50 increases in capacity (of a
maximum of 0.1 each) in a given read timeout period resulting in a
maximum real increase in capacity of 5 per period depending on a
favourable ratio of requests to timeouts.  I wanted to see the affect of
allowing more increases in a period.  Here's some typical results:

tune at 50:
35.55s. Mode: Async [#: 2, OK: 1090, NX: 2961, DR: 45, TO: 469, SF: 0,
TR: 4520, CN: 0]
STATS <X.78.255.20>  TX: 2205 (48%)  RESPONSES: 1986   TIMEDOUT: 219
 CAPACITY Peak: 39 Mean: 25.78 Mode: 36 (15%)
STATS <X.78.100.1>  TX: 2315 (51%)  RESPONSES: 2065   TIMEDOUT: 250
 CAPACITY Peak: 47 Mean: 30.91 Mode: 34 (11%)

tune at 75:
34.02s. Mode: Async [#: 2, OK: 1090, NX: 2944, DR: 62, TO: 464, SF: 0,
TR: 4498, CN: 0]
STATS <X.78.255.20>  TX: 2418 (53%)  RESPONSES: 2200   TIMEDOUT: 218
 CAPACITY Peak: 55 Mean: 36.48 Mode: 52 (16%)
STATS <X.78.100.1>  TX: 2080 (46%)  RESPONSES: 1834   TIMEDOUT: 246
 CAPACITY Peak: 59 Mean: 36.99 Mode: 38 (13%)
 
tune at 75:
36.16s. Mode: Async [#: 2, OK: 1090, NX: 2962, DR: 44, TO: 473, SF: 0,
TR: 4525, CN: 0]
STATS <139.78.255.20>  TX: 2234 (49.4%)  RESPONSES: 2005   TIMEDOUT: 229
 CAPACITY Peak: 57 Mean: 38.1 Mode: 50 (15%)
STATS <139.78.100.1>  TX: 2291 (50.6%)  RESPONSES: 2047   TIMEDOUT: 244
 CAPACITY Peak: 58 Mean: 38.3 Mode: 51 (14%)

tune at 100:
44.39s. Mode: Async [#: 2, OK: 1088, NX: 2976, DR: 32, TO: 788, SF: 0,
TR: 4852, CN: 0]
 STATS <X.78.255.20>  TX: 1773 (36%)  RESPONSES: 1452   TIMEDOUT: 321
  CAPACITY Peak: 48 Mean: 33.15 Mode: 44 (8%)
 STATS <X.78.100.1>  TX: 3079 (63%)  RESPONSES: 2612   TIMEDOUT: 467
  CAPACITY Peak: 77 Mean: 55.13 Mode: 72 (6%)

tune at 100:
41.13s. Mode: Async [#: 2, OK: 1080, NX: 2970, DR: 46, TO: 775, SF: 0,
TR: 4825, CN: 0]
STATS <X.78.255.20>  TX: 2171 (44%)  RESPONSES: 1828   TIMEDOUT: 343
 CAPACITY Peak: 60 Mean: 40.45 Mode: 53 (7%)
STATS <X.78.100.1>  TX: 2654 (55%)  RESPONSES: 2222   TIMEDOUT: 432
 CAPACITY Peak: 77 Mean: 51.83 Mode: 49 (7%)

There doesn't seem to be a great deal of difference in accuracy between
these results but it's clear to see that we start to see an increase in
timed-out requests as we allow more capacity increases - roughly 60%
more at 100 increases per period than at 50 increases.

To summarise my thoughts on this batch of testing, I'd say that these
modifications really help to keep the capacity low and prevent the great
number of timed-out requests seen with the current nmap when the actual
capacity is quite low.  If, however, the actual capacity is high then in
order to take advantage we need to be resolving a huge number of
addresses in order to have the time to reach that capacity.  In addition
to that, we're more susceptible to long delays caused by unresponsive
servers or those that simply ignore some requests because of the lower
CAPACITY_MIN.

Notwithstanding David's ideas which need testing, I've been
experimenting with an altogether different approach.  Briefly, we
attempt to measure congestion by recording the capacity at the time each
request is sent.  If we get a response to a request we increment a
counter in a std::map where the keys are the capacities at send time and
the values are the frequency of responses at that send time capacity. 
Similarly when a request times-out a counter is incremented in a std:map
for timed out requests.  Using the mean and standard deviation capacity
for both responses and timeouts we estimate a minimum and maximum
(respectively) capacity for that server and allow the capacity to bounce
around within these limits which should steadily converge.  That's the
theory anyway.

Thanks for the input folks, I'm sure we'll nail this soon.

jah

_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://SecLists.Org

Current thread:

[RFC] Mass rDNS performance tweak jah (Jan 14)
- Re: [RFC] Mass rDNS performance tweak doug (Jan 14)
- Re: [RFC] Mass rDNS performance tweak David Fifield (Jan 22)
  - Re: [RFC] Mass rDNS performance tweak jah (Jan 22)
    - Re: [RFC] Mass rDNS performance tweak David Fifield (Jan 22)