Security Incidents mailing list archives

Code Red, Virus Growth, and some misunderstandings

From: Thomas Roessler <roessler () does-not-exist org>
Date: Tue, 7 Aug 2001 20:21:08 +0200

[Note: Nothing of the stuff in this message is new, but it stillseems that there are various common misunderstandings about this, soan explanation may be in order.]



I have seen various fascinating predictions and theories on the

growth of the code red worm(s), including claims that the infectedbase of computers could grow by a factor of two when the worm wasclose to saturation.Another claim (an anonymous Network Associates employee as quoted byheise online) was that the August outbreak infected more computersthan the July incidents. [1]

Finally, on incidents.org, Pastor-Satorras' and Vespignani's paperon "Epidemic Spreading in Scale-Free Networks" has been cited asexplaining "that [the] scale-free property of the Internet allowseven slowly spreading worms to proliferate quickly". [2]

I believe that all three statements are misunderstandings, and aremisrepresenting the actual behaviour of the worms we are currentlyseeing.

Let's start with the incidents.org claim that worm spread on theInternet can be described by the model Pastor-Satorras' andVespignani's describe. This model is based on a number of rathersimple assumptions which are indeed plausible to make fortraditional computer viruses (and possibly even some worms such asthe 1988 Internet worm), but which do not apply to Code Red v2,which was responsible for the July and August outbreaks (as far aswe know).


The assumptions are these (quoted almost verbatim from the paper):

1. The probability that a node of the network has  k  connections
  follows a scale-free distribution   P(k) ~ k^(-gamma) .

2. At each time step, each susceptible (healthy) node is infectedwith a rate nu if it is connected to one of more infectednodes.3. At the same time, infected nodes are cured and become againsusceptible with rate delta .

While the Internet may indeed look like described under 1., andwhile it may even be perceived like this by computer viruses (whichis a very interesting finding!), it looks entirely different fromCode Red's point of view.

Indeed, the "infection network" as seen by code red is extremelysimple: Every susceptible node is connected to every othersusceptible node. Every instance of IIS may infect every otherinstance of IIS.

This model for the "infection network" leads to a rather simplemathematical model which was introduced to the Code Red discussionsby Stuart Staniford[3], and which is known as the logicisticdifferential equation. (It was initially introduced in the 19thcentury as a model for population growth, and has been applied torats' body weight and the growth of sunflowers, among other aspects.Just look it up in about any textbook on ordinary differentialequations.)

The model roughly goes like this: Let the number of infectedcomputers be denoted by I, and let the total number of susceptiblecomputers be denoted by N. Assume that, in initial state, there isa single infected computer, and let K be the number of computersthis single infected machine can infect in a time unit. Then, atsome point of time, I infected computers can infect K * I * (N-I)/Ncomputers which weren't infected before. Thus,


        dI/dt = K * I * (N-I)/N

Rescaling, we can set a = I/N, and arrive at this:

        da/dt = K * a * (1-a)

Now, the important thing to note is that it's entirely sufficient tolook at a rather small network's hits by Code Red v2 in order todetermine K (which lead Stuart to his results, K = 1.6 for July andK = 0.7 for August). You do not need to know N in order todetermine K!

However, K can safely be assumed to be proportional to the portionof IIS servers on the net which are susceptible for the worm, whichdirectly leads to the conclusion that the number of machinessusceptible was 2.3 times as high in July as it was in August.

Of course, the question remains how large a grew in July - was itactually close to 1, or would the curve have grown almost infinitelyfor days? The model teaches us that we had reached saturation inJuly. It also teaches us that we reached saturation again lastweek-end.

Thus, assuming that the worms in July and early August actually werethe same, we can derive that the number of infected machines was 2.3times as high in July as it was in August.

The "contradiction" that the number of infected computers which havebeen identified in August is considerably higher than in July isreadily explained by the fact that the worm is still in itsinfection phase, while it stopped spreading in July. That is, weare now being hit by a larger portion of infected computers than wewere in July, and are able to derive more precise July figures,based on just the timing behaviour of the August infection.

Finally, please note that the August outbreak has reached saturationlast week-end. Thus, even the more efficient CodeRed II whichtortures servers right now will NOT infect any computers whichweren't infected by one of the other worms before. However, I'dexpect Code Red II probes to once again follow a logistic curve.


URLs:

[1] http://www.heise.de/newsticker/data/lab-07.08.01-001/
[2] http://www.incidents.org/diary/diary.php#605
[3] http://www.silicondefense.com/cr/

--
Thomas Roessler                        http://log.does-not-exist.org/

Attachment: _bin
Description:

Current thread:

Code Red, Virus Growth, and some misunderstandings Thomas Roessler (Aug 07)
- Message not available
  - Re: Code Red, Virus Growth, and some misunderstandings Thomas Roessler (Aug 08)