Dailydave mailing list archives
Re: Intersecting graphs...
From: Thomas Fischbacher <Thomas.Fischbacher () Physik Uni-Muenchen DE>
Date: Wed, 22 Sep 2004 22:28:26 +0200 (CEST)
On Wed, 22 Sep 2004, Halvar Flake wrote:
Hey all, it might sound odd that I am asking a graph-theory question here, but with DD being a list full of smart people with odd hobbies I thought I might find an answer here: Does anyone know of a reasonably fast algorithm which calculates the intersection of two acyclic digraphs with a common "root"-node ?
Like, say, a tree containing filesystem stat information?
(I'm far from being an expert in graph theory, but let me just give a few
thoughts how I would approach such a problem.. Always assuming that the
data in both trees are "mostiy unique", i.e. a given entry does not
re-occur too often within a tree.)
I haven't had a look at the tripwire source, but one way how to do
something like this is to recursively walk through the first tree, bottom
up, starting with the individual leafs, and mincing all the data for
every single leaf tree through a cryptographic hash function. When you
climb up, you take the data within each node, plus the cryptographically
hashed fingerprints from the subtrees, and generate a new cryptographical
hash for this. This way, you will get one fingerprint for every node, in
something like O(nodes) time. All these fingerprints you store in a hash.
Next, you traverse the second subtree, again, bottom up, forming
cryptographic hash values. When you encounter one which you've seen in the
other tree, you might want to assume with reasonable confidence that the
corresponding subtrees are identical. (Otherwise, check. One can employ
clever tricks here as well - memoizing comparisons would be a strategy.)
Two issues:
* Once you identified two subtrees as identical, you still have to find
their relative places w.r.t. the common intersection. This is facilitated
by remembering for every subtree the node of which it is a child. This
way, such identifications provide you with a set of possibilities for
possible parents to be identified. There will be two kinds: such where
comparison of parent nodes immediately gives you a failure, and such where
this is not the case. Thus, you can reduce a few cases immediately, while
backtracking should be used for the others. call-with-current-continuation
would be of great value here.
* If subtrees are not identical, well, it may still be that the subtree
sets of the corresponding node have a nontrivial intersection. I did not
put too much thought into this yet, but to me it seems as if the proper
strategy would be to re-calculate the hash values of the path up to the
root node. But as we may have more options where to identify a subtree of
graph #2 in graph #1, it seems as if we again would have to do some
backtracking. Then, we perhaps should not use a hash mapping cryptographic
hash keys to trees (and their parents), but instead binary trees, and work
with binary tree insert/delete functions that do not destructively modify
trees, but map given trees to new ones with the appropriate modifications
in a functional fashion - this should greatly simplify the backtracking.
There are many more details that depend on the actual problem, like:
* When to consider subtrees as equal ("same object" vs. "equal").
* Are there data in the nodes, or only in the leaves?
* Is the number of childs a node may have fixed?
* Does the order of the childs of a node matter?
* Is it advantageous to be able to recognize transplantations of subtrees?
* What is the problem structure of a typical application?
So, if you could be a little bit more specific, I may be able to give a
more useful answer.
--
regards, tf () cip physik uni-muenchen de (o_
Thomas Fischbacher - http://www.cip.physik.uni-muenchen.de/~tf //\
(lambda (n) ((lambda (p q r) (p p q r)) (lambda (g x y) V_/_
(if (= x 0) y (g g (- x 1) (* x y)))) n 1)) (Debian GNU)
_______________________________________________
Dailydave mailing list
Dailydave () lists immunitysec com
http://www.immunitysec.com/mailman/listinfo/dailydave
Current thread:
- Re: Intersecting graphs... Thomas Fischbacher (Oct 08)
