I think the Aldridge statement that Mordechai is talking about is from
Undersecretary of Defense for Acquisition, Logistics, and Technology Pete
Aldridge last month at a Pentagon briefing:
http://www.politechbot.com/p-04186.html
News article:
http://www.cnn.com/2002/US/11/20/terror.tracking/index.html
---
From: "Mordechai" <quality_at_computer.org>
To: Declan McCullagh <declan_at_well.com>
Date: Thu, 12 Dec 2002 11:53:28 +0200
Subject: TIA feasability and costs
Reply-to: quality_at_computer.org
Declan,
My name is Mordechai Ben-Menachem. I am a lecturer at Ben-Gurion
University, Beer-Sheva, Israel.
My areas of speciality are software engineering and project
management. Bob Bauman asked me to
write to you to express certain views concerning the DARPA project called
TIA.
I have read the Aldrige testimony. Most of the following was written in
reaction to that.
Much of what Aldrige says walks a very narrow line between outright lies
and obfuscation. It is simply
not correct. The areas for objection are too broad to cover here, but I
shall try to give a few examples.
1. You cannot talk about "... if they choose to use it." The system ONLY
has value if there is a
critical mass of data in it. This means, by definition, that the database
must be massively populated
and this must be constantly maintained. This is not a situation where one
can query and THEN the
system will go off to a thousand different databases around the world to
search for transactions you
may want. There is a fine line here between data collection and data
retrieval. The "if they choose"
part can relate to data retrieval, but that makes it a very sticky
wicket. Existing legal controls (e.g.,
search warrants, Miranda) are designed to control data collection, not use
of that data once it has been
collected.
2. Speech recognition / rapid translation:
The statements are very misleading. No such software exits today. The
state-of-the-art of voice
recognition / voice response systems is that of a watch (you can also tell
your phone to dial your wife,
but only after rigorous training of the system). The accuracy of
translation systems used today is
mostly used as Computer Science jokes. The distance to workable systems is
quite profound. Intel
has recently announced a 3 Giga Hertz chip. This infers (via Moore's Law)
that we shall see a 6 Giga
Hertz chip in 18 months. Many authorities have called 6 GH a milestone
that will allow a new set of
applications. In other words, when those capabilities exist, we may be
able to intelligently discuss
rapid, real-time translation. However, by definition, we do not know how
to conceive of those
applications now. Perhaps it can be on a supercomputer, as cost is not the
governing factor -- no, the
basic computational complexity may be solvable on a supercomputer (no proof
of that exists) but there
are many other aspects that requires a different type of architecture for
real time usage. He also stated
that there will be voice recognition capabilities to recognise who is
speaking. Totally science fiction,
has never been tried in real life. What exists is the ability to match
"voice prints" via pattern
recognition techniques. Very time consuming and with a very low level of
accuracy and reliability. I do
not recall it being recognized by any court, for example.
3. Connections between transactions:
Echelon gathers data from some 8-billion telephone conversations
today. How successful has this
been in the "war on drugs"? The answer is, almost not at all. Add to
that, all airline transactions,
chemical purchases, credit card ... How many daily transactions are we
talking about -- 20 billion,
more? (Visa alone has some 110 million transactions per day.) There is no
way to even imagine how
to query this size of database, much less, make any sense of the
answer. In other words, if they
manage to simulate the data (we do not know how to simulate that), and if
they manage to perform a
query, what do we do with the results of such a query? The data
visualization techniques do not exist.
The quantity of false positives will overload any investigative agency
(tens of thousands per day). As a
matter of fact, the database technology that would allow this type of query
does not exist, either. I must
add, on small scales, tens of thousands of transactions, this is being
performed. The distance to be
able to process five orders of magnitude more is perhaps a decade.
4. Collaborative reasoning:
This part is probably practical, though the development is still quite a
way off. I have done a little bit of
work in this area. (I have an article submitted to a major journal that I
can send you, but it has not yet
been published.) The major issue here is reliability. We are talking
about using massive webs of
hierarchical data (that is, the data has both hierarchical attributes and
network attributes). With this
level of complexity, testing such a system is very far beyond our
capabilities -- we simply have no idea
how to ensure that the answers we are given are correct because we do not
know how to test it. This is
not the only difficulty. The definition of interrelationships is an open
issue -- they are not static.
As I said, space and time do not permit me to do a full analysis and I have
not read the full
specification. The bottom line is composed of two points. The report by
Pete Aldridge cannot simply
be taken at face value. The system / project, as presently defined reminds
me greatly of Reagan's SDI
project. Brilliantly thought of, but much too early. Some of the fruits
of that effort are just now coming
on line, 20 years later (e.g., the Arrow anti-ballistic missile and the
Nautilus anti-tactical rocket laser
gun). When SDI was conceived, it was not technologically possible. This
is not today. In 20 years,
who knows, this may be reasonable. Today, the base technologies do not
exist. The complexity is too
great, the size is impossible to conceive. I don't care how passionate
Poindexter is. It sounds wrong.
Additionally, I spoke with a colleague of mine whose expertise is in the
area of face recognition and
other "bio" technologies. My objective was to double-check that my initial
guess-timates were
reasonable. He confirms and even thought me rather optimistic on some of
the things. For instance,
"rapid translation" based on speech recognition: I said I thought it a few
years off. He says it is AT
LEAST 7-10 years off. The capabilities we see today are very primitive.
In any case, we are talking about a 10-20 year timeframe to demonstrate
capabilities -- similar to SDI.
You are talking about spending billions of dollars for a project to develop
a system that has no hope of
being useful in a significant time-frame -- the size of the project is much
larger than what has been
reported, the base technologies do not exist.
best regards, I hope this is helpful and I shall be most pleased to further
explain if you like,
Mordechai Ben-Menachem
Dept. of Industrial Engineering & Management
Ben-Gurion University
P. O. Box 5613; Beer-Sheva; 84156; Israel
Tel. 972-86-433231, mob. 972-57-433231, off. 972-86-479374
quality_at_computer.org
-------------------------------------------------------------------------
POLITECH -- Declan McCullagh's politics and technology mailing list
You may redistribute this message freely if you include this notice.
To subscribe to Politech: http://www.politechbot.com/info/subscribe.html
This message is archived at http://www.politechbot.com/
Declan McCullagh's photographs are at http://www.mccullagh.org/
-------------------------------------------------------------------------
Like Politech? Make a donation here: http://www.politechbot.com/donate/
Recent CNET News.com articles: http://news.search.com/search?q=declan
-------------------------------------------------------------------------
Received on Dec 13 2002