mailing list archives
IP: Frequency of top 1,000 USENET words
From: Dave Farber <farber () cis upenn edu>
Date: Sun, 27 Dec 1998 03:23:21 -0500
From: Mike Radow <mradow () inx inx net>
It is hoped that this will be useful to others...
In building "word-to-token" compressed files of technical text, we've had
good experience with this file.
We've used this for several years and the distribution is a good fit for
the distribution of our text.
Unlike other "general text" frequencies, this list was generated from
My sincere thanks to Lee Maixner, for locating this URL...:
Date: Tue, 19 Jan 1993 20:43:44 GMT
Subject: Re: Top 1000 English words ...
Top 1000 English words
Culled from one year of USENET traffic, here is my list of the top 1000
words, along with percentage of occurence: (this is from a database of
343945617 total scanned words).
Mike Radow <---> mradow () inx net
- IP: Frequency of top 1,000 USENET words Dave Farber (Dec 27)