|
Full Disclosure
mailing list archives
Re: Squashing supposed hacker profiling
From: "Steven Adair" <steven () securityzone org>
Date: Tue, 19 Jun 2007 10:30:30 -0400 (EDT)
Amazing, you were able to find multiple instances where a script-based
gender guesser was wrong? This is more profound than the initial research
itself. I suppose I could post a series of 10 writings where it was
correct, but what would that prove? Did you try reading this from the
same page:
-----
A few quick notes:
* The system generates a simple estimate (profiling). While Gender
Guesser may be 60% - 70% accurate, it is not 100% accurate. This is
better than random guessing (50%), but should not be interpreted as
"fact". In particular, men should not be offended if it says you write
like a girl.
* People write differently in different forums. For example, a single
writing sample may appear MALE for informal writing but test as FEMALE
for formal writing. Be sure to interpret the results based on the
appropriate writing style. (These notes, for example, are more
informal/blog than formal/non-fiction.)
* Many factors can impact the interpretation from any single person's
writing. The content, knowledge of the material, age of the author,
nationality, experience, occupation, and education level can all
impact writing styles. For example, a woman who has spent 20 years
working in a male-dominated field may write like her co-workers.
Similarly, professional female writers (and experienced hobbyists)
frequently use male writing styles. Gender Guesser does not take any
of these factors into account.
* Email can blur the lines between formal and informal writing styles.
An informal email from a manager may have traces of formality, and a
formal email from a 12-year-old is likely to be informal compared to a
letter from a 40-year-old. Do not be surprised if email messages sent
to public forums test incorrectly -- when writing for an audience,
people commonly use informal words, phrases, and slang within a formal
writing style.
* Quotations, block quotes, and included text usually carries the
gender from the initial author. Be sure to remove quoted text from any
pasted content. Also, significant changes from a copy-editor can
result in a different gender analysis. (A male editor may make a
female author's news article appear MALE or as a Weak MALE.)
* Lyrics, lists, poems, and prose are special writing styles. This
tool is unlikely to classify these texts correctly.
* The system needs a paragraph or two of text in order to observe word
repetition. A good sample should have 300 words or more. Fewer words
can lead to more variation in accuracy, and a single sentence is
unlikely to generate an accurate result. Pasting the same text
multiple times will not change the results!
* People tend to write with consistent styles. If the system
misclassifies a particular author, then other writings by the same
author will likely be misclassify the same way.
* And most importantly: This is an ESTIMATE. Please do not email me
about instances where it made the wrong determination. (I've seen it
generate incorrect results lots of times already.)
----
I can't tell if you're trolling or you have actually taken the bait. You
do realize the person that you were responding to in earlier posts is not
actually Neal Krawetz, right?
All female authors... Your so called gender guessing mechanism is
flawed either way you want to cut it. You could try fuzzy math based on
theories to profile anyone on this list, but unless you have feasible
and PROVEN without reasonable doubt, its all a guessing game bottom
line. Anyhow back to security, sociolinguistics is not meant for this
list.
According to Dr. Krawetz's Gender Guesser...
(http://www.hackerfactor.com/GenderGuesser.html#Analyze)
http://girlygeekdom.blogspot.com/
Genre: Informal
Female = 104
Male = 602
Difference = 498; 85.26%
Verdict: MALE
Genre: Formal
Female = 116
Male = 239
Difference = 123; 67.32%
Verdict: MALE
REALITY: WRONG
http://www.darkreading.com/blog.asp?blog_sectionid=342&WT.svl=blogger1_5
Genre: Informal
Female = 442
Male = 555
Difference = 113; 55.66%
Verdict: Weak MALE
Genre: Formal
Female = 364
Male = 570
Difference = 206; 61.02%
Verdict: MALE
REALITY: WRONG
http://invisiblethings.org/papers/joanna-talk_description-CCC04.txt
Genre: Informal
Female = 218v
Male = 1186
Difference = 968; 84.47%
Verdict: MALE
Genre: Formal
Female = 414
Male = 576
Difference = 162; 58.18%
Verdict: Weak MALE
REALITY: WRONG
http://www.techsploitation.com/2007/05/31/what-the-hell-was-i-thinking-about-green-libertarians/
(text by Sue Lange)
Genre: Informal
Female = 210
Male = 481
Difference = 271; 69.6%
Verdict: MALE
Genre: Formal
Female = 260
Male = 408
Difference = 148; 61.07%
Verdict: MALE
REALITY: WRONG
http://thelizardqueen.wordpress.com/2005/06/08/a-thoroughly-and-utterly-girly-blog-post-sorry-4/
Genre: Informal
Female = 415
Male = 559
Difference = 144; 57.39%
Verdict: Weak MALE
Genre: Formal
Female = 180
Male = 312
Difference = 132; 63.41%
Verdict: MALE
REALITY: WRONG
To be fair I had to go to the most feminine place I could think of, even
then it was iffy.
http://groups.ivillage.com/motherdaughter/
Genre: Informal
Female = 226
Male = 337
Difference = 111; 59.85%
Verdict: Weak MALE
Genre: Formal
Female = 326
Male = 314
Difference = -12; 49.06%
Verdict: Weak FEMALE
REALITY: MAYBE THE AUTHOR HERE WAS FLAMINGLY GAY
--
====================================================
J. Oquendo
http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x1383A743
echo infiltrated.net|sed 's/^/sil@/g'
"Wise men talk because they have something to say;
fools, because they have to say something." -- Plato
_______________________________________________
Full-Disclosure - We believe in it.
Charter: http://lists.grok.org.uk/full-disclosure-charter.html
Hosted and sponsored by Secunia - http://secunia.com/
_______________________________________________
Full-Disclosure - We believe in it.
Charter: http://lists.grok.org.uk/full-disclosure-charter.html
Hosted and sponsored by Secunia - http://secunia.com/
By Date
By Thread
Current thread:
- Re: Dear Neal Krawetz, will the real n3td3v please stand up?, (continued)
Re: Dear Neal Krawetz, will the real n3td3v please stand up? jt5944-27a (Jun 19)
Re: Dear Neal Krawetz, will the real n3td3v please stand up? jt5944-27a (Jun 19)
|