Home page logo

fulldisclosure logo Full Disclosure mailing list archives

Re: Exploits in websites due to buggy input validation where mozilla is at fault as well as the website.
From: Seth Alan Woolley <seth () tautology org>
Date: Thu, 15 Jul 2004 16:01:24 -0700

Sorry for the gory SGML details to follow...

On Thu, Jul 15, 2004 at 09:13:12PM +0200, Pavel Kankovsky wrote:
On Wed, 14 Jul 2004, Seth Alan Woolley wrote:

If the topic of exploiting browsers to gain unauthorized access to
websites with buggy input validation is back in vogue, here's a strange
situation for you that _only_ works in mozilla-based browsers:


See http://www.w3.org/TR/html401/appendix/notes.html#h-B.3.7
(and "SHORTTAG ON" in http://www.w3.org/TR/html401/sgml/sgmldecl.html)

<div><script src="indexvuln.js"</div>

should be interpreted as

<div><script src="indexvuln.js"></script></div>

W3 HTML validator interprets it this way (complaining about missing

SHORTTAG options are listed in of ISO 8879, and the </div> is
option b of that section, so the implied </script> is correct.  But
there's more to SHORTTAG than start tag completion (section
Mozilla doesn't even respect the <tag/CDATA/ minimization feature of
SMGL's SHORTTAG.  Do you blame them?  See
http://www.cs.tut.fi/~jkorpela/qattr.html#shorttag for why they probably
didn't (the example doesn't work). '<' isn't allowed inside a tag
without an entity reference (&lt;), so it can't close the tag (not the
entity) with the '>' in '</div>' like thus: '<script
src="indexvuln.js"</div>'.  '<div><script>
src="indexvuln.js"</script></div>' would be a lot more sane.  That
matches the BNF listed in the book, but it doesn't satisfy SMGL's
command to be a greedy matcher.

It's really just a simple request for security.  At least a user pref
for this?

Can they hide behind SHORTTAG when all the rest of SHORTTAG hasn't been
implemented because it would break most unquoted href attributes with
more than one slash?

I'll reference this:

(from one of the people working on the W3 validator)

It suggests that the HTML WG should clarify the 4.01 spec using an
errata to say that now that Annex K to ISO8879 exists and is better
known, the HTML DTD should use more specific options for SHORTTAG
allowing what's commonly used (attribute minimization such as quotes
left off or default values) and not allowing all the rest (which are not
only only implemented in just one browser only partially, but they are
also opening up possible security holes easier due to an author not
knowing about really obscure and rarely implemented start tag
minimization as I have pointed out).

There are two questions:
1. Should Mozilla support this bizzare esoteric feature of HTML?
   (in fact, this is a bizzare esoteric feature of SGML)
2. Should Mozilla mangle the source when you view it?

I believe the answer is "no" in both cases.
Ad 1. That support should be completely eliminated or at least
      made configurable and disabled by default.
Ad 2. I really hate it. It's like MSIE turning \'s into /'s in URL.

Complete agreement.  They've already resolved to fix #2 and I wasn't
having a problem with that (in fact, it's kind of a cool feature because
I got to see what it really thought of it, although it confused me at

I also agree that they should disable this expansive interpretation of
the standard unless people really want it.  As far as I know, nobody
really wants it except the mozilla developers because they don't want to
have to change their crude heuristics.

Well, the W3C HTML WG wants it because they don't want to have to do any
work, but XML is not really catching on as fast as everybody thought.

If you read the comments on the reported bug, they seemed to fail to
understand the bug and how easy it would be to fix while maintaining
backwards compatibility.  Then they resolved it duplicated on me when it
wasn't the same bug as the other bug, essentially keeping it quiet.

Excuse me? As far as I can tell it is the same problem. The only
difference is the fact you demonstrated possible security consequences of 

Probably so.  I still think that's a substantially different point. 
Perhaps it's a pure duplicate, but it warrants reopening the bug.  Since
I didn't file it originally, I can't reopen it (I think).  Maybe I
should post a comment to this thread on f-d?

Lots of perl and php scripts exist out there that filter for the regular
expression '<.*>' matching only whole tags instead of '[<>]' which
matches either end of a tag.

The mistake made by those scripts is obvious: they attempt to deny bad
things and allow everything else rather than allow known good things
(ie. well-formed documents in some harmless subset of (X)HTML) and deny
everything else.

Agreed as well, which is why I went to the script authors first.  I then
went to the mozilla people and informed them of the problem that was in
my opinion quite dangerous.  Then I sat on it because of my busy life
until the subject came up on this list.

Perhaps I'll file bugs on all the unimplemented features of SHORTTAG YES
and reference my bug and say: precedent says you are supporting bugs in
the specification just because of the unchangability of the HTML WG.


Seth Alan Woolley [seth at positivism.org], SPAM/UCE is unauthorized
Key id EF10E21A = 36AD 8A92 8499 8439 E6A8  3724 D437 AF5D EF10 E21A
Security Team Leader Source Mage GNU/Linux http://www.sourcemage.org

Attachment: _bin

  By Date           By Thread  

Current thread:
[ Nmap | Sec Tools | Mailing Lists | Site News | About/Contact | Advertising | Privacy ]