Susan,
I am the lead OWASP Guide author so I hope I can answer your query.
The basics of this sentence is the fact that there are many ways to
encode text in web apps, and if you're going to make decisions about
that text, or accept it for persistent storage, or re-display it,
it's vital that you make it "canonical" or as simple as possible
before you act on it.
For example, if you get:
select%20*%20from%20...
from the user and you write code to tokenize input based upon spaces,
it will not see any spaces.
So you must decode the that string properly (so it becomes "select *
from ...") and then you can process it "safely".
Be aware of double and n-deep encodings - they can occur, and
obviously there are many encodings you've never seen or considered.
That's why I strongly advocate positive validation.
ie (in C# and .NET, but applicable to most languages):
Hashtable clean = new Hashtable();
// Ensure that if the statement fails for any reason,
// the collection has a safe value for our field
clean.Add("field", "");
// is the data a single word no more than 20 characters long, using
only a-z and 0 to 9?
if ( Regex.isMatch(Request.Form["field"], "^[a-z0-9]{1,20}$",
RegexOptions.IgnoreCase) )
{
// it's safe to take the value of the string as there's most likely
no nasties
clean["field"] = Request.Form["field"].ToString();
// now ensure that the business rules make sense
processFieldBusinessRules(clean["field"]);
}
else
{
throw ...
}
// Now it's moderately safe to use or store the data in clean[]
...
thanks,
Andrew
On 11/04/2006, at 11:12 PM, susam_pal_at_yahoo.co.in wrote:
I found the following paragraph in owasp.org. Can someone please
elaborate on this?
Parameters must be converted to the simplest form before they are
validated, otherwise, malicious input can be masked and it can slip
past filters. The process of simplifying these encodings is called
“canonicalization.”
- application/pkcs7-signature attachment: smime_p7s
Received on Apr 12 2006