It is very difficult to validate rich content submitted by a user.
For more information, please see the cheatsheet on Sanitizing HTML Markup with a Library Designed for the Job.
Input validation should be applied on both syntactical and semantic level.
If it's well structured data, like dates, social security numbers, zip codes, e-mail addresses, etc.
then the developer should be able to define a very strong validation pattern, usually based on regular expressions, for validating such input.
This does not mean that other users cannot access this mailbox, for example when the user makes use of a service that generates a throw away email address.
As the local-part of email addresses are, in fact - case sensitive, it is important to store and compare email addresses correctly.
SSN, date, currency symbol) while semantic validation should enforce correctness of their values in the specific business context (e.g.
start date is before end date, price is within expected range).
In summary, input validation should: Example validating the parameter “zip” using a regular expression.
private static final Pattern zip Pattern = Pattern.compile("^\d(-\d)?
It's also free-form text input that highlights the importance of proper context-aware output encoding and quite clearly demonstrates that input validation is not the primary safeguards against Cross-Site Scripting — if your users want to type apostrophe (') or less-than sign ( References: Input validation of free-form Unicode text in Python Developing regular expressions can be complicated, and is well beyond the scope of this cheat sheet.
There are lots of resources on the internet about how to write regular expressions, including: and the OWASP Validation Regex Repository.
GUMP is a standalone PHP data validation and filtering class that makes validating any data easy and painless without the reliance on a framework.