how can I clean and sanitize a url submitted by a user for redisplay in java?

I want a user to be able to submit a url, and then display that url to other users as a link.

If I naively redisplay what the user submitted, I leave myself open to urls like' ><script>[any javacscript in here]</script>

that when I redisplay it to other users will do something nasty, or at least something that makes me look unprofessional for not preventing it.

Is there a library, preferably in java, that will clean a url so that it retains all valid urls but weeds out any exploits/tomfoolery?




You can use apache validator URLValidator

UrlValidator urlValidator = new UrlValidator(schemes);
if (urlValidator.isValid("")) {

URLs having ' in are perfectly valid. If you are outputting them to an HTML document without escaping, then the problem lies in your lack of HTML-escaping, not in the input checking. You need to ensure that you are calling an HTML encoding method every time you output any variable text (including URLs) into an HTML document.

Java does not have a built-in HTML encoder (poor show!) but most web libraries do (take your pick, or write it yourself with a few string replaces). If you use JSTL tags, you get escapeXml to do it for free by default:

<a href="<c:out value="${link}"/>">ok</a>

Whilst your main problem is HTML-escaping, it is still potentially beneficial to validate that an input URL is valid to catch mistakes - you can do that by parsing it with new URL(...) and seeing if you get a MalformedURLException.

You should also check that the URL begins with a known-good protocol such as http:// or https://. This will prevent anyone using dangerous URL protocols like javascript: which can lead to cross-site-scripting as easily as HTML-injection can.


I think what you are looking for is output encoding. Have a look at OWASP ESAPI which is tried and tested way to perform encoding in Java.

Also, just a suggestion, if you want to check if a user is submitting malicious URL, you can check that against Google malware database. You can use SafeBrowing API for that.


Recent Questions

Top Questions

Home Tags Terms of Service Privacy Policy DMCA Contact Us Javascript

©2020 All rights reserved.