Validate URL syntax with JavaScript
Something that I initially thought would be simple turned into hours of googling for solutions. The problem? I wanted to use JavaScript to check if a URL uses valid syntax.
What I ended up doing is use js-uri to parse the URLs I want to check and then use a couple of regular expressions to check the syntax of the URL parts it returns.
Where I’m using this I’m not particularly interested in checking that the domain exists or verifying that the URL is live—I just want to check that the path
, query
, and fragment
parts do not use characters other than those described in RFC 3986.
Here’s the script I use for that:
var isValidURI = function(uri) {
var path,
query,
fragment,
// Regexp for allowed characters in URL paths
pathRE = /^([-a-z0-9._~:@!$&'()*+,;=\/]|%[0-9a-f]{2})*$/i,
// Regexp for allowed characters in querystrings and URL fragments
qfRE = /^([-a-z0-9._~:@!$&'()*+,;=?\/]|%[0-9a-f]{2})*$/i;
uri = new URI(uri);
path = uri.getPath();
query = uri.getQuery();
fragment = uri.getFragment();
if (path && !path.match(pathRE)) {
return false;
}
if (query && !query.match(qfRE)) {
return false;
}
if (fragment && !fragment.match(qfRE)) {
return false;
}
return true;
};
This may not be bulletproof—parsing URLs is tricky—but as far as I can tell it works with the test cases I’ve thrown at it.
If you spot any errors with this method or happen to know of a better way of verifying URL syntax with JavaScript I’m all ears.
- Previous post: Time to make the title attribute device independent
- Next post: Clipping text with CSS3 text-overflow