How to replace plain URLs with links?

I am using the function below to match URLs inside a given text and replace them for HTML links. The regular expression is working great, but currently I am only replacing the first match.

How I can replace all the URL? I guess I should be using the exec command, but I did not really figure how to do it.

function replaceURLWithHTMLLinks(text) {
    var exp = /(\b(https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|])/i;
    return text.replace(exp,"<a href='$1'>$1</a>"); 
}

Answers:

Answer

First off, rolling your own regexp to parse URLs is a terrible idea. You must imagine this is a common enough problem that someone has written, debugged and tested a library for it, according to the RFCs. URIs are complex - check out the code for URL parsing in Node.js and the Wikipedia page on URI schemes.

There are a ton of edge cases when it comes to parsing URLs: international domain names, actual (.museum) vs. nonexistent (.etc) TLDs, weird punctuation including parentheses, punctuation at the end of the URL, IPV6 hostnames etc.

I've looked at a ton of libraries, and there are a few worth using despite some downsides:

Libraries that I've disqualified quickly for this task:

If you insist on a regular expression, the most comprehensive is the URL regexp from Component, though it will falsely detect some non-existent two-letter TLDs by looking at it.

Answer

Replacing URLs with links (Answer to the General Problem)

The regular expression in the question misses a lot of edge cases. When detecting URLs, it's always better to use a specialized library that handles international domain names, new TLDs like .museum, parentheses and other punctuation within and at the end of the URL, and many other edge cases. See the Jeff Atwood's blog post The Problem With URLs for an explanation of some of the other issues.

The best summary of URL matching libraries is in Dan Dascalescu's Answer +100
(as of Feb 2014)


"Make a regular expression replace more than one match" (Answer to the specific problem)

Add a "g" to the end of the regular expression to enable global matching:

/ig;

But that only fixes the problem in the question where the regular expression was only replacing the first match. Do not use that code.

Answer

I've made some small modifications to Travis's code (just to avoid any unnecessary redeclaration - but it's working great for my needs, so nice job!):

function linkify(inputText) {
    var replacedText, replacePattern1, replacePattern2, replacePattern3;

    //URLs starting with http://, https://, or ftp://
    replacePattern1 = /(\b(https?|ftp):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|])/gim;
    replacedText = inputText.replace(replacePattern1, '<a href="$1" target="_blank">$1</a>');

    //URLs starting with "www." (without // before it, or it'd re-link the ones done above).
    replacePattern2 = /(^|[^\/])(www\.[\S]+(\b|$))/gim;
    replacedText = replacedText.replace(replacePattern2, '$1<a href="http://$2" target="_blank">$2</a>');

    //Change email addresses to mailto:: links.
    replacePattern3 = /(([a-zA-Z0-9\-\_\.])[email protected][a-zA-Z\_]+?(\.[a-zA-Z]{2,6})+)/gim;
    replacedText = replacedText.replace(replacePattern3, '<a href="mailto:$1">$1</a>');

    return replacedText;
}
Answer

Made some optimizations to Travis' Linkify() code above. I also fixed a bug where email addresses with subdomain type formats would not be matched (i.e. [email protected]).

In addition, I changed the implementation to prototype the String class so that items can be matched like so:

var text = '[email protected]';
text.linkify();

'http://stackoverflow.com/'.linkify();

Anyway, here's the script:

if(!String.linkify) {
    String.prototype.linkify = function() {

        // http://, https://, ftp://
        var urlPattern = /\b(?:https?|ftp):\/\/[a-z0-9-+&@#\/%?=~_|!:,.;]*[a-z0-9-+&@#\/%=~_|]/gim;

        // www. sans http:// or https://
        var pseudoUrlPattern = /(^|[^\/])(www\.[\S]+(\b|$))/gim;

        // Email addresses
        var emailAddressPattern = /[\w.][email protected][a-zA-Z_-]+?(?:\.[a-zA-Z]{2,6})+/gim;

        return this
            .replace(urlPattern, '<a href="$&">$&</a>')
            .replace(pseudoUrlPattern, '$1<a href="http://$2">$2</a>')
            .replace(emailAddressPattern, '<a href="mailto:$&">$&</a>');
    };
}
Answer

Thanks, this was very helpful. I also wanted something that would link things that looked like a URL -- as a basic requirement, it'd link something like www.yahoo.com, even if the http:// protocol prefix was not present. So basically, if "www." is present, it'll link it and assume it's http://. I also wanted emails to turn into mailto: links. EXAMPLE: www.yahoo.com would be converted to www.yahoo.com

Here's the code I ended up with (combination of code from this page and other stuff I found online, and other stuff I did on my own):

function Linkify(inputText) {
    //URLs starting with http://, https://, or ftp://
    var replacePattern1 = /(\b(https?|ftp):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|])/gim;
    var replacedText = inputText.replace(replacePattern1, '<a href="$1" target="_blank">$1</a>');

    //URLs starting with www. (without // before it, or it'd re-link the ones done above)
    var replacePattern2 = /(^|[^\/])(www\.[\S]+(\b|$))/gim;
    var replacedText = replacedText.replace(replacePattern2, '$1<a href="http://$2" target="_blank">$2</a>');

    //Change email addresses to mailto:: links
    var replacePattern3 = /(\[email protected][a-zA-Z_]+?\.[a-zA-Z]{2,6})/gim;
    var replacedText = replacedText.replace(replacePattern3, '<a href="mailto:$1">$1</a>');

    return replacedText
}

In the 2nd replace, the (^|[^/]) part is only replacing www.whatever.com if it's not already prefixed by // -- to avoid double-linking if a URL was already linked in the first replace. Also, it's possible that www.whatever.com might be at the beginning of the string, which is the first "or" condition in that part of the regex.

This could be integrated as a jQuery plugin as Jesse P illustrated above -- but I specifically wanted a regular function that wasn't acting on an existing DOM element, because I'm taking text I have and then adding it to the DOM, and I want the text to be "linkified" before I add it, so I pass the text through this function. Works great.

Answer

Identifying URLs is tricky because they are often surrounded by punctuation marks and because users frequently do not use the full form of the URL. Many JavaScript functions exist for replacing URLs with hyperlinks, but I was unable to find one that works as well as the urlize filter in the Python-based web framework Django. I therefore ported Django's urlize function to JavaScript:

https://github.com/ljosa/urlize.js

An example:

urlize('Go to SO (stackoverflow.com) and ask. <grin>', 
       {nofollow: true, autoescape: true})
=> "Go to SO (<a href="http://stackoverflow.com" rel="nofollow">stackoverflow.com</a>) and ask. &lt;grin&gt;"

The second argument, if true, causes rel="nofollow" to be inserted. The third argument, if true, escapes characters that have special meaning in HTML. See the README file.

Answer

I made a change to Roshambo String.linkify() to the emailAddressPattern to recognize [email protected] addresses

if(!String.linkify) {
    String.prototype.linkify = function() {

        // http://, https://, ftp://
        var urlPattern = /\b(?:https?|ftp):\/\/[a-z0-9-+&@#\/%?=~_|!:,.;]*[a-z0-9-+&@#\/%=~_|]/gim;

        // www. sans http:// or https://
        var pseudoUrlPattern = /(^|[^\/])(www\.[\S]+(\b|$))/gim;

        // Email addresses *** here I've changed the expression ***
        var emailAddressPattern = /(([a-zA-Z0-9_\-\.]+)@[a-zA-Z_]+?(?:\.[a-zA-Z]{2,6}))+/gim;

        return this
            .replace(urlPattern, '<a target="_blank" href="$&">$&</a>')
            .replace(pseudoUrlPattern, '$1<a target="_blank" href="http://$2">$2</a>')
            .replace(emailAddressPattern, '<a target="_blank" href="mailto:$1">$1</a>');
    };
}
Answer

I searched on google for anything newer and ran across this one:

$('p').each(function(){
   $(this).html( $(this).html().replace(/((http|https|ftp):\/\/[\w?=&.\/-;#~%-]+(?![\w\s?&.\/;#~%"=-]*>))/g, '<a href="$1">$1</a> ') );
});

demo: http://jsfiddle.net/kachibito/hEgvc/1/

Works really well for normal links.

Answer

This solution works like many of the others, and in fact uses the same regex as one of them, however in stead of returning a HTML String this will return a document fragment containing the A element and any applicable text nodes.

 function make_link(string) {
    var words = string.split(' '),
        ret = document.createDocumentFragment();
    for (var i = 0, l = words.length; i < l; i++) {
        if (words[i].match(/[[email protected]:%_\+.~#?&//=]{2,256}\.[a-z]{2,4}\b(\/[[email protected]:%_\+.~#?&//=]*)?/gi)) {
            var elm = document.createElement('a');
            elm.href = words[i];
            elm.textContent = words[i];
            if (ret.childNodes.length > 0) {
                ret.lastChild.textContent += ' ';
            }
            ret.appendChild(elm);
        } else {
            if (ret.lastChild && ret.lastChild.nodeType === 3) {
                ret.lastChild.textContent += ' ' + words[i];
            } else {
                ret.appendChild(document.createTextNode(' ' + words[i]));
            }
        }
    }
    return ret;
}

There are some caveats, namely with older IE and textContent support.

here is a demo.

Answer

If you need to show shorter link (only domain), but with same long URL, you can try my modification of Sam Hasler's code version posted above

function replaceURLWithHTMLLinks(text) {
    var exp = /(\b(https?|ftp|file):\/\/([-A-Z0-9+&@#%?=~_|!:,.;]*)([-A-Z0-9+&@#%?\/=~_|!:,.;]*)[-A-Z0-9+&@#\/%=~_|])/ig;
    return text.replace(exp, "<a href='$1' target='_blank'>$3</a>");
}
Answer

Reg Ex: /(\b((https?|ftp|file):\/\/|(www))[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|]*)/ig

function UriphiMe(text) {
      var exp = /(\b((https?|ftp|file):\/\/|(www))[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|]*)/ig; 
      return text.replace(exp,"<a href='$1'>$1</a>");
}

Below are some tested string:

  1. Find me on to www.google.com
  2. www
  3. Find me on to www.http://www.com
  4. Follow me on : http://www.nishantwork.wordpress.com
  5. http://www.nishantwork.wordpress.com
  6. Follow me on : http://www.nishantwork.wordpress.com
  7. https://stackoverflow.com/users/430803/nishant

Note: If you don't want to pass www as valid one just use below reg ex: /(\b((https?|ftp|file):\/\/|(www))[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|])/ig

Answer

The warnings about URI complexity should be noted, but the simple answer to your question is:
To replace every match you need to add the /g flag to the end of the RegEx:
/(\b(https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|])/gi

Answer

Keep it simple! Say what you cannot have, rather than what you can have :)

As mentioned above, URLs can be quite complex, especially after the '?', and not all of them start with a 'www.' e.g. maps.bing.com/something?key=!"£$%^*()&lat=65&lon&lon=20

So, rather than have a complex regex that wont meet all edge cases, and will be hard to maintain, how about this much simpler one, which works well for me in practise.

Match

http(s):// (anything but a space)+

www. (anything but a space)+

Where 'anything' is [^'"<>\s] ... basically a greedy match, carrying on to you meet a space, quote, angle bracket, or end of line

Also:

Remember to check that it is not already in URL format, e.g. the text contains href="..." or src="..."

Add ref=nofollow (if appropriate)

This solution isn't as "good" as the libraries mentioned above, but is much simpler, and works well in practise.

if html.match( /(href)|(src)/i )) {
    return html; // text already has a hyper link in it
    }

html = html.replace( 
            /\b(https?:\/\/[^\s\(\)\'\"\<\>]+)/ig, 
            "<a ref='nofollow' href='$1'>$1</a>" 
            );

html = html.replace( 
            /\s(www\.[^\s\(\)\'\"\<\>]+)/ig, 
            "<a ref='nofollow' href='http://$1'>$1</a>" 
            );

html = html.replace( 
             /^(www\.[^\s\(\)\'\"\<\>]+)/ig, 
            "<a ref='nofollow' href='http://$1'>$1</a>" 
            );

return html;
Answer

Correct URL detection with international domains & astral characters support is not trivial thing. linkify-it library builds regex from many conditions, and final size is about 6 kilobytes :) . It's more accurate than all libs, currently referenced in accepted answer.

See linkify-it demo to check live all edge cases and test your ones.

If you need to linkify HTML source, you should parse it first, and iterate each text token separately.

Answer
/**
 * Convert URLs in a string to anchor buttons
 * @param {!string} string
 * @returns {!string}
 */

function URLify(string){
  var urls = string.match(/(((ftp|https?):\/\/)[\-\[email protected]:%_\+.~#?,&\/\/=]+)/g);
  if (urls) {
    urls.forEach(function (url) {
      string = string.replace(url, '<a target="_blank" href="' + url + '">' + url + "</a>");
    });
  }
  return string.replace("(", "<br/>(");
}

simple example

Answer

I've wrote yet another JavaScript library, it might be better for you since it's very sensitive with the least possible false positives, fast and small in size. I'm currently actively maintaining it so please do test it in the demo page and see how it would work for you.

link: https://github.com/alexcorvi/anchorme.js

Answer

I had to do the opposite, and make html links into just the URL, but I modified your regex and it works like a charm, thanks :)

var exp = /<a\s.*href=['"](\b(https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|])['"].*>.*<\/a>/ig;

source = source.replace(exp,"$1");
Answer

The e-mail detection in Travitron's answer above did not work for me, so I extended/replaced it with the following (C# code).

// Change e-mail addresses to mailto: links.
const RegexOptions o = RegexOptions.Multiline | RegexOptions.IgnoreCase;
const string pat3 = @"([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,6})";
const string rep3 = @"<a href=""mailto:[email protected]$2.$3"">[email protected]$2.$3</a>";
text = Regex.Replace(text, pat3, rep3, o);

This allows for e-mail addresses like "[email protected]".

Answer

After input from several sources I've now a solution that works well. It had to do with writing your own replacement code.

Answer.

Fiddle.

function replaceURLWithHTMLLinks(text) {
    var re = /(\(.*?)?\b((?:https?|ftp|file):\/\/[-a-z0-9+&@#\/%?=~_()|!:,.;]*[-a-z0-9+&@#\/%=~_()|])/ig;
    return text.replace(re, function(match, lParens, url) {
        var rParens = '';
        lParens = lParens || '';

        // Try to strip the same number of right parens from url
        // as there are left parens.  Here, lParenCounter must be
        // a RegExp object.  You cannot use a literal
        //     while (/\(/g.exec(lParens)) { ... }
        // because an object is needed to store the lastIndex state.
        var lParenCounter = /\(/g;
        while (lParenCounter.exec(lParens)) {
            var m;
            // We want m[1] to be greedy, unless a period precedes the
            // right parenthesis.  These tests cannot be simplified as
            //     /(.*)(\.?\).*)/.exec(url)
            // because if (.*) is greedy then \.? never gets a chance.
            if (m = /(.*)(\.\).*)/.exec(url) ||
                    /(.*)(\).*)/.exec(url)) {
                url = m[1];
                rParens = m[2] + rParens;
            }
        }
        return lParens + "<a href='" + url + "'>" + url + "</a>" + rParens;
    });
}
Answer

Replace URLs in text with HTML links, ignore the URLs within a href/pre tag. https://github.com/JimLiu/auto-link

Answer

Here's my solution:

var content = "Visit https://wwww.google.com or watch this video: https://www.youtube.com/watch?v=0T4DQYgsazo and news at http://www.bbc.com";
content = replaceUrlsWithLinks(content, "http://");
content = replaceUrlsWithLinks(content, "https://");

function replaceUrlsWithLinks(content, protocol) {
    var startPos = 0;
    var s = 0;

    while (s < content.length) {
        startPos = content.indexOf(protocol, s);

        if (startPos < 0)
            return content;

        let endPos = content.indexOf(" ", startPos + 1);

        if (endPos < 0)
            endPos = content.length;

        let url = content.substr(startPos, endPos - startPos);

        if (url.endsWith(".") || url.endsWith("?") || url.endsWith(",")) {
            url = url.substr(0, url.length - 1);
            endPos--;
        }

        if (ROOTNS.utils.stringsHelper.validUrl(url)) {
            let link = "<a href='" + url + "'>" + url + "</a>";
            content = content.substr(0, startPos) + link + content.substr(endPos);
            s = startPos + link.length;
        } else {
            s = endPos + 1;
        }
    }

    return content;
}

function validUrl(url) {
    try {
        new URL(url);
        return true;
    } catch (e) {
        return false;
    }
}
Answer

Try the below function :

function anchorify(text){
  var exp = /(\b(https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|])/ig;
  var text1=text.replace(exp, "<a href='$1'>$1</a>");
  var exp2 =/(^|[^\/])(www\.[\S]+(\b|$))/gim;
  return text1.replace(exp2, '$1<a target="_blank" href="http://$2">$2</a>');
}

alert(anchorify("Hola amigo! https://www.sharda.ac.in/academics/"));

Answer

Try Below Solution

function replaceLinkClickableLink(url = '') {
let pattern = new RegExp('^(https?:\\/\\/)?'+
        '((([a-z\\d]([a-z\\d-]*[a-z\\d])*)\\.?)+[a-z]{2,}|'+
        '((\\d{1,3}\\.){3}\\d{1,3}))'+
        '(\\:\\d+)?(\\/[-a-z\\d%_.~+]*)*'+
        '(\\?[;&a-z\\d%_.~+=-]*)?'+
        '(\\#[-a-z\\d_]*)?$','i');

let isUrl = pattern.test(url);
if (isUrl) {
    return `<a href="${url}" target="_blank">${url}</a>`;
}
return url;
}

Tags

Recent Questions

Top Questions

Home Tags Terms of Service Privacy Policy DMCA Contact Us

©2020 All rights reserved.