Thursday, May 31, 2012

JavaScript/jQuery HTML Encoding


I'm using Javascript to pull a value out from a hidden field and display it on a textbox



The value in the hidden field is encoded.



e.g.




<input id='hiddenId' type='hidden' value='chalk &amp; cheese' />



gets pulled into




<input type='text' value='chalk &amp; cheese' />



via some JQuery to get the value from the hidden field (Its at this point that I lose the encoding)




$('#hiddenId').attr('value')



The problem is when I read chalk & cheese from the hidden field, javascript seems to lose the encoding, but to escape " and ' I want the encoding to remain.



Is there a Javascript library or a Jquery method that will Html Encode a string?


Source: Tips4all

8 comments:

  1. I use these functions:

    function htmlEncode(value){
    return $('<div/>').text(value).html();
    }

    function htmlDecode(value){
    return $('<div/>').html(value).text();
    }


    Basically a div element is created in memory, but it is never appended to the document.

    On the htmlEncode function I set the innerText of the element, and retrieve the encoded innerHTML, on the htmlDecode function I set the innerHTML value of the element and the innerText is retrieved.

    Check a running example here.

    ReplyDelete
  2. The jQuery trick doesn't encode quote marks and in IE it will strip your whitespace.

    Based on the escape templatetag in Django, which I guess is heavily used/tested already, I made this function which does what's needed.

    It's arguably simpler (and possibly faster) than any of the workarounds for the whitespace-stripping issue - and it encodes quote marks, which is essential if you're going to use the result inside an attribute value for example.

    function htmlEscape(str) {
    return String(str)
    .replace(/&/g, '&amp;')
    .replace(/"/g, '&quot;')
    .replace(/'/g, '&#39;')
    .replace(/</g, '&lt;')
    .replace(/>/g, '&gt;');
    }


    Update:
    jsperf tests show this method is fast and possibly the fastest option if you're in a recent browser version

    ReplyDelete
  3. I know this is an old one, but I wanted to post a variation of the answer that will work in IE without removing lines. This should really be a comment on the answer, but I'm not allowed to comment yet. So here it is:

    function multiLineHtmlEncode(value) {
    var lines = value.split(/\r\n|\r|\n/);
    for (var i = 0; i < lines.length; i++) {
    lines[i] = htmlEncode(lines[i]);
    }
    return lines.join('\r\n');
    }

    function htmlEncode(value) {
    return $('<div/>').text(value).html();
    }

    ReplyDelete
  4. Good answer. Note that if the value to encode is undefined or null with jQuery 1.4.2 you might get errors such as:

    jQuery("<div/>").text(value).html is not a function

    OR

    Uncaught TypeError: Object has no method 'html'

    The solution is to modify the function to check for an actual value:

    function htmlEncode(value){
    if (value) {
    return jQuery('<div/>').text(value).html();
    } else {
    return '';
    }
    }

    ReplyDelete
  5. Do NOT use the innerHTML method (the jQuery .html() method uses innerHTML), as on some (I've only tested Chrome) browsers, this won't escape quotes, so if you were to put your value into an attribute value, you would end up with an XSS vulnerability.

    ReplyDelete
  6. Prototype has it built-in the String class. So if you are using/plan to use Prototype, it does something like:

    '<div class="article">This is an article</div>'.escapeHTML();
    // -> "&lt;div class="article"&gt;This is an article&lt;/div&gt;"

    ReplyDelete
  7. FWIW, the encoding is not being lost. The encoding is used by the markup parser (browser) during the page load. Once the source is read and parsed and the browser has the DOM loaded into memory, the encoding has been parsed into what it represents. So by the time your JS is execute to read anything in memory, the char it gets is what the encoding represented.

    I may be operating strictly on semantics here, but I wanted you to understand the purpose of encoding. The word "lost" makes it sound like something isn't working like it should.

    ReplyDelete
  8. afaik there isn't any straight forward HTML Encode/Decode methods in javascript.

    However, what you can do, is to use JS to create an arbitrary element, set it's inner text, then read it using innerHTML.

    say, with jQuery this should work:

    var helper = $('chalk & cheese').hide().appendTo('body');
    var htmled = helper.html();
    helper.remove();


    or something along these lines

    ReplyDelete