Sunday, June 10, 2012

parseInt(null, 24) === 23… wait, what?


Alright, so I was messing around with parseInt to see how it handles values not yet initialized and I stumbled upon this gem. The below happens for any radix 24 or above.




parseInt(null, 24) === 23 // true



I tested it in IE, Chrome and Firefox and they all alert true, so I'm thinking it must be in the specification somewhere. A quick Google search didn't give me any results so here I am, hoping someone can explain.



I remember listening to a Crockford speech where he was saying typeof null === "object" because of an oversight causing Object and Null to have a near identical type identifier in memory or something along those lines, but I can't find that video now.



Try it: http://jsfiddle.net/robert/txjwP/



Edit Correction, a higher radix returns diffferent results, 32 returns 785077

Edit 2 From zzzzBov: [24...30]:23, 31:714695, 32:785077, 33:859935, 34:939407, 35:1023631, 36:1112745





tl;dr



Explain why parseInt( null, 24 ) === 23 is a true statement.


Source: Tips4all

6 comments:

  1. It's converting null to the string "null" and trying to convert it. For radixes 0 through 23, there are no numerals it can convert, so it returns NaN. At 24, "n", the 14th letter, is added to the numeral system. At 31, "u", the 21st letter, is added and the entire string can be decoded. At 37 on there is no longer any valid numeral set that can be generated and NaN is returned.

    js> parseInt(null, 36)
    1112745

    >>> reduce(lambda x, y: x * 36 + y, [(string.digits + string.lowercase).index(x) for x in 'null'])
    1112745

    ReplyDelete
  2. Mozilla tells us:


    function parseInt converts its first
    argument to a string, parses it, and
    returns an integer or NaN. If not NaN,
    the returned value will be the decimal
    integer representation of the first
    argument taken as a number in the
    specified radix (base). For example, a
    radix of 10 indicates to convert from
    a decimal number, 8 octal, 16
    hexadecimal, and so on. For radices
    above 10, the letters of the alphabet
    indicate numerals greater than 9. For
    example, for hexadecimal numbers (base
    16), A through F are used.


    In the spec, 15.1.2.2/1 tells us that the conversion to string is performed using the built-in ToString, which (as per 9.8) yields "null" (not to be confused with toString, which would yield "[object Window]"!).

    So, let's consider parseInt("null", 24).

    Of course, this isn't a base-24 numeric string in entirety, but "n" is: it's decimal 23.

    Now, parsing stops after the decimal 23 is pulled out, because "u" isn't found in the base-24 system:


    If S contains any character that is
    not a radix-R digit, then let Z be the
    substring of S consisting of all
    characters before the first such
    character; otherwise, let Z be S. [15.1.2.2/11]


    (And this is why parseInt(null, 23) (and lower radices) gives you NaN rather than 23: "n" is not in the base-23 system.)

    ReplyDelete
  3. Ignacio Vazquez-Abrams is correct, but lets see exactly how it works...

    From 15.1.2.2 parseInt (string , radix):


    When the parseInt function is called,
    the following steps are taken:


    Let inputString be ToString(string).
    Let S be a newly created substring of inputString consisting of the first
    character that is not a
    StrWhiteSpaceChar and all characters
    following that character. (In other
    words, remove leading white space.)
    Let sign be 1.
    If S is not empty and the first character of S is a minus sign -, let
    sign be −1.
    If S is not empty and the first character of S is a plus sign + or a
    minus sign -, then remove the first
    character from S.
    Let R = ToInt32(radix).
    Let stripPrefix be true.
    If R ≠ 0, then a. If R < 2 or R > 36, then return NaN. b. If R ≠ 16, let
    stripPrefix be false.
    Else, R = 0 a. Let R = 10.
    If stripPrefix is true, then a. If the length of S is at least 2 and the
    first two characters of S are either
    “0x” or “0X”, then remove the first
    two characters from S and let R = 16.
    If S contains any character that is not a radix-R digit, then let Z be the
    substring of S consisting of all
    characters before the first such
    character; otherwise, let Z be S.
    If Z is empty, return NaN.
    Let mathInt be the mathematical integer value that is represented by Z
    in radix-R notation, using the letters
    A-Z and a-z for digits with values 10
    through 35. (However, if R is 10 and Z
    contains more than 20 significant
    digits, every significant digit after
    the 20th may be replaced by a 0 digit,
    at the option of the implementation;
    and if R is not 2, 4, 8, 10, 16, or
    32, then mathInt may be an
    implementation-dependent approximation
    to the mathematical integer value that
    is represented by Z in radix-R
    notation.)
    Let number be the Number value for mathInt.
    Return sign × number.


    NOTE parseInt may interpret only a
    leading portion of string as an
    integer value; it ignores any
    characters that cannot be interpreted
    as part of the notation of an integer,
    and no indication is given that any
    such characters were ignored.


    There are two important parts here. I bolded both of them. So first of all, we have to find out what the toString representation of null is. We need to look at Table 13 — ToString Conversions in section 9.8.0 for that information:



    Great, so now we know that doing toString(null) internally yields a 'null' string. Great, but how exactly does it handle digits (characters) that aren't valid within the radix provided?

    We look above to 15.1.2.2 and we see the following remark:


    If S contains any character that is
    not a radix-R digit, then let Z be the
    substring of S consisting of all
    characters before the first such
    character; otherwise, let Z be S.


    That means that we handle all digits PRIOR to the specified radix and ignore everything else.

    Basically, doing parseInt(null, 23) is the same thing as parseInt('null', 23). The u causes the two l's to be ignored (even though they ARE part of the radix 23). Therefore, we only can only parse n, making the entire statement synonymous to parseInt('n', 23). :)

    Either way, great question!

    ReplyDelete
  4. parseInt( null, 24 ) === 23


    Is equivalent to

    parseInt( String(null), 24 ) === 23


    which is equivalent to

    parseInt( "null", 24 ) === 23


    The digits for base 24 are 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, a, b, c, d, e, f, ..., n.

    The language spec says



    If S contains any character that is not a radix-R digit, then let Z be the substring of S consisting of all characters before the first such character; otherwise, let Z be S.



    which is the part that ensures that C-style integer literals like 15L parse properly,
    so the above is equivalent to

    parseInt( "n", 24 ) === 23


    "n" is the 23-rd letter of the digit list above.

    Q.E.D.

    ReplyDelete
  5. I guess null gets converted to a string "null". So n is actually 23 in 'base24' (same in 'base25'+), u is invalid in 'base24' so the rest of the string null will be ignored. That's why it outputs 23 until u will become valid in 'base31'.

    ReplyDelete
  6. parseInt uses alphanumeric representation, then in base-24 "n" is valid, but "u" is invalid character, then parseInt only parses the value "n"....

    parseInt("n",24) -> 23


    as an example, try with this:

    alert(parseInt("3x", 24))


    The result will be "3".

    ReplyDelete