Codementor Events

JavaScript dives - operations on strigns

Published Oct 07, 2018Last updated Oct 19, 2018
JavaScript dives - operations on strigns

Welcome to the first article on deep dives with JavaScript methods. I would like to kick it off from rarely used String methods. I'm going to introduce to you 'String.fromCharCode' and 'String.fromCodePoint'.

Those two methods are really similar to each other but they are having a significant difference which might be decisive in your code.

Admission

Both methods require code units as a parameter, it takes any number of arguments. Provided list of parameters is being passed to the function. Afterwards, each code unit is going to be converted to the symbol or letter.

1. String.fromCharCode

As the name of the method suggests, it returns for us a string created from a provided sequence of UTF-16 code units.

If you'd like to know more about UTF-16, you can open below link:
Unicode.org - UTF explanation

The usage of String.fromCharCode is not difficult to understand, it has easy to understand syntax and clear operational definition. Let me show you a simple example on how to use the following method.

const chars = String.fromCharCode(56, 77, 45);
console.log(chars); // Result: 8M-

Above example presents in a simplest possible way the correct usage of our method.

The method gives you freedom in passing as many parameters as you want. The only limitation is the range of characters between 0 and 65535 which is equal to 0xFFFF.

You need to remember that if you exceed the limit and provided number is higher than 0xFFFF, the following value will be truncated. You need to be careful with it because it doesn't perform any validation - simply truncating your parameter and returning the right value.

Take a look at the below example:

String.fromCharCode(0x1200); // Result: ᄀ 
String.fromCharCode(0x11200); // Result: The same as above - ᄀ. Why? Because digit 1 was truncated and ignored.

As you can see in the second line we got the same result because we have exceeded the limit of 16 bits. I passed the value higher than 16 bits and in that case 1 has been removed from our code unit and instead of passing 0x11200 it has passed 0x1200.

I'd like to show you one more example by using map method to generate your own array of symbols or letters.

function generateSymbols(array) {
   const newArray = array.map((value) => {
      return String.fromCharCode(value);
   });
}

generateSymbols([56, 77, 45]); // 8M-

Actually, wait! You can do it in a better way by using the spread operator. Yes!

function generateSymbols(array) {
   return String.fromCharCode(...array);
}

generateSymbols([56, 77, 45]) // 8M-

Wasn't it simpler? Oh yes, it was! If you're reading it and thinking what is a spread operator, please open the link to find out more: MDN Doc - Spread syntax

2. String.fromCodePoint

The following method's syntax is not different from String.fromCharCode. Both methods are having the same syntax structure.

const symbols = String.fromCodePoint(56, 77, 45);
console.log(symbols) // Result: 8M-

You may ask, why is that? A String.fromCodePoint method has been added to the ECMA-262 specification as a part of the ES2015 standard. As you could notice, the first method has been implemented in JavaScript 1.2, which means that initial definition has been added to ECMA's first edition.

The first important difference between those two methods is that fromCodePoint method is a part of the ES2015 standard, which means that it has been improved and it is a sort of replacement for our first method - fromCharCode.

The following method is able to take the same infinite amount of parameters, but we can pass higher numbers than we could do it with fromCharCode. Now, we can pass numbers up to 21 bit and to just remind that the older method was allowing us to operate on 16 bits' numbers. With the new method exceeding the limit of code point will be shown to us by the RangeError, which doesn't happen in the previously mentioned function.

const stringOne = String.fromCharCode(0x11200) // Result: "ሀ"
const stringTwo = String.fromCodePoint(0x11200) // Result: "𑈀"

As you can see on the above example the "stringOne" contains different result than "stringTwo". Why? Because the first method truncates our value and gives us the symbol within 16 bits of range but the second one doesn't truncate the value and gives us the symbol within 21 bits of range.

The saddest news is that with the new ES2015 implementation fromCodePoint is losing Internet Explorer support, but thankfully we've got ready polyfills to use and to just be clear that if you want to operate on code units up to 16 bits, you can use fromCharCode without any polyfills.

Below you can see polyfill for fromCodePoint method:

if (!String.fromCodePoint) {
  (function() {
    var defineProperty = (function() {
      // IE 8 only supports `Object.defineProperty` on DOM elements
      try {
        var object = {};
        var $defineProperty = Object.defineProperty;
        var result = $defineProperty(object, object, object) && $defineProperty;
      } catch(error) {}
      return result;
    }());
    var stringFromCharCode = String.fromCharCode;
    var floor = Math.floor;
    var fromCodePoint = function(_) {
      var MAX_SIZE = 0x4000;
      var codeUnits = [];
      var highSurrogate;
      var lowSurrogate;
      var index = -1;
      var length = arguments.length;
      if (!length) {
        return "";
      }
      var result = "";
      while (++index < length) {
        var codePoint = Number(arguments[index]);
        if (
          !isFinite(codePoint) || // `NaN`, `+Infinity`, or `-Infinity`
                    codePoint < 0 || // not a valid Unicode code point
                    codePoint > 0x10FFFF || // not a valid Unicode code point
                    floor(codePoint) != codePoint // not an integer
        ) {
          throw RangeError("Invalid code point: " + codePoint);
        }
        if (codePoint <= 0xFFFF) { // BMP code point
          codeUnits.push(codePoint);
        } else { // Astral code point; split in surrogate halves
          // https://mathiasbynens.be/notes/javascript-encoding#surrogate-formulae
          codePoint -= 0x10000;
          highSurrogate = (codePoint >> 10) + 0xD800;
          lowSurrogate = (codePoint % 0x400) + 0xDC00;
          codeUnits.push(highSurrogate, lowSurrogate);
        }
        if (index + 1 == length || codeUnits.length > MAX_SIZE) {
          result += stringFromCharCode.apply(null, codeUnits);
          codeUnits.length = 0;
        }
      }
      return result;
    };
    if (defineProperty) {
      defineProperty(String, "fromCodePoint", {
        "value": fromCodePoint,
        "configurable": true,
        "writable": true
      });
    } else {
      String.fromCodePoint = fromCodePoint;
    }
  }());
}

That would be everything for now. If you've got any questions, opinions or anything else, please add your comment below the article.

Thank you for reading the first article - I really appreciate that. Please, share the article if you can and give it a like!

Discover and read more posts from Robert Wozniak
get started
post commentsBe the first to share your opinion
Show more replies