Generating an sequence of unicode characters

Here is a slightly useless, yet sometimes handy, function if you quickly need to spit out an ordered sequence of unicode characters between a range (e.g., U+007F to U+009F).

/**
 * If you need the string representation, there is a handy
 * 'unicode' property you can query on the returned object.
 * 
 * @param {string} from   is a HEX code you want to start from.
 * @param {string} to   is a HEX code you want to end with.
 * @return {array} the unicode range from - to.
 **/
function unicodeSequence(from, to) {
    'use strict';
    var code,
        index = parseInt(String(from), 16),
        end = parseInt(String(to), 16),
        result = [],
        unicode = '';
    if (isNaN(index) || isNaN(end)) {
      throw new TypeError('Invalid input');
    }
    for (; index <= end; index++) {
      code = index.toString(16).toUpperCase();
      //makes unicode code  like '00AB'
      while (code.length < 4) {
        code = '0' + code;
      }
      //concatenate so we get a unicode string
      result.push('\\u' + code);
      unicode += String.fromCharCode(index);
    }
    Object.defineProperty(result, 'unicode', {
      get: function() {
        return unicode;
      }
    });
    return result;
}

//Example
var r = unicodeSequence('40', '7E');
console.log(String(r.join(', ')));
console.log(r.unicode);

JSFiddle with it

Why?

In a lot of specs (e.g., in HTML), you see unicode ranges that when encountered have particular side effects (e.g., a parse error). In the specs, however, instead of listing out all the values in a range, the editor will just write, for example, “U+007F to U+009F“. So, obviously, when implementing a spec, one needs a way of expanding these ranges.

2 thoughts on “Generating an sequence of unicode characters”

Leave a Reply

Your email address will not be published. Required fields are marked *