Demystifying Asynchronous Programming Part 2: Node.js EventEmitter

Published May 11, 2017Last updated Nov 06, 2017
Demystifying Asynchronous Programming Part 2: Node.js EventEmitter

EventEmitter

Node.js is famous for its asynchronous and event-driven nature. I'm sure you’re feeling pretty good about yourself after going through the little analysis of event loops in our previous post. Now, we’re going to talk about another important topic: event emitter. Let me give you my summary first:

Node.js gave you event emitters to let you create tools for event pattern in the user space. It has nothing to do with event loops!

What?!?!

You may be feeling bit confused, or maybe even a bit upset. You might even wonder if I know Node.js at all! Well, hold your horses and calm down. You’ll see how the story unfolds.

EventEmitters are synchronous

EventEmitters are, in nature, synchronous. Unfortunately, none of the books or articles I’ve read have pointed this out clearly. Some of them jump to false conclusions while some of them hint at this. One book even believes EventEmitter to be an abstraction of the event loop (which is completely wrong. Just don't ask me which book it was)!

When I was a newbie at Node.js, I believed that, too. Until I created an EventEmitter and a timer in the style of Node.js in Lua. That’s when I realized things are not what I imagined at all. And because I wanted to write it in the style of Node.js, I ended up “copying” how it was done in Node.js (well, paying homage to great developers XD).

The following also uses the node.js v4.5.0 LTS source code as an example. The implementation of EventEmitter in /lib/events.js is less than 450 lines. There are two very important methods that are part of event mode. Our implementation almost all revolve around the methods .emit(event, ...) and .on(listener). .on() lets you register an event listener, while .emit() lets you emit events. Once events occur, the callbacks registered to listen for the event is executed. I’m sure JavaScript developers are familiar with this mode (well not just familiar, it’s burnt into us).

The constructor of EventEmitter looks like this. Its structure is really that simple. Inside, it’s just a protected member, this._event = {}. In this box, the event type will act as the key, and the registered listener will act as the value. If an event has several listeners, then the value is an array that stores the handlers in order of registration.

function EventEmitter() {
  EventEmitter.init.call(this);
}

EventEmitter.init = function() {
  // ... 
  if (!this._events || this._events === Object.getPrototypeOf(this)._events) {
    this._events = {};      // this object is used to manage the registered listeners
    this._eventsCount = 0;
  }

  this._maxListeners = this._maxListeners || undefined;
};

Let's look at .on() first. Since it's an alias to addListener, we’ll look at the addListener method:

EventEmitter.prototype.on = EventEmitter.prototype.addListener;
EventEmitter.prototype.addListener = function addListener(type, listener) {
  var events;
  var existing;
  // ... 
  events = this._events;

  // ... 
    existing = events[type];

  // If event type does not exist, then input type as the key and listener as the value
  if (!existing) {
    existing = events[type] = listener;
    ++this._eventsCount;
  } else {
    // If event already exisits and its value is function, then store as an array
    // if it’s aready an array with two or more listeners, then push the new listener in
    if (typeof existing === 'function') {
      // Adding the second element, need to change to array.
      existing = events[type] = [existing, listener];
    } else {
      // If we've already got an array, just append.
      existing.push(listener);
    }
    // ... 
  }

  return this;
};

Ain't that easy? Next, let’s take a look at .emit():

EventEmitter.prototype.emit = function emit(type) {
  var er, handler, len, args, i, events, domain;
  // ... 
  events = this._events;
  // ... 

  handler = events[type];  // find the handler

  // if no listener for that type exisits, directly return
  if (!handler)
    return false;

  // ... 
  // the following cases are just different ways to call based on the number of arguments for the sake of node performance 
  // let’s take emitOne and take a look
  switch (len) {
    // fast cases
    case 1:
      emitNone(handler, isFn, this);
      break;
    case 2:
      emitOne(handler, isFn, this, arguments[1]);
      // ... 
  }

  // ... 
  return true;
};

We’ll explain it using .emitOne() as a typical example:

function emitOne(handler, isFn, self, arg1) {
  // if the handler is a function, then run it directly
  if (isFn)
    handler.call(self, arg1);

  // if not, then it’s an array of functions
  else {
    var len = handler.length;
    var listeners = arrayClone(handler, len);

    // run the handlers in the array one by one in order
    // note: This is synchronous code
    for (var i = 0; i < len; ++i)
      listeners[i].call(self, arg1);
  }
}

This tells us that every emit will invoke synchronous code. Callbacks are run one by one until they’re all finished. Doesn’t that sound an awful lot like the callback queue explained earlier in this post? That’s exactly what it is! But how EventEmitters work has nothing to do with the event loop of Node.js — you can read the entire event.js, but you won't find any asynchronous code.

If you've ever used the flux architecture of React, its dispatcher uses the same architecture to implement payload and broadcast (see register() and dispatch() methods. Do you see the resemblance to on() and emit() except with more status control?).

Be extra careful when using EventEmitter

In Node.js, aside from assisting with work flow control, we mostly use EventEmitter to notify when something has happened (or completed), especially when some asynchronous tasks have been completed or have occurred (e.g. read file complete, connection broken, socket closed, etc.). For example:

fooEmitter.on('data', function (data) {
  console.log(data);
});

fs.readFile('/path/to/file', (err, data) => {
  if (!err)
    fooEmitter.emit('data', data);
});

Because we use it like the above, we create the illusion that “using EventEmitter is writing asynchronous code”.

Ever Seen a Dog Chase its Own Tail? Let's Write One!

We emit an 'event2' event in the event1 handler, then emit a 'event3' event in the event2 handler, then lastly emit a 'event1' event in the event3 handler.

var EventEmitter = require("events");

var crazy = new EventEmitter();

crazy.on('event1', function () {
    console.log('event1 fired!');
    crazy.emit('event2');
});

crazy.on('event2', function () {
    console.log('event2 fired!');
    crazy.emit('event3');

});

crazy.on('event3', function () {
    console.log('event3 fired!');
    crazy.emit('event1');
});

crazy.emit('event1');

Go ahead and run it! You’ll get an exception that basically says the call stack has exploded. The dog has spun to death from dizziness. Why? Because all callbacks are executed in a synchronous manner! It’ll just recursive call itself to infinity and beyond! Don't think this will happen? Never say never!

The Immortal Dog

What if we use setImmediate()to start it off and send it into the event loop? (That’s as asynchronous as it gets right?). You’d still get the same result! What you’ve done is send it into the event loop to start, but when the event occurs, the entire EventEmitter chain is synchronous, which would block the event loop. The recursive call in the callback continues until the system hangs with a stack overflow. For the same reason, even if we didn’t set up the events as a closed loop, and had every event handler as a long-running task, that’ll still block the event loop for an extended period of time.

Now, what if, instead, we use setImmediate() to send every emit() in the previous code into the event loop? You’ll get an immortal dog that never dies:

var EventEmitter = require('events');

var crazy = new EventEmitter();

crazy.on('event1', function () {
    console.log('event1 fired!');
    setImmediate(function () {
        crazy.emit('event2');
    });
});

crazy.on('event2', function () {
    console.log('event2 fired!');
    setImmediate(function () {
        crazy.emit('event3');
    });

});

crazy.on('event3', function () {
    console.log('event3 fired!');
    setImmediate(function () {
        crazy.emit('event1');
    });
});

crazy.emit('event1');

Go ahead and run it! Now you have a truly asynchronous program! You’ll be happy because the system no longer hangs!

What about process.nextTick?

Now that you're familiar enough with process.nextTick, what if we swapped ALL setImmediate() with process.nextTick? What do you think will happen? (Don’t try this at home!)

// ... 
crazy.on('event1', function () {
    console.log('event1 fired!');
    // swap all setImmediate with process.nextTick
    process.nextTick(function () {
        crazy.emit('event2');
    });
});

// ... 
crazy.emit('event1');

It’ll get stuck! And if you wait long enough, about 30 seconds, it’ll eventually give you a “process out of memory” exception. Now, the problem is not stack overflow, it’s GC not being able to reclaim memory. (Every handler has its own closure to access the crazy on the outer layer. This cost comes out of the heap.) Though you might not be 100% why GC can't successfully get the memory back, you can probably guess that the program got stuck in some phase because there’s always another process.nextTick callback to be processed. (So, the event loop is blocked completely. The heap overflow is just a bonus lol)

So, about EventEmitter, going back to what I said earlier:

Node.js gave you event emitters to let you create tools for event pattern in the user space. It has absolutely nothing to do with event loops!
So then how on earth do we write an “asynchronous event pattern”? Well, now you know that all you need to do is use EventEmitter with API functions that can dump tasks into the event loop! (setTimeout(), setInterval(), Async I/O APIs, setImmediate(), and process.nextTick(). With process.nextTick(), just be careful to avoid recursive calls and invoking long-running tasks in the callback, and you should be fine.)

If you're still not convinced, then why not try running the following code. Do you think the code will keep going or stop immediately?

var EventEmitter = require('events');
var server = new EventEmitter();

server.on('data', function () {
    console.log('Am I waiting for data incoming?');
});

If you can answer the question without actually running the code, then you really understand what I've been saying. Nothing here was added to the event loop.

2016/9/23: A commenter mentioned that if you run the process.nextTick immortal dog example in node 4.4.7 in windows, it won't overflow! I find that very interesting!

Conclusion

In these two posts, we have completely separated the Node.js concepts of event loop and EventEmitter and explained them independently. They’re not meant to be together to begin with, and EventEmitter is definitely not an abstraction of the event loop. Once we're crystal clear about the two concepts, you’ll feel very comfortable using the two together.

I hope this post can help other developers like me, who are passionate about JavaScript and Node.js, to learn more about the asynchronous behavior of Node.js and inspire them to continue to write better asynchronous code. Everyone is welcome to use and modify this post as teaching material. Meanwhile, if anyone spots a mistake, please tell me so we can make this post better and more accurate!

If you think the posts are written well, please recommend them to your friends. I don’t know if this is useful for front-end or not, so I was only going to post this on Node.js TW. Though, please feel free to repost this.


This post was translated to English by Codementor’s content team. Here’s the original Chinese post by Simen Li.

Discover and read more posts from Simen Li
get started
Enjoy this post?

Leave a like and comment for Simen

15
1