A heads up - I know this is in general an old subject. I wanted to throw in my two cents and to show how to use gulp to automatically build complex inline workers.

While developing the new collision system for BabylonJS I was discussing the development stages with David Catuhe, the framework's creator and main developer. At a certain point he asked me to see if it will be possible to keep on serving a single file for the framework. He wanted me to integrate the web worker's code in the main framework's file.

This is actually a pretty clever idea. This way it will be transparent to the developer. Everything is included in the file he just added as a source. No external dependencies needed. No chance to forget to include a file in the bundle, as there is no file. A one stop shop. Sounds great!

This is, however, a harder task than it actually sounds.A web worker is expected to be opened from an external file. I have explained my point of view about this in part 1 of collision detection for babylon. Let me elaborate a bit more.

Why is the worker specification too limiting

(Very short disclaimer - this is really an opinion, I have no idea if I am right!)

JavaScript is a single threaded. The best thing about a single threaded language is the fact that there are no race conditions. No Deadlocks. There is no possible way for me to accidentally alter a variable that is used in another thread and that will influence my entire application. That's a huge benefit.

Web workers introduced a form of "multi threading" for JavaScript. Why "a form of"? for two reasons:

  1. They process messages asynchronous.
    Even if the main thread will send two consecutive messages one after the other (even in the same function), they will be processed in a message-queue style in the worker's thread. The second will be processed only after the worker finished processing the first one. The same goes to the main thread receiving the answer from the worker.
  2. They share nothing with the main thread
    A worker has its own context. In a multi-threaded language (Java, C++, etc'), you could send a reference of the variable between threads. Changing the variable in one thread, will also change it in the other. In a JavaScript's worker moving variables between both threads is either using data serialization or transferable objects. Data serialization means that when an object is sent to a worker it is converted to a JSON and sent to the worker:
//Main thread
var myWonderfulObject = { "theAnswer":42 };  
//Object is serialized to JSON and sent.
worker.postMessage(myWonderfulObject);  
//Changing the data won't influence the data already sent to the worker:
myWonderfulObject["theAnswer"] = 45;  
.....

//Worker
self.onmessage = function(event) {  
  var objectSentFromMainThread = event.data;
  console.log(objectSentFromMainThread["theAnswer"]); //Prints 42
  //Changing the object here won't influence the main thread
  objectSentFromMainThread["theAnswer"] = 44;
  postMessage(objectSentFromMainThread); //Will send a serialized JSON of this object.
  //Object can (and will) now be garbage collected.
}

The same concept is used when moving transferable objects. Transferable objects are ArrayBuffer objects created on one thread and moved to a different one. The ownership of the object moves to the new context. They are no longer available in the main thread.

Those two reasons are the base of this assumption - The only way to prevent developers from trying to "hack" the system and move variables between two "threads" is to completely separate them. The best way of separating the two contexts is to put them in two different files.
I am saying trying, since it will not be possible. But if a worker could have been initialized using a function, inexperienced developers won't understand why this doesn't work:

var foo = "bar";

var workerFunction = function() {  
  console.log(foo); 
}

//If this was possible (it isn't!), result in the console would be the exception "Foo is undefined", as the context changed.
var worker = new Worker(workerFunction);  

This is basic JavaScript usage. Variables declared in a higher scope will be available in a deeper scope. But a worker has its own context (again, to avoid the alteration of foo in the context thus generating a problem in the main thread). So it wouldn't work.

Initializing a worker using a ("stringed") function

After reading the last passage - is it at all possible?

The worker's constructor accepts a URL as its first variable, which is the location of the worker's file. A URL can also be created for a Blob that we create in the main thread. Combining them both:

var blob = new Blob(["self.onmessage = function(event) { postMessage(event.data); }"], {type: 'application/javascript'});  
var worker = new Worker(URL.createObjectURL(blob));  
worker.onmessage = function(event) {  
  console.log(event.data); //echo-worker
};
worker.postMessage("hello"); // send a message to the worker  

Will work wonderfully.

You might have already noticed the catch - the function must be a string. I have explained why (I think) it cannot be a real function before.

(Update) Browser compatibility

After I received an interesting comment on google+, I noticed I haven't talked about browser compatibility of this feature.
The main reason it didn't concern me during the development of the workers for BabylonJS is due to the fact that BabylonJS is a WebGL based framework, and thus only works on the newest browsers. All newest browsers (including IE11) support this feature, hence - there was no need for me to really check for compatibility.

But since it does concern most other developers I can extend a bit about browser-support for this feature.

According to Can I Use, BlobUrls (and of course Blobs) are supported on all modern browsers, excluding Opera Mini and IE < 10. That means that technically it must work on all of those, including Safari for iOS and the Android browser.
IE10 is however an interesting exception. IE10 does support the BlobUrl feature, but doesn't support initializing workers from Blob URLs. It is due to a wrong security exception being thrown while using it. It was reported and fixed for IE11, where it works wonderfully.

So, there you have it. It is already widely supported. Implementing a fallback is very simple (I show how at the end of this post).

Automating inline worker build process

We have solved the first problem - we can now convert our code to a string (so far manually) and load it as a worker code. We still have a problem - this is not automated. If you want to change the worker's code, and still keep it in the code base as a real JavaScript file, you will have to constantly change two files:

  1. The worker's file (real JavaScript)
  2. The main code file ("a stringed version of it).

For a true lazy developer (aren't we all?) this is really bad. Plus we all (hopefully) read once in the pragmatic programmer that this is a BIG no-no.

Gulp to the rescue

Gulp is just an example, as we are using gulp for building BabylonJS. Any descent extendable building system will do.

Let's slowly write the gulp tasks needed.

To automate the worker build process we will have to:

  • Load all files needed for the worker

    This is easily done with gulp:

gulp.task("worker", function() {  
  return gulp.src(["file1.js", "file2.js"]);
});
  • Convert those to a single string with a predefined variable name

    This is done using a simple gulp plugin I wrote that I like to call gulp-srcToVariable. It is not an official gulp plugin, as I am not too sure where else someone would need something like that. But, here is the code:

var through = require('through2');  
var gutil = require('gulp-util');  
var PluginError = gutil.PluginError;  
var path = require('path');  
var File = gutil.File;

// Consts
const PLUGIN_NAME = 'gulp-srcToVariable';

/**
* varName - the name of the string variable
* asMap - this is a special case for babylonjs that will add each file to a map of strings and not to a single string.
* namingCallback - connected to the asMap variable - how to name the keys in the map.
*/
var srcToVariable = function srcToVariable(varName, asMap, namingCallback) {

    var content;
    var firstFile;

    namingCallback = namingCallback || function(filename) { return filename; };

    function bufferContents(file, enc, cb) {
    // ignore empty files
    if (file.isNull()) {
      cb();
      return;
    }

    // no stream support, only files.
    if (file.isStream()) {
      this.emit('error', new PluginError('gulp-concat',  'Streaming not supported'));
      cb();
      return;
    }

    // set first file if not already set
    if (!firstFile) {
      firstFile = file;
    }

    // construct content instance
    if (!content) {
      content = asMap ? {} : "";
    }
    // add file to content instance
    if(asMap) {
        var name = namingCallback(file.relative);
        //add the file's content as a string to the map
        content[name] = file.contents.toString();
    } else {
        //add the file's content as a string to the files that were added so far.
        content += file.contents.toString();
    }
    cb();
  }

  function endStream(cb) {
    if (!firstFile || !content) {
      cb();
      return;
    }


    var joinedPath = path.join(firstFile.base, varName);
    //The content of the file sent back to gulp is varName = stringify(content).
    var joinedFile = new File({
      cwd: firstFile.cwd,
      base: firstFile.base,
      path: joinedPath,
      contents: new Buffer(varName + '=' + JSON.stringify(content) + ';')
    });

    this.push(joinedFile);
    cb();
  }
  return through.obj(bufferContents, endStream);
}

module.exports = srcToVariable;  

Adding this to our gulp task:

var srcToVariable = require("./gulp-srcToVariable");

gulp.task("worker", function() {  
  return gulp.src(["file1.js", "file2.js"]).pipe(srcToVariable("foo"));
});
  • Add variable to the main code

Using the wonderful merge2 you can now add the worker task to the build process. This requires a little hack:

//the worker task was altered a bit
var workerStream;  
gulp.task("worker", function(cb) {  
  workerStream = gulp.src(["file1.js", "file2.js"]).pipe(srcToVariable("foo"));
  cb();
});

//Worker is a dependency, so it will be executed before
gulp.task("build", ["worker"], function () {  
    return merge2(
        gulp.src([...main files...]),
        // the workerStream result will be merged together with the main files.
        workerStream
    );
});

The build task is first executing the worker task, thus creating the stream stored in workerStream. It will be later used and combined with the rest of the framework's files.

A small recommendation would be to uglify the worker's code - this will simply "compress" the string that the worker task generated:

var uglify = require("gulp-uglify");

//the worker task was altered a bit
var workerStream;  
gulp.task("worker", function(cb) {  
  workerStream = gulp.src(["file1.js", "file2.js"]).pipe(uglify()).pipe(srcToVariable("foo"));
  cb();
});

Now we are all set! foo will be declared at the end of the single file generated.

  • Use the worker's variable in the code
//In one of the main files
var worker  
window.onload = function() {  
//foo is already defined here, as the file was completely loaded. But let's check!
//Feature detection included - make sure everything is supported
  if(!window.Worker) {
    //Oh no! no worker support... 
    return;
  }
  //Make sure blob and create object URL are supported
  if(foo && window.Blob && window.URL.createObjectURL) {
    //worker's string was loaded successfully
    var blob = new Blob([foo], {type: 'application/javascript'}); 
    ///Continue with the worker blob to url initialization
  } else {
    //Fallback! Can be used for debugging purposes.
    worker = new Worker("workerFile.js");
  }
}

This way, if foo is not defined, it means that we are not using the built framework. We are currently debugging the application, using the single files.

We are now keeping our code base clean, no duplication or redundancy. This is the way I have implemented it for BabylonJS (the entire gulp code can be found here).

If you have any questions about the entire process, I will be more than happy to answer! just comment and wait a bit :-)

Connect with me on Twitter or LinkedIn to continue the discussion.

I'm an IT consultant, full stack developer, husband, and father. On my spare time I am contributing to Babylon.js WebGL game engine and other open source projects.

Berlin, Germany