Thursday, November 28, 2013

Making Sense of Modules in TypeScript, Node.js/CommonJS, and AMD/RequireJS

After a little while in the land of pure JavaScript I've decided that I want to have another go at Node.js in TypeScript. I haven't looked into TypeScript development in about 9 months, and in that time a lot has changed. At version 0.9.1-1, the language has almost fully matured to that magical version 1.0, Visual Studio has added some new features that make some of the hacks that I described in my early blog posts unneccesary, and IDEs like WebStorm have done a great job of making it easy to work with TypeScript and Node.js, as they demonstrate in this video. I'm hoping to write a full run-down on exactly what's happening in that video, but as I tried to follow along with it, I ran into some confusion about how exactly modules work TypeScript. Since explaining is a good way of learning, today I'm going to try to go through what I've learnt as I tried to make sense of it all. Fair warning here, I'm not an expert, so I can't guarantee that everything I say is absolutely correct, but I can at least promise that it works.

Instead of going straight into the TypeScript implementation, it's best to start at JavaScript, which has two main module management systems, the AMD (Asynchronous Module Definition) API which is implemented by RequireJS and jQuery (among others), and CommonJS's module specification which is implemented by Node.js. Both of these systems attempt to solve the problem of compartmentalizing sections of code without polluting the global scope.

In AMD, modules are defined using the define keyword, which takes a name if you don't accept the default of a file-based module name (which in general you should), a list of dependencies which are specified in the array argument, and a function within which the dependencies are scoped to a variable. Only the object returned from the function is the public interface of the module, so variables and functions can be hidden within the module definition. Modules can be referenced using the require function, which acts similar to the define function in regards to its dependency management, but does not allow a module definition to be returned. The AMD implementation can see a module's dependencies and load them asynchronously on-demand, so this form of dependency management is usually used on the client-side where bandwidth and speed is important. An example of some JavaScript code that uses the AMD API is shown below:
// File: subdir/dependency1.js
define(function () {
    return function () {
        console.log("Hi from dep1!");
    };
});
// File: dependency2.js
define('dependency2', ['subdir/dependency1'], function (dep1) {
    var private = "Private";
    return {
        public: "Public",
        func: function () {
            console.log("Hi from dep2! " + private);
            dep1();
        }
    };
});
// File: main.js
require(["subdir/dependency1", "dependency2"], function(dep1, dep2) {
    dep1();
    dep2.func();
    console.log(dep2.public);
    console.log(dep2.private);
});
// Output
Hi from dep1!
Hi from dep2! Private
Hi from dep1! 
Public
undefined 

CommonJS is similar but does not use the same scoping system, each file is a module who's public interface is defined by any property added to an "exports" variable (or anything that is assigned to the module.exports variable), and you pull in a module and assign it to a variable by using a require command. Besides syntax, the main difference between CommonJS and AMD is that in CommonJS all modules are loaded on startup, so it is mostly used in server-side code. CommonJS code that produces identical behavior as the code above is shown below:
// File: subdir/dependency1.js
module.exports = function () {
    console.log("Hi from dep1!");
};
// File: dependency2.js
var dep1 = require('./subdir/dependency1');
var private = "Private";
exports.public = "Public",
exports.func = function () {
    console.log("Hi from dep2! " + private);
    dep1();
};
// File: main.js
var dep1 = require('./subdir/dependency1');
var dep2 = require('./dependency2');
dep1();
dep2.func();
console.log(dep2.public);
console.log(dep2.private);
// Output
Hi from dep1!
Hi from dep2! Private
Hi from dep1! 
Public
undefined 

And now we get to TypeScript, which has its own module syntax. Modules are defined using the "module" keyword, and their public interfaces are defined by exporting things from inside the module. Functions and variables can also be exported in the file, without them needing to be wrapped in a module. Modules and exported variables are imported using syntax similar to CommonJS, but with the import keyword in place of var. An example is shown below:
// File: externalModule.ts
export module ExternalModule {
 export function public () {
  console.log("ExternalModule.public");
 };
 function private () {
  console.log("ExternalModule.private");
 }
}

export function ExportedFunction() {
 console.log("ExportedFunction");
}
// File: main.ts
module InternalModule {
 export function public () {
  console.log("InternalModule.public");
 };
 function private () {
  console.log("InternalModule.private");
 }
}
InternalModule.public();
//InternalModule.private(); // Does not compile

import externalModule = require("externalModule");
externalModule.ExportedFunction();
externalModule.ExternalModule.public();
//externalModule.ExternalModule.private(); // Does not compile
// Output:
InternalModule.public
ExportedFunction
ExternalModule.public

In my opinion, since this syntax is more explicit, it is cleaner and easier to understand than AMD or CommonJS modules, but here's where things get confusing. Since TypeScript compiles into JavaScript, you can actually compile this code into either AMD or CommonJS by using the --module flag on the tsc command.

For example, when you run tsc externalModule.ts --module "amd", you get:
define(["require", "exports"], function(require, exports) {
    (function (ExternalModule) {
        function public() {
            console.log("ExternalModule.public");
        }
        ExternalModule.public = public;
        ;
        function private() {
            console.log("ExternalModule.private");
        }
    })(exports.ExternalModule || (exports.ExternalModule = {}));
    var ExternalModule = exports.ExternalModule;

    function ExportedFunction() {
        console.log("ExportedFunction");
    }
    exports.ExportedFunction = ExportedFunction;
});

When you run tsc externalModule.ts --module "commonjs" you get:
(function (ExternalModule) {
    function public() {
        console.log("ExternalModule.public");
    }
    ExternalModule.public = public;
    ;
    function private() {
        console.log("ExternalModule.private");
    }
})(exports.ExternalModule || (exports.ExternalModule = {}));
var ExternalModule = exports.ExternalModule;

function ExportedFunction() {
    console.log("ExportedFunction");
}
exports.ExportedFunction = ExportedFunction;

To further add to the confusion, since all valid JavaScript is also valid TypeScript, there is nothing stopping you from mixing and matching TypeScript modules, AMD modules, and CommonJS modules (as long as you have an implementation of the module loader e.g. RequireJS for AMD and Node.js for CommonJS). Given the recent confusion I experienced I would recommend you just stick with the TypeScript syntax and have that compile into CommonJS for Node.js server-side code and AMD for client-side code.

One last thing I want to mention is the reference path syntax in TypeScript, because that can add to the module confusion a bit. As an example, let's make a simple class:
// File: MyClass.ts
export module MyModule {
    export class MyClass {
        constructor(public str:String) {}
        public func() {
            console.log(str);
        }
    } 
}

We can use this class in another file by referencing it:
// File: main.ts
/// 
var myclass = new MyModule.MyClass("test");
myclass.func();

So why does this work without us needing to import any modules? Because MyModule.MyClass is compiled into a variable, and TypeScript doesn't know how you're going to load your JavaScript files. You could easily have a HTML file that includes both of these files in script tags and it would work fine. What the reference tag does is tell the compiler where to find definitions, so it can tell that the MyClass class is within the MyModule module and it has a func function, so the code in main.ts is valid. When you're writing Node.js however, modules need to be loaded using CommonJS, so you need to use import/require commands in addition to referencing the TypeScript files or definitions (which are still required for syntax checks).


Hope this helps someone, here is some recommended reading/watching:

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.