Write your first Node.js Express app with EclairJS

If you’re a developer, then I’m sure that you must have come across Node.js. It is known to be the best in scaling i.e., it can handle very large numbers of simultaneous requests. To process all these requests, Node hands off any significant processing to other engines like Apache’s Spark.

Apache Spark is fast because it has parallelized operators which work on data kept in memory whenever possible, and it is massively scalable from your laptop to thousands of nodes in distributed clusters.

To take advantage of Spark from JavaScript, you should simply be able to “npm install” an appropriate module, require that module in your Node application, and then write JavaScript code with Spark operators and objects that act just like other JavaScript constructs. You can’t do this with Spark because it arrives out-of-the-box. Once you install EclairJS, you will be able to integrate Apache Spark into your web applications as described.

EclairJS developed by IBM, is particularly suited to applications that are “chatty”, such as user-facing and interactive applications that use input from users to drive analyses or other operations that are performed in Spark. Of course, the ultimate case of “chatty” applications are those that handle streams of data, such as applications that continuously update visualizations like maps or graphs. EclairJS uses web sockets to communicate between Node applications and Spark, and these are ideally suited to handle streaming data

So, let’s start writing our first Node.js Express application with EclairJS. We are going to expand on the Spark “word count” example — Spark’s equivalent of “Hello World” — and turn it into a Node.js Express application using EclairJS. At the end of this post, you should have a basic template you can use for building more complex Node.js web applications using EclairJS.

First, we will add code to the repository to lay the foundation for the node application, add the EclairJS specific piece, and then hook it up to the UI via a web socket layer. Finally, I’ll show you how to run Spark on your local machine and make it work for the application.

Before you start

To get the most out of this exercise, you should familiarize yourself with any of the technologies I mentioned that you have never heard about. Also, before we get started, please make sure you have both git and Node.js installed on your local machine.

Set things up

Setup a github repo and initialize it with a readme file and then add a package.json file.

This file helps in managing dependencies for your application. Also, you will need both the latest Express and EclairJS modules. Create a new file in the root directory of your newly cloned repo called “package.json” and cut and paste the following into it:

{
  "name": "myEclairJSApp",
  "version": "0.0.1",
  "private": true,
  "main": "app.js",
  "dependencies": {
    "express": "~4.14.0",
    "eclairjs": "^0.9",
    "pug": "*"
  }
}

We will use the package.json file we just created with npm to pull the module packages we need locally into our repo.

From the root directory of your repository, run the command npm install –user. You should now see a “node_modules” directory which contains three directories (eclairjs, express, and pug).

Now create a few other directories for your repository where you can put various application assets. In the root directory of your repo, create the following sub-directories:

  • routes
  • views
  • data
  • public
  • public/javascript
  • public/stylesheets
  • public/images

The Node.js Controller Code

Next, we need our top-level JS file where the majority of our application controller code will go. We name it as app.js in our package.json file, so you should use that name. In the root directory of your repo, create a file called “app.js” and add the following code:

/**
 * app.js - Main Node.js Express Application Controller
 */

// Module dependencies.
var express = require('express'),
    pug = require('pug'),
    path = require('path'),
    home = require('./routes/home');

var app = express();

// Setup the application's environment.
app.set('port',  process.env._EJS_APP_PORT || 3000);
app.set('host',  process.env.EJS_APP_HOST || 'localhost');
app.set('view engine', 'pug');
app.set('views', __dirname + '/views');
app.use(express.static(path.join(__dirname, 'public')));

// Route all GET requests to our public static index.html page
app.get('/', home.index);

// Start listening for requests
var server = app.listen(app.get('port'), app.get('host'), function(){
    console.log('Express server listening on port ' + app.get('port'));
});

FIne, so what’s going on in the above code? The first section of code loads the modules needed for the Node.js application. Have you noticed that how they correspond to the package.json file?

We also require a module “path” which is part of core-Node.js and “home” that we will create ourselves under our routes directory. We then create an instance of an Express object which comprises our Node.js app and set up our environment for it. We tell the application that our HTML templates will live under the subdirectory “views” that we created in the last section and that we are using pug for our HTML template engine.

Telling the application that we are using the “public” directory for all of our static content means Express will just look there for any CSS, JS, image, or other files so only put non-sensitive assets to there. The “public” is implied by Express thus when we refer to a public asset we don’t have to append “public” to the path. Finally, we route all GET requests to our “home” or index page which we will create in the next section.

Create your home page

First, create the HTML template for your home page. Pug, our templating engine, supports inheritance. You can put any HTML you want for all of your web applications views to have in one top-level file and extend that file for all subsequent pages.

Create the top-level file first by editing a new file under the “views” directory called “layout.pug”. Cut and paste the following into it:

doctype html
html
  head
    title= title
    link(rel='stylesheet', href='/stylesheets/style.css')
 
  body
    block content

Now, create your home page by editing a new file, again under the “views” directory, called “index.pug”. Cut and paste the following into it:

extends layout.pug

block content
  h1= title
  p.header Welcome to #{title}
  p
    button.count(onclick="clickme()") Count!
  p
    div(id="countResults" class="results")
      ul(id="topTen")
  p

Notice how you are extending the layout template you already created and creating the block content that will comprise the body of the page. This is a nifty little aspect of a templated HTML engine e.g. create your header/footer at the top-level and it will appear in all your pages. I highly suggest reading up on using pug with Node.js apps if it’s not already in your toolbox.

Note: We will hookup the button and use the results div and ul elements later when we get results back from Spark via EclairJS. For now, they are just place holders.
Under your “routes” directory edit a file called home.js and cut and paste the following code into it:

/**

 * home.js - Default route for GET requests to home page.
 */

exports.index = function(req, res){
  res.render("index", {title: "Using EclairJS to Count Words in a File"});
};

As you can see, we set up one export for this file called “index”. Using the response object passed to it, it renders the “index” page using the index.pug file under the views directory and substitutes the variables “title” and “results” with the values supplied.

You need to create one more file before you can validate your web application template. In layout.pug, you added a reference a CSS file in the header. Let’s create that now with a few basic definitions, and you can add to it later if you want.

Create a new file under the sub-directory public/stylesheets/ called “style.css”. Cut and paste the following into it:

body {
  padding: 50px;
  font: 14px "Lucida Grande", Helvetica, Arial, sans-serif;
}

a {
  color: #00B7FF;
}

a.footer {
    font-size: .8em;
    text-decoration: none;
}

p.header {
    margin-bottom: 50px;
}

button.count {
    font-size: 1em;
}

At this point, you can test the skeleton that you just set up to validate your template structure. Then you can add in the application-specific logic.

Run the Node.js Express app

To run your Node.js Express app, simply use the command node –harmony app.js. The –harmony flag tells node to use ECMAScript 6 features of JS. You should see the output Express server listening on port 3000 from the console.log() you added to the callback of the app.listen() function that starts up our server.

So now that you have your Node.js Express application skeleton up and running, let’s add some sweetness to it via EclairJS. If you still have your node server running, kill it for now. We’ll start it up again later.

Add the EclairJS code to talk to Spark

Now the fun really begins. With EclairJS you can enable your web application to make Spark calls in JavaScript! We are going to take the Spark “word count” example and hook it up to our Express application. This will show you how to plug EclairJS into Node.js with a simple front-end.

Create a new file in the root of your repo and call it “count.js”. This is where all of our EclairJS code will go. Cut and paste the following into it:

/**
 * count.js - EclairJS code that talks to Spark to run analytics (word count) on a text file.
 */

var eclairjs = require('eclairjs');
var spark = new eclairjs();

var sparkSession = spark.sql.SparkSession.builder()
  .appName("Word Count")
  .getOrCreate();

function startCount(file, callback) {
    var rdd = sparkSession.read().textFile(file).rdd();

    var rdd2 = rdd.flatMap(function(sentence) {
      return sentence.split(" ");
    });

    var rdd3 = rdd2.filter(function(word) {
      return word.trim().length > 0;
    });

    var rdd4 = rdd3.mapToPair(function(word, Tuple2) {
      return new Tuple2(word.toLowerCase(), 1);
    }, [spark.Tuple2]);

    var rdd5 = rdd4.reduceByKey(function(value1, value2) {
      return value1 + value2;
    });

    var rdd6 = rdd5.mapToPair(function(tuple, Tuple2) {
      return new Tuple2(tuple._2() + 0.0, tuple._1());
    }, [spark.Tuple2]);

    var rdd7 = rdd6.sortByKey(false);

    rdd7.take(10).then(function(val) {
      callback(val);
    }).catch(function(ex){console.log("An error was encountered: ",ex)});
}

// Create a wrapper class so we can interact with this module from Node.js.
function Count() {
}
Count.prototype.start = function(file, callback) {
  startCount(file, callback);
}

Count.prototype.stop = function(callback) {
  if (sparkSession) {
    console.log('stop - SparkSession exists');
    sparkSession.stop().then(callback).catch(callback);
  }
}

module.exports = new Count();

There are a number of things going on in count.js. Let me break them down for you. The Count class is created which wrappers the EclairJS logic, and provides a nexus between Node.js and Spark. This allows you to make calls to the EclairJS side of things from the Node.js side of things. By offering start and stop methods, you have a mechanism to control when the Spark analytics run from the Node.js side. Calling start() invokes the local startCount() function, and that’s where the EclairJS/Spark code gets kicked off. You can read in a text file, break it up into lines and then into words, and then keep a count of each word while removing any duplicates. The results are sorted the top 10 most commonly encountered words are passed back to the Node.js side of things.

The last piece of code you need to add in this section will hook up the call to the EclairJS code to start the counting process and capture the results. Edit app.js and append the following to the end of the file:

var count = require('./count.js');
var file = 'file:/data/dream.txt';
count.start(file, function(results){
	//TODO:  SOMETHING BETTER WITH RESULTS HERE
	console.log('results: ',results);
});

// stop spark  when we stop the node program
process.on('SIGTERM', function () {
  count.stop(function() {
    console.log('SIGTERM - stream has been stopped');
    process.exit(0);
  });
});

process.on('SIGINT', function () {
   count.stop(function() {
    console.log('SIGINT - stream has been stopped');
    process.exit(0);
  });
});
 

As you can see, there is not too much to add on the Node.js side. The code requires the Eclair JS module that you just created and calls start on the “count” object. Pass the name of the file you wish to process and a callback function as its arguments. You can use any lengthy text file as an example. Just make sure you place it in the “data” sub-directory we created in the beginning. I am using the standard “I have a dream” speech by Martin Luther King, Jr. The callback function processes the results you get back from EclairJS. For simplicity’s sake we are just logging them to the console. In the next section, you use a web socket to update the UI with the results.

Add a web socket server

In the last section, you printed out your results to the console. We can use a web socket server in our node application to send our results to the UI and have it render them in our main view. There are a few things we need to do, but they are all easily done.

Note: New code we are adding to existing files will appear in red in the code snippets.

In the root directory of your repo, edit the package.json file and add the web socket server module by adding one new dependency. Your file should now look like this:

{
  "name": "ejsWordCountExpress",
  "version": "0.0.1",
  "private": true,
  "main": "app.js",
  "dependencies": {
    "express": "~4.14.0",
    "eclairjs": "^0.9",
    "pug": "*",
    "ws": "*"
  }
}

Run the command `npm install –user` again to pull the web socket server module locally into your node_modules directory.

Also from the root directory of your repo, edit the file app.js and require the new module for the web socket server near the top of the file.

// Module dependencies.
var express = require('express'),
    path = require('path'),
    pug = require('pug'),
    home = require('./routes/home');

var WebSocketServer = require('ws').Server;

var app = express();

In the same file, you need to create an instance of the web socket server and a listener function for messages coming from the UI. You also want to wrap your call to the EclairJS count object as a local function that you can call when you get a message from the UI to start counting.

// Start listening for requests
var server = app.listen(app.get('port'), app.get('host'), function(){
    console.log('Express server listening on port ' + app.get('port'));
  }
);

var wss = new WebSocketServer({
  server: server
});

wss.on('connection', function(ws) {
  ws.on('message', function(message) {
    console.log("*******",message);
    var msg = JSON.parse(message);
    if (msg && msg.startCount) {
        startCount();
    }
  });
});

var count = require('./count.js');
function startCount() {
    var file = 'file:/data/dream.txt';
    count.start(file, function(rawdata){
        // Recall raw data from EclairJS is Tuple2[] with {"0":count, "1":word}.
        // Convert to something the UI can easily use.
        //console.log("rawdata recieved from ejs: ",JSON.stringify(rawdata));
   var results = [];
        rawdata.forEach(function(result){results.push({count:result[0], word:result[1]})});
        wss.clients.forEach(function(client) {
            try {
                // Send the results to the browser
                client.send(JSON.stringify(results));
            } catch (e) {
                console.log(e);
            }
        });
    });
};

// stop spark  when we stop the node program
process.on('SIGTERM', function () {

Notice how startCount() gets called after a UI connection has been established and a “startCount” message has been received.

Finally, you need to add the actual JS for the UI. Under the public/javascript subdirectory you created earlier, create a new file called “showresults.js”. Cut and paste the following into it.

var ws;
window.onload = function() {
    var port = location.port ? location.port : '80';
    ws = new WebSocket("ws://"+location.hostname+":"+port);

    // When a message is received from Node on the web socket display the results as a simple list.
    ws.onmessage = function(e) {
        if (e.data) {
            var list = document.getElementById("topTen");
            var data = JSON.parse(e.data);
            //console.log("data: ",data);
            data.forEach(function(item){
                var li = document.createElement("li");
                var text = document.createTextNode("The word '" +
		item.word + "' appears "  +item.count + " times in the text");
                li.appendChild(text);
                list.appendChild(li);
            });
        }
    };
};

// When the user clicks the button let NodeJS know it can start the EclairJS counting piece.
function clickme() {
    if (ws) {
        // First clear out any old results
        var list = document.getElementById("topTen");
        while(list.hasChildNodes()) {
            list.removeChild(list.children[0]);
        }
        ws.send(JSON.stringify({startCount: true}));
    }
};

The JS for the UI is pretty simple. When the UI is loaded it attaches to the web socket server you created in app.js and listens for messages from it. When it receives new data, it creates an unordered list for the top ten results. Recall the “clickme()” function is hooked up to our button in “index.pug”. This JS code is also where we define what happens when the user clicks it. We simply send a message via the web socket to our node piece to start the counting process.
Now, edit the file layout.js under the views subdirectory and add the new JS file you just created to it:

doctype html
html
  head
    title= title
    link(rel='stylesheet', href='/stylesheets/style.css')
    script(src='/javascript/showresults.js')

  body
    block content

If desired, you can append the following to the end of public/stylesheets/style.css to nicely format the results list:

#topTen > li {
    list-style-type: none;
    margin-left: -25px;
    padding-top: 5px;
}

That is all there is to it! You have just added a web socket connection between the Node.js piece and the UI. It’s time to fire it all up and give it a whirl!

Get things running

Up until now, we’ve been working with the EclairJS Client. That is what you get with the “npm install” from the intro. You need the EclairJS Server well as Spark to actually do the analytics on the test file. Luckily, everything you need for this is in a docker image on Dockerhub. If you don’t already have Docker installed, you must do that first.

Once it’s installed, issue the following commands to pull the image and start it up:

docker pull eclairjs/minimal-gateway:0.9
docker run -p 8888:8888 -v /<fullpath to repo>/eclairjs-examples/wordcount_express/data/:/data eclairjs/minimal-gateway:0.9

Notice the “-v” option on the docker run command. This maps the data subdirectory we created in the beginning from the host system to the docker container. Since Spark is running in the docker container it must have access to the file we want it to process. Using the “-v” command does just this. Be sure to use the full path to your eclairjs-examples repo.

Once your docker container is up and running, start up your node server again with the command node –harmony app.js. Open a browser and point it to the URL
“http://localhost:3000”. You should see some output running in the docker container as the Spark calls are executed. Finally, when your results are processed and received on the Node.js side, you should see the top 10 most encountered words in your text file appear in your browser.

I hope this article was helpful and gives you a good starting point to use for any future Eclair/Node.js apps. You now have the building blocks to get started and be on your way to building your own first full-fledged Node.js/EclairJS web application. Happy coding!