Information Security, Web, Networks and Systems

Thursday, May 19, 2016

Setting up MongoDB Client Authentication in Ubuntu

5:29 AM Posted by Deepal Jayasekara , , No comments

This article is a slight variant of the official MongoDB guide for enabling client authentication focused more on setting up client authentication in ubuntu. If you are using any other operating system, please follow the official guide instead.

MongoDB by default does not enable client authentication when you install it via apt-get. To enable client authentication, follow the given steps.

Thursday, February 18, 2016

Developing Secure Node.js Applications - A Broad Guide

Security of Node.js applications has been very important since it is becoming a widely used platform for developing web applications/web services and many other applications. With the backend of JavaScript, Node.js has brought the security risks of JavaScript applications to the server side. With asynchronous nature of Node.js, most of the traditional mechanisms for web application protection cannot be directly applied in Node.js applications. In this post, I will discuss what I have researched and experienced in writing secure Node.js web applications and also some of the best practices.

Prevent XSS ! Context Based Encoding

Cross Site Scripting (XSS) is one of the most common but ignored types of attacks. Since Node.js is implemented with JavaScript, there is high-risk of developers introducing XSS vulnerabilities in the code. Output encoding is one of the best ways to prevent XSS attacks. Most view engines such as Jade provides built-in encoding mechanisms. But most important thing is that you should use appropriate encoding to based on the context. Following are some situations that you should use context specific encoding.
  • URL encode parameters which are appended as url parameters. URL encoding can be done using encodeURI() and encodeURIComponent() javascript built-in methods.
  • HTML encode parameters which are displayed in HTML. HTML encoding is provided by view engines such as jade as well as frontend frameworks like Angularjs. You also can explicitly do it from server side using htmlencode npm module.
  • CSS encode parameters which are used in element styles

Prevent CSRF (Cross Site Request Forgery) with Anti-Forgery Tokens

Cross Site Request Forgery (CSRF) allows an attacker to execute a certain function on the web application on behalf of yourself. To prevent these kinds of attacks, we can implement Anti-CSRF tokens so that the server can validate whether the request is coming from intended sender. Anti-CSRF tokens are one time tokens which are sent along with the user's request and used by the server to validate the authenticity of the request. Please refer to my previous blog post about what Anti-CSRF tokens are.

Express.js framework is a web framework for Node.js which has in-built support for CSRF prevention. Following example shows how to initialize CSRF protection with Express.js and Node.js. When this protection is added, express.js creates a secure token which is sent to the server via both request body and cookies. These two tokens are validated by the server for forgery. If server fails to validate these two tokens, server returns a 403 Forbidden response to the client.

This mechanism prevents an attacker sending requests to the server on behalf of yourself since attacker has no access to the cookie for the domain in your browser. Even if he collects one token, he cannot replay it again since the token is one time.

Above generated CSRF token then should be placed in your view as follows so that it gets submitted when the user submits the form.

If you are using angular.js with express.js and node.js, you will need to do some changes in your code to make express CSRF protection mechanism work with angular.js. Angular.js has built-in support for CSRF protection as they have mentioned in their documentation.

When performing XHR requests, the $http service reads a token from a cookie (by default, XSRF-TOKEN) and sets it as an HTTP header (X-XSRF-TOKEN). Since only JavaScript that runs on your domain could read the cookie, your server can be assured that the XHR came from JavaScript running on your domain. The header will not be set for cross-domain requests.

For angular.js to send the CSRF token we create with express.js, we need to manually set the value of CSRF token as a cookie named 'XSRF-TOKEN'. This can be done by a simple middleware as follows.

Then angular.js $http service will pick this value from the cookie and will send with the subsequent request to the server. When the request completes, express.js will generate a new CSRF token and send it along with the response in Set-Cookie header which can be used for the next request.

Secure Express.js Sessions and Cookies

Session cookie name reveals your application's internals

Revealing what technologies you are using for your application is one of the key things that you should not do. If an attacker knows what kind of technology you are using, he can drastically reduce his scope in finding vulnerable components in your application. There are a couple of ways which reveals internal implementation details of your application. One of them is the session cookie name of your application.

Let's look at the session cookie set by my application.

We can see that the session cookie name is connect.sid. This is the default cookie name set by express framework for your application. Anyone who sees this cookie can immediately identify this is a Node.js application unless the developer of the application has masqueraded the cookie name. So how can we change this so that nobody can identify our application is node.js based? It's no big deal. Have a look at following code snippet which initializes an express.js session.

You can see that I have specified a name 'SESS_ID' as the name. Express.js will then take this name as the name for the session cookie. Once you do the change, clear all cookies and restart your application you will see the session cookie name is now changed to SESS_ID as follows:

In previous code snippet, you can also see there is a secret specified in the options object. This secret is used to sign the session id value to prevent attackers injecting malformed session cookies and hijack sessions. Although this does not 100% guarantee that attackers cannot forge a session id, providing a hard-to-guess secret facilitates a good protection from session hijacking.

Make cookies more secure

When you use cookies in your application make sure to add HttpOnly flag to the cookies. Adding HttpOnly flag makes sure that no external script other than an HTTP connection can fetch cookies in your application. This is a good protection mechanism against cross site scripting attacks where attackers read your cookies through malicious scripts.

And also, if your application supports HTTPS (you should!) make sure to add Secure flag as well to prevent cookie being transmitted through insecure HTTP connections so that any man-in-the-middle attack can compromise your cookies. Imagine a scenario where your application is served via HTTPS, but you do not use secure cookies. Although you use HTTPS, there might be some static content (such as images) that are still being loaded through an HTTP connection. An attacker might be able to intercept this kind of a static content response and grab the cookie which is never supposed to be revealed. In this kind of a scenario, even if you think your cookies are safe, it's not.

Consider the previous example where we initialize the express session. We can make the session cookie Secure and HttpOnly by setting these options in a cookie object as follows:

cookie: { secure: true, httpOnly: true }

 When you are creating any other cookies later in the application, you can specify them to be secure and HTTP only as well.

Signing cookies

Signing cookies provide prevention of cookie forging. A signed cookie is a value which has the cookie value as well as a digital signature attached to itself. Once the cookie is received from the server end, the server will validate the integrity of the cookie by validating its signature. Cookie signing is provided with express.js cookie-parser middleware. Take a look at the following example,

Cookie parser middleware accepts a secret string as a parameter which is used to sign the cookies. To create a signed cookie as above, you need to specify signed: true in the cookie options as the example illustrates. You can access these signed cookies with res.signedCookies object. If the signature of the cookie is valid, this object will contain the real value of the cookie with the matching key as the cookie name. As above example, if the user has modified the cookie value, currentUser will have no value as if such cookie never existed. Make sure you specify a difficult to guess secret as the in the cookie parser which makes brute forcer's life harder.

Error Handling

Proper error handling is not as trivial as one might think. With Node.js, it's even weirder. Node.js has multiple ways of error handling which depend on whether the function is synchronous or asynchronous. Yes, you can use the almighty try...catch in Node.js, but if you use it wrapping some asynchronous code, you are done. Asynchronous pieces of your code will not work with try...catch error handling mechanism. Since an asynchronous function is not blocked, execution will continue to the next line and will jump out of the try...catch block without any incident even if an error occurs when the function gets executed. Following example shows where try...catch block should be used inside a node.js application.

Errors in asynchronous functions should be handled using callbacks. You should design your asynchronous functions in order to accept a callback function as a parameter which can be used to handle any errors occurred during execution. Following example demonstrates how error handling should be done with asynchronous functions.

Know how your code behaves, and use proper error handling based on how it behave.

Protect Database Access

If you are using a data such as Mongo DB as your persistent storage, you also need to protect access to the database as well as prevent database being compromised by attackers. If you are using MongoDB as your database, following things should be considered as important.

Enable client authentication in MongoDB to prevent the situation "Everyone is admin".

By default, MongoDB does not enforce authentication to access databases. This is really harmful since anybody has direct access to the database content even if they do not have access to use your application. So you need to implement client authentication in MongoDB and prevent malicious access to the data.

Sanitize user inputs used in MongoDB queries 

MongoDB query language is a javascript based technique. Due to this nature, MongoDB is also vulnerable to script injection attacks. When you use user supplied input values inside mongo db queries you should properly enforce type checks and necessary input validations and sanitizations to prevent attackers executing malicious scripts on your database.

Look at the following vulnerable code which reads the username from the request body and reads his secret notes from the database.

This can be exploited to view the secret notes of other users as well if you do not place proper validations. A request with the following request body will make above code to read other users secret notes as well.


This request body will change above query to the following which will return all user secret notes:

DBSecretNotes.find({username: {"$gt": ""}, secret: {"$gt":""}})

Therefore, you should never trust user input and it should never be placed directly inside the database query without proper validation.

Cross Origin Resource Sharing

If your node.js application is supposed to serve resources to external websites/applications (such as google fonts etc.) you might need to support cross origin resource sharing (CORS). Cross Origin requests are by default blocked by browsers to prevent external script injections and expose applications to unnecessary threats. But if you need certain components of your applications to be accessed by other websites always make sure you expose what you really need to expose and no others. And also if you want to allow accessing your resources by certain external applications only, you should whitelist those applications and should block all other external requests. Using CORS in your node.js applications can be done flexibly and safely by the cors npm module.

Allowing CORS can be done by browser headers which instructs browser to allow specified components of your applications to be accessed by external applications. When your application is configured for CORS, your application can provide following headers which instructs browser which requests should be allowed.

When the external application accesses resources in your application, they need to send appropriate headers saying which type of a cross origin request it is sending. Based on the CORS configuration on the server, external application will either be served or rejected.

Preventing HTTP Parameter Pollution (HPP)

HTTP parameter pollution is a web application attack which is aimed at bypassing server side URL parameter validations and attack a web application. Imagine you have written a node.js application which exposes the below url which has url parameters firstname and lastname.'Jack'&lastname='Sparrow'

An attacker would try to attack the application by modifying the url to the following.'Jack'&lastname='Sparrow'&firstname='Jill'

In this case attacker provides two firstname parameters instead of one. This is a basic example of how HTTP Parameter Pollution (HPP) can be performed. In express.js above URL will be parsed from the server and firstname and lastname query parameters will be as follows:

req.query.firstname = ['Jack', 'Jill']
req.query.lastname = 'Sparrow'

Although you intended to accept a string as the firstname, it has become an array containing two values. This scenario might cause you multiple problems.
  • Causing type errors in your application
  • Unexpected database query execution
  • Modify your application logic
  • Bypass through validations
and many more. To prevent this kind of attacks, you should properly implement type checking on query parameter values, identify potential HPP attacks and act accordingly. You also can use hpp npm module to protect application from HPP attacks as illustrated in following example.

Above code will log following output in console.

Query Parameters : {"firstname":"Jill","lastname":"Sparrow"} 
Polluted Query Parameters : {"firstname":["Jack","Jill"]}

As above, hpp middleware will detect a HPP attack and will move the attacked parameters to a new object called req.queryPolluted. And it will update the req.query.firstname by the last value of the firstname array preventing an HPP attack.

Server Headers

Disable X-Powered-By Header

X-Powered-By header reveals unnecessary details about your application's internal implementation. Looking at the value of this header, an attacker could map your application's internal structure and plan more organized attacks. By default all express.js web applications set these security header the value 'Express'.

This information is very valuable because an attacker can identify that your applications runs on node.js and he can also attack your application or server if you are running an outdated node.js version. This is an unnecessary risk. By removing this header or by masking its value by something else will keep any adversaries off a little bit. 

In Node.js with Express.js, you can remove this header by adding following line to the code after you initializing your express app.


Setting Security Response Headers

HTTP Server security headers provide an vital role in securing a web application. These headers can enforce browser built-in security features to protect your web application from client side attacks. In a previous post, I have discussed about these headers and how they protect web applications.  Few important server security headers are:
  • X-Frame-Options - Prevents your application being displayed in iframes
  • X-XSS-Protection - Invokes browser XSS protection mechanisms
  • X-Content-Type-Options - Prevents mime sniffing
Setting these security headers in your application can be done easily by writing your own express middleware which set these headers.

And also you can use helmet npm module to add these security headers, remove/modify x-powered-by header and many more security features to your application.

Other Best Practices

No eval() please

If eval is a new term to you, eval is a method in javascript where you can evaluate a given expression in string format. This method has a great power which comes with a great responsibility. Following is a basic usage of eval.

You will see the string supplied to eval() method will be evaluated and variable number1 will be initialized with value 100. Although it is not declared anywhere else in the code, it can be used in the rest of the code.

Also have a look at the following script I ran on REPL.

This nature of eval() function introduces a big risk when you use user inputs inside the eval() function. If you do not encode/sanitize user input before using it inside eval() function, an attacker has the capability of executing an arbitray code on your server which can be very dangerous. Therefore, eval() function should not be used unless it is very necessary and there is no alternative other than eval(), as well as you know what you are doing.

require() in top of module scope

require() is a call which can be used anywhere in the code to import other node.js modules. This call is a synchronous file inclusion method which blocks the rest of the code until it returns. Therefore it is recommended to use require() calls to include all necessary node modules on top of the file. This allows all modules to be loaded when the application starts and improve performance of the application.

GET should not mutate state

Do not use HTTP GET method to invoke functions which mutates your application's state. GET request always should only do a get. No matter how many times you send the GET request, your application's state should not be modified. GET requests are logged in browser history and many more places and easy for an attacker to find and send through a browser. If you are doing some unsafe operations with GET requests such as database updates, file writes etc., you are doing it wrong. 

May your code does not invite DOS and DDOS

Node.js is a single threaded technology. If you block it, your whole operation is blocked. These situations can cause a denial of service attack on your application. You need to pay attention to several details to prevent being DOSed.

Validate the content length of the requests. This prevents your application accepting large requests and taking time to process those requests. This can be achieved with express-content-length-validator npm module.

Limit the request flow into your application. This can prevent DOS and DDOS attacks on your application. Limiting request flow can be achieved by throttling request to your application through a front-end proxy like nginx and limiting request flow through nginx configuration, or you can use an npm module such as ratelimiter to do it in your application itself.

Carefully use regular expressions in your application. Certain regular expressions including repeating groups with repetition or alternation with overlapping inside the group can cause your application take exponentially increasing time for certain non-matching inputs. Since regular expressions are evaluated in event loop thread, when a regular expression is being evaluated, the whole application gets blocked. Generally, evil regex types look like:
  • (a+)+
  • ([a-zA-Z]+)*
  • (a|aa)+
  • (a|a?)+
Evil regular expression can cause exponential increase in time taken to evaluate it when certain inputs are provided. Look at the following example.

You can see the start time and the end time of the regex evaluation for each string. After executing the last line I waited 20 minutes to see the end time until finally I decided to give up and close the terminal. Now you have a clear idea what an evil regex can do to your application.

Following is a quote from owasp explaining evil regex very well.

The Regular Expression naïve algorithm builds a Nondeterministic Finite Automaton (NFA), which is a finite state machine where for each pair of state and input symbol there may be several possible next states. Then the engine starts to make transition until the end of the input. Since there may be several possible next states, a deterministic algorithm is used. This algorithm tries one by one all the possible paths (if needed) until a match is found (or all the paths are tried and fail).For example, the Regex ^(a+)+$ is represented by the following NFA: 
For the input aaaaX there are 16 possible paths in the above graph. But for aaaaaaaaaaaaaaaaX there are 65536 possible paths, and the number is double for each additional a. This is an extreme case where the naïve algorithm is problematic, because it must pass on many many paths, and then fail.

Therefore, when you use regular expressions in your application, make sure they do not make your application freeze and also most importantly, never accept user supplied input as a regex.  

Stay Up to Date

No matter how many protection mechanisms you implement from your application level, if you have an outdated node.js version or outdated and vulnerable npm modules, your application is at risk. Outdated components might have already published vulnerabilities and even exploits to attack those components. Therefore, always keep your application components up-to-date. Following are few tools that you can use to identify potential outdated components in your application.
And also keep your Node.js version and NPM up to date. Node.js version can be updated with "n" npm module.

sudo npm install n 
sudo n stable

You can update npm using npm (weired right?)

  sudo npm install -g npm

Use explicit package versions

When you install packages using npm install, you will see that there is a litle "~" or "^" symbol in front of package version numbers. 

These letters mean your packages will be updated to latest versions (minor versions or major versions) when you do npm update. Even though this sounds cool, using drifting package versions could cause a lot of problems. 
  • Some package vendors might release unstable package releases which can brake your application.
  • Some major updates will change the behavior of the initial module you added to the project drastically hence breaking your existing code
  • ... and many more. 
Therefore keep your package versions fixed by removing those characters so that you can expect the same behavior as you implemented from your application. You can enforce npm to include a fixed version in package.json by running npm install with some other parameters as follows:

npm install foobar --save --save-exact

Also you can configure npm to always include a fixed version in your package.json when you do npm install. This can be done using following commands in your terminal.

$ npm config set save=true 
$ npm config set save-exact=true

These settings will be saved in ~/.npmrc file and will be applicable to all npm commands afterwards.

No sensitive data inside your repo !!!

SORRY FOLKS!! I was wrong from the beginning. I put cookie secret and session secret in my code itself. What if I commit my code to github? anybody who have access to the repo can see my secret keys. This is a bad practice. Although I put it in the code for demonstration purpose, never store secret strings in the code itself. It makes you change the source code when the keys change as well as it reveals sensitive information to 3rd parties. 

Best practice is to store these sensitive information in environment variables and access those variables in your code using process.env.

Then you can run your application after you initializing your environment variables.

$ EXPRESS_COOKIE_SECRET = 's8*H*6wvfHc7WN!*nun6' 
$ node server.js

Use a linter tool

Use a Javascript lint tool such as JSlint or Eslint to maintain consistency in your code and maintain coding best practices throughout the code. Using these tools you can enforce secure coding practices for all developers in your project using custom rules.

Do not run node as ROOT !!!

Never run Node.js as root. Running node.js as root will make it worse if an attacker somehow gains control over your application. In this scenario, attacker would also gain root privileges which could result in a catastrophe. If you need to run your application on port 80 or 443, you can do port forwarding using iptables or you can place a front-end proxy such as nginx or apache which routes request from port 80 or 443 to your application.

There is another code based solution where you can start you application as sudo, but as soon as the application starts listening on the port, application de-escalates its privileges level to the normal user.

Useful NPM modules

Here I am listing the NPM modules which I have previously discussed which can be used to implement various security features in your application. And also I am listing some additional modules which you can used for testing and debugging as well.
  • csurf - Implement Anti-CSRF tokens to prevent cross site request forgery
  • cors - Enable Cross origin resource sharing
  • hpp - Protection from HTTP Parameter Pollution
  • express-content-length-validator - Prevent DOS attacks
  • rate-limiter - Prevent DOS attacks
  • helmet - Set custom security headers
  • nsp - Scan for deprecated/vulnerable npm modules used in your app
  • retire - Scan for deprecated/vulnerable npm modules used in your app
  • mocha, should, supertest - Writing node.js tests
  • bunyan - Logging
I think these tips will be useful for you to develop secure Node.js applications. If you have any comments, feel free to write them below.

Saturday, February 14, 2015

Implementation of Smart Cloud Scheduler - Part II

11:34 PM Posted by Deepal Jayasekara , , , , , No comments
In this post, I will discuss implementation details of the Smart Cloud Scheduler which was described in a previous post. Let's look at the component based architecture of the Smart Cloud Scheduler.

Design and Implementation

Web Frontend

Our resource scheduler provides two access methods for the cloud users. One method is via the Web Interface. A user can compose a resource allocation request via the web interface and submit. Web interface internally access the API provided by the resource scheduler.

API Endpoint

API endpoint is a REST API to manage functionality of the resource scheduler. This REST API provides functions to issue resource allocation requests and perform admin/user functions. Following are the functions that are currently supported by the API.

Create User - POST /admin/createUser
Login - POST /login
Issue a request - POST /request
Read configuration - GET /admin/configuration
Update configuration - POST /admin/configuration

Creating a user - 

Creating a User is only allowed for Administrator users. Create User request should be sent as a POST request to the URL /admin/createUser. Create User request body should have following format:

    "username": "User 1",
    "password": "passw@rd"
    "userinfo": {
        "firstName": "User",
        "lastName": "One"
    "priority": 3,
    "admin": true

SHA-256 algorithm is used to hash passwords to store in the database and the timestamp at the point where the account was created is used as the salt for hashing. A successful creation of a new user account will return a response similar to the following:

Login - 

Login function should be performed before sending any Resource Allocation requests to the API. Users should send a POST request to the /login with JSON request body including the information about the user account to be created. Login function accepts username and password in a JSON request and a successful login will return a response including the session ID to be used in future requests.

Request body format:


Response body format:

    "message":"Login successful",

Login function will create a session ID and store it in the database for future session validation and it will be sent back to the user in the response as above. User should include above session ID in future Resource Allocation Requests in order to get authenticated.

Issue a request - 

Once a user is logged in, he/she can issue Resource Allocation Requests to the Resource Scheduler using this method in API. Resource allocation Request should be composed as a XML document which describes the resources the user requires. This request should be sent as a POST request to the URL /request with Content-Type header set to application/xml. We are currently using XML as the Resource Description Language. But, to be consistent with other API methods, we are currently implementing this API method to support requests with Content-Type set to application/json. To differentiate these two types, Content-Type header is required in the POST request for Resource Allocation. Internally, Resource Scheduler selects appropriate parser (XML or JSON) according to the Content-Type header.

Following is a sample Resource Allocation Request:


Read Configuration - 

Configuration information includes the settings related to all components in the resource scheduler. To access configuration information, user should be an administrator. This API method can be accessed by sending a GET request to the URL /request. Configuration information can be retrieved as a JSON document.

Write Configuration - 

Writing a new configuration can be done by sending a POST request to the URL /request with a document containing configuration information as a JSON document. This method is also protected and only administrator users are allowed to access the method.

Authentication and Authorization service

Authentication service provides authentication and authorization services for the resource scheduler. API authentication is provided by a session key validation mechanism. Users need to login into the resource scheduler before issuing any request. Once logged in, Authentication service generates and returns a persistent session key which should be re-sent in following requests. Generated session key is stored in MongoDB and when a user issues a request, validity of the session key is evaluated by the Authentication service and identify the user’s privileges.
Passwords are stored in database as salted SHA-256 hashes where timestamp at the creation of user account is used as the salt. In the first implementation of the Authentication Service, we used MD5 as the hashing method but then updated to SHA-256 for better security. Only the login() and createUser() methods are provided by Authentication service to outside via the REST API, and authorizeResourceRequest() function is used internally by VM scheduler to authenticate and authorize an incoming resource request. Following diagram is a graphical representation of Authentication and Authorization services provided by the module.

Core Scheduler

Host Filter

Host Filter performs the filtration of hosts according to their monitored statistic data through Zabbix. That filtered hosts are then sent to VM scheduler as the candidate hosts for scheduling of a particular job. 

Complete steps taken by the Host Filer are illustrated below.
  • Host Filter receives the resource request and all host information. This host information is monitored through Zabbix and it contains all host information in the infrastructure. For every host, an ID is created in Zabbix.
  • Statistics are collected for specified data items in each host. Monitored items includes, memory information, CPU load and CPU utilization. For each item, history values are also collected apart from the current value. In our application we only take the previous value and the current value as we are calculating the EWMA (exponentially weighted moving average) using the last EWMA which is stored in the database.
Equation for calculating EWMA:
EWMA(new) = * EWMA(last) + (1 - α) * Current value

  • As mentioned above EWMA is calculated for each item and using that values all the candidate hosts which fulfill the resource requirements for the request are found.
  • All those candidate hosts information are then sent to VM scheduler so that the scheduling decision can be made in VM scheduler.
  • Other than the candidate hosts, Host Filter also finds the hosts that fulfill memory requirements for the request and that is also passed to VM scheduler. That information is sent for the use of Migration and Preemption Scheduler.

VM Scheduler

VM Scheduler provides orchestration for all components in the Resource Scheduler including Core Scheduler components and Authentication Service. 
  1. Following is the flow of how a request is handled by the VM scheduler:
  2. VM Scheduler receives a Resource Allocation Request via REST API
  3. VM Scheduler passes the request to the Authentication service for authentication and authorization
  4. If the request is authenticated and authorized, authorized request is taken by the VM Scheduler and forwards it to Priority Scheduler which is the coordinator of Migration Scheduler and Pre-emption scheduler. 
  5. Either the Priority Scheduler or Migration Scheduler will return a selected host where the request can be allocated. VM Scheduler then performs request allocation on the selected host via CloudStack API using the CloudStack Interface.
In addition to resource allocation, resource de-allocation is also performed by VM Scheduler with the support of De-allocation manager.

Priority Scheduler

Priority Scheduler acts as the coordinator of Migration Scheduler and Pre-emption scheduler. Priority Scheduler first forwards the authorized resource request to Migration Scheduler to find host(s) to allocate the incoming request. If Migration Scheduler does not return any host information, Priority Scheduler then checks whether there are previous allocations with priority less than the incoming requests, and then sends those allocation information along with the request to the Pre-emption scheduler. If Pre-emption scheduler selected some hosts, those host information is sent back to the VM Scheduler by the Priority Scheduler.

Migration Scheduler

Migration Scheduling is the second step of VM Scheduler if there appears to be no resources to serve the incoming Resource Request. When Priority Scheduler forwards incoming request to the Migration Scheduler, it checks whether the enough resources for the incoming request can be made available in any of the hosts by migrating some of its virtual machines to another hosts.

Preemptive Scheduler

When the Migration Scheduler is unable to find a host for an incoming request by reshuffling VMs on the cloud, the request is then passed to the Pre-emption scheduler. In pre-emption step, Pre-emption scheduler checks the priority of incoming request; compare it with current allocations and takes decision on pre-empting a running allocation in order to gain resources on the incoming resource request. When a request is preempted, preempted request is saved in the database in order to be rescheduled later when resources are available. When preemption fails, it means that there are no enough resources to be allocated for the incoming request and then the Resource Scheduler returns an error message to the user via API stating that there are no enough resources to handle his request.  When there are specific set of VMs to be preempted, Preemption scheduler calls an additional web service implemented in Java, called ‘JVirshService’. In this external call, Preemption Scheduler will pass the list of VMs to be preempted by a RESTful API call in JSON format and JVirshService will perform the preemption.

JVirsh Service RESTful Java Web Service

JVirshService is a java RESTful web service which runs separately from the main Node.js application. This web service internally uses ‘virsh’ command line tool to issue VM snapshot commands to the hypervisor via libvirt API. When preemption scheduler sends a set of virtual machines to be preempted to the JVirshService in JSON format, it internally calls libvirt and performs taking VM snapshots in hypervisor level. Once the snapshot command gets executed successfully, JVirshService will return the IP address of the host on which the VMs were preempted. This IP address will be received by Preemption Scheduler and it will inform VM Scheduler that preemption complete. 

Allocation Queue Manager

Allocation Queue Manager is the component which stores the Resource Requests which cannot be served with currently available resources in MongoDB. When a request arrives, there are currently no enough resources to be allocated for the request, Resource Scheduler then tries Migration Scheduler and Preemption Scheduler to find enough resources for the incoming request. In the Preemptive scheduling phase, if there are no allocations with priority less than the incoming request, Preemption Scheduler returns no hosts for the incoming request. At this point, the request is queued by Allocation Queue Manager and will be allocated when enough resources is available for the request in the Cloud.

De-allocation Manager

De-allocation manager is to perform resource release and re-allocate preempted requests/queued requests in the Allocation Queue Manager.

Configuration updater

Configuration updater is the module which provides methods to change the configuration of the Resource Scheduler. Configuration information of the all components of the Resource Scheduler can be changed using this module. Access to this module is protected and only administrator users are allowed to make configuration updates.

Database Interface

Database interface is the component we use to store and retrieve information from MongoDB storage. Accessing MongoDB is provided by a 3rd party Node.js module called Mongoose. We use Mongoose module to perform queries and updates on MongoDB database.

CloudStack Interface

CloudStack interface is implemented as a Node.js module called ‘csclient’. ‘csclient’ is also a 3rd party module which provides easy access to the CloudStack API with Node.js where complex functionality including request signing is performed inside.

Zabbix Interface

Zabbix Interface is implemented as another Node.js module which provides access to Zabbix Monitoring System via Zabbix API. It has the functions to login into the monitoring system and issue API requests in Node.js.

MongoDB Database

We are using MongoDB as our database storage to store information including configuration information, user information, resource allocation requests, etc. Since we are using Node.js as our development language for the resource scheduler, and also we need to store queued Resource Allocation Requests easily, MongoDB was easier than MySQL. If we use MySQL, we would need to use a proper Object Relational Mapping (ORM) or convert JSON documents into a String and store in the database. In that case, information retrieval would also add additional overhead to create objects from relational data or parse string into JSON. Using MongoDB, we can directly store Resource Allocation Requests in JSON format and retrieve them as JavaScript objects easily.

Results and Evaluation

Resource-Aware Virtual Machine Scheduling

One requirement of our project was to improve the default VM scheduling mechanism of CloudStack in order to improve the availability for users. By default, CloudStack provides four VM allocation algorithms. Allocation algorithm can be changed by updating the Global Configuration parameter named vm.allocation.algorithm. Following are the four allocation algorithms supported by CloudStack.

Parameter value
Pick a random host available across a zone for allocation.
Pick the first available host across a zone for allocation
Host which has the least amount of VMs allocated for a given account is selected. This provides load balancing up to a certain extent for a given user. But running VM count is not considered between different user accounts.
This algorithm is similar to ‘random’ but, this considers hosts within a given pod rather than across the zone.
This algorithm is similar to ‘firstfit’ but, only considers hosts within a given pod rather than across the zone.

By default, random algorithm is selected as the VM allocation algorithm. None of these algorithms provide resource-aware scheduling. Although userdispersing algorithm considers running VM count which belong to given user, it does not consider the resource utilization of VMs and resource availability in the host. Following snapshot from Zabbix Monitoring System shows how VM Allocation is performed using random method in CloudStack. Diagram shows amount of memory available in all hosts. 

Default random VM schedling algorithm of CloudStack
We can identify a VM deploying as a drop of a line in the graph. We can see than the sequence of host selection is Host 3, Host 4, Host 2, Host 4, Host 2, Host 4, Host 2, Host1, Host 1, etc. Using resource utilization information fetched from the Zabbix server, we have implemented mechanism to deploy VM on the host which has least available memory to deploy a given VM. This algorithm is known as best fit. The reason for using this algorithm is to keep as much as free memory as possible in a given host, so that those memory can be allocated to future requests asking for more memory. An alternative algorithm would be a load balancing algorithm which would allocate a VM on the host which has maximum available memory to allocate the VM. But, that algorithm could cause frequent shuffling of VMs (perform Migration Scheduling) when memory intensive VM requests come.

Following snapshot from Zabbix Monitoring System shows how Resource Aware VM scheduling is performed with our scheduling algorithm on CloudStack:

Resource Aware VM scheduling of Smart Cloud Scheduler
Above diagram shows available amount of memory in each host in CloudStack. Note that the green line in Zabbix Diagram shows a host which was later added to CloudStack. This graph shows change in available memory in four hosts when a list of VMs are deployed with following memory requirements:

VM 1 - 2 GB
VM 2 - 2.5 GB
VM 3 - 1 GB
VM 4 - 2 GB
VM 5 - 2 GB
VM 6 - 2 GB

After these six VM deployments, we added the new host to CloudStack which is represented in the diagram with green line. After that, we can see three next deployments were performed on the newly added host since it has the minimum memory available for the requirements. After these three allocations, its available amount of memory drops than 2GB where final VM in our list cannot be deployed on that host. Therefore the final VM which requires 2GB of memory can only be deployed on Host 4 and it gets allocated on Host 4.

VM 7 - 2 GB
VM 8 - 1 GB
VM 9 - 1 GB
VM 10 - 2 GB

Preemptive Scheduling

When there are low priority requests are allocated on the cloud and there are no space for another resource allocation on the cloud, we need Preemptive Scheduling to service high priority requests coming. Since high priority requests need to be immediately allocated, we need to preempt suitable number of currently allocated low priority requests and gain space for the incoming high priority request. When there is no space on the cloud for an incoming request, we move into Migration Scheduling phase in which we re-shuffle all VMs in the cloud using VM live migration to gain space on a specific host for the incoming request. If Migration Scheduling is not possible, we then move to Preemptive Scheduling phase, in which we check the priority in incoming request, compare it with currently allocated request and then take preempting decision based on availability of enough low priority requests which can be preempted to gain space for the incoming request. Following graph is a graph on memory availability against time on all hosts in the cloud

Preemptive Scheduling of Smart Cloud Scheduler.
When we consider the section after the red vertical line, resource scheduler has first created two VMs on host 2. Then it has created 3 VMs on host 4.  Following those, it has further created one VM on host 2 and another one on host 4. Now we are in a situation on where we do not have sufficient memory capacity for more requests asking more than 2GB of memory. Then we get a request asking 2GB of memory. At this point resource scheduler has taken decision to preempt two VMs from host 2 which are consuming 2GB of memory. And then the same host is then used to allocate the incoming request. We can identify that before the preemption, there are 3 VMs running on host 2 and 4 VMs running on host 4. Resource Scheduler has chosen host 2 for preemption because there are less number of VMs running with low priority than the incoming request. This algorithm can further be improved to consider the resource utilization by each VM. This is currently difficult because Zabbix agent based resource monitoring requires each VM to run Zabbix Agent to achieve this.

Migration Scheduling

Migration Scheduling was later added to Smart Cloud Scheduler which can release resource for an incoming request without preemption. Once completed, we have added Migration Scheduler so that it is called before Preemption Scheduler is called. Following is the graph taken from Zabbix with Migration Scheduling and Preemption Scheduling. 

Above graph shows a sequence of following VM deployments:

Priority Level 2 (Medium)
        VM1 - 2.5GB
        VM2 - 1.5GB
        VM3 - 2GB
        VM4 - 2GB
        VM5 - 2GB
        VM6 - 2GB
        VM7 - 2GB
Priority Level 3 (High)
        VM8 - 2GB


Current implementation of our Smart Scheduler has several limitations which needs to be addressed and implemented later. Following are some of those limitations and issues:
  1. VM Group Allocation is not possible yet. In current implementation, we only support one VM deployment per Resource Allocation Request. Multiple copies of same VM (e.g.:- for Lab allocations) and groups of different types of VMs cannot be specified in a single Resource Allocation Request
  2. Advanced Reservation is not available. Our implementation currently support Immediate Allocations and Best Effort service. Depending on the resource availability on the cloud and existing resource allocations, we provide immediate allocation and best effort service depending on the priority of the request. In Advanced Reservation, a user may specify a future time period during which his resource allocation should be available. Current implementation does not support this feature.
  3. Per VM resource monitoring is not available. We currently monitor resource utilization of physical hosts. This has a major drawback because VMs not be utilizing the full amount of allocation they have requested (such as memory). In this case, when the VM is idle, physical host seem to have more resources for new allocations when we monitor host's resource utilization. This may cause further allocations be happened in the same hosts. But when the idle VM utilized entire amount of resources it has been allocated, host gets overloaded and we may need migration or preemption to overcome overloading. To prevent this, we need to monitor resource utilization by each VM seperately and identify when they are idle and start utilizing their entire allocation.
  4. Need improvements to VM preemption algorithm. VM Preemption algorithm currently selects the physical host with minimum number of low priority VMs are allocated for preemption scheduling. Improvement to this algorithm is necessary because although there may be hosts with large numberof low priority VMs running, there may be idle VMs which few of them can be preempted to allocate a high priority incoming request.

Smart Cloud Scheduler | Resource and Policy Aware VM Scheduler for Medium Scale Clouds

11:44 AM Posted by Deepal Jayasekara , , , , No comments
Cloud computing enables providing computing resources over an Internet connection to the users. In an IaaS (Infrastructure as a Service) Cloud, these computing resources include Processing power, Memory, Storage and Network resources. Cloud computing technologies including Infrastructure as a Service, Platform as a Service, and Software as a Service have changed the traditional “host on own data center” strategy and have given a good solution to prevent maintenance overhead of a private data center per organization. Medium-scale IaaS clouds are becoming more popular in small/medium scale organizations such as universities and enterprises. Medium-scale IaaS clouds are useful when organizations need to deploy their own private cloud using the compute, storage and network resources they have. There are several IaaS platforms such as OpenStack, CloudStack and Eucalyptus which users can use to deploy their own medium-scale clouds. These tools handle sharing and management of compute, storage and network resources dedicated to the cloud and perform resource allocation for various requirements.

Many universities and enterprises are now setting up their own small-to-medium scale private clouds. Such private cloud are becoming popular, as they provide the ability to multiplex existing computing and storage resources within an organization while supporting diverse applications and platforms. It also provides better performance, control, and privacy. In a medium-scale cloud such as a university or enterprise cloud, there are different types of users including students, lecturers, internal/external researchers, and developers, who get benefits from the cloud in different ways. They may have varying requirements in terms of resources, priorities, and allocation periods for the resources. These different requirements may include processing intensive and memory intensive applications such as HPC (High Performance Computing) applications and data mining application, as well as labs which needs to be deployed on a specific set of hosts and for a particular period of time. Priority schemes and Dynamic VM (Virtual Machine) migration schemes should be used to satisfy all these requirements in an organized manner. However, currently known IaaS cloud platforms have no native capability to perform such dynamic resource allocations and VM preemption mechanisms. Therefore, it is important to extend existing cloud platforms to provide such policy, resource, and deadline aware VM scheduling.

 What is the solution?

We are proposing a resource scheduling mechanism which can be used as an extension to an existing IaaS cloud platform to support dynamic resource and policy aware VM scheduling for Medium Scale Clouds. Resource scheduling algorithm schedules VMs while being aware of the capabilities of the cloud hosts and current resource usage, which is monitored continuously using a resource monitor. The resource scheduler will allocate resources according to predefined priority levels of a particular user who issued the resource request. Time-based scheduling (e.g., deploying labs for a particular period of time) is also performed while considering the priority levels and existing resource allocations. To provide these features we extend an existing IaaS cloud platform to include resource and policy aware VM creation, migration, and preemption.

Technologies Involved

Apache CloudStack

Apache CloudStack is an open source IaaS platform widely being used for building medium scale clouds. CloudStack is easier to setup compared to alternatives such as OpenStack and Eucalyptus because of its monolithic architecture. CloudStack management server can be installed on a single machine easily whereas OpenStack has different component which need to be seperately installed and configured which required expertise. CloudStack provides a rich web UI and also simpler RESTful API for 3rd party tools integration. And also it implements an amazon EC2 API compatible interface as well.

KVM Hypervisor

KVM hypervisor is an open source kernel based virtualization scheme which can be used to create virtual machines on top of a shared physical host. The reason for our interest in KVM to be used with CloudStack was KVM provides capability to take snapshots of a running VM including its disk and memory using libvirt API. Memory snapshot is not provided by other supported open source hypervisors of CloudStack such as XCP (Xen Cloud Platform) though XenServer, which is the commercial version of Xen provides VM memory snapshots. Taking VM snapshots is a requirement in our solution because, we need to save state of a VM and restore it later. This is called preemptive scheduling in which high priority request will preempt a low priority request if there are lack of resources on the cloud.

Zabbix Resoure Monitoring System

Zabbix is a free and open source resource monitoring system for clouds and grids. It is highly scalable and can be used for resource monitoring of small to large scale clouds consists of up to 100,000 monitored devices. Zabbix also  provides a RESTful API and a good web UI which can be used to monitor resource utilization and availability of a set of monitored devices. It provides graphical representations as well as detailed information on monitored devices.

Design of the solution

Our Smart Cloud Scheduler (SCS) communicates with CloudStack and Zabbix Monitoring system through the APIs they have provided. SCS itself provides a REST API and a Web  for the users using which they can send Resource Allocation Requests to the cloud. User first have to compose his Resource Allocation Request in a structured format (will be discussed later) and send it to SCS. SCS then queries Zabbix Resource Monitor to fetch latest information on resource utilization and availability on the Cloud. Based on resource availability, SCS takes decision on how to allocate the request and on which host should the request be allocated. Once these decisions are taken, they are executed by SCS on CloudStack using CloudStack API. 

Overview of Smart Cloud Scheduler

Implementation of Smart Cloud Scheduler

Source Code of Smart Cloud Scheduler is located at GitHub and can be accessed by URL:

Following diagram represents the high level architecture of our system. This architecture diagram clearly shows how our resource scheduler works in the middle of Zabbix Resource Monitoring system and CloudStack IaaS framework and performs the coordination among them. A user can submit a query to the scheduler through Web Frontend or using the API endpoint provided by the scheduler. When the request is validated, authenticated and authorized, authentication service forwards an authorized and prioritized request to the core scheduler.

High Level Architecture of the system

Smart Cloud Scheduler is a Node.js implementation which uses a MongoDB database for storage. We are using Mongoose as the Node.js driver for MongoDB and 'csclient' Node.js module for CloudStack API calls. Authentication service also provodes authorization service for Smart Cloud Scheduler. It also uses MongoDB to store user account information. Core scheduler is the main component in above diagram which encapsulates the core functionality of the resource scheduler.

Component based architecture of the system

Following diagram illustrates the internal component based architecture of the Core Scheduler.

A user can issue a resource allocation request using either web frontend or the REST API. The request sent by the user is authenticated and authorized and tagged according to the user's priority and asking priority for the job. This authorized and prioritized request is then sent to the Host Filter which fetches latest resource information from Zabbix in order to determine which hosts to be chosen to allocate the incoming request. If there are hosts on which this request can be directly allocated, then the request is passed to VM scheduler and allocation is happened on the hosts selected by the Host Filter. 
        If there are no hosts to allocate the current request, it is then sent to Priority Scheduler. Priority Scheduler then sends this request to Migration Scheduler. Migration Scheduler tries to obtain space for the request on any of the hosts on the cloud by re-shuffling VMs on the cloud. If it can find space to allocate the request, it returns Priority Scheduler the hosts on which it generated space. Priority Scheduler then passes this information to VM scheduler and VM allocation is performed by VM scheduler on the selected hosts. 
        If migration scheduling is not possible, Priority Scheduler then fowards the request to the Preemptive Scheduler. Preemptive Scheduler checks the priority of the incoming request, then checks whether there are previous allocations on the cloud with lower priority than the incoming request. If there are low priority allocation, it checks whether enough resources for the incoming request can be released by preempting some of those VMs. If it is possible, Preemptive Scheduler performs preemption of those VMs and returns the hosts on which the resources has been released. This host is then used to allocate the new request. 

In the next post about Smart Cloud Scheduler, I will describe each component of the SCS and will discuss current implementation status, limitations and future improvements.

Tuesday, December 23, 2014

Installing CUDA Toolkit 6.5 in Ubuntu 14.10/14.04

10:44 PM Posted by Deepal Jayasekara , , , 14 comments

Checking system capability

First of all, you need to check whether your GPU is CUDA-capable. You can see whether your GPU is listed in CUDA GPUs at


Download the CUDA Toolkit from,
You need to download the RUN installer for Ubuntu 14.04.

Installing background

Before installing CUDA toolkit, you need to first install nvidia proprietery driver in Ubuntu. You can install this using Additional Drivers in Ubuntu.

Then you need to exit the ubuntu graphics session and go to CLI. For this, press ctrl+alt+f1 and run the following command to stop lightdm display manager.

sudo service lightdm stop

Once the lightdm is stopped, you can proceed to next steps.

Installing Prerequisites

If your system does not include necessary dependencies, you might encounter following error when installing CUDA toolkit.

Missing recommended library:
Missing recommended library:
Missing recommended library:
Missing recommended library:

You need to install additional libraries which installs above dependencies as follows:

sudo apt-get install libglu1-mesa libxi-dev libxmu-dev

Installing CUDA toolkit and samples

Once dependencies are installed, go to where the CUDA toolkit is downloaded, and run the following command to start the installation.

sudo chmod a+x
sudo ./

If you get an error saying,

Unsupported compiler 4.*.*. Use --override to override this check

it says that your gcc compiler is incompatible with the installation. To prevent this error, you need to change the installation step as,

sudo ./ --override compiler

You can then accept the EULA, and at the next step, do not install nvidia accellerated graphics driver(select no to when asking to install the driver), since we already have installed a proprietery driver.

If everything went well, your installation will complete successfully. After installation, you can enable Nvidia C Compiler (nvcc) by including following lines at the end of ~/.bashrc file.

For 32 bit include following two lines at the end of bashrc file

export PATH=$PATH:/usr/local/cuda-6.5/bin
export LD_LIBRARY_PATH=/usr/local/cuda-6.5/lib

For 64 bit include following two lines at the end of bashrc file

export PATH=$PATH:/usr/local/cuda-6.5/bin
export LD_LIBRARY_PATH=/usr/local/cuda-6.5/lib64:/lib

When it is done, you can check you ncc by running the command,

nvcc --version

You can then go back to your graphics session by running,

sudo service lightdm start

These steps worked for me. Feel free to put a comment below if you encounter an error during installation.

Thank you.

Wednesday, December 10, 2014

A Digital Signature Batch Verification Scheme for ElGamal

8:30 AM Posted by Deepal Jayasekara , , , , 4 comments
Batch Screening is a scheme which is used with  ElGamal Signature Scheme to improve the performance of verifying large number of signed messages. In Batch screening, a batch of messages is taken together and verified all at once other than verifying each of them individualy which is the standard method. Following is an implementation of a Batch Screening system for ElGamal Signature scheme implemented in Python.

Tuesday, July 8, 2014

Thursday, April 17, 2014

Configuring Secure IIS Response Headers in ASP.NET MVC

In a previous post I talked about how to configure a secure response in Apache by adding secure response headers (such as X-Frame-Options, X-XSS-Protection etc) and omitting headers that disclose internal implementation and technical details of the apache web server (such as X-Powered-By). In this post, I will talk about how to do this in an ASP.NET MVC web application. Instead of configuring these settings in the IIS server, this time I'm going to do this in the ASP.NET code itself since it gives more flexibility and does not affect other applications hosted on the same IIS server.

Using Anti Forgery Tokens with AJAX in ASP.NET

As you know, we can use anti forgery tokens to prevent Cross Site Request Forgery. Usually we add an anti forgery token in a hidden form field of the form. Each time we post the form to the server, server validates the request using the token in the hidden form field and the other one which is sent as a Cookie. See this post in my blog for more detailed information.
              But when we use AJAX to post data to the server asynchronously, we do not use forms most of the times. In those cases we explicitely need to attach the anti forgery token to our data which is sent via ajax. We can do this by some simple javascript.

Sunday, March 2, 2014

Apache Security - Configuring Secure Response Headers

In this post I talk about how to configure some security options of Apache Web Server. A proper configuration of Apache Web server may extreamely important since it sometimes can prevent certain Web Application Attacks even though the vulnerability is there in the web application. In this post I'll describe how to set configure apache to send Security concerned HTTP headers in its response and hide sensitive information from server response headers.