Fabric Engine Server Performance Benchmarks

November 11, 2011, Posted by Paul Doyle

Revision History

DateAuthorDescription
11/11/2011Peter ZionDescribe results of Monte Carlo Value-at-Risk benchmark

Introduction

The purpose of this document is to provide an objective comparison between computational algorithms implemented in JavaScript (alone), single-threaded C++, multithreaded C++ and JavaScript using Fabric Engine.

All tests were performed on an Amazon EC2 High-CPU Extra Large Instance (c1.xlarge) running 64-bit Ubuntu 10.10. No software other than the default system software was running on the machine at the time the tests were run. The specific instance characteristics (from Amazon) are:

  • 7 GB of memory
  • 20 EC2 Compute Units (8 virtual cores with 2.5 EC2 Compute Units each)
  • 1690 GB of instance storage
  • 64-bit platform
  • I/O Performance: High
  • API name: c1.xlarge

For each benchmark, four different versions of the benchmark were created:

  • JavaScript (Node.js)
  • JavaScript (Node.js) + Fabric Engine
  • Single-threaded C++
  • Multithreaded C++

In all cases exactly the same algorithm is implemented in each case; any algorithmic differences are isolated to containers provided by the language itself (eg. Javascript objects vs. C++ std::map). Every reasonable effort has been made to optimize each implementation for what achieves the highest performance in each language; so, for instance, fixed-length arrays (as opposed to variable-length arrays) are used wherever permitted by the algorithm and the language.

The C++ versions were compiled using gcc version 4.4.5 using the compiler flags “-O6 – lpthread”. When possible, fixed-size C-style arrays were used for all data as this generally increases performance compared to using managed containers (such as std::vector). Pthreads was used for all thread management with exactly one thread running for each core; all the work is divided as evenly as possible among all the threads. Threads are only created once and exit as soon as the computation is complete. No shared work queue is used; instead, all threads are told in advance exactly what work they have to do. We believe this is the fastest that the C++ thread usage model can be under the constraint that the threads don’t already exist when the benchmark starts; the same constraint is applied to Fabric Engine.

The JavaScript (Node.js) versions are run with Node.js version 0.6.1-pre.

The Fabric-enabled Javascript (Node.js) versions are timed for their second execution rather than their first so that the optimization time is not included. This is consistent with typical Fabric usage; Fabric caches the optimized compilation after its first use and does not re-optimize the program.

All benchmark core has been made available for public review at https://github.com/fabric-engine/Benchmarks/. Note: The source code is provided for historical purposes only and may not work with recent versions of the software.

Monte Carlo Value-at-Risk

Description

The Monte Carlo Value-at-Risk benchmark uses a Monte Carlo simulation to calculate the Value-at-Risk (95%) for a portfolio of 10 stocks whose prices are simulated over 252 trading days as stochastic random walks according to the geometric Brownian motion model; 1048576 such random walks are independently simulated. The same pseudo- random number generator has been used in all implementations in order to arrive at the same result.

In both the C++ versions and the JavaScript with Fabric Engine versions the number of stocks is a compile-time constant; this allows both implementations to use fixed-length arrays leading to better performance. This is not possible in JavaScript on its own since JavaScript only supports variable-length arrays.

The following chart shows the actual run time, in seconds, of each implementation of the Monte Carlo Value-at-Risk benchmark:

Monte Carlo Value-at Risk run time

Note that the difference in run times between the Fabric-enable Javascript implementation and the multithreaded C++ implementation is due to the (small) overhead of interfacing with Javascript. Also note that 413.0/53.4 ? 7.73, so the multithreaded C++ implementation is indeed achieving almost performance linear scaling to the number of cores.

Download this post as PDF

Comments

  • http://www.victusspiritus.com/ Mark Essel

    This is seriously amazing stuff. It can liberate developers from c++ shackles, if only it was MIT licensed!