Video Processing in node.js
Since our forthcoming release (1.0.20) supports a solid and high performance system for file IO, Fabric Engine can now be put through its paces for any IO intensive task.
Aside from client-side applications, this new system allows developers to implement IO intensive automation using Fabric Engine inside node.js. To illustrate this, we’ve built a sample application that uses all of the power of Fabric Engine for a custom video processing application. The application utilizes the Fabric Engine Dependency Graph, operators based on our KL language, JavaScript logic as well as multi-threading using the powerful Map&Reduce framework. It’s targeted at people new to Fabric Engine, but who are familiar with JavaScript as well node.js. Overall the application holds about 300 lines of source code, containing all of the JavaScript and KL code. The whole process could be implemented with KL using Map&Reduce, but to show all of the different components in action together it made more sense to use JavaScript for the logic and program flow, and KL for the heavy lifting and IO.
I will now walk you through the source and discuss approaches taken. The full source code of the script is available here.
Goal of the application
To prevent custom video footage from being shared online, one approach is to personalize the footage, for example by adding a watermark to it. The watermark has to contain personal information of the client, to ensure that he is uncomfortable with distributing it. Aside from the watermark containing personal information, it should also slowly move through the video frame, to increase the difficulty of removing it later on. The goal of the application is to load in a video file, watermark every single frame with a moving, personalized image, and encode the result to another video file.
Dependency Graph Design

The dependency graph used in this sample contains 3 nodes. The green watermark node is separated from the video stream, since it doesn’t change over time. It’s only computed a single time, and performs the loading of a PNG image as well as compositing of procedural text (client data) into the watermark image. The yellow nodes evaluate for every frame of the input video. Before we explain the operators used, let’s create the nodes and introduce the necessary dependencies:
// create a context and store it
F = require('Fabric').createClient();
// create the dependency graph nodes
var watermarkNode = F.DG.createNode("watermark");
var videoInputNode = F.DG.createNode("videoInput");
var videoOutputNode = F.DG.createNode("videoOutput");
// create dependencies between them
videoOutputNode.setDependency(videoInputNode,'input');
videoOutputNode.setDependency(watermarkNode,'watermark');
Now we can define the data that flows through the graph. Each node will hold its own data. One particular thing to note is that the watermark node contains a text field that stores client data. This could be a name, address etc… All file paths are fixed/absolute in this sample, but of course could be dynamically driven by JavaScript resp. a database etc… The PNG image delivers RGBA pixels, while the video stream works with RGB pixels (for now). The VideoHandle type is defined in the FabricVIDEO extension. It represents a input or output video stream, that can be used with the functions provided in the video extension.
// create the watermark image members
watermarkNode.addMember('filePath','String','/development/watermark/dvd-logo.png');
watermarkNode.addMember('clientData','String','Client Name & Address');
watermarkNode.addMember('width','Size',0);
watermarkNode.addMember('height','Size',0);
watermarkNode.addMember('pixels','RGBA[]');
// create the input video members
videoInputNode.addMember('filePath','String','/development/watermark/earth_short.mp4');
videoInputNode.addMember('video','VideoHandle');
videoInputNode.addMember('pixels','RGB[]');
videoInputNode.addMember('time','Scalar',0.0);
// create the output video members
videoOutputNode.addMember('filePath','String','/development/watermark/earth_short.mpeg');
videoOutputNode.addMember('video','VideoHandle');
videoOutputNode.addMember('offsetWidth','Scalar',0.0);
videoOutputNode.addMember('offsetHeight','Scalar',0.0);
videoOutputNode.addMember('directionWidth','Scalar',1.0);
videoOutputNode.addMember('directionHeight','Scalar',1.0);
videoOutputNode.addMember('pixels','RGB[]');
The offsetWidth and -Height as well as the directionWidth and -Height are used to move the watermark image within the video frame. The offset members represent the current offset, while the direction members represent the current travel direction for both width and height.
Operator Utility Function
Since I am not using any element of the JSSceneGraph in this tutorial, there is no wrapper to create operators. We can implement our own wrapper, however. For simplicity’s sake The operatorName and entryFunctionName will match, binding and sourcecode will be provided as string arrays. The function will additionally check for errors (such as compilation or bind errors).
// define a helper function to construct an operator
var createOperator = function(node,options) {
// check input arguments
if(options.name == undefined)
throw('You need to specify an operator name!');
if(options.srcCode == undefined)
throw('You need to specify an operator sourcecode!');
if(options.binding == undefined)
throw('You need to specify an operator binding!');
// create the operator
var operator = F.DG.createOperator(options.name);
operator.setSourceCode(options.srcCode.join('\n'));
operator.setEntryFunctionName(options.name);
// create a binding between the node and the operator
var binding = F.DG.createBinding();
binding.setOperator(operator);
binding.setParameterLayout(options.binding);
// append the new binding to the node
node.bindings.append(binding);
var errors = node.getErrors();
if(errors.length) {
throw(errors);
}
};
Operator Implementation
Before we look at video, let’s deal with the watermark image. We will have to utilize both the FabricFILESYSTEM extension, to access the local disk, as well as the FabricCIMG extension, to load the PNG image. After the image has been loaded, we will have to construct a temporary image (pixel array), fill it with procedural text pixels, and composite it on top of the PNG based watermark. Since this operator only evaluates a single time, and the watermark doesn’t change over time, I didn’t bother to use Map&Reduce for multi threading. All in all, the operator appended to the waterMarkNode’s bindings looks like this:
use FabricCIMG;
use FabricFILESYSTEM;
operator openWaterMark(io String filePath, io String clientText, io Size width, io Size height, io RGBA pixels[]) {
if(pixels.size() == 0) {
FabricFileHandleWrapper wrapper;
wrapper.setAbsolutePath(filePath);
FabricCIMGOpenFileHandle(wrapper.getHandle(), width, height, pixels);
report("Loaded Watermark image, resolution "+width+"x"+height+".");
Size textWidth;
Size textHeight;
RGBA textPixels[];
FabricCIMGCreateFromText(clientText,textWidth,textHeight,textPixels);
report("Created client text image, resolution "+textWidth+"x"+textHeight+".");
Size centerWidth = (width - textWidth) * 0.5;
Size centerHeight = 190;
for(Size textX=0;textX<textWidth;textX++) {
Size x = centerWidth + textX;
for(Size textY=0;textY<textHeight;textY++) {
Size y = centerHeight + textY;
Size index = y * width + x;
Size textIndex = textY * textWidth + textX;
pixels[index] = textPixels[textIndex];
}
}
report("Copied client text into watermark image.");
}
}
To load the video, we have to use a similar technique, but a different extension. The operator will execute every frame, but we will only require to open the video in the first frame, so we have to account for that. Additionally we will need another operator, which seeks the video and pulls the frame’s pixels. These two operators will be appended to the inputVideoNode’s bindings:
use FabricVIDEO;
use FabricFILESYSTEM;
operator openVideoFile(io String filePath, io VideoHandle video) {
if(!video.pointer) {
FabricFileHandleWrapper wrapper;
wrapper.setAbsolutePath(filePath);
FabricVIDEOOpenFileHandle(wrapper.getHandle(), video);
report("Loaded video handle, resolution "+video.width+"x"+video.height+".");
}
}
operator seekVideoFile(io VideoHandle video, io RGB pixels[], io Scalar time) {
if(video.pointer) {
FabricVIDEOSeekTime(video,time);
FabricVIDEOGetAllPixels(video,pixels);
}
}
For the outputVideoNode we first have to create an operator which creates the video file. For this, we again utilize the powerful FabricFILESYSTEM extension together with FabricVIDEO. This operator should only peform if there is no valid VideoHandle yet, so we have to check for that.
use FabricVIDEO;
use FabricFILESYSTEM;
operator createVideoOutput(io VideoHandle video, io String filePath, io VideoHandle input) {
if(!video.pointer) {
FabricFileHandleWrapper wrapper;
wrapper.setAbsolutePath(filePath);
FabricVIDEOCreateFromFileHandle(wrapper.getHandle(), input.width, input.height, video);
report("Created output video handle, resolution "+video.width+"x"+video.height+".");
}
}
Now we can do the heavy lifting of compositing. For this we will use the Map&Reduce framework. We will create an ArrayProducer for the input pixels and combine that with a transform operator on each pixel. The transform operator (the actual per-pixel compositing) will have to access the watermark image as well as additional data (widths, heights etc), so we will have to setup a custom struct to store this data, and a constant ValueProducer that can be shared into the transform operator. This way all of the information necessary to perform the compositing can be accessed by all threads at the same time. The .produce() method of the transform’s ArrayProducer will contain the composited image resp. pixel array.
To have the watermark image travel around we can accumulate the offsetWidth and -Height members with the directionWidth and – Height member. If the traveled offset touched the borders of the video frame we can negate the directionWidth resp. -Height to negate the traveled direction.
To show the watermark we take the red component of the watermark image and add it (multiplied by 0.33) to the input video stream. So the white areas of the watermark will lighten the video. Since we run this through an ArrayProducer, the transformPixel operator will be multi threaded automatically.
struct sharedDataType {
Size width;
Size height;
RGBA markPixels[];
Size markWidth;
Size markHeight;
Size offsetWidth;
Size offsetHeight;
};
function Byte clampByte(Scalar s) {
if(s < 0.0) return Byte(0);
if(s > 255.0) return Byte(255);
return Byte(s);
}
operator transformPixel(io RGB pixel, Size index, Size count, sharedDataType sharedData) {
Size x = index % sharedData.width;
Size y = (index - x) / sharedData.width;
if(x < sharedData.offsetWidth || y < sharedData.offsetHeight)
return;
x -= sharedData.offsetWidth;
y -= sharedData.offsetHeight;
if(x >= sharedData.markWidth || y >= sharedData.markHeight)
return;
Size markIndex = x + y * sharedData.markWidth;
Scalar alpha = Scalar(sharedData.markPixels[markIndex].r);
pixel.r = clampByte(Scalar(pixel.r) + alpha * 0.33);
pixel.g = clampByte(Scalar(pixel.g) + alpha * 0.33);
pixel.b = clampByte(Scalar(pixel.b) + alpha * 0.33);
}
operator compositeWatermark(
io VideoHandle outputVideo,
io VideoHandle inputVideo,
io RGB inputPixels[],
io RGB outputPixels[],
io RGBA waterMarkPixels[],
io Size waterMarkWidth,
io Size waterMarkHeight,
io Scalar offsetWidth,
io Scalar offsetHeight,
io Scalar directionWidth,
io Scalar directionHeight
) {
if(!inputVideo.pointer || !outputVideo.pointer || !waterMarkPixels.size())
return;
offsetWidth += directionWidth;
offsetHeight += directionHeight;
if(offsetWidth < 0.0) {
directionWidth *= -1.0;
offsetWidth = 0.0;
} else if(Size(offsetWidth) >= inputVideo.width - waterMarkWidth) {
directionWidth *= -1.0;
offsetWidth = Scalar(inputVideo.width - waterMarkWidth - 1);
}
if(offsetHeight < 0.0) {
directionHeight *= -1.0;
offsetHeight = 0.0;
} else if(Size(offsetHeight) >= inputVideo.height - waterMarkHeight) {
directionHeight *= -1.0;
offsetHeight = Scalar(inputVideo.height - waterMarkHeight - 1);
}
sharedDataType sharedData;
sharedData.width = inputVideo.width;
sharedData.height = inputVideo.height;
sharedData.markPixels = waterMarkPixels;
sharedData.markWidth = waterMarkWidth;
sharedData.markHeight = waterMarkHeight;
sharedData.offsetWidth = Size(offsetWidth);
sharedData.offsetHeight = Size(offsetHeight);
ValueProducer<sharedDataType> sharedDataProducer = createConstValue(sharedData);
ArrayProducer<RGB> pixelsProducer = createConstArray(inputPixels);
ArrayProducer<RGB> compositeProducer = createArrayTransform(pixelsProducer, transformPixel,sharedDataProducer);
outputPixels = compositeProducer.produce();
}
Finally, since we have performed the compositing, we can write the image back out to the output video stream.
use FabricVIDEO;
operator writeVideoFrame(io VideoHandle video, io RGB pixels[], io Scalar time) {
if(video.pointer) {
FabricVIDEOWriteAllPixels(video,pixels);
}
}
JavaScript Logic and Time Control
After the evaluation of the first frame, we can pull the input video struct into JavaScript to determine the duration of the video. Then, we can loop through the video, change the time on the inputVideoNode and fire another evaluation of the graph. This will convert the whole video, since we are executing the graph for every contained frame. Since the time logic is done in JavaScript, we can output a nice progress indicator to the console.
// evaluate the nodes!
videoOutputNode.evaluate();
var videoData = videoInputNode.getData('video',0);
var fps = videoData.fps;
var duration = videoData.duration;
var time = 0.0;
// output some standard information
console.log('--------------------------------------');
console.log('Input video: '+videoInputNode.getData('filePath',0));
console.log('WaterMark: '+watermarkNode.getData('filePath',0));
console.log('Output video: '+videoOutputNode.getData('filePath',0));
console.log('Duration: '+duration+' seconds.');
console.log('--------------------------------------');
// output a progress bar
var lastLogTime = -1;
var logProgress = function() {
var logTime = Math.floor(100.0 * time / duration);
if(logTime != lastLogTime) {
var prog = '';
while(prog.length < logTime * 0.5)
prog += '#'
while(prog.length < 50)
prog += ' '
process.stdout.write('\r --> Converting: ['+prog+'] '+logTime+'%.');
lastLogTime = logTime;
}
}
logProgress();
// increment the time, to render each frame one after another
var incrementTime = function() {
if(time >= duration) {
logProgress();
F.close();
console.log('\ndone.')
return;
}
setTimeout(function(){
time += 1.0 / fps;
videoInputNode.setData('time',0,time);
videoOutputNode.evaluate();
logProgress();
incrementTime();
}, 1);
};
incrementTime();
Results
Watermarked footage:
Original footage:
Sum-Up
Combining FabricFILESYSTEM with FabricVIDEO allows to perform video processing inside node.js. There are a dozen of ways one could implement this, and this particular sample is meant to be mainly educational, so please let me know what you think about this article.
Source Code
https://github.com/fabric-engine/PublicDev/tree/ver-1.1.0-alpha/InProgress
Comments
-
renft
-
FabricPaul
-
http://sergeybelov.ru Arikon
-
FabricPaul
Twitter
Facebook
LinkedIn
Vimeo
Fabric Google Group