Converting base64 dataURI strings into Blobs or Typed Array

Sexy title hey? 🙂

So I’m working on a web-based video creation tool called Content Samurai, and was experimenting with some client-side video rendering.

Long story short, that involved me working out how to grab PNG images from HTML canvas objects in the most efficient way possible.

I searched Google to try to find the fastest way to turn the base64-encoded dataURI from a canvas object into a TypedArray or Blob, and it wasn’t easy to find the best results.

So, I thought I’d document it here for future generations.

What’s a dataURI?

It’s the contents of a file or data, that’s encoded in-line in the URL in base64 encoding. For example, to get the contents of a canvas object as a large dataURI string you would do something similar to:

var canvas = document.getElementById('mycanvas');
// draw something on canvas
var dataURI = canvas.toDataURL('image/png');
console.log(dataURI);
// very long string, but here's a snippet:
// data:image/png;base64,iVBORw0KGgoAAA....QAAAAAElFTkSuQmCC

The data:image/png;base64, prefix says that the:

  1. The URI is a data URI, and to treat it as an inline resource as opposed to coming over a network such as http.
  2. image/png is the mime type.
  3. base64 is the string encoding method. Read about what base64 encoding is here.
  4. The data after the first comma is the base64 encoded data.

I thought that the base64 encode and decode would the limiting factor, and academically I wanted to find the most efficient way to get this dataURI into a TypedArray or Blob.

Even if you don’t need to do this, you might discover a few neat tricks:

Method 1: Use browserify Buffer to perform base64 conversion

This uses the npm buffer module to do the base64 decode if you’re using browserify. Under the hood this uses base64-js:

var canvas = document.getElementById('mycanvas');
// draw something on canvas
var dataURI = canvas.toDataURL('image/png');
// convert to typed array
var arr = convertDataURIToBinaryBuffer(dataURI);
// convert to blob and show as image
var blob = new Blob([arr], { type: 'image/png' }); 
var img = document.createElement('img');
img.src = window.URL.createObjectURL(blob);
document.body.appendChild(img);

var BASE64_MARKER = ';base64,';
function convertDataURIToBinaryBuffer(dataURI) {
  var base64Index = dataURI.indexOf(BASE64_MARKER) + BASE64_MARKER.length;
  var base64 = dataURI.substring(base64Index);
  var buf = Buffer.from(base64, 'base64');
  return buf;
}

The results:

  • Operations / Second: 22.12
  • ms / operation: 45.21ms

When trying to hit 60fps, my goal is to do this a lot faster than 1 in 60 seconds (ie. 17ms). So this is way too slow.

Method 2: dataURI copying to Uint8Array using atob:

This is the most widely compatible way, and is quite fast:

var canvas = document.getElementById('mycanvas');
// draw something on canvas
var dataURI = canvas.toDataURL('image/png');
// convert to typed array
var arr = convertDataURIToBinary(dataURI);
// convert to blob and show as image
var blob = new Blob([arr], { type: 'image/png' }); 
var img = document.createElement('img');
img.src = window.URL.createObjectURL(blob);
document.body.appendChild(img);

function convertDataURIToBinary(dataURI) {
  var base64Index = dataURI.indexOf(BASE64_MARKER) + BASE64_MARKER.length;
  var base64 = dataURI.substring(base64Index);
  var raw = window.atob(base64);
  var rawLength = raw.length;
  var array = new Uint8Array(rawLength);

  for(i = 0; i < rawLength; i++) {
    array[i] = raw.charCodeAt(i);
  }
  return array;
}

The results:

  • Operations / Second: 22.40
  • ms / operation: 44.65ms

So, not a big difference. It uses the built-in atob function to do the base64 decode.

But the toDataURI method which is creating the PNG from the canvas is still the main thing slowing down the computation.

I also tried a few methods trying to offload the base64 encoding to a WebWorker which showed minor improvements given that the bulk of the work happens in PNG encoding.

For an easier way to use WebWorkerss inline in your source code check out webworkify

Method 3: fetch() method

An interesting, and less-documented way to do base64 decoding is to use the fetch() API.

Used normally for fetching data from a remote URL, you can also use it to decode dataURI strings! While we know that PNG encoding is the overhead, I was still interested to see how it fared. And you can return the image data as a whole Blob and skip that conversion step:

var canvas = document.getElementById('mycanvas');
// draw something on canvas
var dataURI = canvas.toDataURL('image/png');
// convert to typed array
var blob = convertDataURIToBinaryFetch(dataURI);
var img = document.createElement('img');
img.src = window.URL.createObjectURL(blob);
document.body.appendChild(img);

function convertDataURIToBinaryFetch(dataURI) {
  return fetch(dataURI)
    .then((res) => res.blob());
}

The results:

  • Operations / Second: 11.14
  • ms / operation: 89.79ms

As you can see, this was by far the worst way to decode a base64 string! So, bastardising the fetch() API to do this kind of work is not that efficient! But at least you might add this technique to your bag of tricks!

Note, you can use the arrayBuffer() method on the Response object if you just want a TypedArray and not a Blob.

Method 4: toBlob() method

In newer browsers, you can use the new toBlob() method to turn a canvas straight into a Blog, and skip the base64 decode process all together:

var canvas = document.getElementById('mycanvas');
// draw something on canvas
canvas.toBlob((blob) => {
  var img = document.createElement('img');
  img.src = window.URL.createObjectURL(blob);
  document.body.appendChild(img);
});

The results:

  • Operations / Second: 25.01
  • ms / operation: 39.99ms

A marginal improvement as we don’t have to do the base64 encode and decode, but the PNG compression is still the limiting factor. It would be good if we could tell the browser to return a PNG that is not as highly compression to trade off CPU and space like we can with node-canvas.

Method 5: getImageData()

As none of the techniques would work at 60 frames per second, the obvious solution is to avoid using PNG conversion at all, and just use the getImageData() method on the CanvasRenderingContext2D object:

var ctx = canvas.getContext('2d');
var canvas = document.getElementById('mycanvas');
// draw something on canvas
var imageData = ctx.getImageData(0, 0, canvas.width, canvas.height);
// use `putImageData` to paint the data to the canvas, and use `toBlob()` method to get the data into an image

The results:

  • Operations / Second: 240.96
  • ms / operation: 4.15ms

So, while not an apples to apples comparison, the process of using the pure ImageData functions to get and put the frames is much more efficient. Though, you do need to do a PNG conversion using toBlob() or the other methods listed above if you want to attach the data to an Image, or to efficiently relay that image over a network, for example.

Other things I tried

I discovered the new TextEncoder APIs that allow encoding and decoding from byte streams to strings. I thought I could use it to more efficiently convert from the output of atob into a TypedArray. I didn’t have much luck, but you can read this article to give you some inspiration.

Conclusions / Learning Outcomes

  • PNG conversion is expensive. Avoid it where you can.
  • Use getImageData and putImageData for maximum performance to read and write large amounts of pixels
  • You can use the fetch() API to decode dataURI URLs.
  • You can write inline WebWorkers with webworkify

Know any great tips about base64 encoding or efficiently dealing with images? Leave a comment below!