Mastering Client-Side File Downloads with the Streams API
Typically, servers handle the preparation of data for download. But what if you create data on the client side or combine data from multiple endpoints, which results in GB's big file? Standard method using Blob
may not be suitable.
In this article, we'll explore how to download big files with client side JavaScript & Streams API.
After reading, you'll be able to explain:
What is a
Blob
and what is its shortcoming when dealing with file download.How to create a
ReadableStream
from scratchHow to transfer stream ownership to a Service Worker.
How to simulate server file download with Service Worker
FetchEvent
.
Blob file download in the browser
On the client side, file download can be done with an a
tag with a download
attribute:
<a href="example.com/download" download="archive.zip">Download</a>
The attribute download
will override Content-Disposition
setting when href
points to a server location. But it doesn't need to point to a server location, we can use Blob URL instead.
Blob
(Binary Large Object) is a file like container for data.
This is useful when you are creating content on the client side (rendering on canvas would be one use case) and then you need to download it or display as an image or video.
blob = new Blob(['fox', 'jumps', 'over', 'the', 'lazy', 'dog']);
blobURL = URL.createObjectURL(blob);
// "blob:https://example.com/63fd8457-904a-4c5d-985f-f77c9ea6a58b"
URL.revokeObjectURL
. See: MDN -> createObjectURL -> Memory managementThen you trigger download by creating a
Element
in JS and clicking it programmatically:
anchor = document.createElement('a');
anchor.download = 'archive.zip';
anchor.href = blobURL;
// trigger download
a.click()
Streaming files
Every time you create a Blob URL, memory is allocated. Think about videos or archives that can be GB's big. Putting such big files into application memory can quickly result in out of memory problems.
For this reason, servers usually stream data from storage. This way, only chunks of data needs to be loaded into memory at a given time. On a client side, you can do the same using Streams API.
Introduction to ReadableStream
Even if you never used Streams API before, you must have used Fetch API.
Consider this example:
response = await fetch('https://example.com/resource');
reponse.body
//< ReadableStream
Turns out the response.body
is a ReadableStream
. So instead of awaiting the entire content, using response.blob()
or response.json()
, we can read data as it's being downloaded:
response = await fetch('https://example.com/resource');
stream = response.body;
// Stream implements async iterator:
// https://developer.mozilla.org/en-US/docs/Web/API/ReadableStream#async_iteration
for await (const chunk of stream) {
console.log('chunk:', chunk);
}
// Alternatively use Reader:
reader = stream.getReader();
reader.read(function consumeChunk({ done, value }) {
// Return when done
if (done) {
return;
}
// Read next chunk
reader.read(consumeChunk)
});
Response
using json()
and blob()
at the same time, you know it's prohibited. You can consume Response
only once... unless you use response.body.tee()
which will produce two identical copies of the response data, which you can consume separately.Let's construct a ReadableSource
from scratch.
Before we do that, we need an underlying source
. This is where we get our data from. For the purpose of this exercise, let's create an async iterator which will yield an incremented integer <0, 9> after waiting 1 second. In real life, this is where you would produce data, or call some API to fetch it.
async function* getRandomZeroToTenIntsIterator() {
for (let i = 0; i < 10; i++) {
await new Promise(resolve => {
window.setTimeout(() => {
resolve();
}, 1000);
});
yield i;
}
}
iterator = getRandomZeroToTenIntsIterator();
Whenever we call iterator.next()
we get an object :{ value: number, done: boolean }
.
Now we can feed that data into ReadableStream
:
iterator = getRandomZeroToTenIntsIterator();
underlyingSource = {
// Pull will be called again after promise returned by it resolves
async pull(controller) {
const { value, done } = await iterator.next();
if (done) {
controller.close();
return;
}
controller.enqueue(value);
}
};
stream = new ReadableStream(streamSource);
Now that we have a stream that we can feed and read from, we need to allow downloading of the stream. But there's no way to create an Object URL from a stream.
Intercepting download request in a service worker
Once we have a stream, we can simulate what the server does. We'll do it in 3 steps:
Registering a Service Worker
Transferring stream ownership from main thread to the Service Worker.
Intercepting
FetchEvent
inside the Service Worker and responding with data from our stream, usingevent.respondWith
.Clicking fake download link to start download.
First, we'll register a Service Worker and transfer the stream ownership.
To pass a stream object to a Service Worker, use postMessage
and the fact that ReadableStream
is a Transferable
Object. This is great because transferring such objects is a zero-copy operation. We literally transfer ownership from the main thread to a worker thread, meaning after the transfer, we cannot access the stream object from the source thread anymore.
navigator.serviceWorker.register("service-worker.js");
navigator.serviceWorker.ready.then((registration) => {
// Second argument of postMessage is an Array of Tranferables.
registration.active.postMessage(stream, [stream]);
});
Now let's write worker code which will save the stream and intercept fetch events.
let stream = null;
// Listen for a message containing stream
self.addEventListner('message', (event) => {
if (message.stream instanceof ReadableStream) {
stream = message.stream;
}
});
self.addEventListener('fetch', (event) => {
// Ignore if url does not match or stream is not set
if (event.request.url !== '/fake-download' || !stream) {
return;
}
// Create Response with a stream
const response = new Response(stream, {
headers: {
// This header hints browser to trigger file download.
// Normally server would sets it.
'Content-Disposition': 'attachment; filename="file.txt"'
}
);
// Respond...
event.respondWith(response);
});
At this point, we have everything ready to actually call download endpoint we just "created":
anchor = document.createElement('a');
// We are opening link in a new tab to avoid navigating away from the current page.
// but new tab won't acutally open when browser realizes it should download instead of display
a.target="_blank"
anchor.href = '/fake-download'
// Open the link
a.click()