Codementor Events

Why I built downldr

Published Apr 15, 2019Last updated Apr 16, 2019

The problem I wanted to solve

I wanted a file downloader that checked that the file being downloaded was the file type I wanted to download, to avoid saving invalid files. The actual file type, not what the Content-Type header says!

In any case, I didn't find a package that allowed filtering, not even with Content-Type header, so I decided to build one!

If you know any, please leave a comment with the name of the package.

What is downldr?

Is a file downloader, that allows file type filtering, and conditional piping, so if the downloaded file is incorrect, or the request fails, you don't end up with an empty file.

Where can I get it?

You can get it from npm

npm install downldr

How does the filtering works

Using the extension or the Content-Type header to detect the file type is not bullet proof, since the header can be set to any value, and usually defaults to the type detected by the extension, and if the extension is lacking, it may come as application/octet-stream

We all know, that we can do the following:

mv video.mp4 image.jpg

Now we have a video, with jpg extension, but it's a video in the end.

So if we were to download that file, to later use it in our application, we would have unexpected results.

downldr('https://example.com/image.jpg')
  .pipe(fs.createWriteStream(path.join(__dirname, 'images', 'image.jpg'));

Luckily for us downldr comes with a filter option, that allow us to filter which files we want to download, and it does not use the Content-Type nor the extension to detect it 😃.

Most files have a signature, that allow us to identify or verify the content. That signature is often called magic numbers or magic bytes. downldr uses the file-type package under the hood, to provide this functionality.

But if you wanted to implement it yourself, for a specific file type, is not that hard to do it.

For example, the hex signature for a png file is: 89 50 4E 47 0D 0A 1A 0A

In Node.js, that translates to: Buffer.from([0x89, 0x50, 0x4E, 0x47, 0x0D, 0x0A, 0x1A, 0x0A]);

So the code for implementing a png detector, is the following:

function isPNG(buffer) {
    const magicBytes = Buffer.from([0x89, 0x50, 0x4E, 0x47, 0x0D, 0x0A, 0x1A, 0x0A]);    
    return buffer.indexOf(magicBytes) === 0;
}

fs.createReadStream('image.png') // an actual png :)
    .once('data', chunk => {
        console.log(`png: ${isPNG(chunk)}`); // png: true
    });
    
fs.readFile('image.jpg', (err, content) => {
    console.log(`png: ${isPNG(chunk)}`); // png: false
});

Note that we only need a few bytes, so sending the first chunk of a stream is enough!

Now let's see, how file filtering is done in downldr.

downldr('https://example.com/image.jpg', {
    filter: (type, chunk, statusCode) => {
    	// type.mime may be undefined for some files
        console.log(type.contentType); // image/jpg
        console.log(type.mime); // video/mp4
        return type.mime === 'image/jpeg';
    },
    // For the above filtering, it will be out.jpg
    target: (ext) => fs.createWriteStream(`out.${ext}`)
})
// Error: Invalid type: video/mp4 - Status Code: 200
.on('error', console.error)
.on('complete', () => console.log('done!'));

When using target option instead of .pipe it works like a conditional pipe.
It will only create the file stream, once the filter function returns true, that way we avoid creating an empty file!


You can check the documentation for more options and advanced usage!

Discover and read more posts from Marcos Casagrande
get started
post commentsBe the first to share your opinion
Show more replies