Jordan Nielson

Demystifying webpack - What's a Bundler doing?

Cover Image for Demystifying webpack - What's a Bundler doing?

In my introduction to this series on Demystifying Build Tools, I introduced the core concepts of webpack and babel. I've created a couple other posts on various facets of babel, like @babel/preset-env and babel plugins more generally. If you haven't read those, I'd highly recommend them (obviously!). In this post I'll shift and cover a little more about webpack. In the talk I'm prepping for, I'm intending to spend more time on babel and less time on webpack, which you might have guessed from the blog coverage difference.

Why less on webpack?

I haven't had nearly as much in our projects to manage with webpack since we're using the defaults provided by next.js (thanks Next team!). But, the things that I have found valuable to be aware of include a knowledge of what webpack is at a little more depth than the concepts docs referenced in the introduction post and also how to use and read the webpack-bundle-analyzer plugin. In my opinion, having a knowledge of webpack makes it simpler to work with as the core concepts build together masterfully, and then the bundle-analyzer plugin is super useful to examine what webpack is outputting that I can't imagine doing a project where I don't use it at least once to sanity check that nothing I don't expect is included in the bundle.

So, to learn more about webpack where do you start? First, I'd start with breaking down the description they use for webpack in their docs:

"At its core, webpack is a static module bundler for modern JavaScript applications."

webpack docs

That statement is relatively simple, but can be broken down to emphasize the key features and goals of webpack. I'll talk more to each of the following ideas:

  • Bundler
  • Module
  • Static
  • Modern JavaScript
  • Applications (including libraries)

Bundler

At its core, webpack is a bundler. Not a task runner or a compiler, a bundler. What is a bundler? In the context of webpack, it takes all files referenced from the entry point(s) and spits out at least 1 file called "the bundle". The goal of the bundle is to package code in a way that makes sense for the target environment, in most cases that's the browser. With HTTP 1.1, it tends to be best to serve as much of the application in a single file, to reduce the number of round-trips needed to get the code for the browser to execute. But, with HTTP 2 as well as in environments where you want heavier caching it makes sense to split your "bundle" into multiple files that can be cached and served independently and in parallel.

How does webpack's role as a bundler impact you? Well, for the most part it doesn't. Since it's a bundler it usually does its thing just fine, and once setup in an application it doesn't take much maintenance unless you add a new file type or want to process something differently. More on that later though!

Module

In stating its place as a bundler, the webpack docs clarify that it is a module bundler. In that aspect, it treats everything as a module: JavaScript Code, Images, Raw files, you name it and it is a module in webpack. Modules are loaded into webpack through a variety of loaders, which you can read more about on the loaders concepts page. Essentially in order to support a large variety of file types you'll need to add loaders for them so that webpack can understand them. Out of the box it supports JavaScript and JSON "modules", much like Node itself. In webpack 4 at least, the module type you use greatly impacts the extra features webpack is able to enable, such as Tree Shaking. Modules are key in webpack, since that is how it determines what code to include in the bundle that it creates. It starts from your "entry point" (which is a module) and pulls in everything referenced by that module. In order to pull it in, it needs to be a module! So, anything that you import in that entry module will end up in your bundle that is created. Without module boundaries, webpack wouldn't be able to determine code that can be left out, and we'd be back to including entire directories in what we serve to the browser.

Static

One of the best features of webpack, in my opinion, is the static analysis capabilities that are unlocked by it being a static (in other words, build time) module bundler. A runtime bundler could probably work, but it wouldn't be able to do Tree Shaking or Dead Code Elimination. This would be a pretty large drawback for me, since it is pretty common in my projects to only use part of the aspects that a library or component exposes. In my opinion, the word static in this context also implies that the build output won't change unless the build input does (assuming you have things configured correctly), which gives me some confidence in being able to run builds as many times as needed. Related to that, another benefit of static in this context is that it allows the build process to support plugins that act on those static assets to transform, adjust, or otherwise do something to the code.

There are some downsides to it being a static module bundler. One of the largest I've run into is the inability to dynamically use require.context in storybook to get just the stories that I want with some sort of option string. This led to us re-writing our storybook config file whenever we want a different set of components to work on, which thankfully was relatively easy to implement.

Modern JavaScript

Since the docs statement says "modern JavaScript applications", I decided that there should be a comma in there and broke it down even further. Modern can be made to indicate that it is something up to date, but I think when you combine it with JavaScript you usually get the idea of ESNext or ES2015. In the case of new language features, that job is actually handled by babel, which webpack can run on your code as it bundles it. This interplay is something that I wanted to highlight since it illustrates the capability of the module bundler to take in anything that you can tell it how to handle. Since it runs in node, webpack can be default handle whatever syntax your version of node can. Since you can run it with babel, webpack can optionally handle whatever syntax you throw at it (within the limits of babel of course). These two libraries work together to output your code in a manner that's suitable for browser consumption. In the simplest configuration, babel will take your files and output them, one for one or all to one, transformed according to the plugins you use. Using webpack, it can be a little smarter than that and only run babel on the files that it is bundling, allowing you to have other files in your src directory (or however you organize yourself) that don't need to be processed by babel.

Splitting this up further, Modern is also a good descriptor of webpack itself. The team there does a great job adding new features/plugins, fixing things, and overall keeping the tool modern in the sense of up to date and useful! JavaScript by itself doesn't mean all that much though, it does indicate that webpack is focused on that language (though if I understand correctly it supports web assembly to some extent).

Applications (including libraries)

The core use case for webpack is definitely applications that are served to the browser, but it can also be used for libraries if they have a desire to do so. There is support for libraries in a similar way to applications, and they have an awesome guide on their docs site about how to use webpack to bundle your library code. Since webpack focuses on the application level, there are tons of plugins that support that use providing things like aliasing, loading all the file types you use, and others.

The Bundle Analyzer

After you've got webpack setup and outputting some wonderful files to serve to the browser, you might run into a case where you're curious what is in there. In most cases, your bundle will be minified and uglified so it won't be much good to try and read what's there, though there are some things that don't uglify very well that you can use if you're trying to check to see if something is there quickly. But, outside of that the webpack-bundle-analyzer is a fantastic tool. For use in next.js, it's as simple as installing the Next.js plugin and following the instructions in the readme to add it to your project. Since Next produces two bundles, one for the server and another for the client, it can be pretty intimidating to set up any webpack things from scratch. So, I'm super grateful for the team that added this plugin since it's already setup to create a bundle analyzer for both bundles. Most of the time I just use the client bundle, but the server bundle is also quite helpful. The bundle analyzer looks pretty overwhelming when you first look at it, since it shows in some manner every file that is included in the bundle. There's a number of things to look at when using the bundle analyzer, but there are a few that I want to call out:

  1. Different Size Settings
  2. Hiding chunks
  3. Outputting a JSON file (not currently supported by the next-bundle-analyzer plugin)

Different Size Settings

One of the first things you might wonder is "where does this size information come from?", since in most cases you won't be seeing what your file explorer told you the size was. In the sidebar menu when analyzing your bundle, you can select between stat, parsed, and gzip. These are described in detail on the documentation page linked above, but I think it's useful to point out that stat should be close to your file system output, parsed should be the post-webpack size (minified/uglified) and then gzip is the compressed size of the post-webpack file. By default the parsed size is pulled up, which is why I pointed out that they might look different than you might expect. In most cases I've seen, parsed is the most useful number, since stat doesn't help much as it's pre-webpack and gzip is useful... but I don't want to spend my time optimizing my code for gzip compression since the time the browser spends parsing it is usually longer than the network time a few more bytes off would save. There's more information on this in the documentation.

Hiding Chunks

In most cases, the output from the bundle analyzer will be entirely too much to handle as most projects that care to analyze their bundle will have hundreds of modules. If you haven't used it before, clicking on a module/section will zoom in on it, but that doesn't actually hide the ones that now can't be seen. To do that, you can uncheck them in the sidebar menu, which will actually re-draw the entire page in most cases. There are a number of things that you might want to hide, like a node_module that you're stuck with and can't reduce the size of or a section of your application that you're not working on right now and is distracting from the actual part you are inspecting. There's more information on this in the documentation.

Outputting a JSON file

In a lot of cases, webpack has way more information available then even the bundle analyzer shows, and in that case I find the bundle analyzer's capability to output the stats.json file from webpack for you to be wonderful. Since the bundle analyzer already uses a lot of the stats options (and webpack does slow down a bit when you use a bunch of stats options), it's helpful to be able to re-use those and output them to a file. Sadly the next-bundle-analyzer plugin doesn't currently support passing any options to the bundle analyzer (they'd probably add it, but I haven't cared enough yet since it isn't terribly hard to use for a one-off case). So, if you want to do this in a next context you'd need to manually adjust your next.config.js to use the bundle analyzer (in a similar way to what the plugin does ideally) to pass the generateStatsFile: true option to the bundle analyzer, with the statsFilename changed based off which build is running. The stats file is a bit of a beast to handle, so we're not going to talk about it much here, but it is super useful if you think webpack is doing something weird!

Thanks for reading! Ideally this helps you understand a little bit more about webpack, in combination with going through their core concepts docs. I'd highly recommend spending some time on doing so, since even if you're using an awesome tool like next.js there's still benefits that come from understanding what is happening to bundle your code.

Cover image courtesy of undraw.co