Bloaty McBloatface: a size profiler for binaries

rwmj · on March 20, 2017

I ran it and it gives an awful lot of "[None]":

    $ ~/d/bloaty/bloaty builder/virt-builder -d compileunits
         VM SIZE                            FILE SIZE
     --------------                      --------------
      75.5%  1.96Mi [None]                3.67Mi  85.2%
       8.7%   232Ki guestfs-c-actions.c    232Ki   5.3%
       8.2%   219Ki guestfs.ml             219Ki   5.0%
       2.0%  52.4Ki [Other]               52.4Ki   1.2%
       1.3%  33.7Ki _none_                33.7Ki   0.8%
       0.7%  17.5Ki customize_cmdline.ml  17.5Ki   0.4%
       0.6%  17.3Ki builder.ml            17.3Ki   0.4%
       0.4%  11.8Ki customize_run.ml      11.8Ki   0.3%
       0.4%  10.4Ki cmdline.ml            10.4Ki   0.2%
       0.3%  7.08Ki firstboot.ml          7.08Ki   0.2%
       0.2%  6.21Ki index-scan.c          6.21Ki   0.1%
       0.2%  5.90Ki index_parser.ml       5.90Ki   0.1%
       0.2%  5.15Ki sigchecker.ml         5.15Ki   0.1%
       0.2%  4.87Ki getopt-c.c            4.87Ki   0.1%
    [...]

It's a mixed OCaml/C executable, but I ran it on a build from the local directory and all debug symbols are still available.

Edit: The README mentions this may be caused by incompletely parsed `.debug_aranges'.

haberman · on March 20, 2017

Author here. The quality of the reports that use debug info (like "compileunits") is entirely dependent on how complete said debug information is. There may be ways of mining the debug info more completely, to compensate for this. Theoretically .debug_aranges should be all we need for this report, but we could supplement this with info in other sections.

smnscu · on March 21, 2017

I was curious so I ran bloaty against Go's hello world:

   bloaty main
        VM SIZE                                FILE SIZE
    --------------                          --------------
     33.0%   536Ki __TEXT,__text              536Ki  33.7%
     18.1%   293Ki __TEXT,__gopclntab         293Ki  18.5%
     15.6%   253Ki __DWARF,__debug_info       253Ki  16.0%
     13.1%   213Ki __TEXT,__rodata            213Ki  13.4%
      6.5%   106Ki __DATA,__bss                   0   0.0%
      0.3%  5.62Ki [None]                    93.1Ki   5.9%
      4.4%  71.9Ki __DWARF,__debug_frame     71.9Ki   4.5%
      4.2%  68.3Ki __DWARF,__debug_line      68.3Ki   4.3%
      1.9%  30.5Ki __DWARF,__debug_pubtypes  30.5Ki   1.9%
      1.1%  17.1Ki __DATA,__noptrbss              0   0.0%
      0.6%  9.90Ki __DWARF,__debug_pubnames  9.90Ki   0.6%
      0.6%  9.45Ki __DATA,__noptrdata        9.45Ki   0.6%
      0.4%  6.34Ki __DATA,__data             6.34Ki   0.4%
      0.2%  2.86Ki __TEXT,__typelink         2.86Ki   0.2%
      0.0%     255 __DWARF,__debug_abbrev       255   0.0%
      0.0%      72 __TEXT,__itablink             72   0.0%
      0.0%      61 __DWARF,__debug_gdb_scri      61   0.0%
      0.0%      48 __DWARF,__debug_aranges       48   0.0%
    100.0%  1.59Mi TOTAL                     1.55Mi 100.0%

tempodox · on March 20, 2017

Quite predictably, that name made me lough out loudly. I like very much how cleanly it displays units. Since I routinely produce binary executables, I want to give it a try. It looks really useful.

ndesaulniers · on March 20, 2017

If you have debug symbols. Great tool, I have already used it for Android platform, and love it. But without debug symbols, you're going to have a bad time.

dboreham · on March 20, 2017

Almost spat my latte into my keyboard reading the name.

The first time I remember doing this exercise was when the product I worked on got so large it wouldn't fit on one CD. I think we found 10 copies of one of its shared libraries.

ayuvar · on March 20, 2017

This is a great thing. I am currently tracking down a problem at work where we have a Win32 PE with a lot of DLL bloat.

I've written a few hacky exploratory tools for it but nothing of this quality.

dan353hehe · on March 20, 2017

I'm just a little confused, With terabytes of disk and gigabytes of ram, why should someone spend time shrinking the size of a binary down?

I understand if you are working in an environment where both of those could be limited such a tool would be extremely helpful. But I don't think I have ever decided to not use a program because of the size of the executable. I don't spend my day looking at the size of awk vs sed vs grep and then using the one that is smaller.

Peaker · on March 20, 2017

You might not care to shrink it down for 500KB to 200KB. But shrinking 200MiB binary to 40MiB is nice.

See also: https://en.wikipedia.org/wiki/Wirth's_law

WalterGR · on March 21, 2017

But shrinking 200MiB binary to 40MiB is nice.

Is that the average size savings?

Normal_gaussian · on March 21, 2017

You can make that kind of saving quite easily if you are statically linking a load of stuff together. Often the extra libs have a whole set of features you don't need and didn't realise were so large.

FnuGk · on March 21, 2017

Should the linker not throw away dead code?

Peaker · on March 21, 2017

Normally, if you use archive (.a) files, then any .o files within them that are not needed, are thrown away.

But that is at .o file granularity. There's no per-symbol dead code elimination.

And if you provide a list of .o files (no .a files) then it doesn't throw away anything at all.

phaedrus · on March 21, 2017

It depends on the linker and the command line options to the compiler and the linker. I was surprised to find out that, e.g., GCC, has this off by default and you have to pass options to enable it.

legulere · on March 20, 2017

Smaller code/data fits better into lower level caches, which are faster than the disk/ram.

Also think of smartphones. People often complain about app sizes, because they often just have 16 or 32 GB.

dan353hehe · on March 20, 2017

Fitting in the cache is important for high performance applications.

I do agree that in resource constrained environments it is good to examine what it taking up space, but the exe may not be the first place to look, especially for image heavy smartphone apps.

nitwit005 · on March 20, 2017

It can be an issue even when there is plenty of space. I recall quite a bit of drama around Doom 4: https://steamcommunity.com/app/379720/discussions/0/35728611...

But besides annoyance to customers, remember that bandwidth costs are quite large for popular products. Something like a browser or MS Office update is going to get downloaded billions of times.

misnome · on March 21, 2017

The binary. Presumably it was the assets and not Doom.exe taking most of that 43gb.

Cerium · on March 20, 2017

My team was just asked to do a feasibility study on shrinking a 3mb application into 64kb since it would allow us to access a considerable cost savings on the next generation of hardware. Even things that can be made limitless have associated costs.

dan353hehe · on March 21, 2017

what sort of hardware are you using? and how would shrinking the app to 64k save your team a considerable amount of money?

ajdlinux · on March 21, 2017

Not GP, but it sounds awfully like they're trying to get rid of a >3MB flash chip on the next iteration of the product they're making by moving it into a more limited amount of flash that's already on another chip, which would mean one less component and could also reduce board size.

Cerium · on March 21, 2017

Yes. Exactly the situation.

barsonme · on March 20, 2017

Some of our services are basically just a Dockerfile and an executable. Shrinking a 14MB binary to 3MB takes 10 seconds with upx and is just another step in our build process.

I'd rather send or receive a 3MB file instead of a 14MB.

14113 · on March 20, 2017

One reason to optimise is to try and get it to fit into cache. I don't know if this tool would help with that or not, but it's an excellent way to try and optimise cache residency of your binary.

dan353hehe · on March 20, 2017

That makes sense. if the hot code path in your executable can't all be in the cache at the same time then it will have a performance penalty.

bigiain · on March 20, 2017

Considering where it's come from, Id guess "webscale".

Assuming "spare" terrabytes of disk and gigabytes of ram breaks down at Google/Amazon/Facebook/etc scale.

(I've bookmarked this, because I'm trying to work out how to fit enough of OpenCV into the spare resources on a STM32 microcontroller to be useful... I doubt the author will be too concerned about bugs/doco for _my_ use case though...)

kevan · on March 20, 2017

The best example I can think of is mobile apps in severely bandwidth-constrained countries. A 10MB vs 50MB APK could make the difference between success and failure because not many people are willing to wait 30m for an app to download.

dan353hehe · on March 21, 2017

Right and I agree. however I'm sure that most of the APK would be images, etc. and not the actual executable.

toast0 · on March 21, 2017

Images are big, especially if you have multiple versions for different screens, but the Java code can also be pretty large (especially if you bundle in libraries). I think the play store can split downloads so people only get appropriate native code, but if you have a download version, it needs several copies. And if you've been a good developer and localized your strings, you'll be happy to know that the strings file isn't compressed (and it's utf16)

fnord123 · on March 21, 2017

Organisations that use ball of mud releases have all the code in a large executable. This may exceed the standard ELF maximum size of 4gb. While this isn't a blocker, this is time-consuming to link, to copy to release servers, and, god forbid you break it out into shared libraries, the startup cost of linking will be unpleasant.

This is also a reason behind moving from a ball mud to microservices (nee SOA).

pjc50 · on March 21, 2017

I work on a WinCE environment where program download is rather slow; shrinking a 200MB debug executable to 20MB saved several minutes on every edit-rebuild-test cycle.

(The solution was to make sure string deduplication was on in the linker.)

ksk · on March 20, 2017

Hopefully so that Chrome can launch in under a second as opposed to the slow bloated mess that it currently is.

shultays · on March 21, 2017

Uh, what is "Mi", "Ki"? Is it Mebibyte, kibibyte? Wikipedia says correct abbreviation is MiB and KiB.

shaqbert · on March 20, 2017

Vincent Vanhoucke deserves a lot of credit for his naming skills... Bloaty McBloatface[1] is the best name I've seen in ages.

[1]: https://en.wikipedia.org/wiki/Boaty_McBoatface

tyingq · on March 20, 2017

I suppose Parsey McParseface goes in the same bucket: http://www.telegraph.co.uk/technology/2016/05/13/google-name...

ryan-allen · on March 21, 2017

TIL the submersibles aboard the RSS David Attenborough were actually named after the popular vote Boaty McBoatface, fantastic!

archgoon · on March 20, 2017

Was Vincent Vanhoucke the mind behind Boaty? Or just Parsey?

skinofstars · on March 20, 2017

James Hand, a former BBC Jersey presenter, came up with Boaty McBoatface

http://www.bbc.co.uk/news/world-europe-jersey-35860760

geodel · on March 20, 2017

Are you sure it is not like Judge Judgy McJudgerson to simply call this name best?

21 · on March 20, 2017

Xy McXface was hilarious the first time, fun the second time, but I feel that now it's just lame to name something like this.

AnimalMuppet · on March 20, 2017

Given how close "Bloaty" is to "Boaty", this one gets a pass in my book...

bigiain · on March 20, 2017

Doubly so when you feel the need to explain the name in the footer...

ryan-allen · on March 21, 2017

It got a good snort out of me, so I'm satisfied!