Author here. The quality of the reports that use debug info (like "compileunits") is entirely dependent on how complete said debug information is. There may be ways of mining the debug info more completely, to compensate for this. Theoretically .debug_aranges should be all we need for this report, but we could supplement this with info in other sections.
Quite predictably, that name made me lough out loudly. I like very much how cleanly it displays units. Since I routinely produce binary executables, I want to give it a try. It looks really useful.
If you have debug symbols. Great tool, I have already used it for Android platform, and love it. But without debug symbols, you're going to have a bad time.
Almost spat my latte into my keyboard reading the name.
The first time I remember doing this exercise was when the product I worked on got so large it wouldn't fit on one CD. I think we found 10 copies of one of its shared libraries.
I'm just a little confused, With terabytes of disk and gigabytes of ram, why should someone spend time shrinking the size of a binary down?
I understand if you are working in an environment where both of those could be limited such a tool would be extremely helpful. But I don't think I have ever decided to not use a program because of the size of the executable. I don't spend my day looking at the size of awk vs sed vs grep and then using the one that is smaller.
You can make that kind of saving quite easily if you are statically linking a load of stuff together. Often the extra libs have a whole set of features you don't need and didn't realise were so large.
It depends on the linker and the command line options to the compiler and the linker. I was surprised to find out that, e.g., GCC, has this off by default and you have to pass options to enable it.
Fitting in the cache is important for high performance applications.
I do agree that in resource constrained environments it is good to examine what it taking up space, but the exe may not be the first place to look, especially for image heavy smartphone apps.
But besides annoyance to customers, remember that bandwidth costs are quite large for popular products. Something like a browser or MS Office update is going to get downloaded billions of times.
My team was just asked to do a feasibility study on shrinking a 3mb application into 64kb since it would allow us to access a considerable cost savings on the next generation of hardware. Even things that can be made limitless have associated costs.
Not GP, but it sounds awfully like they're trying to get rid of a >3MB flash chip on the next iteration of the product they're making by moving it into a more limited amount of flash that's already on another chip, which would mean one less component and could also reduce board size.
Some of our services are basically just a Dockerfile and an executable. Shrinking a 14MB binary to 3MB takes 10 seconds with upx and is just another step in our build process.
I'd rather send or receive a 3MB file instead of a 14MB.
One reason to optimise is to try and get it to fit into cache. I don't know if this tool would help with that or not, but it's an excellent way to try and optimise cache residency of your binary.
Considering where it's come from, Id guess "webscale".
Assuming "spare" terrabytes of disk and gigabytes of ram breaks down at Google/Amazon/Facebook/etc scale.
(I've bookmarked this, because I'm trying to work out how to fit enough of OpenCV into the spare resources on a STM32 microcontroller to be useful... I doubt the author will be too concerned about bugs/doco for _my_ use case though...)
The best example I can think of is mobile apps in severely bandwidth-constrained countries. A 10MB vs 50MB APK could make the difference between success and failure because not many people are willing to wait 30m for an app to download.
Images are big, especially if you have multiple versions for different screens, but the Java code can also be pretty large (especially if you bundle in libraries). I think the play store can split downloads so people only get appropriate native code, but if you have a download version, it needs several copies. And if you've been a good developer and localized your strings, you'll be happy to know that the strings file isn't compressed (and it's utf16)
Organisations that use ball of mud releases have all the code in a large executable. This may exceed the standard ELF maximum size of 4gb. While this isn't a blocker, this is time-consuming to link, to copy to release servers, and, god forbid you break it out into shared libraries, the startup cost of linking will be unpleasant.
This is also a reason behind moving from a ball mud to microservices (nee SOA).
I work on a WinCE environment where program download is rather slow; shrinking a 200MB debug executable to 20MB saved several minutes on every edit-rebuild-test cycle.
(The solution was to make sure string deduplication was on in the linker.)
Edit: The README mentions this may be caused by incompletely parsed `.debug_aranges'.