> we compile the binary with debug symbols and a flag to compress the debug symb...

erebe__ · 2025-01-19T11:42:20 1737286940

Normal build

  cargo build --bin engine-gateway --release
      Finished `release` profile [optimized + debuginfo] target(s) in 1m 00s

  ls -lh target/release/engine-gateway 
  .rwxr-xr-x erebe erebe 198 MB Sun Jan 19 12:37:35 2025    target/release/engine-gateway

what we ship

  export RUSTFLAGS="-C link-arg=-Wl,--compress-debug-sections=zlib -C force-frame-pointers=yes" 
  cargo build --bin engine-gateway --release
      Finished `release` profile [optimized + debuginfo] target(s) in 1m 04s

  ls -lh target/release/engine-gateway
  .rwxr-xr-x erebe erebe 61 MB Sun Jan 19 12:39:13 2025  target/release/engine-gateway

The diff is more impressive on some bigger projects

adastra22 · 2025-01-19T12:01:04 1737288064

The compressed symbols sounds like the likely culprit. Do you really need a small executable? The uncompressed symbols need to be loaded into RAM anyway, and if it is delayed until it is needed then you will have to allocate memory to uncompress them.

erebe__ · 2025-01-19T12:40:39 1737290439

I will give it a shot next week to try out ;P

For this particular service, the size does not matter really. For others, it makes more diff (several hundred of Mb) and as we deploy on customers infra, we want images' size to stay reasonable. For now, we apply the same build rules for all our services to stay consistent.

adastra22 · 2025-01-19T13:56:08 1737294968

Maybe I'm not communicating well. Or maybe I don't understand how the debug symbol compression works at runtime. But my point is that I don't think you are getting the tradeoff you think you are getting. The smaller executable may end up using more RAM. Usually at the deployment stage, that's what matters.

Smaller executables are more for things like reducing distribution sizes, or reducing process launch latency when disk throughput is the issue. When you invoke compression, you are explicitly trading off runtime performance in order to get the benefit of smaller on-disk or network transmission size. For a hosted service, that's usually not a good tradeoff.

erebe__ · 2025-01-19T16:32:40 1737304360

It is most likely me reading too quickly. I was caught off guard by the article gaining traction in a Sunday, and as I have other duties during the weekend, I am reading/responding only when I can sneak in.

For your comment, I think you are right regarding compression of debug symbols that add up to the peak memory, but I think you are misleading when you think the debug symbols are uncompressed when the app/binary is started/loaded. Decompression only happens for me when this section is accessed by debugger or equivalent. It is not the same thing as when the binary is fully compressed, like with upx for example.

I have done a quick sanity check on my desktop, I got.

  [profile.release]
  lto = "thin"
  debug = true
  strip = false

  export RUSTFLAGS="-C link-arg=-Wl,--compress-debug-sections=zlib -C force-frame-pointers=yes"
  cargo build --bin engine-gateway --release

From rss memory at startup I get ~128 MB, and after the panic at peak I get ~627 MB.

When compiled with those flags

  export RUSTFLAGS="-C force-frame-pointers=yes" 
  cargo build --bin engine-gateway --release

From rss memory at startup I get ~128 MB, and after the panic at peak I get ~474 MB.

So the peak is taller indeed when the debug section is compressed, but the binary in memory when started is roughly equivalent. (virtual mem too)

I had some hard time getting a source that may validate my belief regarding when the debug symbol are uncompressed. But based on https://inbox.sourceware.org/binutils/20080622061003.D279F3F... and the help of claude.ai, I would say it is only when those sections are accessed.

for what is worth, the whole answer of claude.ai

  The debug sections compressed with --compress-debug-sections=zlib are decompressed:

  At runtime by the debugger (like GDB) when it needs to access the debug information:

  When setting breakpoints
  When doing backtraces
  When inspecting variables
  During symbol resolution


  When tools need to read debug info:

  During coredump analysis
  When using tools like addr2line
  During source-level debugging
  When using readelf with the -w option


  The compression is transparent to these tools - they automatically handle the decompression when needed.    The sections remain compressed on disk, and are only decompressed in memory when required.
  This helps reduce the binary size on disk while still maintaining full debugging capabilities, with only a small runtime performance cost when the debug info needs to be accessed.
  The decompression is handled by the libelf/DWARF libraries that these tools use to parse the ELF files.

adastra22 · 2025-01-19T20:46:13 1737319573

Thanks for running these checks. I’m learning from this too!

the8472 · 2025-01-19T15:50:58 1737301858

Container images use compression too, so having the debug section uncompressed shouldn't actually make the images any bigger.