Symbolizing Postgres Profiling Data
A journey through undocumented behavior.
We've been working hard to get the Parca Agent eBPF-based DWARF stack unwinder to be generally available and enabled by default. Among other things, we've been extensively profiling postgres on Fedora when we realized that some (not all!) function's memory addresses were not being symbolized successfully.
When I started to investigate this I started with the obvious questions: Where did the debuginfo come from? Was it reliable? Could we get better debuginfos?
Because this is a standard postgres, installed from Fedora packages, I can conveniently retrieve debuginfos from publicly available packages using their publicly accessible debuginfod server. To troubleshoot further I downloaded the debuginfos that were seemingly not working correctly.
The `debuginfod` protocol is pretty simple, once you have an endpoint that you know serves the `debuginfod` API you can request `/buildid/<buildid>/debuginfo` and you have a URL to download debuginfos. The above is exactly the command we used to troubleshoot this.
A primer on symbolization using DWARF
Before I start digging further into the problem, let's start with a quick primer on symbolization using DWARF. If you're already intimately familiar with it, you can skip to the next section of the post.
The way symbolization works in the first place is through sections in ELF binaries generally referred to as "debuginfo", most commonly in the DWARF format. DWARF is a complex format, its flexibility goes to the point where DWARF entries can even be instructions, working on registers, even making it Turing complete. Luckily when dealing with symbolization we had not encountered very complex scenario's (yet)!
Let's look at example DWARF of a tiny C program.
Compile it, enabling DWARF to be emitted (`-g`):
Note: Any C compiler could have been used here but I'm on OSX and Zig's cross-compile support is very convenient.
And let's use the `dwarfdump` tool to print everything.
Looking at this output, we see the compilation unit, which is the top level unit, and right underneath it a `DW_TAG_subprogram`, which is our `main` function. It has an attribute called `DW_AT_low_pc` with the form `DW_FORM_addr` (which means it is a `uint64`), that describes the start of our function's memory range, as well as the `DW_AT_high_pc` with the form `DW_FORM_data4` (which is also a `uint64`), the end of our function's memory range. And lastly, important for symbolization is the `DW_AT_name` attribute with the form `DW_FORM_strp`, which is a string.
Essentially what this means for symbolization: Thanks to this entry, we know that if we encountered a memory address between `0x0000000000201e20` and `0x00000016`, then it would be the `main` function.
Typically, that's the form that we expect to see, a clear range of memory addresses, and associated metadata about the function that covers that address range.
Introducing the trouble
The memory address that I was debugging of the postgres binary was `0x00a90dca`, and just like in the above primer on DWARF I used the `dwarfdump` tool and search through it with `vim` and `grep`:
Oh no! This was a first for me with the `dwarfdump`: SEGFAULT!
Alright, pulled up my sleeves and started writing some Go code to see how far I can go:
And run it:
Alright! That worked well enough. But what's that? The class (or "FORM" as DWARF calls it) is `ClassStringAlt`, not as usual `ClassString`. The thing that finally pointed me into the right direction was a code comment in the package:
Once I started reading the proposal to extend the DWARF standard, it started to make some sense. The proposal came from a tool called DWZ, which attempts to deduplicate as much data as possible within DWARF debuginfos, which it turns out is extensively used by Fedora packages. This is making more sense now!
One of the optimizations that DWZ performs is splits the original debuginfo data into two files, a primary piece of debuginfo as well as a supplementary debuginfo file. And then when in the primary debuginfo the `DW_AT_name` field is of form `DW_FORM_strp_alt` then it contains an offset that points to a string table in the supplementary debuginfo file, at which to read a null-terminated string.
Alright! But there is just one problem, we only get a single file from the `debuginfod` server, which appears to be the primary piece of debuginfo, so how do we get the supplementary file? The DWARF extension proposal suggests that there is a `.gnu_debugaltlink` ELF section inserted into the primary debuginfo that will link us to the supplementary debuginfo, but there is no documentation whatsoever on this section's format. So I decided to just dump it and see what I can find:
A relative path?! We downloaded this from a `debuginfod` server, how could a relative path possibly work? It does not: turns out this is only useful when installing debug packages, such as `postgres-debug`. Then both files are installed and the relative path is adhered to.
So at this point I was clueless, and once I felt like I was not going to find any further useful data myself, I thought of the one place on the internet that should be able to help me: `#elfutils` on Freenode! Once I got there, and described what I was looking at, within 5 minutes, I spoke to none other than Mark Wielaard (aka mjw), one of the elfutils maintainers. He knew the missing piece to the puzzle:

Screenshot of a conversation on IRC explaining how to correctly interpret the bytes in a `.gnu_debugaltlink` section
Aha! The last 20 bytes of the `.gnu_debugaltlink` are an identifier! And they can be used to request the supplementary debuginfo from the debuginfod server!
Now we finally have all the pieces! Remember that value that we managed to print thanks to the Go code we have written? It gave us the value `467f8`.
Now we can finally take that offset and read the string from the supplemental file:
Finally! We've successfully symbolized it.
Learning
Sometimes you won't be able to find all the information you need to accomplish your task documented, learn where to go to find them, or who to ask to figure it out, it's one of our most useful skills!
Shout out to Mark Wielaard for putting the puzzle pieces together for me!
Read more

Keep up with Polar Signals
Receive new posts, product updates, and insights on performance engineering straight to your inbox.