From 2705f4c4ad10c722c7a2f5118e5865ccb5a20b60 Mon Sep 17 00:00:00 2001 From: mingmingl Date: Thu, 29 Feb 2024 16:04:41 -0800 Subject: [PATCH 1/2] [nfc][docs]Generalize header and ascii art for indexed profiles - Add pointers to code for source of truth. - Move necessary details from doc to code. --- llvm/docs/InstrProfileFormat.rst | 72 +++++++++++------------ llvm/include/llvm/ProfileData/InstrProf.h | 6 ++ 2 files changed, 39 insertions(+), 39 deletions(-) diff --git a/llvm/docs/InstrProfileFormat.rst b/llvm/docs/InstrProfileFormat.rst index 2069b87a245a1..37df6188b62fb 100644 --- a/llvm/docs/InstrProfileFormat.rst +++ b/llvm/docs/InstrProfileFormat.rst @@ -360,6 +360,10 @@ profiles generated by older tools or compilers. General Storage Layout ----------------------- +The ASCII art depicts the general storage layout of indexed profiles. +Specifically, the indexed profile header describes the byte offset of individual +payload sections. + :: +-----------------------+---+ @@ -369,55 +373,45 @@ General Storage Layout +-----------------------+ | | HashType | H +-----------------------+ E - +-------| HashOffset | A - | +-----------------------+ D - +-----------| MemProfOffset | E - | | +-----------------------+ R - | | +--| BinaryIdOffset | | - | | | +-----------------------+ | - +---------------| TemporalProf- | | - | | | | | TracesOffset | | - | | | | +-----------------------+---+ - | | | | | Profile Summary | | - | | | | +-----------------------+ P - | | +------>| Function data | A - | | | +-----------------------+ Y - | +---------->| MemProf profile data | L - | | +-----------------------+ O - | +->| Binary Ids | A + | Byte Offset | A + +------ | of section 2 | D + | +-----------------------+ E + | | Byte Offset | R + +-----------| of section 3 | | + | | +-----------------------+ | + | | | ... | | + | | +-----------------------+ | + | | | Byte Offset | | + +---------------| of section N | | + | | | +-----------------------+---+ + | | | | Section 1 | | + | | | | (Profile Summary) | | + | | | +-----------------------+ P + | | +------>| Section 2 | A + | | +-----------------------+ Y + | +---------->| Section 3 | L + | +-----------------------+ O + | | ... | A | +-----------------------+ D - +-------------->| Temporal profiles | | + +-------------->| Section N | | +-----------------------+---+ Header -------- -``Magic`` - The purpose of the magic number is to be able to tell if the profile is an - indexed profile. - -``Version`` - Similar to raw profile version, the lower 32 bits specify the version of the - indexed profile and the most significant 32 bits are reserved to specify the - variant types of the profile. - -``HashType`` - The hashing scheme for on-disk hash table keys. Only MD5 hashing is used as of - writing. +The `Header struct`_ should be the source of truth. Field names and comments +should explain the meaning of each field. At a high level, `*Offset` fields +records the byte offset of individual payload sections. Readers use the offset +information to locate interesting sections and skip uninteresting ones. -``HashOffset`` - An on-disk hash table stores the per-function profile records. This field records - the offset of this hash table's metadata (i.e., the number of buckets and - entries), which follows right after the payload of the entire hash table. +.. note:: -``MemProfOffset`` - Records the byte offset of MemProf profiling data. + To maintain backward compatibility of the indexed profiles, existing fields + shouldn't be deleted from struct definition; the field order shouldn't be + modified. New fields should be appended. -``BinaryIdOffset`` - Records the byte offset of binary id sections. +.. _`Header struct`: https://github.com/llvm/llvm-project/blob/1a2960bab6381f2b288328e2371829b460ac020c/llvm/include/llvm/ProfileData/InstrProf.h#L1053-L1080 -``TemporalProfTracesOffset`` - Records the byte offset of temporal profiles. Payload Sections ------------------ diff --git a/llvm/include/llvm/ProfileData/InstrProf.h b/llvm/include/llvm/ProfileData/InstrProf.h index 25ec06a739202..833946e4e91c9 100644 --- a/llvm/include/llvm/ProfileData/InstrProf.h +++ b/llvm/include/llvm/ProfileData/InstrProf.h @@ -1055,9 +1055,15 @@ inline uint64_t ComputeHash(StringRef K) { return ComputeHash(HashType, K); } // as appropriate when updating the indexed profile format. struct Header { uint64_t Magic; + // The lower 32 bits specify the version of the indexed profile. + // The most significant 32 bits are reserved to specify the variant types of + // the profile. uint64_t Version; uint64_t Unused; // Becomes unused since version 4 uint64_t HashType; + // This field records the offset of this hash table's metadata (i.e., the + // number of buckets and entries), which follows right after the payload of + // the entire hash table. uint64_t HashOffset; uint64_t MemProfOffset; uint64_t BinaryIdOffset; From 51566215ec72b28f507dbfdb3c6b1c7736027fdf Mon Sep 17 00:00:00 2001 From: mingmingl Date: Thu, 29 Feb 2024 17:56:47 -0800 Subject: [PATCH 2/2] resolve feedback --- llvm/docs/InstrProfileFormat.rst | 30 +++++++++++++++++------------- 1 file changed, 17 insertions(+), 13 deletions(-) diff --git a/llvm/docs/InstrProfileFormat.rst b/llvm/docs/InstrProfileFormat.rst index 37df6188b62fb..ec1d1e2d7401f 100644 --- a/llvm/docs/InstrProfileFormat.rst +++ b/llvm/docs/InstrProfileFormat.rst @@ -374,35 +374,39 @@ payload sections. | HashType | H +-----------------------+ E | Byte Offset | A - +------ | of section 2 | D + +------ | of section A | D | +-----------------------+ E - | | Byte Offset | R - +-----------| of section 3 | | + | | Byte Of fset | R + +-----------| of section B | | | | +-----------------------+ | | | | ... | | | | +-----------------------+ | | | | Byte Offset | | - +---------------| of section N | | + +---------------| of section Z | | | | | +-----------------------+---+ - | | | | Section 1 | | - | | | | (Profile Summary) | | + | | | | Profile Summary | | | | | +-----------------------+ P - | | +------>| Section 2 | A + | | +------>| Section A | A | | +-----------------------+ Y - | +---------->| Section 3 | L + | +---------->| Section B | L | +-----------------------+ O | | ... | A | +-----------------------+ D - +-------------->| Section N | | + +-------------->| Section Z | | +-----------------------+---+ +.. note:: + + Profile summary section is at the beginning of payload. It's right after the + header so its position is implicitly known after reading the header. + Header -------- -The `Header struct`_ should be the source of truth. Field names and comments -should explain the meaning of each field. At a high level, `*Offset` fields -records the byte offset of individual payload sections. Readers use the offset -information to locate interesting sections and skip uninteresting ones. +The `Header struct`_ is the source of truth and struct fields should explain +what's in the header. At a high level, `*Offset` fields record section byte +offsets, which are used by readers to locate interesting sections and skip +uninteresting ones. .. note::