Skip to content

[lldb][DWARFASTParserClang] Don't overwrite DW_AT_object_pointer of definition with that of a declaration #123089

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 17, 2025

Conversation

Michael137
Copy link
Member

@Michael137 Michael137 commented Jan 15, 2025

In #122742 we will start attaching DW_AT_object_pointer to method declarations (in addition to definitions).

Currently when LLDB parses a DW_TAG_subprogram definition, it will parse all the attributes of the declaration as well. If we have DW_AT_object_pointer on both, then we would overwrite the more specific attribute that we got from the defintion with the one from the specification. This is problematic because LLDB relies on getting the DW_AT_name from the DW_AT_object_pointer, which doesn't exist on the specification.

Note GCC does attach DW_AT_object_pointer on declarations and definitions already (see https://godbolt.org/z/G1GvddY48), so there's definitely some expressions that will fail for GCC compiled binaries. This patch will fix those cases (e.g., I would expect TestConstThis.py to fail with GCC).

…efinition with that of a declaration

In llvm#122742 we will start attaching DW_AT_object_pointer to method declarations (in addition to definitions).

Currently when LLDB parses a `DW_TAG_subprogram` definition, it will
parse all the attributes of the declaration as well. If we have
`DW_AT_object_pointer` on both, then we would overwrite the more
specific attribute that we got from the defintion with the one from the
specification. This is problematic because LLDB relies on getting the
`DW_AT_name` from the `DW_AT_object_pointer`, which doesn't exist on the
specification.
@Michael137 Michael137 marked this pull request as ready for review January 15, 2025 17:04
@llvmbot llvmbot added the lldb label Jan 15, 2025
@llvmbot
Copy link
Member

llvmbot commented Jan 15, 2025

@llvm/pr-subscribers-lldb

Author: Michael Buch (Michael137)

Changes

In #122742 we will start attaching DW_AT_object_pointer to method declarations (in addition to definitions).

Currently when LLDB parses a DW_TAG_subprogram definition, it will parse all the attributes of the declaration as well. If we have DW_AT_object_pointer on both, then we would overwrite the more specific attribute that we got from the defintion with the one from the specification. This is problematic because LLDB relies on getting the DW_AT_name from the DW_AT_object_pointer, which doesn't exist on the specification.


Full diff: https://github.com/llvm/llvm-project/pull/123089.diff

2 Files Affected:

  • (modified) lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp (+6-1)
  • (modified) lldb/unittests/SymbolFile/DWARF/DWARFASTParserClangTests.cpp (+161)
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
index e2f76e88dd6f0f..fb3af44abfa8d4 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
@@ -377,7 +377,12 @@ ParsedDWARFTypeAttributes::ParsedDWARFTypeAttributes(const DWARFDIE &die) {
       break;
 
     case DW_AT_object_pointer:
-      object_pointer = form_value.Reference();
+      // GetAttributes follows DW_AT_specification.
+      // DW_TAG_subprogram definitions and declarations may both
+      // have a DW_AT_object_pointer. Don't overwrite the one
+      // we parsed for the definition with the one from the declaration.
+      if (!object_pointer.IsValid())
+        object_pointer = form_value.Reference();
       break;
 
     case DW_AT_signature:
diff --git a/lldb/unittests/SymbolFile/DWARF/DWARFASTParserClangTests.cpp b/lldb/unittests/SymbolFile/DWARF/DWARFASTParserClangTests.cpp
index f22d76b3973e5f..b31f56aa372d58 100644
--- a/lldb/unittests/SymbolFile/DWARF/DWARFASTParserClangTests.cpp
+++ b/lldb/unittests/SymbolFile/DWARF/DWARFASTParserClangTests.cpp
@@ -741,3 +741,164 @@ TEST_F(DWARFASTParserClangTests, TestUniqueDWARFASTTypeMap_CppInsertMapFind) {
 
   ASSERT_EQ(type_sp, reparsed_type_sp);
 }
+
+TEST_F(DWARFASTParserClangTests, TestParseDWARFAttributes_ObjectPointer) {
+  // This tests the behaviour of ParsedDWARFTypeAttributes
+  // for DW_TAG_subprogram definitions which have a DW_AT_object_pointer
+  // *and* a DW_AT_specification that also has a DW_AT_object_pointer.
+  // We don't want the declaration DW_AT_object_pointer to overwrite the
+  // one from the more specific definition's.
+
+  const char *yamldata = R"(
+--- !ELF
+FileHeader:
+  Class:   ELFCLASS64
+  Data:    ELFDATA2LSB
+  Type:    ET_EXEC
+  Machine: EM_AARCH64
+DWARF:
+  debug_str:
+    - Context
+    - func
+    - this
+  debug_abbrev:
+    - ID:              0
+      Table:
+        - Code:            0x1
+          Tag:             DW_TAG_compile_unit
+          Children:        DW_CHILDREN_yes
+          Attributes:
+            - Attribute:       DW_AT_language
+              Form:            DW_FORM_data2
+        - Code:            0x2
+          Tag:             DW_TAG_structure_type
+          Children:        DW_CHILDREN_yes
+          Attributes:
+            - Attribute:       DW_AT_name
+              Form:            DW_FORM_strp
+        - Code:            0x3
+          Tag:             DW_TAG_subprogram
+          Children:        DW_CHILDREN_yes
+          Attributes:
+            - Attribute:       DW_AT_name
+              Form:            DW_FORM_strp
+            - Attribute:       DW_AT_declaration
+              Form:            DW_FORM_flag_present
+            - Attribute:       DW_AT_object_pointer
+              Form:            DW_FORM_ref4
+            - Attribute:       DW_AT_artificial
+              Form:            DW_FORM_flag_present
+            - Attribute:       DW_AT_external
+              Form:            DW_FORM_flag_present
+        - Code:            0x4
+          Tag:             DW_TAG_formal_parameter
+          Children:        DW_CHILDREN_no
+          Attributes:
+            - Attribute:       DW_AT_artificial
+              Form:            DW_FORM_flag_present
+        - Code:            0x5
+          Tag:             DW_TAG_subprogram
+          Children:        DW_CHILDREN_yes
+          Attributes:
+            - Attribute:       DW_AT_object_pointer
+              Form:            DW_FORM_ref4
+            - Attribute:       DW_AT_specification
+              Form:            DW_FORM_ref4
+        - Code:            0x6
+          Tag:             DW_TAG_formal_parameter
+          Children:        DW_CHILDREN_no
+          Attributes:
+            - Attribute:       DW_AT_name
+              Form:            DW_FORM_strp
+            - Attribute:       DW_AT_artificial
+              Form:            DW_FORM_flag_present
+  debug_info:
+     - Version:         5
+       UnitType:        DW_UT_compile
+       AddrSize:        8
+       Entries:
+
+# DW_TAG_compile_unit
+#   DW_AT_language [DW_FORM_data2]    (DW_LANG_C_plus_plus)
+
+        - AbbrCode:        0x1
+          Values:
+            - Value:           0x04
+
+#   DW_TAG_structure_type
+#     DW_AT_name [DW_FORM_strp] ("Context")
+
+        - AbbrCode:        0x2
+          Values:
+            - Value:           0x0
+
+#     DW_TAG_subprogram
+#       DW_AT_name [DW_FORM_strp] ("func")
+#       DW_AT_object_pointer [DW_FORM_ref4]
+        - AbbrCode:        0x3
+          Values:
+            - Value:           0x8
+            - Value:           0x1
+            - Value:           0x1d
+            - Value:           0x1
+            - Value:           0x1
+
+#       DW_TAG_formal_parameter
+#         DW_AT_artificial
+        - AbbrCode:        0x4
+          Values:
+          - Value: 0x1
+
+        - AbbrCode: 0x0
+        - AbbrCode: 0x0
+
+#     DW_TAG_subprogram
+#       DW_AT_object_pointer [DW_FORM_ref4] ("this")
+#       DW_AT_specification [DW_FORM_ref4] ("func")
+        - AbbrCode:        0x5
+          Values:
+            - Value:           0x29
+            - Value:           0x14
+
+#       DW_TAG_formal_parameter
+#         DW_AT_name [DW_FORM_strp] ("this")
+#         DW_AT_artificial
+        - AbbrCode:        0x6
+          Values:
+            - Value:           0xd
+            - Value:           0x1
+
+        - AbbrCode: 0x0
+        - AbbrCode: 0x0
+...
+)";
+  YAMLModuleTester t(yamldata);
+
+  DWARFUnit *unit = t.GetDwarfUnit();
+  ASSERT_NE(unit, nullptr);
+  const DWARFDebugInfoEntry *cu_entry = unit->DIE().GetDIE();
+  ASSERT_EQ(cu_entry->Tag(), DW_TAG_compile_unit);
+  ASSERT_EQ(unit->GetDWARFLanguageType(), DW_LANG_C_plus_plus);
+  DWARFDIE cu_die(unit, cu_entry);
+
+  auto holder = std::make_unique<clang_utils::TypeSystemClangHolder>("ast");
+  auto &ast_ctx = *holder->GetAST();
+  DWARFASTParserClangStub ast_parser(ast_ctx);
+
+  auto context_die = cu_die.GetFirstChild();
+  ASSERT_TRUE(context_die.IsValid());
+  ASSERT_EQ(context_die.Tag(), DW_TAG_structure_type);
+
+  auto subprogram_definition = context_die.GetSibling();
+  ASSERT_TRUE(subprogram_definition.IsValid());
+  ASSERT_EQ(subprogram_definition.Tag(), DW_TAG_subprogram);
+  ASSERT_FALSE(subprogram_definition.GetAttributeValueAsOptionalUnsigned(
+      DW_AT_external));
+
+  auto param_die = subprogram_definition.GetFirstChild();
+  ASSERT_TRUE(param_die.IsValid());
+
+  ParsedDWARFTypeAttributes attrs(subprogram_definition);
+  EXPECT_TRUE(attrs.object_pointer.IsValid());
+  EXPECT_EQ(attrs.object_pointer, param_die);
+}

Comment on lines +380 to +383
// GetAttributes follows DW_AT_specification.
// DW_TAG_subprogram definitions and declarations may both
// have a DW_AT_object_pointer. Don't overwrite the one
// we parsed for the definition with the one from the declaration.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will only be true if the definition die has DW_AT_object_pointer before DW_AT_specification. GetAttributes expands the specification attribute "in place", so if the order is reversed, we will encounter the attribute from the specification first.

I don't know if there's a good reason for the way that GetAttributes works. It might be possible to just change it to visit the main DIE first.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. For some reason from an initial glance at GetAttributes I assumed it parsed the specification later. Will see if the iteration order is fixable

Copy link
Member Author

@Michael137 Michael137 Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if the spec says anything about duplicate attributes found in DW_AT_specifications

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative would be to say that DW_AT_object_pointer doesn't apply if we found it through a specification/abstract_origin (like we do for DW_AT_sibling and DW_AT_declaration already). The object pointer points to a child DIE, so I don't know why we would want to include it in the list of attributes when getting attributes of the definition. That feels like a more consistent/less intrusive approach. Wdyt?

If we ever find that we do for some reason want to include both DW_AT_obejct_pointers in GetAttributes, then we can change it then?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can sort of see how that could make sense, though I'm not prepared to call it "more consistent". I'm also unsure how many other attributes could be argued into the list in this way, and I wouldn't want this list to get big.

If you want to go that way, then I think it's fine, though I think I'd prefer to change the iteration order (FWIW, llvm's DWARFDie::findRecursively visits the current element first before recursing, although it doesn't have a direct equivalent to our GetAttributes)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be, but you also could be reading too much into it. There are other (better) ways to say that the attribute will appear only once and "take precedence" could have been meant as "it will precede the other attribute in the list" (It's probably not valid English, but I can see someone might say that, and it sort matches what the code does).

That said, I think that omitting the second instance of the attribute should not be a problem functionality-wise, although I am slightly concerned about the performance impact (I guess you'd have to iterate through the list to see if the attribute is not already there). This function is called from the manual dwarf index so it is somewhat performance sensitive. If you want to go down that path, I'd like to measure the impact on indexing speed.

Copy link
Member Author

@Michael137 Michael137 Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nice thing about omitting the second attribute is that I think we could get rid of the Recurse argument. Because the only caller that sets Recurse to No is to avoid possibly overwriting DW_AT_low_pc with the one from the declaration. Though haven't thought about this aspect too much yet.

// Second DW_AT_low_pc may come from DW_TAG_subprogram referenced by
// DW_TAG_GNU_call_site's DW_AT_abstract_origin overwriting our 'low_pc'.
// So do not inherit attributes from DW_AT_abstract_origin.
DWARFAttributes attributes = child.GetAttributes(DWARFDIE::Recurse::no);

I am slightly concerned about the performance impact (I guess you'd have to iterate through the list to see if the attribute is not already there)

I was thinking we just make DWARFAttributes hold a SmallSet (or SetVector) instead of a SmallVector. And Append will only ever insert a single entry (the only caller that Appends is currently GetAttributes, so if we fix iteration order then making it a set will ensure we only see the attribute of the most specific DIE)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nice thing about omitting the second attribute is that I think we could get rid of the Recurse argument. Because the only caller that sets Recurse to No is to avoid possibly overwriting DW_AT_low_pc with the one from the declaration. Though haven't thought about this aspect too much yet.

That would also be true with the implementation which puts the most specific attribute first (the code could ignore the second instance, like you do now), although keeping recurse option might still be nice for performance reasons (if we know we don't need to recurse, we can skip doing that).

I was thinking we just make DWARFAttributes hold a SmallSet (or SetVector) instead of a SmallVector. And Append will only ever insert a single entry (the only caller that Appends is currently GetAttributes, so if we fix iteration order then making it a set will ensure we only see the attribute of the most specific DIE)

For small values a SmallSet is a vector with a linear search for matching keys. It might work. I don't know if this will be a problem, I'm just saying it could be.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would also be true with the implementation which puts the most specific attribute first (the code could ignore the second instance, like you do now), although keeping recurse option might still be nice for performance reasons (if we know we don't need to recurse, we can skip doing that).

Makes sense, I'll table the de-duplication for now

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gave it a shot in #123261

Michael137 added a commit to Michael137/llvm-project that referenced this pull request Jan 17, 2025
… recursing

`GetAttributes` returns all attributes on a given DIE, including any attributes in referenced `DW_AT_abstract_origin` or `DW_AT_specification`. However, if an attribute exists on both the referring DIE and the referenced DIE, the first one encountered will be the one that takes precendence when querying the returned `DWARFAttributes`. There is currently no good way of ensuring that an attribute of a definition doesn't get shadowed by one found on the declaration. One use-case where we don't want this to happen is for `DW_AT_object_pointer` (which can exist on both definitions and declarations, see llvm#123089).

This patch makes sure we visit the current DIE's attributes before
following DIE references. I tried keeping as much of the original
`GetAttributes` unchanged and just add an outer `GetAttributes` that
keeps track of the DIEs we need to visit next.

There's precendent for this iteration order in
`llvm::DWARFDie::findRecursively` and also
`lldb_private::ElaboratingDIEIterator`. We could use the latter to
implement `GetAttributes`, though it also follows `DW_AT_signature` so I
decided to leave it for follow-up.
Michael137 added a commit that referenced this pull request Jan 17, 2025
… recursing (#123261)

`GetAttributes` returns all attributes on a given DIE, including any
attributes that the DIE references via `DW_AT_abstract_origin` and
`DW_AT_specification`. However, if an attribute exists on both the
referring DIE and the referenced DIE, the first one encountered will be
the one that takes precendence when querying the returned
`DWARFAttributes`. But there was no guarantee in which order those
attributes get visited. That means there's no convenient way of ensuring
that an attribute of a definition doesn't get shadowed by one found on
the declaration. One use-case where we don't want this to happen is for
`DW_AT_object_pointer` (which can exist on both definitions and
declarations, see #123089).

This patch makes sure we visit the current DIE's attributes before
following DIE references. I tried keeping as much of the original
`GetAttributes` unchanged and just add an outer `GetAttributes` that
keeps track of the DIEs we need to visit next.

There's precendent for this iteration order in
`llvm::DWARFDie::findRecursively` and also
`lldb_private::ElaboratingDIEIterator`. We could use the latter to
implement `GetAttributes`, though it also follows `DW_AT_signature` so I
decided to leave it for follow-up.
@Michael137 Michael137 merged commit dc1ef2c into llvm:main Jan 17, 2025
7 checks passed
@Michael137 Michael137 deleted the lldb/object-pointer-parsing branch January 17, 2025 14:11
github-actions bot pushed a commit to arm/arm-toolchain that referenced this pull request Jan 17, 2025
… DIE before recursing (#123261)

`GetAttributes` returns all attributes on a given DIE, including any
attributes that the DIE references via `DW_AT_abstract_origin` and
`DW_AT_specification`. However, if an attribute exists on both the
referring DIE and the referenced DIE, the first one encountered will be
the one that takes precendence when querying the returned
`DWARFAttributes`. But there was no guarantee in which order those
attributes get visited. That means there's no convenient way of ensuring
that an attribute of a definition doesn't get shadowed by one found on
the declaration. One use-case where we don't want this to happen is for
`DW_AT_object_pointer` (which can exist on both definitions and
declarations, see llvm/llvm-project#123089).

This patch makes sure we visit the current DIE's attributes before
following DIE references. I tried keeping as much of the original
`GetAttributes` unchanged and just add an outer `GetAttributes` that
keeps track of the DIEs we need to visit next.

There's precendent for this iteration order in
`llvm::DWARFDie::findRecursively` and also
`lldb_private::ElaboratingDIEIterator`. We could use the latter to
implement `GetAttributes`, though it also follows `DW_AT_signature` so I
decided to leave it for follow-up.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants