-
Notifications
You must be signed in to change notification settings - Fork 14.5k
[lldb][DWARFASTParserClang] Don't overwrite DW_AT_object_pointer of definition with that of a declaration #123089
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…efinition with that of a declaration In llvm#122742 we will start attaching DW_AT_object_pointer to method declarations (in addition to definitions). Currently when LLDB parses a `DW_TAG_subprogram` definition, it will parse all the attributes of the declaration as well. If we have `DW_AT_object_pointer` on both, then we would overwrite the more specific attribute that we got from the defintion with the one from the specification. This is problematic because LLDB relies on getting the `DW_AT_name` from the `DW_AT_object_pointer`, which doesn't exist on the specification.
@llvm/pr-subscribers-lldb Author: Michael Buch (Michael137) ChangesIn #122742 we will start attaching DW_AT_object_pointer to method declarations (in addition to definitions). Currently when LLDB parses a Full diff: https://github.com/llvm/llvm-project/pull/123089.diff 2 Files Affected:
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
index e2f76e88dd6f0f..fb3af44abfa8d4 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
@@ -377,7 +377,12 @@ ParsedDWARFTypeAttributes::ParsedDWARFTypeAttributes(const DWARFDIE &die) {
break;
case DW_AT_object_pointer:
- object_pointer = form_value.Reference();
+ // GetAttributes follows DW_AT_specification.
+ // DW_TAG_subprogram definitions and declarations may both
+ // have a DW_AT_object_pointer. Don't overwrite the one
+ // we parsed for the definition with the one from the declaration.
+ if (!object_pointer.IsValid())
+ object_pointer = form_value.Reference();
break;
case DW_AT_signature:
diff --git a/lldb/unittests/SymbolFile/DWARF/DWARFASTParserClangTests.cpp b/lldb/unittests/SymbolFile/DWARF/DWARFASTParserClangTests.cpp
index f22d76b3973e5f..b31f56aa372d58 100644
--- a/lldb/unittests/SymbolFile/DWARF/DWARFASTParserClangTests.cpp
+++ b/lldb/unittests/SymbolFile/DWARF/DWARFASTParserClangTests.cpp
@@ -741,3 +741,164 @@ TEST_F(DWARFASTParserClangTests, TestUniqueDWARFASTTypeMap_CppInsertMapFind) {
ASSERT_EQ(type_sp, reparsed_type_sp);
}
+
+TEST_F(DWARFASTParserClangTests, TestParseDWARFAttributes_ObjectPointer) {
+ // This tests the behaviour of ParsedDWARFTypeAttributes
+ // for DW_TAG_subprogram definitions which have a DW_AT_object_pointer
+ // *and* a DW_AT_specification that also has a DW_AT_object_pointer.
+ // We don't want the declaration DW_AT_object_pointer to overwrite the
+ // one from the more specific definition's.
+
+ const char *yamldata = R"(
+--- !ELF
+FileHeader:
+ Class: ELFCLASS64
+ Data: ELFDATA2LSB
+ Type: ET_EXEC
+ Machine: EM_AARCH64
+DWARF:
+ debug_str:
+ - Context
+ - func
+ - this
+ debug_abbrev:
+ - ID: 0
+ Table:
+ - Code: 0x1
+ Tag: DW_TAG_compile_unit
+ Children: DW_CHILDREN_yes
+ Attributes:
+ - Attribute: DW_AT_language
+ Form: DW_FORM_data2
+ - Code: 0x2
+ Tag: DW_TAG_structure_type
+ Children: DW_CHILDREN_yes
+ Attributes:
+ - Attribute: DW_AT_name
+ Form: DW_FORM_strp
+ - Code: 0x3
+ Tag: DW_TAG_subprogram
+ Children: DW_CHILDREN_yes
+ Attributes:
+ - Attribute: DW_AT_name
+ Form: DW_FORM_strp
+ - Attribute: DW_AT_declaration
+ Form: DW_FORM_flag_present
+ - Attribute: DW_AT_object_pointer
+ Form: DW_FORM_ref4
+ - Attribute: DW_AT_artificial
+ Form: DW_FORM_flag_present
+ - Attribute: DW_AT_external
+ Form: DW_FORM_flag_present
+ - Code: 0x4
+ Tag: DW_TAG_formal_parameter
+ Children: DW_CHILDREN_no
+ Attributes:
+ - Attribute: DW_AT_artificial
+ Form: DW_FORM_flag_present
+ - Code: 0x5
+ Tag: DW_TAG_subprogram
+ Children: DW_CHILDREN_yes
+ Attributes:
+ - Attribute: DW_AT_object_pointer
+ Form: DW_FORM_ref4
+ - Attribute: DW_AT_specification
+ Form: DW_FORM_ref4
+ - Code: 0x6
+ Tag: DW_TAG_formal_parameter
+ Children: DW_CHILDREN_no
+ Attributes:
+ - Attribute: DW_AT_name
+ Form: DW_FORM_strp
+ - Attribute: DW_AT_artificial
+ Form: DW_FORM_flag_present
+ debug_info:
+ - Version: 5
+ UnitType: DW_UT_compile
+ AddrSize: 8
+ Entries:
+
+# DW_TAG_compile_unit
+# DW_AT_language [DW_FORM_data2] (DW_LANG_C_plus_plus)
+
+ - AbbrCode: 0x1
+ Values:
+ - Value: 0x04
+
+# DW_TAG_structure_type
+# DW_AT_name [DW_FORM_strp] ("Context")
+
+ - AbbrCode: 0x2
+ Values:
+ - Value: 0x0
+
+# DW_TAG_subprogram
+# DW_AT_name [DW_FORM_strp] ("func")
+# DW_AT_object_pointer [DW_FORM_ref4]
+ - AbbrCode: 0x3
+ Values:
+ - Value: 0x8
+ - Value: 0x1
+ - Value: 0x1d
+ - Value: 0x1
+ - Value: 0x1
+
+# DW_TAG_formal_parameter
+# DW_AT_artificial
+ - AbbrCode: 0x4
+ Values:
+ - Value: 0x1
+
+ - AbbrCode: 0x0
+ - AbbrCode: 0x0
+
+# DW_TAG_subprogram
+# DW_AT_object_pointer [DW_FORM_ref4] ("this")
+# DW_AT_specification [DW_FORM_ref4] ("func")
+ - AbbrCode: 0x5
+ Values:
+ - Value: 0x29
+ - Value: 0x14
+
+# DW_TAG_formal_parameter
+# DW_AT_name [DW_FORM_strp] ("this")
+# DW_AT_artificial
+ - AbbrCode: 0x6
+ Values:
+ - Value: 0xd
+ - Value: 0x1
+
+ - AbbrCode: 0x0
+ - AbbrCode: 0x0
+...
+)";
+ YAMLModuleTester t(yamldata);
+
+ DWARFUnit *unit = t.GetDwarfUnit();
+ ASSERT_NE(unit, nullptr);
+ const DWARFDebugInfoEntry *cu_entry = unit->DIE().GetDIE();
+ ASSERT_EQ(cu_entry->Tag(), DW_TAG_compile_unit);
+ ASSERT_EQ(unit->GetDWARFLanguageType(), DW_LANG_C_plus_plus);
+ DWARFDIE cu_die(unit, cu_entry);
+
+ auto holder = std::make_unique<clang_utils::TypeSystemClangHolder>("ast");
+ auto &ast_ctx = *holder->GetAST();
+ DWARFASTParserClangStub ast_parser(ast_ctx);
+
+ auto context_die = cu_die.GetFirstChild();
+ ASSERT_TRUE(context_die.IsValid());
+ ASSERT_EQ(context_die.Tag(), DW_TAG_structure_type);
+
+ auto subprogram_definition = context_die.GetSibling();
+ ASSERT_TRUE(subprogram_definition.IsValid());
+ ASSERT_EQ(subprogram_definition.Tag(), DW_TAG_subprogram);
+ ASSERT_FALSE(subprogram_definition.GetAttributeValueAsOptionalUnsigned(
+ DW_AT_external));
+
+ auto param_die = subprogram_definition.GetFirstChild();
+ ASSERT_TRUE(param_die.IsValid());
+
+ ParsedDWARFTypeAttributes attrs(subprogram_definition);
+ EXPECT_TRUE(attrs.object_pointer.IsValid());
+ EXPECT_EQ(attrs.object_pointer, param_die);
+}
|
// GetAttributes follows DW_AT_specification. | ||
// DW_TAG_subprogram definitions and declarations may both | ||
// have a DW_AT_object_pointer. Don't overwrite the one | ||
// we parsed for the definition with the one from the declaration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will only be true if the definition die has DW_AT_object_pointer
before DW_AT_specification
. GetAttributes
expands the specification attribute "in place", so if the order is reversed, we will encounter the attribute from the specification first.
I don't know if there's a good reason for the way that GetAttributes
works. It might be possible to just change it to visit the main DIE first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. For some reason from an initial glance at GetAttributes I assumed it parsed the specification later. Will see if the iteration order is fixable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if the spec says anything about duplicate attributes found in DW_AT_specifications
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative would be to say that DW_AT_object_pointer
doesn't apply if we found it through a specification/abstract_origin (like we do for DW_AT_sibling
and DW_AT_declaration
already). The object pointer points to a child DIE, so I don't know why we would want to include it in the list of attributes when getting attributes of the definition. That feels like a more consistent/less intrusive approach. Wdyt?
If we ever find that we do for some reason want to include both DW_AT_obejct_pointer
s in GetAttributes
, then we can change it then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can sort of see how that could make sense, though I'm not prepared to call it "more consistent". I'm also unsure how many other attributes could be argued into the list in this way, and I wouldn't want this list to get big.
If you want to go that way, then I think it's fine, though I think I'd prefer to change the iteration order (FWIW, llvm's DWARFDie::findRecursively
visits the current element first before recursing, although it doesn't have a direct equivalent to our GetAttributes)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be, but you also could be reading too much into it. There are other (better) ways to say that the attribute will appear only once and "take precedence" could have been meant as "it will precede the other attribute in the list" (It's probably not valid English, but I can see someone might say that, and it sort matches what the code does).
That said, I think that omitting the second instance of the attribute should not be a problem functionality-wise, although I am slightly concerned about the performance impact (I guess you'd have to iterate through the list to see if the attribute is not already there). This function is called from the manual dwarf index so it is somewhat performance sensitive. If you want to go down that path, I'd like to measure the impact on indexing speed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The nice thing about omitting the second attribute is that I think we could get rid of the Recurse
argument. Because the only caller that sets Recurse
to No
is to avoid possibly overwriting DW_AT_low_pc
with the one from the declaration. Though haven't thought about this aspect too much yet.
llvm-project/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
Lines 3988 to 3991 in 628976c
// Second DW_AT_low_pc may come from DW_TAG_subprogram referenced by | |
// DW_TAG_GNU_call_site's DW_AT_abstract_origin overwriting our 'low_pc'. | |
// So do not inherit attributes from DW_AT_abstract_origin. | |
DWARFAttributes attributes = child.GetAttributes(DWARFDIE::Recurse::no); |
I am slightly concerned about the performance impact (I guess you'd have to iterate through the list to see if the attribute is not already there)
I was thinking we just make DWARFAttributes
hold a SmallSet
(or SetVector
) instead of a SmallVector
. And Append
will only ever insert a single entry (the only caller that Appends
is currently GetAttributes
, so if we fix iteration order then making it a set will ensure we only see the attribute of the most specific DIE)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The nice thing about omitting the second attribute is that I think we could get rid of the Recurse argument. Because the only caller that sets Recurse to No is to avoid possibly overwriting DW_AT_low_pc with the one from the declaration. Though haven't thought about this aspect too much yet.
That would also be true with the implementation which puts the most specific attribute first (the code could ignore the second instance, like you do now), although keeping recurse option might still be nice for performance reasons (if we know we don't need to recurse, we can skip doing that).
I was thinking we just make DWARFAttributes hold a SmallSet (or SetVector) instead of a SmallVector. And Append will only ever insert a single entry (the only caller that Appends is currently GetAttributes, so if we fix iteration order then making it a set will ensure we only see the attribute of the most specific DIE)
For small values a SmallSet is a vector with a linear search for matching keys. It might work. I don't know if this will be a problem, I'm just saying it could be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would also be true with the implementation which puts the most specific attribute first (the code could ignore the second instance, like you do now), although keeping recurse option might still be nice for performance reasons (if we know we don't need to recurse, we can skip doing that).
Makes sense, I'll table the de-duplication for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gave it a shot in #123261
… recursing `GetAttributes` returns all attributes on a given DIE, including any attributes in referenced `DW_AT_abstract_origin` or `DW_AT_specification`. However, if an attribute exists on both the referring DIE and the referenced DIE, the first one encountered will be the one that takes precendence when querying the returned `DWARFAttributes`. There is currently no good way of ensuring that an attribute of a definition doesn't get shadowed by one found on the declaration. One use-case where we don't want this to happen is for `DW_AT_object_pointer` (which can exist on both definitions and declarations, see llvm#123089). This patch makes sure we visit the current DIE's attributes before following DIE references. I tried keeping as much of the original `GetAttributes` unchanged and just add an outer `GetAttributes` that keeps track of the DIEs we need to visit next. There's precendent for this iteration order in `llvm::DWARFDie::findRecursively` and also `lldb_private::ElaboratingDIEIterator`. We could use the latter to implement `GetAttributes`, though it also follows `DW_AT_signature` so I decided to leave it for follow-up.
… recursing (#123261) `GetAttributes` returns all attributes on a given DIE, including any attributes that the DIE references via `DW_AT_abstract_origin` and `DW_AT_specification`. However, if an attribute exists on both the referring DIE and the referenced DIE, the first one encountered will be the one that takes precendence when querying the returned `DWARFAttributes`. But there was no guarantee in which order those attributes get visited. That means there's no convenient way of ensuring that an attribute of a definition doesn't get shadowed by one found on the declaration. One use-case where we don't want this to happen is for `DW_AT_object_pointer` (which can exist on both definitions and declarations, see #123089). This patch makes sure we visit the current DIE's attributes before following DIE references. I tried keeping as much of the original `GetAttributes` unchanged and just add an outer `GetAttributes` that keeps track of the DIEs we need to visit next. There's precendent for this iteration order in `llvm::DWARFDie::findRecursively` and also `lldb_private::ElaboratingDIEIterator`. We could use the latter to implement `GetAttributes`, though it also follows `DW_AT_signature` so I decided to leave it for follow-up.
… DIE before recursing (#123261) `GetAttributes` returns all attributes on a given DIE, including any attributes that the DIE references via `DW_AT_abstract_origin` and `DW_AT_specification`. However, if an attribute exists on both the referring DIE and the referenced DIE, the first one encountered will be the one that takes precendence when querying the returned `DWARFAttributes`. But there was no guarantee in which order those attributes get visited. That means there's no convenient way of ensuring that an attribute of a definition doesn't get shadowed by one found on the declaration. One use-case where we don't want this to happen is for `DW_AT_object_pointer` (which can exist on both definitions and declarations, see llvm/llvm-project#123089). This patch makes sure we visit the current DIE's attributes before following DIE references. I tried keeping as much of the original `GetAttributes` unchanged and just add an outer `GetAttributes` that keeps track of the DIEs we need to visit next. There's precendent for this iteration order in `llvm::DWARFDie::findRecursively` and also `lldb_private::ElaboratingDIEIterator`. We could use the latter to implement `GetAttributes`, though it also follows `DW_AT_signature` so I decided to leave it for follow-up.
In #122742 we will start attaching DW_AT_object_pointer to method declarations (in addition to definitions).
Currently when LLDB parses a
DW_TAG_subprogram
definition, it will parse all the attributes of the declaration as well. If we haveDW_AT_object_pointer
on both, then we would overwrite the more specific attribute that we got from the defintion with the one from the specification. This is problematic because LLDB relies on getting theDW_AT_name
from theDW_AT_object_pointer
, which doesn't exist on the specification.Note GCC does attach
DW_AT_object_pointer
on declarations and definitions already (see https://godbolt.org/z/G1GvddY48), so there's definitely some expressions that will fail for GCC compiled binaries. This patch will fix those cases (e.g., I would expectTestConstThis.py
to fail with GCC).