Skip to content

[lldb][DWARFASTParserClang] Don't overwrite DW_AT_object_pointer of definition with that of a declaration #123089

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 17, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -377,7 +377,12 @@ ParsedDWARFTypeAttributes::ParsedDWARFTypeAttributes(const DWARFDIE &die) {
break;

case DW_AT_object_pointer:
object_pointer = form_value.Reference();
// GetAttributes follows DW_AT_specification.
// DW_TAG_subprogram definitions and declarations may both
// have a DW_AT_object_pointer. Don't overwrite the one
// we parsed for the definition with the one from the declaration.
Comment on lines +380 to +383
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will only be true if the definition die has DW_AT_object_pointer before DW_AT_specification. GetAttributes expands the specification attribute "in place", so if the order is reversed, we will encounter the attribute from the specification first.

I don't know if there's a good reason for the way that GetAttributes works. It might be possible to just change it to visit the main DIE first.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. For some reason from an initial glance at GetAttributes I assumed it parsed the specification later. Will see if the iteration order is fixable

Copy link
Member Author

@Michael137 Michael137 Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if the spec says anything about duplicate attributes found in DW_AT_specifications

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative would be to say that DW_AT_object_pointer doesn't apply if we found it through a specification/abstract_origin (like we do for DW_AT_sibling and DW_AT_declaration already). The object pointer points to a child DIE, so I don't know why we would want to include it in the list of attributes when getting attributes of the definition. That feels like a more consistent/less intrusive approach. Wdyt?

If we ever find that we do for some reason want to include both DW_AT_obejct_pointers in GetAttributes, then we can change it then?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can sort of see how that could make sense, though I'm not prepared to call it "more consistent". I'm also unsure how many other attributes could be argued into the list in this way, and I wouldn't want this list to get big.

If you want to go that way, then I think it's fine, though I think I'd prefer to change the iteration order (FWIW, llvm's DWARFDie::findRecursively visits the current element first before recursing, although it doesn't have a direct equivalent to our GetAttributes)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be, but you also could be reading too much into it. There are other (better) ways to say that the attribute will appear only once and "take precedence" could have been meant as "it will precede the other attribute in the list" (It's probably not valid English, but I can see someone might say that, and it sort matches what the code does).

That said, I think that omitting the second instance of the attribute should not be a problem functionality-wise, although I am slightly concerned about the performance impact (I guess you'd have to iterate through the list to see if the attribute is not already there). This function is called from the manual dwarf index so it is somewhat performance sensitive. If you want to go down that path, I'd like to measure the impact on indexing speed.

Copy link
Member Author

@Michael137 Michael137 Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nice thing about omitting the second attribute is that I think we could get rid of the Recurse argument. Because the only caller that sets Recurse to No is to avoid possibly overwriting DW_AT_low_pc with the one from the declaration. Though haven't thought about this aspect too much yet.

// Second DW_AT_low_pc may come from DW_TAG_subprogram referenced by
// DW_TAG_GNU_call_site's DW_AT_abstract_origin overwriting our 'low_pc'.
// So do not inherit attributes from DW_AT_abstract_origin.
DWARFAttributes attributes = child.GetAttributes(DWARFDIE::Recurse::no);

I am slightly concerned about the performance impact (I guess you'd have to iterate through the list to see if the attribute is not already there)

I was thinking we just make DWARFAttributes hold a SmallSet (or SetVector) instead of a SmallVector. And Append will only ever insert a single entry (the only caller that Appends is currently GetAttributes, so if we fix iteration order then making it a set will ensure we only see the attribute of the most specific DIE)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nice thing about omitting the second attribute is that I think we could get rid of the Recurse argument. Because the only caller that sets Recurse to No is to avoid possibly overwriting DW_AT_low_pc with the one from the declaration. Though haven't thought about this aspect too much yet.

That would also be true with the implementation which puts the most specific attribute first (the code could ignore the second instance, like you do now), although keeping recurse option might still be nice for performance reasons (if we know we don't need to recurse, we can skip doing that).

I was thinking we just make DWARFAttributes hold a SmallSet (or SetVector) instead of a SmallVector. And Append will only ever insert a single entry (the only caller that Appends is currently GetAttributes, so if we fix iteration order then making it a set will ensure we only see the attribute of the most specific DIE)

For small values a SmallSet is a vector with a linear search for matching keys. It might work. I don't know if this will be a problem, I'm just saying it could be.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would also be true with the implementation which puts the most specific attribute first (the code could ignore the second instance, like you do now), although keeping recurse option might still be nice for performance reasons (if we know we don't need to recurse, we can skip doing that).

Makes sense, I'll table the de-duplication for now

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gave it a shot in #123261

if (!object_pointer.IsValid())
object_pointer = form_value.Reference();
break;

case DW_AT_signature:
Expand Down
161 changes: 161 additions & 0 deletions lldb/unittests/SymbolFile/DWARF/DWARFASTParserClangTests.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -741,3 +741,164 @@ TEST_F(DWARFASTParserClangTests, TestUniqueDWARFASTTypeMap_CppInsertMapFind) {

ASSERT_EQ(type_sp, reparsed_type_sp);
}

TEST_F(DWARFASTParserClangTests, TestParseDWARFAttributes_ObjectPointer) {
// This tests the behaviour of ParsedDWARFTypeAttributes
// for DW_TAG_subprogram definitions which have a DW_AT_object_pointer
// *and* a DW_AT_specification that also has a DW_AT_object_pointer.
// We don't want the declaration DW_AT_object_pointer to overwrite the
// one from the more specific definition's.

const char *yamldata = R"(
--- !ELF
FileHeader:
Class: ELFCLASS64
Data: ELFDATA2LSB
Type: ET_EXEC
Machine: EM_AARCH64
DWARF:
debug_str:
- Context
- func
- this
debug_abbrev:
- ID: 0
Table:
- Code: 0x1
Tag: DW_TAG_compile_unit
Children: DW_CHILDREN_yes
Attributes:
- Attribute: DW_AT_language
Form: DW_FORM_data2
- Code: 0x2
Tag: DW_TAG_structure_type
Children: DW_CHILDREN_yes
Attributes:
- Attribute: DW_AT_name
Form: DW_FORM_strp
- Code: 0x3
Tag: DW_TAG_subprogram
Children: DW_CHILDREN_yes
Attributes:
- Attribute: DW_AT_name
Form: DW_FORM_strp
- Attribute: DW_AT_declaration
Form: DW_FORM_flag_present
- Attribute: DW_AT_object_pointer
Form: DW_FORM_ref4
- Attribute: DW_AT_artificial
Form: DW_FORM_flag_present
- Attribute: DW_AT_external
Form: DW_FORM_flag_present
- Code: 0x4
Tag: DW_TAG_formal_parameter
Children: DW_CHILDREN_no
Attributes:
- Attribute: DW_AT_artificial
Form: DW_FORM_flag_present
- Code: 0x5
Tag: DW_TAG_subprogram
Children: DW_CHILDREN_yes
Attributes:
- Attribute: DW_AT_object_pointer
Form: DW_FORM_ref4
- Attribute: DW_AT_specification
Form: DW_FORM_ref4
- Code: 0x6
Tag: DW_TAG_formal_parameter
Children: DW_CHILDREN_no
Attributes:
- Attribute: DW_AT_name
Form: DW_FORM_strp
- Attribute: DW_AT_artificial
Form: DW_FORM_flag_present
debug_info:
- Version: 5
UnitType: DW_UT_compile
AddrSize: 8
Entries:

# DW_TAG_compile_unit
# DW_AT_language [DW_FORM_data2] (DW_LANG_C_plus_plus)

- AbbrCode: 0x1
Values:
- Value: 0x04

# DW_TAG_structure_type
# DW_AT_name [DW_FORM_strp] ("Context")

- AbbrCode: 0x2
Values:
- Value: 0x0

# DW_TAG_subprogram
# DW_AT_name [DW_FORM_strp] ("func")
# DW_AT_object_pointer [DW_FORM_ref4]
- AbbrCode: 0x3
Values:
- Value: 0x8
- Value: 0x1
- Value: 0x1d
- Value: 0x1
- Value: 0x1

# DW_TAG_formal_parameter
# DW_AT_artificial
- AbbrCode: 0x4
Values:
- Value: 0x1

- AbbrCode: 0x0
- AbbrCode: 0x0

# DW_TAG_subprogram
# DW_AT_object_pointer [DW_FORM_ref4] ("this")
# DW_AT_specification [DW_FORM_ref4] ("func")
- AbbrCode: 0x5
Values:
- Value: 0x29
- Value: 0x14

# DW_TAG_formal_parameter
# DW_AT_name [DW_FORM_strp] ("this")
# DW_AT_artificial
- AbbrCode: 0x6
Values:
- Value: 0xd
- Value: 0x1

- AbbrCode: 0x0
- AbbrCode: 0x0
...
)";
YAMLModuleTester t(yamldata);

DWARFUnit *unit = t.GetDwarfUnit();
ASSERT_NE(unit, nullptr);
const DWARFDebugInfoEntry *cu_entry = unit->DIE().GetDIE();
ASSERT_EQ(cu_entry->Tag(), DW_TAG_compile_unit);
ASSERT_EQ(unit->GetDWARFLanguageType(), DW_LANG_C_plus_plus);
DWARFDIE cu_die(unit, cu_entry);

auto holder = std::make_unique<clang_utils::TypeSystemClangHolder>("ast");
auto &ast_ctx = *holder->GetAST();
DWARFASTParserClangStub ast_parser(ast_ctx);

auto context_die = cu_die.GetFirstChild();
ASSERT_TRUE(context_die.IsValid());
ASSERT_EQ(context_die.Tag(), DW_TAG_structure_type);

auto subprogram_definition = context_die.GetSibling();
ASSERT_TRUE(subprogram_definition.IsValid());
ASSERT_EQ(subprogram_definition.Tag(), DW_TAG_subprogram);
ASSERT_FALSE(subprogram_definition.GetAttributeValueAsOptionalUnsigned(
DW_AT_external));

auto param_die = subprogram_definition.GetFirstChild();
ASSERT_TRUE(param_die.IsValid());

ParsedDWARFTypeAttributes attrs(subprogram_definition);
EXPECT_TRUE(attrs.object_pointer.IsValid());
EXPECT_EQ(attrs.object_pointer, param_die);
}
Loading