-
Notifications
You must be signed in to change notification settings - Fork 14.5k
[include-cleaner] Handle symbols from system headers. #66089
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This is the initial version -- we add a file-path-based mapping to handle sysmbols from system headers. The mapping list is not completed (I just added the most frequent problematic system headers). I'm looking for initial feedback. In the long run, we might replace the clangd's CanonicalInclude with this one.
83c89f0
to
5aabc90
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks a lot, and so sorry for taking so long on getting to this
// The first element is the suffix of the physical file path for the system | ||
// header, the second element is the final verbatim header spelling. | ||
const std::pair<llvm::StringRef, llvm::StringRef> IncludeMappings[] = { | ||
{"bits/types/struct_timeval.h", "<sys/time.h>"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about extracting these into a SystemHeaderSuffixes.inc
with entries like SYSTEM_HEADER_DETAIL(bits/types/struct_timeval.h, <sys/time.h>)
. that way we can have an easy way to inject more mappings going forward for other libraries in the future.
assert(llvm::all_of(*SystemHeaderMap, [](const auto &KeyAndValue) { | ||
return std::distance( | ||
llvm::sys::path::begin(KeyAndValue.first, | ||
llvm::sys::path::Style::posix), | ||
llvm::sys::path::end(KeyAndValue.first)) <= | ||
MaxSuffixComponents; | ||
})); | ||
// ... and precise. | ||
assert(llvm::any_of(*SystemHeaderMap, [](const auto &KeyAndValue) { | ||
return std::distance( | ||
llvm::sys::path::begin(KeyAndValue.first, | ||
llvm::sys::path::Style::posix), | ||
llvm::sys::path::end(KeyAndValue.first)) == | ||
MaxSuffixComponents; | ||
})); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we can move these assertions into the lambda, no need to perform them every time we access the singleton
MaxSuffixComponents; | ||
})); | ||
|
||
Instance.SuffixHeaderMapping = SystemHeaderMap; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're modifying a global with every call here, in a non-thread safe way.
// A mapping for system headers based on a set of filename rules. | ||
class SystemIncludeMap { | ||
public: | ||
static const SystemIncludeMap &get() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think it's important that we also make these mappings different per language. most of them will likely be language independent, but at least libc extensions are likely to have different spellings for c vs c++.
moreover these fell more like private -> public mappings we perform via PragmaIncludes, so what about just moving them there? that way we can also initialize the mapping based on language options and also benefit all the other users of pragma includes with similar mappings?
// Returns the overridden verbatim spellings for HeaderPath that can be | ||
// directly included. | ||
llvm::SmallVector<llvm::StringRef> | ||
mapHeader(llvm::StringRef HeaderPath) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about making this one static instead and getting rid of the extra singleton interface?
for (auto It = FoundRange.first; It != FoundRange.second; ++It) | ||
Headers.push_back(It->second); | ||
} | ||
return Headers;; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra ;
static SystemIncludeMap Instance; | ||
static const auto *SystemHeaderMap = [] { | ||
const size_t Size = sizeof(IncludeMappings) / sizeof(IncludeMappings[0]); | ||
auto *HeaderMap = new std::multimap<llvm::StringRef, llvm::StringRef>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we have a multipmap here instead of a just StringMap? do we expect to have suffixes that map to multiple umbrella headers?
@@ -188,6 +290,20 @@ llvm::SmallVector<Hinted<Header>> findHeaders(const SymbolLocation &Loc, | |||
OptionalFileEntryRef FE = SM.getFileEntryRefForID(FID); | |||
if (!FE) | |||
return {}; | |||
|
|||
if (SrcMgr::isSystem( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i am afraid this check might be overly strict, maybe drop this initially and see if we have too many false-positives?
Results.emplace_back(Header, Hints::PublicHeader); | ||
assert(!Results.empty()); | ||
Results.front().Hint |= Hints::PreferredHeader; | ||
// FIXME: should we include the original header as well? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, i think we should treat these similar to IWYU private -> public mappings.
// A hand-mainained list of include mappings for system headers. | ||
// The first element is the suffix of the physical file path for the system | ||
// header, the second element is the final verbatim header spelling. | ||
const std::pair<llvm::StringRef, llvm::StringRef> IncludeMappings[] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we discussed offline that this list is mostly for demonstration purposes, but wanted to remind it once again. we probably want to start with at least what we have in clangd, right?
This is an initial version -- we add a file-path-based mapping to handle symbols from system headers.
The mapping list is not completed (I just added the most frequent problematic system headers).
I'm looking for initial feedback. In the long run, we might replace the clangd's CanonicalInclude with this one.