Skip to content

[SYCL] Separate jobs in sycl-post-link tool #854

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Nov 29, 2019
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
138 changes: 84 additions & 54 deletions llvm/tools/sycl-post-link/sycl-post-link.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -50,14 +50,14 @@ static cl::opt<std::string> BaseOutputFilename{
// with prodced IR modules files names.
static cl::opt<std::string> OutputIRFilesList{
"ir-files-list", cl::desc("Specify output filename for IR files list"),
cl::value_desc("filename"), cl::init("-"), cl::cat(ExtractCat)};
cl::value_desc("filename"), cl::init(""), cl::cat(ExtractCat)};

// Module splitter produces multiple TXT files. These files contain kernel names
// list presented in a produced module. TXT files list is a file list
// with produced TXT files names.
static cl::opt<std::string> OutputTxtFilesList{
"txt-files-list", cl::desc("Specify output filename for txt files list"),
cl::value_desc("filename"), cl::init("-"), cl::cat(ExtractCat)};
cl::value_desc("filename"), cl::init(""), cl::cat(ExtractCat)};

static cl::opt<bool> Force{"f", cl::desc("Enable binary output on terminals"),
cl::cat(ExtractCat)};
Expand Down Expand Up @@ -88,45 +88,62 @@ static void writeToFile(std::string Filename, std::string Content) {
OS.close();
}

static void collectKernelsSet(
Module &M, std::map<std::string, std::vector<Function *>> &ResKernelsSet) {
// Output parameter ResKernelModuleMap is a map containing groups of kernels
// with same values of the sycl-module-id attribute.
// The function fills ResKernelModuleMap using input module M.
static void collectKernelModuleMap(
Module &M,
std::map<std::string, std::vector<Function *>> &ResKernelModuleMap) {
for (auto &F : M.functions()) {
if (F.getCallingConv() == CallingConv::SPIR_KERNEL) {
if (OneKernelPerModule) {
ResKernelsSet[F.getName()].push_back(&F);
ResKernelModuleMap[F.getName()].push_back(&F);
} else if (F.hasFnAttribute("sycl-module-id")) {
auto Id = F.getFnAttribute("sycl-module-id");
auto Val = Id.getValueAsString();
ResKernelsSet[Val].push_back(&F);
ResKernelModuleMap[Val].push_back(&F);
}
}
}
}

// Splits input LLVM IR module M into smaller ones.
// Input parameter KernelsSet is a map containing groups of kernels with same
// values in the sycl-module-id attribute. For each group of kernels a separate
// IR module will be produced.
// ResModules and ResSymbolsLists are output parameters.
// Result modules are stored into
// ResModules vector. For each result module set of kernel names is collected.
// Sets of kernel names are stored into ResSymbolsLists.
// Input parameter KernelModuleMap is a map containing groups of kernels with
// same values of the sycl-module-id attribute. ResSymbolsLists is a vector of
// kernel name lists. Each vector element is a string with kernel names from the
// same module separated by \n.
// The function saves names of kernels from one group to a single std::string
// and stores this string to the ResSymbolsLists vector.
static void collectSymbolsLists(
std::map<std::string, std::vector<Function *>> &KernelModuleMap,
std::vector<std::string> &ResSymbolsLists) {
for (auto &It : KernelModuleMap) {
std::string SymbolsList;
for (auto &F : It.second) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically it is a reduction algorithm such as std::accumulate...
Then why not just declaring SymbolsList to be a Twine to avoid the string conversions back-and-forth?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure. According to LLVM docs, Twine is a temporary object and shouldn't be stored. I use the SymbolsList next...
https://llvm.org/doxygen/classllvm_1_1Twine.html

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @Fznamznon. += can't be done on Twines, AFAIK.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, it looks like the Twine API is quite lower level. Sorry for the noise. :-(

SymbolsList =
(Twine(SymbolsList) + Twine(F->getName()) + Twine("\n")).str();
}
ResSymbolsLists.push_back(std::move(SymbolsList));
}
}

// Input parameter KernelModuleMap is a map containing groups of kernels with
// same values of the sycl-module-id attribute. For each group of kernels a
// separate IR module will be produced.
// ResModules is a vector of produced modules.
// The function splits input LLVM IR module M into smaller ones and stores them
// to the ResModules vector.
static void
splitModule(Module &M,
std::map<std::string, std::vector<Function *>> &KernelsSet,
std::vector<std::unique_ptr<Module>> &ResModules,
std::vector<std::string> &ResSymbolsLists) {
for (auto &It : KernelsSet) {
std::map<std::string, std::vector<Function *>> &KernelModuleMap,
std::vector<std::unique_ptr<Module>> &ResModules) {
for (auto &It : KernelModuleMap) {
// For each group of kernels collect all dependencies.
SetVector<GlobalValue *> GVs;
std::vector<llvm::Function *> Workqueue;
std::string SymbolsList;

for (auto &F : It.second) {
GVs.insert(F);
Workqueue.push_back(F);
SymbolsList =
(Twine(SymbolsList) + Twine(F->getName()) + Twine("\n")).str();
}

while (!Workqueue.empty()) {
Expand Down Expand Up @@ -184,22 +201,17 @@ splitModule(Module &M,

// Save results.
ResModules.push_back(std::move(MClone));
ResSymbolsLists.push_back(std::move(SymbolsList));
}
}

// Saves specified collections of llvm IR modules and corresponding lists of
// kernel names to files. Saves IR files list and TXT files list if user
// specified corresponding filenames.
static void saveResults(std::vector<std::unique_ptr<Module>> &ResModules,
std::vector<std::string> &ResSymbolsLists) {
int NumOfFile = 0;
// Saves specified collection of llvm IR modules to files.
// Saves file list if user specified corresponding filename.
static void
saveResultModules(std::vector<std::unique_ptr<Module>> &ResModules) {
std::string IRFilesList;
std::string TxtFilesList;
for (size_t I = 0; I < ResModules.size(); ++I) {
std::error_code EC;
std::string CurOutFileName = BaseOutputFilename + "_" +
std::to_string(NumOfFile) +
std::string CurOutFileName = BaseOutputFilename + "_" + std::to_string(I) +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an issue if the user is compiling toto.cpp and toto_0.cpp at the same time?
I guess it would be in 2 different temporary directories and thus no possible conflict?

((OutputAssembly) ? ".ll" : ".bc");

raw_fd_ostream Out{CurOutFileName, EC, sys::fs::OF_None};
Expand All @@ -214,33 +226,36 @@ static void saveResults(std::vector<std::unique_ptr<Module>> &ResModules,
PrintModule.add(createBitcodeWriterPass(Out));
PrintModule.run(*ResModules[I].get());

IRFilesList =
(Twine(IRFilesList) + Twine(CurOutFileName) + Twine("\n")).str();

CurOutFileName =
BaseOutputFilename + "_" + std::to_string(NumOfFile) + ".txt";
writeToFile(CurOutFileName, ResSymbolsLists[I]);

TxtFilesList =
(Twine(TxtFilesList) + Twine(CurOutFileName) + Twine("\n")).str();

++NumOfFile;
if (!OutputIRFilesList.empty())
IRFilesList =
(Twine(IRFilesList) + Twine(CurOutFileName) + Twine("\n")).str();
}

if (OutputIRFilesList != "-") {
// TODO: Figure out what can be added to the output list if there are no
// kernels in the input module.
if (!OutputIRFilesList.empty()) {
// Just pass input module to next tools if there was nothing to split.
if (IRFilesList.empty())
IRFilesList =
(Twine(InputFilename) + Twine("\n")).str();
IRFilesList = (Twine(InputFilename) + Twine("\n")).str();
writeToFile(OutputIRFilesList, IRFilesList);
}
if (OutputTxtFilesList != "-") {
// TODO: Figure out what can be added to output list if there are no kernels
// in the input module.
}

// Saves specified collection of symbols lists to files.
// Saves file list if user specified corresponding filename.
static void saveResultSymbolsLists(std::vector<std::string> &ResSymbolsLists) {
std::string TxtFilesList;
for (size_t I = 0; I < ResSymbolsLists.size(); ++I) {
std::string CurOutFileName =
BaseOutputFilename + "_" + std::to_string(I) + ".txt";
writeToFile(CurOutFileName, ResSymbolsLists[I]);

if (!OutputTxtFilesList.empty())
TxtFilesList =
(Twine(TxtFilesList) + Twine(CurOutFileName) + Twine("\n")).str();
}

if (!OutputTxtFilesList.empty()) {
if (TxtFilesList.empty()) {
// Just create an empty temporary file if there was nothing to split
// Just create an empty temporary file if there was nothing to split.
std::string TempFileNameBase = sys::path::stem(BaseOutputFilename);
SmallString<128> Path;
std::error_code EC =
Expand Down Expand Up @@ -290,17 +305,32 @@ int main(int argc, char **argv) {

std::map<std::string, std::vector<Function *>> GlobalsSet;

collectKernelsSet(*M.get(), GlobalsSet);
collectKernelModuleMap(*M.get(), GlobalsSet);

std::vector<std::unique_ptr<Module>> ResultModules;
std::vector<std::string> ResultSymbolsLists;

splitModule(*M.get(), GlobalsSet, ResultModules, ResultSymbolsLists);
// Default usage model of that the tool is
// calling it twice with the same input due clang driver limitations.
// It should not bring much extra overhead because
// parseIRFile and collectKernelModuleMap functions are small (would be good
// to estimate) compared to splitModule and saveResultModules.
bool NoLists = OutputIRFilesList.empty() && OutputTxtFilesList.empty();
bool PerformSplit = !OutputIRFilesList.empty() || NoLists;
bool CollectSymbols = !OutputTxtFilesList.empty() || NoLists;

if (BaseOutputFilename == "-")
BaseOutputFilename = "a.out";

saveResults(ResultModules, ResultSymbolsLists);
if (PerformSplit) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a comment that the default usage model is that the tool is called twice with the same inputs because clang does not support actions with multiple outputs, hence this workaround with ir/txt generation splitting. It should not bring much extra overhead because parseIRFile and collectKernelsSet are small (would be good to estimate) compared to either of

    splitModule(*M.get(), GlobalsSet, ResultModules);
    saveResultModules(ResultModules);

and

   collectSymbolsLists(GlobalsSet, ResultSymbolsLists);
   saveResultSymbolsLists(ResultSymbolsLists);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

splitModule(*M.get(), GlobalsSet, ResultModules);
saveResultModules(ResultModules);
}

if (CollectSymbols) {
collectSymbolsLists(GlobalsSet, ResultSymbolsLists);
saveResultSymbolsLists(ResultSymbolsLists);
}

return 0;
}