Skip to content
This repository was archived by the owner on Sep 30, 2022. It is now read-only.

opal/memory: Move Memory Allocation Hooks usage from openib #967

Merged
merged 3 commits into from
Mar 7, 2016

Conversation

igor-ivanov
Copy link
Member

See related PR in ompi/trunk open-mpi/ompi#1351

:bot:assign: @jsquyres
:bot🏷️bug
:bot:milestone:v2.0.0

ompi-trunk:
open-mpi/ompi@d9eefef
open-mpi/ompi@477991b
open-mpi/ompi@5c685e23

@ompiteam-bot ompiteam-bot added this to the v2.0.0 milestone Feb 15, 2016
@mellanox-github
Copy link

Test PASSed.
See http://bgate.mellanox.com/jenkins/job/gh-ompi-release-pr/1339/ for details.

@jsquyres
Copy link
Member

Per open-mpi/ompi#1351 (diff), this PR contains an abstraction violation that needs to be fixed. 👎

@igor-ivanov
Copy link
Member Author

@jsquyres could you please propose correct way.

@jsquyres
Copy link
Member

Typically, you have to do these kinds of things via indirection (e.g., via the framework base). This will also allow you to get rid of the #if in the BTL code.

@mellanox-github
Copy link

Test PASSed.
See http://bgate.mellanox.com/jenkins/job/gh-ompi-release-pr/1345/ for details.

@igor-ivanov
Copy link
Member Author

@jsquyres could you check this update

@@ -107,6 +122,16 @@ int mca_btl_base_select(bool enable_progress_threads,
"select: no init function; ignoring component %s",
component->btl_version.mca_component_name);
} else {
#if MEMORY_LINUX_MALLOC_ALIGN_ENABLED
if (strcmp(component->btl_version.mca_component_name, "openib") == 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't put a check for a specific component in a framework base.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah -- I see the problem: I think I wasn't clear in my comment to you before about doing this "through the framework base." Sorry. It was clear in my head! 😉 Here's what I meant:

I was actually referring to the opal/mca/memory base (not the BTL base). I.e., make an OPAL memory framework function to set a memalign hook. This function would be located in opal/mca/memory/base, and openib (and oshmem, and... whoever else needs it) can just call it directly, without needing to directly include opal/mca/memory/linux/something.h or directly invoke a function in opal/mca/memory/linux. Instead, the memory/base function will lookup and see if the selected memory component supports setting a memalign hook, and if so, set it (e.g., perhaps by calling a new member function on the memory module structure). The linux memory hook component will support this functionality; I don't know if the solaris one will (i.e., it needs to be safe for the solaris memory component to not support this functionality).

Does that make more sense?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsquyres thank you for sharing details. Actually my initial local version did something you described. Reason I did not push one was in necessity to use #ifdef inside memory/base that relates memory/linux component only. It seems that solaris does not use hooks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. Perhaps this could be exposed as a function pointer on the module -- hence, the solaris module can provide a NULL pointer, and the linux one can provide a non-NULL pointer.

@jsquyres
Copy link
Member

@igor-ivanov Can you please revert the master commit? The abstraction violation is causing problems for others: http://www.open-mpi.org/community/lists/devel/2016/02/18598.php. Thanks.

mca_memory_linux_component.use_memalign = -1;
ret = mca_base_component_var_register(&mca_memory_linux_component.super.memoryc_version,
"memalign",
"[64 | 32 | 0] - Enable memory alignment for all malloc calls (default: disabled).",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason not to use an MCA var enumeration here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems @ggouaillardet solved issue in master. Thanks.

only reason is to minimize changes with original code. Moreover using enumeration requires additional variable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but this is exactly the case that enumerated MCA vars were intended for.

@hppritcha
Copy link
Member

Any idea why Jenkins/Travis did not catch this?

@igor-ivanov Can you please revert the master commit? The abstraction violation is causing problems for others: http://www.open-mpi.org/community/lists/devel/2016/02/18598.php. Thanks.


Reply to this email directly or view it on GitHub.

@jsquyres
Copy link
Member

I think @ggouaillardet did a workaround on master that's probably good enough for the moment, but the final result should probably be synchronized between master and v2.x.

@igor-ivanov
Copy link
Member Author

@jsquyres Could you advice.
There is memory/linux component ant it looks as suitable for keeping malloc hook operation. Now memory/linux component can be disabled in case --with-memory-manager=no is selected but we would like to have hook operation under Linux despite this (original openib is able to support this capability in both situations). So impementing this stuff inside memory/linux does not help completely.
I see a moving mca variables and hook operations on memory/base layer as a way that could solve this issue (note: there is 'empty' component inside 'base' layer). configure check if memory hook exists should be moved to memory top layer also.
Do you see any abstraction violation issues in this approach. What is your opinion.

@hppritcha
Copy link
Member

Does this also need to go in to 1.10.X?

igor-ivanov referenced this pull request in ggouaillardet/ompi Feb 18, 2016
@ggouaillardet
Copy link
Contributor

once open-mpi/ompi#1376 is merged, my last three commits can be cherry-picked into this PR and it should be good to go

@igor-ivanov
Copy link
Member Author

One point here.
Original code (before open-mpi/ompi#1351) was able to use aligned malloc in case no memory-manager is enabled (actually memory/empty is activated that is hidden in memory/base). This ability was lost after the PR (my apologies). See #967 (comment).
I have no idea how to solve this w/o breaking abstraction. There is 'linux' component that is suitable for this but it can be disabled. I can not add mostly linux specific feature in 'empty' component too.

@ggouaillardet
Copy link
Contributor

if memory alignment is not specific to the Linux component, then it should be moved to base.
an other option is to write an other minimal component that does alignment only.

@igor-ivanov
Copy link
Member Author

minimal component can be disabled as --memory-manager=none up layer option. So it seems not good way. As for the second option that I asked @jsquyres to check.

@ggouaillardet
Copy link
Contributor

well, if the only role of the minimal memory manager is to align memory, then you cannot have memory aligned with --memory-manager=none
that sounds like a feature to me...

@igor-ivanov
Copy link
Member Author

After looking at old openib code more I realized that malloc hook did not work w/o memory manager. Setting such hook was controled by opal_mem_hooks_support_level(). Only memory/linux and memory/malloc_solaris set hooks support level calling opal_mem_hooks_set_support().
So changes from @ggouaillardet should be enough.

@mike-dubman
Copy link
Member

@hppritcha - yes, we would like to port it into v1.10.x once ready.

@mike-dubman
Copy link
Member

👍

@lanl-ompi
Copy link
Contributor

Test FAILed.

igor-ivanov and others added 2 commits February 29, 2016 09:16
These changes fix issue open-mpi/ompi#1336

- improve abstractions: opal/memory/linux component should be single place that opeartes with
Memory Allocation Hooks.
- avoid collisions in case dynamic component open/close: it is safe because it is linked statically.
- does not change original behaivour.
@mellanox-github
Copy link

Test PASSed.
See http://bgate.mellanox.com/jenkins/job/gh-ompi-release-pr/1382/ for details.

@mike-dubman
Copy link
Member

👍

@mike-dubman
Copy link
Member

👎

@mike-dubman
Copy link
Member

👍

@mike-dubman
Copy link
Member

coool :)

@igor-ivanov
Copy link
Member Author

@ggouaillardet could you check if you see any issue

@jsquyres
Copy link
Member

jsquyres commented Mar 1, 2016

@igor-ivanov I think this looks good; nice job. I'd still like to see the MCA enumeration, though.

@ggouaillardet Does it look good to you?

@ggouaillardet
Copy link
Contributor

@jsquyres so far so good, i made open-mpi/ompi#1420 for the MCA enumeration

@hppritcha
Copy link
Member

@jsquyres i pronounce ready to go.

@jsquyres
Copy link
Member

jsquyres commented Mar 2, 2016

@hppritcha I'd like to see a v2.0.0 PR corresponding to open-mpi/ompi#1420 first.

Thanks Igor Ivanov for the review.
@mellanox-github
Copy link

Test PASSed.
See http://bgate.mellanox.com/jenkins/job/gh-ompi-release-pr/1397/ for details.

@jsquyres
Copy link
Member

jsquyres commented Mar 7, 2016

👍

@ompiteam-bot
Copy link

OMPIBot error: Label "reviewed" is already set on issue 967.

hppritcha added a commit that referenced this pull request Mar 7, 2016
opal/memory: Move Memory Allocation Hooks usage from openib
@hppritcha hppritcha merged commit 43e860c into open-mpi:v2.x Mar 7, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants