-
Notifications
You must be signed in to change notification settings - Fork 684
Add context support for memory allocator #1034
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Awesome! I'll take a closer look in a few hours. |
/** \addtogroup mem Memory allocation | ||
* @{ | ||
* | ||
* \addtogroup heap Context |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really heap-related (or just a copy-paste issue)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have been thinking what is the best place for the context in Jerry, and for me the allocator is, since even if it is statically allocated, it is still "allocated". It could go to jrt, but for me that does not make sense. It cannot go to ecma, because mem cannot depend on ecma. I am open to better suggestions though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think jrt
may be a bit better, but I see what you mean. Your call.
Looks great 👍 |
*/ | ||
mem_heap_t jerry_global_heap __attribute__ ((aligned (MEM_ALIGNMENT))) JERRY_GLOBAL_HEAP_SECTION; | ||
|
||
#endif /* !JERRY_HEAP_SECTION_ATTR */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This a is bit overly complex. You could simplify it as:
#ifndef JERRY_HEAP_SECTION_ATTR
#define JERRY_GLOBAL_HEAP_SECTION
#else /* JERRY_HEAP_SECTION_ATTR */
/**
* Jerry global heap section attribute.
*/
#define JERRY_GLOBAL_HEAP_SECTION __attribute__ ((section (JERRY_HEAP_SECTION_ATTR)))
#endif /* !JERRY_HEAP_SECTION_ATTR */
/**
* Global heap.
*/
mem_heap_t jerry_global_heap __attribute__ ((aligned (MEM_ALIGNMENT))) JERRY_GLOBAL_HEAP_SECTION;
This would move the double definiton of the variable outside the macro guards and remove duplication.
I think we should decide where to put this context. After some thinking I would put in the jerry-core/ root directory or a new jerry-core/context/ sub-directory |
I would have favoured "jerry-core/jrt" but certainly not "jerry-core/", which leaves me with voting for a new subdir under jerry-core. |
+1 to a new |
Patch reworked |
Sorting the source directories in CMake is a good idea :) I'd like to see it in the commit message too. In general, a more descriptive commit message would be great. The code is LGTM. |
/** \addtogroup context Jerry context | ||
* @{ | ||
* | ||
* \addtogroup context Context |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the two levels deep gruoping intentional? Will the context doc group have multiple sub-groups? Or is a single context group enough for everything here?
LGTM |
/** | ||
* Jerry global heap. | ||
*/ | ||
extern mem_heap_t jerry_global_heap; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd strongly suggest merging most information of mem_heap_t
and jerry_global_heap
into jerry_global_context
. Something along the lines:
typedef struct
{
// whatever already is here extended with the fields below:
mem_heap_free_t first; /**< first node in free region list */
uint8_t *area; /**< heap area */
} jerry_context_t;
And then, you could move the relevant code from context.c to mem-heap.c:
/**
+ * Jerry global heap section attribute.
+ */
+#ifndef JERRY_HEAP_SECTION_ATTR
+#define JERRY_GLOBAL_HEAP_SECTION
+#else /* JERRY_HEAP_SECTION_ATTR */
+#define JERRY_GLOBAL_HEAP_SECTION __attribute__ ((section (JERRY_HEAP_SECTION_ATTR)))
+#endif /* !JERRY_HEAP_SECTION_ATTR */
+
+/**
+ * Global heap.
+ */
uint8_t jerry_global_heap_area[MEM_HEAP_AREA_SIZE] __attribute__ ((aligned (MEM_ALIGNMENT))) JERRY_GLOBAL_HEAP_SECTION;
And initialize the area pointer in mem_heap_init
with:
JERRY_CONTEXT (area) = jerry_global_heap_area;
This would remove the need for two context globals and two context macros, and would also make setting the heap pointer with something not statically allocated easier later in the future, while keeping the possibility to have the heap put in a separate section right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to keep two areas because of the performance reasons discussed before. When somebody has a single context pointer, the area should be organized like this [context][heap], and the pointer should point to [heap]. This would allow fast pointer compression and context could be reached easily, because negative absolute offsets are supported on all popular CPUs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I get this concept.
Context is (or, should be) a "thing" that represents all the state of the engine. Engine-managed heap, any caches, or anything else required to execute JS in/by the engine should (must) be part of the context. That should be the only entity that the user of the library has to deal with if s/he wants to change ... well, context. Dealing with the internals -- lile, heap -- should be off limits for the user. Thus, s/he should not pass in a heap pointer or something like that.
My understanding is that we concluded at the workshop that -- at least for now -- we start with the implementation of a single global context, which advances the current state by (1) ensuring that all global state information is in one place, (2) that state is always initialized, and (3) it can be swapped/replaced. (I.e., we don't pass a context object/pointer around the engine all the time but allow a global context struct -- or pointer to such a struct -- to be set by public API calls.)
I don't see this, especially (3), in the above explanation.
Could you please benchmark how the current PR performs (in its current form, without the mentioned context+heap layout optimization) and what performance degradation the suggested approach of merging the two contexts would cause (if any)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, the outlined [context][heap] memory layout cannot be ensured as heap can end up in a completely different section to context (because of JERRY_HEAP_SECTION_ATTR
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This concept provides nice optimization opportunities in the future, and I don't think it is worth sacrificing them. I want three contexts: for simple variables, heap, and lookup cache. If you prefer you can put all of them into a super structure (it can be the default), but you can also organize them cleverly. Obviously it is not relevant for static data, but can have nice benefit for other purposes. The point of the current approach is not putting everything into one structure, but to put everything into one file. People can reorganize this data according to their needs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes they are no context, but the building blocks of a context. The point is, you take these blocks, and build your own context. One opportunity is defining static variables for each block individually, another is putting all (or some) them in a large structure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- It's still unclear to me who is "you" above. How will it be possible to put these blocks together in different ways? All the code above is strictly jerry internal with noone (no user or port API) having access to it.
- I still don't see why the type
mem_heap_t
is needed. Even with the layout you depicted,mem_heap_free_t first;
could go intojerry_context_t
(perhaps with a slightly different name) as it already has several heap-related fields (e.g.,mem_heap_free_t *mem_heap_list_skip_p;
). That would leave the heap as a byte array, and theMEM_HEAP_CONTEXT
macro would need no argument at all as it would/could only refer to the heap area. - I'm still interested in the benchmark results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
there will be different modes like the floating point representations (e.g. 32/64 bit doubles) (I expect 3-5 modes maximum). Ports will choose the appropriate mode with a compiler flags (one port likely will use only one mode).
-
The mem_heap_t is very much needed, since the two fields are tightly coupled. Actually the "first" is the first 8 bytes of the heap, and provides nicer access to this 8 bytes than casting uint8_t* all the time.
-
What do you expect here? Why the results should be much different? I expect that there is no difference between the current several globals spread in the code and a few global variables. From compiler viewpoint, nothing changed (and that is the plan). But this is just mode 1. For other modes, there will likely be changes, but their purpose is different. In a future mode with a single global context pointer, your primary aim is having one changable pointer, and not being faster than globals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like @zherczeg's idea. This is a good first step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- The mem_heap_t is very much needed, since the two fields are tightly coupled. Actually the "first" is the first 8 bytes of the heap, and provides nicer access to this 8 bytes than casting uint8_t* all the time.
I don't think that this is right/correct. I've tried to trace down all uses of the first
field and it is never used as part of the heap. I cannot find a place where it would be referred to via a compact pointer. It is never used in a more tightly coupled way than mem_heap_list_skip_p
, which is in the "regular" context object.
- What do you expect here?
My original suggestion (the first in this thread) was to have a pointer in the context object to a heap area, which is "somewhere"; instead of having a globally allocated byte array for the heap. You've argued against it because of performance reasons. That's why I asked for benchmark results.
LGTM |
9a72874
to
572402e
Compare
I tried the pointer to heap version. Performance is roughly the same (gain is less than the deviation), but the binary size is increased by 600 bytes. I would still prefer the original. |
Updated and rebased to the latest master. |
Still LGTM |
Added a long comment why the "first" is in a good place. |
Patch rebased. |
LGTM |
/** | ||
* Global context. | ||
*/ | ||
jerry_context_t jerry_global_context; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1)Is the "context" here similar with the v8 context https://developers.google.com/v8/embed#contexts
2)Will the name "global_context" be misleading? People may think it's global environment/scope
What about "native_context"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I understand. It is a global environment. This is an initial patch, which collects global variables into a structure, and later this structure can be used to support different global management. E.g. multiple Jerries in one process, optional memory allocation for Jerry, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry that I didn't put it clearly. Here "context" I think it is similar with V8 context or Node.js's vm.runInXXContext, right? If so, each context will have its own global env, like ecma_global_lex_env_p in jerry-core\ecma\operations\ecma-lex-env.c
The followings are my understanding, please correct me if I am wrong.
jerry_global_context is the default context when we run jerry. in the future, we will support creating new context (let's name it "context2") and leting some js codes run in the context2. In the above case, both jerry_global_context and context2 will have respective ecma_global_lex_env_p. But the name of "jerry_global_context" may let people feel it is the global env.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My plan here is a bit different. Jerry will support more "context models" in the future and these context models will be selected by an appropriate defines. E.g. the current context model (which will likely be the default context model) supports only a single global context. Other context models (e.g. JERRY_CONTEXT_MODEL_USER_CONTEXT_PTR) will defines other variables and structures. The point is, all context models will reuse the structures above for convenience. Hence creating new context models will be easy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am sorry that I didn't fully understand. In a case which has 2 contexts, do the 2 context have 2 different jerry_global_context? or there is only one jerry_global_context in one jerry process? (I think only one jerry_global_context in one jerry process)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The plan is to have more contexts. But the other modes has not implemented yet, so we are quite free to do whatever we want. Those modes which support multiple contexts jerry_global_context will obviously not a good name, so they will define something like jerry_current_context, and this will be a pointer type.
I think a pseudo code will makes this easier to understand:
#ifdef JERRY_CONTEXT_MODE_SOMETHING1
// Whatever we want, e.g. a pointer to the current context
#elif defined JERRY_CONTEXT_MODE_SOMETHING2
// Whatever we want, e.g. a callback is defined to get the current context
#else
// Default context, simply one global context. This is the only mode we have now.
jerry_context_t global_context;
#endif
This is not implemented yet, just the general plan. The code will evolve into this direction. So we can define any kind of context modes depending on user requests. Hence we will not be limited to one context model like v8, but we can adapt several different ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, Thanks very much
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem. I admit the plan was not clear from the patch. Different context modes has different advantages / disadvantages, and we don't want to limit choices here. Instead everybody should choose at compile time the best context model for them.
Other than the minor stylistic comments, LGTM |
Thank you for the review. The patch and its description is updated. |
Still LGTM |
Currently the static variables of the allocator are moved into a global structure. The future plan is supporting multiple context models with different advantages and disadvantages. Users can select the most appropriate context model for their own use case. This context model must be selected at compile time using compiler defines. Currently only one model is implemented, which will be the default context model: the context is stored in global variables, so only a single context is available which always exists. JerryScript-DCO-1.0-Signed-off-by: Zoltan Herczeg [email protected]
This is the first patch which aims to move all jerry global variables into a context. This patch contains the infrastructure part, and the memory manager is updated to use this context. The JERRY_CONTEXT macro should provide the freedom of using contexts in Jerry.
@franc0is if you have some time please check this patch.