-
Notifications
You must be signed in to change notification settings - Fork 684
Enhancement of parser operand handling, compaction of instruction dumper #488
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cb8bec7
to
b7f9faa
Compare
Please fix precommit |
/** | ||
* Operand (descriptor of value or reference in context of parser) | ||
*/ | ||
class jsp_operand_t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving to C++ is still under discussion in #470 with no conclusion yet. This change would already start adding such features (i.e., classes).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@akiss77, I'll provide measurements for both versions (i.e. before switching to class, and just after) today.
I think, we should perform the measurements on engine code, rather on some external tests / benchmarks. Because of this, I prepared the patch with usage of classes.
So, we could see if usage of classes could be ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@akiss77, I've performed the measurements on the following versions (if I didn't missed anything, the versions are only different usage of struct / class for parser operand
):
Performance was measured in --parse-only
mode, as the changes involve only parser.
Benchmark, used for the test was generated by the following script:
rm -f bench.js
for i in `seq 1 1 100`
do
echo "a = 1 + 1 + 1 + 1 ? 1 * 1 * 1 : 1 & (1 * a * b + o.p * q - 1 + 'str');"
echo "if (a) {b = 1 + 1 + 1 + 1 ? 1 * 1 * 1 : 1 ^ (1 * a * b + o.p * q - 1 + 'str');}"
echo "else { for (i = 0; i < 1000; i++) { c = 1 + 1 + 1 + 1 ? 1 * 1 * 1 : 1 | (1 * a * b + o.p * q - 1 + 'str'); } }"
done >> bench.js
- release.mcu_stm32f4-cp (gcc version 4.8.2 (4.8.2-14ubuntu1+6)):
- binary size
- struct: 150700 bytes
- class: 151012 bytes
- binary size
- release.linux (gcc version 4.8.2 (Ubuntu/Linaro 4.8.2-16ubuntu4), hard float):
- binary size (after
strip
):- struct: 190152 bytes
- class: 190152 bytes
- performance (every value in the sequence is a mean by 5 consecutive measurements):
- struct:
- 3.332
- 3.328
- 3.334
- 3.34
- 3.334
- class:
- 3.331
- 3.334
- 3.33
- 3.337
- 3.334
- struct:
- binary size (after
On mcu version binary size of build with class-based implementation is greater on 312 bytes.
Seems that this is caused by extra inline of a functions in build with C++ class-based implementation, relatively to build with C struct-based implementation.
In the performance measurements above share of parse_*
functions was very small - about 0.01% or less.
To increase share of parse_*
functions, I've added some performance optimizations (for heap allocator and linked list collection) to both builds.
On the new versions I've got the following (same environment; each value is mean of 20 measurements):
- struct:
- 0.1445
- 0.14475
- 0.14475
- 0.14475
- 0.14475
- class:
- 0.14525
- 0.145
- 0.14475
- 0.145
- 0.14525
So, time of build that uses class implementation is slower on about 0.2%.
However, the following is part of performance profile, consisting of first two parse_*expression
functions in each profile, collected using perf record
utility with frequency argument -F 50000
:
- struct:
- parse_member_expression: 0.36%
- parse_unary_expression: 0.27%
- class:
- parse_member_expression: 0.35% (a bit less)
- parse_unary_expression: 0.27% (same)
Considering the profile, we can see that actually reduce of performance is not relevant to change of parse_*expression
functions' performance.
For example, in the same profile:
- struct:
- serializer_dump_op_meta: 0.22%
- generate_instr: 0.29%
- current_locus: 0.22%
- linked_list_element: 12.86%
- memcpy: 5.23%
- class:
- serializer_dump_op_meta: 0.16%
- generate_instr: 0.37%
- current_locus: 0.49%
- linked_list_element: 13.33%
- memcpy: 5.22%
Considering the above, in the case the C++-feature itself doesn't involve binary size or performance.
For me it seems that, the C++-feature itself doesn't involve binary size of performance in general case too, and so can be used, with accurate consideration of consequences in each case (as for any other way of implementation).
In other words, there definitely can be some cases, in which performance is reduced because of using some feature, but it seems to me that the cases should be considered in the feature-independent way, i.e. by analysing actual origin of the reduce.
Concerning classes, I don't see any "physical" difference between basic classes and structures. For me, it seems to be just different way to declare the same behaviour.
Actual difference can occur, for example, during implementation of some default operations (like default constructor) that would add unnecessary operations in some cases, in which the operations are not optimized out. The cases should definitely be considered very accurately.
@akiss77, @zherczeg, @egavrin, @sand1k, could you, please, share your opinion about this?
b7f9faa
to
438036a
Compare
@egavrin, precommit fixed. |
This change is OK for me. I don't see any changes in binary size, memory consumption or performance. Additionally, classes are already used in other parts of the engine. Anyway, significant binary size reduction should be done other way, not by limiting C++ -> C. |
@akiss77 @zherczeg can we merge this? |
It's hard to object against the current patch because the measurements don't show big differences. However:
So, my vote is still against classes, exceptions, and C++ in general -- unless size is not that important anymore. That being said, that's just my two cents. I have no formal LGTM right, so no formal protest right either. (But I do protest informally :) ) |
@akiss77, I'm not sure that inlining is controlled less or more, depending on C or C++. Is it so? I think that in the case inlining can be evidence of more efficient code generation: usually, inline algorithns use some heuristics, like size of function to inline, etc. So, in the case, maybe, code of some function became a little bit smaller after changing to classes, and so, the function was inlined. Can't it be the cause?
It seems to me, that if we don't use classes with constructors / destructors, then we should emulate the constructors / desctructors mechanism manually, and calling them somehow upon
What do you think about this? |
Well, we have release builds here, which implies
You mean implementing a totally self-invented exception handling solution? How could we make the compiler know about it and emit code that conforms to our mechanism?
Eventhough that's picky, the above 300 bytes code increase does not underpin that completely. (Don't misunderstand me, I use C++ often in desktop environment, and I'm fairly content with its performance and convenience. But that's normally the
That's true, but not the other way around, though. |
I agree that gcc / g++ with
Actually, in the same way as we can do it in C. As C++ compiler has native mechanism of exception handling, and C compiler - doesn't have, I would prefer to use C++ compiler.
As I wrote above, I agree that gcc / g++ with @akiss77, @zherczeg, thank you very much for sharing your opinion. |
Since we don't agree completely on all points, I humbly propagate the chance of giving a +1 LGTM to someone else. |
|
||
return _data.idx_const; | ||
} /* get_idx_const */ | ||
private: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would put a newline before private.
Looks good to me (we have our concerns about C++ but we don't want to slow you down). We should introduce style guides for C++ code. |
@zherczeg thank you! @ruben-ayrapetyan rebase to master and |
* @return constructed operand | ||
*/ | ||
static jsp_operand_t | ||
make_idx_const_operand (vm_idx_t cnst) /**< integer in vm_idx_t range */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cnst
-> idx_const
c4fbe97
to
f54547a
Compare
8af9143
to
a416184
Compare
e1116a1
to
c80704c
Compare
c80704c
to
9f2c803
Compare
JerryScript-DCO-1.0-Signed-off-by: Ruben Ayrapetyan [email protected]
… types to enum defined in the class. JerryScript-DCO-1.0-Signed-off-by: Ruben Ayrapetyan [email protected]
9f2c803
to
28ffadd
Compare
…ALUE -> VM_IDX_EMPTY / VM_IDX_REWRITE_GENERAL_CASE, LITERAL_TO_REWRITE -> VM_IDX_REWRITE_LITERAL_UID. JerryScript-DCO-1.0-Signed-off-by: Ruben Ayrapetyan [email protected]
JerryScript-DCO-1.0-Signed-off-by: Ruben Ayrapetyan [email protected]
…e dumper, by extracting them to several separate routines that can be used for most cases, remove getop_* routines from vm. JerryScript-DCO-1.0-Signed-off-by: Ruben Ayrapetyan [email protected]
28ffadd
to
a00079e
Compare
Binary size (Linux armv7-hf): 198328 -> 194232