Skip to content

Commit 9ffd39a

Browse files
author
Father Chrysostomos
committed
Allow PADTMPs’ strings to be swiped
While copy-on-write does speed things up, it is not perfect. Take this snippet for example: $a = "$b$c"; $a .= $d; The concatenation operator on the rhs of the first line has its own scalar that it reuses every time that operator is called (its target). When the assignment happens, $a and that target share the same string buffer, which is good, because we didn’t have to copy it. But because it is shared between two scalars, the concatenation on the second line forces it to be copied. While copy-on-write may be fast, string swiping surpasses it, because it has no later bookkeeping overhead. If we allow stealing targets’ strings, then $a = "$b$c" no longer causes $a to share the same string buffer as the target; rather, $a steals that buffer and leaves the tar- get undefined. The result is that neither ‘$a =’ nor ‘$a .= $d’ needs to copy any strings. Only the "$b$c" will copy strings (unavoidably). This commit only applies that to long strings, however. This is why: Simply swiping the string from any swipable TARG (which I tried at first) resulted in a significant slowdown. By swiping the string from a TARG that is going to be reused (as opposed to a TEMP about to be freed, which is where swipe was already happening), we force it to allocate another string next time, greatly increasing the number of malloc calls. malloc overhead exceeds the overhead of copying short strings. I tried swiping TARGs for short strings only when the buffer on the lhs was not big enough for a copy (or there wasn’t one), but simple benchmarks with mktables show that even checking SvLEN(dstr) is enough to slow things down, since the speed-up this provides is minimal where short strings are involved. Then I tried checking just the string length, and saw a consistent speed increase. So that’s what this patch uses. Programs using short strings will not benefit. Programs using long strings may see a 1.5% increase in speed, due to fewer string copies.
1 parent 0fd5eac commit 9ffd39a

File tree

3 files changed

+27
-9
lines changed

3 files changed

+27
-9
lines changed

ext/Devel-Peek/t/Peek.t

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -509,7 +509,7 @@ do_test('string with Unicode',
509509
PV = $ADDR "\\\214\\\101\\\0\\\235\\\101"\\\0 \[UTF8 "\\\x\{100\}\\\x\{0\}\\\x\{200\}"\]
510510
CUR = 5
511511
LEN = \\d+
512-
COW_REFCNT = 1
512+
COW_REFCNT = 1 # $] < 5.019006
513513
');
514514
} else {
515515
do_test('string with Unicode',
@@ -521,7 +521,7 @@ do_test('string with Unicode',
521521
PV = $ADDR "\\\304\\\200\\\0\\\310\\\200"\\\0 \[UTF8 "\\\x\{100\}\\\x\{0\}\\\x\{200\}"\]
522522
CUR = 5
523523
LEN = \\d+
524-
COW_REFCNT = 1
524+
COW_REFCNT = 1 # $] < 5.019006
525525
');
526526
}
527527

@@ -549,7 +549,7 @@ do_test('reference to hash containing Unicode',
549549
PV = $ADDR "\\\235\\\101"\\\0 \[UTF8 "\\\x\{200\}"\]
550550
CUR = 2
551551
LEN = \\d+
552-
COW_REFCNT = 1
552+
COW_REFCNT = 1 # $] < 5.019006
553553
', '',
554554
$] > 5.009
555555
? $] >= 5.015
@@ -580,7 +580,7 @@ do_test('reference to hash containing Unicode',
580580
PV = $ADDR "\\\310\\\200"\\\0 \[UTF8 "\\\x\{200\}"\]
581581
CUR = 2
582582
LEN = \\d+
583-
COW_REFCNT = 1
583+
COW_REFCNT = 1 # $] < 5.019006
584584
', '',
585585
$] > 5.009
586586
? $] >= 5.015

pp_ctl.c

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3810,6 +3810,7 @@ PP(pp_require)
38103810
if (vms_unixname)
38113811
#endif
38123812
{
3813+
SV *nsv = sv;
38133814
namesv = newSV_type(SVt_PV);
38143815
for (i = 0; i <= AvFILL(ar); i++) {
38153816
SV * const dirsv = *av_fetch(ar, i, TRUE);
@@ -3834,11 +3835,15 @@ PP(pp_require)
38343835

38353836
ENTER_with_name("call_INC");
38363837
SAVETMPS;
3838+
if (SvPADTMP(nsv)) {
3839+
nsv = sv_newmortal();
3840+
SvSetSV_nosteal(nsv,sv);
3841+
}
38373842
EXTEND(SP, 2);
38383843

38393844
PUSHMARK(SP);
38403845
PUSHs(dirsv);
3841-
PUSHs(sv);
3846+
PUSHs(nsv);
38423847
PUTBACK;
38433848
if (sv_isobject(loader))
38443849
count = call_method("INC", G_ARRAY);

sv.c

Lines changed: 17 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4358,12 +4358,20 @@ Perl_sv_setsv_flags(pTHX_ SV *dstr, SV* sstr, const I32 flags)
43584358
)
43594359
&&
43604360
!(isSwipe =
4361+
( /* Either ... */
43614362
#ifdef PERL_NEW_COPY_ON_WRITE
43624363
/* slated for free anyway (and not COW)? */
4363-
(sflags & (SVs_TEMP|SVf_IsCOW)) == SVs_TEMP &&
4364+
(sflags & (SVs_TEMP|SVf_IsCOW)) == SVs_TEMP
43644365
#else
4365-
(sflags & SVs_TEMP) && /* slated for free anyway? */
4366-
#endif
4366+
(sflags & SVs_TEMP) /* slated for free anyway? */
4367+
#endif
4368+
/* or a swipable TARG */
4369+
|| ((sflags & (SVs_PADTMP|SVf_READONLY|SVf_IsCOW))
4370+
== SVs_PADTMP
4371+
/* whose buffer is worth stealing */
4372+
&& GE_COWBUF_THRESHOLD(cur)
4373+
)
4374+
) &&
43674375
!(sflags & SVf_OOK) && /* and not involved in OOK hack? */
43684376
(!(flags & SV_NOSTEAL)) &&
43694377
/* and we're allowed to steal temps */
@@ -14085,14 +14093,19 @@ Perl_sv_recode_to_utf8(pTHX_ SV *sv, SV *encoding)
1408514093
STRLEN len;
1408614094
const char *s;
1408714095
dSP;
14096+
SV *nsv = sv;
1408814097
ENTER;
1408914098
PUSHSTACK;
1409014099
SAVETMPS;
14100+
if (SvPADTMP(nsv)) {
14101+
nsv = sv_newmortal();
14102+
SvSetSV_nosteal(nsv, sv);
14103+
}
1409114104
save_re_context();
1409214105
PUSHMARK(sp);
1409314106
EXTEND(SP, 3);
1409414107
PUSHs(encoding);
14095-
PUSHs(sv);
14108+
PUSHs(nsv);
1409614109
/*
1409714110
NI-S 2002/07/09
1409814111
Passing sv_yes is wrong - it needs to be or'ed set of constants

0 commit comments

Comments
 (0)