Skip to content

Restore 2.11 style desugaring of non-top-level module accessor and module lzycompute. #225

@retronym

Description

@retronym

My theory is that the shapeless slowdown is because of the change to inline the synchronized { if (moduleVar eq null) moduleVar = new NestedModule }; moduleVar into the module accessor method, rather than before when this was in a separate method.

Concretely, trait lazy vals are mixed into subclasses
with the needed synchronization logic in place, as do
lazy vals in classes and methods. Similarly, modules
are initialized using double checked locking.
Since the code to initialize a module is short,
we do not emit compute methods for modules (anymore).

For simplicity, local lazy vals do not get a compute method either.

2.11.8:

scala> :javap -c -private scala.reflect.internal.SymbolTable#NoPrefix
  public scala.reflect.internal.Types$NoPrefix$ NoPrefix();
    Code:
       0: aload_0
       1: getfield      #3094               // Field NoPrefix$module:Lscala/reflect/internal/Types$NoPrefix$;
       4: ifnonnull     14
       7: aload_0
       8: invokespecial #3100               // Method NoPrefix$lzycompute:()Lscala/reflect/internal/Types$NoPrefix$;
      11: goto          18
      14: aload_0
      15: getfield      #3094               // Field NoPrefix$module:Lscala/reflect/internal/Types$NoPrefix$;
      18: areturn
  public scala.reflect.api.Types$TypeApi NoPrefix();
    Code:
       0: aload_0
       1: invokevirtual #5560               // Method NoPrefix:()Lscala/reflect/internal/Types$NoPrefix$;
       4: areturn

scala> :javap -c -private scala.reflect.internal.SymbolTable#NoPrefix$lzycompute
  private scala.reflect.internal.Types$NoPrefix$ NoPrefix$lzycompute();
    Code:
       0: aload_0
       1: dup
       2: astore_1
       3: monitorenter
       4: aload_0
       5: getfield      #3094               // Field NoPrefix$module:Lscala/reflect/internal/Types$NoPrefix$;
       8: ifnonnull     23
      11: aload_0
      12: new           #3096               // class scala/reflect/internal/Types$NoPrefix$
      15: dup
      16: aload_0
      17: invokespecial #3097               // Method scala/reflect/internal/Types$NoPrefix$."<init>":(Lscala/reflect/internal/SymbolTable;)V
      20: putfield      #3094               // Field NoPrefix$module:Lscala/reflect/internal/Types$NoPrefix$;
      23: getstatic     #737                // Field scala/runtime/BoxedUnit.UNIT:Lscala/runtime/BoxedUnit;
      26: pop
      27: aload_0
      28: monitorexit
      29: aload_0
      30: getfield      #3094               // Field NoPrefix$module:Lscala/reflect/internal/Types$NoPrefix$;
      33: areturn
      34: aload_1
      35: monitorexit
      36: athrow
    Exception table:
       from    to  target type
           4    29    34   any

2.12.0-RC1:

scala> :javap -c -private scala.reflect.internal.SymbolTable#NoPrefix
  public scala.reflect.internal.Types$NoPrefix$ NoPrefix();
    Code:
       0: aload_0
       1: getfield      #2615               // Field NoPrefix$module:Lscala/reflect/internal/Types$NoPrefix$;
       4: ifnonnull     36
       7: aload_0
       8: monitorenter
       9: aload_0
      10: getfield      #2615               // Field NoPrefix$module:Lscala/reflect/internal/Types$NoPrefix$;
      13: ifnonnull     28
      16: aload_0
      17: new           #867                // class scala/reflect/internal/Types$NoPrefix$
      20: dup
      21: aload_0
      22: invokespecial #2616               // Method scala/reflect/internal/Types$NoPrefix$."<init>":(Lscala/reflect/internal/SymbolTable;)V
      25: putfield      #2615               // Field NoPrefix$module:Lscala/reflect/internal/Types$NoPrefix$;
      28: aload_0
      29: monitorexit
      30: goto          36
      33: aload_0
      34: monitorexit
      35: athrow
      36: aload_0
      37: getfield      #2615               // Field NoPrefix$module:Lscala/reflect/internal/Types$NoPrefix$;
      40: areturn

This has probably interfered with HotSpot's inlining.

image

I was comparing 2.11.8 and 2.12.0-RC1++ performance on the shapeless test suite, file by file. This was the worst regression, probably due to the relatively high amount of subtype checks that are performed in the implicit search.

⚡ time ((scalac -J-XX:+UnlockCommercialFeatures -J-XX:+UnlockDiagnosticVMOptions -J-XX:+DebugNonSafepoints -J-XX:+FlightRecorder -J-XX:FlightRecorderOptions=defaultrecording=true,dumponexit=true,stackdepth=1024,loglevel=debug,settings=profile,dumponexitpath=/tmp/old.jfr -Xlog-implicits @args-small.txt /Users/jz/code/shapeless/core/src/test/scala/shapeless/hlist.scala 2>&1) > /tmp/hlist-old.log)

real    0m27.659s
user    1m33.629s
sys 0m2.012s

⚡ time ((/code/scala/build/pack/bin/scalac -J-XX:+UnlockCommercialFeatures -J-XX:+UnlockDiagnosticVMOptions -J-XX:+DebugNonSafepoints -J-XX:+FlightRecorder -J-XX:FlightRecorderOptions=defaultrecording=true,dumponexit=true,stackdepth=1024,loglevel=debug,settings=profile,dumponexitpath=/tmp/new.jfr -Xlog-implicits @args-2.12.0-RC-small.txt /Users/jz/code/shapeless/core/src/test/scala/shapeless/hlist.scala 2>&1) > /tmp/hlist-new.log)

real    1m27.506s
user    2m37.283s
sys 0m2.292s

The actual output of -Xlog-implicits was quite similar, so it seems a similar amount of work was being done.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions