From 6e9b6909979dfb14f4ecd360cfbfb74cbee78ef8 Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Fri, 28 Feb 2025 23:37:08 +0800 Subject: [PATCH 01/23] . --- content/flexible-varargs.md | 278 ++++++++++++++++++++++++++++++++++++ 1 file changed, 278 insertions(+) create mode 100644 content/flexible-varargs.md diff --git a/content/flexible-varargs.md b/content/flexible-varargs.md new file mode 100644 index 0000000..3b3c5d7 --- /dev/null +++ b/content/flexible-varargs.md @@ -0,0 +1,278 @@ +--- +layout: sip +permalink: /sips/:title.html +stage: implementation +status: under-review +title: SIP-XX - Flexible Varargs +--- + +**By: Li Haoyi** + +## History + +| Date | Version | +|---------------|--------------------| +| Feb 28th 2024 | Initial Draft | + +## Summary + +This SIP proposes an extension of the Scala Vararg unpacking syntax `*`. Currently, +vararg unpacking can only be used for a single `Seq`, both in expressions and in patterns: + +```scala +def sum(x: Int*) = x.sum + +val numbers = Seq(1, 2, 3) + +val total = sum(numbers*) // 6 + +numbers match{ + case Seq(numbers2*) => ??? +} +``` + +We propose to extend this to allow mixing of `*` unpackings and raw values in expressions and patterns: + + +```scala +val numbers = Seq(1, 2, 3) + +val total = sum(0, numbers*, 4) // 10 + +numbers match { + case Seq(1, numbers2*, 3) => println(numbers2) // Seq(2) +} +``` + +Allow multiple `*`s to be unpacked into the same varargs: + +```scala +val numbers1 = Seq(1, 2, 3) +val numbers2 = Seq(4, 5, 6) + +val total = sum(numbers1*, numbers2*) // 21 +``` + +And allow `Option`s to be unpacked: + +```scala +val numberOpt1: Option[Int] = Some(1) +val numberOpt2: Option[Int] = None + +val total = sum(numberOpt1*, numberOpt2*) // 1 +``` + +## Motivation + +Vararg unpacking with `*` is convenient but very limited. In particular, this proposal +streamlines two scenarios that were previously awkward. + +### Passing Multiple Things to Varargs + +The first scenario that this proposal streamlines is when you want to combine a bunch of +different variables into a single varargs unpacking, e.g. two single values and two sequences. +This can be done as shown below, but is terribly ugly: lots of extra parens, constructing +inline `Seq`s with `++`: + +```scala +val numbers1 = Seq(1, 2, 3) +val numbers2 = Seq(4, 5, 6) + +// BEFORE +val total = sum((Seq(0) ++ numbers1 ++ numbers2 ++ Seq(7))*) // 28 + +val total = sum((0 +: numbers1 ++ numbers2 :+ 7)*) +// (console):1: left- and right-associative operators with same precedence may not be mixed + +// AFTER +val total = sum(0, numbers1*, numbers2*, 4) // 10 +``` + +As you can see, the version using inline `Seq()`s and `++` is super verbose, and the "obvious" +cleanup using `+:` and `:+` doesn't actually work due to weird associativity problems. +With this proposal, you can write what you mean and have it just work. + +### Constucting Sequences + +The second scenarios that this streamlines is constructing `Seq`s and other collections. +For example, a common scenario is constructing a collection from values: + + +```scala +val foo = 1 +val bar = 2 + +val coll = Seq(foo, bar) +// Seq(1, 2) +``` + +This works great even as the collection grows: + +```scala +val foo = 1 +val bar = 2 +val qux = 3 +val baz = 4 + +val coll = Seq(foo, bar, qux, baz) +// Seq(1, 2, 3, 4) +``` + +This looks fine, until one of the values is sequence or optional: + +```scala +val foo = 1 +val bar = Seq(2) +val qux = 3 +val baz = Some(4) + +val coll = Seq(foo) ++ bar ++ Seq(qux) ++ baz +// Seq(1, 2, 3, 4) +``` + +Even worse, if the first value is optional, it needs to be explicitly turned into a `Seq` +otherwise you get an inferred type of `Iterable` which is unexpected: + +```scala +val foo = Some(1) +val bar = Seq(2) +val qux = 3 +val baz = 4 + +val coll = foo.toSeq ++ bar ++ Seq(qux, baz) +val coll = Seq() ++ foo ++ bar ++ Seq(qux, baz) // alternative syntax +// Seq(1, 2, 3, 4) +``` + +As you can see, we end up having to refactor our code significantly for what is +logically a very similar operation: constructing a sequence from values. Depending +on what those values are, the shape of the code varies a lot, and you are open to +get surprising inferred types if you forget to call `toSeq` on the first entry (sometimes!) + +```scala +val coll = Seq(foo, bar, qux, baz) +val coll = Seq(foo) ++ bar ++ Seq(qux) ++ baz +val coll = foo.toSeq ++ bar ++ Seq(qux, baz) +val coll = Seq() ++ foo ++ bar ++ Seq(qux, baz) +``` + +With those proposal, all three scenarios would look almost the same - reflecting +the fact that they are really doing the same thing - and you are no longer prone to weird +type inference issues depending on the type of the left-most value: + + +```scala +val coll = Seq(foo, bar, qux, baz) +val coll = Seq(foo, bar*, qux, baz*) +val coll = Seq(foo*, bar*, qux, baz) +``` + +## Implementation + +The proposed implementation is to basically desugar the multiple `*`s into the manual +`Seq`-construction code you would have written without it: + +```scala +// User Code +val total = sum(0, numbers1*, numbers2*, 4) // 10 + +// Desugaring +val total = sum((IArray(0) ++ numbers1 ++ numbers2 ++ IArray(4))*) // 10 +``` + +The implementation for patterns could be something like + +```scala +// User Code +numbers match { + case Seq(1, numbers2*, 3) => println(numbers2) // Seq(2) +} + +// Desugaring Helper +class VarargsMatchHelper(beforeCount: Int, afterCount: Int) { + def unapply[T](values: Seq[T]): Option[(Seq[T], Seq[T], Seq[T])] = { + Option.when (values.length >= beforeCount + afterCount){ + ( + values.take(beforeCount), + values.drop(beforeCount).dropRight(afterCount), + values.takeRight(afterCount) + ) + } + } +} + +// Desugaring Helper +val VarargsMatcher = new VarargsMatchHelper(1, 1) +numbers match { + case VarargsMatcher(Seq(1), numbers2, Seq(3)) => println(numbers2) // Seq(2) +} +``` + +## Limitations + + +### Single `*` in pattern matching + +One major limitation is that while expressions support unpacking multiple `*`s in a varargs, +pattern matching can only support a single `*`. That is because a varargs pattern with +multiple `*` sub-patterns may have multiple possible ways of assigning the individual +elements to each sub-pattern, and depending on the sub-patterns not all such assignments +may be valid. Thus there is no way to implement it efficiently in general, as it would +require an expensive (`O(2^n)`) backtracking search to try and find a valid assignment +of the elements that satisfies all sub-patterns. + +Python's in [PEP634: Structural Pattern Matching](https://peps.python.org/pep-0634) +has the same limitation of only allowing one`*` unpacking in its +[Sequence Patterns](https://peps.python.org/pep-0634/#sequence-patterns) + +### No Specific Performance Optimizations + +As proposed, the desugaring just relies on `IArray()` and `++` to construct the final +sequence that will be passed to varargs. We don't want to use `Seq` because it returns +a `List`, and `List#++` is very inefficient. But we do not do any further optimizations, +and anyone who hits performance issues with flexible vararg unpacking can always rewrite +it themselves manually constructing an `mutable.ArrayBuffer` or `mutable.ArrayDeque`. + +### Not supporting unpacking arbitrary `Iterable`s + +Given we propose to allow unpacking `Seq` and `Option`, its worth asking if we should +support unpacking `Iterable` or `IterableOnce`. This proposal avoids those types for now, +but if anyone wants to unpack them it's trivial to call `.toSeq`. For `Option`, we think +that the use case of calling a vararg method with optional values is frequent enough that +it deserves special support, whereas calling a vararg method with a `Set` or a `Map` +doesn't happen enough to be worth special casing. + +## Alternatives + +Apart from the manual workflows described above, one alternative is to use implicit conversions +to a target type, i.e. the "magnet pattern". For example: + +```scala +case class Summable(value: Seq[Int]) +implicit def seqSummable(value: Seq[Int]) = Summable(value) +implicit def singleSummable(value: Int) = Summable(Seq(value)) +implicit def optionSummable(value: Option[Int]) = Summable(value.toSeq) + +def sum(x: Int*) = x.sum + +val numbers1 = Seq(1, 2, 3) +val numbers2 = Seq(4, 5, 6) + +val total = sum(0, numbers1, numbers2, 7) +``` + +We can see this done repeatedly in the wild: + +* OS-Lib's `os.call` and `os.spawn` methods, which use + `os.Shellable` as the target type, with [several implicit conversions](https://github.com/com-lihaoyi/os-lib/blob/ff52a8bc4873d9c01e085cc18780845ecea0f8a2/os/src/Model.scala#L217-L237) + both for normal constructors as well as for `Seq[T]` and `Option[T]` cases + +* SBT's `settings` lists, which uses `SettingsDefinition` as the target type, with + [two implicit conversions](https://github.com/sbt/sbt/blob/1d16ca95106a11ad4ef0e3c5a1637c17189600da/internal/util-collection/src/main/scala/sbt/internal/util/Settings.scala#L691-L695) + for single and sequence entries + +This approach works, but relies on you controlling the +target type, and adds considerable boilerplate defining implicit conversions for +every such target type. It thus can sometimes be found in libraries where that +overhead can be amortized over many callsites, but it not a general +replacement for the more flexible `*` proposed in this document. \ No newline at end of file From 4d7e7c570df6a706e107703dcf7ac9562c5a56bd Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Fri, 28 Feb 2025 23:39:15 +0800 Subject: [PATCH 02/23] fixdate --- content/flexible-varargs.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/flexible-varargs.md b/content/flexible-varargs.md index 3b3c5d7..1a6cd69 100644 --- a/content/flexible-varargs.md +++ b/content/flexible-varargs.md @@ -12,7 +12,7 @@ title: SIP-XX - Flexible Varargs | Date | Version | |---------------|--------------------| -| Feb 28th 2024 | Initial Draft | +| Feb 28th 2025 | Initial Draft | ## Summary From 28861a4285a6d76632ef7a8c835ec8ff2aaa4e0c Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Fri, 28 Feb 2025 23:40:25 +0800 Subject: [PATCH 03/23] fixdate --- content/flexible-varargs.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/content/flexible-varargs.md b/content/flexible-varargs.md index 1a6cd69..d4c0e58 100644 --- a/content/flexible-varargs.md +++ b/content/flexible-varargs.md @@ -56,10 +56,11 @@ val total = sum(numbers1*, numbers2*) // 21 And allow `Option`s to be unpacked: ```scala -val numberOpt1: Option[Int] = Some(1) -val numberOpt2: Option[Int] = None +val number1: Int = 1 +val number2: Option[Int] = Some(2) +val number3: Int = 3 -val total = sum(numberOpt1*, numberOpt2*) // 1 +val total = sum(number1, number2*, number3) // 6 ``` ## Motivation From f14d3ab101fee1eb65841c0e54b00097afd7b384 Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Fri, 28 Feb 2025 23:40:53 +0800 Subject: [PATCH 04/23] fixdate --- content/flexible-varargs.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/content/flexible-varargs.md b/content/flexible-varargs.md index d4c0e58..26df5c6 100644 --- a/content/flexible-varargs.md +++ b/content/flexible-varargs.md @@ -53,7 +53,8 @@ val numbers2 = Seq(4, 5, 6) val total = sum(numbers1*, numbers2*) // 21 ``` -And allow `Option`s to be unpacked: +And allow `Option`s to be unpacked, as it is very common to have some of the values +you want to pass to a varargs be optional: ```scala val number1: Int = 1 From 9a9ad39be6d0e90d9b564794c4ba7fa8c44ba802 Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Fri, 28 Feb 2025 23:42:58 +0800 Subject: [PATCH 05/23] fixdate --- content/flexible-varargs.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/content/flexible-varargs.md b/content/flexible-varargs.md index 26df5c6..eb58ad6 100644 --- a/content/flexible-varargs.md +++ b/content/flexible-varargs.md @@ -273,6 +273,9 @@ We can see this done repeatedly in the wild: [two implicit conversions](https://github.com/sbt/sbt/blob/1d16ca95106a11ad4ef0e3c5a1637c17189600da/internal/util-collection/src/main/scala/sbt/internal/util/Settings.scala#L691-L695) for single and sequence entries +* Scalatags' HTML templates, which use `Frag` as the target type and provide + [an implicit conversion](https://github.com/com-lihaoyi/scalatags/blob/762ab37d0addc614bfd65bbeabeb5f123caf4395/scalatags/src/scalatags/Text.scala#L59-L63) from any `Seq[T]` with an implicit `T => Frag` + This approach works, but relies on you controlling the target type, and adds considerable boilerplate defining implicit conversions for every such target type. It thus can sometimes be found in libraries where that From 2067ed1bf50c10dbd10e89d7ed416188fef69381 Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Fri, 28 Feb 2025 23:45:06 +0800 Subject: [PATCH 06/23] fixdate --- content/flexible-varargs.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/content/flexible-varargs.md b/content/flexible-varargs.md index eb58ad6..e65c768 100644 --- a/content/flexible-varargs.md +++ b/content/flexible-varargs.md @@ -280,4 +280,7 @@ This approach works, but relies on you controlling the target type, and adds considerable boilerplate defining implicit conversions for every such target type. It thus can sometimes be found in libraries where that overhead can be amortized over many callsites, but it not a general -replacement for the more flexible `*` proposed in this document. \ No newline at end of file +replacement for the more flexible `*` proposed in this document. Note that while the +"manual" approach of doing `foo ++ Seq(bar) ++ qux ++ Seq(baz)` could be applied to +any of the three use cases above, all three libraries found it painful enough that +adding implicit conversions was worthwhile. \ No newline at end of file From 1f358faf8d1a4a444e93e9ce413ec317fca0e2c6 Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Fri, 28 Feb 2025 23:46:55 +0800 Subject: [PATCH 07/23] fixdate --- content/flexible-varargs.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/content/flexible-varargs.md b/content/flexible-varargs.md index e65c768..76ce0dc 100644 --- a/content/flexible-varargs.md +++ b/content/flexible-varargs.md @@ -224,8 +224,10 @@ require an expensive (`O(2^n)`) backtracking search to try and find a valid assi of the elements that satisfies all sub-patterns. Python's in [PEP634: Structural Pattern Matching](https://peps.python.org/pep-0634) -has the same limitation of only allowing one`*` unpacking in its -[Sequence Patterns](https://peps.python.org/pep-0634/#sequence-patterns) +has the same limitation of only allowing one `*` unpacking in its +[Sequence Patterns](https://peps.python.org/pep-0634/#sequence-patterns), with +an arbitrary number of non-`*` patterns on the left and right, and follows the same +pattern matching strategy that I sketched above. ### No Specific Performance Optimizations From b8b295b4815257d1027f2a5392593f0f216d5cd9 Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Fri, 28 Feb 2025 23:47:03 +0800 Subject: [PATCH 08/23] fixdate --- content/flexible-varargs.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/flexible-varargs.md b/content/flexible-varargs.md index 76ce0dc..74a6b2e 100644 --- a/content/flexible-varargs.md +++ b/content/flexible-varargs.md @@ -223,7 +223,7 @@ may be valid. Thus there is no way to implement it efficiently in general, as it require an expensive (`O(2^n)`) backtracking search to try and find a valid assignment of the elements that satisfies all sub-patterns. -Python's in [PEP634: Structural Pattern Matching](https://peps.python.org/pep-0634) +Python's [PEP634: Structural Pattern Matching](https://peps.python.org/pep-0634) has the same limitation of only allowing one `*` unpacking in its [Sequence Patterns](https://peps.python.org/pep-0634/#sequence-patterns), with an arbitrary number of non-`*` patterns on the left and right, and follows the same From b39a6c55a3571e765a97bad135aa7c81022c9ae9 Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Fri, 28 Feb 2025 23:48:19 +0800 Subject: [PATCH 09/23] fixdate --- content/flexible-varargs.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/content/flexible-varargs.md b/content/flexible-varargs.md index 74a6b2e..f1d9793 100644 --- a/content/flexible-varargs.md +++ b/content/flexible-varargs.md @@ -226,8 +226,9 @@ of the elements that satisfies all sub-patterns. Python's [PEP634: Structural Pattern Matching](https://peps.python.org/pep-0634) has the same limitation of only allowing one `*` unpacking in its [Sequence Patterns](https://peps.python.org/pep-0634/#sequence-patterns), with -an arbitrary number of non-`*` patterns on the left and right, and follows the same -pattern matching strategy that I sketched above. +an arbitrary number of non-`*` patterns on the left and right, and follows the +[same pattern matching strategy](https://docs.python.org/3/reference/compound_stmts.html#sequence-patterns) +that I sketched above. ### No Specific Performance Optimizations From de5078b23d0e5d0240d4a9dceacf34da833877be Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Fri, 28 Feb 2025 23:52:50 +0800 Subject: [PATCH 10/23] fixdate --- content/flexible-varargs.md | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/content/flexible-varargs.md b/content/flexible-varargs.md index f1d9793..cccbe63 100644 --- a/content/flexible-varargs.md +++ b/content/flexible-varargs.md @@ -230,6 +230,19 @@ an arbitrary number of non-`*` patterns on the left and right, and follows the [same pattern matching strategy](https://docs.python.org/3/reference/compound_stmts.html#sequence-patterns) that I sketched above. +Javascript has a stricter limitation where destructuring an array, it only allows +single values on the _left_ `...rest` pattern. + +```javascript +[a, b, ...rest] = [10, 20, 30, 40, 50]; +// a = 10 +// b = 20 +// rest = [30, 40, 50] + +[a, b, ...rest, c] = [10, 20, 30, 40, 50]; +// Uncaught SyntaxError: Rest element must be last element +``` + ### No Specific Performance Optimizations As proposed, the desugaring just relies on `IArray()` and `++` to construct the final From 2bfd84b5b79a1e21ddad40677f69be2c52bec445 Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Fri, 28 Feb 2025 23:56:13 +0800 Subject: [PATCH 11/23] fixdate --- content/flexible-varargs.md | 69 ++++++++++++++++++++++++++----------- 1 file changed, 48 insertions(+), 21 deletions(-) diff --git a/content/flexible-varargs.md b/content/flexible-varargs.md index cccbe63..390acc6 100644 --- a/content/flexible-varargs.md +++ b/content/flexible-varargs.md @@ -223,26 +223,6 @@ may be valid. Thus there is no way to implement it efficiently in general, as it require an expensive (`O(2^n)`) backtracking search to try and find a valid assignment of the elements that satisfies all sub-patterns. -Python's [PEP634: Structural Pattern Matching](https://peps.python.org/pep-0634) -has the same limitation of only allowing one `*` unpacking in its -[Sequence Patterns](https://peps.python.org/pep-0634/#sequence-patterns), with -an arbitrary number of non-`*` patterns on the left and right, and follows the -[same pattern matching strategy](https://docs.python.org/3/reference/compound_stmts.html#sequence-patterns) -that I sketched above. - -Javascript has a stricter limitation where destructuring an array, it only allows -single values on the _left_ `...rest` pattern. - -```javascript -[a, b, ...rest] = [10, 20, 30, 40, 50]; -// a = 10 -// b = 20 -// rest = [30, 40, 50] - -[a, b, ...rest, c] = [10, 20, 30, 40, 50]; -// Uncaught SyntaxError: Rest element must be last element -``` - ### No Specific Performance Optimizations As proposed, the desugaring just relies on `IArray()` and `++` to construct the final @@ -299,4 +279,51 @@ overhead can be amortized over many callsites, but it not a general replacement for the more flexible `*` proposed in this document. Note that while the "manual" approach of doing `foo ++ Seq(bar) ++ qux ++ Seq(baz)` could be applied to any of the three use cases above, all three libraries found it painful enough that -adding implicit conversions was worthwhile. \ No newline at end of file +adding implicit conversions was worthwhile. + +# Prior Art + +### Python + +Python's `*` syntax works identically to this proposal. In Python, you can mix +single values with one or more `*` unpackings when calling a function: + +```python +>>> a = [1, 2, 3] +>>> b = [4, 5, 6] + +>>> print(*a, 0, *b) +1 2 3 0, 4 5 6 +``` + +Python's [PEP634: Structural Pattern Matching](https://peps.python.org/pep-0634) +has the same limitation of only allowing one `*` unpacking in its +[Sequence Patterns](https://peps.python.org/pep-0634/#sequence-patterns), with +an arbitrary number of non-`*` patterns on the left and right, and follows the +[same pattern matching strategy](https://docs.python.org/3/reference/compound_stmts.html#sequence-patterns) +that I sketched above. + +### Javascript +Javascript's expression `...` syntax works identically to this proposal. In Python, you can mix +single values with one or more `...` unpackings when calling a function: + +```javascript +a = [1, 2, 3] +b = [4, 5, 6] + +console.log(...a, 0, ...b) +// 1 1 2 3 0 4 5 6 +``` + +Javascript has a stricter limitation when destructuring an array, as it only allows +single values on the _left_ `...rest` pattern. + +```javascript +[a, b, ...rest] = [10, 20, 30, 40, 50]; +// a = 10 +// b = 20 +// rest = [30, 40, 50] + +[a, b, ...rest, c] = [10, 20, 30, 40, 50]; +// Uncaught SyntaxError: Rest element must be last element +``` From 4a89f650e87ab25c8ab21a6edc03a788f0753b56 Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Fri, 28 Feb 2025 23:58:52 +0800 Subject: [PATCH 12/23] fixdate --- content/flexible-varargs.md | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/content/flexible-varargs.md b/content/flexible-varargs.md index 390acc6..87a4b48 100644 --- a/content/flexible-varargs.md +++ b/content/flexible-varargs.md @@ -327,3 +327,24 @@ single values on the _left_ `...rest` pattern. [a, b, ...rest, c] = [10, 20, 30, 40, 50]; // Uncaught SyntaxError: Rest element must be last element ``` + +### C# + +C# also has structural pattern matching, and supports +[List Patterns](https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/operators/patterns#list-patterns), +which allows a single `..` "slice" to be placed anywhere in the list: + +```csharp +Console.WriteLine(new[] { 1, 2, 3, 4, 5 } is [> 0, > 0, ..]); // True +Console.WriteLine(new[] { 1, 1 } is [_, _, ..]); // True +Console.WriteLine(new[] { 0, 1, 2, 3, 4 } is [> 0, > 0, ..]); // False +Console.WriteLine(new[] { 1 } is [1, 2, ..]); // False + +Console.WriteLine(new[] { 1, 2, 3, 4 } is [.., > 0, > 0]); // True +Console.WriteLine(new[] { 2, 4 } is [.., > 0, 2, 4]); // False +Console.WriteLine(new[] { 2, 4 } is [.., 2, 4]); // True + +Console.WriteLine(new[] { 1, 2, 3, 4 } is [>= 0, .., 2 or 4]); // True +Console.WriteLine(new[] { 1, 0, 0, 1 } is [1, 0, .., 0, 1]); // True +Console.WriteLine(new[] { 1, 0, 1 } is [1, 0, .., 0, 1]); // False +``` \ No newline at end of file From 2cb50b5be5702666f40922fa52ebdc0753a7528a Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Sat, 1 Mar 2025 00:03:33 +0800 Subject: [PATCH 13/23] fixdate --- content/flexible-varargs.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/content/flexible-varargs.md b/content/flexible-varargs.md index 87a4b48..9064304 100644 --- a/content/flexible-varargs.md +++ b/content/flexible-varargs.md @@ -316,7 +316,8 @@ console.log(...a, 0, ...b) ``` Javascript has a stricter limitation when destructuring an array, as it only allows -single values on the _left_ `...rest` pattern. +single values to the _left_ of the `...rest` pattern, and does not allow anything to be +to the right of it. ```javascript [a, b, ...rest] = [10, 20, 30, 40, 50]; From 662713e683f0fb170f610e4d01d8287ff17cd640 Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Sat, 1 Mar 2025 00:07:12 +0800 Subject: [PATCH 14/23] fixdate --- content/flexible-varargs.md | 23 ++++++++++++++++++++++- 1 file changed, 22 insertions(+), 1 deletion(-) diff --git a/content/flexible-varargs.md b/content/flexible-varargs.md index 9064304..0ccb053 100644 --- a/content/flexible-varargs.md +++ b/content/flexible-varargs.md @@ -348,4 +348,25 @@ Console.WriteLine(new[] { 2, 4 } is [.., 2, 4]); // True Console.WriteLine(new[] { 1, 2, 3, 4 } is [>= 0, .., 2 or 4]); // True Console.WriteLine(new[] { 1, 0, 0, 1 } is [1, 0, .., 0, 1]); // True Console.WriteLine(new[] { 1, 0, 1 } is [1, 0, .., 0, 1]); // False -``` \ No newline at end of file +``` + +## Ruby + +Ruby does something interesting with its pattern matching: rather than only allowing a single +`*` vararg with single values to the left and right (like Python or Javascript) +Ruby's [Find Patterns](https://docs.ruby-lang.org/en/3.0/syntax/pattern_matching_rdoc.html#label-Patterns) +allow the `*` only as the _left-most and right-most_ entry in the sequence, with the +single values in the _middle_ + + +```ruby +case ["a", 1, "b", "c", 2] +in [*, String, String, *] + "matched" +else + "not matched" +end +``` + +This could be supported as part of the desugaring in this proposal if we want to, but for now is +left out \ No newline at end of file From 77cc8dfd0c65f27fdc2d3deaf7e7682766214213 Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Sat, 1 Mar 2025 00:08:25 +0800 Subject: [PATCH 15/23] fixdate --- content/flexible-varargs.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/flexible-varargs.md b/content/flexible-varargs.md index 0ccb053..12f181a 100644 --- a/content/flexible-varargs.md +++ b/content/flexible-varargs.md @@ -251,7 +251,7 @@ implicit def seqSummable(value: Seq[Int]) = Summable(value) implicit def singleSummable(value: Int) = Summable(Seq(value)) implicit def optionSummable(value: Option[Int]) = Summable(value.toSeq) -def sum(x: Int*) = x.sum +def sum(x: Summable*) = x.flatMap(_.value).sum val numbers1 = Seq(1, 2, 3) val numbers2 = Seq(4, 5, 6) From b6a198a829a845aa562f9dff2fe01df1d9c130de Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Sat, 1 Mar 2025 00:08:52 +0800 Subject: [PATCH 16/23] fixdate --- content/flexible-varargs.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/flexible-varargs.md b/content/flexible-varargs.md index 12f181a..5b25f79 100644 --- a/content/flexible-varargs.md +++ b/content/flexible-varargs.md @@ -353,7 +353,7 @@ Console.WriteLine(new[] { 1, 0, 1 } is [1, 0, .., 0, 1]); // False ## Ruby Ruby does something interesting with its pattern matching: rather than only allowing a single -`*` vararg with single values to the left and right (like Python or Javascript) +`*` vararg in the middle with single values to the left and right (like Python or Javascript) Ruby's [Find Patterns](https://docs.ruby-lang.org/en/3.0/syntax/pattern_matching_rdoc.html#label-Patterns) allow the `*` only as the _left-most and right-most_ entry in the sequence, with the single values in the _middle_ From 15d1a02b0df984e7eeebd3598b6eb4ba8d3fbd79 Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Sat, 1 Mar 2025 00:13:10 +0800 Subject: [PATCH 17/23] fixdate --- content/flexible-varargs.md | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/content/flexible-varargs.md b/content/flexible-varargs.md index 5b25f79..d816754 100644 --- a/content/flexible-varargs.md +++ b/content/flexible-varargs.md @@ -329,6 +329,17 @@ to the right of it. // Uncaught SyntaxError: Rest element must be last element ``` +### Dart + +Dart allows pattern matching on lists with a single `...` [Rest Element](https://dart.dev/language/pattern-types#rest-element) +that can be anywhere in the list, similar to Python: + +```dart +var [a, b, ...rest, c, d] = [1, 2, 3, 4, 5, 6, 7]; +// Prints "1 2 [3, 4, 5] 6 7". +print('$a $b $rest $c $d'); +``` + ### C# C# also has structural pattern matching, and supports @@ -353,7 +364,7 @@ Console.WriteLine(new[] { 1, 0, 1 } is [1, 0, .., 0, 1]); // False ## Ruby Ruby does something interesting with its pattern matching: rather than only allowing a single -`*` vararg in the middle with single values to the left and right (like Python or Javascript) +`*` vararg in the middle with single values to the left and right, Ruby's [Find Patterns](https://docs.ruby-lang.org/en/3.0/syntax/pattern_matching_rdoc.html#label-Patterns) allow the `*` only as the _left-most and right-most_ entry in the sequence, with the single values in the _middle_ From 35f94bd6f86044d286189312c40e14d3b9af90ad Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Sat, 1 Mar 2025 00:16:27 +0800 Subject: [PATCH 18/23] fixdate --- content/flexible-varargs.md | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/content/flexible-varargs.md b/content/flexible-varargs.md index d816754..ef3780d 100644 --- a/content/flexible-varargs.md +++ b/content/flexible-varargs.md @@ -329,6 +329,20 @@ to the right of it. // Uncaught SyntaxError: Rest element must be last element ``` +### PHP + +PHP's work in progress [Pattern Matching RFC](https://wiki.php.net/rfc/pattern-matching) +allows for a `...` "rest" pattern to be added to a sequence pattern, but does not allow +it to be bound to a local variable: + +```php +// Array sequence patterns +$list is [1, 2, 3, 4]; // Exact match. +$list is [1, 2, 3, ...]; // Begins with 1, 2, 3, but may have other entries. +$list is [1, 2, mixed, 4]; // Allows any value in the 3rd position. +$list is [1, 2, 3|4, 5]; // 3rd value may be 3 or 4. +``` + ### Dart Dart allows pattern matching on lists with a single `...` [Rest Element](https://dart.dev/language/pattern-types#rest-element) From 9a81ab4d296b9b2b7f6cb5c7184a2a7daa0110db Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Sat, 1 Mar 2025 01:07:40 +0800 Subject: [PATCH 19/23] fixdate --- content/flexible-varargs.md | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/content/flexible-varargs.md b/content/flexible-varargs.md index ef3780d..e951d1a 100644 --- a/content/flexible-varargs.md +++ b/content/flexible-varargs.md @@ -303,6 +303,13 @@ an arbitrary number of non-`*` patterns on the left and right, and follows the [same pattern matching strategy](https://docs.python.org/3/reference/compound_stmts.html#sequence-patterns) that I sketched above. +```python +match command: + case ["drop", *objects]: + for obj in objects: + ... +``` + ### Javascript Javascript's expression `...` syntax works identically to this proposal. In Python, you can mix single values with one or more `...` unpackings when calling a function: @@ -393,5 +400,8 @@ else end ``` -This could be supported as part of the desugaring in this proposal if we want to, but for now is -left out \ No newline at end of file +Implementing this requires a `O(n^2)` scan over the input sequence attempting to +match the pattern starting at every index, hence the name `Find Patterns`. Although better +than the `O(2^n)` exponential backtracking search that would be required for arbitrary +placement of `*` patterns, it is still much worst than the `O(n)` cost of the +single-`*` pattern that most languages do, and so we are leaving it out of this proposal. \ No newline at end of file From 0b3d5c9fcba423bbc5be59adaa74f96fa008889c Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Sat, 1 Mar 2025 08:02:25 +0800 Subject: [PATCH 20/23] . --- content/flexible-varargs.md | 26 +++++++++++--------------- 1 file changed, 11 insertions(+), 15 deletions(-) diff --git a/content/flexible-varargs.md b/content/flexible-varargs.md index e951d1a..c23f7a7 100644 --- a/content/flexible-varargs.md +++ b/content/flexible-varargs.md @@ -179,9 +179,15 @@ The proposed implementation is to basically desugar the multiple `*`s into the m val total = sum(0, numbers1*, numbers2*, 4) // 10 // Desugaring -val total = sum((IArray(0) ++ numbers1 ++ numbers2 ++ IArray(4))*) // 10 +val total = sum(IArray.newBuilder.addOne(0).addAll(numbers1).addAll(numbers2).addOne(4).result()*) // 10 ``` +We don't want to hard-code too much deep integration with the Scala collections library, +but at the same time do not want to do this too naively, since this may be used in some +performance critical APIs (e.g. Scalatags templates). It seems reasonable to assume that +any possible implementation of `IArray` or `Seq` will have an API like `newBuilder` that +allows you to construct the collection efficiently without excessive copying. + The implementation for patterns could be something like ```scala @@ -194,11 +200,9 @@ numbers match { class VarargsMatchHelper(beforeCount: Int, afterCount: Int) { def unapply[T](values: Seq[T]): Option[(Seq[T], Seq[T], Seq[T])] = { Option.when (values.length >= beforeCount + afterCount){ - ( - values.take(beforeCount), - values.drop(beforeCount).dropRight(afterCount), - values.takeRight(afterCount) - ) + val (first, rest) = values.splitAt(beforeCount) + val (middle, last) = rest.splitAt(rest.length - afterCount) + (first, middle, last) } } } @@ -223,14 +227,6 @@ may be valid. Thus there is no way to implement it efficiently in general, as it require an expensive (`O(2^n)`) backtracking search to try and find a valid assignment of the elements that satisfies all sub-patterns. -### No Specific Performance Optimizations - -As proposed, the desugaring just relies on `IArray()` and `++` to construct the final -sequence that will be passed to varargs. We don't want to use `Seq` because it returns -a `List`, and `List#++` is very inefficient. But we do not do any further optimizations, -and anyone who hits performance issues with flexible vararg unpacking can always rewrite -it themselves manually constructing an `mutable.ArrayBuffer` or `mutable.ArrayDeque`. - ### Not supporting unpacking arbitrary `Iterable`s Given we propose to allow unpacking `Seq` and `Option`, its worth asking if we should @@ -311,7 +307,7 @@ match command: ``` ### Javascript -Javascript's expression `...` syntax works identically to this proposal. In Python, you can mix +Javascript's expression `...` syntax works identically to this proposal. In Javascript, you can mix single values with one or more `...` unpackings when calling a function: ```javascript From d6959348df4469dd003114af21442c7ad7223c5c Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Mon, 7 Jul 2025 12:58:36 +0800 Subject: [PATCH 21/23] Update content/flexible-varargs.md Co-authored-by: odersky --- content/flexible-varargs.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/flexible-varargs.md b/content/flexible-varargs.md index c23f7a7..6b1d403 100644 --- a/content/flexible-varargs.md +++ b/content/flexible-varargs.md @@ -158,7 +158,7 @@ val coll = foo.toSeq ++ bar ++ Seq(qux, baz) val coll = Seq() ++ foo ++ bar ++ Seq(qux, baz) ``` -With those proposal, all three scenarios would look almost the same - reflecting +With this proposal, all three scenarios would look almost the same - reflecting the fact that they are really doing the same thing - and you are no longer prone to weird type inference issues depending on the type of the left-most value: From 8a9ac174a00306b386fa8805a8f86022bb9800a3 Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Mon, 7 Jul 2025 12:58:46 +0800 Subject: [PATCH 22/23] Update content/flexible-varargs.md Co-authored-by: odersky --- content/flexible-varargs.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/flexible-varargs.md b/content/flexible-varargs.md index 6b1d403..0d30e88 100644 --- a/content/flexible-varargs.md +++ b/content/flexible-varargs.md @@ -96,7 +96,7 @@ With this proposal, you can write what you mean and have it just work. ### Constucting Sequences -The second scenarios that this streamlines is constructing `Seq`s and other collections. +The second scenario that this streamlines is constructing `Seq`s and other collections. For example, a common scenario is constructing a collection from values: From 03ad012561251d433ddaecd7a28afa644e41f346 Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Mon, 7 Jul 2025 12:58:56 +0800 Subject: [PATCH 23/23] Update content/flexible-varargs.md Co-authored-by: odersky --- content/flexible-varargs.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/flexible-varargs.md b/content/flexible-varargs.md index 0d30e88..66857a2 100644 --- a/content/flexible-varargs.md +++ b/content/flexible-varargs.md @@ -94,7 +94,7 @@ As you can see, the version using inline `Seq()`s and `++` is super verbose, and cleanup using `+:` and `:+` doesn't actually work due to weird associativity problems. With this proposal, you can write what you mean and have it just work. -### Constucting Sequences +### Constructing Sequences The second scenario that this streamlines is constructing `Seq`s and other collections. For example, a common scenario is constructing a collection from values: