From e41f46c727a3f129988f96be91eaab469e8a2398 Mon Sep 17 00:00:00 2001 From: Dawn Perchik Date: Tue, 18 Feb 2020 17:37:41 -0800 Subject: [PATCH 1/3] =?UTF-8?q?P1868R2=20=F0=9F=A6=84=20width:=20clarifyin?= =?UTF-8?q?g=20units=20of=20width=20and=20precision=20in=20std::format?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Also fixes NB US 228 (C++20 CD) and LWG3290. --- source/intro.tex | 9 +++---- source/utilities.tex | 56 +++++++++++++++++++++++++++++++++++++++++++- 2 files changed, 60 insertions(+), 5 deletions(-) diff --git a/source/intro.tex b/source/intro.tex index ab5ff09bd4..10fc2d29a7 100644 --- a/source/intro.tex +++ b/source/intro.tex @@ -35,10 +35,6 @@ \begin{itemize} \item Ecma International, \doccite{ECMAScript Language Specification}, Standard Ecma-262, third edition, 1999. -% FIXME: Is the following comment refering to the entry above or -% the entry removed in LWG3319? I.e. is it still valid? -%%% Format for this entry is based on that specified at -%%% http://www.iec.ch/standardsdev/resources/draftingpublications/directives/principles/referencing.htm \item ISO/IEC 2382 (all parts), \doccite{Information technology --- Vocabulary} \item ISO 8601:2004, \doccite{Data elements and interchange formats --- @@ -56,6 +52,11 @@ \item ISO 80000-2:2009, \doccite{Quantities and units --- Part 2: Mathematical signs and symbols to be used in the natural sciences and technology} +%%% Format for the following entry is based on that specified at +%%% http://www.iec.ch/standardsdev/resources/draftingpublications/directives/principles/referencing.htm +\item The Unicode Consortium. Unicode Standard Annex, UAX \#29, \doccite{Unicode Text Segmentation} [online]. +Edited by Mark Davis. Revision 35; issued for Unicode 12.0.0. 2019-02-15 [viewed 2020-02-23]. +Available at \url{http://www.unicode.org/reports/tr29/tr29-35.html} \end{itemize} \pnum diff --git a/source/utilities.tex b/source/utilities.tex index 58fe9c8f1b..a48b78e1a6 100644 --- a/source/utilities.tex +++ b/source/utilities.tex @@ -19371,6 +19371,54 @@ there is no minimum field width, and the field width is determined based on the content of the field. +\pnum +\indextext{string!width}% +The \defn{width} of a string is defined as +the estimated number of column positions appropriate +for displaying it in a terminal. +\begin{note} +This is similar to the semantics of the POSIX \tcode{wcswidth} function. +\end{note} + +\pnum +For the purposes of width computation, +a string is assumed to be in +a locale-independent, implementation-defined encoding. +Implementations should use a Unicode encoding +on platforms capable of displaying Unicode text in a terminal. +\begin{note} +This is the case for Windows-based and many POSIX-based operating systems. +\end{note} + +\pnum +For a string in a Unicode encoding, +implementations should estimate the width of a string +as the sum of estimated widths of +the first code points in its extended grapheme clusters +as defined by UAX \#29. +The estimated width of the following code points is 2: +\begin{itemize} +\item \tcode{U+1100-U+115F} +\item \tcode{U+2329} +\item \tcode{U+232A} +\item \tcode{U+2E80-U+303E} +\item \tcode{U+3040-U+A4CF} +\item \tcode{U+AC00-U+D7A3} +\item \tcode{U+F900-U+FAFF} +\item \tcode{U+FE10-U+FE19} +\item \tcode{U+FE30-U+FE6F} +\item \tcode{U+FF00-U+FF60} +\item \tcode{U+FFE0-U+FFE6} +\item \tcode{U+20000-U+2FFFD} +\item \tcode{U+30000-U+3FFFD} +\item \tcode{U+1F300-U+1F64F} +\item \tcode{U+1F900-U+1F9FF} +\end{itemize} +The estimated width of other code points is 1. + +\pnum +For a string in a non-Unicode encoding, the width of a string is unspecified. + \pnum A zero (\tcode{0}) character preceding the \fmtgrammarterm{width} field @@ -19398,7 +19446,13 @@ the precision or maximum field size. It can only be used with floating-point and string types. For floating-point types this field specifies the formatting precision. -For string types it specifies how many characters will be used from the string. +For string types, this field provides an upper bound +for the estimated width of the prefix of +the input string that is copied into the output. +For a string in a Unicode encoding, +the formatter copies to the output +the longest prefix of whole extended grapheme clusters +whose estimated width is no greater than the precision. \pnum When the \tcode{L} option is used, the form used for the conversion is called From cfd03b579bba7247bc826ed4a00fbf64c603d5bf Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Mon, 2 Mar 2020 16:10:05 -0800 Subject: [PATCH 2/3] [format.string.std] Put code point ranges in numerical order and collapse two adjacent code points into a range. --- source/utilities.tex | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/source/utilities.tex b/source/utilities.tex index a48b78e1a6..12c0ecad1a 100644 --- a/source/utilities.tex +++ b/source/utilities.tex @@ -19399,8 +19399,7 @@ The estimated width of the following code points is 2: \begin{itemize} \item \tcode{U+1100-U+115F} -\item \tcode{U+2329} -\item \tcode{U+232A} +\item \tcode{U+2329-U+232A} \item \tcode{U+2E80-U+303E} \item \tcode{U+3040-U+A4CF} \item \tcode{U+AC00-U+D7A3} @@ -19409,10 +19408,10 @@ \item \tcode{U+FE30-U+FE6F} \item \tcode{U+FF00-U+FF60} \item \tcode{U+FFE0-U+FFE6} -\item \tcode{U+20000-U+2FFFD} -\item \tcode{U+30000-U+3FFFD} \item \tcode{U+1F300-U+1F64F} \item \tcode{U+1F900-U+1F9FF} +\item \tcode{U+20000-U+2FFFD} +\item \tcode{U+30000-U+3FFFD} \end{itemize} The estimated width of other code points is 1. From 9ce9b23148af2f744ef6707713839aa550f44999 Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Mon, 2 Mar 2020 16:41:36 -0800 Subject: [PATCH 3/3] [intro.ack] Add Unicode to the list of registered trademarks we mention. --- source/intro.tex | 3 +++ 1 file changed, 3 insertions(+) diff --git a/source/intro.tex b/source/intro.tex index 10fc2d29a7..60bee58ce0 100644 --- a/source/intro.tex +++ b/source/intro.tex @@ -682,5 +682,8 @@ \pnum ECMAScript\textregistered\ is a registered trademark of Ecma International. +\pnum +Unicode\textregistered\ is a registered trademark of Unicode, Inc. + \pnum All rights in these originals are reserved.