Skip to content

Commit 91b1a21

Browse files
committed
CWG3015 Handling of header-names for #include and #embed
1 parent fae130f commit 91b1a21

File tree

2 files changed

+139
-121
lines changed

2 files changed

+139
-121
lines changed

source/lex.tex

Lines changed: 14 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -588,16 +588,23 @@
588588
the next preprocessing token is the longest sequence of
589589
characters that could constitute a preprocessing token, even if that
590590
would cause further lexical analysis to fail,
591-
except that a \grammarterm{header-name}\iref{lex.header} is only formed
591+
except that
592592
\begin{itemize}
593593
\item
594-
after the \tcode{include} or \tcode{import} preprocessing token in a
595-
\tcode{\#include}\iref{cpp.include} or
596-
\tcode{import}\iref{cpp.import} directive, or
597-
594+
a \grammarterm{header-name}\iref{lex.header} is only formed
595+
\begin{itemize}
598596
\item
599-
within a \grammarterm{has-include-expression}.
600-
597+
immediately after the \tcode{include}, \tcode{embed}, or \tcode{import} preprocessing token in a
598+
\tcode{\#include}\iref{cpp.include}, \tcode{\#embed}\iref{cpp.embed}, or
599+
\tcode{import}\iref{cpp.import} directive, respectively, or
600+
\item
601+
immediately after a preprocessing token sequence of \xname{has_include}
602+
or \xname{has_embed} immediately followed by \tcode{(}
603+
in a \tcode{\#if}, \tcode{\#elif}, or \tcode{\#embed} directive\iref{cpp.cond,cpp.embed} and
604+
\end{itemize}
605+
\item
606+
a \grammarterm{string-literal} token is never formed
607+
when a \grammarterm{header-name} token can be formed.
601608
\end{itemize}
602609
\end{itemize}
603610

source/preprocessor.tex

Lines changed: 125 additions & 114 deletions
Original file line numberDiff line numberDiff line change
@@ -363,7 +363,8 @@
363363
\indextext{\idxxname{has_embed}}%
364364
\begin{bnf}
365365
\nontermdef{has-embed-expression}\br
366-
\terminal{\xname{has_embed}} \terminal{(} pp-balanced-token-seq \terminal{)}
366+
\terminal{\xname{has_embed}} \terminal{(} header-name \opt{pp-balanced-token-seq} \terminal{)}\br
367+
\terminal{\xname{has_embed}} \terminal{(} header-name-tokens \opt{pp-balanced-token-seq} \terminal{)}
367368
\end{bnf}
368369

369370
\indextext{\idxxname{has_cpp_attribute}}%
@@ -405,17 +406,12 @@
405406
\tcode{\#undef}
406407
directive with the same subject identifier), \tcode{0} if it is not.
407408

408-
\pnum
409-
The second form of \grammarterm{has-include-expression}
410-
is considered only if the first form does not match,
411-
in which case the preprocessing tokens are processed just as in normal text.
412-
413409
\pnum
414410
The header or source file identified by
415411
the parenthesized preprocessing token sequence
416412
in each contained \grammarterm{has-include-expression}
417413
is searched for as if that preprocessing token sequence
418-
were the \grammarterm{pp-tokens} in a \tcode{\#include} directive,
414+
were the \grammarterm{pp-tokens} of a \tcode{\#include} directive,
419415
except that no further macro expansion is performed.
420416
If such a directive would not satisfy the syntactic requirements
421417
of a \tcode{\#include} directive, the program is ill-formed.
@@ -424,10 +420,11 @@
424420
to \tcode{0} if the search fails.
425421

426422
\pnum
427-
The parenthesized \grammarterm{pp-balanced-token-seq} in each contained
423+
The parenthesized preprocessing token sequence of each contained
428424
\grammarterm{has-embed-expression} is processed as if that
429-
\grammarterm{pp-balanced-token-seq} were the \grammarterm{pp-tokens} in the
430-
third form of a \tcode{\#embed} directive\iref{cpp.embed}.
425+
preprocessing token sequence were the \grammarterm{pp-tokens}
426+
of a \tcode{\#embed} directive\iref{cpp.embed},
427+
except that no further macro expansion is performed.
431428
If such a directive would not satisfy the syntactic requirements of a
432429
\tcode{\#embed} directive, the program is ill-formed.
433430
The \grammarterm{has-embed-expression} evaluates to:
@@ -686,83 +683,81 @@
686683
\indextext{\idxcode{\#include}}%
687684

688685
\pnum
689-
A
690-
\tcode{\#include}
691-
directive shall identify a header or source file
692-
that can be processed by the implementation.
686+
A \defnadj{header}{search} for a sequence of characters
687+
searches a sequence of places for a header
688+
identified uniquely by that sequence of characters.
689+
How the places are determined or the header identified
690+
is \impldef{determination of places and identification of headers during header search}.
691+
692+
\pnum
693+
A \defnadj{source file}{search} for a sequence of characters
694+
attempts to identify a source file that is named by the sequence of characters.
695+
The named source file is searched for
696+
in an \impldef{search for source files during source file search} manner.
697+
If the implementation does not support a source file search
698+
for that sequence of characters, or if the search fails,
699+
the result of the source file search
700+
is the result of a header search for the same sequence of characters.
693701

694702
\pnum
695703
A preprocessing directive of the form
696704
\begin{ncsimplebnf}
697-
\terminal{\# include <} h-char-sequence \terminal{>} new-line
705+
\terminal{\# include} header-name new-line
698706
\end{ncsimplebnf}
699-
searches a sequence of
700-
\impldef{sequence of places searched for a header}
701-
places
702-
for a header identified uniquely by the specified sequence
703-
between the
704-
\tcode{<}
705-
and
706-
\tcode{>}
707-
delimiters,
708-
and causes the replacement of that
709-
directive by the entire contents of the header.
710-
How the places are specified
711-
or the header identified
712-
is \impldef{search locations for \tcode{<>} header}.
707+
causes the replacement of that directive by the entire contents
708+
of the header or source file identified by \grammarterm{header-name}.
713709

714710
\pnum
715-
A preprocessing directive of the form
711+
If the \grammarterm{header-name} is of the form
716712
\begin{ncsimplebnf}
717-
\terminal{\# include "} q-char-sequence \terminal{"} new-line
713+
\terminal{<} h-char-sequence \terminal{>}
718714
\end{ncsimplebnf}
719-
causes the replacement of that
720-
directive by the entire contents of the
721-
source file identified by the specified sequence between the
722-
\tcode{"}
723-
delimiters.
724-
The named source file is searched for in an
725-
\impldef{manner of search for included source file}
726-
manner.
727-
If this search is not supported,
728-
or if the search fails,
729-
the directive is reprocessed as if it read
715+
a header is identified by a header search for the sequence of characters
716+
of the \grammarterm{h-char-sequence}.
717+
718+
\pnum
719+
If the \grammarterm{header-name} is of the form
730720
\begin{ncsimplebnf}
731-
\terminal{\# include <} h-char-sequence \terminal{>} new-line
721+
\terminal{"} q-char-sequence \terminal{"}
732722
\end{ncsimplebnf}
733-
with the identical contained sequence (including
734-
\tcode{>}
735-
characters, if any) from the original directive.
723+
the source file or header is identified by a source file search
724+
for the sequence of characters of the \grammarterm{q-char-sequence}.
725+
726+
\pnum
727+
If a header search fails, or if a source file search or header search
728+
identifies a header or source file that cannot be processed by the implementation,
729+
the program is ill-formed.
730+
\begin{note}
731+
If the header or source file cannot be processed,
732+
the program is ill-formed even when evaluating \xname{has_include}.
733+
\end{note}
736734

737735
\pnum
738736
A preprocessing directive of the form
739737
\begin{ncsimplebnf}
740738
\terminal{\# include} pp-tokens new-line
741739
\end{ncsimplebnf}
742-
(that does not match one of the two previous forms) is permitted.
740+
(that does not match the previous form) is permitted.
743741
The preprocessing tokens after
744742
\tcode{include}
745743
in the directive are processed just as in normal text
746744
(i.e., each identifier currently defined as a macro name is replaced by its
747745
replacement list of preprocessing tokens).
748-
If the directive resulting after all replacements does not match
749-
one of the two previous forms, the behavior is
750-
undefined.
746+
Then, an attempt is made to form a \grammarterm{header-name}
747+
preprocessing token\iref{lex.header} from the whitespace and the characters
748+
of the spellings of the resulting sequence of preprocessing tokens;
749+
the treatment of whitespace
750+
is \impldef{treatment of whitespace when processing a \tcode{\#include} directive}.
751+
If the attempt succeeds, the directive with the so-formed \grammarterm{header-name}
752+
is processed as specified for the previous form.
753+
Otherwise, the behavior is undefined.
751754
\begin{note}
752755
Adjacent \grammarterm{string-literal}s are not concatenated into
753756
a single \grammarterm{string-literal}
754757
(see the translation phases in~\ref{lex.phases});
755758
thus, an expansion that results in two \grammarterm{string-literal}s is an
756759
invalid directive.
757760
\end{note}
758-
The method by which a sequence of preprocessing tokens between a
759-
\tcode{<}
760-
and a
761-
\tcode{>}
762-
preprocessing token pair or a pair of
763-
\tcode{"}
764-
characters is combined into a single header name
765-
preprocessing token is \impldef{search locations for \tcode{""""} header}.
766761

767762
\pnum
768763
The implementation shall provide unique mappings for
@@ -838,35 +833,58 @@
838833

839834
\rSec2[cpp.embed.gen]{General}
840835

836+
\pnum
837+
A \defnadj{bracket resource}{search} for a sequence of characters
838+
searches a sequence of places for a resource identified uniquely
839+
by that sequence of characters.
840+
How the places are determined or the resource identified
841+
is \impldef{determination of places and identification of resources during bracket resource search}.
842+
843+
\pnum
844+
A \defnadj{quote resource}{search} for a sequence of characters
845+
attempts to identify a resource that is named by the sequence of characters.
846+
The named resource is searched for
847+
in an \impldef{search for resources during quote resource search} manner.
848+
If the implementation does not support a quote resource search
849+
for that sequence of characters, or if the search fails,
850+
the result of the quote resource search
851+
is the result of a bracket resource search for the same sequence of characters.
852+
841853
\pnum
842854
A preprocessing directive of the form
843855
\begin{ncsimplebnf}
844-
\terminal{\# embed <} h-char-sequence \terminal{>} \opt{pp-tokens} new-line
856+
\terminal{\# embed} header-name \opt{pp-tokens} new-line
845857
\end{ncsimplebnf}
846-
searches a sequence of
847-
\impldef{sequence of places searched for an embedded resource}
848-
places for a resource identified uniquely by the specified sequence between
849-
the \tcode{<} and \tcode{>} delimiters.
850-
How the places are specified or the resource identified is
851-
\impldef{search locations for embedded resources specified with \tcode{<>}}.
858+
causes the replacement of that directive
859+
by preprocessing tokens derived from data
860+
in the resource identified by \grammarterm{header-name},
861+
as specified below.
852862

853863
\pnum
854-
A preprocessing directive of the form
864+
If the \grammarterm{header-name} is of the form
855865
\begin{ncsimplebnf}
856-
\terminal{\# embed "} q-char-sequence \terminal{"} \opt{pp-tokens} new-line
866+
\terminal{<} h-char-sequence \terminal{>}
857867
\end{ncsimplebnf}
858-
searches for a resource identified by the specified sequence between the
859-
\tcode{"} delimiters.
860-
The named resource is searched for in an
861-
\impldef{manner of search for named resource}
862-
manner.
863-
If this search is not supported, or if the search fails, the directive is
864-
reprocessed as if it read
868+
the resource is identified by a bracket resource search
869+
for the sequence of characters of the \grammarterm{h-char-sequence}.
870+
871+
\pnum
872+
If the \grammarterm{header-name} is of the form
865873
\begin{ncsimplebnf}
866-
\terminal{\# embed <} h-char-sequence \terminal{>} \opt{pp-tokens} new-line
874+
\terminal{"} q-char-sequence \terminal{"}
867875
\end{ncsimplebnf}
868-
with the identical contained sequence (including \tcode{>} characters, if any)
869-
from the original directive.
876+
the resource is identified by a quote resource search
877+
for the sequence of characters of the \grammarterm{q-char-sequence}.
878+
879+
\pnum
880+
If a bracket resource search fails,
881+
or if a quote or bracket resource search identifies a resource
882+
that cannot be processed by the implementation, the program is ill-formed.
883+
\begin{note}
884+
If the resource cannot be processed, the program is ill-formed
885+
even when processing \tcode{\#embed} with \tcode{limit(0)}\iref{cpp.embed.param.limit}
886+
or evaluating \xname{has_embed}.
887+
\end{note}
870888

871889
\pnum
872890
\recommended A mechanism similar to, but distinct from, the
@@ -987,53 +1005,48 @@
9871005
\begin{ncsimplebnf}
9881006
\terminal{\# embed} pp-tokens new-line
9891007
\end{ncsimplebnf}
990-
(that does not match one of the two previous forms) is permitted.
1008+
(that does not match the previous form) is permitted.
9911009
The preprocessing tokens after \tcode{embed} in the directive are processed
9921010
just as in normal text (i.e., each identifier currently defined as a macro
9931011
name is replaced by its replacement list of preprocessing tokens).
994-
The directive resulting after all replacements of the third form shall match
995-
one of the two previous forms.
1012+
Then, an attempt is made to form a \grammarterm{header-name}
1013+
preprocessing token\iref{lex.header} from the whitespace and the characters
1014+
of the spellings of the resulting sequence of preprocessing tokens immediately after embed;
1015+
the treatment of whitespace
1016+
is \impldef{treatment of whitespace when processing a \tcode{\#embed} directive}.
1017+
If the attempt succeeds, the directive with the so-formed \grammarterm{header-name}
1018+
is processed as specified for the previous form.
1019+
Otherwise, the program is ill-formed.
9961020
\begin{note}
9971021
Adjacent \grammarterm{string-literal}{s} are not concatenated into a single
9981022
\grammarterm{string-literal} (see the translation phases in \iref{lex.phases});
9991023
thus, an expansion that results in two \grammarterm{string-literal}{s} is an
10001024
invalid directive.
10011025
\end{note}
1002-
1003-
Any further processing as in normal text described for the two previous
1004-
forms is not performed.
1026+
Any further processing as in normal text described for the previous
1027+
form is not performed.
10051028
\begin{note}
10061029
That is, processing as in normal text happens once and only once for the entire
10071030
directive.
10081031
\end{note}
1009-
10101032
\begin{example}
1011-
If the directive matches the third form, the whole directive is replaced.
1012-
If the directive matches the first two forms, everything after the name is
1013-
replaced.
1014-
1033+
If the directive matches the second form, the whole directive is replaced.
1034+
If the directive matches the first form, everything after the name is replaced.
10151035
\begin{codeblock}
1016-
#define prefix(ARG) suffix(ARG)
1017-
#define THE_ADDITION "teehee"
1018-
#define THE_RESOURCE ":3c"
1019-
#embed ":3c" prefix(THE_ADDITION)
1020-
#embed THE_RESOURCE prefix(THE_ADDITION)
1036+
#define EMPTY
1037+
#define X myfile
1038+
#define Y rsc
1039+
#define Z 42
1040+
#embed <myfile.rsc> prefix(Z)
1041+
#embed EMPTY <X.Y> prefix(Z)
10211042
\end{codeblock}
1022-
10231043
is equivalent to:
1024-
10251044
\begin{codeblock}
1026-
#embed ":3c" suffix("teehee")
1027-
#embed ":3c" suffix("teehee")
1045+
#embed <myfile.rsc> prefix(42)
1046+
#embed <myfile.rsc> prefix(42)
10281047
\end{codeblock}
10291048
\end{example}
10301049

1031-
\pnum
1032-
The method by which a sequence of preprocessing tokens between a \tcode{<} and
1033-
a \tcode{>} preprocessing token pair or a pair of \tcode{"} characters is
1034-
combined into a single resource name preprocessing token is
1035-
\impldef{search locations for \tcode{""""} resource}.
1036-
10371050
\rSec2[cpp.embed.param]{Embed parameters}
10381051
\rSec3[cpp.embed.param.limit]{limit parameter}
10391052
\pnum
@@ -1783,17 +1796,15 @@
17831796
Otherwise, the original spelling of each preprocessing token in the
17841797
stringizing argument is retained in the character string literal,
17851798
except for special handling for producing the spelling of
1786-
\grammarterm{string-literal}s and \grammarterm{character-literal}s:
1787-
a
1788-
\tcode{\textbackslash}
1789-
character is inserted before each
1790-
\tcode{"}
1791-
and
1792-
\tcode{\textbackslash}
1793-
character of a \grammarterm{character-literal} or \grammarterm{string-literal}
1794-
(including the delimiting
1795-
\tcode{"}
1796-
characters).
1799+
\grammarterm{header-name}s,
1800+
\grammarterm{string-literal}s,
1801+
and \grammarterm{character-literal}s:
1802+
a \tcode{\textbackslash} character is inserted before each
1803+
\tcode{"} and \tcode{\textbackslash} character of a
1804+
\grammarterm{header-name},
1805+
\grammarterm{character-literal},
1806+
or \grammarterm{string-literal}
1807+
(including the delimiting \tcode{"} characters).
17971808
If the replacement that results is not a valid character string literal,
17981809
the behavior is undefined. The character string literal corresponding to
17991810
an empty stringizing argument is \tcode{""}.

0 commit comments

Comments
 (0)