So, isn't InputElementRegExp supposed to also process InputElementXMLContent?
Yes, <
(or XMLInitialiser
) would consume XMLListInitialiser
elements matched by InputElementRegExp
.
The specification tries to reject that.
How so?
They all make InputElementXMLContent available, but then how would one parse other expressions that need InputElementXMLContent? [In comment:] ... it'd cause a conflict for scanning an
InputElementRegExp
as well as anInputElementSomething
, this is because the<
symbol comes first inXMLInitialiser
orXMLInitiliser
.
If you're saying that there's a conflict between XMLListInitialiser
and <
because they both derive '<': No, that's not a conflict, because those two symbols are never used at the same point in the input. See the third paragraph in section 8:
"The <A><B/></A>
is used in those syntactic contexts where the literal contents of an XML element are permitted. The InputElementRegExp
symbol is used in all other syntactic grammar contexts."
That is, in any given syntactic context, there should be exactly one lexical goal symbol (<A>
) to use.
On the other hand, if you're saying that there's a conflict between XMLElementContent
and </
because they both derive sentences beginning with <
: Well, yes, that would be a problem for an LL(1) parser, but an LL(2) or LR(1) parser could handle it, I believe.
I believe the < production in the InputElementXMLContent was a mistake.
I don't think so. Consider the example <
. After XMLElement
, the syntactic parser is expecting <
or InputElementRegExp
, which means that the lexical parser must use <A>
as the goal symbol. So the latter must be able to match <
in order for this parse to succeed.
[from comment:]
InputElementXMLContent
starts with a<
terminal and may nest other elements, thus, it should beInputElementRegExp
IMO.
Again, see section 8. The two sentences I quoted above indicate that, after seeing order = <{x}>{item}</{x}>;
, you must use InputElementXMLContent
to get the next input element.