idnits 2.17.1 draft-legg-xed-rxer-ei-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 4190. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 4201. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 4208. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 4214. ** The document seems to lack an RFC 3978 Section 5.1 IPR Disclosure Acknowledgement -- however, there's a paragraph with a matching beginning. Boilerplate error? Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There is 1 instance of too long lines in the document, the longest one being 1 character in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (December 22, 2006) is 6332 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'ATTRIBUTE' is mentioned on line 4147, but not defined == Missing Reference: 'LIST' is mentioned on line 842, but not defined == Missing Reference: 'SIMPLE-CONTENT' is mentioned on line 1056, but not defined == Missing Reference: 'VERSION-INDICATOR' is mentioned on line 4141, but not defined == Missing Reference: 'GROUP' is mentioned on line 4148, but not defined == Missing Reference: 'SINGULAR-INSERTIONS' is mentioned on line 4144, but not defined == Missing Reference: 'HOLLOW-INSERTIONS' is mentioned on line 4013, but not defined == Missing Reference: 'MULTIFORM-INSERTIONS' is mentioned on line 3615, but not defined == Missing Reference: 'UNIFORM-INSERTIONS' is mentioned on line 4066, but not defined -- No information found for draft-legg-xed-rxer-xx - is the name correct? -- Possible downref: Normative reference to a draft: ref. 'RXER' -- Possible downref: Non-RFC (?) normative reference: ref. 'XML10' -- Possible downref: Non-RFC (?) normative reference: ref. 'XMLNS10' -- Possible downref: Non-RFC (?) normative reference: ref. 'XSD1' -- Possible downref: Non-RFC (?) normative reference: ref. 'XSD2' -- Possible downref: Non-RFC (?) normative reference: ref. 'RNG' Summary: 3 errors (**), 0 flaws (~~), 11 warnings (==), 13 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 INTERNET-DRAFT S. Legg 3 draft-legg-xed-rxer-ei-04.txt eB2Bcom 4 Intended Category: Standards Track December 22, 2006 6 Encoding Instructions for the 7 Robust XML Encoding Rules (RXER) 9 Copyright (C) The IETF Trust (2006). 11 Status of This Memo 13 By submitting this Internet-draft, each author represents that any 14 applicable patent or other IPR claims of which he or she is aware 15 have been or will be disclosed, and any of which he or she becomes 16 aware will be disclosed, in accordance with Section 6 of BCP 79. 18 Internet-Drafts are working documents of the Internet Engineering 19 Task Force (IETF), its areas, and its working groups. Note that 20 other groups may also distribute working documents as 21 Internet-Drafts. 23 Internet-Drafts are draft documents valid for a maximum of six months 24 and may be updated, replaced, or obsoleted by other documents at any 25 time. It is inappropriate to use Internet-Drafts as reference 26 material or to cite them other than as "work in progress". 28 The list of current Internet-Drafts can be accessed at 29 http://www.ietf.org/1id-abstracts.html 31 The list of Internet-Draft Shadow Directories can be accessed at 32 http://www.ietf.org/shadow.html 34 Technical discussion of this document should take place on the XED 35 developers mailing list . Please send editorial 36 comments directly to the editor . Further 37 information is available on the XED website: www.xmled.info. 39 This Internet-Draft expires on 22 June 2007. 41 Abstract 43 This document defines encoding instructions that may be used in an 44 Abstract Syntax Notation One (ASN.1) specification to alter how ASN.1 45 values are encoded by the Robust XML Encoding Rules (RXER) and 46 Canonical Robust XML Encoding Rules (CRXER), for example, to encode a 47 component of an ASN.1 value as an Extensible Markup Language (XML) 48 attribute rather than as a child element. Some of these encoding 49 instructions also affect how an ASN.1 specification is translated 50 into an Abstract Syntax Notation X (ASN.X) specification. Encoding 51 instructions that allow an ASN.1 specification to reference 52 definitions in other XML schema languages are also defined. 54 Table of Contents 56 1. Introduction ....................................................3 57 2. Conventions .....................................................4 58 3. Definitions .....................................................4 59 4. Notation for RXER Encoding Instructions .........................5 60 5. Component Encoding Instructions .................................7 61 6. Reference Encoding Instructions .................................9 62 7. Expanded Names of Components ...................................10 63 8. The ATTRIBUTE Encoding Instruction .............................12 64 9. The ATTRIBUTE-REF Encoding Instruction .........................12 65 10. The COMPONENT-REF Encoding Instruction ........................13 66 11. The ELEMENT-REF Encoding Instruction ..........................16 67 12. The LIST Encoding Instruction .................................17 68 13. The NAME Encoding Instruction .................................19 69 14. The REF-AS-ELEMENT Encoding Instruction .......................19 70 15. The REF-AS-TYPE Encoding Instruction ..........................21 71 16. The SCHEMA-IDENTITY Encoding Instruction ......................22 72 17. The SIMPLE-CONTENT Encoding Instruction .......................22 73 18. The TARGET-NAMESPACE Encoding Instruction .....................23 74 19. The TYPE-AS-VERSION Encoding Instruction ......................24 75 20. The TYPE-REF Encoding Instruction .............................25 76 21. The UNION Encoding Instruction ................................26 77 22. The VALUES Encoding Instruction ...............................28 78 23. Insertion Encoding Instructions ...............................29 79 24. The VERSION-INDICATOR Encoding Instruction ....................32 80 25. The GROUP Encoding Instruction ................................34 81 25.1. Unambiguous Encodings ....................................36 82 25.1.1. Grammar Construction .............................36 83 25.1.2. Unique Component Attribution .....................46 84 25.1.3. Deterministic Grammars ...........................51 85 25.1.4. Attributes in Unknown Extensions .................53 86 26. Security Considerations .......................................55 87 27. IANA Considerations ...........................................55 88 28. References ....................................................55 89 28.1. Normative References .....................................55 90 28.2. Informative References ...................................56 91 Appendix A. GROUP Encoding Instruction Examples ...................57 92 Appendix B. Insertion Encoding Instruction Examples ...............72 93 Appendix C. Extension and Versioning Examples .....................84 95 1. Introduction 97 This document defines encoding instructions [X.680-1] that may be 98 used in an Abstract Syntax Notation One (ASN.1) [X.680] specification 99 to alter how ASN.1 values are encoded by the Robust XML Encoding 100 Rules (RXER) [RXER] and Canonical Robust XML Encoding Rules (CRXER) 101 [RXER], for example, to encode a component of an ASN.1 value as an 102 Extensible Markup Language (XML) [XML10] attribute rather than as a 103 child element. Some of these encoding instructions also affect how 104 an ASN.1 specification is translated into an Abstract Syntax Notation 105 X (ASN.X) specification [ASN.X]. 107 This document also defines encoding instructions that allow an ASN.1 108 specification to incorporate the definitions of types, elements and 109 attributes in specifications written in other XML schema languages. 110 References to XML Schema [XSD1] types, elements and attributes, 111 RELAX NG [RNG] named patterns and elements, and XML document type 112 definition (DTD) [XML10] element types are supported. 114 In most cases, the effect of an encoding instruction is only briefly 115 mentioned in this document. The precise effects of these encoding 116 instructions are described fully in the specifications for RXER 117 [RXER] and ASN.X [ASN.X], at the points where they apply. 119 2. Conventions 121 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 122 "SHOULD", "SHOULD NOT", "RECOMMENDED" and "MAY" in this document are 123 to be interpreted as described in BCP 14, RFC 2119 [BCP14]. The key 124 word "OPTIONAL" is exclusively used with its ASN.1 meaning. 126 Throughout this document "type" shall be taken to mean an ASN.1 type, 127 and "value" shall be taken to mean an ASN.1 abstract value, unless 128 qualified otherwise. 130 A reference to an ASN.1 production [X.680] (e.g., Type, NamedType) is 131 a reference to text in an ASN.1 specification corresponding to that 132 production. Throughout this document, "component" is synonymous with 133 NamedType. 135 This document uses the namespace prefix "xsi:" to stand for the 136 namespace name [XMLNS10] "http://www.w3.org/2001/XMLSchema-instance". 138 Example ASN.1 definitions in this document are assumed to be defined 139 in an ASN.1 module with a TagDefault of "AUTOMATIC TAGS" and an 140 EncodingReferenceDefault [X.680-1] of "RXER INSTRUCTIONS". 142 3. Definitions 144 The following definition of base type is used in specifying a number 145 of encoding instructions. 147 Definition (base type): If a type, T, is a constrained type, then the 148 base type of T is the base type of the type that is constrained, 149 otherwise if T is a prefixed type, then the base type of T is the 150 base type of the type that is prefixed, otherwise if T is a type 151 notation that references or denotes another type (i.e., DefinedType, 152 ObjectClassFieldType, SelectionType, TypeFromObject or 153 ValueSetFromObjects), then the base type of T is the base type of the 154 type that is referenced or denoted, otherwise the base type of T is T 155 itself. 157 Aside: A tagged type is a special case of a prefixed type. 159 4. Notation for RXER Encoding Instructions 161 The grammar of ASN.1 permits the application of encoding instructions 162 [X.680-1], through type prefixes and encoding control sections, that 163 modify how abstract values are encoded by nominated encoding rules. 165 The generic notation for type prefixes and encoding control sections 166 is defined by the ASN.1 basic notation [X.680] [X.680-1], and 167 includes an encoding reference to identify the specific encoding 168 rules that are affected by the encoding instruction. 170 The encoding reference that identifies the Robust XML Encoding rules 171 is literally RXER. An RXER encoding instruction applies equally to 172 both RXER and CRXER encodings. 174 The specific notation for an encoding instruction for a specific set 175 of encoding rules is left to the specification of those encoding 176 rules. Consequently, this companion document to the RXER 177 specification [RXER] defines the notation for RXER encoding 178 instructions. Specifically, it elaborates the EncodingInstruction 179 and EncodingInstructionAssignmentList placeholder productions of the 180 ASN.1 basic notation. 182 In the context of the RXER encoding reference, the 183 EncodingInstruction production is defined as follows, using the 184 conventions of the ASN.1 basic notation: 186 EncodingInstruction ::= 187 AttributeInstruction | 188 AttributeRefInstruction | 189 ComponentRefInstruction | 190 ElementRefInstruction | 191 GroupInstruction | 192 InsertionsInstruction | 193 ListInstruction | 194 NameInstruction | 195 RefAsElementInstruction | 196 RefAsTypeInstruction | 197 SimpleContentInstruction | 198 TypeAsVersionInstruction | 199 TypeRefInstruction | 200 UnionInstruction | 201 ValuesInstruction | 202 VersionIndicatorInstruction 204 In the context of the RXER encoding reference, the 205 EncodingInstructionAssignmentList production (which only appears in 206 an encoding control section) is defined as follows: 208 EncodingInstructionAssignmentList ::= 209 SchemaIdentityInstruction ? 210 TargetNamespaceInstruction ? 211 TopLevelComponents ? 213 TopLevelComponents ::= TopLevelComponent TopLevelComponents ? 215 TopLevelComponent ::= "COMPONENT" NamedType 217 Definition (top-level NamedType): A NamedType is a top-level 218 NamedType (equivalently, a top-level component) if and only if it is 219 the NamedType in a TopLevelComponent. A NamedType nested within the 220 Type of the NamedType of a TopLevelComponent is not itself a 221 top-level NamedType. 223 Aside: Specification writers should note that non-trivial types 224 defined within a top-level NamedType will not be visible to ASN.1 225 tools that do not understand RXER. 227 Although a top-level NamedType only appears in an RXER encoding 228 control section, the default encoding reference for the module 229 [X.680-1] still applies when parsing a top-level NamedType. 231 Each top-level NamedType within a module SHALL have a distinct 232 identifier. 234 The NamedType production is defined by the ASN.1 basic notation. The 235 other productions are described in subsequent sections and make use 236 of the following productions: 238 NCNameValue ::= Value 240 AnyURIValue ::= Value 242 QNameValue ::= Value 244 NameValue ::= Value 246 The Value production is defined by the ASN.1 basic notation. 248 The governing type for the Value in an NCNameValue is the NCName type 249 from the AdditionalBasicDefinitions module [RXER]. 251 The governing type for the Value in an AnyURIValue is the AnyURI type 252 from the AdditionalBasicDefinitions module. 254 The governing type for the Value in a QNameValue is the QName type 255 from the AdditionalBasicDefinitions module. 257 The governing type for the Value in a NameValue is the Name type from 258 the AdditionalBasicDefinitions module. 260 The Value in an NCNameValue, AnyURIValue, QNameValue or NameValue 261 SHALL NOT be a DummyReference [X.683] and SHALL NOT textually contain 262 a nested DummyReference. 264 Aside: Thus encoding instructions are not permitted to be 265 parameterized in any way. This restriction will become important 266 if a future specification for ASN.X explicitly represents 267 parameterized definitions and parameterized references instead of 268 expanding out parameterized references as in the current 269 specification. A parameterized definition could not be directly 270 translated into ASN.X if it contained encoding instructions that 271 were not fully specified. 273 5. Component Encoding Instructions 275 Certain of the RXER encoding instructions are categorized as 276 component encoding instructions. The component encoding instructions 277 are the ATTRIBUTE, ATTRIBUTE-REF, COMPONENT-REF, GROUP, ELEMENT-REF, 278 NAME, REF-AS-ELEMENT, SIMPLE-CONTENT, TYPE-AS-VERSION and 279 VERSION-INDICATOR encoding instructions (whose notations are 280 described respectively by AttributeInstruction, 281 AttributeRefInstruction, ComponentRefInstruction, GroupInstruction, 282 ElementRefInstruction, NameInstruction, RefAsElementInstruction, 283 SimpleContentInstruction, TypeAsVersionInstruction and 284 VersionIndicatorInstruction). 286 The Type in the EncodingPrefixedType for a component encoding 287 instruction SHALL be either: 289 (1) the Type in a NamedType, or 291 (2) the Type in an EncodingPrefixedType in a PrefixedType in a 292 BuiltinType in a Type that is one of (1) to (4), or 294 (3) the Type in an TaggedType in a PrefixedType in a BuiltinType in a 295 Type that is one of (1) to (4), or 297 (4) the Type in a ConstrainedType (excluding a TypeWithConstraint) in 298 a Type that is one of (1) to (4). 300 Aside: The effect of this condition is to force the component 301 encoding instructions to be textually within the NamedType to 302 which they apply. Only case (2) can be true on the first 303 iteration as the Type belongs to an EncodingPrefixedType, however 304 any of (1) to (4) can be true on subsequent iterations. 306 Case (4) is not permitted when the encoding instruction is the 307 ATTRIBUTE-REF, COMPONENT-REF, ELEMENT-REF or REF-AS-ELEMENT encoding 308 instruction. 310 The NamedType in case (1) is said to be "subject to" the component 311 encoding instruction. 313 A top-level NamedType SHALL NOT be subject to an ATTRIBUTE-REF, 314 COMPONENT-REF, GROUP, ELEMENT-REF, REF-AS-ELEMENT or SIMPLE-CONTENT 315 encoding instruction. 317 Aside: This condition does not preclude these encoding 318 instructions being used on a nested NamedType. 320 A NamedType SHALL NOT be subject to two or more component encoding 321 instructions of the same kind, e.g., a NamedType is not permitted to 322 be subject to two NAME encoding instructions. 324 The ATTRIBUTE, ATTRIBUTE-REF, COMPONENT-REF, GROUP, ELEMENT-REF, 325 REF-AS-ELEMENT, SIMPLE-CONTENT and TYPE-AS-VERSION encoding 326 instructions are mutually exclusive. The NAME, ATTRIBUTE-REF, 327 COMPONENT-REF, ELEMENT-REF and REF-AS-ELEMENT encoding instructions 328 are mutually exclusive. A NamedType SHALL NOT be subject to two or 329 more encoding instructions that are mutually exclusive. 331 A SelectionType [X.680] SHALL NOT be used to select the Type from a 332 NamedType that is subject to an ATTRIBUTE-REF, COMPONENT-REF, 333 ELEMENT-REF or REF-AS-ELEMENT encoding instruction. The other 334 component encoding instructions are not inherited by the type denoted 335 by a SelectionType. 337 Definition (attribute component): An attribute component is a 338 NamedType that is subject to an ATTRIBUTE or ATTRIBUTE-REF encoding 339 instruction, or subject to a COMPONENT-REF encoding instruction that 340 references a top-level NamedType that is subject to an ATTRIBUTE 341 encoding instruction. 343 Definition (element component): An element component is a NamedType 344 that is not subject to an ATTRIBUTE, ATTRIBUTE-REF, GROUP or 345 SIMPLE-CONTENT encoding instruction, and not subject to a 346 COMPONENT-REF encoding instruction that references a top-level 347 NamedType that is subject to an ATTRIBUTE encoding instruction. 349 Aside: A NamedType subject to a GROUP or SIMPLE-CONTENT encoding 350 instruction is neither an attribute component nor an element 351 component. 353 6. Reference Encoding Instructions 355 Certain of the RXER encoding instructions are categorized as 356 reference encoding instructions. The reference encoding instructions 357 are the ATTRIBUTE-REF, COMPONENT-REF, ELEMENT-REF, REF-AS-ELEMENT, 358 REF-AS-TYPE and TYPE-REF encoding instructions (whose notations are 359 described respectively by AttributeRefInstruction, 360 ComponentRefInstruction, ElementRefInstruction, 361 RefAsElementInstruction, RefAsTypeInstruction and 362 TypeRefInstruction). These encoding instructions (except 363 COMPONENT-REF) allow an ASN.1 specification to incorporate the 364 definitions of types, elements and attributes in specifications 365 written in other XML schema languages, through implied constraints on 366 the markup that may appear in values of the Markup ASN.1 type from 367 the AdditionalBasicDefinitions module [RXER] (for ELEMENT-REF, 368 REF-AS-ELEMENT, REF-AS-TYPE and TYPE-REF) or the UTF8String type (for 369 ATTRIBUTE-REF). References to XML Schema [XSD1] types, elements and 370 attributes, RELAX NG [RNG] named patterns and elements, and XML 371 document type definition (DTD) [XML10] element types are supported. 372 References to ASN.1 types and top-level components are also 373 permitted. The COMPONENT-REF encoding instruction provides a more 374 direct method of referencing a top-level component. 376 The Type in the EncodingPrefixedType for an ELEMENT-REF, 377 REF-AS-ELEMENT, REF-AS-TYPE or TYPE-REF encoding instruction SHALL be 378 either: 380 (1) a ReferencedType that is a DefinedType that is a typereference 381 (not a DummyReference) or ExternalTypeReference that references 382 the Markup ASN.1 type from the AdditionalBasicDefinitions module 383 [RXER], or 385 (2) a BuiltinType that is a PrefixedType that is a TaggedType where 386 the Type in the TaggedType is one of (1) to (3), or 388 (3) a BuiltinType that is a PrefixedType that is an 389 EncodingPrefixedType where the Type in the EncodingPrefixedType 390 is one of (1) to (3) and the EncodingPrefix in the 391 EncodingPrefixedType does not contain a reference encoding 392 instruction. 394 Aside: Case (3) and similar cases for the ATTRIBUTE-REF and 395 COMPONENT-REF encoding instructions have the effect of making the 396 reference encoding instructions mutually exclusive as well as 397 singly occurring. 399 With respect to the REF-AS-TYPE and TYPE-REF encoding instructions, 400 the DefinedType in case (1) is said to be "subject to" the encoding 401 instruction. 403 The restrictions on the Type in the EncodingPrefixedType for an 404 ATTRIBUTE-REF encoding instruction are specified in Section 9. The 405 restrictions on the Type in the EncodingPrefixedType for a 406 COMPONENT-REF encoding instruction are specified in Section 10. 408 The reference encoding instructions make use of a common production 409 defined as follows: 411 RefParameters ::= ContextParameter ? 413 ContextParameter ::= "CONTEXT" AnyURIValue 415 A RefParameters instance provides extra information about a reference 416 to a definition. A ContextParameter is used when a reference is 417 ambiguous, i.e., refers to definitions in more than one schema 418 document or external DTD subset. This situation would occur, for 419 example, when importing types with the same name from independently 420 developed XML Schemas defined without a target namespace [XSD1]. 421 When used in conjunction with a reference to an element type in an 422 external DTD subset, the AnyURIValue in the ContextParameter is the 423 system identifier (a Uniform Resource Identifier or URI [URI]) of the 424 external DTD subset, otherwise the AnyURIValue is a URI that 425 indicates the intended schema document, either an XML Schema 426 specification, a RELAX NG specification or an ASN.1 or ASN.X 427 specification. 429 7. Expanded Names of Components 431 Each NamedType has an associated expanded name [XMLNS10], determined 432 as follows: 434 (1) if the NamedType is subject to a NAME encoding instruction, then 435 the local name of the expanded name is the character string 436 specified by the NCNameValue of the NAME encoding instruction, 438 (2) otherwise, if the NamedType is subject to a COMPONENT-REF 439 encoding instruction, then the expanded name is the same as the 440 expanded name of the referenced top-level NamedType, 442 (3) otherwise, if the NamedType is subject to an ATTRIBUTE-REF or 443 ELEMENT-REF encoding instruction, then the namespace name of the 444 expanded name is equal to the namespace-name component of the 445 QNameValue of the encoding instruction and the local name is 446 equal to the local-name component of the QNameValue, 448 (4) otherwise, if the NamedType is subject to a REF-AS-ELEMENT 449 encoding instruction, then the local name of the expanded name is 450 the LocalPart [XMLNS10] of the qualified name specified by the 451 NameValue of the encoding instruction, 453 (5) otherwise, the local name of the expanded name is the identifier 454 of the NamedType. 456 In cases (1) and (5), if the NamedType is a top-level NamedType and 457 the module containing the NamedType has a TARGET-NAMESPACE encoding 458 instruction, then the namespace name of the expanded name is the 459 character string specified by the AnyURIValue of the TARGET-NAMESPACE 460 encoding instruction, otherwise the namespace name has no value. 462 Aside: Thus the TARGET-NAMESPACE encoding instruction applies to a 463 top-level NamedType but not to any other NamedType. 465 In case (4), if the encoding instruction contains a Namespace, then 466 the namespace name of the expanded name is the character string 467 specified by the AnyURIValue of the Namespace, otherwise the 468 namespace name has no value. 470 The expanded names for the attribute components of a CHOICE, SEQUENCE 471 or SET type MUST be distinct. The expanded names for the components 472 of a CHOICE, SEQUENCE or SET type that are not attribute components 473 MUST be distinct. These tests are applied after the COMPONENTS OF 474 transformation specified in X.680, Clause 24.4 [X.680]. 476 Aside: Two components of the same CHOICE, SEQUENCE or SET type may 477 have the same expanded name if one of them is an attribute 478 component and the other is not. Note that the "not" case includes 479 components that are subject to a GROUP or SIMPLE-CONTENT encoding 480 instruction. 482 The expanded name of a top-level NamedType subject to an ATTRIBUTE 483 encoding instruction MUST be distinct from the expanded name of every 484 other top-level NamedType subject to an ATTRIBUTE encoding 485 instruction in the same module. 487 The expanded name of a top-level NamedType not subject to an 488 ATTRIBUTE encoding instruction MUST be distinct from the expanded 489 name of every other top-level NamedType not subject to an ATTRIBUTE 490 encoding instruction in the same module. 492 Aside: Two top-level components may have the same expanded name if 493 one of them is an attribute component and the other is not. 495 8. The ATTRIBUTE Encoding Instruction 497 The ATTRIBUTE encoding instruction causes an RXER encoder to encode a 498 value of the component to which it is applied as an XML attribute 499 instead of as a child element. 501 The notation for an ATTRIBUTE encoding instruction is defined as 502 follows: 504 AttributeInstruction ::= "ATTRIBUTE" 506 The base type of the type of a NamedType that is subject to an 507 ATTRIBUTE encoding instruction SHALL NOT be: 509 (1) a CHOICE, SET or SET OF type, or 511 (2) a SEQUENCE type other than the one defining the QName type from 512 the AdditionalBasicDefinitions module [RXER] (i.e., QName is 513 allowed), or 515 (3) a SEQUENCE OF type where the SequenceOfType is not subject to a 516 LIST encoding instruction, or 518 (4) an open type. 520 Example 522 PersonalDetails ::= SEQUENCE { 523 firstName [ATTRIBUTE] UTF8String, 524 middleName [ATTRIBUTE] UTF8String, 525 surname [ATTRIBUTE] UTF8String 526 } 528 9. The ATTRIBUTE-REF Encoding Instruction 530 The ATTRIBUTE-REF encoding instruction causes an RXER encoder to 531 encode a value of the component to which it is applied as an XML 532 attribute instead of as a child element, where the attribute's name 533 is a qualified name of the attribute declaration referenced by the 534 encoding instruction. In addition, the ATTRIBUTE-REF encoding 535 instruction causes values of the UTF8String type to be restricted to 536 conform to the type of the attribute declaration. 538 The notation for an ATTRIBUTE-REF encoding instruction is defined as 539 follows: 541 AttributeRefInstruction ::= 542 "ATTRIBUTE-REF" QNameValue RefParameters 544 Taken together, the QNameValue and the ContextParameter in the 545 RefParameters (if present) MUST reference an XML Schema attribute 546 declaration or a top-level NamedType that is subject to an ATTRIBUTE 547 encoding instruction. 549 The type of a referenced XML Schema attribute declaration SHALL NOT 550 be, either directly or by derivation, the XML Schema type QName, 551 NOTATION, ENTITY, ENTITIES or anySimpleType. 553 Aside: Values of these types require information from the context 554 of the attribute for interpretation. Because an ATTRIBUTE-REF 555 encoding instruction is restricted to prefixing the ASN.1 556 UTF8String type, there is no mechanism to capture such context. 558 The type of a referenced top-level NamedType SHALL NOT be, either 559 directly or by subtyping, the QName type from the 560 AdditionalBasicDefinitions module [RXER]. 562 The Type in the EncodingPrefixedType for an ATTRIBUTE-REF encoding 563 instruction SHALL be either: 565 (1) the UTF8String type, or 567 (2) a BuiltinType that is a PrefixedType that is a TaggedType where 568 the Type in the TaggedType is one of (1) to (3), or 570 (3) a BuiltinType that is a PrefixedType that is an 571 EncodingPrefixedType where the Type in the EncodingPrefixedType 572 is one of (1) to (3) and the EncodingPrefix in the 573 EncodingPrefixedType does not contain a reference encoding 574 instruction. 576 The identifier of a NamedType subject to an ATTRIBUTE-REF encoding 577 instruction does not contribute to the name of attributes in an RXER 578 encoding. For the sake of consistency, the identifier SHOULD, where 579 possible, be the same as the local name of the referenced attribute 580 declaration. 582 10. The COMPONENT-REF Encoding Instruction 583 The ASN.1 basic notation does not have a concept of a top-level 584 NamedType and therefore does not have a mechanism to reference a 585 top-level NamedType. The COMPONENT-REF encoding instruction provides 586 a way to specify that a NamedType within a combining type definition 587 is equivalent to a referenced top-level NamedType. 589 The notation for a COMPONENT-REF encoding instruction is defined as 590 follows: 592 ComponentRefInstruction ::= "COMPONENT-REF" ComponentReference 594 ComponentReference ::= 595 InternalComponentReference | 596 ExternalComponentReference 598 InternalComponentReference ::= identifier FromModule ? 600 FromModule ::= "FROM" GlobalModuleReference 602 ExternalComponentReference ::= modulereference "." identifier 604 The GlobalModuleReference production is defined by the ASN.1 basic 605 notation [X.680]. If the GlobalModuleReference is absent from an 606 InternalComponentReference, then the identifier MUST be the 607 identifier of a top-level NamedType in the same module. If the 608 GlobalModuleReference is present in an InternalComponentReference, 609 then the identifier MUST be the identifier of a top-level NamedType 610 in the referenced module. 612 The modulereference in an ExternalComponentReference is used in the 613 same way as a modulereference in an ExternalTypeReference. The 614 identifier in an ExternalComponentReference MUST be the identifier of 615 a top-level NamedType in the referenced module. 617 The Type in the EncodingPrefixedType for a COMPONENT-REF encoding 618 instruction SHALL be either: 620 (1) a ReferencedType that is a DefinedType that is a typereference 621 (not a DummyReference) or an ExternalTypeReference, or 623 (2) a BuiltinType or ReferencedType that is one of the productions in 624 Table 1 in Section 5 of the specification for RXER [RXER], or 626 (3) a BuiltinType that is a PrefixedType that is a TaggedType where 627 the Type in the TaggedType is one of (1) to (4), or 629 (4) a BuiltinType that is a PrefixedType that is an 630 EncodingPrefixedType where the Type in the EncodingPrefixedType 631 is one of (1) to (4) and the EncodingPrefix in the 632 EncodingPrefixedType does not contain a reference encoding 633 instruction. 635 The restrictions on the use of RXER encoding instructions are such 636 that no other RXER encoding instruction is permitted within a 637 NamedType if the NamedType is subject to a COMPONENT-REF encoding 638 instruction. 640 The Type in the top-level NamedType referenced by the COMPONENT-REF 641 encoding instruction MUST be either: 643 (a) if the preceding case (1) is used, a ReferencedType that is a 644 DefinedType that is a typereference or ExternalTypeReference that 645 references the same type as the DefinedType in case (1), or 647 (b) if the preceding case (2) is used, a BuiltinType or 648 ReferencedType that is the same as the BuiltinType or 649 ReferencedType in case (2), or 651 (c) a BuiltinType that is a PrefixedType that is an 652 EncodingPrefixedType where the Type in the EncodingPrefixedType 653 is one of (a) to (c) and the EncodingPrefix in the 654 EncodingPrefixedType contains an RXER encoding instruction. 656 In principle, the COMPONENT-REF encoding instruction creates a 657 notional NamedType where the expanded name is that of the referenced 658 top-level NamedType and the Type in case (1) or (2) is substituted by 659 the Type of the referenced top-level NamedType. 661 In practice, it is sufficient for non-RXER encoders and decoders to 662 use the original NamedType rather than the notional NamedType because 663 the Type in case (1) or (2) can only differ from the Type of the 664 referenced top-level NamedType by having fewer RXER encoding 665 instructions, and RXER encoding instructions are ignored by non-RXER 666 encoders and decoders. 668 Although any prefixes for the Type in case (1) or (2) would be 669 bypassed, it is sufficient for RXER encoders and decoders to use the 670 referenced top-level NamedType instead of the notional NamedType 671 because these prefixes cannot be RXER encoding instructions (except, 672 of course, for the COMPONENT-REF encoding instruction) and can have 673 no effect on an RXER encoding. 675 Example 677 Modules ::= SEQUENCE OF 678 module [COMPONENT-REF module 679 FROM AbstractSyntaxNotation-X 680 { 1 3 6 1 4 1 21472 1 0 1 }] 681 ModuleDefinition 683 Note that the "module" top-level NamedType in the 684 AbstractSyntaxNotation-X module is defined like so: 686 COMPONENT module ModuleDefinition 688 The ASN.X translation of the SEQUENCE OF type definition provides 689 a more natural representation: 691 693 694 695 696 698 Aside: The element in ASN.X corresponds to a 699 TypeAssignment, not a NamedType. 701 The identifier of a NamedType subject to a COMPONENT-REF encoding 702 instruction does not contribute to an RXER encoding. For the sake of 703 consistency with other encoding rules, the identifier SHOULD be the 704 same as the identifier in the ComponentRefInstruction. 706 11. The ELEMENT-REF Encoding Instruction 708 The ELEMENT-REF encoding instruction causes an RXER encoder to encode 709 a value of the component to which it is applied as an element where 710 the element's name is a qualified name of the element declaration 711 referenced by the encoding instruction. In addition, the ELEMENT-REF 712 encoding instruction causes values of the Markup ASN.1 type to be 713 restricted to conform to the type of the element declaration. 715 The notation for an ELEMENT-REF encoding instruction is defined as 716 follows: 718 ElementRefInstruction ::= "ELEMENT-REF" QNameValue RefParameters 720 Taken together, the QNameValue and the ContextParameter in the 721 RefParameters (if present) MUST reference an XML Schema element 722 declaration, a RELAX NG element definition, or a top-level NamedType 723 that is not subject to an ATTRIBUTE encoding instruction. 725 A referenced XML Schema element declaration MUST NOT have a type that 726 requires the presence of values for the XML Schema ENTITY or ENTITIES 727 types. 729 Aside: Entity declarations are not supported by CRXER. 731 Example 733 AnySchema ::= CHOICE { 734 module [ELEMENT-REF { 735 namespace-name 736 "urn:ietf:params:xml:ns:asnx", 737 local-name "module" }] 738 Markup, 739 schema [ELEMENT-REF { 740 namespace-name 741 "http://www.w3.org/2001/XMLSchema", 742 local-name "schema" }] 743 Markup, 744 grammar [ELEMENT-REF { 745 namespace-name 746 "http://relaxng.org/ns/structure/1.0", 747 local-name "grammar" }] 748 Markup 749 } 751 The ASN.X translation of the choice type definition provides a 752 more natural representation: 754 758 759 760 761 762 763 765 The identifier of a NamedType subject to an ELEMENT-REF encoding 766 instruction does not contribute to the name of an element in an RXER 767 encoding. For the sake of consistency, the identifier SHOULD, where 768 possible, be the same as the local name of the referenced element 769 declaration. 771 12. The LIST Encoding Instruction 773 The LIST encoding instruction causes an RXER encoder to encode a 774 value of a SEQUENCE OF type as a white space separated list of the 775 component values. 777 The notation for a LIST encoding instruction is defined as follows: 779 ListInstruction ::= "LIST" 781 The Type in an EncodingPrefixedType for a LIST encoding instruction 782 SHALL be either: 784 (1) a BuiltinType that is a SequenceOfType of the 785 "SEQUENCE OF NamedType" form, or 787 (2) a ConstrainedType that is a TypeWithConstraint of the 788 "SEQUENCE Constraint OF NamedType" form or 789 "SEQUENCE SizeConstraint OF NamedType" form, or 791 (3) a ConstrainedType, other than a TypeWithConstraint, where the 792 Type in the ConstrainedType is one of (1) to (5), or 794 (4) a BuiltinType that is a PrefixedType that is a TaggedType where 795 the Type in the TaggedType is one of (1) to (5), or 797 (5) a BuiltinType that is a PrefixedType that is an 798 EncodingPrefixedType where the Type in the EncodingPrefixedType 799 is one of (1) to (5). 801 The effect of this condition is to force the LIST encoding 802 instruction to be textually co-located with the SequenceOfType or 803 TypeWithConstraint to which it applies. 805 Aside: This makes it clear to a reader that the encoding 806 instruction applies to every use of the type no matter how it 807 might be referenced. 809 The SequenceOfType in case (1) and the TypeWithConstraint in case (2) 810 are said to be "subject to" the LIST encoding instruction. 812 A SequenceOfType or TypeWithConstraint SHALL NOT be subject to more 813 than one LIST encoding instruction. 815 The base type of the component type of a SequenceOfType or 816 TypeWithConstraint that is subject to a LIST encoding instruction 817 MUST be one of the following: 819 (1) the BOOLEAN, INTEGER, ENUMERATED, REAL, OBJECT IDENTIFIER, 820 RELATIVE-OID, GeneralizedTime or UTCTime type, or 822 (2) the NCName, AnyURI, Name or QName type from the 823 AdditionalBasicDefinitions module [RXER]. 825 Aside: While it would be feasible to allow the component type to 826 also be any character string type that is constrained such that 827 all its abstract values have a length greater than zero and none 828 of its abstract values contain any white space characters, testing 829 whether this condition is satisfied can be quite involved. For 830 the sake of simplicity, only certain immediately useful 831 constrained UTF8String types, which are known to be suitable, are 832 permitted (i.e., NCName, AnyURI and Name). 834 The NamedType in a SequenceOfType or TypeWithConstraint that is 835 subject to a LIST encoding instruction MUST NOT be subject to an 836 ATTRIBUTE, ATTRIBUTE-REF, COMPONENT-REF, GROUP, ELEMENT-REF, 837 REF-AS-ELEMENT, SIMPLE-CONTENT or TYPE-AS-VERSION encoding 838 instruction. 840 Example 842 UpdateTimes ::= [LIST] SEQUENCE OF updateTime GeneralizedTime 844 13. The NAME Encoding Instruction 846 The NAME encoding instruction causes an RXER encoder to use a 847 nominated character string instead of a component's identifier 848 wherever that identifier would otherwise appear in the encoding 849 (e.g., as an element or attribute name). 851 The notation for a NAME encoding instruction is defined as follows: 853 NameInstruction ::= "NAME" "AS"? NCNameValue 855 Example 857 CHOICE { 858 foo-att [ATTRIBUTE] [NAME AS "Foo"] INTEGER, 859 foo-elem [NAME "Foo"] INTEGER 860 } 862 14. The REF-AS-ELEMENT Encoding Instruction 864 The REF-AS-ELEMENT encoding instruction causes an RXER encoder to 865 encode a value of the component to which it is applied as an element 866 where the element's name is the name of the external DTD subset 867 element type declaration referenced by the encoding instruction. In 868 addition, the REF-AS-ELEMENT encoding instruction causes values of 869 the Markup ASN.1 type to be restricted to conform to the content and 870 attributes permitted by that element type declaration and its 871 associated attribute-list declarations. 873 The notation for a REF-AS-ELEMENT encoding instruction is defined as 874 follows: 876 RefAsElementInstruction ::= 877 "REF-AS-ELEMENT" NameValue Namespace ? RefParameters 879 Namespace ::= "NAMESPACE" AnyURIValue 881 Taken together, the NameValue and the ContextParameter in the 882 RefParameters (if present) MUST reference an element type declaration 883 in an external DTD subset that is conformant with Namespaces in XML 884 1.0 [XMLNS10]. 886 The Namespace is present if and only if the Name of the referenced 887 element type declaration conforms to a PrefixedName (a QName) 888 [XMLNS10], in which case the Namespace specifies the namespace name 889 to be associated with the Prefix of the PrefixedName. 891 The referenced element type declaration MUST NOT require the presence 892 of attributes of type ENTITY or ENTITIES. 894 Aside: Entity declarations are not supported by CRXER. 896 Example 898 Suppose that the following external DTD subset has been defined 899 with a system identifier of "http://www.example.com/inventory": 901 902 903 908 The product element type declaration can be referenced as an 909 element in an ASN.1 type definition: 911 CHOICE { 912 product [REF-AS-ELEMENT "product" 913 CONTEXT "http://www.example.com/inventory"] 914 Markup 915 } 917 Here is the ASN.X translation of this ASN.1 type definition: 919 920 921 923 924 926 The identifier of a NamedType subject to a REF-AS-ELEMENT encoding 927 instruction does not contribute to the name of an element in an RXER 928 encoding. For the sake of consistency, the identifier SHOULD, where 929 possible, be the same as the Name of the referenced element type 930 declaration (or the LocalPart if the Name conforms to a 931 PrefixedName). 933 15. The REF-AS-TYPE Encoding Instruction 935 The REF-AS-TYPE encoding instruction causes values of the Markup 936 ASN.1 type to be restricted to conform to the content and attributes 937 permitted by a nominated element type declaration and its associated 938 attribute-list declarations in an external DTD subset. 940 The notation for a REF-AS-TYPE encoding instruction is defined as 941 follows: 943 RefAsTypeInstruction ::= "REF-AS-TYPE" NameValue RefParameters 945 Taken together, the NameValue and the ContextParameter of the 946 RefParameters (if present) MUST reference an element type declaration 947 in an external DTD subset that is conformant with Namespaces in XML 948 1.0 [XMLNS10]. 950 The referenced element type declaration MUST NOT require the presence 951 of attributes of type ENTITY or ENTITIES. 953 Aside: Entity declarations are not supported by CRXER. 955 Example 957 The product element type declaration can be referenced as a type 958 in an ASN.1 definition: 960 SEQUENCE OF 961 inventoryItem 962 [REF-AS-TYPE "product" 963 CONTEXT "http://www.example.com/inventory"] 964 Markup 966 Here is the ASN.X translation of this definition: 968 969 970 972 973 975 Note that when an element type declaration is referenced as a 976 type, the Name of the element type declaration does not contribute 977 to RXER encodings. For example, child elements in the RXER 978 encoding of values of the above SEQUENCE OF type would resemble 979 the following: 981 983 16. The SCHEMA-IDENTITY Encoding Instruction 985 The SCHEMA-IDENTITY encoding instruction associates a unique 986 identifier, a URI [URI], with the ASN.1 module containing the 987 encoding instruction. This encoding instruction has no effect on an 988 RXER encoder but does have an effect on the translation of an ASN.1 989 specification into an ASN.X representation. 991 The notation for a SCHEMA-IDENTITY encoding instruction is defined as 992 follows: 994 SchemaIdentityInstruction ::= "SCHEMA-IDENTITY" AnyURIValue 996 The character string specified by the AnyURIValue of each 997 SCHEMA-IDENTITY encoding instruction MUST be distinct. In 998 particular, successive versions of an ASN.1 module must each have a 999 different schema identity URI value. 1001 17. The SIMPLE-CONTENT Encoding Instruction 1003 The SIMPLE-CONTENT encoding instruction causes an RXER encoder to 1004 encode a value of a component of a SEQUENCE or SET type without 1005 encapsulation in a child element. 1007 The notation for a SIMPLE-CONTENT encoding instruction is defined as 1008 follows: 1010 SimpleContentInstruction ::= "SIMPLE-CONTENT" 1012 A NamedType subject to a SIMPLE-CONTENT encoding instruction SHALL be 1013 in a ComponentType in a ComponentTypeList in a RootComponentTypeList. 1014 At most one such NamedType of a SEQUENCE or SET type is permitted to 1015 be subject to a SIMPLE-CONTENT encoding instruction. If any 1016 component is subject to a SIMPLE-CONTENT encoding instruction, then 1017 all other components in the same SEQUENCE or SET type definition MUST 1018 be attribute components. These tests are applied after the 1019 COMPONENTS OF transformation specified in X.680, Clause 24.4 [X.680]. 1021 Aside: Child elements and simple content are mutually exclusive. 1022 Specification writers should note that use of the SIMPLE-CONTENT 1023 encoding instruction on a component of an extensible SEQUENCE or 1024 SET type means that all future extensions to the SEQUENCE or SET 1025 type are restricted to being attribute components with the limited 1026 set of types that are permitted for attribute components. Using 1027 an ATTRIBUTE encoding instruction instead of a SIMPLE-CONTENT 1028 encoding instruction avoids this limitation. 1030 The base type of the type of a NamedType that is subject to a 1031 SIMPLE-CONTENT encoding instruction SHALL NOT be: 1033 (1) a SET or SET OF type, or 1035 (2) a CHOICE type where the ChoiceType is not subject to a UNION 1036 encoding instruction, or 1038 (3) a SEQUENCE type other than the one defining the QName type from 1039 the AdditionalBasicDefinitions module [RXER] (i.e., QName is 1040 allowed), or 1042 (4) a SEQUENCE OF type where the SequenceOfType is not subject to a 1043 LIST encoding instruction, or 1045 (5) an open type. 1047 If the type of a NamedType subject to a SIMPLE-CONTENT encoding 1048 instruction has abstract values with an empty character data 1049 translation [RXER] (i.e., an empty encoding), then the NamedType 1050 SHALL NOT be marked OPTIONAL or DEFAULT. 1052 Example 1054 SEQUENCE { 1055 units [ATTRIBUTE] UTF8String, 1056 amount [SIMPLE-CONTENT] INTEGER 1057 } 1059 18. The TARGET-NAMESPACE Encoding Instruction 1061 The TARGET-NAMESPACE encoding instruction associates an XML namespace 1062 name [XMLNS10], a URI [URI], with the type, object class, value, 1063 object and object set references defined in the ASN.1 module 1064 containing the encoding instruction. In addition, it associates the 1065 namespace name with each top-level NamedType in the RXER encoding 1066 control section. 1068 The notation for a TARGET-NAMESPACE encoding instruction is defined 1069 as follows: 1071 TargetNamespaceInstruction ::= 1072 "TARGET-NAMESPACE" AnyURIValue Prefix ? 1074 Prefix ::= "PREFIX" NCNameValue 1076 The AnyURIValue SHALL NOT specify an empty string. 1078 Definition (target namespace): If an ASN.1 module contains a 1079 TARGET-NAMESPACE encoding instruction, then the target namespace of 1080 the module is the character string specified by the AnyURIValue of 1081 the TARGET-NAMESPACE encoding instruction, otherwise the target 1082 namespace of the module is said to be absent. 1084 Two or more ASN.1 modules MAY have the same non-absent target 1085 namespace if and only if the expanded names of the top-level 1086 attribute components are distinct across all those modules, the 1087 expanded names of the top-level element components are distinct 1088 across all those modules and the defined type, object class, value, 1089 object and object set references are distinct in their category 1090 across all those modules. 1092 The Prefix, if present, suggests an NCName to use as the namespace 1093 prefix in namespace declarations involving the target namespace. An 1094 RXER encoder is not obligated to use the nominated namespace prefix. 1096 If there are no top-level components, then the RXER encodings 1097 produced using a module with a TARGET-NAMESPACE encoding instruction 1098 are backward compatible with the RXER encodings produced by the same 1099 module without the TARGET-NAMESPACE encoding instruction. 1101 19. The TYPE-AS-VERSION Encoding Instruction 1103 The TYPE-AS-VERSION encoding instruction causes an RXER encoder to 1104 include an xsi:type attribute in the encoding of a value of the 1105 component to which the encoding instruction is applied. This 1106 attribute allows an XML Schema [XSD1] validator to select, if 1107 available, the appropriate XML Schema translation for the version of 1108 the ASN.1 specification used to create the encoding. 1110 Aside: Translations of an ASN.1 specification into a compatible 1111 XML Schema are expected to be slightly different across versions 1112 because of progressive extensions to the ASN.1 specification. Any 1113 incompatibilities between these translations can be accommodated 1114 if each version uses a different target namespace. The target 1115 namespace will be evident in the value of the xsi:type attribute 1116 and will cause an XML Schema validator to use the appropriate 1117 version. This mechanism also accommodates an ASN.1 type that is 1118 renamed in a later version of the ASN.1 specification. 1120 The notation for a TYPE-AS-VERSION encoding instruction is defined as 1121 follows: 1123 TypeAsVersionInstruction ::= "TYPE-AS-VERSION" 1125 The Type in a NamedType that is subject to a TYPE-AS-VERSION encoding 1126 instruction MUST be a namespace-qualified reference [RXER]. 1128 The addition of a TYPE-AS-VERSION encoding instruction does not 1129 affect the backward compatibility of RXER encodings. 1131 Aside: In a translation of an ASN.1 specification into XML Schema, 1132 any Type in a NamedType that is subject to a TYPE-AS-VERSION 1133 encoding instruction is expected to be translated into the 1134 XML Schema anyType so that the xsi:type attribute acts as a switch 1135 to select the appropriate version. 1137 20. The TYPE-REF Encoding Instruction 1139 The TYPE-REF encoding instruction causes values of the Markup ASN.1 1140 type to be restricted to conform to a specific XML Schema named type, 1141 RELAX NG named pattern or an ASN.1 defined type. 1143 Aside: Referencing an ASN.1 type in a TYPE-REF encoding 1144 instruction does not have the effect of imposing a requirement to 1145 preserve the Infoset [INFOSET] representation of the RXER encoding 1146 of an abstract value of the type. It is still sufficient to 1147 preserve just the abstract value. 1149 The notation for a TYPE-REF encoding instruction is defined as 1150 follows: 1152 TypeRefInstruction ::= "TYPE-REF" QNameValue RefParameters 1154 Taken together, the QNameValue and the ContextParameter of the 1155 RefParameters (if present) MUST reference an XML Schema named type, a 1156 RELAX NG named pattern, or an ASN.1 defined type. 1158 A referenced XML Schema type MUST NOT require the presence of values 1159 for the XML Schema ENTITY or ENTITIES types. 1161 Aside: Entity declarations are not supported by CRXER. 1163 The QNameValue SHALL NOT be a direct reference to the XML Schema 1164 NOTATION type [XSD2] (i.e., the namespace name 1165 "http://www.w3.org/2001/XMLSchema" and local name "NOTATION"), 1166 however a reference to an XML Schema type derived from the NOTATION 1167 type is permitted. 1169 Aside: This restriction is to ensure that the lexical space [XSD2] 1170 of the referenced type is actually populated with the names of 1171 notations [XSD1]. 1173 Example 1175 MyDecimal ::= 1176 [TYPE-REF { 1177 namespace-name "http://www.w3.org/2001/XMLSchema", 1178 local-name "decimal" }] 1179 Markup 1181 Note that the ASN.X translation of this ASN.1 type definition 1182 provides a more natural way to reference the XML Schema decimal 1183 type: 1185 1187 1188 1190 21. The UNION Encoding Instruction 1192 The UNION encoding instruction causes an RXER encoder to encode the 1193 value of an alternative of a CHOICE type without encapsulation in a 1194 child element. The chosen alternative is optionally indicated with a 1195 member attribute. The optional PrecedenceList also allows a 1196 specification writer to alter the order in which an RXER decoder will 1197 consider the alternatives of the CHOICE as it determines which 1198 alternative has been used (if the actual alternative has not been 1199 specified through the member attribute). 1201 The notation for a UNION encoding instruction is defined as follows: 1203 UnionInstruction ::= "UNION" AlternativesPrecedence ? 1205 AlternativesPrecedence ::= "PRECEDENCE" PrecedenceList 1207 PrecedenceList ::= identifier PrecedenceList ? 1209 The Type in the EncodingPrefixedType for a UNION encoding instruction 1210 SHALL be either: 1212 (1) a BuiltinType that is a ChoiceType, or 1214 (2) a ConstrainedType, other than a TypeWithConstraint, where the 1215 Type in the ConstrainedType is one of (1) to (4), or 1217 (3) a BuiltinType that is a PrefixedType that is a TaggedType where 1218 the Type in the TaggedType is one of (1) to (4), or 1220 (4) a BuiltinType that is a PrefixedType that is an 1221 EncodingPrefixedType where the Type in the EncodingPrefixedType 1222 is one of (1) to (4). 1224 The ChoiceType in case (1) is said to be "subject to" the UNION 1225 encoding instruction. 1227 The base type of the type of each alternative of a ChoiceType that is 1228 subject to a UNION encoding instruction SHALL NOT be: 1230 (1) a CHOICE, SET or SET OF type, or 1232 (2) a SEQUENCE type other than the one defining the QName type from 1233 the AdditionalBasicDefinitions module [RXER] (i.e., QName is 1234 allowed), or 1236 (3) a SEQUENCE OF type where the SequenceOfType is not subject to a 1237 LIST encoding instruction, or 1239 (4) an open type. 1241 Each identifier in the PrecedenceList MUST be the identifier of a 1242 NamedType in the ChoiceType. 1244 A particular identifier SHALL NOT appear more than once in the same 1245 PrecedenceList. 1247 Every NamedType in a ChoiceType that is subject to a UNION encoding 1248 instruction MUST NOT be subject to an ATTRIBUTE, ATTRIBUTE-REF, 1249 COMPONENT-REF, GROUP, ELEMENT-REF, REF-AS-ELEMENT, SIMPLE-CONTENT or 1250 TYPE-AS-VERSION encoding instruction. 1252 Example 1254 [UNION PRECEDENCE basicName] CHOICE { 1255 extendedName UTF8String, 1256 basicName PrintableString 1258 } 1260 22. The VALUES Encoding Instruction 1262 The VALUES encoding instruction causes an RXER encoder to use 1263 nominated names instead of the identifiers that would otherwise 1264 appear in the encoding of a value of a BIT STRING, ENUMERATED or 1265 INTEGER type. 1267 The notation for a VALUES encoding instruction is defined as follows: 1269 ValuesInstruction ::= 1270 "VALUES" AllValuesMapped ? ValueMappingList ? 1272 AllValuesMapped ::= AllCapitalized | AllUppercased 1274 AllCapitalized ::= "ALL" "CAPITALIZED" 1276 AllUppercased ::= "ALL" "UPPERCASED" 1278 ValueMappingList ::= ValueMapping ValueMappingList ? 1280 ValueMapping ::= "," identifier "AS" NCNameValue 1282 The Type in the EncodingPrefixedType for a VALUES encoding 1283 instruction SHALL be either: 1285 (1) a BuiltinType that is a BitStringType with a NamedBitList, or 1287 (2) a BuiltinType that is an EnumeratedType, or 1289 (3) a BuiltinType that is an IntegerType with a NamedNumberList, or 1291 (4) a ConstrainedType, other than a TypeWithConstraint, where the 1292 Type in the ConstrainedType is one of (1) to (6), or 1294 (5) a BuiltinType that is a PrefixedType that is a TaggedType where 1295 the Type in the TaggedType is one of (1) to (6), or 1297 (6) a BuiltinType that is a PrefixedType that is an 1298 EncodingPrefixedType where the Type in the EncodingPrefixedType 1299 is one of (1) to (6). 1301 The effect of this condition is to force the VALUES encoding 1302 instruction to be textually co-located with the type definition to 1303 which it applies. 1305 The BitStringType, EnumeratedType or IntegerType in case (1), (2) or 1306 (3), respectively, is said to be "subject to" the VALUES encoding 1307 instruction. 1309 A BitStringType, EnumeratedType or IntegerType SHALL NOT be subject 1310 to more than one VALUES encoding instruction. 1312 Each identifier in a ValueMapping MUST be an identifier appearing in 1313 the NamedBitList, Enumerations or NamedNumberList, as the case may 1314 be. 1316 The identifier in a ValueMapping SHALL NOT be the same as the 1317 identifier in any other ValueMapping for the same ValueMappingList. 1319 Definition (replacement name): Each identifier in a BitStringType, 1320 EnumeratedType or IntegerType subject to a VALUES encoding 1321 instruction has a replacement name. If there is a ValueMapping for 1322 the identifier, then the replacement name is the character string 1323 specified by the NCNameValue in the ValueMapping, otherwise if 1324 AllCapitalized is used, then the replacement name is the identifier 1325 with the first character uppercased, otherwise if AllUppercased is 1326 used, then the replacement name is the identifier with all its 1327 characters uppercased, otherwise, the replacement name is the 1328 identifier. 1330 The replacement names for the identifiers in a BitStringType subject 1331 to a VALUES encoding instruction MUST be distinct. 1333 The replacement names for the identifiers in an EnumeratedType 1334 subject to a VALUES encoding instruction MUST be distinct. 1336 The replacement names for the identifiers in an IntegerType subject 1337 to a VALUES encoding instruction MUST be distinct. 1339 Example 1341 Traffic-Light ::= [VALUES ALL CAPITALIZED, red AS "RED"] 1342 ENUMERATED { 1343 red, -- Replacement name is RED. 1344 amber, -- Replacement name is Amber. 1345 green -- Replacement name is Green. 1346 } 1348 23. Insertion Encoding Instructions 1350 Certain of the RXER encoding instructions are categorized as 1351 insertion encoding instructions. The insertion encoding instructions 1352 are the NO-INSERTIONS, HOLLOW-INSERTIONS, SINGULAR-INSERTIONS, 1353 UNIFORM-INSERTIONS and MULTIFORM-INSERTIONS encoding instructions 1354 (whose notations are described respectively by 1355 NoInsertionsInstruction, HollowInsertionsInstruction, 1356 SingularInsertionsInstruction, UniformInsertionsInstruction and 1357 MultiformInsertionsInstruction). 1359 The notation for the insertion encoding instructions is defined as 1360 follows: 1362 InsertionsInstruction ::= 1363 NoInsertionsInstruction | 1364 HollowInsertionsInstruction | 1365 SingularInsertionsInstruction | 1366 UniformInsertionsInstruction | 1367 MultiformInsertionsInstruction 1369 NoInsertionsInstruction ::= "NO-INSERTIONS" 1371 HollowInsertionsInstruction ::= "HOLLOW-INSERTIONS" 1373 SingularInsertionsInstruction ::= "SINGULAR-INSERTIONS" 1375 UniformInsertionsInstruction ::= "UNIFORM-INSERTIONS" 1377 MultiformInsertionsInstruction ::= "MULTIFORM-INSERTIONS" 1379 Using the GROUP encoding instruction on components with extensible 1380 types can lead to situations where an unknown extension could be 1381 associated with more than one extension insertion point. The 1382 insertion encoding instructions remove this ambiguity by limiting the 1383 form that extensions can take. That is, the insertion encoding 1384 instructions indicate what extensions can be made to an ASN.1 1385 specification without breaking forward compatibility for RXER 1386 encodings. 1388 Aside: Forward compatibility means the ability for a decoder to 1389 successfully decode an encoding containing extensions introduced 1390 into a version of the specification that is more recent than the 1391 one used by the decoder. 1393 In the most general case, an extension to a CHOICE, SET or SEQUENCE 1394 type will generate zero or more attributes and zero or more elements, 1395 due to the potential use of the GROUP and ATTRIBUTE encoding 1396 instructions by the extension. 1398 The MULTIFORM-INSERTIONS encoding instruction indicates that the RXER 1399 encodings produced by forward compatible extensions to a type will 1400 always consist of one or more elements and zero or more attributes. 1401 No restriction is placed on the names of the elements. 1403 Aside: Of necessity, the names of the attributes will all be 1404 different in any given encoding. 1406 The UNIFORM-INSERTIONS encoding instruction indicates that the RXER 1407 encodings produced by forward compatible extensions to a type will 1408 always consist of one or more elements having the same expanded name, 1409 and zero or more attributes. The expanded name shared by the 1410 elements in one particular encoding is not required to be the same as 1411 the expanded name shared by the elements in any other encoding of the 1412 extension. For example, in one encoding of the extension the 1413 elements might all be called "foo", while in another encoding of the 1414 extension they might all be called "bar". 1416 The SINGULAR-INSERTIONS encoding instruction indicates that the RXER 1417 encodings produced by forward compatible extensions to a type will 1418 always consist of a single element and zero or more attributes. The 1419 name of the single element is not required to be the same in every 1420 possible encoding of the extension. 1422 The HOLLOW-INSERTIONS encoding instruction indicates that the RXER 1423 encodings produced by forward compatible extensions to a type will 1424 always consist of zero elements and zero or more attributes. 1426 The NO-INSERTIONS encoding instruction indicates that no forward 1427 compatible extensions can be made to a type. 1429 Examples of forward compatible extensions are provided in Appendix C. 1431 The Type in the EncodingPrefixedType for an insertion encoding 1432 instruction SHALL be either: 1434 (1) a BuiltinType that is a ChoiceType where the ChoiceType is not 1435 subject to a UNION encoding instruction, or 1437 (2) a BuiltinType that is a SequenceType or SetType, or 1439 (3) a ConstrainedType, other than a TypeWithConstraint, where the 1440 Type in the ConstrainedType is one of (1) to (5), or 1442 (4) a BuiltinType that is a PrefixedType that is a TaggedType where 1443 the Type in the TaggedType is one of (1) to (5), or 1445 (5) a BuiltinType that is a PrefixedType that is an 1446 EncodingPrefixedType where the Type in the EncodingPrefixedType 1447 is one of (1) to (5). 1449 Case (2) is not permitted when the insertion encoding instruction is 1450 the SINGULAR-INSERTIONS, UNIFORM-INSERTIONS or MULTIFORM-INSERTIONS 1451 encoding instruction. 1453 Aside: Because extensions to a SET or SEQUENCE type are serial and 1454 effectively optional, the SINGULAR-INSERTIONS, UNIFORM-INSERTIONS 1455 and MULTIFORM-INSERTIONS encoding instructions offer no advantage 1456 over unrestricted extensions (for a SET or SEQUENCE). For 1457 example, an optional series of singular insertions generates zero 1458 or more elements and zero or more attributes, just like an 1459 unrestricted extension. 1461 The Type in case (1) or case (2) is said to be "subject to" the 1462 insertion encoding instruction. 1464 The Type in case (1) or case (2) MUST be extensible, either 1465 explicitly or by default. 1467 A Type SHALL NOT be subject to more than one insertion encoding 1468 instruction. 1470 The insertion encoding instructions indicate what kinds of extensions 1471 can be made to a type without breaking forward compatibility, but 1472 they do not prohibit extensions that do break forward compatibility. 1473 That is, it is not an error for a type's base type to contain 1474 extensions that do not satisfy an insertion encoding instruction 1475 affecting the type. However, if any such extensions are made, then a 1476 new value SHOULD be introduced into the extensible set of permitted 1477 values for a version indicator attribute, or attributes (see 1478 Section 24), whose scope encompasses the extensions. An example is 1479 provided in Appendix C. 1481 24. The VERSION-INDICATOR Encoding Instruction 1483 The VERSION-INDICATOR encoding instruction provides a mechanism for 1484 RXER decoders to be alerted that an encoding contains extensions that 1485 break forward compatibility (see the preceding section). 1487 The notation for a VERSION-INDICATOR encoding instruction is defined 1488 as follows: 1490 VersionIndicatorInstruction ::= "VERSION-INDICATOR" 1492 A NamedType that is subject to a VERSION-INDICATOR encoding 1493 instruction MUST also be subject to an ATTRIBUTE encoding 1494 instruction. 1496 The type of the NamedType that is subject to the VERSION-INDICATOR 1497 encoding instruction MUST be directly or indirectly a constrained 1498 type where the set of permitted values is defined to be extensible. 1500 Each value represents a different version of the ASN.1 specification. 1501 Ordinarily, an application will set the value of a version indicator 1502 attribute to be the last of these permitted values. An application 1503 MAY set the value of the version indicator attribute to the value 1504 corresponding to an earlier version of the specification if it has 1505 not used any of the extensions added in a subsequent version. 1507 If an RXER decoder encounters a value of the type that is not one of 1508 the root values or extension additions (but still allowed since the 1509 set of permitted values is extensible), then this indicates that the 1510 decoder is using a version of the ASN.1 specification that is not 1511 compatible with the version used to produce the encoding. In such 1512 cases the decoder SHOULD treat the element containing the attribute 1513 as having an unknown ASN.1 type. 1515 Aside: A version indicator attribute only indicates an 1516 incompatibility with respect to RXER encodings. Other encodings 1517 are not affected because the GROUP encoding instruction does not 1518 apply to them. 1520 Examples 1522 In this first example, the decoder is using an incompatible older 1523 version if the value of the version attribute in a received RXER 1524 encoding is not 1, 2 or 3. 1526 SEQUENCE { 1527 version [ATTRIBUTE] [VERSION-INDICATOR] 1528 INTEGER (1, ..., 2..3), 1529 message MessageType 1530 } 1532 In this second example, the decoder is using an incompatible older 1533 version if the value of the format attribute in a received RXER 1534 encoding is not "1.0", "1.1" or "2.0". 1536 SEQUENCE { 1537 format [ATTRIBUTE] [VERSION-INDICATOR] 1538 UTF8String ("1.0", ..., "1.1" | "2.0"), 1539 message MessageType 1540 } 1542 An extensive example is provided in Appendix C. 1544 It is not necessary for every extensible type to have its own version 1545 indicator attribute. It would be typical for only the types of 1546 top-level element components to include a version indicator 1547 attribute, which would serve as the version indicator for all of the 1548 nested components. 1550 25. The GROUP Encoding Instruction 1552 The GROUP encoding instruction causes an RXER encoder to encode a 1553 value of the component to which it is applied without encapsulation 1554 as an element. It allows the construction of non-trivial content 1555 models for element content. 1557 The notation for a GROUP encoding instruction is defined as follows: 1559 GroupInstruction ::= "GROUP" 1561 The base type of the type of a NamedType that is subject to a GROUP 1562 encoding instruction SHALL be either: 1564 (1) a SEQUENCE, SET or SET OF type, or 1566 (2) a CHOICE type where the ChoiceType is not subject to a UNION 1567 encoding instruction, or 1569 (3) a SEQUENCE OF type where the SequenceOfType is not subject to a 1570 LIST encoding instruction. 1572 The SEQUENCE type in case (1) SHALL NOT be the associated type for a 1573 built-in type, SHALL NOT be a type from the 1574 AdditionalBasicDefinitions module [RXER] and SHALL NOT contain a 1575 component that is subject to a SIMPLE-CONTENT encoding instruction. 1577 Aside: Thus the CHARACTER STRING, EMBEDDED PDV, EXTERNAL, REAL and 1578 QName types are excluded. 1580 The CHOICE type in case (2) SHALL NOT be a type from the 1581 AdditionalBasicDefinitions module. 1583 Aside: Thus the Markup type is excluded. 1585 Definition (visible component): Ignoring all type constraints, the 1586 visible components for a type that is directly or indirectly a 1587 combining ASN.1 type (i.e., SEQUENCE, SET, CHOICE, SEQUENCE OF or 1588 SET OF) is the set of components of the combining type definition 1589 plus, for each NamedType (of the combining type definition) that is 1590 subject to a GROUP encoding instruction, the visible components for 1591 the type of the NamedType. The visible components are determined 1592 after the COMPONENTS OF transformation specified in X.680, Clause 1593 24.4 [X.680]. 1595 Aside: The set of visible attribute and element components for a 1596 type is the set of all the components of the type, and any nested 1597 types, that describe attributes and child elements appearing in 1598 the RXER encodings of values of the outer type. 1600 A GROUP encoding instruction MUST NOT be used where it would cause a 1601 NamedType to be a visible component of the type of that same 1602 NamedType (which is only possible if the type definition is 1603 recursive). 1605 Aside: Components subject to a GROUP encoding instruction might be 1606 translated into a compatible XML Schema [XSD1] as group 1607 definitions. A NamedType that is visible to its own type is 1608 analogous to a circular group, which XML Schema disallows. 1610 Section 25.1 imposes additional conditions on the use of the GROUP 1611 encoding instruction. 1613 In any use of the GROUP encoding instruction there is a type, the 1614 including type, that contains the component subject to the GROUP 1615 encoding instruction, and a type, the included type, that is the base 1616 type of that component. Either type can have an extensible content 1617 model, either directly using ASN.1 extensibility, or by including 1618 through another GROUP encoding instruction some other type that is 1619 extensible. 1621 The including and included types may be defined in different ASN.1 1622 modules, in which case the owner of the including type, i.e., the 1623 person or organization having the authority to add extensions to the 1624 including type's definition, may be different from the owner of the 1625 included type. 1627 If the owner of the including type is not using the most recent 1628 version of the included type's definition, then the owner of the 1629 including type might add an extension to the including type which is 1630 valid with respect to the older version of the included type, but is 1631 later found to be invalid when the latest versions of the including 1632 and included type definitions are brought together (perhaps by a 1633 third party). Although the owner of the including type must 1634 necessarily be aware of the existence of the included type, the 1635 reverse is not necessarily true. The owner of the included type 1636 could add an extension to the included type without realizing that it 1637 invalidates someone else's including type. 1639 To avoid these problems, a GROUP encoding instruction MUST NOT be 1640 used if: 1642 (1) the included type is defined in a different module from the 1643 including type, and 1645 (2) the included type has an extensible content model, and 1647 (3) changes to the included type are not coordinated with the owner 1648 of the including type. 1650 Changes in the included type are coordinated with the owner of the 1651 including type if: 1653 (1) the owner of the included type is also the owner of the including 1654 type, or 1656 (2) the owner of the including type is collaborating with the owner 1657 of the included type, or 1659 (3) all changes will be vetted by a common third party before being 1660 approved and published. 1662 25.1. Unambiguous Encodings 1664 Unregulated use of the GROUP encoding instruction can easily lead to 1665 specifications in which distinct abstract values have 1666 indistinguishable RXER encodings, i.e., ambiguous encodings. This 1667 section imposes restrictions on the use of the GROUP encoding 1668 instruction to ensure that distinct abstract values have distinct 1669 RXER encodings. In addition, these restrictions ensure that an 1670 abstract value can be easily decoded in a single pass without 1671 back-tracking. 1673 An RXER decoder for an ASN.1 type can be abstracted as a recognizer 1674 for a notional language, consisting of element and attribute expanded 1675 names, where the type definition describes the grammar for that 1676 language (in fact it is a context-free grammar). The restrictions on 1677 a type definition to ensure easy, unambiguous decoding are more 1678 conveniently, completely and simply expressed as conditions on this 1679 associated grammar. Implementations are not expected to verify type 1680 definitions exactly in the manner to be described, however the 1681 procedure used MUST produce the same result. 1683 Section 25.1.1 describes the procedure for recasting as a grammar a 1684 type definition containing components subject to the GROUP encoding 1685 instruction. Sections 25.1.2 and 25.1.3 specify conditions that the 1686 grammar must satisfy for the type definition to be valid. Appendices 1687 A and B have extensive examples. 1689 25.1.1. Grammar Construction 1691 A grammar consists of a collection of productions. A production has 1692 a left hand side and a right hand side (in this document, separated 1693 by the "::=" symbol). The left hand side (in a context-free grammar) 1694 is a single non-terminal symbol. The right hand side is a sequence 1695 of non-terminal and terminal symbols. The terminal symbols are the 1696 lexical items of the language that the grammar describes. One of the 1697 non-terminals is nominated to be the start symbol. A valid sequence 1698 of terminals for the language can be generated from the grammar by 1699 beginning with the start symbol and repeatedly replacing any 1700 non-terminal with the right hand side of one of the productions where 1701 that non-terminal is on the production's left hand side. The final 1702 sequence of terminals is achieved when there are no remaining 1703 non-terminals to replace. 1705 Aside: X.680 describes the ASN.1 basic notation using a 1706 context-free grammar. 1708 Each NamedType has an associated primary and secondary non-terminal. 1710 Aside: The secondary non-terminal for a NamedType is used when the 1711 base type of the type in the NamedType is a SEQUENCE OF type or 1712 SET OF type. 1714 Each ExtensionAddition and ExtensionAdditionAlternative has an 1715 associated non-terminal. There is a non-terminal associated with the 1716 extension insertion point of each extensible type. There is also a 1717 primary start non-terminal (this is the start symbol) and a secondary 1718 start non-terminal. The exact nature of the non-terminals is not 1719 important, however all the non-terminals MUST be mutually distinct. 1721 It is adequate for most of the examples in this document (though not 1722 in the most general case) for the primary non-terminal for a 1723 NamedType to be the identifier of the NamedType, for the primary 1724 start non-terminal to be S, for the non-terminals for the instances 1725 of ExtensionAddition and ExtensionAdditionAlternative to be E1, E2, 1726 E3 and so on, and for the non-terminals for the extension insertion 1727 points to be I1, I2, I3 and so on. The secondary non-terminals are 1728 labelled by appending a "'" character to the primary non-terminal 1729 label, e.g., the primary and secondary start non-terminals are S and 1730 S', respectively. 1732 Each NamedType and extension insertion point has an associated 1733 terminal. There exists a terminal called the general extension 1734 terminal that is not associated with any specific notation. The 1735 general extension terminal and the terminals for the extension 1736 insertion points are used to represent elements in unknown 1737 extensions. The exact nature of the terminals is not important, 1738 however the aforementioned terminals MUST be mutually distinct. The 1739 terminals are further categorized as either element terminals or 1740 attribute terminals. A terminal for a NamedType is an attribute 1741 terminal if its associated NamedType is an attribute component, 1742 otherwise it is an element terminal. The general extension terminal 1743 and the terminals for the extension insertion points are categorized 1744 as element terminals. 1746 Terminals for attributes in unknown extensions are not explicitly 1747 provided in the grammar. Certain productions in the grammar are 1748 categorized as insertion point productions and their role in 1749 accepting unknown attributes is described in Section 25.1.4. 1751 In the examples in this document, the terminal for a component other 1752 than an attribute component will be represented as the local name of 1753 the expanded name of the component enclosed in double quotes, and the 1754 terminal for an attribute component will be represented as the local 1755 name of the expanded name of the component prefixed by the '@' 1756 character and enclosed in double quotes. The general extension 1757 terminal will be represented as "*" and the terminals for the 1758 extension insertion points will be represented as "*1", "*2", "*3" 1759 and so on. 1761 The productions generated from a NamedType depend on the base type of 1762 the type of the NamedType. The productions for the start 1763 non-terminals depend on the combining type definition being tested. 1764 In either case, the procedure for generating productions takes a 1765 primary non-terminal, a secondary non-terminal (sometimes) and a type 1766 definition. 1768 The grammar is constructed beginning with the start non-terminals and 1769 the combining type definition being tested. 1771 A grammar is constructed after the COMPONENTS OF transformation 1772 specified in X.680, Clause 24.4 [X.680]. 1774 Given a primary non-terminal, N, and a type where the base type is a 1775 SEQUENCE or SET type, a production is added to the grammar with N as 1776 the left hand side. The right hand side is constructed from an 1777 initial empty state according to the following cases considered in 1778 order: 1780 (1) If an initial RootComponentTypeList is present in the base type, 1781 then the sequence of primary non-terminals for the components 1782 nested in that RootComponentTypeList are appended to the right 1783 hand side in the order of their definition. 1785 (2) If an ExtensionAdditions instance is present in the base type and 1786 not empty, then the non-terminal for the first ExtensionAddition 1787 nested in the ExtensionAdditions instance is appended to the 1788 right hand side. 1790 (3) If an ExtensionAdditions instance is empty or not present in the 1791 base type and the base type is extensible (explicitly or by 1792 default) and the base type is not subject to a NO-INSERTIONS or 1793 HOLLOW-INSERTIONS encoding instruction, then the non-terminal for 1794 the extension insertion point of the base type is appended to the 1795 right hand side. 1797 (4) If a final RootComponentTypeList is present in the base type, 1798 then the primary non-terminals for the components nested in that 1799 RootComponentTypeList are appended to the right hand side in the 1800 order of their definition. 1802 The production is an insertion point production if an 1803 ExtensionAdditions instance is empty or not present in the base type 1804 and the base type is extensible (explicitly or by default) and the 1805 base type is not subject to a NO-INSERTIONS encoding instruction. 1807 If a component in a ComponentTypeList (in either a 1808 RootComponentTypeList or an ExtensionAdditionGroup) is marked 1809 OPTIONAL or DEFAULT, then a production with the primary non-terminal 1810 of the component as the left hand side and an empty right hand side 1811 is added to the grammar. 1813 If a component (regardless of the ASN.1 combining type containing it) 1814 is subject to a GROUP encoding instruction, then one or more 1815 productions constructed according to the component's type are added 1816 to the grammar. Each of these productions has the primary 1817 non-terminal of the component as the left hand side. 1819 If a component (regardless of the ASN.1 combining type containing it) 1820 is not subject to a GROUP encoding instruction, then a production is 1821 added to the grammar with the primary non-terminal of the component 1822 as the left hand side and the terminal of the component as the right 1823 hand side. 1825 Example 1827 Consider the following ASN.1 type definition: 1829 SEQUENCE { 1830 -- Start of initial RootComponentTypeList. 1831 one [ATTRIBUTE] UTF8String, 1832 two BOOLEAN OPTIONAL, 1833 three INTEGER 1834 -- End of initial RootComponentTypeList. 1835 } 1837 Here is the grammar derived from this type: 1839 S ::= one two three 1840 one ::= "@one" 1841 two ::= "two" 1842 two ::= 1843 three ::= "three" 1845 For each ExtensionAddition (of a SEQUENCE or SET base type), a 1846 production is added to the grammar where the left hand side is the 1847 non-terminal for the ExtensionAddition and the right hand side is 1848 initially empty. If the ExtensionAddition is a ComponentType, then 1849 the primary non-terminal for the NamedType in the ComponentType is 1850 appended to the right hand side, otherwise (an 1851 ExtensionAdditionGroup) the sequence of primary non-terminals for the 1852 components nested in the ComponentTypeList in the 1853 ExtensionAdditionGroup are appended to the right hand side in the 1854 order of their definition. If the ExtensionAddition is followed by 1855 another ExtensionAddition, then the non-terminal for the next 1856 ExtensionAddition is appended to the right hand side, otherwise if 1857 the base type is not subject to a NO-INSERTIONS or HOLLOW-INSERTIONS 1858 encoding instruction, then the non-terminal for the extension 1859 insertion point of the base type is appended to the right hand side. 1860 If the ExtensionAddition is not followed by another ExtensionAddition 1861 and the base type is not subject to a NO-INSERTIONS encoding 1862 instruction, then the production is an insertion point production. 1863 If the empty sequence of terminals cannot be generated from the 1864 production (it may be necessary to wait until the grammar is 1865 otherwise complete before making this determination), then another 1866 production is added to the grammar where the left hand side is the 1867 non-terminal for the ExtensionAddition and the right hand side is 1868 empty. 1870 Aside: An extension is always effectively optional since a sender 1871 may be using an earlier version of the ASN.1 specification where 1872 none, or only some, of the extensions have been defined. 1874 Aside: The grammar generated for ExtensionAdditions is structured 1875 to take account of the condition that an extension can only be 1876 used if all the earlier extensions are also used [X.680]. 1878 If a SEQUENCE or SET base type is extensible (explicitly or by 1879 default) and is not subject to a NO-INSERTIONS or HOLLOW-INSERTIONS 1880 encoding instruction, then: 1882 (1) a production is added to the grammar where the left hand side is 1883 the non-terminal for the extension insertion point of the base 1884 type and the right hand side is the general extension terminal 1885 followed by the non-terminal for the extension insertion point, 1886 and 1888 (2) a production is added to the grammar where the left hand side is 1889 the non-terminal for the extension insertion point and the right 1890 hand side is empty. 1892 Example 1894 Consider the following ASN.1 type definition: 1896 SEQUENCE { 1897 -- Start of initial RootComponentTypeList. 1898 one BOOLEAN, 1899 two INTEGER OPTIONAL, 1900 -- End of initial RootComponentTypeList. 1901 ..., 1902 -- Start of ExtensionAdditions. 1903 four INTEGER, -- First ExtensionAddition (E1). 1904 five BOOLEAN OPTIONAL, -- Second ExtensionAddition (E2). 1905 [[ -- An ExtensionAdditionGroup. 1906 six UTF8String, 1907 seven INTEGER OPTIONAL 1908 ]], -- Third ExtensionAddition (E3). 1909 -- End of ExtensionAdditions. 1910 -- The extension insertion point is here (I1). 1911 ..., 1912 -- Start of final RootComponentTypeList. 1913 three INTEGER 1914 } 1916 Here is the grammar derived from this type: 1918 S ::= one two E1 three 1920 E1 ::= four E2 1921 E1 ::= 1922 E2 ::= five E3 1923 E3 ::= six seven I1 1924 E3 ::= 1926 I1 ::= "*" I1 1927 I1 ::= 1929 one ::= "one" 1930 two ::= "two" 1931 two ::= 1932 three ::= "three" 1933 four ::= "four" 1934 five ::= "five" 1935 five ::= 1936 six ::= "six" 1937 seven ::= "seven" 1938 seven ::= 1940 If the SEQUENCE type were subject to a NO-INSERTIONS or 1941 HOLLOW-INSERTIONS encoding instruction, then the productions for 1942 I1 would not appear and the first production for E3 would be: 1944 E3 ::= six seven 1946 Given a primary non-terminal, N, and a type where the base type is a 1947 CHOICE type: 1949 (1) A production is added to the grammar for each NamedType nested in 1950 the RootAlternativeTypeList of the base type, where the left hand 1951 side is N and the right hand side is the primary non-terminal for 1952 the NamedType. 1954 (2) A production is added to the grammar for each 1955 ExtensionAdditionAlternative of the base type, where the left 1956 hand side is N and the right hand side is the non-terminal for 1957 the ExtensionAdditionAlternative. 1959 (3) If the base type is extensible (explicitly or by default) and the 1960 base type is not subject to an insertion encoding instruction, 1961 then: 1963 (a) A production is added to the grammar where the left hand side 1964 is N and the right hand side is the non-terminal for the 1965 extension insertion point of the base type. This production 1966 is an insertion point production. 1968 (b) A production is added to the grammar where the left hand side 1969 is the non-terminal for the extension insertion point of the 1970 base type and the right hand side is the general extension 1971 terminal followed by the non-terminal for the extension 1972 insertion point. 1974 (c) A production is added to the grammar where the left hand side 1975 is the non-terminal for the extension insertion point of the 1976 base type and the right hand side is empty. 1978 (4) If the base type is subject to a HOLLOW-INSERTIONS encoding 1979 instruction, then a production is added to the grammar where the 1980 left hand side is N and the right hand side is empty. This 1981 production is an insertion point production. 1983 (5) If the base type is subject to a SINGULAR-INSERTIONS encoding 1984 instruction, then a production is added to the grammar where the 1985 left hand side is N and the right hand side is the general 1986 extension terminal. This production is an insertion point 1987 production. 1989 (6) If the base type is subject to a UNIFORM-INSERTIONS encoding 1990 instruction, then: 1992 (a) A production is added to the grammar where the left hand side 1993 is N and the right hand side is the general extension 1994 terminal. 1996 Aside: This production is used to verify the correctness 1997 of an ASN.1 type definition, but would not be used in the 1998 implementation of an RXER decoder. The next production 1999 takes precedence over it for accepting an unknown element. 2001 (b) A production is added to the grammar where the left hand side 2002 is N and the right hand side is the terminal for the 2003 extension insertion point of the base type followed by the 2004 non-terminal for the extension insertion point. This 2005 production is an insertion point production. 2007 (c) A production is added to the grammar where the left hand side 2008 is the non-terminal for the extension insertion point of the 2009 base type and the right hand side is the terminal for the 2010 extension insertion point followed by the non-terminal for 2011 the extension insertion point. 2013 (d) A production is added to the grammar where the left hand side 2014 is the non-terminal for the extension insertion point of the 2015 base type and the right hand side is empty. 2017 (7) If the base type is subject to a MULTIFORM-INSERTIONS encoding 2018 instruction, then: 2020 (a) A production is added to the grammar where the left hand side 2021 is N and the right hand side is the general extension 2022 terminal followed by the non-terminal for the extension 2023 insertion point of the base type. This production is an 2024 insertion point production. 2026 (b) A production is added to the grammar where the left hand side 2027 is the non-terminal for the extension insertion point of the 2028 base type and the right hand side is the general extension 2029 terminal followed by the non-terminal for the extension 2030 insertion point. 2032 (c) A production is added to the grammar where the left hand side 2033 is the non-terminal for the extension insertion point of the 2034 base type and the right hand side is empty. 2036 If an ExtensionAdditionAlternative is a NamedType, then a production 2037 is added to the grammar where the left hand side is the non-terminal 2038 for the ExtensionAdditionAlternative and the right hand side is the 2039 primary non-terminal for the NamedType. 2041 If an ExtensionAdditionAlternative is an 2042 ExtensionAdditionAlternativesGroup, then a production is added to the 2043 grammar for each NamedType nested in the 2044 ExtensionAdditionAlternativesGroup, where the left hand side is the 2045 non-terminal for the ExtensionAdditionAlternative and the right hand 2046 side is the primary non-terminal for the NamedType. 2048 Example 2050 Consider the following ASN.1 type definition: 2052 CHOICE { 2053 -- Start of RootAlternativeTypeList. 2054 one BOOLEAN, 2055 two INTEGER, 2056 -- End of RootAlternativeTypeList. 2057 ..., 2058 -- Start of ExtensionAdditionAlternatives. 2059 three INTEGER, -- First ExtensionAdditionAlternative (E1). 2060 [[ -- An ExtensionAdditionAlternativesGroup. 2061 four UTF8String, 2062 five INTEGER 2063 ]] -- Second ExtensionAdditionAlternative (E2). 2064 -- The extension insertion point is here (I1). 2065 } 2067 Here is the grammar derived from this type: 2069 S ::= one 2070 S ::= two 2071 S ::= E1 2072 S ::= E2 2073 S ::= I1 2075 I1 ::= "*" I1 2076 I1 ::= 2078 E1 ::= three 2079 E2 ::= four 2080 E2 ::= five 2082 one ::= "one" 2083 two ::= "two" 2084 three ::= "three" 2085 four ::= "four" 2086 five ::= "five" 2088 If the CHOICE type were subject to a NO-INSERTIONS encoding 2089 instruction, then the fifth, sixth and seventh productions would 2090 be removed. 2092 If the CHOICE type were subject to a HOLLOW-INSERTIONS encoding 2093 instruction, then the fifth, sixth and seventh productions would 2094 be replaced by: 2096 S ::= 2098 If the CHOICE type were subject to a SINGULAR-INSERTIONS encoding 2099 instruction, then the fifth, sixth and seventh productions would 2100 be replaced by: 2102 S ::= "*" 2104 If the CHOICE type were subject to a UNIFORM-INSERTIONS encoding 2105 instruction, then the fifth and sixth productions would be 2106 replaced by: 2108 S ::= "*" 2109 S ::= "*1" I1 2111 I1 ::= "*1" I1 2113 If the CHOICE type were subject to a MULTIFORM-INSERTIONS encoding 2114 instruction, then the fifth production would be replaced by: 2116 S ::= "*" I1 2118 Constraints on a SEQUENCE, SET or CHOICE type are ignored. They do 2119 not affect the grammar being generated. 2121 Aside: This avoids an awkward situation where values of a subtype 2122 have to be decoded differently from values of the parent type. It 2123 also simplifies the verification procedure. 2125 Given a primary non-terminal, N, and a type that has a SEQUENCE OF or 2126 SET OF base type and that permits a value of size zero (i.e., an 2127 empty sequence or set): 2129 (1) a production is added to the grammar where the left hand side of 2130 the production is N and the right hand side is the primary 2131 non-terminal for the NamedType of the component of the 2132 SEQUENCE OF or SET OF base type, followed by N, and 2134 (2) a production is added to the grammar where the left hand side of 2135 the production is N and the right hand side is empty. 2137 Given a primary non-terminal, N, a secondary non-terminal, N', and a 2138 type that has a SEQUENCE OF or SET OF base type and that does not 2139 permit a value of size zero: 2141 (1) a production is added to the grammar where the left hand side of 2142 the production is N and the right hand side is the primary 2143 non-terminal for the NamedType of the component of the 2144 SEQUENCE OF or SET OF base type, followed by N', and 2146 (2) a production is added to the grammar where the left hand side of 2147 the production is N' and the right hand side is the primary 2148 non-terminal for the NamedType of the component of the 2149 SEQUENCE OF or SET OF base type, followed by N', and 2151 (3) a production is added to the grammar where the left hand side of 2152 the production is N' and the right hand side is empty. 2154 Example 2156 Consider the following ASN.1 type definition: 2158 SEQUENCE SIZE(1..MAX) OF number INTEGER 2160 Here is the grammar derived from this type: 2162 S ::= number S' 2163 S' ::= number S' 2164 S' ::= 2166 number ::= "number" 2168 All inner subtyping (InnerTypeContraints) is ignored for the purposes 2169 of deciding whether a value of size zero is permitted by a 2170 SEQUENCE OF or SET OF type. 2172 This completes the description of the transformation of ASN.1 2173 combining type definitions into a grammar. 2175 25.1.2. Unique Component Attribution 2176 This section describes conditions that the grammar must satisfy so 2177 that each element and attribute in a received RXER encoding can be 2178 uniquely associated with an ASN.1 component definition. 2180 Definition (used by the grammar): A non-terminal, N, is used by the 2181 grammar if: 2183 (1) N is the start symbol or 2185 (2) N appears on the right hand side of a production where the 2186 non-terminal on the left hand side is used by the grammar. 2188 Definition (multiple derivation paths): A non-terminal, N, has 2189 multiple derivation paths if: 2191 (1) N appears on the right hand side of a production where the 2192 non-terminal on the left hand side has multiple derivation paths, 2193 or 2195 (2) N appears on the right hand side of more than one production 2196 where the non-terminal on the left hand side is used by the 2197 grammar, or 2199 (3) N is the start symbol and it appears on the right hand side of a 2200 production where the non-terminal on the left hand side is used 2201 by the grammar. 2203 For every ASN.1 type with a base type containing components that are 2204 subject to a GROUP encoding instruction, the grammar derived by the 2205 method described in this document MUST NOT have: 2207 (1) two or more primary non-terminals that are used by the grammar 2208 and are associated with element components having the same 2209 expanded name, or 2211 (2) two or more primary non-terminals that are used by the grammar 2212 and are associated with attribute components having the same 2213 expanded name, or 2215 (3) a primary non-terminal that has multiple derivation paths and is 2216 associated with an attribute component. 2218 Aside: Case (1) is in response to component referencing notations 2219 that are evaluated with respect to the XML encoding of an abstract 2220 value. Case (1) guarantees, without having to do extensive 2221 testing (which would necessarily have to take account of encoding 2222 instructions for all other encoding rules), that all sibling 2223 elements with the same expanded name will be associated with 2224 equivalent type definitions. Such equivalence allows a component 2225 referenced by element name to be re-encoded using a different set 2226 of ASN.1 encoding rules without ambiguity as to which type 2227 definition and encoding instructions apply. 2229 Cases (2) and (3) ensure that an attribute name is always uniquely 2230 associated with one component that can occur at most once and is 2231 always nested in the same part of an abstract value. 2233 Example 2235 The following example types illustrate various uses and misuses of 2236 the GROUP encoding instruction with respect to unique component 2237 attribution: 2239 TA ::= SEQUENCE { 2240 a [GROUP] TB, 2241 b [GROUP] CHOICE { 2242 a [GROUP] TB, 2243 b [NAME AS "c"] [ATTRIBUTE] INTEGER, 2244 c INTEGER, 2245 d TB, 2246 e [GROUP] TD, 2247 f [ATTRIBUTE] UTF8String 2248 }, 2249 c [ATTRIBUTE] INTEGER, 2250 d [GROUP] SEQUENCE OF 2251 a [GROUP] SEQUENCE { 2252 a [ATTRIBUTE] OBJECT IDENTIFIER, 2253 b INTEGER 2254 }, 2255 e [NAME AS "c"] INTEGER, 2256 COMPONENTS OF TD 2257 } 2259 TB ::= SEQUENCE { 2260 a INTEGER, 2261 b [ATTRIBUTE] BOOLEAN, 2262 COMPONENTS OF TC 2263 } 2265 TC ::= SEQUENCE { 2266 f OBJECT IDENTIFIER 2267 } 2269 TD ::= SEQUENCE { 2270 g OBJECT IDENTIFIER 2271 } 2273 The grammar for TA is constructed after performing the 2274 COMPONENTS OF transformation. The result of this transformation 2275 is shown next. This example will depart from the usual convention 2276 of using just the identifier of a NamedType to represent the 2277 primary non-terminal for that NamedType. A label relative to the 2278 outermost type will be used instead to better illustrate unique 2279 component attribution. The labels used for the non-terminals are 2280 shown down the right hand side. 2282 TA ::= SEQUENCE { 2283 a [GROUP] TB, -- TA.a 2284 b [GROUP] CHOICE { -- TA.b 2285 a [GROUP] TB, -- TA.b.a 2286 b [NAME AS "c"] [ATTRIBUTE] INTEGER, -- TA.b.b 2287 c INTEGER, -- TA.b.c 2288 d TB, -- TA.b.d 2289 e [GROUP] TD, -- TA.b.e 2290 f [ATTRIBUTE] UTF8String -- TA.b.f 2291 }, 2292 c [ATTRIBUTE] INTEGER, -- TA.c 2293 d [GROUP] SEQUENCE OF -- TA.d 2294 a [GROUP] SEQUENCE { -- TA.d.a 2295 a [ATTRIBUTE] OBJECT IDENTIFIER, -- TA.d.a.a 2296 b INTEGER -- TA.d.a.b 2297 }, 2298 e [NAME AS "c"] INTEGER, -- TA.e 2299 g OBJECT IDENTIFIER -- TA.g 2300 } 2302 TB ::= SEQUENCE { 2303 a INTEGER, -- TB.a 2304 b [ATTRIBUTE] BOOLEAN, -- TB.b 2305 f OBJECT IDENTIFIER -- TB.f 2306 } 2308 -- Type TC is no longer of interest. -- 2310 TD ::= SEQUENCE { 2311 g OBJECT IDENTIFIER -- TD.g 2312 } 2314 The associated grammar is: 2316 S ::= TA.a TA.b TA.c TA.d TA.e TA.g 2318 TA.a ::= TB.a TB.b TB.f 2320 TB.a ::= "a" 2321 TB.b ::= "@b" 2322 TB.f ::= "f" 2324 TA.b ::= TA.b.a 2325 TA.b ::= TA.b.b 2326 TA.b ::= TA.b.c 2327 TA.b ::= TA.b.d 2328 TA.b ::= TA.b.e 2329 TA.b ::= TA.b.f 2331 TA.b.a ::= TB.a TB.b TB.f 2332 TA.b.b ::= "@c" 2333 TA.b.c ::= "c" 2334 TA.b.d ::= "d" 2335 TA.b.e ::= TD.g 2336 TA.b.f ::= "@f" 2338 TD.g ::= "g" 2340 TA.c ::= "@c" 2342 TA.d ::= TA.d.a TA.d 2343 TA.d ::= 2345 TA.d.a ::= TA.d.a.a TA.d.a.b 2347 TA.d.a.a := "@a" 2348 TA.d.a.b ::= "b" 2350 TA.e ::= "c" 2352 TA.g ::= "g" 2354 All the non-terminals are used by the grammar. 2356 The type definition for TA is invalid because there are two 2357 instances where two or more primary non-terminals are associated 2358 with element components having the same expanded name: 2360 (1) TA.b.c and TA.e (both generate the terminal "c"), and 2362 (2) TD.g and TA.g (both generate the terminal "g"). 2364 In case (2), TD.g and TA.g are derived from the same instance of 2365 NamedType notation, but become distinct components following the 2366 COMPONENTS OF transformation. AUTOMATIC tagging is applied after 2367 the COMPONENTS OF transformation, which means that the types of 2368 the components corresponding to TD.g and TA.g will end up with 2369 different tags and therefore the types will not be equivalent. 2371 The type definition for TA is also invalid because there is one 2372 instance where two or more primary non-terminals are associated 2373 with attribute components having the same expanded name: TA.b.b 2374 and TA.c (both generate the terminal "@c"). 2376 The non-terminals with multiple derivation paths are: TA.d, 2377 TA.d.a, TA.d.a.a, TA.d.a.b, TB.a, TB.b and TB.f. The type 2378 definition for TA is also invalid because TA.d.a.a and TB.b are 2379 primary non-terminals that are associated with an attribute 2380 component. 2382 25.1.3. Deterministic Grammars 2384 Let the First Set of a production P, denoted First(P), be the set of 2385 all element terminals T where T is the first element terminal in a 2386 sequence of terminals that can be generated from the right hand side 2387 of P. There can be any number of leading attribute terminals before 2388 T. 2390 Let the Follow Set of a non-terminal N, denoted Follow(N), be the set 2391 of all element terminals T where T is the first element terminal 2392 following N in a sequence of non-terminals and terminals that can be 2393 generated from the grammar. There can be any number of attribute 2394 terminals between N and T. If a sequence of non-terminals and 2395 terminals can be generated from the grammar where N is not followed 2396 by any element terminals, then Follow(N) also contains a special end 2397 terminal, denoted by "$". 2399 Aside: If N does not appear on the right hand side of any 2400 production, then Follow(N) will be empty. 2402 For a production P, let the predicate Empty(P) be true if and only if 2403 the empty sequence of terminals can be generated from P. Otherwise, 2404 Empty(P) is false. 2406 Definition (base grammar): The base grammar is a rewriting of the 2407 grammar in which the non-terminals for every ExtensionAddition and 2408 ExtensionAdditionAlternative are removed from the right hand side of 2409 all productions. 2411 For a production P, let the predicate Preselected(P) be true if and 2412 only if every sequence of terminals that can be generated from the 2413 right hand side of P using only the base grammar contains at least 2414 one attribute terminal. Otherwise, Preselected(P) is false. 2416 The Select Set of a production P, denoted Select(P), is empty if 2417 Preselected(P) is true, otherwise it contains First(P). Let N be the 2418 non-terminal on the left hand side of P. If Empty(P) is true, then 2419 Select(P) also contains Follow(N). 2421 Aside: It may appear somewhat dubious to include the attribute 2422 components in the grammar because, in reality, attributes appear 2423 unordered within the start tag of an element, and not interspersed 2424 with the child elements as the grammar would suggest. This is why 2425 attribute terminals are ignored in composing the First Sets and 2426 Follow Sets. However, the attribute terminals are important in 2427 composing the Select Sets because they can preselect a production 2428 and can prevent a production from being able to generate an empty 2429 sequence of terminals. In real terms, this corresponds to an RXER 2430 decoder using the attributes to determine the presence or absence 2431 of optional components and to select between the alternatives of a 2432 CHOICE, even before considering the child elements. 2434 An attribute appearing in an extension isn't used to preselect a 2435 production since, in general, a decoder using an earlier version 2436 of the specification would not be able to associate the attribute 2437 with any particular extension insertion point. 2439 Let the Reach Set of a non-terminal N, denoted Reach(N), be the set 2440 of all element terminals T where T appears in a sequence of terminals 2441 that can be generated from N. 2443 Aside: It can be readily shown that all the optional attribute 2444 components and all but one of the mandatory attribute components 2445 of a SEQUENCE or SET type can be ignored in constructing the 2446 grammar because their omission does not alter the First, Follow, 2447 Select or Reach Sets, or the evaluation of the Preselected and 2448 Empty predicates. 2450 A grammar is deterministic (for the purposes of an RXER decoder) if 2451 and only if: 2453 (1) there do not exist two productions P and Q, with the same 2454 non-terminal on the left hand side, where the intersection of 2455 Select(P) and Select(Q) is not empty, and 2457 (2) there does not exist a non-terminal E for an ExtensionAddition or 2458 ExtensionAdditionAlternative where the intersection of Reach(E) 2459 and Follow(E) is not empty. 2461 Aside: In case (1), if the intersection is not empty, then a 2462 decoder would have two or more possible ways to attempt to decode 2463 the input into an abstract value. In case (2), if the 2464 intersection is not empty, then a decoder using an earlier version 2465 of the ASN.1 specification would confuse an element in an unknown 2466 (to that decoder) extension with a known component following the 2467 extension. 2469 Aside: In the absence of any attribute components, case (1) is the 2470 test for an LL(1) grammar. 2472 For every ASN.1 type with a base type containing components that are 2473 subject to a GROUP encoding instruction, the grammar derived by the 2474 method described in this document MUST be deterministic. 2476 25.1.4. Attributes in Unknown Extensions 2478 An insertion point production is able to accept unknown attributes if 2479 the non-terminal on the left hand side of the production does not 2480 have multiple derivation paths. 2482 Aside: If the non-terminal has multiple derivation paths, then any 2483 future extension cannot possibly contain an attribute component 2484 because that would violate the requirements of Section 25.1.2. 2486 For a deterministic grammar, there is only one possible way to 2487 construct a sequence of element terminals matching the element 2488 content of an element in a correctly formed RXER encoding. Any 2489 unknown attributes of the element are accepted if at least one 2490 insertion point production that is able to accept unknown attributes 2491 is used in that construction. 2493 Example 2495 Consider this type definition: 2497 CHOICE { 2498 one UTF8String, 2499 two [GROUP] SEQUENCE { 2500 three INTEGER, 2501 ... 2502 } 2503 } 2505 The associated grammar is: 2507 S ::= one 2508 S ::= two 2510 two ::= three I1 2512 I1 ::= "*" I1 2513 I1 ::= 2515 one ::= "one" 2516 three ::= "three" 2518 The third production is an insertion point production and it is 2519 able to accept unknown attributes. 2521 When decoding a value of this type, if the element content 2522 contains a child element, then any unrecognized attribute 2523 would be illegal as the insertion point production would not be 2524 used to recognize the input (the "one" alternative does not admit 2525 an extension insertion point). If the element content contains a 2526 element, then an unrecognized attribute would be accepted 2527 because the insertion point production would be used to recognize 2528 the input (the "two" alternative that generates the 2529 element has an extensible type). 2531 If the SEQUENCE type were prefixed by a NO-INSERTIONS encoding 2532 instruction, then the third, fourth and fifth productions would be 2533 replaced by: 2535 two ::= three 2537 With this change, any unrecognized attribute would be illegal for 2538 the "two" alternative also, since the replacement production is 2539 not an insertion point production. 2541 If more than one insertion point production that is able to accept 2542 unknown attributes is used in constructing a matching sequence of 2543 element terminals, then a decoder is free to associate an 2544 unrecognized attribute with any one of the extension insertion points 2545 corresponding to those insertion point productions. The 2546 justification for doing so comes from the following two observations: 2548 (1) If the encoding of an abstract value contains an extension where 2549 the type of the extension is unknown to the receiver, then it is 2550 generally impossible to re-encode the value using a different set 2551 of encoding rules, including the canonical variant of the 2552 received encoding. This is true no matter which encoding rules 2553 are being used. It is desirable for a decoder to be able to 2554 accept and store the raw encoding of an extension without raising 2555 an error, and to re-insert the raw encoding of the extension when 2556 re-encoding the abstract value using the same non-canonical 2557 encoding rules. However, there is little more that an 2558 application can do with an unknown extension. 2560 An application using RXER can successfully accept, store and 2561 re-encode an unrecognized attribute regardless of which extension 2562 insertion point it might be ascribed to. 2564 (2) Even if there is a single extension insertion point, an unknown 2565 extension could still be the encoding of a value of any one of an 2566 infinite number of valid type definitions. For example, an 2567 attribute or element component could be nested to any arbitrary 2568 depth within CHOICEs whose components are subject to GROUP 2569 encoding instructions. 2571 Aside: A similar series of nested CHOICEs could describe an 2572 unknown extension in a Basic Encoding Rules (BER) encoding 2573 [X.690]. 2575 26. Security Considerations 2577 ASN.1 compiler implementors should take special care to be thorough 2578 in checking that the GROUP encoding instruction has been correctly 2579 used, otherwise ASN.1 specifications with ambiguous RXER encodings 2580 could be deployed. 2582 Ambiguous encodings mean that the abstract value recovered by a 2583 decoder may differ from the original abstract value that was encoded. 2584 If that is the case, then a digital signature generated with respect 2585 to the original abstract value (using a canonical encoding other than 2586 CRXER) will not be successfully verified by a receiver using the 2587 decoded abstract value. Also, an abstract value may have 2588 security-sensitive fields, and in particular fields used to grant or 2589 deny access. If the decoded abstract value differs from the encoded 2590 abstract value, then a receiver using the decoded abstract value will 2591 be applying different security policy to that embodied in the 2592 original abstract value. 2594 27. IANA Considerations 2596 This document has no actions for the Internet Assigned Numbers 2597 Authority (IANA). 2599 28. References 2601 28.1. Normative References 2603 [BCP14] Bradner, S., "Key words for use in RFCs to Indicate 2604 Requirement Levels", BCP 14, RFC 2119, March 1997. 2606 [URI] Berners-Lee, T., Fielding, R. and L. Masinter, "Uniform 2607 Resource Identifiers (URI): Generic Syntax", STD 66, RFC 2608 3986, January 2005. 2610 [RXER] Legg, S. and D. Prager, "Robust XML Encoding Rules (RXER) 2611 for Abstract Syntax Notation One (ASN.1)", 2612 draft-legg-xed-rxer-xx.txt, a work in progress, December 2613 2006. 2615 [ASN.X] Legg, S., "Abstract Syntax Notation X (ASN.X)", 2616 draft-legg-xed-asd-xx.txt, a work in progress, December 2617 2006. 2619 [X.680] ITU-T Recommendation X.680 (07/02) | ISO/IEC 8824-1, 2620 Information technology - Abstract Syntax Notation One 2621 (ASN.1): Specification of basic notation. 2623 [X.680-1] ITU-T Recommendation X.680 (2002) Amendment 1 (10/03) | 2624 ISO/IEC 8824-1:2002/Amd 1:2004, Support for EXTENDED-XER. 2626 [X.683] ITU-T Recommendation X.683 (07/02) | ISO/IEC 8824-4, 2627 Information technology - Abstract Syntax Notation One 2628 (ASN.1): Parameterization of ASN.1 specifications. 2630 [XML10] Bray, T., Paoli, J., Sperberg-McQueen, C., Maler, E. and 2631 F. Yergeau, "Extensible Markup Language (XML) 1.0 (Fourth 2632 Edition)", W3C Recommendation, 2633 http://www.w3.org/TR/2006/REC-xml-20060816, August 2006. 2635 [XMLNS10] Bray, T., Hollander, D., Layman, A., and R. Tobin, 2636 "Namespaces in XML 1.0 (Second Edition)", W3C 2637 Recommendation, 2638 http://www.w3.org/TR/2006/REC-xml-names-20060816, August 2639 2006. 2641 [XSD1] Thompson, H., Beech, D., Maloney, M. and N. Mendelsohn, 2642 "XML Schema Part 1: Structures Second Edition", W3C 2643 Recommendation, 2644 http://www.w3.org/TR/2004/REC-xmlschema-1-20041028/, 2645 October 2004. 2647 [XSD2] Biron, P. and A. Malhotra, "XML Schema Part 2: Datatypes 2648 Second Edition", W3C Recommendation, 2649 http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/, 2650 October 2004. 2652 [RNG] Clark, J. and M. Makoto, "RELAX NG Tutorial", OASIS 2653 Committee Specification, http://www.oasis- 2654 open.org/committees/relax-ng/tutorial-20011203.html, 2655 December 2001. 2657 28.2. Informative References 2659 [INFOSET] Cowan, J. and R. Tobin, "XML Information Set (Second 2660 Edition)", W3C Recommendation, 2661 http://www.w3.org/TR/2004/REC-xml-infoset-20040204, 2662 February 2004. 2664 [X.690] ITU-T Recommendation X.690 (07/02) | ISO/IEC 8825-1, 2665 Information technology - ASN.1 encoding rules: 2666 Specification of Basic Encoding Rules (BER), Canonical 2667 Encoding Rules (CER) and Distinguished Encoding Rules 2668 (DER). 2670 Appendix A. GROUP Encoding Instruction Examples 2672 This appendix is non-normative. 2674 This appendix contains examples of both correct and incorrect use of 2675 the GROUP encoding instruction, determined with respect to the 2676 grammars derived from the example type definitions. The productions 2677 of the grammars are labeled for convenience. Sets and predicates for 2678 non-terminals with only one production will be omitted from the 2679 examples since they never indicate non-determinism. 2681 The requirements of Section 25.1.2 (unique component attribution) are 2682 satisfied by all the examples in this appendix and the appendices 2683 that follow it. 2685 A.1. Example 1 2687 Consider this type definition: 2689 SEQUENCE { 2690 one [GROUP] SEQUENCE { 2691 two UTF8String OPTIONAL 2692 } OPTIONAL, 2693 three INTEGER 2694 } 2696 The associated grammar is: 2698 P1: S ::= one three 2699 P2: one ::= two 2700 P3: one ::= 2701 P4: two ::= "two" 2702 P5: two ::= 2703 P6: three ::= "three" 2705 Select Sets have to be evaluated to test the validity of the type 2706 definition. The grammar leads to the following sets and predicates: 2708 First(P2) = { "two" } 2709 First(P3) = { } 2710 Preselected(P2) = Preselected(P3) = false 2711 Empty(P2) = Empty(P3) = true 2712 Follow(one) = { "three" } 2713 Select(P2) = First(P2) + Follow(one) = { "two", "three" } 2714 Select(P3) = First(P3) + Follow(one) = { "three" } 2716 First(P4) = { "two" } 2717 First(P5) = { } 2718 Preselected(P4) = Preselected(P5) = Empty(P4) = false 2719 Empty(P5) = true 2720 Follow(two) = { "three" } 2721 Select(P4) = First(P4) = { "two" } 2722 Select(P5) = First(P5) + Follow(two) = { "three" } 2724 The intersection of Select(P2) and Select(P3) is not empty, hence the 2725 grammar is not deterministic and the type definition is not valid. 2726 If the RXER encoding of a value of the type does not have a child 2727 element , then it is not possible to determine whether the "one" 2728 component is present or absent in the value. 2730 Now consider this type definition with attributes in the "one" 2731 component: 2733 SEQUENCE { 2734 one [GROUP] SEQUENCE { 2735 two UTF8String OPTIONAL, 2736 four [ATTRIBUTE] BOOLEAN, 2737 five [ATTRIBUTE] BOOLEAN OPTIONAL 2738 } OPTIONAL, 2739 three INTEGER 2740 } 2742 The associated grammar is: 2744 P1: S ::= one three 2745 P2: one ::= two four five 2746 P3: one ::= 2747 P4: two ::= "two" 2748 P5: two ::= 2749 P6: four ::= "@four" 2750 P7: five ::= "@five" 2751 P8: five ::= 2752 P9: three ::= "three" 2754 This grammar leads to the following sets and predicates: 2756 First(P2) = { "two" } 2757 First(P3) = { } 2758 Preselected(P3) = Empty(P2) = false 2759 Preselected(P2) = Empty(P3) = true 2760 Follow(one) = { "three" } 2761 Select(P2) = { } 2762 Select(P3) = First(P3) + Follow(one) = { "three" } 2764 First(P4) = { "two" } 2765 First(P5) = { } 2766 Preselected(P4) = Preselected(P5) = Empty(P4) = false 2767 Empty(P5) = true 2768 Follow(two) = { "three" } 2769 Select(P4) = First(P4) = { "two" } 2770 Select(P5) = First(P5) + Follow(two) = { "three" } 2772 First(P7) = { } 2773 First(P8) = { } 2774 Preselected(P8) = Empty(P7) = false 2775 Preselected(P7) = Empty(P8) = true 2776 Follow(five) = { "three" } 2777 Select(P7) = { } 2778 Select(P8) = First(P8) + Follow(five) = { "three" } 2780 The intersection of Select(P2) and Select(P3) is empty, as is the 2781 intersection of Select(P4) and Select(P5) and the intersection of 2782 Select(P7) and Select(P8), hence the grammar is deterministic and the 2783 type definition is valid. In a correct RXER encoding, the "one" 2784 component will be present if and only if the "four" attribute is 2785 present. 2787 A.2. Example 2 2789 Consider this type definition: 2791 CHOICE { 2792 one [GROUP] SEQUENCE { 2793 two [ATTRIBUTE] BOOLEAN OPTIONAL 2794 }, 2795 three INTEGER, 2796 four [GROUP] SEQUENCE { 2797 five BOOLEAN OPTIONAL 2798 } 2799 } 2801 The associated grammar is: 2803 P1: S ::= one 2804 P2: S ::= three 2805 P3: S ::= four 2806 P4: one ::= two 2807 P5: two ::= "@two" 2808 P6: two ::= 2809 P7: three ::= "three" 2810 P8: four ::= five 2811 P9: five ::= "five" 2812 P10: five ::= 2814 This grammar leads to the following sets and predicates: 2816 First(P1) = { } 2817 First(P2) = { "three" } 2818 First(P3) = { "five" } 2819 Preselected(P1) = Preselected(P2) = Preselected(P3) = false 2820 Empty(P2) = false 2821 Empty(P1) = Empty(P3) = true 2822 Follow(S) = { "$" } 2823 Select(P1) = First(P1) + Follow(S) = { "$" } 2824 Select(P2) = First(P2) = { "three" } 2825 Select(P3) = First(P3) + Follow(S) = { "five", "$" } 2827 First(P5) = { } 2828 First(P6) = { } 2829 Preselected(P6) = Empty(P5) = false 2830 Preselected(P5) = Empty(P6) = true 2831 Follow(two) = { "$" } 2832 Select(P5) = { } 2833 Select(P6) = First(P6) + Follow(two) = { "$" } 2835 First(P9) = { "five" } 2836 First(P10) = { } 2837 Preselected(P9) = Preselected(P10) = Empty(P9) = false 2838 Empty(P10) = true 2839 Follow(five) = { "$" } 2840 Select(P9) = First(P9) = { "five" } 2841 Select(P10) = First(P10) + Follow(five) = { "$" } 2843 The intersection of Select(P1) and Select(P3) is not empty, hence the 2844 grammar is not deterministic and the type definition is not valid. 2845 If the RXER encoding of a value of the type is empty, then it is not 2846 possible to determine whether the "one" alternative or the "four" 2847 alternative has been chosen. 2849 Now consider this slightly different type definition: 2851 CHOICE { 2852 one [GROUP] SEQUENCE { 2853 two [ATTRIBUTE] BOOLEAN 2854 }, 2855 three INTEGER, 2856 four [GROUP] SEQUENCE { 2857 five BOOLEAN OPTIONAL 2858 } 2859 } 2861 The associated grammar is: 2863 P1: S ::= one 2864 P2: S ::= three 2865 P3: S ::= four 2866 P4: one ::= two 2867 P5: two ::= "@two" 2868 P6: three ::= "three" 2869 P7: four ::= five 2870 P8: five ::= "five" 2871 P9: five ::= 2873 This grammar leads to the following sets and predicates: 2875 First(P1) = { } 2876 First(P2) = { "three" } 2877 First(P3) = { "five" } 2878 Preselected(P2) = Preselected(P3) = false 2879 Empty(P1) = Empty(P2) = false 2880 Preselected(P1) = Empty(P3) = true 2881 Follow(S) = { "$" } 2882 Select(P1) = { } 2883 Select(P2) = First(P2) = { "three" } 2884 Select(P3) = First(P3) + Follow(S) = { "five", "$" } 2886 First(P8) = { "five" } 2887 First(P9) = { } 2888 Preselected(P8) = Preselected(P9) = Empty(P8) = false 2889 Empty(P9) = true 2890 Follow(five) = { "$" } 2891 Select(P8) = First(P8) = { "five" } 2892 Select(P9) = First(P9) + Follow(five) = { "$" } 2894 The intersection of Select(P1) and Select(P2) is empty, the 2895 intersection of Select(P1) and Select(P3) is empty, the intersection 2896 of Select(P2) and Select(P3) is empty and the intersection of 2897 Select(P8) and Select(P9) is empty, hence the grammar is 2898 deterministic and the type definition is valid. The "one" and "four" 2899 alternatives can be distinguished because the "one" alternative has a 2900 mandatory attribute. 2902 A.3. Example 3 2904 Consider this type definition: 2906 SEQUENCE { 2907 one [GROUP] CHOICE { 2908 two [ATTRIBUTE] BOOLEAN, 2909 three [GROUP] SEQUENCE OF number INTEGER 2910 } OPTIONAL 2911 } 2913 The associated grammar is: 2915 P1: S ::= one 2916 P2: one ::= two 2917 P3: one ::= three 2918 P4: one ::= 2919 P5: two ::= "@two" 2920 P6: three ::= number three 2921 P7: three ::= 2922 P8: number ::= "number" 2924 This grammar leads to the following sets and predicates: 2926 First(P2) = { } 2927 First(P3) = { "number" } 2928 First(P4) = { } 2929 Preselected(P3) = Preselected(P4) = Empty(P2) = false 2930 Preselected(P2) = Empty(P3) = Empty(P4) = true 2931 Follow(one) = { "$" } 2932 Select(P2) = { } 2933 Select(P3) = First(P3) + Follow(one) = { "number", "$" } 2934 Select(P4) = First(P4) + Follow(one) = { "$" } 2936 First(P6) = { "number" } 2937 First(P7) = { } 2938 Preselected(P6) = Preselected(P7) = Empty(P6) = false 2939 Empty(P7) = true 2940 Follow(three) = { "$" } 2941 Select(P6) = First(P6) = { "number" } 2942 Select(P7) = First(P7) + Follow(three) = { "$" } 2944 The intersection of Select(P3) and Select(P4) is not empty, hence the 2945 grammar is not deterministic and the type definition is not valid. 2946 If the RXER encoding of a value of the type is empty, then it is not 2947 possible to determine whether the "one" component is absent or the 2948 empty "three" alternative has been chosen. 2950 A.4. Example 4 2952 Consider this type definition: 2954 SEQUENCE { 2955 one [GROUP] CHOICE { 2956 two [ATTRIBUTE] BOOLEAN, 2957 three [ATTRIBUTE] BOOLEAN 2958 } OPTIONAL 2959 } 2961 The associated grammar is: 2963 P1: S ::= one 2964 P2: one ::= two 2965 P3: one ::= three 2966 P4: one ::= 2967 P5: two ::= "@two" 2968 P6: three ::= "@three" 2970 This grammar leads to the following sets and predicates: 2972 First(P2) = { } 2973 First(P3) = { } 2974 First(P4) = { } 2975 Preselected(P4) = Empty(P2) = Empty(P3) = false 2976 Preselected(P2) = Preselected(P3) = Empty(P4) = true 2977 Follow(one) = { "$" } 2978 Select(P2) = { } 2979 Select(P3) = { } 2980 Select(P4) = First(P4) + Follow(one) = { "$" } 2982 The intersection of Select(P2) and Select(P3) is empty, the 2983 intersection of Select(P2) and Select(P4) is empty and the 2984 intersection of Select(P3) and Select(P4) is empty, hence the grammar 2985 is deterministic and the type definition is valid. 2987 A.5. Example 5 2989 Consider this type definition: 2991 SEQUENCE { 2992 one [GROUP] SEQUENCE OF number INTEGER OPTIONAL 2993 } 2995 The associated grammar is: 2997 P1: S ::= one 2998 P2: one ::= number one 2999 P3: one ::= 3000 P4: one ::= 3001 P5: number ::= "number" 3003 P3 is generated during the processing of the SEQUENCE OF type. P4 is 3004 generated because the "one" component is optional. 3006 This grammar leads to the following sets and predicates: 3008 First(P2) = { "number" } 3009 First(P3) = { } 3010 First(P4) = { } 3011 Preselected(P2) = Preselected(P3) = Preselected(P4) = false 3012 Empty(P2) = false 3013 Empty(P3) = Empty(P4) = true 3014 Follow(one) = { "$" } 3015 Select(P2) = First(P2) = { "number" } 3016 Select(P3) = First(P3) + Follow(one) = { "$" } 3017 Select(P4) = First(P4) + Follow(one) = { "$" } 3019 The intersection of Select(P3) and Select(P4) is not empty, hence the 3020 grammar is not deterministic and the type definition is not valid. 3021 If the RXER encoding of a value of the type does not have any 3022 child elements, then it is not possible to determine whether 3023 the "one" component is present or absent in the value. 3025 Consider this similar type definition with a SIZE constraint: 3027 SEQUENCE { 3028 one [GROUP] SEQUENCE SIZE(1..MAX) OF number INTEGER OPTIONAL 3029 } 3031 The associated grammar is: 3033 P1: S ::= one 3034 P2: one ::= number one' 3035 P3: one' ::= number one' 3036 P4: one' ::= 3037 P5: one ::= 3038 P6: number ::= "number" 3040 This grammar leads to the following sets and predicates: 3042 First(P2) = { "number" } 3043 First(P5) = { } 3044 Preselected(P2) = Preselected(P5) = Empty(P2) = false 3045 Empty(P5) = true 3046 Follow(one) = { "$" } 3047 Select(P2) = First(P2) = { "number" } 3048 Select(P5) = First(P5) + Follow(one) = { "$" } 3050 First(P3) = { "number" } 3051 First(P4) = { } 3052 Preselected(P3) = Preselected(P4) = Empty(P3) = false 3053 Empty(P4) = true 3054 Follow(one') = { "$" } 3055 Select(P3) = First(P3) = { "number" } 3056 Select(P4) = First(P4) + Follow(one') = { "$" } 3058 The intersection of Select(P2) and Select(P5) is empty, as is the 3059 intersection of Select(P3) and Select(P4), hence the grammar is 3060 deterministic and the type definition is valid. If there are no 3061 child elements, then the "one" component is necessarily 3062 absent and there is no ambiguity. 3064 A.6. Example 6 3066 Consider this type definition: 3068 SEQUENCE { 3069 beginning [GROUP] List, 3070 middle UTF8String OPTIONAL, 3071 end [GROUP] List 3072 } 3074 List ::= SEQUENCE OF string UTF8String 3076 The associated grammar is: 3078 P1: S ::= beginning middle end 3079 P2: beginning ::= string beginning 3080 P3: beginning ::= 3081 P4: middle ::= "middle" 3082 P5: middle ::= 3083 P6: end ::= string end 3084 P7: end ::= 3085 P8: string ::= "string" 3087 This grammar leads to the following sets and predicates: 3089 First(P2) = { "string" } 3090 First(P3) = { } 3091 Preselected(P2) = Preselected(P3) = Empty(P2) = false 3092 Empty(P3) = true 3093 Follow(beginning) = { "middle", "string", "$" } 3094 Select(P2) = First(P2) = { "string" } 3095 Select(P3) = First(P3) + Follow(beginning) 3096 = { "middle", "string", "$" } 3098 First(P4) = { "middle" } 3099 First(P5) = { } 3100 Preselected(P4) = Preselected(P5) = Empty(P4) = false 3101 Empty(P5) = true 3102 Follow(middle) = { "string", "$" } 3103 Select(P4) = First(P4) = { "middle" } 3104 Select(P5) = First(P5) + Follow(middle) = { "string", "$" } 3106 First(P6) = { "string" } 3107 First(P7) = { } 3108 Preselected(P6) = Preselected(P7) = Empty(P6) = false 3109 Empty(P7) = true 3110 Follow(end) = { "$" } 3111 Select(P6) = First(P6) = { "string" } 3112 Select(P7) = First(P7) + Follow(end) = { "$" } 3114 The intersection of Select(P2) and Select(P3) is not empty, hence the 3115 grammar is not deterministic and the type definition is not valid. 3117 Now consider the following type definition: 3119 SEQUENCE { 3120 beginning [GROUP] List, 3121 middleAndEnd [GROUP] SEQUENCE { 3122 middle UTF8String, 3123 end [GROUP] List 3124 } OPTIONAL 3125 } 3127 The associated grammar is: 3129 P1: S ::= beginning middleAndEnd 3130 P2: beginning ::= string beginning 3131 P3: beginning ::= 3132 P4: middleAndEnd ::= middle end 3133 P5: middleAndEnd ::= 3134 P6: middle ::= "middle" 3135 P7: end ::= string end 3136 P8: end ::= 3137 P9: string ::= "string" 3139 This grammar leads to the following sets and predicates: 3141 First(P2) = { "string" } 3142 First(P3) = { } 3143 Preselected(P2) = Preselected(P3) = Empty(P2) = false 3144 Empty(P3) = true 3145 Follow(beginning) = { "middle", "$" } 3146 Select(P2) = First(P2) = { "string" } 3147 Select(P3) = First(P3) + Follow(beginning) = { "middle", "$" } 3149 First(P4) = { "middle" } 3150 First(P5) = { } 3151 Preselected(P4) = Preselected(P5) = Empty(P4) = false 3152 Empty(P5) = true 3153 Follow(middleAndEnd) = { "$" } 3154 Select(P4) = First(P4) = { "middle" } 3155 Select(P5) = First(P5) + Follow(middleAndEnd) = { "$" } 3157 First(P7) = { "string" } 3158 First(P8) = { } 3159 Preselected(P7) = Preselected(P8) = Empty(P7) = false 3160 Empty(P8) = true 3161 Follow(end) = { "$" } 3162 Select(P7) = First(P7) = { "string" } 3163 Select(P8) = First(P8) + Follow(end) = { "$" } 3165 The intersection of Select(P2) and Select(P3) is empty, as is the 3166 intersection of Select(P4) and Select(P5) and the intersection of 3167 Select(P7) and Select(P8), hence the grammar is deterministic and the 3168 type definition is valid. 3170 A.7. Example 7 3172 Consider the following type definition: 3174 SEQUENCE SIZE(1..MAX) OF 3175 one [GROUP] SEQUENCE { 3176 two INTEGER OPTIONAL 3177 } 3179 The associated grammar is: 3181 P1: S ::= one S' 3182 P2: S' ::= one S' 3183 P3: S' ::= 3184 P4: one ::= two 3185 P5: two ::= "two" 3186 P6: two ::= 3188 This grammar leads to the following sets and predicates: 3190 First(P2) = { "two" } 3191 First(P3) = { } 3192 Preselected(P2) = Preselected(P3) = false 3193 Empty(P2) = Empty(P3) = true 3194 Follow(S') = { "$" } 3195 Select(P2) = First(P2) + Follow(S') = { "two", "$" } 3196 Select(P3) = First(P3) + Follow(S') = { "$" } 3198 First(P5) = { "two" } 3199 First(P6) = { } 3200 Preselected(P5) = Preselected(P6) = Empty(P5) = false 3201 Empty(P6) = true 3202 Follow(two) = { "two", "$" } 3203 Select(P5) = First(P5) = { "two" } 3204 Select(P6) = First(P6) + Follow(two) = { "two", "$" } 3206 The intersection of Select(P2) and Select(P3) is not empty and the 3207 intersection of Select(P5) and Select(P6) is not empty, hence the 3208 grammar is not deterministic and the type definition is not valid. 3209 The encoding of a value of the type contains an indeterminate number 3210 of empty instances of the component type. 3212 A.8. Example 8 3214 Consider the following type definition: 3216 SEQUENCE OF 3217 list [GROUP] SEQUENCE SIZE(1..MAX) OF number INTEGER 3219 The associated grammar is: 3221 P1: S ::= list S 3222 P2: S ::= 3223 P3: list ::= number list' 3224 P4: list' ::= number list' 3225 P5: list' ::= 3226 P6: number ::= "number" 3228 This grammar leads to the following sets and predicates: 3230 First(P1) = { "number" } 3231 First(P2) = { } 3232 Preselected(P1) = Preselected(P2) = Empty(P1) = false 3233 Empty(P2) = true 3234 Follow(S) = { "$" } 3235 Select(P1) = First(P1) = { "number" } 3236 Select(P2) = First(P2) + Follow(S) = { "$" } 3237 First(P4) = { "number" } 3238 First(P5) = { } 3239 Preselected(P4) = Preselected(P5) = Empty(P4) = false 3240 Empty(P5) = true 3241 Follow(list') = { "number", "$" } 3242 Select(P4) = First(P4) = { "number" } 3243 Select(P5) = First(P5) + Follow(list') = { "number", "$" } 3245 The intersection of Select(P4) and Select(P5) is not empty, hence the 3246 grammar is not deterministic and the type definition is not valid. 3247 The type describes a list of lists but it is not possible for a 3248 decoder to determine where the outer lists begin and end. 3250 A.9. Example 9 3252 Consider the following type definition: 3254 SEQUENCE OF item [GROUP] SEQUENCE { 3255 before [GROUP] OneAndTwo, 3256 core UTF8String, 3257 after [GROUP] OneAndTwo OPTIONAL 3258 } 3260 OneAndTwo ::= SEQUENCE { 3261 non-core UTF8String 3262 } 3264 The associated grammar is: 3266 P1: S ::= item S 3267 P2: S ::= 3268 P3: item ::= before core after 3269 P4: before ::= non-core 3270 P5: non-core ::= "non-core" 3271 P6: core ::= "core" 3272 P7: after ::= non-core 3273 P8: after ::= 3275 This grammar leads to the following sets and predicates: 3277 First(P1) = { "non-core" } 3278 First(P2) = { } 3279 Preselected(P1) = Preselected(P2) = Empty(P1) = false 3280 Empty(P2) = true 3281 Follow(S) = { "$" } 3282 Select(P1) = First(P1) = { "non-core" } 3283 Select(P2) = First(P2) + Follow(S) = { "$" } 3284 First(P7) = { "non-core" } 3285 First(P8) = { } 3286 Preselected(P7) = Preselected(P8) = Empty(P7) = false 3287 Empty(P8) = true 3288 Follow(after) = { "non-core", "$" } 3289 Select(P7) = First(P7) = { "non-core" } 3290 Select(P8) = First(P8) + Follow(after) = { "non-core", "$" } 3292 The intersection of Select(P7) and Select(P8) is not empty, hence the 3293 grammar is not deterministic and the type definition is not valid. 3294 There is ambiguity between the end of one item and the start of the 3295 next. Without looking ahead in an encoding, it is not possible to 3296 determine whether a element belongs with the preceding or 3297 following element. 3299 A.10. Example 10 3301 Consider the following type definition: 3303 CHOICE { 3304 one [GROUP] List, 3305 two [GROUP] SEQUENCE { 3306 three [ATTRIBUTE] UTF8String, 3307 four [GROUP] List 3308 } 3309 } 3311 List ::= SEQUENCE OF string UTF8String 3313 The associated grammar is: 3315 P1: S ::= one 3316 P2: S ::= two 3317 P3: one ::= string one 3318 P4: one ::= 3319 P5: two ::= three four 3320 P6: three ::= "@three" 3321 P7: four ::= string four 3322 P8: four ::= 3323 P9: string ::= "string" 3325 This grammar leads to the following sets and predicates: 3327 First(P1) = { "string" } 3328 First(P2) = { "string" } 3329 Preselected(P1) = Empty(P2) = false 3330 Preselected(P2) = Empty(P1) = true 3331 Follow(S) = { "$" } 3332 Select(P1) = First(P1) + Follow(S) = { "string", "$" } 3333 Select(P2) = { } 3335 First(P3) = { "string" } 3336 First(P4) = { } 3337 Preselected(P3) = Preselected(P4) = Empty(P3) = false 3338 Empty(P4) = true 3339 Follow(one) = { "$" } 3340 Select(P3) = First(P3) = { "string" } 3341 Select(P4) = First(P4) + Follow(one) = { "$" } 3343 First(P7) = { "string" } 3344 First(P8) = { } 3345 Preselected(P7) = Preselected(P8) = Empty(P7) = false 3346 Empty(P8) = true 3347 Follow(four) = { "$" } 3348 Select(P7) = First(P7) = { "string" } 3349 Select(P8) = First(P8) + Follow(four) = { "$" } 3351 The intersection of Select(P1) and Select(P2) is empty, as is the 3352 intersection of Select(P3) and Select(P4) and the intersection of 3353 Select(P7) and Select(P8), hence the grammar is deterministic and the 3354 type definition is valid. Although both alternatives of the CHOICE 3355 can begin with a element, an RXER decoder would use the 3356 presence of a "three" attribute to decide whether to select or 3357 disregard the "two" alternative. 3359 However, an attribute in an extension cannot be used to select 3360 between alternatives. Consider the following type definition: 3362 [SINGULAR-INSERTIONS] CHOICE { 3363 one [GROUP] List, 3364 ..., 3365 two [GROUP] SEQUENCE { 3366 three [ATTRIBUTE] UTF8String, 3367 four [GROUP] List 3368 } -- ExtensionAdditionAlternative (E1). 3369 -- The extension insertion point is here (I1). 3370 } 3372 List ::= SEQUENCE OF string UTF8String 3374 The associated grammar is: 3376 P1: S ::= one 3377 P10: S ::= E1 3378 P11: S ::= "*" 3379 P12: E1 ::= two 3380 P3: one ::= string one 3381 P4: one ::= 3382 P5: two ::= three four 3383 P6: three ::= "@three" 3384 P7: four ::= string four 3385 P8: four ::= 3386 P9: string ::= "string" 3388 This grammar leads to the following sets and predicates for P1, P10 3389 and P11: 3391 First(P1) = { "string" } 3392 First(P10) = { "string" } 3393 First(P11) = { "*" } 3394 Preselected(P1) = Preselected(P10) = Preselected(P11) = false 3395 Empty(P10) = Empty(P11) = false 3396 Empty(P1) = true 3397 Follow(S) = { "$" } 3398 Select(P1) = First(P1) + Follow(S) = { "string", "$" } 3399 Select(P10) = First(P10) = { "string" } 3400 Select(P11) = First(P11) = { "*" } 3402 Preselected(P10) evaluates to false because Preselected(P10) is 3403 evaluated on the base grammar, wherein P10 is rewritten as: 3405 P10: S ::= 3407 The intersection of Select(P1) and Select(P10) is not empty, hence 3408 the grammar is not deterministic and the type definition is not 3409 valid. An RXER decoder using the original, unextended version of the 3410 definition would not know that the "three" attribute selects between 3411 the "one" alternative and the extension. 3413 Appendix B. Insertion Encoding Instruction Examples 3415 This appendix is non-normative. 3417 This appendix contains examples showing the use of insertion encoding 3418 instructions to remove extension ambiguity arising from use of the 3419 GROUP encoding instruction. 3421 B.1. Example 1 3423 Consider the following type definition: 3425 SEQUENCE { 3426 one [GROUP] SEQUENCE { 3427 two UTF8String, 3428 ... -- Extension insertion point (I1). 3429 }, 3430 three INTEGER OPTIONAL, 3431 ... -- Extension insertion point (I2). 3432 } 3434 The associated grammar is: 3436 P1: S ::= one three I2 3437 P2: one ::= two I1 3438 P3: two ::= "two" 3439 P4: I1 ::= "*" I1 3440 P5: I1 ::= 3441 P6: three ::= "three" 3442 P7: three ::= 3443 P8: I2 ::= "*" I2 3444 P9: I2 ::= 3446 This grammar leads to the following sets and predicates: 3448 First(P4) = { "*" } 3449 First(P5) = { } 3450 Preselected(P4) = Preselected(P5) = Empty(P4) = false 3451 Empty(P5) = true 3452 Follow(I1) = { "three", "*", "$" } 3453 Select(P4) = First(P4) = { "*" } 3454 Select(P5) = First(P5) + Follow(I1) = { "three", "*", "$" } 3456 First(P6) = { "three" } 3457 First(P7) = { } 3458 Preselected(P6) = Preselected(P7) = Empty(P6) = false 3459 Empty(P7) = true 3460 Follow(three) = { "*", "$" } 3461 Select(P6) = First(P6) = { "three" } 3462 Select(P7) = First(P7) + Follow(three) = { "*", "$" } 3464 First(P8) = { "*" } 3465 First(P9) = { } 3466 Preselected(P8) = Preselected(P9) = Empty(P8) = false 3467 Empty(P9) = true 3468 Follow(I2) = { "$" } 3469 Select(P8) = First(P8) = { "*" } 3470 Select(P9) = First(P9) + Follow(I2) = { "$" } 3472 The intersection of Select(P4) and Select(P5) is not empty, hence the 3473 grammar is not deterministic and the type definition is not valid. 3474 If an RXER decoder encounters an unrecognized element immediately 3475 after a element, then it will not know whether to associate it 3476 with extension insertion point I1 or I2. 3478 The non-determinism can be resolved with either a NO-INSERTIONS or 3479 HOLLOW-INSERTIONS encoding instruction. Consider this revised type 3480 definition: 3482 SEQUENCE { 3483 one [GROUP] [HOLLOW-INSERTIONS] SEQUENCE { 3484 two UTF8String, 3485 ... -- Extension insertion point (I1). 3486 }, 3487 three INTEGER OPTIONAL, 3488 ... -- Extension insertion point (I2). 3489 } 3491 The associated grammar is: 3493 P1: S ::= one three I2 3494 P10: one ::= two 3495 P3: two ::= "two" 3496 P6: three ::= "three" 3497 P7: three ::= 3498 P8: I2 ::= "*" I2 3499 P9: I2 ::= 3501 With the addition of the HOLLOW-INSERTIONS encoding instruction, the 3502 P4 and P5 productions are no longer generated and the conflict 3503 between Select(P4) and Select(P5) no longer exists. The Select Sets 3504 for P6, P7, P8 and P9 are unchanged. A decoder will now assume that 3505 an unrecognized element is to be associated with extension insertion 3506 point I2. It is still free to associate an unrecognized attribute 3507 with either extension insertion point. If a NO-INSERTIONS encoding 3508 instruction had been used, then an unrecognized attribute could only 3509 be associated with extension insertion point I2. 3511 The non-determinism could also be resolved by adding a NO-INSERTIONS 3512 or HOLLOW-INSERTIONS encoding instruction to the outer SEQUENCE: 3514 [HOLLOW-INSERTIONS] SEQUENCE { 3515 one [GROUP] SEQUENCE { 3516 two UTF8String, 3517 ... -- Extension insertion point (I1). 3518 }, 3519 three INTEGER OPTIONAL, 3520 ... -- Extension insertion point (I2). 3521 } 3523 The associated grammar is: 3525 P11: S ::= one three 3526 P2: one ::= two I1 3527 P3: two ::= "two" 3528 P4: I1 ::= "*" I1 3529 P5: I1 ::= 3530 P6: three ::= "three" 3531 P7: three ::= 3533 This grammar leads to the following sets and predicates: 3535 First(P4) = { "*" } 3536 First(P5) = { } 3537 Preselected(P4) = Preselected(P5) = Empty(P4) = false 3538 Empty(P5) = true 3539 Follow(I1) = { "three", "$" } 3540 Select(P4) = First(P4) = { "*" } 3541 Select(P5) = First(P5) + Follow(I1) = { "three", "$" } 3543 First(P6) = { "three" } 3544 First(P7) = { } 3545 Preselected(P6) = Preselected(P7) = Empty(P6) = false 3546 Empty(P7) = true 3547 Follow(three) = { "$" } 3548 Select(P6) = First(P6) = { "three" } 3549 Select(P7) = First(P7) + Follow(three) = { "$" } 3551 The intersection of Select(P4) and Select(P5) is empty, as is the 3552 intersection of Select(P6) and Select(P7), hence the grammar is 3553 deterministic and the type definition is valid. A decoder will now 3554 assume that an unrecognized element is to be associated with 3555 extension insertion point I1. It is still free to associate an 3556 unrecognized attribute with either extension insertion point. If a 3557 NO-INSERTIONS encoding instruction had been used, then an 3558 unrecognized attribute could only be associated with extension 3559 insertion point I1. 3561 B.2. Example 2 3563 Consider the following type definition: 3565 SEQUENCE { 3566 one [GROUP] CHOICE { 3567 two UTF8String, 3568 ... -- Extension insertion point (I1). 3569 } OPTIONAL 3570 } 3572 The associated grammar is: 3574 P1: S ::= one 3575 P2: one ::= two 3576 P3: one ::= I1 3577 P4: one ::= 3578 P5: two ::= "two" 3579 P6: I1 ::= "*" I1 3580 P7: I1 ::= 3582 This grammar leads to the following sets and predicates: 3584 First(P2) = { "two" } 3585 First(P3) = { "*" } 3586 First(P4) = { } 3587 Preselected(P2) = Preselected(P3) = Preselected(P4) = false 3588 Empty(P2) = false 3589 Empty(P3) = Empty(P4) = true 3590 Follow(one) = { "$" } 3591 Select(P2) = First(P2) = { "two" } 3592 Select(P3) = First(P3) + Follow(one) = { "*", "$" } 3593 Select(P4) = First(P4) + Follow(one) = { "$" } 3595 First(P6) = { "*" } 3596 First(P7) = { } 3597 Preselected(P6) = Preselected(P7) = Empty(P6) = false 3598 Empty(P7) = true 3599 Follow(I1) = { "$" } 3600 Select(P6) = First(P6) = { "*" } 3601 Select(P7) = First(P7) + Follow(I1) = { "$" } 3603 The intersection of Select(P3) and Select(P4) is not empty, hence the 3604 grammar is not deterministic and the type definition is not valid. 3605 If the element is not present, then a decoder cannot determine 3606 whether the "one" alternative is absent, or present with an unknown 3607 extension that generates no elements. 3609 The non-determinism can be resolved with either a 3610 SINGULAR-INSERTIONS, UNIFORM-INSERTIONS or MULTIFORM-INSERTIONS 3611 encoding instruction. The MULTIFORM-INSERTIONS encoding instruction 3612 is the least restrictive. Consider this revised type definition: 3614 SEQUENCE { 3615 one [GROUP] [MULTIFORM-INSERTIONS] CHOICE { 3616 two UTF8String, 3617 ... -- Extension insertion point (I1). 3618 } OPTIONAL 3619 } 3621 The associated grammar is: 3623 P1: S ::= one 3624 P2: one ::= two 3625 P8: one ::= "*" I1 3626 P4: one ::= 3627 P5: two ::= "two" 3628 P6: I1 ::= "*" I1 3629 P7: I1 ::= 3631 This grammar leads to the following sets and predicates: 3633 First(P2) = { "two" } 3634 First(P8) = { "*" } 3635 First(P4) = { } 3636 Preselected(P2) = Preselected(P8) = Preselected(P4) = false 3637 Empty(P2) = Empty(P8) = false 3638 Empty(P4) = true 3639 Follow(one) = { "$" } 3640 Select(P2) = First(P2) = { "two" } 3641 Select(P8) = First(P8) = { "*" } 3642 Select(P4) = First(P4) + Follow(one) = { "$" } 3644 First(P6) = { "*" } 3645 First(P7) = { } 3646 Preselected(P6) = Preselected(P7) = Empty(P6) = false 3647 Empty(P7) = true 3648 Follow(I1) = { "$" } 3649 Select(P6) = First(P6) = { "*" } 3650 Select(P7) = First(P7) + Follow(I1) = { "$" } 3652 The intersection of Select(P2) and Select(P8) is empty, as is the 3653 intersection of Select(P2) and Select(P4), the intersection of 3654 Select(P8) and Select(P4) and the intersection of Select(P6) and 3655 Select(P7), hence the grammar is deterministic and the type 3656 definition is valid. A decoder will now assume the "one" alternative 3657 is present if it sees at least one unrecognized element, and absent 3658 otherwise. 3660 B.3. Example 3 3662 Consider the following type definition: 3664 SEQUENCE { 3665 one [GROUP] CHOICE { 3666 two UTF8String, 3667 ... -- Extension insertion point (I1). 3668 }, 3669 three [GROUP] CHOICE { 3670 four UTF8String, 3671 ... -- Extension insertion point (I2). 3672 } 3673 } 3675 The associated grammar is: 3677 P1: S ::= one three 3678 P2: one ::= two 3679 P3: one ::= I1 3680 P4: two ::= "two" 3681 P5: I1 ::= "*" I1 3682 P6: I1 ::= 3683 P7: three ::= four 3684 P8: three ::= I2 3685 P9: four ::= "four" 3686 P10: I2 ::= "*" I2 3687 P11: I2 ::= 3689 This grammar leads to the following sets and predicates: 3691 First(P2) = { "two" } 3692 First(P3) = { "*" } 3693 Preselected(P2) = Preselected(P3) = Empty(P2) = false 3694 Empty(P3) = true 3695 Follow(one) = { "four", "*", "$" } 3696 Select(P2) = First(P2) = { "two" } 3697 Select(P3) = First(P3) + Follow(one) = { "*", "four", "$" } 3699 First(P5) = { "*" } 3700 First(P6) = { } 3701 Preselected(P5) = Preselected(P6) = Empty(P5) = false 3702 Empty(P6) = true 3703 Follow(I1) = { "four", "*", "$" } 3704 Select(P5) = First(P5) = { "*" } 3705 Select(P6) = First(P6) + Follow(I1) = { "four", "*", "$" } 3707 First(P7) = { "four" } 3708 First(P8) = { "*" } 3709 Preselected(P7) = Preselected(P8) = Empty(P7) = false 3710 Empty(P8) = true 3711 Follow(three) = { "$" } 3712 Select(P7) = First(P7) = { "four" } 3713 Select(P8) = First(P8) + Follow(three) = { "*", "$" } 3715 First(P10) = { "*" } 3716 First(P11) = { } 3717 Preselected(P10) = Preselected(P11) = Empty(P10) = false 3718 Empty(P11) = true 3719 Follow(I2) = { "$" } 3720 Select(P10) = First(P10) = { "*" } 3721 Select(P11) = First(P11) + Follow(I2) = { "$" } 3723 The intersection of Select(P5) and Select(P6) is not empty, hence the 3724 grammar is not deterministic and the type definition is not valid. 3725 If the first child element is an unrecognized element, then a decoder 3726 cannot determine whether to associate it with extension insertion 3727 point I1, or to associate it with extension insertion point I2 by 3728 assuming that the "one" component has an unknown extension that 3729 generates no elements. 3731 The non-determinism can be resolved with either a SINGULAR-INSERTIONS 3732 or UNIFORM-INSERTIONS encoding instruction. Consider this revised 3733 type definition using the SINGULAR-INSERTIONS encoding instruction: 3735 SEQUENCE { 3736 one [GROUP] [SINGULAR-INSERTIONS] CHOICE { 3737 two UTF8String, 3738 ... -- Extension insertion point (I1). 3739 }, 3740 three [GROUP] CHOICE { 3741 four UTF8String, 3742 ... -- Extension insertion point (I2). 3743 } 3744 } 3746 The associated grammar is: 3748 P1: S ::= one three 3749 P2: one ::= two 3750 P12: one ::= "*" 3751 P4: two ::= "two" 3752 P7: three ::= four 3753 P8: three ::= I2 3754 P9: four ::= "four" 3755 P10: I2 ::= "*" I2 3756 P11: I2 ::= 3758 With the addition of the SINGULAR-INSERTIONS encoding instruction, 3759 the P5 and P6 productions are no longer generated. The grammar leads 3760 to the following sets and predicates for the P2 and P12 productions: 3762 First(P2) = { "two" } 3763 First(P12) = { "*" } 3764 Preselected(P2) = Preselected(P12) = false 3765 Empty(P2) = Empty(P12) = false 3766 Follow(one) = { "four", "*", "$" } 3767 Select(P2) = First(P2) = { "two" } 3768 Select(P12) = First(P12) = { "*" } 3770 The sets for P5 and P6 are no longer generated and the remaining sets 3771 are unchanged. 3773 The intersection of Select(P2) and Select(P12) is empty, as is the 3774 intersection of Select(P7) and Select(P8) and the intersection of 3775 Select(P10) and Select(P11), hence the grammar is deterministic and 3776 the type definition is valid. If the first child element is an 3777 unrecognized element, then a decoder will now assume that it is 3778 associated with extension insertion point I1. Whatever follows, 3779 possibly including another unrecognized element, will belong to the 3780 "three" component. 3782 Now consider the type definition using the UNIFORM-INSERTIONS 3783 encoding instruction instead: 3785 SEQUENCE { 3786 one [GROUP] [UNIFORM-INSERTIONS] CHOICE { 3787 two UTF8String, 3788 ... -- Extension insertion point (I1). 3789 }, 3790 three [GROUP] CHOICE { 3791 four UTF8String, 3792 ... -- Extension insertion point (I2). 3793 } 3794 } 3796 The associated grammar is: 3798 P1: S ::= one three 3799 P2: one ::= two 3800 P13: one ::= "*" 3801 P14: one ::= "*1" I1 3802 P4: two ::= "two" 3803 P15: I1 ::= "*1" I1 3804 P6: I1 ::= 3805 P7: three ::= four 3806 P8: three ::= I2 3807 P9: four ::= "four" 3808 P10: I2 ::= "*" I2 3809 P11: I2 ::= 3811 This grammar leads to the following sets and predicates for the P2, 3812 P13, P14, P15 and P6 productions: 3814 First(P2) = { "two" } 3815 First(P13) = { "*" } 3816 First(P14) = { "*1" } 3817 Preselected(P2) = Preselected(P13) = Preselected(P14) = false 3818 Empty(P2) = Empty(P13) = Empty(P14) = false 3819 Follow(one) = { "four", "*", "$" } 3820 Select(P2) = First(P2) = { "two" } 3821 Select(P13) = First(P13) = { "*" } 3822 Select(P14) = First(P14) = { "*1" } 3824 First(P15) = { "*1" } 3825 First(P6) = { } 3826 Preselected(P15) = Preselected(P6) = Empty(P15) = false 3827 Empty(P6) = true 3828 Follow(I1) = { "four", "*", "$" } 3829 Select(P15) = First(P15) = { "*1" } 3830 Select(P6) = First(P6) + Follow(I1) = { "four", "*", "$" } 3832 The remaining sets are unchanged. 3834 The intersection of Select(P2) and Select(P13) is empty, as is the 3835 intersection of Select(P2) and Select(P14), the intersection of 3836 Select(P13) and Select(P14) and the intersection of Select(P15) and 3837 Select(P6), hence the grammar is deterministic and the type 3838 definition is valid. If the first child element is an unrecognized 3839 element, then a decoder will now assume that it and every subsequent 3840 unrecognized element with the same name are associated with I1. 3841 Whatever follows, possibly including another unrecognized element 3842 with a different name, will belong to the "three" component. 3844 A consequence of using the UNIFORM-INSERTIONS encoding instruction is 3845 that any future extension to the "three" component will be required 3846 to generate elements with names that are different from the names of 3847 the elements generated by the "one" component. With the 3848 SINGULAR-INSERTIONS encoding instruction, extensions to the "three" 3849 component are permitted to generate elements with names that are the 3850 same as the names of the elements generated by the "one" component. 3852 B.4. Example 4 3854 Consider the following type definition: 3856 SEQUENCE OF one [GROUP] CHOICE { 3857 two UTF8String, 3858 ... -- Extension insertion point (I1). 3859 } 3861 The associated grammar is: 3863 P1: S ::= one S 3864 P2: S ::= 3865 P3: one ::= two 3866 P4: one ::= I1 3867 P5: two ::= "two" 3868 P6: I1 ::= "*" I1 3869 P7: I1 ::= 3871 This grammar leads to the following sets and predicates: 3873 First(P1) = { "two", "*" } 3874 First(P2) = { } 3875 Preselected(P1) = Preselected(P2) = false 3876 Empty(P1) = Empty(P2) = true 3877 Follow(S) = { "$" } 3878 Select(P1) = First(P1) + Follow(S) = { "two", "*", "$" } 3879 Select(P2) = First(P2) + Follow(S) = { "$" } 3881 First(P3) = { "two" } 3882 First(P4) = { "*" } 3883 Preselected(P3) = Preselected(P4) = Empty(P3) = false 3884 Empty(P4) = true 3885 Follow(one) = { "two", "*", "$" } 3886 Select(P3) = First(P3) = { "two" } 3887 Select(P4) = First(P4) + Follow(one) = { "*", "two", "$" } 3889 First(P6) = { "*" } 3890 First(P7) = { } 3891 Preselected(P6) = Preselected(P7) = Empty(P6) = false 3892 Empty(P7) = true 3893 Follow(I1) = { "two", "*", "$" } 3894 Select(P6) = First(P6) = { "*" } 3895 Select(P7) = First(P7) + Follow(I1) = { "two", "*", "$" } 3897 The intersection of Select(P1) and Select(P2) is not empty, as is the 3898 intersection of Select(P3) and Select(P4) and the intersection of 3899 Select(P6) and Select(P7), hence the grammar is not deterministic and 3900 the type definition is not valid. If a decoder encounters two or 3901 more unrecognized elements in a row, then it cannot determine whether 3902 this represents one instance or more than one instance of the "one" 3903 component. Even without unrecognized elements, there is still a 3904 problem that an encoding could contain an indeterminate number of 3905 "one" components using an extension that generates no elements. 3907 The non-determinism cannot be resolved with a UNIFORM-INSERTIONS 3908 encoding instruction. Consider this revised type definition using 3909 the UNIFORM-INSERTIONS encoding instruction: 3911 SEQUENCE OF one [GROUP] [UNIFORM-INSERTIONS] CHOICE { 3912 two UTF8String, 3913 ... -- Extension insertion point (I1). 3914 } 3916 The associated grammar is: 3918 P1: S ::= one S 3919 P2: S ::= 3920 P3: one ::= two 3921 P8: one ::= "*" 3922 P9: one ::= "*1" I1 3923 P5: two ::= "two" 3924 P10: I1 ::= "*1" I1 3925 P7: I1 ::= 3927 This grammar leads to the following sets and predicates: 3929 First(P1) = { "two", "*", "*1" } 3930 First(P2) = { } 3931 Preselected(P1) = Preselected(P2) = Empty(P1) = false 3932 Empty(P2) = true 3933 Follow(S) = { "$" } 3934 Select(P1) = First(P1) = { "two", "*", "*1" } 3935 Select(P2) = First(P2) + Follow(S) = { "$" } 3937 First(P3) = { "two" } 3938 First(P8) = { "*" } 3939 First(P9) = { "*1" } 3940 Preselected(P3) = Preselected(P8) = Preselected(P9) = false 3941 Empty(P3) = Empty(P8) = Empty(P9) = false 3942 Follow(one) = { "two", "*", "*1", "$" } 3943 Select(P3) = First(P3) = { "two" } 3944 Select(P8) = First(P8) = { "*" } 3945 Select(P9) = First(P9) = { "*1" } 3947 First(P10) = { "*1" } 3948 First(P7) = { } 3949 Preselected(P10) = Preselected(P7) = Empty(P10) = false 3950 Empty(P7) = true 3951 Follow(I1) = { "two", "*", "*1", "$" } 3952 Select(P10) = First(P10) = { "*1" } 3953 Select(P7) = First(P7) + Follow(I1) = { "two", "*", "*1", "$" } 3955 The intersection of Select(P1) and Select(P2) is now empty, but the 3956 intersection of Select(P10) and Select(P7) is not, hence the grammar 3957 is not deterministic and the type definition is not valid. The 3958 problem of an indeterminate number of "one" components from an 3959 extension that generates no elements has been solved, however if a 3960 decoder encounters a series of elements with the same name it cannot 3961 determine whether this represents one instance or more than one 3962 instance of the "one" component. 3964 The non-determinism can be fully resolved with a SINGULAR-INSERTIONS 3965 encoding instruction. Consider this revised type definition: 3967 SEQUENCE OF one [GROUP] [SINGULAR-INSERTIONS] CHOICE { 3968 two UTF8String, 3969 ... -- Extension insertion point (I1). 3970 } 3972 The associated grammar is: 3974 P1: S ::= one S 3975 P2: S ::= 3976 P3: one ::= two 3977 P8: one ::= "*" 3978 P5: two ::= "two" 3980 This grammar leads to the following sets and predicates: 3982 First(P1) = { "two", "*" } 3983 First(P2) = { } 3984 Preselected(P1) = Preselected(P2) = Empty(P1) = false 3985 Empty(P2) = true 3986 Follow(S) = { "$" } 3987 Select(P1) = First(P1) = { "two", "*" } 3988 Select(P2) = First(P2) + Follow(S) = { "$" } 3990 First(P3) = { "two" } 3991 First(P8) = { "*" } 3992 Preselected(P3) = Preselected(P8) = false 3993 Empty(P3) = Empty(P8) = false 3994 Follow(one) = { "two", "*", "$" } 3995 Select(P3) = First(P3) = { "two" } 3996 Select(P8) = First(P8) = { "*" } 3998 The intersection of Select(P1) and Select(P2) is empty, as is the 3999 intersection of Select(P3) and Select(P8), hence the grammar is 4000 deterministic and the type definition is valid. A decoder now knows 4001 that every extension to the "one" component will generate a single 4002 element, so the correct number of "one" components will be decoded. 4004 Appendix C. Extension and Versioning Examples 4006 This appendix is non-normative. 4008 C.1. Valid Extensions for Insertion Encoding Instructions 4010 The first example shows extensions that satisfy the HOLLOW-INSERTIONS 4011 encoding instruction. 4013 [HOLLOW-INSERTIONS] CHOICE { 4014 one BOOLEAN, 4015 ..., 4016 two [ATTRIBUTE] INTEGER, 4017 three [GROUP] SEQUENCE { 4018 four [ATTRIBUTE] UTF8String, 4019 five [ATTRIBUTE] INTEGER OPTIONAL, 4020 ... 4021 }, 4022 six [GROUP] CHOICE { 4023 seven [ATTRIBUTE] BOOLEAN, 4024 eight [ATTRIBUTE] INTEGER 4025 } 4026 } 4028 The "two" and "six" components generate only attributes. 4030 The "three" component in its current form does not generate elements. 4031 Any extension to the "three" component will need to do likewise to 4032 avoid breaking forward compatibility. 4034 The second example shows extensions that satisfy the 4035 SINGULAR-INSERTIONS encoding instruction. 4037 [SINGULAR-INSERTIONS] CHOICE { 4038 one BOOLEAN, 4039 ..., 4040 two INTEGER, 4041 three [GROUP] SEQUENCE { 4042 four [ATTRIBUTE] UTF8String, 4043 five INTEGER 4044 }, 4045 six [GROUP] CHOICE { 4046 seven BOOLEAN, 4047 eight INTEGER 4048 } 4049 } 4051 The "two" component will always generate a single element. 4053 The "three" component will always generate a single element. 4054 It will also generate a "four" attribute, but any number of 4055 attributes is allowed by the SINGULAR-INSERTIONS encoding 4056 instruction. 4058 The "six" component will either generate a single element or 4059 a single element. Either case will satisfy the requirement 4060 that there will be a single element in any given encoding of the 4061 extension. 4063 The third example shows extensions that satisfy the 4064 UNIFORM-INSERTIONS encoding instruction. 4066 [UNIFORM-INSERTIONS] CHOICE { 4067 one BOOLEAN, 4068 ..., 4069 two INTEGER, 4070 three [GROUP] SEQUENCE SIZE(1..MAX) OF four INTEGER, 4071 five [GROUP] SEQUENCE { 4072 six [ATTRIBUTE] UTF8String OPTIONAL, 4073 seven INTEGER 4074 }, 4075 eight [GROUP] CHOICE { 4076 nine BOOLEAN, 4077 ten [GROUP] SEQUENCE SIZE(1..MAX) OF eleven INTEGER 4078 } 4079 } 4081 The "two" component will always generate a single element. 4083 The "three" component will always generate one or more 4084 elements. 4086 The "five" component will always generate a single element. 4087 It may also generate a "six" attribute, but any number of attributes 4088 is allowed by the UNIFORM-INSERTIONS encoding instruction. 4090 The "eight" component will either generate a single element or 4091 one or more elements. Either case will satisfy the 4092 requirement that there must be one or more elements with the same 4093 name in any given encoding of the extension. 4095 C.2. Versioning Example 4097 Making extensions that are not forward compatible is permitted 4098 provided the incompatibility is signalled with a version indicator 4099 attribute. 4101 Suppose that version 1.0 of a specification contains the following 4102 type definition: 4104 MyMessageType ::= SEQUENCE { 4105 version [ATTRIBUTE] [VERSION-INDICATOR] 4106 UTF8String ("1.0", ...) DEFAULT "1.0", 4107 one [GROUP] [SINGULAR-INSERTIONS] CHOICE { 4108 two BOOLEAN, 4109 ... 4110 }, 4111 ... 4112 } 4114 An attribute is to be added to the CHOICE for version 1.1. This 4115 change is not forward compatible since it does not satisfy the 4116 SINGULAR-INSERTIONS encoding instruction. Therefore the version 4117 indicator attribute must be updated at the same time (or added if it 4118 wasn't already present). This results in the following new type 4119 definition for version 1.1: 4121 MyMessageType ::= SEQUENCE { 4122 version [ATTRIBUTE] [VERSION-INDICATOR] 4123 UTF8String ("1.0", ..., "1.1") DEFAULT "1.0", 4124 one [GROUP] [SINGULAR-INSERTIONS] CHOICE { 4125 two BOOLEAN, 4126 ..., 4127 three [ATTRIBUTE] INTEGER -- Added in Version 1.1 4128 }, 4129 ... 4130 } 4132 If a version 1.1 conformant application hasn't used the version 1.1 4133 extension in a value of MyMessageType, then it is allowed to set the 4134 value of the version attribute to "1.0". 4136 A pair of elements is added to the CHOICE for version 1.2. Again the 4137 change does not satisfy the SINGULAR-INSERTIONS encoding instruction. 4138 The type definition for version 1.2 is: 4140 MyMessageType ::= SEQUENCE { 4141 version [ATTRIBUTE] [VERSION-INDICATOR] 4142 UTF8String ("1.0", ..., "1.1" | "1.2") 4143 DEFAULT "1.0", 4144 one [GROUP] [SINGULAR-INSERTIONS] CHOICE { 4145 two BOOLEAN, 4146 ..., 4147 three [ATTRIBUTE] INTEGER, -- Added in Version 1.1 4148 four [GROUP] SEQUENCE { 4149 five UTF8String, 4150 six GeneralizedTime 4151 } -- Added in version 1.2 4153 }, 4154 ... 4155 } 4157 If a version 1.2 conformant application hasn't used the version 1.2 4158 extension in a value of MyMessageType, then it is allowed to set the 4159 value of the version attribute to "1.1". If it hasn't used either of 4160 the extensions, then it is allowed to set the value of the version 4161 attribute to "1.0". 4163 Author's Address 4165 Dr. Steven Legg 4166 eB2Bcom 4167 Suite 3, Woodhouse Corporate Centre 4168 935 Station Street 4169 Box Hill North, Victoria 3129 4170 AUSTRALIA 4172 Phone: +61 3 9896 7830 4173 Fax: +61 3 9896 7801 4174 EMail: steven.legg@eb2bcom.com 4176 Full Copyright Statement 4178 Copyright (C) The IETF Trust (2006). 4180 This document is subject to the rights, licenses and restrictions 4181 contained in BCP 78, and except as set forth therein, the authors 4182 retain all their rights. 4184 This document and the information contained herein are provided on an 4185 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 4186 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 4187 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 4188 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 4189 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 4190 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 4192 Intellectual Property 4194 The IETF takes no position regarding the validity or scope of any 4195 Intellectual Property Rights or other rights that might be claimed to 4196 pertain to the implementation or use of the technology described in 4197 this document or the extent to which any license under such rights 4198 might or might not be available; nor does it represent that it has 4199 made any independent effort to identify any such rights. Information 4200 on the procedures with respect to rights in RFC documents can be 4201 found in BCP 78 and BCP 79. 4203 Copies of IPR disclosures made to the IETF Secretariat and any 4204 assurances of licenses to be made available, or the result of an 4205 attempt made to obtain a general license or permission for the use of 4206 such proprietary rights by implementers or users of this 4207 specification can be obtained from the IETF on-line IPR repository at 4208 http://www.ietf.org/ipr. 4210 The IETF invites any interested party to bring to its attention any 4211 copyrights, patents or patent applications, or other proprietary 4212 rights that may cover technology that may be required to implement 4213 this standard. Please address the information to the IETF at 4214 ietf-ipr@ietf.org. 4216 Note to the RFC Editor: the remainder of this document is to be removed 4217 before final publication. 4219 Changes in Draft 01 4221 The GROUP encoding instruction is no longer permitted in situations 4222 that would cause a recursive group definition. 4224 TopLevelNamedType has been replaced by an unrestricted NamedType. 4225 This makes manipulation of top-level components easier to both 4226 specify and implement. 4228 RefParametersValue (a governed Value) has been replaced by specific 4229 notation, i.e., the RefParameters production. The RefParameters 4230 ASN.1 type is no longer used. 4232 Parameterized encoding instructions have been disallowed. 4234 A selection type is not permitted to select the Type from a NamedType 4235 that is subject to an ATTRIBUTE-REF, ELEMENT-REF or REF-AS-ELEMENT 4236 encoding instruction. Also, a selection type does not inherit 4237 component encoding instructions. 4239 The ATTRIBUTE encoding instruction is permitted to be applied to the 4240 QName type and LIST types. 4242 The descriptions of the SCHEMA-IDENTITY and TARGET-NAMESPACE encoding 4243 instructions have been expanded. 4245 Changes in Draft 02 4247 The prefixed type for the ATTRIBUTE-REF encoding instruction has been 4248 reduced to a UTF8String and restrictions have been placed on the type 4249 of referenced attribute definitions. These changes have been made to 4250 overcome difficulties in producing a canonical encoding for foreign 4251 attribute definitions. 4253 References to foreign definitions dependent on the XML Schema ENTITY 4254 and ENTITIES types have been disallowed. 4256 CanonicalizationParameter has been removed from the grammar for 4257 RefParameters. Preservation of the Infoset representation of a value 4258 of Markup is sufficient for the purposes of CRXER. 4260 References to AnySimpleType have been removed. 4262 The type of an alternative of a ChoiceType that is subject to a UNION 4263 encoding instruction is not permitted to be an open type. 4265 The CONTENT encoding instruction has been renamed to GROUP. 4267 The conditions for unique component attribution have been 4268 reformulated in terms of the grammar for a type definition, but the 4269 effects are the same. 4271 Unknown extensions are now handled explicitly in the grammars 4272 generated from type definitions. The insertion encoding instructions 4273 have been added to resolve non-determinism with respect to extension 4274 insertion points. Examples using insertion encoding instructions 4275 have been added as Appendices B and C. 4277 Changes in Draft 03 4279 The BIT STRING type is no longer permitted to be the component type 4280 of a LIST type. 4282 The SIMPLE-CONTENT and COMPONENT-REF encoding instructions have been 4283 added. 4285 An optional Prefix specification has been added to the 4286 TARGET-NAMESPACE encoding instruction. 4288 The AS keyword in the NAME encoding instruction has been made 4289 optional. 4291 The syntax of the VALUES encoding instruction has been changed 4292 slightly. 4294 The VersionIndicator parameter of the ATTRIBUTE encoding instruction 4295 has been pulled out as a separate VERSION-INDICATOR encoding 4296 instruction. 4298 The AnyType ASN.1 type has been renamed to Markup. 4300 The insertions encoding instructions have been simplified by forcing 4301 them to be co-located with the type they affect. 4303 With regard to the TYPE-REF encoding instruction, it is no longer 4304 necessary to preserve the exact Infoset [INFOSET] representation of 4305 abstract values of an ASN.1 type embedded in a Markup value. 4307 The URL for the ASN.1 namespace has been replaced. A permanent URN 4308 will be requested from IANA. 4310 Changes in Draft 04 4312 The effective name definition has been replaced by the expanded name 4313 definition from Namespaces in XML. 4315 The Namespace notation has been removed from the REF-AS-TYPE encoding 4316 instruction.