idnits 2.17.1 draft-ietf-jsonpath-base-02.txt: -(2): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(7): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == There are 8 instances of lines with non-ascii characters in the document. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 43 instances of too long lines in the document, the longest one being 22 characters in excess of 72. == There are 1 instance of lines with non-RFC2606-compliant FQDNs in the document. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (25 October 2021) is 885 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '0' on line 806 -- Looks like a reference, but probably isn't: '1' on line 1130 -- Looks like a reference, but probably isn't: '2' on line 1134 -- Looks like a reference, but probably isn't: '3' on line 1143 == Missing Reference: '-1' is mentioned on line 752, but not defined -- Looks like a reference, but probably isn't: '4' on line 1143 == Missing Reference: '-2' is mentioned on line 810, but not defined -- Looks like a reference, but probably isn't: '5' on line 1143 Summary: 1 error (**), 0 flaws (~~), 5 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 JSONPath WG S. Gössner, Ed. 3 Internet-Draft Fachhochschule Dortmund 4 Intended status: Standards Track G. Normington, Ed. 5 Expires: 28 April 2022 6 C. Bormann, Ed. 7 Universität Bremen TZI 8 25 October 2021 10 JSONPath: Query expressions for JSON 11 draft-ietf-jsonpath-base-02 13 Abstract 15 JSONPath defines a string syntax for identifying values within a 16 JavaScript Object Notation (JSON) document. 18 Contributing 20 This document picks up the popular JSONPath specification dated 21 2007-02-21 and provides a normative definition for it. In its 22 current state, it is a strawman document showing what needs to be 23 covered. 25 Comments and issues may be directed to this document's github 26 repository (https://github.com/ietf-wg-jsonpath/draft-ietf-jsonpath- 27 jsonpath). 29 Status of This Memo 31 This Internet-Draft is submitted in full conformance with the 32 provisions of BCP 78 and BCP 79. 34 Internet-Drafts are working documents of the Internet Engineering 35 Task Force (IETF). Note that other groups may also distribute 36 working documents as Internet-Drafts. The list of current Internet- 37 Drafts is at https://datatracker.ietf.org/drafts/current/. 39 Internet-Drafts are draft documents valid for a maximum of six months 40 and may be updated, replaced, or obsoleted by other documents at any 41 time. It is inappropriate to use Internet-Drafts as reference 42 material or to cite them other than as "work in progress." 44 This Internet-Draft will expire on 28 April 2022. 46 Copyright Notice 48 Copyright (c) 2021 IETF Trust and the persons identified as the 49 document authors. All rights reserved. 51 This document is subject to BCP 78 and the IETF Trust's Legal 52 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 53 license-info) in effect on the date of publication of this document. 54 Please review these documents carefully, as they describe your rights 55 and restrictions with respect to this document. Code Components 56 extracted from this document must include Simplified BSD License text 57 as described in Section 4.e of the Trust Legal Provisions and are 58 provided without warranty as described in the Simplified BSD License. 60 Table of Contents 62 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 63 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 64 1.2. Inspired by XPath . . . . . . . . . . . . . . . . . . . . 4 65 1.3. Overview of JSONPath Expressions . . . . . . . . . . . . 5 66 2. JSONPath Examples . . . . . . . . . . . . . . . . . . . . . . 8 67 3. JSONPath Syntax and Semantics . . . . . . . . . . . . . . . . 11 68 3.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . 11 69 3.2. Processing Model . . . . . . . . . . . . . . . . . . . . 11 70 3.3. Syntax . . . . . . . . . . . . . . . . . . . . . . . . . 12 71 3.4. Semantics . . . . . . . . . . . . . . . . . . . . . . . . 12 72 3.5. Selectors . . . . . . . . . . . . . . . . . . . . . . . . 13 73 3.5.1. Root Selector . . . . . . . . . . . . . . . . . . . . 14 74 3.5.2. Dot Selector . . . . . . . . . . . . . . . . . . . . 14 75 3.5.3. Dot Wild Card Selector . . . . . . . . . . . . . . . 15 76 3.5.4. Index Selector . . . . . . . . . . . . . . . . . . . 15 77 3.5.5. Index Wild Card Selector . . . . . . . . . . . . . . 18 78 3.5.6. Array Slice Selector . . . . . . . . . . . . . . . . 18 79 3.5.7. Descendant Selector . . . . . . . . . . . . . . . . . 22 80 3.5.8. Union Selector . . . . . . . . . . . . . . . . . . . 22 81 3.5.8.1. Syntax . . . . . . . . . . . . . . . . . . . . . 22 82 3.5.8.2. Semantics . . . . . . . . . . . . . . . . . . . . 23 83 3.5.9. Filter Selector . . . . . . . . . . . . . . . . . . . 23 84 3.5.9.1. Syntax . . . . . . . . . . . . . . . . . . . . . 23 85 3.5.9.2. Semantics . . . . . . . . . . . . . . . . . . . . 26 86 4. Expression Language . . . . . . . . . . . . . . . . . . . . . 27 87 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 28 88 6. Security Considerations . . . . . . . . . . . . . . . . . . . 28 89 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 28 90 7.1. Normative References . . . . . . . . . . . . . . . . . . 28 91 7.2. Informative References . . . . . . . . . . . . . . . . . 28 92 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 29 93 Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . 29 94 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 30 96 1. Introduction 98 This document picks up the popular JSONPath specification dated 99 2007-02-21 [JSONPath-orig] and provides a normative definition for 100 it. In its current state, it is a strawman document showing what 101 needs to be covered. 103 JSON is defined by [RFC8259]. 105 JSONPath is not intended as a replacement, but as a more powerful 106 companion, to JSON Pointer [RFC6901]. [insert reference to section 107 where the relationship is detailed. The purposes of the two syntaxes 108 are different. Pointer is for isolating a single location within a 109 document. Path is a query syntax that can also be used to pull 110 multiple locations.] 112 1.1. Terminology 114 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 115 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 116 "OPTIONAL" in this document are to be interpreted as described in 117 BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all 118 capitals, as shown here. 120 The grammatical rules in this document are to be interpreted as ABNF, 121 as described in [RFC5234]. ABNF terminal values in this document 122 define Unicode code points rather than their UTF-8 encoding. For 123 example, the Unicode PLACE OF INTEREST SIGN (U+2318) would be defined 124 in ABNF as %x2318. 126 The terminology of [RFC8259] applies except where clarified below. 127 The terms "Primitive" and "Structured" are used to group the types as 128 in Section 1 of [RFC8259]. Definitions for "Object", "Array", 129 "Number", and "String" remain unchanged. Importantly "object" and 130 "array" in particular do not take on a generic meaning, such as they 131 would in a general programming context. 133 Additional terms used in this specification are defined below. 135 Value: As per [RFC8259], a structure complying to the generic data 136 model of JSON, i.e., composed of components such as containers, 137 namely JSON objects and arrays, and atomic data, namely null, 138 true, false, numbers, and text strings. 140 Member: A name/value pair in an object. (Not itself a value.) 141 Name: The name in a name/value pair constituting a member. (Also 142 known as "key", "tag", or "label".) This is also used in 143 [RFC8259], but that specification does not formally define it. It 144 is included here for completeness. 146 Element: A value in an array. (Also used with a distinct meaning in 147 XML context for XML elements.) 149 Index: A non-negative integer that identifies a specific element in 150 an array. 152 Query: Short name for JSONPath expression. 154 Argument: Short name for the value a JSONPath expression is applied 155 to. 157 Node: The pair of a value along with its location within the 158 argument. 160 Root Node: The unique node whose value is the entire argument. 162 Nodelist: A list of nodes. The output of applying a query to an 163 argument is manifested as a list of nodes. While this list can be 164 represented in JSON, e.g. as an array, the nodelist is an abstract 165 concept unrelated to JSON values. 167 Normalized Path: A simple form of JSONPath expression that 168 identifies a node by providing a query that results in exactly 169 that node. Similar to, but syntactically different from, a JSON 170 Pointer [RFC6901]. 172 For the purposes of this specification, a value as defined by 173 [RFC8259] is also viewed as a tree of nodes. Each node, in turn, 174 holds a value. Further nodes within each value are the elements of 175 arrays and the member values of objects and are themselves values. 176 (The type of the value held by a node may also be referred to as the 177 type of the node.) 179 A query is applied to an argument, and the output is a nodelist. 181 1.2. Inspired by XPath 183 A frequently emphasized advantage of XML is the availability of 184 powerful tools to analyse, transform and selectively extract data 185 from XML documents. [XPath] is one of these tools. 187 In 2007, the need for something solving the same class of problems 188 for the emerging JSON community became apparent, specifically for: 190 * Finding data interactively and extracting them out of [RFC8259] 191 JSON values without special scripting. 193 * Specifying the relevant parts of the JSON data in a request by a 194 client, so the server can reduce the amount of data in its 195 response, minimizing bandwidth usage. 197 So what does such a tool look like for JSON? When defining a 198 JSONPath, how should expressions look? 200 The XPath expression 202 /store/book[1]/title 204 looks like 206 x.store.book[0].title 208 or 210 x['store']['book'][0]['title'] 212 in popular programming languages such as JavaScript, Python and PHP, 213 with a variable x holding the argument. Here we observe that such 214 languages already have a fundamentally XPath-like feature built in. 216 The JSONPath tool in question should: 218 * be naturally based on those language characteristics. 220 * cover only essential parts of XPath 1.0. 222 * be lightweight in code size and memory consumption. 224 * be runtime efficient. 226 1.3. Overview of JSONPath Expressions 228 JSONPath expressions always apply to a value in the same way as XPath 229 expressions are used in combination with an XML document. Since a 230 value is anonymous, JSONPath uses the abstract name $ to refer to the 231 root node of the argument. 233 JSONPath expressions can use the _dot notation_ 235 $.store.book[0].title 237 or the _bracket notation_ 238 $['store']['book'][0]['title'] 240 for paths input to a JSONPath processor. [1] Where a JSONPath 241 processor uses JSONPath expressions as output paths, these will 242 always be converted to Output Paths which employ the more general 243 _bracket notation_. [2] Bracket notation is more general than dot 244 notation and can serve as a canonical form when a JSONPath processor 245 uses JSONPath expressions as output paths. 247 JSONPath allows the wildcard symbol * for member names and array 248 indices. It borrows the descendant operator .. from [E4X] and the 249 array slice syntax proposal [start:end:step] [SLICE] from ECMASCRIPT 250 4. 252 JSONPath was originally designed to employ an _underlying scripting 253 language_ for computing expressions. The present specification 254 defines a simple expression language that is independent from any 255 scripting language in use on the platform. 257 JSONPath can use expressions, written in parentheses: (), as an 258 alternative to explicit names or indices as in: 260 $.store.book[(@.length-1)].title 262 The symbol @ is used for the current node. Filter expressions are 263 supported via the syntax ?() as in 265 $.store.book[?(@.price < 10)].title 267 Here is a complete overview and a side by side comparison of the 268 JSONPath syntax elements with their XPath counterparts. 270 +=======+==================+=====================================+ 271 | XPath | JSONPath | Description | 272 +=======+==================+=====================================+ 273 | / | $ | the root element/node | 274 +-------+------------------+-------------------------------------+ 275 | . | @ | the current element/node | 276 +-------+------------------+-------------------------------------+ 277 | / | . or [] | child operator | 278 +-------+------------------+-------------------------------------+ 279 | .. | n/a | parent operator | 280 +-------+------------------+-------------------------------------+ 281 | // | .. | nested descendants (JSONPath | 282 | | | borrows this syntax from E4X) | 283 +-------+------------------+-------------------------------------+ 284 | * | * | wildcard: All elements/nodes | 285 | | | regardless of their names | 286 +-------+------------------+-------------------------------------+ 287 | @ | n/a | attribute access: JSON values do | 288 | | | not have attributes | 289 +-------+------------------+-------------------------------------+ 290 | [] | [] | subscript operator: XPath uses it | 291 | | | to iterate over element collections | 292 | | | and for predicates; native array | 293 | | | indexing as in JavaScript here | 294 +-------+------------------+-------------------------------------+ 295 | | | [,] | Union operator in XPath (results in | 296 | | | a combination of node sets); | 297 | | | JSONPath allows alternate names or | 298 | | | array indices as a set | 299 +-------+------------------+-------------------------------------+ 300 | n/a | [start:end:step] | array slice operator borrowed from | 301 | | | ES4 | 302 +-------+------------------+-------------------------------------+ 303 | [] | ?() | applies a filter (script) | 304 | | | expression | 305 +-------+------------------+-------------------------------------+ 306 | n/a | () | expression engine | 307 +-------+------------------+-------------------------------------+ 308 | () | n/a | grouping in Xpath | 309 +-------+------------------+-------------------------------------+ 311 Table 1: Overview over JSONPath, comparing to XPath 313 XPath has a lot more to offer (location paths in unabbreviated 314 syntax, operators and functions) than listed here. Moreover there is 315 a significant difference how the subscript operator works in Xpath 316 and JSONPath: 318 * Square brackets in XPath expressions always operate on the _node 319 set_ resulting from the previous path fragment. Indices always 320 start at 1. 322 * With JSONPath, square brackets operate on the _object_ or _array_ 323 addressed by the previous path fragment. Array indices always 324 start at 0. 326 2. JSONPath Examples 328 This section provides some more examples for JSONPath expressions. 329 The examples are based on the simple JSON value shown in Figure 1, 330 which was patterned after a typical XML example representing a 331 bookstore (that also has bicycles). 333 { "store": { 334 "book": [ 335 { "category": "reference", 336 "author": "Nigel Rees", 337 "title": "Sayings of the Century", 338 "price": 8.95 339 }, 340 { "category": "fiction", 341 "author": "Evelyn Waugh", 342 "title": "Sword of Honour", 343 "price": 12.99 344 }, 345 { "category": "fiction", 346 "author": "Herman Melville", 347 "title": "Moby Dick", 348 "isbn": "0-553-21311-3", 349 "price": 8.99 350 }, 351 { "category": "fiction", 352 "author": "J. R. R. Tolkien", 353 "title": "The Lord of the Rings", 354 "isbn": "0-395-19395-8", 355 "price": 22.99 356 } 357 ], 358 "bicycle": { 359 "color": "red", 360 "price": 19.95 361 } 362 } 363 } 365 Figure 1: Example JSON value 367 The examples in Table 2 use the expression mechanism to obtain the 368 number of elements in an array, to test for the presence of a member 369 in a object, and to perform numeric comparisons of member values with 370 a constant. 372 +======================+========================+===================+ 373 | XPath | JSONPath | Result | 374 +======================+========================+===================+ 375 | /store/book/author | $.store.book[*].author | the authors of | 376 | | | all books in | 377 | | | the store | 378 +----------------------+------------------------+-------------------+ 379 | //author | $..author | all authors | 380 +----------------------+------------------------+-------------------+ 381 | /store/* | $.store.* | all things in | 382 | | | store, which | 383 | | | are some books | 384 | | | and a red | 385 | | | bicycle | 386 +----------------------+------------------------+-------------------+ 387 | /store//price | $.store..price | the prices of | 388 | | | everything in | 389 | | | the store | 390 +----------------------+------------------------+-------------------+ 391 | //book[3] | $..book[2] | the third book | 392 +----------------------+------------------------+-------------------+ 393 | //book[last()] | $..book[(@.length-1)] | the last book | 394 | | $..book[-1] | in order | 395 +----------------------+------------------------+-------------------+ 396 | //book[position()<3] | $..book[0,1] | the first two | 397 | | $..book[:2] | books | 398 +----------------------+------------------------+-------------------+ 399 | //book[isbn] | $..book[?(@.isbn)] | filter all | 400 | | | books with isbn | 401 | | | number | 402 +----------------------+------------------------+-------------------+ 403 | //book[price<10] | $..book[?(@.price<10)] | filter all | 404 | | | books cheaper | 405 | | | than 10 | 406 +----------------------+------------------------+-------------------+ 407 | //* | $..* | all elements in | 408 | | | XML document; | 409 | | | all member | 410 | | | values and | 411 | | | array elements | 412 | | | contained in | 413 | | | input value | 414 +----------------------+------------------------+-------------------+ 416 Table 2: Example JSONPath expressions applied to the example JSON 417 value 419 3. JSONPath Syntax and Semantics 421 3.1. Overview 423 A JSONPath query is a string which selects zero or more nodes of a 424 piece of JSON. A valid query conforms to the ABNF syntax defined by 425 this document. 427 A query MUST be encoded using UTF-8. To parse a query according to 428 the grammar in this document, its UTF-8 form SHOULD first be decoded 429 into Unicode code points as described in [RFC3629]. 431 A string to be used as a JSONPath query needs to be _well-formed_ and 432 _valid_. A string is a well-formed JSONPath query if it conforms to 433 the syntax of JSONPath. A well-formed JSONPath query is valid if it 434 also fulfills all semantic requirements posed by this document. 436 The well-formedness and the validity of JSONPath queries are 437 independent of the value the query is applied to; no further errors 438 can be raised during application of the query to a value. 440 (Obviously, an implementation can still fail when executing a 441 JSONPath query, e.g., because of resource depletion, but this is not 442 modeled in the present specification.) 444 3.2. Processing Model 446 In this specification, the semantics of a JSONPath query are defined 447 in terms of a _processing model_. That model is not prescriptive of 448 the internal workings of an implementation: Implementations may wish 449 (or need) to design a different process that yields results that 450 conform to the model. 452 In the processing model, a valid query is executed against a value, 453 the _argument_, and produces a list of zero or more nodes of the 454 value. 456 The query is a sequence of zero or more _selectors_, each of which is 457 applied to the result of the previous selector and provides input to 458 the next selector. These results and inputs take the form of a 459 _nodelist_, i.e., a sequence of zero or more nodes. 461 The nodelist going into the first selector contains a single node, 462 the argument. The nodelist resulting from the last selector is 463 presented as the result of the query; depending on the specific API, 464 it might be presented as an array of the JSON values at the nodes, an 465 array of Output Paths referencing the nodes, or both -- or some other 466 representation as desired by the implementation. Note that the API 467 must be capable of presenting an empty nodelist as the result of the 468 query. 470 A selector performs its function on each of the nodes in its input 471 nodelist, during such a function execution, such a node is referred 472 to as the "current node". Each of these function executions produces 473 a nodelist, which are then concatenated into the result of the 474 selector. 476 The processing within a selector may execute nested queries, which 477 are in turn handled with the processing model defined here. 478 Typically, the argument to that query will be the current node of the 479 selector or a set of nodes subordinate to that current node. 481 3.3. Syntax 483 Syntactically, a JSONPath query consists of a root selector ($), 484 which stands for a nodelist that contains the root node of the 485 argument, followed by a possibly empty sequence of _selectors_. 487 json-path = root-selector *(dot-selector / 488 dot-wild-selector / 489 index-selector / 490 index-wild-selector / 491 union-selector / 492 slice-selector / 493 descendant-selector / 494 filter-selector) 496 The syntax and semantics of each selector is defined below. 498 3.4. Semantics 500 The root selector $ not only selects the root node of the argument, 501 but it also produces as output a list consisting of one node: the 502 argument itself. 504 A selector may select zero or more nodes for further processing. A 505 syntactically valid selector MUST NOT produce errors. This means 506 that some operations which might be considered erroneous, such as 507 indexing beyond the end of an array, simply result in fewer nodes 508 being selected. 510 But a selector doesn't just act on a single node: a selector acts on 511 each of the nodes in its input nodelist and concatenates the 512 resultant nodelists to form the result nodelist of the selector. 514 For each node in the list, the selector selects zero or more nodes, 515 each of which is a descendant of the node or the node itself. 517 For instance, with the argument {"a":[{"b":0},{"b":1},{"c":2}]}, the 518 query $.a[*].b selects the following list of nodes: 0, 1 (denoted 519 here by their value). Let's walk through this in detail. 521 The query consists of $ followed by three selectors: .a, [*], and .b. 523 Firstly, $ selects the root node which is the argument. So the 524 result is a list consisting of just the root node. 526 Next, .a selects from any input node of type object and selects the 527 node of any member value of the input node corresponding to the 528 member name "a". The result is again a list of one node: 529 [{"b":0},{"b":1},{"c":2}]. 531 Next, [*] selects from any input node which is an array and selects 532 all the elements of the input node. The result is a list of three 533 nodes: {"b":0}, {"b":1}, and {"c":2}. 535 Finally, .b selects from any input node of type object with a member 536 name b and selects the node of the member value of the input node 537 corresponding to that name. The result is a list containing 0, 1. 538 This is the concatenation of three lists, two of length one 539 containing 0, 1, respectively, and one of length zero. 541 As a consequence of this approach, if any of the selectors selects no 542 nodes, then the whole query selects no nodes. 544 In what follows, the semantics of each selector are defined for each 545 type of node. 547 3.5. Selectors 549 A JSONPath query consists of a sequence of selectors. Valid 550 selectors are 552 * Root selector $ 554 * Dot selector ., used with object member names exclusively. 556 * Dot wild card selector .*. 558 * Index selector [], where is either an (possibly 559 negative) array index or an object member name. 561 * Index wild card selector [*]. 563 * Array slice selector [::], where , , 564 are integer literals. 566 * Nested descendants selector ... 568 * Union selector [,,...,], holding a comma 569 delimited list of index, index wild card, array slice, and filter 570 selectors. 572 * Filter selector [?()] 574 * Current item selector @ 576 3.5.1. Root Selector 578 Syntax 580 Every valid JSONPath query MUST begin with the root selector $. 582 root-selector = "$" 584 Semantics 586 The Argument -- the root JSON value -- becomes the root node, which 587 is addressed by the root selector $. 589 3.5.2. Dot Selector 591 Syntax 593 A dot selector starts with a dot . followed by an object's member 594 name. 596 dot-selector = "." dot-member-name 597 dot-member-name = name-first *name-char 598 name-first = 599 ALPHA / 600 "_" / ; _ 601 %x80-10FFFF ; any non-ASCII Unicode character 602 name-char = DIGIT / name-first 604 DIGIT = %x30-39 ; 0-9 605 ALPHA = %x41-5A / %x61-7A ; A-Z / a-z 606 Member names containing other characters than allowed by dot-selector 607 -- such as space ` ` and minus - characters -- MUST NOT be used with 608 the dot-selector. (Such member names can be addressed by the index- 609 selector instead.) 611 Semantics 613 The dot-selector selects the node of the member value corresponding 614 to the member name from any JSON object. It selects no nodes from 615 any other JSON value. 617 Note that the dot-selector follows the philosophy of JSON strings and 618 is allowed to contain bit sequences that cannot encode Unicode 619 characters (a single unpaired UTF-16 surrogate, for example). The 620 behaviour of an implementation is undefined for member names which do 621 not encode Unicode characters. 623 3.5.3. Dot Wild Card Selector 625 Syntax 627 The dot wild card selector has the form .*. 629 dot-wild-selector = "." "*" ; dot followed by asterisk 631 Semantics 633 A dot-wild-selector acts as a wild card by selecting the nodes of all 634 member values of an object as well as all element nodes of an array. 635 Applying the dot-wild-selector to a primitive JSON value (number, 636 string, or true/false/null) selects no node. 638 3.5.4. Index Selector 640 Syntax 642 An index selector [] addresses at most one object member value 643 or at most one array element value. 645 index-selector = "[" (quoted-member-name / element-index) "]" 647 Applying the index-selector to an object value, a quoted-member-name 648 string is required. JSONPath allows it to be enclosed in _single_ or 649 _double_ quotes. 651 quoted-member-name = string-literal 653 string-literal = %x22 *double-quoted %x22 / ; "string" 654 %x27 *single-quoted %x27 ; 'string' 656 double-quoted = unescaped / 657 %x27 / ; ' 658 ESC %x22 / ; \" 659 ESC escapable 661 single-quoted = unescaped / 662 %x22 / ; " 663 ESC %x27 / ; \' 664 ESC escapable 666 ESC = %x5C ; \ backslash 668 unescaped = %x20-21 / ; s. RFC 8259 669 %x23-26 / ; omit " 670 %x28-5B / ; omit ' 671 %x5D-10FFFF ; omit \ 673 escapable = ( %x62 / %x66 / %x6E / %x72 / %x74 / ; \b \f \n \r \t 674 ; b / ; BS backspace U+0008 675 ; t / ; HT horizontal tab U+0009 676 ; n / ; LF line feed U+000A 677 ; f / ; FF form feed U+000C 678 ; r / ; CR carriage return U+000D 679 "/" / ; / slash (solidus) 680 "\" / ; \ backslash (reverse solidus) 681 (%x75 hexchar) ; uXXXX U+XXXX 682 ) 684 hexchar = non-surrogate / (high-surrogate "\" %x75 low-surrogate) 685 non-surrogate = ((DIGIT / "A"/"B"/"C" / "E"/"F") 3HEXDIG) / 686 ("D" %x30-37 2HEXDIG ) 687 high-surrogate = "D" ("8"/"9"/"A"/"B") 2HEXDIG 688 low-surrogate = "D" ("C"/"D"/"E"/"F") 2HEXDIG 690 HEXDIG = DIGIT / "A" / "B" / "C" / "D" / "E" / "F" 692 ; Task from 2021-06-15 interim: update ABNF later 694 Applying the index-selector to an array, a numerical element-index is 695 required. JSONPath allows it to be negative. 697 element-index = int ; decimal integer 699 int = ["-"] ( "0" / (DIGIT1 *DIGIT) ) ; - optional 700 DIGIT1 = %x31-39 ; 1-9 non-zero digit 702 Notes: 1. double-quoted strings follow JSON in [RFC8259]; single- 703 quoted strings follow an analogous pattern. 2. An element-index is 704 an integer (in base 10, as in JSON numbers). 3. As in JSON numbers, 705 the syntax does not allow octal-like integers with leading zeros such 706 as 01 or -01. 708 Semantics 710 A quoted-member-name string MUST be converted to a member name by 711 removing the surrounding quotes and replacing each escape sequence 712 with its equivalent Unicode character, as in the table below: 714 +=================+===================+=============================+ 715 | Escape Sequence | Unicode Character | Description | 716 +=================+===================+=============================+ 717 | \b | U+0008 | BS backspace | 718 +-----------------+-------------------+-----------------------------+ 719 | \t | U+0009 | HT horizontal tab | 720 +-----------------+-------------------+-----------------------------+ 721 | \n | U+000A | LF line feed | 722 +-----------------+-------------------+-----------------------------+ 723 | \f | U+000C | FF form feed | 724 +-----------------+-------------------+-----------------------------+ 725 | \r | U+000D | CR carriage return | 726 +-----------------+-------------------+-----------------------------+ 727 | \" | U+0022 | quotation mark | 728 +-----------------+-------------------+-----------------------------+ 729 | \' | U+0027 | apostrophe | 730 +-----------------+-------------------+-----------------------------+ 731 | \/ | U+002F | slash (solidus) | 732 +-----------------+-------------------+-----------------------------+ 733 | \\ | U+005C | backslash (reverse | 734 | | | solidus) | 735 +-----------------+-------------------+-----------------------------+ 736 | \uXXXX | U+XXXX | unicode character | 737 +-----------------+-------------------+-----------------------------+ 739 Table 3: Escape Sequence Replacements 741 The index-selector applied with a quoted-member-name to an object 742 selects the node of the corresponding member value from it, if and 743 only if that object has a member with that name. Nothing is selected 744 from a value which is not a object. 746 Array indexing via element-index is a way of selecting a particular 747 array element using a zero-based index. For example, selector [0] 748 selects the first and selector [4] the fifth element of a 749 sufficiently long array. 751 A negative element-index counts from the array end. For example, 752 selector [-1] selects the last and selector [-2] selects the last but 753 one element of an array with at least two elements. 755 3.5.5. Index Wild Card Selector 757 Syntax 759 The index wild card selector has the form [*]. 761 index-wild-selector = "[" "*" "]" ; asterisk enclosed by brackets 763 Semantics 765 An index-wild-selector selects the nodes of all member values of an 766 object as well as of all elements of an array. Applying the index- 767 wild-selector to a primitive JSON value (such as a number, string, or 768 true/false/null) selects no node. 770 The index-wild-selector behaves identically to the dot-wild-selector. 772 3.5.6. Array Slice Selector 774 Syntax 776 The array slice selector has the form [::]. It 777 selects elements starting at index , ending at -- but not 778 including -- , while incrementing by step. 780 slice-selector = "[" slice-index "]" 782 slice-index = ws [start] ws ":" ws [end] [ws ":" ws [step] ws] 784 start = int ; included in selection 785 end = int ; not included in selection 786 step = int ; default: 1 788 ws = *( %x20 / ; Space 789 %x09 / ; Horizontal tab 790 %x0A / ; Line feed or New line 791 %x0D ) ; Carriage return 793 The slice-selector consists of three optional decimal integers 794 separated by colons. 796 Semantics 798 The slice-selector was inspired by the slice operator of ECMAScript 4 799 (ES4), which was deprecated in 2014, and that of Python. 801 Informal Introduction 803 This section is non-normative. 805 Array indexing is a way of selecting a particular element of an array 806 using a 0-based index. For example, the expression [0] selects the 807 first element of a non-empty array. 809 Negative indices index from the end of an array. For example, the 810 expression [-2] selects the last but one element of an array with at 811 least two elements. 813 Array slicing is inspired by the behaviour of the 814 Array.prototype.slice method of the JavaScript language as defined by 815 the ECMA-262 standard [ECMA-262], with the addition of the step 816 parameter, which is inspired by the Python slice expression. 818 The array slice expression [start:end:step] selects elements at 819 indices starting at start, incrementing by step, and ending with end 820 (which is itself excluded). So, for example, the expression [1:3] 821 (where step defaults to 1) selects elements with indices 1 and 2 (in 822 that order) whereas [1:5:2] selects elements with indices 1 and 3. 824 When step is negative, elements are selected in reverse order. Thus, 825 for example, [5:1:-2] selects elements with indices 5 and 3, in that 826 order and [::-1] selects all the elements of an array in reverse 827 order. 829 When step is 0, no elements are selected. This is the one case which 830 differs from the behaviour of Python, which raises an error in this 831 case. 833 The following section specifies the behaviour fully, without 834 depending on JavaScript or Python behaviour. 836 Detailed Semantics 838 An array selector is either an array slice or an array index, which 839 is defined in terms of an array slice. 841 A slice expression selects a subset of the elements of the input 842 array, in the same order as the array or the reverse order, depending 843 on the sign of the step parameter. It selects no nodes from a node 844 which is not an array. 846 A slice is defined by the two slice parameters, start and end, and an 847 iteration delta, step. Each of these parameters is optional. len is 848 the length of the input array. 850 The default value for step is 1. The default values for start and 851 end depend on the sign of step, as follows: 853 +===========+=========+==========+ 854 | Condition | start | end | 855 +===========+=========+==========+ 856 | step >= 0 | 0 | len | 857 +-----------+---------+----------+ 858 | step < 0 | len - 1 | -len - 1 | 859 +-----------+---------+----------+ 861 Table 4: Default array slice 862 start and end values 864 Slice expression parameters start and end are not directly usable as 865 slice bounds and must first be normalized. Normalization for this 866 purpose is defined as: 868 FUNCTION Normalize(i, len): 869 IF i >= 0 THEN 870 RETURN i 871 ELSE 872 RETURN len + i 873 END IF 875 The result of the array indexing expression [i] applied to an array 876 of length len is defined to be the result of the array slicing 877 expression [i:Normalize(i, len)+1:1]. 879 Slice expression parameters start and end are used to derive slice 880 bounds lower and upper. The direction of the iteration, defined by 881 the sign of step, determines which of the parameters is the lower 882 bound and which is the upper bound: 884 FUNCTION Bounds(start, end, step, len): 885 n_start = Normalize(start, len) 886 n_end = Normalize(end, len) 888 IF step >= 0 THEN 889 lower = MIN(MAX(n_start, 0), len) 890 upper = MIN(MAX(n_end, 0), len) 891 ELSE 892 upper = MIN(MAX(n_start, -1), len-1) 893 lower = MIN(MAX(n_end, -1), len-1) 894 END IF 896 RETURN (lower, upper) 898 The slice expression selects elements with indices between the lower 899 and upper bounds. In the following pseudocode, the a(i) construct 900 expresses the 0-based indexing operation on the underlying array. 902 IF step > 0 THEN 904 i = lower 905 WHILE i < upper: 906 SELECT a(i) 907 i = i + step 908 END WHILE 910 ELSE if step < 0 THEN 912 i = upper 913 WHILE lower < i: 914 SELECT a(i) 915 i = i + step 916 END WHILE 918 END IF 920 When step = 0, no elements are selected and the result array is 921 empty. 923 An implementation MUST raise an error if any of the slice expression 924 parameters does not fit in the implementation's representation of an 925 integer. If a successfully parsed slice expression is evaluated 926 against an array whose size doesn't fit in the implementation's 927 representation of an integer, the implementation MUST raise an error. 929 3.5.7. Descendant Selector 931 Syntax 933 The descendant selector starts with a double dot .. and can be 934 followed by an object member name (similar to the dot-selector), by 935 an index-selector acting on objects or arrays, or by a wild card. 937 descendant-selector = ".." ( dot-member-name / ; .. 938 index-selector / ; ..[] 939 index-wild-selector / ; ..[*] 940 "*" ; ..* 941 ) 943 Semantics 945 The descendant-selector is inspired by ECMAScript for XML (E4X). It 946 selects the node and all its descendants. 948 3.5.8. Union Selector 950 3.5.8.1. Syntax 952 The union selector is syntactically related to the index-selector. 953 It contains multiple, comma separated entries. 955 union-selector = "[" ws union-entry 1*(ws "," ws union-entry) ws "]" 957 union-entry = ( quoted-member-name / 958 element-index / 959 slice-index 960 ) 962 Task (T1): This, besides slice-index, is currently one of only two 963 places in the document that mentions whitespace. Whitespace needs 964 to be handled throughout the ABNF syntax. Room Consensus at the 965 2021-06-15 interim was that JSONPath generally is generous with 966 allowing insignificant whitespace throughout. Minimizing the 967 impact of the many whitespace insertion points by choosing a rule 968 name such as "S" was mentioned. Some conventions will probably 969 help with minimizing the number of places where S needs to be 970 inserted. 972 3.5.8.2. Semantics 974 A union selects any node which is selected by at least one of the 975 union selectors and selects the concatenation of the lists (in the 976 order of the selectors) of nodes selected by the union elements. 977 Note that any node selected in more than one of the union selectors 978 is kept as many times in the node list. 980 3.5.9. Filter Selector 982 3.5.9.1. Syntax 984 The filter selector has the form [?]. It works via iterating 985 over structured values, i.e. arrays and objects. 987 filter-selector = "[?" boolean-expr "]" 989 During iteration process each array element or object member is 990 visited and its value -- accessible via symbol @ -- or one of its 991 descendants -- uniquely defined by a relative path -- is tested 992 against a boolean expression boolean-expr. 994 The current item is selected if and only if the result is true. 996 boolean-expr = logical-or-expr 997 logical-or-expr = logical-and-expr *("||" logical-and-expr) 998 ; disjunction 999 ; binds less tightly than conjunction 1000 logical-and-expr = basic-expr *("&&" basic-expr) ; conjunction 1001 ; binds more tightly than disjunction 1003 basic-expr = exist-expr / paren-expr / (neg-op paren-expr) / relation-expr 1004 exist-expr = [neg-op] path ; path existence or non-existence 1005 path = rel-path / json-path 1006 rel-path = "@" *(dot-selector / index-selector) 1007 paren-expr = "(" boolean-expr ")" ; parenthesized expression 1008 neg-op = "!" ; not operator 1010 relation-expr = comp-expr / ; comparison test 1011 regex-expr / ; regular expression test 1012 contain-expr ; containment test 1014 comp-expr = comparable comp-op comparable 1015 comparable = number / string-literal / ; primitive ... 1016 true / false / null / ; values only 1017 path ; path value 1018 comp-op = "==" / "!=" / ; comparison ... 1019 "<" / ">" / ; operators 1020 "<=" / ">=" 1022 regex-expr = regex-op regex 1023 regex-op = "=~" ; regular expression match 1024 regex = 1026 contain-expr = containable in-op container 1027 containable = rel-path / json-path / ; path to primitive value 1028 number / string-literal 1029 in-op = " in " ; in operator 1030 container = rel-path / json-path / array-literal ; resolves to array 1032 Notes: 1034 * Parentheses can be used with boolean-expr for grouping. So filter 1035 selection syntax in the original proposal [?()] is naturally 1036 contained in the current lean syntax [?] as a special case. 1038 * Comparisons are restricted to primitive values (such as number, 1039 string, true, false, null). Comparisons with complex values will 1040 fail, i.e. no selection occurs. 1042 * Types are not implicitly converted in comparisons. So "13 == 1043 '13'" selects no node. 1045 * A member or element value by itself is _falsy_ only, if it does 1046 not exist. Otherwise it is _truthy_, resulting in its value. To 1047 be more specific explicit comparisons are necessary. This 1048 existence test -- as an exception of the general rule -- also 1049 works with structured values. 1051 * Regular expression tests can be applied to string values only. 1053 * The value of the first operand (containable) of a contain-expr is 1054 compared to every single element of the RHS container. In case of 1055 a match a selection occurs. Containment tests -- like comparisons 1056 -- are restricted to primitive values. So even if a structured 1057 containable value is equal to a certain structured value in 1058 container, no selection is done. 1060 * The value of the second operand (container) of a contain-expr 1061 needs to be resolved to an array. Otherwise nothing is selected. 1063 The following table lists filter expression operators in order of 1064 precedence from highest (binds most tightly) to lowest (binds least 1065 tightly). 1067 +============+===========+===========+ 1068 | Precedence | Operator | Syntax | 1069 | | type | | 1070 +============+===========+===========+ 1071 | 5 | Grouping | (...) | 1072 +------------+-----------+-----------+ 1073 | 4 | Logical | ! | 1074 | | NOT | | 1075 +------------+-----------+-----------+ 1076 | 3 | Relations | == != | 1077 | | | < <= > >= | 1078 | | | =~ | 1079 | | | in | 1080 +------------+-----------+-----------+ 1081 | 2 | Logical | && | 1082 | | AND | | 1083 +------------+-----------+-----------+ 1084 | 1 | Logical | \|\| | 1085 | | OR | | 1086 +------------+-----------+-----------+ 1088 Table 5: Filter expression 1089 operator precedence 1091 3.5.9.2. Semantics 1093 The filter-selector works with arrays and objects exclusively. Its 1094 result might be a list of _zero_, _one_, _multiple_ or _all_ of their 1095 element or member values then. Applied to other value types, it will 1096 select nothing. 1098 Negation operator neg-op allows to test _falsiness_ of values. 1100 +========+==========+========+===========================+ 1101 | Type | Negation | Result | Comment | 1102 +========+==========+========+===========================+ 1103 | Number | !0 | true | false for non-zero number | 1104 +--------+----------+--------+---------------------------+ 1105 | String | !"" | true | false for non-empty | 1106 | | !'' | | string | 1107 +--------+----------+--------+---------------------------+ 1108 | null | !null | true | -- | 1109 +--------+----------+--------+---------------------------+ 1110 | true | !true | false | -- | 1111 +--------+----------+--------+---------------------------+ 1112 | false | !false | true | -- | 1113 +--------+----------+--------+---------------------------+ 1114 | Object | !{} | false | always false | 1115 | | !{a:0} | | | 1116 +--------+----------+--------+---------------------------+ 1117 | Array | ![] | false | always false | 1118 | | ![0] | | | 1119 +--------+----------+--------+---------------------------+ 1121 Table 6: Test falsiness of JSON values 1123 Applying negation operator twice !! gives us _truthiness_ of values. 1125 Some examples: 1127 +===================+=======================+===============+===========+ 1128 |JSON | Query | Result |Comment | 1129 +===================+=======================+===============+===========+ 1130 |{"a":1,"b":2} | $[?@] | [1,2] |Same as $.*| 1131 |[2,3,4] | | [2,3,4] |or $[*] | 1132 +-------------------+-----------------------+---------------+-----------+ 1133 |./. | $[?@==2] | [2] |Select by | 1134 | | | [2] |value. | 1135 +-------------------+-----------------------+---------------+-----------+ 1136 |{"a":{"b":{"c":{}}}| $[?@.b] |[{"b":{"c":{}}]|Existence | 1137 | | $[?@.b.c] | | | 1138 +-------------------+-----------------------+---------------+-----------+ 1139 |{"key":false} | $[?index(@)=='key'] | [false] |Select | 1140 | | $[?index(@)==0] | [] |object | 1141 | | | |member | 1142 +-------------------+-----------------------+---------------+-----------+ 1143 |[3,4,5] | $[?index(@)==2] | [5] |Select | 1144 | | $[?index(@)==17] | [] |array | 1145 | | | |element | 1146 +-------------------+-----------------------+---------------+-----------+ 1147 |{"col":"red"} | $[?@ in | ["red"] |Containment| 1148 | |['red','green','blue']]| | | 1149 +-------------------+-----------------------+---------------+-----------+ 1150 |{"a":{"b":{5},c:0}}| $[?@.b==5 && !@.c] |[{"b":{5},c:0}]|Existence | 1151 +-------------------+-----------------------+---------------+-----------+ 1153 Table 7 1155 4. Expression Language 1157 Task (T2): Separate out expression language. For now, this 1158 section is a repository for ABNF taken from [RFC8259]. This needs 1159 to be deduplicated with definitions above. 1161 number = [ minus ] jsint [ frac ] [ exp ] 1162 decimal-point = %x2E ; . 1163 digit1-9 = %x31-39 ; 1-9 1164 e = %x65 / %x45 ; e E 1165 exp = e [ minus / plus ] 1*DIGIT 1166 frac = decimal-point 1*DIGIT 1167 jsint = zero / ( digit1-9 *DIGIT ) 1168 minus = %x2D ; - 1169 plus = %x2B ; + 1170 zero = %x30 ; 0 1172 false = %x66.61.6c.73.65 ; false 1173 null = %x6e.75.6c.6c ; null 1174 true = %x74.72.75.65 ; true 1176 5. IANA Considerations 1178 TBD: Define a media type for JSONPath expressions. 1180 6. Security Considerations 1182 This section gives security considerations, as required by [RFC3552]. 1184 7. References 1186 7.1. Normative References 1188 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1189 Requirement Levels", BCP 14, RFC 2119, 1190 DOI 10.17487/RFC2119, March 1997, 1191 . 1193 [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO 1194 10646", STD 63, RFC 3629, DOI 10.17487/RFC3629, November 1195 2003, . 1197 [RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax 1198 Specifications: ABNF", STD 68, RFC 5234, 1199 DOI 10.17487/RFC5234, January 2008, 1200 . 1202 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1203 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1204 May 2017, . 1206 [RFC8259] Bray, T., Ed., "The JavaScript Object Notation (JSON) Data 1207 Interchange Format", STD 90, RFC 8259, 1208 DOI 10.17487/RFC8259, December 2017, 1209 . 1211 7.2. Informative References 1213 [E4X] ISO, "Information technology — ECMAScript for XML (E4X) 1214 specification", ISO/IEC 22537:2006 , 2006. 1216 [ECMA-262] Ecma International, "ECMAScript Language Specification, 1217 Standard ECMA-262, Third Edition", December 1999, 1218 . 1222 [JSONPath-orig] 1223 Gössner, S., "JSONPath — XPath for JSON", 21 February 1224 2007, . 1226 [RFC3552] Rescorla, E. and B. Korver, "Guidelines for Writing RFC 1227 Text on Security Considerations", BCP 72, RFC 3552, 1228 DOI 10.17487/RFC3552, July 2003, 1229 . 1231 [RFC6901] Bryan, P., Ed., Zyp, K., and M. Nottingham, Ed., 1232 "JavaScript Object Notation (JSON) Pointer", RFC 6901, 1233 DOI 10.17487/RFC6901, April 2013, 1234 . 1236 [SLICE] "Slice notation", n.d., 1237 . 1239 [XPath] Berglund, A., Boag, S., Chamberlin, D., Fernandez, M., 1240 Kay, M., Robie, J., and J. Simeon, "XML Path Language 1241 (XPath) 2.0 (Second Edition)", World Wide Web Consortium 1242 Recommendation REC-xpath20-20101214, 14 December 2010, 1243 . 1245 Acknowledgements 1247 This specification is based on Stefan Gössner's original online 1248 article defining JSONPath [JSONPath-orig]. 1250 The books example was taken from http://coli.lili.uni- 1251 bielefeld.de/~andreas/Seminare/sommer02/books.xml -- a dead link now. 1253 Contributors 1255 Marko Mikulicic 1256 InfluxData, Inc. 1257 Pisa 1258 Italy 1260 Email: mmikulicic@gmail.com 1262 Edward Surov 1263 TheSoul Publishing Ltd. 1264 Limassol 1265 Cyprus 1267 Email: esurov.tsp@gmail.com 1269 Authors' Addresses 1271 Stefan Gössner (editor) 1272 Fachhochschule Dortmund 1273 Sonnenstraße 96 1274 D-44139 Dortmund 1275 Germany 1277 Email: stefan.goessner@fh-dortmund.de 1279 Glyn Normington (editor) 1280 Winchester 1281 United Kingdom 1283 Email: glyn.normington@gmail.com 1285 Carsten Bormann (editor) 1286 Universität Bremen TZI 1287 Postfach 330440 1288 D-28359 Bremen 1289 Germany 1291 Phone: +49-421-218-63921 1292 Email: cabo@tzi.org