idnits 2.17.1 draft-ietf-clue-data-model-schema-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 48 instances of too long lines in the document, the longest one being 18 characters in excess of 72. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 180: '.... Capture Scene MAY correspond to a p...' RFC 2119 keyword, line 181: '...presence room or MAY focus only on the...' RFC 2119 keyword, line 962: '... MUST be included in spatially defin...' RFC 2119 keyword, line 1001: '... line of capture MUST NOT be identical...' RFC 2119 keyword, line 1003: '...of capture is provided, it MUST belong...' (9 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 127 has weird spacing: '...ff with draft...' == Line 2596 has weird spacing: '...ff with draft...' -- The document date (September 29, 2014) is 3497 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'RFC4796' is defined on line 2634, but no explicit reference was found in the text == Outdated reference: A later version (-18) exists of draft-ietf-clue-datachannel-01 == Outdated reference: A later version (-25) exists of draft-ietf-clue-framework-17 == Outdated reference: A later version (-19) exists of draft-ietf-clue-protocol-01 == Outdated reference: A later version (-38) exists of draft-ietf-ecrit-additional-data-22 Summary: 2 errors (**), 0 flaws (~~), 8 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 CLUE Working Group R. Presta 3 Internet-Draft S P. Romano 4 Intended status: Standards Track University of Napoli 5 Expires: April 2, 2015 September 29, 2014 7 An XML Schema for the CLUE data model 8 draft-ietf-clue-data-model-schema-07 10 Abstract 12 This document provides an XML schema file for the definition of CLUE 13 data model types. 15 Status of This Memo 17 This Internet-Draft is submitted in full conformance with the 18 provisions of BCP 78 and BCP 79. 20 Internet-Drafts are working documents of the Internet Engineering 21 Task Force (IETF). Note that other groups may also distribute 22 working documents as Internet-Drafts. The list of current Internet- 23 Drafts is at http://datatracker.ietf.org/drafts/current/. 25 Internet-Drafts are draft documents valid for a maximum of six months 26 and may be updated, replaced, or obsoleted by other documents at any 27 time. It is inappropriate to use Internet-Drafts as reference 28 material or to cite them other than as "work in progress." 30 This Internet-Draft will expire on April 2, 2015. 32 Copyright Notice 34 Copyright (c) 2014 IETF Trust and the persons identified as the 35 document authors. All rights reserved. 37 This document is subject to BCP 78 and the IETF Trust's Legal 38 Provisions Relating to IETF Documents 39 (http://trustee.ietf.org/license-info) in effect on the date of 40 publication of this document. Please review these documents 41 carefully, as they describe your rights and restrictions with respect 42 to this document. Code Components extracted from this document must 43 include Simplified BSD License text as described in Section 4.e of 44 the Trust Legal Provisions and are provided without warranty as 45 described in the Simplified BSD License. 47 Table of Contents 48 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 49 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 50 3. XML Schema . . . . . . . . . . . . . . . . . . . . . . . . . . 7 51 4. . . . . . . . . . . . . . . . . . . . . . . . 17 52 5. . . . . . . . . . . . . . . . . . . . . . . . 18 53 6. . . . . . . . . . . . . . . . . . . . . . . . 18 54 7. . . . . . . . . . . . . . . . . . . . . . . 18 55 8. . . . . . . . . . . . . . . . . . . . . . . . . 18 56 9. . . . . . . . . . . . . . . . . . . . . . . 18 57 10. . . . . . . . . . . . . . . . . . . . . . . . . 18 58 10.1. captureID attribute . . . . . . . . . . . . . . . . . . . 19 59 10.2. mediaType attribute . . . . . . . . . . . . . . . . . . . 20 60 10.3. . . . . . . . . . . . . . . . . . . . 20 61 10.4. . . . . . . . . . . . . . . . . . . . . . 20 62 10.5. . . . . . . . . . . . . . . . . . . 20 63 10.5.1. . . . . . . . . . . . . . . . . . . . 21 64 10.5.2. . . . . . . . . . . . . . . . . . . . . 22 65 10.6. . . . . . . . . . . . . . . . . . 23 66 10.7. . . . . . . . . . . . . . . . . . . . . . . . . 23 67 10.8. . . . . . . . . . . . . . . . . . . . 24 68 10.9. . . . . . . . . . . . . . . . . . . . . . . . . 24 69 10.10. . . . . . . . . . . . . . . . . . . . . . . 24 70 10.11. . . . . . . . . . . . . . . . . . . . . . . 25 71 10.12. . . . . . . . . . . . . . . . . . . . . . . 25 72 10.13. . . . . . . . . . . . . . . . . . . . . . . . 26 73 10.14. . . . . . . . . . . . . . . . . . . . . . . . . . 26 74 10.15. . . . . . . . . . . . . . . . . . . . . . . . 26 75 10.16. . . . . . . . . . . . . . . . . . . 26 76 10.17. . . . . . . . . . . . . . . . . . . . . . . . 26 77 10.18. . . . . . . . . . . . . . . . . . . . . . . . . . 26 78 10.19. . . . . . . . . . . . . . . . . . . . . . 27 79 10.20. . . . . . . . . . . . . . . . . . . . . 27 80 10.20.1. . . . . . . . . . . . . . . . . . . . . 27 81 11. Audio captures . . . . . . . . . . . . . . . . . . . . . . . . 27 82 11.1. . . . . . . . . . . . . . . . . . . 28 83 12. Video captures . . . . . . . . . . . . . . . . . . . . . . . . 28 84 12.1. . . . . . . . . . . . . . . . . . . . . . 29 85 13. Text captures . . . . . . . . . . . . . . . . . . . . . . . . 29 86 14. Other capture types . . . . . . . . . . . . . . . . . . . . . 30 87 15. . . . . . . . . . . . . . . . . . . . . . . . . 30 88 15.1. . . . . . . . . . . . . . . . . . . . 31 89 15.2. . . . . . . . . . . . . . . . . . . . . . . 31 90 15.3. sceneID attribute . . . . . . . . . . . . . . . . . . . . 32 91 15.4. scale attribute . . . . . . . . . . . . . . . . . . . . . 32 92 16. . . . . . . . . . . . . . . . . . . . . . . . . . 32 93 16.1. . . . . . . . . . . . . . . . . . . . . 33 94 16.2. sceneViewID attribute . . . . . . . . . . . . . . . . . . 33 95 17. . . . . . . . . . . . . . . . . . . . . . . . 33 96 17.1. . . . . . . . . . . . . . . . . . . . 34 97 17.2. . . . . . . . . . . . . . . . . . . . . 34 98 17.3. encodingGroupID attribute . . . . . . . . . . . . . . . . 34 99 18. . . . . . . . . . . . . . . . . . . . . . . 35 100 18.1. setID attribute . . . . . . . . . . . . . . . . . . . . . 35 101 18.2. mediaType attribute . . . . . . . . . . . . . . . . . . . 35 102 18.3. . . . . . . . . . . . . . . . . . . . 36 103 18.4. . . . . . . . . . . . . . . . . . . . . 36 104 18.5. . . . . . . . . . . . . . . . . . . . 36 105 19. . . . . . . . . . . . . . . . . . . . . . . . . . 36 106 20. . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 107 20.1. . . . . . . . . . . . . . . . . . . . . . . . . 37 108 20.1.1. personID attribute . . . . . . . . . . . . . . . . . 38 109 20.1.2. . . . . . . . . . . . . . . . . . . . . 38 110 20.1.3. . . . . . . . . . . . . . . . . . . . . 38 111 21. . . . . . . . . . . . . . . . . . . . . . . 38 112 21.1. . . . . . . . . . . . . . . . . . . . . . . . 39 113 21.2. . . . . . . . . . . . . . . . . . . . . . . 39 114 21.3. . . . . . . . . . . . . . . . . . . . 39 115 22. . . . . . . . . . . . . . . . . . . . . . . . . . . 39 116 23. XML Schema extensibility . . . . . . . . . . . . . . . . . . . 40 117 23.1. Example of extension . . . . . . . . . . . . . . . . . . 40 118 24. Security considerations . . . . . . . . . . . . . . . . . . . 42 119 25. IANA considerations . . . . . . . . . . . . . . . . . . . . . 43 120 25.1. XML Schema registration . . . . . . . . . . . . . . . . . 43 121 25.2. XML namespace registration . . . . . . . . . . . . . . . 43 122 26. Sample XML file . . . . . . . . . . . . . . . . . . . . . . . 43 123 27. MCC example . . . . . . . . . . . . . . . . . . . . . . . . . 50 124 28. Diff with draft-ietf-clue-data-model-schema-06 version . . . . 56 125 29. Diff with draft-ietf-clue-data-model-schema-04 version . . . . 57 126 30. Diff with draft-ietf-clue-data-model-schema-03 version . . . . 58 127 31. Diff with draft-ietf-clue-data-model-schema-02 version . . . 58 128 32. Informative References . . . . . . . . . . . . . . . . . . . . 58 130 1. Introduction 132 This document provides an XML schema file for the definition of CLUE 133 data model types. 135 The schema is based on information contained in 136 [I-D.ietf-clue-framework]. It encodes information and constraints 137 defined in the aforementioned document in order to provide a formal 138 representation of the concepts therein presented. The schema 139 definition is intended to be modified according to changes applied to 140 the above mentioned CLUE document. 142 The document aims at the definition of a coherent structure for all 143 the information associated with the description of a telepresence 144 scenario. Such information is used within the CLUE protocol messages 145 ([I-D.ietf-clue-protocol]) enabling the dialogue between a Media 146 Provider and a Media Consumer. CLUE protocol messages, indeed, are 147 XML messages allowing (i) a Media Provider to advertise its 148 telepresence capabilities in terms of media captures, capture scenes, 149 and other features envisioned in the CLUE framework, according to the 150 format herein defined and (ii) a Media Consumer to request the 151 desired telepresence options in the form of capture encodings, 152 represented as described in this document. 154 2. Terminology 156 This document refers to the same terminology used in 157 [I-D.ietf-clue-framework], except for the "CLUE Participant" 158 definition (which is still under discussion). We briefly recall 159 herein some of the main terms exploited in the document. 161 Audio Capture: Media Capture for audio. Denoted as ACn in the 162 example cases in this document. 164 Camera-Left and Right: For Media Captures, Camera-Left and Camera- 165 Right are from the point of view of a person observing the 166 rendered media. They are the opposite of Stage-Left and Stage- 167 Right. 169 Capture: Same as Media Capture. 171 Capture Device: A device that converts audio and video input into an 172 electrical signal, in most cases to be fed into a media encoder. 174 Capture Encoding: A specific encoding of a Media Capture, to be sent 175 by a Media Provider to a Media Consumer via RTP. 177 Capture Scene: An abstraction grouping semantically-coupled Media 178 Captures available at the Media Provider's side, representing a 179 precise portion of the local scene that can be transmitted 180 remotely. Capture Scene MAY correspond to a part of the 181 telepresence room or MAY focus only on the presentation media. A 182 Capture Scene is characterized by a set of attributes and by a set 183 of Capture Scene Views. 185 Capture Scene View: A list of Media Captures of the same media type 186 that constitute a possible representation of a Capture Scene. 187 Media Capture belonging to the same Capture Scene View can be sent 188 simultaneously by the Media Provider. 190 CLUE Participant: This term is not imported from the framework 191 terminology and should be considered temporary since it is under 192 review. We introduced it for the sake of simplicity in order to 193 identify a generic entity (either an Endpoint or a MCU) making use 194 of the CLUE protocol. 196 Consumer: Same as Media Consumer. 198 Encoding or Individual Encoding: The representation of an encoding 199 technology. In the CLUE datamodel, for each encoding it is 200 provided a set of parameters representing the encoding 201 constraints, like for example the maximum bandwidth of the Media 202 Provider the encoding can consume. s 204 Encoding Group: The representation of a group of encodings. For 205 each group, it is provided a set of parameters representing the 206 constraints to be applied to the group as a whole. An example is 207 the maximum bandwidth that can be consumed when using the 208 contained encodings together simultaneously. 210 Endpoint The logical point of final termination through receiving, 211 decoding and rendering, and/or initiation through capturing, 212 encoding, and sending of media streams. An endpoint consists of 213 one or more physical devices which source and sink media streams, 214 and exactly one SIP Conferencing Framework Participant (which, in 215 turn, includes exactly one SIP User Agent). Endpoints can be 216 anything from multiscreen/multicamera room controllers to handheld 217 devices. 219 MCU: Multipoint Control Unit (MCU) - a device that connects two or 220 more endpoints together into one single multimedia conference. An 221 MCU may include a Mixer. 223 Media: Any data that, after suitable encoding, can be conveyed over 224 RTP, including audio, video or timed text. 226 Media Capture: A "Media Capture", or simply "Capture", is a source 227 of Media of a single type (i.e., audio or video or text). 229 Media Stream: The term "Media Stream", or simply "Stream", is used 230 as a synonymous of Capture Encoding. 232 Media Provider: A CLUE participant (i.e., an Endpoint or a MCU) able 233 to send Media Streams. 235 Media Consumer: A CLUE participant (i.e., an Endpoint or a MCU) able 236 to receive Media Streams. 238 Scene: Same as Capture Scene. 240 Scene View: Same as Capture Scene View. 242 Stream: Same of Media Stream. 244 Multiple Content Capture: A Capture that can contain different Media 245 Captures of the same media type. It is denoted as MCC in this 246 document. In the Stream resulting from the MCC, the Stream coming 247 from the encoding of the composing Media Captures can appear 248 simultaneously, if the MCC is the result of a mixing operation, or 249 can appear alternatively over the time, according to a certain 250 switching policy. 252 Plane of Interest: The spatial plane containing the most relevant 253 subject matter. 255 Provider: Same as Media Provider. 257 Render: 259 Simultaneous Transmission Set: a set of Media Captures of the same 260 media type that can be transmitted simultaneously from a Media 261 Provider. 263 Single Media Capture: A Capture representing the Media coming from a 264 single-source Capture Device. 266 Spatial Information: Data about the spatial position of a Capture 267 Device that generate a Single Media Capture within the context of 268 a Capture Scene representing a phisical portion of a Telepresence 269 Room. 271 Stream Characteristics: The union of the features used to describe a 272 Stream in the CLUE environment and in the SIP-SDP environment. 274 Video Capture: A Media Capture for video. 276 3. XML Schema 278 This section contains the CLUE data model schema definition. 280 The element and attribute definitions are formal representation of 281 the concepts needed to describe the capabilities of a Media Provider 282 and the streams that are requested by a Media Consumer given the 283 Media Provider's ADVERTISEMENT ([I-D.ietf-clue-protocol]). 285 The main groups of information are: 287 : the list of media captures available (Section 4) 289 : the list of encodings groups (Section 5) 291 : the list of capture scenes (Section 6) 293 : the list of simultaneous transmission sets 294 (Section 7) 296 : the list of global views sets (Section 8) 298 : meta data about the participants represented in the 299 telepresence session (Section 20). 301 : the list of instantiated capture encodings 302 (Section 9) 304 All of the above refers to concepts that have been introduced in 305 [I-D.ietf-clue-framework] and further detailed in the following of 306 this document. 308 309 319 320 323 324 325 326 327 328 329 331 333 334 335 336 337 339 340 342 343 344 345 346 347 348 349 350 351 353 354 355 356 357 358 359 360 362 363 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 387 388 389 390 392 394 395 396 397 399 401 403 404 405 407 408 409 410 411 413 414 415 417 418 419 420 421 422 424 425 426 427 429 430 432 433 434 435 437 440 442 443 444 445 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 470 471 472 473 474 475 476 477 479 480 481 482 483 484 486 487 488 490 491 492 493 494 495 496 497 499 500 501 502 503 504 506 507 508 509 510 512 513 514 515 516 517 519 520 521 522 524 525 526 527 528 529 531 533 534 535 536 537 539 540 541 542 543 544 545 546 547 548 549 551 552 553 554 555 556 557 559 560 561 562 563 565 566 567 568 569 570 571 572 573 574 576 577 578 579 580 582 583 585 586 587 588 589 590 591 593 594 595 596 597 599 600 601 602 603 604 605 606 608 609 610 611 612 613 614 615 616 618 619 620 621 622 623 624 625 627 628 629 630 631 632 634 635 636 637 638 640 641 642 643 644 646 648 650 651 652 653 654 655 656 657 659 660 661 662 664 665 667 668 669 670 672 673 675 676 677 678 679 680 682 683 684 685 687 688 689 690 691 692 694 695 696 697 699 700 702 703 704 705 707 709 711 713 714 715 716 717 719 720 721 722 724 725 727 728 729 730 732 734 735 736 737 739 740 741 742 744 745 747 748 749 750 751 752 753 755 756 757 758 760 761 763 765 766 767 768 769 770 771 772 773 774 776 777 778 779 780 782 Following sections describe the XML schema in more detail. 784 4. 786 represents the list of one ore more media captures 787 available on the Media Provider's side. Each media capture is 788 represented by a element (Section 10). 790 5. 792 represents the list of the encoding groups organized 793 on the Media Provider's side. Each encoding group is represented by 794 a element (Section 17). 796 6. 798 represents the list of the capture scenes organized 799 on the Media Provider's side. Each capture scene is represented by a 800 element. (Section 15). 802 7. 804 contains the simultaneous sets indicated by the 805 Media Provider. Each simultaneous set is represented by a 806 element. (Section 18). 808 8. 810 contains a set of alternative representations of all 811 the scenes that are offered by a Media Provider to a Media Consumer. 812 Each alternative is named "global view" and it is represented by a 813 element. (Section 19). 815 9. 817 is a list of capture encodings. It can represent 818 the list of the desired capture encodings indicated by the Media 819 Consumer or the list of instantiated captures on the provider's side. 820 Each capture encoding is represented by a element. 821 (Section 21). 823 10. 825 According to the CLUE framework, a media capture is the fundamental 826 representation of a media flow that is available on the provider's 827 side. Media captures are characterized (i) by a set of features that 828 are independent from the specific type of medium, and (ii) by a set 829 of features that are media-specific. The features that are common to 830 all media types appear within the media capture type, that has been 831 designed as an abstract complex type. Media-specific captures, such 832 as video captures, audio captures and others, are specialization of 833 that abstract media capture type, as in a typical generalization- 834 specialization hierarchy. 836 The following is the XML Schema definition of the media capture type: 838 839 840 841 842 843 844 845 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 871 872 873 874 876 878 10.1. captureID attribute 880 The "captureID" attribute is a mandatory field containing the 881 identifier of the media capture. 883 10.2. mediaType attribute 885 The "mediaType" attribute is a mandatory attribute specifying the 886 media type of the capture ("audio", "video", "text",...). 888 10.3. 890 is a mandatory field containing the identifier of 891 the capture scene the media capture is defined in. Indeed, each 892 media capture must be defined within one and only one capture scene. 893 When a media capture is spatially definable, some spatial information 894 is provided along with it in the form of point coordinates (see 895 Section 10.5). Such coordinates refers to the space of coordinates 896 defined for the capture scene containing the capture. 898 10.4. 900 is an optional field containing the identifier of the 901 encoding group the media capture is associated with. Media captures 902 that are not associated with any encoding group can not be 903 instantiated as media streams. 905 10.5. 907 Media captures are divided into two categories: (i) non spatially 908 definable captures and (ii) spatially definable captures. 910 Captures are spatially definable when at least (i) it is possible to 911 provide the coordinates of the device position within the 912 telepresence room of origin (capture point) together with its 913 capturing direction specified by a second point (point on line of 914 capture), or (ii) it is possible to provide the represented area 915 within the telepresence room, by listing the coordinates of the four 916 co-planar points identifying the plane of interest (area of capture). 917 The coordinates of the abovementioned points must be expressed 918 according to the coordinate space of the capture scene the media 919 captures belongs to. 921 Non spatially definable captures cannot be characterized within the 922 physical space of the telepresence room of origin. Capture of this 923 kind are for example those related to recordings, text captures, 924 DVDs, registered presentation, or external streams that are played in 925 the telepresence room and transmitted to remote sites. 927 Spatially definable captures represent a part of the telepresence 928 room. The captured part of the telepresence room is described by 929 means of the element. By comparing the 930 element of different media captures within the 931 same capture scene, a consumer can better determine the spatial 932 relationships between them and render them correctly. Non spatially 933 definable captures do not show in their XML description such element: 934 they are instead characterized by having the 935 tag set to "true" (see Section 10.6). 937 The definition of the spatial information type is the following: 939 940 941 942 943 944 946 947 948 950 The contains the coordinates of the capture device 951 that is taking the capture, as well as, optionally, the pointing 952 direction (see Section 10.5.1). 954 The is an optional field containing four points 955 defining the captured area covered by the capture (see 956 Section 10.5.2). 958 10.5.1. 960 The element is used to represent the position and 961 optionally the line of capture of a capture device. 962 MUST be included in spatially definable audio captures, while is 963 optional for spatially definable video captures. 965 The XML Schema definition of the element type is the 966 following: 968 969 970 971 972 973 974 975 977 978 979 980 981 982 984 985 986 987 988 990 The point type contains three spatial coordinates (x,y,z) 991 representing a point in the space associated with a certain capture 992 scene. 994 The capture point type extends the point type, i.e., it is 995 represented by three coordinates identifying the position of the 996 capture device, but can add further information. Such further 997 information is conveyed by the , which is another 998 point-type element representing the "point on line of capture", that 999 gives the pointing direction of the capture device. 1001 The coordinates of the point on line of capture MUST NOT be identical 1002 to the capture point coordinates. For a spatially definable video 1003 capture, if the point on line of capture is provided, it MUST belong 1004 to the region between the point of capture and the capture area. For 1005 a spatially definable audio capture, if the point on line of capture 1006 is not provided, the sensitivity pattern should be considered 1007 omnidirectional. 1009 10.5.2. 1011 is an optional element that can be contained within the 1012 spatial information associated with a media capture. It represents 1013 the spatial area captured by the media capture. MUST be 1014 included in the spatial information of spatially definable video 1015 capture, while MUST NOT be associated to audio captures. 1017 The XML representation of that area is provided through a set of four 1018 point-type element, , , , and 1019 , as it can be seen from the following definition: 1021 1022 1023 1024 1025 1026 1027 1028 1029 1031 , , , and MUST be co- 1032 planar. 1034 10.6. 1036 When media captures are non spatially definable, they are marked with 1037 the boolean element set to "true" and no 1038 is provided. Indeed, 1039 and are mutually exclusive tag, according to the 1040 section within the XML Schema definition of the media 1041 capture type. 1043 10.7. 1045 A media capture can be (i) an individual media capture or (ii) a 1046 multiple content capture (MCC). A multiple content capture is made 1047 by different captures that can be arranged spatially (by a 1048 composition operation), or temporally (by a switching operation), or 1049 that can result from the orchestration of both the techniques. If a 1050 media capture is a MCC, then it can show in its XML data model 1051 representation the element. It is composed by a list of 1052 media capture identifiers ("captureIDREF") and capture scene view 1053 identifiers ("sceneViewIDREF"), where the last ones are used as 1054 shortcuts to refer to multiple capture identifiers. The referenced 1055 captures are used to create the MCC according to a certain strategy. 1056 If the element does not appear in a MCC, or it has no child 1057 elements, then the MCC is assumed to be made by multiple sources but 1058 no information regarding those sources is given. 1060 1061 1062 1063 1065 1067 1069 1070 1071 1073 10.8. 1075 is an optional element for multiple content 1076 captures that contains a numeric identifier. Multiple content 1077 captures marked with the same identifier in the 1078 contain at each time captures coming from the same source. It is the 1079 MP that determines what the source for the captures is. By this way, 1080 the MP can choose how to group together single captures for the 1081 purpose of keeping them synchronized according to the 1082 SynchronisationID attribute. 1084 10.9. 1086 is an optional element that can be used only for multiple 1087 content captures. It indicates the criteria applied to build the 1088 multiple content capture using the media captures referenced in 1089 . Such element can assume a list of pre-defined 1090 values ([todo]). 1092 10.10. 1094 is an optional element that can be used only for 1095 multiple content captures. It provides information about the number 1096 of media captures that can be represented in the multiple content 1097 capture at a time. The type definition is provided below. 1099 1100 1101 1102 1103 1104 1106 1107 1109 When the "exactNumber" attribute is set to "1", it means the 1110 element carries the exact number of the media captures 1111 appearing at a time. Otherwise, the number of the represented media 1112 captures MUST be considered "<=" of the value. 1114 10.11. 1116 is a boolean element that MUST be used for single- 1117 content captures. Its value is fixed and set to "true". Such 1118 element indicates the capture that is being described is not a 1119 multiple content capture. Indeed, and the 1120 aforementioned tags related to MCC attributes (from Section 10.7 to 1121 Section 10.10) are mutually exclusive, according to the 1122 section within the XML Schema definition of the media capture type. 1124 10.12. 1126 is used to provide optionally human-readable textual 1127 information about a media capture. The same element is exploited to 1128 describe, besides media captures, capture scenes and capture scene 1129 views, as it is included in their XML representation. A media 1130 capture can be described by using multiple elements, 1131 each one providing information in a different language. The 1132 element definition is the following: 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1145 As it can be seen, is a string element with an 1146 attribute ("lang") indicating the language used in the textual 1147 description. 1149 10.13. 1151 is an optional unsigned integer field indicating the 1152 importance of a media capture according to the Media Provider's 1153 perspective. It can be used on the receiver's side to automatically 1154 identify the most relevant contribution from the Media Provider. The 1155 higher the importance, the lower the contained value. When media 1156 captures are marked with a "0" priority value, it means that they are 1157 "not subject to priority". 1159 10.14. 1161 is an optional element containing the language used in the 1162 capture, if any. 1164 10.15. 1166 is an optional element indicating whether or not the 1167 capture device originating the capture may move during the 1168 telepresence session. That optional element can assume one of the 1169 three following values: (i) static, (ii) dynamic or (iii) highly 1170 dynamic. 1172 10.16. 1174 The optional contains an unsigned integer 1175 indicating the maximum number of capture encodings that can be 1176 simultaneously active for the media capture. If absent, this 1177 parameter defaults to 1. The minimum value for this attribute is 1. 1178 The number of simultaneous capture encodings is also limited by the 1179 restrictions of the encoding group the media capture refers to by 1180 means of the element. 1182 10.17. 1184 The optional element contains the value of the ID 1185 attribute of the media capture it refers to. The media capture 1186 marked with a element can be for example the translation 1187 of a main media capture in a different language. 1189 10.18. 1191 The element is an optional tag describing what is represented 1192 in the spatial area covered by a media capture. The current possible 1193 values are: "table", "lectern", "individual", and "audience", as 1194 listed in the enumerative view type in the following. 1196 10.19. 1198 The element is an optional tag used for media captures 1199 conveing information about presentations within the telepresence 1200 session. The current possible values are "slides" and "images", as 1201 listed in the enumerative presentation type in the following. 1203 10.20. 1205 This optional element is used to indicate which telepresence session 1206 participants are represented within the media captures. For each 1207 participant, a element is provided. 1209 10.20.1. 1211 contains the identifier of the represented person. 1212 Metadata about the represented participant can be retrieved by 1213 accessing the list (Section 20). 1215 11. Audio captures 1217 Audio captures inherit all the features of a generic media capture 1218 and present further audio-specific characteristics. The XML Schema 1219 definition of the audio capture type is reported below: 1221 1222 1223 1224 1225 1226 1228 1230 1231 1232 1233 1234 1236 An example of audio-specific information that can be included is 1237 represented by the element. (Section 11.1). 1239 11.1. 1241 The element is an optional field describing the 1242 characteristic of the nominal sensitivity pattern of the microphone 1243 capturing the audio signal. 1245 The XML Schema definition is provided below: 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1258 12. Video captures 1260 Video captures, similarly to audio captures, extend the information 1261 of a generic media capture with video-specific features, such as 1262 (Section 12.1). 1264 The XML Schema representation of the video capture type is provided 1265 in the following: 1267 1268 1269 1270 1271 1272 1273 1275 1276 1277 1278 1279 1281 12.1. 1283 The element is a boolean element indicating that there 1284 is text embedded in the video capture. The language used in such 1285 embedded textual description is reported in "lang" 1286 attribute. 1288 The XML Schema definition of the element is: 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1301 13. Text captures 1303 Also text captures can be described by extending the generic media 1304 capture information, similarly to audio captures and video captures. 1306 The XML Schema representation of the text capture type is currently 1307 lacking text-specific information, as it can be seen by looking at 1308 the definition below: 1310 1311 1312 1313 1314 1315 1317 1318 1319 1320 1321 1323 14. Other capture types 1325 Other media capture types can be described by using the CLUE data 1326 model. They can be represented by exploiting "otherCaptureType" 1327 type. This media capture type is conceived to be filled with 1328 elements defined within extensions of the current schema, i.e., with 1329 elements defined in other XML schemas (see Section 23 for an 1330 example). The otherCaptureType inherits all the features envisioned 1331 for the abstract mediaCaptureType. 1333 The XML Schema representation of the otherCaptureType is the 1334 following: 1336 1337 1338 1339 1340 1341 1343 1344 1345 1346 1347 1349 15. 1351 A Media Provider organizes the available capture in capture scenes in 1352 order to help the receiver both in the rendering and in the selection 1353 of the group of captures. Capture scenes are made of media captures 1354 and capture scene views, that are set of media captures of the same 1355 media type. Each capture scene view is an alternative to represent 1356 completely a capture scene for a fixed media type. 1358 The XML Schema representation of a element is the 1359 following: 1361 1362 1363 1364 1365 1366 1367 1369 1370 1371 1372 1373 1375 Each capture scene is identified by a "sceneID" attribute. The 1376 element can contain zero or more textual 1377 elements, defined as in Section 10.12. Besides , there 1378 is the optional element (Section 15.1), which 1379 contains structured information about the scene in the vcard format, 1380 and the optional element (Section 15.2), which is the 1381 list of the capture scene views. When no is provided, 1382 the capture scene is assumed to be made by all the media captures 1383 showing the value of its sceneID attribute in their mandatory 1384 captureSceneIDREF attribute. 1386 15.1. 1388 The element contains optional information about 1389 the capture scene according to the vcard format. 1391 15.2. 1393 The element is a mandatory field of a capture scene 1394 containing the list of scene views. Each scene view is represented 1395 by a element (Section 16). 1397 1398 1399 1400 1401 1403 1404 1406 15.3. sceneID attribute 1408 The sceneID attribute is a mandatory attribute containing the 1409 identifier of the capture scene. 1411 15.4. scale attribute 1413 The scale attribute is a mandatory attribute that specifies the scale 1414 of the coordinates provided in the spatial information of the media 1415 capture belonging to the considered capture scene. The scale 1416 attribute can assume three different values: 1418 "mm" - the scale is in millimeters. Systems which know their 1419 physical dimensions (for example professionally installed 1420 telepresence room systems) should always provide those real-world 1421 measurements. 1423 "unknown" - the scale is not necessarily millimeters, but the 1424 scale is the same for every media capture in the capture scene. 1425 Systems which don't know specific physical dimensions but still 1426 know relative distances should select "unknown" in the scale 1427 attribute of the capture scene to be described. 1429 "noscale" - there is no a common physical scale among the media 1430 captures of the capture scene. That means the scale could be 1431 different for each media capture. 1433 1434 1435 1436 1437 1438 1439 1440 1442 16. 1444 A element represents a capture scene view, which contains 1445 a set of media capture of the same media type describing a capture 1446 scene. 1448 A element is characterized as follows. 1450 1451 1452 1453 1454 1455 1456 1457 1459 One or more optional elements provide human-readable 1460 information about what the scene view contains. is 1461 defined as already seen in Section 10.12. 1463 The remaining child elements are described in the following 1464 subsections. 1466 16.1. 1468 The is the list of the identifiers of the media 1469 captures included in the scene view. It is an element of the 1470 captureIDListType type, which is defined as a sequence of 1471 each one containing the identifier of a media capture 1472 listed within the element: 1474 1475 1476 1477 1479 1480 1482 16.2. sceneViewID attribute 1484 The sceneViewID attribute is a mandatory attribute containing the 1485 identifier of the capture scene view represented by the 1486 element. 1488 17. 1490 The element represents an encoding group, which is 1491 made by a set of one or more individual encodings and some parameters 1492 that apply to the group as a whole. Encoding groups contain 1493 references to individual encodings that can be applied to media 1494 captures. The definition of the element is the 1495 following: 1497 1498 1499 1500 1501 1502 1504 1505 1506 1507 1509 In the following, the contained elements are further described. 1511 17.1. 1513 is an optional field containing the maximum 1514 bitrate expressed in bit per second that can be shared by the 1515 individual encodings included in the encoding group. 1517 17.2. 1519 is the list of the individual encoding grouped 1520 together in the encoding group. Each individual encoding is 1521 represented through its identifier contained within an 1522 element. 1524 1525 1526 1527 1528 1529 1531 17.3. encodingGroupID attribute 1533 The encodingGroupID attribute contains the identifier of the encoding 1534 group. 1536 18. 1538 represents a simultaneous transmission set, i.e., a 1539 list of captures of the same media type that can be transmitted at 1540 the same time by a Media Provider. There are different simultaneous 1541 transmission sets for each media type. 1543 1544 1545 1546 1548 1550 1552 1554 1555 1556 1557 1558 1560 Besides the identifiers of the captures ( 1561 elements), also the identifiers of capture scene views and of capture 1562 scene can be exploited, as shortcuts ( and 1563 elements). 1565 18.1. setID attribute 1567 The "setID" attribute is a mandatory field containing the identifier 1568 of the simultaneous set. 1570 When only capture scene identifiers are listed within a simultaneous 1571 set, the media type attribute MUST be used in order to determine 1572 which media captures can be simultaneously sent together. 1574 18.2. mediaType attribute 1576 The "mediaType" attribute is an optional attribute containing the 1577 media type of the captures referenced by the simultaneous set. 1579 When only capture scene identifiers are listed within a simultaneous 1580 set, the media type attribute MUST appear in the XML description in 1581 order to determine which media captures can be simultaneously sent 1582 together. 1584 18.3. 1586 contains the identifier of the media capture that 1587 belongs to the simultanous set. 1589 18.4. 1591 contains the identifier of the scene view containing 1592 a group of capture that are able to be sent simultaneously with the 1593 other captures of the simultaneous set. 1595 18.5. 1597 contains the identifier of the capture scene 1598 where all the included captures of a certain media type are able to 1599 be sent together with the other captures of the simultaneous set. 1601 19. 1603 is a set of captures of the same media type representing 1604 a summary of the complete Media Provider's offer. The content of a 1605 global view is expressed by leveraging only scene view identifiers, 1606 put within elements. Each global view is identified 1607 by a unique identifier within the "globalViewID" attribute. 1609 1610 1611 1612 1614 1616 1617 1618 1619 1621 20. 1623 Information about the participants that are represented in the media 1624 captures is conveyed via the element. As it can be seen 1625 from the XML Schema depicted below, for each participant, a 1626 element is provided. 1628 1629 1630 1631 1633 1634 1636 1637 1638 1639 1641 1644 1646 1647 1648 1649 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1664 20.1. 1666 includes all the metadata related to a person represented 1667 within one or more media captures. Such element provides the vcard 1668 of the subject (via the element, see Section 20.1.2) and 1669 his conference role(s) (via one or more elements, see 1670 Section 20.1.3). Furthermore, it has a mandatory "personID" 1671 attribute (Section 20.1.1). 1673 20.1.1. personID attribute 1675 The "personID" attribute carries the identifier of a represented 1676 person. Such identifier can be used to refer to the participant, as 1677 in the element in media captures representation 1678 (Section 10.20). 1680 20.1.2. 1682 The element is the XML representation of all the fields 1683 composing a vcard as specified in the Xcard RFC [RFC6351]. The 1684 vcardType is imported by the Xcard XML Schema provided by 1685 [I-D.ietf-ecrit-additional-data]. As such schema specifies, the 1686 element within is mandatory. 1688 20.1.3. 1690 The value of the element determines the role of the 1691 represented participant within the telepresence session organization. 1692 It can be one of the following terms, that are defined in the 1693 framework document: "presenter", "timekeeper","attendee", "minute 1694 taker", "translator", "chairman", "vice-chairman". 1696 A participant can have more than one conference role. In that case, 1697 more than one element will appear in his description. 1699 21. 1701 A is given from the association of a media capture 1702 and an individual encoding, to form a capture stream as defined in 1703 [I-D.ietf-clue-framework]. The model of such an entity is provided 1704 in the following. 1706 1707 1708 1709 1710 1711 1712 1714 1715 1716 1718 1720 21.1. 1722 is the mandatory element containing the identifier of the 1723 media capture that has been encoded to form the capture encoding. 1725 21.2. 1727 is the mandatory element containing the identifier of 1728 the applied individual encoding. 1730 21.3. 1732 is an optional element to be used in case of 1733 configuration of MCCs. It contains the list of capture identifiers 1734 and capture scene view identifiers the Media Consumer wants within 1735 the MCC. That element is structured as the element used to 1736 describe the content of a MCC, i.e., it contains The total number of 1737 the media captures listed in the must be lower 1738 than or equal to the value carried within the attribute 1739 of the MCC. 1741 22. 1743 The element has been left within the XML Schema for 1744 representing a drafty version of the body of an ADVERTISEMENT message 1745 (see the example section). 1747 1748 1750 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1763 1764 1765 1766 1768 23. XML Schema extensibility 1770 The telepresence data model defined in this document is meant to be 1771 extensible. Extensions are accomplished by defining elements or 1772 attributes qualified by namespaces other than 1773 "urn:ietf:params:xml:ns:clue-info" and 1774 "urn:ietf:params:xml:ns:vcard-4.0" for use wherever the schema allows 1775 such extensions (i.e., where the XML Schema definition specifies 1776 "anyAttribute" or "anyElement"). Elements or attributes from unknown 1777 namespaces MUST be ignored. 1779 23.1. Example of extension 1781 When extending the CLUE data model, a new schema with a new namespace 1782 associated with it needs to be specified. 1784 In the following, an example of extension is provided. The extension 1785 defines a new audio capture attribute ("newAudioFeature") and an 1786 attribute for characterizing the captures belonging to an 1787 "otherCaptureType" defined by the user. An XML document compliant 1788 with the extension is also included. The XML file results validated 1789 against the current CLUE data model schema. 1791 1792 1803 1804 1807 1808 1811 1812 1813 1815 1817 1818 1822 1823 1827 CS1 1828 true 1829 true 1830 EG1 1831 newAudioFeatureValue 1832 1833 1837 CS1 1838 true 1839 EG1 1840 OtherValue 1841 1842 1843 1844 1845 1846 300000 1847 1848 ENC4 1849 ENC5 1850 1851 1852 1853 1854 1855 1856 1858 24. Security considerations 1860 This document defines an XML Schema data model for telepresence 1861 scenarios. The modeled information is identified in the CLUE 1862 framework as the needed one in order to enable a full-optional media 1863 stream negotiation and rendering. Indeed, the XML elements herein 1864 defined are used within CLUE protocol messages to describe both the 1865 media streams representing the MP's telepresence offer and the 1866 desired selection requested by the MC. Security concerns described 1867 in [I-D.ietf-clue-framework], Section 15, apply to this document. 1869 Data model information carried within CLUE messages SHOULD be 1870 accessed only by authenticated endpoints. Indeed, some information 1871 published by the MP might reveal sensitive data about who and what is 1872 represented in the transmitted streams. The vCards included in the 1873 elements (Section 20.1) mandatorily contains the 1874 identity of the represented person. Optionally vCards can also carry 1875 the person's contact addresses, together with his/her photo and other 1876 personal data. Similar privacy-critical information can be conveyed 1877 by means of elements (Section 15.1) describing the 1878 capture scenes. The elements also can specify details 1879 that should be protected about the content of media captures 1880 (Section 10.12), capture scenes (Section 15), scene views 1881 (Section 16). 1883 Integrity attacks to the data model information incapsulated in CLUE 1884 messages can invalidate the success of the telepresence session's 1885 setup by misleading the MC's and MP's interpretation of the offered 1886 and desired media streams. 1888 The assurance of the authenticated access and of the integrity of the 1889 data model information is up to the involved transport mechanisms, 1890 namely the CLUE protocol [I-D.ietf-clue-protocol] and the CLUE data 1891 channel [I-D.ietf-clue-datachannel]. 1893 25. IANA considerations 1895 ToDo. 1897 25.1. XML Schema registration 1899 ToDo. 1901 25.2. XML namespace registration 1903 ToDo. 1905 26. Sample XML file 1907 The following XML document represents a schema compliant example of a 1908 CLUE telepresence scenario. Taking inspiration from the examples 1909 described in the framework draft ([I-D.ietf-clue-framework]), it is 1910 provided the XML representation of an endpoint-style Media Provider's 1911 offer. 1913 There are three cameras, where the central one is also able of 1914 capturing a zoomed-out view of the overall telepresence room. 1915 Besides the three video captures coming from such cameras, the MP 1916 makes available a further multi-content capture about the loudest 1917 segment of the room, obtained by switching the video source across 1918 the three cameras. For the sake of simplicity, only one audio 1919 capture is advertised for the audio of the whole room. 1921 The three cameras are placed in front of three participants (Alice, 1922 Bob and Ciccio), whose vcard and conference roles details are also 1923 provided. 1925 Media captures are arranged into four capture scene views: 1927 1. (VC0, VC1, VC2) - left, center and right camera video captures 1929 2. (VC3) - video capture associated with loudest room segment 1931 3. (VC4) - video capture zoomed out view of all people in the room 1933 4. (AC0) - main audio 1935 There are two encoding groups: (i) EG0, for video encodings, and (ii) 1936 EG1, for audio encodings. 1938 As to the simultaneous sets, only VC1 and VC4 cannot be transmitted 1939 simultaneously since they are captured by the same device, i.e., the 1940 central camera (VC4 is a zoomed-out view while VC1 is a focused view 1941 of the front participant). The simultaneous sets would then be the 1942 following: 1944 SS1 made by VC3 and all the captures in the first capture scene view 1945 (VC0,VC1,VC2); 1947 SS2 made by VC3, VC0, VC2, VC4 1949 1950 1952 1953 1955 CS1 1956 EG1 1957 1958 1959 0.5 1960 1.0 1961 0.5 1962 1963 0.5 1964 0.0 1965 0.5 1966 1967 1968 1969 true 1970 main audio from the room 1971 1 1972 it 1973 static 1974 room 1975 1976 alice 1977 bob 1978 ciccio 1979 1980 1 1981 1982 1984 CS1 1985 EG0 1986 1987 1988 0.5 1989 1.0 1990 0.5 1991 1992 0.5 1993 0.0 1994 0.5 1995 1996 1997 1998 true 1999 left camera video capture 2000 1 2001 it 2002 static 2003 individual 2004 2005 ciccio 2007 2008 2 2009 2010 2012 CS1 2013 EG0 2014 2015 2016 0.5 2017 1.0 2018 0.5 2019 2020 0.5 2021 0.0 2022 0.5 2023 2024 2025 2026 true 2027 central camera video capture 2028 1 2029 it 2030 static 2031 individual 2032 2033 alice 2034 2035 2 2036 2037 2039 CS1 2040 EG0 2041 2042 2043 0.5 2044 1.0 2045 0.5 2046 2047 0.5 2048 0.0 2049 0.5 2050 2051 2052 2053 true 2054 right camera video capture 2055 1 2056 it 2057 static 2058 individual 2059 2060 bob 2061 2062 2 2063 2064 2066 CS1 2067 EG0 2068 true 2069 Soundlevel:0 2070 loudest room segment 2071 1 2072 it 2073 static 2074 individual 2075 1 2076 2077 2079 CS1 2080 EG0 2081 2082 2083 0.5 2084 1.0 2085 0.5 2086 2087 0.5 2088 0.0 2089 0.5 2090 2091 2092 2093 true 2094 zoomed out view of all people in the 2095 room 2096 1 2097 it 2098 static 2099 room 2100 2101 alice 2102 bob 2103 ciccio 2104 2105 1 2106 2107 2108 2109 2110 600000 2111 2112 ENC1 2113 ENC2 2114 ENC3 2115 2116 2117 2118 300000 2119 2120 ENC4 2121 ENC5 2122 2123 2124 2125 2126 2127 2128 2129 2130 VC0 2131 VC1 2132 VC2 2133 2134 2135 2136 2137 VC3 2138 2139 2140 2141 2142 VC4 2143 2144 2145 2146 2147 VC4 2148 2149 2150 2152 2153 2154 2155 2156 VC3 2157 SE1 2158 2159 2160 VC0 2161 VC2 2162 VC4 2163 VC3 2164 2165 2166 2167 2168 2169 2170 Bob 2171 2172 2173 minute taker 2174 2175 2176 2177 2178 Alice 2179 2180 2181 presenter 2182 2183 2184 2185 2186 Ciccio 2187 2188 2189 chairman 2190 timekeeper 2191 2192 2193 2194 27. MCC example 2196 Enhancing the scenario presented in the previous example, the Media 2197 Provider is able to advertise a composed capture VC7 made by a big 2198 picture representing the current speaker (VC3) and two picture-in- 2199 picture boxes representing the previous speakers (the previous one 2200 -VC5- and the oldest one -VC6). The provider does not want to 2201 instantiate and send VC5 and VC6, so it does not associate any 2202 encoding group with them. Their XML representations are provided for 2203 enabling the description of VC7. 2205 A possible description for that scenario could be the following: 2207 2208 2210 2211 2213 CS1 2214 EG1 2215 2216 2217 0.5 2218 1.0 2219 0.5 2220 2221 0.5 2222 0.0 2223 0.5 2224 2225 2226 2227 true 2228 main audio from the room 2229 1 2230 it 2231 static 2232 room 2233 2234 alice 2235 bob 2236 ciccio 2237 2238 1 2239 2240 2242 CS1 2243 EG0 2244 2245 2246 0.5 2247 1.0 2248 0.5 2249 2250 0.5 2251 0.0 2252 0.5 2253 2254 2255 2256 true 2257 left camera video capture 2258 1 2259 it 2260 static 2261 individual 2262 2263 ciccio 2264 2265 2 2266 2267 2269 CS1 2270 EG0 2271 2272 2273 0.5 2274 1.0 2275 0.5 2276 2277 0.5 2278 0.0 2279 0.5 2280 2281 2282 2283 true 2284 central camera video capture 2285 1 2286 it 2287 static 2288 individual 2289 2290 alice 2291 2292 2 2293 2294 2296 CS1 2297 EG0 2298 2299 2300 0.5 2301 1.0 2302 0.5 2303 2304 0.5 2305 0.0 2306 0.5 2307 2308 2309 2310 true 2311 right camera video capture 2312 1 2313 it 2314 static 2315 individual 2316 2317 bob 2318 2319 2 2320 2321 2323 CS1 2324 EG0 2325 true 2326 2327 SE1 2328 2329 Soundlevel:0 2330 loudest room segment 2331 1 2332 it 2333 static 2334 individual 2335 1 2337 2338 2340 CS1 2341 EG0 2342 2343 2344 0.5 2345 1.0 2346 0.5 2347 2348 0.5 2349 0.0 2350 0.5 2351 2352 2353 2354 true 2355 zoomed out view of all people in the room 2356 1 2357 it 2358 static 2359 room 2360 2361 alice 2362 bob 2363 ciccio 2364 2365 1 2366 2367 2369 CS1 2370 true 2371 2372 SE1 2373 2374 Soundlevel:1 2375 penultimate loudest room segment 2376 1 2377 it 2378 static 2379 individual 2380 1 2381 2382 2384 CS1 2385 true 2386 2387 SE1 2388 2389 Soundlevel:2 2390 last but two loudest room segment 2391 1 2392 it 2393 static 2394 individual 2395 1 2396 2397 2399 CS1 2400 true 2401 2402 VC3 2403 VC5 2404 VC6 2405 2406 big picture of the current speaker + 2407 pips about previous speakers 2408 1 2409 it 2410 static 2411 individual 2412 1 2413 2414 2415 2416 2417 600000 2418 2419 ENC1 2420 ENC2 2421 ENC3 2422 2423 2424 2425 300000 2426 2427 ENC4 2428 ENC5 2429 2430 2431 2432 2433 2434 2435 2436 participants' individual 2437 videos 2438 2439 VC0 2440 VC1 2441 VC2 2442 2443 2444 2445 loudest segment of the 2446 room 2447 2448 VC3 2449 2450 2451 2452 loudest segment of the 2453 room + pips 2454 2455 VC7 2456 2457 2458 2459 room audio 2460 2461 AC0 2462 2463 2464 2465 room video 2466 2467 VC4 2468 2469 2470 2471 2472 2473 2474 2475 VC7 2476 SE1 2477 2478 2479 VC0 2480 VC2 2481 VC4 2482 VC7 2483 2484 2485 2486 2487 2488 2489 Bob 2490 2491 2492 minute taker 2493 2494 2495 2496 2497 Alice 2498 2499 2500 presenter 2501 2502 2503 2504 2505 Ciccio 2506 2507 2508 chairman 2509 timekeeper 2510 2511 2512 2514 28. Diff with draft-ietf-clue-data-model-schema-06 version 2516 o Capture Scene Entry/Entries renamed as Capture Scene View/Views in 2517 the text, / renamed as / 2518 in the XML schema. 2520 o Global Scene Entry/Entries renamed as Global View/Views in the 2521 text, / renamed as 2522 / 2524 o Security section added. 2526 o Extensibility: a new type is introduced to describe other types of 2527 media capture (otherCaptureType), text and example added. 2529 o Spatial information section updated: capture point optional, text 2530 now is coherent with the framework one. 2532 o Audio capture description: added, 2533 removed, disallowed. 2535 o Simultaneous set definition: added to refer to 2536 capture scene identifiers as shortcuts and an optional mediaType 2537 attribute which is mandatory to use when only capture scene 2538 identifiers are listed. 2540 o Encoding groups: removed the constraint of the same media type. 2542 o Updated text about media captures without 2543 (optional in the XML schema). 2545 o "mediaType" attribute removed from homogeneous groups of capture 2546 (scene views and globlal views) 2548 o "mediaType" attribute removed from the global view textual 2549 description. 2551 o "millimeters" scale value changed in "mm" 2553 29. Diff with draft-ietf-clue-data-model-schema-04 version 2555 globalCaptureEntries/Entry renamed as globalSceneEntries/Entry; 2557 sceneInformation added; 2559 Only capture scene entry identifiers listed within global scene 2560 entries (media capture identifiers removed); 2562 renamed as in the >clueInfo< template 2564 renamed as to synch with the framework 2565 terminology 2567 renamed as to synch with the 2568 framework terminology 2570 renamed as in the media capture 2571 type definition to remove ambiguity 2572 Examples have been updated with the new definitions of 2573 and of . 2575 30. Diff with draft-ietf-clue-data-model-schema-03 version 2577 encodings section has been removed 2579 global capture entries have been introduced 2581 capture scene entry identifiers are used as shortcuts in listing 2582 the content of MCC (similarly to simultaneous set and global 2583 capture entries) 2585 Examples have been updated. A new example with global capture 2586 entries has been added. 2588 has been made optional. 2590 has been renamed into 2592 Obsolete comments have been removed. 2594 participants information has been added. 2596 31. Diff with draft-ietf-clue-data-model-schema-02 version 2598 captureParameters and encodingParameters have been removed from 2599 the captureEncodingType 2601 data model example has been updated and validated according to the 2602 new schema. Further description of the represented scenario has 2603 been provided. 2605 A multiple content capture example has been added. 2607 Obsolete comments and references have been removed. 2609 32. Informative References 2611 [I-D.ietf-clue-datachannel] Holmberg, C., "CLUE Protocol Data 2612 Channel", 2613 draft-ietf-clue-datachannel-01 2614 (work in progress), September 2014. 2616 [I-D.ietf-clue-framework] Duckworth, M., Pepperell, A., and 2617 S. Wenger, "Framework for 2618 Telepresence Multi-Streams", 2619 draft-ietf-clue-framework-17 (work 2620 in progress), September 2014. 2622 [I-D.ietf-clue-protocol] Presta, R. and S. Romano, "CLUE 2623 protocol", 2624 draft-ietf-clue-protocol-01 (work 2625 in progress), June 2014. 2627 [I-D.ietf-ecrit-additional-data] Rosen, B., Tschofenig, H., 2628 Marshall, R., Randy, R., and J. 2629 Winterbottom, "Additional Data 2630 related to an Emergency Call", 2631 draft-ietf-ecrit-additional-data-22 2632 (work in progress), April 2014. 2634 [RFC4796] Hautakorpi, J. and G. Camarillo, 2635 "The Session Description Protocol 2636 (SDP) Content Attribute", RFC 4796, 2637 February 2007. 2639 [RFC6351] Perreault, S., "xCard: vCard XML 2640 Representation", RFC 6351, 2641 August 2011. 2643 Authors' Addresses 2645 Roberta Presta 2646 University of Napoli 2647 Via Claudio 21 2648 Napoli 80125 2649 Italy 2651 EMail: roberta.presta@unina.it 2653 Simon Pietro Romano 2654 University of Napoli 2655 Via Claudio 21 2656 Napoli 80125 2657 Italy 2659 EMail: spromano@unina.it