idnits 2.17.1 draft-pwid-urn-specification-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) == There are 1 instance of lines with non-RFC2606-compliant FQDNs in the document. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (November 4, 2018) is 1993 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- No issues found here. Summary: 2 errors (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force E. Zierau, Ed. 3 Internet-Draft Royal Danish Library 4 Intended status: Informational November 4, 2018 5 Expires: May 8, 2019 7 A Persistent Web IDentifier (PWID) URN Namespace 8 draft-pwid-urn-specification-04 10 Abstract 12 This document specifies a Uniform Resource Name (URN) for Persistent 13 Web IDentifiers to web material in web archives using the 'pwid' 14 namespace identifier. 16 The main purpose of the standard is to support specification of 17 references that are not covered by other reference techniques: to 18 support references to material in web archives with restricted 19 access. Furthermore, it supports persistent technology agnostic 20 references to web archives in general, in a form that can work as an 21 algorithmic basis for finding web archive resources in general. An 22 additional important benefit is that it can be used in specifying web 23 collections, which then can form a persistent computational basis for 24 the extract of the archived collection parts. Since the parts can be 25 specified generally, this further allow collections to be specified 26 with elements from one or more web archives. 28 The PWID is designed for researchers and therefore it is designed as 29 general, global, sustainable, humanly readable, technology agnostic, 30 persistent and precise web references for web materials in web 31 archives. 33 Status of This Memo 35 This Internet-Draft is submitted in full conformance with the 36 provisions of BCP 78 and BCP 79. 38 Internet-Drafts are working documents of the Internet Engineering 39 Task Force (IETF). Note that other groups may also distribute 40 working documents as Internet-Drafts. The list of current Internet- 41 Drafts is at https://datatracker.ietf.org/drafts/current/. 43 Internet-Drafts are draft documents valid for a maximum of six months 44 and may be updated, replaced, or obsoleted by other documents at any 45 time. It is inappropriate to use Internet-Drafts as reference 46 material or to cite them other than as "work in progress." 48 This Internet-Draft will expire on May 8, 2019. 50 Copyright Notice 52 Copyright (c) 2018 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents 57 (https://trustee.ietf.org/license-info) in effect on the date of 58 publication of this document. Please review these documents 59 carefully, as they describe your rights and restrictions with respect 60 to this document. Code Components extracted from this document must 61 include Simplified BSD License text as described in Section 4.e of 62 the Trust Legal Provisions and are provided without warranty as 63 described in the Simplified BSD License. 65 Table of Contents 67 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 68 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 5 69 2. Namespace Registration Template . . . . . . . . . . . . . . . 5 70 3. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 20 71 4. References . . . . . . . . . . . . . . . . . . . . . . . . . 20 72 4.1. Normative References . . . . . . . . . . . . . . . . . . 20 73 4.2. Informative References . . . . . . . . . . . . . . . . . 20 74 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 23 76 1. Introduction 78 The URN PWID is a supplement to existing reference standards, where 79 the PWID will support references to web archives, including areas 80 that are not supported today: support of references to material in 81 web archives with restricted access. Furthermore, it enables 82 technology agnostic references to web archives in general, which can 83 for instance can be needed for references to web material that is 84 dynamic (e.g. a news site) or a specific version of a web material 85 (e.g. specific version of the DOI handbook). 87 The URN PWID is a form that can work as an algorithmic basis for 88 finding the resource. This also enables basis for computation of 89 archived web parts to a collection from one or more web archives. 91 Furthermore, the PWID includes information about the resource which 92 makes it possible to find alternative resources, in cases where the 93 original precise resource have become unavailable. 95 The PWID URN is designed to be a persistent reference that is 96 general, global and technology agnostic in order to enhance its 97 chances for being sustainable. Furthermore, it is designed to be 98 humanly readable and with ability to make precision of the web 99 archive resource covers. This design enables a PWID URN to: 101 o be used for technical solutions e.g. to make them resolvable 103 o cover references to all sorts of materials in web archives 105 o cover references to materials from all sort of web archives 107 The motivation for defining a PWID namespace is the growing challenge 108 of references to archived web resources, which the PWID as a URN can 109 assist in overcoming. The standard is needed to address web 110 materials meeting precision and persistency issues on par precision 111 in with traditional references for analogue material. Furthermore, 112 it is needed in order to address web archive resources that are not 113 freely available online. The PWID URN covers both referencing of web 114 resources from research papers and definition of web collection/ 115 corpus. In detail the challenges are: 117 o Citation guidelines generally do not cover general and persistent 118 referencing techniques for web resources that are not registered 119 by Persistent Identifier systems (like DOI [DOI]). However, an 120 increasing number of references point to resources that only exist 121 on the web, e.g. blogs that turned out to have a historical 122 impact. In order to obtain persistency for a reference, the 123 target need to be stable. As the live web is 'alive' and in 124 constant change, persistency can only be obtained by referring to 125 archived snapshots of the web. The PWID URN is therefore focused 126 on referencing archived web material in a technology agnostic way 127 (research documented in [IPRES2016] and [ResawRef]). 129 o There are many new initiatives for web archive referencing, - most 130 of them are centralised solutions which offers harvest and 131 referencing, but these cannot be used for existing materials in 132 web archives. Other initiatives only cover open web archives, 133 which does not cover material in archives with restricted access 134 and where there is a risk of imprecision if a resource in an 135 alternative archive is the result of resolving such a resource. 136 The PWID URN is needed in order to fill these gaps where other 137 techniques are not sufficient. 139 o There are many different requirements for construction of 140 collection definitions for web material besides precision and 141 persistency. Recent research have found that various legal and 142 sustainability issues leads to a need for a collection to be 143 defined by references to the web parts in the collection. The 144 PWID URN is needed in such definitions in order to fulfil these 145 requirements and to enable a collection to cover web materials 146 from more archives (research documented in [ResawColl]). 148 The PWID is especially useful for web material where precision is in 149 focus and/or there are references to materials from web archives 150 requiring special grants in order to gain access. The precision 151 regards both pointing to the archive where it was found and validated 152 against its purpose (other archived versions in other web archives 153 may differ both regarding completeness and contents even within short 154 time periods) as well as precision about what is actually referred by 155 the reference (e.g. is it the page or the whole website). 157 Furthermore the PWID is very useful in specification of contents of a 158 web collection (also known as web corpus). Definitions of web 159 collections are often needed for extraction of data used in 160 production of research results, e.g. for evaluations in the future. 161 Current practices today are not persistent as they often use some CDX 162 version, which vary for different implementations. 164 Strict syntax is needed for the PWID reference in order to ensure 165 that it can be used for computational purposes. This is especially 166 relevant for automatic extraction of parts from web collection 167 definitions. Furthermore, readers of research papers are today 168 expecting to be able to access a referenced resource by clicking an 169 actionable URI, therefore a similar facility will be expected for 170 references to available archived web material, which strict syntax 171 can make possible. Examples of technical solutions that is enabled 172 are: 174 o resolving of a references and automatic extraction of web 175 collection defined by PWID URNs [ResawRef] [ResawColl] 177 o Resolving of a PWID reference by resolving services. As a start, 178 there is work on a prototype that can work for the Danish web 179 archive data and open web archives with standard patterns for the 180 current technologies. There may come different implementations 181 for resolving which may rely on different protocols and 182 application 184 The purpose of the PWID is also to express a web archive reference as 185 simple as possible and at the same time meeting requirements for 186 sustainability, usability and scope. Therefore, the PWID URN is 187 focused on only having the minimum required information to make a 188 precise identification of a resource in an arbitrary web archive. 189 Resent research have found that this is obtain by the following 190 information [ResawRef]: 192 o Identification of web archive 193 o Identification of source: 195 * Archived URI or identifier 197 * Archival timestamp 199 o Intended precision (page, part, subsite etc.) 201 The PWID URN represents this information in a human readable way as 202 well as a well-defined way that enables technical solutions to 203 interpret the URN. 205 1.1. Requirements Language 207 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 208 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 209 document are to be interpreted as described in [RFC2119]. 211 2. Namespace Registration Template 213 Namespace Identifier: 215 PWID 217 Version: 219 4 221 Date: 223 2018-11-03 225 Registrant: 227 Eld Maj-Britt Olmuetz Zierau 228 Royal Danish Library 229 Soeren Kierkegaards Plads 1 230 1219 Copenhagen 231 Denmark 232 ph: +45 9132 4690 233 email: elzi@kb.dk 235 Purpose: 237 The URN PWID is a supplement to existing reference standards, 238 where the PWID will support references to web archives, including 239 areas that are not supported today: support of references to 240 material in web archives with restricted access. Furthermore it 241 enables technology agnostic references to web archives in general, 242 which can for instance can be needed for references to web 243 material that is dynamic (e.g. a news site) or a specific version 244 of a web material (e.g. specific version of the DOI handbook). 246 The URN PWID is a form that can work as an algorithmic basis for 247 finding the resource. This also enables basis for computation of 248 archived web parts to a collection from one or more web archives. 250 Furthermore, the PWID includes information about the resource 251 which makes it possible to find alternative resources, in cases 252 where the original precise resource have become unavailable. 254 The PWID URN is designed to be a persistent reference that is 255 general, global and technology agnostic in order to enhance its 256 chances for being sustainable. Furthermore, it is designed to be 257 humanly readable and with ability to make precision of the web 258 archive resource covers. This design enables a PWID URN to: 260 * be used for technical solutions e.g. to make them resolvable 262 * cover references to all sorts of materials in web archives 264 * cover references to materials from all sort of web archives 266 The motivation for defining a PWID namespace is the growing 267 challenge of references to archived web resources, which the PWID 268 as a URN can assist in overcoming. The standard is needed to 269 address web materials meeting precision and persistency issues on 270 par precision in with traditional references for analogue 271 material. Furthermore, it is needed in order to address web 272 archive resources that are not freely available online. The PWID 273 URN covers both referencing of web resources from research papers 274 and definition of web collection/corpus. In detail the challenges 275 are: 277 * Citation guidelines generally do not cover general and 278 persistent referencing techniques for web resources that are 279 not registered by Persistent Identifier systems (like DOI 280 [DOI]). However, an increasing number of references point to 281 resources that only exist on the web, e.g. blogs that turned 282 out to have a historical impact. In order to obtain 283 persistency for a reference, the target need to be stable. As 284 the live web is 'alive' and in constant change, persistency can 285 only be obtained by referring to archived snapshots of the web. 286 The PWID URN is therefore focused on referencing archived web 287 material in a technology agnostic way (research documented in 288 [IPRES2016] and [ResawRef]). 290 * There are many new initiatives for web archive referencing, - 291 most of them are centralised solutions which offers harvest and 292 referencing, but these cannot be used for existing materials in 293 web archives. Other initiatives only cover open web archives, 294 which does not cover material in archives with restricted 295 access and where there is a risk of imprecision if a resource 296 in an alternative archive is the result of resolving such a 297 resource. The PWID URN is needed in order to fill these gaps 298 where other techniques are not sufficient. 300 * There are many different requirements for construction of 301 collection definitions for web material besides precision and 302 persistency. Recent research have found that various legal and 303 sustainability issues leads to a need for a collection to be 304 defined by references to the web parts in the collection. The 305 PWID URN is needed in such definitions in order to fulfil these 306 requirements and to enable a collection to cover web materials 307 from more archives (research documented in [ResawColl]). 309 The PWID is especially useful for web material where precision is 310 in focus and/or there are references to materials from web 311 archives requiring special grants in order to gain access. The 312 precision regards both regards precise reference where there can 313 be no doubt about that you have the correct web material as well 314 as precision about what is actually referred by the reference 315 (e.g. is it the page or the whole website) 317 Furthermore, the PWID is very useful in specification of contents 318 of a web collection (also known as web corpus). Definitions of 319 web collections are often needed for extraction of data used in 320 production of research results, e.g. for evaluations in the 321 future. Current practices today are not persistent as they often 322 use some CDX version, which vary for different implementations. 324 Strict syntax is needed for the PWID reference in order to ensure 325 that it can be used for computational purposes. This is 326 especially relevant for automatic extraction of parts from web 327 collection definitions. Furthermore, readers of research papers 328 are today expecting to be able to access a referenced resource by 329 clicking an actionable URI, therefore a similar facility will be 330 expected for references to available archived web material, which 331 strict syntax can make possible. Examples of technical solutions 332 that is enabled are: 334 * resolving of a references and automatic extraction of web 335 collection defined by PWID URNs [ResawRef] [ResawColl] 337 * Resolving of a PWID reference by resolving services. As a 338 start, there is work on a prototype that can work for the 339 Danish web archive data and open web archives with standard 340 patterns for the current technologies. There may come 341 different implementations for resolving which may rely on 342 different protocols and application 344 The purpose of the PWID is also to express a web archive reference 345 as simple as possible and at the same time meeting requirements 346 for sustainability, usability and scope. Therefore, the PWID URN 347 is focused on only having the minimum required information to make 348 a precise identification of a resource in an arbitrary web 349 archive. Resent research have found that this is obtain by the 350 following information [ResawRef]: 352 * Identification of web archive 354 * Identification of source: 356 + Archived URI or identifier 358 + Archival timestamp 360 * Intended precision (page, part, subsite etc.) 362 The PWID URN represents this information in a human readable way 363 as well as a well-defined way that enables technical solutions to 364 interpret the URN. 366 Syntax: 368 The syntax of the PWID URN is specified below in Augmented Backus- 369 Naur Form (ABNF) [RFC5234] and it conforms to URN syntax defined 370 in [RFC8141]. The syntax definition of the PWID URN is: 372 pwid-urn = "urn" ":" pwid-NID ":" pwid-NSS 374 pwid-NID = "pwid" 375 pwid-NSS = archive-id ":" archival-time ":" precision-spec 376 ":" archived-item 378 archive-id = +( unreserved ) 379 precision-spec = "part" / "page" / "subsite" / "site" 380 / "collection" / "recording" / "snapshot" 381 / "other" 383 archived-item = URI / archived-item-id 384 archived-item-id = +( unreserved ) 386 where 388 * 'archival-time' is a UTC timestamp as described in the W3C 389 profile of [ISO8601] [W3CDTF] (also defined in [RFC3339]), for 390 example YYYY-MM-DDThh:mm:ssZ. The 'archival-time' shall 391 represent the timestamp that the web archive have recorded for 392 the referenced archived URI. The archival-time may be 393 specified at any of the levels of granularity described in 394 [W3CDTF], as long as it reflects exactly the granularity of the 395 timestamp recorded in the web archive, which is in accordance 396 with the WARC standard [ISO28500]. 398 * 'unreserved' is defined as in [RFC3986] 400 * 'precision-spec' values are not case sensitive (i.e. "PAGE" / 401 "PART" / "PaGe" / ... are valid values as well.) 403 * 'URI' is defined as in [RFC3986] but where occurrences of "[", 404 "]", "?" and "#" are %-encoded in order not to clash with URN 405 reserved characters [RFC8141] 407 The precision specification is expressing the intended precision 408 of the reference. For example, if the reference is to an html web 409 element, this element can be interpreted in several ways: 411 * As just one web part 412 Meaning the file containing the html, and precisely this file 414 * A web page 415 Meaning that an application like Wayback shows result in a 416 browser, and calculates referenced web parts (display 417 templates, images etc.) and use these found web parts in the 418 result. 419 If the full reference only contains the PWID URN for the page, 420 this may mean that the archived page can change look over time, 421 e.g. in case that parts referred by the page did not exist at 422 reference time, but are harvested at a later stage, - or in 423 case the web archive's algorithm for calculation of the 424 referred web parts are changed and given a different result. 425 In order to make a precise reference to a picture in context of 426 a web page, the most precise reference will be to provide the 427 PWID URN for the page (with page precision) and the PWID URN 428 for the image file part which contains the referred picture 429 (with part precision) 431 * As a site or subsite 432 Meaning that an application like Wayback shows result in a 433 browser showing the web page, - and if there are restricted 434 access according to the reference, the application also needs 435 to make sure that all parts/pages belonging to the site/subsite 436 is available. 437 If the full reference only contains the PWID URN for the site/ 438 subsite, this may mean that the site/subsite can change its 439 appearance over time, in the same way as for the web page 440 described above. 442 The precision specification needs to be part of an URN PWID in 443 order to enable the person making the above described precision in 444 the reference. Furthermore this precesion specification will make 445 it possible for resolvers to display the referred source in a way 446 that corresponds to the precision specification. 448 Especially for web materials, there can be different ways to 449 represent e.g. a web page, which provides different precision of 450 the source as well. The above examples with part, page, subsite 451 and site are addressing the most common access via browser 452 functionality like in Wayback. However, there are also web 453 archives that archive snapshots of the web pages for the archived 454 URI. A third option can be to produce a collection of archived 455 URIs as basis for browser access instead of letting the web 456 archive calculate sub items (which may change over time). An 457 example of the production of such a collection is provided in the 458 section about assignment. Lastly, a web page may be archived via 459 a web recording. 461 As consequence of the above, there are following valid precision- 462 spec values: 464 * part 465 the single archived web part harvested as a file from the 466 specified URI, e.g. a pdf, an html text, an image 468 * page 469 the web page represented by the web page file (e.g. html) 470 harvested from the specified URI, where this contents is 471 interpreted as a web page with all referred parts relevant to 472 display the web page (but where referred parts must be 473 calculated as described above), e.g. an html page with referred 474 images 476 * subsite 477 The referred web page (as described under 'page') where it is 478 possible to browse to all references starting with the same 479 path as the archived URI 481 * site 482 The referred web page (as described under 'page') where it is 483 possible to browse to all references in the domain specified in 484 the archived URI 486 * collection 487 Representation of a collection specification, where it is up to 488 the web archive applications to find out how it is rendered 489 (e.g. collection specification in the XML format enabling 490 interpretation as in the example provided in [ResawColl]) 492 * snapshot 493 a snapshot (image) representation of web material, e.g. a web 494 page 496 * recording 497 Representation of a web recording specification where it is up 498 to the web archive applications to find out how it is rendered 499 (where interpretation could depends on file-suffix for the web 500 recording), an example is web recording coded in a WARC file 502 * other 503 This is a placeholder to allow reference of a resource of any 504 kind with an assigned identifier (by the archive). In all 505 cases, it will be up to the application serving the web archive 506 to interpret how this item should be rendered 508 Assignment: 510 The PWID URNs does not have to be assigned by an authority, as 511 they are based on the information created at the time of 512 archiving. In other words: the PWID URNs are created 513 independently, but following an algorithm which ensures that the 514 referred item can be found if it is still available. It also has 515 the benefit that it includes information to look alternative 516 resources e.g. via Memento for some open web archives [MEMENTO] or 517 via possibly coming web archive infrastructures. 519 A PWID URN is created by finding the relevant information of the 520 syntax parts of the PWID on form: 522 "urn:pwid:" archive-id ":" archival-time ":" precision-spec 523 ":" archived-item 525 The PWID URN for an archived item in hand can be constructed by 526 exchanging the unspecified PWID parts with relevant information, 527 as explained in the following: 529 * archive-id (identification of web archive): 530 In this version of the standard, it is recommended to use the 531 domain of the web archive as the identifier for the web archive 532 (e.g. archive.org for Internet Archives open web archive). 533 This is recommended, since browsing of this domain page 534 typically will lead to description of how to access the web 535 archive, e.g. online or by applying for access grants. 536 Furthermore, it is more precise than e.g. the name of the 537 archive, since there may be more than one installation of web 538 archives in the same organisation, e.g. archive.org and 539 archive-it.org are both covered by Internet Archive. When a 540 registry of web archives are established it will be more 541 precise and persistent to use the web archive identifier 542 specified in this registry (e.g. DKWA for the Danish web 543 archive with domain netarkivet.dk) 545 * archival-time (archival timestamp): 546 The archival time for the archived item in hand may be 547 displayed along with the archived item, but there are different 548 implementation where it is important to be aware of whether a 549 more precise timestamp can be found, and that it is the correct 550 timestamp that is used. For many Wayback implementation the 551 precise time can be found as part of the URI used for viewing 552 the archived item, e.g. in the example of 553 https://web.archive.org/web/20160122112029/http://www.dr.dk 554 viewable by the Internet Archives Wayback installation, the 555 number 20160122112029 represents the archival time 556 2016-01-22T11:20:29Z. In other installations. In other 557 installations, the most precise time may be found in the URI 558 from a search result leading to the resource (which usually 559 redirects on basis of a call to the underlying archive index). 560 Especially for web pages with frames, there may be cases where 561 the actual time is not displayed with the source, since only 562 the times for the contents of the frames are displayed. 564 * precision-spec (precision as represented page, part, site, 565 snapshot etc.): 566 The precision specification specifies how the user should view 567 the referred item - either as a specific representation (with 568 inherited precision) or by use of tools (e.g. browse web site 569 based on calculations or browse on basis of collection of 570 specific parts). 572 Since the archived URI can have different forms indicated by 573 the precision specification, this information may be used in 574 resolution and location. 575 For most imprecision types are the ones that involves 576 calculation, i.e. page, site or subsite. For items like an 577 image that have no references to calculate the precision is 578 best described by part, since it also tells that it is a 579 precise reference. 581 * archived-item (archived URI or identifier): 582 The archived item will be the URI (or identifier assigned for a 583 resource by the archive) of the displayed the archived item in 584 hand. 586 A much easier way to construct PWID URNs is to use tools that 587 construct them. Currently, there is also a prototype for a SOLR- 588 Wayback tool (Source at https://github.com/netarchivesuite/ 589 solrwayback) [PWIDprovider], which can assist in finding the most 590 precise reference to an archived web page. This Wayback version 591 can provide all PWID URNs belonging a shown page (with the page 592 PWID URN at the top). For example, in netarkivet.dk, the archived 593 URI for the web page http://www.susanlegetoej.dk/shop/handskedyr- 594 siameser-killing-8681p.html archived 2008-11-29 01:19:16 UTC, has 595 the following parts calculated by the SOLR-Wayback tool: 597 urn:pwid:netarkivet.dk:2008-11- 598 29T00:41:42Z:part:http://www.susanlegetoej.dk/images/ddcss/ 599 SK113_Master_NF.css 601 urn:pwid:netarkivet.dk:2008-11- 602 29T00:39:47Z:part:http://www.susanlegetoej.dk/shop/css/ 603 print.css 605 urn:pwid:netarkivet.dk:2008-11- 606 29T00:40:06Z:part:http://www.susanlegetoej.dk/images/ddcss/ 607 SK113_Basket_NF.css 609 urn:pwid:netarkivet.dk:2008-11- 610 29T00:40:00Z:part:http://www.susanlegetoej.dk/images/ddcss/ 611 SK113_TopMenu_NF.css 613 urn:pwid:netarkivet.dk:2008-11- 614 29T00:40:00Z:part:http://www.susanlegetoej.dk/images/ddcss/ 615 SK113_SearchPage_NF.css 617 urn:pwid:netarkivet.dk:2008-11- 618 29T00:40:35Z:part:http://www.susanlegetoej.dk/images/ddcss/ 619 SK113_Productmenu_NF.css 620 urn:pwid:netarkivet.dk:2008-11- 621 29T00:40:22Z:part:http://www.susanlegetoej.dk/images/ddcss/ 622 SK113_SpaceTop_NF.css 624 urn:pwid:netarkivet.dk:2008-11- 625 29T00:40:24Z:part:http://www.susanlegetoej.dk/images/ddcss/ 626 SK113_SpaceLeft_NF.css 628 urn:pwid:netarkivet.dk:2008-11- 629 29T00:40:23Z:part:http://www.susanlegetoej.dk/images/ddcss/ 630 SK113_SpaceBottom_NF.css 632 urn:pwid:netarkivet.dk:2008-11- 633 29T00:40:25Z:part:http://www.susanlegetoej.dk/images/ddcss/ 634 SK113_SpaceRight_NF.css 636 urn:pwid:netarkivet.dk:2008-11- 637 29T00:37:23Z:part:http://www.susanlegetoej.dk/images/ddcss/ 638 SK113_ProductInfo_NF.css 640 urn:pwid:netarkivet.dk:2008-11- 641 29T00:37:24Z:part:http://www.susanlegetoej.dk/Shop/js/ 642 Variants.js 644 urn:pwid:netarkivet.dk:2009-03- 645 03T11:53:00Z:part:http://www.susanlegetoej.dk/Shop/js/Media.js 647 urn:pwid:netarkivet.dk:2009-03- 648 03T11:53:02Z:part:http://www.susanlegetoej.dk/images/design/ 649 print.gif 651 urn:pwid:netarkivet.dk:2009-03- 652 03T11:54:19Z:part:http://www.susanlegetoej.dk/Shop/js/Scroll.js 654 urn:pwid:netarkivet.dk:2009-03- 655 03T11:54:09Z:part:http://www.susanlegetoej.dk/Shop/js/ 656 Shop5Common.js 658 urn:pwid:netarkivet.dk:2006-11- 659 20T20:16:03Z:part:http://www.susanlegetoej.dk/images/602551.jpg 661 Security and Privacy: 663 Security and privacy considerations are restricted to accessible 664 web resources in web archives. Resolvers to PWID URNs will 665 usually only be possible using the web archives' access tools, 666 where security and privacy are covered by these tools. In such 667 cases security and privacy will covered by such tools, since the 668 information used for access has no security and privacy issues. 669 In the cases where resolution is made around the archives' access 670 tools, there should be made separate analysis. 672 Interoperability: 674 This is covered by comments in the Syntax description: 676 * the PWID URN conforms to the URI standard defined as in RFC 677 3986 [RFC3986] and the URN standard RFC 8141 [RFC8141] 679 * the 'archival-time' of the PWID URN conforms UTC timestamp as 680 described in the W3C profile of ISO 8601 [ISO8601] [W3CDTF] and 681 is in accordance with the WARC standard ISO 28500 [ISO28500]. 683 * the 'archived-item' is either an assigned identifier (the URN 684 standard RFC 8141 [RFC8141]) or an URI which conforms to the 685 URI standard defined as in RFC 3986 [RFC3986], with %-encodings 686 of "[", "]", "#", and "?" in order to conform to the the URN 687 standard RFC 8141 [RFC8141] 689 Resolution: 691 The information in a PWID URN can be used for locating a web 692 archive resource, for any kind of web archive. It includes the 693 minimum information for web archive materials, which enables 694 resolvability, manually or by a resolver. Resolution of a PWID 695 URN is the primary motivation of making a formal URN definition, 696 instead of just textual representation of the for needed parts of 697 a PWID. 699 Resolution (manually or automatically) is done based on the PWID 700 parts: 702 * Web archive identification for web archive holding referred 703 resource 704 The identifier is either an identifier where location of the 705 web archive can be found by looking up the identifier in a 706 registry, - or it is the domain name for the web archive, where 707 browsing this domain page typically will lead to description of 708 how to access the web archive, e.g. online or by applying for 709 access grants 711 * Archived URI or identifier of archived item 712 If the resource is an archived URI, this URI must be used in 713 search for or construction of location of the resource. If the 714 resource is an identifier assigned to the resource (by the 715 archive), it is this identifier that must be used in search for 716 or construction of location of the resource 718 * Date and time associated with the archived item 719 The archival date and time must be used in search for or 720 construction of the location of the resource 722 * Precision of what is referred 723 The precision can either contribute to the guidance of 724 activating tools to view the referred item e.g. browse the 725 referred item as a page on basis of computed closest past, 726 browse the referred item on basis of parts specified in a 727 collection, or view the referred item as a snapshot. In the 728 example of the snapshot, it also contains a specification of 729 which resource to display 731 In the following the different resolution techniques are explained 732 (manual as well as via a service) . 734 An example of a PWID URN is: 736 urn:pwid:archive.org:2016-01-22T11:20:29Z:page:http://www.dr.dk 738 has the information: 740 * archive.org 741 Currently known identifier in form of the Internet Archive 742 domain name for their open access web archive. If Internet 743 Archive registered their open web archive in an IANA web 744 archive register, this identifier could currently be 745 "web.archive.org/web/" for Wayback resolution, or it could be 746 "archive.org/pwid/" if a PWID interface was created as 747 described below 749 * 2016-01-22T11:20:29Z 750 UTC date and time associated with the archived URI 752 * page 753 Clarification that the reference cover the full web page with 754 all its inherited parts selected by the web archive 756 * http://www.dr.dk 757 archived URI of item 759 Based on the current (2018) knowledge of Internet Archive's open 760 access web interface, which has the pattern: 762 https://web.archive.org/web/