idnits 2.17.1 draft-ietf-ntp-ntpv4-proto-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 19. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 5017. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 5028. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 5035. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 5041. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 162 instances of weird spacing in the document. Is it really formatted ragged-right, rather than justified? == There are 2 instances of lines with multicast IPv4 addresses in the document. If these are generic example addresses, they should be changed to use the 233.252.0.x range defined in RFC 5771 == The 'Obsoletes: ' line in the draft header should list only the _numbers_ of the RFCs which will be obsoleted by this document (if approved); it should not include the word 'RFC' in the list. -- The draft header indicates that this document obsoletes RFC1305, but the abstract doesn't seem to directly say this. It does mention RFC1305 though, so this could be OK. -- The draft header indicates that this document obsoletes RFC4330, but the abstract doesn't seem to mention this, which it should. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year == Line 2580 has weird spacing: '...o-><-no y| |...' == Line 3063 has weird spacing: '... ipaddr srcad...' == Line 3064 has weird spacing: '... ipaddr dstad...' == Line 3065 has weird spacing: '... char versio...' == Line 3066 has weird spacing: '... char leap; ...' == (157 more instances...) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (January 17, 2007) is 6302 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: 'NSTAGE' on line 3950 -- Looks like a reference, but probably isn't: 'NMAX' on line 3208 == Missing Reference: '0' is mentioned on line 4440, but not defined -- Obsolete informational reference (is this intentional?): RFC 1305 (ref. '1') (Obsoleted by RFC 5905) -- Obsolete informational reference (is this intentional?): RFC 4330 (ref. '2') (Obsoleted by RFC 5905) -- Obsolete informational reference (is this intentional?): RFC 2434 (ref. '10') (Obsoleted by RFC 5226) Summary: 2 errors (**), 0 flaws (~~), 10 warnings (==), 15 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 NTP WG J. Burbank, Ed. 3 Internet-Draft W. Kasch, Ed. 4 Obsoletes: RFC 4330, RFC 1305 JHU/APL 5 (if approved) J. Martin, Ed. 6 Intended status: Standards Track Netzwert AG 7 Expires: July 21, 2007 D. Mills 8 U. Del. 9 January 17, 2007 11 Network Time Protocol Version 4 Protocol And Algorithms Specification 12 draft-ietf-ntp-ntpv4-proto-04 14 Status of this Memo 16 By submitting this Internet-Draft, each author represents that any 17 applicable patent or other IPR claims of which he or she is aware 18 have been or will be disclosed, and any of which he or she becomes 19 aware will be disclosed, in accordance with Section 6 of BCP 79. 21 Internet-Drafts are working documents of the Internet Engineering 22 Task Force (IETF), its areas, and its working groups. Note that 23 other groups may also distribute working documents as Internet- 24 Drafts. 26 Internet-Drafts are draft documents valid for a maximum of six months 27 and may be updated, replaced, or obsoleted by other documents at any 28 time. It is inappropriate to use Internet-Drafts as reference 29 material or to cite them other than as "work in progress." 31 The list of current Internet-Drafts can be accessed at 32 http://www.ietf.org/ietf/1id-abstracts.txt. 34 The list of Internet-Draft Shadow Directories can be accessed at 35 http://www.ietf.org/shadow.html. 37 This Internet-Draft will expire on July 21, 2007. 39 Copyright Notice 41 Copyright (C) The IETF Trust (2007). 43 Abstract 45 The Network Time Protocol (NTP) is widely used to synchronize 46 computer clocks in the Internet. This memorandum describes Version 4 47 of the NTP (NTPv4), introducing several changes from Version 3 of NTP 48 (NTPv3) described in RFC 1305, including the introduction of a 49 modified protocol header to accomodate Internet Protocol Version 6. 50 NTPv4 also includes optional extensions to the NTPv3 51 protocol,including a dynamic server discovery mechanism. 53 Table of Contents 55 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 56 1.1. Requirements Notation . . . . . . . . . . . . . . . . . . 5 57 2. Modes of Operation . . . . . . . . . . . . . . . . . . . . . 5 58 3. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 7 59 4. Implementation Model . . . . . . . . . . . . . . . . . . . . 10 60 5. Data Types . . . . . . . . . . . . . . . . . . . . . . . . . 13 61 6. Data Structures . . . . . . . . . . . . . . . . . . . . . . . 17 62 6.1. Structure Conventions . . . . . . . . . . . . . . . . . . 17 63 6.2. Global Parameters . . . . . . . . . . . . . . . . . . . . 17 64 6.3. Packet Header Variables . . . . . . . . . . . . . . . . . 19 65 6.3.1. The Kiss-o'-Death Packet . . . . . . . . . . . . . . 25 66 6.3.2. NTP Extension Field Format . . . . . . . . . . . . . 26 67 7. On Wire Protocol . . . . . . . . . . . . . . . . . . . . . . 28 68 8. Peer Process . . . . . . . . . . . . . . . . . . . . . . . . 32 69 8.1. Peer Process Variables . . . . . . . . . . . . . . . . . 32 70 8.2. Peer Process Operations . . . . . . . . . . . . . . . . . 35 71 8.3. Clock Filter Algorithm . . . . . . . . . . . . . . . . . 42 72 9. System Process . . . . . . . . . . . . . . . . . . . . . . . 45 73 9.1. System Process Variables . . . . . . . . . . . . . . . . 45 74 9.2. System Process Operations . . . . . . . . . . . . . . . . 47 75 9.2.1. Selection Algorithm . . . . . . . . . . . . . . . . . 48 76 9.2.2. Clustering Algorithm . . . . . . . . . . . . . . . . 50 77 9.2.3. Combining Algorithm . . . . . . . . . . . . . . . . . 52 78 9.2.4. Clock Discipline Algorithm . . . . . . . . . . . . . 56 79 9.3. Clock Adjust Process . . . . . . . . . . . . . . . . . . 64 80 10. Poll Process . . . . . . . . . . . . . . . . . . . . . . . . 65 81 10.1. Poll Process Variables and Parameters . . . . . . . . . . 65 82 10.2. Poll Process Operations . . . . . . . . . . . . . . . . . 66 83 11. Security Considerations . . . . . . . . . . . . . . . . . . . 67 84 12. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 67 85 13. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 68 86 14. Informative References . . . . . . . . . . . . . . . . . . . 68 87 Appendix A. Code Skeleton . . . . . . . . . . . . . . . . . . . 68 88 A.1. Global Definitions . . . . . . . . . . . . . . . . . . . 69 89 A.1.1. Definitions, Constants, Parameters . . . . . . . . . 69 90 A.1.2. Packet Data Structures . . . . . . . . . . . . . . . 72 91 A.1.3. Association Data Structures . . . . . . . . . . . . . 73 92 A.1.4. System Data Structures . . . . . . . . . . . . . . . 76 93 A.1.5. Local Clock Data Structures . . . . . . . . . . . . . 77 94 A.1.6. Function Prototypes . . . . . . . . . . . . . . . . . 77 95 A.2. Main Program and Utility Routines . . . . . . . . . . . . 78 96 A.3. Kernel Input/Output Interface . . . . . . . . . . . . . . 82 97 A.4. Kernel System Clock Interface . . . . . . . . . . . . . . 82 98 A.5. Peer Process . . . . . . . . . . . . . . . . . . . . . . 84 99 A.5.1. receive() . . . . . . . . . . . . . . . . . . . . . . 85 100 A.5.2. packet() . . . . . . . . . . . . . . . . . . . . . . 90 101 A.5.3. clock_filter() . . . . . . . . . . . . . . . . . . . 92 102 A.5.4. fast_xmit() . . . . . . . . . . . . . . . . . . . . . 93 103 A.5.5. access() . . . . . . . . . . . . . . . . . . . . . . 95 104 A.6. System Process . . . . . . . . . . . . . . . . . . . . . 95 105 A.6.1. clock_select() . . . . . . . . . . . . . . . . . . . 95 106 A.6.2. root_dist() . . . . . . . . . . . . . . . . . . . . . 99 107 A.6.3. accept() . . . . . . . . . . . . . . . . . . . . . . 100 108 A.6.4. clock_update() . . . . . . . . . . . . . . . . . . . 100 109 A.6.5. clock_combine() . . . . . . . . . . . . . . . . . . . 103 110 A.6.6. local_clock() . . . . . . . . . . . . . . . . . . . . 103 111 A.6.7. rstclock() . . . . . . . . . . . . . . . . . . . . . 109 112 A.7. Clock Adjust Process . . . . . . . . . . . . . . . . . . 109 113 A.7.1. clock_adjust() . . . . . . . . . . . . . . . . . . . 109 114 A.8. Poll Process . . . . . . . . . . . . . . . . . . . . . . 110 115 A.8.1. poll() . . . . . . . . . . . . . . . . . . . . . . . 110 116 A.8.2. poll_update() . . . . . . . . . . . . . . . . . . . . 112 117 A.8.3. peer_xmit() . . . . . . . . . . . . . . . . . . . . . 114 118 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 115 119 Intellectual Property and Copyright Statements . . . . . . . . . 116 121 1. Introduction 123 This document specifies the Network Time Protocol Version 4 (NTPv4), 124 which is widely used to synchronize the system clocks among a set of 125 distributed time servers and clients. This document defines the core 126 architecture, protocol, state machines, data structures and 127 algorithms. This document describes NTPv4, which introduces new 128 functionality to NTPv3 as described in [1], and functionality 129 expanded from that of SNTPv4 as described in [2] (SNTPv4 is a subset 130 of NTPv4). This document obsoletes RFC 1305 and RFC 4330. While 131 certain minor changes have been made in some protocol header fields, 132 these do not affect the interoperability between NTPv4 and previous 133 versions. 135 The NTP subnet model includes a number of widely accessible primary 136 time servers synchronized by wire or radio to national standards. 137 The purpose of the NTP protocol is to convey timekeeping information 138 from these primary servers to secondary time servers and clients via 139 both private networks and the public Internet. Crafted algorithms 140 mitigate errors that may result from network disruptions, server 141 failures and possible hostile action. Servers and clients are 142 configured such that values flow from the primary servers at the root 143 via branching secondary servers toward clients. 145 The NTPv4 design overcomes significant shortcomings in the NTPv3 146 design, corrects certain bugs and incorporates new features. In 147 particular, expanded NTP timestamp definitions encourage the use of 148 floating double data types throughout any implementation. The time 149 resolution is better than one nanosecond and frequency resolution 150 better than one nanosecond per second. Additional improvements 151 include a new clock discipline algorithm which is more responsive to 152 system clock hardware frequency fluctuations. Typical primary 153 servers using modern machines are precise within a few tens of 154 microseconds. Typical secondary servers and clients on fast LANs are 155 within a few hundred microseconds with poll intervals up to 1024 156 seconds, which was the maximum with NTPv3. With NTPv4, servers and 157 clients are within a few tens of milliseconds with poll intervals up 158 to 36 hours. 160 The main body of this document describes only the core protocol and 161 data structures necessary to interoperate between conforming 162 implementations. Additional detail is provided in the form of a 163 skeleton program included as an appendix. This program includes data 164 structures and code segments for the core algorithms and in addition 165 the mitigation algorithms used to enhance reliability and accuracy. 166 While the skeleton and other descriptions in this document apply to a 167 particular implementation, they are not intended as the only way the 168 required functions can be implemented. While the NTPv3 symmetric key 169 authentication scheme described in this document carries over from 170 NTPv3, the Autokey public key authentication scheme new to NTPv4 is 171 described in [3]. 173 The NTP protocol includes the modes of operation described in 174 Section 2 using the data types described in Section 5 and the data 175 structures in Section 6. The implementation model described in 176 Section 4 is based on a multiple-process, threaded architecture, 177 although other architectures could be used as well. The on-wire 178 protocol described in Section 7 is based on a returnable-time design 179 which depends only on measured clock offsets, but does not require 180 reliable message delivery. The synchronization subnet is a self- 181 organizing, hierarchical, master-slave network with synchronization 182 paths determined by a shortest-path spanning tree and defined metric. 183 While multiple masters (primary servers) may exist, there is no 184 requirement for an election protocol. 186 The remaining sections of this document define the data structures 187 and algorithms suitable for a fully featured NTPv4 implementation. 188 Appendix A contains the code skeleton with definitions, structures 189 and code segments that represent the basic structure of the reference 190 implementation. 192 The remainder of this document contains numerous variables and 193 mathematical expressions. Those variables take the form of Greek 194 characters. Those Greek characters are spelled out by their full 195 name, with the "cap" prefix added to variables referring to the 196 corresponding upper case Greek character. For example capdelta 197 refers to the uppercase Greek character, where delta refers to the 198 lowercase Greek character. Furthermore, subscripts are denoted with 199 a '_' separating the variable name and the subscript. For example 200 'theta_i' refers to the variable lowercase Greek character theta with 201 subscript i, or phonetically 'theta sub i.' 203 1.1. Requirements Notation 205 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 206 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 207 document are to be interpreted as described in RFC 2119 [4]. 209 2. Modes of Operation 211 An NTP implementation operates as a primary server, secondary server 212 or client. A primary server is synchronized directly to a reference 213 clock, such as a GPS receiver or telephone modem service. A client 214 is synchronized to one or more upstream servers, but does not provide 215 synchronization to dependent clients. A secondary server has one or 216 more upstream servers and one or more downstream servers or clients. 217 All servers and clients claiming full NTPv4 compliance must implement 218 the entire suite of algorithms described in this document. In order 219 to maintain stability in large NTP subnets, secondary servers must be 220 fully NTPv4 compliant. 222 Primary servers and clients complying with a subset of NTP, called 223 the Simple Network Time Protocol (SNTPv4) [2], do not need to 224 implement all algorithms. SNTP is intended for primary servers 225 equipped with a single reference clock, as well as clients with a 226 single upstream server and no dependent clients. The fully developed 227 NTPv4 implementation is intended for secondary servers with multiple 228 upstream servers and multiple downstream servers or clients. Other 229 than these considerations, NTP and SNTP servers and clients are 230 completely interoperable and can be mixed and matched in NTP subnets. 232 +-------------------+--------------+-------------+ 233 | Association Mode | Assoc. Mode | Packet Mode | 234 +-------------------+--------------+-------------+ 235 | Symmetric Active | 1 | 1 or 2 | 236 | Symmetric Passive | 2 | 1 | 237 | Client | 3 | 4 | 238 | Server | 4 | 3 | 239 | Broadcast Server | 5 | 5 | 240 | Broadcast Client | 6 | N/A | 241 +-------------------+--------------+-------------+ 243 Table 1: Association and Packet Modes 245 There are three NTP protocol variants, symmetric, client/server and 246 broadcast. Each is associated with an association mode as shown in 247 Table 1. In the client/server variant a client association sends 248 mode-3 (client) packets to a server, which returns mode-4 (server) 249 packets. Servers provide synchronization to one or more clients, but 250 do not accept synchronization from them. A server can also be a 251 reference clock which obtains time directly from a standard source 252 such as a GPS receiver or telephone modem service. We say that 253 clients pull synchronization from servers. 255 In the symmetric variant a peer operates as both a server and client 256 using either a symmetric-active or symmetric-passive association. A 257 symmetric-active association sends mode-1 (symmetric-active) packets 258 to a symmetric-active peer association. Alternatively, a symmetric- 259 passive association can be mobilized upon arrival of a mode-1 packet. 260 That association sends mode-2 (symmetric-passive) packets and 261 persists until error or timeout. Peers both push and pull 262 synchronization to and from each other. For the purposes of this 263 document, a peer operates like a client, so a reference to client 264 implies peer as well. 266 In the broadcast variant a broadcast server association sends 267 periodic mode-5 (broadcast) packets which are received by multiple 268 mode-6 (broadcast client) associations. It is useful to provide an 269 initial volley where the client operating in mode 3 exchanges several 270 packets with the server in order to calibrate the propagation delay 271 and to run the Autokey security protocol, after which the client 272 reverts to mode 6. We say that broadcast servers push 273 synchronization to willing consumers. 275 Following conventions established by the telephone industry, the 276 level of each server in the hierarchy is defined by a number called 277 the stratum, with the primary servers assigned stratum one and the 278 secondary servers at each level assigned one greater than the 279 preceding level. As the stratum increases from one, the accuracies 280 achievable degrade somewhat depending on the particular network path 281 and system clock stability. It is useful to assume that mean errors, 282 and thus a metric called the synchronization distance, increase 283 approximately in proportion to the stratum and measured roundtrip 284 delay. It is important to note that NTP stratum is only loosely 285 modeled after telecommunications stratum. The NTP stratum numbers 286 and telecommunications stratum numbers do not correlate with one 287 another. Telecommunications stratum numbers are rigorously defined 288 by international standards that are not covered within this document. 290 Drawing from the experience of the telephone industry, which learned 291 such lessons at considerable cost, the subnet topology should be 292 organized to produce the lowest synchronization distances, but must 293 never be allowed to form a loop. In NTP the subnet topology is 294 determined using a variant of the Bellman-Ford distributed routing 295 algorithm, which computes the shortest-distance spanning tree rooted 296 on the primary servers. As a result of this design, the algorithm 297 automatically reorganizes the subnet to produce the most accurate and 298 reliable time, even when one or more primary or secondary servers or 299 the network paths fail. 301 3. Definitions 303 A number of terms used throughout this document have a precise 304 technical definition. A timescale is a frame of reference where time 305 is expressed as the value of a monotonic-increasing binary counter 306 with an indefinite number of bits. It counts in seconds and fraction 307 with the decimal point somewhere in the middle. The Coordinated 308 Universal Time (UTC) timescale represents mean solar time as 309 disseminated by national standards laboratories. The system time is 310 represented by the system clock maintained by the operating system 311 kernel. The goal of the NTP algorithms is to minimize both the time 312 difference and frequency difference between UTC and the system clock. 313 When these differences have been reduced below nominal tolerances, 314 the system clock is said to be synchronized to UTC. 316 The date of an event is the UTC time at which it takes place. Dates 317 are ephemeral values which always increase in step with reality and 318 are designated with upper case T in this document. It is convenient 319 to define another timescale coincident with the running time of the 320 NTP program that provides the synchronization function. This is 321 convenient in order to determine intervals for the various repetitive 322 functions like poll events. Running time is usually designated with 323 lower case t. 325 A timestamp T(t) represents either the UTC date or time offset from 326 UTC at running time t. Which meaning is intended should be clear 327 from the context. Let T(t) be the time offset, R(t) the frequency 328 offset, D(t) the ageing rate (first derivative of R(t) with respect 329 to t). Then, if T(t_0) is the UTC time offset determined at t=t_0, 330 the UTC time offset after some interval is: 332 T(t+t_0) = T(t_0) + R(t_0)(t+t_0)+(1/2)*D(t_0)(t+t_0)^2 + e, 334 where e is a stochastic error term discussed later in this document. 335 While the D(t) term is important when characterizing precision 336 oscillators, it is ordinarily neglected for computer oscillators. In 337 this document all time values are in seconds (s) and all frequency 338 values are in seconds-per-second (s/s). It is sometimes convenient 339 to express frequency offsets in parts-per-million (PPM), where 1 PPM 340 is equal to 1*10^(-6) seconds. 342 It is important in computer timekeeping applications to assess the 343 performance of the timekeeping function. The NTP performance model 344 includes four statistics which are updated each time a client makes a 345 measurement with a server. The offset theta represents the maximum- 346 likelihood time offset of the server clock relative to the system 347 clock. The delay del represents the roundtrip delay between the 348 client and server. The dispersion epsilon represents the maximum 349 error inherent in the measurement. It increases at a rate equal to 350 the maximum disciplined system clock frequency tolerance phi, 351 typically 15 PPM. The jitter psi, defined as the root-mean-square 352 (RMS) average of the most recent time offset differences, represents 353 the nominal error in estimating theta. 355 While the theta, del, epsilon, and psi statistics represent 356 measurements of the system clock relative to the each server clock 357 separately, the NTP protocol includes mechanisms to combine the 358 statistics of several servers to more accurately discipline and 359 calibrate the system clock. The system offset captheta represents 360 the maximum-likelihood offset estimate for the server population. 361 The system jitter, defined as vartheta, represents the nominal error 362 in estimating captheta. The del and epsilon statistics are 363 accumulated at each stratum level from the reference clocks to 364 produce the root delay delta and root dispersion capepsilon 365 statistics. The synchronization distance gamma=capepsilon+delta/2 366 represents the maximum error due all causes. The detailed 367 formulations of these statistics are given later in this document. 368 They are available to the dependent applications in order to assess 369 the performance of the synchronization function. 371 0 1 2 3 372 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 373 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 374 |LI | VN |Mode | Strat | Poll | Prec | 375 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 376 | Root Delay | 377 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 378 | Root Dispersion | 379 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 380 | Reference ID | 381 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 382 | | 383 + Reference Timestamp + 384 | | 385 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 386 | | 387 + Origin Timestamp + 388 | | 389 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 390 | | 391 + Receive Timestamp + 392 | | 393 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 394 | | 395 + Transmit Timestamp + 396 | | 397 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 398 | | 399 + Extension Field 1 (Optional) + 400 | | 401 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 402 | | 403 + Extension Field 2 (Optional) + 404 | | 405 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 406 . . 407 . Authentication . 408 . (Optional) (160 bits) . 409 . . 410 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 412 Figure 1: NTPv4 Message Format 414 4. Implementation Model 416 Figure 2 shows two processes dedicated to each server, a peer process 417 to receive messages from the server or reference clock and a poll 418 process to transmit messages to the server or reference clock. . 420 .......................................................... 421 . Remote .. Peer/Poll .. System . 422 . Servers .. Processes .. Process . 423 . .. .. . 424 .----------..-------------..-------------- . 425 .| |->| |..| | . 426 .|Server 1|..|Peer/Poll 1|->| | . 427 .| |<-| |..| | ............ 428 .----------..-------------..| | . Clock . 429 .Discipline. 430 . .. ^ ..| | .. Process . 431 . .. | ..| | .. . 432 .----------..-------------..| | |-----------|.. . 433 .| |->| |..| Selection |->| ..-------- . 434 .|Server 2|..|Peer/Poll 2|->| and | | Combining |->| Loop | . 435 .| |<-| |..| Clustering | | Algorithm |..|Filter| . 436 .----------..-------------..| Algorithms |->| |.----------- 437 . .. ^ ..| | |-----------|. | 438 . .. | ..| | . | 439 .----------..-------------..| | . | 440 .| |->| |..| | . | 441 .|Server 3|..|Peer/Poll 3|->| | . | 442 .| |<-| |..| | . | 443 .----------..-------------..|------------| . | 444 ....................^..................................... | 445 | | 446 | \|/ 447 | ............... 448 | . /-----\ . 449 '----------------------------------<-| VFO |-<-. 450 . \-----/ . 451 . Clock Adjust. 452 . Process . 453 ............... 455 Figure 2: NTPv4 Algorithm Interactions 457 These processes operate on a common data structure called an 458 association, which contains the statistics described above along with 459 various other data described later. A client sends an NTP packet to 460 one or more servers and processes the replies as received. The 461 server interchanges addresses and ports, overwrites certain fields in 462 the packet and returns it immediately (client/ server mode) or at 463 some time later (symmetric modes). As each NTP message is received, 464 the offset theta between the peer clock and the system clock is 465 computed along with the associated statistics del, epsilon, and psi. 467 The system process includes the selection, clustering and combining 468 algorithms which mitigate among the various servers and reference 469 clocks to determine the most accurate and reliable candidates to 470 synchronize the system clock. The selection algorithm uses Byzantine 471 principles to discard the falsetickers from the incident population, 472 leaving only truechimers. A 'truechimer' is a clock that maintains 473 timekeeping accuracy to a previously published (and trusted) 474 standard, while a 'falseticker' is a clock that does not maintain 475 that level of timekeeping accuracy. The clustering algorithm uses 476 statistical principles to sift the most accurate truechimers leaving 477 the survivors as result. The combining algorithm develops the final 478 clock offset as a statistical average of the survivors. 480 The clock discipline process, which is actually part of the system 481 process, includes engineered algorithms to control the time and 482 frequency of the system clock, here represented as a variable 483 frequency oscillator (VFO). Timestamps struck from the VFO close the 484 feedback loop which maintains the system clock time. Associated with 485 the clock discipline process is the clock adjust process, which runs 486 once each second to inject a computed time offset and maintain 487 constant frequency. The RMS average of past time offset differences 488 represents the nominal error or system jitter vartheta. The RMS 489 average of past frequency offset differences represents the 490 oscillator frequency stability or frequency wander cappsi. 492 A client sends messages to each server with a poll interval of 2^tau 493 seconds, as determined by the poll exponent tau. In NTPv4 tau ranges 494 from 4 (16 s) through 17 (36 h). The value of tau is determined by 495 the clock discipline algorithm to match the loop time constant 496 T_c=2^tau. A server responds with messages at an update interval of 497 mu seconds. For stateless servers, mu=T_c, since the server responds 498 immediately. However, in symmetric modes each of two peers manages 499 the time constant as a function of current system offset and system 500 jitter, so may not agree with the same tau. It is important that the 501 dynamic behavior of the clock discipline algorithms be carefully 502 controlled in order to maintain stability in the NTP subnet at large. 503 This requires that the peers agree on a common tau equal to the 504 minimum poll exponent of both peers. The NTP protocol includes 505 provisions to properly negotiate this value. 507 While not shown in the figure, the implementation model includes some 508 means to set and adjust the system clock. The operating system is 509 assumed to provide two functions, one to set the time directly, for 510 example the Unix settimeofday() function, and another to adjust the 511 time in small increments advancing or retarding the time by a 512 designated amount, for example the Unix adjtime() function 513 (parentheses following a name indicate reference to a function rather 514 than a simple variable). In the intended design the clock discipline 515 process uses the adjtime() function if the adjustment is less than a 516 designated threshold, and the settimeofday() function if above the 517 threshold. The manner in which this is done and the value of the 518 threshold is described later. 520 5. Data Types 522 All NTP time values are represented in twos-complement format, with 523 bits numbered in big-endian (as described in Appendix A of [5]) 524 fashion from zero starting at the left, or high-order, position. 525 There are three NTP time formats, a 128-bit date format, a 64-bit 526 timestamp format and a 32-bit short format, as shown in Figure 3. 527 The 128-bit date format is used where sufficient storage and word 528 size are available. It includes a 64-bit signed seconds field 529 spanning 584 billion years and a 64-bit fraction field resolving .05 530 attosecond (i.e. 0.5e-18). For convenience in mapping between 531 formats, the seconds field is divided into a 32-bit era field and a 532 32-bit timestamp field. Eras cannot be produced by NTP directly, nor 533 is there need to do so. When necessary, they can be derived from 534 external means, such as the filesystem or dedicated hardware. 536 0 1 2 3 537 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 538 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 539 | Seconds | Fraction | 540 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 541 NTP Short Format 543 0 1 2 3 544 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 545 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 546 | Seconds | 547 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 548 | Fraction | 549 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 550 NTP Timestamp Format 552 0 1 2 3 553 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 554 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 555 | Era Number | 556 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 557 | Era Offset | 558 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 559 | | 560 | Fraction | 561 | | 562 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 563 NTP Date Format 565 Figure 3: NTP Time Format 567 The 64-bit timestamp format is used in packet headers and other 568 places with limited word size. It includes a 32-bit unsigned seconds 569 field spanning 136 years and a 32-bit fraction field resolving 232 570 picoseconds. The 32-bit short format is used in delay and dispersion 571 header fields where the full resolution and range of the other 572 formats are not justified. It includes a 16-bit unsigned seconds 573 field and a 16-bit fraction field. 575 In the date format the prime epoch, or base date of era 0, is 0 h 1 576 January 1900 UTC, when all bits are zero. It should be noted that 577 strictly speaking, UTC did not exist prior to 1 January 1972, but it 578 is convenient to assume it has existed for all eternity, even if all 579 knowledge of historic leap seconds has been lost. Dates are relative 580 to the prime epoch; values greater than zero represent times after 581 that date; values less than zero represent times before it. 583 Timestamps are unsigned values and operations on them produce a 584 result in the same or adjacent eras. Era 0 includes dates from the 585 prime epoch to some time in 2036, when the timestamp field wraps 586 around and the base date for era 1 is established. In either format 587 a value of zero is a special case representing unknown or 588 unsynchronized time. Table 2 shows a number of historic NTP dates 589 together with their corresponding Modified Julian Day (MJD), NTP era 590 and NTP timestamp. 592 +-------------+------------+-----+---------------+------------------+ 593 | Year | MJD | NTP | NTP Timestamp | Epoch | 594 | | | Era | | | 595 +-------------+------------+-----+---------------+------------------+ 596 | 1 Jan -4712 | -2,400,001 | -49 | 1,795,583,104 | 1st day Julian | 597 | 1 Jan -1 | -679,306 | -14 | 139,775,744 | 2 BCE | 598 | 1 Jan 0 | -678,491 | -14 | 171,311,744 | 1 BCE | 599 | 1 Jan 1 | -678,575 | -14 | 202,939,144 | 1 CE | 600 | 4 Oct 1582 | -100,851 | -3 | 2,873,647,488 | Last day Julian | 601 | 15 Oct 1582 | -100,840 | -3 | 2,874,597,888 | First day | 602 | | | | | Gregorian | 603 | 31 Dec 1899 | 15019 | -1 | 4,294,880,896 | Last day NTP Era | 604 | | | | | -1 | 605 | 1 Jan 1900 | 15020 | 0 | 0 | First day NTP | 606 | | | | | Era 0 | 607 | 1 Jan 1970 | 40,587 | 0 | 2,208,988,800 | First day UNIX | 608 | 1 Jan 1972 | 41,317 | 0 | 2,272,060,800 | First day UTC | 609 | 31 Dec 1999 | 51,543 | 0 | 3,155,587,200 | Last day 20th | 610 | | | | | Century | 611 | 8 Feb 2036 | 64,731 | 1 | 63,104 | First day NTP | 612 | | | | | Era 1 | 613 +-------------+------------+-----+---------------+------------------+ 615 Table 2: Interesting Historic NTP Dates 617 Let p be the number of significant bits in the second fraction. The 618 clock resolution is defined 2^(-p), in seconds. In order to minimize 619 bias and help make timestamps unpredictable to an intruder, the non- 620 significant bits should be set to an unbiased random bit string. The 621 clock precision is defined as the running time to read the system 622 clock, in seconds. Note that the precision defined in this way can 623 be larger or smaller than the resolution. The term rho, representing 624 the precision used in this document, is the larger of the two. 626 The only operation permitted with dates and timestamps is twos- 627 complement subtraction, yielding a 127-bit or 63-bit signed result. 628 It is critical that the first-order differences between two dates 629 preserve the full 128-bit precision and the first-order differences 630 between two timestamps preserve the full 64-bit precision. However, 631 the differences are ordinarily small compared to the seconds span, so 632 they can be converted to floating double format for further 633 processing and without compromising the precision. 635 It is important to note that twos-complement arithmetic does not know 636 the difference between signed and unsigned values; only the 637 conditional branch instructions. Thus, although the distinction is 638 made between signed dates and unsigned timestamps, they are processed 639 the same way. A perceived hazard with 64-bit timestamp calculations 640 spanning an era, such as possible in 2036, might result in incorrect 641 values. In point of fact, if the client is set within 68 years of 642 the server before the protocol is started, correct values are 643 obtained even if the client and server are in adjacent eras. 645 Some time values are represented in exponent format, including the 646 precision, time constant and poll interval values. These are in 647 8-bit signed integer format in log2 (log to the base 2) seconds. 649 The only operations permitted on them are increment and decrement. 650 For the purpose of this document and to simplify the presentation, a 651 reference to one of these state variables by name means the 652 exponentiated value, e.g., the poll interval is 1024 s, while 653 reference by name and exponent means the actual value, e.g., the poll 654 exponent is 10. 656 To convert system time in any format to NTP date and timestamp 657 formats requires that the number of seconds s from the prime epoch to 658 the system time be determined. The era is the integer quotient and 659 the timestamp the integer remainder as in: 661 era = s / 2^(32) and timestamp = s - era*2^(32) 663 which works for positive and negative dates. To convert from NTP era 664 and timestamp to system time requires the calculation 666 s = era*2^(32) + timestamp 668 to determine the number of seconds since the prime epoch. Converting 669 between NTP and system time can be a little messy, but beyond the 670 scope of this document. Note that the number of days in era 0 is one 671 more than the number of days in most other eras and this won't happen 672 again until the year 2400 in era 3. 674 In the description of state variables to follow, explicit reference 675 to integer type implies a 32-bit unsigned integer. This simplifies 676 bounds checks, since only the upper limit needs to be defined. 677 Without explicit reference, the default type is 64-bit floating 678 double. Exceptions will be noted as necessary. 680 6. Data Structures 682 The NTP protocol state machines described in following sections are 683 defined using state variables and flow chart fragments. State 684 variables are separated into classes according to their function in 685 packet headers, peer and poll processes, the system process and the 686 clock discipline process. Packet variables represent the NTP header 687 values in transmitted and received packets. Peer and poll variables 688 represent the contents of the association for each server separately. 689 System variables represent the state of the server as seen by its 690 dependent clients. Clock discipline variables represent the internal 691 workings of the clock discipline algorithm. Additional constant and 692 variable classes are defined in Appendix A. 694 6.1. Structure Conventions 696 In order to distinguish between different variables of the same name 697 but used in different processes, the naming convention summarized in 698 Table 3 is employed. A receive packet variable v is a member of the 699 packet structure r with fully qualified name r.v. In a similar 700 manner x.v is a transmit packet variable, p.v is a peer variable, s.v 701 is a system variable and c.v is a clock discipline variable. There 702 is a set of peer variables for each association; there is only one 703 set of system and clock variables. Most flow chart fragments begin 704 with a statement label and end with a named go-to or exit. A 705 subroutine call includes a dummy () following the name and return at 706 the end to the point following the call. 708 +------+---------------------------------+ 709 | Name | Description | 710 +------+---------------------------------+ 711 | r. | receive packet header variable | 712 | x. | transmit packet header variable | 713 | p. | peer/poll variable | 714 | s. | system variable | 715 | c. | clock discipline variable | 716 +------+---------------------------------+ 718 Table 3: Name Prefix Conventions 720 6.2. Global Parameters 722 In addition to the variable classes a number of global parameters are 723 defined in this document, including those shown with values in 724 Table 4. 726 +-----------+-------+----------------------------------+ 727 | Name | Value | Description | 728 +-----------+-------+----------------------------------+ 729 | PORT | 123 | NTP port number | 730 | VERSION | 4 | version number | 731 | TOLERANCE | 15e-6 | frequency tolerance (s/s) | 732 | MINPOLL | 4 | minimum poll exponent (16 s) | 733 | MAXPOLL | 17 | maximum poll exponent (36 h) | 734 | MAXDISP | 16 | maximum dispersion (s) | 735 | MINDISP | .005 | minimum dispersion increment (s) | 736 | MAXDIST | 1 | distance threshold (s) | 737 | MAXSTRAT | 16 | maximum stratum number | 738 +-----------+-------+----------------------------------+ 740 Table 4: Global Parameters 742 While these are the only parameters needed in this document, a larger 743 collection is necessary in the skeleton and larger still for any 744 implementation. Appendix A.1.1 contains those used by the skeleton 745 for the mitigation algorithms, clock discipline algorithm and related 746 implementation-dependent functions. Some of these parameter values 747 are cast in stone, like the NTP port number assigned by the IANA and 748 the version number assigned NTPv4 itself. Others like the frequency 749 tolerance, involve an assumption about the worst case behavior of a 750 system clock once synchronized and then allowed to drift when its 751 sources have become unreachable. The minimum and maximum parameters 752 define the limits of state variables as described in later sections. 754 While shown with fixed values in this document, some implementations 755 may make them variables adjustable by configuration commands. For 756 instance, the reference implementation computes the value of 757 PRECISION as log2 of the minimum time in several iterations to read 758 the system clock. 760 6.3. Packet Header Variables 762 +-----------+------------+-----------------------+ 763 | Name | Formula | Description | 764 +-----------+------------+-----------------------+ 765 | leap | leap | leap indicator (LI) | 766 | version | version | version number (VN) | 767 | mode | mode | mode | 768 | stratum | stratum | stratum | 769 | poll | poll | poll exponent | 770 | precision | rho | precision exponent | 771 | rootdelay | delta | root delay | 772 | rootdisp | capepsilon | root dispersion | 773 | refid | refid | reference ID | 774 | reftime | reftime | reference timestamp | 775 | org | T1 | origin timestamp | 776 | rec | T2 | receive timestamp | 777 | xmt | T3 | transmit timestamp | 778 | dst | T4 | destination timestamp | 779 | keyid | keyid | key ID | 780 | digest | digest | message digest | 781 +-----------+------------+-----------------------+ 783 Table 5: Packet Header Variables 785 The most important state variables from an external point of view are 786 the packet header variables described below. The NTP packet consists 787 of a number of 32-bit (4 octet) words in network byte order. The 788 packet format consists of three components, the header itself, one or 789 more optional extension fields and an optional message authentication 790 code (MAC). The header component is identical to the NTPv3 header 791 and previous versions. The optional extension fields are used by the 792 Autokey public key cryptographic algorithms described in [3]. The 793 optional MAC is used by both Autokey and the symmetric key 794 cryptographic algorithms described in the main body of this report. 796 The NTP packet header follows the UDP and IP headers and the physical 797 header specific to the underlying transport network. It consists of 798 a number of 32-bit (4-octet) words, although some fields use multiple 799 words and others are packed in smaller fields within a word. The NTP 800 packet header shown in Figure 4 has 12 words followed by optional 801 extension fields and finally an optional message authentication code 802 (MAC) consisting of the key identifier and message digest fields. 804 The optional extension fields described in this section are used by 805 the Autokey security protocol [3], which is not described here. The 806 MAC is used by both Autokey and the symmetric key authentication 807 scheme described in Appendix A. As is the convention in other 808 Internet protocols, all fields are in network byte order, commonly 809 called big-endian. 811 A list of the packet header variables is shown in Table 5 and 812 described in detail below. The packet header fields apply to both 813 transmitted (x prefix) and received packets (r prefix). The NTP 814 header is shown in Figure 4 , where the size of some multiple-word 815 fields is shown in bits if not the default 32 bits. The header 816 extends from the beginning of the packet to the end of the Transmit 817 Timestamp field. When using the IPv4 address family these fields are 818 backwards compatible with NTPv3. When using the IPv6 address family 819 on an NTPv4 server with a NTPv3 client, the Reference Identifier 820 field appears to be a random value and a timing loop might not be 821 detected. The message authentication code (MAC) consists of a 32-bit 822 Key Identifier followed by a 128bit Message Digest. The message 823 digest, or cryptosum, is calculated as in [6] over all header and 824 optional extension fields. 826 0 1 2 3 827 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 828 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 829 |LI | VN |Mode | Strat | Poll | Prec | 830 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 831 | Root Delay | 832 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 833 | Root Dispersion | 834 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 835 | Reference ID | 836 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 837 | | 838 + Reference Timestamp + 839 | | 840 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 841 | | 842 + Origin Timestamp + 843 | | 844 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 845 | | 846 + Receive Timestamp + 847 | | 848 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 849 | | 850 + Transmit Timestamp + 851 | | 852 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 853 | | 854 + Extension Field 1 (Optional) + 855 | | 856 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 857 | | 858 + Extension Field 2 (Optional) + 859 | | 860 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 861 . . 862 . Authentication . 863 . (Optional) (160 bits) . 864 . . 865 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 867 Figure 4: NTPv4 Message Format 869 The variables are interpreted as follows: 871 leap: 2-bit integer warning of an impending leap second to be 872 inserted or deleted in the last minute of the current month, coded as 873 follows: 875 +-------+-------------------------------------------------+ 876 | Value | Meaning | 877 +-------+-------------------------------------------------+ 878 | 0 | no warning | 879 | 1 | last minute of the day has 61 seconds | 880 | 2 | last minute of the day has 59 seconds | 881 | 3 | alarm condition (the clock is not synchronized) | 882 +-------+-------------------------------------------------+ 884 Table 6: Leap Indicator 886 version: 3-bit integer representing the NTP version number, currently 887 4. 889 mode: 3-bit integer representing the mode, with values defined as 890 follows: 892 +-------+--------------------------+ 893 | Value | Meaning | 894 +-------+--------------------------+ 895 | 0 | reserved | 896 | 1 | symmetric active | 897 | 2 | symmetric passive | 898 | 3 | client | 899 | 4 | server | 900 | 5 | broadcast | 901 | 6 | NTP control message | 902 | 7 | reserved for private use | 903 +-------+--------------------------+ 905 Table 7: Mode 907 stratum: 8-bit integer representing the stratum, with values defined 908 as follows: 910 +--------+-----------------------------------------------------+ 911 | Value | Meaning | 912 +--------+-----------------------------------------------------+ 913 | 0 | unspecified or invalid | 914 | 1 | primary server (e.g., equipped with a GPS receiver) | 915 | 2-15 | secondary server (via NTP) | 916 | 16 | client-only | 917 | 17-255 | undefined | 918 +--------+-----------------------------------------------------+ 920 Table 8: Stratum 922 It is customary to map the stratum value 0 in received packets to 923 MAXSTRAT (16) in the peer variable p.stratum and to map p.stratum 924 values of MAXSTRAT or greater to 0 in transmitted packets. This 925 allows reference clocks, which normally appear at stratum 0, to be 926 conveniently mitigated using the same algorithms used for external 927 sources. 929 poll: 8-bit signed integer representing the maximum interval between 930 successive messages, in log2 seconds. Suggested default limits for 931 minimum and maximum poll intervals are 6 and 10, respectively. 933 precision: 8-bit signed integer representing the precision of the 934 system clock, in log2 seconds. For instance a value of -18 935 corresponds to a precision of about one microsecond. The precision 936 can be determined when the service first starts up as the minimum 937 time of several iterations to read the system clock. 939 rootdelay: Total roundtrip delay to the reference clock, in NTP short 940 format. 942 rootdisp: Total dispersion to the reference clock, in NTP short 943 format. 945 refid: 32-bit code identifying the particular server or reference 946 clock. The interpretation depends on the value in the stratum field. 947 For packet stratum 0 (unspecified or invalid) this is a four- 948 character ASCII string, called the kiss code, used for debugging and 949 monitoring purposes. For stratum 1 (reference clock) this is a four- 950 octet, left-justified, zero-padded ASCII string assigned to the 951 reference clock. While not specifically enumerated in this document, 952 the following have been used as ASCII identifiers: 954 +------+----------------------------------------------------------+ 955 | ID | Clock Source | 956 +------+----------------------------------------------------------+ 957 | GOES | Geosynchronous Orbit Environment Satellite | 958 | GPS | Global Position System | 959 | GAL | Galileo Positioning System | 960 | PPS | Generic pulse-per-second | 961 | IRIG | Inter-Range Instrumentation Group | 962 | WWVB | LF Radio WWVB Ft. Collins, CO 60 kHz | 963 | DCF | LF Radio DCF77 Mainflingen, DE 77.5 kHz | 964 | HBG | LF Radio HBG Prangins, HB 75 kHz | 965 | MSF | LF Radio MSF Anthorn, UK 60 kHz (Rugby until April 2007) | 966 | JJY | LF Radio JJY Fukushima, JP 40 kHz, Saga, JP 60 kHz | 967 | LORC | MF Radio LORAN C 100 kHz | 968 | TDF | MF Radio Allouis, FR 162 kHz | 969 | CHU | HF Radio CHU Ottawa, Ontario | 970 | WWV | HF Radio WWV Ft. Collins, CO | 971 | WWVH | HF Radio WWVH Kauai, HI | 972 | NIST | NIST telephone modem | 973 | ACTS | NIST telephone modem | 974 | USNO | USNO telephone modem | 975 | PTB | European telephone modem | 976 +------+----------------------------------------------------------+ 978 Table 9: Reference IDs 980 Above stratum 1 (secondary servers and clients) this is the reference 981 identifier of the server. If using the IPv4 address family, the 982 identifier is the four-octet IPv4 address. If using the IPv6 address 983 family, it is the first four octets of the MD5 hash of the IPv6 984 address. 986 reftime: Time when the system clock was last set or corrected, in NTP 987 timestamp format. 989 org: Time at the client when the request departed for the server, in 990 NTP timestamp format. 992 rec: Time at the server when the request arrived from the client, in 993 NTP timestamp format. 995 xmt: Time at the server when the response left for the client, in NTP 996 timestamp format. 998 dst: Time at the client when the reply arrived from the server, in 999 NTP timestamp format. Note: This value is not included in a header 1000 field; it is determined upon arrival of the packet and made available 1001 in the packet buffer data structure. 1003 keyid: 32-bit unsigned integer used by the client and server to 1004 designate a secret 128-bit MD5 key. Together, the keyid and digest 1005 fields collectively are called message authentication code (MAC). 1007 digest: 128-bit bitstring computed by the keyed MD5 message digest 1008 algorithm described in Appendix A. 1010 6.3.1. The Kiss-o'-Death Packet 1012 If the Stratum field is 0, which is an 'unspecified' Stratum field 1013 value, the Reference Identifier field can be used to convey messages 1014 useful for status reporting and access control. In NTPv4 and SNTPv4, 1015 packets of this kind are called Kiss-o'-Death (KoD) packets and the 1016 ASCII messages they convey are called kiss codes. The KoD packets 1017 got their name because an early use was to tell clients to stop 1018 sending packets that violate server access controls. The kiss codes 1019 can provide useful information for an intelligent client. These 1020 codes are encoded in four-character ASCII strings left justified and 1021 zero filled. The strings are designed for character displays and log 1022 files. A list of the currently-defined kiss codes is given below: 1024 +------+------------------------------------------------------------+ 1025 | Code | Meaning | 1026 +------+------------------------------------------------------------+ 1027 | ACST | The association belongs to a unicast server | 1028 | AUTH | Server authentication failed | 1029 | AUTO | Autokey sequence failed | 1030 | BCST | The association belongs to a broadcast server | 1031 | CRYP | Cryptographic authentication or identification failed | 1032 | DENY | Access denied by remote server | 1033 | DROP | Lost peer in symmetric mode | 1034 | RSTR | Access denied due to local policy | 1035 | INIT | The association has not yet synchronized for the first | 1036 | | time | 1037 | MCST | The association belongs to a dynamically discovered server | 1038 | NKEY | No key found. Either the key was never installed or is | 1039 | | not trusted | 1040 | RATE | Rate exceeded. The server has temporarily denied access | 1041 | | because the client exceeded the rate threshold | 1042 | RMOT | Alteration of association from a remote host running | 1043 | | ntpdc. | 1044 | STEP | A step change in system time has occurred, but the | 1045 | | association has not yet resynchronized | 1046 +------+------------------------------------------------------------+ 1048 Table 10: Currently-defined NTP Kiss Codes 1050 6.3.2. NTP Extension Field Format 1052 In NTPv4 one or more extension fields can be inserted after the 1053 header and before the MAC, which is always present when extension 1054 fields are present. The extension fields can occur in any order; 1055 however, in some cases there is a preferred order which improves the 1056 protocol efficiency. 1058 An extension field contains a request or response message in the 1059 format shown in Figure 5. 1060 0 1 2 3 1061 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1062 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1063 | Field Type | Length | 1064 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1065 | Association ID | 1066 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1067 | Timestamp | 1068 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1069 | Filestamp | 1070 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1071 | Value Length | 1072 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1073 . . 1074 . Value . 1075 . . 1076 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1077 | Signature Length | 1078 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1079 . . 1080 . Signature . 1081 . . 1082 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1083 | Padding (as needed) | 1084 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1086 Figure 5: NTP Extension Field Format 1088 All extension fields are zero-padded to a word (4 octets) boundary. 1089 The Length field covers the entire extension field, including the 1090 Length and Padding fields. While the minimum field length is 4 words 1091 (16 octets), a maximum field length remains to be established. 1093 The RE, VN, and Code fields together form a Field Type field, a 16- 1094 bit integer which indicates the type of extension message contained 1095 within the extension field. 1097 The Length field is a 16-bit integer which indicates the length of 1098 the entire extension field in octets, including the Length and 1099 Padding fields. 1101 The 32-bit Association ID field is set by clients to the value 1102 previously received from the server or 0 otherwise. The server sets 1103 the Association ID field when sending a response as a handle for 1104 subsequent exchanges. If the association ID value in a request does 1105 not match the association ID of any association, the server returns 1106 the request with the first two bits of the Field Type field set to 1. 1108 The Timestamp and Filestamp 32-bit fields carry the seconds field of 1109 an NTP timestamp. The Timestamp field establishes the signature 1110 epoch of the data field in the message, while the filestamp 1111 establishes the generation epoch of the file that ultimately produced 1112 the data. 1114 The 32-bit Value Length field indicates the length of the Value field 1115 in octets. The minimum length of the Value field is 0, in which case 1116 the Value field is omitted. 1118 The 32-bit Value Length field indicates the length of the Value field 1119 in octets. The minimum length of the Value field is 0. 1121 Zero padding is applied, as necessary, to extend the extension field 1122 to a word (4-octet) boundary. If multiple extension fields are 1123 present, the last extension field is zero-padded to a double-word (8 1124 octet) boundary. 1126 The presence of the MAC and extension fields in the packet is 1127 determined from the length of the remaining area after the header to 1128 the end of the packet. The parser initializes a pointer just after 1129 the header. If the Length field is not a multiple of 4, a format 1130 error has occurred and the packet is discarded. The following cases 1131 are possible based on the remaining length in words. 1132 0 The packet is not authenticated. 1133 1 The packet is an error report or crypto-NAK. 1134 2, 3, 4 The packet is discarded with a format error. 1135 5 The remainder of the packet is the MAC. 1136 >5 One or more extension fields are present. 1138 If an extension field is present, the parser examines the Length 1139 field. If the length is less than 4 or not a multiple of 4, a format 1140 error has occurred and the packet is discarded; otherwise, the parser 1141 increments the pointer by this value. The parser now uses the same 1142 rules as above to determine whether a MAC is present and/or another 1143 extension field. An additional implementation dependent test is 1144 necessary to ensure the pointer does not stray outside the buffer 1145 space occupied by the packet. 1147 7. On Wire Protocol 1148 t2 t3 t6 t7 1149 +---------+ +---------+ +---------+ +---------+ 1150 T1 | 0 | | t2 | | t4 | | t6 | 1151 +---------+ +---------+ +---------+ +---------+ 1152 T2 | 0 | | t1 | | t3 | | t5 | Packet 1153 +---------+ +---------+ +---------+ +---------+ Variables 1154 T3 |t2=clock | | t2 | |t6=clock | | t6 | 1155 +---------+ +---------+ +---------+ +---------+ 1156 T4 | t1 | |t3=clock | | t5 | |t7=clock | 1157 +---------+ +---------+ +---------+ +---------+ 1158 Peer B 1159 +---------+ +---------+ +---------+ +---------+ 1160 org | t1 | | t1 | | T3<>t1? | | t5 | 1161 +---------+ +---------+ +---------+ +---------+ State 1162 rec | t2 | | t2 | | t6 | | t6 | Variables 1163 +---------+ +---------+ +---------+ +---------+ 1164 xmt | 0 | | t3 | | T1<>t3? | | t7 | 1165 +---------+ +---------+ +---------+ +---------+ 1167 t2 t3 t6 t7 1168 --------------------------------------------------------- 1169 /\ \ /\ \ 1170 / \ / \ 1171 / \ / \ 1172 / \/ / \/ 1173 --------------------------------------------------------- 1174 t1 t4 t5 t8 1176 t1 t4 t5 t8 1177 +---------+ +---------+ +---------+ +---------+ 1178 T1 | 0 | | t2 | | t4 | | t6 | 1179 +---------+ +---------+ +---------+ +---------+ 1180 T2 | 0 | | t1 | | t3 | | t5 | Packet 1181 +---------+ +---------+ +---------+ +---------+ Variables 1182 T3 | 0 | |t4=clock | | t4 | |t8=clock | 1183 +---------+ +---------+ +---------+ +---------+ 1184 T4 |t1=clock | | t3 | |t5=clock | | t7 | 1185 +---------+ +---------+ +---------+ +---------+ 1186 Peer A 1187 +---------+ +---------+ +---------+ +---------+ 1188 org | 0 | | T3<>0? | | t3 | | T3<>t3? | 1189 +---------+ +---------+ +---------+ +---------+ State 1190 rec | 0 | | t4 | | t4 | | t8 | Variables 1191 +---------+ +---------+ +---------+ +---------+ 1192 xmt | t1 | | T1=t1? | | t5 | | T1<>t5? | 1193 +---------+ +---------+ +---------+ +---------+ 1195 Figure 7: On-Wire Protocol 1197 The NTP on-wire protocol is the core mechanism to exchange time 1198 values between servers, peers and clients. It is inherently 1199 resistant to lost or duplicate data packets. Data integrity is 1200 provided by the IP and UDP checksums. No flow-control or 1201 retransmission facilities are provided or necessary. The protocol 1202 uses timestamps, either extracted from packet headers or struck from 1203 the system clock upon the arrival or departure of a packet. 1204 Timestamps are precision data and should be restruck in case of link 1205 level retransmission and corrected for the time to compute a MAC on 1206 transmit. 1208 NTP messages make use of two different communication modes, one to 1209 one and one to many, commonly referred to as unicast and broadcast. 1210 For the purposes of this document, the term broadcast is interpreted 1211 to mean any available one to many mechanism. For IPv4 this equates 1212 to either IPv4 broadcast or IPv4 multicast. For IPv6 this equates to 1213 IPv6 multicast. For this purpose, IANA has allocated the IPv4 1214 multicast address 224.0.1.1 and the IPv6 multicast address ending 1215 :101, with prefix determined by scoping rules. 1217 The on-wire protocol uses four timestamps numbered T1 through T4 and 1218 three state variables org, rec and xmt, as shown in Figure 7. This 1219 figure shows the most general case where each of two peers, A and B, 1220 independently measure the offset and delay relative to the other. 1221 For purposes of illustration the individual timestamp values are 1222 shown in lower case with subscripts indicating the order of 1223 transmission and reception. 1225 In the figure the first packet transmitted by A containing only the 1226 transmit timestamp T3 with value t1. B receives the packet at t2 and 1227 saves the origin timestamp T1 with value t1 in state variable org and 1228 the destination timestamp T4 with value t2 in state variable rec. At 1229 this time or some time later B sends a packet to A containing the org 1230 and rec state variables in T1 and T2, respectively and in addition 1231 the transmit timestamp T3 with value t3, which is saved in the xmt 1232 state variable. When this packet arrives at A the packet header 1233 variables T1, T2, T3 and destination timestamp T4 represent the four 1234 timestamps necessary to compute the offset and delay of B relative to 1235 A, as described later. 1237 Before the A state variables are updated, two sanity checks are 1238 performed in order to protect against duplicate or bogus packets. A 1239 packet is a duplicate if the transmit timestamp T3 in the packet 1240 matches the xmt state variable. A packet is bogus if the origin 1241 timestamp T1 in the packet does not match the org state variable. In 1242 either of these cases the state variables are updated, but the packet 1243 is discarded. 1245 The four most recent timestamps, T1 through T4, are used to compute 1246 the offset of B relative to A 1248 theta = T(B) - T(A) = 1/2*(T2-T1)+(T4-T3) 1250 and the roundtrip delay 1252 delta = T(ABA)- = (T4-T1)-(T3-T2) 1254 Note that the quantities within parentheses are computed from 64-bit 1255 unsigned timestamps and result in signed values with 63 significant 1256 bits plus sign. These values can represent dates from 68 years in 1257 the past to 68 years in the future. However, the offset and delay 1258 are computed as the sum and difference of these values, which contain 1259 62 significant bits and two sign bits, so can represent unambiguous 1260 values from 34 years in the past to 34 years in the future. In other 1261 words, the time of the client must be set within 34 years of the 1262 server before the service is started. This is a fundamental 1263 limitation with 64-bit integer arithmetic. 1265 In implementations where floating double arithmetic is available, the 1266 first-order differences can be converted to floating double and the 1267 second-order sums and differences computed in that arithmetic. Since 1268 the second-order terms are typically very small relative to the 1269 timestamps themselves, there is no loss in significance, yet the 1270 unambiguous range is increased from 34 years to 68 years. 1272 In some scenarios where the frequency offset between the client and 1273 server is relatively large and the actual propagation time small, it 1274 is possible that the delay computation becomes negative. For 1275 instance, if the frequency difference is 100 PPM and the interval 1276 T4-T1 is 64 s, the apparent delay is -6.4 ms. Since negative values 1277 are misleading in subsequent computations, the value of del should be 1278 clamped not less than the system precision defined. 1280 The discussion above assumes the most general case where two 1281 symmetric peers independently measure the offsets and delays between 1282 them. In the case of a stateless server, the protocol can be 1283 simplified. A stateless server copies T3 and T4 from the client 1284 packet to T1 and T2 of the server packet and tacks on the transmit 1285 timestamp T3 before sending it to the client. Additional details for 1286 filling in the remaining protocol fields are given in the next 1287 section and in Appendix A. 1289 A SNTP primary server implementing the on-wire protocol has no 1290 upstream servers except a single reference clock In principle, it is 1291 indistinguishable from an NTP primary server which has the mitigation 1292 algorithms, presumably to mitigate between multiple reference clocks. 1294 Upon receiving a client request, a SNTP primary server constructs and 1295 sends the reply packet as shown in Table 4 below. Note that the 1296 dispersion field in the packet header must be calculated in the same 1297 way as in the NTP case. 1299 A SNTP client using the on-wire protocol has a single server and no 1300 downstream clients. It can operate with any subset of the NTP on- 1301 wire protocol, the simplest using only the transmit timestamp of the 1302 server packet and ignoring all other fields. However, the additional 1303 complexity to implement the full on-wire protocol is minimal and is 1304 encouraged. 1306 8. Peer Process 1308 The peer process is called upon arrival of a server packet. It runs 1309 the on-wire protocol to determine the clock offset and roundtrip 1310 delay and in addition computes statistics used by the system and poll 1311 processes. Peer variables are instantiated in the association data 1312 structure when the structure is initialized and updated by arriving 1313 packets. There is a peer process, poll process and association for 1314 each server. 1316 The discussion in this section covers only the variables and routines 1317 necessary for a conforming NTPv4 implementation. 1319 8.1. Peer Process Variables 1321 Table 11, Table 12, Table 13, and Table 14 summarize the common 1322 names, formula names and a short description of each peer variable, 1323 all of which have prefix p. 1325 +---------+----------+-----------------------+ 1326 | Name | Formula | Description | 1327 +---------+----------+-----------------------+ 1328 | srcaddr | srcaddr | source address | 1329 | srcport | srcport | source port | 1330 | dstaddr | dstaddr | destination address | 1331 | dstport | destport | destination port | 1332 | keyid | keyid | key identifier key ID | 1333 +---------+----------+-----------------------+ 1335 Table 11: Peer Process Configuration Variables 1337 The following configuration variables are normally initialized when 1338 the association is mobilized, either from a configuration file or 1339 upon arrival of the first packet for an ephemeral association. 1341 p.srcadr: IP address of the remote server or reference clock. This 1342 becomes the destination IP address in packets sent from this 1343 association. 1345 p.srcport: UDP port number of the server or reference clock. This 1346 becomes the destination port number in packets sent from this 1347 association. When operating in symmetric modes (1 and 2) this field 1348 must contain the NTP port number PORT (123) assigned by the IANA. In 1349 other modes it can contain any number consistent with local policy. 1351 p.dstadr: IP address of the client. This becomes the source IP 1352 address in packets sent from this association. 1354 p.dstport: UDP port number of the client, ordinarily the NTP port 1355 number PORT (123) assigned by the IANA. This becomes the source port 1356 number in packets sent from this association. 1358 p.keyid: Symmetric key ID for the 128-bit MD5 key used to generate 1359 and verify the MAC. The client and server or peer can use different 1360 values, but they must map to the same key. 1362 +-----------+------------+---------------------+ 1363 | Name | Formula | Description | 1364 +-----------+------------+---------------------+ 1365 | leap | leap | leap indicator | 1366 | version | version | version number | 1367 | mode | mode | mode | 1368 | stratum | stratum | stratum | 1369 | ppoll | ppoll | peer poll exponent | 1370 | rootdelay | delta | root delay | 1371 | rootdisp | capepsilon | root dispersion | 1372 | refid | refid | reference ID | 1373 | reftime | reftime | reference timestamp | 1374 +-----------+------------+---------------------+ 1376 Table 12: Peer Process Packet Variables 1378 The variables defined below are updated from the packet header as 1379 each packet arrives. They are interpreted in the same way as the as 1380 the packet variables of the same names. 1382 ------------------ 1383 | receive | 1384 ------------------ 1385 \| / 1386 ------------------ no------------------ 1387 | format OK? |-->| format error | 1388 ------------------ ------------------ 1389 \| / yes 1390 ------------------ no------------------ 1391 | access OK? |-->| access error | 1392 ------------------ ------------------ 1393 \| / yes 1394 ------------------yes------------------ 1395 | mode = 3? |-->| client_packet | 1396 ------------------ ------------------ 1397 \| / no 1398 ------------------yes------------------ 1399 | auth OK? |-->| auth error | 1400 ------------------ ------------------ 1401 \| / yes 1402 ------------------ 1403 | match_assoc | 1404 ------------------ 1406 Figure 8: Receive Processing 1408 p.leap, p.version, p.mode, p.stratum, p.ppoll, p.rootdelay, 1409 p.rootdisp, p.refid, p.reftime 1411 It is convenient for later processing to convert the NTP short format 1412 packet values p.rootdelay and p.rootdisp to floating doubles as peer 1413 variables. 1415 +------+---------+--------------------+ 1416 | Name | Formula | Description | 1417 +------+---------+--------------------+ 1418 | t | t | epoch | 1419 | org | T1 | origin timestamp | 1420 | rec | T2 | receive timestamp | 1421 | xmt | T3 | transmit timestamp | 1422 +------+---------+--------------------+ 1424 Table 13: Peer Process Timestamp Variables 1425 +--------+---------+-----------------+ 1426 | Name | Formula | Description | 1427 +--------+---------+-----------------+ 1428 | offset | theta | clock offset | 1429 | delay | del | roundtrip delay | 1430 | disp | epsilon | dispersion | 1431 | jitter | psi | jitter | 1432 +--------+---------+-----------------+ 1434 Table 14: Peer Process Statistics Variables 1436 The p.org, p.rec, p.xmt variables represent the timestamps computed 1437 by the on-wire protocol described previously. The p.offset, p.delay, 1438 p.disp, p.jitter variables represent the current time values and 1439 statistics produced by the clock filter algorithm. The offset and 1440 delay are computed by the on-wire protocol; the dispersion and jitter 1441 are calculated as described below. Strictly speaking, the epoch p.t 1442 is not a timestamp; it records the system timer upon arrival of the 1443 latest packet selected by the clock filter algorithm. 1445 8.2. Peer Process Operations 1447 Figure 8 shows the peer process code flow upon the arrival of a 1448 packet. There is no specific method required for access control, 1449 although it is recommended that implementations include a match-and- 1450 mask scheme similar to many others now in widespread use. Format 1451 checks require correct field length and alignment, acceptable version 1452 number (1-4) and correct extension field syntax, if present. There 1453 is no specific requirement for authentication; however, if 1454 authentication is implemented, the symmetric key scheme described in 1455 Section 6 must be included among the supported. This scheme uses the 1456 MD5 keyed hash algorithm described in Appendix A.2. For the most 1457 vulnerable applications the Autokey public key scheme described in 1458 [3] is recommended. 1460 Next, the association table is searched for matching source address 1461 and source port using the find_assoc() routine in Appendix A.5.1. 1462 The dispatch table near the beginning of that section is indexed by 1463 the packet mode and association mode (0 if no matching association) 1464 to determine the dispatch code and thus the case target. The 1465 significant cases are FXMT, NEWPS and NEWBC. 1466 ----------------- 1467 | client_packet | 1468 ----------------- 1469 \ | / 1470 ----------------- 1471 | copy header | 1472 ----------------- 1473 \ | / 1474 ----------------- 1475 | copy T1,T2 | 1476 ----------------- 1477 \ | / 1478 ----------------- 1479 | T3 = clock | 1480 ----------------- 1481 \ | / 1482 ----------------- yes -------------- 1483 | copy header | --> | MD5 digest |-\ 1484 ----------------- -------------- | 1485 | no | 1486 \ | / | 1487 ----------------- | 1488 | NAK digest | | 1489 ----------------- | 1490 |-----------------------------/ 1491 \ | / 1492 ----------------- 1493 | fast_xmit() | 1494 ----------------- 1495 \ | / 1496 ----------------- 1497 | xmt = T3 | 1498 ----------------- 1499 \ | / 1500 ----------------- 1501 | return | 1502 ----------------- 1504 Packet Variable <-- Variable 1505 x.leap <-- s.leap 1506 x.version <-- r.version 1507 x.mode <-- 4 1508 x.stratum <-- s.stratum 1509 x.poll <-- r.poll 1510 x.precision <-- s.precision 1511 x.rootdelay <-- s.rootdelay 1512 x.rootdisp <-- s.rootdisp 1513 x.refid <-- s.refid 1514 x.reftime <-- s.reftime 1515 x.org <-- r.xmt 1516 x.rec <-- r.dst 1517 x.xmt <-- clock 1518 x.keyid <-- r.keyid 1519 x.digest <-- md5 digest 1520 Figure 9: Client Packet Processing 1522 FXMIT. This is a client (mode 3) packet matching no association. 1523 The server constructs a server (mode 4) packet and returns it to the 1524 client without retaining state. The server packet is constructed as 1525 in Figure 9 and the fast_xmit() routine in Appendix A.5.4. If the 1526 s.rootdelay and s.rootdisp system variables are stored in floating 1527 double, they must be converted to NTP short format first. Note that, 1528 if authentication fails, the server returns a special message called 1529 a crypto-NAK. This message MUST include the normal NTP header data 1530 shown in the figure, but with a MAC consisting of four octets of 1531 zeros. The client MAY accept or reject the data in the message. 1533 NEWBC. This is a broadcast (mode 5) packet matching no association. 1534 The client mobilizes a client (mode 3) association as shown in the 1535 mobilize() and clear() routines in Appendix A.2. Implementations 1536 supporting authentication first perform the necessary steps to run 1537 the Autokey or other protocol, and determine the propagation delay, 1538 then continues in listen-only (mode 6) to receive further packets. 1539 Note the distinction between a mode-6 packet, which is reserved for 1540 the NTP monitor and control functions, and a mode-6 association. 1542 NEWPS. This is a symmetric active (1) packet matching no 1543 association. The client mobilizes a symmetric passive (mode 2) 1544 association as shown in the mobilize() and clear() routines in 1545 Appendix A.2. Code flow continues to the match_assoc() fragment 1546 described below. In other cases the packet matches an existing 1547 association and code flows to the match_assoc fragment in Figure 10. 1548 The packet timestamps are carefully checked to avoid invalid, 1549 duplicate or bogus packets, as shown in the figure. Note that a 1550 crypto-NAK is considered valid only if it survives these tests. 1551 Next, the peer variables are copied from the packet header variables 1552 as shown in Figure 11 and the packet() routine in Appendix A.5.2. 1553 Implementations MUST include a number of data range checks as shown 1554 in Table 15 and discard the packet if the ranges are exceeded; 1555 however, the header fields MUST be copied even if errors occur, since 1556 they are necessary in symmetric modes to construct the subsequent 1557 poll message. 1559 --------------- 1560 | match assoc | 1561 --------------- 1562 \ | / 1563 --------------- yes ---------------- 1564 | T3 = 0? | --> | format error | 1565 --------------- ---------------- 1566 \ | / no 1567 --------------- yes ---------------- 1568 | T3 = xmt? | --> | duplicate | 1569 --------------- ---------------- 1570 \ | / no 1571 --------------- no ---------------- yes 1572 | mode = 5? | --> |T1 or T2 = 0? |--\ 1573 --------------- ---------------- | 1574 | yes \ | / no | 1575 \ | /<-----\ ---------------- | 1576 | \-| T1 = xmt? | | 1577 ---------------- ---------------- | 1578 | auth = NAK? | no \ | /<------/ 1579 ---------------- | 1580 yes\|/ no\|/ ---------------- 1581 --------- ------ | org = T3 | 1582 |org=T3| |auth| | rec = T4 | 1583 |rec=T4| |err | ---------------- 1584 --------- ------ \ | / 1585 \|/ ---------------- 1586 --------- | return | 1587 |packet | ---------------- 1588 --------- 1590 Figure 10: Timestamp Processing 1592 ---------------- 1593 | packet | 1594 ---------------- 1595 \ | / 1596 ---------------- 1597 | copy header | 1598 ---------------- 1599 \ | / 1600 ---------------- bad ---------------- 1601 | header? | --> |header error | 1602 ---------------- ---------------- 1603 \ | / 1604 ---------------- 1605 | reach |= 1 | 1606 ---------------- 1607 \ | / 1608 ---------------- 1609 | poll update | 1610 ---------------- 1611 \ | / 1612 ---------------------------------------- 1613 | theta = 1/2*(T2-T1)+(T3-T4) | 1614 | del = (T4-T1)-(T3-T2) | 1615 | epsilon = rho_r+rho+capphi*((T4-T1) | 1616 ---------------------------------------- 1617 \ | / 1618 ---------------- 1619 | clock filter | 1620 ---------------- 1622 Peer Variables <-- Packet Variables 1623 p.leap <-- r.leap 1624 p.mode <-- r.mode 1625 p.stratum <-- r.stratum 1626 p.ppoll <-- r.ppoll 1627 p.rootdelay <-- r.rootdelay 1628 p.rootdisp <-- r.rootdisp 1629 p.refid <-- r.refid 1630 p.reftime <-- r.reftime 1632 Figure 11: Packet Processing 1634 +--------------------------+----------------------------------------+ 1635 | Packet Type | Description | 1636 +--------------------------+----------------------------------------+ 1637 | 1 duplicate packet | The packet is at best an old duplicate | 1638 | | or at worst a replay by a hacker. | 1639 | | This can happen in symmetric modes if | 1640 | | the poll intervals are uneven. | 1641 | 2 bogus packet | | 1642 | 3 invalid | One or more timestamp fields are | 1643 | | invalid. This normally happens in | 1644 | | symmetric modes when one peer sends | 1645 | | the first packet to the other and | 1646 | | before the other has received its | 1647 | | first reply. | 1648 | 4 access denied | The access controls have black | 1649 | 5 authentication failure | The cryptographic message digest does | 1650 | | not match the MAC. | 1651 | 6 unsynchronized | The server is not synchronized to a | 1652 | | valid source. | 1653 | 7 bad header data | One or more header fields are invalid. | 1654 | 8 autokey error | Public key cryptography has failed to | 1655 | | authenticate the packet. | 1656 | 9 crypto error | Mismatched or missing cryptographic | 1657 | | keys or certificates. | 1658 +--------------------------+----------------------------------------+ 1660 Table 15: Packet Error Checks 1662 The 8-bit p.reach shift register in the poll process described later 1663 is used to determine whether the server is reachable or not and 1664 provide information useful to insure the server is reachable and the 1665 data are fresh. The register is shifted left by one bit when a 1666 packet is sent and the rightmost bit is set to zero. As valid 1667 packets arrive, the rightmost bit is set to one. If the register 1668 contains any nonzero bits, the server is considered reachable; 1669 otherwise, it is unreachable. Since the peer poll interval might 1670 have changed since the last packet, the poll_update() routine in 1671 Appendix A.8.2 is called to re-determine the host poll interval. 1673 The on-wire protocol calculates the clock offset theta and roundtrip 1674 delay del from the four most recent timestamps as shown in Figure 7. 1675 While it is in principle possible to do all calculations except the 1676 first-order timestamp differences in fixed-point arithmetic, it is 1677 much easier to convert the first-order differences to floating 1678 doubles and do the remaining calculations in that arithmetic, and 1679 this will be assumed in the following description. The dispersion 1680 statistic epsilon(t) represents the maximum error due to the 1681 frequency tolerance and time since the last measurement. It is 1682 initialized 1684 epsilon(t_o) = rho_r + rho +capphi(T4-T1) 1686 when the measurement is made at t _0. Here rho_r is the peer 1687 precision in the packet header r.precision and rho the system 1688 precision s.precision, both expressed in seconds. These terms are 1689 necessary to account for the uncertainty in reading the system clock 1690 in both the server and the client. The dispersion then grows at 1691 constant rate TOLERANCE (cappsi); in other words, at time t, 1692 epsilon(t)=epsilon(t_0)+cappsi(t-t_0). With the default value 1693 cappsi=15 PPM, this amounts to about 1.3 s per day. With this 1694 understanding, the argument t will be dropped and the dispersion 1695 represented simply as epsilon. The remaining statistics are computed 1696 by the clock filter algorithm described in the next section. 1698 8.3. Clock Filter Algorithm 1699 ----------------------- 1700 | clock filter | 1701 ----------------------- 1702 \ | / 1703 ----------------------- 1704 | shift sample theta, | 1705 | del, epsilon, and t | 1706 | filter shift registr| 1707 ----------------------- 1708 \ | / 1709 ----------------------- 1710 | copy filter to a | 1711 | temporary list. sort| 1712 | list by increasing | 1713 | del. Let theta_i | 1714 | del_i, epsilon_i, | 1715 | t_i be the ith entry| 1716 | on the sorted list. | 1717 ----------------------- 1718 \ | / 1719 ----------------------- no 1720 | t_0 > t? |----\ 1721 ----------------------- | 1722 \ | / yes | 1723 ----------------------- | 1724 | theta = theta_0 | | 1725 | del = del_0 | | 1726 | epsilon | | 1727 | = sum(epsilon_i) | | 1728 | ---------- | | 1729 | 2^(i+1) | | 1730 | psi | | 1731 | = sqrt(1/7* ... | | 1732 | ... sum( ... | | 1733 | (theta_0-theta_i)^2 | | 1734 | t = t_0 | | 1735 ----------------------- | 1736 \ | / | 1737 ----------------------- | 1738 | clock_select() | | 1739 ----------------------- | 1740 \ | /<------------/ 1741 ----------------------- 1742 | return | 1743 ----------------------- 1745 Figure 12: Clock Filter Processing 1747 The clock filter algorithm grooms the stream of on-wire data to 1748 select the samples most likely to represent the correct time. The 1749 algorithm produces the p.offset theta, p.delay del, p.dispersion 1750 epsilon, p.jitter psi, and time of arrival p.t t used by the 1751 mitigation algorithms to determine the best and final offset used to 1752 discipline the system clock. They are also used to determine the 1753 server health and whether it is suitable for synchronization. The 1754 core processing steps of this algorithm are shown in Figure 12 with 1755 more detail in the clock_filter() routine in Appendix A.5.3. 1757 The clock filter algorithm saves the most recent sample tuples 1758 (theta, del, epsilon, t) in an 8-stage shift register in the order 1759 that packets arrive. Here t is the system timer, not the peer 1760 variable of the same name. The following scheme is used to insure 1761 sufficient samples are in the register and that old stale data are 1762 discarded. Initially, the tuples of all stages are set to the dummy 1763 tuple (0,MAXDISP, MAXDISP, 0). As valid packets arrive, the (theta, 1764 del, epsilon, t) tuples are shifted into the register causing old 1765 samples to be discarded, so eventually only valid samples remain. If 1766 the three low order bits of the reach register are zero, indicating 1767 three poll intervals have expired with no valid packets received, the 1768 poll process calls the clock filter algorithm with the dummy tuple 1769 just as if the tuple had arrived from the network. If this persists 1770 for eight poll intervals, the register returns to the initial 1771 condition. 1773 In the next step the shift register stages are copied to a temporary 1774 list and the list sorted by increasing del. Let j index the stages 1775 starting with the lowest del. If the sample epoch t_0 is not later 1776 than the last valid sample epoch p.t, the routine exits without 1777 affecting the current peer variables. Otherwise, let epsilon_j be 1778 the dispersion of the jth entry, then 1779 i=n-1 1780 --- epsilon_i 1781 capepsilon = \ ---------- 1782 / (i+1) 1783 --- 2 1784 i=0 1786 is the peer dispersion p.disp. Note the overload of epsilon, whether 1787 input to the clock filter or output, the meaning should be clear from 1788 context. 1790 The observer should note (a) if all stages contain the dummy tuple 1791 with dispersion MAXDISP, the computed dispersion is a little less 1792 than 16 s, (b) each time a valid tuple is shifted into the register, 1793 the dispersion drops by a little less than half, depending on the 1794 valid tuples dispersion, (c) after the fourth valid packet the 1795 dispersion is usually a little less than 1 s, which is the assumed 1796 value of the MAXDIST parameter used by the selection algorithm to 1797 determine whether the peer variables are acceptable or not. 1799 Let the first stage offset in the sorted list be theta_0; then, for 1800 the other stages in any order, the jitter is the RMS average 1801 +----- -----+ 1802 | 1/2 | 1803 | +----- -----+ | 1804 | | n-1 | | 1805 | | --- | | 1806 | 1 | \ 2 | | 1807 psi = | -------- * | / (theta_0-theta_j) | | 1808 | (n-1) | --- | | 1809 | | j=1 | | 1810 | +----- -----+ | 1811 | | 1812 +----- -----+ 1814 where n is the number of valid tuples in the register. In order to 1815 insure consistency and avoid divide exceptions in other computations, 1816 the psi is bounded from below by the system precision rho expressed 1817 in seconds. While not in general considered a major factor in 1818 ranking server quality, jitter is a valuable indicator of fundamental 1819 timekeeping performance and network congestion state. 1821 Of particular importance to the mitigation algorithms is the peer 1822 synchronization distance, which is computed from the root delay and 1823 root dispersion. The root delay is 1825 del ' = delta_r + del 1827 and the root dispersion is 1829 epsilon ' = capepsilon_r + epsilon + psi 1831 Note that epsilon and therefore increase at rate capphi. The peer 1832 synchronization distance is defined 1834 lambda = (del ' / 2) + epsilon 1836 and recalculated as necessary. The lambda is a component of the root 1837 synchronization distance caplambda used by the mitigation algorithms 1838 as a metric to evaluate the quality of time available from each 1839 server. Note that there is no state variable for lambda, as it 1840 depends on the time since the last update. 1842 9. System Process 1844 As each new sample (theta, delta, epsilon, t) is produced by the 1845 clock filter algorithm, the sample is processed by the mitigation 1846 algorithms consisting of the selection, clustering, combining and 1847 clock discipline algorithms in the system process. The selection 1848 algorithm scans all associations and casts off the falsetickers, 1849 which have demonstrably incorrect time, leaving the truechimers as 1850 result. In a series of rounds the clustering algorithm discards the 1851 association statistically furthest from the centroid until a minimum 1852 number of survivors remain. The combining algorithm produces the 1853 best and final offset on a weighted average basis and selects one of 1854 the associations as the system peer providing the best statistics for 1855 performance evaluation. The final offset is passed to the clock 1856 discipline algorithm to steer the system clock to the correct time. 1857 The statistics (theta, delta, epsilon, t) associated with the system 1858 peer are used to construct the system variables inherited by 1859 dependent servers and clients and made available to other 1860 applications running on the same machine. 1862 The discussion in following sections covers the basic variables and 1863 routines necessary for a conforming NTPv4 implementation. Additional 1864 implementation details are in Appendix A. An interface that might be 1865 considered in a formal specification is represented by the function 1866 prototypes in Appendix A.1.6. 1868 9.1. System Process Variables 1870 The variables and parameters associated with the system process are 1871 summarized in Table 16, which gives the variable name, formula name 1872 and short description. Unless noted otherwise, all variables have 1873 assumed prefix s. 1875 +-----------+------------+---------------------+ 1876 | Name | Formula | Description | 1877 +-----------+------------+---------------------+ 1878 | t | t | epoch | 1879 | leap | leap | leap indicator | 1880 | stratum | stratum | stratum | 1881 | precision | rho | precision | 1882 | p | p | system peer pointer | 1883 | offset | captheta | combined offset | 1884 | jitter | varsigma | combined jitter | 1885 | rootdelay | capdelta | root delay | 1886 | rootdisp | capepsilon | root dispersion | 1887 | refid | refid | reference ID | 1888 | reftime | reftime | reference time | 1889 | NMIN | 3 | minimum survivors | 1890 | CMIN | 1 | minimum candidates | 1891 +-----------+------------+---------------------+ 1893 Table 16: System Process Variables and Parameters 1895 All the variables except s.t and s.p have the same format and 1896 interpretation as the peer variables of the same name. The remaining 1897 variables are defined below. 1899 s.t: Integer representing the value of the system timer at the last 1900 update. 1902 s.p: System peer association pointer. 1904 s.precision: 8-bit signed integer representing the precision of the 1905 system clock, in log2 seconds. 1907 s.offset: Offset computed by the combining algorithm. 1909 s.jitter: Jitter computed by the cluster and combining algorithms. 1911 The variables defined below are updated from the system peer process 1912 as described later. They are interpreted in the same way as the as 1913 the peer variables of the same names. 1915 s.leap, s.stratum, s.rootdelay, s.rootdisp, s.refid, s.reftime 1917 Initially, all variables are cleared to zero, then the s.leap is set 1918 to 3 (unsynchronized) and s.stratum is set to MAXSTRAT (16). The 1919 remaining statistics are determined as described below. 1921 9.2. System Process Operations 1923 The system process implements the selection, clustering, combining 1924 and clock discipline algorithms. The clock_select() routine in 1925 Figure 15 includes the selection algorithm of Section 9.2.1 that 1926 produces a majority clique of truechimers based on agreement 1927 principles. The clustering algorithm of Section 9.2.2 discards the 1928 outliers of the clique to produce the survivors used by the combining 1929 algorithm in Section 9.2.3 , which in turn provides the final offset 1930 for the clock discipline algorithm in Section 9.2.4. If the 1931 selection algorithm cannot produce a majority clique, or if the 1932 clustering algorithm cannot produce at least CMIN survivors, the 1933 system process terminates with no further processing. If successful, 1934 the clustering algorithm selects the statistically best candidate as 1935 the system peer and its variables are inherited as the system 1936 variables. The selection and clustering algorithms are described 1937 below separately, but combined in the code skeleton. 1939 ------------------------- 1940 | clock_select() | 1941 ------------------------- 1942 \|/ 1943 -----------------------------------|--------------- 1944 | ----------- ---------------------- | 1945 | /---| accept? | | scan candidates | | 1946 | | ----------- | | | 1947 | | yes no| | | | 1948 | ----------- | | | | 1949 | | add peer| | | | | 1950 | ----------- | | | | 1951 | | \|/ | | | 1952 | \-------->----->| | | 1953 | | | | 1954 | selection algorithm ---------------------- | 1955 | \|/ | 1956 ------------------------------------|-------------- 1957 no ----------------------- 1958 /--------------| survivors? | 1959 | ----------------------- 1960 | \|/ yes 1961 | ----------------------- 1962 | | clustering algorithm| 1963 | ----------------------- 1964 | \|/ 1965 | ----------------------- 1966 |<---------yes-| n < CMIN? | 1967 \|/ ----------------------- 1968 ------------------------- \|/ no 1969 | s.p = NULL | ----------------------- 1970 ------------------------- | s.p = vo.p | 1971 \|/ ----------------------- 1972 ------------------------- \|/ 1973 | return (UNSYNC) | ----------------------- 1974 ------------------------- | return (SYNC) | 1975 ----------------------- 1977 Figure 15: clock_select() routine 1979 9.2.1. Selection Algorithm 1981 The selection algorithm operates to find the truechimers using 1982 Byzantine agreement principles originally proposed by Marzullo [7], 1983 but modified to improve accuracy. An overview of the algorithm is 1984 listed below and the first half of the clock_select() routine in 1985 Appendix A.6.1. First, those servers which are unusable according to 1986 the rules of the protocol are detected and discarded by the accept() 1987 routine in Figure 16 and Appendix A.6.3. Next, a set of tuples {p, 1988 type, edge} is generated for the remaining servers, where p is an 1989 association pointer, type and edge identifies the upper (+1), middle 1990 (0) and lower (-1) endpoint of a correctness interval [theta- 1991 lambda,theta+lambda], where lambda is the root distance. 1993 1. For each of m associations, construct a correctness interval 1994 [(theta-rootdist()),(theta+rootdist())]. 1996 2. Select the lowpoint, midpoint and highpoint of these intervals. 1997 Sort these values in a list from lowest to highest. Set the 1998 number of falsetickers f=0. 2000 3. Set the number of midpoints d=0. Set c=0. Scan from lowest 2001 endpoint to highest. Add one to c for every lowpoint, subtract 2002 one for every highpoint, add one to d for every midpoint. If 2003 c>=m-f, stop; set l=current lowpoint 2005 4. Set c=0. Scan from highest endpoint to lowest. Add one to c for 2006 every highpoint, subtract one for every lowpoint, add one to d 2007 for every midpoint. If c>=m-f, stop; set u=current highpoint. 2009 5. Is d=f and l= |--any yes---\ server not 2039 | MAXSTRAT? | | synchronized 2040 -------------------- | 2041 \|/ all no | 2042 -------------------- | 2043 | reach = 0? |---yes----->| server not 2044 -------------------- | reachable 2045 \|/ no | 2046 -------------------- | 2047 | root_dist() >= | | 2048 | MAXDIST? |---yes----->| root distance 2049 -------------------- | exceeded 2050 \|/ no | 2051 -------------------- | 2052 | refid = addr? |---yes----->| server/client 2053 -------------------- | sync loop 2054 \|/ no | 2055 -------------------- | 2056 | return (YES) | ----------------------- 2057 -------------------- | return (NO) | 2058 ----------------------- 2060 Figure 16: accept() routine 2062 9.2.2. Clustering Algorithm 2064 The members of the majority clique are placed on the survivor list, 2065 and sorted first by stratum, then by root distance lambda. The 2066 sorted list is processed by the clustering algorithm below and the 2067 second half of the clock_select() algorithm in Appendix A.6.1. 2069 1. Let (theta, phi, Lambda) represent a candidate peer with 2070 offset theta, jitter psi and a weight factor 2071 Lambda=stratum*MAXDIST+rootdist(). 2073 2. Sort the candidates by increasing Lambda. Let n be the number 2074 of candidates and NMIN the minimum number of survivors. 2076 3. For each candidate compute the selection jitter psi_S (RMS 2077 peer offset differences between this and all other candidates). 2079 4. Select psi_max as the candidate with maximum psi_S. 2081 5. Select psi_min as the candidate with minimum psi_S. 2083 6. Is psi_max < psi_min or n <= NMIN? If yes, go to step 6y. If 2084 no, go to step 6n. 2086 6y. Done. The remaining cluster survivors are correct. The 2087 survivors are in the v. structure sorted by Lambda. 2089 6n. Delete the outlyer candidate with psi_max; reduce n by one, 2090 and go back to step 3. 2092 It operates in a series of rounds where each round discards the 2093 furthest statistical outlier until a specified minimum number of 2094 survivors NMIN (3) are left or until no further improvement is 2095 possible. In each round let n be the number of survivors and s index 2096 the survivor list. Assume psi_p is the peer jitter of the s 2097 survivor. Compute 2098 +----- -----+ 2099 | 1/2 | 2100 | +----- -----+ | 2101 | | n-1 | | 2102 | | --- | | 2103 | 1 | \ 2 | | 2104 psi_s = | -------- * | / (theta_s-theta_j) | | 2105 | (n-1) | --- | | 2106 | | j=1 | | 2107 | +----- -----+ | 2108 | | 2109 +----- -----+ 2111 as the selection jitter. Then choose psi_max=max(psi) and 2112 psi_min=min(psi). If psi_max| x = rootdist() | 2130 | | ------------------ 2131 | | \|/ 2132 | | ------------------ 2133 | |<--| y+= 1/x | 2134 | | | z+=theta_i/x | 2135 | | | w+=(theta_i - | 2136 | | | theta_o)^2 | 2137 --------------------- ------------------ 2138 \|/ done 2139 ----------------------- 2140 | captheta = z/y | 2141 | vartheta = sqrt(w/y)| 2142 ----------------------- 2143 \|/ 2144 ----------------------- 2145 | return | 2146 ----------------------- 2148 Variable/Process/Description 2149 captheta/system/combined clock offset 2150 vartheta_p/system/combined jitter 2151 theta_0/survivor list/first survivor offset 2152 theta_i/survivor list/ith survivor offset 2153 x,y,z,w/ /temporaries 2155 Figure 18: clock_combine() routine 2156 -------------------- 2157 | clock_update() | 2158 -------------------- 2159 \|/ 2160 -------------------- 2161 /----no----->| p.t > s.t | 2162 | -------------------- 2163 | \|/ yes 2164 | -------------------- 2165 | | s.t = p.t | 2166 | -------------------- 2167 | \|/ 2168 | -------------------- 2169 | | local_clock() | 2170 | -------------------- 2171 | \|/ 2172 |<--------------------+-----------------\ 2173 | panic\|/ | adj step\|/ 2174 | ------------- | ------------------- 2175 | | panic exit| | | clear all assoc.| 2176 | ------------- | ------------------- 2177 | ----------------- \|/ 2178 | |*update system | ----------------- 2179 | | variables | | leap = 3 | 2180 | ----------------- | quamtum = | 2181 | \|/ | MAXSTRAT | 2182 | | ----------------- 2183 \---------------------+----------------/ 2184 | 2185 --------------- 2186 | return | 2187 --------------- 2189 System Variables <-- System Peer Variables 2190 leap <-- leap 2191 stratum <-- stratum + 1 2192 refid <-- refid 2193 reftime <-- reftime 2194 capdelta <-- capdelta_r + del 2195 capepsilon <-- capepsilon_r+epsilon+cappsi*mu+psi+|captheta| 2196 * update system variables 2198 Figure 19: clock_update() routine 2200 The remaining survivors are processed by the clock_combine() routine 2201 in Figure 18 and Appendix A.6.5 to produce the best and final data 2202 for the clock discipline algorithm. The routine processes the peer 2203 offset theta and jitter psi to produce the system offset captheta and 2204 system peer jitter vartheta_p, where each server statistic is 2205 weighted by the reciprocal of the root distance and the result 2206 normalized. The system peer jitter vartheta_p is a component of the 2207 system jitter described later. 2209 The system statistics are passed to the clock_update() routine in 2210 Figure 19 and Appendix A.6.4. If there is only one survivor, the 2211 offset passed to the clock discipline algorithm is captheta=theta and 2212 the system peer jitter is vartheta=psi. Otherwise, the selection 2213 jitter vartheta_s is computed as in (8), where theta_0 represents the 2214 offset of the system peer and j ranges over the survivors. 2215 Peer Variables Client System Variables 2216 ---------------- ----------------- 2217 | theta = 1/2* |-------------------->| captheta = | 2218 | [(T2 - T1)+ | | (combine | 2219 | (T3 - T4)] | | (theta_j)) | 2220 ---------------- ----------------- 2221 | del = [(T4 - |--sum--------------->| capdelta= | 2222 | T1) - (T3 - | /|\ | capdelta_r + | 2223 | T2)] | | | del | 2224 ---------------- | ----------------- 2225 | epsilon = | | | capepsilon = | 2226 | | | |capepsilon_r + | 2227 | rho_r + rho +| | | epsilon + | 2228 | captheta*( | | | vartheta + | 2229 | T4 - T1) |------------sum----->| absolutevalue(| 2230 ---------------- | /|\ | theta) | 2231 | psi = | | | ----------------- 2232 | sqrt((1/n)-1)*| | | | psi_s = | 2233 | (sum(theta_0)| | | | sqrt(1/(m-1)* | 2234 | -theta_i)^2))|---|---\ | | sum(theta_0- | 2235 ---------------- | | | | theta_j)^2) | 2236 /|\ | | | ----------------- 2237 | | | | \|/ 2238 | | \------------------>sum 2239 server| | | | 2240 ---------------- | | \|/ 2241 | rho_r | | | | 2242 ---------------- | | ----------------- 2243 | capdelta_r |>--/ | | vartheta = | 2244 ---------------- | | sqrt( | 2245 | capepsilon_r |>------------/ | (vartheta_p)^2| 2246 ---------------- | + | 2247 | (vartheta_s)^2| 2248 ----------------- 2250 Figure 20: System Variables Processing 2252 The first survivor on the survivor list is selected as the system 2253 peer, here represented by the statistics (theta, del, epsilon, psi). 2254 By rule, an update is discarded if its time of arrival p.t is not 2255 strictly later than the last update used s.t. Let mu=p.t-s.t be the 2256 time since the last update or update interval. If the update 2257 interval is less than or equal to zero, the update is discarded. 2258 Otherwise, the system variables are updated from the system peer 2259 variables as shown in Figure 19. Note that s.stratum is set to 2260 p.stratum plus one. 2262 The arrows labeled IGNOR, PANIC, ADJ and STEP refer to return codes 2263 from the local_clock() routine described in the next section. IGNORE 2264 means the update has been ignored as an outlier. PANIC means the 2265 offset is greater than the panic threshold PANICT (1000 s) and SHOULD 2266 cause the program to exit with a diagnostic message to the system 2267 log. STEP means the offset is less than the panic threshold, but 2268 greater than the step threshold STEPT (125 ms). Since this means all 2269 peer data have been invalidated, all associations SHOULD be reset and 2270 the client begins as at initial start. ADJ means the offset is less 2271 than the step threshold and thus a valid update for the local_clock() 2272 routine described later. In this case the system variables are 2273 updated as shown in Figure 19. 2275 There is one exception not shown. The dispersion increment is 2276 bounded from below by MINDISP. In subnets with very fast processors 2277 and networks and very small dispersion and delay this forces a 2278 monotone-definite increase in capepsilon, which avoids loops between 2279 peers operating at the same stratum. 2281 Figure 20 shows how the error budget grows from the packet variables, 2282 on-wire protocol and system peer process to produce the system 2283 variables that are passed to dependent applications and clients. The 2284 system jitter is defined 2286 vartheta = sqrt((vartheta_p)^2+(vartheta_s)^2) 2288 where vartheta_s is the selection jitter relative to the system peer. 2289 The system jitter is passed to dependent applications programs as the 2290 nominal error statistic. The root delay capdelta and root dispersion 2291 capepsilon statistics are relative to the primary server reference 2292 clock and thus inherited by each server along the path. The system 2293 synchronization distance is defined 2295 caplambda = capdelta/2 + capepsilon 2297 which is passed to dependent application programs as the maximum 2298 error statistic. 2300 9.2.4. Clock Discipline Algorithm 2301 --------- 2302 theta_r + | \ +----------------+ 2303 NTP --------->| Phase \ V_d | | V_s 2304 theta_c - | Detector ------>| Clock Filter |-----+ 2305 +-------->| / | | | 2306 | | / +----------------+ | 2307 | --------- | 2308 | | 2309 ----- | 2310 / \ | 2311 | VFO | | 2312 \ / | 2313 ----- +-------------------------------------+ | 2314 ^ | Loop Filter | | 2315 | | | | 2316 | | +---------+ x +-------------+ | | 2317 | V_c | | |<-----| | | | 2318 +------|-| Clock | y | Phase/Freq |<---|------+ 2319 | | Adjust |<-----| Prediction | | 2320 | | | | | | 2321 | +---------+ +-------------+ | 2322 | | 2323 +-------------------------------------+ 2325 Figure 21: Clock Discipline Feedback Loop 2327 The NTPv4 clock discipline algorithm, shortened to discipline in the 2328 following, functions as a combination of two philosophically quite 2329 different feedback control systems. In a phase-locked loop (PLL) 2330 design, periodic phase updates at update intervals m are used 2331 directly to minimize the time error and indirectly the frequency 2332 error. In a frequency-locked loop (FLL) design, periodic frequency 2333 updates at intervals mu are used directly to minimize the frequency 2334 error and indirectly the time error. As shown in [8], a PLL usually 2335 works better when network jitter dominates, while a FLL works better 2336 when oscillator wander dominates. This section contains an outline 2337 of how the NTPv4 design works. An in-depth discussion of the design 2338 principles is provided in [8], which also includes a performance 2339 analysis. 2341 The clock discipline and clock adjust processes interact with the 2342 other algorithms in NTPv4. The output of the combining algorithm 2343 represents the best estimate of the system clock offset relative to 2344 the server ensemble. The discipline adjusts the frequency of the VFO 2345 to minimize this offset. Finally, the timestamps of each server are 2346 compared to the timestamps derived from the VFO in order to calculate 2347 the server offsets and close the feedback loop. 2349 The discipline is implemented as the feedback control system shown in 2350 Figure 21. The variable theta_r represents the combining algorithm 2351 offset (reference phase) and theta_c the VFO offset (control phase). 2352 Each update produces a signal V_d representing the instantaneous 2353 phase difference theta_r - theta_c. The clock filter for each server 2354 functions as a tapped delay line, with the output taken at the tap 2355 selected by the clock filter algorithm. The selection, clustering 2356 and combining algorithms combine the data from multiple filters to 2357 produce the signal V_s. The loop filter, with impulse response F(t), 2358 produces the signal V_c which controls the VFO frequency omega_c and 2359 thus its phase theta_c=integral(omega_c,dt) which closes the loop. 2360 The V_c signal is generated by the clock adjust process in 2361 Section 9.3. The characteristic behavior of this model, which is 2362 determined by F(t) and the various gain factors given in 2363 Appendix A.6.6. 2365 The transient behavior of the PLL/FLL feedback loop is determined by 2366 the impulse response of the loop filter F(t). The loop filter shown 2367 in Figure 22 predicts a phase adjustment x as a function of Vs. The 2368 PLL predicts a frequency adjustment yFLL as an integral of Vs*mu with 2369 repsect to t, while the FLL predicts an adjustment yPLL as a function 2370 of Vs /mu. The two adjustments are combined to correct the frequency 2371 y as shown in Figure 22. The x and y are then used by the 2372 clock_adjust() routine to control the VFO frequency. The detailed 2373 equations that implement these functions are best presented in the 2374 routines of Appendix A.6.6 and Appendix A.7.1. 2375 x <------(Phase Correction)<--. 2376 | 2377 y_FLL | 2378 .-(FLL Predict)<-------+<--V_s 2379 | | 2380 \|/ | 2381 y <--(Sum) | 2382 ^ | 2383 | | 2384 '-(PLL Predict)<-------' 2385 y_PLL 2387 Figure 22: Clock Discipline Loop Filter 2389 Ordinarily, the pseudo-linear feedback loop described above operates 2390 to discipline the system clock. However, there are cases where a 2391 nonlinear algorithm offers considerable improvement. One case is 2392 when the discipline starts without knowledge of the intrinsic clock 2393 frequency. The pseudo-linear loop takes several hours to develop an 2394 accurate measurement and during most of that time the poll interval 2395 cannot be increased. The nonlinear loop described below does this in 2396 15 minutes. Another case is when occasional bursts of large jitter 2397 are present due to congested network links. The state machine 2398 described below resists error bursts lasting less than 15 minutes. 2400 The remainder of this section describes how the discipline works. 2401 Table 17 contains a summary of the variables and parameters including 2402 the program name, formula name and short description. Unless noted 2403 otherwisse, all variables have assumed prefix c. The variables c.t, 2404 c.tc, c.state, and c.count are integers; the memainder are floating 2405 doubles. The function of each will be explained in the algorithm 2406 descriptions below. 2408 +--------+------------+-------------------------+ 2409 | Name | Formula | Description | 2410 +--------+------------+-------------------------+ 2411 | t | timer | seconds counter | 2412 | offset | captheta | combined offset | 2413 | resid | captheta_r | residual offset | 2414 | freq | phi | clock frequency | 2415 | jitter | psi | clock jitter | 2416 | wander | cappsi | frequency wander | 2417 | tc | tau | time constant(log2) | 2418 | state | state | state | 2419 | adj | adj | frequency adjustment | 2420 | count | count | hysteresis counter | 2421 | STEPT | 125 | step threshold (.125 s) | 2422 | WATCH | 900 | stepout thresh(s) | 2423 | PANICT | 1000 | panic threshold(1000 s) | 2424 | LIMIT | 30 | hysteresis limit | 2425 | PGATE | 4 | hysteresis gate | 2426 | TC | 16 | time constant scale | 2427 | AVG | 8 | averaging constant | 2428 +--------+------------+-------------------------+ 2430 Table 17: Clock Discipline Variables And Parameters 2432 ===================================================================== 2433 | State | captheta < STEP | captheta > STEP | Comments | 2434 --------------------------------------------------------------------- 2435 | NSET | > FREQ; adjust | > FREQ; step | no frequency | 2436 | | time | time | file | 2437 --------------------------------------------------------------------- 2438 | FSET | > SYNC; adjust | > SYNC; step | frequency file | 2439 | | time | time | | 2440 --------------------------------------------------------------------- 2441 | SPIK | > SYNC; adjust | if (<900 s)>SPIK | outlier detected | 2442 | | freq, adjust time | else SYNC; step | | 2443 | | | freq; step time | | 2444 --------------------------------------------------------------------- 2445 | FREQ | if (<900 s)> FREQ | if (<900 s)>FREQ | initial frequency | 2446 | | else >SYNC; step | else >SYNC; step | | 2447 | | freq, adjust time | freq, adjust time | | 2448 --------------------------------------------------------------------- 2449 | SYNC | >SYNC; adjust freq| if (<900 s)>SPIK | normal operation | 2450 | | adjust time | else >SYNC; step | | 2451 | | | freq; step time | | 2452 --------------------------------------------------------------------- 2454 Figure 23 2456 The discipline is implemented by the local_clock() routine, which is 2457 called from the clock_update() routine. The local_clock() routine 2458 pseudo code in Appendix A.6.6 has two parts; first the state machine 2459 shown in Figure 24 and second the algorithm that determines the time 2460 constant and thus the poll interval in Figure 25. The state 2461 transition function in Figure 24 is implemented by the rst() function 2462 shown at the lower left of the figure. The local_clock() routine 2463 exits immediately if the offset is greater than the panic threshold. 2464 --- 2465 | A | 2466 --- 2467 || 2468 \/ 2469 --- yes --- 2470 | B |-->| C | 2471 --- --- 2472 no || 2473 \/ 2474 --- 2475 | D | 2476 --- 2477 || 2478 \/ 2479 --- no --- yes SYNC SPIK FREQ 2481 | E |<--| F |---------------------------------- 2482 --- --- || || 2483 SYNC || \/ \/ 2484 SPIKE FSET \/ FREQ NSET --- --- 2485 ------------------------- | G | | H | 2486 || || || || --- --- 2487 || || \/ \/ || yes || || no 2488 || || --- --- || || \/ 2489 || --- | H | | I | || || --- 2490 \/ | I | --- --- || || | J | 2491 --- --- no || ||yes || || || --- 2492 | K | || || || \/ || || || || yes 2493 --- || \/ || --- || || || \/ 2494 || || --- || | L | || || || --- 2495 || || | M ||| --- || || || | M | 2496 || || --- || || || || || --- 2497 || || || \/ \/ \/ \/ || || 2498 || || || ------------>\/<----------- \/ \/ 2499 || || || --- --->\/<----- 2500 || || || | N | --- 2501 || || || --- | O | 2502 || || || --- 2503 || || || || 2504 || || || \/ 2505 || || || --- --- --- 2506 ----->-------->----| P |----><--------| Q |<------| R | 2507 --- || --- --- 2508 --- \/ || 2509 | S | --- \/ 2510 --- | T | --- 2511 || --- | U | 2512 \/ --- 2513 --- || 2514 | V | \/ 2515 --- --- 2516 || | W | 2517 \/ --- 2518 --- 2519 | X | 2520 --- 2522 A: local_clock() 2523 B: |captheta|>PANICT? 2524 C: return(PANIC) 2525 D: freq=0 2526 rval=IGNOR 2527 E: 2528 F: |captheta|>STEPT? 2529 G: state=SPIK 2530 H: mu<-no y| | 2581 ---- | ---- | 2582 | L| | | M| | 2583 -------><---------/ 2584 \|/ 2585 ----- 2586 | N | 2587 ----- 2588 \|/ 2589 ----- 2590 | O | 2591 ----- 2592 \|/ 2593 ----- 2594 | P | 2595 ----- 2597 A: tc 2598 B: state=SYNC 2599 C: |captheta_g| > PGATE? 2600 D: count -= 2*tau 2601 E: count += tau 2602 F: count <= -LIMIT? 2603 G: count >= LIMIT? 2604 H: count = 0 2605 I: count = 0 2606 J: tau>MINPOLL 2607 K: tau 2762 0), nothing further is done except call the poll_update() routine to 2763 set the next poll interval. 2765 If not in a burst, the p.reach variable is shifted left by one bit, 2766 with zero replacing the rightmost bit. If the server has not been 2767 heard for the last three poll intervals, the clock_filter() routine 2768 is called to increase the dispersion as described in Section 8.3. If 2769 the BURST flag is lit and the server is reachable and a valid source 2770 of synchronization is available, the client sends a burst of BCOUNT 2771 (8) packets at each poll interval. This is useful to accurately 2772 measure jitter with long poll intervals. If the IBURST flag is lit 2773 and this is the first packet sent when the server becomes 2774 unreachable, the client sends a burst. This is useful to quickly 2775 reduce the synchronization distance below the distance threshold and 2776 synchronize the clock. The figure also shows the mechanism which 2777 backs off the poll interval if the server becomes unreachable. If 2778 p.reach is nonzero, the server is reachable and p.unreach is set to 2779 zero; otherwise, p.unreach is incremented by one for each poll to the 2780 maximum UNREACH (24). Thereafter for each poll p.hpoll is increased 2781 by one, which doubles the poll interval up to the maximum MAXPOLL 2782 determined by the poll_update() routine. When the server again 2783 becomes reachable, p.unreach is set to zero, p.hpoll is reset to tau 2784 and operation resumes normally. 2786 When a packet is sent from an association, some header values are 2787 copied from the peer variables left by a previous packet and others 2788 from the system variables. includes a flow diagram and a table 2789 showing which values are copied to each header field. In those 2790 implementations using floating double data types for root delay and 2791 root dispersion, these must be converted to NTP short format. All 2792 other fields are either copied intact from peer and system variables 2793 or struck as a timestamp from the system clock. 2795 The poll_update() routine shown in Appendix A.8.2 is called when a 2796 valid packet is received and immediately after a poll message is 2797 sent. If in a burst, the poll interval is fixed at 2 s; otherwise, 2798 the host poll exponent is set to the minimum of p.poll from the last 2799 packet received and p.hpoll from the poll() routine, but not less 2800 than MINPOLL nor greater than MAXPOLL. Thus the clock discipline can 2801 be oversampled, but not undersampled. This is necessary to preserve 2802 subnet dynamic behavior and protect against protocol errors. 2803 Finally, the poll exponent is converted to an interval which 2804 establishes the time at the next poll p.next. 2806 11. Security Considerations 2808 NTPv4 provides an optional authentication field that utilizes the MD5 2809 algorithm. MD5, as the case for SHA-1, is derived from MD4, which 2810 has long been known to be weak. In 2004, techniques for efficiently 2811 finding collisions in MD5 were announced. A summary of the weakness 2812 of MD5 can be found in [9]. 2814 In the case of NTP as specified herein, NTP broadcast clients are 2815 vulnerable to disruption by misbehaving or hostile SNTP or NTP 2816 broadcast servers elsewhere in the Internet. Access controls and/or 2817 cryptographic authentication means should be provided for additional 2818 security in such cases. 2820 12. IANA Considerations 2822 UDP/TCP Port 123 was previously assigned by IANA for this protocol. 2823 The IANA has assigned the IPv4 multicast group address 224.0.1.1 and 2824 the IPv6 multicast address ending :101 for NTP. This document 2825 introduces NTP extension fields allowing for the development of 2826 future extensions to the protocol, where a particular extension is to 2827 be identified by the Field Type sub-field within the extension field. 2828 IANA is requested to establish and maintain a registry for Extension 2829 Field Types associated with this protocol, populating this registry 2830 with no initial entries. As future needs arise, new Extension Field 2831 Types may be defined. Following the policies outlined in [10], new 2832 values are to be defined by IETF Consensus. 2834 13. Acknowledgements 2836 This authors would like to thank Karen O'Donoghue, Brian Haberman, 2837 Greg Dowd, Mark Elliot, and Harlan Stenn for technical reviews of 2838 this document. 2840 14. Informative References 2842 [1] Mills, D., "Network Time Protocol (Version 3) Specification, 2843 Implementation", RFC 1305, March 1992. 2845 [2] Mills, D., "Simple Network Time Protocol (SNTP) Version 4 for 2846 IPv4, IPv6 and OSI", RFC 4330, January 2006. 2848 [3] University of Delaware, "The Autokey security architecture, 2849 protocol and algorithms. Electrical and Computer Engineering 2850 Technical Report 06-1-1", NDSS , January 2006. 2852 [4] Bradner, S., "Key words for use in RFCs to Indicate Requirement 2853 Levels", BCP 14, RFC 2119, March 1997. 2855 [5] Postel, J., "Internet Protocol", STD 5, RFC 791, 2856 September 1981. 2858 [6] Rivest, R., "The MD5 Message-Digest Algorithm", RFC 1321, 2859 April 1992. 2861 [7] Marzullo and S. Owicki, "Maintaining the time in a distributed 2862 system.", ACM Operating Systems Review 19 , July 1985. 2864 [8] Mills, D. L., "Computer Network Time Synchronization - the 2865 Network Time Protocol. CRC Press, 304pp.", 2006. 2867 [9] Bellovin, S. and E. Rescorla, Proceedings of the 13th annual 2868 ISOC Network and Distributed System Security Symposium, 2869 "Deploying a new Hash Algorithm", February 2006. 2871 [10] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA 2872 Considerations Section in RFCs", BCP 26, RFC 2434, 2873 October 1998. 2875 Appendix A. Code Skeleton 2877 This appendix is intended to describe the protocol and algorithms of 2878 an implementation in a general way using what is called a code 2879 skeleton program. This consists of a set of definitions, structures 2880 and code segments which illustrate the protocol operations without 2881 the complexities of an actual implementation of the protocol. This 2882 program is not an executable and is not designed to run in the 2883 ordinary sense. It is designed to be compiled only in order to 2884 verify consistent variable and type usage. The program is not 2885 intended to be fast or compact, just to demonstrate the algorithms 2886 with sufficient fidelity to understand how they work. The code 2887 skeleton consists of eight segments, a header segment included by 2888 each of the other segments, plus a code segment for the main program, 2889 kernel I/O and system clock interfaces, and peer, system, 2890 clock_adjust and poll processes. These are presented in order below 2891 along with definitions and variables specific to each process. 2893 A.1. Global Definitions 2895 Following are definitions and other data shared by all programs. 2896 These values are defined in a header file ntp4.h which is included in 2897 all files. 2899 A.1.1. Definitions, Constants, Parameters 2900 #include s/* avoids complaints about sqrt() */ 2901 #include /* for gettimeofday() and friends */ 2902 #include /* for malloc() and friends */ 2904 /* 2905 * Data types 2906 * 2907 * This program assumes the int data type is 32 bits and the long data 2908 * type is 64 bits. The native data type used in most calculations is 2909 * floating double. The data types used in some packet header fields 2910 * require conversion to and from this representation. Some header 2911 * fields involve partitioning an octet, here represented by individual 2912 * octets. 2913 * 2914 * The 64-bit NTP timestamp format used in timestamp calculations is 2915 * unsigned seconds and fraction with the decimal point to the left of 2916 * bit 32. The only operation permitted with these values is 2917 * subtraction, yielding a signed 31-bit difference. The 32-bit NTP 2918 * short format used in delay and dispersion calculations is seconds and 2919 * fraction with the decimal point to the left of bit 16. The only 2920 * operations permitted with these values are addition and 2921 * multiplication by a constant. 2922 * 2923 * The IPv4 address is 32 bits, while the IPv6 address is 128 bits. The 2924 * message digest field is 128 bits as constructed by the MD5 algorithm. 2925 * The precision and poll interval fields are signed log2 seconds. 2926 */ 2927 typedef unsigned long tstamp; 2928 typedef unsigned int tdist; 2929 typedef unsigned long ipaddr; 2930 typedef unsinged int ipport; 2931 typedef unsigned long digest; 2932 typedef signed char s_char; 2934 /* 2935 * Arithmetic conversion macroni 2936 */ 2938 /* NTP timestamp format */ 2939 /* NTP short format */ 2940 /* IPv4 or IPv6 address */ 2941 /* IP port number */ 2942 /* md5 digest */ 2943 /* precision and poll interval (log2) */ 2945 #define LOG2D(a) ((a) < 0 ? 1. / (1L << -(a)) : \ 2947 1L << (a)) /* poll, etc. */ 2949 #define LFP2D(a) ((double)(a) / 0x100000000L) /* NTP timestamp */ 2951 #define D2LFP(a) ((tstamp)((a) * 0x100000000L)) 2953 #define FP2D(a) (double)(a) / 0x10000L) /* NTP short */ 2954 #define D2FP(a) ((tdist)((a) * 0x10000L)) 2955 #define SQUARE(x) (x * x) 2956 #define SQRT(x) (sqrt(x)) 2958 /* 2959 * Global constants. Some of these might be converted to variables 2960 * which can be tinkered by configuration or computed on-fly. For 2961 * instance, PRECISION could be calculated on-fly and 2962 * provide performance tuning for the defines marked with % below. 2963 */ 2964 #define VERSION 4 /* version number */ 2965 #define PORT 123 /* NTP poert number */ 2966 #define MINDISP .01 /* % minimum dispersion (s) */ 2967 #define MAXDISP 16 /* % maximum dispersion (s) */ 2968 #define MAXDIST 1 /* % distance threshold (s) */ 2969 #define NOSYNC 3 /* leap unsync */ 2970 #define MAXSTRAT 16 /* maximum stratum (infinity metric) */ 2971 #define MINPOLL 4 /* % minimum poll interval (16 s)*/ 2972 #define MAXPOLL 17 /* % maximum poll interval (36.4 h) */ 2973 #define PHI 15e-6 /* % frequency tolerance (15 PPM) */ 2974 #define NSTAGE 8 /* clock register stages */ 2975 #define NMAX 50 /* % maximum number of peers */ 2976 #define NSANE 1 /* % minimum intersection survivors */ 2977 #define NMIN 3 /* % minimum cluster survivors */ 2979 /* 2980 * Global return values 2981 */ 2982 #define TRUE 1 /* boolean true */ 2983 #define FALSE 0 /* boolean false */ 2984 #define NULL 0 /* empty pointer */ 2986 /* 2987 * Local clock process return codes 2988 */ 2989 #define IGNORE 0 /* ignore */ 2990 #define SLEW 1 /* slew adjustment */ 2991 #define STEP 2 /* step adjustment */ 2992 #define PANIC 3 /* panic - no adjustment */ 2994 /* 2995 * System flags 2996 */ 2997 #define S_FLAGS 0 /* any system flags */ 2998 #define S_BCSTENAB 0x1 /* enable broadcast client */ 3000 /* 3001 * Peer flags 3002 */ 3003 #define P_FLAGS 0 /* any peer flags */ 3004 #define P_EPHEM 0x01 /* association is ephemeral */ 3005 #define P_BURST 0x02 /* burst enable */ 3006 #define P_IBURST 0x04 /* intial burst enable */ 3007 #define P_NOTRUST 0x08 /* authenticated access */ 3008 #define P_NOPEER 0x10 /* authenticated mobilization */ 3010 /* 3011 * Authentication codes 3012 */ 3013 #define A_NONE 0 /* no authentication */ 3014 #define A_OK 1 /* authentication OK */ 3015 #define A_ERROR 2 /* authentication error */ 3016 #define A_CRYPTO 3 /* crypto-NAK */ 3018 /* 3019 * Association state codes 3020 */ 3021 #define X_INIT 0 /* initialization */ 3022 #define X_STALE 1 /* timeout */ 3023 #define X_STEP 2 /* time step */ 3024 #define X_ERROR 3 /* authentication error */ 3025 #define X_CRYPTO 4 /* crypto-NAK received */ 3026 #define X_NKEY 5 /* untrusted key */ 3028 /* 3029 * Protocol mode definitionss 3030 */ 3031 #define M_RSVD 0 /* reserved */ 3032 #define M_SACT 1 /* symmetric active */ 3033 #define M_PASV 2 /* symmetric passive */ 3034 #define M_CLNT 3 /* client */ 3035 #define M_SERV 4 /* server */ 3036 #define M_BCST 5 /* broadcast server */ 3037 #define M_BCLN 6 /* broadcast client */ 3038 /* 3039 * Clock state definitions 3040 */ 3041 #define NSET 0 /* clock never set */ 3042 #define FSET 1 /* frequency set from file */ 3043 #define SPIK 2 /* spike detected */ 3044 #define FREQ 3 /* frequency mode */ 3045 #define SYNC 4 /* clock synchronized */ 3047 A.1.2. Packet Data Structures 3048 /* 3049 * The receive and transmit packets may contain an optional message 3050 * authentication code (MAC) consisting of a key identifier (keyid) and 3051 * message digest (mac). NTPv4 supports optional extension fields which 3052 * are inserted after the the header and before the MAC, but these are 3053 * not described here. 3054 * 3055 * Receive packet 3056 * 3057 * Note the dst timestamp is not part of the packet itself. It is 3058 * captured upon arrival and returned in the receive buffer along with 3059 * the buffer length and data. Note that some of the char fields are 3060 * packed in the actual header, but the details are omited here. 3061 */ 3062 struct r { 3063 ipaddr srcaddr; /* source (remote) address */ 3064 ipaddr dstaddr; /* destination (local) address */ 3065 char version; /* version number */ 3066 char leap; /* leap indicator */ 3067 char mode; /* mode */ 3068 char stratum; /* stratum */ 3069 char poll; /* poll interval */ 3070 s_char precision; /* precision */ 3071 tdist rootdelay; /* root delay */ 3072 tdist rootdisp; /* root dispersion */ 3073 char refid; /* reference ID */ 3074 tstamp reftime; /* reference time */ 3075 tstamp org; /* origin timestamp */ 3076 tstamp rec; /* receive timestamp */ 3077 tstamp xmt; /* transmit timestamp */ 3078 int keyid; /* key ID */ 3079 digest digest; /* message digest */ 3080 tstamp dst; /* destination timestamp */ 3081 } r; 3083 /* 3084 * Transmit packet 3085 */ 3086 struct x { 3087 ipaddr dstaddr; /* source (local) address */ 3088 ipaddr srcaddr; /* destination (remote) address */ 3089 char version; /* version number */ 3090 char leap; /* leap indicator */ 3091 char mode; /* mode */ 3092 char stratum; /* stratum */ 3093 char poll; /* poll interval */ 3094 s_char precision; /* precision */ 3095 tdist rootdelay; /* root delay */ 3096 tdist rootdisp; /* root dispersion */ 3097 char refid; /* reference ID */ 3098 tstamp reftime; /* reference time */ 3099 tstamp org; /* origin timestamp */ 3100 tstamp rec; /* receive timestamp */ 3101 tstamp xmt; /* transmit timestamp */ 3102 int keyid; /* key ID */ 3103 digest digest; /* message digest */ 3104 } x; 3106 A.1.3. Association Data Structures 3108 /* 3109 * Filter stage structure. Note the t member in this and other 3110 * structures refers to process time, not real time. Process time 3111 * increments by one second for every elapsed second of real time. 3112 */ 3113 struct f { 3114 tstamp t; /* update time */ 3115 double offset; /* clock ofset */ 3116 double delay; /* roundtrip delay */ 3117 double disp; /* dispersion */ 3118 } f; 3120 /* 3121 * Association structure. This is shared between the peer process and 3122 * poll process. 3123 */ 3124 struct p { 3126 /* 3127 * Variables set by configuration 3128 */ 3129 ipaddr srcaddr; /* source (remote) address */ 3130 ipport srcport; /* source port number *. 3131 ipaddr dstaddr; /* destination (local) address */ 3132 ipport dstport; /* destination port number */ 3133 char version; /* version number */ 3134 char mode; /* mode */ 3135 int keyid; /* key identifier */ 3136 int flags; /* option flags */ 3138 /* 3139 * Variables set by received packet 3140 */ 3141 char leap; /* leap indicator */ 3142 char mode; /* mode */ 3143 char stratum; /* stratum */ 3144 char ppoll; /* peer poll interval */ 3145 double rootdelay; /* root delay */ 3146 double rootdisp; /* root dispersion */ 3147 char refid; /* reference ID */ 3148 tstamp reftime; /* reference time */ 3149 #define begin_clear org /* beginning of clear area */ 3150 tstamp org; /* originate timestamp */ 3151 tstamp rec; /* receive timestamp */ 3152 tstamp xmt; /* transmit timestamp */ 3154 /* 3155 * Computed data 3156 */ 3157 double t; /* update time */ 3158 struct f f[NSTAGE]; /* clock filter */ 3159 double offset; /* peer offset */ 3160 double delay; /* peer delay */ 3161 double disp; /* peer dispersion */ 3162 double jitter; /* RMS jitter */ 3164 /* 3165 * Poll process variables 3166 */ 3167 char hpoll; /* host poll interval */ 3168 int burst; /* burst counter */ 3169 int reach; /* reach register */ 3170 #define end_clear unreach /* end of clear area */ 3171 int unreach; /* unreach counter */ 3172 int last; /* last poll time */ 3173 int next; /* next poll time */ 3174 } p; 3176 A.1.4. System Data Structures 3178 /* 3179 * Chime list. This is used by the intersection algorithm. 3180 */ 3181 struct m { /* m is for Marzullo */ 3182 struct p *p; /* peer structure pointer */ 3183 int type; /* high +1, mid 0, low -1 */ 3184 double edge; /* correctness interval edge */ 3185 } m; 3187 /* 3188 * Survivor list. This is used by the clustering algorithm. 3189 */ 3190 struct v { 3191 struct p *p; /* peer structure pointer */ 3192 double metric; /* sort metric */ 3193 } v; 3194 /* 3195 * System structure 3196 */ 3197 struct s { 3198 tstamp t; /* update time */ 3199 char leap; /* leap indicator */ 3200 char stratum; /* stratum */ 3201 char poll; /* poll interval */ 3202 char precision; /* precision */ 3203 double rootdelay; /* root delay */ 3204 double rootdisp; /* root dispersion */ 3205 char refid; /* reference ID */ 3206 tstamp reftime; /* reference time */ 3207 struct m m[NMAX]; /* chime list */ 3208 struct v v[NMAX]; /* survivor list */ 3209 struct p *p; /* association ID */ 3210 double offset; /* combined offset */ 3211 double jitter; /* combined jitter */ 3212 int flags; /* option flags */ 3213 } s; 3215 A.1.5. Local Clock Data Structures 3217 /* 3218 * Local clock structure 3219 */ 3220 struct c { 3221 tstamp t; /* update time */ 3222 int state; /* current state */ 3223 double offset; /* current offset */ 3224 double base; /* base offset */ 3225 double last; /* previous offset */ 3226 int count; /* jiggle counter */ 3227 double freq; /* frequency */ 3228 double jitter; /* RMS jitter */ 3229 double wander; /* RMS wander */ 3230 } c; 3232 A.1.6. Function Prototypes 3234 /* 3235 * Peer process 3236 */ 3237 void receive(struct r *); /* receive packet */ 3238 void fast_xmit(struct r *, int, int); 3239 /* transmit a reply packet */ 3240 struct p *find_assoc(struct r *); 3241 /* search the association table */ 3242 void packet(struct p *, struct r *); 3243 /* process packet */ 3244 void clock_filter(struct p *, double, double, double); 3245 /* filter */ 3246 int accept(struct p *); 3247 /* determine fitness of server */ 3248 int access(struct r *); 3249 /* determine access restrictions */ 3251 /* 3252 * System process 3253 */ 3254 void clock_select(); /* find the best clocks */ 3255 void clock_update(struct p *); /* update the system clock */ 3256 void clock_combine(); /* combine the offsets */ 3257 double root_dist(struct p *); /* calculate root distance */ 3259 /* 3260 * Clock discipline process 3261 */ 3262 int local_clock(struct p *, double); /* clock discipline */ 3263 void rstclock(int, double, double); /* clock state transition */ 3265 /* 3266 * Clock adjust process 3267 */ 3268 void clock_adjust(); /* one-second timer process */ 3270 /* 3271 * Poll process 3272 */ 3273 void poll(struct p *); /* poll process */ 3274 void poll_update(struct p *, int); /* update the poll interval */ 3275 void peer_xmit(struct p *); /* transmit a packet */ 3277 /* 3278 * Main program and utility routines 3279 */ 3280 int main(); /* main program */ 3281 struct p *mobilize(ipaddr, ipaddr, int, int, int, int); 3282 /* mobilize */ 3283 void clear(struct p *, int); /* clear association */ 3284 digest md5(int); /* generate a message digest */ 3286 /* 3287 * Kernel I/O Interface 3288 */ 3289 struct r *recv_packet(); /* wait for packet */ 3290 void xmit_packet(struct x *); /* send packet */ 3292 /* 3293 * Kernel system clock interface 3294 */ 3295 void step_time(double); /* step time */ 3296 void adjust_time(double); /* adjust (slew) time */ 3297 tstamp get_time(); /* read time */ 3299 A.2. Main Program and Utility Routines 3301 #include "ntp4.h" 3303 /* 3304 * Definitions 3305 */ 3306 #define PRECISION -18 /* precision (log2 s) */ 3307 #define IPADDR 0 /* any IP address */ 3308 #define MODE 0 /* any NTP mode */ 3309 #define KEYID 0 /* any key identifier */ 3310 /* 3311 * main() - main program 3312 */ 3313 int 3314 main() 3315 { 3316 struct p *p; /* peer structure pointer */ 3317 struct r *r; /* receive packet pointer */ 3319 /* 3320 * Read command line options and initialize system variables. 3321 * Implementations MAY measure the precision specific 3322 * to each machine by measuring the clock increments to read the 3323 * system clock. 3324 */ 3325 memset(&s, sizeof(s), 0); 3326 s.leap = NOSYNC; 3327 s.stratum = MAXSTRAT; 3328 s.poll = MINPOLL; 3329 s.precision = PRECISION; 3330 s.p = NULL; 3332 /* 3333 * Initialize local clock variables 3334 */ 3335 memset(&c, sizeof(c), 0); 3336 if (/* frequency file */ 0) { 3337 c.freq = /* freq */ 0; 3338 rstclock(FSET, 0, 0); 3339 } else { 3340 rstclock(NSET, 0, 0); 3341 } 3342 c.jitter = LOG2D(s.precision); 3344 /* 3345 * Read the configuration file and mobilize persistent 3346 * associations with spcified addresses, version, mode, key ID 3347 * and flags. 3348 */ 3349 while (/* mobilize configurated associations */ 0) { 3350 p = mobilize(IPADDR, IPADDR, VERSION, MODE, KEYID, 3351 P_FLAGS); 3352 } 3354 /* 3355 * Start the system timer, which ticks once per second. Then 3356 * read packets as they arrive, strike receive timestamp and 3357 * call the receive() routine. 3359 */ 3360 while (0) { 3361 r = recv_packet(); r->dst = get_time(); receive(r); 3362 } 3363 } 3365 /* 3366 * mobilize() - mobilize and initialize an association 3367 */ 3368 struct p 3369 *mobilize( 3370 ipaddr srcaddr, /* IP source address */ 3371 ipaddr dstaddr, /* IP destination address */ 3372 int version, /* version */ 3373 int mode, /* host mode */ 3374 int keyid, /* key identifier */ 3375 int flags /* peer flags */ 3376 ) 3377 { 3378 struct p *p; /* peer process pointer */ 3380 /* 3381 * Allocate and initialize association memory 3382 */ 3383 p = malloc(sizeof(struct p)); 3384 p->srcaddr = srcaddr; 3385 p->srcport = PORT; 3386 p->dstaddr = dstaddr; 3387 p->dstport = PORT; 3388 p->version = version; 3389 p->mode = mode; 3390 p->keyid = keyid; 3391 p->hpoll = MINPOLL; 3392 clear(p, X_INIT); 3393 p->flags == flags; 3394 return (p); 3395 } 3397 /* 3398 * clear() - reinitialize for persistent association, demobilize 3399 * for ephemeral association. 3400 */ 3401 void 3402 clear( 3403 struct p *p, /* peer structure pointer */ 3404 int kiss /* kiss code */ 3405 ) 3406 { 3407 int i; 3409 /* 3410 * The first thing to do is return all resources to the bank. 3411 * Typical resources are not detailed here, but they include 3412 * dynamically allocated structures for keys, certificates, etc. 3413 * If an ephemeral association and not initialization, return 3414 * the association memory as well. 3415 */ 3416 /* return resources */ 3417 if (s.p == p) 3418 s.p = NULL; 3419 if (kiss != X_INIT && (p->flags & P_EPHEM)) { 3420 free(p); 3421 return; 3422 } 3424 /* 3425 * Initialize the association fields for general reset. 3426 */ 3427 memset(BEGIN_CLEAR(p), LEN_CLEAR, 0); p->leap = NOSYNC; 3428 p->stratum = MAXSTRAT; 3429 p->ppoll = MAXPOLL; 3430 p->hpoll = MINPOLL; 3431 p->disp = MAXDISP; 3432 p->jitter = LOG2D(s.precision); p->refid = kiss; 3433 for (i = 0; i < NSTAGE; i++) 3434 p->f[i].disp = MAXDISP; 3436 /* 3437 * Randomize the first poll just in case thousands of broadcast 3438 * clients have just been stirred up after a long absence of the 3439 * broadcast server. 3440 */ 3441 p->last = p->t = c.t; 3442 p->next = p->last + (random() & ((1 << MINPOLL) - 1)); 3443 } 3445 /* 3446 * md5() - compute message digest 3447 */ 3448 digest 3449 md5( 3450 int keyid /* key identifier */ 3451 ) 3452 { 3453 /* 3454 * Compute a keyed cryptographic message digest. The key 3455 * identifier is associated with a key in the local key cache. 3456 * The key is prepended to the packet header and extension fieds 3457 * and the result hashed by the MD5 algorithm as described in 3458 * RFC-1321. Return a MAC consisting of the 32-bit key ID 3459 * concatenated with the 128-bit digest. 3460 */ 3461 return (/* MD5 digest */ 0); 3462 } 3464 A.3. Kernel Input/Output Interface 3466 /* 3467 * Kernel interface to transmit and receive packets. Details are 3468 * deliberately vague and depend on the operating system. 3469 * 3470 * recv_packet - receive packet from network 3471 */ 3472 struct r /* receive packet pointer*/ 3473 *recv_packet() { 3474 return (/* receive packet r */ 0); 3475 } 3477 /* 3478 * xmit_packet - transmit packet to network 3479 */ 3480 void 3481 xmit_packet( 3482 struct x *x /* transmit packet pointer */ 3483 ) 3484 { 3485 /* send packet x */ 3486 } 3488 A.4. Kernel System Clock Interface 3490 /* 3491 * There are three time formats: native (Unix), NTP and floating double. 3492 * The get_time() routine returns the time in NTP long format. The Unix 3493 * routines expect arguments as a structure of two signed 32-bit words 3494 * in seconds and microseconds (timeval) or nanoseconds (timespec). The 3495 * step_time() and adjust_time() routines expect signed arguments in 3496 * floating double. The simplified code shown here is for illustration 3497 * only and has not been verified. 3498 */ 3499 #define JAN_1970 2208988800UL /* 1970 - 1900 in seconds */ 3501 /* 3502 * get_time - read system time and convert to NTP format 3503 */ 3504 tstamp 3505 get_time() 3506 { 3507 struct timeval unix_time; 3509 /* 3510 * There are only two calls on this routine in the program. One 3511 * when a packet arrives from the network and the other when a 3512 * packet is placed on the send queue. Call the kernel time of 3513 * day routine (such as gettimeofday()) and convert to NTP 3514 * format. 3515 */ 3516 gettimeofday(&unix_time, NULL); 3518 return ((unix_time.tv_sec + JAN_1970) * 0x100000000L + 3519 (unix_time.tv_usec * 0x100000000L) / 1000000); 3520 } 3522 /* 3523 * step_time() - step system time to given offset valuet 3524 */ 3525 void 3526 step_time( 3527 double offset /* clock offset */ 3528 ) 3529 { 3530 struct timeval unix_time; 3531 tstamp ntp_time; 3533 /* 3534 * Convert from double to native format (signed) and add to the 3535 * current time. Note the addition is done in native format to 3536 * avoid overflow or loss of precision. 3537 */ 3538 ntp_time = D2LFP(offset); gettimeofday(&unix_time, NULL); 3539 unix_time.tv_sec += ntp_time / 0x100000000L; 3540 unix_time.tv_usec += ntp_time % 0x100000000L; 3541 unix_time.tv_sec += unix_time.tv_usec / 1000000; 3542 unix_time.tv_usec %= 1000000; 3543 settimeofday(&unix_time, NULL); 3544 } 3546 /* 3547 * adjust_time() - slew system clock to given offset value 3548 */ 3549 void 3550 adjust_time( 3551 double offset /* clock offset */ 3552 ) 3553 { 3554 struct timeval unix_time; 3555 tstamp ntp_time; 3557 /* 3558 * Convert from double to native format (signed) and add to the 3559 * current time. 3560 */ 3561 ntp_time = D2LFP(offset); 3562 unix_time.tv_sec = ntp_time / 0x100000000L; 3563 unix_time.tv_usec = ntp_time % 0x100000000L; 3564 unix_time.tv_sec += unix_time.tv_usec / 1000000; 3565 unix_time.tv_usec %= 1000000; 3566 adjtime(&unix_time, NULL); 3567 } 3569 A.5. Peer Process 3571 #include "ntp4.h" 3573 /* 3574 * A crypto-NAK packet includes the NTP header followed by a MAC 3575 * consisting only of the key identifier with value zero. It tells the 3576 * receiver that a prior request could not be properly authenticated, 3577 * but the NTP header fields are correct. 3578 * 3579 * A kiss-o'-death packet has an NTP header with leap 3 (NOSYNC) and 3580 * stratum 0. It tells the receiver that something drastic 3581 * has happened, as revealled by the kiss code in the refid field. The 3582 * NTP header fields may or may not be correct. 3583 */ 3584 /* 3585 * Definitions 3586 */ 3587 #define SGATE 3 /* spike gate (clock filter */ 3588 #define BDELAY .004 /* broadcast delay (s) */ 3590 /* 3591 * Dispatch codes 3592 */ 3593 #define ERR -1 /* error */ 3594 #define DSCRD 0 /* discard packet */ 3595 #define PROC 1 /* process packet */ 3596 #define BCST 2 /* broadcast packet */ 3597 #define FXMIT 3 /* client packet */ 3598 #define NEWPS 4 /* new symmetric passive client */ 3599 #define NEWBC 5 /* new broadcast client */ 3601 /* 3602 * Dispatch matrix 3603 * active passv client server bcast */ 3604 int table[7][5] = { 3605 /* nopeer */{ NEWPS, DSCRD, FXMIT, DSCRD, NEWBC }, 3606 /* active */{ PROC, PROC, DSCRD, DSCRD, DSCRD }, 3607 /* passv */{ PROC, ERR, DSCRD, DSCRD, DSCRD }, 3608 /* client */{ DSCRD, DSCRD, DSCRD, PROC, DSCRD }, 3609 /* server */{ DSCRD, DSCRD, DSCRD, DSCRD, DSCRD }, 3610 /* bcast */{ DSCRD, DSCRD, DSCRD, DSCRD, DSCRD }, 3611 /* bclient */{ DSCRD, DSCRD, DSCRD, DSCRD, PROC} 3612 }; 3614 /* 3615 * Miscellaneous macroni 3616 * 3617 * This macro defines the authentication state. If x is 0, 3618 * authentication is optional, othewise it is required. 3619 */ 3620 #define AUTH(x, y)((x) ? (y) == A_OK : (y) == A_OK || \ 3621 (y) == A_NONE) 3623 /* 3624 * These are used by the clear() routine 3625 */ 3626 #define BEGIN_CLEAR(p) ((char *)&((p)->begin_clear)) 3627 #define END_CLEAR(p) ((char *)&((p)->end_clear)) 3628 #define LEN_CLEAR (END_CLEAR ((struct p *)0) - \ 3629 BEGIN_CLEAR((struct p *)0)) 3631 A.5.1. receive() 3633 /* 3634 * receive() - receive packet and decode modes 3635 */ 3636 void 3637 receive( 3638 struct r *r /* receive packet pointer */ 3639 ) 3640 { 3641 struct p *p; /* peer structure pointer 3642 int auth; /* authentication code */ 3643 int has_mac; /* size of MAC */ 3644 int synch; /* synchronized switch */ 3645 int auth; /* authentication code */ 3646 /* 3647 * Check access control lists. The intent here is to implement a 3648 * whitelist of those IP addresses specifically accepted and/or 3649 * a blacklist of those IP addresses specifically rejected. 3650 * There could be different lists for authenticated clients and 3651 * unauthenticated clients. 3652 */ 3653 if (!access(r)) 3654 return; /* access denied */ 3656 /* 3657 * The version must not be in the future. Format checks include 3658 * packet length, MAC length and extension field lengths, if 3659 * present. 3660 */ 3661 if (r->version > VERSION /* or format error */) 3662 return; /* format error */ 3664 /* 3665 * Authentication is conditioned by two switches which can be 3666 * specified on a per-client basis. 3667 * 3668 * P_NOPEER do not mobilize an association unless 3669 * authenticated 3670 * P_NOTRUST do not allow access unless authenticated 3671 * (implies P_NOPEER)* 3672 * There are four outcomes: 3673 * 3674 * A_NONE the packet has no MAC 3675 * A_OK the packet has a MAC and authentication 3676 * succeeds 3677 * A_ERROR the packet has a MAC and authentication fails 3678 * A_CRYPTO crypto-NAK. the MAC has four octets only. 3679 * 3680 * Note: The AUTH(x, y) macro is used to filter outcomes. If x 3681 * is zero, acceptable outcomes of y are NONE and OK. If x is 3682 * one, the only acceptable outcome of y is OK. 3683 */ 3684 has_mac = /* length of MAC field */ 0; if (has_mac == 0) { 3685 auth = A_NONE; /* not required */ 3686 } else if (has_mac == 4) { 3687 auth == A_CRYPTO; /* crypto-NAK */ 3688 } else { 3689 if (r->mac != md5(r->keyid)) 3690 auth = A_ERROR; /* auth error */ 3691 else 3692 auth = A_OK; /* auth OK */ 3693 } 3694 /* 3695 * Find association and dispatch code. If there is no 3696 * association to match, the value of p->mode is assumed NULL. 3697 */ 3698 p = find_assoc(r); 3699 switch(table[p->mode][r->mode]) { 3701 /* 3702 * Client packet. Send server reply (no association). If 3703 * authentication fails, send a crypto-NAK packet. 3704 */ 3705 case FXMIT: 3706 if (AUTH(p->flags & P_NOTRUST, auth)) 3707 fast_xmit(r, M_SERV, auth); 3708 else if (auth == A_ERROR) 3709 fast_xmit(r, M_SERV, A_CRYPTO); 3710 return; /* M_SERV packet sent */ 3712 /* 3713 * New symmetric passive client (ephemeral association). It is 3714 * mobilized in the same version as in the packet. If 3715 * authentication fails, send a crypto-NAK packet. If restrict 3716 * no-moblize, send a symmetric active packet instead. 3717 */ 3718 case NEWPS: 3719 if (!AUTH(p->flags & P_NOTRUST, auth)) { 3720 if (auth == A_ERROR) 3721 fast_xmit(r, M_SACT, A_CRYPTO); 3722 return; /* crypto-NAK packet sent */ 3723 } 3724 if (!AUTH(p->flags & P_NOPEER, auth)) { 3725 fast_xmit(r, M_SACT, auth); 3726 return; /* M_SACT packet sent */ 3727 } 3728 p = mobilize(r->srcaddr, r->dstaddr, r->version, M_PASV, 3729 r->keyid, P_EPHEM); 3730 break; 3732 /* 3733 * New broadcast client (ephemeral association). It is mobilized 3734 * in the same version as in the packet. If authentication 3735 * error, ignore the packet. 3736 */ 3737 case NEWBC: 3738 if (!AUTH(p->flags & (P_NOTRUST | P_NOPEER), auth)) 3739 return; /* authentication error */ 3741 if (!(s.flags & S_BCSTENAB)) 3742 return; /* broadcast not enabled */ 3744 p = mobilize(r->srcaddr, r->dstaddr, r->version, M_BCLN, 3745 r->keyid, P_EPHEM); 3746 break; /* processing continues */ 3748 /* 3749 * Process packet. Placeholdler only. 3750 */ 3751 case PROC: 3752 break; /* processing continues */ 3754 /* 3755 * Invalid mode combination. We get here only in case of 3756 * ephemeral associations, so the correct action is simply to 3757 * toss it. 3758 */ 3759 case ERR: 3760 clear(p, X_ERROR); 3761 return; /* invalid mode combination */ 3763 /* 3764 * No match; just discard the packet. 3765 */ 3766 case DSCRD: 3767 return; /* orphan abandoned */ 3768 } 3770 /* 3771 * Next comes a rigorous schedule of timestamp checking. If the 3772 * transmit timestamp is zero, the server is horribly broken. 3773 */ 3774 if (r->xmt == 0) 3775 return; /* invalid timestamp */ 3777 /* 3778 * If the transmit timestamp duplicates a previous one, the 3779 * packet is a replay. 3780 */ 3781 if (r->xmt == p->xmt) 3782 return; /* duplicate packet */ 3784 /* 3785 * If this is a broadcast mode packet, skip further checking. 3786 * If the origin timestamp is zero, the sender has not yet heard 3787 * from us. Otherwise, if the origin timestamp does not match 3788 * the transmit timestamp, the packet is bogus. 3789 */ 3790 synch = TRUE; 3791 if (r->mode != M_BCST) { 3792 if (r->org == 0) 3793 synch = FALSE;/* unsynchronized */ 3795 else if (r->org != p->xmt) 3796 synch = FALSE;/* bogus packet */ 3797 } 3799 /* 3800 * Update the origin and destination timestamps. If 3801 * unsynchronized or bogus, abandon ship. 3802 */ 3803 p->org = r->xmt; 3804 p->rec = r->dst; 3805 if (!synch) 3806 return; /* unsynch */ 3808 /* 3809 * The timestamps are valid and the receive packet matches the 3810 * last one sent. If the packet is a crypto-NAK, the server 3811 * might have just changed keys. We demobilize the association 3812 * and wait for better times. 3813 */ 3814 if (auth == A_CRYPTO) { 3815 clear(p, X_CRYPTO); 3816 return; /* crypto-NAK */ 3817 } 3819 /* 3820 * If the association is authenticated, the key ID is nonzero 3821 * and received packets must be authenticated. This is designed 3822 * to avoid a bait-and-switch attack, which was possible in past 3823 * versions. 3824 */ 3825 if (!AUTH(p->keyid || (p->flags & P_NOTRUST), auth)) 3826 return; /* bad auth */ 3828 /* 3829 * Everything possible has been done to validate the timestamps 3830 * and prevent bad guys from disrupting the protocol or 3831 * injecting bogus data. Earn some revenue. 3832 */ 3833 packet(p, r); 3834 } 3836 /* 3837 * find_assoc() - find a matching association 3838 */ 3839 struct p /* peer structure pointer or NULL */ 3840 *find_assoc( 3841 struct r *r /* receive packet pointer */ 3842 ) 3843 { 3844 struct p *p; /* dummy peer structure pointer */ 3846 /* 3847 * Search association table for matching source 3848 * address and source port. 3849 */ 3850 while (/* all associations */ 0) { 3851 if (r->srcaddr == p->srcaddr && r->port == p->port) 3852 return(p); 3853 } 3854 return (NULL); 3855 } 3857 A.5.2. packet() 3859 /* 3860 * packet() - process packet and compute offset, delay and 3861 * dispersion. 3862 */ 3863 void 3864 packet( 3865 struct p *p, /* peer structure pointer */ 3866 struct r *r /* receive packet pointer */ 3867 ) 3868 { 3869 double offset; /* sample offsset */ 3870 double delay; /* sample delay */ 3871 double disp; /* sample dispersion */ 3873 /* 3874 * By golly the packet is valid. Light up the remaining header 3875 * fields. Note that we map stratum 0 (unspecified) to MAXSTRAT 3876 * to make stratum comparisons simpler and to provide a natural 3877 * interface for radio clock drivers that operate for 3878 * convenience at stratum 0. 3879 */ 3880 p->leap = r->leap; 3881 if (r->stratum == 0) 3882 p->stratum = MAXSTRAT; else 3883 p->stratum = r->stratum; p->mode = r->mode; 3884 p->ppoll = r->poll; 3885 p->rootdelay = FP2D(r->rootdelay); p->rootdisp = FP2D(r->rootdisp); 3886 p->refid = r->refid; 3887 p->reftime = r->reftime; 3889 /* 3890 * Verify the server is synchronized with valid stratum and 3891 * reference time not later than the transmit time. 3892 */ 3893 if (p->leap == NOSYNC || p->stratum >= MAXSTRAT) 3894 return; /* unsynchronized */ 3896 /* 3897 * Verify valid root distance. 3898 */ 3899 if (r->rootdelay / 2 + r->rootdisp >= MAXDISP || p->reftime > 3900 r->xmt) 3901 return; /* invalid header values */ 3903 poll_update(p, p->hpoll); 3904 p->reach |= 1; 3906 /* 3907 * Calculate offset, delay and dispersion, then pass to the 3908 * clock filter. Note carefully the implied processing. The 3909 * first-order difference is done directly in 64-bit arithmetic, 3910 * then the result is converted to floating double. All further 3911 * processing is in floating double arithmetic with rounding 3912 * done by the hardware. This is necessary in order to avoid 3913 * overflow and preseve precision. 3914 * 3915 * The delay calculation is a special case. In cases where the 3916 * server and client clocks are running at different rates and 3917 * with very fast networks, the delay can appear negative. In 3918 * order to avoid violating the Principle of Least Astonishment, 3919 * the delay is clamped not less than the system precision. 3920 */ 3921 if (p->mode == M_BCST) { 3922 offset = LFP2D(r->xmt - r->dst); delay = BDELAY; 3923 disp = LOG2D(r->precision) + LOG2D(s.precision) + PHI * 3924 2 * BDELAY; 3925 } else { 3926 offset = (LFP2D(r->rec - r->org) + LFP2D(r->dst 3927 r->xmt)) / 2; 3928 delay = max(LFP2D(r->dst - r->org) - LFP2D(r->rec 3929 r->xmt), LOG2D(s.precision)); 3930 disp = LOG2D(r->precision) + LOG2D(s.precision) + PHI * 3931 LFP2D(r->dst - r->org); 3932 } 3933 clock_filter(p, offset, delay, disp); 3934 } 3936 A.5.3. clock_filter() 3938 /* 3939 * clock_filter(p, offset, delay, dispersion) - select the best from the 3940 * latest eight delay/offset samples. 3941 */ 3942 void 3943 clock_filter( 3944 struct p *p, /* peer structure pointer */ 3945 double offset, /* clock offset */ 3946 double delay, /* roundtrip delay */ 3947 double disp /* dispersion */ 3948 ) 3949 { 3950 struct f f[NSTAGE];/* sorted list */ 3951 double dtemp; 3952 int i; 3954 /* 3955 * The clock filter contents consist of eight tuples (offset, 3956 * delay, dispersion, time). Shift each tuple to the left, 3957 * discarding the leftmost one. As each tuple is shifted, 3958 * increase the dispersion since the last filter update. At the 3959 * same time, copy each tuple to a temporary list. After this, 3960 * place the (offset, delay, disp, time) in the vacated 3961 * rightmost tuple. 3962 */ 3963 for (i = 1; i < NSTAGE; i++) { 3964 p->f[i] = p->f[i - 1]; 3965 p->f[i].disp += PHI * (c.t - p->t); f[i] = p->f[i]; 3966 } 3967 p->f[0].t = c.t; 3968 p->f[0].offset = offset; 3969 p->f[0].delay = delay; 3970 p->f[0].disp = disp; 3971 f[0] = p->f[0]; 3973 /* 3974 * Sort the temporary list of tuples by increasing f[].delay. 3975 * The first entry on the sorted list represents the best 3976 * sample, but it might be old. 3977 */ 3978 dtemp = p->offset; 3979 p->offset = f[0].offset; 3980 p->delay = f[0].delay; 3981 for (i = 0; i < NSTAGE; i++) { 3982 p->disp += f[i].disp / (2 ^ (i + 1)); 3983 p->jitter += SQUARE(f[i].offset - f[0].offset); 3984 } 3985 p->jitter = max(SQRT(p->jitter), LOG2D(s.precision)); 3987 /* 3988 * Prime directive: use a sample only once and never a sample 3989 * older than the latest one, but anything goes before first 3990 * synchronized. 3991 */ 3992 if (f[0].t - p->t <= 0 && s.leap != NOSYNC) 3993 return; 3995 /* 3996 * Popcorn spike suppressor. Compare the difference between the 3997 * last and current offsets to the current jitter. If greater 3998 * than SGATE (3) and if the interval since the last offset is 3999 * less than twice the system poll interval, dump the spike. 4000 * Otherwise, and if not in a burst, shake out the truechimers. 4001 */ 4002 if (fabs(p->offset - dtemp) > SGATE * p->jitter && (f[0].t 4003 p->t) < 2 * s.poll) 4004 return; 4006 p->t = f[0].t; 4007 if (p->burst == 0) 4008 clock_select(); 4009 return; 4010 } 4012 A.5.4. fast_xmit() 4014 /* 4015 * fast_xmit() - transmit a reply packet for receive packet r 4016 */ 4017 void 4018 fast_xmit( 4019 struct r *r, /* receive packet pointer */ 4020 int mode, /* association mode */ 4021 int auth /* authentication code */ 4022 ) 4023 { 4024 struct x x; 4026 /* 4027 * Initialize header and transmit timestamp. Note that the 4028 * transmit version is copied from the receive version. This is 4029 * for backward compatibility. 4031 */ 4032 x.version = r->version; 4033 x.srcaddr = r->dstaddr; 4034 x.dstaddr = r->srcaddr; 4035 x.leap = s.leap; 4036 x.mode = mode; 4037 if (s.stratum == MAXSTRAT) 4038 x.stratum = 0; 4039 else 4040 x.stratum = s.stratum; x.poll = r->poll; 4041 x.precision = s.precision; 4042 x.rootdelay = D2FP(s.rootdelay); x.rootdisp = D2FP(s.rootdisp); 4043 x.refid = s.refid; 4044 x.reftime = s.reftime; 4045 x.org = r->xmt; 4046 x.rec = r->dst; 4047 x.xmt = get_time(); 4049 /* 4050 * If the authentication code is A.NONE, include only the 4051 * header; if A.CRYPTO, send a crypto-NAK; if A.OK, send a valid 4052 * MAC. Use the key ID in the received packet and the key in the 4053 * local key cache. 4054 */ 4055 if (auth != A_NONE) { 4056 if (auth == A_CRYPTO) { 4057 x.keyid = 0; 4058 } else { 4059 x.keyid = r->keyid; 4060 x.digest = md5(x.keyid); 4061 } 4062 } 4063 xmit_packet(&x); 4064 } 4066 A.5.5. access() 4068 /* 4069 * access() - determine access restrictions 4070 */ 4071 int 4072 access( 4073 struct r *r /* receive packet pointer */ 4074 ) 4075 { 4076 /* 4077 * The access control list is an ordered set of tuples 4078 * consisting of an address, mask and restrict word containing 4079 * defined bits. The list is searched for the first match on the 4080 * source address (r->srcaddr) and the associated restrict word 4081 * is returned. 4082 */ 4083 return (/* access bits */ 0); 4084 } 4086 A.6. System Process 4088 #include "ntp4.h" 4090 A.6.1. clock_select() 4092 /* 4093 * clock_select() - find the best clocks 4094 */ 4095 void 4096 clock_select() { 4097 struct p *p, *osys; /* peer structure pointers */ 4098 double low, high; /* correctness interval extents */ 4099 int allow, found, chime; /* used by intersecion algorithm */ 4100 int n, i, j; 4102 /* 4103 * We first cull the falsetickers from the server population, 4104 * leaving only the truechimers. The correctness interval for 4105 * association p is the interval from offset - root_dist() to 4106 * offset + root_dist(). The object of the game is to find a 4107 * majority clique; that is, an intersection of correctness 4108 * intervals numbering more than half the server population. 4109 * 4110 * First construct the chime list of tuples (p, type, edge) as 4111 * shown below, then sort the list by edge from lowest to 4112 * highest. 4113 */ 4114 osys = s.p; 4115 s.p = NULL; 4116 n = 0; 4117 while (accept(p)) { 4118 s.m[n].p = p; 4119 s.m[n].type = +1; 4120 s.m[n].edge = p->offset + root_dist(p); 4121 n++; 4122 s.m[n].p = p; 4123 s.m[n].type = 0; 4124 s.m[n].edge = p->offset; 4125 n++; 4126 s.m[n].p = p; 4127 s.m[n].type = -1; 4128 s.m[n].edge = p->offset - root_dist(p); 4129 n++; 4130 } 4132 /* 4133 * Find the largest contiguous intersection of correctness 4134 * intervals. Allow is the number of allowed falsetickers; found 4135 * is the number of midpoints. Note that the edge values are 4136 * limited to the range +-(2 ^ 30) < +-2e9 by the timestamp 4137 * calculations. 4138 */ 4139 low = 2e9; high = -2e9; 4140 for (allow = 0; 2 * allow < n; allow++) { 4141 /* 4142 * Scan the chime list from lowest to highest to find 4143 * the lower endpoint. 4144 */ 4145 found = 0; 4146 chime = 0; 4147 for (i = 0; i < n; i++) { 4148 chime -= s.m[i].type; 4149 if (chime >= n - found) { 4150 low = s.m[i].edge; 4151 break; 4152 } 4153 if (s.m[i].type == 0) 4154 found++; 4155 } 4157 /* 4158 * Scan the chime list from highest to lowest to find 4159 * the upper endpoint. 4160 */ 4161 chime = 0; 4162 for (i = n - 1; i >= 0; i--) { 4163 chime += s.m[i].type; 4164 if (chime >= n - found) { 4165 high = s.m[i].edge; 4166 break; 4167 } 4168 if (s.m[i].type == 0) 4169 found++; 4170 } 4172 /* 4173 * If the number of midpoints is greater than the number 4174 * of allowed falsetickers, the intersection contains at 4175 * least one truechimer with no midpoint. If so, 4176 * increment the number of allowed falsetickers and go 4177 * around again. If not and the intersection is 4178 * nonempty, declare success. 4179 */ 4180 if (found > allow) 4181 continue; 4183 if (high > low) 4184 break; 4185 } 4187 /* 4188 * Clustering algorithm. Construct a list of survivors (p, 4189 * metric) from the chime list, where metric is dominated first 4190 * by stratum and then by root distance. All other things being 4191 * equal, this is the order of preference. 4192 */ 4193 n = 0; 4194 for (i = 0; i < n; i++) { 4195 if (s.m[i].edge < low || s.m[i].edge > high) 4196 continue; 4198 p = s.m[i].p; 4199 s.v[n].p = p; 4200 s.v[n].metric = MAXDIST * p->stratum + root_dist(p); 4201 n++; 4202 } 4204 /* 4205 * There must be at least NSANE survivors to satisfy the 4206 * correctness assertions. Ordinarily, the Byzantine criteria 4207 * require four, susrvivors, but for the demonstration here, one 4208 * is acceptable. 4209 */ 4210 if (n == NSANE) 4211 return; 4213 /* 4214 * For each association p in turn, calculate the selection 4215 * jitter p->sjitter as the square root of the sum of squares 4216 * (p->offset - q->offset) over all q associations. The idea is 4217 * to repeatedly discard the survivor with maximum selection 4218 * jitter until a termination condition is met. 4219 */ 4220 while (1) { 4221 struct p *p, *q, *qmax;/* peer structure pointers */ 4222 double max, min, dtemp; 4224 max = -2e9; min = 2e9; for (i = 0; i < n; i++) { 4225 p = s.v[i].p; 4226 if (p->jitter < min) 4227 min = p->jitter; 4228 dtemp = 0; 4229 for (j = 0; j < n; j++) { 4230 q = s.v[j].p; 4231 dtemp += SQUARE(p->offset - q->offset); 4232 } 4233 dtemp = SQRT(dtemp); if (dtemp > max) { 4234 max = dtemp; 4235 qmax = q; 4236 } 4237 } 4239 /* 4240 * If the maximum selection jitter is less than the 4241 * minimum peer jitter, then tossing out more survivors 4242 * will not lower the minimum peer jitter, so we might 4243 * as well stop. To make sure a few survivors are left 4244 * for the clustering algorithm to chew on, we also stop 4245 * if the number of survivors is less than or equal to 4246 * NMIN (3). 4247 */ 4248 if (max < min || n <= NMIN) 4249 break; 4251 /* 4252 * Delete survivor qmax from the list and go around * again. 4253 */ 4254 n--; 4255 } 4257 /* 4258 * Pick the best clock. If the old system peer is on the list 4259 * and at the same stratum as the first survivor on the list, 4260 * then don't do a clock hop. Otherwise, select the first 4261 * survivor on the list as the new system peer. 4262 */ 4263 if (osys->stratum == s.v[0].p->stratum) 4264 s.p = osys; 4265 else 4266 s.p = s.v[0].p; 4267 clock_update(s.p); 4268 } 4270 A.6.2. root_dist() 4272 /* 4273 * root_dist() - calculate root distance 4274 */ 4275 double 4276 root_dist( 4277 struct p *p /* peer structure pointer */ 4278 ) 4279 { 4280 /* 4281 * The root synchronization distance is the maximum error due to 4282 * all causes of the local clock relative to the primary server. 4283 * It is defined as half the total delay plus total dispersion 4284 * plus peer jitter. 4285 */ 4286 return (max(MINDISP, p->rootdelay + p->delay) / 2 + 4287 p->rootdisp + p->disp + PHI * (c.t - p->t) + p->jitter); 4288 } 4290 A.6.3. accept() 4292 /* 4293 * accept() - test if association p is acceptable for synchronization 4294 */ 4295 int 4296 accept( 4297 struct p *p /* peer structure pointer */ 4298 ) 4299 { 4300 /* 4301 * A stratum error occurs if (1) the server has never been 4302 * synchronized, (2) the server stratum is invalid. 4303 */ 4304 if (p->leap == NOSYNC || p->stratum >= MAXSTRAT) 4305 return (FALSE); 4307 /* 4308 * A distance error occurs if the root distance exceeds the 4309 * distance threshold plus an increment equal to one poll 4310 * interval. 4311 */ 4312 if (root_dist(p) > MAXDIST + PHI * LOG2D(s.poll)) 4313 return (FALSE); 4315 /* 4316 * A loop error occurs if the remote peer is synchronized to the 4317 * local peer or the remote peer is synchronized to the current 4318 * system peer. Note this is the behavior for IPv4; for IPv6 the 4319 * MD5 hash is used instead. 4320 */ 4321 if (p->refid == p->dstaddr || p->refid == s.refid) 4322 return (FALSE); 4324 /* 4325 * An unreachable error occurs if the server is unreachable. 4326 */ 4327 if (p->reach == 0) 4328 return (FALSE); 4330 return (TRUE); 4331 } 4333 A.6.4. clock_update() 4335 /* 4336 * clock_update() - update the system clock 4337 */ 4338 void 4339 clock_update( 4340 struct p *p /* peer structure pointer */ 4341 ) 4342 { 4343 double dtemp; 4345 /* 4346 * If this is an old update, for instance as the result of a 4347 * system peer change, avoid it. We never use an old sample or 4348 * the same sample twice. 4349 * 4350 if (s.t >= p->t) 4351 return; 4353 /* 4354 * Combine the survivor offsets and update the system clock; the 4355 * local_clock() routine will tell us the good or bad news. 4356 */ 4357 s.t = p->t; 4358 clock_combine(); 4359 switch (local_clock(p, s.offset)) { 4361 /* 4362 * The offset is too large and probably bogus. Complain to the 4363 * system log and order the operator to set the clock manually 4364 * within PANIC range. An implementation MAY include a 4365 * command line option to disable this check and to change the 4366 * panic threshold from the default 1000 s as required. 4367 */ 4368 case PANIC: 4369 exit (0); 4371 /* 4372 * The offset is more than the step threshold (0.125 s by 4373 * default). After a step, all associations now have 4374 * inconsistent time valurs, so they are reset and started 4375 * fresh. The step threshold MAY be changed in an 4376 * implementation in order to lessen the chance the clock might 4377 * be stepped backwards. However, there may be serious 4378 * consequences. 4379 */ 4380 case STEP: 4381 while (/* all associations */ 0) 4382 clear(p, X_STEP); 4383 s.stratum = MAXSTRAT; 4384 s.poll = MINPOLL; 4385 break; 4386 /* 4387 * The offset was less than the step threshold, which is the 4388 * normal case. Update the system variables from the peer 4389 * variables. The lower clamp on the dispersion increase is to 4390 * avoid timing loops and clockhopping when highly precise 4391 * sources are in play. The clamp MAY be changed from the 4392 * suggested default of .01 s. 4393 */ 4394 case SLEW: 4395 s.leap = p->leap; 4396 s.stratum = p->stratum + 1; s.refid = p->refid; 4397 s.reftime = p->reftime; 4398 s.rootdelay = p->rootdelay + p->delay; 4399 dtemp = SQRT(SQUARE(p->jitter) + SQUARE(s.jitter)); 4400 dtemp += max(p->disp + PHI * (c.t - p->t) + 4401 fabs(p->offset), MINDISP); 4402 s.rootdisp = p->rootdisp + dtemp; break; 4404 /* 4405 * Some samples are discarded while, for instance, a direct 4406 * frequency measurement is being made. 4407 */ 4408 case IGNORE: 4409 break; 4410 } 4411 } 4413 A.6.5. clock_combine() 4415 /* 4416 * clock_combine() - combine offsets 4417 */ 4418 void 4419 clock_combine() 4420 { 4421 struct p *p;/* peer structure pointer */ 4422 double x, y, z, w; 4423 int i; 4425 /* 4426 * Combine the offsets of the clustering algorithm survivors 4427 * using a weighted average with weight determined by the root 4428 * distance. Compute the selection jitter as the weighted RMS 4429 * difference between the first survivor and the remaining 4430 * survivors. In some cases the inherent clock jitter can be 4431 * reduced by not using this algorithm, especially when frequent 4432 * clockhopping is involved. 4433 */ 4434 y = z = w = 0; 4435 for (i = 0; s.v[i].p != NULL; i++) { 4436 p = s.v[i].p; 4437 x = root_dist(p); 4438 y += 1 / x; 4439 z += p->offset / x; 4440 w += SQUARE(p->offset - s.v[0].p->offset) / x; 4441 } 4442 s.offset = z / y; 4443 s.jitter = SQRT(w / y); 4444 } 4446 A.6.6. local_clock() 4448 #include "ntp4.h" 4450 /* 4451 * Constants 4452 */ 4453 #define STEPT.128/* step threshold (s) */ 4454 #define WATCH900/* stepout threshold (s) */ 4455 #define PANICT1000/* panic threshold (s) */ 4456 #define PLL65536/* PLL loop gain */ 4457 #define FLLMAXPOLL + 1/* FLL loop gain */ 4458 #define AVG 4/* parameter averaging constant */ 4459 #define ALLAN1500/* compromise Allan intercept (s) */ 4460 #define LIMIT 30 /* poll-adjust threshold */ 4461 #define MAXFREQ 500e-6 4462 /* maximum frequency tolerance (s/s) */ 4463 #define PGATE 4 /* poll-adjust gate */ 4465 /* 4466 * local_clock() - discipline the local clock 4467 */ 4468 int /* return code */ 4469 local_clock( 4470 struct p *p, /* peer structure pointer */ 4471 double offset /* clock offset from combine() */ 4472 ) 4473 { 4474 int state; /* clock discipline state */ 4475 double freq; /* frequency */ 4476 double mu; /* interval since last update */ 4477 int rval; 4478 double etemp, dtemp; 4480 /* 4481 * If the offset is too large, give up and go home. 4482 */ 4483 if (fabs(offset) > PANICT) 4484 return (PANIC); 4486 /* 4487 * Clock state machine transition function. This is where the 4488 * action is and defines how the system reacts to large time 4489 * and frequency errors. There are two main regimes: when the 4490 * offset exceeds the step threshold and when it does not. 4491 */ 4492 rval = SLEW; 4493 mu = p->t - s.t; 4494 freq = 0; 4495 if (fabs(offset) > STEPT) { 4496 switch (c.state) { 4498 /* 4499 * In S_SYNC state we ignore the first outlyer amd 4500 * switch to S_SPIK state. 4501 */ 4502 case SYNC: 4503 state = SPIK; 4504 return (rval); 4505 /* 4506 * In S_FREQ state we ignore outlyers and inlyers. At 4507 * the first outlyer after the stepout threshold, 4508 * compute the apparent frequency correction and step 4509 * the time. 4510 */ 4511 case FREQ: 4512 if (mu < WATCH) 4513 return (IGNORE); 4515 freq = (offset - c.base - c.offset) / mu; 4516 /* fall through to S_SPIK */ 4518 /* 4519 * In S_SPIK state we ignore succeeding outlyers until 4520 * either an inlyer is found or the stepout threshold is 4521 * exceeded. 4522 */ 4523 case SPIK: 4524 if (mu < WATCH) 4525 return (IGNORE); 4527 /* fall through to default */ 4529 /* 4530 * We get here by default in S_NSET and S_FSET states 4531 * and from above in S_FREQ state. Step the time and 4532 * clamp down the poll interval. 4533 * 4534 * In S_NSET state an initial frequency correction is 4535 * not available, usually because the frequency file has 4536 * not yet been written. Since the time is outside the 4537 * capture range, the clock is stepped. The frequency 4538 * will be set directly following the stepout interval. 4539 * 4540 * In S_FSET state the initial frequency has been set 4541 * from the frequency file. Since the time is outside 4542 * the capture range, the clock is stepped immediately, 4543 * rather than after the stepout interval. Guys get 4544 * nervous if it takes 17 minutes to set the clock for 4545 * the first time. 4546 * 4547 * In S_SPIK state the stepout threshold has expired and 4548 * the phase is still above the step threshold. Note 4549 * that a single spike greater than the step threshold 4550 * is always suppressed, even at the longer poll 4551 * intervals. 4552 */ 4553 default: 4555 /* 4556 * This is the kernel set time function, usually 4557 * implemented by the Unix settimeofday() system 4558 * call. 4559 */ 4560 step_time(offset); c.count = 0; 4561 rval = STEP; 4562 if (state == NSET) { 4563 rstclock(FREQ, p->t, 0); 4564 return (rval); 4565 } 4566 break; 4567 } 4568 rstclock(SYNC, p->t, 0); 4569 } else { 4571 /* 4572 * Compute the clock jitter as the RMS of exponentially 4573 * weighted offset differences. This is used by the 4574 * poll-adjust code. 4575 */ 4576 etemp = SQUARE(c.jitter); 4577 dtemp = SQUARE(max(fabs(offset - c.last), 4578 LOG2D(s.precision))); 4579 c.jitter = SQRT(etemp + (dtemp - etemp) / AVG); 4580 switch (c.state) { 4582 /* 4583 * In S_NSET state this is the first update received and 4584 * the frequency has not been initialized. The first 4585 * thing to do is directly measure the oscillator 4586 * frequency. 4587 */ 4588 case NSET: 4589 c.offset = offset; 4590 rstclock(FREQ, p->t, offset); return (IGNORE); 4592 /* 4593 * In S_FSET state this is the first update and the 4594 * frequency has been initialized. Adjust the phase, but 4595 * don't adjust the frequency until the next update. 4596 */ 4597 case FSET: 4598 c.offset = offset; break; 4600 /* 4601 * In S_FREQ state ignore updates until the stepout 4602 * threshold. After that, correct the phase and 4603 * frequency and switch to S_SYNC state. 4604 */ 4605 case FREQ: 4606 if (c.t - s.t < WATCH) 4607 return (IGNORE); 4609 freq = (offset - c.base - c.offset) / mu; 4610 break; 4612 /* 4613 * We get here by default in S_SYNC and S_SPIK states. 4614 * Here we compute the frequency update due to PLL and 4615 * FLL contributions. 4616 */ 4617 default: 4619 /* 4620 * The FLL and PLL frequency gain constants 4621 * depend on the poll interval and Allan 4622 * intercept. The FLL is not used below one-half 4623 * the Allan intercept. Above that the loop gain 4624 * increases in steps to 1 / AVG. 4625 */ 4626 if (LOG2D(s.poll) > ALLAN / 2) { 4627 etemp = FLL - s.poll; 4628 if (etemp < AVG) 4629 etemp = AVG; 4630 freq += (offset - c.offset) / (max(mu, 4631 ALLAN) * etemp); 4632 } 4634 /* 4635 * For the PLL the integration interval 4636 * (numerator) is the minimum of the update 4637 * interval and poll interval. This allows 4638 * oversampling, but not undersampling. 4639 */ 4640 etemp = min(mu, LOG2D(s.poll)); 4641 dtemp = 4 * PLL * LOG2D(s.poll); 4642 freq += offset * etemp / (dtemp * dtemp); 4643 break; 4644 } 4645 rstclock(SYNC, p->t, offset); 4646 } 4648 /* 4649 * Calculate the new frequency and frequency stability (wander). 4650 * Compute the clock wander as the RMS of exponentially weighted 4651 * frequency differences. This is not used directly, but can, 4652 * along withthe jitter, be a highly useful monitoring and 4653 * debugging tool 4654 */ 4655 freq += c.freq; 4656 c.freq = max(min(MAXFREQ, freq), -MAXFREQ); 4657 etemp = SQUARE(c.wander); 4658 dtemp = SQUARE(freq); 4659 c.wander = SQRT(etemp + (dtemp - etemp) / AVG); 4661 /* 4662 * Here we adjust the poll interval by comparing the current 4663 * offset with the clock jitter. If the offset is less than the 4664 * clock jitter times a constant, then the averaging interval is 4665 * increased, otherwise it is decreased. A bit of hysteresis 4666 * helps calm the dance. Works best using burst mode. 4667 */ 4668 if (fabs(c.offset) < PGATE * c.jitter) { 4669 c.count += s.poll; 4670 if (c.count > LIMIT) { 4671 c.count = LIMIT; 4672 if (s.poll < MAXPOLL) { 4673 c.count = 0; 4674 s.poll++; 4675 } 4676 } 4677 } else { 4678 c.count -= s.poll << 1; if (c.count < -LIMIT) { 4679 c.count = -LIMIT; 4680 if (s.poll > MINPOLL) { 4681 c.count = 0; 4682 s.poll--; 4683 } 4684 } 4685 } 4686 return (rval); 4687 } 4689 A.6.7. rstclock() 4691 /* 4692 * rstclock() - clock state machine 4693 */ 4694 void 4695 rstclock( 4696 int state, /* new state */ 4697 double offset, /* new offset */ 4698 double t /* new update time */ 4699 ) 4700 { 4701 /* 4702 * Enter new state and set state variables. Note we use the time 4703 * of the last clock filter sample, which must be earlier than 4704 * the current time. 4705 */ 4706 c.state = state; 4707 c.base = offset - c.offset; 4708 c.last = c.offset = offset; 4709 s.t = t; 4710 } 4712 A.7. Clock Adjust Process 4714 A.7.1. clock_adjust() 4716 /* 4717 * clock_adjust() - runs at one-second intervals 4718 */ 4719 void 4720 clock_adjust() { 4721 double dtemp; 4723 /* 4724 * Update the process time c.t. Also increase the dispersion 4725 * since the last update. In contrast to NTPv3, NTPv4 does not 4726 * declare unsynchronized after one day, since the dispersion 4727 * threshold serves this function. When the dispersion exceeds 4728 * MAXDIST (1 s), the server is considered unaccept for 4729 * synchroniztion. 4730 */ 4731 c.t++; 4732 s.rootdisp += PHI; 4734 /* 4735 * Implement the phase and frequency adjustments. The gain 4736 * factor (denominator) is not allowed to increase beyond the 4737 * Allan intercept. It doesn't make sense to average phase noise 4738 * beyond this point and it helps to damp residual offset at the 4739 * longer poll intervals. 4740 */ 4741 dtemp = c.offset / (PLL * min(LOG2D(s.poll), ALLAN)); 4742 c.offset -= dtemp; 4744 /* 4745 * This is the kernel adjust time function, usually implemented 4746 * by the Unix adjtime() system call. 4747 */ 4748 adjust_time(c.freq + dtemp); 4750 /* 4751 * Peer timer. Call the poll() routine when the poll timer 4752 * expires. 4753 */ 4754 while (/* all associations */ 0) { 4755 struct p *p;/* dummy peer structure pointer */ 4757 if (c.t >= p->next) 4758 poll(p); 4759 } 4761 /* 4762 * Once per hour write the clock frequency to a file 4763 */ 4764 if (c.t % 3600 == 3599) 4765 /* write c.freq to file */ 0; 4766 } 4768 A.8. Poll Process 4770 #include "ntp4.h" 4772 /* 4773 * Constants 4774 */ 4775 #define UNREACH 12 /* unreach counter threshold */ 4776 #define BCOUNT 8 /* packets in a burst */ 4777 #define BTIME 2 /* burst interval (s) */ 4779 A.8.1. poll() 4781 /* 4782 * poll() - determine when to send a packet for association p-> 4783 */ 4784 void 4785 poll( 4786 struct p *p /* peer structure pointer */ 4787 ) 4788 { 4789 int hpoll; 4790 int oreach; 4792 /* 4793 * This routine is called when the current time c.t catches up 4794 * to the next poll time p->next. The value p->last is 4795 * the last time this routine was executed. The poll_update() 4796 * routine determines the next execution time p->next. 4797 * 4798 * If broadcasting, just do it, but only if we are synchronized. 4799 */ 4800 hpoll = p->hpoll; 4801 if (p->mode == M_BCST) { 4802 p->last = c.t; 4803 if (s.p != NULL) 4804 peer_xmit(p); 4805 poll_update(p, hpoll); return; 4806 } 4807 if (p->burst == 0) { 4809 /* 4810 * We are not in a burst. Shift the reachability 4811 * register to the left. Hopefully, some time before the 4812 * next poll a packet will arrive and set the rightmost 4813 * bit. 4814 */ 4815 p->last = c.t; 4816 oreach = p->reach; 4817 p->reach << 1; 4818 if (!p->reach) { 4820 /* 4821 * The server is unreachable, so bump the 4822 * unreach counter. If the unreach threshold has 4823 * been reached, double the poll interval to 4824 * minimize wasted network traffic. 4825 */ 4826 if (p->flags & P_IBURST && p->unreach == 0) { 4827 p->burst = BCOUNT; 4828 } else if (p->unreach < UNREACH) 4829 p->unreach++; 4830 else 4831 hpoll++; 4832 p->unreach++; 4833 } else { 4835 /* 4836 * The server is reachable. However, if has not 4837 * been heard for three consecutive poll 4838 * intervals, stuff the clock register to 4839 * increase the peer dispersion. This makes old 4840 * servers less desirable and eventually boots 4841 * them off the island. 4842 */ 4843 p->unreach = 0; 4844 if (!(p->reach & 0x7)) 4845 clock_filter(p, 0, 0, MAXDISP); hpoll = s.poll; 4846 if (p->flags & P_BURST && accept(p)) 4847 p->burst = BCOUNT; 4848 } 4849 } else { 4851 /* 4852 * If in a burst, count it down. When the reply comes 4853 * back the clock_filter() routine will call 4854 * clock_select() to process the results of the burst. 4855 */ 4856 p->burst--; 4857 } 4859 /* 4860 * Do not transmit if in broadcast client mode. 4861 */ 4862 if (p->mode != M_BCLN) 4863 peer_xmit(p); 4864 poll_update(p, hpoll); 4865 } 4867 A.8.2. poll_update() 4868 /* 4869 * poll_update() - update the poll interval for association p 4870 * 4871 * Note: This routine is called by both the packet() and poll() routine. 4872 * Since the packet() routine is executed when a network packet arrives 4873 * and the poll() routine is executed as the result of timeout, a 4874 * potential race can occur, possibly causing an incorrect interval for 4875 * the next poll. This is considered so unlikely as to be negligible. 4876 */ 4877 void 4878 poll_update( 4879 struct p *p, /* peer structure pointer */ 4880 int hpoll /* poll interval (log2 s) */ 4881 ) 4882 { 4883 int poll; 4885 /* 4886 * This routine is called by both the poll() and packet() 4887 * routines to determine the next poll time. If within a burst 4888 * the poll interval is two seconds. Otherwise, it is the 4889 * minimum of the host poll interval and peer poll interval, but 4890 * not greater than MAXPOLL and not less than MINPOLL. The 4891 * design insures that a longer interval can be preempted by a 4892 * shorter one if required for rapid response. 4893 */ 4894 p->hpoll = min(MAXPOLL, max(MINPOLL, hpoll)); if (p->burst != 0) { 4895 if(c.t != p->next) 4896 return; 4898 p->next += BTIME; 4899 } else { 4900 poll = min(p->hpoll, max(MINPOLL, ppoll)); 4901 } 4902 /* 4903 * While not shown here, an implementation 4904 * SHOULD randomize the poll interval by a small factor. 4905 */ 4906 p->next = p->last + (1 << poll); 4907 } 4909 /* 4910 * It might happen that the due time has already passed. If so, 4911 * make it one second in the future. 4912 */ 4913 if (p->next <= c.t) 4914 p->next = c.t + 1; 4915 } 4917 A.8.3. peer_xmit() 4919 /* 4920 * transmit() - transmit a packet for association p 4921 */ 4922 void 4923 peer_xmit( 4924 struct p *p/* peer structure pointer */ 4925 ) 4926 { 4927 struct x x;/* transmit packet */ 4929 /* 4930 * Initialize header and transmit timestamp 4931 */ 4932 x.srcaddr = p->dstaddr; 4933 x.dstaddr = p->srcaddr; 4934 x.leap = s.leap; 4935 x.version = VERSION; 4936 x.mode = p->mode; 4937 if (s.stratum == MAXSTRAT) 4938 x.stratum = 0; 4939 else 4940 x.stratum = s.stratum; x.poll = p->hpoll; 4941 x.precision = s.precision; 4942 x.rootdelay = D2FP(s.rootdelay); x.rootdisp = D2FP(s.rootdisp); 4943 x.refid = s.refid; 4944 x.reftime = s.reftime; 4945 x.org = p->org; 4946 x.rec = p->rec; 4947 x.xmt = get_time(); 4948 p->xmt = x.xmt; 4950 /* 4951 * If the key ID is nonzero, send a valid MAC using the key ID 4952 * of the association and the key in the local key cache. If 4953 * something breaks, like a missing trusted key, don't send the 4954 * packet; just reset the association and stop until the problem 4955 * is fixed. 4956 */ 4957 if (p->keyid) 4958 if (/* p->keyid invalid */ 0) { 4959 clear(p, X_NKEY); 4960 return; 4961 } 4962 x.digest = md5(p->keyid); 4963 xmit_packet(&x); 4964 } 4966 Authors' Addresses 4968 Jack Burbank (editor) 4969 The Johns Hopkins University Applied Physics Laboratory 4970 11100 Johns Hopkins Road 4971 Laurel, MD 20723-6099 4972 US 4974 Phone: +1 443 778 7127 4975 Email: jack.burbank@jhuapl.edu 4977 William Kasch (editor) 4978 The Johns Hopkins University Applied Physics Laboratory 4979 11100 Johns Hopkins Road 4980 Laurel, MD 20723-6099 4981 US 4983 Phone: +1 443 778 7463 4984 Email: william.kasch@jhuapl.edu 4986 Jim Martin (editor) 4987 Netzwert AG 4988 An den Treptowers 1 4989 Berlin 12435 4990 Germany 4992 Phone: +49.30/5 900 80-1180 4993 Email: jim@netzwert.ag 4995 Dr. David L. Mills 4996 University of Delaware 4997 Newark, DE 19716 4998 US 5000 Phone: +1 302 831 8247 5001 Email: mills@udel.edu 5003 Full Copyright Statement 5005 Copyright (C) The IETF Trust (2007). 5007 This document is subject to the rights, licenses and restrictions 5008 contained in BCP 78, and except as set forth therein, the authors 5009 retain all their rights. 5011 This document and the information contained herein are provided on an 5012 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 5013 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 5014 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 5015 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 5016 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 5017 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 5019 Intellectual Property 5021 The IETF takes no position regarding the validity or scope of any 5022 Intellectual Property Rights or other rights that might be claimed to 5023 pertain to the implementation or use of the technology described in 5024 this document or the extent to which any license under such rights 5025 might or might not be available; nor does it represent that it has 5026 made any independent effort to identify any such rights. Information 5027 on the procedures with respect to rights in RFC documents can be 5028 found in BCP 78 and BCP 79. 5030 Copies of IPR disclosures made to the IETF Secretariat and any 5031 assurances of licenses to be made available, or the result of an 5032 attempt made to obtain a general license or permission for the use of 5033 such proprietary rights by implementers or users of this 5034 specification can be obtained from the IETF on-line IPR repository at 5035 http://www.ietf.org/ipr. 5037 The IETF invites any interested party to bring to its attention any 5038 copyrights, patents or patent applications, or other proprietary 5039 rights that may cover technology that may be required to implement 5040 this standard. Please address the information to the IETF at 5041 ietf-ipr@ietf.org. 5043 Acknowledgment 5045 Funding for the RFC Editor function is provided by the IETF 5046 Administrative Support Activity (IASA).