[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [dccp] dccp spec expert review (Minshall, main spec)



Greg,

Thanks very much for your thoughtful review!! We, and I personally, totally
appreciate it. Here's a response to your comments. Maybe we can get a
little back-and-forth going.

> (3.3)
>
>  "DCCP does not support TCP-style simultaneous open..."  Why is this?

To maintain the distinction between "client" and "server", with only
servers able to send CloseReq (and thus not having to keep TimeWait state).
More fundamentally, we decided that TCP-style simultaneous opens are not
important. Why would they be?

> "DCCP does not support half-open connections either.  That is, DCCP
> shuts down both half-connections as a unit. However, DCCP SHOULD allow
> applications to declare they are no longer interested in receiving
> data..."  I think the authors misunderstand TCP's concept of a
> half-open connection.  In TCP, a connection is said to be "half-open"
> if one of the endpoints has knowledge of the connection (in a state, I
> suppose, past SYNSENT and before TIMEWAIT) and the other endpoint has
> no knowledge of the connection.  This is an important concept to get
> right, and I'm not sure if DCCP supports it correctly.  The authors
> here appear to be discussing the effects of shutdown(2) with ``how''
> of 0 or 1.

Fair comment. You're right; the draft is talking about TCP's states
FINWAIT* and CLOSEWAIT, where one side declares that it is done sending
data, and the protocol enforces that choice. We didn't use the right
terminology (perhaps "half-closed"?).

> (4)
>
> "(8) The server sends a DCCP-Reset packet whose reason is set to
> "Closed", and clears its connection state."  I would suggest using
> reset for an "unclean" close, and use another mechanism for a clean
> close.

Why? Our rationale for Reset is that the receiver of Reset keeps TimeWait
state. That's what ties Reset(Closed) and Reset(other) together. Reserving
a separate packet type for this functionality is friendly to middleboxes
too.

> (4.1.1)
>
> "(3) ... and including a Confirm option finalizing the negotiation of
> the client-to-server CCID..."  Why is this?

I'm not sure I understand the question.

> "(5) ... ... The client's Ack Vector echoes the accumulated ECN Nonce
> for the server's packets."  Does this mean that the server doesn't use
> an Ack Vector, or that the server's Ack Vector does not echo the
> accumulated ECN Nonce for the client's packets?

Bug in the spec. We say that the discussion concerns a whole connection,
but really it focuses almost entirely on the server-to-client
half-connection.

> "(7) ... The server responds to lost or marked DCCP-Ack packets by
> modifying the ACK Ratio sent to the client..."  I assume the reason
> for this is to have "flow control for ACKs".  However, I think the
> solution of reducing the number of ACKs per data packet may complicate
> simple DCCP implementations, which would have to deal with one ACK per
> every 20 or 100 data packets, which makes the "clocking" harder.  At
> the least, I would think that an option should be for the server to
> reduce the rate at which it is sending data packets (which would
> implicitly slow down the rate at which ACK packets are being inserted
> into the network).
> 
> In general, I'm uneasy about the "ACK Ratio" feature of DCCP.  There
> is today strong disagreement among experts about whether one ACK per
> data packet, 0.5 ACK per data packet, 0.1 ACK per data packet (in this
> last case, qualified with the "both endpoints on the same local
> LAN").  I think giving control to application designers is likely to
> be a mistake.  The protocol designers and/or relevant standardization
> body should come up with the desired ratio, and system implementors
> should use that number, without any negotiating between the two sides.

Ack Ratio doesn't give the _application_ control of anything; Ack Ratio is
set by the relevant CCID, which is part of the DCCP kernel implementation.
The negotiation here is between DCCPs, not between applications.

But yes, Ack Ratio is not without conceptual risk. Maybe we need to do some
simulations here.

> (5.1)
>
> I will note, though I'm sure this was discussed and dismissed, that
> the source port, destination port, and random ISS could be replaced
> with a random connection ID, and have the sequence number start with
> zero.

Something like this was discussed -- we talked about having Service Name
replace Destination Port for DCCP-Requests. But that would prevent running
multiple instantiations of the same service on the same box (like an
experimental Web server on port 8000 for example). I think it's better to
stay with ports.

> "Despite this, we leave the design of mechanisms to protect against
> wrapped sequence numbers for future work.  In particular, if it is
> decided that very large packet sizes are better than very large
> congestion windows for very-high-bandwidth flows, then 24 bits may be
> enough."  This seems wrong to me.  I think of jumbograms as being an
> mistake that people will always fall back on (just like people fall
> back on only ACKing one out of every ten packets "on a LAN").  I don't
> think we should build a mechanism that even slightly promotes the use
> of jumbograms.  24 bits is probably too few bits for the sequence
> number field.

Reasonable people do disagree on jumbograms [if Mark is reasonable :)]. 

But this may be a problem with the rationale, not the sequence number
length. DCCP is designed for connections with congestion windows of at most
a few hundred thousand packets; that is its design space. Sally, in
particular, strongly believes that we don't know how to do congestion
control with larger cwnds, and that entirely different mechanisms might be
required. If at most 2^18 or 2^19 packets will be in flight during a given
RTT, that still leaves 32-64 RTTs before sequence numbers wrap; and this
might be sufficient for now.

Nevertheless, your comments, and Eric's, make me think that we should
specify some wrap mechanism now: for instance, an option or feature that
announced the wrapping of sequence numbers. (In other words, we explicitly
transmit the lower 24 bits of each sequence number, but the endpoints keep
32 or 64 or whatever, with the upper bits negotiated once the connection
approaches sequence number wrap.)

> "Values between 1 and 14, inclusive, indicate that the checksum
> additionally covers that number of initial 32-bit words of the
> packet's payload, padded to the right with zeros as necessary."  I
> would reword this: "the checksum covers the DCCP header, DCCP options,
> pseudo-header, plus an initial number of 32-bit words..."

Yep.

> (5.2)
>
> "One good guideline is to set it to about 3 or 4 times the maximum
> number of packets the sender expects to send in any round trip time."
> I don't know how the sender, at the beginning of a connection, is
> supposed to estimate the RTT.  I would suggest simply deciding on and
> stating either one half or one third of the sequence number space.

The feature value can be renegotiated as the connection progresses, when
the sender has a better idea of the RTT -- that's why it's a feature.

Also note that Loss Window is a "non-negotiable" feature, which means the
sender announces Loss Window and the receiver must agree to it.

> "The receiver may center the loss window on GSN, or arrange it
> asymmetrically."  A problem with this flexibility is that network
> management tools (looking at a set of captured packets) will have a
> hard time understanding which packets will, and which packets won't,
> be accepted by the receiver.

Ouch. Should we specify exactly how the Loss Window is to be arranged?

> "(1) No valid packet has been received recently (for instance, within
> at least one round-trip time)."  Going quiet for one round trip time
> is not unusual.  In fact, the network delaying a flight of packets by
> one round trip time is not that unusual.  If the quoted sentence means
> that after one round trip time of quiet, any packet will be accepted,
> no matter what the sequence number, this would be a problem.  The
> concept Maximum Segment Lifetime (MSL) is defined in TCP as the
> longest a segment may exist in the network before being presented to a
> receiving host.  MSL is typically (though conservatively) set at one
> or two minutes.

It doesn't mean that after 1 RTT of quiet any invalid-seqno packet would be
accepted, because the 2nd condition (ID/Challenge) must also hold.

> "(2) The packet includes a correct Identification or Challenge option
> (see Section 6.4.3)."  This should not be required (for example, to
> deal with half-open connections).

I disagree. Furthermore half-open connections are not relevant here (unless
I'm missing something). The section describes what to do with a packet
_belonging to an existing connection_ whose sequence number is invalid for
that connection. The "closed" side of the half-open connection will
properly ignore all packets but Requests.

> (5.3)
>
> The action table which is included has some entries marked "Old/New",
> but other entries are not so marked.  I think it might be clearer if
> all the entries were so marked.

OK.

> In the "Request" column of the "RESPOND" row, the entry is marked
> "-/OK".  Should this be "RST/OK"?

Probably not, in case the requestor sent 2 requests and they were reordered
in the network.

> I didn't understand Note 1 to the action table.

This might require a forward reference to Init Cookie. Init Cookie lets the
server respond to a request *while remaining in the Listen state*. The Init
Cookie wraps up all relevant state about the connection; if the client
repeats the Init Cookie (necessarily using a DCCP-Data or DCCP-DataAck or
DCCP-Ack packet), the server will build the connection using that state.
This is a transition directly from LISTEN to OPEN.

> I think a number of the resets in the state diagram should be
> evaluated once DCCP supports cleaning up half-open connections.  (In
> TCP, the side which is "open" sends an ACK to the "closed" side, which
> elicits a reset.  The intent, I believe, is to prevent spurious
> resets.)

I don't think I understand the problem. If I send packets to a connection
that doesn't exist, should I not get a reset back? Also, note that the
"Rst" action is specified as "the receiver SHOULD either ignore the packet
or respond with a (rate-limited) Reset".

> "The Open state does not signify that a DCCP connection is ready for
> data transfer."  I find this unfortunate.

... Should we fix it? I think it's an advantage that the state diagram
proceeds in parallel with feature negotiation states -- in particular
because different CCIDs might have different rules about when a connection
is ready.

> (5.5)
>
> "All valid packets received by a DCCP stack MUST be acknowledged as
> 'received', even if their payloads were dropped (due to receive buffer
> overflow or payload corruption, for example)."  This seems like a
> mistake.  First off, it assumes the term "received" (in "all valid
> packets received") is well specified, which it isn't.  Second,
> congestion "towards" the application (because of the application's
> "socket buffer" is full, say) is congestion.  An old streams bug with
> TCP was to do the TCP processing, then drop the packet because the
> application buffer was full.

Congestion in the receiving node does not always merit the same heavy-duty
response as network congestion. Our intent is to tease apart the two. In
the absence of network congestion, for example, it does the network no harm
to retransmit packets dropped in the receiving node. We provide a mandatory
mechanism, namely Data Dropped, to inform the sending node that packets
were dropped at the receiver node.

Perhaps we need to explicitly specify how the sender should react to Data
Dropped packets. Something like "The sending congestion control mechanism
MUST react to a packet whose payload was dropped, as reported by Data
Dropped, as if the packet was marked in the network, unless otherwise
explicitly specified." And then we could add explicit text for the various
Data Dropped states.

> "Each new DCCP-Response MUST increment the Sequence Number, and
> possibly #NDP, by one."  Which sequence number?  That of the
> responder?  What if seq0 is sent, delayed, then Seq1 is sent and
> arrives.  When seq0 arrives at the Requester, the action table of
> section 5.3 implies that a Rst would occur.

Yes, that of the responder. No -- the table's entry for "OPEN (client) /
Response" says "-/Rst", so old Responses will be ignored. ("client" =
requester.)

> (5.6)
>
> By not having an ACK on every packet, there is less possibility of
> error/validity/timeliness checking than is possible in TCP.

We would agree that there is a performance penalty to having Ack Ratios
greater than 2. For that reason, an Ack Ratio greater than 2 is only used
when necessary as a response to congestion on the reverse path.

> "A DCCP-Data or DCCP-DataAck packet may contain no data bytes if the
> application sends a zero-length datagram.  Such zero-length datagrams
> MUST be reported to the receiving application."  I can see some
> utility in having this facility (framing, sending an EOF, etc.).  But,
> I can imagine that some systems would have trouble implementing this
> requirement (some systems might have to re-work their buffer
> management code, the API between applications and the kernel, etc.).

OK. MUST -> SHOULD?

> "The receiver of a valid DCCP-Close packet SHOULD respond with a
> DCCP-Reset packet, with Reason set to "Closed"..."  Why not respond
> with an ACK?

Because of the semantics of Reset: "you must now hold TimeWait state". The
Reset also signals dumber middleboxes that the connection is over.

> (5.9)
>
> "DCCP B SHOULD respond to the DCCP-Move with a DCCP-Reset (with Reason
> set to "Invalid Move") if any of the following conditions holds..."
> Would "silently ignore" be more secure?  By sending the DCCP-Reset,
> system A is telling system C ("far away" from A and B) something about the
> state of the connection between A and B.

You read the -02 version of the draft :) We adopted exactly this solution
in -03.

> "After receiving such an invalid DCCP-Move, DCCP B MAY ignore
> subsequent DCCP-Move packets, valid or not, for a short period of
> time, such as one round trip time."  Which round trip time?  Also, do
> DCCP *receivers* track round trip time?  (I don't believe TCP
> receivers track round trip time.)

We could amend it to ", such as one round trip time, or one second".

(In CCID 3, there is a 4-bit window counter value incremented
every quarter of a RTT, so the CCID3 receiver has some sense of
the RTT.  In CCID 2, there are ack of acks, so the CCID 2 receiver
could come up with a rough estimate/upper bound on the RTT if
it was absolutely necessary.)

> "If DCCP B accepts the move, it MUST send this acknowledgment to the
> network address/Source Port combination."  I believe the intent of
> this is to say "to the network address and source port contained in
> the received datagram", but I'm not sure.  It should be re-worded to
> be more clear.

Fair.

> "... if it rejects the move, which it MAY do for any reason, it MUST
> send the acknowledgment to the Old Address/Old Port combination."
> Does this mean that if attacker C sends a invalid DCCP-Move to A
> purporting to be from B, A will then reset the connection with B?

No, A sends an *acknowledgement*, not a reset. The "rejection" of the move
is not equivalent to a reset. A valid Mover would detect that the move was
rejected because the ack was delivered to the old address. The draft needs
a bit of clarification here. (The paragraph does begin "DCCP B SHOULD
respond to a valid DCCP-Move packet with a DCCP-Ack or DCCP-DataAck packet
acknowledging the move".)

> "If the acknowledgment is lost, DCCP A might resend the DCCP-Move
> packet (using a new sequence number).  DCCP B will detect this case
> because the network address/Source port combination corresponds to a
> valid connection, for which the ...  It SHOULD respond by sending
> another acknowledgment, as allowed by the congestion control
> mechanism in use."  This seems very baroque to me.

Do you believe it is unnecessarily baroque? The situation is as follows.
Say A and B have a DCCP connection open, and A is trying to move to A'.

             B has state for the A<->B connection

   A' > B:   DCCP-Move(Old Address/Port = A, Seqno = 10)

	     B looks up the Old Address/Port (A), checks out the Move,
	     accepts it, switches accepts the Move, switches its state to
	     an A'<->B connection

   B > A':   DCCP-Ack(Ackno = 10)

	     The Ack is lost

   A' > B:   DCCP-Move(Old Address/Port = A, Seqno = 11)   [retransmission]

Now what should B do? The Move is not, on its face, valid, because there is
no connection corresponding to Old Address/Port A. So unless we explicitly
defined how to handle this case, B would ignore the retransmitted Move, and
A' might never find out that the Move had succeeded.

Here are a couple alternate mechanisms.

  1. DCCP B must retransmit its Ack until it hears back from A'.
  2. DCCP B must keep state for the connection under *both* A and A' for
     a while. After five seconds (say) the entry for A will disappear.

I suppose 1. might be acceptable -- although we'd have to specify some
retransmit timer, etc. But is it significantly less baroque in the end?

> (6.2)
>
> The problem I have is that Ignored is not guaranteed delivery.  So, if
> A would have RST on seeing an Ignored, a lost Ignored may have A
> continue, with subsequent bad performance/semantics.  My sense is that
> even though data transfer does not need to be reliable in DCCP,
> session management does need to have reliability.

We agree that session management needs to have reliability, which is why we
use the feature negotiation mechanism for most session management. So, for
the options that matter (which are mostly features), A would retransmit the
option until it got some response. Say that A sends "Change(FOO)" to B on
packet 2. B's first ack of packet 2 (say, ack #3) would contain either
"Ignored(Change(FOO))" or "Prefer(FOO)" or "Confirm(FOO)" -- but A cannot
know which, if ack #3 is dropped. Even if ack #4 arrived, confirming that
packet #2 was received, A will still retransmit "Change(FOO)", because it
didn't get the explicit Ignored/Prefer/Confirm response.

> (6.3.2)
>
> "DCCP A SHOULD retransmit the Change option until it receives some
> relevant response."  How often should it retransmit?

We should probably specify this a little bit more. I believe it should
retransmit at most as often as allowed by the congestion control mechanism
in use. (Maybe we don't need to specify any more than that.)

> Is it the case that Change (and others) do *not* consume sequence
> number space?

That is true, since sequence numbers are attached to packets, not data
bytes or options or any combination. But a packet containing a
"retransmitted" Change would end up using a new sequence number.

> (6.3.3)
>
> "DCCP B SHOULD respond to a valid Prefer option..."  The last sentence
> of 6.3.2 implies "MUST" rather than "SHOULD".

Fair; we will change 6.3.2.

> (6.3.5)
> 
> The second example ("Here, A and B jointly settle on CC mechanism 5")
> seems contrived, in that I would think that B would have started with
> "Change(CC, 3, 4, 5)" if it understood 5.

Probably, but I think it's important to explicitly demonstrate that this is
an allowed possibility. (Unless we want to disallow it.)

> (6.3.7)
> 
> "... rather, it says which feature option must be sent on the next
> packet generated."  What if the next packet is already as large as
> possible?

"must" should be "SHOULD" here. And there's always the possibility of
sending an extra Ack, as allowed by the CC mechanism in use. We should
mention this.

> (REQUESTER STATE DIAGRAM (DCCP B))
> 
> In the Unknown state, "RECV - Pr | APP", why "-"?

Because the requester may want to escape the Unknown state, independent of
the application or the feature location, in which case it sends a Change.

> I don't see any retransmissions.

They're a little hidden. The "RECV -" transition in "Changing" state allows
for retransmissions.

> I don't know how sequence numbers are interpreted on received packets,
> i.e., what to do with "in the window" and "out of the window" packets.

The diagrams deal with valid packets, so any invalid "out of the window"
packets have already been rejected. There is an issue with old packets.

> (FEATURE LOCATION STATE DIAGRAM (DCCP A))
> 
> In the Unknown state, "RECV - | APP"  What does the "RECV -" mean?

See above.

> In the Known state, "RECV Chg | APP" should be "RECV Chg (X) | APP".

Yep.

> (6.4.2)
> 
> I'm not sure how "Connection Nonce defaults to a random 8-byte string"
> can work.  If B's Connection Nonce feature is set to a different
> random 8-byte string than A's connection nonce, how does that work?

It means the two sides cannot send valid Identification options until they
negotiate the Connection Nonce feature or otherwise trade Connection Nonce
values. This is a feature, not a bug.

> (6.7)
> 
> I believe one still needs to specify "network byte order" when
> discussing quantities larger than a single byte.

OK.

> "Elapsed time is meant to help the Timestamp sender separate the
> network round-trip time from the Timestamp receiver's processing time.
> This may be particularly important for CCIDs where acknowledgments
> are sent infrequently, so that there might be a considerable delay
> between receiving a Timestamp option and sending the corresponding
> Timestamp Echo."  I think asking a system to put the actual elapsed
> time will not work.  I think the best we might be able to do is put an
> *estimated* elapsed time.  Different systems might put in different
> values: half the period at which ACKs are sent, or the maximum, or the
> amount of time left on the ACK timer for this connection).

"A lower bound on the elapsed time"?

> (6.9)
> 
> A diagram of the loss window would be useful.

OK.

> (7)
> 
> "A new connection starts with CCID 2 for both DCCPs.  If this is
> unacceptable for either DCCP, that DCCP will start in the Unknown
> state."  This seems problematic, in that A thinks B is in 2, but B is
> in Unknown and isn't about to tell A that it is in Unknown.

It will tell A that it's in Unknown, by sending a Prefer. This is the "RECV
-" state transition from Unknown that you asked about. Probably need to
find a better way to describe this.

> "A DCCP SHOULD NOT send data when its Congestion Control feature is in
> the Unknown state."  What about cases where the sending of *control*
> information is limited by the CCID?

This is a bit problematic. We should make sure that Ack storms cannot occur
when the CCID is unknown.

> (7.1)
> 
> "For example, say that CCID 98, a new sender-based congestion control
> mechanism using Ack Vector for acknowledgments, has entered the IETF
> standards process, and the IETF has approved the use of CCID 1 as a
> backup for CCID 98.  Now, DCCP A, which understands and would like to
> use CCID 98, is trying to communicate with DCCP B, which doesn't yet
> know about CCID 98.  DCCP A can simply negotiate use of CCID 1 and,
> separately, negotiate Use Ack Vector.  DCCP B will provide the
> feedback DCCP A requires for CCID 98, namely Ack Vector, without
> needing to understand the congestion control mechanism in use."  This
> sounds very good.  But, there are two issues.  First, what control
> information sender and receiver agree to send each other on the wire.

This is separately negotiated, as described; Use Ack Vector, for example.

> Second, how the sender and receiver expect each other to act based on
> the control information exchanged.  It isn't clear to me that if the
> two systems have different expectations about what will happen there
> won't be trouble.  I understand the argument, "in that case, the IETF
> would not have approved the use of ...".  However, my concern is that
> this is a mechanism that makes the protocol more complicated (as all
> mechanisms do) but that may not have any *practical* utility in the
> future.

Hm. Well, the meaning of Ack Vector, for example, is CCID-independent; it
says what arrived and what didn't. Furthermore, the receiver knows that
CCID 1 is in use, and it knows what that means: it cannot expect any
specific congestion-related behavior from the sender (except, for now,
long-term TCP friendliness). We should talk about what the possible
problems are (what might happen if the two systems had "different
expectations about what will happen").

> (7.4)
> 
> Maybe in Gordon Bell's assessment of the PDP-10 (-11?) architecture,
> there is the point that the main mistake an architect can make is not
> having enough address bits.  I worry about the partitioning of
> CCID-specific options into ones that apply to to HC-Sender,
> HC-Receiver, etc.  I would suggest instead having a second byte that
> would specify this information.

There are 64 options and 64 features per direction, this really seems like
enough. If a CCID really needed more, then it could define option/feature
191 (say) so that that the first byte/two bytes/four bytes of
option/feature data specified the true option/feature number.

> (8.1)
> 
> "However, note that acks-of-acks need not be reliable themselves:
> when an ack-of-acks is lost, the HC-Receiver will simply maintain
> [(and retransmit)] old acknowledgment state for a little longer."

Yep.

> "For instance, DCCP A might send a DCCP-DataAck packet every now and
> then, instead of DCCP-Data."  Why not send DCCP-DataAck every time
> there is Ack data to transmit?

4 bytes savings (at least: additional ack options might push it higher).
And there is almost always Ack data to transmit: all packets, including
DCCP B's Acks, take sequence numbers. No big deal though; I agree with you
in practice.

> "DCCP A switches its ack pattern from bidirectional to unidirectional
> when it notices that DCCP B has gone quiescent.  It switches from
> unidirectional to bidirectional when it must acknowledge even a single
> DCCP-Data or DCCP-DataAck packet from DCCP B..."  Isn't it the case
> that if DCCP A always sent DCCP-DataAck (when it had data and ACKs to
> send), it wouldn't have to "notice" these transitions?

I don't think so. For instance, DCCP A might think that the B-to-A
halfconnection was quiescent just because DCCP B's user wasn't typing fast
enough to merit sending a DataAck each time. I don't want the connection to
flip back and forth too much between quiescent and nonquiescent, thus the
hysteresis in the switch from bidi to unidi.

> (8.4)
> 
> "... although DCCP B MAY send Ack Vector options even when Use Ack
> Vector is false."  Why is this?  It seems counterintuitive and could
> cause problems.

What problems? DCCP A can just ignore the option.

> (8.5)
> 
> "Packets reported as State 0 or State 1 ...  And data on the packet
> need not have been delivered to the receiving application; in fact,
> the data may have been dropped."  I'll repeat that this seems like a
> mistake to me.  I would say "if the data cannot be delivered to the
> application, the packet should be silently discarded (or, "ECN
> discarded" would be fine, too).

Again, we are trying to allow the protocol to distinguish between network
drops and other drops. This is new, but, I think, valuable. If you continue
to disagree, we should pop this question out into its own thread. At
minimum, we should motivate this more clearly in the protocol documents.

> (8.5.1)
> 
> If the Ack of packet 24 (showing it to be State 0) has been acked,
> then duplicate of packet 24 is received marked ECN, does a new Ack of
> packet 24 need to be generated?

This is defined in the 2nd paragraph after the table:

"But what state should the HC-Receiver report in Ack Vector if two
duplicates are received for a packet, and only one is ECN marked? We
explicitly allow the HC-Receiver to report the combination as State 0
(received non-marked) or State 1. After all, one duplicate was non-marked,
and depending on how much state the HC-Receiver keeps about packets it
receives, it might be impossible to change a packet from State 0 to State 1
and preserve correct ECN Nonce Echo information."

So, no, it does not need to be generated, but it may be generated. Probably
we should make a new table for the receiver case, partially replacing the
current paragraph, which describes a table in words.

> In the "old state"/"received state" table, I would suggest the the top
> row be set to "0 0 0" (rather than "0 1 0").  This would mean that the
> state is not changed from State 0 or from State 1, but only from State
> 3.  Yes, this means a bit of information may be lost, but simplicity
> probably trumps absolute efficiency in this case.

That would lose the reordering property: "The table is symmetric about the
main diagonal, so it is indifferent to ack reordering." This is important
perhaps for the receiver to verify that the sender's CC mechanism is acting
correctly. If the receiver sends acks stating both State 0 and State 1 for
a packet, both of which the sender acknowledged, then the receiver should
know the states the sender uses for every acknowledged packet.

> (8.5.2)
> 
> "The union of groups 2 and 3 is called the Unacknowledged Window."
> This is the HC-Receiver's point of view.  From the HC-Sender's point
> of view, the union of groups 2, 3, and 4 is what is unacknowledged.

We'll explicitly state this.

> (8.6)
> 
> I don't think the "Slow Receiver Option" is a good idea.  I don't see
> any need to invent a new mechanism to indicate "congestion" in the
> path to the application.  Packets that arrive should be dropped
> silently (or, with ECN).

We disagree, as stated above.

> "Slow Receiver implements a portion of TCP's receive window
> functionality.  We believe receiver operating systems and applications
> will find it much easier to send Slow Receiver when appropriate than
> they currently find it to correctly set a TCP receive window."  To me,
> the receive window portion of TCP is a way of trying to get N bytes in
> flight.  The "guarantee" of being able to, in fact, accept all those N
> bytes is at the level of statistical multiplexing: most likely the
> receiving TCP will accept all N bytes, but it is possible it will
> not.  As long as in a "best effort" sense this breaking of the
> "guarantee" doesn't happen "too often", it is okay.
> 
> (I seem to remember at the SIGCOMM in 1988, Dave Cheriton asking Van
> Jacobson if his work ultimately eliminated the utility of the receive
> window.  I don't know of any subsequent work looking at this issue,
> though it would be interesting.)

Hmm.

> (8.7)
> 
> Again, I do not see a strong case for this option.

Discussed above.

> "Drop State 4 ("application no longer listening") means the
> application running at the endpoint that sent the option is no longer
> listening for data. ..."  Why not simply reset the connection if data
> arrives when the receiving application is not expecting any data?

Because the receiving application might still be sending data.

> (8.8)
> 
> I don't see the checksum option as being particularly useful.  I know
> this is an open issue currently, but I wouldn't want to rely on the
> ones complement checksum in the case where the link level CRC
> indicates a corrupted packet.  I see checksum less than 15 as being a
> CPU optimization (don't waste cycles computing checksum on payload).

We have seen arguments on the list advocating for this option. It is
experimental.

> (8.9)
> 
> This might be a separate document, or perhaps an appendix.

Was it not useful to help explain the Ack Vector?

> (8.9.4)
> 
> "... a single acknowledgment number tells HC-Receiver how much ack
> information has arrived."  I thought HC-Sender needed to maintain a
> vector of received ACKs and send this vector to HC-Receiver every now
> and then.

It doesn't have to be a vector. That's the difference between bidirectional
connections and unidirectional connections. Depending on the CC in use, 

> (9.1)
> 
> "The ECN Capable feature lets a DCCP inform its partner that it cannot
> read ECN bits from received IP headers, so the partner must not set
> ECN-Capable Transport on its packets."  Why support this?  Why not
> just require ECN?

Dumb firewalls. So far, there aren't any dumb firewalls that we know about
that block packets due to the ECN field in the IP header, but the argument
in RFC 3360, Inappropriate TCP Resets Considered Harmful, is that transport
protocols now have to be designed to protect themselves against such
possibilities.

> (9.2)
> 
> I think ECN Nonces are a clever idea, but I'm not sure they are worth
> the complication in the protocol.  Both the sender and the receiver
> are at least somewhat motivated to ignore congestion control, and this
> doesn't help any malfeasance on the part of the sender.  I suspect you
> cannot use technical means to police "good behavior" in this case.

Penalty [middle]boxes could use the ECN nonce to prevent even colluding
endpoints from achieving more than their fair share.

Also, in general, the sender is a server with many active connections whose
motivation generally is to use end-to-end congestion control, while the
receiver often has a much stronger motivation for evading end-to-end
congestion control.

> (10.3)
> 
> "... unless they have high-quality information about actual network
> conditions between the two new endpoints."  I wouldn't allow for any
> exceptions.  The endpoints should start as they would at the beginning
> of a connection.

I could accept that.

> "Normally, the only way to get this information would be by
> instrumenting a DCCP connection between the new addresses."  What does
> "instrumenting" mean in this context?

Running the connection and taking the relevant parameters out of the
internal state, nothing more complicated than that.

> (10.4)
> 
> "The mobile DCCP MUST NOT let loss events on packets from the old
> address/port pair affect the new congestion control state."  Why not?
> Clearly, it wouldn't necessarily be correct (though it might be).
> But, it seems like the conservative (towards overall network health)
> thing to do.

We don't see any need to make this a MUST NOT.  "Should not" would
seem just as well, not necessarily even in capital letters.  It
doesn't have to do with interoperability, and it doesn't introduce
any dangers to the network if the mobile DCCP lets loss events on
old packets affect the new congestion control state.  It just hurts
the connection itself, possibly unnecessarily.  

(I don't think we have to be so conservative as to *require* that
the sender responds to loss events from the old address/port pair...)

> (11)
> 
> "A DCCP implementation SHOULD be capable of performing Path MTU (PMTU)
> Discovery..."  Why not say "MUST"?

Eddie: I tried to reserve MUST for cases where it was really required for
interoperability, as RFC 2119 -- this is not such a case. Furthermore, what
about a super-minimal embedded system implementation?

Sally: I would be inclined to a MUST, for a generic DCCP implementation.
Even though PMTU discovery is sometimes blocked by middleboxes. (Of course,
there is always the minimal DCCP implementation in the small, mobile,
battery-limited device...)

> "However, it is undesirable for MTU discovery to occur on the initial
> connection setup handshake, as the connection setup process may not be
> representative of packet sizes used during the connection, and
> performing MTU discovery on the initial handshake might unnecessarily
> delay connection establishment.  Thus, DF SHOULD NOT be set on
> DCCP-Request and DCCP-Response packets."  Am I right in remembering
> that IPv6 doesn't support fragmentation in routers along the path?  I
> think a new protocol should probably set DF in every single packet.
> For connection setup, assuming a 576 byte PMTU seems conservative, and
> is what I would recommend.

I don't see why a protocol that will work over IPv4 should set DF on every
single packet. In IPv6, obviously it will do the equivalent; the source has
to fragment when required. Agree with 576 byte initial PMTU.

> "(We are aware that this may cause problems for DCCP endpoints behind
> certain firewalls.)"  I'm unaware, so it might be good to discuss this
> briefly.

Some firewalls don't let ICMP through. So the ICMP NEEDFRAG message that is
required for PMTU wouldn't get through to the relevant endpoint.

RFC 2923 on "TCP Problems with Path MTU Discovery," September 2000, says
the following: "As was pointed out in [RFC1435], routers don't always do
this correctly -- many routers fail to send the ICMP messages, for a
variety of reasons ranging from kernel bugs to configuration problems.
Firewalls are often misconfigured to suppress all ICMP messages."

> (12)
> 
> "(In TCP, sequence number modification is required to support legacy
> protocols like FTP that carry variable-length addresses in the data
> stream.  If such an application were deployed over DCCP, middleboxes
> would simply grow or shrink the relevant packets as necessary, without
> changing their sequences numbers.)"  "Legacy" has a somewhat negative
> connotation to it and, in this instance, could be safely left out.
> And, a middlebox might need to inject an extra packet in the
> data stream in the case where the packet that needed to be extended was
> already of a maximum size.

"Legacy" -> fair point.

Injecting extra packet into the data stream -> A middlebox faced with that
situation should probably drop the offending packet and send a NEEDFRAG to
decrease the PMTU. I really don't want middleboxes to change sequence
numbers. It breaks a huge number of things, including Identification, and
additionally makes the middleboxes' lives much harder.

> (14)
> 
> "However, this approach to multiplexing sub-flows above DCCP will not
> work in circumstances such as RTP where the RTP subflows require
> separate port numbers."  I would think that a multiplexing layer above
> DCCP would have to have port numbers in its header.  I don't see why
> this means it couldn't happen.

Good point: it's an API issue, not a protocol issue.

Eddie, with Sally

_______________________________________________
dccp IETF mailing list: dccp@ietf.org
list info:  https://www1.ietf.org/mailman/listinfo/dccp
wg charter: http://www.ietf.org/html.charters/dccp-charter.html