[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [dhcwg] Comments for draft-ietf-dhc-dhcpv4-leasequery-00.txt



Additional comments - Ted's comments are much more specific whereas I'm
asking some questions re general approach. Sorry for their lateness.

1) Is very DSLAM centric - suggest we need to generalize the use case to
any relay agent. The background re fast/slow path in the Motivation
section is unnecessary. Suggest Motivation can be boiled down to
"extending the current individual Leasequery protocol [RFC 4388] to
support a bulk query".

2) Notice that as a new proposal this has a lot of commonality with the
SAVI protocols for anti-spoofing.

3) Section 7.2.1 assigns new DHCP message type values - these should be
TBA/IANA.

4) Section 7.2.2 proposes dhcp-message formats that are somewhat
confusing: it specifies constrained format 3 character ASCII field where
the characters must be base10. As the table appears to be simple
incrementing error codes I would suggest an simple 16-bit result code
and associated table vs the left-, middle-, right-number format in
7.2.2. I do not think we should overload option 56 for error messages as
currently opt 56 is defined for text to display to end-client. I don't
see any harm in creating a new DHCP option for lease-query to convey
server-to-relay errors/results (ie, as done with the base-time and
start-time-of-state messages).

5) I dont think we should mix absolute and relative time scales in the
same single message. As currently proposed base-time is an absolute time
and all other time values are offsets from this base-time option. Is
there a way we can avoid absolute times all together thus avoiding the
need for time sync between relay and DHCP server? Given the leasequery
returns things like lease (which are timers) could we not get away with
returning the remaining lease time? I'm not sure I can appreciate value
in the relay agent knowing what the original lease time was.

6) With section 7.2.5/7.2.6 I feel we are overcomplicating the query
with time ranges. Why would a relay agent not want to recover all leases
- further this requires a DHCP server to maintain an explicit
last-modified time for every lease.

7) The states RELEASED, ABANDONED, RESET suggest the server keeps state
of leases that have expired - I can see that this is a function of the
time-based query - but it puts a huge burden on a DHCP server. See point
6 wrt returning all valid leases. These time-bound queries could lead to
out-of-sync problems between server and relay and as the motivation
appears to be state recovery it seems pragmatic to return all state
instead of a time-based subset. In reality how many relay-agents could
even implement a time-bound query after a reload (ie, do they really
know the second they were rebooted or lost power)?

8) This proposal requires a TCP connection between server and relay
agent thus necessitating an IP address on the relay agent. This seems to
be at odds with the objectives outlined in the L2 Relay and LDRA drafts.
In these drafts relay-agent (often a DSLAM) does -not- require an IP
address nor is it desirable. This proposal will require the DSLAM has an
IP address that is routable to the DHCP server - also the draft says in
section 5 that "No relaying of Bulk Leasequery messages is specified".
This means a logical topology change in many ISP networks and is not
aligned with the broadband architectures put forward by other
organisations like Broadband Forum TR-101.


BR,

-d

-----Original Message-----
From: dhcwg-bounces at ietf.org [mailto:dhcwg-bounces at ietf.org] On Behalf
Of Ted Lemon
Sent: Tuesday, June 23, 2009 8:00 AM
To: dhc WG
Subject: [dhcwg] Comments for draft-ietf-dhc-dhcpv4-leasequery-00.txt

There are a lot of editorial comments here, and a few comments having  
to do with protocol issues.   I should note that we have had no  
independent readings of the document other than mine, and it received  
no support during the working group last call.   As of now I believe  
the document requires revision before we do another working group last  
call, so there's no need to rush in now and state your support for it.

Thanks to Kim, Bernie, Neil and Mark for their hard work in merging  
the two previous bulk leasequery documents.   I'm sorry my comments  
here are so voluminous, but I feel like they were worth the effort it  
took to do them, and I hope the authors take them as constructive and  
not nitpicky.



This:

The Dynamic Host Configuration Protocol for IPv4 (DHCPv4) has been  
extended with a Leasequery capability that allows a requestor to  
request information about DHCPv4 bindings. That mechanism is limited  
to queries for individual bindings. In some situations individual  
binding queries may not be efficient, or even possible. This document  
expands on the DHCPv4 Leasequery protocol to allow for bulk transfer  
of DHCPv4 address binding data via TCP.

Might be better worded this way:

The Dynamic Host Configuration Protocol for IPv4 (DHCPv4) Leasequery  
extension allows a requestor to request information about DHCPv4  
bindings. This mechanism is limited to queries for individual  
bindings. In some situations individual binding queries may not be  
efficient, or even possible. This document extends the DHCPv4  
Leasequery protocol to allow for bulk transfer of DHCPv4 address  
binding data via TCP.

Page 3:

configured DHCPv4 servers (see Figure 1). In this process, some relay  
agents also glean the lease information sent by the server and  
maintain this locally. This information is used for a variety of  
purposes, including prevention of spoofing attempts from the DHCPv4  
clients and to install routes. When a relay agent reboots, this  
information is frequently lost.

maybe change to this:

configured DHCPv4 servers (see Figure 1). In this process, some relay  
agents also glean lease information sent by the server and cache it  
locally. This information is used for a variety of purposes.   Two  
examples are prevention of spoofing attempts from the DHCPv4 clients,  
and installation of routes. When a relay agent reboots, this  
information is frequently lost.

Page 4:

Different query types are needed where a relay agent can query the  
server without waiting for the traffic from or for the clients, as  
well as a different transmission technique more conducive to the  
transmission of large quantities of data.

maybe change to this:

Some applications require the ability to query the server without  
waiting for traffic from or to clients. This query capability in turn  
requires an underlying transport more suitable to the bulk  
transmission of data.

Page 7:

For a DSLAM having multiple DSL ports, multiple IP addresses may be  
assigned using DHCPv4 to a single port and the number of DHCPv4  
clients on a port may be unknown. The DSLAM may also not know the  
network portions of the IP addresses that are assigned to its DHCPv4  
clients.

maybe change to this:

For a DSLAM having multiple DSL ports, multiple IP addresses may be  
assigned to DHCPv4 clients on a single port and the number of DHCPv4  
clients on that port may be unknown. The DSLAM may also not know the  
network portions of the IP addresses that are assigned to its DHCPv4  
clients.

Page 8:

The existing data driven approach required by [RFC4388] means that the  
Leasequeries can only be performed after an Access Concentrator  
receives data. To implement antispoofing, packets need to be dropped  
until it gets the lease information from DHCPv4 server. If an Access  
Concentrator finishes the Leasequeries before it starts receiving  
data, then there is no need to drop legitimate packets. In this way,  
outage time may be reduced.

to:

The existing data driven approach required by [RFC4388] means that the  
Leasequeries can only be performed after an Access Concentrator  
receives data. To implement antispoofing, the concentrator must drop  
packets for each client until it gets lease information from DHCPv4  
server for that client. If an Access Concentrator finishes the  
Leasequeries before it starts receiving data, then there is no need to  
drop legitimate packets. In this way, outage time may be reduced.

Page 15:

The start-time-of-state option allows the receiver to determine the  
time at which the IP address transitioned into its current state.

to:

The start-time-of-state option allows the receiver to determine the  
time at which the IP address made the transition into its current state.

Page 16, section 7.2.5:

You have a lot of admonitions about what the query start time must be  
and must not be, which I don't think actually communicate what you  
want the implementation to do.   You should just tell the  
implementation how to compute this value; otherwise I don't see how an  
interoperable implementation that follows the rules you've given here  
is possible.

maybe something like this:

7.2.5. query-start-time

The query-start-time option specifies a start query time to the DHCPv4  
server. If specified, only bindings that have changed on or after the  
query-start-time should be included in the response to the query.    
The requester MUST compute the query-start-time relative to a lease it  
has recovered from stable storage, and MUST specify that time in terms  
of the DHCPv4 server's clock at the time of the query recovered from  
stable storage.

For example, suppose the requester had previously received information  
on a lease from the server.   At that time, the server had sent a base- 
time option containing a time of X.   Subsequently the server sent a  
DHCPLEASEACTIVE message with a lease time of Y.   If this is the last  
lease the requester remembers receiving, it would use X+Y for the  
query start time, indicating that it would like to receive all updates  
from that time onward.

The same comment applies to query-end-time.

In section 7.2.7, I don't really understand why the relay agent needs  
to know more than that the lease is still active, or is no longer  
active.   This seems to be an extension on the functionality provided  
by DHCPLEASEQUERY, relating to failover, but (a) it seems unnecessary,  
(b) I have no idea what the DSLAM would do with it, and (c) because of  
that, it seems like it could result in interoperability problems as  
different implementors make different guesses as to how to handle  
it.   You could save a lot of text by just taking this out.   Same  
with data-source.

I can see where it might be helpful to use this for disambiguating in  
cases where one server reports a lease as free and another reports it  
as allocated, but it might be better to simply specify rules for when  
a server should claim to have knowledge about a lease - if a  
particular IP address is in its peer's pool, and is not known to have  
been allocated, perhaps that IP address should simply not be mentioned  
in a DHCPBULKLEASEQUERY reply.   It would be entirely permissible and  
expected in some cases for that to happen.

Page 21:

Messages from the DHCPv4 server come as multiple responses to a single  
DHCPBULKLEASEQUERY message. Thus, each DHCPBULKLEASEQUERY request MUST  
have a xid (transaction-id) unique on the connection on which it is  
sent, and all of the messages which come as a response to it all  
contain the same xid as the request. It is the xid which allows the  
data-streams of two different DHCPBULKLEASEQUERY requests to be  
demultiplexed by the requestor.

to:

Messages from the DHCPv4 server come as multiple responses to a single  
DHCPBULKLEASEQUERY message. Thus, each DHCPBULKLEASEQUERY request MUST  
have a xid (transaction-id) unique on the connection on which it is  
sent.   All of the messages which come as a response to that message  
will contain the same xid as the request. It is the xid which allows  
the data-streams of two different DHCPBULKLEASEQUERY requests to be  
demultiplexed by the requestor.

Page 21:

A requestor MAY send a DHCPBULKLEASEQUERY request to a DHCPv4 server  
and immediately close the transmission side of its TCP connection, and  
then read the resulting response messages from the DHCPv4 server. This  
is not required, and the usual approach is to leave both sides of the  
TCP connection up until at least the conclusion of the Bulk Leasequery.

You don't need to say this; I would take it out.

Page 23:

Both of the query-start-time and query-end-time options (if they  
appear) MUST be in the time context of the DHCPv4 server to which the  
Bulk Leasequery is directed. In the absence of information to the  
contrary, the requestor SHOULD assume that the time context on the  
DHCPv4 server is identical to the time context on the requestor.

In the event that previous operations have determined that the time  
context on the DHCPv4 server to which the Bulk Leasequery is addressed  
differs from the time context of the requestor, the time context of  
the DHCPv4 server MUST be used. Use of the query-start-time or the  
query-end-time options or both can serve to reduce the amount of data  
transferred over the TCP connection by a considerable amount.

You really haven't told us what to do here.   I don't see how  
interoperability can be achieved.   See my comments on these options  
earlier.   I would just delete this text, since you already have text  
explaining this earlier in the document.

Page 23:

If the TCP connection becomes blocked or stops being writeable while  
the requestor is sending its query, the requestor SHOULD be prepared  
to terminate the connection after BULK_LQ_DATA_TIMEOUT. We make this  
recommendation to allow requestors to control the period of time they  
are willing to wait before abandoning a connection, independent of  
notifications from the TCP implementations they may be using.

How does an implementation "prepare to terminate" the connection?   I  
think what you mean is something like this:

The TCP connection may become blocked or stop being writeable while  
the requestor is sending its query.   Should this happen, the  
implementation's behavior is controlled by two variables:  
BULK_LQ_DATA_TIMEOUT, and whatever configuration the operator may have  
provided.   When this situation is detected, the requester SHOULD set  
a timer using the lesser of BULK_LQ_DATA_TIMEOUT or the operator- 
configured timeout.   If that timer expires, the requester SHOULD  
terminate the connection.

This same comment applies to section 8.3, and the prescribed antidote  
is the same.

Also on Page 23:

If a response message does not contain a DHCPv4 server-identifier  
option (option 54), then the server-identifier option from the  
previous message should be used. Thus, the DHCPv4 server MUST send the  
server-identifier option in the first response message, and MAY send  
it in subsequent response message for the same request.

Why aren't you saying this?

The DHCPv4 server MUST send a server-identifier option (option 54) in  
the first response to any DHCPBULKLEASEQUERY message.   The DHCPv4  
server SHOULD NOT send server identifier options in subsequent  
responses to that DHCPBULKLEASEQUERY message.   The requester MUST  
cache the server-identifier option from the first response and apply  
it to any subsequent responses.

(In a large response dataset, the space consumed by repetition of the  
same server-identifier option could be substantial.   Either that, or  
make sending the option mandatory in every message.   If you have to  
have this code in the requester, might as well take advantage of it.)

Section 8.4 is unimplementable as written.   I'm sure it describes an  
implementation that someone has done, but it provides no concrete  
guidance or math for the implementor.   As such, it should either be  
deleted, or made explicit.

Regarding section 8.6, as I've said previously in this commentary, I  
think it would be better to specify the client behavior in such a way  
that the conflicts proposed in this section simply do not happen.    
This section is not specific enough to foster interoperability.

Page 28:

A server MAY process more than one query at a time. A server that does  
not support more than one query at a time on a single connection MUST  
return a DHCPLEASEQUERYDONE message containing a dhcp-message option  
with a status-code of NotAllowed to the unsupported queries.  
Alternatively, a server that does not support more than one query at a  
time on a single connection MAY chose to simply read one query and  
only read any subsequent queries after processing of the current query  
is complete.

There are two problems with this paragraph.   First, you have given  
implementors an option where no option is necessary.   Just specify  
one or the other.   Second, if you specify that the server needn't  
read each message from the connection until it's ready to process it,  
then you need to change the last paragraph in section 8.2, since a  
conforming implementation could easily set up a situation where the  
connection appears to be blocked, which would trigger the requester to  
drop the connection, even though nothing was wrong.

Section 8.8:

Either the requestor or DHCPv4 server MAY close the TCP connection at  
any time. The requestor MAY choose to retain the connection if it  
intends to issue additional queries or if other queries are currently  
using the connection. Note that this requestor behavior does not  
guarantee that the connection will be available for additional  
queries: the server might decide to close the connection based on its  
own configuration.

This seems broken.   It's easy to imagine situations where one side or  
the other closes the connection in compliance with this statement and  
triggers a timeout on the other end which causes legitimate data to be  
discarded.   I think you need to specify a non-error connection  
termination protocol to make this work (and you should - otherwise  
nobody's allowed to close the connection, which would be worse).

In section 9.1, you seem to be specifying an authorization model based  
on the IP source-address of the requestor.   You do not mention this  
in the security considerations section.   This is likely to trigger  
pushback from the IESG.

You use the same "should be prepared" language in these sections that  
you used in section 8.2 and 8.3.   You need to specify exactly what  
the server does, not just say "should be prepared."   The language I  
suggested for section 8.2, with appropriate modifications, would be  
acceptable here as well, at least to me.

Page 30, 31:

A Bulk Leasequery response MUST contain no more than one message for  
each IP address configured in the DHCPv4 server. In addition, a Bulk  
Leasequery may well take significant time between the beginning and  
end of the processing of all of the messages required to satisfy the  
Bulk Leasequery query. During this time, the state of some of the IP  
addresses sent early in the response may change prior to the  
completion of the entire response to the Bulk Leasequery. This is  
normal and expected -- there is no requirement for the entire response  
to a Bulk Leasequery to represent an instantaneous snapshot of the  
state of the IP address bindings of a DHCPv4 server. Quite the  
contrary -- as the cursor moves through the IP addresses in whatever  
order is convenient to the DHCPv4 server, the state of IP addresses  
already examined can change and a DHCPv4 server MUST NOT try to  
examine IP addresses already scanned in an attempt to "keep up" with  
the ongoing state changes of all of the IP addresses. To do so would  
make it difficult to meet the requirement to send only one message per  
IP address in response to a Bulk Leasequery and would also make it  
difficult to know when to finish the Bulk Leasequery.

to:

When responding to a DHCPBULKLEASEQUERY message, the DHCPv4 server  
MUST NOT send more than one message for each applicable IP address,  
even if the state of some of those IP addresses changes during the  
processing of the message.   Updates to such IP address state are  
already handled by normal protocol processing, so no special effort is  
needed here. (I hope!)

Page 33:

A DHCPv4 server MAY always compare the address binding information for  
an IP address against a time window if it follows the following  
guidelines. If there is no query-start-time, then the DHCPv4 server  
MUST assume the query-start-time is equivalent to a time prior to any  
time that resides in any IP address binding. If there is no query- end- 
time, the DHCPv4 server MUST assume that the query-end-time is  
equivalent to a time that is later than any time that resides in any  
IP address binding.

You can't use "MAY" like this.   This text is an implementation  
suggestion; I would suggest that you state this simply in terms of  
requirements, without talking about how those requirements might be  
implemented.

Also page 33:

Even if the query-start-time or query-end-time option value is being  
used to limit the amount of data flow from the DHCPv4 server to the  
requestor, there is no requirement placed on the DHCPv4 server to  
return address binding data in any order and certainly not in any  
order based on time.

When the DHCPv4 server has no additional information to send to the  
requestor, it will send a DHCPLEASEQUERYDONE message.

to:

The DHCPv4 server MAY return address binding data in any order, as  
long as binding information for any given IP address is not  
repeated.   When all binding data for a given DHCPBULKLEASEQUERY has  
been sent, the DHCPv4 server MUST send a DHCPBULKLEASEQUERYDONE message.

Page 34:

[RFC2131] and [RFC4388] specify that every response message MUST  
contain the server-identifier option. However, that option will be the  
same for every response from a particular DHCPBULKLEASEQUERY request.  
Thus, the DHCPv4 server MUST include the server-identifier option in  
the first message sent in response to a DHCPBULKLEASEQUERY. It MAY  
include the server-identifier in later messages as well, but there is  
no requirement for it to do so.

I strongly suggest that you change the MAY here to a SHOULD NOT.    
Also, note that you are repeating something you specified earlier, so  
it would be good to delete one or the other paragraph.

Also on page 34:

The message type of DHCPLEASEACTIVE or DHCPLEASEUNASSIGNED is based on  
the value of the dhcp-state option.  If the dhcp-state option value is  
ACTIVE, then the message type is DHCPLEASEACTIVE, otherwise the  
message type is DHCPLEASEUNASSIGNED.

Once again I suggest you get rid of the dhcp-state message and simply  
say what should happen here for IP addresses that are managed by  
failover and those that are managed by a single server.

Specifically, any IP address that is in the FREE state on the primary  
is DHCPLEASEUNASSIGNED.   Any IP address that is in the BACKUP state  
on the primary is not reported on at all.   Any IP address that is in  
the FREE state on the secondary is not reported on at all.   Any IP  
address that is in the BACKUP state on the secondary is  
DHCPLEASEUNASSIGNED.   Likewise for all the other permutations; I  
think that in any of the states where the failover peer would renew  
the lease, it's active, and in states where it wouldn't renew the  
lease, it's either unassigned, or shouldn't be mentioned.

Also, in point 4, you mention a grace period on the expired state, but  
as far as I know no such thing exists.

Page 35:

As discussed in Section 8.3, requestors may want to leverage an  
existing connection if they need to make multiple queries. Servers MAY  
support reading and processing multiple queries from a single  
connection. A server MUST NOT read more query messages from a  
connection than it is prepared to process simultaneously.

to:

As discussed in Section 8.3, requestors may want to use a connection  
that has already been established when they need to make additional  
queries. Servers MAY support reading and processing multiple queries  
from a single connection. A server MUST NOT read more query messages  
from a connection than it is prepared to process simultaneously.

And:

DHCPv4 server implementations may offer administrative control to  
enable or disable this feature. DHCPv4 server implementations that are  
able to process queries in parallel should offer be configurable to  
limit the number of simultaneous queries permitted from any one  
requester.

[And in total?]

Page 36:

The server MUST close its end of the TCP connection if it encounters  
an error sending data on the connection. The server MUST close its end  
of the TCP connection if it finds that it has to abort an in- process  
request. A server aborting an in-process request SHOULD attempt to  
signal that to its requestors by using the QueryTerminated status code  
in the dhcp-message option in a DHCPLEASEQUERYDONE message, including  
a message string indicating details of the reason for the abort. If  
the server detects that the requesting end of the connection has been  
closed, the server MUST close its end of the connection after it has  
finished processing any outstanding requests.

This is in conflict with section 8.8, which says the server can drop  
the connection at any time.   Also, why should the server finish  
processing outstanding requests if the remote end of the connection  
has closed?   Shouldn't it just drop the connection immediately?

Section 10:

As I mentioned earlier, the only authorization model for this protocol  
is based on the requestor's IP address, and there is no  
authentication.   At a minimum this should be mentioned in the  
security considerations section.

I think the appendix belongs in a requirements document, or on the  
mailing list, not in the draft.

_______________________________________________
dhcwg mailing list
dhcwg at ietf.org
https://www.ietf.org/mailman/listinfo/dhcwg