Re: [mile] SHOULD/RECOMMENDED Analysis / Suggestions

"Roman D. Danyliw" <rdd@cert.org> Tue, 30 July 2013 14:31 UTC

Return-Path: <rdd@cert.org>
X-Original-To: mile@ietfa.amsl.com
Delivered-To: mile@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3B38621F9433 for <mile@ietfa.amsl.com>; Tue, 30 Jul 2013 07:31:31 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.331
X-Spam-Level:
X-Spam-Status: No, score=-7.331 tagged_above=-999 required=5 tests=[AWL=1.268, BAYES_00=-2.599, GB_I_LETTER=-2, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vEPqmiNT0NZc for <mile@ietfa.amsl.com>; Tue, 30 Jul 2013 07:31:26 -0700 (PDT)
Received: from plainfield.sei.cmu.edu (plainfield.sei.cmu.edu [192.58.107.45]) by ietfa.amsl.com (Postfix) with ESMTP id A193421E80F5 for <mile@ietf.org>; Tue, 30 Jul 2013 07:30:25 -0700 (PDT)
Received: from pawpaw.sei.cmu.edu (pawpaw.sei.cmu.edu [10.64.21.22]) by plainfield.sei.cmu.edu (8.14.4/8.14.4/1408) with ESMTP id r6UEUGnb014710; Tue, 30 Jul 2013 10:30:16 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cert.org; s=jthatj15xw2j; t=1375194616; bh=sq0wM9yspQAavO4wJx9FKq7rK+ah78qO1/eWdxXSN+A=; h=From:To:Subject:Date:Message-ID:References:In-Reply-To: Content-Type:Content-Transfer-Encoding:MIME-Version:Sender: Reply-To:Cc; b=NY5LqpPCFm3cj5J+dyo4EJV5IMFL9m88V8N317ublf3D5rslqDTOFgWVEwXyJKUqr gta33ymVS6RHdwT5EWuakMLZyY3Em5kAJnb6g1Ph/zU9gUfcFozw2iuO1uDyRBj7qF T/ZjQFTkHS87vjk1aGp5kj+1keLugOGA3NNH5EmA=
Received: from CASCADE.ad.sei.cmu.edu (cascade.sei.cmu.edu [10.64.28.248]) by pawpaw.sei.cmu.edu (8.14.4/8.14.4/1408) with ESMTP id r6UEUOVg005517; Tue, 30 Jul 2013 10:30:24 -0400
Received: from MARATHON.ad.sei.cmu.edu ([10.64.28.250]) by CASCADE.ad.sei.cmu.edu ([10.64.28.248]) with mapi id 14.02.0318.004; Tue, 30 Jul 2013 10:30:15 -0400
From: "Roman D. Danyliw" <rdd@cert.org>
To: Eric Burger <eburger@standardstrack.com>, MILE <mile@ietf.org>
Thread-Topic: [mile] SHOULD/RECOMMENDED Analysis / Suggestions
Thread-Index: AQHOLuUJPztIKnX4L02MSdPMRKhAQZl92B9w
Date: Tue, 30 Jul 2013 14:30:14 +0000
Message-ID: <359EC4B99E040048A7131E0F4E113AFC13C57355@marathon>
References: <B3540424-4B78-4A6A-956D-B59BFF6CCF03@standardstrack.com>
In-Reply-To: <B3540424-4B78-4A6A-956D-B59BFF6CCF03@standardstrack.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.64.22.6]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: Re: [mile] SHOULD/RECOMMENDED Analysis / Suggestions
X-BeenThere: mile@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "Managed Incident Lightweight Exchange, IODEF extensions and RID exchanges" <mile.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/mile>, <mailto:mile-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/mile>
List-Post: <mailto:mile@ietf.org>
List-Help: <mailto:mile-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mile>, <mailto:mile-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 30 Jul 2013 14:31:31 -0000

Good afternoon Eric!

Thank you for the careful and extremely complete review of the keywords.  A response to each is inline.  Again, thank you for this painstaking analysis!

> -----Original Message-----
> From: mile-bounces@ietf.org [mailto:mile-bounces@ietf.org] On Behalf Of Eric
> Burger
> Sent: Monday, April 01, 2013 10:27 AM
> To: MILE
> Subject: [mile] SHOULD/RECOMMENDED Analysis / Suggestions
> 
> SHOULD's:
> 
> 3.4.  AlternativeID Class
>    [...] The incident tracking numbers of the CSIRT that
>    generated the IODEF document should never be considered an
>    AlternativeID.
> 
> Two schools of thought.  The first is either one is saying something or one is
> not.  In that case, we should say:
>    [...] The incident tracking numbers of the CSIRT that
>    generated the IODEF document must never be considered an
>    AlternativeID.
> 
> The second is this is not describing protocol behavior, and as such there is no
> enforcement.  In that case, we should say:
>    [...] Do not use the incident tracking numbers of the CSIRT that
>    generated the IODEF document as an AlternativeID.

I'd prefer option #1

> 3.8.4.  ReportTime
>    The ReportTime class represents the time the incident was reported.
>    This timestamp SHOULD coincide to the time at which the IODEF
>    document is generated.
> 
> What other time can the timestamp refer to?  Why would it be something
> different?  Say so or turn this into a MUST.
>    The ReportTime class represents the time the incident was reported.
>    This timestamp MUST be the time at which the IODEF
>    document is generated.

Concur with proposed text.
 
> 
> 3.8.5.  DateTime
> 
>    The DateTime class is a generic representation of a timestamp.  Its
>    semantics should be inferred from the parent class in which it is
>    aggregated.
> 
> Although not normative, this is very loose.  When would one ever NOT infer
> the DateTime class for a particular timestamp class (start/end/report/detect)?
> Is there a reason not to just say:
>    The DateTime class is a generic representation of a timestamp.
>    Infer its semantics from the parent class in which it is
>    aggregated.

Agreed.  Concur with proposed text. 

> 3.10.4.  Confidence Class
>    [...]
>    The element content expresses a numerical assessment in the
>    confidence of the data when the value of the rating attribute is
>    "numeric".  Otherwise, this element should be empty.
> 
> What else can it be?  I do not think it could be anything:
>    The element content expresses a numerical assessment in the
>    confidence of the data when the value of the rating attribute is
>    "numeric".  Otherwise, this element MUST be empty.

Concur with proposed text.

> 3.13.  Expectation Class
>    [...]
>    StartTime
>       Zero or one.  The time at which the action should be performed.  A
>       timestamp that is earlier than the ReportTime specified in the
>       Incident class denotes that the expectation should be fulfilled as
>       soon as possible.  The absence of this element leaves the
>       execution of the expectation to the discretion of the recipient.
> 
> I would offer we not use the word 'should' here, as it is confused with
> 'SHOULD.'  How about rewording this to say:
>    StartTime
>       Zero or one.  The time at which the sender would like the action
>       performed.  A timestamp that is earlier than the ReportTime
>       specified in the Incident class denotes that the sender would like
>       the action performed as soon as possible.  The absence of this
>       element indicates no expectations of when the recipient would like
>       the action performed.

The use of would makes the text clearer.  Concur with proposed text.

> 
>    EndTime
>       Zero or one.  The time by which the action should be completed.
>       If the action is not carried out by this time, it should no longer
>       be performed.
> 
> Or else what?  We should be brutal here.  I would offer:
>    EndTime
>       Zero or one.  The time by which the sender expects the recipient
>       to complete the action.  If the recipient cannot complete the
>       action before EndTime, the recipient MUST NOT carry out the action.

Concur with proposed text.

>       NOTE: Because of tranist delays, clock drift, and so on, the sender
>       MUST be prepared for the recipient to have carried out the action,
>       even if it completes past EndTime.

To the broader audience, what do you think?

> 3.16.  Node Class
>    [...]
>    DateTime
>       Zero or one.  A timestamp of when the resolution between the name
>       and address was performed.  This information SHOULD be provided if
>       both an Address and NodeName are specified.
> 
> What is this used for?  Is it important?  Are we worried about DNS changes?
> What if the authoritative server does an update but the resolver's TTL has not
> expired?  What is the use case?

One use case in which this element is when specifying C2 nodes using fast flux.

> Assuming it is useful (I am doubtful), I would offer:
>    DateTime
>       Zero or one.  A timestamp of when the resolution between the name
>       and address was performed.  This information MUST be provided if
>       both an Address and NodeName are specified.
> 
> Otherwise, drop the element.

I would argue that it text remain as a SHOULD since the resolution time might now always be useful.

> 3.17.  Service Class
>    [...]
>    For a given source, System@type="source", a corresponding target,
>    System@type="target", maybe defined, or vice versa.  When a Portlist
>    class is defined in the Service class of both the source and target
>    in a given instance of the Flow class, there MUST be symmetry in the
>    enumeration of the ports.  Thus, if n-ports are listed for a source,
>    n-ports should be listed for the target.  Likewise, the ports should
>    be listed in an identical sequence such that the n-th port in the
>    source corresponds to the n-th port of the target.  This symmetry in
>    listing and sequencing of ports applies whether there are 1-to-1,
>    1-to-many, or many-to-many sources-to-targets.  In the 1-to-many or
>    many-to-many, the exact order in which the System classes are
>    enumerated in the Flow class is significant.
> 
> There MUST be symmetry, but it is only a good idea to actually do it?  Either
> there MUST be symmetry, or not.  In being conservative in what one sends, I
> would move everything to MUSTs.  Also note a few typographical errors that
> get fixed:
>    For a given source, System@type="source", a corresponding target,
>    System@type="target", may be defined, or vice versa.  When a Portlist
>    class is defined in the Service class of both the source and target
>    in a given instance of the Flow class, there MUST be symmetry in the
>    enumeration of the ports.  Thus, if N ports are listed for a source,
>    N ports MUST be listed for the target.  Likewise, the ports MUST
>    be listed in an identical sequence such that the n-th port in the
>    source corresponds to the n-th port of the target.  This symmetry in
>    listing and sequencing of ports applies whether there are 1-to-1,
>    1-to-many, or many-to-many sources-to-targets.  In the 1-to-many or
>    many-to-many, the exact order in which the System classes are
>    enumerated in the Flow class is significant.
> 
> Since the document brings it up, what does it mean to have a 1:1
> correspondence of sources-to-targets if there can be more targets than
> sources?

Good point.  I'd recommend punting this issue out to the broader discussion.  1:1 and 1:N would mean the following 

(a) 1-source portlist and 1-target portlist
(b) 1-source portlist
(c) 1-target portlist
(d) many-target portlist
(e) many-source portlist

In what cases would we want to exchange many-to-many Portlist?  I don't see an easy way to represent N:N where the port mapping stays the same.

> 3.19.  Record Class
>    The Record class is a container class for log and audit data that
>    provides supportive information about the incident.  The source of
>    this data will often be the output of monitoring tools.  These logs
>    should substantiate the activity described in the document.
> 
> What else can this class do?  Take out one word and it is fine:
>    The Record class is a container class for log and audit data that
>    provides supportive information about the incident.  The source of
>    this data will often be the output of monitoring tools.  These logs
>    substantiate the activity described in the document.

Concur with proposed text.

> 3.19.  Record Class
>    [...]
>    RecordData
>       One or more.  Log or audit data generated by a particular type of
>       sensor.  Separate instances of the RecordData class SHOULD be used
>       for each sensor type.
> 
> If there is ever a reason not to have separate insances of RecordData, then
> enumerate them or explain how they get aggregated.  Otherwise:
>    RecordData
>       One or more.  Log or audit data generated by a particular type of
>       sensor.  Separate instances of the RecordData class MUST be used
>       for each sensor type.

The historic reason for this guidance only being a SHOULD is to support someone dumping a mix of log snippets into a single instance of RecordData/AdditionalData and calling it "supporting evidence".

I'll pose the question to the community as to whether this is still a valid use case.

> 
> 4.2.  IODEF Namespace
>    The IODEF schema declares a namespace of
>    "urn:ietf:params:xml:ns:iodef-1.0" and registers it per [4].  Each
>    IODEF document SHOULD include a valid reference to the IODEF schema
>    using the "xsi:schemaLocation" attribute.  An example of such a
>    declaration would look as follows:
> 
> When would one EVER not include a valid reference on purpose?
>    The IODEF schema declares a namespace of
>    "urn:ietf:params:xml:ns:iodef-1.0" and registers it per [4].  Each
>    IODEF document MUST include a valid reference to the IODEF schema
>    using the "xsi:schemaLocation" attribute.  An example of such a
>    declaration would look as follows:
> 
> Remember, later on we say this whole thing is a joke and you should not even
> try to dereference the URN, so it is safe to include the 32 bytes the URN takes
> up. We are already paying the XML tax, so we might as well really use it.

Agreed. Concur with proposed text.

> 4.3.  Validation
>    The IODEF documents MUST be well-formed XML and SHOULD be validated
>    against the schema described in Section 8.
> 
> Here, finally, is a legitimate use of SHOULD.  Well, actually, it should be a
> recommendation:
>    The IODEF documents MUST be well-formed XML.  Recipients are
>    RECOMMENDED to validate the received document against the schema
>    described in Section 8.  

Concur with proposed text.

>   Situations where the recipient may choose
>    not to validate the received document include deployments where
>    there is limited Internet connectivity and as such access to the
>    schemas is not available or timely; implementations where concerns
>    about accessing external files at run time may conflict with local policies;
>    and situations where the integrity of the data is of little importance to the
>    recipient.
> 
> 
> 5.2.  Extending Classes
>    [...]
>    4.  When a parser encounters an IODEF document with an extension it
>        does not understand, this extension MUST be ignored (and not
>        processed), but the remainder of the document MUST be processed.
>        Parsers will able to identify these extensions for which they
>        have no processing logic through the namespace declaration.
>        Parsers that encounter an unrecognized element in a namespace
>        that they do support SHOULD reject the document as a syntax
>        error.
> 
> When would they not reject the document?  Enumerate likely scenarios, like:
>        Parsers that encounter an unrecognized element in a namespace
>        that they do support SHOULD reject the document as a syntax
>        error unless they can do meaningful processing without the
>        element in question.
> 
> If this is a real sitution, I would offer we would be better served with a
> mechanism similar to the machinery described by RFC 3459.  Say what we
> mean, and if a sender sends something the recipient MUST understand, mark
> it as such.  If a sender sends optional, "value added" information that can be
> ignored, mark it as such.  Otherwise:
>        Parsers that encounter an unrecognized element in a namespace
>        that they do support MUST reject the document as a syntax
>        error.

Agreed on this RFC3459 suggested language.  Since all extensions must declare a separate namespace, this stricter language makes sense.

> 5.2.  Extending Classes
>    [...]
>    5.  Implementations SHOULD NOT download schemas at runtime due to the
>        security implications, and extensions MUST NOT be required to
>        provide a resolvable location of their schema.
> 
> This is incompatible with "The IODEF document ... SHOULD be validated."  Is it
> really validated or is it not?  
> I would offer this gets deployed in the real world:
> it will NOT be validated.

I'd argue that these two statements are compatible as the schema can be cached.

> That said, paragraph 5 does not help the
> implmentor.  We need some meat around the strictures.  For example:
>    5.  There are security implications of requiring implementations
>        to dynamically download schemas at run time.  For example,
>        malformed IODEF documents may launch a denial of service attack
>        on a recipient by specifying bogus or numerous schema definition
>        references in the document.  Thus, implementations SHOULD NOT
>        download schemas at runtime, unless implementations take
>        appropriate precautions and are prepared for potentially significant
>        network, processing, and time-out demands.

Concur with the spirit of the suggestion and propose slightly simpler language:

5.  There are security and performance implications in requiring implementations to dynamically download schemas at run time.  Thus, implementations SHOULD NOT download schemas at runtime, unless implementations take appropriate precautions and are prepared for potentially significant network, processing, and time-out demands.

 
>    6.  Some users of the IODEF may have private schema definitions that
>        might not be available on the Internet.  In this situation, if
>        a IODEF document leaks out of the private use space, references
>        to some of those document schemas may not be resolvable.  This
>        has two implications.  First, references to private schemas may
>        never resolve.  As such, in addition to the suggestion that
>        implementations do not download schemas at runtime mentioned
>        above, recipients MUST be prepared for a schema definition in an
>        IODEF document never to resolve.
>
> I think the new #6 above is what we are really saying when we say,
> "extensions MUST NOT be required to provide a resolvable location of their
> schema."  If I am wrong, then the old #5 is not clear and needs to be fixed.

It's a good idea to split up #5 into two section.  I concur with the new language for #6 (formerly the second part of the sentence of #5).

> 6.  Internationalization Issues
>    [...]
>    The IODEF parser SHOULD extract the appropriate language relevant to
>    the recipient.
> 
> Two schools of thought.  The first is this is an implentation issue:
>    The IODEF parser can extract the appropriate language relevant to
>    the recipient.
> - or -
>    The IODEF parser extracts the most relevant language for
>    the recipient.
> 
> The second school of thought is this is a protocol issue:
>    The IODEF parser MUST extract the appropriate language relevant to
>    the recipient or return an error to the sender.
> I do not like this approach.

I concur that there is a problem, but recommend that it is deferred until the broader issue of fixing internationalization is tackled (http://trac.tools.ietf.org/wg/mile/trac/ticket/1).

> RECOMMENDED's, all in Section 5.2, Extending Classes:
> 
>    [...] It is RECOMMENDED that the
>    extension be placed in the most closely related class to the new
>    information.
> Unless what?  Will there ever be extensions that are not related to any
> existing class?  Say so if so.

Perhaps a various data elements that could be scattered across a document through extensions are simply dumped into single extension in /IODEF-Document/Incident/AdditionalData.  While inelegant, it probably shouldn't be precluded.
 
>    2.  The extension schema MUST declare a separate namespace.  It is
>        RECOMMENDED that these extensions have the prefix "iodef-".
> 
> Sensible, but let us at least say why:
>    2.  The extension schema MUST declare a separate namespace.  It is
>        RECOMMENDED that these extensions have the prefix "iodef-".
>        This recommendation makes readability of the document easier
>        by allowing the reader to infer which namespaces relate to IODEF
>        by inspection.

Concur with proposed text.

>    3.  It is RECOMMENDED that extension schemas follow the naming
>        convention of the IODEF data model.  The names of all elements
>        are capitalized.  For composed names, a capital letter is used
>        for each word.  Attribute names are lower case.
> 
> Again, sensible, and again we should say why:
>    3.  It is RECOMMENDED that extension schemas follow the naming
>        convention of the IODEF data model.  This makes reading an
>        extended IODEF document look like any other IODEF document.
>        The names of all elements are capitalized.  For composed names,
>        a capital letter is used for each word.  Attribute names are lower case.

Concur with proposed text.