[Gen-art] Gen-ART review of draft-ietf-marf-redaction-04

<david.black@emc.com> Wed, 11 January 2012 02:44 UTC

Return-Path: <david.black@emc.com>
X-Original-To: gen-art@ietfa.amsl.com
Delivered-To: gen-art@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 231A321F847C; Tue, 10 Jan 2012 18:44:43 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -106.592
X-Spam-Level:
X-Spam-Status: No, score=-106.592 tagged_above=-999 required=5 tests=[AWL=0.007, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id KfruPwjNhBJW; Tue, 10 Jan 2012 18:44:42 -0800 (PST)
Received: from mexforward.lss.emc.com (mexforward.lss.emc.com [128.222.32.20]) by ietfa.amsl.com (Postfix) with ESMTP id 5AC3E21F8499; Tue, 10 Jan 2012 18:44:42 -0800 (PST)
Received: from hop04-l1d11-si02.isus.emc.com (HOP04-L1D11-SI02.isus.emc.com [10.254.111.55]) by mexforward.lss.emc.com (Switch-3.4.3/Switch-3.4.3) with ESMTP id q0B2iWxI019520 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 10 Jan 2012 21:44:32 -0500
Received: from mailhub.lss.emc.com (mailhubhoprd02.lss.emc.com [10.254.221.253]) by hop04-l1d11-si02.isus.emc.com (RSA Interceptor); Tue, 10 Jan 2012 21:44:19 -0500
Received: from mxhub08.corp.emc.com (mxhub08.corp.emc.com [128.222.70.205]) by mailhub.lss.emc.com (Switch-3.4.3/Switch-3.4.3) with ESMTP id q0B2iIN1003220; Tue, 10 Jan 2012 21:44:18 -0500
Received: from mx14a.corp.emc.com ([169.254.1.99]) by mxhub08.corp.emc.com ([128.222.70.205]) with mapi; Tue, 10 Jan 2012 21:44:18 -0500
From: david.black@emc.com
To: ietf@cybernothing.org, msk@cloudmark.com, gen-art@ietf.org, ietf@ietf.org
Date: Tue, 10 Jan 2012 21:44:16 -0500
Thread-Topic: Gen-ART review of draft-ietf-marf-redaction-04
Thread-Index: AczQCujbMXhqirD5Sk6EB1sS2KVtOQ==
Message-ID: <7C4DFCE962635144B8FAE8CA11D0BF1E05A7B80D63@MX14A.corp.emc.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-EMM-MHVC: 1
Cc: presnick@qualcomm.com, marf@ietf.org
Subject: [Gen-art] Gen-ART review of draft-ietf-marf-redaction-04
X-BeenThere: gen-art@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "GEN-ART: General Area Review Team" <gen-art.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/gen-art>, <mailto:gen-art-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/gen-art>
List-Post: <mailto:gen-art@ietf.org>
List-Help: <mailto:gen-art-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/gen-art>, <mailto:gen-art-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 11 Jan 2012 02:44:43 -0000

I am the assigned Gen-ART reviewer for this draft. For background on Gen-ART, please
see the FAQ at <http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq>.

Please resolve these comments along with any other Last Call comments you may receive.

Document: draft-ietf-marf-redaction-04
Reviewer: David L. Black
Review Date: January 10, 2012
IETF LC End Date: January 18, 2011
IESG Telechat Date: January 19, 2011

Summary: This draft is on the right track but has open issues, described in the review.

This draft specifies a method for redacting information from email abuse reports
(e.g., hiding the local part [user] of an email address), while still allowing
correlation of the redacted information across related abuse reports from the same
source. The draft is short, clear, and well written.

There are two open issues:

[1] The first open issue is the absence of security guidance to ensure that this
redaction technique effectively hides the redacted information.  The redaction
technique is to concatenate a secret string (called the "redaction key") to the
information to be redacted, apply "any hashing/digest algorithm", convert the output
to base64 and use that base64 string to replace the redacted information.

There are two important ways in which this technique could fail to effectively hide
the redacted information:
	- The secret string may inject insufficient entropy.
	- The hashing/digest algorithm may be weak.

To take an extreme example, if the secret string ("redaction key") consists of a
single ASCII character, and a short email local part is being redacted, then the
output is highly vulnerable to dictionary and brute force attacks because only 6 bits
of entropy are added (the result may look secure, but it's not).  Beyond this extreme
example, this is a potentially real concern - e.g., applying the rule of thumb that
ASCII text contains 4-5 bits of entropy per character, the example in Appendix A
uses a "redaction key" of "potatoes" that injects at most 40 bits of entropy -
is that sufficient for email redaction purposes?

To take a silly example, if a CRC is used as the hash with that sort of short input,
the result is not particularly difficult to invert.

I suggest a couple of changes:
1) Change "any hashing/digest algorithm" to require use of a secure hash, and
	explain what is meant by "secure hash" in the security considerations section.
2) Require a minimum length of the "redaction key" string, and strongly suggest
	(SHOULD) that it be randomly generated (e.g., by running sufficient output
	of an entropy-rich random number generator through a base64 converter).

For the latter change, figure out the amount of entropy that should be used
for redaction - the recommended string length will be larger because printable
ASCII is not entropy-dense (at best it's good for 6 bits of entropy in each
8-bit character, and human-written text such as this message has significantly
less).

>From a pure security perspective, use of HMAC with specified secure hashes
(SHA2-family) and an approach of hashing the "redaction key" down to a binary
key for HMAC would be a stronger approach. I suggest that authors consider
approach, but  there may be practical usage concerns that suggest not adopting it.

[2] The second open issue is absence of security considerations for the redaction
key.  The security considerations section needs to caution that the redaction key
is a secret key that must be managed and protected as a secret key.  Disclosure
of a redaction key removes the redaction from all reports that used that key.
As part of this, guidance should be provided on when and how to change the
redaction key in order to limit the effects of loss of secrecy for a single
redaction key.

Editorial Nit: I believe that "anonymization" is a better description of what
this draft is doing (as opposed to "redaction"), particularly as the result is
intended to be correlatable via string match across reports from the same source.

idnits 2.12.13 didn't find any nits.

Thanks,
--David
----------------------------------------------------
David L. Black, Distinguished Engineer
EMC Corporation, 176 South St., Hopkinton, MA  01748
+1 (508) 293-7953             FAX: +1 (508) 293-7786
david.black@emc.com        Mobile: +1 (978) 394-7754
----------------------------------------------------