[MORG] Review: draft-ietf-morg-fuzzy-search-02

Cyrus Daboo <cyrus@daboo.name> Fri, 30 July 2010 08:44 UTC

Return-Path: <cyrus@daboo.name>
X-Original-To: morg@core3.amsl.com
Delivered-To: morg@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 438EF3A67F9 for <morg@core3.amsl.com>; Fri, 30 Jul 2010 01:44:39 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.567
X-Spam-Level:
X-Spam-Status: No, score=-1.567 tagged_above=-999 required=5 tests=[AWL=-0.827, BAYES_20=-0.74]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0niavNu1b7Tg for <morg@core3.amsl.com>; Fri, 30 Jul 2010 01:44:36 -0700 (PDT)
Received: from daboo.name (daboo.name [151.201.22.177]) by core3.amsl.com (Postfix) with ESMTP id DEDB73A69AA for <morg@ietf.org>; Fri, 30 Jul 2010 01:44:35 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1]) by daboo.name (Postfix) with ESMTP id 05C6C18BB64C9 for <morg@ietf.org>; Fri, 30 Jul 2010 04:44:59 -0400 (EDT)
X-Virus-Scanned: amavisd-new at daboo.name
Received: from daboo.name ([127.0.0.1]) by localhost (chewy.mulberrymail.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id o9xGUcLNcBtw for <morg@ietf.org>; Fri, 30 Jul 2010 04:44:58 -0400 (EDT)
Received: from dhcp-63f1.meeting.ietf.org (dhcp-63f1.meeting.ietf.org [130.129.99.241]) by daboo.name (Postfix) with ESMTPSA id 0A0ED18BB64BE for <morg@ietf.org>; Fri, 30 Jul 2010 04:44:57 -0400 (EDT)
Date: Fri, 30 Jul 2010 10:44:54 +0200
From: Cyrus Daboo <cyrus@daboo.name>
To: morg@ietf.org
Message-ID: <A199F09978CEADEC3697D0DA@dhcp-63f1.meeting.ietf.org>
X-Mailer: Mulberry/4.1.0a1 (Mac OS X)
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format="flowed"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline; size="1245"
Subject: [MORG] Review: draft-ietf-morg-fuzzy-search-02
X-BeenThere: morg@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Messaging Organization <morg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/morg>, <mailto:morg-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/morg>
List-Post: <mailto:morg@ietf.org>
List-Help: <mailto:morg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/morg>, <mailto:morg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 30 Jul 2010 08:44:39 -0000

Hi,
I have looked at this document. Some comments:

1) I would like to see an example of a mixed fuzzy/non-fuzzy search, just 
to emphasize that such a thing can happen. e.g.:

C: A01 SEARCH FUZZY SUBJECT "Mulberry" FROM cyrus@daboo.name

FROM - precise match, SUBJECT - fuzzy match

2) I would like it t be clear, that for a "plain" SEARCH response (i.e. no 
relevancy results) that the order of returned results is arbitrary (i.e. 
NOT ordered by relevancy) - I am assuming that that is the intent, but I 
would like it to be clear to client implementors.

3) Security: there should be some discussion about how relevancy rankings 
can be "poisoned" by smart attackers using certain keywords or hidden 
markup in their messages to boost the rankings.

4) Given RFC5255's comparator stuff, is the active comparator expected to 
have any influence on the fuzzy behavior? I am guess not, but it might be 
worth a mention.

5) How are rankings determined when more than one FUZZY is used? I assume 
the server has to use its own logic for aggregating the rankings.

6) Do clients need to care that the FUZZY search key is not distributive, 
e.g.:

FUZZY OR SUBJECT x SUBJECT y != OR FUZZY SUBJECT x FUZZY SUBJECT y

Otherwise OK.

-- 
Cyrus Daboo