[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [MORG] fuzzy search



On Mar 25, 2009, at 12:33 AM, <Zoltan.Ordogh at nokia.com> wrote:

"However, the result MUST NOT contain messages that don't have a \Seen flag set or whose size is 5000 bytes or more."

I do not understand why the size limit cannot be fuzzy. What happens when you find a match that is 5016 bytes? Instead of making it "not found", it could get a lower score.
Same for seen and unseen.

Originally I had been thinking about a bit different syntax where this text made much more sense. Then Alexey suggested a better way, and yes, I did start wondering if this made sense afterwards but decided to just post the draft as it is and see what other people thought about it.

I can think of one potential reason why it might not be good to allow it: If it was allowed, is it a MUST, SHOULD or MAY that 5016 bytes can find 5000 bytes message? And if so, what should be the limit when it's dropped out of the search results? Or should it always be there, just with a smaller score (minimum score)? If there is no recommended limit for servers, is there a recommandation to clients when they should put such queries inside the FUZZY rules?

I suppose another way to look at it would be to look at it from user's perspective. Anything inside FUZZY should be supplied by the user. User may want to look at "about" 5000 byte message, and server should prioritize the results accordingly and without restrictions to implementation. Same as with strings.

I am not sure what to say about SORT - could yield funny results - I would need to see it in action.

I'm not sure what you mean by this. SORT is the main reason why the whole relevancy even exists.

BTW: are you using the right boilerplate?

I think so. It got changed somewhat recently and I think I'm using the same as e.g. STATUS-IN-LIST (but too lazy to check right now).