[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Ltru] #944 in draft -04 (was: Confusing conformance language)



Addison Phillips wrote:

> I have rewritten Section 2.1.1 some more and those changes
> went along for the ride. If you're interesting in issue #944,
> you should go read that section again and send comments

Not new, but reading it again it still tickles me:  You say...

| a combined total length of up to six characters, much larger
| registered tags were not only possible but were actually
| registered.

...if this "much longer" stands for "almost 67% longer" it's
true, but OTOH ten characters is not _that_ long, if it's the
worst registered 3066bis tag.

Here's a "normative" typo:

| For example, see [RFC 2231] [23].  This protocol has no
| explicit length limitation:

In fact it is in part about bypassing some length limitations
and introducing I18N in MIME parameters.  In its second part it
is about more I18N (= language tags) in MIME encoded words.

| the language tag's length is limited by the length of other
| header components (such as the charset's name) coupled with
| the 78 character limit in [RFC 2822]

NAK, the relevant limit for encoded words isn't the SHOULD "78"
in RfC 2822, it is the "76" in RfC 2047:

| An 'encoded-word' may not be more than 75 characters long,
| including 'charset', 'encoding', 'encoded-text', and
| delimiters.  If it is desirable to encode more text than will
| fit in an 'encoded-word' of 75 characters, multiple
| 'encoded-word's (separated by CRLF SPACE) may be used.

| While there is no limit to the length of a multiple-line
| header field, each line of a header field that contains one
| or more 'encoded-word's is limited to 76 characters.

You could also say "75" with a reference to [RFC2047].  Please
do not mention RfC 2822 and its SHOULD "78", this is another
story, and it's unrelated to our "most perverse tag" problem.

| Thus the "limit" might be 60 or more characters, but it could
| potentially be quite small.

 =?utf-8*?B?9L+/vw==?=
1234567890123456789012 (22)

The worst case for UTF-8 is 4 bytes F4 BF BF BF (u+10FFFF).
I'm not sure about the B64 rules, but IIRC splitting is evil,
and so we get 8 bytes 9L+/vw==.

Total length for the worst case from SP up to ?= therefore 22,
76-22=54.  For many Q-encoded US-ASCII characters it's only
one byte resulting in an absolute _maximum_ of 76-15 = 61:

 =?utf-8*?Q?3?=
123456789012345 (15)

And for the worst Q-encoded case it is 76-26=50.

 =?utf-8*?Q?=F4=BF=BF=BF?=
12345678901234567890123456 (26)

Now that's a nice number needing no explicit justification, you
could say:                  vv
  Thus the "limit" might be 50 or more characters, but it could
  potentially be quite small
                            Bye, Frank



_______________________________________________
Ltru mailing list
Ltru at lists.ietf.org
https://www1.ietf.org/mailman/listinfo/ltru




Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.