idnits 2.17.1 draft-elkchow-iea-deploy-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 4, 2016) is 2759 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- No issues found here. Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 INTERNET-DRAFT N. Elkins 3 Intended Status: Informational Inside Products 4 H. Chowdhary 5 NIXI 7 Expires: April 7, 2017 October 4, 2016 9 Deployment Issues for Internationalized Email 10 draft-elkchow-iea-deploy-00 12 Abstract 14 International Email Addresses (IEA) are far from the global reality. The 15 current de-facto language of the Internet is English. Even today, many 16 of the users of the Internet do not speak English as their primary 17 language. The next billion users of the Internet are likely to be even 18 less familiar with English. IEA is probably the first application needed 19 in a truly internationalized Internet. The Email Address 20 Internationalization (EAI) Working Group defined the RFCs to support 21 internationalized email. The time may now finally have come to develop 22 best practices and to discuss the deployment challenges for IEA. 24 Status of this Memo 26 This Internet-Draft is submitted to IETF in full conformance with the 27 provisions of BCP 78 and BCP 79. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF), its areas, and its working groups. Note that 31 other groups may also distribute working documents as 32 Internet-Drafts. 34 Internet-Drafts are draft documents valid for a maximum of six months 35 and may be updated, replaced, or obsoleted by other documents at any 36 time. It is inappropriate to use Internet-Drafts as reference 37 material or to cite them other than as "work in progress." 39 The list of current Internet-Drafts can be accessed at 40 http://www.ietf.org/1id-abstracts.html 42 The list of Internet-Draft Shadow Directories can be accessed at 43 http://www.ietf.org/shadow.html 45 Copyright and License Notice 47 Copyright (c) 2016 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents 52 (http://trustee.ietf.org/license-info) in effect on the date of 53 publication of this document. Please review these documents 54 carefully, as they describe your rights and restrictions with respect 55 to this document. Code Components extracted from this document must 56 include Simplified BSD License text as described in Section 4.e of 57 the Trust Legal Provisions and are provided without warranty as 58 described in the Simplified BSD License. 60 Table of Contents 62 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 63 1.1 Punycode . . . . . . . . . . . . . . . . . . . . . . . . . 4 64 1.2 Single Language / Multiple Languages . . . . . . . . . . . 4 65 2 Email Servers . . . . . . . . . . . . . . . . . . . . . . . . 4 66 2.1 Backend Databases . . . . . . . . . . . . . . . . . . . . . 5 67 3 Email Clients . . . . . . . . . . . . . . . . . . . . . . . . . 5 68 3.1 Display of Email ID . . . . . . . . . . . . . . . . . . . . 5 69 3.2 Display of Email Body . . . . . . . . . . . . . . . . . . . 5 70 3.3 Messages Routed to SPAM . . . . . . . . . . . . . . . . . . 6 71 4 Multiple Identities / Aliases . . . . . . . . . . . . . . . . . 6 72 5 Email Address Books . . . . . . . . . . . . . . . . . . . . . . 6 73 6 Security Considerations . . . . . . . . . . . . . . . . . . . . 6 74 6.1 Homographic Attacks . . . . . . . . . . . . . . . . . . . . 6 75 6.2 Use of Mixed Scripts . . . . . . . . . . . . . . . . . . . 7 76 6.3 Right-to-left Issues . . . . . . . . . . . . . . . . . . . 7 77 7 IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 7 78 8 References . . . . . . . . . . . . . . . . . . . . . . . . . . 7 79 8.1 Normative References . . . . . . . . . . . . . . . . . . . 7 80 8.2 Informative References . . . . . . . . . . . . . . . . . . 8 81 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 8 83 1 Introduction 85 The Email Address Internationalization (EAI) Working Group, which has 86 concluded, created a structure and framework for internationalized 87 email addresses. From the charter: 89 "The email address has two parts, local part and domain part. Email 90 address internationalization must deal with both. This working 91 group's previous experimental efforts investigated the use of UTF-8 92 as a general approach to email internationalization. That approach is 93 based on the use of an SMTP extension to enable the use of UTF-8 in 94 envelope address local-parts, optionally in address domain-parts, and 95 in mail headers. The mail header contexts can include both addresses 96 and wherever existing protocols (e.g., RFC 2231) permit the use of 97 encoded-words." [EAICharter] 99 Much work was done in this group including: 101 RFC 6530 : Overview and Framework for Internationalized Email 102 [RFC6530] 103 RFC 6531 : SMTP Extension for Internationalized Email [RFC6531] 104 RFC 6532 : Internationalized Email Headers [RFC6532] 105 RFC 6533 : Internationalized Delivery Status and Disposition 106 Notifications [RFC6533] 107 RFC 6783 : Mailing Lists and Non-ASCII Addresses [RFC6783] 108 RFC 6855 : IMAP Support for UTF-8 [RFC6855] 109 RFC 6856 : Post Office Protocol Version 3 (POP3) Support for UTF-8 110 [RFC6856] 111 RFC 6857 : Post-Delivery Message Downgrading for Internationalized 112 Email Messages [RFC6857] 113 RFC 6858 : Simplified POP and IMAP Downgrading for Internationalized 114 Email [RFC6858] 116 Yet, deployment lags. Global EAI is far from the reality. 118 The Internet is getting bigger day by day by integrating top level 119 domains using non-ASCII based scripts i.e Devanagari,Cyrillic,Arabic, 120 Chinese etc. These new top level domains need to be able to send 121 emails as well as to access web sites via browsers. 123 If a user has an internationalized email address, then it should be 124 possible to send/ receive to/from any email address using any email 125 client. This interoperability demands concerted efforts by all major 126 email service providers. Even now, there is very limited or no 127 support in email servers (SMTP, IMAP, POP), email providers (Gmail, 128 Yahoo, Hotmail) and email clients. 130 Often, it is not even possible to create an email ID for end-users in 131 a non-ascii based language while many Internationalized Domain Names 132 (IDN) exist. This is of major concern to many in the parts of the 133 world where the primary language is not English. 135 1.1 Punycode 137 Languages not based on the Latin script (A, B, C, etc) use unicode to 138 represent the letters in their alphabet rather than ASCII. Punycode 139 is used to show unicode characters in ASCII format. It is used in 140 the transport of email. 142 For example: 144 English: Nehru 145 Hindi: ????? [Cannot be displayed] 146 Punycode: xn--l2bq0a0bw 148 Punycode will start with the prefix: "xn--". 150 An application handling IDN domains has to reference an IDN 151 repository to know how to display them. Emails add to the 152 complication since many email systems pre-date the introduction of 153 IDNs. These systems often simply reject emails that don't work within 154 the old domain name model. 156 1.2 Single Language / Multiple Languages 158 Some people ask, "What if I send an email in Russian and want to 159 respond in Chinese?" What problems will arise? 161 This is certainly an important issue but it is hard enough to send 162 emails back and forth in one non-ASCII based language. This draft 163 will leave for the future the issues of multiple languages with the 164 accompanying translation and user interface issues. 166 2 Email Servers 168 For an email server to be ready for EAI, it must implement: 170 RFC 6530 : Overview and Framework for Internationalized Email 171 [RFC6530] 172 RFC 6531 : SMTP Extension for Internationalized Email [RFC6531] 173 RFC 6532 : Internationalized Email Headers [RFC6532] 175 Here is a partial list of servers and test beds for EAI: 177 PostFix 3.0 and above 178 Coremail 179 Throughway (Thailand) 180 OpenMail (Taiwan) 181 EAI test environment (Saudi Arabia) 182 Xgenplus (INDIA) 184 2.1 Backend Databases 186 The email servers store data in relational databases such as MySQL 187 and MariaDB. These databases must support UTF-8 and be configured 188 to use UTF-8. There may need to be both Punycode and UTF-8 fields 189 defined on occasion. 191 3 Email Clients 193 Since the email client is the interface to the user, here is where a 194 number of issues arise. There are a number of email clients services 195 providers that support EAI to various extents. A partial list 196 follows: 198 Coremail 199 Horde Project 200 Microsoft Outlook 2016 for PC 201 Gmail - to some extent 202 Apple Mail - to some extent 203 Throughway (Thailand) 204 OpenMail (Taiwan) 205 EAI test environment (Saudi Arabia) 206 Roundcube 208 3.1 Display of Email ID 210 The email ID is often shown in Punycode. For example, the email id: 211 harish@nalini.bharat in Hindi is: 213 ???????@??????.?????? [cannot be displayed] 215 This email ID will be displayed in many email clients in Punycode. 216 That is, the email ID will be shown as: xn--t2bmh3a@xn--l2ba3a4cg.xn- 217 -h2brj9c. This is not particularly user-friendly. 219 3.2 Display of Email Body 221 The issue with the email body have to do with an easy ability to type 222 in the language of choice. A number of browsers have extensions to 223 allow this. 225 But when it comes to displays of links containing IDN names, often 226 the link does not work. 228 3.3 Messages Routed to SPAM 230 Messages may be routed by email clients to SPAM if they are not in 231 English. 233 4 Multiple Identities / Aliases 235 A user may have multiple identities. That is, he / she may have an 236 English language email ID, a Hindi email ID, and so on. 238 5 Email Address Books 240 Email address books today have little or no support for addresses in 241 non-Latin based languages. 243 6 Security Considerations 245 6.1 Homographic Attacks 247 A user on Internet can be easily duped with Russian letters 'a, e, p, 248 or y' as they are indistinguishable in writing from their English 249 equivalents. A number of the letters (such as "a") are closely look 250 alike etymologically, whereas others look similar by sheer 251 coincidence. for example, Russian letter p is really pronounced like 252 r, however the glyphs of both the letters are identical. Russian 253 isn't the single such language; other Cyrillic languages could cause 254 similar collisions. 256 For example paypal.com and paypal.com are look alike however first 257 domain name contains the Russian letter "a" while other contains 258 English letter "a",further it can lead to similar-looking e-mail IDs 259 such as Nalini@paypal.com,and Nalini@paypal.com ;both are similar in 260 view however different e-mail IDs in reality.In this case,the 261 characters used for the fraud are perfectly legitimate. 263 Therefore numerous English domain and e-mail IDs may be homographed - 264 that is, maliciously misspelled by substitution of non-Latin letters. 265 A number of approaches may be utilized to protect against this sort 266 of attack. the best fix would indiscriminately forbid domain names 267 that combine letters from totally different alphabets, but this will 268 block actually helpful names like "CNNenEspanol.com". [Note: the 'n' 269 in Espanol has a ~ on the top] Alternatively, the browser may 270 highlight international letters existing in domain names with a 271 separate color, although users might find this system excessively 272 intrusive. 274 Browsers may solely highlight really suspicious names, like ones that 275 blend letters from different scripts inside one single word. For 276 additional security, the browser may use a map of identical letters 277 to look for collisions between the requested domain and equally 278 written registered ones. If critical, it would then warn the user of 279 suspected fraud 281 6.2 Use of Mixed Scripts 283 Legitimate uses for mixed scripts in both Japanese and Chinese are 284 also possible, and many people use Latin usernames with IDN domains. 286 6.3 Right-to-left Issues 288 Some languages, for example, Arabic, are written right-to-left. 289 Systems created to work with Arabic script typically switch when the 290 first Arabic character is entered. But with a mixed script email 291 address such as 'customer.care@[IDN domain].IDN', the system needs to 292 be able to handle both left-to-right and right-to-left scripting. 294 This could be an additional potential security issue. If someone 295 registers the domain name "customer.helpline" in this scenario, they 296 could type in the Arabic script first (username), triggering an email 297 system to switch to right-to-left and then put in the domain 298 "customer.helpline". It would appear in the input box as though 299 "customer.helpline" was the username on the left-hand side of the 300 email address. The only difference would be that the whole address 301 would be aligned right. 303 7 IANA Considerations 305 There are no IANA considerations. 307 8 References 309 8.1 Normative References 311 [RFC6530] Klensin, J. and Y. Ko, "Overview and Framework for 312 Internationalized Email", RFC 6530, February 2012. 314 [RFC6531] Yao, J. and W. Mao, "SMTP Extension for Internationalized 315 Email", RFC 6531, February 2012. 317 [RFC6532] Yang, A., Steele, S., and N. Freed, "Internationalized 318 Email Headers", RFC 6532, February 2012. 320 [RFC6533] Hansen, T., Newman, C., and A. Melnikov, 321 "Internationalized Delivery Status and Disposition Notifications", 322 RFC 6533, February 2012. 324 [RFC6783] Hansen, T., Newman, C., and A. Melnikov, " Mailing Lists 325 and Non-ASCII Addresses", RFC 6783, November 2012. 327 [RFC6855] Resnick, P., Ed., Newman, C., Ed., and S. Shen, Ed., "IMAP 328 Support for UTF-8", RFC 6855, March 2013. 330 [RFC6856] Gellens, R., Newman, C., Yao, J., and K. Fujiwara, "Post 331 Office Protocol Version 3 (POP3) Support for UTF-8", RFC 6856, March 332 2013. 334 [RFC6857] Fujiwara, K., "Post-Delivery Message Downgrading for 335 Internationalized Email Messages", RFC 6857, March 2013. 337 [RFC6858] Gulbrandsen, A., "Simplified POP and IMAP Downgrading for 338 Internationalized Email", RFC 6858, March 2013 340 8.2 Informative References 342 [EAICharter] https://datatracker.ietf.org/wg/eai/charter/, May 2010 344 Authors' Addresses 346 Nalini Elkins 347 Inside Products, Inc. 348 Carmel Valley, CA 93924 349 USA 350 Phone: +1 831 659 8360 351 Email: nalini.elkins@insidethestack.com 353 Harish Chowdhary 354 NIXI 355 India 356 Email: harish@nixi.in