RFC 2047 Message Header Extensions

In this tutorial, we’ll be discussing RFC 2047 Message Header Extensions for email messages. This set of extensions allows us to send non-US-ASCII text data in Internet mail header fields without any headaches or hexadecimal code conversion. Let’s roll with the details!

First, RFC 2047 defines two new header fields: Content-Type and Content-Transfer-Encoding. The former specifies the media type (e.g., text/plain or image/jpeg) of the message body, while the latter determines how that data is encoded for transmission over email.

Let’s take a closer look at these two header fields:

1. Content-Type Header Field
The Content-Type header field allows us to specify not only the media type but also any subtypes or parameters associated with that type. For example, we can use it to send plain text messages in UTF-8 character encoding like this:

Content-Type: text/plain; charset=UTF-8

In this case, we’re saying that our message body is plain text (media type) and uses UTF-8 character encoding for transmission over email (subtype). Pretty straightforward, right?

2. Content-Transfer-Encoding Header Field
The Content-Type header field allows us to specify how the data should be encoded for transmission over email. For example:

Content-Type: application/octet-stream; name=”myfile.bin”
Content-Transfer-Encoding: base64

In this case, we’re saying that our attachment is binary (media type) and it uses the Base64 encoding scheme for transmission over email (subtype). This ensures that any non-printable characters are properly represented in the message body.

However, there are some important considerations when working with RFC 2047:

1. Banning nested encodings may complicate the job of certain mail gateways, but this seems less of a problem than the effect of nested encodings on user agents.

2. Any entity with an unrecognized Content-Transfer-Encoding must be treated as if it has a Content-Type of “application/octet-stream”, regardless of what the Content-Type header field actually says.

3. The quoted-printable and base64 encodings are designed so that conversion between them is possible, but hard line breaks in quoted-printable encoding output must be converted to a corresponding encoded CRLF sequence when converting from quoted-printable to base64. Similarly, a CRLF sequence in the canonical form of the data obtained after base64 decoding must be converted to a quoted-printable hard line break (but only when converting text data).

SICORPS