> Viva.com's outgoing verification emails lack a Message-ID header, a requirement that has been part of the Internet Message Format specification (RFC 5322) since 2008
> ...
> `Message-ID` is one of the most basic required headers in email.
+----------------+--------+------------+----------------------------+
| Field | Min | Max number | Notes |
| | number | | |
+----------------+--------+------------+----------------------------+
| | | | |
|/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/
... bla bla bla ...
/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/|
| message-id | 0* | 1 | SHOULD be present - see |
| | | | 3.6.4 |
|/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/
... more bla bla ...
/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/|
| optional-field | 0 | unlimited | |
+----------------+--------+------------+----------------------------+
and in section 3.6.4:
... every message SHOULD have a "Message-ID:" field.
That says SHOULD, not MUST, so how is it a requirement?
SHOULD is a requirement. It means that you have to do it unless you know some specific reason that the requirement doesn't apply in your case. "I don't want to" is not a valid excuse, "I don't see a reason to" isn't either.
IIRC this particular rule is a SHOULD because MUAs often send messages without a Message-ID to their submission server, and the submission server adds one if necessary. https://www.rfc-editor.org/rfc/rfc6409.html#section-8.3 The SHOULD lets those messages be valid. Low-entropy devices that can't generate a good random ID are rare these days, but old devices remain in service, so the workaround is IMO justified.
I once had a job where reading standards documents was my bread and butter.
SHOULD is not a requirement. It is a recommendation. For requirements they use SHALL.
My team was writing code that was safety related. Bad bugs could mean lives lost. We happily ignored a lot of SHOULDs and were open about it. We did it not because we had a good reason, but because it was convenient. We never justified it. Before our code could be released, everything was audited by a 3rd party auditor.
Yes, except there seems to be a move on the best words from SHALL to MUST and from SHOULD to MAY.
IANAL but I recall reading this in e.g. legal language guidance sites.
Pulling exact quotes out, SHOULD means "there may exist valid reasons in particular circumstances to ignore a particular item" while MAY means "an item is truly optional."
I don't think this can be interpreted as simply "should is optional".
I think that is a bit to easy. MAY is described ar optional.
SHOULD - Should really be there. It's not MUST, you can ignore it but do not come crying if your email is not delivered to some of your customers !
you should have though about that before.
Maybe the standards documents you are used to differ from RFCs, but here is the official language:
3. SHOULD This word, or the adjective "RECOMMENDED", mean that there
may exist valid reasons in particular circumstances to ignore a
particular item, but the full implications must be understood and
carefully weighed before choosing a different course.
SHOULD is effectively REQUIRED unless it conflicts with another standards requirement or you have a very specific edge case.
I just don't understand how you get from the text you pasted to "required". Nowhere does it say that anything is effectively required. Words have meaning.
You should wear sunscreen to the beach. Its recommended as a good way to prevent sunburn. However, the beach police aren't going to come get you for not wearing it. You just might get a sunburn if you don't plan accordingly with other sun countermeasures.
> the full implications must be understood and carefully weighed before choosing a different course.
In this case, the full implication is that your email might be undeliverable. "Should" indicates that the consequences for this fall on the entity that is deviating from the thing they "should" be doing.
> the full implications must be understood and carefully weighed before choosing a different course.
Note the use of the word "must" used twice there. Barring a sufficiently good reason and accepting the consequences, this becomes a very poorly worded "required".
The spec would have been far better starting with SHALL and then carving out the allowance for exceptions.
No, its not a "required"... It means someone may have reasons not to use something, and so spec implementors need to allow for circumstances where it is not present.
Those reasons can be anything. Legal, practical, technological, ideaological. You don't know. All you know is not using it is explicitly permitted.
I don't even know how you got to "used twice" tbh. Both your own comment AND the post you quoted from only have a single "must".
The only thing that text demands is understanding and carefully weighing the implications. If, having done that, you conclude that you don't want to then there is absolutely nothing in the spec stopping you. Maybe the spec would have been better off putting more stuff in SHALL and less in SHOULD, but as written that is definitely not the case.
Any time any document (standards or otherwise) says something is recommended, then of course you should think it through before going against the recommendation. Going from their verbiage to:
> SHOULD is effectively REQUIRED unless it conflicts with another standards requirement or you have a very specific edge case.
3. SHOULD This word, or the adjective "RECOMMENDED", mean that there
may exist valid reasons in particular circumstances to ignore a
particular item, but the full implications must be understood and
carefully weighed before choosing a different course.
If the client SHOULD do something and doesn't, and your server does not know why, you SHOULD disconnect and move on.
If the server has considered fully the implications of not having a Message-ID header, then it MAY continue processing.
In general, you will find most of the Internet specifications are labelled MUST if they are required for the protocol's own state-processing (i.e. as documented), while specifications are labelled SHOULD if they are required for application state-processing in some circumstances (i.e. other users of the protocol).
> If the client SHOULD do something and doesn't, and your server does not know why, you SHOULD disconnect and move on.
That is not a rule.
In this situation the server can reject any message if it wants to, and not doing a SHOULD tests the server's patience, but it's still ultimately in the "server wanted to" category, not the "RFC was violated" category.
The RFC is a request for comments. The specific one in question is about Internet Mail.
If server implementers want their mail to be delivered these are things they SHOULD do.
That's it.
It isn't something you can give to your lawyer, and nobody cares about your opinion about what you think "should" means you can make someone else do. This is how it is.
How does Google know whether or not the sender has a valid reason? They cannot know that so for them to reject an email for it means they would reject emails that have valid reasons as well.
How would the sender know the consequences of sending without the header? You shouldn’t assume anything here. As a sender, you should include it unless you’ve already worked out what the recipient is expecting or how it will be handled. Doing this with email is silly because the client is sennding to so many different servers they know nothing about so it’s basically a requirement to include it.
It is not required for the protocol of one SMTP client sending one message to one SMTP server, but it is required for many Internet Mail applications to function properly.
This one for example, is where if you want to send an email to some sites, you are going to need a Message-ID, so you SHOULD add one if you're the originating mail site.
> How does Google know whether or not the sender has a valid reason?
If the Sender has a valid reason, they would have responded to the RFC (Request For Comments) telling implementers what they SHOULD do, rather than do their own thing and hope for the best!
Google knows the meaning of the word SHOULD.
> it means they would reject emails that have valid reasons as well.
No shit! They reject spam for example. And there's more than a few RFC's about that. Here's one about spam that specifically talks about using Message-ID:
> If the server has considered fully the implications
The server "considers" nothing. The considerations are for the human implementers to make when building their software. And they can never presume to know why the software on the other side is working a certain way. Only that the RFC didn't make something mandatory.
The rejection isn't to be compliant with the RFC, it's a choice made by the server implementers.
Either the server must explicitly confirm to servers or the clients must accept everything. Otherwise message delivery is not guaranteed. In the context of an email protocol, this often is a silent failure which causes real-world problems.
I don’t care what the protocol rfc says, the client arbitrarily rejecting an email from the server for some missing unimportant header (for deduction detection?) is silly.
These changes MUST NOT be applied by an SMTP server that
provides an intermediate relay function.
That's Google in this situation.
> Stop hiding behind policy and think for yourself.
Sometimes you should think for yourself, but sometimes, and friend let me tell you this is one of those times, you should take some time to read all of the things that other people have thought about a subject, especially when that subject is as big and old as email.
There is no good reason viva couldn't make a Message-ID, but there's a good reason to believe they can't handle delivery status notifications, and if they can't do that, they are causing bigger problems than just this.
“MAY This word, or the adjective "OPTIONAL", mean that an item is truly optional… An implementation which does not include a particular option MUST be prepared to interoperate with another implementation which does include the option, though perhaps with reduced functionality. In the same vein an implementation which does include a particular option MUST be prepared to interoperate with another implementation which does not include the option (except, of course, for the feature the option provides.)”
Note how it explicitly calls out interoperation with implementations that do or do not implement MAY. As a exception that proves the rule, we can reasonably assume that not interoperating with a system ignoring a SHOULD rule is a correct implementation and it is the fault of whoever is not implementing SHOULD.
Hearsay has it that the reason is spam. Spam messages are said to have massively higher chances of minor RFC violations when they arrive at the destination server.
Most of the time, in my experience, when one encounters a situation like this in Internet tech (i.e. "why is this suggestion treated like a hard requirement?"), this is the answer: "because attackers found a way to exploit the lack of the suggestion's implementation in the wild, so it is now a hard requirement."
The standards, to my observation, tend to lag the CVEs.
Side-note: If someone has built a reverse-database that annotates RFCs with overriding CVEs that have invalidated or rendered harmful part of the spec, I'd love to put that in my toolbox. It'd be nice-to-have in the extreme if it hasn't been created yet.
CVE classify a lot of things that have nothing to do with security.
Not having a Message-ID can cause problems for loop-detection (especially on busy netnews and mailing lists), and with reliable delivery status notification.
Dealing with these things for clients who can't read the RFC wastes memory and time which can potentially deny legitimate users access to services
> It seems that Gmail is being pedantic for no reason
So add a message id at the first stop, or hard ban the sender server version until they confirm. A midway point that involves a doom switch is not a good option.
Because in practice it showed up for a period of time as a common thing in spam-senders. They were trying to maximize throughput and minimize software maintenance costs, so they leave out things that the spec says are optional. But that makes "a commonly-implemented optional thing was left out" into a stronger spam signal.
Is it still a strong spam signal? Hard to say. Sources disagree. But as with laws, heuristics, once added, are often sticky.
As someone else who does System Engineering, when dealing with ancient protocols, "shall" is extremely difficult barrier to get over since there is always ancient stuff out there and there might be cases not to do it, esp if it's internal communication.
"SHOULD" is basically, if you control both sides of conversation, you can decide if it's required looking at your requirements. If you are talking between systems where you don't control both sides of conversation, you should do all "SHOULD" requirements with fail back in cases where other side won't understand you. If for reason you don't do "SHOULD" requirement, reason should be a blog article that people understand.
For example, "SHOULD" requirement would be "all deployable artifacts SHOULD be packaged in OCI container". There are cases where "SHOULD" doesn't work but those are well documented.
I’m doing some work with an email company at the moment. The company has been in the email space for decades. Holy moly email is just full of stuff like this. There is an insane amount of institutional knowledge about how email actually works - not what the specs say but what email servers need to actually do to process real emails and deal with real email clients and servers.
The rfcs try to keep up, but they’re missing a lot of details and all important context for why recommendations are as they are, and what you actually need to do to actually process real email you see in the wild. (And talk to the long tail of email software).
This conversation makes me think about cornering some of the engineers with a microphone. It’d be great to talk through the specs with them, to capture some of that missing commentary.
In a completely different field, navigating ships at sea, the Collision Regulations which define how people must conduct ships at sea, they use the words "Shall" and "May" to differentiate legal requirements and what may just be best practice. "Should" intuitively means something more like "May" to me
Note "the full implications must be understood and carefully weighed before choosing a different course". Gmail and the other big hosters have full-time spam teams who spend a lot of time weighing implications, so I assume the implications of this was weighed.
The original email RFC is also completely unaware of how bad spam is. Sure it might mention it but it's not really AWARE of the problem. The truth is, companies like Google, Microsoft and a few others have de-facto adjusted the minimum requirements for an email. Signing, anti-spam-agreements, etc.. are the true standard if you want an email to get from point a to b. (none of which are going to be REQUIRED in the RFC)
3. SHOULD This word, or the adjective "RECOMMENDED", mean that there
may exist valid reasons in particular circumstances to ignore a
particular item, but the full implications must be understood and
carefully weighed before choosing a different course.
Not sure how the people at Google interpreted this about the message-id
You can argue that you not obligated to use message-id but if you don't use it you should blame only yourself that your messages are not accepted. In requiring message-id I would side with google (though in general I think they anti-spam is too aggressive and lacks ways to report false positives). Full RFC compliance (as in not only MUST but also SHOULD unless you have a very good reason) is the easiest part of making sure your emails will be delivered.
> if you don't use it you should blame only yourself that your messages are not accepted
I think it's a gray area
- If the receiver declines your message because "Message-id" is required - then I blame the receiver; because that's not true
- If the receiver declines your message because "most systems do include it, and it's lack of presence is highly correlated with spam email", then it's on the sender
I think it's the latter. But, in either case, you're right in that you get the same result.
Now, let's assume that if it is the latter (it's spam related), and Google were to accept the message, but then internally bin the message, it would be worse. At least in this case, they are bouncing the message. Because of this, the sender is at least aware that the message wasn't delivered.
Also, the author was able to get their mail delivered to a personal gmail.com address. The issue was with a Google Workspace custom email domain. This further makes me think of this as a security/spam related issue. Google is clearly capable of processing the message without a Message-id, they are just refusing for business customers.
My takeaway is that I think that Google is doing the least-wrong thing. And by being explicit in how they are handling it, it at least made the debugging for the author possible.
Also note: in a quick reading of RFC5321 (SMTP), rejecting messages for "policy reasons" is an acceptable outcome. I'm not sure if it applies completely here. The author should probably also be taking into account RFC5321 (SMTP) instead of just 5322 (message format).
> Also, the author was able to get their mail delivered to a personal gmail.com address. The issue was with a Google Workspace custom email domain. This further makes me think of this as a security/spam related issue. Google is clearly capable of processing the message without a Message-id, they are just refusing for business customers.
That's the annoying part to me.
An email is an email. By applying different rules for rejection on different mailboxes, gmail creates a system where it's harder for would-be implementers to test compliance.
If tomorrow gmail creates a new type of mailbox, will there be a third set of rules to have your message delivered?
There are dozens of spam and security settings that admins can change in the Google Workspace console, presumably because different businesses have different requirements. So in practice, there's not just two sets of rules in gmail -- there's probably thousands or millions (however many combinations of settings are actually in use).
In my experience, email is an unreliable way to communicate any time-bounded critical information. When I want to be sure an email was transmitted on either side, the only reliable way to ensure this is to use a distinct channel to validate reception and confirm content.
That is, when some hotline tell me that they just sent and email with the information, I ensure they hold the line until I got the actual email and checked it delivers the relevant information to fulfill the intended process. And when I want to make sure an email was received, I call the person and ask to check, waiting until confirmation.
It’s not that much SMTP/IMAP per se as the whole ecosystem. People can legitimately get fatigue of "is it in my junk directory", "it might be relayed but only after the overloaded spam/junk analyzer accept it", or whatever can go wrong in the MUA, MSA, MTA, MX, MDA chain. And of course people can simply lie, and pretend the email was sent/received when they couldn’t bother less with the actual delivery status.
There are of course many cases where emails is fantastic.
Email is an unreliable way to communicate any information, in the strictest sense of the word "reliable." The protocol does not guarantee that any email will be delivered, nor does it guarantee that failure will be detected. It's a good-faith effort. The bits could drop on the floor at any point and you might never know.
Does it even matter when in reality it's more likely that this is intentional anti-competitive behavior by Google?
They once made all emails from my very reputable small German email provider (a company that has existed and provided email services long before Google existed) go into a black whole - not bounce them back or anything like that, mind you, their servers accepted them and made them disappear forever. I was in contact with the technicians then to get the problem fixed and they told me it's very difficult for them to even reach anyone at Google. It took them several days to get the problem fixed.
Of course, no one will ever be able to prove an intention behind these kind of "technical glitches." Nothing of significance ever happened when Google had large optics fiber connections with NSA installed illegally and claimed to have no knowledge of it, so certainly nothing will happen when small issues with interoperability occur and drive more people to Gmail.
At scale, it's very hard to distinguish malicious intent from the simple consequence of being the largest operator in a space so any motion one makes makes waves.
For what it's worth: having seen some of how the sausage is made, Google isn't particularly interested in screwing over a small reputable German provider. But they also aren't particularly interested in supporting such a provider's desire to route messages to their users because the provider is small. At their scale, "I've forgotten how to count that low" is a real effect. And email, as a protocol, has been pretty busted for decades; it's not Google that made the protocol so open-ended and messy to implement in a way that providers will agree is correct.
> Nothing of significance ever happened when Google had large optics fiber connections with NSA installed illegally and claimed to have no knowledge of it
Nothing of significance outside Google. Inside, Google initiated a technical lift that turned their intranet into an untrusted-by-default ecosystem so that data was encrypted on the fiber (as well as between machines within a datacenter, to head off future compromised-employee attacks). That process took at least five years; I suppose there's a scenario where it was all smoke and mirrors, but being on the inside in the middle of the process, I watched several C-suite who are not particularly good actors be bloody pissed at the US government for putting itself into Google's "threat actor" box and making that much work for the system engineering teams.
Long time ago when I was managing ISP email relay and customers asked "Where is the message I've sent?" seeing in the logs message accepted by receiving SMTP server was the end of the debug for me: I just handed the customer the part of the log and suggested talking to the receiving side IT administrator.
Google is rejecting it to ensure incoming messages aren't spam. SHOULD means "you should do this unless you have a really, really good reason not to." Do they have a good reason not to? It doesn't seem so, meaning Viva is in the wrong here.
No, SHOULD is defined in the RFC, not by colloquial usage. Google is on the wrong, regardless of their "safety" intent.
After all, linguistics is full with examples of words that are spelled the same, but have different meaning in different cultures. I'm glad the RFC spelled it out it for everyone.
if Google's choices are protecting users, they can't be in the wrong. That's the reality of a shared communications infrastructure regardless of what the docs say.
When the docs disagree with the reality of threat-actor behavior, reality has to win because reality can't be fooled.
Per RFC2119:
3. SHOULD This word, or the adjective "RECOMMENDED", mean that there
may exist valid reasons in particular circumstances to ignore a
particular item, but the full implications must be understood and
carefully weighed before choosing a different course.
So, it's fairly explicit that the sender should use message-id unless there's a good reason to not do so. The spec is quiet about the recipients behavior (unless there's another spec that calls it out).
(No seriously, I’m asking; are there examples of where it’s actually different from a MUST)?
Also this reminds me of something I read somewhere a long time ago: when specifying requirements don’t bother with SHOULD. Either you want it or you don’t. Because if it’s not a requirement, some people won’t implement it.
I guess the one time it’s good is if you want an optional feature or are moving towards requiring it. In this case Google has decided it’s not an optional feature.
"SHOULD generally means: some people might require it."
No it absolutely does not mean that. It means, by explicit definition which is right here, that text is exactly that definition, that no one requires it. They can't require it, and still be conforming to the spec or rfc. That's the entire point of that text is to define that and remove all ambiguity about it.
It's not required by anyone.
The reason it's there at all, and has "should" is that it's useful and helpful and good to include, all else being equal.
But by the very definition itself, no people require it. No people are allowed to require it.
Typically, MUST means that if you don't do that then something will break at the protocol level.
SHOULD means that if you don't that, bad things are likely to happen, but it will not immediately break at the protocol level and during discussion in the IETF some people thought there could be valid reasons to violate the SHOULD.
Typically, IETF standards track RFCs consider the immediate effects of the protocol but often do not consider operational reality very well.
Sometimes operational reality is that a MUST gets violated because the standard is just wrong. Sometimes a SHOULD becomes required, etc.
Certainly for email, there is a lot you have to do to get your email accepted that is not spelled out in the RFCs.
Incorrect. Not required is not required. You do not need to supply rationale or get agreement by anyone else that your reasons are good in their opinion and not just in your opinion.
Should just means the thing is preferred. It's something that is good and useful and helpful to do.
That is not "must unless you can convince me that you should be excused".
I SHOULD have 8 hours of sleep every night. It's RECOMMENDED. However, there are times where it's best I don't (e.g. because of work, or travel, or needing to take someone to the hospital, etc.). It's definitely not that I MUST sleep 8 hours every night.
“When jump getting over a wall, you SHOULD use three points of contact.”
For most cases you should use three points of contact. However, there may be other situations for example if someone is giving you a leg up, or you can pole vault, where another solution is preferred.
You assume that internet standards are prescriptivist; that the document describes how it is to be implemented. In practice it's often descriptivist, with the standards documents playing catch-up with how things are actually going in practice.
Anyway, in general you can expect that doing unusual but technically valid things with email headers will very often get your messages rejected or filtered as spam.
Standards are definitely prescriptive. But just like a medical prescription, it doesn’t ensure that actors in the wild will conform to what’s prescribed. People will not follow prescriptions for whatever reason, willingly or otherwise. It doesn’t mean the document wasn’t prescriptive.
For producers, ignoring a SHOULD is riskier because it shifts the burden to every consumer.
For consumers, ignoring a SHOULD mostly affects their own robustness.
But here Google seems to understand it as a MUST... maybe the scale of spam is enough to justify it. Users are stuck between two parties that expect the other to behave.
I think a mail 2.0 would be notify and pull based.... you notify a recipient's mail server that there's a message from <address> for them, then that server connects to the MX of record for the domain of <address> and retrieves <message-id> message.
Would this make mass emails and spam harder, absolutely. Would it be a huge burden for actual communications with people, not so much. From there actual white/black listing processes would work all that much better.
Is the idea that you could decide from the envelope whether you want to even bother fetching the message? Besides that I'm not sure I see the advantage
You have to have a working mail server attached to a domain to be able to send mail... that's the big part. Right now, email can more or less come to anywhere from anywhere as anyone. There are extensions for signing connections, tls, etc... but in general SMTP at it's core is pretty open and there have been efforts to close this.
It would simply close the loop and push the burden of the messages onto the sender's system mostly.
And yes, you can decide from the envelope, and a higher chance of envelope validity.
jmap is the communication between a mail client and shared directory/mail services on a server. It does not include server to server communications (that I am aware of) for sending mail to other users/servers.
My take, as a postmaster for hosting company, who don't have any sympathy to gmail (that should be visible from my comments history):
Message-ID is absolutely MUST in production e-mails. You can send your test stuff without it, but real messages always have it. Not having Message-ID's causes lot of fun things. All somewhat competent software is capable to add Message-ID's, so lack of it is good indication of poorly made custom (usually spamming) solution.
Rspamd and spamassassin have missing MID check in their default rules, I am sure that most antispam software is same.
Why? If I'm writing a mail receiver, and I'm told there is some unique ID generated by the sender in a loosely specified way, the first thing I'm doing is ignoring that value forever. One lesson surely most everyone learns in CS is that unique identifiers are maybe unique to the system generating them, but to rely on foreign generated IDs being unique globally is a terrible idea that will break within the minute.
So at that point the ID has no value to me except being obliged to carry it around with the message, so maybe the originating system can at some point make sense of it. But then there is obviously no reason to ever reject mail without it, it's an ID valid for the sender and the sender didn't care to include one, great, we save on storage.
>Why? If I'm writing a mail receiver, and I'm told there is some unique ID generated by the sender in a loosely specified way, the first thing I'm doing is ignoring that value forever. [...] So at that point the ID has no value to me
Your framework of analysis is based on someone else's database key ids being irrelevant to you. That's true.
But another framework of analysis is tracking statistical correlations of what spam looks like. Lots of spam often don't have message ids. Therefore it's used as a heuristic in scoring it as potential spam. That's why other postmasters even without SpamAssassin independently arrive at the same answer of trying to block messages without a message id. Example: https://serverfault.com/questions/629923/blocking-messages-w...
MID-s are used by MUA-s for referring earlier messages, tracking answers and so on. So any software expecting dialog (messages coming back) needs to deal with MID-s correctly. Missing MID-s show that said communication is one direction, because broken dialog has not been problem.
An unrelated frustration of mine is that Message-ID really should not be overridden but SES for instance throws away your Message-ID and replaces it with another one :(
It is de-facto required and has been for many years.
Should in most RFCs also mean "do it as long as you don't have a very good technical reason not to do it". Like it's most times a "weak must". And in that case the only reason it isn't must is for backward compatibility with older mail system not used for sending automated mails.
And it is documented if you read any larger mail providers docs about "what to do that my automated mails don't get misclassified as spam". And spam rejection is a whole additional non-standardized layer on top of RFCs anyone working with mail should be aware of. In any decades old non centralized communication system without ever green standards having other "industry standard/de-factor" but not "standardized" requirements is pretty normal btw.
I would read this as a requirement for email to be 'legit' and not classified as spam.
Sure, you can send email with whatever headers you want, use weird combos, IP addresses, reply-to, and it might be still a technically valid email, but not something that should land in people's inboxes.
Also, a payment processor not testing their email on the most popular email provider in the world is quite ridiculous.
As indicated in the RFC, it uses another RFC[0] to define those words. Here's the relevant excerpt from that one:
3. SHOULD This word, or the adjective "RECOMMENDED", mean that there
may exist valid reasons in particular circumstances to ignore a
particular item, but the full implications must be understood and
carefully weighed before choosing a different course.
yeah, but that RFC isn't the only relevant document
Mail RFCs do not cover at all spam detection and malicious mail rejection, but it's a thing every large mail provider has and you really have to care about when producing automated mails all looking similar. And large mail providers like google tend to document what "base line" of additional requirements they have for accepting (automated mail). Having a Message-Id is in there, and in pretty much any larger mail providers documentation about that topic. Tbh. I have worked with mail before a bunch of years ago and the need for Message-Id was back then really no hidden gotcha but pretty well known.
and the design space mail provides is larger then any client could reasonable support (like it's really a huge mess covering docent of standards which allow all kind of nonsense and hypothetical use cases practical unsupported), so you anyway have to look at "what everyone does" and only then make sure it's also RFC compatible, instead of starting with the RFC. That was a painful lessen to learn.
In addition there are some de-facto standards not pinned down in any RFC, like e.g.:
- Message-Id being required for any automated mails by many mail providers (through how bad the consequences are if you don't have it diverges largely).
- You can't punycode encode the local part of an email address (it would be a different email), and there is no standard way (as far as I remember) to convert non us-ascii local parts to us-ascii. This is based on the fact that iff your mail server allows you to have non us-ascii local parts it should also support "internationalized mail" (SMTPUTF8 and co.). But it's a semi industry standard to give the user with an unicode local part also the mail with the punycode encoding of the local part so it often "just works" and dev are frequently surprised when it fails to work...
- You can have quoted text in local part, like whitespaces. But most of the industry decided to not give users such mail addresses so you can see it as soft deprecated and using it is asking for trouble.
- Attachements. The MIME encoding allows a lot of different ways to put mails together and doesn't force a specific semantic interpretation by mail clients. As such you if you naively use it you might run into surprises how/if your attachments or embedding(s) are displayed. Through today embedded resources often are either not done or uses data URLs. Again which ways work well and which don't is somewhat an industry standard and not in any RFC.
- A lot of different ways to encode Unicode to us-ascii. If you produce any mails you probably should by default use the latest revision (where encoding is often just not needed as things are utf8), but might need a fallback with it often being fully unclear if . But if you are a client you probably have to support older versions. And in some parts of the world/business segments usage of very very old mail servers is a thing which is a major pain if you run into it.
so quoting that something isn't strictly required by mail RFCs is kinda pointless as even many things explicitly allowed won't work well in practice
As a rule of thump: If you can afford it test you system will all widely used mail providers as if you where an external customers. And redo the tests yearly.
oh and as a bonus, if you mails looks too similar to known phishing mails it will also just disappear. That seems irrelevant, but e.g. if you use Keycloak with the default mail templates for password reset and co. there is a high chance of your mails ending up in spam or not even being delivered as scammers have used Keycloak for their means, too. And that isn't just a case for Keycloak but any "decently widely used open source software producing mails and having default templates". So you pretty much always need to change the default templates (you will do so anyway for branding, but skipping it in the earliest stages of a startup where branding might still be in flux isn't that uncommon either).
I know you're looking for "pedant points" but the specification generally take a backseat to implementation. If Message-ID is expected out here where the rubber meets the road, then you are the squeaky wheel in this scenario for not including it.
1. SHOULD means, do it if you can/you have to have a really good reason if you don't do it. The only reason it's SHOULD and not MUST is backward compatibility. Mostly in context of "personal send mails", i.e. not automated mails. (For automated mail sending the expectations of you running somewhat up to date software is much higher).
2. You can't really implement mail stuff just based on RFCs:
- There docent overlapping RFCs which can sometimes influence each other and many of them obsolete older versions why others still relevant RFCs reference this older versions. This makes it hard to even know what actually is required/recommendation.
- Then you have a lot of "irrelevant" parts, which where standardized but are hardly supported/if at all. You probably should somewhat support them as recipient but should never produce them as sender today (mostly stuff related to pro-"everything is utf8" days). Like in general the ideas of "how mail should probably work" in old RFCs and "how it is done IRL today" are in some aspects _very_ far away.
- Lastly RFCs are not sufficient by themself. They don't cover large parts of the system for "spam detection/suspicious mail rejection". So it's a must have to go to the support pages of all large mail providers and read through what they expect of mails. And "automated mails need a message id" is a pretty common requirement. In addition you have to e.g. make sure the domain you use isn't black listed (e.g. due to behavior of a previous user), and that your servers IP addresses aren't black listed (they never should be black listed long term, but happens anyway, and e.g. MS has based on very questionable excuses "conveniently" black listed smaller local data center competition while also being one of the most widely used providers for commercial mail in that area).
And the definition of "SHOULD" (from RFC 2119) is "This word, or the adjective "RECOMMENDED", mean that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course."
Having said that, I regret my original characterization of the Message-ID header as a "requirement" and have updated the blogpost to be fair to all sides.
The reason that European tech sucks is that people in Europe are open to such arguments. If an engineer in the US started talking about SHOULD vs MUST, some PM would just give them that "what the fuck did I just listen to" face, spend the next few minutes gently trying to convince them that the customer experience matters more than the spec, and if they fail, escalate and get the decision they want.
For example, why does Google handle this differently for consumer and enterprise accounts? Well it's Google so the answer could always just be "they are disorganized" but there's a good chance that in both cases, it was the pragmatic choice given the slightly different priorities of these types of customers.
Not my PM (in the US). My PM would try to avoid anything that is not absolutely necessary and therefore ask developers not to develop anything that isn't a MUST. I know that we like making fun of Europe for their alleged lack of innovation but this isn't a Europe thing.
Your PM most definitely would not tell you to skip a feature that is needed for your emails to be delivered to Gmail accounts. What a preposterous thing to lie about.
Do bugs and bad implementations not exist in US software? If an US company did this, nobody would be bloviating about how it is a cultural issue or whatever.
> That says SHOULD, not MUST, so how is it a requirement?
Battle with spam has been for long part just trying to algorithmically fingerprint the scam bots and reject the message if it looks like it wasn't sent by "real" mail server/client.
So a lot of things that are optional like SPF/DKIM are basically "implement this else your mail have good chance of being put into spam automatically".
The reason it’s recommended is that it’s useful for detecting when an email you receive is already in your mailbox, so that you don’t accumulate duplicates. Otherwise one would have to compare the complete email, which probably no MUA does. Another reason is that replies can include a reference to the original message, so that it can be properly threaded by MUAs.
So these are mostly quality-of-life reasons, it’s not a reason to reject an email.
SHALL has been interpreted/clarified by US courts as not being a fancy MUST or REQUIRED that many people were taught it to mean, but SHOULD still has it's purposes, e.g., to provide contractual flexibility in development, i.e., a MUST/REQUIRED requirement was more challenging or complicated and took up more time/resources than anticipated, so SHOULDs can be trimmed due to contingencies.
Another example may be a lightweight implementation of a spec in a limited and/or narrow environment, which remains technically compliant with full implementations of a spec but interaction with such a limited/narrow environment comes with awareness about such limitations.
> ...
> `Message-ID` is one of the most basic required headers in email.
Section 3.6. of the RFC in question (https://www.rfc-editor.org/rfc/rfc5322.html) says:
and in section 3.6.4: That says SHOULD, not MUST, so how is it a requirement?