/
Data Problems In Email Aliases

Data Problems In Email Aliases

First, remember a few basic rules:

  • A Netid is assigned to you forever. Since it doesn't change, it is safe to base anything on a unique identifier based on Netid.
  • An Email Alias is supposed to be yours only while you have an active affiliation. Since Aliases can be reclaimed and assigned to someone else, we have to clean up anything that is based on a Primary AliasName (generically referred to as a "first.last" identifier because it is typically a firstname, period, and a lastname) when that Primary AliasName is changed or deleted.
  • Secondary Email Aliases are also owned by you and are therefore guaranteed to be unique. There are no rules that use a Secondary Alias for anything, but there are data errors created by previous manual name changes where a former AliasName was switched from Primary to Secondary without cleaning up the consequences.

The basic problem is that in previous years various administrators who regarded first.last as more friendly used it, instead of Netid, in a large number of mail and AD fields without building proper controls to prevent conflicts after reassignment. Examples include:

The old (about to be replaced) AD Daily Updater code inserted both first.last@connect.yale.edu and netid@connect.yale.edu into the ProxyAddresses list for Exchange users (to support both the now preferred and deprecated Mailbox name conventions). It does not appear to have ever cleaned up MailNickName, ProxyAddresses, or TargetAddresses for any old and now obsolete entries. This guarantees delivery of mail shortly after new mailboxes are created, but does not protect against conflicts for reassigned names.

The AD Daily Updater also renames UPN whenever the Primary Email Alias entry changes. The UPN becomes netid@yale.edu if the Primary Alias is deleted or reverts to "reserved" (no Mailbox value).

There are several sources of problems:

  1. The O365/Azure AD/Exchange Online enforces rules against duplicate MailNickName and ProxyAddresses values that the old on premise Exchange/AD ignored. The old AD Daily Updater code was originally designed for on premise Exchange, and was minimally adapter to O365.
  2. If a Mail account is actually deleted, then everything goes away. However, the process of "parking" logically deleted Email accounts temporarily has not completely cleaned up the AD and Azure AD mail routing fields.
  3. Previous Primary Alias changes were implemented up to the point where mail was delivered as expected, but without enforcing all the rules in a consistent manner.
  4. The new "Preferred Name" sensitivities can turn an old "Primary Email Alias" into a form of toxic waste at a time when we do not have the tooling to clean it up.
  5. The future replacement of Mail Relays configured from the Alias table with Exchange Online configured from the Azure AD cannot tolerate some current bad data. Essentially there are misconfigurations in the Alias table that cannot be copied to Azure AD but which are currently required for all mail addresses to be deliverable.

As always, there are three types of problems that we could generate with new code (and when possible will be identified in advance and remediated):

  1. If we delete or change MailNickName and ProxyAddresses that are non-standard, then a few manually created odd cases would break mail delivery until they are remediated.
  2. If we don't fix up all the current improper first.last references, then in the future when new people claim the same name there will be incidents to clean up.
  3. If we add complex rules to try and balance 1 and 2, then the code becomes too difficult to run once every hour, on the current AD Updater or Mail Relay schedule.

The proposed solution is a set of simple rules:

  • Any data error that can be cleaned up without any secondary consequences will be fixed in advance. For example, Mailboxes in the Alias table in the form "first.last@connect.yale.edu" can be changed to "netid@connect.yale.edu" because AD Daily Updater currently ensures that both Mailbox values are in the ProxyAddresses list.
  • Deprovisioned users (who cannot login, cannot view mail, and are not supposed to receive mail) can be changed and are by definition "must work correctly" because "correctly" says that they must not work at all. Reactivating such accounts probably will not require any extra work because either AD Updater will insert any missing and required data one the Alias table is filled in.
  • Otherwise, the new AD Updater code will do all the things the old AD Updater code did, and will initially not fix problems the old code did not fix (except updating the UPN in AzureAD).
  • A comprehensive set of rules will be developed and enforced on the tools used for manual changes. Deprovisioning will also get rid of now obsolete ProxyAddresses. Name changes will ensure that MailNickName and ProxyAddresses are also fixed. We will define a correct form for all fields for every state an account can be in.
  • New accounts have to be created as quickly as possible. Retiring old accounts after waiting weeks or years can be extended to wait a few more days. Therefore, the new AD Updater code will never be expected to delete the obsolete entries that it will be initially written to ignore. This will be done separately in a process that runs perhaps once a week because it will probably involve extensive cross-checking of Aliases, AD, and Azure AD, and the cloud is a non-responsive data source.

This leaves a small set of changes that are really part of the Mail Relay retirement project, not the AD Updater project. When we move mail sorting to Exchange Online, a few people will see changes to mail delivery that cannot be avoided by the sort of fixes we can automate. For example, five people at Yale have Primary Email Aliases that point to an Eliapps account and a Secondary O365 account. This is not strictly according to Yale policy and, when the Mail Relays go away, it will result in mail being mostly delivered to O365 mailboxes. There are ways to reconfigure, but they will involve some sort of custom service Tickets. Since there are only five people, this is OK.

We expect there will be a number larger than 5 of people who will have some other problem. We will identify the problem and if it cannot be corrected automatically then we will contact them by mail in advance and suggest that they talk to the Help Desk about alternatives.

The Behavior that Creates the Problems

The Email Alias table tells the Mail Relays where to forward mail addressed to any "@yale.edu" destination. If the prefix before "@" is associated with an "@connect.yale.edu" Mailbox value, it is sent to O365. If the Mailbox ends in "@bulldogs.yale.edu", then it goes to Eliapps.

In addition, O365 needs to know to which mailbox it should send each message. It looks at the address and tries to match it to one of the ProxyAddresses belonging to one of the Azure AD User objects. If it doesn't find a ProxyAddress, but the email address ends in a suffix it owns (like "@connect.yale.edu" or "@yaleedu.mail.onmicrosoft.com") then it also tries to match the part before the "@" to a MailNickName of some AzureAD User.

However, O365 and Exchange Online have a few extra rules that the old mail systems did not have. For one thing, the UPN has to be a deliverable O365 mail address. Since the UPN is the Primary Email Alias (even when the Primary Mailbox is Eliapps), this creates a rule that mail from O365 to morrow.long@yale.edu goes from one O365 account to Morrow's O365 inbox, while external mail to the same address goes to the Mail Relays which look at the Email Alias table and discover that Morrow's Primary Alias points to Eliapps, and so they send the external mail to Google.

Once the Mail Relays go away, Exchange Online is greedy and will deliver both internal and external mail to the O365 account. In fact, the Eliapps account will not get any mail ever again. There are various ways to reconfigure this, but they are all disruptive and Morrow has to figure out what he wants to do.

The AD Updater Logic Design

Up until this point, almost every tool we had to deal with Aliases starts by getting all the aliases owned by a particular Netid. Generally, tools run in response to a request, and so it makes sense to start with the Netid that made the request.

However, while the Alias database table allows you to look up an AliasName and find the Mailbox to which mail is delivered, the Exchange approach attaches all the possible AliasNames of a Mailbox to the ProxyAddresses list of the AD User object to which the Mailbox is attached. Put another way, the Alias Table is a "goes to" lookup while Exchange keeps a "comes from" list.

It turns out that a large number of Aliases point to Mailboxes attached to the AD User object of some other Netid. Sometimes a person owns Aliases himself but points them to mailboxes owned by some of his own Dependent Netids. In a lot of cases, User A owns an Alias that points to User B, and at this point it is not useful to try and explain all the reasons why this may occur. It happened. It works now. We can't break it.

A consequence of this is that we cannot assume that an O365 alias goes in the ProxyAddresses list of the User object of the Netid that owns that alias. Instead, we have to look at the Mailbox the Alias points to. There are several possibilities:

  1. A mailbox of the form netid@connect.yale.edu is easy. The Alias goes to the ProxyAddresses list of the Netid in front of "@" in the Mailbox value. We want to change any other form to this form in advance when possible.
  2. A mailbox of the form first.last@connect.yale.edu is ambiguous. We could add a rule that if "first.last" is an Alias is the Alias table (and maybe if the Mailbox has the same value or is "netid@connect.yale.edu" where the Netid is the owner of the Alias) then it goes in the ProxyAddresses list of the Alias owner. This is really something better handled by a data fix than by code.
  3. Other mailbox values ending in "@connect.yale.edu" are probably Distribution Lists and other "resources", and the owner is arbitrary. Netid kh536 (Kate Hathaway) owns hundreds of distribution lists for all the Yale courses offered each semester going back several years (example ALIAS_NAME "chem99004_spring2017" points to MAILBOX "chem99004_spring2017@connect.yale.edu").  There are too many possibilities to enumerate, so the plan is to ignore them. Right now, the Mail Relays will do the right thing and any AD object associated with one of these already has a correctly set up AD User object which we will not be changing. After the Mail Relays go away, email sent to Exchange Online will also be delivered correctly, because these are native Exchange accounts and it knows how to handle them. So AD Updater can do nothing to help and should do nothing to harm these entries which have nothing to do with Identities, People, Netids, or any other subject matter IAM handles.
  4. Everything else gets a Contact. There is one Contact per Mailbox value. All Aliases with the same Mailbox value (without regard to who owns the alias) get collected by Mailbox and are attached to the same Contact.
  5. However, there is a special rule for Mailboxes of the form first.last@bulldogs.yale.edu when the first.last is the Primary Email Alias of a Netid that does not appear in any mailbox of the form "netid@connect.yale.edu". This describes all the undergrads and a lot of grad students that have only Eliapps accounts. These aliases are attached to the ProxyAddress list of AD User object, which is currently idle because there is no O365 account using it. The TargetAddress then points to the Eliapps account.

This logic cannot be implemented inside IIQ because of the sequential processing of Aggregation. So we created Views in the database to cross check Aliases and Netids and group by Mailbox value.

There are still stray data problems. For example, netid kzc3 has Email Aliases:

AliasNameisPrimaryAccountMailbox
zoe.chanceTrueO365zoe.chance@connect.yale.edu
kristin.chanceFalseO365kristin.chance@connect.yale.edu

This looks like a rename. Originally kristin.chance was the primary, but at some point she was renamed "zoe.chance". What is unexpected here is that the two aliases have two MAILBOXs neither of which is named Netid and both of which are named "ALIAS_NAME@connect.yale.edu" but for two alias names instead of both using the current Primary.

There are several accidential reasons why mail to "kristin.chance@yale.edu" will continue to be delivered correctly, but they are not by design and don't need to be explained here. The purpose of the example is to point out the junk that is in the table for historical reasons and cannot be anticipated except by examining every case that doesn't match a simple filter. Finding these issues is not the kind of logic we want in AD Updater, so it will be incrementally detected by some data mining and fixed manually by a sequence of data fix Changes, accompanied by possible Email to "kristin.chance@yale.edu" warning that this account will be changed and how it will change.