Spring Framework
Every application has to be configured with information about the Yale environment (server names, how to access databases, Active Directory, how users login) and to select options. You may also plug in external components either by using the features of the application server (filters, listeners, EJBs) or some interface provided by the specific software package. If there is no explicit configuration language, then you may have to get the source and modify it.
...
Every SAML Response has a Subject field. It has a value and selects one of a list of standardized "Format" name strings.
The value can be the Netid, UPI, Netid@yale.edu, but in most cases it is a reproducible but opaque hash of the Netid or a large random string.
No two users of the same service should get the same Subject value. However, if two individuals lack credentials to actually login to a service, then it is not a problem if two different Responses that the service will reject happen to have the same Subject. Thus if a service is only used by employees, and non-employee students cannot login to it, it is not a problem if all students are given the same dummy Subject value.
Any attribute that might be used to generate the Subject value cannot be NULL. If you have to generate a Subject for some Relying Party that has a value derived from an identity variable that might be null for any person at Yale, then generate a derived attribute with an AttributeDefinition that guarantees it is never NULL even when the input variable is NULLFor most partners the Subject field is ignored and they get any information they need from the Attributes. Some important partners, however, use the Subject field as their most important source of information. ServiceNow expect the Subject field to be a Netid, while Google expects it to be the Eliapps ID (the part of the Email alias before the "@").
The Subject is supposed to be unique and is commonly obscured. For example, if you hash the Netid and some other secret stuff you can get a value that is reproducible from login to login but does not expose the identity of the user. However, if a user is not expected to be able to login to a service then providing both of them the same "do not log this person on" Subject value is not a problem. This means that technically we do not have to worry if the same subject is generated whenever an indispensable identity value is NULL (say when people who do not have Eliapps accounts try to login to Eliapps, or when people who are not Employees try to access the Benefits system).
Shibboleth has been known to generate an internal error if any attribute used to generate a Subject value has a NULL value, so generally any query for a value that might be used as a Subject should substitute a dummy value like "unknown" or "-1" for NULL return values.
In Shibboleth 2 a Subject was represented by a special type of SAML Encoder on particular attributes. In Shib 3 you generally derive special attributes with guaranteed non-NULL values that have no Encoder elements at all, then generate the Subject using an entirely new configuration file named Encoder element in AttributeDefinition statements in the attribute-resolver XML file.
In Shibboleth 3 there is a new subsystem and a new configuration file called saml-nameid.xml.
The Subject is just "the Subject". It doesn't have a name that indicates what type of value it was generated from. All the documentation suggests that it should be based on a number like Yale UPI, and if we had it to do over again that might be what we use. However, up to this point Subjects are typically generated from Netid. Since you have to have a Netid to login to CAS and Shib, this is guaranteed not to be NULL.
Each subject value generated by the saml-nameid.xml file has an associated format string and is based on a AttributeDefinition.
If the ID of the AttributeDefinition is not released to the Service Provider to which you are trying to login, then all Subject definitions associated with that AttributeDefinition are not calculated and are not eligible for use in this Response.
If the Metadata for the Service Provider to which you are trying to login has a list of NameIDFormat string values, and the Format string associated with a Subject definition is not in the list, then that Subject is not generated an cannot appear in the ResponseTo understand the change, you have to remember that the "best practice" is to generate some obscured meaningless reproducible string of characters as the Subject and to use Attributes exclusively as the source of meaningful information. Shibboleth 3 is designed to emphasize the idea that putting real data in the Subject field is a bad idea. We have to do it because certain partners expect it. We do not have to discuss the configuration of a hash-trash Subject because it is automatic and fairly uninteresting.
If the Subject is meaningful, then it has to be based on some attribute (Netid, UPI, email, ...). That means that there is an AttributeDefinition that provides the value.
In Shibboleth 3 the correct way to do this is to create a special AttributeDefinition with a special ID that is only used to generate subject values:
Code Block |
---|
<AttributeDefinition id="subjectMail" xsi:type="ad:Simple"
sourceAttributeID="EmailAddress">
<Dependency ref="IDRQuery" />
</AttributeDefinition> |
There are lots of real Attributes based on Email address, but this one special attribute is named "subjectMail" and it has no SAML Encoder elements that can be used to produce an Attribute in the Response. With this Definition, we have a special ID and can release "subjectMail" to certain Relying Party partners through the attribute-filter statements.
Now in the saml-nameid.xml file there can be a statement:
Code Block |
---|
<bean parent="shibboleth.SAML2AttributeSourcedGenerator"
p:format="urn:oasis:names:tc:SAML:1.1:nameid-format:emailAddress"
p:attributeSourceIds="#{ {'subjectMail'} }" /> |
This statement creates a potential Subject. It has a format string, but we usually refer to Subject formats by just the last piece after the last colon, so this is "emailAddress". The value will be taken from the 'subjectMail' attribute above.
The last piece of the puzzle is provided by the Metadata for a Relying Party, where one or more Format strings can be provided in a NameIDFormat element:
Code Block |
---|
<NameIDFormat>urn:oasis:names:tc:SAML:1.1:nameid-format:emailAddress</NameIDFormat> |
Now, if you have done this right, there should be for any given Relying Party exactly one Subject generated by saml-nameid.xml that is both based on an attribute released to this RP by attribute-filter and also has an associated format mentioned in a NameIDFormat element of the RP Metadata. Then that one Subject will be used to generate the Response.
The most important thing to understand about Subjects is that the NameIDFormat does not really mean what it says. In normal use, "emailAddress" seems to suggest that this is a real Email address to which you might send mail. NameIDFormat is a suggestion about what the thing looks like, not a request for a particular real attribute value. In this case it suggests that the Subject "look like an Email address" not that it actually be an email address. In reality, there is not even a requirement that the Subject contain an "@" to match the format.
So in practice you can almost ignore the NameIDFormat except for its use in selecting a specific Subject from the list of available subjects in the saml-nameid.xml file. A value equal to the Netid could, for example, could be assigned any format in saml-nameid and then could match any NameIDFormat in a Metadata file.
When more than one Subject definition can be released to a Service Provider, Shibboleth chooses one. You can control the preference, but now you are missing the point. Either you should not release two Subject-generating AttributeDefinitions to the same EntityID, or you should delete the unwanted NameIDFormat string in the Metadata. If that is not possible, read the Shibboleth Wiki for information on controlling the selection preference.
...