Spring Framework

Every application has to be configured with information about the Yale environment (server names, how to access databases, Active Directory, how users login) and to select options. You may also plug in external components either by using the features of the application server (filters, listeners, EJBs) or some interface provided by the specific software package. If there is no explicit configuration language, then you may have to get the source and modify it.

Rather than creating a new configuration system, applications are increasingly using the Spring Framework. CAS started using Spring 10 years ago, and Shibboleth 3 now fully embraces it. Spring provides a common configuration mechanism, so once you know how it works you can use it to configure CAS or Shib or any other system written in Spring. If you want to understand Spring in general, go to the website or read a book on Spring.

Some files cannot be replaced by Spring configuration. The SAML standard provides an XML syntax for Metadata. Shibboleth 2 had its own XML configuration file syntax, and some of those files are supported in Shibboleth 3 to simplify migration. Under the covers each XML element will be converted by Shibboleth to an object, and today some of those objects will also be visible to the Spring framework. Most of the less important, more technical configuration files that administrators were less likely to have modified in Shib 2 are now rewritten in Spring syntax and would have to be manually converted if you made changes to the previous version.

In the Spring framework, application Java classes are used to generate objects called Beans. In any Spring configuration files, each <bean> XML element contains the name of a Java class. Spring creates an instances of the class (an object) and passes values from the <bean> XML object to named properties (or constructor arguments) to initialize the bean with parameters it needs to perform a specific function. New classes can be made available by adding a library to the WEB-INF/lib directory of the Web application, so we can add third party code or Yale code without changing anything in Shib itself. The parameters provide database URLs, userids, passwords, Active Directory domain names, or any other string or numeric value needed to configure generic code to the specific Yale environment. For example, we add the Unicon library to the Web Application and then configure the Unicon CAS-Shibboleth integration by providing the URL "https://secure.its.yale.edu/cas" of the CAS server at Yale. Information on the Unicon code and configuration, see https://github.com/Unicon/shib-cas-authn3.

You know you have a Spring configuration file if the first element is <beans>. Then the file contains mostly <bean> elements, although Spring has a few aliases for <bean> when you are dealing with standard classes. If you are creating a Java List, for example, then instead of a <bean> file that references the "java.util.List" class, it can use the defined nickname of <util:list>.

Local Disk File Configuration

Many applications are configured in a database. Spring has the built in capability to define a file (which it calls a Resource) based on a path to local disk or a URL to a network file. Even before it used Spring, Shibboleth had written its own custom code to read configuration files from disk, from URLs, or even to check a file out from Subversion source control at system startup. Then periodically Shibboleth can "poll" the source to see if a new version of the same file has become available (based on the last changed date or in Subversion the most recent committed version number) and reload it if there was a change.

Using Source Control to manage text configuration files is an excellent idea. You have a history of changes and an easy way to back out mistakes. When Shibboleth was used to provide casual access to academic resources at other universities, the direct use of Source Control would have been the right thing to do. However, today many of our critical systems are located in the Cloud and they use SAML to login. Since our disaster recovery plans are stored off site, Shibboleth has to be one of the first services restored after a major outage. Subversion comes up hours later. So Shibboleth has to be able to run with only its own local disk. You can of course bring Shibboleth up on the last checked out version of files when Source Control is not available, but we have discovered that Shibboleth takes an unreasonable amount of time trying and failing to connect to the Source Control server when that server is unavailable.

So Yale accomplishes effectively the same thing with a bit more work. All Shibboleth configuration files are checked into Source Control. However, Shibboleth does not know this and does not go to Source Control itself. Shibboleth is configured to use files on disk, and when appropriate to check periodically to see if the file change dates have been modified and reload the changed files. The files are deposited or updated on the Shibboleth local disk by a Jenkins Install job under the control of Operations. So Shibboleth does not see the files change just because a new version of a configuration file has been committed to Subversion. After the commit there has to be an explicit Jenkins run to move the file to the Shibboleth server, and while Jenkins jobs can be configured to run automatically after a commit, this particular job is started by a person when we make a positive decision to change the running Shibboleth.

Terms

Shibboleth is an Identity Provider (an IdP).

Applications and Cloud vendors are Service Providers (SPs) because they provide a service to users. They rely on Shibboleth to provide login information, so they are also called Relying Parties (RPs).

Each IdP and SP is an Entity and it has an EntityID that is a string. The string has to be globally unique, so it is typically a DNS name (google.com) or a URL, but some partners just use a string they made up.

Metadata is a big block of information about an Entity in an XML <EntityDescriptor> defined by SAML.

So a given partner like google.com is sometimes called an SP and sometimes called an RP. Technically Shibboleth is an Entity, but normally our own EntityID is understood, so most of the time when we discuss Entities and EntityIDs we are talking about an RP/SP. Similarly we have our own Metadata, but it is understood and so most discussions of Metadata refer to an RP/SP/Entity.

Metadata and "Providers"

The Metadata describing an RP/SP/Entity is the content of a rather large block of XML contained in an <EntityDescriptor> element. This is the sort of thing that any other application would store in a database. Shibboleth reads Metadata from a file or URL and uses it to build objects in memory.

Metadata is obtained from Metadata Providers defined in the conf/metadata-providers.xml file. At Yale, each Provider is a file on disk, but the InCommon metadata for thousands of Entities comes from a URL to the InCommon server that is checked at startup and then once every 8 hours. The most recent copy of the InCommon data is stored on disk, so if Shibboleth starts up at a time when it cannot reach InCommon on the net, it uses the stored file as a backup.

At Yale, Metadata files are checked into Source Control as part of the Jenkins Install project. They are copied to Shibboleth server local disk during Jenkins Install processing and are replaced only by another Jenkins Install (except for InCommon which is the only file that comes dynmically from the Web).

In theory we could create one big file and put all the local Metadata elements in it, but Shibboleth will refuse to read in any file that contains a single syntax error. So instead we tend to use individual files for each SP/RP/Entity although occasionally we will put the DEV/TEST/PROD entities of an application in the same file. That way a screwup is isolated to just the one file and one Entity.

The user wants to login to an RP with EntityID "https://example.com/provider". Shibboleth goes to the EntityProviders (the files) configured and looks for a matching EntityID in the first file, then the second, and so on until a match is made. Since Shibboleth stops when an EntityID is found, the order that the files are defined in the metadata-providers.xml file determines which of two or more metadata elements will be used.

In general, Yale puts all of its own locally managed Metadata files first, then it searches the InCommon Metadata we don't control. That way if we need some special processing for an InCommon partner, we can extract their standard Metadata, change it, and then store it in a Yale Source Control file. This "first match" rule also suggests an obvious use for one initially empty file at the beginning of the search order and one at the end of the search.

The "emergency-override" dynamic file is searched first. Metadata placed in this file with an EntityID that matches an existing Metadata entry in a later file will logically replace the previous version of production Metadata for any partner. When we have a regularly scheduled formal Release of new Shibboleth configuration (on alternate Thursdays) this file is empty. During the two week period, or when it is too later to schedule a regular update through the CAB committee, a runtype=emergency Jenkins Install of Shibboleth modifies just this one file. So if one partner has a problem (typically because a key/certificate changed and we did not know about it in advance) we can go to the Emergency CAB and get approval to put the updated metadata in the emergency-override file, change just that one file on the disk of the running Shib, and fix the problem with that one metadata file. In the next alternate-Thursday full release the changed metadata will be in its normal file and this file will be empty again.

The "additions" dynamic file comes last in the search. Every existing Metadata file will have already been searched, and all existing EntityID values will have matched, so you do not get to this file unless you have a new EntityID that doesn't match any existing one (including all the InCommon entities). This file can only define new Metadata for new entities. This becomes a relatively safe Standard Change that doesn't have to be approved because anything put into this file cannot adversely affect existing configured services. Of course, a new partner may also need attributes released to them. Fortunately, Shibboleth allows the function of the attribute-filter.xml file to be broken up into multiple files. Existing partners are configured in attribute-filter, and an empty file named "additional-attribute-filter.xml" is deployed with every Shibboleth Release. Therefore, if a new partner has to be defined to production and cannot wait for the every-other-Thursday Release cycle, the Metadata for that partner can be placed in the metadata/additions.xml file and the attributes to be released can be put in the additional-attribute-filter.xml file. A Jenkins install of runtype=additions replaces both of these originally empty files with the data for the newly defined partner while guaranteeing by their search order that they cannot interfere with existing services. When the next regularly scheduled Shibboleth Release is ready, the changes move from the additions files to the normal Shibboleth configuration and the additions files are empty again.

Two of our partners (Salesforce and Cvent) regularly add new AssertionConsumerService URL elements to their existing Metadata file. This happens so frequently that we have the option of replacing these specific production Metadata files with updated copies. There has not yet been any urgency to make such changes outside a normal Release cycle, but we have the ability to respond to the special needs of these two cloud partners if "every other week" becomes an unacceptable delay.

Jenkins Runtype

The runtype parameter in the Jenkins Install job determines the specific processing that this run of the Install job will perform.

Runtype "install" stops the JBoss server, loads a complete Shibboleth system including potentially new code, and new configuration files.

Runtype "config" does not stop JBoss or the running Shibboleth server. Instead, it replaces the full set of configuration files. The running Shibboleth process checks the timestamps on these files, and when it sees they have changed it loads a complete new configuration. Shibboleth completely reconfigures itself.

Runtype "update" compares the contents of the deployed running Shibboleth configuration to the contents of each configuration file from Source Control and the installer project. If the files are the same, nothing happens. If they are different, then the new contents from Source Control replaces the file on local disk on the Shibboleth server. Since only changed files are reloaded by Shibboleth, this can change only a Metadata file, or only the Attribute Resolver configuration, or only the attribute-filter.

Runtype "additions" modifies only the "additions.xml" Metadata file and the "additional-attribute-filter" file. This can be used to add new Service Providers to production between the every-other-Thursday full Release cycles. Shibboleth isolates these files and appears to guarantee that this type of configuration cannot possibly interfere with existing production services.

Runtype "emergency" will change the "emergency-override.xml" file and allows us to define new Metadata for an existing production partner without affecting anything else. It may require permission from the ECAB, but it is a less dangerous change than a full configuration update. Note that the old Metadata for the partner remains in place, but is not used because the override Metadata is found first in the search order. Before the next Release cycle (the next runtype=install or runtype=config), the old production Metadata should be replaced with the new override data and the emergency-override.xml should be emptied.

Runtype "salesforce" and "cvent" are proposed runtypes that change a single Metadata file for the two partners that require frequent updates.

Contents of the Primary Configuration Files

Attribute-Resolver

Normally Shibboleth has a single attribute-resolver.xml file that contains two types of elements. DataConnectors define database or LDAP queries that produce result sets with columns or LDAP User objects with properties. AttributeDefinitions then take the columns and properties returned by the queries, assign a unique identifier that can be referenced in the attribute-filter (release policy), and supply SAML syntax. So two DataConnectors could query the Yale IDR database for basic identity information, and also the Active Directory for the subset of identity information it contains. Then AttributeDefinition statements can take the "FirstName" column from IDR or the "givenName" property from AD and create various SAML Attributes all with the same value of "Howard" but with SAML name and friendlyName attributes that refer to it as "FirstName", "First Name", "givenName", "http://schemas.xmlsoap.org/ws/2005/05/identity/claims/givenname", or "urn:oid:2.5.4.42" (informal and formal standards based SAML names for the same thing).

Shibboleth 3 allows the attribute-resolver to be configured with more than one file if you add additional elements to a list in conf/services.xml. Yale uses this to separate all the DataConnector elements which have one syntax and one natural default xml namespace from the AttributeDefinition elements that have a different syntax and a different natural default namespace.

Shibboleth documentation is not particularly clear on the algorithm, so I will try to fill in something that I believe is important to understand.

DataConnectors

There are generally three types of queries that make sense:

A database query can return exactly one row. Then you can think of the row as a user object, and the column names become the properties of the object.
A database query can return more than one row but only one column. Then you have a "multivalued" property for one user.
An LDAP query returns the User object from the directory. LDAP User objects have properties some of which are single valued and some of which are multivalued.

Each column in the result set or property in the LDAP object becomes a Shibboleth IdPAttribute object, but this object has no SAML formatting information and no global name. To find one of these objects, you have to specify the ID of the DataConnector that ran the query and then the column or property name. You need to create one or more AttributeDefinition elements to convert this into a real Shibboleth Attribute with a real ID that you can release as SAML to a partner.

The DataConnector needs a Java DataSource to provide a pool of database connections. Java DataSource management is complicated because it has to know when a database connection must be discarded because it has timed out or because the database rebooted since it was last used. Shibboleth would prefer to leave this to the database experts. Shibboleth 2 did this by using "container managed" connections provided by Tomcat or JBoss. Shibboleth 3 can still use connections managed by the application server, but now that it is a full Spring Framework application it can use DataSources provided and managed by Spring. Either way the complex database management doesn't have to be done by Shib provided code.

AttributeDefinitions

The DataConnector provides the value. We know that "Howard" is the value of the "FirstName" column of the result set returned by the "IDRQuery" database connector or of the "givenName" property of the user object returned by the "ADQuery" LDAP connector.

However, to use this value in any real Shibboleth logic, it needs a single unique "id=" name. That ID can be referenced in the attribute-filter to release the attribute to an application, or it can be used in a Dependency statement to create a variable in a JavaScript block of code, or to create a variable that can be inserted into the WHEN clause of another Database query.

All you need is an AttributeDefinition statement that has a Dependency element that references the ID of the query (IDRQuery or ADQuery in the previous examples) and which specifies the column name or attribute property name as the value of sourceAttributeID=.

<resolver:AttributeDefinition id="idrFirstName" xsi:type="ad:Simple" sourceAttributeID="FirstName">
    <resolver:Dependency ref="IDRQuery" />
</resolver:AttributeDefinition>

This AttributeDefinition gives a single unique ID "idrFirstName" to the value of the "FirstName" column returned by the "IDRQuery" DataConnector.

With this AttributeDefinition, you can reference "idrFirstName" in other elements. They can create a variable named "idrFirstName" in a block of JavaScript code, or they could create "$idrFirstName" which can be added to the template for another database query. It would be a really bad idea, but you could create a NameId definition that uses this value as input to a hash that generates the value of the Subject of a SAML Response (except that Subject has to be unique and lots of people have the same FirstName, but this would make sense if the attribute was the Yale UPI number).

About the only thing this AttributeDefinition cannot do as shown is create a SAML Attribute that can be released to a partner and sent in a SAML Response. To do that you need to add one additional element, a SAML Encoder that specifies a friendlyName= like "FirstName" or "GivenName" and an (unfriendly) name= like "http://schemas.xmlsoap.org/ws/2005/05/identity/claims/givenname" or "urn:oid:2.5.4.42".

<resolver:AttributeDefinition id="idrFirstName" xsi:type="ad:Simple" sourceAttributeID="FirstName">
    <resolver:Dependency ref="IDRQuery" />
    <AttributeEncoder xsi:type="enc:SAML2String" name="urn:oid:2.5.4.42"
            friendlyName="givenName" /> 
</resolver:AttributeDefinition>

Now "idrFirstName" can be released by the attribute-file.xml file and can be sent in a SAML 2 Response. Of course, now the "idrFirstName" field is bound to the LDAP name and friendlyName conventions, which some partners expect. Other partners want another value. If you are going to do this, then you might change the id from "idrFirstName" to "idrFirstNameWithLDAPSyntax".

So an alternative is to create a pure AttributeDefinition using the first example code (without the AttributeEncoder), and then derive additional attributes with Encoders and more specific IDs based on the original:

<resolver:AttributeDefinition id="idrFirstNameWithLDAPSyntax" xsi:type="ad:Simple" sourceAttributeID="idrFirstName">
    <resolver:Dependency ref="idrFirstName" />
    <AttributeEncoder xsi:type="enc:SAML2String" name="urn:oid:2.5.4.42"
            friendlyName="givenName" /> 
</resolver:AttributeDefinition>

This form of AttributeDefinition references a previously defined Attribute (by setting both sourceAttributeID and the Dependency ref to the ID of a previous AttributeDefinition) and then gives it a AttributeEncoder format and a new more specific ID to use in the attribute-filter so you release to a particular partner a value and name+friendlyName format.

XML and Shibboleth are case sensitive, so it is important to realize that Oracle always converts its columns to UPPERCASE. To avoid errors you should always use UPPERCASE names for the sourceAttributeID field if the query is to an Oracle database, and you should define an UPPERCASE id for a default static value in the fallback connector if the Oracle query fails. Otherwise you may spend hours trying debugging the failure of the value to show up where you expect it to be.

Undefined, Null, or Empty

When you query a database or an LDAP directory and then try to define an Attribute based on the value of a column or property, several things can go wrong:

The database query can return no rows from the query.
The database query can return a SQL NULL value for a column (unless you use NVL in Oracle or ISNULL in SQL Server to replace the NULL with a default value).
An LDAP query can return no User object
An LDAP query can return a User object, but in that object the property you are looking for may not be present.
An LDAP query can return a User object, and the property may be present, but it may have no values in the list of values.

First the good news. If all you do is to create an AttributeDefinition with an Encoder and to release it as an ordinary Attribute in a SAML Response, then you don't have to think about it. Shibboleth takes care of all these cases and does the right thing.

However, if you want to use this attribute in the SAML NameID server to generate a Subject, then you have a problem because the value that goes into any Subject calculation can never be NULL or missing. So it is your job to make sure that none of these things can happen (nor can duplicate values be returned for different identities) when this attribute is input to a custom element in the saml-nameid.xml file

Then you can run into problems when this attribute is referenced as a Dependency in an AttributeDefinition of type Script. Each Dependency creates a JavaScript variable that can be used in the block of JavaScript code that calculates a value for the new Attribute. However, this variable reflects the peculiar status of the column or property from which it was derived.

A JavaScript variable name can be "undefined" if it has never been used and assigned a value.

A JavaScript variable can be null if it has been assigned the value null.

A JavaScript variable can be an array of length 0 or a collection that contains no objects.

You can encounter all three conditions in a JavaScript variable created by a Dependency element in a type "Script" Attribute definition when different results are returned from a database or LDAP query. The only safe thing to do is to check all three in that specific order:

                if (typeof googleEmail!="undefined" && 
                    googleEmail!=null &&
                    googleEmail.getValues().size()>0) {
                        googlemailalias = googleEmail.getValues().get(0);
                }

Suppose the Database is Down

If any exception is thrown during the query, then the Shibboleth code will attempt to execute a secondary query specified in the "failover" attribute of the DataConnector. The failover can point to a different query to a different database that might return the same value. Or it can be a Static element.

A Static DataConnector defines one or more property names and values. It is not necessary to define a default value for every property that you could have obtained from the correct execution of the real query, provided that a null or undefined value is acceptable for the other properties.Given the previous warning about NULL and undefined and empty, you should think twice before leaving column/property names without an explicit default value (0, 1, -1, "", "undefined", etc.). However, it is not an error to omit them if you choose. The Static DataConnector cannot throw an exception.

At Yale:

Every Query must have a Failover DataConnector, which may itself have a Failover, and the chain must end with a Static Connector.

See the examples in the attribute-resolver-connectors.xml file.

Script bugs

Any JavaScript program can have errors. Usually they only show up when a database is down or some crud gets dumped into new rows or columns, or the AD gets updated badly. Unfortunately, if JavaScript throws an unhandled exception then Shibboleth fails the entire login.

Every Script must be wrapped in a try-catch that catches all errors and does something reasonable. Normally the reasonable thing is to just return which produces an empty Attribute which is probably the best you could do anyway.

Other Errors

Other problems occur inside Shibboleth itself. Unfortunately, if Shibboleth generates an internal exception evaluating any Attribute it aborts login processing and returns no attributes at all. This is not the best solution for Yale, and in Shibboleth 2 we added a try-catch so that exceptions evaluating an Attribute only left that one Attribute undefined. We have not yet decided to migrate that Yale modification to Shibboleth 3.

NameId (Subject)

Every SAML Response has a Subject field. It has a value and one of a list of standardized "Format" name strings.

For most partners the Subject field is ignored and they get any information they need from the Attributes. Some important partners, however, use the Subject field as their most important source of information. ServiceNow expect the Subject field to be a Netid, while Google expects it to be the Eliapps ID (the part of the Email alias before the "@").

The Subject is supposed to be unique and is commonly obscured. For example, if you hash the Netid and some other secret stuff you can get a value that is reproducible from login to login but does not expose the identity of the user. However, if a user is not expected to be able to login to a service then providing both of them the same "do not log this person on" Subject value is not a problem. This means that technically we do not have to worry if the same subject is generated whenever an indispensable identity value is NULL (say when people who do not have Eliapps accounts try to login to Eliapps, or when people who are not Employees try to access the Benefits system).

Shibboleth has been known to generate an internal error if any attribute used to generate a Subject value has a NULL value, so generally any query for a value that might be used as a Subject should substitute a dummy value like "unknown" or "-1" for NULL return values.

In Shibboleth 2 a Subject was represented by a special type of Encoder element in AttributeDefinition statements in the attribute-resolver XML file.

In Shibboleth 3 there is a new subsystem and a new configuration file called saml-nameid.xml. To understand the change, you have to remember that the "best practice" is to generate some obscured meaningless reproducible string of characters as the Subject and to use Attributes exclusively as the source of meaningful information. Shibboleth 3 is designed to emphasize the idea that putting real data in the Subject field is a bad idea. We have to do it because certain partners expect it. We do not have to discuss the configuration of a hash-trash Subject because it is automatic and fairly uninteresting.

If the Subject is meaningful, then it has to be based on some attribute (Netid, UPI, email, ...). That means that there is an AttributeDefinition that provides the value.

In Shibboleth 3 the correct way to do this is to create a special AttributeDefinition with a special ID that is only used to generate subject values:

    <AttributeDefinition id="subjectMail" xsi:type="ad:Simple" 
        sourceAttributeID="EmailAddress">
        <Dependency ref="IDRQuery" />
    </AttributeDefinition>

There are lots of real Attributes based on Email address, but this one special attribute is named "subjectMail" and it has no SAML Encoder elements that can be used to produce an Attribute in the Response. With this Definition, we have a special ID and can release "subjectMail" to certain Relying Party partners through the attribute-filter statements.

Now in the saml-nameid.xml file there can be a statement:

        <bean parent="shibboleth.SAML2AttributeSourcedGenerator"
            p:format="urn:oasis:names:tc:SAML:1.1:nameid-format:emailAddress"
            p:attributeSourceIds="#{ {'subjectMail'} }" />

This statement creates a potential Subject. It has a format string, but we usually refer to Subject formats by just the last piece after the last colon, so this is "emailAddress". The value will be taken from the 'subjectMail' attribute above.

The last piece of the puzzle is provided by the Metadata for a Relying Party, where one or more Format strings can be provided in a NameIDFormat element:

        <NameIDFormat>urn:oasis:names:tc:SAML:1.1:nameid-format:emailAddress</NameIDFormat>

Now, if you have done this right, there should be for any given Relying Party exactly one Subject generated by saml-nameid.xml that is both based on an attribute released to this RP by attribute-filter and also has an associated format mentioned in a NameIDFormat element of the RP Metadata. Then that one Subject will be used to generate the Response.

The most important thing to understand about Subjects is that the NameIDFormat does not really mean what it says. In normal use, "emailAddress" seems to suggest that this is a real Email address to which you might send mail. NameIDFormat is a suggestion about what the thing looks like, not a request for a particular real attribute value. In this case it suggests that the Subject "look like an Email address" not that it actually be an email address. In reality, there is not even a requirement that the Subject contain an "@" to match the format.

So in practice you can almost ignore the NameIDFormat except for its use in selecting a specific Subject from the list of available subjects in the saml-nameid.xml file. A value equal to the Netid could, for example, could be assigned any format in saml-nameid and then could match any NameIDFormat in a Metadata file.

When more than one Subject definition can be released to a Service Provider, Shibboleth chooses one. You can control the preference, but now you are missing the point. Either you should not release two Subject-generating AttributeDefinitions to the same EntityID, or you should delete the unwanted NameIDFormat string in the Metadata. If that is not possible, read the Shibboleth Wiki for information on controlling the selection preference.

The attribute-filter.xml file has a long list of rules listing the Attributes (defined in the previous section) that are released to each partner. For example

    <afp:AttributeFilterPolicy id="releaseToCommunityForceStaging">
        <afp:PolicyRequirementRule xsi:type="basic:AttributeRequesterString" value="https://yalestaging.communityforce.com" />
        <afp:AttributeRule attributeID="givenName"><afp:PermitValueRule xsi:type="basic:ANY" /></afp:AttributeRule>
        <afp:AttributeRule attributeID="sn"><afp:PermitValueRule xsi:type="basic:ANY" /></afp:AttributeRule>
        <afp:AttributeRule attributeID="mail"><afp:PermitValueRule xsi:type="basic:ANY" /></afp:AttributeRule>
    </afp:AttributeFilterPolicy>

CommunityForce gets the givenName (firstname), sn (surname or family name), and E-Mail address (named just "mail" according to the old LDAP standards). In fact, these are all standard old LDAP attributes which are very popular in academic applications. In contrast

    <afp:AttributeFilterPolicy id="releaseToArcher">
        <afp:PolicyRequirementRule xsi:type="basic:AttributeRequesterString" value="https://sso2.archer.rsa.com/adfs/services/trust" />
        <afp:AttributeRule attributeID="scopedNetidAsUPN"><afp:PermitValueRule xsi:type="basic:ANY" /></afp:AttributeRule>
        <afp:AttributeRule attributeID="firstnameADFS"><afp:PermitValueRule xsi:type="basic:ANY" /></afp:AttributeRule>
        <afp:AttributeRule attributeID="lastnameADFS"><afp:PermitValueRule xsi:type="basic:ANY" /></afp:AttributeRule>
        <afp:AttributeRule attributeID="emailADFS"><afp:PermitValueRule xsi:type="basic:ANY" /></afp:AttributeRule>

Archer gets "http://schemas.xmlsoap.org/claims/FirstName" and so on for lastname and email. These are Microsoft URL style names that are more popular these days with everyone except for the old guard in universities who still remember the LDAP names from previous failed attempts to use them.

The Attribute-Filter entries are cumulative. Shibboleth runs through the rules and whenever a rule applies to a entity, any released attributes are added to the list of values we will send. Although most of the time all of the attributes for one entity will be defined in one place, this is a good and sane practice but not a requirement.

Therefore, Shibboleth allows the Attribute-Filter function to be broken up into more than one file. We take advantage of this by creating an attribute-filter.xml file that contains the attributes released to each partner as of an official Shibboleth Release, but then an addtional-attribute-filter.xml file exists initially empty that can be changed between formal releases. The addtional file can either create a new filter policy for a new partner, or it could add an additional attribute to an existing partner.

However, you can only release attributes defined by attribute-resolver.xml, and that does not change between releases.

Relying-Party

The relying-party.xml file has three types of definitions:

An Anonymous Relying Party is a partner who sends a SAML Request message to Shibboleth with an EntityID that does not match any configured Metadata. You may decide not to support them at all, but it is probably safe to send back a response with a unique Subject. In a simpler time before modern security and privacy concerns Yale adopted a lax policy for anonymous services and at some time both Shibboleth and CAS should be changed to be more strict.

The Default Relying Party configures the behavior of every partner who does not have a specific exception. At Yale we configure the Default Relying Party to receive attributes that are not encrypted, because if you want encryption just use SSL but in the meantime encrypted attributes are impossible to debug.

Specific Relying Party configurations could force encryption if we needed to do it, but we have no examples currently at Yale.

More About Metadata

SAML Metadata can have a ton of useless information. There are four things that are actually important:

The entityID value provides a unique string that identifies the partner.
The AssertionConsumerService Location parameter defines the URL to which a SAML Response will be sent. No Response will be sent to any URL that is not a listed ACS Location. One ACS element can be flagged as the Default (which is implied when there is only one ACS element) and it becomes the URL to which SAML Response are sent when no URL is provided in the Request.
One or more <NameIDFormat> elements can be provided, and they help select the format and value of the Subject.
The Metadata indicates if the partner uses SAML2 or SAML1 protocol.

Shibboleth has a failFastInitialization="false" parameter for each configured Metadata source. The default is "true" and causes Shibboleth to fail to start up if the Metadata is invalid. If we put Metadata directly into production, "true" would be a really, really bad idea. However, at Yale Metadata goes through DEV and TEST before it goes to PROD, and the way the Jenkins jobs interact with the Subversion tags should prevent problems only showing up in production. If we have an issue, it is better that it show up as an initialization problem for DEV and get fixed immediately rather than being something that could just slip through the cracks. Perhaps this parameter should be "true" in DEV and TEST and "false" in PROD, and that will be a change to be made in some later release.

Yale defines four types of Metadata Providers in the following order:

The dynamic "emergency-override.xml" that is initially empty but can be used to replace production that becomes bad between releases.
The static production partner Metadata XML files provided for archer, hewitt, communityforce, salesforce, and so on.
The InCommon remote source which changes without our knowledge or control.
The dynamic "additions.xml" file where new partners can be defined between releases (also associated with the additional-attribute-filter.xml file).

This then leaves us with a small number of special cases. Two of our partners (salesforce and cvent) use a technique that we might call the Expanding Metadata File. Every time you define a new application with these systems, instead of getting a new Metadata file you get a one line change to add to the existing Metadata file. In Salesforce, the file looks like:

      <md:AssertionConsumerService Binding="urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST" 
		Location="https://yale-finance.my.salesforce.com?so=00Di0000000gP9D" index="12"/>
      <md:AssertionConsumerService Binding="urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST" 
		Location="https://yale-fbo.my.salesforce.com?so=00Di0000000gP9D" index="13"/>
      <md:AssertionConsumerService Binding="urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST" 
		Location="https://yale-adm.my.salesforce.com?so=00DA0000000ABT0" index="14"/>

The next time someone comes up with a new Salesforce application, it will be index="15" and will have its own unique Location value.

Shibboleth Configuration