Configuration Strategies

Shibboleth is configured with three primary files (attribute-resolver.xml, attribute-filter.xml, and relying-party.xml) and a number of "metadata" files.

The three primary files can be stored:

On the local disk of the Shibboleth server VM
On a Web server of some sort (referenced by a URL).
On a Subversion server

Shibboleth reads the files and creates in memory objects at startup time. After that, it can be configured to check each file periodically and to reload the file and replace that part of the configuration if the file is changed. For local disk files, the last modified date is checked. For HTTP files, an "if-modified-since" check is made. For Subversion, modification is detected when a new version number is checked in.

All three methods of storage work well under normal circumstances, but Shibboleth cannot come up in a disaster recovery situation if the configuration data is not available and options 2 and 3 depend on external servers that may not be up. Although Shibboleth can maintain a local copy of the remote file, it turns out that Shibboleth startup is delayed for an unacceptable amount of time if Subversion is configured but the Subversion server is down.

At Yale the configuration of Shibboleth is controlled by Subversion, but not by making Shibboleth a direct Subversion client. Instead, the Jenkins Install job is the Subversion client, and after the Install job is run Shibboleth expects its files to be on local disk. Having tested the alternatives, this turns out to be the best option. It allows dynamic update, but under the control of an explicit operation (the Jenkins job) and leaves Shibboleth ready to come up without additional external dependencies.

In addition to the three primary configuration files, Shibboleth has a separate mechanism to load Metadata files. A Metadata source can provide information for a single partner or for a group of partners. The group may be an administered federation like InCommon or just a bunch of related definitions, like the 11 new Tenant instances of Workday.

Each Metadata source can be a local file on disk or an HTTP source (but the Subversion option is not available for Metadata). We define InCommon as an HTTP source (with a local file backup of the last fetched copy of the Metadata), and all the other sources are local files installed by Jenkins from Subversion. Each Metadata source is an independent configuration with its own refresh rules. At Yale, we have decided to use three Metadata source models:

Static - Metadata files installed in Subversion and only changed by a full Jenkins Install. These individual files are not checked for a change in timestamp. Instead, they are all reloaded from disk when the relying-party.xml file that defined them is changed and reloaded into memory.

Remote- The InCommon Metadata is provided from a remote URL on the InCommon Web server. Once every 8 hours Shibboleth checks for a new version and dowloads it from the server. Shibboleth maintains a local disk copy of the last file downloaded, so if Shibboleth is restarted and the remote server is unavailable it will be able to come up with the previous InCommon Metadata.

Dynamic - Individual Metadata files that are checked every 5 minutes for changes and are individually loaded into memory. This is a new option that was added as part of the 1Q15 general Shibboleth code update and reorganization. However, at this time the files designated to be dynamic are initially empty after each release. There is a new Jenkins type of Install operation that changes only these files. Because Shibboleth examines Metadata sources in the order in which they are configured, and it stops when it finds Metadata for the entity for which it is searching, these dynamic Metadata files are distinguished by their position in the search order.

The "emergency-override" dynamic file comes first in the search, so any Metadata placed in this file overrides an older version configured statically. This file is initially empty and Metadata is placed in it when we have an incident because an existing partner metadata has failed (typically because it has expired or the Certificate and key used by the partner has changed unexpectedly). This provides a safer form of "emergency" fix because only the one Metadata element is replaced instead of the entire Shibboleth configuration.

The "additions" dynamic file comes last in the search, so it cannot be used to change any existing Metadata for any entity. It can only define new Meatadata for new entities. This becomes a relatively safe Standard Change because anything put into this file cannot adversely affect existing configured services.

A new partner may need more than just Metadata. They may need attributes released to them. Fortunately, Shibboleth allows the function of the attribute-filter.xml file to be broken up into multiple files. Existing partners are configured in attribute-filter, and an empty file named "additional-attribute-filter.xml" is deployed with every Shibboleth Release.

Therefore, if a new partner has to be defined to production and cannot wait for the every-other-Thursday Release cycle, the Metadata for that partner can be placed in the metadata/additions.xml file and the attributes to be released can be put in the additional-attribute-filter.xml file. A Jenkins install of runtype=additions replaces both of these originally empty files with the data for the newly defined partner while guaranteeing by their search order that they cannot interfere with existing services. When the next regularly scheduled Shibboleth Release is ready, the changes move from the additions files to the normal Shibboleth configuration and the additions files are empty again.

If a partner requires a new attribute, however, there is no way to define it outside the every other Thursday system (unless the ECAB authorizes an unscheduled Release).

Attribute-Resolver (Queries and Attributes)

Attributes are defined and their values are obtained from the configuration in the attribute-resolver.xml file.

The file starts with DataConnectors. A typical connector identifies a database or LDAP directory as a source, and a query (in SQL or LDAP query language) to present to the source. Currently Shibboleth pulls data from Oracle instances (ACS, IST, IDM, HOP), the IDR SQL Server database, the Public LDAP Directory, and the Windows AD LDAP directory.

There are generally three types of queries that make sense:

A database query can return exactly one row. Then you can think of the row as a user object, and the column names become the properties of the object. All these properties are Strings (Shibboleth does not have a concept of other data types). NOTE: Oracle always returns column names in UPPERCASE and it is a really, really good idea to always capitalize these names in the configuration file whenever they appear. If you fail to use all caps in one of the later configuration elements, it will fail to match and then you get a null instead of a value.
A database query can return more than one row but only one column. Then you have a "multivalued" property.
An LDAP query returns the User object from the directory. LDAP User objects have properties some of which are single valued and some of which are multivalued.

One other point. The database query can return no rows, or it can return a row where some of the columns have the value NULL. An LDAP query can return no object, or it can return an object that does not have the property name you are interested in, or the property can exist but have no value. For the most part this doesn't matter unless you have to reference the value of the property in JavaScript. If you write JavaScript, remember to test for the property being "undefined", or null, or empty, which are distinct conditions with distinct tests.

Returning no rows or objects is a normal response to a query. A query fails if it generates an ORAxxxx SQLException or a NamingException. Typically this happens if the database server or directory is down, but it can also happen if the userid and password you are using to login to the server is no longer valid or if permissions have been revoked or were never granted to that user.

Shibboleth regards any query failure as catastrophic, unless there is a "failover" DataConnector. The Failover can be a different query to a different database, or it can be a Static default value.

A Static DataConnector defines one or more properties and values. It is not necessary to define a default value for every property that you could have obtained from the correct execution of the real query, provided that a null or undefined value is acceptable for the other properties. Typically Static default values are -1, 0, "", "undefined" but that is up to you. Because query failure is catastrophic without a Fallback

Every Query must have a Static Fallback DataConnector. It is not important what default values, if any, are provided.

In the new release, the attribute-resolver file has been reorganized to emphasize the Failover relationship, and as part of the testing of the new release we will verify that Shibboleth survives the loss of access to each data source. However, it becomes an ongoing process to ensure that every time a new query is defined, a static Failover is also created and Shib is tested for that failure.

Because Shibboleth behaves catastrophically if a query fails without a failover, there is no entirely safe way to update this file. Defining new queries or attributes cannot be part of a Standard Change. It is going to require testing as part of a full Release cycle (unless it is an Emergency in the eyes of the ECAB).

The queries provide the basic data. When they are done you know stuff about the user, but different partners have decided to demand that the same piece of information be given different names when sent to them. Take something as simple as "first name". In China, the name that comes first is the family name, and the individual given name comes second, so international standards tend to reject "first" and "last" preferring terms like "familyName" and "givenName". Of course, a lot of our partners are not familiar with international standards. So different partners will ask for "FirstName", "firstName", "first_name", "givenName", "Given Name", and slightly more sophisticated partners will ask for one of three globally unique technical identifiers ("http://schemas.xmlsoap.org/ws/2005/05/identity/claims/givenname", "http://schemas.xmlsoap.org/claims/FirstName", and "urn:oid:2.5.4.42").

Then there are a few cases where a single named variable can have different values to different partners. The best example is E-Mail address. Most systems expect this to be passed to them as an attribute named "mail". However, at Yale you have your primary E-Mail alias (firstname.lastname@yale.edu) but your can also have other aliases. This is complicated by the fact that you can have an Exchange account or a Eliapps Google account, or both. When we are sending E-Mail alias to Google, they only want to see the Google mail name, but when you send the E-Mail address to Box, they want to see your primary alias whether it is Exchange or Google.

The attribute-filter.xml file has a long list of rules listing the Attributes (defined in the previous section) that are released to each partner. For example

    <afp:AttributeFilterPolicy id="releaseToCommunityForceStaging">
        <afp:PolicyRequirementRule xsi:type="basic:AttributeRequesterString" value="https://yalestaging.communityforce.com" />
        <afp:AttributeRule attributeID="givenName"><afp:PermitValueRule xsi:type="basic:ANY" /></afp:AttributeRule>
        <afp:AttributeRule attributeID="sn"><afp:PermitValueRule xsi:type="basic:ANY" /></afp:AttributeRule>
        <afp:AttributeRule attributeID="mail"><afp:PermitValueRule xsi:type="basic:ANY" /></afp:AttributeRule>
    </afp:AttributeFilterPolicy>

CommunityForce gets the givenName (firstname), sn (surname or family name), and E-Mail address (named just "mail" according to the old LDAP standards). In fact, these are all standard old LDAP attributes which are very popular in academic applications. In contrast

    <afp:AttributeFilterPolicy id="releaseToArcher">
        <afp:PolicyRequirementRule xsi:type="basic:AttributeRequesterString" value="https://sso2.archer.rsa.com/adfs/services/trust" />
        <afp:AttributeRule attributeID="scopedNetidAsUPN"><afp:PermitValueRule xsi:type="basic:ANY" /></afp:AttributeRule>
        <afp:AttributeRule attributeID="firstnameADFS"><afp:PermitValueRule xsi:type="basic:ANY" /></afp:AttributeRule>
        <afp:AttributeRule attributeID="lastnameADFS"><afp:PermitValueRule xsi:type="basic:ANY" /></afp:AttributeRule>
        <afp:AttributeRule attributeID="emailADFS"><afp:PermitValueRule xsi:type="basic:ANY" /></afp:AttributeRule>

Archer gets "http://schemas.xmlsoap.org/claims/FirstName" and so on for lastname and email. These are Microsoft URL style names that are more popular these days with everyone except for the old guard in universities who still remember the LDAP names from previous failed attempts to use them.

The Attribute-Filter entries are cumulative. Shibboleth runs through the rules and whenever a rule applies to a entity, any released attributes are added to the list of values we will send. Although most of the time all of the attributes for one entity will be defined in one place, this is a good and sane practice but not a requirement.

Therefore, Shibboleth allows the Attribute-Filter function to be broken up into more than one file. We take advantage of this by creating an attribute-filter.xml file that contains the attributes released to each partner as of an official Shibboleth Release, but then an addtional-attribute-filter.xml file exists initially empty that can be changed between formal releases. The addtional file can either create a new filter policy for a new partner, or it could add an additional attribute to an existing partner.

However, you can only release attributes defined by attribute-resolver.xml, and that does not change between releases.

Relying-Party and Metadata

Metadata is a SAML standard format for describing the Identity Provider (Shibboleth at Yale) and the Service Provider (example: coursera.org). Shibboleth needs Service Provider Metadata for its partners. Although the Metadata file can be quite large and complex, the important information is the EntityID, a unique identifier for the partner, which is typically either a DNS name (coursera.org) or a URL (https://coursera.org). There is also an "AssertionConsumerService URL" that defines the URL to which Shibboleth sends the SAML message that describes the user.

The relying-party.xml file defines the Metadata sources. Each source is a file that Shibboleth reads in and parses separately from the other sources. Then Shibboleth searches each source for an EntityID and it stops when it finds a match.

Some partners are configured through a Federation. InCommon, for example, distributes Metadata for a large number of Universities and companies that do business with universities. Periodically Shibboleth obtains updated Metadata from the URL "http://md.incommon.org/InCommon/InCommon-metadata.xml".

Our most important partners exchange Metadata with us directly. We store their Metadata files in a directory in Subversion, and we add a reference to the file name to the relying-party.xml file so Shibboleth will read it.

We could have created a single composite Metadata file with all the information provided by all the partners. This is, after all, the way InCommon distributes its Metadata. However, we lack the resources and tools to do any elaborate parsing and validity checking of the file contents. By storing the files separately and creating a new Metadata source for each file, we insulate each file from all the other files and limit the possible damage caused by misconfiguration.

Shibboleth has a failFastInitialization="false" parameter for each configured Metadata source. The default is "true" and causes Shibboleth to fail to start up if the Metadata is invalid. If we put Metadata directly into production, "true" would be a really, really bad idea. However, at Yale Metadata goes through DEV and TEST before it goes to PROD, and the way the Jenkins jobs interact with the Subversion tags should prevent problems only showing up in production. If we have an issue, it is better that it show up as an initialization problem for DEV and get fixed immediately rather than being something that could just slip through the cracks. Perhaps this parameter should be "true" in DEV and TEST and "false" in PROD, and that will be a change to be made in some later release.

The relying-party.xml defines four categories of Metadata sources:

The dynamic "emergency-override.xml" that is initially empty but can be used to replace production that becomes bad between releases because of something the partner did wrong.
The static production partner Metadata XML files provided for archer, hewitt, communityforce, salesforce, and so on.
The InCommon remote source
The dynamic "additions.xml" file where new partners can be defined between releases (also associated with the additonal-attribute-filter.xml file).

This then leaves us with a small number of special cases. Two of our partners (salesforce and cvent) use a technique that we might call the Expanding Metadata File. Every time you define a new application with these systems, instead of getting a new Metadata file you get a one line change to add to the existing Metadata file. In Salesforce, the file looks like:

      <md:AssertionConsumerService Binding="urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST" 
		Location="https://yale-finance.my.salesforce.com?so=00Di0000000gP9D" index="12"/>
      <md:AssertionConsumerService Binding="urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST" 
		Location="https://yale-fbo.my.salesforce.com?so=00Di0000000gP9D" index="13"/>
      <md:AssertionConsumerService Binding="urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST" 
		Location="https://yale-adm.my.salesforce.com?so=00DA0000000ABT0" index="14"/>

The next time someone comes up with a new Salesforce application, it will be index="15" and will have its own unique Location value.

We may add special types of Jenkins Installs (runtype=salesforce and runtype=cvent) that replace just this one file. The bad news is that if the new Metadata is bad it will break existing Salesforce or Cvent applications, but the type of edit here is fairly simple and any mistakes should show up in DEV and TEST. Futhermore, the Shibboleth isloation of Metadata sources and the decision to configure files separately in relying-party.xml ensure that changes to Salesforce only affects Salesforce applications and nothing else.

Elements of a Proposed Strategy

Previously, there were only two "runtype" values for the Jenkins Shibboleth Install job.

Runtype "install" stops the JBoss server, loads a complete Shibboleth system including potentially new code, and new configuration files.

Runtype "config" installs a complete set of new configuration files.

The proposal is to add new runtype values.

Runtype "additions" will change the Metadata "additions.xml" and the "additional-attribute-filter" file. This can be used to add new Service Providers to production between the every-other-Thursday full Release cycles. Shibboleth isolates these files and appears to guarantee that this type of configuration cannot possibly interfere with existing production services.

Runtype "emergency" will change the "emergency-override.xml" file and allows us to define new Metadata for an existing production partner without affecting anything else. It may require permission from the ECAB, but it is a less dangerous change than a full configuration update.

Runtype "salesforce" and "cvent" are proposed (for discussion) jobs that change a single Metadata file for the two partners that require frequent updates. I would like to see them become Standard Changes.

Then the second element of the strategy is to provide a more accurate and complete testing strategy. Currently TEST Shibboleth is connected to the TEST database instances (ACS2, IST2, IDM2, HOP4) and potentially to the TEST AD (yu.yale.net). This provides a service for those who need to use test netids, but it does not actually test what is going to go into production. So a new PREPROD instance has been created that uses the production databases and generates SAML signed with the production credentials, but which runs on a separate machine and can be tested prior to installing the changes into real PROD.