Configuration Strategies

Shibboleth is configured with three primary files (attribute-resolver.xml, attribute-filter.xml, and relying-party.xml) and a number of "metadata" files.

The three primary files can be stored:

On the local disk of the Shibboleth server VM
On a Web server of some sort (referenced by a URL).
On a Subversion server

If they are local disk files then Shibboleth can check their modify date and load a new configuration when the file is updated. If they are on a Web server, then HTTP can be used to check to see if the file is changed. The Subversion access is driven by the version number increasing instead of the date.

The Shibboleth documentation recommends SVN, but because Shibboleth is a Tier 0 application it has to be able to come up with as few prerequisites as possible and the SVN support does not seem to work if the Subversion server has not yet been restored.

Yale Shibboleth configuration is driven by Subversion, but Shibboleth is not the SVN client and it doesn't download files itself. Instead, we control when the files become active through the Jenkins Install job. Jenkins checks out configuration files from SVN and then copies the files to local disk on the Shibboleth servers.Shibboleth is configured to use local disk files, and we have an administratively controlled and logged mechanism to update them. Once installed, the files remain available should Shibboleth be restarted in any Disaster Recovery scenario.

We have configured Shibboleth to check every 30 seconds for a new change timestamp on any local configuration file. When it sees a new version of the file it reads the contents into memory and runs a minimal XML parse. If there is an XML syntax error in the file, it is discarded and the old configuration remains active. Otherwise, once the file has been successfully read then the new configuration replaces the previous configuration.

Metadata is a bit more complicated. Metadata sources are configured in the relying-party.xml file. Each Metadata source is an independent configuration with its own refresh rules. At Yale, we have decided to use three Metadata source models:

Static - Production Metadata for partners that have supplied us with Metadata that we check into SVN and manage directly are handled as individual static files. They are copied from SVN to the local hard drive of the Shibboleth server, but they have no refresh policy. You cannot change a Static Metadata file by itself. You have to change the timestamp on the relying-party.xml file, and when it gets read into memory then Shibboleth automatically reloads all the Metadata files that the relying-party.xml file designates.

Remote- The InCommon Metadata is provided from a remote URL on the InCommon Web server. Once every 8 hours Shibboleth checks for a new version and dowloads it from the server. Shibboleth maintains a local disk copy of the last file downloaded, so if Shibboleth is restarted and the remote server is unavailable it will be able to come up with the previous InCommon Metadata.

Dynamic - Specific Metadata files are stored as local files on disk, but they are configured to be examined once every 5 minutes for a changed timestamp and to be reloaded when they change. Because Shibboleth examines Metadata sources in the order in which they are configured, and it stops when it finds Metadata for the entity for which it is searching, these dynamic Metadata files are distinguished by their position in the search order.

The "emergency-override" dynamic file comes first in the search, so any Metadata placed in this file overrides an older version configured statically. This file is initially empty and Metadata is placed in it when we have an incident because an existing partner metadata has failed (typically because it has expired or the Certificate and key used by the partner has changed unexpectedly). This provides a safer form of "emergency" fix because only the one Metadata element is replaced.

The "additions" dynamic file comes last in the search, so it cannot be used to change any existing Metadata for any entity. It can only define new Meatadata for new entities. This becomes a relatively safe Standard Change because anything put into this file cannot adversely affect existing configured services.

A new partner may need more than just Metadata. They may need attributes released to them. Fortunately, Shibboleth allows the function of the attribute-filter.xml file to be broken up into multiple files. Existing parteners are configured in attribute-filter, and an empty file named additional-attribute-filter.xml is initially deployed with every Shibboleth release.

Therefore, if a new partner has to be defined to production and cannot wait for the every-other-Thursday Release cycle, the Metadata for that partner can be placed in the metadata/additions.xml file and the attributes to be released can be put in the additonal-attribute-filter.xml file. The two files are updated together. At a normal Release point, information is moved out of the "additions" files and becomes part of the standard configuration files, and the empty additions files are deployed to start the next cycle.

If a partner requires a new attribute, however, there is no way to define it outside the every other Thursday system (unless the ECAB authorizes an unscheduled Release).

Attribute-Resolver (Queries and Attributes)

Attributes are defined and their values are obtained from the configuration in the attribute-resolver.xml file.

The file starts with DataConnectors. A typical connector identifies a database or LDAP directory as a source, and a query (in SQL or LDAP query language) to present to the source. Currently Shibboleth pulls data from Oracle instances (ACS, IST, IDM, HOP), the IDR SQL Server database, the Public LDAP Directory, and the Windows AD LDAP directory.

There are generally three types of queries that make sense:

A database query can return exactly one row. Then you can think of the row as a user object, and the column names become the properties of the object. All these properties are Strings (Shibboleth does not have a concept of other data types). NOTE: Oracle always returns column names in UPPERCASE and it is a really, really good idea to always capitalize these names in the configuration file whenever they appear. If you fail to use all caps in one of the later configuration elements, it will fail to match and then you get a null instead of a value.
A database query can return more than one row but only one column. Then you have a "multivalued" property.
An LDAP query returns the User object from the directory. LDAP User objects have properties some of which are single valued and some of which are multivalued.

One other point. The database query can return no rows, or it can return a row where some of the columns have the value NULL. An LDAP query can return no object, or it can return an object that does not have the property name you are interested in, or the property can exist but have no value. For the most part this doesn't matter unless you have to reference the value of the property in JavaScript. If you write JavaScript, remember to test for the property being "undefined", or null, or empty, which are distinct conditions with distinct tests.

Returning no rows or objects is a normal response to a query. A query fails if it generates an ORAxxxx SQLException or a NamingException. Typically this happens if the database server or directory is down, but it can also happen if the userid and password you are using to login to the server is no longer valid or if permissions have been revoked or were never granted to that user.

Shibboleth regards any query failure as catastrophic, unless there is a "failover" DataConnector. The Failover can be a different query to a different database, or it can be a Static default value.

A Static DataConnector defines one or more properties and values. It is not necessary to define a default value for every property that you could have obtained from the correct execution of the real query, provided that a null or undefined value is acceptable for the other properties. Typically Static default values are -1, 0, "", "undefined" but that is up to you. Because query failure is catastrophic without a Fallback

Every Query must have a Static Fallback DataConnector. It is not important what default values, if any, are provided.

In the new release, the attribute-resolver file has been reorganized to emphasize the Failover relationship, and as part of the testing of the new release we will verify that Shibboleth survives the loss of access to each data source. However, it becomes an ongoing process to ensure that every time a new query is defined, a static Failover is also created and Shib is tested for that failure.

Because Shibboleth behaves catastrophically if a query fails without a failover, there is no entirely safe way to update this file. Defining new queries or attributes cannot be part of a Standard Change. It is going to require testing as part of a full Release cycle (unless it is an Emergency in the eyes of the ECAB).

The queries provide the basic data. When they are done you know stuff about the user, but different partners have decided to demand that the same piece of information be given different names when sent to them. Take something as simple as "first name". In China, the name that comes first is the family name, and the individual given name comes second, so international standards tend to reject "first" and "last" preferring terms like "familyName" and "givenName". Of course, a lot of our partners are not familiar with international standards. So different partners will ask for "FirstName", "firstName", "first_name", "givenName", "Given Name", and slightly more sophisticated partners will ask for one of three globally unique technical identifiers ("http://schemas.xmlsoap.org/ws/2005/05/identity/claims/givenname", "http://schemas.xmlsoap.org/claims/FirstName", and "urn:oid:2.5.4.42").

Then there are a few cases where a single named variable can have different values to different partners. The best example is E-Mail address. Most systems expect this to be passed to them as an attribute named "mail". However, at Yale you have your primary E-Mail alias (firstname.lastname@yale.edu) but your can also have other aliases. This is complicated by the fact that you can have an Exchange account or a Eliapps Google account, or both. When we are sending E-Mail alias to Google, they only want to see the Google mail name, but when you send the E-Mail address to Box, they want to see your primary alias whether it is Exchange or Google.

The attribute-filter.xml file has a long list of rules listing the Attributes (defined in the previous section) that are released to each partner. For example

    <afp:AttributeFilterPolicy id="releaseToCommunityForceStaging">
        <afp:PolicyRequirementRule xsi:type="basic:AttributeRequesterString" value="https://yalestaging.communityforce.com" />
        <afp:AttributeRule attributeID="givenName"><afp:PermitValueRule xsi:type="basic:ANY" /></afp:AttributeRule>
        <afp:AttributeRule attributeID="sn"><afp:PermitValueRule xsi:type="basic:ANY" /></afp:AttributeRule>
        <afp:AttributeRule attributeID="mail"><afp:PermitValueRule xsi:type="basic:ANY" /></afp:AttributeRule>
    </afp:AttributeFilterPolicy>

CommunityForce gets the givenName (firstname), sn (surname or family name), and E-Mail address (named just "mail" according to the old LDAP standards). In fact, these are all standard old LDAP attributes which are very popular in academic applications. In contrast

    <afp:AttributeFilterPolicy id="releaseToArcher">
        <afp:PolicyRequirementRule xsi:type="basic:AttributeRequesterString" value="https://sso2.archer.rsa.com/adfs/services/trust" />
        <afp:AttributeRule attributeID="scopedNetidAsUPN"><afp:PermitValueRule xsi:type="basic:ANY" /></afp:AttributeRule>
        <afp:AttributeRule attributeID="firstnameADFS"><afp:PermitValueRule xsi:type="basic:ANY" /></afp:AttributeRule>
        <afp:AttributeRule attributeID="lastnameADFS"><afp:PermitValueRule xsi:type="basic:ANY" /></afp:AttributeRule>
        <afp:AttributeRule attributeID="emailADFS"><afp:PermitValueRule xsi:type="basic:ANY" /></afp:AttributeRule>

Archer gets "http://schemas.xmlsoap.org/claims/FirstName" and so on for lastname and email. These are Microsoft URL style names that are more popular these days with everyone except for the old guard in universities who still remember the LDAP names from previous failed attempts to use them.

The Attribute-Filter entries are cumulative. Shibboleth runs through the rules and whenever a rule applies to a entity, any released attributes are added to the list of values we will send. Although most of the time all of the attributes for one entity will be defined in one place, this is a good and sane practice but not a requirement.

Therefore, Shibboleth allows the Attribute-Filter function to be broken up into more than one file. We take advantage of this by creating an attribute-filter.xml file that contains the attributes released to each partner as of an official Shibboleth Release, but then an addtional-attribute-filter.xml file exists initially empty that can be changed between formal releases. The addtional file can either create a new filter policy for a new partner, or it could add an additional attribute to an existing partner.

However, you can only release attributes defined by attribute-resolver.xml, and that does not change between releases.

Relying-Party and Metadata

Metadata is a SAML standard format for describing the Identity Provider (Shibboleth at Yale) and the Service Provider (example: coursera.org). Shibboleth needs Service Provider Metadata for its partners. Although the Metadata file can be quite large and complex, the important information is the EntityID, a unique identifier for the partner, which is typically either a DNS name (coursera.org) or a URL (https://coursera.org). There is also an "AssertionConsumerService URL" that defines the URL to which Shibboleth sends the SAML message that describes the user.

The relying-party.xml file defines the Metadata sources. Each source is a file that Shibboleth reads in and parses separately from the other sources. Then Shibboleth searches each source for an EntityID and it stops when it finds a match.

Some partners are configured through a Federation. InCommon, for example, distributes Metadata for a large number of Universities and companies that do business with universities. Periodically Shibboleth obtains updated Metadata from the URL "http://md.incommon.org/InCommon/InCommon-metadata.xml".

Our most important partners exchange Metadata with us directly. We store their Metadata files in a directory in Subversion, and we add a reference to the file name to the relying-party.xml file so Shibboleth will read it.

We could have created a single composite Metadata file with all the information provided by all the partners. This is, after all, the way InCommon distributes its Metadata. However, we lack the resources and tools to do any elaborate parsing and validity checking of the file contents. By storing the files separately and creating a new Metadata source for each file, we insulate each file from all the other files and limit the possible damage caused by misconfiguration.

Shibboleth has a failFastInitialization="false" parameter for each configured Metadata source. The default is "true" and causes Shibboleth to fail to start up if the Metadata is invalid. If we put Metadata directly into production, "true" would be a really, really bad idea. However, at Yale Metadata goes through DEV and TEST before it goes to PROD, and the way the Jenkins jobs interact with the Subversion tags should prevent problems only showing up in production. If we have an issue, it is better that it show up as an initialization problem for DEV and get fixed immediately rather than being something that could just slip through the cracks. Perhaps this parameter should be "true" in DEV and TEST and "false" in PROD, and that will be a change to be made in some later release.

The relying-party.xml defines four categories of Metadata sources:

The dynamic "emergency-override.xml" that is initially empty but can be used to replace production that becomes bad between releases because of something the partner did wrong.
The static production partner Metadata XML files provided for archer, hewitt, communityforce, salesforce, and so on.
The InCommon remote source
The dynamic "additions.xml" file where new partners can be defined between releases (also associated with the additonal-attribute-filter.xml file).

This then leaves us with a small number of special cases. Two of our partners (salesforce and cvent) use a technique that we might call the Expanding Metadata File. Every time you define a new application with these systems, instead of getting a new Metadata file you get a one line change to add to the existing Metadata file. In Salesforce, the file looks like:

      <md:AssertionConsumerService Binding="urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST" 
		Location="https://yale-finance.my.salesforce.com?so=00Di0000000gP9D" index="12"/>
      <md:AssertionConsumerService Binding="urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST" 
		Location="https://yale-fbo.my.salesforce.com?so=00Di0000000gP9D" index="13"/>
      <md:AssertionConsumerService Binding="urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST" 
		Location="https://yale-adm.my.salesforce.com?so=00DA0000000ABT0" index="14"/>

The next time someone comes up with a new Salesforce application, it will be index="15" and will have its own unique Location value.

We may add special types of Jenkins Installs (runtype=salesforce and runtype=cvent) that replace just this one file. The bad news is that if the new Metadata is bad it will break existing Salesforce or Cvent applications, but the type of edit here is fairly simple and any mistakes should show up in DEV and TEST. Futhermore, the Shibboleth isloation of Metadata sources and the decision to configure files separately in relying-party.xml ensure that changes to Salesforce only affects Salesforce applications and nothing else.

Elements of a Proposed Strategy

Previously, there were only two "runtype" values for the Jenkins Shibboleth Install job.

Runtype "install" stops the JBoss server, loads a complete Shibboleth system including potentially new code, and new configuration files.

Runtype "config" installs a complete set of new configuration files.

The proposal is to add new runtype values.

Runtype "additions" will change the Metadata "additions.xml" and the "additional-attribute-filter" file. This can be used to add new Service Providers to production between the every-other-Thursday full Release cycles. Shibboleth isolates these files and appears to guarantee that this type of configuration cannot possibly interfere with existing production services.

Runtype "emergency" will change the "emergency-override.xml" file and allows us to define new Metadata for an existing production partner without affecting anything else. It may require permission from the ECAB, but it is a less dangerous change than a full configuration update.

Runtype "salesforce" and "cvent" are proposed (for discussion) jobs that change a single Metadata file for the two partners that require frequent updates. I would like to see them become Standard Changes.

Then the second element of the strategy is to provide a more accurate and complete testing strategy. Currently TEST Shibboleth is connected to the TEST database instances (ACS2, IST2, IDM2, HOP4) and potentially to the TEST AD (yu.yale.net). This provides a service for those who need to use test netids, but it does not actually test what is going to go into production.

It is also true that most partners do not support TEST environments. In fact, the entire InCommon Federation has no concept of TEST and no provision for us to define our TEST Shibboleth.

However, while CAS is bound to a particular well known URL (secure.its.yale.edu/cas), Shibboleth is actually not bound to a URL or server but rather is known by the Public/Private key pair stored in its /usr/local/shibboleth-idp/credentials folder. Create a second instance of Shibboleth running on any server anywhere in the organization and give it a copy of the same credentials files and it will generate a SAML message that will be accepted as legitimate by any of our partners. While applications talk to CAS directly, all communication between Shibboleth and any application goes through the Browser. So if there is a PRE-PROD test environment with a copy of the code we propose to put into production and a copy of the Production credentials, then a Browser on a machine can use it with all the standard production apps by the obvious brute force solution of pointing the hosts file on the Browser client machine to the PRE-PROD VIP whenever the browser is redirected to "auth.yale.edu". The first time it may be necessary to approve the SSL Certificate name mismatch, but after that you have a platform to comprehensively test the exact configuration we intend to put into production.