...
In the new release, the attribute-resolver file has been reorganized to emphasize the Failover relationship, and as part of the testing of the new release we will verify that Shibboleth survives the loss of access to each data source. However, it becomes an ongoing process to ensure that every time a new query is defined, a static Failover is also created and Shib is tested for that failure.
However, defining Because Shibboleth behaves catastrophically if a query fails without a failover, there is no entirely safe way to update this file. Defining new queries or attributes is less common, and typically it is not an emergency. With the care that should be used and the testing that should be done, the normal two week release to production cycle seems appropriate.After the queries are defined, the same file goes on to define SAML attributes. The previous step obtained a value, but different partners want to use different names for the same thingcannot be part of a Standard Change. It is going to require testing as part of a full Release cycle (unless it is an Emergency in the eyes of the ECAB).
The queries provide the basic data. When they are done you know stuff about the user, but different partners have decided to demand that the same piece of information be given different names when sent to them. Take something as simple as "first name". It isn't actually that simple. In China, the name that comes first is the family name, and the individual given name comes second. It is just in the West that the individual given name comes first. Then different partners want to see this value labeled as "FirstName", "first_name", or "givenName" and when they want the long unique formal identifier it can be , so international standards tend to reject "first" and "last" preferring terms like "familyName" and "givenName". Of course, a lot of our partners are not familiar with international standards. So different partners will ask for "FirstName", "firstName", "first_name", "givenName", "Given Name", and slightly more sophisticated partners will ask for one of three globally unique technical identifiers ("http://schemas.xmlsoap.org/ws/2005/05/identity/claims/givenname" or , "http://schemas.xmlsoap.org/claims/FirstName" or the old LDAP value , and "urn:oid:2.5.4.42").
There are only a limited number of possible variable that you can extract from the Yale systems about a given user, but there are an unlimited number of names that people can dream up for Then there are a few cases where a single named variable can have different values to different partners. The best example is E-Mail address or phone number. Fortunately, adding a new label for an existing value is simple and in this part of the file an error adding something new cannot cause Shibboleth to misbehave. Unfortunately, because this is the second section of a single file, and additions to the first section can cause problems if they are not done correctly, there is no quick off the shelf improvement available for the Install process. However, with a bit of Ant programming it might be possible to break the file into separate components and define different levels of testing and approval to change the two different types of configuration elements. Most systems expect this to be passed to them as an attribute named "mail". However, at Yale you have your primary E-Mail alias (firstname.lastname@yale.edu) but your can also have other aliases. This is complicated by the fact that you can have an Exchange account or a Eliapps Google account, or both. When we are sending E-Mail alias to Google, they only want to see the Google mail name, but when you send the E-Mail address to Box, they want to see your primary alias whether it is Exchange or Google.
Attribute-Filter
The attribute-filter.xml file has a long list of rules listing the Attributes (defined in the previous section) that are released to each partner. For example
...
Archer gets "http://schemas.xmlsoap.org/claims/FirstName" and so on for lastname and email. These are Microsoft URL style names that are more popular these days with everyone except for the old guard in universities who still remember the LDAP names from previous failed attempts to use them.It is almost impossible to imagine that any additions (or changes) to this file could plausibly cause a problem. However, for good practice it makes sense to arrange the order of the release elements so that the Tier 0, mission critical, or production stuff comes first, and the brand new or testing junk comes at the end. Then there could be a rule that
makes the level of approval and testing depend upon where in the file you make the change. Changes to the stuff at the front are important and require signoff, while adding a new partner to the end is routine and can be done at any time. Again it would be nice to create an Ant script that breaks the sections up into separate files that are assembled at install time, and then the level of risk would be determined by which file representing which section of the configuration you are working withThe Attribute-Filter entries are cumulative. Shibboleth runs through the rules and whenever a rule applies to a entity, any released attributes are added to the list of values we will send. Although most of the time all of the attributes for one entity will be defined in one place, this is a good and sane practice but not a requirement.
Therefore, Shibboleth allows the Attribute-Filter function to be broken up into more than one file. We take advantage of this by creating an attribute-filter.xml file that contains the attributes released to each partner as of an official Shibboleth Release, but then an addtional-attribute-filter.xml file exists initially empty that can be changed between formal releases. The addtional file can either create a new filter policy for a new partner, or it could add an additional attribute to an existing partner.
However, you can only release attributes defined by attribute-resolver.xml, and that does not change between releases.
Relying-Party and Metadata
The relying-party.xml file is only important now because it defines where Shibboleth finds Metadata . It is unlikely that the file itself will be modified, but if the Ant script triggered by a new form of Jenkins Install job simply "touches" the file (an Ant operation that resets the change date) then Shibboleth notices the new date and it reloads all the Metadata files.
So now it is important to explain Metadata. SAML defines a standard format file that a partner should give us to define the two things we need to know: what is the formal name the partner uses to identify itself and where should we send the SAML message after we create it. Metadata is the most complicated possible format imaginable to carry such little information, but SAML defines a lot of extra fluff in the standard.
A partner can expose metadata with a URL, and we can configure Shibboleth to use the URL to fetch new metadata from the partner periodically, but what happens then if the partner is down when Shibboleth restarts. Fortunately, Shibboleth can be configured (although it is not the default) to not regard a failure fetching any metadata file as a fatal error that prevents initialization. However, it is safer if we make a copy of the metadata and check it into our own system, especially since it almost never changes.
Shibboleth is actually much smarter and more flexible with Metadata than it is with any of its other configuration elements. In the relying-party.xml file you define a sequence of possible metadata sources. Each source is treated as independent and dynamic. Independent means a failure of any source does not affect the validity of the other sources. Dynamic means that any source can be configured to poll a local file or a remote URL for updates and to load new data when it appears and the loading of new data for one source does not affect the other sources.
When Shibboleth needs metadata for a partner, it runs down the list of configured sources in the order in which they were configured checking each source for configuration data for the unique identity string for that partner. When it finds a match, it uses that metadata.
This creates two obvious special sources. One source we can call "the junk at the end of the list" or just the additions. The additions metadata can be used to add new configured partners, but because it comes at the end and will not be searched if the name if found in an earlier search, anything put in the additions cannot change an already configured metadata source. This file is totally safe. It cannot change any existing service. It can only add brand new configurations for new partners. Since mistakes in the file don't affect other configuration, you can change it at any time.
The other extreme is a typically empty file at the start of the list that is the "emergency-override.xml" source. Add anything to this file and it replaces any metadata in any other source. You use it to respond to an emergency when you just need to fix one piece of metadata and you don't care where it came from (InCommon, a local configuration file, whatever). It will be found first and it will fix a reported problem quickly, and then the long term fix can be handled in the normal repair cycleis a SAML standard format for describing the Identity Provider (Shibboleth at Yale) and the Service Provider (example: coursera.org). Shibboleth needs Service Provider Metadata for its partners. Although the Metadata file can be quite large and complex, the important information is the EntityID, a unique identifier for the partner, which is typically either a DNS name (coursera.org) or a URL (https://coursera.org). There is also an "AssertionConsumerService URL" that defines the URL to which Shibboleth sends the SAML message that describes the user.
The relying-party.xml file defines the Metadata sources. Each source is a file that Shibboleth reads in and parses separately from the other sources. Then Shibboleth searches each source for an EntityID and it stops when it finds a match.
Some partners are configured through a Federation. InCommon, for example, distributes Metadata for a large number of Universities and companies that do business with universities. Periodically Shibboleth obtains updated Metadata from the URL "http://md.incommon.org/InCommon/InCommon-metadata.xml".
Our most important partners exchange Metadata with us directly. We store their Metadata files in a directory in Subversion, and we add a reference to the file name to the relying-party.xml file so Shibboleth will read it.
We could have created a single composite Metadata file with all the information provided by all the partners. This is, after all, the way InCommon distributes its Metadata. However, we lack the resources and tools to do any elaborate parsing and validity checking of the file contents. By storing the files separately and creating a new Metadata source for each file, we insulate each file from all the other files and limit the possible damage caused by misconfiguration.
Shibboleth has a failFastInitialization="false" parameter for each configured Metadata source. The default is "true" and causes Shibboleth to fail to start up if the Metadata is invalid. If we put Metadata directly into production, "true" would be a really, really bad idea. However, at Yale Metadata goes through DEV and TEST before it goes to PROD, and the way the Jenkins jobs interact with the Subversion tags should prevent problems only showing up in production. If we have an issue, it is better that it show up as an initialization problem for DEV and get fixed immediately rather than being something that could just slip through the cracks. Perhaps this parameter should be "true" in DEV and TEST and "false" in PROD, and that will be a change to be made in some later release.
The relying-party.xml defines four categories of Metadata sources:
- The dynamic "emergency-override.xml" that is initially empty but can be used to replace production that becomes bad between releases because of something the partner did wrong.
- The static production partner Metadata XML files provided for archer, hewitt, communityforce, salesforce, and so on.
- The InCommon remote source
- The dynamic "additions.xml" file where new partners can be defined between releases (also associated with the additonal-attribute-filter.xml file).
This then leaves us with a small number of special cases. Two of our partners (salesforce and cvent) use a technique that we might call the Expanding Metadata File. Every time you define a new application with these systems, instead of getting a new Metadata file you get a one line change to add to the existing Metadata file. In Salesforce, the file looks like:
...
The next time someone comes up with a new Salesforce application, it will be index="15" and will have its own unique Location value.This means that a new type of targeted Jenkins Install job should treat the Salesforce and Cvent metadata files differently from all the other metadata we are managing. Changes to those two files is routine and requires less approval than changes to archer or hewitt.
We may add special types of Jenkins Installs (runtype=salesforce and runtype=cvent) that replace just this one file. The bad news is that if the new Metadata is bad it will break existing Salesforce or Cvent applications, but the type of edit here is fairly simple and any mistakes should show up in DEV and TEST. Futhermore, the Shibboleth isloation of Metadata sources and the decision to configure files separately in relying-party.xml ensure that changes to Salesforce only affects Salesforce applications and nothing else.
Elements of a Proposed Strategy
Currently a "config" run of the Jenkins Install job replaces all the Shibboleth configuration files with new copies checked out from Subversion.
The proposal is to add one or more new soft-config options (to be named later) that perform subsets of the "config" install. Rather than having a large number of new Jenkins options, the soft-config will be driven by the Subversion tag. That is, instead of expecting to copy everything it will expect that only a small subset of the possible files will be updated and tagged and it will only change those files.
...
Previously, there were only two "runtype" values for the Jenkins Shibboleth Install job.
Runtype "install" stops the JBoss server, loads a complete Shibboleth system including potentially new code, and new configuration files.
Runtype "config" installs a complete set of new configuration files.
The proposal is to add new runtype values.
Runtype "additions" will change the Metadata "additions.xml" and the "additional-attribute-filter" file. This can be used to add new Service Providers to production between the every-other-Thursday full Release cycles. Shibboleth isolates these files and appears to guarantee that this type of configuration cannot possibly interfere with existing production services.
Runtype "emergency" will change the "emergency-override.xml" file and allows us to define new Metadata for an existing production partner without affecting anything else. It may require permission from the ECAB, but it is a less dangerous change than a full configuration update.
Runtype "salesforce" and "cvent" are proposed (for discussion) jobs that change a single Metadata file for the two partners that require frequent updates. I would like to see them become Standard Changes.
Then the second element of the strategy is to provide a more accurate and complete testing strategy. Currently TEST Shibboleth is connected to the TEST database instances (ACS2, IST2, IDM2, HOP4) and potentially to the TEST AD (yu.yale.net). This provides a service for those who need to use test netids, but it does not actually test what is going to go into production.
...