Spring Framework
In the old days every application programmer had to define a configuration syntax and write code to parse that syntax and create the objects that perform operations. Earlier releases of Shibboleth worked this way. Shibboleth 3 (and CAS for that matter) are Spring Framework applications. Spring provides a general purpose configuration file system that can be used by any application. XML files define "beans", which are object instances of classes provided by the application. These objects are parameterized by values from the XML file, and they are linked together by plugging a reference to one bean into a property of another bean. For more information, go to the Spring Framework website.
However, Shibboleth 2 already had a configuration syntax where XML files were parsed to create objects, and SAML has standardized XML configuration files that have nothing to do with Spring. So Shibboleth 3 is a hybrid system where some configuration is done with Spring beans and some use SAML or Shib 2 syntax. You can tell the difference by looking at the first XML element. If it is a <beans> element, then this is a Spring bean configuration file in the universal Spring syntax.
In general, each Spring created object is defined by a <bean> element inside the <beans> container. Each <bean> is given Every application has to be configured with information about the Yale environment (server names, how to access databases, Active Directory, how users login) and to select options. You may also plug in external components either by using the features of the application server (filters, listeners, EJBs) or some interface provided by the specific software package. If there is no explicit configuration language, then you may have to get the source and modify it.
Rather than creating a new configuration system, applications are increasingly using the Spring Framework. CAS started using Spring 10 years ago, and Shibboleth 3 now fully embraces it. Spring provides a common configuration mechanism, so once you know how it works you can use it to configure CAS or Shib or any other system written in Spring. However, it is not appropriate here for me to introduce Spring just to explain how Shibboleth configuration works. You can go to the Spring website for an introduction to its general syntax.
The SAML standard defines a syntax for certain configuration files, notably the Metadata. Shibboleth 1 and 2 had their own XML syntax for certain Shib only files, and in a few cases that syntax is still supported. However, almost all the new or less important or highly technical configuration files have been converted in Shib 3 to Spring syntax.
Shibboleth is a modular system with subcomponents that perform specific functions. One component reads Metadata files that describe communication with various partners (Salesforce, Comcast, Google). Another component authenticates usersĀ (at Yale we use CAS). Another component reads data from Yale databases and directories to build Attributes that will be returned to the partners after the user logs in.
In Spring, a component is configured as a Bean. A Bean is an instance of a Java class in one of the libraries provided by the application. The Bean is parameterized with strings, files, numbers, booleans, and references to other Beans. For example, a Bean that uses the CAS client library to log a user on needs to know the URL of CAS at Yale ("https://secure.its.yale.edu/cas"). To do the configuration, Spring doesn't have to understand what a Bean does. It just needs to know the name of the Java class that creates the right type of object, and it has a unique id= by which this bean can be located or referenced by other code. Spring contains some utility classes that make common generic objects easier to define with a specialized element. For example, a Java List object can be created with the Spring <util:list> element which is translated internally to a bean.
Each bean is configured with parameters that can be provided by adding p:parametername="value" elements to the <bean> element or by adding property or constructor elements to the content of the <bean> element.
Through Spring, the Shibboleth application is assembled by selecting optional components, configuring them with parameters, and plugging them into the application framework. For example, Yale used CAS for its SSO, and Shibboleth has a framework for configuring and adding beans that perform authentication. So we use the Unicon supplied integration between CAS and Shibboleth distributed and documented at Bean and the names and values of the parameters.
You know you have a Spring configuration file if the first element is <beans>. Then the file contains mostly <bean> elements, although Spring has a few aliases for <bean> when you are dealing with standard classes. If you are creating a Java List, for example, then instead of a <bean> file that references the "java.util.List" class, it can use the defined nickname of <util:list>.
Once Shibboleth defines the Interface or signature of a component, others can provide their own Java code that implements the Interface. So while Shibboleth provides a generic signature for a component that logs users on, Unicon provides a library with a class that implements that interface and logs users on through CAS. We downloaded and installed the Unicon library and added the Unicon configuration files, parameterized with Yale CAS information. For information on the Unicon code and configuration, see https://github.com/Unicon/shib-cas-authn3. Remember, Shibboleth was now written to expect that Unicon would add a new component, but because it is configured in Spring as long as Shibboleth provides an Interface on which new classes can be defined, then anyone can write code and plug it into the framework.
Local Disk File Configuration
Local Disk File Configuration
Spring has the built in capability to define a file (which it calls a Resource) based on a path to local disk or a URL to a network file. Even before it used Spring, Shibboleth allowed the non-Spring XML files to come from local hard disk, or from a network URL, or from a Subversion Source Control system. If Shibboleth were a normal application then dynamically obtaining configuration from a Source Control system would be very attractive. Unfortunately, in order to recover from major datacenter failures (as happened when the power was cut to half the servers had written its own custom code to read configuration files from disk, from URLs, or even to check a file out from Subversion source control at system startup. Then periodically Shibboleth can "poll" the source to see if a new version of the same file has become available (based on the last changed date or in Subversion the most recent committed version number) and reload it if there was a change.
While checking configuration files into Source Control seems like an excellent idea, if the Subversion server is not available when Shibboleth comes up then it takes an unreasonable amount of time for Shibboleth to start. In order to recover from major datacenter failures (as happened when the power was cut to half the servers on campus) Yale stores some of its recovery plans and checklists on cloud services. To login to cloud providers, you need Shibboleth. So Shib is a "Tier 0" application that has to come up before any database, Web, or Source Control servers. That means that everything Shib needs to come up has to be on its "local hard disk.
So we create a slightly more complicated system. Ultimately, every Shibboleth configuration file is managed by Source Control. Specifically, these files are checked into the yale-shibboleth?-installer project in Source Control (where ? may be replaced with a version reference). However, instead of Shibboleth linking to Source Control at startup or polling Source Control while it is running, a Jenkins Install job is run by operations to check out the current version of all the files from Source Control and then update the configuration of Shibboleth on a running (or temporarily stopped) Shibboleth server. Shibboleth itself is configured to use files on local disk, but Jenkins controls when these files are replaced.
Metadata
The files in the "conf" directory are defined by Shibboleth and are supplied by Yale. One file is metadata-providers.xml. It contains a list of elements that define files or URLs that supply the metadata information defining the applications that support Shibboleth login. While Yale has been forced to create a few metadata files, normally they are supplied by the application vendor.
Each metadata provider element in the metadata-providers file points to a file name in the "metadata" subdirectory of Shib. Optionally it can also point to a URL that Shib can check at configured intervals to look for updates. Yale only uses the URL update facility for the curated InCommon federation aggregated metadata, and we put that metadata source near the end of the list. Because Shibboleth scans all the metadata provider elements in the order they are defined, and it stops when it finds metadata for the entity name it is looking for, Yale configures all the individual metadata files we store on local disk to come first in the search and then configures InCommon. That way if we have a specific metadata file for a partner that is also defined in InCommon, the file we created and store on local disk will be found first and our parameters will be used to talk to that partner" (put in quotes because datacenter disk can be on a SAN).
So Yale accomplishes effectively the same thing with a bit more work. All Shibboleth configuration files are checked into Source Control. However, Shibboleth does not know this and does not go to Source Control itself. Shibboleth is configured to use files on disk, and when appropriate to check periodically to see if the file change dates have been modified and reload the changed files. The files are deposited or updated on the Shibboleth local disk by a Jenkins Install job under the control of Operations. So Shibboleth does not see the files change just because a new version of a configuration file has been committed to Subversion. After the commit there has to be an explicit Jenkins run to move the file to the Shibboleth server, and while Jenkins jobs can be configured to run automatically after a commit, this particular job is started by a person when we make a positive decision to change the running Shibboleth.
Terms
Shibboleth is an Identity Provider (an IdP).
Applications and Cloud vendors are Service Providers (SPs) because they provide a service to users. They rely on Shibboleth to provide login information, so they are also called Relying Parties (RPs).
Each IdP and SP is an Entity and it has an EntityID that is a string. The string has to be globally unique, so it is typically a DNS name (google.com) or a URL, but some partners just use a string they made up.
Metadata is a big block of information about an Entity in an XML <EntityDescriptor> defined by SAML.
Metadata
The conf/metadata-providers.xml file (Shib 2 format, not Spring) contains a list of <MetadataProvider) elements. Each defines a local disk file or URL that contains or returns Metadata. This can define a single Entity or it can contain thousands of EntityDescriptors.
There is no requirement, but it is a Yale convention that each <MetadataProvider> element in our configuration points to the location of a file in the "metadata" subdirectory of the Shibboleth directory. Every one of these files is checked out of Source Control and is deposited on the Shibboleth local disk by the Jenkins Install job, except for InCommon.
The InCommon Federation provides a curated collection of thousands of Metadata elements. Shibboleth loads it from the URL supplied by InCommon when it starts up and then checks for updates every 8 hours. Shibboleth keeps the most recent copy of the data from InCommon in a file in the meatadata subdirectory, but that one file is downloaded from a URL and managed by Shib itself and does not come from Jenkins or Source Control.
A metadata provider file can define one metadata for one entity, or it can contain as many entities as you want. Yale could have combined all its local disk metadata into one file with one metadata provider element in the metadata-providers file. That seems simpler, but there is a problem. If an XML file has a syntax error, then the entire file is ignored. So if we combine all our metadata in one big file, then a single missing "/" makes the entire file unreadable. It seems safer even if it makes the configuration file more complicated to separate each metadata configuration for each partner in a separate file, so mistakes are localized to just the one partner with the problem.
...