Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

SAML Messages

Shibboleth generates a SAML message (called a "SAML Response") that it sends to eachr of our Service Providers (such as box, google, service-now, archer, and so on). There is a tiny amount of important content in the Response buried in a large amount of meaningless XML junk. If the Service Provider accepts the message, then the user logs on to that service.

In order for the SAML Response to work, it has to pass a long list of checks:

  • The EntityID (an arbitrary but unique identifying string) has to be matched in one of the Metadata files.
  • The Metadata file has to provide the correct URL to which Shibboleth sends the SAML Response message. This is called the AssertionConsumerService or ACS URL.
  • The Digital Signature generated by Shibboleth has to be validated by the Public Key or Certificate configured for Yale in the Service Provider application and the EntityID name that Shibboleth uses to identify itself also has to match. In practice, this means that a Service Provider configure with PROD Shib has to get a message from a server configured with the private key credentials and EntityID of PROD Shib, while a SP that expects a message from TEST needs a message signed with the TEST key and sent with the TEST EntityID.
  • The message will contain the EntityID and ACS URL of the Service Provider. SAML messages have an implied "eyes only" rule, so these two strings have to exactly match the EntityID and ACS URL that the Service Provider expects. Since these two values came from a Metadata file the SP sent us, this check is usually never a problem.
  • The message does not contain any information about the network address or server name of the Shibboleth server. So it is not important that Shibboleth run on any particular machine.
  • The SAML Response must contain any attributes that the Service Provider has told us are required. Generally speaking, across any change to Shibboleth the value of each attribute should not change. If the user changes their name or email address, then Shibboleth will start sending the new attributes, but the attributes should not change if Shibboleth switches to a new database or we put in a new version of Shibboleth.
  • Each SAML message contains a Subject. In most cases the Subject is ignored, but there are a few special cases that also turn out to be really important cases (Google for EliApps, Service-Now) where the Subject is the only field and it is special and must have the same value each time.

The most common Shibboleth testing occurs when we have a new Service Provider to support. They will provide us with Metadata and a list of attributes they need us to supply.

If the attributes are already defined for someone else, all we need to do is to add the Metadata and to add or modify an entry in the attribute-filter.xml to release the attributes to this EntityID. Generally we can set this up in Sandbox or DEV testing, use the IdP Initiated Login URL with this EntityID, and then capture the Response. Correct behavior can be determined by an inspection of the Response. Then the code can move to PREPROD and a real login to the application becomes possible.

Occasionally someone asks for something new, or for a new version of something we are already generating. The Law School asks, "send us the list of groups in the Law School OU that contain this user as a member." We may have previously generated a list of all group membership, but now we need to filter it for just Law School groups and that is new. So now you create the new attribute, release it to this partner, and again manually check the generated SAML message to see if the right values were generated.

Attributes have a formal unique name, a friendlyName (short nickname), and a single value or occasionally a list of values. A simple example (removing all the XML boilerplate) would be:

FirstName: John
LastName: Doe
Email: John.Doe@yale.edu
eduPersonPrincipalName: jd345@yale.edu

Different partners can ask for the same attribute with different names. So "John" can be a FirstName, First_Name, givenName, and so on. So you have to generate both the right name and the right value for each item.

When a new Service Provider appears, it is convenient to initially ask them to configure the EntityID and Public Key Certificate of ourTEST Shibboleth. Then we can verify the attribute list, attribute names, and values and verify a login to the application. When it is all working, then we promote the changes to Production Shibboleth and ask the Service Provider to change EntityID and Certificate configuration to point to PROD. After a few days, we lose our contacts with the person at Yale coordinating the installation and the contact at the Service Provider who is making any changes.

Very few Service Providers bother to have a permanent configuration for our TEST Shibboleth. InCommon doesn't even have a concept of configuring TEST Identity Providers to anyone. So if after a few years we need to upgrade Shibboleth and test the new release, we have to do it with all of our Service Providers configured to use only auth.yale.edu and the production Shibboleth credentials. This explains the need for a PREPROD Shibboleth.

PREPROD is a Shibboleth IdP server on a VM that has the same /credentials subdirectory (and therefore the same private key) as PROD Shibboleth. If it generates a SAML Response, all of our Service Providers will accept it as genuine. It uses the production databases. It has been installed passing a property with the production EntityID. The only difference is that it is not actually at the IP address that the DNS server reports to be "auth.yale.edu" and if a message goes through the F5 to get to it, then it will use a separate (but hopefully equivalent) configuration so it is handled the same as production.

If we can arrange that on one computer in one Browser, all traffic addressed to production Shibboleth goes instead to PREPROD, and all messages back from PREPROD are sent on to the Service Provider, then we can verify that the new code running on PREPROD works before the same code is installed on PROD.

PREPROD is necessary for final testing where it is not good enough to simply "eyeball" the SAML XML, but you really need to go all the way to login to the Service Provider and verify that you get the right account contents. This isn't necessary for ordinary day to day testing of the configuration of Service Providers, but it is necessary if someone makes a change to Shibboleth itself or do a database like IDR that provides a lot of Shibboleth data.

In production, the network setup is controlled by Operations. In testing, the developer can run Shibboleth on a desktop Sandbox, or can "redirect" URLs from one host to another, or can use SSH tunnels to connect to VMs in the machine room. It is fairly easy to route a request to a selected Shibboleth server, but it can be fairly difficult to actually get it working. Shibboleth doesn't simply passively accept requests. Certain things get checked for correctness. Shibboleth can reject requests because of the way they arrived over the network, the contents of the Hosts HTTP header, and the use of https to transport the request.

Developers do not configure the F5 front end, and they generally just accept the VM that Operations gives them. So these topics are not necessarily part of a developer's technical training. This document explains a few of the complex options for transporting HTTP requests and the peculiar nature of SAML validation that can be affected by the network configuration. It enumerates the problems you may run into if you try to use a configuration other than the ones documented in the other Testing documents on this site.

IdP Initiated Login

An "IdP Initiated" test occurs when you click a URL that points to our Shibboleth server and provides the EntityID of one of our Service Provider partners:

http://localhost:8080/idp/profile/SAML2/Unsolicited/SSO?providerId=nobody.yale.edu

The hostname part ("localhost:8080") could reference a Sandbox test Shibboleth running on Tomcat on your desktop, or it This URL calls a Shibboleth server that could be running under Tomcat on the local desktop or could be a test version of Shibboleth running on a VM in the machine room where you have created an SSH tunnel connecting port 8080 on your desktop to the real Server port on the real Server VM.

At the end, after "providerId=" you add the EntityID of a partner. It is a Yale testing convention that the ID This URL is a Shibboleth feature, not part of the SAML standard. The Shibboleth code that responds to a "/profile/SAML2/Unsolicited/SSO" URL path does not care about the hostname, port, or protocol. It makes no assumptions about how the client connected to the server or what the network looks like.

The only checks are:

  • the providerid ("nobody.yale.edu"

...

After generating the SAML Response XML, Shibboleth will try to send it to the AssertionConsumerService URL configured in the Metadata for Entity "nobody.yale.edu". This ACS URL is bogus (literally, it is "http://www.yale.edu/bogus") and the message goes nowhere. If you are running on DEV or the Sandbox, before the SAML is sent you will get the "Consent" page where Shibboleth lists all the attributes and values it is about to send, and many problems can be debugged from that page without ever actually sending out the SAML. Otherwise, using the SAML Trace plugin in Firefox you can trap the SAML sent to the bogus address and then inspect it for problems.

Change the EntityID at the end of the URL and you can verify that the right attributes are sent to any particular Service Provider.

SP Initiated

The hard part occurs when you have to go to the Service Provider first, and it has to send you back to Shibboleth. In this case the Service Provider will have generate a SAML Request message that Shibboleth receives. Shibboleth requires that the Request be valid. It performs essentially the same checks that we described above when the Service Provider was validating the SAML Response generated by Shibboleth. I will now copy that list of checks and note the differences.

In order for the SAML Request to work, it has to pass a shorter list of checks:

  • The EntityID in the Request has to match the EntityID configured to this Shibboleth server through the "idp.entityid" property on the Jenkins Install. Because SPs are almost always configured to talk only to PROD, that means the Shibboleth has to be installed with "idp.entityID= https://auth.yale.edu/idp/shibboleth"
  • The Service Provider will send the message to the URL it has configured for Yale. Unfortunately, this will almost always be a production URL ("https://auth.yale.edu/idp/...") and so the tester has to trap this on their desktop and reroute it to the test or preprod Shibboleth server (more on that below).
  • The Digital Signature generated by the Service Provider has to be validated by the Public Key or Certificate configured in the Shibboleth Metadata that defines that Service Provider. This will be done by the SP and requires nothing on our end.
  • In addition to the EntityID, a Request contains the SSO URL for Shibboleth configured to the SP in Yale's Metadata (typically in this example) must be the EntityID of a Service Provider configured in some Metadata file
  • The Metadata file must have a configured default AssertionConsumerService URL
  • The hostname in the ACS URL must exist on the network.

Internally Shibboleth builds a dummy SAML Request, then triggers normal SAML processing to generate a Response.

If Consent is configured (and it will be by default in most Sandbox environments) the generated Attributes will be displayed. Then the SAML Response will be sent to the ACS URL.

This will work and will attempt to deliver the Response no matter what operating system you are running on, no matter what hostname, port number, or protocol your Tomcat uses, no matter what EntityID your Shibboleth server is configured to use for itself. In short, there are no network issues for IdP Initated Login.

Of course, the SP will probably discard the Response because it will generally not be digitally signed correctly, and the EntityID that issued it may be unrecognized by the SP, but that is for later testing.

The Unicon CAS Integration

Unicon provides a Shibboleth Login ("Authn") module that uses CAS. To configure this integration, the properties file must supply the protocol and hostname to use in the Service string sent to CAS. CAS will Redirect the Browser back to this URL after it issues the Service Ticket. Unicon has a specific property name, but Yale used a different name long before we switched to using the Unicon code. We keep the old Yale property name and use it to set the new Unicon property.

So in the Install Project, you must set a property named "cas.target.url" to be the protocol and hostname of the Shibboleth server from the point of view of the Browser. When you are testing a Sandbox running on your desktop, then it makes perfect sense to set the property in install.properties and make the value to "localhost", or more specifically:

cas.target.url=http://localhost:8080

However, this is probably not a good value to use when Shibboleth is installed in DEV or TEST. In these cases the property is set in install-DEV or install-TEST and it will reference the "auth-dev.yale.edu" and "auth-test.yale.edu" machines. This means, however, that if you install Shibboleth into DEV or TEST normally with these parameters, then you are not going to be able to use these VMs through an SSH tunnel (where they appear to be localhost:8080) unless you use some mechanism (Redirectory, Charles Proxy, hosts table) to rename the "https://auth-dev.yale.edu/" in the CAS Redirect to actually go to the SSH Tunnel.

You know you have this problem when Shibboleth sends you to CAS, you login, and then get a

HTTP Status 500 - Error processing ShibCas authentication request

error with a root cause of:

ExternalAuthenticationException: No conversation state found in session for key (e1s1)

What has happened is that Shibboleth received your original request from what it regards as one browser session, and it sent that Browser to CAS, but now it is getting back a response from CAS that appears to be coming in from a different Browser (although it is really the same Browser connecting to Shibboleth through a different network path using a different host name).

This error can also occur when two Shibboleth VMs exist behind a load balancing front end, but the front end has not been configured to route subsequent requests for the same session (based on the "JSESSIONID" parameter) to the same VM. Then one VM sends the Browser to CAS, but the response gets routed to the other VM.

SP Initiated

In an SP Initiated login, the user goes to the application first and is then sent back to CAS. For example, if you go to "http://mail.bulldogs.yale.edu" to get your mail, then the Google Apps login through Shibboleth is SP Initiated. Other examples include yale.box.com, yale.service-now.com, and www.yale.edu/iptv.

The Service Provider generates a SAML Request message and sends it to Shibboleth. Unfortunately, the SP knows that Yale's Shibboleth server is located at "https://auth.yale.edu/idp/

...

.

...

.

...

So the Browser, test computer on which the Browser is running, the network path between the Browser and the test VM, and the Tomcat environment on the VM all have to do two things:

...

" and not only does it tell the Browser to go to that network address, but it also includes this URL in the Request message. Shibboleth implements the requirement in the SAML standard that it must determine its own network URL and compare that string with the network URL sent in the SAML Request, and reject the SAML Request if the two don't match.

The problem here is that you cannot configure Shibboleth with a fixed URL string and tell it to use that string in the comparison. Instead, Shibboleth calculates its own URL using information in the HTTP headers, and information provided by Tomcat in the HTTPServletRequest object.

Of course, this is not a problem for Production Shibboleth because that machine really is "https://auth.yale.edu

...

One set of problems has to be solved to send the request to the test VM in the first place. No ordinary computer inside or outside Yale can communicate with the VM that runs the Shibboleth IdP. Instead, a network address like ". The problem then is to try and trick a test version of Shibboleth running on another network address into believing that it really is on a host named "auth.yale.edu" and the request is coming in over SSL/TLS (https), on port 443, even if that is not the case.

If you don't get it right, then Shibboleth discards the Request, generates a error message in the log, and displays an error page. The log message will always say that it was comparing "https://auth.yale.edu/idp/..." is defined to a special network front end device. At Yale this is the "F5" that stands in front of all network services. The host name "its own network address and the comparison did not match. Looking at the string it was trying to match, you will see what it generated:

  • "http://localhost:8080/" -  You gave the Browser the local address and it passed this information on to Shibboleth
  • "http://auth.yale.edu"

...

  • - You generated the right hostname, but Tomcat has told Shibboleth this request came in on "http" instead of "https"
  • "https://auth.yale.edu:8080"

...

  • - Tomcat is willing to say the request was "https", but the real port number got reported

At one point I thought this problem was solvable. I now realize it is not worth the effort. Do all your basic testing using IdP Initiated URLs. Any final end-to-end testing, where you start with an SP URL and end with an actual login to you Eliapps mailbox or you Box files should be done using the Pre-Production machine behind the F5 by adding a line to the hosts file assigning the "auth.yale.edu" , then either it has to have this name configured as one of its properties provided by its Install Project, or else it has to get the name from the control blocks that Tomcat provides with each incoming request.

When the Browser generates any HTTP request, it creates a Host header. This header contains the protocol, hostname, and port number from the URL that the Browser is using to get to the application. For a request that thinks it is going to production Shibboleth, this is

Host: https://auth.yale.edu

This header can go through the network and the F5 unmodified, or it can be changed and there are other network standards for other headers that might be generated by the F5 or any other device that receives, modifies, and forwards the request. The people who configure these network devices know all these conventions, and production Shibboleth receives data through a similar path with a similar set of header modifications.

So if you trap a Service Provider generated SAML Request and decide to forward it to a test Shibboleth server, things will work correctly provided that the Shibboleth server is configured with the same EntityID as production and that when the SAML Request arrives at the test server, passes through Tomcat, and is presented to Shibboleth that the Host header (and any other headers commonly generated by machines like the F5) convince Shibboleth that this request was originally addressed to "https://auth.yale.edu/idp".

Although there will be at any time one or more specific network configurations for accomplishing this result, there are lots of rules and tricks and software that can accomplish the same thing in various ways. The thing to remember is that if you try to login to a Service Provider with a test Shibboleth, and instead of getting a CAS login you get a message saying that the request did not meet security restrictions, and the Shibboleth log contains an error message saying that the message was addressed to "https://auth.yale.edu/idp" but Shibboleth has decided that its own network address is "http://auth.yale.edu/idp" (http not https), or "https://auth.yale.edu:8080/idp" (a port number got added), or hostname to a test address.

Background on Proxies and Tunnels

In modern networks, the Browser almost never talks directly to a production server. Production servers are hidden away in machine rooms behind firewalls, and access to them goes through a network front-end device. At Yale, that device is called the "F5".

So in the simplest case, a Browser going to CAS, or Shibboleth, or any other production service actually connects to the F5. The F5 then forwards the Browser request into the machine room, and it sends back to the Browser the response from the real server.

In HTTP, when a computer stands between the Browser and the Web Server and acts as a silent intermediary, it is called a Proxy. There are actually two configurations. In the old days, the original Web Proxy was a device on the Yale Campus that held local copies of frequently used pages from distant Internet servers. That way hundreds of requests from Yale users for the same front page of the New York Times could all be satisfied by a single copy saved from a recent request. That was useful back when the Internet was slow, but today the Internet is so fast and powerful that this type of proxy has become obsolete.

So mostly today we talk about a Proxy that sits in front of a server, rather than a Proxy that sits behind the Browser. This is often called a "Reverse Proxy" and it receives requests intended for a Web Server and, in many cases, it distributes the requests that come in among a number of identical server computers to spread the work around and quickly recover if one of the servers has a problem.

The Proxy is supposed to be invisible. To the Browser, the proxy appears to be "auth.yale.edu" or whatever hostname the Browser put into the URL. If that is the case, the actual VM in the machine room where a service like Shibboleth is running will never really be named "auth.yale.edu" but will instead be "vm-shibprd-01.web.yale.internal". Furthermore, that real machine name and the IP address of that machine will not be visible outside the machine room and you can only get to it through the F5.

So you can see that if Shibboleth needs to know its own URL from the point of view of the Browser, and it insists on figuring it out for itself through programming, there is a bit of a problem. The "auth.yale.edu" name is known to the F5, but until Shibboleth gets a request from the first user, the name "auth.yale.edu" is configured nowhere on the VM and there is nothing Shibboleth can do to find it.

In HTTP the Browser puts the protocol and host name in a Host: header that is part of the data sent to the server. So coming out of your Browser, there will be a line containing "Host: https://auth.yale.edu". If that line gets to Shibboleth unchanged, then Shibboleth can use it to guess the right answer to the "What is my network name" question. Of course, the F5 or any other proxy can change this header, but then there are other network standards that have developed over time to create other Headers that track such changes.

Tomcat was written to work correctly in the modern world of firewalls, and reverse proxies, and F5 devices. It can be configured to report to the application that the hostname is "auth.yale.edu", that the data came in over "https" on port number 443, despite the fact that the data actually came in on a "http://localhost:8080/idp" (you didn't fake anything out at all), then you have not set " request.

Feel free to download the Shibboleth source code and read it. You will discover that there is some Shibboleth source that calls some OpenSAML source that calls back to some other Shibboleth source. Some information comes from Tomcat, some from Spring, and unfortunately when you are talking about how the CAS integration does the same thing then Shibboleth calls the Unicon code which in turn calls the CAS Client code.

Somehow under the covers this stuff digs through the HTTP headers and the network path right and you have to map out all the intermediate boxes and tunnels through which the request is passing to figure out what is wrong or missing.

The problem is that some of this path may not be under the control of the developer or tester, and if you try and create a path that is completely under your control you now have to learn more about the configuration of Tomcat and HTTP proxy tools, which are normally a problem for someone else.

Some successful recipes will be provided in the Testing Setup document, but if you decide to deviate from them you need to understand the problem described above and figure out new solutionsTomcat configuration and comes up with what it thinks is "The URL the Browser used or could use to communciate to this Shibboleth server". Most of the time it is right, or when it is wrong it is obvious why it is wrong. If you cannot fix it in 10 minutes, you will never fix it.

That is why the simplest solution is to put your code in Pre-Production and change the hosts table to point "auth.yale.edu" to the F5 address that routes to Pre-Production.