Background
The SAML Single SignOn protocols exchange data between the Application and Shibboleth through the Browser. We do not use the protocols where the Application talks to Shibboleth directly.
IdP Initiated Logon
About 95% of the applications that use Shibboleth support "IdP Initiated" logon, where the Browser starts with a URL that points to Shibboleth and Shibboleth sends a SAML Response unsolicited to the Application. The starting URL is of the form:
https://auth.yale.edu/idp/profile/SAML2/Unsolicited/SSO?providerId=nobody.yale.edu
When the Browser generates a GET to this URL, the handler for profile/SAML2/Unsolicited/SSO searches the Metadata files for an EntityId that matches the value in the providerId at the end of the URL. There has to be a matching Metadata to supply the URL of the AssertionConsumerService to which the login will be sent, and the hostname in the ACS URL has to exist. Shibboleth also searches the attribute-filter.xml for a policy that releases attributes to this EntityId. It then builds a SAML Response message with the released attributes, writes it to a Form on the Browser screen where JavaScript submits the form and sends everything on to the ACS URL.
The first thing the Application sees is a Form POST to the ACS URL containing the SAML Response as data. This Response will have been generated and digitally signed by the Shibboleth you sent the IdP Initiated URL to in the first place. It could be auth, auth-test, auth-dev, or localhost:8080 in the Sandbox. If the Application has not been configured to trust the Shibboleth that generated the message, it will generate an error message and fail to log you on. However, if you have been tracing the HTTP messages in the Browser, you will have captured the SAML message sent to the application. You can display it and verify that the Attributes and Subject are correct for that particular Application, and that is 99% of the testing you typically need to do.
SP Initiated Login
The user can start by going to the application. A Browser that starts with "yale.service-now.com", "yale.box.com", "mail.bulldogs.yale.edu", or the Comcast IPTV URL will in some cases get immediately sent to Shibboleth and in some cases will push a button or select Yale University from a dropdown list before being Redirected back to Shib.
Some applications that support SP Initiated also support IdP Initiated. You can test all the attributes and subject and the basic login using IdP Initiated, but when something changes that might affect it, you also have to be sure that when the Application sends a Request to Shibboleth then that Request is accepted.
There are two ways an application can send its Request to Shibboleth:
- (SAML Browser POST Protocol) writes a Form to the screen containing a pre-loaded text box containing the SAML Request, then uses JavaScript to Submit the data to https://auth.yale.edu/idp/profile/SAML2/POST/SSO
- (SAML Browser Redirect Procol) redirects the Brower to "https://auth.yale.edu/idp/profile/SAML2/Redirect/SSO?SAMLRequest=" followed by the character encoded SAML Request
The Browser (and therefore your computer) is always in the middle
So the three options are:
- The user click on a link sending the Browser to the Shibboleth URL
- The Application sends a Redirect to the Browser, and the Browser does a GET to the Shibboleth URL
- The Application writes a form and JavaScript Submits it to Shibboleth URL.
Shibboleth always uses the Browser POST Protocol to send the Response back to the Application (it writes a Form to the screen containing a pre-loaded text box containing the SAML Request, then uses JavaScript to Submit the data to the AssertionConsumerService URL configured for that EntityId in the Metadata).
So both Shibboleth and the Application talk only to the Browser and the Browser forwards data between them. Therefore, the network address of the Application and the Shibboleth server are always determined by the Browser and the environment of the desktop system on which the Browser is located. There are many mechanisms you can use privately on your computer (or on a VM running on your computer if you don't want to mess up your real OS):
- In 90% of the cases you can use an IdP Initiated Response, where all you need to do is to change the URL of the link (or bookmark in the Browser) to point to http://localhost:8080/idp or any other URL.
- You can temporarily change the network address of "auth.yale.edu" by putting an entry in the "hosts" table on your computer (or you could point to a dummy DNS server, but that is much harder). This captures the traffic, but it does not change the protocol or port number so generally you have to use a test Shibboleth that is running SSL on port 443.
- You can install a Browser plugin ("Redirector") that matches URL patterns and then rewrites them. It can match "https://auth.yale.edu/idp/*" (note the wildcard at the end) and replace it with "http://localhost:8080/idp/$1" (where $1 is a variable that inserts the rest of the original URL after the matched string.
- You can reroute the URL outside the Browser using a Proxy. Charles Web Debugging Proxy is easy to use and setup, while nginx is widely used and very powerful, but requires more reading because there is so much it can do.
The four options have been listed in increasing order of complexity, but they do not include the minute by minute difficulty during testing of turning the test environment on and off. Because changing the hosts table is usually inconvenient and provides limited function, we typically do not use that approach but include it here for completeness.
Generally you use the simplest option that can handle your test case, which means that most of the time you click on a link to a IdP Initiated operation, and you only use the other options when you have to go to the application first and get a real Request.
Preparation
Get Firefox. You can can certainly use other Browsers in production, but Firefox is needed to test.
Firefox Add-Ons
SAML messages are typically BIN64 encoded and appear to be the contents of a Text Box in a form or else part of the query string. You can line up a set of tools to trace, cut, paste, decode, and format the XML, or you can install the Firefox SAML Trace add on, which does all of this work for you.
A lot of debugging can be done by using Shibboleth on your desktop (Sandbox) and setting a URL starting with http://localhost:8080/idp/..." in the Browser address bar. However, final testing may require access to the actual application using a normal login sequence. This may require the use of the PREPROD Shibboleth VM, because it has credentials (the signing key) identical to production Shibboleth and can produce a Response that the application will accept. PREPROD may have a public URL address through the F5, or it may have an internal URL that can be accessed from a desktop, or it may require that you SSH login to the host and tunnel port 8080 to your desktop. However that works, you need the Redirector Firefox plugin which watches for a particular URL pattern "https://auth.yale.edu/idp/*" and then substitutes a replacement for the original URL "http://localhost:8080/idp/$1" where $1 is replaced by the rest of the original URL (after the matched prefix).
If you click the Firefox Menu icon (three horizontal lines in the upper right corner of the toolbar) then Add-On is an option (which looks like a puzzle piece). Click it.
Go to the Add-Ons, Search for a new Add-On with the word "SAML". Install the SAML Tracer. Look for "Redirector" and install the Redirector Add On (it has a logo of a capital R ending in an arrow).
Now SAML Trace appears in the same menu, and Redirector installs an icon on the toolbar itself.
SAML Tracer requires no configuration. When you turn it on it traces Web activity in a new Window and will highlight, decode, and display the XML in a SAML Request or Response on demand. You turn it off by closing the Trace Window.
VM Access
If you run Shibboleth under Tomcat on your Sandbox desktop, then it is http://localhost:8080/idp. No setup is required.
If you run Shibboleth on the DEV/TEST/PREPROD VMs in the machine room and they have a public URL through the F5, then you can use that URL.
If they are new VMs not defined to the F5, then the firewall will not allow access from your desktop directly to the VM. First, you need to use the standard Cisco AnyConnect client to establish a VPN session to an area of the network from which SSH traffic is permitted to the VMs. You need to be put in an AD Group to use this type of VPN, and you have to download the Profile files for these special VPN targets. Get help from another developer if this is not set up on your machine or for your Netid. Access to the VPN requires MultiFactor Authentication even though your desktop is on campus.
When the VPN is enabled, you may have access to http port 80 and https port 443 using the native VM hostname (vm-something-01.its.yale.internal). That may be good enough for testing.
In other cases, you may need use an SSH tunnel for ports 8080 or 8443, but it is almost impossible to test or debug anything without SSH access to VMs and log files anyway. SSH requires that Operations create a login for your Netid on the VM and installed your SSH public key. In addition to the terminal session on the VM, the SSH client can be configured to "tunnel" one or more port number from your desktop computer to the VM. In the general case, tunnel ports 8080, 8443, and 443 to the same port numbers on the VM.
Only one program can use port 8080 on your computer at a time. When you test Shibboleth on your local Sandbox, it also uses 8080. Using the same local port number for both the Sandbox and the SSH tunnel will generate an error message if you accidentally run both at the same time. SSH will generate error messages that it cannot create the tunnel if you have forgotten to shut down the Sandbox Tomcat, and Tomcat will generate error messages that it cannot bind to the port if you forget forget to shut down SSH before starting the Sandbox. This is a feature, because you really don't want to spend hours trying to figure out what is wrong only to discover that you are debugging the wrong Shibboleth server.
Browser Point of View
We have already explained that all communication between Shibboleth and the application goes through the Browser. That means that the Browser has to find the Shibboleth server, and it has to find the Application. What is not clear is that some parts of Shibboleth have to be configured from the "Browser" point of view.
Suppose Shibboleth is running on a VM named vm-something-01.its.yale.internal. That name, and the local IP address, are restricted to the machine room. The F5 knows how to get to that hostname, but ordinary desktop computers cannot reach it.
The F5 creates a public virtual hostname with an IP address everyone can use. It is the F5 that has the name "auth.yale.edu" or "auth-test.yale.edu". So in normal processing, all traffic to Shibboleth goes through the F5.
The missing piece here is that nothing configured to Linux or Tomcat on the "vm-something-01" host computer will tell it what name the F5 happens to be presenting to the world. There is no way for the machine to know that everyone else in the world thinks it is "auth-test.yale.edu" unless you configure that name using the install-TEST.properties file in the Jenkins Install. If you look at that file, you might be misled because the first couple of lines read:
idp.entityID= https://auth-test.yale.edu/idp/shibboleth
idp.metadata.file=auth-test.yale.edu-metadata.xml
Yes these lines include the "auth-test.yale.edu" name, but neither of them is configured to anything Shibboleth regards as a URL or network address. The "entityID" is just a string that happens by Yale convention to also be a URL, but it could have been anything. the second the name of a file on disk that happens to start with the public hostname. There is only one property that is actually a network address of the server VM:
cas.target.url=https://auth-test.yale.edu
This is the Yale property that is used to set a parameter of the Unicon CAS-Shibboleth integration that in turn sets the Service string sent to CAS. CAS will Redirect the Browser to the URL in this string, so it must be a URL that the Browser can use to get back to the same Shibboleth VM ending up at the same Tomcat port that received the request that started the CAS login.
It may seem strange that all the Applications in the world have been given a Metadata file that contains URLs that contain public URLs for Shibboleth, but Shibboleth itself has no way to find its own public URL in any configuration parameter. The reason why it can get along without that information is that Shibboleth never does anything except in response to a network request, and when any request comes in over the network it contains an HTTP Host header. A Host header may be originally generated by the Browser, and it may be modified by the F5 on its way to the VM. The Host header contains the protocol, server DNS name, and the port (from the point of view of the Client who sent it). So the Browser generates a header of the form
Host: https://auth.yale.edu
This can flow through the F5 and arrive at the VM and be passed on to Tomcat and to Shibboleth. It tells Shibboleth a subset of the URL that the Browser started with before the F5 changed the protocol, and the VM name, and the port number, to send the request through possibly more intermediaries before it gets to Shibboleth. Sometimes Shibboleth needs to Redirect the Browser to another URL on its own server. It can use the Host header to send back the right network address, so the Browser will come back to Shibboleth again through the same path.
You need to be aware of the "cas.target.url" property and the way that the Host header is used, because during testing the Browser may start with a request to "https://auth.yale.edu" but you may trap it and send it to a special IP address or even a different host name. Neither of these is necessarily a problem, but if as a result you change the Host header then certain specific Shibboleth functions may break.
Typically if you get to the right Shibboleth test server in the first place, then all the Redirects will go back to the right machine. A configuration/test problem typically shows up if you try to change ports or protocol (http or https). So if the original URL was "https://auth.yale.edu/idp" and you try to test against "http://localhost:8080/idp" then problems with configuration or Host header mapping will show up when your Browser generates an error page and the address bar shows "https://auth.yale.edu:8080/idp" or "http://localhost:8080:8080/idp".
Configure Redirector
Redirector runs inside your Browser. Every time the Browser is about to go to a network URL, Redirector inspects it. If the destination URL matches a pattern, Redirector replaces the string with a different string. This is essentially the same as the Find and Replace function of every Text editor. You can match text with either a Wildcard or Regular Expression. Wildcard is simpler and is perfectly adequate for Shibboleth testing.
The basic remapping is from "https://auth.yale.edu/idp/*" to "http://localhost:8080/idp/$1". Note that the Match string ends in the "*" wildcard character, so it matches all URLs that begin with the string. The Replace string ends in "$1" which is a variable that represents the data that matched the "*" wildcard. In English, this says, "Match all URLs that begin with https://auth.yale.edu/idp/ and replace those characters with http://localhost:8080/idp/, but leave the end of the URL alone."
There is a problem because Redirector replaces the hostname before the Host header is generated. Therefore, Shibboleth may have properties that tell it to generate a hostname of "https://auth.yale.edu", but it gets an HTTP header of "Host: http://localhost:8080".
Initially, this will cause trouble when the Unicon CAS-Shibboleth integration generates a Service string to send to CAS. After login, CAS redirects the Browser back to the URL in the Service string, which you will discover has been set to "service=https://auth.yale.edu:8080/idp". You can get around this in one of two ways.
If you are testing on your desktop Sandbox, nobody else cares how your private Shibboleth instance is configured, and you have to manually edit an install.properties file that overrides all other property configurations. So in this case you put in your install.properties file:
cas.target.url=http://localhost
Now the property value in the properties file and the rewritten Host header are the same, and the Service string becomes "service=http://localhost:8080/idp" and it works. However, this is generally not a viable choice for a VM in the machine room which you probably do not want to configure with a "localhost:8080" property.
So you need to create a second Redirector remapping entry to map "https://auth.yale.edu:8080/idp/*" to "http://localhost:8080/idp/$1".
This second remapping entry rewrites the URL that the Unicon integration generated and sent to CAS to the working network address of the test server you are using.
Unfortunately, this will not solve a mismatch problem between the "destination" string generated by an application when you go the the application first and it generates a Request and the URL Shibboleth generates to validate the destination. The application will generate the official production URL for production Shibboleth ("https://auth.yale.edu/idp") but unfortunately the Spring Beans configured by default in the Shibboleth /system directory generate the protocol that they use to compare against this string from the actual protocol (http or https) through which the Request arrived from the Browser. No matter how many tricks you use to rewrite or modify stuff, if the Request arrives as http over 8080, then Shibboleth will generate "http://..." and that will fail to match the "https:" in the Request XML.
So if you are doing a final end-to-end login test of one of the applications that require you to go to the application first and get a Request object, then Shibboleth has to be running under a Tomcat configured to use SSL, and the Host header generated should have no port number.
Charles Web Debugging Proxy
A Web Proxy is a program that sits between the Browser and the Web Server. In the old days with a slower Internet, the Yale Proxy server cached frequently used Web pages from other locations to speed up browsing for Yale users. Today the F5 acts as what is called a "reverse proxy", where it appears to the network to be all the important Yale Web servers (including "auth.yale.edu") and then it forwards the request to other computers or VMs in the machine room that do the real work.
You can configure the Apache Web Server to be a proxy, and there is a very useful tool called nginx that specializes in acting as a proxy. However, these are larger solutions used by system administrators in production, and you could have to read a book to learn how to use them. A simpler solution is the Charles Web Debugging Proxy that can run on your desktop and modify the URL of Browser requests that pass through it.
The primary function of Charles is to intercept Browser traffic and display a log of data passing between the browser and the servers. This would be extremely valuable if we did not already have the SAML Tracer built into the Browser providing a more easily read summary of the important (SAML) data.
Without tracing, the Charles is simply an external alternative to the URL rewriting function of Redirector. There are some advantages to an external solution. Without Redirector, the Browser generates exactly the same data and headers that it would use to talk to the real Shibboleth. Because Redirector rewrites the URLs before they are logged and before the Host header is generated, you have to take its actions into consideration when you are debugging.
Charles only intercepts data while you run it. Close Charles and everything behaves normally. Charles only intercepts the traffic you configure, and it runs locally on your desktop and only intercepts traffic from your Browser. Generally you only run it during testing, and when you are running it you only generate test related traffic.
Charles has a lot more functions than we will use, but unlike Apache or nginx, it has a simple to use GUI configuration that is easy to learn. There are a sequence of steps you need to perform, and since it is easy to forget something, I will explain each step and what it does.
Download and Install Charles. Install the Charles Plug-In for Firefox.
Start Charles. Start Firefox. When Charles is running, the Charles Plug-in for Firefox configures Charles as the Firefox Web proxy. All traffic from Firefox goes to Charles, and Charles forwards it to the network. Stop Charles and the plug-in removes the proxy configuration and now Firefox runs normally.
Now suppose Charles is running and you tell the Browser to go to production Shibboleth ("https://auth.yale.edu/idp/...").
Because the URL starts with "https", Firefox is going to establish an SSL/TLS session and encrypt all the data it sends. It expects to be talking to a host named "auth.yale.edu" and part of the SSL session setup requires that host to prove its identity with an X.509 Certificate that says it is "auth.yale.edu". If Charles has no specific configuration, it will pass data between Firefox and the real auth.yale.edu server. The data will be encrypted and Charles will be unable to read it, but Firefox will run normally as if Charles was not present.
In order to decrypt the data, Charles has to provide a certificate and claim to be the "auth.yale.edu" server. This is exactly what the F5 does in the machine room, but in this case it will be private between your Firefox browser and your Charles Proxy, both running on your desktop. For security, they will use a private Certificate that you create and only your Firefox trusts.
When Charles is installed, it generates a local self-signed Certificate for itself and uses it to create a mini Certificate Authority (CA). In the Charles menu, you select "Proxy" and then "SSL Proxying" from the pulldown list. Click the "Enable SSL Proxying" box and then add an SSL hostname of "auth.yale.edu" to the list box. Charles internally generates a Certificate for"auth.yale.edu" created by its internal Certificate Authority.
In addition to creating the certificate, the SSL Proxy configuration just told Charles to intercept any HTTP traffic issued by your Firefox browser for a URL that begins with "https://auth.yale.edu/..." and to send back to Firefox the dummy Certificate issued by the internal Charles CA for hostname "auth.yale.edu". The Charles CA will not be in the list of real commercial Certificate Authorities that Firefox is distributed by Mozilla to automatically trust. So when Firefox gets the dummy Certificate from Charles, it displays a Warning page saying that the Web server certificate is not from a recognized Authority. You can click on the message page and tell Firefox to configure an Exception and trust this Certificate. It is convenient to tell Firefox to trust it from now on, and then you only get the Warning page the first time.
Now there is an SSL session inside your desktop computer between Firefox and Charles (acting as its Web proxy). Firefox encrypts data and Charles decrypts it. If this was all you configured, Charles would establish a second SSL connection between it and the real "auth.yale.edu" endpoint (the F5) and simply forward messages between Firefox and the F5, although now that it can decrypt the data it can log readable information flowing in both directions. The SAML generated by Shibboleth contains no sensitive information and can flow over an http unencrypted session, so this part is nothing special.
However, we want to do something different. We want to take the data that was originally going to production Shibboleth and reroute it so it goes the the PREPROD VM with new code or new configuration. This is a second step where we now tell Charles to send the data for "auth.yale.edu" to a substitute URL address.
As with Redirector, if the VM in the machine room has a public URL provided through the F5 then you can simply use that address. If not, then establish an SSH tunnel and use "localhost:8080".
The Charles version of the Redirector function is configured by selecting Tools from the Charles menu, then "Map Remote" from the pulldown list. Click the "Enable Map Remote" box and then Add a mapping.
The Map From part of the mapping provides data that must match a URL sent from your Firefox browser to Charles. In this case the Protocol is "https" and the Host is "auth.yale.edu". Generally you leave the other fields (port, path, query) blank and they default to matching anything.
The Map To part of the mapping specifies the changes you want to make to the incoming URL. In this case, you want to change the Protocol to "http", the host to "localhost", and the port to "8080". Leaving the other fields blank means that the path (/idp/...) will be copied from the incoming URL to the outgoing connection. It is assumed that you can figure out what target URL you want for the Map To address in other situations. If the F5 has a public URL for the VM you are trying to access, then just configure the Map To with the virtual host name on the F5 for the VM you want to access and Charles will send the data to the F5 instead of localhost:8080.
There is one last step. Click the "Preserve Host Header" box. When Firefox generated its request, it sent a Host header with the "https://auth.yale.edu" value. Without Redirector, Firefox does not know about the URL mapping so the Host header is the same as it would send to real production Shibboleth. This turns out to be exactly what we want to get the Unicon CAS-Shibboleth integration to generate the correct Service string without any fudging.
Charles is a larger tool and it has a license fee. Redirector is simpler and is free. Because Redirector operates inside the Browser there are changes in the URL and the Host header that are visible to the Browser, to SAML Tracer, and to the Shibboleth server (at least the Unicon CAS-Shib integration). Because Charles operates outside of the Browser and performs the same function that in production is performed by the F5, when we use Charles then everything is exactly the same as it will be in production. However, the differences created by Redirector are generally not important and do not interfere with any normal Shib testing.
The techniques used by Charles are similar to exploits used by some malware. The difference is that Charles only functions when you explicitly run it and it only decodes traffic for hosts you configure it to proxy. If you accidentally leave it running and do some banking, then since bankofamerica.com is not in any of its configuration lists the SSL encrypted data remains secure and no sensitive information is exposed, even to other windows on your desktop. If you use it to debug CAS, then close it when you are done and don't save files that contain your Netid password.