Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Incrementals are trivial (.1 to .2 seconds).

Configuration

In JASIG CAS , the administrator selects one of the several TicketRegistry optional implementations and configures it using a Spring Bean XML file located in the TicketRegisty is configured using the WEB-INF/spring-configuration/ticketRegistry.xml . With CushyTicketRegistry this file creates the first "Primary" object instance that manages the Tickets created and owned by the local nodes. That object examines the configuration and creates additional "Secondary" object instances for every other node configured in the cluster.

The Cluster

Cluster configuration requirements became complex enough that they were moved into their own CushyClusterConfiguration class. This Bean is defined in front of the CushyTicketRegistry in the Spring ticketRegistry.xml file.

Why is this complicated? We prefer a single "cas.war" artifact that works everywhere. It has to work on standalone or clustered environments, in a desktop sandbox with or without virtual machines, but also in official DEV (development), TEST, and PROD (production) servers. Changing the WAR file for each environment is undesirable because we do not want to change the artifact between Test and Production. The original idea was to configure things at the container level (JBoss), but Yale Production Services did not want to be responsible for managing all that configuration stuff.

So CushyClusterConfiguration adds Java logic instead of just a static cluster configuration file. During initialization on the target machine it can determine all the IP addresses assigned to the machine and the machine's primary HOSTNAME. This now allows two strategies.

First, you can configure all your clusters (sandbox, dev, test, prod, ...). Then at runtime CushyClusterConfiguration determines the IP addresses of the current machine and scans each cluster definition provided. It cannot use a cluster that does not contain the current machine, so it stops and uses the first cluster than contains a URL that references an IP address on the current server.

If none of the configured clusters contains the current machine, or if no configuration is provided, then Cushy uses the HOSTNAME and some Java code. The code was written for the Yale environment and can describe other environments, but if you already have a cluster with other machine naming conventions then you may want to modify or replace the Java at the end of this bean.

At Yale, the DEV, TEST, and PROD machines are all part of a two machine cluster where the HOSTNAME contains a file. It has two sections.

First, a bean with id="ticketRegistry" is configured selecting the class name of one of the optional TicketRegistry implementations (JBoss Cache, Ehcache, ...). To use Cushy you configure the CushyTicketRegistry class. The rest of the bean definition provides property values that configure that particular type of registry.

Then at the end there are a group of bean definitions that set up periodic timer driven operations using the Spring support for the Quartz timer library. Normally these beans set up the RegistryCleaner to wake up periodically and remove all the expired tickets from the Registry.

Cushy adds a new bean at the beginning. This is an optional bean for class CushyClusterConfiguration that uses some static configuration information and runtime Java logic to find the IP addresses and hostname of the current computer to select a specific cluster configuration and generate property values that can be passed on to the CushyTicketRegistry bean. If this class does not do what you want, you can alter it, replace it, or just generate static configuration for the CushyTicketRegistry bean.

The Cluster

We prefer a single "cas.war" artifact that works everywhere. It has to work on standalone or clustered environments, in a desktop sandbox with or without virtual machines, but also in official DEV (development), TEST, and PROD (production) servers.

There are techniques (Ant, Maven) to "filter" a WAR file replacing one string of text with another as it is deployed to a particular host. While that works for individual parameters like "nodeName", the techniques that are available make it hard to substitute a variable number of elements, and some locations have one CAS node in development, two CAS nodes in test, and three CAS nodes in production.

Then when we went to Production Services to actually deploy the code, they said that they did not want to edit configuration files. They wanted a system where the same WAR is deployed anywhere and when it starts up it looks at the machine it is on, decides that this a TEST machine (because it has "tst" in the hostname), and so it automatically generates the configuration of the TEST cluster.

At this point you should have figured out that it would be magical if anyone could write a class that reads your mind and figures out what type of cluster you want. However, it did seem reasonable to write a class that could handle most configurations out of the box and was small enough and simple enough that you could add any custom logic yourself.

The class is CushyClusterConfiguration and it is separate from CushyTicketRegistry to isolate its entirely optional convenience features and make it possible to jiggle the configuration logic without touching the actual TicketRegistry. It has two configuration strategies:

First, you can configure a sequence of clusters (desktop sandbox, and machine room development, test, and production) by providing for each cluster a list of the machine specific raw URL to get to CAS (from other machines also behind the machine room firewall). CusyClusterConfiguration look up all the IP addresses of the current machine, then looks up the addresses associated with the servers in each URL in each cluster. It chooses the first cluster that it is in (that contains a URL that resolves to an address of the current machine).

Second, if none of the configured clusters contains the current machine, or if no configuration is provided, then Cushy uses the HOSTNAME and some Java code to automatically configure the cluster. At this point we expect you to provide some programming, unless you can use the Yale solution off the shelf.

At Yale we know that CAS is a relatively small application with limited requirements, and that any modern multi-core server can certainly handle all the CAS activity of the university (or even of a much larger university). So we always create clusters with only two nodes, and the other node is just for recovery from a serious failure (and ideally the other node is in another machine room far enough away to be outside the blast radius).

In any given cluster, the hostname of both machines is identical except for a suffix that is either the three characters "-01" or "-02" suffix. So by finding the current HOSTNAME it can say that if this machine has "-01" in its name, the other machine in the cluster is "-02" and , or the reverse.

Sounds easy, but as always the actual code implies some rules you need to know.

...

Configuration By File

You can define the CushyClusterConfiguration bean with or without a "clusterDefinition" property. If you provide the property, it is a List of Lists of Strings:

    <bean id="clusterConfiguration" class="edu.yale.its.tp.cas.util.CushyClusterConfiguration"
        p:md5Suffix="yes" >
      <property name="clusterDefinition">
           <list>
               <!-- Desktop Sandbox cluster -->
               <list>
                   <value>http://foo.yu.yale.edu:8080/cas/</value>
                   <value>http://bar.yu.yale.edu:8080/cas/</value>
               </list>
               <!-- Development cluster -->
               <list>
                   <value>https://casdev1.yale.edu:8443/cas/</value>
                   <value>https://casdev2.yale.edu:8443/cas/</value>
               </list>
...
           </list>
      </property>
    </bean>

In spring, the <value> tag generates a String, so this is what Java calls a List<List<String>> (List of Lists of Strings). As noted, the top List has two elements. The first element is a List with two Strings for the machines foo and bar. The second element is another List with two strings for casdev1 and casdev2.Only one of these cluster definitions should apply. At run time CushyClusterConfiguration selects the first usable cluster configuration, where a configuration is not usable if the current machine is not in the cluster.

There is no good way to determine all the DNS names that may resolve to an address on this server. However, it is relatively easy in Java to find all the IP addresses of all the LAN interfaces on the current machine. This list may be longer than you think. Each LAN adapter can have IPv4 and IPv6 addresses, and then there can be multiple real LANs and a bunch of virtual LAN adapters for VMWare or Virtualbox VMs you host or tunnels to VPN connections. Of course, there is always the loopback address.

...

It is not generally possible to determine the port numbers that a J2EE Web Server is using. So it is not possible to make distinctions based only on port number. CushyClusterConfiguration requires a difference in IP addresses. So if you want to emulate a cluster on a single machine, use VirtualBox to create VMs and don't think you can run two Tomcats on different ports.

(This does not apply to Unit Testing, because Unit Testing does not use a regular WAR and is not constrained to a single configuration file. If you look at the unit tests you can see examples where there are two instances of CushyTicketRegistry configured with two instances of CushyClusterConfiguration with two cluster configuration file where both names map to the local machine and are configured in a different order. For example, if your etc/hosts file is configured so that both foo and bar are mapped to the loopback address configured with two instances of CushyClusterConfiguration with two cluster configuration files. In fact, it can be a useful trick that the code stops at the first match. If you edit the etc/hosts file to create a bunch of dummy hostnames all mapped on this computer to the loopback address (127.0.0.1, then one configuration can list (foo,bar) and in that case Cushy will decide it is foo and the other node is bar, while the other configuration can list (bar,foo) and that instance of the Cushy classes will decide they are bar and the other node is foo.)However, you have to be careful and in production enforce rules that prevent the algorithm from screwing up. You may have noticed the potential problem if you run VirtualBox or VMWare virtual machines on a Windows desktop computer at work), then those names will always match the current computer and Cushy will stop when it encounters the first such name. The trick then is to create for the two test instances of Cushy two configuration files (localhost1,localhost2 and localhost2,localhost1). Fed the first configuration, that test instance of Cushy will match the first name (localhost1) and will expect the cluster to also have the other name (localhost2). Fed the second configuration the other test class will stop at localhost2 (which is first in that file) and then assume the cluster also contains localhost1.)

Any automatic configuration mechanism can get screwed up by mistakes made by system administrators. In this case, it is a little easier to mess things up in Windows. You may have already noticed this if your Windows machine hosts VMs or if your home computer is a member of your Active Directory at work (though VPNs for example). At least you would see it if you do "nslookup" to see what DNS thinks of your machine. Windows has Dynamic DNS support and it is enabled by default on each new LAN adapter. After a virtual LAN adapter has been configured you can go to its adapter configuration, select IPv4, click Advanced, select the DNS tab, and turn off the checkbox labelled "Register this connection's addresses in DNS". If you don't do this (and how many people even think to do this), then the private IP address assigned to your computer on the virtual LAN (the 192.168.1.1 style address created to be private to this virtual LAN inside your computer) gets accidentally registered in the DNS server of the Active Directory along with the real IP address of the real LAN adapter of your machine. So throughout an organization there can be dozens or hundreds of computer that all have DNS names that resolve to the same or the home network address assigned to your computer when it has a VPN tunnel to work) gets registered to the AD DNS server. When you look up your machine in DNS you get the IP address you expected, and then an additional address of the form 192.168.1.? which is either the address of your machine on your home LAN or its address on the private virtual LAN that connects it to VMs it hosts.

Generally the extra address doesn't matter. A problem only arises when another computer that is also on a home or virtual network with its own 192.168.1.* address. Generally it doesn't matter and all the other code ignores it.However, addresses looks up the DNS name of a computer, gets back a list of addresses, and for whatever reason decides that that other computer is also on its home or virtual LAN instead of using the real public address that can actually get to the machine.

CushyClusterConfiguration is going to notice all the addresses on the machine and all the addresses registered to DNS, and it may misidentify the cluster if these spurious internal private addresses are being used on more than one sandbox or machine room CAS computer. It is unlikely that production or professionally managed machines will have this error, but you should be warned.

On the other hand, you can create this situation intentionally for test purposes by adding names with the loopback or private addresses to the etc/hosts table on your desktop sandbox computer. Just remember the algorithm and you can figure out the testing tricks yourself.

This may seem complex, but in practice you will have a specific set of production and test machines and a sandbox development environment, you build a single configuration that specifies everything, test to make sure you haven't done anything dumb, and then you can create a single WAR file that automatically detects which environment it is running in and doesn't have to be changed when you move it across machinesa design objective of continuing Cushy development to refine this configuration process so you cannot get messed up when a USB device you plug into your computer generates a USB LAN with a 192.168.153.4 address for your computer, but to do this in a way that preserves your ability to configure a couple of VM guests on your desktop for CAS testing.

Note also that the Unit Test cases sometimes exploit this by defining dummy hostnames that resolve to the loopback address and therefore are immediately matched on any computer.

In practice you will have a sandbox you created and some machine room VMs that were professionally configured and do not have strange or unexpected IP addresses, and you can configure all the hostnames in a configuration file and Cushy will select the right cluster and configure itself the way you expect.

Autoconfigure

At Yale the names of DEV, TEST, and PROD machines follow a predictable pattern, and CAS clusters have only two machines. So production services asked that CAS automatically configure itself based on those conventions. If you have similar conventions and any Java coding expertise you can modify the autoconfiguration logic at the end of CushyClusterConfiguration Java source.

...

At Yale, the two servers in any cluster have DNS names that ends in "-01" or "-02". Therefore, Cushy autoconfigure gets the HOSTNAME of the current machine, looks for a "-01" or "-02" in the name, and when it matches creates a cluster with the current machine and one additional machine with the same name but substituting "-01" for "-02" or the reverse.

Standalone

If no configured cluster matches the current machine IP addresses and the machine does not autoconfigure (because the HOSTNAME does not have "-01" or "-02"), then Cushy configures a single standalone server with no cluster.

Even without a cluster, Cushy still checkpoints the ticket cache to disk and restores the tickets across a reboot. So it provides a useful function in a single machine configuration that is otherwise only available with JPA and a database.

This is all Optional

Although CushyClusterConfiguration makes most configuration problems simple and automatic, if it does the wrong thing and you don't want to change the code you can ignore it entirely. As will be shown in the next section, there are three properties, a string and two Properties tables) that are input to the CusyTicketRegistry bean. The whole purpose of CushyClusterConfiguration is to generate a value for these three parameters. If you don't like it, you can use Spring to generate static values for these parameters and you don't even have to use the clusterConfiguration bean.

Other Parameters

Typically in the ticketRegistry.xml Spring configuration file you configure CushyClusterConfiguration as a bean with id="clusterConfiguration" first, and then configure the usual id="ticketRegistry" using CusyTicketRegistry. The clusterConfiguration bean exports some properties that are used (through Spring EL) to configure the Registry bean.

...