...
The other CAS clustering techniques (JBoss Cache, Ehcache, Memcached) are typically regarded as magic off the shelf software products that take care of all your problems automatically and you don't have to worry about them. Again you haven't actually solved the problem, but now you really have transformed it into something you will never understand and so you just have to cross your fingers and hope those guys know what they are doing.Cushy is simple enough that anyone can figure out
how it works and, therefore, the consequences of any type of failure. So now you really can plan for each type of failure and design the system to do the best thing. Cushy was specifically written for the people who feel more comfortable knowing exactly how stuff works, and how it fails.Even it you do not understand Java programming, CushyTicketRegistry performs a sequence of steps described here that you can understand. It writes a file on disk, and from that point on everything is file transfer. You can use the built-in Web support, or replace it with something else. From that point on every type of node failure or network failure produces predictable behavior. Since the file transfer is being retried periodically, every type of hardware recovery also produces predictable results. This is something you can understand and take into consideration when you plan out the scenarios.
Why another Clustering Mechanism?
One solution is to share all the ticket objects and their associated components in database tables using JPA. JPA is the standard Java mechanism for mapping objects to tables. It is an enormously powerful tool for ordinary Web applications. It is a possible solutionYou can use JPA, but CAS doesn't really have a database problem:.
- CAS tickets all timeout after a number of hours. They have no need for long term persistence.
- There are no meaningful SQL operations in CAS. Nobody will generate reports based on tickets.
- CAS has no transactional structure or need for a conventional commit operation.
Most importantly, having created a cluster for availability, JPA now makes the database a single point of failure. Configuring a database for 24x7x366 availability and guaranteeing that it comes up before CAS places a significant and unnecessary burden on most CAS installations.
The alternative is to use one of several "cache" libraries (Ehcache, JBoss Cache, Memcached). They create the impression of a large pool of ordinary Java objects shared by all the CAS servers. Any change made to objects in the pool are automatically and transparently replicated to all the other servers. These systems also solve very large problems and they can have very complicated configurations with exotic network parameters.
A common problem with both JPA and the generic "cache" solutions is that they integrate into CAS "inline". JPA is driven by annotations that are added to the Java source of the Ticket classes, but under the covers it dynamically generates code that it transparently "weaves" into the classes. The cache systems intercept TicketRegistry operations such as addTicket to make sure that copies of the tickets are moved to some network communications queue. In either case we have observed that when things get bad, when the network is sick or something generates an unexpected error, the problem can "back up" from the replication mechanism back into the TicketRegistry and then into CAS itself.
In Cushy, the only connection between the CAS mainline function (the part of CAS that responds to Web requests and services users and applications) is that references to objects are occasionally copied from one in memory collection to another. This is an object copy operation that can't fail unless you run out of memory. Separate from that, on a timer driven basis collections of objects completely separate from the objects used by CAS request processing are periodically written to disk with a single Java writeObject statement. This is a disk I/O operation that can't fail unless the disk fills up or dies. Separate from that, on a network request driven basis, copies of those files are then send to other CAS nodes. If there is a network problem, the HTTP GET fails with an I/O error, the operation is cancelled, then retried 10 or 15 seconds later.
Each of these three steps (memory to memory, memory to disk, disk to network) is independent and runs under different threads. Network problems do not block the code that writes the file, file problems do not block the code that copies references to objects. Nothing can crash CAS.
Comparison of Cushy and previous cluster technologies:
- With Cushy the only thing you have to configure is the URL of the other node. There are no multicast addresses, no ports, no timeout values, no retry counts. There is no library, no drivers, no database.
- If the network has a problem, Cushy on the other nodes uses the last copy of the files it received from each of the other nodes. That is probably good enough for 99.9% of the users. It will update as soon as it can talk to the other nodes. Meanwhile, it continues to generate its own files on disk whether the other nodes are able to read them or not.
- Other systems break down if the communication between nodes or to the database fails while the Front End is still distributing requests round-robin. If the ST did not get replicated, it cannot be validated by any node except the one that issued it. Adding the Cushy routing logic to the Front End protects against this type of problem whether you use Cushy or any other TicketRegistry.
- If anything goes wrong with Cushy, it is just an HTTP request. You know how to diagnose them.
- Cushy keeps the tickets for each node separate from the tickets for every other node. That is also simpler.
- Cushy is probably less efficient than other technologies, but if it uses less that 1% of one core of a modern server then, given the relative importance of CAS in most institutions, reducing that to a quarter of 1% is not worthwhile if you have to give something up to get the efficiency.
Two examples in the form of a fable -
Suppose the Rapture happens and all your users have been good users and they are all transported to Heaven leaving their laptops and tablets behind. Activity ceases on the network, so all the other TicketRegistry systems have nothing to do. Cushy, however, is driven by the number of tickets in the Registry and not as much by the amount of activity. So it continues to generate and exchange checkpoint files until 8 hours after the Rapture when the logins all timeout.
Suppose (and this one we have all seen) someone doesn't really understand how applications are supposed to work with CASJPA also weaves its own generated code into the methods exposed by the objects it manages. This causes the application (CAS) to fail in unpredictable and unavoidable ways if the database goes down or if network access to the database is interrupted.
There are a number of non-database central object server technologies available. There are no existing CAS TicketRegistry implementations for any of them, and the central server remains a problem.
JBoss Cache has proven unreliable, and it is terribly complex to configure with multicast addresses and complex network timeout and other parameters.
Ehcache appears to be the most commonly used CAS replication technology. It is fairly simple to configure, and it uses RMI calls to transmit tickets, a built in Java technology that is about as simple as Cushy HTTP. It can store tickets on local disk. It is the obvious alternative to CushyTicketRegistry and deserves special consideration.
Ehcache Compared to Cushy
First, CushyClusterConfiguration will configure either EhcacheTicketRegistry or CushyTicketRegistry, so it is certainly no easier to configure one or the other.
If you use Cushy configuration and Front End programming (F5 or the Cushy Filter), then you can relax the ticket replication in Ehcache so Service Tickets and TGTs are both asynchronous and casual. This means that both Ehcache and Cushy would replicate tickets every 10 seconds or so given typical configurations.
In that 10 second period Ehcache would queue up all the changes and then send the changes for that 10 second period as an RMI operation to each configured machine in the cluster. Cushy on the other hand adds the changes during the 10 second period onto all the other changes since the last checkpoint. So Cushy sends a lot more data.
However, Ehcache knows nothing about CAS or Ticket structure. When it replicates a ST or proxy ticket, it also generates a copy of the TGT and transmits it to the other machines. Multiple copies of the TGT accumulate and are never fixed. Cushy cleans everything up when once every few minutes it replicates a copy of the entire ticket registry with what Java guarantees to be an exact duplication of all the connections and relationships between tickets.
Ehcache is a closed system. The Cushy checkpoint and incremental files can be backed up or copied off site as ordinary files, so you have a lot of flexibility.
Ehcache is designed to be a "cache". That is, it is designed to be a high speed, in memory or local disk, copy of some data that has a persistent copy off on some server. That is why it has a lot of configuration for "LRU" and object eviction, because it assumes that lost objects are reloaded from persistent storage. You can use it as a replicated in memory table, but you have to understand if you read the documentation that that is not its original design.
Ehcache replicates data transparently inside a large black box library. Cushy is a single source file of pure Java written to be easily understood. It is specifically designed to manage Tickets. Furthermore, there is a specific point in the code when files arrive and when they are being processed. These are places in the code where additional CAS specific logic can be added to handle special or future requirements.
Two examples in the form of a fable -
Suppose the Rapture happens and all your users have been good users and they are all transported to Heaven leaving their laptops and tablets behind. Activity ceases on the network, so all the other TicketRegistry systems have nothing to do. Cushy, however, is driven by the number of tickets in the Registry and not as much by the amount of activity. So it continues to generate and exchange checkpoint files until 8 hours after the Rapture when the logins all timeout.
Suppose (and this one we have all seen) someone doesn't really understand how applications are supposed to work with CAS, and they write their code so they get a new Service Ticket for every Web page the user accesses. CAS now sees a stream of requests to create and validate new Service Tickets. The other TicketRegistry systems replicate the Service Ticket and then immediately send a message to all nodes to delete the ticket they just created. Cushy instead just wakes up after 10 seconds and finds that all this ticket activity has mostly cancelled out. The incremental file will contain an increasing number of deleted Ticket IDs, until the next checkpoint resets it to empty and it starts growing again. If you turn on the option to ignore Service Tickets all together (because you don't really need to replicate them if you have programmed your Front End or added the Filter), Cushy can ignore this mess entirely.
...
Although HTTP is a "stateless" protocol, an SSL connection is frequently optimized to be a longer term thing that keeps a session alive between requests. The SSL connects the browser or application to the Front End, and there is probably a separate SSL connection from the Front End to the CAS VM. A common option for Front Ends is to notice any long running SSL connection and use it to route requests to the same backend VM node. You must be sure that you do not select this option with Cushy and CAS. For Service Ticket validation requests to work, the routing decision has to be made separately for each request because different tickets have to be routed to different CAS VMs even though they came from the same application.From a practical point of view, the biggest problem with Cushy deployment will probably be that Front End programming is typically the responsibility of one group, and CAS is typically the responsibility of another group, and the CAS group may not be able to prioritize its needs to the other group, or even convince them that this is a good idea
If you cannot convince your network administrators to do the programming in the Front End where it belongs, you can get the same result slightly less efficiently using the CushyFrontEndFilter. Just add it as a Servlet Filter to the WEB-INF/web.xml file and it will do the same thing that the F5 is supposed to do.
What Cushy Does at Failure
...
To the outside world, the cluster typically shares a common virtual URL simulated by the Front End device. At Yale, CAS is "https://secure.its.yale.edu/cas" to all the users and applications. The "secure.its.yale.edu" DNS name is associated with an IP address managed by the BIG-IP F5 device. It holds the certificate, terminates the SSL, then examines requests and based on programming called iRules it forwards requests to any of the configured CAS virtual machines.
Each virtual machine has a native DNS name and URL. It is these "native" URLs that define the cluster because each CAS VM has to use the native URL to talk to another CAS VM. At Yale those URLs follow a pattern of "https://vm-foodevapp-01.web.yale.internal:8443/cas".
Internally, Cushy configuration takes a list of URLs and generates a cluster definition with three pieces of data for each cluster member: a nodename like "vmfoodevapp01" (the first element of the DNS name with dashes removed), the URL, and the ticket suffix that identifies that node (the F5 prefers the ticket suffix to the an MD5 hash of the IP address of the VM).
Configuration
In CAS the TicketRegisty is configured using the WEB-INF/spring-configuration/ticketRegistry.xml file.
In the standard file, a bean with id="ticketRegistry" is configured selecting the class name of one of the optional TicketRegistry implementations (JBoss Cache, Ehcache, ...). To use Cushy you configure the CushyTicketRegistry class and its particular parameters.
Then at the end there are a group of bean definitions that set up periodic timer driven operations using the Spring support for the Quartz timer library. Normally these beans set up the RegistryCleaner to wake up periodically and remove all the expired tickets from the Registry.
Cushy adds a new bean at the beginning. This is an optional bean for class CushyClusterConfiguration that uses some static configuration information and runtime Java logic to find the IP addresses and hostname of the current computer to select a specific cluster configuration and generate property values that can be passed on to the CushyTicketRegistry bean. If this class does not do what you want, you can alter it, replace it, or just generate static configuration for the CushyTicketRegistry bean.
Then add a second timer driven operation to the end of the file to call the "timerDriven" method of the CushyTicketRegistry object on a regular basis (say once every 10 seconds) to trigger writing the checkpoint and incremental files.
The Cluster
We prefer a single "cas.war" artifact that works everywhere. It has to work on standalone or clustered environments, in a desktop sandbox with or without virtual machines, but also in official DEV (development), TEST, and PROD (production) servers.
There are techniques (Ant, Maven) to "filter" a WAR file replacing one string of text with another as it is deployed to a particular host. While that works for individual parameters like "nodeName", the techniques that are available make it hard to substitute a variable number of elements, and some locations have one CAS node in development, two CAS nodes in test, and three CAS nodes in production.
Then when we went to Production Services to actually deploy the code, they said that they did not want to edit configuration files. They wanted a system where the same WAR is deployed anywhere and when it starts up it looks at the machine it is on, decides that this a TEST machine (because it has "tst" in the hostname), and so it automatically generates the configuration of the TEST cluster.
At this point you should have figured out that it would be magical if anyone could write a class that reads your mind and figures out what type of cluster you want. However, it did seem reasonable to write a class that could handle most configurations out of the box and was small enough and simple enough that you could add any custom logic yourself.
The class is CushyClusterConfiguration and it is separate from CushyTicketRegistry to isolate its entirely optional convenience features and make it possible to jiggle the configuration logic without touching the actual TicketRegistry. It has two configuration strategies:
First, you can configure a sequence of clusters (desktop sandbox, and machine room development, test, and production) by providing for each cluster a list of the machine specific raw URL to get to CAS (from other machines also behind the machine room firewall). CusyClusterConfiguration look up all the IP addresses of the current machine, then looks up the addresses associated with the servers in each URL in each cluster. It chooses the first cluster that it is in (that contains a URL that resolves to an address of the current machine).
Second, if none of the configured clusters contains the current machine, or if no configuration is provided, then Cushy uses the HOSTNAME and some Java code to automatically configure the cluster. At this point we expect you to provide some programming, unless you can use the Yale solution off the shelf.
At Yale we know that CAS is a relatively small application with limited requirements, and that any modern multi-core server can certainly handle all the CAS activity of the university (or even of a much larger university). So we always create clusters with only two nodes, and the other node is just for recovery from a serious failure (and ideally the other node is in another machine room far enough away to be outside the blast radius).
In any given cluster, the hostname of both machines is identical except for a suffix that is either the three characters "-01" or "-02". So by finding the current HOSTNAME it can say that if this machine has "-01" in its name, the other machine in the cluster is "-02", or the reverse.
Configuration By File
You can define the CushyClusterConfiguration bean with or without a "clusterDefinition" property. If you provide the property, it is a List of Lists of Strings:
<bean id="clusterConfiguration" class="edu.yale.its.tp.cas.util.CushyClusterConfiguration"
p:md5Suffix="yes" >
<property name="clusterDefinition">
<list>
<!-- Desktop Sandbox cluster -->
<list>
<value>http://foo.yu.yale.edu:8080/cas/</value>
<value>http://bar.yu.yale.edu:8080/cas/</value>
</list>
<!-- Development cluster -->
<list>
<value>https://casdev1.yale.edu:8443/cas/</value>
<value>https://casdev2.yale.edu:8443/cas/</value>
</list>
...
</list>
</property>
</bean>
In spring, the <value> tag generates a String, so this is what Java calls a List<List<String>> (List of Lists of Strings). As noted, the top List has two elements. The first element is a List with two Strings for the machines foo and bar. The second element is another List with two strings for casdev1 and casdev2.
There is no good way to determine all the DNS names that may resolve to an address on this server. However, it is relatively easy in Java to find all the IP addresses of all the LAN interfaces on the current machine. This list may be longer than you think. Each LAN adapter can have IPv4 and IPv6 addresses, and then there can be multiple real LANs and a bunch of virtual LAN adapters for VMWare or Virtualbox VMs you host or tunnels to VPN connections. Of course, there is always the loopback address.
So CushyClusterConfiguration goes to the first cluster (foo and bar). It does a name lookup (in DNS and in the local etc/hosts file) for each server name (foo.yu.yale.edu and bar.yu.yale.edu). Each lookup returns a list of IP addresses associated with that name.
CushyClusterConfiguration selects the first cluster and first host computer whose name resolves to an IP address that is also an address on one of the interfaces of the current computer. The DNS lookup of foo.yu.yale.edu returns a bunch of IP addresses. If any of those addresses is also an address assigned to any real or virtual LAN on the current machine, then that is the cluster host name and that is the cluster to use. If not, then try again in the next cluster.
CushyClusterConfiguration can determine if it is running in the sandbox on the desktop, or if it is running the development, test, production, disaster recovery, or any other cluster definition. The only requirement is that IP addresses be distinct across servers and cluster.
Restrictions (if you use a single WAR file with a single global configuration):
It is not generally possible to determine the port numbers that a J2EE Web Server is using. So it is not possible to make distinctions based only on port number. CushyClusterConfiguration requires a difference in IP addresses. So if you want to emulate a cluster on a single machine, use VirtualBox to create VMs and don't think you can run two Tomcats on different ports.
(This does not apply to Unit Testing, because Unit Testing does not use a regular WAR and is not constrained to a single configuration file. If you look at the unit tests you can see examples where there are two instances of CushyTicketRegistry configured with two instances of CushyClusterConfiguration with two cluster configuration files. In fact, it can be a useful trick that the code stops at the first match. If you edit the etc/hosts file to create a bunch of dummy hostnames all mapped on this computer to the loopback address (127.0.0.1), then those names will always match the current computer and Cushy will stop when it encounters the first such name. The trick then is to create for the two test instances of Cushy two configuration files (localhost1,localhost2 and localhost2,localhost1). Fed the first configuration, that test instance of Cushy will match the first name (localhost1) and will expect the cluster to also have the other name (localhost2). Fed the second configuration the other test class will stop at localhost2 (which is first in that file) and then assume the cluster also contains localhost1.)
Any automatic configuration mechanism can get screwed up by mistakes made by system administrators. In this case, it is a little easier to mess things up in Windows. You may have already noticed this if your Windows machine hosts VMs or if your home computer is a member of your Active Directory at work (though VPNs for example). At least you would see it if you do "nslookup" to see what DNS thinks of your machine. Windows has Dynamic DNS support and it is enabled by default on each new LAN adapter. After a virtual LAN adapter has been configured you can go to its adapter configuration, select IPv4, click Advanced, select the DNS tab, and turn off the checkbox labelled "Register this connection's addresses in DNS". If you don't do this (and how many people even think to do this), then the private IP address assigned to your computer on the virtual LAN (or the home network address assigned to your computer when it has a VPN tunnel to work) gets registered to the AD DNS server. When you look up your machine in DNS you get the IP address you expected, and then an additional address of the form 192.168.1.? which is either the address of your machine on your home LAN or its address on the private virtual LAN that connects it to VMs it hosts.
Generally the extra address doesn't matter. A problem only arises when another computer that is also on a home or virtual network with its own 192.168.1.* addresses looks up the DNS name of a computer, gets back a list of addresses, and for whatever reason decides that that other computer is also on its home or virtual LAN instead of using the real public address that can actually get to the machine.
CushyClusterConfiguration is going to notice all the addresses on the machine and all the addresses registered to DNS, and it may misidentify the cluster if these spurious internal private addresses are being used on more than one sandbox or machine room CAS computer. It is a design objective of continuing Cushy development to refine this configuration process so you cannot get messed up when a USB device you plug into your computer generates a USB LAN with a 192.168.153.4 address for your computer, but to do this in a way that preserves your ability to configure a couple of VM guests on your desktop for CAS testing.
Note also that the Unit Test cases sometimes exploit this by defining dummy hostnames that resolve to the loopback address and therefore are immediately matched on any computer.
In practice you will have a sandbox you created and some machine room VMs that were professionally configured and do not have strange or unexpected IP addresses, and you can configure all the hostnames in a configuration file and Cushy will select the right cluster and configure itself the way you expect.
Autoconfigure
At Yale the names of DEV, TEST, and PROD machines follow a predictable pattern, and CAS clusters have only two machines. So production services asked that CAS automatically configure itself based on those conventions. If you have similar conventions and any Java coding expertise you can modify the autoconfiguration logic at the end of CushyClusterConfiguration Java source.
CAS is a relatively simple program with low resource utilization that can run on very large servers. There is no need to spread the load across multiple servers, so the only reason for clustering is error recovery. At Yale a single additional machine is regarded as providing enough recovery.
At Yale, the two servers in any cluster have DNS names that ends in "-01" or "-02". Therefore, Cushy autoconfigure gets the HOSTNAME of the current machine, looks for a "-01" or "-02" in the name, and when it matches creates a cluster with the current machine and one additional machine with the same name but substituting "-01" for "-02" or the reverse.
Standalone
If no configured cluster matches the current machine IP addresses and the machine does not autoconfigure (because the HOSTNAME does not have "-01" or "-02"), then Cushy configures a single standalone server with no cluster.
Even without a cluster, Cushy still checkpoints the ticket cache to disk and restores the tickets across a reboot. So it provides a useful function in a single machine configuration that is otherwise only available with JPA and a database.virtual machine has a native DNS name and URL. It is these "native" URLs that define the cluster because each CAS VM has to use the native URL to talk to another CAS VM. At Yale those URLs follow a pattern of "https://vm-foodevapp-01.web.yale.internal:8443/cas".
Internally, Cushy configuration takes a list of URLs and generates a cluster definition with three pieces of data for each cluster member: a nodename like "vmfoodevapp01" (the first element of the DNS name with dashes removed), the URL, and the ticket suffix that identifies that node (the F5 prefers the ticket suffix to the an MD5 hash of the IP address of the VM).
Configuration
In CAS the TicketRegisty is configured using the WEB-INF/spring-configuration/ticketRegistry.xml file.
In the standard file, a bean with id="ticketRegistry" is configured selecting the class name of one of the optional TicketRegistry implementations (JBoss Cache, Ehcache, ...). To use Cushy you configure the CushyTicketRegistry class and its particular parameters.
Then at the end there are a group of bean definitions that set up periodic timer driven operations using the Spring support for the Quartz timer library. Normally these beans set up the RegistryCleaner to wake up periodically and remove all the expired tickets from the Registry.
Cushy adds a new bean at the beginning. This is an optional bean for class CushyClusterConfiguration that uses some static configuration information and runtime Java logic to find the IP addresses and hostname of the current computer to select a specific cluster configuration and generate property values that can be passed on to the CushyTicketRegistry bean. If this class does not do what you want, you can alter it, replace it, or just generate static configuration for the CushyTicketRegistry bean.
Then add a second timer driven operation to the end of the file to call the "timerDriven" method of the CushyTicketRegistry object on a regular basis (say once every 10 seconds) to trigger writing the checkpoint and incremental files.
There is a separate page that describes CushyClusterConfiguration in detail.
You Can Configure Manually
...
CushyTicketRegistryTest.java tests the TicketRegistry interface and the Cushy functions of checkpoint, restore, writeIncremental, and readIncremental. You can create a single ticket or a 100,000 TGTs. This verifies that the tickets are handled correctly, but it does not test CAS Business Layer processing. This test case Intialization creates a new empty TicketRegistry for each test, so it is good for checking possible to test that a sequence of operations produces an expected outcome.
...
127.0.0.1 casvm01,casvm02
Without this the two CushyClusterConfiguration beans cannot be tricked into regarding the one machine as if it was two nodes.
...
Create credentials on casvm02
Create a TGT with the credentials on casvm02
Simulate a failure of casvm02, from now on everything is casvm01
Create a ST using the TGT ID of the casvm02 TGT.
Use the ST to create a PGT.
Create a new ST using the PGT just created.
Validate the ST. Make sure that the netid that comes back matches the credentials supplied to casvm02.
...