Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The idea of using a cluster for availability made sense ten years ago when servers were physical machines and recovery involved manual intervention. Today servers run on VMs in a highly managed environment and backup VMs can be spun off automatically. It may be possible to design a system so that the backup comes up automatically and so quickly that you don't need a cluster at all. Cushy supports this profile and strategy.

However, if If you still insist on building creating a cluster of CAS clusterservers, then you should consider the small number of very specific programming problems that any CAS cluster must solve:

  1. Because CAS as currently written uses Spring Web Flow to store data between the time that the browser's initial GET returns the logon form and the time that the userid and password are submitted by the user, either the form has to POST back to the same server that wrote it or the Session object has to be replicated between Application Servers. This is what JBoss calls "clustering" but it has nothing to do with CAS tickets.
  2. After that, the browser has to come back to the CAS server that processed the logon or else the logon ticket has to be replicated to all servers.
  3. And any application that uses Proxy tickets has to come back to the CAS server that granted that ticket or proxy tickets have to be replicated to all servers.
  4. And a request from an application to validate a Service Ticket has to go to the CAS server that issued the ticket or the ST has to be replicated to all servers.

Since CAS 3 first came out, the assumption has been that tickets have to be replicated to every server. That may have been necessary with the network Front End options available at the time, but today the Front End devices (some differences between modern technology and the machine room of ten years ago when conventional CAS cluster support was developed.

Multiple CAS nodes will still be run in the VM infrastructure. With the CAS server divorced from physical hardware, nodes should not remain down as long.

Original CAS clustering assumed that the network Front End was fairly dumb. Typically it would take requests for the common CAS URL and distribute them on a round robin basis to the available servers. So CAS clustering had to replicate ticket status almost immediately to all the nodes before the next request came in and was randomly assigned to an unpredictable node. Modern Front End machines, such as the BIG-IP F5) are programmable. Every CAS request from a browser or an application has a ticket ID in a well defined location, and CAS 3 has always had the ability to generate ticketids that contain the name of the server that created them. So today it is relatively simple to arrange for requests to go to the server that has the ticket and can process it, unless that server is down.So now Cushy with its checkpoint and incremental files makes it possible to recover from a CAS crash without a cluster, and the modern Front End makes it possible to create a cluster without ticket replication. Although a cluster is unnecessary, consider building one anyway, are much smarter and they can be programmed with enough understanding of the CAS protocol so that they only round robin the initial login of new users. After that, they should be able to route requests for your tickets, whether from the browser or from applications validating a ticket, to the node you logged into which is also the node that created the ticket. Assuming a modern Front End, tickets only have to be replicated to protect against system failure, and that allows replication to be measured in seconds instead of milliseconds (hence the term Lazy Replication).

The CushyClusterConfiguration class makes it simple to configure more than one CAS server in a cluster. It makes sure that every server has a unique name, that all members of the cluster know the names and network locations of the other members, and that some version of these names is appended to every ticketid so the Front End can route requests properly. It then feeds this cluster information to the CushyTicketRegistry object.

...