...
The other mechanisms (JPA or the cache technologies) operate on single tickets. They write individual tickets to the database or replicate them across the network. Obviously this is vastly more efficient than periodically copying all the tickets to disk. Except that at Yale (a typical medium sized university), the entire Registry of tickets can be written to a disk file in 1 second and it produces a file about 3 megabytes in size. Those numbers are so trivial that writing a copy of the entire Registry to disk once That is a trivial use of modern multi-core server hardware, and copying 3 megabytes of data over the network every 5 minutes, or even once a every minute, is trivial on a modern server. Given the price of hardware, being more efficient than that is unnecessary.
Once you have a file on disk it should not take very long to figure out how to get a copy of that file from one Web Server to another. An HTTP GET is the obvious solution, though if you had shared disk there are other solutions.
Going to an intermediate disk file was not the solution that first comes to mind. If the tickets are in memory on one machine and they have to be copied to memory on another machine, some sort of direct network transfer is going to be the first thing you think about. However, the intermediate disk file is useful to restore tickets to memory if you have to restart your CAS server for some reason. Mostly, it means that the network transmission is COMPLETELY separate from the process of creating, validating, and deleting tickets. If the network breaks down you cannot transfer the files, but CAS continues to operate normally and it can even generate new files with newer copies of all the tickets. When the network comes back the file transfer resumes independent of the main CAS services. So replication problems can never interfere with CAS operationa trivial use of network bandwidth. So Cushy is less efficient, but in a way that is predictable and insignificant, in exchange for code that is simple and easy to completely understand.
Once the tickets are a file on disk, the Web server provides an obvious way (HTTPS GET) to transfer them from one server to another. Instead of using complex multicast sockets with complex error recovery, you are using a simple technology everyone understands to accomplish a trivial function.
Cache solutions go memory to memory. Adding an intermediate disk file wasn't an obvious step, but once you think of it it has some added benefits. If you reboot the CAS server, the local disk file allows CAS to immediately restore the tickets and therefore its state from before the reboot. Serializing the tickets to disk will work no matter how badly the network or other nodes are damaged, and it is the only step that involves the existing CAS code. Although the second step, transferring the file from one server to another, is accomplished with new code that runs in the CAS Web application, it does not touch a single existing CAS object or class. So whatever unexpected problems the network might create, they affect only the independent file transfer logic leaving normal CAS function untouched. And while the cache solutions require complex logic to reconcile cache on different machines after communication between nodes is restored, Cushy retransmits the entire set of tickets every (configurable number) few minutes after which everyone is guaranteed to be back in synchronization.
Cushy is based on four basic design principles:
...
Given the small cost of making a complete checkpoint, you could configure Cushy to generate one every 10 seconds and run the cluster on full checkpoints. It is probably inefficient, but using 1 second of one core and transmitting 3 megabytes of data to each node every 10 seconds is not a big deal on modern equipmentmulti-core servers. This was the first Cushy code milestone and it lasted for about a day before it was extended with a little extra code.
The next milestone (a day later) was to add an "incremental" file that contains all the tickets added or ticket ids of tickets deleted since the last full checkpoint. Creating multiple increments and transmitting only the changes the other node has not yet seen was considered, but it would require more code and complexity. If you generate checkpoints every few minutes, then the incremental file grows as more changes are made but it never gets really large. It is well know that the overhead of creating and opening a file or establishing a network connection is so great that the difference between reading or writing 5K or 100K is trivial.
In Cushy you configure a timer in XML. If you set the timer to 10 seconds, then Cushy writes a new incremental file every 10 seconds. Separately you configure the time between full checkpoints. When the timer goes off, if enough time has passed since the last checkpoint then instead of writing an incremental file, this time it writes a new full Checkpoint.
Only a small number of tickets are added, but lots of Service Tickets have been created and deleted and there is no good way to keep the list of expired Service Tickets from making the incremental file larger. So if you tried to separate full checkpoints by an unreasonable amount of time you would find the incremental file had grown to be larger than the checkpoint file and you have made things worse rather than better. So the expectation is you do a full checkpoint somewhere between every 1-10 minutes and you do an incremental somewhere between every 5 -15 seconds, but test it and make your own decisionsIncrementals are designed so that they grow between full checkpoints, they are cumulative, and you can always apply the last incremental you got without worrying about any previous incrementals. Again, slightly inefficient, but trivially so, and emphasize simplicity.
CAS already ran the RegistryCleaner off a timer configured in Spring XML to call it every so often. Cushy adds a second timer to the same configuration file to signal the TicketRegistry frequently. For this example, say it makes the call every 10 seconds. Then every 10 seconds Cushy generates an incremental file, and then it checks all the other nodes to get their most recent incremental file. Separately, Cushy is configured with the time between checkpoints (say every 5 minutes), so when it has been long enough that a new full checkpoint is due, it creates a full checkpoint instead of an incremental.
Each incremental has a small number of new Login (TGT) tickets and maybe a few unclaimed service tickets. However, because we do not know whether any previous incremental was or was not processed, it is necessary to transmit the list of every ticket that was deleted since the last full checkpoint, and that will contain the ID of lots of Service Tickets that were created, validated, and deleted within a few milliseconds. That list is going to grow, and its size is limited by the fact that we can start over again after each full checkpoint.
A Service Ticket is created and then is immediately validated and deleted. Trying to replicate Service Tickets to the other nodes before the validation request comes in is an enormous problem that screws up the configuration and timing parameters for all the other Ticket Registry solutions. Cushy doesn't try to do replication at this speed. Instead, it has CAS configuration elements that ensure that each Ticket ID contains an identifier of the node that created it, and it depends on a front end smart enough to route any of the ticket validation requests to the node that created the ticket and already has it in memory. Then replication only is needed for crash recover.
...