...
There is much to be said for "off the shelf COTS software". After all, if something is widely used and written to handle much more complicated problems, then it should handle CAS. Unfortunately, all these packages are designed to support application level software, and at Yale CAS is a Tier 0 system component (in Disaster Recovery planning) and it has to be back up first with as few dependencies as possible. Application software is not written to system specificationscomes up much later when the databases are back up and the network is stable again.
So CushyTicketRegistry was written to solve the CAS Ticket problem and pretty much nothing else. It does not require a database, or any additional complex network configuration with multicast addresses and timeouts. It depends on the observed behavior that CAS is actually a fairly small component with limited hardware demands so that a slightly less "efficient" but rock solid and dead simple approach can be used to solve the problem.
...
"Healthy" is a status of a Secondary object. Without it when a node goes down then the other nodes will try every timer tick (every 10 seconds or so) to connect to the dead node and fetch the latest incremental file. When a file request fails, then the node is marked "not healthy" and no more incrementals will be fetched until a Notify indicates that the node is back up.
Originally Cushy was designed to restore tickets to memory as soon as the file was loaded from the other node. However, this means that CAS is spending time deserializing data from files every few seconds, day after day while nothing goes wrong. It is necessary to get the files from the other nodes immediately because you cannot predict when a computer will crash, but the actual tickets don't need to be deserialized from the file until the node fails. So now Cushy uses Just In Time Deserialization. It holds the file on disk until the Business Logic asks for a ticket that belongs to one of the other nodes, something that should not occur unless the node owning the ticket has failed. Then Cushy deserializes the files from that node in order to find the requested ticket.
Security
The collection of tickets contains sensitive data. With access to the TGT ID values, a remote user could impersonate anyone currently logged in to CAS. So when checkpoint and incremental files are transferred between nodes of the cluster, we need to be sure the data is encrypted and goes only to the intended CAS servers.
...