Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The other mechanisms (JPA or the cache technologies) operate on single tickets. They write individual tickets to the database or replicate them across the network. Obviously this is vastly more efficient than periodically copying all the tickets to disk. Except that at Yale (a typical medium sized university), the entire Registry of tickets can be written to a disk file in 1 second and it produces a file about 3 megabytes in size. That is a trivial use of modern multi-core multicore server hardware, and copying 3 megabytes of data over the network every 5 minutes, or even every minute, is a trivial use of network bandwidth. So Cushy is less efficient, but in a way that is predictable and insignificant, in exchange for code that is simple and easy to completely understand.

...

Given the small cost of making a complete checkpoint, you could configure Cushy to generate one every 10 seconds and run the cluster on full checkpoints. It is probably inefficient, but using 1 second of one core and transmitting 3 megabytes of data to each node every 10 seconds is not a big deal on modern multi-core multicore servers. This was the first Cushy code milestone and it lasted for about a day.

...

There are a few more consequences to Single SignOut Sign Out that will be explained in the next section.

...

Again, the rule that each node owns its own registry and all the tickets it created and the other nodes can't successfully change those tickets has certain consequences.

  • If you use Single SignOffSign Off, then the Login Ticket maintains a table of Services to which you have logged in so that when you logout or when your Login Ticket times out in the middle of the night then each Service gets a call from CAS on a published URL with the Service Ticket ID you used to login so the application can log you off if it has not already done so. In failover fail-over mode a backup server can issue Service Tickets for a failed nodes TGT, but it cannot successfully update the Service table in the TGT, because when the failed node comes back up it will restore the old Service table along with the old TGT.
  • If the user logs out and the Services are notified by the backup CAS server, and then the node that owned the TGT is restored along with the now undead copy of the obsolete TGT, then in the middle of the night that restored TGT will timeout and the Services will all be notified of the logoff a second time. It seems unlikely that anyone would ever write a service logout so badly that a second logoff would be a problem. Mostly it will be ignored.

You have probably guessed by now that Yale does not use Single SignOutSign Out, and if we ever enabled it we would only indicate that it is supported on a "best effort" basis.

...

An F5 can be configured to have "sticky" connections between a client and a server. The first time the browser connects to a service name it is assigned any available backend back-end server. For the next few minutes, however, subsequently requests to the same service go back to whichever server the F5 assigned to handle the first request.

Intelligent routing is based on tickets that exist only after you have logged in. CAS was designed (for better or worse) to use Spring Webflow which keeps information in the Session object during the login process. For Webflow Web Flow to work, one of two things must happen:

...

Yale made a minor change to the CAS Webflow Web Flow to store extra data in hidden fields of the login form, and an additonal additional check so if the Form POSTs back to another server the other server can handle the rest of the login without requiring Session data.

...

Web applications are traditionally defined in three layers. The User Interface generates the Web pages, displays data, and processes user input. The Business Logic validates requests, verifies inventory, approves the credit card, and so on. The backend back-end "persistence" layer talks to a database. CAS doesn't sell anything, but it has roughly the same three layers.

...

In the simple case of a single CAS server, the tickets remain in memory and do not get written to disk or sent over the network. So CAS doesn't need any backend back-end services, but it creates the TicketRegistry interface and makes it possible to add database or network function for clustering support.

...

The collection returned by ConcurrentHashMap is not serializable, so Cushy has to copy Tickets from it to a more standard collecitoncollection, and it uses this opportunity to exclude expired tickets. Then it uses a single Java writeObject statement to write the List and a copy of all the Ticket objects to a checkpoint file on disk. Internally Java does all the hard work of figuring out what objects point to other objects so it can write only one copy of each unique object. When it returns, Cushy just has to close the file.

...

There is a chase condition between one node taking a full checkpoint when another node is trying to read an incremental. A new checkpoint deletes the previous incremental file. As each of the other nodes receives a Notify from this node they realize that there is a new checkpoint and no incremental, so a flag gets set and the next timer cycle through no incremental is read. However, after the checkpoint is generate and before the Notify is sent there is a opportunity for the other node to wake up, ask for the incremental file to be sent, and to get back an HTTP status of FILE_NOT_FOUND.

"Healthy" is a status of a Secondary object. Without it when a node goes down then the other nodes will try every timer tick (every 10 seconds or so) to connect to the dead node and fetch the latest incremental file. When a file request fails, then the node is marked "not healthy" and no more incrementals will be fetched until a Notify indicates that the node is back up.

Security

The collection of tickets contains sensitive data. With access to the TGT ID values, a remote user could impersonate anyone currently logged in to CAS. So when checkpoint and incremental files are transferred between nodes of the cluster, we need to be sure the data is encrypted and goes only to the intended CAS servers.

...