Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

In terms of traditional "layered" application design, the user interface of CAS is provided by Spring MVS MVC and Spring Web Flow. This layer obtains the userid and password from the user or accepts and processes the ticket validation request from the application. In the middle, the Business Logic of CAS verifies the userid and password against some back end system, frequently Active Directory, and it generates ticket IDs and creates ticket objects.

The Ticket Registry is the back end component. It fits into the layered architecture where normal applications have a database layer, but although CAS can optionally store tickets in a database it more commonly keeps them in memory. The TicketRegistry interface is a pluggable component into which you can insert (using Spring XML configuration) any of several JASIG modules. The CushyTicketRegistry is one implementation of the interface.

Most JASIG TicketRegistry modules are based on what are commonly The JPA Ticket Registry depends on Java ticket objects and object they point to being annotated with references to tables and a column name and type for every object field. JPA then generates and "weaves" into your application a lot of invisible support code to track these objects and to notice when they have been modified.

The "cache" versions of JASIG TicketRegistry modules use what is informally called Plain Old Java Objects. They are generic systems that work with any object. Unfortunately, Like Cushy, they require these objects to be serializable to a bunch of portable bytes. With that restriction, they work on any type of object, but they work best with objects that are not directly connected to anything, and CAS Tickets have a structure of connections. It isn't necessary to understand this structure, but if you want to understand how things really work and verify that something is properly designed, you need to know a little bit about how Tickets work(by references) to other objects and especially not to Collections of objects.

Unfortunately, CAS tickets were not designed to serialize nicely. The problem requires a short discussion about ticket relationships.

When a CAS users logs in, CAS creates a Login TGT. The TGT has two interesting chains of objects. It has an "Authentication object " that contains is connected to a "Principal" that contains points to the Netid string. This structure is designed to support multiple forms of authentication and various forms of principal name, but in practice 99% of CAS use is based on users typing a userid and password into a Web Form and having the password validated by a back end system. To support Single Logout, the TGT also has a Service table with one entry for every Service Ticket the user obtained. The entries have the ST id string that identified the user at service login, and a pointer to a Service object with a URL that can be called to log the user out.

After login, the user obtains Service Tickets to login to various Web applications. An ST is created and then used and destroyed in a matter of milliseconds, so Service Tickets are not an important long term feature of the ticket structure. However, the user can login to a "middleman" service like the Yale Portal that gets what are called Proxy Tickets so that the middleman application can access other Web applications on the user's behalf. A Proxy Ticket is essentially the same as a TGT, except that it has no Authentication itself. Instead, the Proxy Ticket has a parent TGT that contains the Authentication and therefore the Netid.

Collectively, TGT and Proxy Tickets are called "Granting Tickets" because they are used to create Service Tickets. The TGT and Proxy Tickets both have Service tables and can participate in Single Logout. Generally speaking, when the user logs out manually or when the TGT times out, the Single Logout logic runs through all the Proxy and Service Tickets connected to the TGT and contacts all the services that have registered a logout URL, then all the tickets in the tree are deleted.

This discussion of the structure and use of the tickets in the Registry explains a lot about how any Ticket Registry has to work, and what issues are likely to arise.

TGTs point to no other tickets, but they contain a table of Single Logout Service entries. Each Web application has the same Logout URL for all users, so while it is possible for the CAS Business Logic to optimize things and use a single shared Service object for all users who logged onto the same application, it is just as likely that each user will have their own identical copy of a Service object. If a TGT is lost, then the user has to login again to CAS reentering his userid and password.

Proxy Tickets are chained to the TGT. If logic wants to know the Netid, it has to follow the pointer from the Proxy to the TGT to find the Authentication. Proxy tickets represent an application, like the Yale Portal, to which the user logged in minutes or hours ago. If the Proxy Ticket is lost, this becomes a program logic problem for the middleware application. That application may be written to handle the problem correctly, or it may experience an ugly failure. Proxies are the most important type of ticket to back up in a cluster because their loss may be most visible and disruptive.

Service Tickets point to the Granting Ticket that created them, but they are created and destroyed so quickly that only a few exist in the Registry at any time and they have no long term significance for structure or behavior.

JASIG Supported Ticket Replication

CAS has a number of optional mechanisms to replicate tickets from the node that created them to all the other nodes in a cluster. Generally all these optional solutions use one of two generic technologies.

JPA is the modern J2EE database interface. To support JPA, the Tickets and the objects they point to are marked up with annotations that map each data field to a column in a database table. New tickets become new rows in the database. JPA is a very sophisticated technology, because it tracks every Java object that is mapped to a database and intercepts every change made to the object. It knows which objects have been altered and which are unmodified, and it write the new and altered objects back to the database when the transaction completes. To do this, objects created for or from the database are automatically modified by JPA to include tracking and intercept code.

As a result, JPA knows about every TGT, Proxy, or ST in the registry. It treats every ticket as a unique object, and it understands every pointer that connects one of these unique objects to another one. Every individual ticket in the Registry of one CAS server becomes an individual ticket in the other CAS servers. The downside is that CAS isn't really a database application and has no need for long term persistent storage of tickets. All tickets expire and are discarded after a certain number of hours, so using database technology is overkill. Furthermore, the intercepts and tracking logic are embedded magically inside the important CAS objects so if the database goes down or there is a problem connecting to the database over the network, CAS simply stops functioning.

The other JASIG TicketRegistry implementations are based on competing off-the-shelf generic object replication libraries. These alternatives (Ehcache, Memcached, JBoss cache) provide what appears to be an in memory storage for plain old Java objects. CAS puts a Ticket (and its associated Authentication and Service objects) into this "cache". Under the covers, the generic library converts the objects to a stream of bytes and transmits those bytes to the other CAS servers in the cluster where they are turned back into a set of identical Java objects.

There are two problems with these "cache replication" libraries. First, because the TicketRegistry is the replicated cache, CAS has no place to store or retrieve ticket objects without using the code provided by the replication library. Again, if the replication logic has a problem with logic or network access, then the problem can back up until it blocks ticket creation and then nobody can login to CAS.

There is a second issue which is not a problem but rather an undocumented feature you should understand. When the user logs into a middleman application and gets a Proxy Ticket, then when the Business Logic tries to add the Proxy Ticket to the TicketRegistry the generic Java object replication library tries to make a copy of the Proxy Ticket (which is the new object) to send to all the other nodes in the cluster. The Proxy Ticket points to the TGT which in turn points to the Authentication which points to the Principal that contains the Netid. However, generic object replication mechanisms do not understand that the TGT that the Proxy Ticket points to is an object that is already in the cache. Unlike JPA, these libraries do not have sophisticated tracking mechanisms that distinguish each individual object in the cache. So what happens is that the cache makes a copy of all the objects in the chain, then transmits this over the network, and at the other end the other CAS nodes create a new copy of the original chain and store it in memory.

At this point the node that created the Proxy Ticket has that object and its parent TGT in the cache as two objects. However, all the other nodes in the cluster have two copies of the TGT. One is the object that the business logic gets if it looks up the TGT by its ID string (the string stored in the CAS cookie on the user's browser). A duplicate object with all the same information is now chained to the copy of the Proxy Ticket. When the Business Logic asks for the Netid, it will find it because the Proxy Ticket points to a TGT object that points to all the other stuff and ends up with a Netid. However, if you look up the TGT in the cache using its ID you get one copy of the TGT, and if you look for the parent TGT by following the pointer from the Proxy Ticket you get a different TGT. The TGT chained to the Proxy Ticket is a point in time snapshot of the login information, so if the real TGT in the cache changed, this change would not be reflected in the copy of the TGT chained to the replicated Proxy Ticket. However, on the node that created the Proxy Ticket, it is chained to the real TGT and on that node any changes to the TGT are going to be visible when you follow the pointer from the Proxy.

This uncertainty is not important because in current CAS logic the TGT doesn't change meaningfully after it is created. If it becomes important, the structure of Tickets and the TicketRegistry interface are going to have to change so that Tickets are connected not by direct references from one object to another but by storing the TGT ID string in the Proxy Ticket and then using the string to lookup the parent TGT in the Registry instead of just following a pointer. That would be something for a later version of CAS. Right now, ending up with two copies of the TGT on all the other backup nodes is just a feature that we do not worry about since it does not upset the business logic.

CushyTicketRegistry works differently from either JPA or the standard object replication cache mechanisms. A CAS ticket cache has a moderate number of relatively small objects that are relatively short lived. It turns out to be entirely reasonable to periodically (every 5 to 15 minutes) make a complete snapshot of the entire Ticket Registry and transmit it to the other nodes. In between full snapshots changes to the ticket registry can be captured and transmitted as smaller updates.

Operating on the entire cache instead of trying to track individual tickets makes the logic vastly simpler and makes the system more robust. If the replication system fails entirely, CAS continues to store tickets locally in memory although during the failure the tickets are not replicated. By making replication an operation that succeeds or fails, and when it fails it simply times out and then cleans up completely after itself and then retries several minutes later, replication can stop at a network failure, then restart when the network is restored.

There is another nice trick. The TicketRegistry is first written to a disk file, and then the disk file is copied across the network. With conventional cache replication each change has to be queued in memory until it can be transmitted across the network. It is the backup in those queues that causes the failures. However, with CushyTicketRegistry the complex logic ends when the file is written to local disk, which does not depend on any type of networking. Problems communicating between nodes means that the copy of the data in the disk file becomes stale on one node, but the local data continues to update and replace the local disk copy. Then when connectivity is restored the nodes simply exchange the lastest copy of the disk files and they are back in synchronization again.

Failure Modes of Conventional Replication

JPA

JPA is an amazing technology. It handles the mapping between annotated Java objects and database tables. It generates tables if they do not already exist. It generates proxy objects with hidden intercept code that detects when an object has been modified so JPA can maintain a managed set of modified and unmodified objects and write back only the changes when a transaction commits.

Of course, a real database application like an order entry system has nothing to do when the database is down. If you cannot update the database, you cannot process orders. So the database is an intrinsic single point of failure, and JPA is not designed to allow the database application to continue to function temporarily without the database.

CAS works fine on a single server with no database or replication at all. The purpose of JPA ticket replication is not to store tickets on disk and certainly not to make tickets available for SQL queries. JPA provides a very precise method of sharing tickets across multiple CAS nodes, particularly in the event of a node failure. Unfortunately, it introduces the database as a new single point of failure which takes CAS down completely if it stops working.

Because JPA weaves itself into the core CAS business objects, there is no way to extract CAS from JPA if it fails. JPA may be too sophisticated and powerful. A better solution keeps the ticket replication separated from the main CAS business logic flow to allow CAS to continue to run if the replication has problems.

Serialize and Send

All the other mechanisms (Ehcache, JBoss Cache, Memcached, and this new CushyTicketRegistry) depend on Java "Serialization". Serialization turns a Java object in memory into a stream of bytes. The bytes can be written to disk, or they can be sent to another JVM where they can be "de-serialized" to create identical Java objects. Serialization has to process not only the object you provide but also all the objects to which it is connected. For example, a TGT would be no good without its Authentication object, which in turn needs it Principal object, because without them the TGT doesn't have the user's Netid.

Ehcache, JBoss Cache, and Memcached all try to serialize and transmit through the network each new or changed ticket. However, because of the ticket structure described above, any attempt to serialize a Service Ticket or a Proxy Ticket also ends up serializing a copy of the TGT to which they are connected by a pointer, and that in turn points to the Authentication and Principal and the Services in the service table. On the receiving end, these cache solutions create new copies of all these objects connected only to the Service Ticket, but it turns out that CAS functions properly whether it is working with the "real" ticket objects or the private copy of objects connected only to the copy of the Service Ticket.

The real problem is that these off the shelf caching mechanisms are designed to replicate objects that are not connected to any other object. That is their original design. CAS Ticket objects do not conform to the design restrictions that guarantee reliable performance when you use one of these cache solutions.

As it converts objects to bytes, Serialization has to iterate through any collection it encounters. Some Java Collection objects are not thread safe. Problems occur when one thread (doing serialization) gets an iterator and while it is processing the collection another thread adds or deletes an object in the same collection. When the iterator gets to the part of the collection that was updated, it can throw an exception. Worse, the serialization can generate bad data without reporting a problem.

Fortunately, the only collection inside a Ticket is the collection of Services in Granting Tickets that is used for Single SignOut. If you do not use Single SignOut you can disable updates to this collection. Even if you do use it updates are so infrequent that the probability of conflict is low, but it is still possible unless you add synchronization logic.

Yale does not use Single SignOut. If you want to solve this problem by adding the required synchronization, then the main advantage of CushyTicketRegistry is that all the code is visible in one place, so you control everything and can add the logic you need. With an off the shelf object cache library, you have to deal with a large body of code and making custom changes is very hard, plus it defeats the purpose of using off the shelf libraries in the first place.

The Requirements

We assume the existence of a front end that maintains stick sessions between the browser and the CAS instance to which the user logged in, and which routes ST Validation requests to the node that generated the Service Ticket.

We are willing to allow a handful of users to experience a temporary glitch in the event of a CAS node failure. In particular, Service Ticket validation may fail if a CAS node generates the ST and then crashes in millseconds before the ST is validated, and a login may be lost if the node that logs you in crashes within seconds of the login.

The primary purpose of Lazy Ticket Replication is to replicate Granting Tickets (Login and Proxy). However, in a typical CAS Ticket Cache 99.9% of tickets at any time are Granting Tickets (Service Tickets are created and then almost immediately deleted) so Granting Tickets are pretty much the entire cache.

Since CAS nodes today exist on Virtual Machines that are fairly quickly moved to another host, nodes should come back up fairly quickly after a failure. The design has to consider rapid node recovery as well as node failure.

Usage Pattern

A user starts logging on to CAS in the morning when he accesses the first CAS-aware application.

In reality, users almost never log out of CAS. However, if the user closes his browser he may abandon a CAS TGT and then when he reopens the browser he may log in again with a new ticket.

At the end of the day the user walks away without logging off.

So what happens is that the number of CAS tickets starts to grow each morning when users get up or arrive at work. They grow throughout the day and exceed the number of actual people because of tickets abandoned when the browser closes.

At some point you time out the first logon ticket of the day. From that point forward, you may lose as many old expired tickets as you get new tickets from new logons.

At some point users go to sleep or have better things to do. Tickets continue to time out and get deleted, so the number of tickets drops overnight to a low point when the first user wakes up and the process starts over again.

As a result the Ticket Registry is typically not a steady state component. During the day most of the tickets that were created earlier remain in the registry and new tickets are added hour by hour. At night old tickets expire hour by hour and the registry shrinks.

This suggests that the Ticket Registry mechanism should be designed to expect that after any configured period of use, most of the tickets that were there at the beginning of the period are probably still there, that the ones that aren't there are much more likely to have expired (which you can determine from the timestamp) than to have been manually deleted (by explicit CAS logoff), and that there is a block of incremental new tickets accumulated during the period.

This is a fairly predictable pattern that could, over time, produce increasingly efficient and highly optimized replication strategies. The current CushyTicketRegistry was written in about 3 days of coding and is good enough to be usable as is. As the following section will show, the problem is not really all that big.

Some Metrics

At Yale there are typically more than 10,000 and fewer than 20,000 Login tickets. Because Service Tickets expire when validated and after a short timeout, there are only several dozen unexpired Service Tickets at any given time.

Java can serialize a collection of 20,000 Login tickets to disk in less than a second (one core of a Sandy Bridge processor). Since this occurs in a separate thread that does not need to synchronize with other threads during the serialization, if CAS runs with more than one CPU core then the serialization does not interfere with the primary CAS function.

For a larger organization, serializing 200,000 tickets takes 9 seconds of a single core. Remember that each object serialized has to be deserialized, so double that CPU figure. Then remember that abandoned objects have to be garbage collected. Still, these numbers suggest that even large organizations could afford to do a complete serialization of Ticket Registries every 5 minutes.

However, CushyTicketRegistry also has an incremental (actually a single differential) operation where tickets added or deleted since the last full backup of the registry are serialized to a much smaller file. Generally speaking, the added tickets is the set of people who have done a new CAS login since the last full serialization, and the deleted ticket ID list is the set of Service Tickets that were created, validated, and deleted. The incremental file can generally be serialized in a fifth of a second but you can measure it yourself for any given workload pattern.

CAS is subject to a type of Denial of Service Attack where someone with a userid and password logs in again and again filling the Ticket Registry with an arbitrary number of TGTs. The solution to this attack is to keep a counter of the number of TGTs in the Registry associated with the same Principal Name, and stop adding TGTs when you hit a threshold. However, for basic protection the CushyTicketRegistry needs a DOSCircuitBraker ticket count that is way more than you will ever hit in any normal processing that will stop replication when the memory is filling up with tickets and any additional processing will only add to the problem. Currently the breaker is set at half a million tickets, which is way more than the 20,000 normal situation.

CushyTicketRegistry

CushyTicketRegistry is a relatively small class that does all the work. It begins To support Single SignOut, it also contains a table of used Service Ticket IDs and references to the Service object that contains the URL CAS contacts to tell the Service that the user has logged out.

When the user authenticates to other applications, the short lived Service Ticket points to the Login TGT. When the ST is validated, CAS follows the pointer from the ST to the TGT and then to the Authentication to the Principal to the Netid string, which it returns as part of the validation message.

A special kind of Service is allowed to Proxy. Such a service gets its own Proxy Granting Ticket which acts like a TGT in the sense that it generates Service Tickets and the ST points back to it, but a PGT does not have the Netid itself. Rather it points back to the real TGT that contains the Authentication data. So when validating a proxy ST, CAS follows the pointer in the ST to the PGT, then follows the pointer from the PGT to the Login TGT and finds the Netid there.

Unfortunately, neither this chain of tickets nor the Single SignOut Service table is the sort of thing you want to have when you use a standard cache mechanism to replicate Plain Old Java Objects. It will do it, but with consequences that may or may not be important depending on how you use them.

When Cushy does a full checkpoint of all the tickets, it doesn't matter how the tickets are chained together. Under the covers of the writeObject statement, Java does all the work of following the chains and understanding the structure, then it write out a blob of bytes that will recreate the exact same structure.

The potential for problems arises when you try to serialize a single ticket, as Cushy does during incrementals and as the standard JASIG cache solutions do all the time for all the tickets.

Serialization makes a copy not just of the object you are trying to serialize, but also of all the objects that it points to directly or indirectly.

If you serialize a TGT, then you also make a copy of the Authentication, the Principal, the Netid string. But then you also generate a copy of the Single SignOut service table and, therefore, all the Services.

If you serialize a ST you get all its data, but because it points to a TGT you also get a copy of the TGT and all its stuff.

Service Tickets last for such a short period of time that it really doesn't matter how they serialize. Proxy Granting Tickets, however, also point to a TGT and therefore they serialize and generate a copy of the TGT they point to, and they live for a very long time.

Now it is possible to describe two problems:

When you serialize a collection, Java must internally obtain an iterator and step one by one through the objects in the collection. Unless the collection is inherently thread safe, serialization can break if another thread adds or deletes elements to the collection while serialization is trying to iterate through it. For example, suppose someone is logged into CAS and in a browser presses the "Open All In Tabs" button to create several tabs simultaneously. This can launch two CAS logon requests at the same time for the same user (and therefore the same TGT). One request is handled first, and it creates a Service Ticket, adds an entry in the Service table of the TGT. Now if you use one of the JASIG "cache" solutions all of them are going to try and Serialize the ST to copy it to the other nodes, but to do that they also have to make a copy of the TGT. Meanwhile, there was another tab that generated another request under another thread that is trying to create a second Service Ticket. Following the same process it will at some point try to add a new Service to the Service Table in the TGT, and every so often this will collide with Java Serialization trying to iterate through the objects in the TGT Service Table to turn them into a blob of bytes. The second thread adds a entry in the Service table that can, on occasion, invalidate the iterator and cause the serialization operation to throw an exception.

If you do not user Single SignOn you can, like Yale, simply disable use of the Service table in the TGT. Otherwise you might solve the problem by editing TicketGrantingTicketImpl.java and changing the type of "services" from HashMap which isn't thread safe to Hashtable which is.

The second problem is a "feature" that you should understand. A Ticket Registry stores ticket objects and retrieves them using their ticket Id string as a key. So if you go to the Ticket Registry with the ID of a TGT you get a TGT object. If you create a new Service Ticket using that TGT, then the ST points to that TGT object and both end up stored in the Registry on the node that created them.

Now if you are using one of the JASIG cache mechanisms or during the period of time while Cushy is generating incremental files instead of a full checkpoint, any attempt to serialize the Service Ticket as a single object produces a blob of bytes that when it is turned back into objects on any other node produces an ST pointing to a copy of the original TGT on the other node.

This is not a problem because the CAS business logic only cares about the data in the TGT, it doesn't care whether it has the official TGT object or an exact copy of it. Besides, Service Tickets are validated or time out and they are gone a few seconds after they are created, so this is never going to be a long term problem.

Except for Proxy Granting Tickets that also point to a TGT. They continue to exist for hours, and if the PGT is serialized and replicated to another node it will, on that node, have its own private copy of the TGT that is a point in time snapshot of what was in the TGT at the time the proxy was created.

Cushy solves this problem as soon as the next full checkpoint is written and restored. By serializing the entire collection of tickets on the node that owns them, Cushy create an exact duplicate object structure on all the other nodes. As for other cache solutions, the information in a TGT does change in any way that could affect the processing of a Proxy ticket. Some time down the line when we start to use mulitfactor authentication and we begin to add additional authentications dynamically to an existing logged on user it may be important for a PGT to reference the current TGT with all its current information rather than a copy of the TGT with only the information it had when the user first connected to the middleware application.

Usage Pattern

Users start logging into CAS at the start of the business day. The number of TGTs begins to grow.

Users seldom log out of CAS, so TGTs typically time out instead of being explicitly deleted.

Users abandon a TGT when they close the browser. They then get a new TGT and cookie when they open a new browser window.

Therefore, the number of TGTs can be much larger than the number of real CAS users. It is a count of browser windows and not of people or machines.

At Yale around 3 PM a typical set of statistics is:

Unexpired-TGTs: 13821
Unexpired-STs: 12
Expired TGTs: 30
Expired STs: 11

So you see that a Ticket Registry is overwhelmingly a place to keep TGTs.

After work, and then again after the students go to sleep, the TGTs from earlier in the day start to time out and the Registry Cleaner deletes them.

So generally the pattern is a slow growth of TGTs while people are using the network application, followed by a slow reduction of tickets while they are asleep, with a minimum probably reached each morning before 8 AM.

If you display CAS statistics periodically during the day you will see a regular pattern and a typical maximum number of tickets in use "late in the day".

Translated to Cushy, the cost of the full checkpoint and the size of the checkpoint file grow over time along with the number of active tickets, and then the file shrinks over night. During any period of intense login activity the incremental file may be unusually large. The worst possible configuration of Cushy would be to generate a checkpoint at 8 AM when the number of tickets is at a minimum, and then to have a period of hours between checkpoints when you might get a lot of activity from people waking up and arriving at work that loads up the incremental file with more and more stuff.

This suggests that the Ticket Registry mechanism should be designed to expect that after any configured period of use, most of the tickets that were there at the beginning of the period are probably still there, that the ones that aren't there are much more likely to have expired (which you can determine from the timestamp) than to have been manually deleted (by explicit CAS logoff), and that there is a block of incremental new tickets accumulated during the period.

This is a fairly predictable pattern that could, over time, produce increasingly efficient and highly optimized replication strategies. The current CushyTicketRegistry was written in about 3 days of coding and is good enough to be usable as is. As the following section will show, the problem is not really all that big.

Some Metrics

At Yale there are typically more than 10,000 and fewer than 20,000 Login tickets. Because Service Tickets expire when validated and after a short timeout, there are only several dozen unexpired Service Tickets at any given time.

Java can serialize a collection of 20,000 Login tickets to disk in less than a second (one core of a Sandy Bridge processor).Cushy has to block normal CAS processing just long enough to get a list of references to all the tickets, and the all the rest of the work occurs under a separate thread unrelated to any CAS operation that does not interfere with CAS processing.

Of course, Cushy also has to deserialize tickets from the other nodes. However, remember that if you are currently using any other Ticket Registry the number of tickets reported out in the statistics page is the total number combined across all nodes, while Cushy serializes only the tickets that the current node owns and it deserializes the tickets for the other nodes. So generally you can apply the 20K tickets = 1 second rule of thumb to estimate the overhead of converting to Cushy and the number does seem to scale. Serializing 200,000 tickets takes 9 seconds.

Incrementals are trivial (.1 to .2 seconds).

CushyTicketRegistry (the code)

CushyTicketRegistry is a medium sized Java class that does all the work. It began with the standard JASIG DefaultTicketRegistry code that stores the tickets in memory (in a ConcurrentHashMap). Then on top of that base, it adds code to serialize tickets to disk and to transfer the disk files between nodes using HTTP.

...