Sandbox VM

A Sandbox provides a pre-configured development environment that you can run on your laptop computer. The VM version provides a Linux system on which to test the software you are developing in an environment close to what is used in production. Although the Sandbox was originally created for CAS, it can also be used for Shibboleth, Netid, or any other Java Web application. The default user is, however, named "casdev".

If you do not use the Sandbox, you can eventually stumble by trial and error to a workable development environment of your own. Along the way you will have to solve many problems. Does your Java have the unlimited cryptography files and does it have the Yale Certificate Authority files installed? Do you have both a current Maven and the old Maven 2.2.1 that Jenkins runs? Do you have Oracle and SQL Server database drivers installed in JBoss? How do you get JBoss logs to go to /var/log/jbossas/standalone instead of the log directory inside JBoss itself? None of these is a big problem, but these are just randomly chosen examples of the hundreds of small configuration issues you have to address.

The Sandbox is a friendly environment for developers to edit, compile, build, and debug the application. It is specifically not designed to duplicate the runtime configuration of production. For that you have DEV and TEST.

That's not how I do it!

If you take a strong stand that programmers should all do something in one common way, you almost guarantee that your efforts will be subverted by people who believe that your choice is wrong. The Sandbox is not a single choice but rather a family of almost identically configured mix and match components that almost anyone can live with.

At Yale, the production environments are all "server" configurations with no user interface except the command line. While some diehards believe that the original terminal interface is the best experience, most programmers today expect a graphical desktop environment and an IDE. If you have to add Gnome or KDE, and production doesn't depend on either, then the Sandbox concept doesn't care which desktop you use.

The production systems install JBoss using RPMs that are unavailable to us and in a scattered collection of directories specifically designed for starting processes at boot time and running them continuously and non-interactively. A development environment has to debug JBoss, which means it has to be installed where the JBoss Tools in Eclipse expect it to be and to start and stop under user control. However, once you are no longer wedded to installing things in system directories, then the development environment no longer really cares if the distribution is CentOS, Fedora, or Ubuntu.

So if you have a particular distribution you prefer, or you like to be leading edge, or you prefer a specific desktop (Gnome, KDE, ...), then the Sandbox can be configured the way you like it.

Thanks to Java, you can develop CAS on Windows, then run it on Linux without a problem. Therefore the choice of the OS is only an issue if the application requires a particular library or interface that isn't available on specific systems. GSSAPI, for example, behaves differently on Windows and Linux even though it is a standard Java API. If you are testing things that depend on the system like clustering and failover, then using Linux for that testing is the best policy.

A separate copy of the Sandbox may have been archived by the last person who worked with a particular application. If you pick it up and use it as is, then you can get right into development. If you only have a few changes to make, you can be done before you would have had a chance to make any system configurations.

However, if you feel strongly about KDE or Gnome or the version of Eclipse that was released last week, then an alternative is to keep your own personal optimized system disk. Then you only swap the SandboxData disk that contains the application, workspace, and JBoss. Because every SandboxData disk contains the same directories in the same places, the OS disk can have symlinks to the various components that remain valid even when you change disks. It is conventions like this that make the Sandbox a concept and strategy rather than a single image.

Different Linux distributions put the same component in different directory locations. There is no single correct place to install a program, and if you decide to use your own customized environment to perform the Sandbox development function you may have already installed some of the same programs in the location you prefer. The only safe thing is for the Sandbox to provide all its own copies of every program on which it depends and to put all this stuff in a separate location that cannot possibly conflict with any default "system" version of the same code. So the Sandbox is packaged as a separate small self contained virtual disk drive that you mount as an additional disk on a base operating system.

In the operating system servers are configured in /etc or the registry, but in the Sandbox they are configured in Eclipse. In the system they are started at boot as services, but in the Sandbox they are started and debugged by Eclipse. There would be a conflict if the real operating system starts the standard version of a server up on a standard port and the Sandbox disk is configured to use the same port. Then you have to change either the system or the sandbox to address the conflict. Otherwise the Sandbox does not know or care what version of Java or Eclipse or JBoss you have configured in the system because the Sandbox starts its own copy with the right version from its own directories.

If you need to develop two different Yale applications at the same time (CAS and Netid depend on each other for password change functions) then each may use a different version of Java and different Application Servers (CAS uses JBoss and Netid uses Tomcat) A Sandbox configured for this development may have two Java Runtimes and both JBoss and Tomcat in side by side subdirectories of the same parent directory. However, the basic directory structure of all Sandbox disks is the same and once you learn it you know where JBoss is going to be stored even if, from one application to another, it may be a different version number of JBoss in each Sandbox.

No application ever depends on the Eclipse that built it, so if the Sandbox contains Eclipse Luna (4.4) and you want to upgrade it to Mars (4.5), then that should always work.

In production JBoss runs under its own userid (typically "jboss" or "jbossa") that owns its own directories. In the Sandbox all the files used by Eclipse, the applications, the application servers, or anything else unrelated to the system are owned by the user "casdev" who is Linux user number 1001. This means the Sandbox cannot debug disk permission problems that will show up only in DEV. You can wait until you get to DEV to find and fix them. Since the application runs under JBoss that runs under Eclipse that is running as an interactive application, that whole process tree has to run as the logged in user.

It would be great if there was only one set of Sandbox files for all operating systems. Unfortunately, Eclipse tends to automatically insert a disk letter when it configures the path to JAR libraries on Windows. So an Eclipse Workspace has to be reconfigured when you move it between Linux and Windows. In addition, executable programs like java and javac are different on different systems (and different between 32 bit and 64 bit versions of the same system). So the Sandbox has identical directory structures and equivalent configurations, but there is generally a Windows version separate from the Java version.

Typically applications are debugged without SSL even if they run SSL in production.

Some applications have a custom log4j.xml or Spring configuration XML that references a cache directory or a file path explicitly. In this one type of case it is probably better to arrange the operating system and Sandbox so that the same file path can be used in development and production. Therefore, no matter what version of the OS you use, you will need to make "/var/log/jbossas/standalone" (or R:\var\log\jbossas\standalone in Windows) writable and typically owned by the casdev userid.

Sandbox is not Production

The Sandbox does not have to run at maximum efficiency. It will not be up 24x7 for months, so rock solid reliability and stability are not a requirement. Therefore, you have more flexibility choosing a VM host, Guest OS, Java runtime, and version of JBoss.

For example, while a particular application will be designed to run in a particular version of Java (1.6, 1.7, or 1.8) it doesn't matter for development if you are using OpenJDK or the Sun JVM. At various times in the past, the OpenJDK has a bad reputation for slow memory leaks that accumulate in applications that run for weeks without being restarted, but that is not a problem for code you debug and restart every few minutes. Developers frequently prefer being on the latest and greatest version of Fedora rather than running an older Linux Kernel in Centos just because that is what Red Hat supports for production.

To fully exercise every part of an application you may have to install optional libraries in the system. For example, the Netid application uses the Kerberos 5 administration libraries that would have to be installed with yum or apt-get. However, unless this additional function is an essential part of your debugging, then you open a big can of worms trying to get the Sandbox to support this sort of function. You will never get the Netid application to actually do Kerberos 5 administration in the Sandbox because it turns out that the Java native code that provides that interface only works in 32 bit mode and only on an older version of the library. It is much better to do what Netid does and have a install parameter "kadm5_activation=false" that turns all Kerberos administration calls into noops since you are almost never actually testing that particular function of the application.

Two "Disks"

The core of the Sandbox is a separate disk image (named "SandboxData") that contains the Java, Eclipse, JBoss, Maven, database drivers, and so on. This disk can be cloned or the VM manager (VirtualBox) can create a fixed base disk shared by multiple machines with separate difference files that hold the changes each VM makes.

The last programmer to work on an application will leave a VM with a particular version of a particular OS that was last used in development. Try to live with it. If you can't, then you can switch to another OS Sandbox disk, but then you will have to connect it back to the SandboxData disk that contains the development files.

The SandboxData disk files are owned by user 1001 group 1001. The operating system you boot should have configured this user number to be "casdev" for sanity sake.

This is a three step process in Linux:

You have to configure the VM (the *.vbox XML file) using the Virtualbox Manager to have two SATA disks, one for the OS and one for the SandboxData.
You then have to boot the OS and configure it to mount the SandboxData disk and its file system as a directory in a well known path. Search for the "Disks" GNOME utility which is an easy to use way of changing the /etc/fstab file. The smaller "VBOX HARDDISK" that is device /dev/sdb1 should be configured to mount at boot time. Edit Mount Options and curiously turn "Automatic Mount Options" OFF and then type in a mount point. This will create a line in fstab of the form "/dev/disk/by-uuid/6a61a74b-df34-4038-bc70-ee6cf02d5cf0 /media/SandboxData auto ..."
It is probably not a best practice to choose the system mount point as the explicit path used to configure something in Eclipse. So a simpler directory should be symlinked to the mount point and that should be used in application configuration.

There are several ways to mount disks in Windows and Linux. The choices all derive from two ambiguities:

Linux and Windows both support multiple users, but in practice only one person typically logs on. Still the system has provision for there to be another user and that has an effect on disk mounts.
Some disks are transient (USB flash drives) and some are permanent (a SATA hard disk installed inside the computer).

Since Linux cannot automatically decide if a disk belongs to one user or is supposed to be shared, and whether it can be dismounted and replaced or is mounted permanently, it is up to the users and administrators to configure things appropriately. Generally Linux will default in favor of the single user transient model (the USB flash drive) unless it is told otherwise.

While it is possible to switch from one software project to another by keeping the same OS disk and changing SandboxData disks, it is unreasonable to expect to do this without rebooting. That is, we do not intend to configure the Sandbox VMs to believe that SandboxData is a USB disk that you can eject and replace with another disk on a running system.

In Fedora, if user "casdev" mounts a disk then the mount point defaults to being something in the transient "/run" directory that is tied to the userid that mounted it. Specifically, the default location is /run/media/casdev/SandboxData. Ubuntu, however, uses a somewhat older standard where mounted CD drives were associated with /media/cdrom and so it wants to mount the same disk as /media/SandboxData. It is difficult to change the defaults of the system, and for this purpose it is unnecessary. So we adopt the Ubuntu default as the configured behavior on all Linux systems.

However, having decided that the mount point of the disk will be /media/SandboxData, that does not mean that we should use that path in Eclipse configuration windows. The option is then to symlink something simple like "/SandboxData" or "/home/casdev/SandboxData" (a.k.a. ~/SandboxData) to the mount point.

In Windows, mounting a disk typically generates a disk letter. The convention for the Sandbox is to use letter R: simply because it is unlikely to be used by anyone for something else. However, there are several ways to create a disk letter in Windows. In some cases the mapping is visible only to the user who does it, while in other cases the disk letter is global and is seen by all users on the machine. So Windows has some of the same problems as Linux, but the system details are different.

There is one more problem to mention. When a disk is initialized, it is given a unique identifier (a UUID) written in the disk label. This allows external disks to be uniquely identified no matter which USB port you plug them into. It also helps identify internal hard disks when the SATA cables become disconnected and are then reconnected to different ports. The Linux /etc/fstab file and the Windows Registry identify disks by the UUID and not by their label. So if you reconfigure a VM replacing one SandboxData disk with another, then when you boot it up again the system will know it is a different disk and will not automatically mount it. You have to use the Disks tool to add an additional line in fstab to mount the new UUID.

Obviously, although this discussion has assumed that the SandboxData files are all collected on a mountable hard disk image (a best practice), nothing prevents you from creating a Sandbox configuration where the files that would be on the second hard disk are instead copied or unzipped into a directory tree on disk. This is particularly attractive if you have decided to promote your sandbox development to the native operating system on your laptop, and now that you are not using a VM it is harder to mount a virtual disk file.

Installation

The Sandbox conventions do not depend on the software that creates and manages the virtual machines. Since this is not a production environment, the best choice is the one that is easiest to use. VirtualBox is free, can be installed on Windows, Linux, or Mac, and has convenience features for the developer. If you were creating virtual machines that you would immediately deploy into a IaaS cloud environment, then VMWare Workstation might be the best choice when developing for a VMWare fabric, and Hyper-V might be the best choice for a Microsoft fabric. The Sandbox, however, isn't going anywhere. The end result of all your work is source checked into Subversion that will get built and installed with Jenkins.

However, it is not a good idea to mix hypervisors on the same machine. They get into conflicts over the CPU VM support (VT-x, Nested Paging, etc.). So while you would never install VMWare or Hyper-V just in order to run the Sandbox, if you already have them installed for other projects then it is not necessary to rip them out and replace them with VirtualBox. Running a Sandbox OS disk on VMWare or Microsoft requires changing the hard disk format, creating a VM configuration (number of processors, size of memory, etc.), and then after the OS boots up you have to install the vendor specific Guest Support services and drivers. If you do not know how to do this, ask someone for help.

However, it turns out that VirtualBox is generally regarded as the best tool to convert one virtual hard disk format to another (VDI, VMDK, VHD). So even if you are not going to use it to run virtual machines, you may need to install it somewhere to do the conversion.

Download and install the current version of Virtualbox from virtualbox.org. Run the installer. Launch the VirtualBox Manager.

Now you have to install a Sandbox VM. This turns out to be a bit more complicated than just copying some files from disk.

The virtual machine definition (the *.vbox file) is an XML text file. It is not hard to read, but it has two types of problems:

For some reason, VirtualBox has decided to remember every DVD image you mounted in the virtual DVD drive. Worse, these images are remembered with a fully qualified path name

   <Image uuid="{9d9e4ed4-faaf-4f84-be68-271462e7c756}" location="D:/tpshare/ISO/Fedora-Live-Workstation-x86_64-22-3.iso"/>

and the virtual machine will not start if you move this configuration to another machine in which the DVD file image is not in the exact same location. Since there was really no reason to remember the iso file in the first place, the solution is to delete all these lines and then the machine will start.

Sometimes the virtual machine has to be associated with a particular device on the real machine. Your computer may have more than one Audio adapter, and the virtual audio of the VM has to be associated with one of the real audio drivers on your computer. Similarly, you computer may have multiple LAN adapters (wired, wireless, etc.) and in certain network configurations (like "Bridged") the VM has to be associated with one of these adapters. However, if the VM is moved from one computer to another, then these associations with real devices is broken and you have to work with the VirtualBox Manager to change the Settings of the VM so that each of these virtual hardware features is reattached to an appropriate real device on the new machine.

Each Sandbox OS defines a user named "casdev" with admin privileges who is the owner of SandboxData and some other directories on disk. It is an objective that this user have a specific userid number shared on all Unix hosts, so the disk file ownership transfers when the disk is moved from one OS disk to another. If that did not happen, then even though the user is named "casdev" on both systems, the SandboxData disk moved to a new system will appear to be owned by an unknown user and you have to reassign ownership of the work directories by running the chown command as root.

There is a "standard" portable format for VMs called a *.ova file. It saves space by zipping up the hard disk image, and it gives you an opportunity during installation to reconfigure the audio adapter and virtual LAN adapters as part of the installation rather than having to clean up after the fact. The Sandbox may be distributed in this format to save space and download time.

VirtualBox has a feature called "Shared Directory" that allows the host operating system to share one of its directories with the VM (like a network share, but implemented without the network). It is often convenient to have, but the Sandbox process has no standard use for it.

casdev

There is one user named "casdev" with admin (sudo) privileges. You login, run Eclipse, and do all your development as this user. The /home/casdev directory holds the Eclipse workspace and all the casual files. Because JBoss is started from Eclipse, it also runs as casdev. Therefore, the JBoss and Eclipse directories are owned by casdev even though they are installed elsewhere in the file system.

The password for user casdev is not particularly secure, since it can only be used by the developer on the host machine. However, it will not be published here. Ask someone for it.

Where

Oracle Java comes in a standard Red Hat distribution format called an "RPM" that contains both the files and instructions where to put them. Oracle puts a JVM in /usr/local/java. Red Hat official RPMs for JBoss are not available without a subscription, and RPMs for Eclipse are typically several releases behind the current verison. So JBoss comes from http://jbossas.jboss.org/downloads.html and Eclipse comes from http://www.eclipse.org/downloads/ and they are unzipped into subdirectories of /opt.

Then ownership of the directories is assigned to the "casdev" user instead of "root". JBoss contains configuration files and it writes to work directories that are part of the single distribution directory tree, and Eclipse has to change itself whenever you add new software. Eclipse and JBoss could have been put in the casdev HOME directory, but putting them in /opt seems cleaner. They are not, after all, part of your normal development workfiles. The Eclipse workspace that you are using goes in the casdev home.

There are a few things that have to go in the same place in the Sandbox and production. For example, the log files should be written to /var/log/jbossas on the Sandbox because that is where they go in production and that specific path has to go into log4j.xml. The JBoss Server configuration in Eclipse is modified to add -Djboss.server.log.dir=... onto the end of the JBoss start command.

The VM

The VM is a standard Virtualbox 64 bit Linux configuration. With JBoss running the used memory only gos up to 1.3 GB, so the amount of virtual RAM for the VM could be reduced to 1.5 or 2 GB if you need to run two VMs on an 8G laptop.

The VirtualBox Guest Additions are a set of drivers for the VM operating systems. These mouse, keyboard, video, and filesystem drivers support the integration of the VM interactive environment with the host system. For Linux, the Guest Additions are distributed in source and are compiled and then linked into the system. The source of the Guest Additions changes when you get a new version of VirtualBox, and the Additions have to be recompiled and installed every time the Linux Kernel changes, which happens frequently after you apply normal weekly maintenance to the Linux system. If you have made a change that breaks the Guest Additions then the VM window shrinks to a fixed size and the mouse gets captured when you click in the window and can only be detached from the VM by pressing the right Ctrl key. When this happens, click the outside (VirtualBox) menu item Devices - Insert Guest Additions CD. Inside the VM window you get a popup asking if you want to autorun the software on the CD that was just inserted. Click the "Run" button and let the Guest Additions rebuild.

It is not generally possible to drag and drop files between the Linux and Windows systems. Of course, you can use network file sharing between the machines, but there is a simpler solution. VirtualBox provides a feature called "Shared Folder". In the settings for the VM, there is a section for Shared Folder. You can designate one or more directories on the host computer (D:\sandbox is configured initially for the Sandbox VM). This directory is then given a name ("sandbox" for D:\sandbox). The shared host folder appears to the VM to be a virtual disk or virtual shared disk that can be mounted in Linux or assigned a disk letter (if you have a Windows VM). For Linux VMs, the shared folder is automatically mounted (because of the check box in the VM settings) to the location /media/sf_[name] (that is, /media/sf_sandbox for the name "sandbox"). The casdev user has been added to a group that allows read/write access to the files in the shared folder. This allows easy transfer of files between the VM and the host (Windows?) operating system. Copy files to or from C:\sandbox on the one end, and to or from /media/sf_sandbox on the other end.

The Virtual Network

Each VM has a virtual LAN adapter. VirtualBox can be configured to support this virtual adapter in several different ways. This is the most complicated step in the Sandbox configuration and needs to be understood, at least in basic terms, so the developer knows how to interpret behavior.

First, we need to define a few network terms. Suppose you have several real computers that you are connecting together in a home network. If you wire them to each other through a switch but you do not connect them to any router, then you have a Private Network. The machines can talk to each other but not to the outside world. You can assign each machine a static IP address, and for home networks this is traditionally a 192.168.1.* number. Of course this is the most secure arrangement, but it is not very useful.

So you get a DSL connection from the phone company or a cable connection from you TV provider, and you connect it these days to a Wireless Router box. Home routers add two network functions: DHCP and NAT. DHCP assigns addresses (from the 192.168.1.* subnet) to machines that are not configured to use a specific private address. NAT allows the router to forward client requests from any computer on the private network to the internet, but it changes the IP address on each packet of data so that the outside world thinks the request came from the router itself. This is important because the phone or cable company only assigned one IP address to your home, and the router has to own and manage that address.

NAT works automatically when the home computer is a client and the server is out on the internet. To allow Internet machines to connect back to a computer in your home network, then you have to configure the "Port Mapping" feature of the router to direct all Internet requests for a particular port (example: 8080) to a particular home computer.

Your host computer may be a laptop connected to the Yale network, but the VMs that it runs under VirtualBox are typically unknown to Yale and you probably want them to be unavailable to other machines. So VirtualBox creates various virtual network solutions emulating different elements of the typical home network solution.

When you create a VM and give it a virtual LAN adapter, you can configure the connectivity of that adapter to use specific named options:

NAT - One VM appears to be connected to its own network with a NAT router simulated on the host real machine. Client programs on the VM can access the Yale network and Internet, but neither the host computer nor the other VMs can talk to that VM except through ports mapped from the VM to the host computer. If you map ports, they become visible to the outside world.
NAT Network - Several VMs are connected to a private network with a NAT router simulated on the host real machine. Like the previous configuration, except in this case the VMs can talk to each other as if they were real computers on a real network, but the host computer still can't talk to them.
Bridge - All the VMs appear to be directly connected to the real network to which the host computer is connected. At Yale, that means that every VM has to be assigned its own IP address from Data Network Operations. Since that address is real, no other developer can use the same set of addresses for his Sandbox machines. This also exposes the VMs to the outside world (at least the Yale network). This option is useless for a sandbox.
Host-Only Adapter - First, this creates a virtual LAN adapter on the host computer (you get a dialog box on Windows asking you to install a new device). Then logically it connects this simulated LAN adapter to a Private Network to which all the VMs are connected. Typically you assign a static address like 192.168.137.1 to the host computer and then other static addresses like 192.168.137.10 to each VM. VirtualBox does not provide any NAT router function, so this private network is isolated from the real network.

Now for Sandbox requirements: The VMs have to be able to communicate with each other just like real machines, so they can test various clustering options. The VMs have to access servers in the Yale Network (SVN for example to update or commit source changes). For security, other machines must not be able to access the VMs. It is useful, but not an absolute requirement, for the host computer to be able to connect to port 8080 (JBoss) on the VM.

It is possible to meet all these requirements with a Host-Only adapter and an exotic system configuration or third party software on the host computer. However, the simplest solution is to recognize that while one virtual LAN adapter cannot do all these things, two different adapters can provide complete coverage.

One adapter (that VirtualBox refers to as LAN Adapter 2 and Centos chooses to name "enp0s8") uses a simple "NAT" connection to give VM clients access to the Yale network (SVN) and to the Internet (the Centos software update sites). You map no ports, so this provides only outbound service. It is automatically assigned a meaningless IP address that doesn't matter because no other computer can talk to it.

The other adapter (that VirtualBox refers to as LAN Adapter 1 and Centos chooses to name "enp0s3") is a Host-Only Adapter that creates a simulated Private Network that connects the VMs to each other and to the Host computer. It is not connected, routed, or bridged to Yale or the Internet, so it cannot be used to access other machines and no computer other than the host or the VMs can use it. The host and VMs are assigned static 192.168.*.* addresses to talk to each other just like real computers connected to a regular private home network. Since these addresses are not typically available

NAT is part of VirtualBox and requires no configuration. However, a Host-Only network has to be set up before any VM can use it. In the VirtualBox management console (that lists the installed virtual machines). Click File - Preferences - Network. Select the Host-only Networks tab. If no network is listed, click the Add button to create one. It will be called "VirtualBox Host-Only Ethernet Adapter" and when you create it you have to let your real laptop operating system add a new device. If you double click the now listed adapter, you can set its IPv4 Address to 192.168.137.1 and the Network mask to 255.255.255.0. It does not need a DHCP server because static addresses are configured in the VMs.

If you run Windows as the host computer, there is one additional cleanup step. When VirtualBox create a new simulated LAN Adapter in the Windows system, it left all the default configuration options. Go to Control Panel - Netowrk and Internet - Network Connections. Right click the VirtualBox Host-Only Network connection and choose Properties. DoubleClick the "Internet Protocol Version 4" item in the list box. Click the Advanced... button, choose the DNS tab, and turn off the checkbox at the bottom for "Register this connection's addresses in DNS". If you do not do this, then when you login to the Yale AD, your Windows desktop tries to register the 192.168.137.1 private address on this adapter with the dynamic DNS service that AD maintains. It probably will not cause a problem, but if another computer at Yale (frequently another machine you own) also has a private virtual network mapped to 192.168.137.* then from that DNS name that computer can believe that your computer is a VM on its private network, and then become unable to communicate with your machine because the two private networks are not connected. You can spend hours trying to figure out why you cannot share files or start a remote desktop session before you realize that your network traffic is going into the private network black hole instead of transiting the real network.

The Sandbox VM comes configured with two virtual LAN adapters (NAT and Host-Only). The Sandbox OS is configured with three adapters (NAT and two alternate versions of Host-Only to easily configure two VMs from one system image). It uses the NAT adapter to get to the outside world. You configure one of two Ethernet hardware (MAC) addresses with the VirtualBox console, and which hardware address comes up tells the Sandbox if it is the vm-ssoboxapp-01 host with private IP address 192.168.137.10, or the "-02" host with IP address ending in ".11".

The open format distribution file (the *.ova file you install the Sandbox VM with) sort of knows that there are supposed to be two virtual LAN adapters and it sort of knows that one is to be NAT and one is to be Host-Only. There is an obvious conflict between an "open format" file that can be read by VirtualBox or VMware and configuration options like "Host-Only" that may be a VirtualBox technical term that other vendors name differently. So when you install the *.ova file, VirtualBox displays the proposed hardware configuration and gives you a chance to explicitly connect any dangling configuration items to specific local chocices. For example, if you did not follow the previous instructions and did not create the Host-Only adapter on the host computer, then there would be no Host-Only adapter to connect the Sandbox VM to, and then one of the two LAN adapters is left with nothing it can connect to. If you do not fix the configuration problem at the start of installation, you can always fix it later before you start the VM.

If you need to simulate a second VM, clone the Sandbox computer (as explained below) and then in the clone configuration you leave Adapter 1 attached to the same Host-only Adapter but now you expose the Advanced options and change the MAC Address to be one larger (change AD at the end to AE).

The Centos operating system in the Sandbox VM has two different configurations for two different LAN adapters with different MAC addresses. It selects which IP address it uses based on which MAC address the simulated LAN adapter exposes. The first VM (ending in AD) gets 192.168.137.10 and the second (AE) gets .11. However, it is not possible to automatically change the hostname based simply on the MAC address. You have to do that manually the first time you boot up the cloned second VM. Issue the following command once:

sudo hostnamectl set-hostname vm-ssoboxapp-02.web.yale.internal

to change the hostname permanently on that VM.

However, while each machine has to know its own name, it also has to be able to locate the other machine in the "cluster". In production this is accomplished through the DNS server. Fortunately, all systems support a simple static alternative called a "hosts" file. This file is /etc/hosts on Linux or Mac, and C:\windows\system32\drivers\etc\hosts on Windows. It is a simple text file where each line starts with an IP address and then contains one or more host names associated with that address. The VM has a hosts file with two lines:

192.168.137.10 vm-ssoboxapp-01.web.yale.internal casvm1
192.168.137.11 vm-ssoboxapp-02.web.yale.internal casvm2

This maps the full name and a shorter nickname (casvmx) that is easier to type. You should add these lines to the "hosts" file on your host (Windows or Mac) computer so that you can access the VMs by name through the Host-Only adapter.

CAS currently has only two VMs. If that changes, or you want to use the Sandbox to work with a different product that has a different cluster configuration with more machines, then you can follow the cookie cutter instructions and add additional Network adapter configurations in /etc/sysconfig/network-scripts and add an extra line to all the hosts files.

Centos 7 has a Firewall service (firewalld) that, like the Windows firewall service, provides some protection to a desktop or server machine. It is not used in production machines behind the corporate firewall, and similarly it is not useful on VMs that are hidden on the Host-Only private virtual network, so that service is disabled in the Sandbox. It you want to turn it back on, you have to configure it for JBoss and the clustering.

Clone

The Sandbox VM can be cloned to produce a second CAS VM for testing cluster failover.

Cloning is a VirtualBox operation performed on the VirtualBox control window. Select the machine and choose Clone from the Machine menu. There are two kinds of Clones:

A Full Clone makes a complete copy of the VM configuration and the hard disk. This is the simplest option, but it takes a few minutes to make a second copy of the hard drive file

A Linked Clone is much faster and smaller. It creates a "snapshot" of the current hard drive file (an image of the disk that does not change and can be shared between the original VM and the Clone. Then the two VMs each get a new file that holds only the changes made to the hard disk since the snapshot was taken. Once the snapshot is taken, any files that are changed, for example changes to the source files in the Eclipse workspace, are separate on the original and clone VM and so they have to be copied from one to the other. You can commit changes to SVN and then update the files on the other VM, but after a while that gets to be annoying. If you make a lot of changes, do it on the original VM, discard the old clone and make a new one. It only takes a few seconds to create a Linked Clone.

After you delete a Linked Clone you may want to take a few seconds more and delete the disk snapshot that was created for it. That merges the changes back into a single disk image file.

There are two steps that you must perform after you create the Clone to distinguish it on the network from the original VM. Before you start the Clone VM, edit the Settings - Network - Advanced - MAC address changing the last byte of the address by adding 1 (change the "AD" at the end to "AE"). Then after the clone starts up for the first time, issue the command "sudo hostnamectl set-hostname vm-ssoboxapp-02.web.yale.internal" to change the hostname. After you do this a few times it will become routine and deleting the old Clone, creating a new one, updating the MAC address, and changing the hostname will take only a few minutes and is much simpler and faster than copying files from one machine to the other.

CAS Development

CAS development is the same whether you are working on a Windows host computer (with Eclipse and JBoss) or on the Sandbox VM. You edit in Eclipse, build with Maven, and run JBoss from the Eclipse toolbar. Generic CAS develpment is described elsewhere. This section just describes the Sandbox.

When you get a new copy of the Sanbox VM, the casdev user will have an Eclipse workspace in its home directory. This will have a project for the CAS source and for the CAS installer. However, the project may have old files. So the first step is to synchronize the workspace with the SVN server and update the files with whatever is current.

If you are starting work on a whole new release of CAS, then delete the old project and create a new project using the instructions provided in CAS Development Conventions at Yale.

There are two configured Maven "jobs" in the Eclipse Run - Run Configurations...

If you click the dropdown mark (V) behind the Run icon on the toolbar (a green circle with a right pointing triangle) then there is a CAS Build job (which runs the Maven POM in the CAS parent directory to compile everything and build the CAS WAR) and a CAS Install job (which copies the CAS WAR to the JBoss deploy directory and inserts parameters into the configuration files). Run the Build first, then the Install.

Elsewhere on the toolbar there are JBoss Run and Debug icons (a green arrow pointing right and a bug with an arrow under it). They can be used to start JBoss normally or with interactive debugging using Eclipse.

Changing the Sandbox

The Sandbox is separate from the contents of the projects in the workspace or the SVN. You could use the same Sanbox to develop with CAS 3.5.3 or CAS 4.0, and you could probably use it to develop any JBoss hosted application including Shibboleth.

Updating Centos 7 with newer versions of software or changing system configuration options is just standard Linux system administration.

Therefore, changing the Sandbox means installing a different version of Java, JBoss, or Eclipse.

You can install any version of Java using the Oracle RPMs. They go automatically into /usr/java. Generally you configure different versions of Java into Eclipse (Preferences - Java - Java Runtime) and then you select the verison of Java you want to target as a parameter of the Eclipse project, or as a parameter of the CAS Build and CAS Install Run configurations, or as a parameter of the JBoss Server runtime configuration. If you want to change the default version of Java that you get at the command prompt, Google for information on the Linux "alternatives" command.

There are separate instructions in this set of documentation for starting with a vanilla version of Eclipse and then adding the Subversive and JBoss Tools components, so they will not be repeated here.

To install a new version of JBoss, get it from jboss.org and unzip it into a new directory in /opt/jboss. JBoss Tools will configure it automatically if you ask. In Eclipse, go to Preferences - JBoss Tools, JBoss Runtime Detection. Click Search ... and have it search /opt/jboss. It will notice the new server and configure it. To change the default configuration, you need to display the Servers (Window - Show View - Servers). The Servers tab lists all the configured servers. Double click on the name of the server you want to configure. Click the "Open launch configuration" link for detailed startup configuration. In particular, you may want to allow browsers on other machines (actually only on the host computer) to access http://vm-ssoboxapp-01/cas by binding JBoss to the LAN adapter instead of just the local loopback. In the command parameters, change "-b localhost" to "-b 0.0.0.0". Also add a parameter to change the log directory "-Djboss.server.log.dir=/var/log/jbossas/standalone". If in doubt, compare this to the configuration of another JBoss server.

Disclaimer: configuring modern versions of JBoss AS 7, JBoss EAP 6.x, and Wildfly 8.x are all fairly similar. If you want to configure an older JBoss 5, that changes the entire directory structure and conventions and may require extra research.