Sandbox VM

A Sandbox provides a pre-configured development environment that you can run on your laptop computer. The VM version provides a Linux system on which to test the software you are developing in an environment close to what is used in production. Although the Sandbox was originally created for CAS, it can also be used for Shibboleth, Netid, or any other Java Web application. The default user is, however, named "casdev".

If you do not use the Sandbox, you can eventually stumble by trial and error to a workable development environment of your own. Along the way you will have to solve many problems. Does your Java have the unlimited cryptography files and does it have the Yale Certificate Authority files installed? Do you have both a current Maven and the old Maven 2.2.1 that Jenkins runs? Do you have Oracle and SQL Server database drivers installed in JBoss? How do you get JBoss logs to go to /var/log/jbossas/standalone instead of the log directory inside JBoss itself? None of these is a big problem, but these are just randomly chosen examples of the hundreds of small configuration issues you have to address.

The Sandbox is a friendly environment for developers to edit, compile, build, and debug the application. It is specifically not designed to duplicate the runtime configuration of production. For that you have DEV and TEST.

That's not how I do it!

If you take a strong stand that programmers should all do something in one common way, you almost guarantee that your efforts will be subverted by people who believe that your choice is wrong. The Sandbox is not a single choice but rather a family of almost identically configured mix and match components that almost anyone can live with.

Hypervisor

Oracle Virtualbox is free, powerful, easy to use, and can be installed on all Windows, Mac, and Linux computers. Other options (Hyper-V, Xen, VMWare Player) run on single operating systems so you have to convert VMs from one to the other (and it turns out that Virtualbox is the recommended utility for a lot of the conversions). It is a desktop tool friendly to developers rather than a production server deployment tool like Hyper-V. VMWare has been around longer and goes through more extensive testing before much more infrequent releases, but if a feature breaks when there is a new release of Windows or Ubuntu, then you have to wait a long time for the next release to fix it. Virtualbox has frequent releases as an open source project and adapts more quickly.

Although Virtualbox aspires to be something you could use for production servers in a machine room, it is not yet something one would recommend over an alternative like Hyper-V. However, for simple desktop edit, compile, and debug it is simple, reliable, and convenient. It may be that the drivers for the emulated video card don't support modern features, and that the passthrough of USB 3 hardware devices to the VM has problems. You don't need that stuff to run Eclipse.

Desktop

Originally the Production Services group at Yale tried to convince use to use the same VM in development that they use in production. However, that VM only has the character terminal user interface, and most people these days do not want to write code using only the vi editor. So to use that machine you would have to use Eclipse on another computer, then install onto the VM and remotely debug. It is a possible model, but it turns out that Production Service never provided the tools needed to actually create and maintain accurate versions of their VMs.

It is much simpler to upgrade the VM to include a real desktop and run Eclipse on the VM itself instead of on a separate machine. Given that any user interface is an addition, there was no particular reason to prefer Gnome to KDE to Unity or any of the other choices you make from one Linux distribution to another. Eclipse will work under any Linux "desktop" system. If you feel strongly about it, switch to whichever one you want.

Cutting and pasting text with the "clipboard" function and dragging and dropping files are functions actually performed with the user interface and not the Linux OS. Virtualbox has had a lot of trouble getting drag and drop of files to work across systems, in part because it is different for KDE and Gnome and so on. If you get it working, there is no guarantee that the next release of Linux won't break it again. Even if you accept the default Gnome desktop that comes default with your preferred distribution, if the Gnome drag and drop is broken and the KDE drag and drop works, then you may want to install just the KDE File Management utility (Dolphin). It will drag along 100 megabytes of KDE lib files, but then you have two file managers and can drag and drop files to whichever one works.

Virtual USB Stick

Even if Eclipse is not automatically installed in your distribution, you can find it in the Software tool. It is almost always one or two releases (years) out of date and generally you don't want to run something that old. However, if you do choose to install it, there is some directory where the package manager tool (yum or apt-get) wants to install that particular program. Feel free to use the standard tools to install Eclipse or Java or JBoss where the distribution wants to put them.

The Sandbox comes with its own copy of everything, and it puts that stuff somewhere that no distribution would normally install Java or Eclipse. It puts them where a Linux system mounts CDs and USB Sticks (the /media directory). In that sense, the Sandbox puts all its files on a "virtual USB stick" although actually it looks to the VM like a small additional hard disk. This has two advantages. First, the Sandbox cannot conflict with any OS convention for where stuff is supposed to go. Second, this is a more natural way to arrange things if you want to work with the Sandbox files for Shibboleth, then "eject" the disk with those files and mount the disk with Sandbox files for IIQ.

You may want to move the Eclipse workspace to your real Home directory (or the /home/casdev directory since you login as user "casdev") if you are doing something experimental and don't want to mess up the original workspace in the Sandbox.

This way we avoid the eternal argument between /opt and /usr/local and other places where different people believe stuff should go.

One from Column A

Generally speaking the Sandbox VM environment involves choosing two separate components.

You need an OS VM. Typically it will run CentOS, Fedora, or Unbuntu. You can just pick up the OS configuration last used for the software you are planning to develop, or you can choose one specific OS VM to use for all your development and customize it to your preferences.
You need the SandboxData disk/directory-tree for the particular software package you are planning to work on . By separating out the CAS project files on one virtual disk, and the Netid project files on another virtual disk, you can keep the same OS VM with your customizations and switch from one software development project to another by switching mounted disks.

Eclipse, not /etc

In the operating system servers are configured in /etc or the registry, but in the Sandbox you configure every Java Runtime, every JBoss or Tomcat server, and every Maven instance in Eclipse. In the system Java and JBoss are started at as background processes at boot time, but in the Sandbox your run Maven or start JBoss in debugging mode by clicking the task bar in Eclipse.If you have installed a J2EE server in the OS, then you have to stop it before starting JBoss under Eclipse or else you have to configure the Sandbox JBoss to use a different port that the 8080 default.

Of course, if you have a version of the Sandbox configured to work on both CAS and Netid (because they interact) and CAS runs on JBoss and Netid runs on Tomcat then you may have to configure two port numbers anyway. You just don't want the OS servers adding to the complexity.

1001

There is a generic developer userid named "casdev" (because the Sandbox was first created for CAS, but it can be used for any application). This user must be configured to the OS as Linux user 1001 because it is the userid number that is stored in the SandboxData disk directory as the owner of all the directoriesand files. The Sandbox OS is typically configured to log this userid in when you boot the VM. Typically "casdev" starts Eclipse and then everything is managed and started from Eclipse as part of this interactive login. So everything runs under the "casdev" user, and that user must have read and write access to the small number of OS directories that may be used during execution (specific subdirectories of /var/log and /media for example). Any directory permission problems will not show up until you run the code in DEV, but that is typically soon enough to deal with them.

Because it is 1001 and not "casdev" that is used in all Linux rules, it would be fairly simple to rename "casdev" to a different name if you object to it. Just keep the 1001 number.

R:\

The Sandbox OS can be Fedora or Ubuntu, but it can also be Windows. Java is system independent and all the tools (Eclipse, JBoss, Maven) are written in Java. You can develop CAS in Windows and then deploy it to production Linux servers and everything works.

Unfortunately, the SandboxData disk cannot be moved between Linux and Windows. While Java can be stored in the same directory on the disk for both systems, the Java executable is either a Linux program or a java.exe Windows program. Eclipse also inserts the disk letter when it configures an external file in Windows, while disk letters do not exist in Linux. So there have to be separate SandboxData images for Linux and Windows even though the component (Java, Eclipse, JBoss, or Maven) is in the same relative directory on each image.

Misc

CAS, Shibboleth, and Netid are examples of programs that run under SSL in production but run normal HTTP on port 8080 during development in the Sandbox.

Some files have to have explict fully qualified path names to a log, cache, or configuration directory. The Shibboleth WAR needs to know where /usr/local/shibboleth-idp is, and the log4j.xml file has to point to the /var/log/jbossas/standalone directory. You have two choices. You can use the standard OS file (strongly recommended for /var/log/jbossas/standalone) and on Windows you can create the C:\var\log\jbossas\standaone directory. Alternately you can create symlinks in the system that redirect the path to somewhere in the SandboxData disk. Remember, the the casdev user has to have write access to such directories.

Sandbox is not Production

The Sandbox does not have to run at maximum efficiency. It will not be up 24x7 for months, so rock solid reliability and stability are not a requirement. Therefore, you have more flexibility choosing a VM host, Guest OS, Java runtime, and version of JBoss.

For example, while a particular application will be designed to run in a particular version of Java (1.6, 1.7, or 1.8) it doesn't matter for development if you are using OpenJDK or the Sun JVM. At various times in the past, the OpenJDK has a bad reputation for slow memory leaks that accumulate in applications that run for weeks without being restarted, but that is not a problem for code you debug and restart every few minutes. Developers frequently prefer being on the latest and greatest version of Fedora rather than running an older Linux Kernel in Centos just because that is what Red Hat supports for production.

To fully exercise every part of an application you may have to install optional libraries in the system. For example, the Netid application uses the Kerberos 5 administration libraries that would have to be installed with yum or apt-get. However, unless this additional function is an essential part of your debugging, then you open a big can of worms trying to get the Sandbox to support this sort of function. You will never get the Netid application to actually do Kerberos 5 administration in the Sandbox because it turns out that the Java native code that provides that interface only works in 32 bit mode and only on an older version of the library. It is much better to do what Netid does and have a install parameter "kadm5_activation=false" that turns all Kerberos administration calls into noops since you are almost never actually testing that particular function of the application.

Two "Disks"

The core of the Sandbox is a separate disk image (named "SandboxData") that contains the Java, Eclipse, JBoss, Maven, database drivers, and so on. This disk can be cloned or the VM manager (VirtualBox) can create a fixed base disk shared by multiple machines with separate difference files that hold the changes each VM makes.

The last programmer to work on an application will leave a VM with a particular version of a particular OS that was last used in development. Try to live with it. If you can't, then you can switch to another OS Sandbox disk, but then you will have to connect it back to the SandboxData disk that contains the development files.

The SandboxData disk files are owned by user 1001 group 1001. The operating system you boot should have configured this user number to be "casdev" for sanity sake.

This is a three step process in Linux:

You have to configure the VM (the *.vbox XML file) using the Virtualbox Manager to have two SATA disks, one for the OS and one for the SandboxData.
You then have to boot the OS and configure it to mount the SandboxData disk and its file system as a directory in a well known path. Search for the "Disks" GNOME utility which is an easy to use way of changing the /etc/fstab file. The smaller "VBOX HARDDISK" that is device /dev/sdb1 should be configured to mount at boot time. Edit Mount Options and curiously turn "Automatic Mount Options" OFF and then type in a mount point. This will create a line in fstab of the form "/dev/disk/by-uuid/6a61a74b-df34-4038-bc70-ee6cf02d5cf0 /media/SandboxData auto ..."
It is probably not a best practice to choose the system mount point as the explicit path used to configure something in Eclipse. So a simpler directory should be symlinked to the mount point and that should be used in application configuration.

There are several ways to mount disks in Windows and Linux. The choices all derive from two ambiguities:

Linux and Windows both support multiple users, but in practice only one person typically logs on. Still the system has provision for there to be another user and that has an effect on disk mounts.
Some disks are transient (USB flash drives) and some are permanent (a SATA hard disk installed inside the computer).

Since Linux cannot automatically decide if a disk belongs to one user or is supposed to be shared, and whether it can be dismounted and replaced or is mounted permanently, it is up to the users and administrators to configure things appropriately. Generally Linux will default in favor of the single user transient model (the USB flash drive) unless it is told otherwise.

While it is possible to switch from one software project to another by keeping the same OS disk and changing SandboxData disks, it is unreasonable to expect to do this without rebooting. That is, we do not intend to configure the Sandbox VMs to believe that SandboxData is a USB disk that you can eject and replace with another disk on a running system.

In Fedora, if user "casdev" mounts a disk then the mount point defaults to being something in the transient "/run" directory that is tied to the userid that mounted it. Specifically, the default location is /run/media/casdev/SandboxData. Ubuntu, however, uses a somewhat older standard where mounted CD drives were associated with /media/cdrom and so it wants to mount the same disk as /media/SandboxData. It is difficult to change the defaults of the system, and for this purpose it is unnecessary. So we adopt the Ubuntu default as the configured behavior on all Linux systems.

However, having decided that the mount point of the disk will be /media/SandboxData, that does not mean that we should use that path in Eclipse configuration windows. The option is then to symlink something simple like "/SandboxData" or "/home/casdev/SandboxData" (a.k.a. ~/SandboxData) to the mount point.

In Windows, mounting a disk typically generates a disk letter. The convention for the Sandbox is to use letter R: simply because it is unlikely to be used by anyone for something else. However, there are several ways to create a disk letter in Windows. In some cases the mapping is visible only to the user who does it, while in other cases the disk letter is global and is seen by all users on the machine. So Windows has some of the same problems as Linux, but the system details are different.

There is one more problem to mention. When a disk is initialized, it is given a unique identifier (a UUID) written in the disk label. This allows external disks to be uniquely identified no matter which USB port you plug them into. It also helps identify internal hard disks when the SATA cables become disconnected and are then reconnected to different ports. The Linux /etc/fstab file and the Windows Registry identify disks by the UUID and not by their label. So if you reconfigure a VM replacing one SandboxData disk with another, then when you boot it up again the system will know it is a different disk and will not automatically mount it. You have to use the Disks tool to add an additional line in fstab to mount the new UUID.

Obviously, although this discussion has assumed that the SandboxData files are all collected on a mountable hard disk image (a best practice), nothing prevents you from creating a Sandbox configuration where the files that would be on the second hard disk are instead copied or unzipped into a directory tree on disk. This is particularly attractive if you have decided to promote your sandbox development to the native operating system on your laptop, and now that you are not using a VM it is harder to mount a virtual disk file.

Installation

The Sandbox conventions do not depend on the software that creates and manages the virtual machines. Since this is not a production environment, the best choice is the one that is easiest to use. VirtualBox is free, can be installed on Windows, Linux, or Mac, and has convenience features for the developer. If you were creating virtual machines that you would immediately deploy into a IaaS cloud environment, then VMWare Workstation might be the best choice when developing for a VMWare fabric, and Hyper-V might be the best choice for a Microsoft fabric. The Sandbox, however, isn't going anywhere. The end result of all your work is source checked into Subversion that will get built and installed with Jenkins.

However, it is not a good idea to mix hypervisors on the same machine. They get into conflicts over the CPU VM support (VT-x, Nested Paging, etc.). So while you would never install VMWare or Hyper-V just in order to run the Sandbox, if you already have them installed for other projects then it is not necessary to rip them out and replace them with VirtualBox. Running a Sandbox OS disk on VMWare or Microsoft requires changing the hard disk format, creating a VM configuration (number of processors, size of memory, etc.), and then after the OS boots up you have to install the vendor specific Guest Support services and drivers. If you do not know how to do this, ask someone for help.

However, it turns out that VirtualBox is generally regarded as the best tool to convert one virtual hard disk format to another (VDI, VMDK, VHD). So even if you are not going to use it to run virtual machines, you may need to install it somewhere to do the conversion.

Download and install the current version of Virtualbox from virtualbox.org. Run the installer. Launch the VirtualBox Manager.

Now you have to install a Sandbox VM. This turns out to be a bit more complicated than just copying some files from disk.

The virtual machine definition (the *.vbox file) is an XML text file. It is not hard to read, but it has two types of problems:

For some reason, VirtualBox has decided to remember every DVD image you mounted in the virtual DVD drive. Worse, these images are remembered with a fully qualified path name

   <Image uuid="{9d9e4ed4-faaf-4f84-be68-271462e7c756}" location="D:/tpshare/ISO/Fedora-Live-Workstation-x86_64-22-3.iso"/>

and the virtual machine will not start if you move this configuration to another machine in which the DVD file image is not in the exact same location. Since there was really no reason to remember the iso file in the first place, the solution is to delete all these lines and then the machine will start.

Sometimes the virtual machine has to be associated with a particular device on the real machine. Your computer may have more than one Audio adapter, and the virtual audio of the VM has to be associated with one of the real audio drivers on your computer. Similarly, you computer may have multiple LAN adapters (wired, wireless, etc.) and in certain network configurations (like "Bridged") the VM has to be associated with one of these adapters. However, if the VM is moved from one computer to another, then these associations with real devices is broken and you have to work with the VirtualBox Manager to change the Settings of the VM so that each of these virtual hardware features is reattached to an appropriate real device on the new machine.

Each Sandbox OS defines a user named "casdev" with admin privileges who is the owner of SandboxData and some other directories on disk. It is an objective that this user have a specific userid number shared on all Unix hosts, so the disk file ownership transfers when the disk is moved from one OS disk to another. If that did not happen, then even though the user is named "casdev" on both systems, the SandboxData disk moved to a new system will appear to be owned by an unknown user and you have to reassign ownership of the work directories by running the chown command as root.

There is a "standard" portable format for VMs called a *.ova file. It saves space by zipping up the hard disk image, and it gives you an opportunity during installation to reconfigure the audio adapter and virtual LAN adapters as part of the installation rather than having to clean up after the fact. The Sandbox may be distributed in this format to save space and download time.

VirtualBox has a feature called "Shared Directory" that allows the host operating system to share one of its directories with the VM (like a network share, but implemented without the network). It is often convenient to have, but the Sandbox process has no standard use for it.

casdev

There is one user named "casdev" with admin (sudo) privileges. You login, run Eclipse, and do all your development as this user. The /home/casdev directory holds the Eclipse workspace and all the casual files. Because JBoss is started from Eclipse, it also runs as casdev. Therefore, the JBoss and Eclipse directories are owned by casdev even though they are installed elsewhere in the file system.

The password for user casdev is not particularly secure, since it can only be used by the developer on the host machine. However, it will not be published here. Ask someone for it.

Where

Oracle Java comes in a standard Red Hat distribution format called an "RPM" that contains both the files and instructions where to put them. Oracle puts a JVM in /usr/local/java. Red Hat official RPMs for JBoss are not available without a subscription, and RPMs for Eclipse are typically several releases behind the current verison. So JBoss comes from http://jbossas.jboss.org/downloads.html and Eclipse comes from http://www.eclipse.org/downloads/ and they are unzipped into subdirectories of /opt.

Then ownership of the directories is assigned to the "casdev" user instead of "root". JBoss contains configuration files and it writes to work directories that are part of the single distribution directory tree, and Eclipse has to change itself whenever you add new software. Eclipse and JBoss could have been put in the casdev HOME directory, but putting them in /opt seems cleaner. They are not, after all, part of your normal development workfiles. The Eclipse workspace that you are using goes in the casdev home.

There are a few things that have to go in the same place in the Sandbox and production. For example, the log files should be written to /var/log/jbossas on the Sandbox because that is where they go in production and that specific path has to go into log4j.xml. The JBoss Server configuration in Eclipse is modified to add -Djboss.server.log.dir=... onto the end of the JBoss start command.

The VM

The VM is a standard Virtualbox 64 bit Linux configuration. With JBoss running the used memory only gos up to 1.3 GB, so the amount of virtual RAM for the VM could be reduced to 1.5 or 2 GB if you need to run two VMs on an 8G laptop.

The VirtualBox Guest Additions are a set of drivers for the VM operating systems. These mouse, keyboard, video, and filesystem drivers support the integration of the VM interactive environment with the host system. For Linux, the Guest Additions are distributed in source and are compiled and then linked into the system. The source of the Guest Additions changes when you get a new version of VirtualBox, and the Additions have to be recompiled and installed every time the Linux Kernel changes, which happens frequently after you apply normal weekly maintenance to the Linux system. If you have made a change that breaks the Guest Additions then the VM window shrinks to a fixed size and the mouse gets captured when you click in the window and can only be detached from the VM by pressing the right Ctrl key. When this happens, click the outside (VirtualBox) menu item Devices - Insert Guest Additions CD. Inside the VM window you get a popup asking if you want to autorun the software on the CD that was just inserted. Click the "Run" button and let the Guest Additions rebuild.

It is not generally possible to drag and drop files between the Linux and Windows systems. Of course, you can use network file sharing between the machines, but there is a simpler solution. VirtualBox provides a feature called "Shared Folder". In the settings for the VM, there is a section for Shared Folder. You can designate one or more directories on the host computer (D:\sandbox is configured initially for the Sandbox VM). This directory is then given a name ("sandbox" for D:\sandbox). The shared host folder appears to the VM to be a virtual disk or virtual shared disk that can be mounted in Linux or assigned a disk letter (if you have a Windows VM). For Linux VMs, the shared folder is automatically mounted (because of the check box in the VM settings) to the location /media/sf_[name] (that is, /media/sf_sandbox for the name "sandbox"). The casdev user has been added to a group that allows read/write access to the files in the shared folder. This allows easy transfer of files between the VM and the host (Windows?) operating system. Copy files to or from C:\sandbox on the one end, and to or from /media/sf_sandbox on the other end.

The Virtual Network

Each VM has a virtual LAN adapter. VirtualBox can be configured to support this virtual adapter in several different ways. This is the most complicated step in the Sandbox configuration and needs to be understood, at least in basic terms, so the developer knows how to interpret behavior.

First, we need to define a few network terms. Suppose you have several real computers that you are connecting together in a home network. If you wire them to each other through a switch but you do not connect them to any router, then you have a Private Network. The machines can talk to each other but not to the outside world. You can assign each machine a static IP address, and for home networks this is traditionally a 192.168.1.* number. Of course this is the most secure arrangement, but it is not very useful.

So you get a DSL connection from the phone company or a cable connection from you TV provider, and you connect it these days to a Wireless Router box. Home routers add two network functions: DHCP and NAT. DHCP assigns addresses (from the 192.168.1.* subnet) to machines that are not configured to use a specific private address. NAT allows the router to forward client requests from any computer on the private network to the internet, but it changes the IP address on each packet of data so that the outside world thinks the request came from the router itself. This is important because the phone or cable company only assigned one IP address to your home, and the router has to own and manage that address.

NAT works automatically when the home computer is a client and the server is out on the internet. To allow Internet machines to connect back to a computer in your home network, then you have to configure the "Port Mapping" feature of the router to direct all Internet requests for a particular port (example: 8080) to a particular home computer.

Your host computer may be a laptop connected to the Yale network, but the VMs that it runs under VirtualBox are typically unknown to Yale and you probably want them to be unavailable to other machines. So VirtualBox creates various virtual network solutions emulating different elements of the typical home network solution.

When you create a VM and give it a virtual LAN adapter, you can configure the connectivity of that adapter to use specific named options:

NAT - One VM appears to be connected to its own network with a NAT router simulated on the host real machine. Client programs on the VM can access the Yale network and Internet, but neither the host computer nor the other VMs can talk to that VM except through ports mapped from the VM to the host computer. If you map ports, they become visible to the outside world.
NAT Network - Several VMs are connected to a private network with a NAT router simulated on the host real machine. Like the previous configuration, except in this case the VMs can talk to each other as if they were real computers on a real network, but the host computer still can't talk to them.
Bridge - All the VMs appear to be directly connected to the real network to which the host computer is connected. At Yale, that means that every VM has to be assigned its own IP address from Data Network Operations. Since that address is real, no other developer can use the same set of addresses for his Sandbox machines. This also exposes the VMs to the outside world (at least the Yale network). This option is useless for a sandbox.
Host-Only Adapter - First, this creates a virtual LAN adapter on the host computer (you get a dialog box on Windows asking you to install a new device). Then logically it connects this simulated LAN adapter to a Private Network to which all the VMs are connected. Typically you assign a static address like 192.168.137.1 to the host computer and then other static addresses like 192.168.137.10 to each VM. VirtualBox does not provide any NAT router function, so this private network is isolated from the real network.

Now for Sandbox requirements: The VMs have to be able to communicate with each other just like real machines, so they can test various clustering options. The VMs have to access servers in the Yale Network (SVN for example to update or commit source changes). For security, other machines must not be able to access the VMs. It is useful, but not an absolute requirement, for the host computer to be able to connect to port 8080 (JBoss) on the VM.

It is possible to meet all these requirements with a Host-Only adapter and an exotic system configuration or third party software on the host computer. However, the simplest solution is to recognize that while one virtual LAN adapter cannot do all these things, two different adapters can provide complete coverage.

One adapter (that VirtualBox refers to as LAN Adapter 2 and Centos chooses to name "enp0s8") uses a simple "NAT" connection to give VM clients access to the Yale network (SVN) and to the Internet (the Centos software update sites). You map no ports, so this provides only outbound service. It is automatically assigned a meaningless IP address that doesn't matter because no other computer can talk to it.

The other adapter (that VirtualBox refers to as LAN Adapter 1 and Centos chooses to name "enp0s3") is a Host-Only Adapter that creates a simulated Private Network that connects the VMs to each other and to the Host computer. It is not connected, routed, or bridged to Yale or the Internet, so it cannot be used to access other machines and no computer other than the host or the VMs can use it. The host and VMs are assigned static 192.168.*.* addresses to talk to each other just like real computers connected to a regular private home network. Since these addresses are not typically available

NAT is part of VirtualBox and requires no configuration. However, a Host-Only network has to be set up before any VM can use it. In the VirtualBox management console (that lists the installed virtual machines). Click File - Preferences - Network. Select the Host-only Networks tab. If no network is listed, click the Add button to create one. It will be called "VirtualBox Host-Only Ethernet Adapter" and when you create it you have to let your real laptop operating system add a new device. If you double click the now listed adapter, you can set its IPv4 Address to 192.168.137.1 and the Network mask to 255.255.255.0. It does not need a DHCP server because static addresses are configured in the VMs.

If you run Windows as the host computer, there is one additional cleanup step. When VirtualBox create a new simulated LAN Adapter in the Windows system, it left all the default configuration options. Go to Control Panel - Netowrk and Internet - Network Connections. Right click the VirtualBox Host-Only Network connection and choose Properties. DoubleClick the "Internet Protocol Version 4" item in the list box. Click the Advanced... button, choose the DNS tab, and turn off the checkbox at the bottom for "Register this connection's addresses in DNS". If you do not do this, then when you login to the Yale AD, your Windows desktop tries to register the 192.168.137.1 private address on this adapter with the dynamic DNS service that AD maintains. It probably will not cause a problem, but if another computer at Yale (frequently another machine you own) also has a private virtual network mapped to 192.168.137.* then from that DNS name that computer can believe that your computer is a VM on its private network, and then become unable to communicate with your machine because the two private networks are not connected. You can spend hours trying to figure out why you cannot share files or start a remote desktop session before you realize that your network traffic is going into the private network black hole instead of transiting the real network.

The Sandbox VM comes configured with two virtual LAN adapters (NAT and Host-Only). The Sandbox OS is configured with three adapters (NAT and two alternate versions of Host-Only to easily configure two VMs from one system image). It uses the NAT adapter to get to the outside world. You configure one of two Ethernet hardware (MAC) addresses with the VirtualBox console, and which hardware address comes up tells the Sandbox if it is the vm-ssoboxapp-01 host with private IP address 192.168.137.10, or the "-02" host with IP address ending in ".11".

The open format distribution file (the *.ova file you install the Sandbox VM with) sort of knows that there are supposed to be two virtual LAN adapters and it sort of knows that one is to be NAT and one is to be Host-Only. There is an obvious conflict between an "open format" file that can be read by VirtualBox or VMware and configuration options like "Host-Only" that may be a VirtualBox technical term that other vendors name differently. So when you install the *.ova file, VirtualBox displays the proposed hardware configuration and gives you a chance to explicitly connect any dangling configuration items to specific local chocices. For example, if you did not follow the previous instructions and did not create the Host-Only adapter on the host computer, then there would be no Host-Only adapter to connect the Sandbox VM to, and then one of the two LAN adapters is left with nothing it can connect to. If you do not fix the configuration problem at the start of installation, you can always fix it later before you start the VM.

If you need to simulate a second VM, clone the Sandbox computer (as explained below) and then in the clone configuration you leave Adapter 1 attached to the same Host-only Adapter but now you expose the Advanced options and change the MAC Address to be one larger (change AD at the end to AE).

The Centos operating system in the Sandbox VM has two different configurations for two different LAN adapters with different MAC addresses. It selects which IP address it uses based on which MAC address the simulated LAN adapter exposes. The first VM (ending in AD) gets 192.168.137.10 and the second (AE) gets .11. However, it is not possible to automatically change the hostname based simply on the MAC address. You have to do that manually the first time you boot up the cloned second VM. Issue the following command once:

sudo hostnamectl set-hostname vm-ssoboxapp-02.web.yale.internal

to change the hostname permanently on that VM.

However, while each machine has to know its own name, it also has to be able to locate the other machine in the "cluster". In production this is accomplished through the DNS server. Fortunately, all systems support a simple static alternative called a "hosts" file. This file is /etc/hosts on Linux or Mac, and C:\windows\system32\drivers\etc\hosts on Windows. It is a simple text file where each line starts with an IP address and then contains one or more host names associated with that address. The VM has a hosts file with two lines:

192.168.137.10 vm-ssoboxapp-01.web.yale.internal casvm1
192.168.137.11 vm-ssoboxapp-02.web.yale.internal casvm2

This maps the full name and a shorter nickname (casvmx) that is easier to type. You should add these lines to the "hosts" file on your host (Windows or Mac) computer so that you can access the VMs by name through the Host-Only adapter.

CAS currently has only two VMs. If that changes, or you want to use the Sandbox to work with a different product that has a different cluster configuration with more machines, then you can follow the cookie cutter instructions and add additional Network adapter configurations in /etc/sysconfig/network-scripts and add an extra line to all the hosts files.

Centos 7 has a Firewall service (firewalld) that, like the Windows firewall service, provides some protection to a desktop or server machine. It is not used in production machines behind the corporate firewall, and similarly it is not useful on VMs that are hidden on the Host-Only private virtual network, so that service is disabled in the Sandbox. It you want to turn it back on, you have to configure it for JBoss and the clustering.

Clone

The Sandbox VM can be cloned to produce a second CAS VM for testing cluster failover.

Cloning is a VirtualBox operation performed on the VirtualBox control window. Select the machine and choose Clone from the Machine menu. There are two kinds of Clones:

A Full Clone makes a complete copy of the VM configuration and the hard disk. This is the simplest option, but it takes a few minutes to make a second copy of the hard drive file

A Linked Clone is much faster and smaller. It creates a "snapshot" of the current hard drive file (an image of the disk that does not change and can be shared between the original VM and the Clone. Then the two VMs each get a new file that holds only the changes made to the hard disk since the snapshot was taken. Once the snapshot is taken, any files that are changed, for example changes to the source files in the Eclipse workspace, are separate on the original and clone VM and so they have to be copied from one to the other. You can commit changes to SVN and then update the files on the other VM, but after a while that gets to be annoying. If you make a lot of changes, do it on the original VM, discard the old clone and make a new one. It only takes a few seconds to create a Linked Clone.

After you delete a Linked Clone you may want to take a few seconds more and delete the disk snapshot that was created for it. That merges the changes back into a single disk image file.

There are two steps that you must perform after you create the Clone to distinguish it on the network from the original VM. Before you start the Clone VM, edit the Settings - Network - Advanced - MAC address changing the last byte of the address by adding 1 (change the "AD" at the end to "AE"). Then after the clone starts up for the first time, issue the command "sudo hostnamectl set-hostname vm-ssoboxapp-02.web.yale.internal" to change the hostname. After you do this a few times it will become routine and deleting the old Clone, creating a new one, updating the MAC address, and changing the hostname will take only a few minutes and is much simpler and faster than copying files from one machine to the other.

CAS Development

CAS development is the same whether you are working on a Windows host computer (with Eclipse and JBoss) or on the Sandbox VM. You edit in Eclipse, build with Maven, and run JBoss from the Eclipse toolbar. Generic CAS develpment is described elsewhere. This section just describes the Sandbox.

When you get a new copy of the Sanbox VM, the casdev user will have an Eclipse workspace in its home directory. This will have a project for the CAS source and for the CAS installer. However, the project may have old files. So the first step is to synchronize the workspace with the SVN server and update the files with whatever is current.

If you are starting work on a whole new release of CAS, then delete the old project and create a new project using the instructions provided in CAS Development Conventions at Yale.

There are two configured Maven "jobs" in the Eclipse Run - Run Configurations...

If you click the dropdown mark (V) behind the Run icon on the toolbar (a green circle with a right pointing triangle) then there is a CAS Build job (which runs the Maven POM in the CAS parent directory to compile everything and build the CAS WAR) and a CAS Install job (which copies the CAS WAR to the JBoss deploy directory and inserts parameters into the configuration files). Run the Build first, then the Install.

Elsewhere on the toolbar there are JBoss Run and Debug icons (a green arrow pointing right and a bug with an arrow under it). They can be used to start JBoss normally or with interactive debugging using Eclipse.

Changing the Sandbox

The Sandbox is separate from the contents of the projects in the workspace or the SVN. You could use the same Sanbox to develop with CAS 3.5.3 or CAS 4.0, and you could probably use it to develop any JBoss hosted application including Shibboleth.

Updating Centos 7 with newer versions of software or changing system configuration options is just standard Linux system administration.

Therefore, changing the Sandbox means installing a different version of Java, JBoss, or Eclipse.

You can install any version of Java using the Oracle RPMs. They go automatically into /usr/java. Generally you configure different versions of Java into Eclipse (Preferences - Java - Java Runtime) and then you select the verison of Java you want to target as a parameter of the Eclipse project, or as a parameter of the CAS Build and CAS Install Run configurations, or as a parameter of the JBoss Server runtime configuration. If you want to change the default version of Java that you get at the command prompt, Google for information on the Linux "alternatives" command.

There are separate instructions in this set of documentation for starting with a vanilla version of Eclipse and then adding the Subversive and JBoss Tools components, so they will not be repeated here.

To install a new version of JBoss, get it from jboss.org and unzip it into a new directory in /opt/jboss. JBoss Tools will configure it automatically if you ask. In Eclipse, go to Preferences - JBoss Tools, JBoss Runtime Detection. Click Search ... and have it search /opt/jboss. It will notice the new server and configure it. To change the default configuration, you need to display the Servers (Window - Show View - Servers). The Servers tab lists all the configured servers. Double click on the name of the server you want to configure. Click the "Open launch configuration" link for detailed startup configuration. In particular, you may want to allow browsers on other machines (actually only on the host computer) to access http://vm-ssoboxapp-01/cas by binding JBoss to the LAN adapter instead of just the local loopback. In the command parameters, change "-b localhost" to "-b 0.0.0.0". Also add a parameter to change the log directory "-Djboss.server.log.dir=/var/log/jbossas/standalone". If in doubt, compare this to the configuration of another JBoss server.

Disclaimer: configuring modern versions of JBoss AS 7, JBoss EAP 6.x, and Wildfly 8.x are all fairly similar. If you want to configure an older JBoss 5, that changes the entire directory structure and conventions and may require extra research.