Table of Contents
The purpose of this Quickstart Guide is to allow you to familiarize yourself with the client side of the Workspace Service. To accomplish this, we will describe the steps necessary to deploy an example workspace, and then explain how you can customize this example to meet your own needs. You should find this guide useful is you have installed the Workspace Service yourself and want to test it, or if you are submitting workspaces to a Workspace Service administered by someone else.
To work through this guide, the Workspace Service's client environment must be installed on the machine from which you will be submitting workspaces. Although installation of the Workspace Service is outside the scope of this guide, we will point you to the specific sections of the Administrator's Guide that explain how to install a client environment on your machine.
This guide is divided into two parts. Sections 2 through 6 describe the deployment of the example workspace, with all necessary files (VM image, configuration files, etc.) provided for you. These first sections are meant to be read sequentially. The next sections, sections 7 and above, can be read in any order, as each section provides instructions on how you can customize a particular aspect of workspace deployment (e.g., using your own VM image, modifying deployment parameters, etc.).
If you are an experienced Globus user and are already familiar with the workspace service, you may want to skip to the Workspace client quicker-start.
To deploy a workspace, you will need to install the Workspace Service's client environment on the client node (the machine from which you will be accessing the Workspace Service). This need not be the same machine as the VWS node (the machine hosting the Workspace Service). In particular:
Another option is to use a VM with the Workspace Service's client environment preinstalled in it. You can find VMs like this in the Workspace Marketplace. However, take into account that these VMs may require additional configuration to work properly. If so, please make sure you follow the configuration instructions included with the VM image you choose from the Marketplace.
Now that you have a working installation of the client environment, you will need a virtual machine to deploy as a workspace. Although you can use any VM image, in this guide we will use a small and simple VM image as an example. This VM image is provided as part of this guide, and is also a part of the Workspace Marketplace.
In particular, we will be using a ttylinux-based image. ttylinux is a very small, yet functional, Linux distribution requiring only around 4 MB. Visit the ttylinux home page for a list of some of its many nice features.
You can download a tarball containg the image here. Untar the file in an empty directory. This directory should contain the following files:
ttylinux-xen: The VM image
ttylinux-xen.conf: A sample Xen configuration for the VM image.
Take into account that the provided ttylinux image is not the exact same image you can download from the ttylinux home page. It is preconfigured to obtain a network address through DHCP, and depends on a different root device than a regular ttylinux image (to make the image Xen-friendlier).
If you have access to a machine running Xen (preferably, a machine that will be used by your Workspace Service to run your workspaces), you can test the image using the provided Xen configuration file. First of all, make sure you replace the values of the kernel and disk parameters inside the file with appropriate values. In particular, you should point kernel to your Xen guest kernel and disk to the location of the ttylinux-xen file. The contents of the configuration file should look something like this:
kernel = "/boot/vmlinuz-2.6-xen" memory = 64 name = "ttylinux" disk = ['file:/root/ttylinux-xen,sda1,w'] root = "/dev/sda1 ro" vif = ['']
To test the image, simply run the following:
xm create ttylinux-xen.conf -c
You should see ttylinux boot messages, followed by a login prompt. You can log in using user 'root' and password 'root'. You can use ifconfig to check that the networking interface (eth0) is correctly setup. If you do not have a DHCP server, you can also use ifconfig to configure eth0 with a static IP address. Once the network is correctly set up, you should be able to ping out to some other machine besides dom0 and from that other machine be able to ping this address. Note that you should do all this just for the purposes of verifying that the image works correctly. Keep the image configured to obtain a network address automatically via DHCP (even if your network doesn't have a DHCP server), as the Workspace Service will use DHCP to dynamically set up networking in your workspace.
Now that you've installed the client environment and have a test VM image, the next step is configuring your workspace. To succesfully deploy a workspace, the Workspace Service needs to know certain configuration information such as where the VM image is located, what network devices must be configured on the VM, etc. This information is contained in the workspace metadata file, which is sent to the Workspace Service when deploying a workspace. For now, we will use a sample metadata file which will require only a few modifications to work. Later on, after you have deployed your first test workspace, we will see what other configuration parameters can be modified in the metadata file (in section Customizing the workspace metadata). You can also find a more technical discussion of metadata in this section of the workspace interface documentation.
The metadata file we will be using is included with the Workspace Service client environment. You can find it in $GLOBUS_LOCATION/share/workspace_client/sample-workspace.xml. Copy this file to another directory, as we will need to make some modifications to it. Right now, you will only need to modify one value. In the next section, depending on how you choose to deploy the workspace, you may need to modify an additional value.
Locate the aggregateVirtualWorkspace/vwSet/virtualWorkspace/logistics element. It should look like this:
<log:logistics>
<log:networking>
<log:nics>
<log:number>1</log:number>
<log:nic>
<log:name>eth0</log:name>
<log:ipConfig>
<log:acquisitionMethod>AllocateAndConfigure</log:acquisitionMethod>
</log:ipConfig>
<log:association>public</log:association>
</log:nic>
</log:nics>
</log:networking>
</log:logistics>
You will need to set the value of the association element (highlighted above) to an association accepted by the Workspace Service you are using. If you have installed the Workspace Service yourself, following the Administrator's Guide, just use any of the associations you configured. If you are using a Workspace Service that is not under your control, you can query the service to find out what associations it supports. To do this, you can use the workspace client:
$GLOBUS_LOCATION/bin/workspace --factoryrp -s https://example.com:8443/wsrf/services/WorkspaceFactoryService
Make sure you use the URI for the Workspace Factory Service. The client will print information about the Workspace Service, including supported association. If you do not know which one to use, consult with the administrator of the Workspace Service you want to use, to make sure you as using the correct association.
At this point, you are ready to deploy the test workspace. In this section we will see two ways of deploying the workspace:
Using prepropagated images. This option should only be used to verify that the Workspace Service has been correctly installed. In this setup, the VM image will be copied manually to the VMM nodes (instead of being automatically deployed by the Workspace Service). Of course, this requires access to the VMM nodes (which you will have access to if you are installing the Workspace Service). If you are submitting a workspace to a remote Workspace Service controlled by someone else, you do not need to follow the steps for this setup, but should nonetheless read through them, as they will be relevant later on.
Using image propagation. In this setup, the VM image is propagated to the VMM node from a well-known location, such as a VM image repository in the same site as the Workspace Service or a third-party repository. The Workspace Service manages all image transfers automatically, and provides greater control on what happens when a workspace's lifetime ends.
You can find a more detailed discussion of how propagation works at the end of this guide, in the Propagation Basics section.
As mentioned above, don't forget that this setup is only for test purposes. You will only be able to follow the steps described here if you have installed the Workspace Service yourself, following the instructions described in the Administrator's Guide. If you are using a remote Workspace Service, you should still read this section as much of the information mentioned here will be relevant when submitting workspaces with image propagation.
First of all, we need to manually propagate our test ttylinux-xen image to the VMM nodes. If you have a default Workspace Service installation, you will have to copy the file to /opt/workspace/images in all the VMM nodes. If you are using non-default paths, you will need to copy the file to the directory specified in option localdir in your worksp.conf file.
Next, make sure that security is properly set up:
On the client node, your UNIX user (or whatever user you will be using to submit the workspace) has a valid certificate that can be authenticated by the VWS node. Also, don't forget to generate a proxy certificate. If you have no idea what the two previous sentences are stating, you need to brush up on fundamental Globus security concepts. The Security section of the GT4 Programmer's Tutorial contains a good explanation of certificates and general Globus security topics. The GT4 Quickstart Guide and the System Administrator's Guide provide instructions on how to generate user certificates.
On the VWS node, your user is authorized to access the Workspace Service. The Administrator's Guide describes how to authorize users by adding them to the Workspace Service's gridmap file.
You can now go ahead and run the following command to submit your workspace:
$GLOBUS_LOCATION/bin/workspace --file workspace.epr \ --metadata sample-workspace.xml \ -s https://127.0.0.1:8443/wsrf/services/WorkspaceFactoryService \ --deploy-duration 30 --deploy-mem 64 --deploy-state Running
The meaning of each parameter is explained later on in this guide. For now, you have to make sure that the --metadata parameter refers to the workspace metadata file you have been editing in this guide, and that the -s parameters refers to the URI of the Workspace Factory Service.
If all goes well, you should see the following (read the callouts for a more detailed explanation of what each message means):
Reading in metadata ... ok. Creating deployment request from arguments... ok. *** Deployment request: - Node number: 1 - minDuration: 1800 seconds - State: Running - Default shutdown mechanism: Normal - individualPhysicalMemory: - exact: 64.0 *** Creating workspace "http://example1/localhost/image"... ok.
Resource key: 1 Instantiation time: Thu Apr 12 11:55:07 CDT 2007 Duration: 1800 seconds (roughly 30 minutes) Shutdown time: Thu Apr 12 12:25:07 CDT 2007 Resource Termination time: Thu Apr 12 12:55:07 CDT 2007
Wrote EPR to 'workspace.epr'
Subscribed to termination notification.
Subscribed to deployment changes. Waiting. *** Deployment:
- State changed: Unstaged --> Propagated
*** Network configuration: - NIC #1 - ------------ - Name: eth0 - MAC: ANY - IP configuration: AllocateAndConfigure - IP address: 192.168.100.1 - IP gateway: 192.168.0.1 Thu Apr 12 11:55:13 CDT 2007 -- Received notification: *** Deployment:
- State changed: Propagated --> Running
You can test if your workspace has been correctly deployed by using the IP address printed by the workspace client. You can try pinging it and, if you are using the provided ttylinux-xen image, by logging into it:
ssh root@xxx.xxx.xxx.xxx
Remember that the root password for the ttylinux-xen image is "root".
The previous example assumes that the VM images are prepropagated to the VMM nodes, which is only useful for testing purposes. The Workspace Service is capable of managing the VMM nodes and propagating VM images to those nodes whenever a workspace has to be instantiated on a node. Unpropagating the images and staging them out is also handled by the Workspace Service. We will build on the previous example to have the Workspace Service automatically propagate the VM image, instead of having to copy it manually.
In this example, we will assume that the VM image has already been staged to an image node in the same site as the Workspace Service. If you are using a Workspace Service installed by yourself, make sure that image propagation is correctly setup as described in the Administrator's Guide, and copy the ttylinux-xen image to a well-known location on that node. If you are using a Workspace Service administered by someone else, you must ask the Workspace Service administrator to provide you with the location of a valid image in the image node (or provide him with the ttylinux image and ask him to add it to the image node). In both cases, let's assume that the ttylinux-xen image is located in directory /usr/share/vm_images/. of the image node.
Now, we need to edit our metadata file to indicate that we will be transferring a file from the image node. Locate the aggregateVirtualWorkspace/vwSet/virtualWorkspace/definition element. It should look like this:
<def:definition>
<def:requirements>
<jsdl:CPUArchitecture>x86</jsdl:CPUArchitecture>
<def:VMM>
<def:type>Xen</def:type>
<def:version>3</def:version>
</def:VMM>
</def:requirements>
<def:modules>
<def:disk>
<def:logicalName>sampledisk1</def:logicalName>
<def:uri>file://ttylinux-xen</def:uri>
</def:disk>
</def:modules>
<def:bindings>
<def:binding>
<def:logicalName>http://default</def:logicalName>
<def:diskCollection>
<def:rootVBD>
<def:logicalName>sampledisk1</def:logicalName>
<def:mountAs>sda1</def:mountAs>
<def:permissions>ReadWrite</def:permissions>
</def:rootVBD>
</def:diskCollection>
</def:binding>
</def:bindings>
</def:definition>
You need to modify the uri element (highlighted above) with the path to the VM image in the image node. For this example we will use SCP propagation (file copying through SSH), although the Workspace Service also supports GridFTP for propagation. So, assuming the hostname of the image node is imagenode.foobar.net and we are using the image path mentioned earlier, the uri element should look like this:
<def:uri>scp://imagenode.foobar.net/usr/share/vm_images/ttylinux-xen</def:uri>
Now, to submit your workspace, you can run the same command as the one used in the previous example:
$GLOBUS_LOCATION/bin/workspace --file workspace.epr \ --metadata sample-workspace.xml \ -s https://127.0.0.1:8443/wsrf/services/WorkspaceFactoryService \ --deploy-duration 30 --deploy-mem 64 --deploy-state Running
However, remember how the last three parameters make up the deployment request. These can actually be expressed in an XML file, which can be reused in other deployments. A sample deployment request, expressing the above three deployment parameters, can be found in $GLOBUS_LOCATION/share/workspace_client/sample-deployment-request.xml. To use the deployment request XML file, simply run the workspace client like this:
$GLOBUS_LOCATION//bin/workspace --file workspace.epr \ --metadata sample-workspace.xml \ --request sample-deployment-request.xml \ -s https://127.0.0.1:8443/wsrf/services/WorkspaceFactoryService
As before, make sure that the -s parameters refers to the URI of the Workspace Factory Service you will be using.
If all goes well, the output of the workspace client should be very similar to the output shown earlier. There are a few important differences, which are highlighted below:
Reading in metadata ... ok.
Reading in deployment request ... ok.
*** Deployment request:
- Node number: 1
- minDuration: 120 seconds
- State: Running
- Default shutdown mechanism: Normal
- individualPhysicalMemory:
- exact: 64.0
*** Creating workspace "http://example1/localhost/image"... ok.
Resource key: 6
Instantiation time: Thu Apr 12 20:00:24 CDT 2007
Duration: 120 seconds (roughly 2 minutes)
Shutdown time: Thu Apr 12 20:02:24 CDT 2007
Resource Termination time: Thu Apr 12 20:32:24 CDT 2007
Wrote EPR to 'workspace.epr'
Subscribed to termination notification.
Subscribed to deployment changes. Waiting.
*** Deployment:
- State changed: Unstaged --> Unpropagated
*** Network configuration:
- NIC #1
- ------------
- Name: eth0
- MAC: ANY
- IP configuration: AllocateAndConfigure
- IP address: 192.168.100.1
- IP gateway: 192.168.0.1
Thu Apr 12 20:00:32 CDT 2007 -- Received notification:
*** Deployment:
- State changed: Unpropagated --> Running
If you allow the workspace to run to its full duration, you will see the following messages:
Thu Apr 12 20:02:36 CDT 2007 -- Received notification:
*** Deployment:
- State changed: Running --> Propagated
Thu Apr 12 20:02:41 CDT 2007 -- Received notification:
*** Deployment:
- State changed: Propagated --> TransportReady
Now that you have deployed your workspace, you can perform certain operations on your workspace. Remember that, after creating a workspace, the workspace client will print out notification messages to the console informing you of changes in the state of the workspace. Make sure you keep this client running, so you can verify that the management operations are being correctly carried out. In a separate shell, we will use the workspace client to pause/resume/stop our workspace. To refer back to our workspace in this separate shell, we will use the workspace.epr file, created in the previous section. This file contains the EPR (Endpoint Reference) of the workspace. If you are not familiar with EPRs, you can think of this as a 'pointer' to our workspace.
Pausing the VM: Run the following:
workspace -e workspace.epr --pause
In that shell you should have seen 'Paused.' printed.
Back in the previous shell where the client is waiting on notifications for state changes, this was printed along with a timestamp: State changed: Running --> Paused
The VM should now not ping. If you have access to the VMM it is running on you could log in and verify it is in a paused state.
Resuming the VM: Run this command:
workspace -e workspace.epr --start
The opposite should have happened, 'Started.' was printed to this console and the other client has received another notification: State changed: Paused --> Running
Stopping the VM: There are three ways of stopping the VM: rebooting ('--reboot' parameter), graceful shutdown ('--shutdown' parameter), and destruction ('--destroy' parameter). This last operation will completely destroy the workspace including the WSRF resource associated with the instance. The running VM itself will be immediately powered down (no chance for graceful shutdown).
workspace -e workspace.epr --destroy
Output in the other client:
Wed Mar 14 10:20:09 CDT 2007 -- Resource terminated Shutting down notification consumer ... ok.
Up to this point, you have been using a sample workspace metadata file well suited for the ttylinux workspace (in fact, you can find more sample metadata files in $GLOBUS_LOCATION/share/workspace_client). In this section we will not provide an in-depth description of every configuration parameter that can be specified in the metadata file (you can find this in the interfaces page), but we will point you the parameters that you will probably have to fiddle with to write metadata files for new workspaces (e.g., using a different VM image)
First, let's look at the top-level structure of the metadata file (some content ommitted for brevity):
<aggregateVirtualWorkspace ... >
<name>http://example1/localhost/image</name>
<vwSet>
<nodeNumber>1</nodeNumber>
<virtualWorkspace>
<name>http://example1/localhost/image</name>
<log:logistics>
...
</log:logistics>
<def:definition>
...
</def:definition>
</virtualWorkspace>
</vwSet>
</aggregateVirtualWorkspace>
Note the following:
The two <name> elements. These are not required to be unique across workspaces, although you can choose to follow that convention. These names can be used by users or higher level tools to manage them in various ways. The unique identifier of each workspace will always be the EPR generated when we create a new workspace.
The <nodeNumber> element. The value of this element must be 1. This is used in an advanced use case of virtual clusters (where you as the client are asking for many instances of these images to be deployed simultaneously in some coordinated fashion). Virtual clusters are not supported in any released code as of yet, but you can read about prototype work with them in this paper.
There are two main sections, <logistics> and <definition>, which are discussed next.
A full description of the <definition> section is given in this section of the interface documentation.
The <definition> section may seem confusing at first but it is organized to allow for potentially complicated scenarios. If you want to use a different VM image, you can safely ignore most of the <definition> section, and focus on the following two elements:
The <uri> element. This element can be found in find the <modules> - <disk> element. It tells the workspace service how to find the VM so it can instantiate it. In our first example, we used a file:// URI, indicating we were using a prepropagated image). Our second example used an scp:// URI, indicating the VM image had to be propagated (using SCP) from an image node. You can also use gsiftp:// URIs to use GridFTP to propagate the VM image.
The <mountAs> element. This allows you to specify the disk device the VM image will appear under in the VM. In many VMMs,
including Xen, the guest kernel can be made to see what it
thinks are physical disks and partitions with an assortment
of labels. For example, our ttylinux-xen image appeared as the sda1 partition. However, it is important to note that, inside any OS, there must also be directions for mapping
partitions as particular volumes or locations that the users
see. In Linux this is done with the '/etc/fstab' file, so you must make sure that the partition name specified in the <mountAs> element corresponds to the partition mounted in the fstab file.
A full description of the <logistics> section is given in this section of the interface documentation.
The <nic> element allows you to configure the VMs network interface. You can assign a different name to the network device (using the <name> element) or configure how the VM's IP address is configured. In our example, we have been using the 'AllocateAndConfigure' acquisition method. This IP configuration acquisition method directs the service to lease an IP address from a pool of available addresses (this is the 'Allocate' part) and then configure the VM with this address on the fly (this is the 'Configure' part). On the fly configuration of the leased address is accomplished via DHCP and therefore the guest VM needs to be set up to make a DHCP request upon booting. There are other acquisition methods, explained in this section of the interface documentation.
Because we are using 'Allocate', you need to specify the association to request an address from. As mentioned earlier, if you are setting up a Workspace Service yourself, you will be able to configure the available associations yourself. Otherwise, you can query the Workspace Service for a list of valid associations. The association name is specified in the <association> element. For further explanation, see this section of the interface documentation.
There are currently three deployment parameters you can modify:
--deploy-duration minutes
This requests a specific deployment duration (in minutes).
--deploy-mem MB_amount
MB of RAM to be allocated to the workspace.
--deploy-state initial_state
This requests that the workspace be moved to state initial_state as soon as possible. For reference, consult the list of states a workspace may go through in its lifecycle at this section of the interfaces documentation.
You can request other states as well. For example, you could request 'Propagated' which would move things to just before the VM is instantiated at the VMM. It will hang there. Then you could come back at a later point and invoke 'start' on the Workspace Service and launch the VM. Requesting 'Paused' is similar. When you are doing staging from remote grid nodes, requesting 'Unpropagated' will mean that the staging is accomplished and then that workspace will stay unpropagated until further notice.
As mentioned in the second example, deployment parameters can also be specified in an XML file.
If you just wanted to get started as soon as possible and have some Globus experience, this is the section for you.
Prerequisites:
$GLOBUS_LOCATION is set
$GLOBUS_LOCATION/bin is in
your $PATH$ANT_HOME is set
Optional (requires post-TP1.2.3 client):
Invocation:
workspace --file eprfile.xml --metadata hello.xml --request req.xml -s <factory URL>
Invocation with pubkey option (requires post-TP1.2.3 client):
workspace --file eprfile.xml --metadata hello.xml --request req.xml -s <factory URL> --ssh-pubkey-file ~/.ssh/id_rsa.pub
Client hangs by default, listening for notifications. Wait for client to print that it has received a 'Running' notification. Use printed IP address or hostname to contact the VM (if you sent SSH key, you can use it login to root@hostname).
The workspace client ($GLOBUS_LOCATION/bin/workspace) which we have been using throughout this guide can accept several parameters. You can see the complete list of available parameters by running:
$GLOBUS_LOCATION/bin/workspace --help
The following is a selection of some of the more relevant parameters:
--service url
-s url
We'll use this to specify the URL to the Workspace Factory Service you want to use for deployment.
--metadata metadata_file
This is the location of the metadata file for the workspace we are deploying.
--request deployment_file
If we are not specifying the deployment parameters as command-line parameters, we use this option to specify the file containing the deployment parameters.
--file epr_file
Only when creating a new workspace
The Workspace Factory Service creates a unique EPR for each workspace it is managing. After going through the deploy process, the workspace client will write this EPR to a local file (epr_file). We will use this file later (with the '-e' option) to allow us to contact the Workspace Service concerning this particular workspace instance.
--eprFile epr_file
-e epr_file
Only when accessing an existing workspace
When accessing a workspace (e.g., to shut it down), this parameter is required to specify the EPR of the workspace we want to operate on.
Installing the workspace client adds several files to the
GLOBUS_LOCATION directory tree:
The $GLOBUS_LOCATION/bin/workspace
program.
The $GLOBUS_LOCATION/share/schema
directory now contains the 'workspace' directory of
WSDL and XML Schema files.
The
$GLOBUS_LOCATION/share
directory now contains the
workspace_client directory.
This contains sample workspace metadata
files and resource request files. It also contains several
example scripts that run different sample interactions
with the Workspace Service.
Before reading about propagation, it should be helpful to look at the list of states a workspace may go through in its lifecycle which can be found in this section of the interfaces documentation.
The resource pool model of the Workspace Service (the most useful and common deployment of the service) allows one grid service to manage a large set of VMMs. There are many ways to instantiate a VM but the base case (and the case we are walking through in our examples) is to do so from a local filesystem that the VMM can see.
To fulfill a workspace request by running a VM at the site, the workspace service will pick a VMM node (based on criteria and an algorithm) and send directives to it to initiate running a VM there. If the user has specified, it can also initiate propagation which is to have that node transfer over a copy of the VM's files to its local filesystem.
After the deployment is over (if the client has requested a shutdown or if the maximum running time has been reached) the files are unpropagated back to where they came from, replacing the originals. Thus, state can be maintained over many deployments.
The resource pool model treats the group of VMMs as a set of available raw resources and these VMMs do not necessarily have grid middleware on them or even routable IP addresses. This prevents direct transfers from the wide area network to a particular VMM. Instead the workspace infrastructure takes over to deliver VM files the 'last mile' on the fast site network.
There is an optional way to include an 'external' transfer first (and/or after deployment) in a workspace deployment request. See the optional parameters section of the interfaces documentation.
You have an option to just destroy (destroy is different than shutdown) the workspace, causing any files on the VMM nodes to be immediately deleted instead of unpropagated. This can be useful for situations where you can always start VMs from the same, 'fresh' state.
This only actually happens if you've called destroy while the workspace is in stages of its lifecycle that are at the VMM. For example, if the workspace has already been unpropagated then this would not bypass unpropagation.
There is also an option to make no-unpropagation the default shutdown option for a particular workspace deployment. When the maximum running time has been reached and the workspace service automatically starts tearing down the VM, there will be no unpropagation stage.
On every VMM node there is a directory created for each workspace instance that is used to store files that have been propagated specifically for that instance.
There is also another directory that allows files to be cached and used by any workspace deployed on that node. You can control who is allowed to use files from this cache by using the Workspace Service authorization callout and defining access rules about each file. But by default this is left as open access.
For simple tests, such as the first example of this guide, this serves as a way to quickly test that the basic mechanisms of the service and workspace-control program are working. By placing your VM in this directory you can point to it in metadata by using a relative 'file://' URL.
Because of the default open access, this is mostly for test purposes. Depending on the nature of your Workspace Service, it could be possible that several people deploy workspaces that use files from this cache simultaneously. If the file in question is mounted as read/write that means that more than one kernel now thinks it has exclusive access which can result in filesystem corruption.