Virtual Workspaces - VWS

  Home
  Overview
  Features
  Documentation   
  Downloads
  Clouds
  Publications
  Talks
  Marketplace
  Contributors
  Roadmap
  Funding
  Contact
  News rss feed
Home -> VWS -> TP1.3.1

DEPRECATED: The most recent version is TP1.3.3.1


Version TP1.3.1 - VWS Technology Preview 1.3.1

This is a technology preview of the virtual machine based Workspace Service (VWS). The workspace service provides a gateway to a set of resources configured with the Xen (2.0.7 or 3.x) implementation of virtual machines.

Client facing functionality:

  1. The Workspace Factory Service, implemented for the GT4 Java container, allows a grid client to deploy one or many Xen-based workspace(s) described by workspace metadata according to a deployment request specifying resource allocation and length of deployment.

  2. The Workspace Service, implemented for the GT4 Java container, allows a Grid client to 1) manage a workspace by restarting, stopping, pausing, or destroying it and 2) discover/monitor information about the deployment including networking assignments, deployment state, etc.

  3. The Workspace Group Service, implemented for the GT4 Java container, allows a Grid client to manage a group of workspaces by starting, stopping, pausing, or destroying them. When deploying a group via the Factory Service, a group EPR will be returned that can be used to operate on the whole group.

  4. The Workspace Ensemble Service, implemented for the GT4 Java container, allows a Grid client to manage "groups of groups." The primary purpose of ensembles is to provide a mechanism for virtual clusters (consisting of diverse workspace definitions, resource allocations and node numbers) to have all members co-scheduled (i.e., all cluster members will be scheduled at same time or none will run).

  5. The Workspace Status Service implemented for the GT4 Java container allows a Grid client to consult the usage statistics that the service has tracked about it.

The set of VMM resources may be managed entirely by the workspace service or alternatively integrated with a site's resource manager (such as PBS) using the workspace pilot. This way a dual use grid cluster can be achieved: regular grid jobs can run on the VMM node with no guest VMs, unless the node is at that time allocated to the workspace service. The site resource manager maintains full control over the cluster and does not need to be modified. For more information, see Flying Low: Simple Leases with Workspace Pilot.

See the Features page for general information on VWS functionality. See below for detailed documentation options that explain the functionality in detail.

You can download the software from the downloads page.

Documentation

Changes in TP1.3.1 (vs. TP1.3)

Summary
  • Added support for workspace pilot resource management. The pilot is a program the service will submit to a local site resource manager in order to obtain time on the VMM nodes. When not allocated to the workspace service, these nodes will be used for jobs as normal (the jobs run in normal system accounts in Xen domain 0 with no guest VMs running). See below.

  • Added functionality to ensure multiple workspaces (including groups of workspaces) are co-scheduled. See below.

  • Various client enhancements including ensemble service support, cleaner output, and new commandline options.

  • Various bug fixes.

  • There was a WSDL update: additions, changes and new namespaces.

Services
  • Added support for workspace pilot resource management. The pilot is a program the service will submit to a local site resource manager in order to obtain time on the VMM nodes. When not allocated to the workspace service, these nodes will be used for jobs as normal (the jobs run in normal system accounts in Xen domain 0 with no guest VMs running).

    Several extra safeguards have been added to make sure the node is returned from VM hosting mode at the proper time, including support for:

    • the workspace service being down or malfunctioning
    • LRM preemption (including deliberate LRM job cancellation)
    • node reboot/shutdown

    Also included is a one-command "kill 9" facility for administrators as a "worst case scenario" contingency.

    Using the pilot is optional. By default the service does not operate with it, the service instead directly manages the nodes it is configured to manage.

  • Added functionality to ensure multiple workspaces (including groups of workspaces) are co-scheduled. This includes the introduction of the Workspace Ensemble Service. This functionality allows complex virtual clusters to have all its component workspaces be scheduled to run at once if that is necessary. This works with both the default and pilot-based resource managers.

  • All remote interfaces (WSDLs/schemas) have been updated with at least new namespaces. You can examine them directly online at the WSDL and XSD files page (or read the descriptions on the Interfaces section). The main difference is an extension to the factory create/deploy operation and the addition of the ensemble service.

  • SSH based workspace-control invocations may now be configured with an alternate private key.

  • SSH based workspace-control invocations now use options to ensure easier identification of misconfigurations (no password entry hang is possible now).

  • If using the pilot mechanisms, a new configuration section in the service configuration file needs to be uncommented for pilot specific configurations (see the configuration comments there).

  • If using the pilot mechanisms, a client may now not submit a flag to the factory that requests the workspace be unpropagated after the running time has elapsed. Instead, unpropagation must be triggered manually by a client before this deadline is reached.

  • If using the pilot mechanisms, a shared secret must be configured in etc/workspace_service/pilot/users.properties for HTTP digest access authentication based notifications from the pilot. Use the included shared-secret-suggestion.py script. (alternatively SSH may be used for notifications but it is slower)

  • New dependencies (these are distributed with the service):

    • backport-util-concurrent
    • jetty - only necessary if using the pilot with the faster, default HTTP digest access authentication based notifications.

  • Some platforms+JVMs have buffer size issues which caused some workspace-control invocations to fail. This problem is addressed.

  • DHCP based network delivery to the VMs now requires unique hostnames for each allocatable address (even if they do not resolve to an IP). This addresses Bug #5738.

Reference clients
  • A new client workspace-ensemble allows you to destroy all workspaces in a running ensemble as well as trigger the workspaces in the ensemble to be co-scheduled and (afterwards) allowed to launch. This trigger is also available in the last workspace deployment of the ensemble, if desirable (this will save a web services operation).

  • Enhancement Bug #5795 is addressed, this allows an early unpropagate request to be sent. The new workspace action is "--shutdown-save" and requires a single or group workspace EPR.

  • The workspace program includes a new flag "--trash-at-shutdown" which allows callers to include a request that the service simply discards the VM after use (instead of unpropagating it). This is typical behavior for virtual cluster compute nodes, for example. The functionality itself is not new in this release, just this flag. It allows callers to include the flag when using commandline based resource requests as well as override a given resource request file with a trash-at-shutdown flag.

  • The workspace program has improved output, especially in the cases where you are launching groups and ensembles.

Workspace-control
  • Note: a previously used TP1.2.3 or TP1.3 configuration file for workspace-control will still work because of the nature of these changes. See this migration section of the administrator's guide for details.

  • A bug with failed propagations has been addressed: Bug #5681.

  • Will now support older ISC DHCP versions (v2 servers). See Bug #5470.

  • The defaults paths for ebtables and the dhcpd.conf file are now the more common occurences:

    • /sbin/ebtables is now /usr/sbin/ebtables
    • /etc/dhcp/dhcpd.conf is now /etc/dhcpd.conf

Workspace pilot program
  • This is a new tarball on the download page and is only necessary when using pilot based resource management.