Analysis(Publisher)

Analysis(Compute)

Publishing Analysis Applications

A set of analysis applications are published by a Publisher(compute workstation).
Facility Administrator can add new publisher from the workflow link of the web client by giving it appropriate name and description. Publisher name has to be unique.
As a result of successful addition of new publisher, publisher will get a "publisher code" which has to be mentioned in publisher.properties file and will be used on server side for authentication of legitimate publisher.
Typically, publisher will run in following directory structure:
1. Publisher_Root_Directory/
2. publisher.properties
3. run-compute.sh
4. icompute.properties
5. lib/
6. apps/
  1. publisher.properties file contains the publisher name and publisher code.

publisher.name=Publisher 1
publisher.code=PUBLISHER_KZqs7QGcm92iUAyXAorZePkM55uCNd

  1.  icompute.properties will contain different properties used by publisher.

directory where binaries, source, libraries and properties for Compute Algorithms is specified icompute.applications.root.dir=apps
url for iManage compute web services iworkers.webservice.url=http://localhost:8080/iManage/services/iWorkers
interval in milis to ping iManage server icompute.ping.interval=15000 icompute.threadpool.size=4
extension for application launcher files has to be changed to .bat in case of windows machine icompute.applauncher.extension=.sh
properties for task manager iengine.compute.task.manager=com.strandgenomics.imaging.icompute.ComputeTaskManager iengine.external.task.manager=com.strandgenomics.imaging.icompute.torque.TorqueTaskManager iengine.compute.task.type=EXTERNAL
properties for logging iengine.log.dir=logs iengine.log.size=1000000 iengine.log.interval=1 iengine.log.level=INFO iengine.log.scope=com.strandgenomics.imaging iengine.log.filename=avadis_worker_log.txt

run-compute.sh is a script which runs on publisher machine.
lib/ This directory contains the libraries required by publisher to co-ordinate with iManage server.
apps/ is the root under which publisher publishes the Compute Algorithm it supports. Name/path of this directory is specified in icompute.properties file.

Writing new Compute Application

A new Compute Algorithm can be written using the APIs and authentication mechanism described [wiki:meetings/Authentication here].
Compute Algorithm will be client of iManage system and will have a unique client id. This client id can be obtained by registering the client using Manage Client button in workflow panel on right side of web client.
The "Compute" algorithm can be placed at a particular location(icompute.applications.root.dir) in the compute workstation along with a specifications file. The directory structure for application is described in next section.
The newly added "Compute" algorithm becomes available as a workflow link in the web client.

Directory Structure for Application Specification

Publisher will have a root under which, all the application supported by the publisher are listed.
Every application will be specified in the directory by the application name(client-name) under the applications root directory. Under the individual application directory, there will be a .sh file which will be executed using the arguments specified in .gson file. Both of these files will have same name as that of the application. There may be other supporting files/directories eg. lib.
Typical directory structure for the application root is as follows:
- Applications_Root_Directory/
  - !AppName1/
    - !AppName1.sh
    - !AppName1.gson
    - lib/
    - misc files/directory, if required
Example, consider an application(registered as "Center" on iManage server) which draws a circle at center of specified image of specified record. The directory structure will be:
- apps/
  - Center/
    - Center.sh
    - Center.gson
    - lib/
    - !FindCenterTest.class
    - !FindCenterTest.java
lib/ directory will contain the supporting jar files required by the compute application eg. ImageJ.jar, client-jar.jar etc.
Center.gson will look like: {{{ { "categoryName":"Demo", "clientID":"hYRkPg664WzZsIFzHqfd00WKe0tbp3VCWGCFDANF", "description":"Draws Circle at the center of image", "name":"Center", "parameters":[ {"defaultValue":0,"name":"SiteNo","type":"INTEGER"}, {"defaultValue":"","name":"OverlayName","type":"STRING"} ], "version":"1.0" } }}}

categoryName is the category under which this application will appear in web client workflow link.
clientID this is client id generated while registering the application as client.
description is the description of the application. The same description will be displayed by the web client.
name is the name of the application. The same will be displayed by the web client.
version is the version of the application. The same will be displayed by the web client.
parameters are the parameters required by the application. Parameters will have defaultValue of parameter, name of the parameter and the data type of the parameter.

Center.sh will look like(assuming java is in $PATH):

java -DInputFile=$InputFile -cp lib/client-jar.jar:lib/ij.jar:. FindCenterTest

Center.bat will look like(assuming java is in $PATH):

java -DInputFile=%InputFile% -cp lib\\client-jar.jar;lib\\ij.jar;. FindCenterTest

On compute-workstation, all the inputs are received in an input file. The name of the input file can be accessed as environment variable $!InputFile. Typically, name of the input file will be <task_id>.in

The structure of the input file will be <param_name>=<param_value_1>,<param_value_2>,...<param_value_n>

Since the typical compute code will be written once and executed many times, unlike other iManage clients, the compute clients will receive one time authentication code from server at time of execution, which will be received as property !AuthCode in the input file. Input file will also contain property !TaskHandle which is the server side reference to the task and can be used for sending task progress to the server.

The example task input file is: {{{ SiteNo=0 RecordIds=28 AuthCode=ydT6faDC2O4g7HQaJW8qQavzkheneVuXuZ5RDDi6 TaskHandle=8349537612086336 OverlayName=pqr }}}

The example directory structure at compute workstation can be found here: [https://nandi.strandls.com/trac/curie/attachment/wiki/Analysis/compute.tar.gz]

Invoke Workflow

A compute algorithm can be invoked on selected records by clicking on a workflow link.
It pops up a dialog box to specify view specific parameters and scheduling information.

Task Monitoring and Inspection

Scheduled instance of application is called task.
Task will be in one of the following states.
- SCHEDULED: before its scheduled time.
- WAITING: waiting for free resources to execute (after scheduled time is reached).
- DELETED: task deleted before allocated for execution.
- ALLOCATED: allocated for execution
- EXECUTING: under execution
- SUCCESS : task completed with success
- ERROR : task completed with error
- TERMINATING: request sent for task termination
- TERMINATED: task is terminated
Web client shows task docklet below navigator for task monitoring and inspection. This docklet has 2 tabs viz. My tasks, Task Inspector.

My Tasks

When user submits a task for execution; user can select whether this task should be monitored (default=true).
If this option is selected then corresponding task is shown under in My tasks.
Tasks under My Tasks are remembered for a particular user even after logout.
Tasks can be cleared from My Tasks using clear button available for each task.

Task Inspector

Task Inspector is intended to monitor tasks which are submitted by other user.
No task is added by default to the inspector. User needs to search for a task of his interest and explicitly add it to task inspector.
User can add any task belonging to any of his/her project.
Tasks can be cleared from Task Inspector using clear button available for each task.
All tasks from Task Inspector are cleared on logout.

Common features for My Tasks / Task Inspector

State of the tasks under 'My Tasks' tab are updated at every 15 seconds.
User can explicitly refresh the the list to check the latest states of the task
Selecting a task will show its input records in the spreadsheet, thumbnail view. User can view history of corresponding records to know the changes done to the record.
User can check the input parameters used for invocation of this task
'Terminate task' button is available if user is owner of the task or Manager / Facility Manager / Administrator for that project.

External Task Manager (Torque)

The analysis task can be managed by compute publisher internally or can be delegated to external task handler eg. PBS/Torque. To specify, whether the task is managed by internal task handler or external handler the icompute.properties contains the properties related to task manager.

{{{ #properties for task manager iengine.compute.task.type=EXTERNAL iengine.compute.task.manager=com.strandgenomics.imaging.icompute.ComputeTaskManager iengine.external.task.manager=com.strandgenomics.imaging.icompute.torque.TorqueTaskManager }}}

The "iengine.compute.task.type" is the property which specifies whether the task is managed by internal or external task manager. For, task to be managed by Torque/PBS it will be set to EXTERNAL. Other two properties specifies the class which is used as manager of the analysis task. The class specified by "iengine.compute.task.manager" property will be used for internal task manager whereas the class specified by "iengine.external.task.manager" will be used for external task manager, in this case Torque/PBS.

Apart from these properties the structure of the executable file will change so as to contain the properties to be supplied to PBS.

Setting PBS Properties in Task Executable File

The properties required by PBS system for execution of task on PBS/Torque are specified just like any other PBS script. In the executable file, the properties starting with #PBS are interpreted by PBS.

#PBS -q cmms-dev #PBS -N center #PBS -o /data/cmms/torque_test/test.log #PBS -e /data/cmms/torque_test/test.err #PBS -l nodes=1:ppn=4 #PBS -l mem=1gb #PBS -l walltime=00:00:10

Apart from these changes, all the other things remain same for analysis job running internally or on Torque/PBS. For analysis algorithm developer, the implementation of algorithm is not affected by the way it is run as compute job. Also for end user, the invocation, the user interaction and the task management functionality like monitoring the jobs, viewing the job parameters, terminating the job etc remains exactly the same.

The example directory structure at compute workstation running torque/pbs jobs can be found here: [https://nandi.strandls.com/trac/curie/attachment/wiki/Analysis/torque_compute.tar.gz]

Changes required for running publisher on Windows platform

Following changes have to be made in order to run publisher on windows machine.

In icompute.properties
icompute.applauncher.extension=.bat
. In application <application_name>.sh file change all '/' to '!\', ':' to ';'
In application <application_name>.sh file change $<VARIABLE_NAME> to %<VARIABLE_NAME>% eg. $!InputFile to %!InputFile%
In application directory change <application_name>.sh to <application_name>.bat eg. center.sh to center.bat

How to stop a publisher?

Find the PID of your publisher :
jps -l | grep ComputeDaemon
If several publisher are running on the same machine, you can check the directory where each process corresponding to the list of PIDs was invoked:
pwdx PID
kill it
kill -9 PID

Provide feedback

Saved searches

Use saved searches to filter your results more quickly