eu.xtreemos.xati.API
Class XCRJobMng
java.lang.Object
eu.xtreemos.xati.API.XCRJobMng
public class XCRJobMng
- extends java.lang.Object
- Author:
- gregor.pipan@xlab.si
Method Summary |
static void |
checkpointJobInit(java.lang.String __jobId,
java.lang.Integer __resolveJobDependencies,
java.lang.String __modeType,
java.lang.String __options,
java.security.cert.X509Certificate __userCert)
INITIAL CP CMD TYPED IN ON ANY MACHINE
entry point for single/multi-job checkpointing - redirection to super job checkpointer grid node in the next step
first job detects job dependencies: 1. from jdsl listed dependencies [NOW], 2. queries dependency monitor service (based on connector) [LATER]
ATTENTION job manager/checkpointer of job (referenced by passed jobId) becomes SUPER JOB CHECKPOINTER => coordinate all other involved job checkpointer |
static void |
matchUnitsBiggerNodesCB(java.util.ArrayList<CommunicationAddress> __list)
BEST SERVICE! |
static void |
prepResReallocation(java.lang.String __jobId,
java.lang.String __checkpointVersion,
java.lang.Integer __restartDependentJobs,
java.lang.String __desiredRestartDestination,
java.lang.String __mode,
java.security.cert.X509Certificate __userCert)
|
static void |
proceedWithRebuilding(java.lang.String __jobId,
java.lang.String __initialJobId)
|
static void |
proceedWithResumingRST(java.lang.String __jobId,
java.lang.String __initialJobId)
|
static void |
restartJobInit(java.lang.String __jobId,
java.lang.String __checkpointVersion,
java.lang.Integer __restartDependentJobs,
java.util.ArrayList<java.lang.String> __ip,
java.util.ArrayList<java.lang.String> __port,
java.lang.String __mode,
java.security.cert.X509Certificate __userCert)
RESTART *********************************
/
/*
put ip- and port-details into required format for further computation |
static void |
returnFromJobLocking(java.lang.String __jobId,
java.lang.String __initialJobId,
java.lang.String __jsdlFile,
java.lang.String __executable,
java.util.ArrayList<CommunicationAddress> __jobUnitAddresses,
CommunicationAddress __superJobCpAddr,
java.util.ArrayList<java.lang.String> __dependentJobs,
java.lang.String __strategy,
java.lang.String __options,
java.lang.String __mode,
java.lang.Integer __ret,
java.security.cert.X509Certificate __userCert)
JOB CHECKPOINTER (JC)
know job units
mJUC entry (job-unit address and state )
select appropriate kernel checkpointer |
static void |
terminateSingleMultiJobRestart(java.lang.String __jobId,
java.lang.String __initialJobId)
|
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
servicename
public static java.lang.String servicename
XCRJobMng
public XCRJobMng()
checkpointJobInit
public static void checkpointJobInit(java.lang.String __jobId,
java.lang.Integer __resolveJobDependencies,
java.lang.String __modeType,
java.lang.String __options,
java.security.cert.X509Certificate __userCert)
throws java.lang.Exception
- INITIAL CP CMD TYPED IN ON ANY MACHINE
entry point for single/multi-job checkpointing - redirection to super job checkpointer grid node in the next step
first job detects job dependencies: 1. from jdsl listed dependencies [NOW], 2. queries dependency monitor service (based on connector) [LATER]
ATTENTION job manager/checkpointer of job (referenced by passed jobId) becomes SUPER JOB CHECKPOINTER => coordinate all other involved job checkpointer
- Parameters:
jobId
- signed
- XOSCertificate
- Throws:
java.lang.Exception
returnFromJobLocking
public static void returnFromJobLocking(java.lang.String __jobId,
java.lang.String __initialJobId,
java.lang.String __jsdlFile,
java.lang.String __executable,
java.util.ArrayList<CommunicationAddress> __jobUnitAddresses,
CommunicationAddress __superJobCpAddr,
java.util.ArrayList<java.lang.String> __dependentJobs,
java.lang.String __strategy,
java.lang.String __options,
java.lang.String __mode,
java.lang.Integer __ret,
java.security.cert.X509Certificate __userCert)
throws java.lang.Exception
- JOB CHECKPOINTER (JC)
know job units
mJUC entry (job-unit address and state )
select appropriate kernel checkpointer
- Throws:
java.lang.Exception
restartJobInit
public static void restartJobInit(java.lang.String __jobId,
java.lang.String __checkpointVersion,
java.lang.Integer __restartDependentJobs,
java.util.ArrayList<java.lang.String> __ip,
java.util.ArrayList<java.lang.String> __port,
java.lang.String __mode,
java.security.cert.X509Certificate __userCert)
throws java.lang.Exception
- RESTART *********************************
/
/*
put ip- and port-details into required format for further computation
- Throws:
java.lang.Exception
prepResReallocation
public static void prepResReallocation(java.lang.String __jobId,
java.lang.String __checkpointVersion,
java.lang.Integer __restartDependentJobs,
java.lang.String __desiredRestartDestination,
java.lang.String __mode,
java.security.cert.X509Certificate __userCert)
throws java.lang.Exception
- Throws:
java.lang.Exception
matchUnitsBiggerNodesCB
public static void matchUnitsBiggerNodesCB(java.util.ArrayList<CommunicationAddress> __list)
throws java.lang.Exception
- BEST SERVICE! - used!
FEATURE: find matching nodes by regarding kernel checkpointer and potential resource(PID,IPC) conflicts,
DRAWBACK: unknown number of repeatedly called 'getResources' (worst case: it will never stop => upper limit)
- Throws:
java.lang.Exception
proceedWithRebuilding
public static void proceedWithRebuilding(java.lang.String __jobId,
java.lang.String __initialJobId)
throws java.lang.Exception
- Throws:
java.lang.Exception
proceedWithResumingRST
public static void proceedWithResumingRST(java.lang.String __jobId,
java.lang.String __initialJobId)
throws java.lang.Exception
- Throws:
java.lang.Exception
terminateSingleMultiJobRestart
public static void terminateSingleMultiJobRestart(java.lang.String __jobId,
java.lang.String __initialJobId)
throws java.lang.Exception
- Throws:
java.lang.Exception