Question

ManishMehra on Wed, 13 Jan 2016 10:05:50


Hi,

Can anyone please explain if there is a possibility to replace the HPC implementation with Azure batch Service.

PFB my problem statement.

I have an exe which behaves differently when different arguments are passed to it. So an exe gets triggered from web thread and posts a job on Azure Batch. It in turn has to create 100's of task where in it will be calling same exe with different set of parameters. Once all those tasks are completed it is supposed to initiate another set of tasks. We have this kind of implementation in HPC but I am facing some issues while replacing it with Azure Batch.

Is it possible to achieve this? If yes do we have any sample code wherein we are able to do something similar would be of great help.

Thanks in advance.

Manish


Sponsored



Replies

Ivan Towlson (MS) on Wed, 13 Jan 2016 23:23:20


Hello Manish,

To do this in the current Batch service you would need to use a job manager task.  The job manager would create the first tranche of tasks, then monitor completion of those tasks, in both cases using the Batch API.  When the tasks were complete, the job manager would then create the second tranche of tasks.

Depending on your requirements, you *may* be able to use the forthcoming task dependencies feature to handle this more easily.  This would work as follows: Job manager creates *all* tasks up front, but sets the tasks in the second tranche to depend on all the tasks in the first tranche.  The Batch service would then not run any of the tasks in the second tranche until the tasks in the first tranche had completed.  If that would meet your needs, then the task dependencies feature should be available within the next month.

Something that isn't clear from your message, however, is whether the results of the first tranche feeds into the creation of the second tranche -- that is, if you need to create a different *set* of tasks depending on the results of the first tranche.  If you do, then the naive dependency approach won't suffice, and you're back to the job manager approach.  Or you *could* use task dependencies to implement an approach where you have a single "create the second set of tasks" task depending on the first tranche: that task would be scheduled only when the first tranche was completed and could use the results when creating the second set.  This would avoid needing to have a long-running job manager to monitor the status of tasks.

Mark Scurrell - MSFT on Wed, 27 Jan 2016 01:36:07


Hi Manish

I think the Job Manager task is the way to go, as Ivan states.  A description of Job Manager can be found here - https://msdn.microsoft.com/en-us/library/azure/mt282178.aspx.  There is sample Job Manager code here - https://github.com/Azure/azure-batch-samples/tree/master/CSharp/GettingStarted/03_JobManager.

The Job Manager tasks can use the Batch API and can create the 100's of tasks, monitor them until they complete, then add a new set of tasks to the job, detect when they are complete and terminate the job.

Currently with Batch you have to monitor your tasks to determine when they are complete and check whether they succeeded or failed.  You can use this monitoring to implement your dependency checking.  In the future we'll make this easier.

Regards, Mark