How to submit Apache Spark job to Hadoop YARN on HDInsight

Category: sql server hdinsight


Aq123 on Thu, 10 Jul 2014 09:03:29

Hi everyone,

I am very excited that HDInsight switched to Hadoop version 2, which supports Apache Spark through YARN. Apache Spark is a much better fitting parallel programming paradigm than MapReduce for the task that I want to perform.

I was unable to find any documentation however on how to do remote job submission of a Apache Spark job to my HDInsight cluster. For remote job submission of standard MapReduce jobs I know that there are several REST endpoints like Templeton and Oozie. But as for as I was able to find, running Spark jobs is not possible through Templeton. I did find it to be possible to incorporate Spark jobs into Oozie, but I've read that this is a very tedious thing to do and also job failure detection does not work in this case.

Probably there must be a more appropriate way to submit Spark jobs. Does anyone know how to do remote job submissions of Apache Spark jobs to HDInsight?

Many thanks in advance!


AmarpreetBassasn on Mon, 31 Aug 2015 12:03:20


Apache Spark is available in preview , you could submit spark jobs in HDInsight.