we use the Hadoop Hortonworks distribution, which supports spark on Yarn.
We would like to connect Orbit to this cluster, but I cannot find an example of scale out configuration file with yarn parameter.
has anyone ever experienced this kind of configuration?
the out-of-the-box Orbit spark scaleout provider only works with Spark standalone.
But you easily can build your own spark scaleout provider for YARN.
It’s anyway the preferred method for scaleout infrastructures to build your own provider so that you can really tune it according to your infrastructure.
Here some info on how to use a custom scaleout provider.
Basically you modify the MapReduceExecutorSpark class so that it works with your infrastructure and put the jars into the lib folder of orbit and enable the scaleout implementation in the config.properties file in Orbit.
Please let me know in case you need further assistance for this.
thank you for the answer. I’m going to try to build a custom
very cool - please let me know how it works.
Btw: this could also be a nice community contribution!