Why Scala:
Scala is good for process distributed data over cluster.
In the very end, compile all code to bytecode and run. You can use all Java library
Spark is built on Scala
What's the difference of JVM on Scala or Java