Tuesday 10 December 2019

Setup Java, Scala, Spark & Intellij in Mac

------------------
To Install JAVA:
------------------
    https://download.oracle.com/otn-pub/java/jdk/8u201-b09/42970487e3af4f5aa5bca3f542482c60/jdk-8u201-macosx-x64.dmg
    Extract
    Create a bash profile in user directory
    Open terminal -> $ vim .bash_profile
    export JAVA_HOME=$(/usr/libexec/java_home)

    Open new terminal/ source .bash_profile and enter -> echo $JAVA_HOME
    Type java -version

------------------
To Install Scala:
------------------
    https://downloads.lightbend.com/scala/2.11.12/scala-2.11.12.tgz
    Extract
    Open terminal and enter below commands:
    cd Downloads/

    sudo cp -R scala-2.11.12 /usr/local/scala
    cd
    vi .bash_profile
            export PATH=/usr/local/scala/bin:$PATH
    source .bash_profile
    Type scala

------------------
To Install Spark:
------------------
    https://www.apache.org/dyn/closer.lua/spark/spark-2.4.1/spark-2.4.1-bin-hadoop2.7.tgz
    Extract
    Copy the extracted spark folder to:  xxxxxxx/dev/apache-spark/      (You can choose any path)
    Open terminal
    vi .bash_profile
      export SPARK_HOME=/Users/lokeshnanda/xxxxxxx/dev/apache-spark/spark-2.4.4-bin-hadoop2.7
      export PATH=$PATH:$SPARK_HOME/bin
    source .bash_profile
    Type spark-shell and it should open spark

Now install Intellij and add scala plugin. Add sbt dependencies for Spark core(It will take 10-15mins).


Enter below in build.sbt:

name := "TestSpark"
version := "0.1"
scalaVersion := "2.11.12"
// https://mvnrepository.com/artifact/org.apache.spark/spark-corelibraryDependencies += "org.apache.spark" %% "spark-core" % "2.4.4"

// https://mvnrepository.com/artifact/org.apache.spark/spark-sqllibraryDependencies += "org.apache.spark" %% "spark-sql" % "2.4.4"

// https://mvnrepository.com/artifact/org.apache.spark/spark-mlliblibraryDependencies += "org.apache.spark" %% "spark-mllib" % "2.4.4" % "runtime"

// https://mvnrepository.com/artifact/org.apache.spark/spark-streaminglibraryDependencies += "org.apache.spark" %% "spark-streaming" % "2.4.4" % "provided"