Wednesday 20 September 2017

Work with Spark in Windows

Quick easy guide to setup spark in Windows

Windows x64197.78 MB  jdk-8u144-windows-x64.exe
  • Open command prompt and type java -version    
  • The above should give you a response. Java installation is done now.


  • Now, create the below folder structure in C:\
C:\Hadoop\bin
Copy the downloaded winutils.exe, in the above path.
  • Create a new system env variable - "HADOOP_HOME" and its value "C:\Hadoop" 
  • Now, open command line terminal as administrator, and enter the below commands:
C:\WINDOWS\system32>cd \
C:\>mkdir tmp
C:\>cd tmp
C:\tmp>mkdir hive
C:\tmp>c:\hadoop\bin\winutils chmod 777 \tmp\hive
C:\tmp>

  • Now we need to download spark, https://spark.apache.org/downloads.html
  • Once this is done, extract the downloaded file using 7zip. (You need to extract twice, as it is tar.gz)
  • Now create a folder "spark" in C:\\
  • Copy the contents of "C:\Users\Lokesh\Downloads\spark-2.2.0-bin-hadoop2.7\spark-2.2.0-bin-hadoop2.7\" to C:\spark\
  • Now go to system environment variables, and create a variable "SPARK_HOME" and its value as "C:\spark"
  • Now, it is time to test spark, open cmd as administrator and type
    • C:\spark\bin\spark-shell

No comments:

Post a Comment