RDBMS IMPORT
***** Using Python *******
pyspark --jars /mnt/resource/lokeshtest/guava-12.0.1.jar,/mnt/resource/lokeshtest/hadoop-aws-2.6.0.jar,/mnt/resource/lokeshtest/aws-java-sdk-1.7.3.jar,/mnt/resource/lokeshtest/mysql-connector-java-5.1.38/mysql-connector-java-5.1.38/mysql-connector-java-5.1.38-bin.jar --packages com.databricks:spark-csv_2.10:1.2.0
from pyspark import SQLContext
sqlcontext=SQLContext(sc)
dataframe_mysql = sqlcontext.read.format("jdbc").options(url="jdbc:mysql://YOUR_PUBLIC IP:3306/DB_NAME",driver = "com.mysql.jdbc.Driver",dbtable = "TBL_NAME",user="sqluser",password="sqluser").load()
dataframe_mysql.show()
****** Using Scala *******
sudo -u root spark-shell --jars /mnt/resource/lokeshtest/guava-12.0.1.jar,/mnt/resource/lokeshtest/hadoop-aws-2.6.0.jar,/mnt/resource/lokeshtest/aws-java-sdk-1.7.3.jar,/mnt/resource/lokeshtest/mysql-connector-java-5.1.38/mysql-connector-java-5.1.38/mysql-connector-java-5.1.38-bin.jar --packages com.databricks:spark-csv_2.10:1.2.0
import org.apache.spark.sql.SQLContext
val sqlcontext = new org.apache.spark.sql.SQLContext(sc)
val dataframe_mysql = sqlcontext.read.format("jdbc").option("url", "jdbc:mysql://YOUR_PUBLIC IP:3306/DB_NAME").option("driver", "com.mysql.jdbc.Driver").option("dbtable", "TBL_NAME").option("user", "sqluser").option("password", "sqluser").load()
dataframe_mysql.show()
**********************************************************************************************************************************************************
****** Using Scala *******
Persist in Mem Cache:
dataframe_mysql.cache
Perform some transformation or filter on df using map, etc.
val filter_gta = dataframe_mysql.filter(dataframe_mysql("date") === "20151129")
Optional: Repartition Data:
filter_gta.repartition(1)
Save to S3 as CSV:
filter_gta.write.format("com.databricks.spark.csv").option("header","true").save("s3n://YOUR_KEY:YOUR_SECRET@BUCKET_NAME/resources/spark-csv/mysqlimport1.csv")
************************************************************************************************************************************************************
No comments:
Post a Comment