Skip to content
This repository has been archived by the owner on May 17, 2022. It is now read-only.

Test compatibility with PYSPARK_SUBMIT_ARGS #10

Closed
AbdealiLoKo opened this issue May 27, 2018 · 2 comments
Closed

Test compatibility with PYSPARK_SUBMIT_ARGS #10

AbdealiLoKo opened this issue May 27, 2018 · 2 comments

Comments

@AbdealiLoKo
Copy link
Contributor

Based on the discussion at #6 (comment)

The extension is doing an import pyspark inside the extension. Which means, that if I as a jupyter user want to do something like:

import os

spark_pkgs=('com.amazonaws:aws-java-sdk:1.7.4',
            'org.apache.hadoop:hadoop-aws:2.7.3',
            'joda-time:joda-time:2.9.3',)

os.environ['PYSPARK_SUBMIT_ARGS'] = (
    '--packages {spark_pkgs} pyspark-shell'.format(spark_pkgs=",".format(spark_pkgs)))

import findspark
findspark.init()
import pyspark

spark = pyspark.sql.SparkSession.builder \
    .getOrCreate()

I cannot, because the PYSPARK_SUBMIT_ARGS environment variable will be created after the pyspark imported in the sparkmonitor module.

@krishnan-r
Copy link
Owner

Can you confirm that setting the environment variable is not working?

I think that the environment variable is read by Spark only while the SparkContext object is created.
The extension only imports pyspark and creates a SparkConf object. If I'm not wrong, you can still add properties to conf and as well set environment variables before starting the context.
(Here again you must pass the conf to create the SparkContext for the extension to work.)

@AbdealiLoKo
Copy link
Contributor Author

You're right. The PYSPARK_SUBMIT_ARGS are not used only in the case of the PySpark kernel in jupyter. But that is because the PySpark kernel initializes the SparkContext internally and hence the args don't work (as sparkcontext has been initialized already)

An observation: It does look like sparkmonitor won't work correctly with the PySpark Kernel as PySparkKernel won't use the conf created by sparkmonitor.

Closing as this is not an issue.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants