aws-glue-libs:glue_libs_3.0.0_image_01 image issue

0

I am getting issues in aws-glue-libs:glue_libs_3.0.0_image_01 image

docker run -it -p 8888:8888 -p 4040:4040 -e DISABLE_SSL="true" -v C:/Docker/jupyter_workspace:**/home/glue_user/workspace/jupyter_workspace/ ** --name glue_jupyter amazon/aws-glue-libs:glue_libs_3.0.0_image_01 /home/glue_user/jupyter/jupyter_start.sh

It is getting started locally but When I am trying to read the csv file stored locally it is giving error : An error was encountered: Path does not exist: file:/home/glue_user/workspace/employees.csv Traceback (most recent call last): File "/home/glue_user/spark/python/pyspark/sql/readwriter.py", line 737, in csv return self._df(self._jreader.csv(self._spark._sc._jvm.PythonUtils.toSeq(path))) File "/home/glue_user/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1305, in call answer, self.gateway_client, self.target_id, self.name) File "/home/glue_user/spark/python/pyspark/sql/utils.py", line 117, in deco raise converted from None pyspark.sql.utils.AnalysisException: Path does not exist: file:/home/glue_user/workspace/employees.csv

Or When I am trying to start with

docker run -it -p 8888:8888 -p 4040:4040 -e DISABLE_SSL="true" -v C:/Docker/jupyter_workspace****:/home/glue_user/workspace** ** --name glue_jupyter amazon/aws-glue-libs:glue_libs_3.0.0_image_01 /home/glue_user/jupyter/jupyter_start.sh

then container is not getting started getting following error :

Bad config encountered during initialization: No such directory: ''/home/glue_user/workspace/jupyter_workspace''

asked 2 years ago733 views
1 Answer
0
Accepted Answer

Hello,

Assuming your file employees.csv is present in your local path C:/Docker/jupyter_workspace, I could see that you are expecting it gets mounted to the location /home/glue_user/workspace/jupyter_workspace/ within the docker container using below command.

docker run -it -p 8888:8888 -p 4040:4040 -e DISABLE_SSL="true" -v C:/Docker/jupyter_workspace:/home/glue_user/workspace/jupyter_workspace/ --name glue_jupyter amazon/aws-glue-libs:glue_libs_3.0.0_image_01 /home/glue_user/jupyter/jupyter_start.sh

However, when you try to read the file using something like below

df = spark.read.csv("employees.csv")

As per the error message, Spark appears to be looking for the file in the location /home/glue_user/workspace/

So, can you try using full path of the file or something like below ?

df = spark.read.csv("jupyter_workspace/employees.csv")
AWS
SUPPORT ENGINEER
answered 2 years ago
  • Hello Chiranjeevi Thanks for reply . Yes I resolved it in same way you have mentioned . It was my mistake that even after mounting my directory to working directory I was passing windows path rather than passing path of container.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions