Running PySpark Jobs Locally with the AWS Glue ETL library - On windows

0

I have followed the steps outlined to install Developing using the AWS Glue ETL library - Python on Windows found here: https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-libraries.html

After following the installation instructions, it's unclear how to actually execute a spark job successfully locally In powershell I have:

a simple pyspark job called my_script.py:

from pyspark.context import SparkContext
from pyspark.sql import SparkSession

sc = SparkContext()
spark = SparkSession(sc)

I have:

  1. navigated to the aws-glue-libs directory
  2. in PowerShell attempted.\bin\gluesparksubmit F:\programming\my_script.py
  3. the output seems to be nothing

Can you please provide correct example on how to execute a aws glue job locally?

The ultimate goal here is to develop my glue jobs locally in Pycharm before deploying to the AWS Glue Service.

Adriano
已提問 9 個月前檢視次數 370 次
1 個回答
0

Those scripts are for Linux Bash, not PowerShell.
It might be possible to get then work using a Cygwin shell but it might be easier if you use the docker option

profile pictureAWS
專家
已回答 9 個月前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南