- Neueste
- Die meisten Stimmen
- Die meisten Kommentare
Hello
I can fully appreciate how useful tab-completion is for writing Python code in Jupyter, and that having this same functionality for PySpark would save a lot of time looking at reference material. As you may know, PySpark and Jupyter are opensource software distributions, and thus the features available to us are limited by what is developed in their respective communities. From the AWS side, EMR Notebook uses SparkMagic to communicate with EMR cluster to run spark jobs on EMR Cluster. This system (SparkMagic) uses Apache Livy to communicate with EMR cluster. Apache Livy today does not support any API to perform Intellisense autocomplete feature. Hence this functionality is not possible with EMR Notbeooks using SparkMagic kernels. For more information you can refer to the link for this issue 1. I saw there is an internal ticket for the service team to raise this feature but there is no ETA on this feature release. You can keep an eye out for the announcement here 2 3.
For the workaround, EMR supports OnCluster mode 4. You can enable on-cluster execution mode on Notebook, which will allow you to install new spark native kernels such as (Apache Toree) on EMR Cluster. Using this native kernels you can perform auto completion.
Another way I found from a third-party blog post link which is quite useful is about how you can install your own jupyter in EMR cluster. You can refer to it here 5 for more information. There is also some discussion here 6 about how to get autocomplete in jupyter notebook without using tab using Hinterland or TabNine for your to refer to. A general recommendation, since they are third party, please test them thoroughly before deploying this into production.
I hope the above information helps.
Relevanter Inhalt
- AWS OFFICIALAktualisiert vor 2 Jahren
- AWS OFFICIALAktualisiert vor 2 Jahren
- AWS OFFICIALAktualisiert vor einem Jahr