Zeppelin Notebook on EMR - how to do pip install

0

I am trying to install happybase package on Zeppelin notebook ( or for that matter any package ) . How do I do a pip install from a Zeppelin cell . %pip or !pip is not recognized

已提問 4 個月前檢視次數 213 次
2 個答案
3

Hello,

You can follow the below steps in zeppelin to install the packages at runtime. This method works in client deploy mode.

  1. Provide access to home directory for other user(zeppelin) using following command using a Bootstrap script so that zeppelin service can install the packages on /home/.local directory.
sudo chmod 757 /home
  1. Add below settings in spark interpreter from Zeppelin UI and restart the interpreter.
spark.pyspark.virtualenv.enabled	true	
spark.pyspark.virtualenv.bin.path	/usr/bin/virtualenv	
spark.pyspark.virtualenv.type           native
spark.pyspark.python                    python3
  1. Now try installing the packages using below command from notebook,
%spark.pyspark
sc.install_pypi_package("xgboost")
AWS
支援工程師
已回答 1 個月前
0

Hi,

Have a look at https://medium.com/@techboomph/getting-zeppelin-to-work-with-emr-93e237ac446a

The author proposes a solution to do the pip install for a Zepplin notebook on EMR that you need.

Didier

profile pictureAWS
專家
已回答 4 個月前
  • Thanks for sharing , David - so looks like he is suggesting to include the pip install as part of the bootstrap script which means the cluster would need to be recreated. I could try that , however I believe that something similar to Jupyter - where you could do a pip install in the note book itself - should be available in Zeppelin. I see that the %conda interpreter is loaded , but I am unable to make that work - like if I type %conda install happybase ... it just says command ( install ) not found

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南