Zeppelin Notebook on EMR - how to do pip install

0

I am trying to install happybase package on Zeppelin notebook ( or for that matter any package ) . How do I do a pip install from a Zeppelin cell . %pip or !pip is not recognized

已提问 4 个月前213 查看次数
2 回答
3

Hello,

You can follow the below steps in zeppelin to install the packages at runtime. This method works in client deploy mode.

  1. Provide access to home directory for other user(zeppelin) using following command using a Bootstrap script so that zeppelin service can install the packages on /home/.local directory.
sudo chmod 757 /home
  1. Add below settings in spark interpreter from Zeppelin UI and restart the interpreter.
spark.pyspark.virtualenv.enabled	true	
spark.pyspark.virtualenv.bin.path	/usr/bin/virtualenv	
spark.pyspark.virtualenv.type           native
spark.pyspark.python                    python3
  1. Now try installing the packages using below command from notebook,
%spark.pyspark
sc.install_pypi_package("xgboost")
AWS
支持工程师
已回答 1 个月前
0

Hi,

Have a look at https://medium.com/@techboomph/getting-zeppelin-to-work-with-emr-93e237ac446a

The author proposes a solution to do the pip install for a Zepplin notebook on EMR that you need.

Didier

profile pictureAWS
专家
已回答 4 个月前
  • Thanks for sharing , David - so looks like he is suggesting to include the pip install as part of the bootstrap script which means the cluster would need to be recreated. I could try that , however I believe that something similar to Jupyter - where you could do a pip install in the note book itself - should be available in Zeppelin. I see that the %conda interpreter is loaded , but I am unable to make that work - like if I type %conda install happybase ... it just says command ( install ) not found

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则