Why is python connector slower than jdbc driver?

0

To connect to Redshift from our app, we would like to replace existing JDBC driver (Java) with the python sql connector if possible. For that purpose, we measured the performance to fetch a few large datasets by select *, GETDATE(), RAND() from <db.schema.table>. where GETDATE(), RAND() are added to disable caching.

The python sql connector consistently takes about twice the time to get the same table. The elapsed times obtained from our measurement code are about the same as those shown in the "query monitoring" logs of the Redshift cluster.

In summary, the same query when submitted via the python sql connector takes about twice the time than when submitted via the JDBC driver (Java). Any thoughts on why?

Mary
已提問 6 個月前檢視次數 317 次
1 個回答
0

Hello,

The difference in performance that you observed between the Python Connector and JDBC driver could be due to the difference in the way they handle connections, data conversion and queries.

You may consider profiling the code execution and examining the query plans generated by both of them to get more insights.

Please go through the following blog that shares performance tuning techniques for Amazon Redshift : https://aws.amazon.com/blogs/big-data/top-10-performance-tuning-techniques-for-amazon-redshift/

You may also check the below GitHub links for the above drivers :

Reference :

[+] Configuring connections in Amazon Redshift - https://docs.aws.amazon.com/redshift/latest/mgmt/configuring-connections.html

Thank you!

AWS
已回答 6 個月前
profile picture
專家
已審閱 1 個月前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南