Why is python connector slower than jdbc driver?

0

To connect to Redshift from our app, we would like to replace existing JDBC driver (Java) with the python sql connector if possible. For that purpose, we measured the performance to fetch a few large datasets by select *, GETDATE(), RAND() from <db.schema.table>. where GETDATE(), RAND() are added to disable caching.

The python sql connector consistently takes about twice the time to get the same table. The elapsed times obtained from our measurement code are about the same as those shown in the "query monitoring" logs of the Redshift cluster.

In summary, the same query when submitted via the python sql connector takes about twice the time than when submitted via the JDBC driver (Java). Any thoughts on why?

1 réponse
0

Hello,

The difference in performance that you observed between the Python Connector and JDBC driver could be due to the difference in the way they handle connections, data conversion and queries.

You may consider profiling the code execution and examining the query plans generated by both of them to get more insights.

Please go through the following blog that shares performance tuning techniques for Amazon Redshift : https://aws.amazon.com/blogs/big-data/top-10-performance-tuning-techniques-for-amazon-redshift/

You may also check the below GitHub links for the above drivers :

Reference :

[+] Configuring connections in Amazon Redshift - https://docs.aws.amazon.com/redshift/latest/mgmt/configuring-connections.html

Thank you!

AWS
répondu il y a 6 mois
profile picture
EXPERT
vérifié il y a un mois

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.

Instructions pour répondre aux questions