Why is python connector slower than jdbc driver?

0

To connect to Redshift from our app, we would like to replace existing JDBC driver (Java) with the python sql connector if possible. For that purpose, we measured the performance to fetch a few large datasets by select *, GETDATE(), RAND() from <db.schema.table>. where GETDATE(), RAND() are added to disable caching.

The python sql connector consistently takes about twice the time to get the same table. The elapsed times obtained from our measurement code are about the same as those shown in the "query monitoring" logs of the Redshift cluster.

In summary, the same query when submitted via the python sql connector takes about twice the time than when submitted via the JDBC driver (Java). Any thoughts on why?

Mary
posta 6 mesi fa317 visualizzazioni
1 Risposta
0

Hello,

The difference in performance that you observed between the Python Connector and JDBC driver could be due to the difference in the way they handle connections, data conversion and queries.

You may consider profiling the code execution and examining the query plans generated by both of them to get more insights.

Please go through the following blog that shares performance tuning techniques for Amazon Redshift : https://aws.amazon.com/blogs/big-data/top-10-performance-tuning-techniques-for-amazon-redshift/

You may also check the below GitHub links for the above drivers :

Reference :

[+] Configuring connections in Amazon Redshift - https://docs.aws.amazon.com/redshift/latest/mgmt/configuring-connections.html

Thank you!

AWS
con risposta 6 mesi fa
profile picture
ESPERTO
verificato un mese fa

Accesso non effettuato. Accedi per postare una risposta.

Una buona risposta soddisfa chiaramente la domanda, fornisce un feedback costruttivo e incoraggia la crescita professionale del richiedente.

Linee guida per rispondere alle domande