Why is python connector slower than jdbc driver?

0

To connect to Redshift from our app, we would like to replace existing JDBC driver (Java) with the python sql connector if possible. For that purpose, we measured the performance to fetch a few large datasets by select *, GETDATE(), RAND() from <db.schema.table>. where GETDATE(), RAND() are added to disable caching.

The python sql connector consistently takes about twice the time to get the same table. The elapsed times obtained from our measurement code are about the same as those shown in the "query monitoring" logs of the Redshift cluster.

In summary, the same query when submitted via the python sql connector takes about twice the time than when submitted via the JDBC driver (Java). Any thoughts on why?

Mary
preguntada hace 6 meses317 visualizaciones
1 Respuesta
0

Hello,

The difference in performance that you observed between the Python Connector and JDBC driver could be due to the difference in the way they handle connections, data conversion and queries.

You may consider profiling the code execution and examining the query plans generated by both of them to get more insights.

Please go through the following blog that shares performance tuning techniques for Amazon Redshift : https://aws.amazon.com/blogs/big-data/top-10-performance-tuning-techniques-for-amazon-redshift/

You may also check the below GitHub links for the above drivers :

Reference :

[+] Configuring connections in Amazon Redshift - https://docs.aws.amazon.com/redshift/latest/mgmt/configuring-connections.html

Thank you!

AWS
respondido hace 6 meses
profile picture
EXPERTO
revisado hace un mes

No has iniciado sesión. Iniciar sesión para publicar una respuesta.

Una buena respuesta responde claramente a la pregunta, proporciona comentarios constructivos y fomenta el crecimiento profesional en la persona que hace la pregunta.

Pautas para responder preguntas