Educated guess about RDS instance with best performance to cost ratio based on EXPLAIN ANALYZE of PostgreSQL

0

I have a table that stores 1M 384-dimensional vectors in the column called vector of type "char"[]. Consider the query:

EXPLAIN ANALYZE
WITH Vars(key) as (
    VALUES (array_fill(1, ARRAY[384])::vector)
)
SELECT content_id
FROM MyTable, Vars
ORDER BY vector::int[]::vector <#> key
LIMIT 10;

The key is just a toy vector consisting of all ones. <#> is the dot product operator of the pgvector extension, and vector is the type defined by that extension, which to my understanding is similar to real[].

Here is the output when running on db.m7g.large RDS PostgreSQL instance:

 Limit  (cost=83268.20..83269.37 rows=10 width=12) (actual time=3070.513..3070.570 rows=10 loops=1)
   ->  Gather Merge  (cost=83268.20..180496.59 rows=833328 width=12) (actual time=3070.512..3070.567 rows=10 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Sort  (cost=82268.18..83309.84 rows=416664 width=12) (actual time=3062.863..3062.864 rows=8 loops=3)
               Sort Key: ((((mytable.vector)::integer[])::vector <#> '[1,1, ... ,1]'::vector))
               Sort Method: top-N heapsort  Memory: 25kB
               Worker 0:  Sort Method: top-N heapsort  Memory: 25kB
               Worker 1:  Sort Method: top-N heapsort  Memory: 25kB
               ->  Parallel Seq Scan on mytable  (cost=0.00..73264.22 rows=416664 width=12) (actual time=0.017..3005.128 rows=333333 loops=3)
 Planning Time: 0.097 ms
 Execution Time: 3070.597 ms
(12 rows)

Given this output, how can I make an educated guess about which of the many instance types will give the best performance to cost ratio for this particular query?

No Answers

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions