- Newest
- Most votes
- Most comments
Hi Toebs2,
Those aren't actual row counts. Those are plan operator output estimates that can be partly based on statistics resulting from the ANALYZE command, partly from cardinality estimates based on prior filters and joins, and rules and overriding constant values in the query planner/optimizer itself.
I don't know the Redshift code nor as a customer do I have access to it. However, from having worked on and worked with other database engines for a long time I can tell you that quite often there are a lot of edge condition overriding constants in SQL query planner/optimizer code to avoid absolute edges, like 0's in the stats that often don't play to nicely with the other operators when computing plan costs. So. small overrides at the access plan layer of a complex query plan can manifest in some unexpected numbers as they propagate up estimation formulae of the plan tree.
In your case the plan still doesn't have much work to do, so I wouldn't focus on it too much. On the other hand I would be thankful that there probably are overriding constants in the query planner/optimizer that have been put there to avoid edge conditions where misleading estimates might cause the choice of a query plan that performs horribly badly.
I've seen this happen before in databases I won't name. Most were fixed with, you guessed it, a overriding constant of an edge condition stat in the query planner/optimizer code. If you ever trip over one you'll know it because you'll be looking at query plan that has something like a cartesian cross product or a nested loop join over tables with some ungodly number of rows.
I'll leave it to the Redshift team to comment on the Redshift specifics and details of the question.
Regards,
-Kurt
Hi Toebs2,
Those aren't actual row counts. Those are plan operator output estimates that can be partly based on statistics resulting from the ANALYZE command, partly from cardinality estimates based on prior filters and joins, and rules and overriding constant values in the query planner/optimizer itself.
I don't know the Redshift code nor as a customer do I have access to it. However, from having worked on and worked with other database engines for a long time I can tell you that quite often there are a lot of edge condition overriding constants in SQL query planner/optimizer code to avoid absolute edges, like 0's in the stats that often don't play to nicely with the other operators when computing plan costs. So. small overrides at the access plan layer of a complex query plan can manifest in some unexpected numbers as they propagate up estimation formulae of the plan tree.
In your case the plan still doesn't have much work to do, so I wouldn't focus on it too much. On the other hand I would be thankful that there probably are overriding constants in the query planner/optimizer that have been put there to avoid edge conditions where misleading estimates might cause the choice of a query plan that performs horribly badly.
I've seen this happen before in databases I won't name. Most were fixed with, you guessed it, a overriding constant of an edge condition stat in the query planner/optimizer code. If you ever trip over one you'll know it because you'll be looking at query plan that has something like a cartesian cross product or a nested loop join over tables with some ungodly number of rows.
I'll leave it to the Redshift team to comment on the Redshift specifics and details of the question.
Regards,
-Kurt
Thankyou, Klarson.
I will look at the Postgres source code and see what it does.
+5 points for you! :-)
Hmm, interesting.
I think I may have stumbled across a forum bug.
You posted the same reply twice, somehow - and marking one of them as correct has given you ten points, not five.
Reminds me to issue multiple replies in the future :-)
Relevant content
- asked a year ago
- asked 4 years ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 2 months ago
- AWS OFFICIALUpdated a year ago