Why MUST an index be sparse to be used in a DocumentDB query with an $exists clause?

Question

Given a DocumentDB collection called `test`, with 200 documents of the following shape (i.e. containing a simple key/value pair):

```
{ "num" : 1 }
```
and an index on the key `num`:

```
{ v: 4, key: { num: 1 }, name: 'num-idx', ns: 'db.test2' }
```
the index is not used for the following query:

```
db.test.find({'num': {$exists: true}})
...
winningPlan: { stage: 'COLLSCAN' }
```
This is consistent with the AWS [documentation](https://docs.aws.amazon.com/documentdb/latest/developerguide/functional-differences.html#functional-differences.indexes) which states:

> Amazon DocumentDB does not currently support the ability to use indexes with the $ne, $nin, $nor, $not, $exists, $distinct, and $elemMatch operators. As a result, utilizing these operators will result in collection scans

However when I recreate the index, specifying it is sparse:

```
{  v: 4, key: { num: 1 }, name: 'num-idx', ns: 'db.test2', sparse: true }
```
then the index IS used:

```
db.test.find({'num': {$exists: true}})
...
winningPlan: { stage: 'IXSCAN', indexName: 'num-idx', direction: 'forward' } }
```

According to the [MongoDB documentation](https://www.mongodb.com/docs/manual/core/index-sparse/), making an index sparse has the following implications:
- documents missing the indexed keys will not be present in the index 
- this makes the index smaller in memory
- you cannot use it for queries containing `{ $exists: false }`

There is no mention of the sparsity of an index being a requirement in order for it to be used in the case of `{ $exists: true }` queries.

The closest to an answer I can find is https://docs.aws.amazon.com/documentdb/latest/developerguide/functional-differences.html#functional-differences.sparse-index piece of AWS documentation which states:

> To use a sparse index that you have created in a query, you must use the $exists clause on the fields that cover the index. If you omit $exists, Amazon DocumentDB does not use the sparse index

However, this clashes with the previous documentation quote.

**Behaviour for MongoDB (db.test.find({'num': {$exists: }})):**

| Head | exists:true | exists:false |
| --- | --- | --- | 
| sparse index | IDX | COLLSCAN | 
| non sparse index | IDX | IDX |

**Behaviour for DocumentDB (db.test.find({'num': {$exists: }})):**

| Head | exists:true | exists:false |
| --- | --- | --- | 
| sparse index | IDX | COLLSCAN | 
| non sparse index | COLLSCAN | COLLSCAN |

Is this an unintended loophole?

Can I rely on this behaviour to guarantee that a query containing an `{ $exists: true }` clause is able to use an index if it is sparse?

Answer

Not sure what you mean by *unintended loophole* in this context, but your tests validate what the DocumentDB documentation states.

> To use a sparse index that you have created in a query, you must use the $exists clause on the fields that cover the index. If you omit $exists, Amazon DocumentDB does not use the sparse index

means that if you have a sparse index and you use `$exist: true` condition the index will be used, which is what happens. Maybe the documentation could explicitly state **$exist: true**.

On the other hand, DocumentDB does not support  using the index with `$exists: true` if the index is not sparse, and your tests demonstrate it, as pointed out by:

> Amazon DocumentDB does not currently support the ability to use indexes with the $ne, $nin, $nor, $not, $exists, $distinct, and $elemMatch operators. As a result, utilizing these operators will result in collection scans

Why MUST an index be sparse to be used in a DocumentDB query with an $exists clause?

関連するコンテンツ