vendredi 13 janvier 2017

Hashing to partitions within DocumentDB

I'm building a system atop DocumentDB. We're using partitioned collections. For the partition key, we're going to use a random number to best ensure that we have equal utilization of the underlying partitions (and hence, equal utilization of the RUs because RUs are split equally amongst the underlying partitions).

We'll be storing the partition key within an identifier so that we know how to "find" the document later when a request is made by the identifier.

With that said, we're trying to figure out the tradeoffs of the random number range. Our choice is something like [0-999] or [0-99] if we want to save one character (Which is important in our use case).

If there are >99 underlying partitions, we would probably be under-utilizing partitions because our distribution wouldn't cover all possible "buckets". I'm trying to reason the opposite - namely, with less than 99 physical partitions, what is the trade-off of reducing the random distribution range to [0-99]?




Aucun commentaire:

Enregistrer un commentaire