Indexing Entities

Single-Partition Indexing

For accounts, at least, the account name is unique for each user regardless of casing (upper case letters have no difference from their lower case variants). This needs to be done as a single operation, the indexing of an entity and the insertion of said entity so that if the operation is successful then the entity is both inserted and indexed and there is no way the operation gets stuck midway and the entity is only added and not indexed or vice-versa.

Batch operations offer this capacity, the entity ID is a GUID and stored in the row key while the name, which needs to be unique, is stored in a separate entry having the name in lowercase stored in the row key. Both entries are stored in the same partition. To ensure that a name which has the same value as the GUID, the row key for each entry is prefixed with id- for the entity entry and name- for the index entry.

PartitionKey: partitionId
    - RowKey: 'id-' + entity.id
      Other Properties

    - RowKey: 'name-' + toLowerCase(entity.name)
      IndexedEntityId: entity.id

This is the single-partition index structure/pattern.

Cross-Partition Indexing

This is a different variation for indexing entities where the operation is done in two separate operations because the two entries are not stored in the same partition. While this is more risky because the operation may be stuck midway, it offers better handling of large amounts of data because the indexing is done across multiple partitions instead of a single partition which grows. This method is used for storing user logins and user accounts. For each login there is one partition containing one row (or multiple if there are multiple login configuration for the same e-mail) where the login information is stored along side the user ID. The user is stored in a different table and the user id is used as a partition key for each entry containing user details.

Here, the order in which entities are stored is critical because it should account for the case when only one of the two operations works. For registering users, storing the user before their login information is required because if the login insert operation fails then the user has no way to confirm their account and authenticate. The sign-up process fails in this case, but the system does not get blocked if the user attempts to register again using the same e-mail because their e-mail address has not been indexed yet. A clean-up job can be run to remove all user entries that have no associated login which removes the duplicate data.

- PartitionKey: toLowerCase(entity.name)
  RowKey: entity.type
  EntityId: entity.id

- PartitionKey: entity.id
  RowKey: "details"
  Other Properties

Indexed Entities

Accounts

Single-Partition Index

PartitionKey: userId
    - RowKey: 'id-' + account.id
      Other Properties

    - RowKey: 'name-' + toLowerCase(account.name)
      IndexedEntityId: account.id

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Indexing Entities

Single-Partition Indexing

Cross-Partition Indexing

Indexed Entities

Accounts

Clone this wiki locally