This post is part of a series of posts that aims to capture a number of emerging design patterns for the Windows Azure Platform. This series will begin with emerging design patterns for Windows Azure Table storage.
The Table Name Key Pattern enables the key for an entity to be extended beyond the partition key and row key, to provide faster querying or deletion/archiving.
Example
Data with short lifetime such as log or diagnostic data or temporary working sets, are generally only needed for hours, days or perhaps weeks. As an example, consider a log record defined as:
Challenge
The log data is stored in a Windows Azure Table.
Table name: Log
| PartitionKey (Date) |
RowKey (DateTimeStamp) |
InstanceId |
Log Level |
Message |
Description |
| 20100601 |
0634014616600000000 |
Worker1 |
1 |
… |
… |
| 20100601 |
0634014817200000000 |
Worker2 |
1 |
… |
… |
| 20100602 |
0634014817200000000 |
Worker1 |
1 |
… |
… |
While, the choice of the partition key and row key are extremely important, independent of this choice there is a more fundamental design problem. In order to delete or archive old log records, the entities must first be retrieving from the table with a GET operation and then deleted with a further DELETE operation. Leading to the some suggested approaches (http://social.msdn.microsoft.com/Forums/en/windowsazure/thread/30d221d3-3a15-4f2d-b64b-1072146580a1):
var query =
from e in ctx.Log
where e.PartitionKey "20100601"
select e;
foreach (var entity in query)
{
ctx.DeleteObject(entity);
ctx.SaveChanges();
}
While this can be improved slightly by deleting in batches of 100, it is both a time consuming and potentially expensive undertaking.
Solution
A solution to this problem is to extend the key for the entity to include the table name. Creating, for instance, a different log table for each day:
Table name: Log_20100601
| PartitionKey (InstanceId) |
RowKey (DateTimeStamp) |
Log Level |
Message |
Description |
| Worker1 |
0634014616600000000 |
1 |
… |
… |
| Worker2 |
0634014817200000000 |
1 |
… |
… |
Table name: Log_20100602
| PartitionKey (InstanceId) |
RowKey (DateTimeStamp) |
Log Level |
Message |
Description |
| Worker1 |
0634014817200000000 |
1 |
… |
… |
This now enables simple deletion or archiving of old log records by a single delete table operation for each day; as there is an explicit link between the day and the table name.
Summary
Motivation:
Extend the key for an entity beyond the partition key and row key to provide faster querying or deletion/archiving.
Implementation:
The table name is used as part of the key for an entity. The unique key for an entity is provided by the combination of the table name, partition key and row key. A new table created as and when required for a new entity.
Uses:
- For entities that have different lifetime enabling them to be managed (e.g. for deletion/archiving) based on their lifetime.
- Multi-tenanted scenarios to provide some additional isolation against cross customer queries.
- Where you have more than 2 keys and concatenation does not provide a sensible solution.
Also See
Table Name Key Pattern
Hash Partitioning Pattern
Transactional Master-Item Record Pattern
Chronological Query Pattern
Starts With Query Pattern