Starts With Query Pattern – Windows Azure Table Design Patterns

This post is part of a series of posts that aims to capture a number of emerging design patterns for the Windows Azure Platform. This series will begin by tackling emerging design patterns for Windows Azure Table storage.

The Starts With Query Pattern enables performance optimised begins with partial match operations.

Example

Provide a fast search for articles that begins with the term “Metabolis” from a list of article titles, such as those found on Wikipedia. A subset is shown below:

Metabolife, Metabolise, Metabolised, Metabolism, Metabolite

Challenge

Using LINQ to return the titles that start with e.g. “Metabolis” the following query might be tried:

var titles =
    from titlesearch in
       context.TitleSearch
    where title.StartsWith(“Metabolis”)
    select titles;

However, this is not supported by Windows Azure Table storage (http://blogs.southworks.net/fboerr/2010/04/22/compsition-in-windows-azure-table-storage-choosing-the-row-key-and-simulating-startswith/).

 

Solution

The entities within a partition in a table are returned in the lexicographical order of the row key. To take advantage of this, storing all the article titles in the same partition and capitalising the title, to provide case insensitive search (if required).

Partition Key Row Key Title
M METABOLIFE Metabolife
M METABOLISE Metabolise
M METABOLISED Metabolised
M METABOLISM Metabolism
M METABOLITE Metabolite

Then using a LINQ query with a set of statements to constrain the results in 3 ways. First, constrain to a single partition “M”. Second, constrain to records from and including “METABOLIS” (lexicographically) . Third, by replacing the last character “S” with the next character “T”, constrain to records before “METABOLIT” (lexicographically).

var titles =
   (from t in
   context.TitleSearch
   where t.RowKey.CompareTo(“METABOLIS”) >= 0
      && t.RowKey.CompareTo(“METABOLIT”)  < 0
      && t.PartitionKey == “M”
   select t
   ).Take(2);

Additionally, the Take operator can be used to limit the number of results returned (http://msdn.microsoft.com/en-us/library/dd135725.aspx). This query will return “Metabolise” and “Metabolised” as the top 2 results from the example list shown.

Summary

Motivation:

Enable performance optimised begins with partial match operations.

Implementation:

The value to be queried stored as the row key and lower and upper bound constraints applied to the query.

Uses:

Where multiple row need to be returned from a begins with partial word match search.

Reference

Windows Azure Table (Tips and Tricks Section)

Composition in Windows Azure Table Storage: choosing the row key and simulating StartsWith

Also See

Table Name Key Pattern

Hash Partitioning Pattern

Transactional Master-Item Record Pattern

Chronological Query Pattern