As you have probably heard, SQL Server 2014 features a new In Memory OLTP engine, previously known as Heakton. One of the big features of In Memory OLTP is Memory Optimized Tables; tables which are entirely loaded into memory, rather than only residing on disk and being cached based on use. As you can imagine, storing tables in memory means they are accessed differently, and therefore indexed differently. We’ll take a look at some of the basics of how indexing works for Memory Optimized Tables here.
Our Memory Optimized Tables have two options for types of indexes: Hash Indexes and Nonclustered BW-trees. We’ll take a look at the benefits of each in a moment, but first lets consider the details that apply to both.
Memory Optimized Indexes:
– Exist only in memory. All indexes are rebuilt on database start up.
– Do not contain actual data rows. Instead, data rows for a given table exists only in one place in memory and indexes are made up of memory pointers to those rows.
– Do not duplicate data. All indexes on a table point to the same rows in memory, the data is not duplicated.
– Are all covering indexes. Indexes are made up of pointers to full rows, they automatically cover all columns of a table.
– Are limited to 8 per table. This can be any combination of the types of indexes.
– Must be created along with the table. All memory optimized indexes are created inline in the CREATE TABLE statement.
Types of Memory Optimized Indexes:
Hash indexes are made up of hash buckets with linked lists of rows. They are optimized for point look ups and inserts. They do not support range scans or inequality predicates, as the values are ordered by a hash, and therefore are not necessarily in order by the index key. For the same reason, they can not be used for ordered scans. Let’s look at the DDL for a hash index. Here’s an example of creating a hash index at the column level:
CREATE TABLE [foo]( [bar] INT NOT NULL INDEX IX_bar HASH WITH (BUCKET_COUNT = 100), [boo] VARCHAR(200) NOT NULL ) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA);
You can also create the hash index at the table level:
CREATE TABLE [foo]( [bar] INT NOT NULL, INDEX IX_bar HASH ([bar]) WITH (BUCKET_COUNT = 100)) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA);
As you can see, this is actually table creation DDL. As mentioned above, all memory optimized indexes must be created along with their table.
You’ll notice in each example that we have WITH (BUCKET_COUNT = #). The bucket count is an important part of defining a memory optimized index. The hash function SQL Server uses for hash indexes hashes each value for the index key and maps it to a hash bucket. Each hash bucket points to the first row of data with the corresponding index key, and that row points to the next row and so on. It follows that for best performance we want each possible index key value to have its own bucket and to point to as few rows as possible. For these reasons, Microsoft recommends that we define our hash indexes with double the number of buckets of our number of unique index key values, and to use a regular nonclustered memory optimized index for tables with many repeating index key values.
It is important to avoid having too few buckets, and so we should always go for a higher bucket count if we’re not sure. If we have too few buckets, the hash function will map multiple index key values to a bucket, which is called a hash collision, and can seriously degrade performance.
The second kind of memory optimized index is a nonclustered index. These special nonclustered indexes are based on the new BW-tree. This architecture was named with tongue firmly placed in cheek, as BW stands for “buzz word”. The name is fitting, however, as these new type of B-trees have some great functionality.
These nonclustered indexes support all operations and are your safer option to try when migrating to the new in memory architecture. They are optimized for range look-ups and ordered scans, support point look-ups (but don’t do them quite as fast as a good Hash index) and also support searching on left-based subsets of index keys.
Let’s take a look at the DDL. Here is how we create a column level nonclustered index:
CREATE TABLE [foo]( [bar] INT NOT NULL INDEX ID) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA);
And here is how we create a table level nonclustered index:
CREATE TABLE [foo]( [bar] INT NOT NULL, INDEX IX_bar NONCLUSTERED ([bar])) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA);
Keep in mind, these indexes are unidirectional, meaning that they can only be used if the columns are in the same order as your queries. So, if you are going to use an ORDER BY clause, you need to be sure your corresponding indexes are ordered in the same way.