Index Fragmentation in SQL Server

Ganesh Kamalakannan
3 min readJul 18, 2021

--

An Index is a data structure that arranges data in a specific order. This improves data read performance. Fragmentation is the space wastages in the index pages or the differences in the logical order and physical order of the index pages.

The phone directory is one of the common examples of the index. Let’s use a phone directory to understand an index and the fragmentation of an index. Let's say the name, address, and phone number of all the people living in a city is printed in a phone directory in the order of the full name.

In SQL Server, the phone directory information is stored in a table with name, address, and phone number beings its columns. A clustered index is created on the Full Name column, that being said, the phone directory details are sorted in Full Name order.

Internally, SQL Server stores everything in 8KB pages. Based on the size of the data, table data will be stored in one or more pages. Let's assume, the phone directory data is stored in 9 pages and all the pages are full except the last page as illustrated in Image 1. For simplicity, only the full name is printed in the image but the actual data row will be stored in the leaf level pages of the clustered index.

Image 1: Phone Directory Index

Let's assume Ashlee Wilson newly moved to the city and now we have to add her to the phone directory. Actually, Ashlee Wilson's data should be stored on the first page but currently, it didn’t have room for a new row. In this situation, SQL Server adds a new page and moves 50% of the data to the new page. The first 2 entries will remain on the first page and the last 2 entries on Page 1 will be moved to the newly added Page 10.

The first record on Page 10 is Asley Bernick and Ashlee Wilson should come before Asley Bernick. Since the first page has space now, Ashlee Wilson is stored on the first page itself. Now the index pages look like Image 2.

Image 2: Phone Directory after adding a new row

Logically, Page 10 is the next page of Page 1 based on the index key (Full name) values but physically it is the last page of the index. The logical page order of this index is Page 1 → Page 10 → Page 2 … → Page 9 but the physical order of the index in the disk is Page 1 → Page 2 … → Page 9 → Page 10. This difference is called External Fragmentation.

While reading data, after reading from Page 1 SQL Server will have to go all the way to Page 10 and comes back to Page 2, this increases the IO cost.

Removing the existing row can also cause fragmentation in an index. When Byju Koshal moved to a different city, we have to remove her record from the directory. If we remove Byju Koshal from Page 2, it creates a vacant space on the Page. This empty space in between is called Internal Fragmentation.

Not only does add and remove the data causes fragmentation, but updating the existing data can also cause fragmentation. Let’s say Arthi Rajan gets a second phone. To enter 2nd phone number, there is no space on Page 1 when the index looks like Image 1. SQL Server adds a new page and moves 50% of the data to the new page. Now, there is a room on the first page to add the 2nd phone number of Arthi Rajan. Again the logical and physical order of index pages is different.

Proper Table design, Index design, and regular Index maintenance activities like Index re-organize, Index Rebuild will fix Index fragmentation.

--

--