Archiving is one of the most feasible methods to allow your database to be smaller and the best, manageable.
It allows you to index, backup as well as restore the data any time. A common benefit of archiving the data is that it saves you a lot of storage space and allows your systems and devices to perform much better than before.
However, if you are curious about the data storage in the database SQL Server, let us tell you that Microsoft SQL server databases are stored on the disk in two files – Data file and log file.
Talking about the data files, they are of two types i.e. Primary and Secondary.
The primary data file has all the startup information for the database. It points to other files in the database. The user data and the objects could also be stored in this file where every database has a primary data file.
On the other hand, the secondary data files are optional where they could be used for spreading the data across multiple files. The same could be done by putting every file on a different disk drive.
The SQL Server databases could have a bunch of data and log files. However, just one of these files could be the primary data file. Different than these system files are the Filegroups that work as logical containers for the data files.
As there is a partition of data made and then it is archived, it’s quite fair that the whole process does require understanding. The whole archive process starts by exporting the data from the source of data to a staging area where the database should be residing on a different SQL server.
Apparently, the disk space that is allocated to a data file is divided into page where every fundamental unit of the data storage in SQL Server.
On average, the size of the database page is 8kb and when you insert any data in the SQL server database, it helps you save the data to a series of pages that are present in the data file.
In case the multiple data files exist within a filegroup, the SQL server starts allocating the pages to all the data files that are based on a round-robin mechanism.
This means that if we are inserting data into a table, the SQL server is allocating the pages first to the data file 1 and then to data file 2 and back to data file 1. This archiving is done by the process of an algorithm, Proportional Fill.
The algorithm is used during allocating the pages for all the data files to allocate space around the same time. It even helps in determining the total information that should be written to each of the data files in a multi-file filegroup just based on the amount of free space that each file contains.
This allows the files to become full at around the same time. It works based on the free space that is available within a file.
For the physical storage of a table, the rows are divided into a series of partitions where the partition size is defined by the user. However, all the rows are in a single partition by default.
A table is further split into a variety of partitions to spread a database over a cluster where the rows in every partition are stored in the heap structure (or a B-tree). In case the table has an associated index that allows the rows to be retrieved sooner, then the rows would be stored in order as per their index values where a B-tress provides the index.
The leaf nodes of the leaves contain the data whereas the rest of the nodes store the index values for the leaf data that could be accessed from the respective nodes.
In case the index is non-clustered, then the rows aren’t sorted as per the index keys which has the same storage structure as that of an indexed table.
A table that doesn’t have an index is stored in an unordered heap structure where both the heaps and B-trees could span a variety of allocation units.