Backup Compression

Backup CompressionBackup Compression is a feature of SQL Server versions 2008 and later (2008, 2008 R2, 2012) which offers you more benefits without any cost to your data or its integrity.

In spite of \having been through a data compression process, the compressed backup actually has a few advantages over regular backups.

Benefits of backup compression

Because the compressed backup file is smaller than a regular uncompressed backup of the same data, compressing a backup usually implies less I/O for the device it’s being backed up by, which translates into a significantly increased backup speed.

Restrictions of backup compression

Along with the benefits of having such an operation possible when doing your backup, there are also restrictions in order for this to happen.

  • A media set cannot contain backups that are both compressed and uncompressed.
  • Versions of SQL Server which are older than 2008 cannot read compressed backups.
  • NTbackups cannot share a tape with compressed SQL Server backups.

Performance implications of backup compression

Implicitly, the operation of backup compression requires additional CPU, which translates into significantly increased CPU usage. This also might have a negative effect on other operations that happen concurrently with the compression operation.

In this situation, and in case you do not want to trade off much of your CPU time for compression, you have the possibility to create a low-priority compressed backup in a session where CPU usage is monitored and managed by the Resource Governor.

In order to get a grasp of what your backup I/O performance is, you can either form devices by evaluating different sorts of performance counters or you can try to isolate the backup I/O.

The evaluation counters you can use to measure the I/O performance of your backups are:

  • Windows I/O performance counters, such as the physical disk counters
  • The Device Throughput Bytes/sec counter of the SQLServer:Backup Device object
  • The Backup/Restore Throughput/sec counter of the SQLServer:Databses object

Calculate the Compression Ratio of a Compressed Backup

A metric that can help you see the efficiency of your backup is by seeing its compression ratio. In order to be able to calculate the compression ratio of a backup, you will have to use the available values for the backup in the backup_size and compressed_backup_size columns of the backupset history table, as follows:

SELECT backup_size / compressed_backup_size FROM msdb..backupset

The result of this T-SQL query is the compression ratio of your backup. A quick example would be that if a compression ratio is around 4:1, that would indicate that you are saving about 75% of disk space.

The compression ratio of a compressed backup is in direct relation with the data that has to be compressed. A variety of factors can impact the compression ratio which you get with your backup. The primary factors that influence this are:

  • The type of data you use for the columns/tables that are in your database. For example, character data compress more efficiently than other types of data.
  • The consistency of your data among rows on a page. Usually, in the case, a page contains multiple rows in which a field has the same value, compression at a significant level might occur for that specific value. By comparison, this type of database in contrast with a database that contains random data or another database that contains only one large row per page, a compressed backup would approximately have the same size as an uncompressed backup.
  • Encrypted data or not. Data that is encrypted has a lower compression rate than its equivalent unencrypted data. If transparent data encryption is used for encrypting an entire database, compressing backups might not reduce their size by much, or even at all.
  • In case the database is compressed. If the database has been through a compression process, the operation of compressing the backup has a high chance of not reducing the size by much or even reducing the size at all.

Allocation of space for the Backup file

In the case of backups that will be created through the compression process, the actual size of the final backup file is very much dependant on how compressible the data in the database is, and this fact is unknown before the actual backup operation is finalized.

Therefore, a default step that occurs when backing up a database using compression, is that the Database Engine calculates an approximate size using a pre-allocation algorithm for the backup file. The algorithm pre-allocates a predefined percentage of the current size of the database for the backup file.

In the case in which more space is necessary during the process of backing up, the Database Engine increases the size of the file. If the final size of the compressed backup file is less than the allocated space, at the end of the backup operation, the Database Engine shrinks the file to the actual final size of the backup, thus freeing the additional space it does not require.

In order to allow the backup file to grow only as needed to reach its final size, you have the possibility of using a trace flag, called 3042. This 3042 trace flag instructs the backup operation to ignore the default backup compression pre-allocation algorithm. This flag is useful only if you need to save on space by allocating only the actual size required for the compressed backup.

However, by using this trace flag for the backup compression operation, there is the possibility of causing a minor performance penalty translated into a possible increase in the duration of the backup operation.

Helpful tutorials

  1. How To : Configure Backup Compression

6 thoughts on “Backup Compression”

    • Thank you for all of your feedback! I will continue to write articles that hopefully can help people understand how the general process of backing up and restoring your databases work, with some tips and tricks to make this into a best practice.

      Reply

Leave a Comment