A few years ago, I had a customer who was taking uncompressed backups. The databases were quite large, and they had a huge Disk IO issue. One time we were forced to restore from the last backup, and the restore failed due to timeout. I had to transfer the backup to my own server which had better performance and no IO issues, where the backup restored fine. I then proceeded to take a compressed backup of their data from my server, and transfer it back. The compressed backup restored without any issues. Ever since then, I always recommend compressed backups to all my customers.
@@PinalDaveSQLAuthority I can’t remember the size of the s Database, but the restore took over 5 hours to complete. How would you explain that a compressed backup of the same database restored fine while the uncompressed didn’t.
We are handling huge database and hence this feature of compressed backup is such that we cannot even think our life without it. The storage, network bandwidth, cloud sharing and many more having direct impact on compressed backup. In the world of SSD instead of normal Sata harddisks, it makes big difference as well.
This is a well-produced video, but I have a problem with a blanket recommendation to use backup compression or not just based on these results. The correct decision will depend on your data, your hardware, and storage infrastructure. With the current Microsoft backup compression algorithm, compressing the data before it is written to backup file(s) causes some CPU overhead while the backup is running. If the logical drive(s) where the backup files are being written to have relatively low sequential write performance, then the actual elapsed time for the backup will usually be significantly lower because less data is being written. The sequential write performance is the bottleneck, which is a fairly common scenario. If the destination for the backup file(s) has very good sequential write performance, then compressing the backup can make the elapsed time significantly higher. This is not as common with real database servers, but it definitely can happen. People should test both ways in their environments to see what kind of results they get.
I have a question if you can help -- I have oracle linked server and we run an hrly job which drops and insert into temp table from Oralce open query which normally completes in 3-4 minutes - but some time job will hang, the only solution to stop the hanged job to close TCP session and re-run the job (I am unable to find the cause) and this happens 2-3 times in a week -- any thoughts ?
Try to compare backups of big databases, f.e. 100 GB. My own experience tells that on a big database the compressed backup is faster than uncompressed.
I've been testing the compressed backups on TDE DBs (2016, defining maxtransfersize > 64k), but only seeing ~20% decrease in file size. These DBs are over 300gb, some much larger. Is there a point where the original file size is just too much for the backup compression?
Hi Pinal! One thing to improve Compressed Backup throughput is to create the backup with multiple files to take benefit of parallel compression. At least i observed that in a (lazy) test i've performed. Edit to add test results: Environment: SQL 2019 Developer StackOverflow database modified by Brent Ozar (2018 i think...) - 160 GB allocated Fast Disk (PCI Gen 4) Backups taken to same drive that datafiles. Results: NOT COMPRESSED - SINGLE FILE: BACKUP DATABASE successfully processed 20932266 pages in 94.620 seconds (1728.316 MB/sec). 160 Gb bak file COMPRESSED - SINGLE FILE: BACKUP DATABASE successfully processed 20932266 pages in 304.049 seconds (537.851 MB/sec). 47 GB bak file NOT COMPRESSED - 4 FILES: BACKUP DATABASE successfully processed 20932266 pages in 119.746 seconds (1365.668 MB/sec). 4 x 40 GB bak files COMPRESSED - 4 FILES: BACKUP DATABASE successfully processed 20932266 pages in 116.200 seconds (1407.343 MB/sec). 4 x near 12 GB bak files So, as Steve Earle said, it all seems to depend on your involved hardware.
I'm intrigued by this as I've always thought compressed was quicker. I've just done some tests on AdventureWorks 2014: to C:\ no compression: 390MB/s to C:\ with compression: 214MB/s to UNC path no compression: 2.4MB/s to UNC path with compression: 9.5MB/s I guess it depends what the particular bottlenecks are (Network will have IO bottleneck Which means the IO saving makes compression faster in that scenario)
In my opinion, performance does matter with backups. if we are working on a Data Warehouse system in Simple recovery model which runs large updates / inserts, if the backup takes longer, these operations can really increase the size of the log file
@@MiningForPies yes, I mean autogrow. If the log is sized appropriately and then a backup goes slower than it ever has before, the log may grow. I know because I have felt the pain
Compressed vs Uncompressed becomes really important when you are shipping your backups to the cloud over a network. Backup and restore times I don’t care about 🤔. Would also be interesting to add verify options (both in terms of during the backup and RESTORE VERIFYONLY) 🤔🤔
We are about to do the same, unfortunately only seeing about a 20% decrease in file size with compression on our 300+gb files, I'm wondering if it's because of the DB file size being so large
Man, you are simply brilliant
Wow, thanks
A few years ago, I had a customer who was taking uncompressed backups. The databases were quite large, and they had a huge Disk IO issue.
One time we were forced to restore from the last backup, and the restore failed due to timeout.
I had to transfer the backup to my own server which had better performance and no IO issues, where the backup restored fine.
I then proceeded to take a compressed backup of their data from my server, and transfer it back.
The compressed backup restored without any issues.
Ever since then, I always recommend compressed backups to all my customers.
I do recommend compressed backups, however, there is no clear relation with performance.
@@PinalDaveSQLAuthority I can’t remember the size of the s
Database, but the restore took over 5 hours to complete. How would you explain that a compressed backup of the same database restored fine while the uncompressed didn’t.
I believe the same thing will be applicable for striped backups as well or anything interesting to discuss in specific
Thanks!
Thanks for sharing Pinal.
Welcome
We are handling huge database and hence this feature of compressed backup is such that we cannot even think our life without it. The storage, network bandwidth, cloud sharing and many more having direct impact on compressed backup. In the world of SSD instead of normal Sata harddisks, it makes big difference as well.
Well said
This is a well-produced video, but I have a problem with a blanket recommendation to use backup compression or not just based on these results. The correct decision will depend on your data, your hardware, and storage infrastructure. With the current Microsoft backup compression algorithm, compressing the data before it is written to backup file(s) causes some CPU overhead while the backup is running.
If the logical drive(s) where the backup files are being written to have relatively low sequential write performance, then the actual elapsed time for the backup will usually be significantly lower because less data is being written. The sequential write performance is the bottleneck, which is a fairly common scenario.
If the destination for the backup file(s) has very good sequential write performance, then compressing the backup can make the elapsed time significantly higher. This is not as common with real database servers, but it definitely can happen. People should test both ways in their environments to see what kind of results they get.
Thanks for the detailed note.
I have a question if you can help -- I have oracle linked server and we run an hrly job which drops and insert into temp table from Oralce open query which normally completes in 3-4 minutes - but some time job will hang, the only solution to stop the hanged job to close TCP session and re-run the job (I am unable to find the cause) and this happens 2-3 times in a week -- any thoughts ?
Not expert in this area.
Try to compare backups of big databases, f.e. 100 GB. My own experience tells that on a big database the compressed backup is faster than uncompressed.
I will be interested in watching that video.
Is there a flexibility to specify the compression percentage while taking backup in SQL server?
Currently no flexibility.
I've been testing the compressed backups on TDE DBs (2016, defining maxtransfersize > 64k), but only seeing ~20% decrease in file size. These DBs are over 300gb, some much larger. Is there a point where the original file size is just too much for the backup compression?
This is indeed very subjective and no right answer to this question.
Does compression significantly affect the CPU load? Or maybe faster write to disk causes more CPU usage? I'm curious about it.
I have done multiple tests, there is no conclusive result.
Hi Pinal!
One thing to improve Compressed Backup throughput is to create the backup with multiple files to take benefit of parallel compression. At least i observed that in a (lazy) test i've performed.
Edit to add test results:
Environment:
SQL 2019 Developer
StackOverflow database modified by Brent Ozar (2018 i think...) - 160 GB allocated
Fast Disk (PCI Gen 4)
Backups taken to same drive that datafiles.
Results:
NOT COMPRESSED - SINGLE FILE:
BACKUP DATABASE successfully processed 20932266 pages in 94.620 seconds (1728.316 MB/sec).
160 Gb bak file
COMPRESSED - SINGLE FILE:
BACKUP DATABASE successfully processed 20932266 pages in 304.049 seconds (537.851 MB/sec).
47 GB bak file
NOT COMPRESSED - 4 FILES:
BACKUP DATABASE successfully processed 20932266 pages in 119.746 seconds (1365.668 MB/sec).
4 x 40 GB bak files
COMPRESSED - 4 FILES:
BACKUP DATABASE successfully processed 20932266 pages in 116.200 seconds (1407.343 MB/sec).
4 x near 12 GB bak files
So, as Steve Earle said, it all seems to depend on your involved hardware.
Thanks for sharing.
I'm intrigued by this as I've always thought compressed was quicker.
I've just done some tests on AdventureWorks 2014:
to C:\ no compression: 390MB/s
to C:\ with compression: 214MB/s
to UNC path no compression: 2.4MB/s
to UNC path with compression: 9.5MB/s
I guess it depends what the particular bottlenecks are (Network will have IO bottleneck Which means the IO saving makes compression faster in that scenario)
Ofcourse. Great point
Hi Sir, will split backup be faster?
It should
In my opinion, performance does matter with backups.
if we are working on a Data Warehouse system in Simple recovery model which runs large updates / inserts, if the backup takes longer, these operations can really increase the size of the log file
Thanks 👍
The log files won’t be bigger, they’ll just stay full for longer. 🤔
@@MiningForPies and if they fill and autosize is enabled, the size will increase
@@steveearle9678 assuming you mean auto grow, wouldn’t like to think there are people out there still autoshrinking log files on a regular basis 😮
@@MiningForPies yes, I mean autogrow. If the log is sized appropriately and then a backup goes slower than it ever has before, the log may grow.
I know because I have felt the pain
Compressed vs Uncompressed becomes really important when you are shipping your backups to the cloud over a network. Backup and restore times I don’t care about 🤔.
Would also be interesting to add verify options (both in terms of during the backup and RESTORE VERIFYONLY) 🤔🤔
Totally interested.
We are about to do the same, unfortunately only seeing about a 20% decrease in file size with compression on our 300+gb files, I'm wondering if it's because of the DB file size being so large
@@rebekahw8601 a lot of it depends on variable your data is and the data types.