I was recently involved in a big Sharepoint MOSS project for a major Swedish technical consultant firm. They have offices spread out over Europe and participate in global projects. The new Sharepoint project involved a process management tool and project collaboration areas, both tools to make their daily work easier. But the goal of the entire project was to provide a tool that made it easy to re-use solutions and spread experience throughout the organization, ie, make it searchable. Sharepoint was the perfect platform for them! One of the major challenges with this project is the huge amount of data they produce each year. CAD-drawings and documentation push their yearly data to somewhere around 3TB each year!
What does this have to do with SQL 2008 binary data performance you may ask, or you already know because you have Sharepoint experience. Sharepoint stores all it’s information in an SQL database. Normally that’s not a problem when people share documents and presentations well under 50Mb each. The default limit in a standard Sharepoint MOSS / WSS 3.0 installation is 50Mb per file. In a servicepack however Microsoft enabled the possibility to increase this limit to 2Gb per file. 2Gb is a lot and arguable files that big shouldn’t be stored in a database, at least not a SQL 2005 database.
Even though the production servers are extremely powerful there are problems uploading 2Gb files to Sharepoint. The biggest problem is of course the WAN links. Time was measured for file uploads from offices around Sweden and the results showed that A, the biggest problem was slow WAN links, some slower than 10Mbit and B, a lot of time is lost apart from the network transfer. My job was to find out how to minimize the time lost apart from the actual network transfer.
I first did some tests uploading files to my development server running WSS 3.0. From the bandwidth usage on that machine i quickly realized that the problem was the time it took for Sharepoint to store the file in the database. I whipped together a tool to measure the time it takes to store a binary file in an SQL database. The results were suprising.
I set up 3 virtual machines on our SQL 2008 Hyper-V server. Each with identical amount of resources:
The measurements in the graphs are from my application which loads the entire file into ram and stores it in the SQL-database. The time is measured for the SQL command to complete, nothing else. The column type for all results but SQL 2008 filestream is image. The application ran on each machine locally but connected through TCP. Preallocated means that the database file size was increased so the entire file would fit without the need for the database to grow, other results mean that the database has to grow for the file to fit.
I was suprised to see how poorly SQL 2005 performs, even in x64 mode it’s pretty bad. SQL 2008 does improve performance significantly over SQL 2005 for the same data type (image), and Sharepoint MOSS 2007 / WSS 3.0 does run on SQL 2008. However most people are probably running with SQL 2005 on 64-bit architecture and by increasing the size of the database to prevent constant growth you increase the binary data speed by an impressive 30%. Remember however that a larger database can reduce overall performance. The standard setting for Sharepoint content databases is to grow by 1Mb, if you plan to store huge amounts of data consider changing it to grow by a percentage instead.
Another interesting finding is how fast filestream really is. I’m eagerly awaiting the next version of Sharepoint!
This was just a performance tuning part of the solution for storing large files. We came up with a rather genius solution to the problem, perhaps I’ll write more about that in a later post.
Here’s a screen shot of my performance measuring tool
With this blog I try to provide useful tips and solutions for programming .NET, Objective-C and more. My name is Björn Sållarp, and I love writing code.