tempdb

NavStudentNavStudent Member Posts: 399
edited 2009-05-28 in SQL Performance
hello. i was reading this 10 best practice and saw this.

For the TEMPDB database, create 1 data file per CPU.
Lining up the number of data files with CPU’s has scalability advantages for allocation intensive workloads.
It is recommended to have .25 to 1 data files (per filegroup) for each CPU on the host server.
This is especially true for TEMPDB where the recommendation is 1 data file per CPU.
Dual core counts as 2 CPUs; logical procs (hyperthreading) do not.

http://www.microsoft.com/technet/prodtechnol/sql/bestpractice/storage-top-10.mspx


so basically one data file each having it's own file group per cpu?
is there a configuration on cpu will utilize each data file?
my 2 cents

Comments

  • WaldoWaldo Member Posts: 3,412
    To be honest, I don't think this is important for a NAV db, because it hardly uses tempdb.

    Eric Wauters
    MVP - Microsoft Dynamics NAV
    My blog
  • NavStudentNavStudent Member Posts: 399
    that's not what I hear.
    I've been told that in 2005 sql, tempdb is used more. specially for navision since it uses cursor.
    my 2 cents
  • Marije_BrummelMarije_Brummel Member, Moderators Design Patterns Posts: 4,262
    There are some cases where TEMPDB is heavily used when SQL decides to use a cached execution plan when it shouldn't. In that case it is used for sorting.

    If you have a correctly working database without this bug TEMPDB should not be used heavliy.

    Right now I am hoping that sombody at Microsoft will finaly wake up and solves this issue for once and for all, basicaly it is the last major issue for a good running system on SQL.

    Of course there is the wishlist like being able to use included columns from the NAV Key designer and having the possible to only select a few columns instead of the *, but that has all been put on the list.

    Using different filegroups in SQL can be very dangerous because all indexes are by default put in one filegroup.

    If you are going to split the filegroups you have to decide which index goes where. This can be very difficult to decide because of the characteristics of an ERP system.

    In an ERP system indexes are used for so many different processes that it is very difficult do make such decisions and the result can be that the load per filegroup can be very uneven.

    My recomendation would be to leave the striping to SQL which does a pretty good job with it.

    Good luck.
  • NavStudentNavStudent Member Posts: 399
    i could create separate data files all in same filegroup.
    that way i don't have to worry about indexes.

    the recomendation is up to .25 per file group. so up to 4 cpu.
    my 2 cents
  • bbrownbbrown Member Posts: 3,268
    The steps you mention are common practices with very large databases (VLDB). The typical Navision install does not qualify as a VLDB in the world of SQL. Instead of creating multiple files, just stick a few more disk on the RAID 10 and let SQL handle the striping.

    Also when creating multiple files there is only an advantage if each can be placed on its own disk array. Placing them on the same array defeats the purpose since they would be competing for the same resource. Also never create more files then you have CPUs
    There are no bugs - only undocumented features.
  • you're mixing up files and filegroups. Best practice for all databases is to create one file per cpu core per filegroup. So on a 4 way dual core box you'd create 8 files within the primary filegroup for tempdb. Splitting filegroups is something else all together. Files give you performance gains becuase you can only have a read/write thread per file. Typically you'd use a script like this

    use master
    go
    Alter database tempdb modify file (name = tempdev, filename = 'L:\MSSQL\data\tempdb_db1.mdf',size = 60Mb,filegrowth = 20Mb);
    go
    Alter database tempdb add file (name = tempdev2, filename = 'L:\MSSQL\data\tempdb_db2.mdf',size = 60Mb,filegrowth = 20Mb),
    ( name = tempdev3, Filename = 'L:\MSSQL\data\tempdb_db3.mdf',size = 60Mb,filegrowth = 20Mb),
    ( name = tempdev4, Filename = 'L:\MSSQL\data\tempdb_db4.mdf',size = 60Mb,filegrowth = 20Mb),
    ( name = tempdev5, Filename = 'L:\MSSQL\data\tempdb_db5.mdf',size = 60Mb,filegrowth = 20Mb),
    ( name = tempdev6, Filename = 'L:\MSSQL\data\tempdb_db6.mdf',size = 60Mb,filegrowth = 20Mb),
    ( name = tempdev7, Filename = 'L:\MSSQL\data\tempdb_db7.mdf',size = 60Mb,filegrowth = 20Mb),
    ( name = tempdev8, Filename = 'L:\MSSQL\data\tempdb_db8.mdf',size = 60Mb,filegrowth = 20Mb);
    go

    You really only need to do this if you have do this on a busy oltp system
  • David_SingletonDavid_Singleton Member Posts: 5,479
    bbrown wrote:
    The steps you mention are common practices with very large databases (VLDB). The typical Navision install does not qualify as a VLDB in the world of SQL. ...

    I attended a session on SQL 2008 recently. They were discussing VLDBs. At first the data sizes did not seem large, until I realized that they were talking Tera not Giga. :shock:

    So applying principles for deploying a 100Tb database are very different to a 100Gb db.
    David Singleton
  • bbrownbbrown Member Posts: 3,268
    you're mixing up files and filegroups. Best practice for all databases is to create one file per cpu core per filegroup. ...

    No I'm not confusing the two. Objects are created within filegroups. Filegroups may contain 1 or more files. It is true that SQL will assign one disk process per file per CPU. But this is only an advantage if the performance bottleneck is the CPU. If the performance bottleneck is the disk subsystem then this will not help. In fact it may actually make things worse, since creating more disk processes will increase the i/o demand on the subsystem.

    You cannot make blanket decisions without first understanding where the performance bottlenecks are in a given system.
    There are no bugs - only undocumented features.
  • the principles apply to the throughput on the database, a small database may be much busier than a large database and in this instance the multifile approach can be useful. Typically disk io occurs in the background and on a raided data set the files on the disks will aid throughput and help avoid queuing. If you have a quiet database and no raid ( stripes ) then I absolutely agree, remember that it's only very recently that we have become more blaise about cpu's and database sizes.
    It's always best to put any performance features in at the start of a project, trying to make changes a couple of years down the line can be very very painful.
  • NobodyNobody Member Posts: 93
    It is my guess and my opinion (for what it is worth) that these recommendations are based on the expectation of Parrallell Execution of quiries. Hence I believe it is not all that applicable to NAV installs as the NDBCS.DLL only send tiny SQL statements to SQL server non of which require parralell execution, actually on a NAV database most of the time parralell execution should be tutned off as it can cause performance issues.

    Only in very rare cases with >100 GB database and high transaction volume have I seen any benefit in breaking up the NAV database into multiple file and filegroups and placing them on their own set of disks. Normally if you just add the disks to the primary RAID10 data partition you were going to use to break up the database you get the same if not better perfomance than spliting things up. Granted there are scenarios that will be different. Breaking up the NAV database into different files and file groups can be a painful and time consuming process as moving objects from file group to file group can be tricky.

    Just my $.02 worth.
  • no it has nothing to do with parallelism, it's to do with multiple read/write to the disk file.
    if Navision doesn't parallel scan then you wouldn't turn off parallelism, so if the advice is to turn off parallelism then Navision will be suffering from CXPACKET locks. If you're using STD edition of sql then the parallelism setting isn't too critical, mostly enterprise benefits most, especially 2005. As I said it's to do with throughput and decreasing latency on the system which not only relates to the actual user activity but also the underlying hardware, smaller systems can benefit too - it's all a matter of scale.
    btw if you're going to use multiple files with sql 2000 you also have to enable a trace flag.
  • NavStudentNavStudent Member Posts: 399
    so reading all these posts, I am gathering that creating separate data files per cpu core is the wa to go for large db > 100 Gb.
    my 2 cents
  • bbrownbbrown Member Posts: 3,268
    NavStudent wrote:
    so reading all these posts, I am gathering that creating separate data files per cpu core is the wa to go for large db > 100 Gb.

    you cannot make that blanket assumption. You must understand the performance bottlenecks of a particular system before deciding on configuration changes like this. I completely disagree that this should be implemented as a generic configuration.
    There are no bugs - only undocumented features.
  • I generally work in controlled environments where build standards are established which apply across the board - I will agree that this approach gives a more complex build, that said it's much more difficult to modify a 100gb database into a multifile / multifilegroup setup then it is to do this right from the word go, the same would apply to using partitioned tables for example.
    I see that tempdb can represent up to 33% of my total transactions ( dedicated server sql 2005 ) so tempdb is certainly being used. Tempdb is also used more within SQL 2005 than it was within SQL 2000.
    I admit that I consider multiple files the same as I consider a number of other configuration options, it's quicker to defrag a number of smaller files than it is one large file though so it can enable you to manage ntfs fragmentation much more efficiently, just another point to ponder.
  • NavStudentNavStudent Member Posts: 399
    hmm, the tempdb will be on its own raid1 physical partition. There are 4 cpu. It makes sense to setup the separate files for each cpu.
    my 2 cents
  • NobodyNobody Member Posts: 93
    Colin Leversuch-Roberts

    Just about everthing you have said is a little off in a NAV/SQL implementation. NAV is a client side cursor based app utilizing parameterization not a stored procedures so paralellism is not necessary for NAV as the NDBCS.DLL translates the C/AL code to single line TSQL statements none of which require paralell execution, and it does not use SQL temp tables and the only thing tempdb is really used for is sorting in NAV. I have actually run side by side comparrisons of seperating NAV tables to their own spindles and also just adding the new spindles to the existing RAID 10 array and the difference was negliable it all depends on the I/O patterns of the customer. I have only seen one case where it had helped and that was due to over zealous reporting. Breaking up the DB into the same number of files as cpu. Is an old SQL 6.5 thing, I have never seen this anywhere in any MSSQL documentation?
  • NavStudentNavStudent Member Posts: 399
    Nobody take a look at my first post on this thread. It's what ms is recomending for sql 2k5.
    my 2 cents
  • this is nothing to do with sql 6.5, Microsoft raised the issue of single threaded read write to a database in sql 2000, I suppose as more and more servers used more cpu's and the databases were picking up more users then this became more relevant, it's more of an o/s than a sql thing. The use of multifile tempdb is quite common in sql 2000 but it requires a trace flag, 1118, to make the round robin process work correctly. I know that each Navision install is probably almost unique, but I do see up to 33% of transactions in tempdb, with an average of around 12% on my Navision install.
    Parallel plans and parallelism has nothing to do with stored procedures and ad-hoc t-sql / application generated sql, especially that in the main starts select * is more likely to generate a parallel plan than a stored proc. There's lots of info on how the optimiser calculates it's plans and at what stage it decides to use a parallel plan on msdn and technet, as well as the various "inside" sql books.
    I don't see much adverse parallelism on my server and what there is I expect to cure with optimal indexing - or a plan guide.
  • mailarun2005mailarun2005 Member Posts: 1
    Thanks for the valuable information given above. You will get some technical information on optimizing tempdb on the below article.
    http://www.sqllion.com/2009/05/optimizi ... rver-2005/
  • larroldlarrold Member Posts: 1
    The tempdb recommendation has nothing to do with multi-threaded performance. It is actually due to the nature of tempdb's workload - tempdb is constantly creating and dropping temporary objects. This causes contention on the various allocation structures (PFS, IAM, GAM, SGAM pages) as multiple sessions hit them all at once. Typical databases don 't have this issue since you aren't creating that many objects at once. By splitting it into multiple files, you create new allocation structures - one set for each file - and can minimize contention IF it was a problem. Even so, one per CPU isn't always necessary. Also, they made some enhancements in SQL 2005 to mitigate this problem, although it can still occur with extreme workloads.

    See Paul Randall's blog post here: http://www.sqlskills.com/BLOGS/PAUL/cat ... stems.aspx

    Also, regarding the tempdb usage in 2005. It is true it went up, but that is dependent upon your workload. This is mostly due to row versioning added. For NAV, you will see more tempdb usage with pre 5.0 clients since the SIFTs were maintained using triggers instead of indexed views. In SQL 2005, the inserted and deleted virtual tables used in triggers were changed to be implemented using row versioning instead of scanning the transaction log, so you'll see I/O patterns change. With NAV 5.0, these are implemented using indexed views instead, so you won't see this additional usage (though that is not to say that you won't experience any depending on what temporary objects it needs to create as part of this maintenance, it just won't be significantly greater between versions).

    Aside from that, if you turn the READ COMMITTED SNAPSHOT isolation level option on for the database, you will also see increased tempdb activity.
  • NaviDeveloper_NLNaviDeveloper_NL Member Posts: 42
    You can easily check if you have an I/O problem on tempdb with the dynamic management view sys.dm_io_virtual_file_stats. This gives I/O statistics for data and log files (SQL Server 2005/2008).

    Example query:

    USE tempdb

    SELECT database_id,
    file_id,
    io_stall_read_ms,
    num_of_reads,
    CAST(io_stall_read_ms/(1.0+num_of_reads) AS numeric(10,1)) AS [avg_read_stall_ms],
    io_stall_write_ms,
    num_of_writes,
    CAST(io_stall_write_ms/(1.0+num_of_writes) AS numeric(10,1)) AS [avg_write_stall_ms],
    io_stall_read_ms + io_stall_write_ms AS [io_stalls],
    num_of_reads + num_of_writes AS [total_io],
    CAST((io_stall_read_ms+io_stall_write_ms)/(1.0+num_of_reads + num_of_writes) AS numeric(10,1)) AS [avg_io_stall_ms]
    FROM sys.dm_io_virtual_file_stats(DB_ID(),null)
    ORDER BY file_id desc
    GO

    You can compare the results of this query with results of this same query on the user database.

    High usage of tempdb can be caused by bad indexing. For example queries with an ORDER BY without an index that can be used. Sorting of big result sets can cause a high usage of tempdb.
Sign In or Register to comment.