Chris Haas's Blog

June 25, 2009

Fixing “query processor ran out of stack space”

Filed under: Uncategorized — Tags: , , — chrishaas @ 9:07 am

I’ve been normalizing a table with 98 million rows for a couple of days now and with a recent change I’ve been able to run my batches in groups of 100,000 instead of just 10,000. The program doing the normalization pulls down 100,000 records at a time, analyzes and groups them and performs an UPDATE for each item in the group, passing the primary keys back to the server in an IN () clause. Everything was fine for a while but with the upgrade to 100,000 and with some groupings sending thousands of keys in the IN clause I started to get the following error:

Internal Query Processor Error: The Query Processor Ran Out Of Stack Space During Query Optimization.

Microsoft says to insert the keys into a temp table and join with that but that doesn’t really work for me. Instead I just decided to break up the UPDATE query into ones with more manageable IN clauses.

Originally I had something like:


            Dim mySQL = String.Format("UPDATE Tbl SET Col2=xyz WHERE Col1 IN ({0})", Join(TheList.ToArray(), ","))
            Using Com As New SqlCommand(mySQL, con)
                Com.CommandType = CommandType.Text
                Com.ExecuteNonQuery()
            End Using

The new code breaks things into more manageable sizes:


        Dim AmmountToProcess = 1000
        Do While True
            Dim Y = TheList.Take(AmmountToProcess)
            Dim mySQL = String.Format("UPDATE Tbl SET Col2=xyz WHERE Col1 IN ({0})", Join(Y.ToArray(), ","))
            Using Com As New SqlCommand(mySQL, con)
                Com.CommandType = CommandType.Text
                Com.ExecuteNonQuery()
            End Using
            TheList.RemoveRange(0, Math.Min(TheList.Count, AmmountToProcess))
            If TheList.Count = 0 Then Exit Do
        Loop

The new code uses the wonderful Take extension method to grab 1,000 items. Take luckily doesn’t error if you ask for more than available. The smaller UPDATE query runs and then we remove the first 1,000 items (or however many are left) from the list. Repeat until the list is empty.

June 24, 2009

Make sure your SQL queries are using the proper index

Filed under: Uncategorized — Tags: , — chrishaas @ 11:00 am

I’ve got a live table with 98 million rows that I’m normalizing. Unforuntately all full batch updates tend to lock the system for a very long so I wrote a VB.Net program to perform updates in smaller batches. The program pulls 10,000 records that haven’t been updated by using the standard SELECT TOP 10000… syntax. Unfortunately the WHERE portion of this query takes a really long time to run. I started digging into things and tried just SELECT TOP 1… and that returned results immediately. I then tried SELECT TOP 10… and that returned immediately, too. After playing around I found that SELECT TOP 63… was the most I could go before the query took a long time to execute. I have no idea what’s so special about 63 but it was bugging me. So I ran SELECT TOP 63… and SELECT TOP 64… side-by-side in SQL SMS with the execution plan turned on and found the 63 version was using an Index Seek on the index that was on the column in my WHERE clause. This is a good thing. The 64 version, however, was performing a Clustered Index Scan on the primary key which had nothing to do with my WHERE clause.

The fix for this is to force SQL to use the index of your choice and its so easy to do. In your query, before the WHERE clause just add WITH (INDEX(INDEX_NAME)).

So if this is your original query:

SELECT TOP 10000 Col1, Col2 FROM YourTable WHERE Col3 IS NULL

And you have an index on Col3 called IDX__YourTable__Col3 you’d execute this query instead:

SELECT TOP 10000 Col1, Col2 FROM YourTable WITH (INDEX(IDX__YourTable__Col3)) WHERE Col3 IS NULL

Normally you shouldn’t have to do this but if you’re doing one-off things like I’m doing it comes in handy.

–Update:

The forced index is running so fast that I’ve actually changed my batch size from 10,000 to 100,000. The prior 10,000 was processing about 400 records/sec, most of the time spent trying to get results from the SELECT TOP… but with the forced index and changing to 100,000 I’m now averaging 3,500 records/sec. Nice.

April 8, 2009

SQL Server 2000 w/o the GUI

Filed under: Uncategorized — Tags: — chrishaas @ 9:21 pm

Okay, not totally, but this process starts without the GUI. I had two SQL servers, one without the GUI tools installed. In a domain environment this is fine because you can just snap to the machine using your domain credentials. If you don’t have a domain and forgot to create a local account (or can’t remember the sa password) before disconnecting from the domain, however, you’re in for a little trouble. Okay, its not that bad really. SQL Server 2000 installs a command line tool that you can use to create a new user and elevate that user as needed. In my case, I wanted to manage both SQL Servers from the one SQL Server with the GUI tools installed. Luckily I have mixed-mode authentication installed so I didn’t need to try to get the sa password.

  1. Drop to a command line and navigate to C:\Program Files\Microsoft SQL Server\80\Tools\Binn (or wherever your tools are installed to, obviously).
  2. Type (case-sensitive):
    osql -E
    This will log you in using your current logged in credentials
  3. Type (two lines):
    EXEC sp_addlogin 'Username', 'Password'
    GO
    Replacing Username and Password as needed
  4. In my case, I just wanted the account to have total control since I was going to be managing this remotely so I ran (two lines):
    EXEC sp_addsrvrolemember 'Username', 'sysadmin'
    GO
    Replacing Username as needed
  5. Alternatively you can run this if you want to use a custom role (two lines):
    EXEC sp_addrolemember 'role name', 'Username'
    GO
    For some reason the two role sproc’s parameters are in different orders

With this you should be able to login remotely.

Blog at WordPress.com.