Monday, February 20, 2012

Is my table scan unavoidable?

Hi guys-n-gals...

I have a table that contains the following:
PortfolioID(int), EndDate(smalldatetime), Begdate(smalldatetime), WklyCloseIndex (float)

It has a primary key which is PortfolioID/BegDate/EndDate

I need to create a table that summarizes, by date range, the weeklycloseindex of several portfolios, like this:

BegDate EndDate Portfolio2 Portfolio67 Portfolio11 Portfolio90
05/28/2004 06/04/2004 xx.xx xx.xx xx.xx xx.xx
05/21/2004 05/28/2004 xx.xx xx.xx xx.xx xx.xx

So I wrote a function...

This function results in a table scan *GASP!!!!* (at least that's what the 'splain plan tells me when I run it in SQL Analyzer). Before I made it into the function, when I was testing the code in SQL Analyzer, it resulted NOT in a table scan, but rather a series of nested loops (the joins) and clustered index seeks...resulting in about 1/3 the total cost of the function.

I suspected originally that it was the TOP/ORDER BY that the function insisted upon, but even if I remove those, still get a table scan.

Wassup? Why does the function turn my cool lil' self-join into a table scan? Whut am I missing? Any thoughts? Disgusted Derisions? Hurled Insults? Bring it on!!! (please! ;) )

My predecessor did this in a similar project using a separate cursor for each portfolio by date, then looped through the dates, pulling in the per-portfolio index value and building the output table. I would rather avoid the cursors if I can.it wasn't the logic inside the function that was the cause of the table scan...but rather the code 'Select * from <function> ' that caused it to be reported as a table scan :( *blush*

So...how does one get an execution plan for what goes on inside a function anyway? Just cut out the code and run it in the analyzer? Seems like ya ought to be able to 'splain the internal cost of a function by calling it too...Hmmm...

Anyway, thanks for reading!|||Highlight it in the object browser on Query Analyzer and hit "Display Estimated Execution Plan". ??|||Highlight it in the object browser on Query Analyzer and hit "Display Estimated Execution Plan". ??

Nope, doesn't give me the results I would like...since in order to execute the function, I have to do something like select * from dbo.fn_Get_Performance_Table ('2004-01-01', '2004-06-01') (my function returns a table...). So the execution plan displayed just gives me a table scan due to the "select * from" part of the overall query.

I just used cut-n-paste to copy the internal function code to the QA window, then used "show execution plan" to actually get what I need...just thought (or hoped) there was a way to "see inside" the function in an explain (sorry...Tandem on the brain...OUCH!) ...errr...EXECUTION plan...

Thanks for taking a shot!|||This has been a bit of a problem with the Show Estimated Execution feature in Query Analyzer. It is blind to the costs of UDFs. I have come across a number of postings, and anecdotes about people using functions, and being really happy about how clean their code looks. After they test it, though, they go back to the ugly code without functions, and vow never to drink-er, that is, never to use functions again.

The performance problem in a function is that if you have a query in the function, that goes out and racks up say 20 reads and 3 ticks of CPU, then you put that function into the field list of a query that returns 1000 rows, you have a query that runs up 20,000 reads, and 3 full seconds of CPU time.

As with anything, of course, use caution and test.

No comments:

Post a Comment