Improving Data Analysis Operations :: SQL Server 2012
You are a data analyst in a financial company that uses SQL Server 2012 for its database. The
company has just recently upgraded the system from SQL Server 2000. You often use T-SQL
queries against the company's database to analyze the data. So far, you were limited to code
that was compatible with SQL Server 2000, relying mainly on joins, subqueries, and grouped
queries. Your queries were often complex and slow. You are now evaluating the use of features
available in SQL Server 2012.
- You often need to compute things like running totals, year-to-date calculations, and
moving averages. What will you consider now to handle those? What are the things
you should watch out for in order to get good performance?
- Occasionally, you need to create crosstab reports where you rotate the data from rows
to columns or the other way around. So far, you imported data to Microsoft Excel and
handled such needs there, but you prefer to do it in T-SQL. What will you consider using
for this purpose? What should you be careful about when using the features you're
considering?
- In many of your queries, you need to perform recency computations -- that is, identify
the time passed between a previous event and the current, or between the current
event and the next. So far, you used subqueries for this. What will you consider now
instead?
Answers
- Window aggregate functions are excellent for such computations. As for things to
watch out for, with the current implementation in SQL Server 2012, you should generally
try to avoid using the RANGE window frame unit. And remember that without an
explicit window frame clause, you get RANGE by default, so you want to be explicit and
use the ROWS option.
- The PIVOT and UNPIVOT operators are handy for crosstab queries. One thing to be
careful about when using PIVOT is related to the fact that the grouping element is
determined by elimination -- what's left from the input table that wasn't specified as
either spreading or aggregation elements. Therefore, it is recommended to always define
a table expression returning the grouping, spreading, and aggregation elements,
and use that table as the input to the PIVOT operator.
- The LAG and LEAD functions are natural for this purpose.