June 2008 - Posts

More Kindle

 I'm still on the fence here about a Kindle. I pinged someone that bought one and he said they're returning it. Not that they don't like it, but his wife reads too much now, so she's spending more $$ and they'd like to slow down. Also, he doesn't see enough technical content and the Kindle editions aren't that cheap compared to some full versions.

I worry about reading and spending more myself. I think I'd definitely need some type of budget for the thing. And I worry  that I'll find lots of books I'm interested in not on there.

I searched for a few O'Reilly books today to see and couldn't find any, but they sent me this note, which shows they're moving forward. I got a note from Apress as well that they're planning on more Kindle versions.

I'm still not sure it's for me. I'm back to using the library a bit, though our library doesn't have a great selection. I'm also re-reading some older books, so I'm trying not to spend too much. Depending on my Amazon referrals, it's possible that I'll have a decent spend for books. I seem to get about $30 a quarter, so that's not too bad.

I think about using a PDA or iTouch to read, but they're just too small screens. And I could see my battery life dying too quickly. The question is will someone do a better e-Reader than the Kindle anytime soon? Their wireless connectivity, to me, means they've beaten Sony pretty well. Even at $60 more. The question is now how quickly they'll add more titles and can they work a deal to lower prices.


Posted 26 June 2008 14:55 by Steve Jones

One Note - Part II

 Apparently other people like One Note. Microsoft is using it for learning content, and I think that's a cool idea. I might find some issues with that over time, but at first glance I think it's a good idea.

Posted 25 June 2008 11:26 by Steve Jones

Filed under:

One Note

Lately I've been trying out a few new pieces of software to see how well they work. One is Twitter, and if you're really interested, you can follow me there. A few people do, though I'm not sure why. I follow a few people and might cut my list down at some point. I'm also not sure how much I like it.

The other is One Note, which is a part of Office 2007. When I first saw this a few versions ago, Brian Knight, my former SSC partner, was using it on his tablet PC. It seemed more geared for that form factor, so I hadn't really thought much of it. However at TechEd last week I watched Buck Woody use it and he told me just how valuable it's been for him, so I thought I'd give it a try and see what I think.

And I really like it.

I'm not really someone that likes to change tools that often. It's not that I'm against newer tools, but rather that I've made investments in not only tools, but time in learnign them and building around them and I don't get benefits if I change too often. I also like to have small, lightweight tools I can easily use. Word is cumbersome and causes me issues moving to HTML, so I've stuck with a text editor. Notepad is most of the way there for me (and it's on every machine, but it's not as smooth in HTML, so I've gone with Edit Plus, which has worked well and allows me to have mutliple documents open.

So it's rare I try something and see a chance it will improve things. One Note was one of those things with it's free structured formatting and right away I saw posibilities with it. I have a screen shot attached that shows the basic interface.

There are a few things I like about it, but one of the big things is how easy I can move my notebooks or tabs around. Each tab on the left side is a folder. I like things easily being organized like that. It makes life simple and allows me to easily be sure that things are backed up, but it also allows me to manage sizes of things. Right now Outlook is a pain because everything is in these large files.

The tabs across the top (Editorials, Editorial Polls, ToDo, Meetings, etc) are actually the .one files in the file system. By copying these to my laptop, they appear in my other One Note installation. The right side shows the actual documents. I started putting multiple editorials on one page, but decided to have separate pages, just like a notebook, and it's worked well

I've moved over most of my editorial stuff from a text editor, and this works well since my cut and paste seems to work well into the SSC system. I also have easily been able to move stuff from one machine to another, which is key. It's also a little easier to write in than the HTML text editor I've been using. There are a few things I haven't figured out, like flowing text around pictures, but so far this seems to be working well.

 

Posted 20 June 2008 11:23 by Steve Jones

Asking a SQL Question

I've been a little surprised lately by some of the questions I've seen in our forums at SQLServerCentral.com with questions that weren't questions. I've seen people post things that make it hard for people to help them. Among those items:

  • statements, and never really ask a question.
  • An error message or partial error with no context.
  • Broad questions like "What do I need to watch out for when upgrading to SQL Server 2005?"
  • Posting a question you got asked and nothing else.
  • Advertisements - I delete these, even in response to a question if they don't apply.

Jeff Moden wrote a great article on Forum Etiquette that I've tried to rerun and point out to people so they can better frame what they need to post.

There isn't a set of things I can say you always need to post, but you really should think about your question for a minute and post what's appropriate. Imagine that most of us don't understand your context, so we need to know the version (pay attention to where you're posting), and some information about what is happening. It helps to see your observations of the system or the code you've tried and also what happened that you didn't want.

The one thing that always bugs me is someone giving me a test question. Or interview or some other question they got asked and they haven't done any work. I had someone do this recently, actually it happens regularly, but this person posted a question with no code. I gave a few hints and asked them to try, saying that I didn't want to give answers to questions. They responded with "this is not a test question"

OK, maybe it's not. I'm not assuming that it is, but just that it appears to be and the posted didn't show any work. I tried to explain that and hoped that no one else just posted the code. I can appreciate that you sometimes need a code answer, but if you haven't tried anything, it looks like you're lazy and not intersting in doing work. Or that you want a good grade. If you're willing to post a test question, wouldn't you be willing to lie and say it isn't?

The bottom line is that I enjoy helping people and try to do so, but I'm not interested in feeding people, I'm interested in teaching them to fish.

 

 

Posted 18 June 2008 10:17 by Steve Jones

Filed under:

TechEd - Execution Plans

Randy Dyess of Solid Quality Mentors had a session and since I'd just gotten a copy of Grant Fritchey's book with the same name, I wanted to check it out.

There are 3 types of plans: graphical, text, and XML.

Graphical is cool, looks good, but it doesn't have enough information and once you are in a non-trivial situation, you have to scroll around and it's hard to hold all that information in your mind.

Text plans are what you need to learn to read. Especially in grid mode, we deal with this format for data all the time, so this makes lots of sense for many DBAs, so spend some time working on this. Once you get used to reading text plans, likely this will be your preferred method.

XML - Also hard to read, perhaps need to know XQuery, might be worth learning if you want to know more about XML. Randy doesn't like this, but I think it's worth reading about XML a bit and learning how it works since I think we'll have this data around for a long time. However reading XML isn't easy as things lik whitespace and other characters are encoded. Therefore it's not "human readable" and needs to be consumed by something. However XQuery should help here.

All plans can generate estimated or actual plans. Sometims you can't get actuals since it would be hours to execute the query, or you can't run it on the production system and need to know, or have a good idea of what's happening. Estimated plans help here, but they are not guarenteed to be the same as the actual plan.

One problem with estimated plans is temporary tables. You might need to comment one out to get the plan, and it might not be a good representation of the final plan. If you create a permanent work table and remove it, the same thing happens.

Text Plans

In the text plans, follow the parent column in your plan. It contains numbers, and the highest number happens first. The recommendation is to start tuning with the highest numbers, tuning there first since a change there will flow down to lower numbers. An observation from Randy that SQL Server seems to do the hardest or more resource intensive things first.

Options:

  • STATISTICS PROFILE ON
  • SHOWPLAN_ALL 
  • SHOWPLAN_TEXT
  • STATISTICS IO
  • STATISTICS TIME ON

Statistics IO gives you IO statistics from the query. You get more informaiton in 2008 as this has been enhanced. This is really read and scan information that you can use to determine where you might think about tuning the query. A big thing here is to try and reduce physical reads, but you need to run this a few times and be sure that the physical reads aren't the first ones to populate the cache. Once the data is in the cache, it might perform much better. 

In this case, Randy looks to reduce logical reads. If those can be reduced, even if the data is not in cache, physical reads might be reduced.

Execution plans can also be pulled from memory. You can grab the plan from the cache, which is actually what is being used by SQL Server to execute the query. DMVs are available to help here.

How do you tune?

Randy's Goals

  • Optimize duration / CPU usage
  • Optimize IO usage

Tracking duration meaning physical clock time, helps to determine how busy the system is. It's good to know an average time for procedures to execute as a baseline. Know that this will grow over time (potentially) As you add data.

CPU usage - Track this since it's a limited resource. Knowing the top usage queries can allow you to focus your efforts on those queries that are used often and impact the CPU.

Large, complex queries usually have large batches of code, each of which can be tuned as a mini-execution plan.

Top Slowdowns

  • Table and clustered index scans
  • Index scans v index seeks
  • Bookmark lookups
  • Join methods
  • Sorts
  • Compute Scalar

Table and clustered index scans are the leading cause of bad performing queries. Not always, sometimes you want this, but it is often bad. A CI scan is a table scan really, but could have better performance since you don't have as many random IOs as on a heap. Randy has this is the number one thing to tune when he sees this. If he can remove this, it often helps. Partial scans are sometimes ok, such as scanning a set of data all grouped together.

Index scans are beter than table scans, but still costly. This usually means a lot of rows ned to be read for the query. Investigate to see if you can turn these into seeks. CI seeks are the best things to have since they hit the data, but only specific rows.

Bookmark Lookups - A non-clustered index is used, but then you must use the CI key to go read the CI and get the data. In 2005 the INCLUDE columns can help remove these. In 2005/2008, you don't see this operator. You see RID (row identifier) and then a Lookup to get the data. These are hard to resolve, you might include 3-5 columns, but not more. You want to review the columns used by indexes to see if you can cover more queries. Or perhaps see if you really need to return that column and perhaps remove the columns. Don't SELECT *, that can cause this. You might also try creating aditional indexes  to be used for joins or perhaps, if appropriate for that table, change the NCI to a CI. This shouldn't be done lightly, especially if you've spent time considering what the best CI should be.

Be sure that your NCI doesn't duplicate the CI keys. They are already included.

Join Methods - SQL Server uses 3 types. Nested loop, Merge, Hash.

Nested loops are for smaller inputs, and has low memory usage. When you have a small and large tables, this often occurs. You don't want this between two large tables. Nested loops may resort your data.

If you have two large tables, a merge join in better. It's for larger inputs, and middle in terms of memory being used.

The hash join is for large tables, uses lots of memory. A row from the first table is run through a hash algorithm and then stored. The hash buckets are stored in MemToLeave, which is only 256MB on 32 bit machines. Above that, the hash tables moves to tempdb, which is much slower.

Sorts are expensive. Don't add them unnecessarily. DISTINCT, UNION, ORDER BY, GROUP BY, aren't always needed. Don't add them out of habit. UNION ALL does not produce a sort because duplicates aren't removed. If possible, allow indexes to presort things.

Compute Scalar is a function being used by the optimizer. Try to reveiew for implicit functions and watch if you have this in a WHERE clause 

 

 

Posted 13 June 2008 06:51 by Steve Jones

TechEd - Day 3

I left my laptop at the Red Gate booth, and then managed to wander up to a Data Compression talk from Sunil A. from Microsoft. It was a great presentation, and despite some extensive reading and researching earlier this year, I learned quite a bit about how it works. There are definitely changes from earlier documentation and Sunil had some great slides, so I'll try to write more over time.

A few quick things:

  • Row compression is moving to a variable storage format. This is a low level of compression and it provides some benefits, with low overhead. Nulls and 0s use almost no space, with them being encoded and using less than a byte if there's present.
  • Indexes non-leaf pages are only row compressed. You can set leaf pages to be row or page compressed.
  • Page compression has two parts: prefix and dictionary.
  • Prefix compression actually doesn't use the shortest matching value, but the longest. In the example Sunil gave, if you have three values (Smith, Smith, Smithson), the value stored in the header is (Smithson) and the tokens actually determine how many characters to replace. So the tokens here would be (5, 5, 8) meaning use the first 5 characters (Smith) for the first two values.
  • Prefix compression is only for a column. The values and tokens are not shared across columns.
  • Dictionary compression matches entire values, so only those entire matching values on the page are replaced. Using an example of three more values (Steve, Steve, Steven), only the 2 Steves would be replaced with a dictionary entry.
  • The compression techniques are byte agnostic, so 0x65656565 could be an integer value or character (5 As) and the compression doesn't care. It looks at byte patterns, not values.
  • Compression space savings make sense if the table or index is a relatively large percentage of the database. If you have a small part of the database, or not used that much, the savings might not really pay off.
  • You compress object by object, so you make the decisions for each table, AND EACH index. It's not all indexes that you consider, but each one separately.

More to come later.

I also was on another panel today, talking about the new features of SQL Server. The configuration server, the new central server that can store common registrations for all DBAs also can help deploy PBM. I didn't know that.

The resource governer is cool as well and I saw a few interesting tidbits on that. Like the CPU changes are dynamic, meaning on the fly, and that you might want to keep a small Admin pool setup that gets at least 5% and no more than 20%.

I'm tired. It's been a long week. 

Posted 12 June 2008 16:01 by Steve Jones

TechEd - Query Performance

A program manager asked the question at the start:  Who's had a query performance issue where a query "went bad" on the server?

Lots of hands up, and that was one of the things I wanted from this session.

What's reliability? MS sees this as two things

 - Quality: Given the criteria, you get satisfactory performance from your code at a point in time.

 -  Consistency: the query performance is satisfactory over time. No surprises.

Is there a tradoff? The presenter said so, better performance might come, but this may cost consistency. I'm not sure I agree with this. There was an admission that they tended to err on the side of better performance rather than consistency.

Query performance is diverse. It really depends on  what you want. Throughput, wall clock time, CPU time, Disk I/O, concurrency, etc.

The Query Processor (QP) needs to deal with this. It must make tradeoffs and the big one is compilation time v execution time. The first thing is to pick an execution plan. They could spend more time in picking plans, but it doesn't pay off unless the query will be executed many times. Should they worry about concurrency or assume all queries are executed singly. They don't do a lot of work to consider locking.

The result is that the "optimial" query plan isn't chosen. A "good enough" plan is picked.  However, MS realizes that reliability (consistency) is important.

What does the QP do (features)?

  • Column Statistics - created automatically.
  • Multi-Column Statistics - not created automatically except for indexes that cover those columns. If you have multiple column queries, build an index to help the QP.
  • Sample Rate - defaults to a few perecnt (Depends on table size), but there is an option to do a full scan.Sample works well, but there are a few places where fullscan helps. You want to allow these to be created (keep auto statistics on) and don't delete them.
  • You do need up to date statistics, so if the data changes significantly (size or distribution), you want to be sure that these are updated. This should be automatic, but be aware of this if you have performance problems.
  • Database Constraints - How does this help? If you have implemented DRI, the QP can understand some properties about the design. It knows that if this exists, they can infer there are matching rows in tables with PK/FK constraints, so they can better reason how to join these tables, possibly removing joins.

Query Hints

 There are a number of hints. The OPTION Clause allows hints that influence a particular query. There are other places in the SELECT statement you can use to influence joins for grouping, degree of parallelism, join order, parameterization, sniffing, etc. There is also a way to force a particular plan.

The best practice is to use hints only if you cannot solve an issue any other way.

Why doesn't SQL Server pick a good plan?

- Estimation issues even if statistics are up to date. They may under or over estimate the number of rows.

- Compile time might be too large, so perhaps that they think there is a "good enough" plan.

Some Best Practices

  • parallelism hints work well with highly concurrent, low complexity workloads
  • Paramaterize more aggresively in uniform and stable data distribution loads
  • Recompile often in higher complexity, skewed, and changing data distribution cases
  • Most hints and plan forcing make sense in workloads with few ad hod querys and mostly modules (procedures) and pre-compiled batches. 

Plan Forcing

 when you have a USE PLAN hint when submitting the query.  This provides an XML SHOWPLAN document that encodes the plan and the QP uses this. It's part of the query text and needs to ship with the application. Changing this means changing the app.

Plan Guides
These are not part of the application, applies query hints and are added by the DBA. These are like indexes in terms of guiding the QP in how to choose a plan.  The idea here is to fix performance problems with a particualr application. Usually a third party or custom application. If you understand why the query performs poorly, but you cannot change the application (change the SQL), you can use plan guides to influence the QP to perfrm better.

This was introduced in SS2K5. Useful, but hard to understand. DBAs having a hard time with this. I haven't seen a lot about it, so I probably concur.

This works by the application submitting a query. The query is compared to plan guides and matched up. If the plan guide matches a query, then the QP tries to apply those hints to the  query. The hints are honored if there is a successful match.

Plan guides are exposed through SMO.

USE PLAN

USE PLAN is a big part of SQL Server 2008.

DML is supported in 2008, guides can be created from history after the application has been tuned.

Profiler trace events are added to monitor plan guides

Also perfmon additions. In SSMS as a new folder as well. This includes the ability to script these from SSMS.

The engine has additions to allow you to freeze a plan with sp_create_plan_guide_from_handle.

The plan handle can be gotten from dm_exec_query_Stats.

In 2005, if an index was removed, plans failed. This is fixed, if the plan guide refers to an object that doesn't exist, then the query will build a new plan and execute.

Not sure if XevenTs are plugged in here.

Profiler has new events, plan guide successful (match), and unsuccessful (not matched or not used).

Development support with validation functions (sys.fn_Validate_plan_guide) 

One interesting demo. If you want to remove an index, you can actually script a check to see if the index is being used by any plan guides and then commit or rollback the drop, depending on results.

The scenario here is that you would test and build the plans on a dev or test system, and then script and move these plans to the production environment. This does mean that you need the same database names, same indexes, same data, etc. Even hardware should be the same.

You can also keep a history of good plans in your system. Lock them down by disable the plan guides. Then you can go back if something breaks in the future.

The upgrades are a godo use for disabling and locking those for upgrades beyond SQL Server 2008. If the plan upgrades have issues (new plans for enabled plans, you have copies. There is no guarentee that upgraded servers will perform well or use those plans.

If you mark a procedure with a plan guide, the current plan is evicted from the cache (nice term from the presenter) and recompile the stored procedure. Some plans guides are recompiled, but if you specifically use a plan in cache, there is no compilation.

Resource Govener

Helps manage resources

Can mark groups as low, medium, high, and so workloads get preference depending on how they compare to other workloads.

You can also set pools, such as an admin pool that "gets" 10% of memory, limits CPU, etc.

Question: If there is extra memory available, can a pool go above the limits?

Answer: No, because memory is sticky and can be slow to get moving. CPU can exceed if there are extra resources.

Allocations don't need to add to 100%. If Pool1 has 90% max CPU, Pool 2 has 20% CPU max, they compete for the extra 10% if both are busy. This is a limit, not a guarentee. 

How does DAC fit in? Not sure how it should be configured and should it have extra resources guarenteed?

Posted 12 June 2008 08:17 by Steve Jones

TechEd - Data Corruption

I went to this one, mostly because I don't understand coruption that well and Paul Randal is a great speaker. Worth going to see him if you get the chance. 

Run CheckDB. It's important.

You need to know how long checkdb takes to run. Checkdb is optimized to run for the times where there are no issues. So it runs fast. If there are issues, then it does a deep dive and runs longer.

In general, keep in mind that only the first 200 messages are reported. Run all_msgs to get all data and use no_infomsgs since there are lots of things reported that aren't needed in a DR situation.

Make sure checkdb completes. Story here about a stock firm.

Lots of messages once it's complete, so there are over 100 errors checkdb can report. Many have a variety of states.

The error results aren't too readable, but there are some things you should be aware of.

First determine if checkdb completed itself. Make sure that it has completed. Common errors with the checkdb process itself.

7984-988 - corruption in the critical system tables.

8967 - invalid states within CHECKDB. 

8930 - Metadata corruption, not critical, but Checkdb cannot make sense of the metadata about a table(s)

Many of these are BOL or online documented. 

Demo with pre-corrupted databases. A critical error shown on a system table, 7995, so checkdb cannot run, including repair. This means a restore situation.

Corruption in a system table, checkdb will not run, but you can perhaps run checktable against individual tables to determine which one is corrupt.

If you have corruptions in nonclustered indexes only, then you can run repair_rebuild if it's recommended from the output. Look through the errors and if the errors are all index IDs > 1, then you can manually repair the non-clustered indexes.

The flipside here is that if you must run repair, you must be in single user mode. If the index is large, than you must have 2x space to rebuild as well, which can be a problem. If you repair with checkdb, it might be able to correct some records.

You can also run an online rebuild in 2005 (Enterprise edition), to try and repair. However since this reads the old index, you need to do an offline rebuild.

Unrepairable error

Was one found? A PFS error? (89090, 8938, 8939) - No way to fix these. You can try to extrat data, but the pages where the corruption exists will not extract. Thereforre a backup is needed.

Was it 8970, invalid data in the column. Example, more than 1439 minutes after midnight. If you get this, repair cannot fix this since it does not know what values should be there. You can repair these manually.

8992 - metadata mismatch, checkcatalog error. Cannot be repaired, but depending on what this is, you might be able to hack the system tables.

Demo

DBCC Page is unsupported/undocumented, but it's safe for production use. Only reads data. Does not change anything. It's heavily used by the SQL Server team.

One trick in binding to system tables is that you can do it with the Dedicated Admin Connection. It doesn't work with regular connections.

System tables are cached. If you start in single-user mode, and use the DAC, you can change metadata. Paul had a demo that showed hwo to fix some corruption doing this, but he got an error that the metadata cache wasn't consistent. SQL Server maintains this and if you make the change to system tables, the cache isn't updated. So you need to restart SQL Server.

Look for blog posts on this stuff from Paul in the next few weeks at SQLSkills

Recovery using backups

Not necessarily the best way to deal with corruption.With large databases, it can be quicker to run DBCC to repair things. However most corruptions require ALLOW_DATA_LOSS, which means data gets deleted.

Full backups are a good starting point, necessary for a base.

Log backups give you good recovery to point in time. 

Backups need to be valid, so you should test them. You can restore bad backups with the "Continue after restore" option. This will restore the database, which is corrupt, but you can possibly retrieve data. Even with a corrupt database, you want to still get a backup first in case things get worse. Use CONTINUEAFTERERROR with the backup as well.

Restore or repair?

Do you have a database? If no, you need to restore.

Working backups? If no, repair. Or you can restore a damaged backup.

Log damaged? Restore, if that fails, use emergency mode and extract information.

CheckDB failed? If it's a critical error, restore or extraction.

If you have nonclustered index errors - restore or repair, choose which makes sense.

If you have repairable errors, choose either repair or restore.

If you have a choice, think about SLAs (downtime and data loss). Choose the one that limits the loss.

One thing to always do in a DR situation is ensure you have the tail log backup. This are all the changes since the last log backup. This is a just a current log backup.

Paul showed a single page restore, which is fairly cool.I haven't done that, but it can help with one (or relatively few) page corruptions.

One trick I've seen over and over is that you should ALWAYS restore with NORECOVERY. That's the default, do that, it allows you to continue with restores. When you're sure you're done, do a restore with recovery by itself to brings things online.

REPAIR_ALLOW_DATA_LOSS

Be careful. You're allowing things to be repaired, which often means just deleting data. 

Repair goes the fastest, doing the most provably correct thing. It doesn't try to save data, just get done quickly. Doesn't look at FKs, constraints, replications, or any types of relationships between objects.

CHECKBD repairs aren't replicated, so you need to quiese the system and then re-init the publications.

Damaged Log

If you don't have a backup, use EMERGENCY mode to extract the data. There is an emergency mode repair in 2005 that  you can run. This will rebuild the log (after extracting what it can) and then run checkdb as one operation.

What people do? 

Restart SQL Server - Doesn't really do anything. If there's corruption, it's corrupt and you're wasting time.

Jump to a last resort -  Without determining the error, they often run repair_allow_data_loss.

Detach a suspect database - Doesn't help.

XVI32, freeware hex editor. Paul showed how to corrupt a t-log. Interesting, not a useful skill if you're not a presenter/tester.

One thing about rebuilding the log on the sly with a detach/attach, Service Broker has been given a new GUID, so it's disabled. Watch out for this if you use SOA stuff.

No backup, no database? Worst state. Kimberly says URLT (update resume, leave town).

Summary

Know the signs

Run checkDB, let it complete,

Always take a backup before restore or repair

Try to limit downtime and data loss, but think about it.

 Test this, go through this using demo databases. And using your systems.

 

Posted 11 June 2008 08:23 by Steve Jones

TechEd - IO and Wait Stats

Andrew Kelley, SQL Server MVP, is a good speaker, so I'd recommend his sessions if you want good technical information.

IO Stats are a snapshot in time since the instance was restarted, so you must have at least two reads to compare the data. Stats are a snapshot. They are not reset until the instance is restarted. This means that you must be storing this data between checks to have any meanungful data. A single read from the stats DMVs doesn't give you a reference as to what the server is doing.

NOTE: Auto close does reset the stats (not sample_ms, but the counters). Be aware of this, especially with smaller applications/databases.

Solid State DIsks - Not many people using because of costs, but more and more they are in places, tempdb, pagefile, etc.

sys.dm_io_virtual_file_stats - Stores data from all databases, holds dbid and fileid to differentiate.  Some changes in SQL 2008 that help.Some countere are separated out instead of being totals (wait stats, I think).

FileStats - Reported numbers are physical I/O, not logical. Logical can be many times larger than physical.

Don't forget tempdb. Lots of tempdb usage and it's easy to forget about it and concentrate on your user database.Often tempdb is used more than extensively than people realize and have a huge performance impact on the server.

Use FileStats with WriteLog waits to see how efficient your log writes are. Log writes are sequential writes, so these can slow down your activity. T1 must complete writing before T2, before T3. If T1 is delayed, all other transactions are delayed. Data writes can be deferred until there is more time and the changes held in memory.

Don't use multiple log files. No performance advantage for these. There is an exception in extremely large databases, but not for performance. 

Backups can skew the results. Account for them in watching stats. These are complete physical IO. Transaction log and database operations are both included here. Backups will have multiple reads. Each read can be up to xx (512kb??)  bytes. Accounting for this can be tricky. Not sure about verifying backups. Should account for reads. The account for this, you need a snapshot before and after the backups to get an idea of the reads used by the backups.

Filestats are easy to capture, so there's no excuse to examine these on a regular basis. Frequnecy might vary. Daily, weekly, what you need. Be sure you review the reports. Capturing the data doesn't help if you don't examine it. This data is important for a baseline.

Code:

DBCC DROPCLEANBUFFERS -- cleans out databases. Useful for testing.

Gather stats, build table with same structure of table, then load with snapshot from the DMV. Include the datestamp for the data.

Join with system DMVs for friendly names of files and databases.

----- 

 All these counters point to hardware issues. Not that hardware is bad, but it's a delay with OS, drivers, controllers, disk, cables, etc. Configuration can be an issue here.

 Log writes should be very short, so stalls should be < 2ms. Data writes can be longer, but they can impact things as well if they grow too much above 10ms.

Don't assume a SAN is fast. Make sure that you are checking and verifying there are not delays on those IO paths.

 The counters include aggregates from all threads. So 5s of wait or stall doesn't mean that it was 5 physical stats. If you had 5 threads, then you could have 1 physical sec of wait across each thread. Be aware of parallelism here.

Some common problems:

High stalls on writes

  • - add more write-back cache on the controller. This speeds up writes from the instance. 
  • You can change the ratio to be 100% writeback instead of read. SQL Server doesn't necessarily benefit from read caches on disk.
  • Also be sure that you are R1 or R10 or R01, not R5.
  • If possible add spindles to the array.
  • Maintain physical separation of data and logs on disks.
  • And don't put non-SQL Server traffic on the array.

High Stalls on reads

  •  optimize queries., index, tune, reduce scans or translate into seeks.
  • Same, make the controllers 100% writeback.
  • Use same disk strategies from writes. 
  • Don't forget tempdb 

 Wait Stats

 Anytime SQL Server has to wait for something, it records an event (WaitType). Time is in ms.

In 2005 there were 194 types, and more in 2008. In 2005 there wre some that were hidden, but more are exposed in 2008.

These numbers are cumulative from the last restart or the last clear. Users can reset these counters here, so be aware of this.

DMV is sys.dm_os_wait_Stats. Contains the type, the count, wait time in ms, max wait time, and signal wait time. Be aware that the max wait time is from the last reset, not between samples. Signal wait time is time from when resource is available until it is used. Wait time is from time resource needed until IS IT USED, not untiil it is available. 

 Wait stats can narrow down where bottleneck is. If there is a lot of signal time, you might have CPU pressure. This replaces dbcc sqlperf(waitstats) from SQL 2000.

dbcc sqlperf('sys.dm_os_wait_stats', clear)

Be careful of OLEDB waits.

If you report on these stats you'll see the top types of waits, which can help you diagnose where the slowdowns are in your system.

Common Waits 

CXPacket - From inefficient parallel processing, where one or more threads are waiting on things to finish. Hyperthreading can add to the problem here. Consider adjusting MAXDOP here to reduce this (at server or query level).

Locks (LCK_xxx) - Long running transactions or blocking. Tune, index, reduce contention.

ASync_NETWORKIO - client isn't getting data as quickly as SQL sends. May be network issues, but likely the client is to blame.

IO (PAGEIOLATCH, IO_COMPLETION, WRITELOG) - Physical IO waits. Waiting on disk into or from memory. Storage subsystems issues here. Writelog should be as log as possible.

Latches (pagelatch_xx) - Not IO related, this is contention with internal resources. Heaps and LOBs can cause latching waits. Lots of inserts into one page can cause issues here along with page splits.

 

 

 

 

Posted 11 June 2008 06:50 by Steve Jones

TechED - Building an SOA Data Center

Talking about heterogenous systems, usually legacy systems. How do you move to a new architecture?

You must start with the old systems, often you cannot throw it away and need to mitigate the risks of migration to newer systems. One way of doing this is to rip out the presentation layer/logic and move that out as a web service. Now you can modernize the top layer. The underlying layers are still legacy. In this example, an X-Windows on Solaris system, moving to a Microsoft platform. However with webservices, modern types of applications can be built on top of the legacy systems.

There is a perception that the overhead of web services, SSL, etc., will cause problems. This shouldn't be the case

Digging into more systems, often securiy becomes an issue. Certificates can work, but many companies already have a security infrastructure in place with desktops and Kerberos you don't want to duplicate. A header was added to the web service communication to contain a Kerberos token.

The evolution in moving a legacy system, one approach is to slowly rip out some logic and replace with an SOA piece of logic. So the legacy system was a consumer of a web service not a provider. This allows piecemeal for replacement of the application. Eventually most of the application is replaced and you are in an SOA architecture.

It appears that they've found ways to build quick systems, despite the overhead of XML on your data, even with mainframe and other legacy systems. Original and many early web services are poorly implemented and insecure. More modern WS are better written. No real details here, but it's an architecture talk. It does give some hope that SOA and web services make sense and they're can perform well and be secure. I'd like to see good examples of this. Probably need to look out on MSDN for stuff.

One security concern with insecure web services. In an example they used the KErberos token in the header to go out to AD and authenticate the user. The tokens only live for a short time (30s or less) so this worked.

Lots of WSxx acronyms, new standards and performance guarentees for metrics thrown out. So there is some maturity taking place here with web services.

One issue that occurs with legacy systems is often the newer developres don't necessarily know the older technologies and interfaces (CORBA, J2EE, etc). Two skill sets are needed, so be sure that you have development resources in all areas, even if it's double staff. 

OK, it's getting lss interesting. Moving on. 

Posted 10 June 2008 10:12 by Steve Jones

TechEd - Designing a Global Database

I was going to hit a Powershell lab because the Scripting Guys at TechNet write great columns, but it filled up, so I went to my second choice: Building  a Global Database.

 

What does this mean? If you are building a database that handles all types of languages and cultures, you have to handle not only characters in storage, but also display. The formatting and layout is different, so you need to ensure that you have "cultural integrity" with your data.

The Storage Perspective

What does it mean to be multi-cultural?

Really it means Unicode. There are two types of data (nchar, nvarchar), but also encoding. UCS-2 encoding has2 byte characters, fixed width, 16bit. This is what SQL Server uses. If you were to use char/varchar, these are not Unicode, but the storage is defined by the code page of the table.

SQL Server 2008 added the new data and time types. There's an ISO8601 support with DATEPART to handle global applications. There is an ISO_WEEK parameter added to datepart.Timezoneoffset was also added, this is from UTC and it's globally aware, stored as a ETC date.  

A code example was given to check if a date was UTC compliant. It's a good idea, especially as more and more of us deal with global data.

SET LANGUAGE gives you the error messages and date time formats in the appropriate language on the server. ISO8601 formatting isn't set by this.  The client uses it's own parameters in the connection to handle it's own formatting.

One of the changes in SQL Server 2008 with language is the enhancements to Full-Text search that added many languages (51 out of the box) to the Word Breakers used for searching in various languages. The Word Breakers were developed by the Office team, so there is compatability between products (including Vista). In the older versions, there were 11 lists of word breakers.

Collations

Collations are fundamental to all operations, so you should be aware of how they work. The collation defines the sort order and code pages used. Sorting are based on cultures and liguistics. Codepages specify a specific set of characters that are recognized. 

Windows collations provide more information than SQL collations.

Collations include various aspects such as Case sensitivity/insensitivity, Accent sensitivity/insensitivity, Kana sensitivity/insensitivity, and width sensitivity/insensitivity. Kana sorting is an option for all collations, but basically set for Japanese. Width means that you handle double byte the same as bytes when sorting. So A in one byte (char) equals A in double byte (nchar)

 Case sensitivity can be set at different levels. There is an instant level collation, a database level, and then an object level. This matters since users, passwords, etc. are compared according to the instance level setting.

SQL Server Collations

 There are two types of binary sorting.  _BIN sorts based on binary code point. This has the fastest performance, but works with fixed 2 byte characters, so Unicode 16 (4 byte) might not sort correctly. _BIN2 handles sorting with Unicode code points, so double Unicode (4 byte) characters are sorted correctly.

One big thing is that collations are set in a tab during setup. THIS IS EASY TO MISS, so if collation is important, go slow here. It's not obvious where this is.

tempdb collations is based on model, so change that if you need to do this. There is a problem if you need multiple collations. So temp objects need to be declared with the collation in the columns or table if it is required.

Collations can be changed on databases, columns, but this changes meta data, so interpertation of your data could change. Be very careful of altering collations, but you have the choice of T-SQL or SSMS. You can also change data from non-Unicode to Unicode.

Collation can be chosen on the fly in a select statement. (select * from T order by C COLLATE Japanese_CI_AS). 

There is an interesting chart on how collation conflicts are handled. Two explicit declarations can error out with they conflict, but explicit declarations trump the implicit declarations. There is a chart in BOL (SQL Server 2008) to determine how other situations are handled.

The collations have been aligned with Windows 2008, so 80 new Windows collations. Weights have been added and updates to Japanese, Chinese, etc. in terms of the changes to the languages based on government standards. There have been quite a few minority scripts in Chinese (Yi, Uighur, Mongolian, Tibetan) as well as supplementary characters (4 byte Unicode). Collations are indicated by compatability level (90 and 100).

An impressive list of collations shown on a slide that showed the 2000 collations, about 35 or so, about 6 updates in 2005, and then 6 new ones in 2005, but almost a doubling of collations in 2008. SQL Server is going global.

Demos: create table, change types. There is an SSMS setting that prevents changes that require table creation. That's a great safety net I wasn't aware of in the tool.

Inserting Unicode characters: some display, some show the square box as they're not displayable in SSMS, and using the Hex code. In three inserts, the interesting thing is how non-Unicode data is handled. The insert is smart enough to break up hex values into the correct number of characters, but you can get stange data.

Stroke order is another sorting that matters in some languages, and support is added to match the dictionaries in those cultures where this matters.

There is a tempdb demo thos shows some of the issues with different collations. To me, if SQL Server wants to be a truly global player, this is a reason why there need to be multiple tempdbs.  The CASE function does need an explicit collation in the query since it's not aware of the collations.

Best Practices 

Use Unicode throughout the stack (end to end). If this isn't possible, be sure that conversion points are well known.

- In the ODBC flow, if the client has SQL_C_CHAR and the server has CHAR, different code pages, these need to be converted. With auto-translater, the client converts from Client Code page to Unicode. Next the server code page is used to convert from Unicode to the server page. This is client side, prior to going over the wire, so the client must be aware of the server code page and have it installed.

If the client is char, server is unicode, one conversion. Same is the client nchar, server char. You still have have issues here.

No conversions are made for surrogate characters (4 byte). These are stored as 2 separate 2-byte characters, so all handling must be done with clients. 

The biggest danger here is a Unicode client working with a non-Unicode client. This could result in data loss and hopefully people working with multiple langages are sing Unicode storage only.

There are a few white papers for Best Practices on globalization with changing collations and moding to a Unicode world.. 

Kind of a basic session in some ways. They covered some basics that even a language novice such as me is aware of, but there wasn't a lot of maketing in terms of SQL Server 2008. It does show that there is much more support in 2008, so if you need that support, think about SQL Server 2008.

 

Posted 10 June 2008 08:29 by Steve Jones

Building Software is Like

I still remember hearing an executive in IT at Virginia Power start a sentence like this, and he went on to say it's like a house before chastising the developers for being slow. I was on the ops team and liked the analogy, finding it funny. Course I hadn't done a lot of software development outside my bedroom and was young.

I've had an interesting editorial series this week comparing software development to builders, lawyers, doctors, and engineers. The discussions are worth reading and it's definitely opened my eyes quite a bit.

Posted 06 June 2008 13:01 by Steve Jones

Know Your Equipment

Today I ran to the co-location facility for FourDeuce and End to End Training, to upgrade our database server to SQL Server 2005. We've been wanting to do it for awhile, but since most of our SQL code is fairly simple and generic, we haven't needed it.  The old SQL Server 2000 instance has run fine for years on there (it was the old SQLServerCentral.com database server) with no issues.

However we're starting to do more VS 2005 development and there are a few things in 2005 that it would be nice to tak eadvantage of. Plus we are teaching on 2005 all the time, or at least Andy and Brian are, so it makes some sense to just get 2005 on there.

I logged a ticket yesterday, got a note back today that I was set to go, and so I downloaded the 2005 Enterprise media today to my desktop and burned it on a DVD. I also started it downloading remotely on the database server, or so I thought. I didn't realize until later that it wasn't there.

I got to the colo at lunch, figuring this would be a quick hour. I got in, the guy from our hosting company opened the rack for me (we share with some of his other customers) and I got a crash cart and plugged it in. Now the crash cart had PS2 connections for the mouse and keyboard, which is OK since that's what the server has, but those connections are flaky. It would be better to have USB stuff on there, or at least adapters. I plugged things int he server, but it didn't respond.

So I did a hard shutdown, not able to reach Andy right then. So I started it up, plugged into the front (mistake) and only had one connection, so I went with keyboard. I've been using Windows for years, figure it would be easy to navigate. I got things up, logged on, put the DVD in, and

 "Please insert a CD into the drive"

 Huh? I opened it, put it back, cursed myself for not bringing a laptop, and then checked the drive. It was a "CD" drive and couldn't read DVDs. I haven't had a machine with no DVD drive in years, but this server is an old 2650, about 4-5 years old, and it doesn't have one. Even out "newer" machine, a 1750, is about 3 years old and doesn't have a DVD drive either. They didn't have one I could borrow, so I thought I'd download more media.

No chance. I got to the MSDN page, picked the server, but the "download" button wouldn't respond to keyboard input and I had no mouse.

I called Andy, thinking he might have some other idea, but it ended up being a busted trip.

When I got home, I saw the image was <4GB, so I could have actually put it on my flash drive, which I should have done. I'll try that tomorrow and see if I can get over there again and get a 2K5 instance.

So the moral: pay attention to your equipment. Remember that servers, espeically older ones, might be a generation behind your desktop in terms of interfaces and capabilities.    

Posted 05 June 2008 21:53 by Steve Jones

A New Logo

For SQL Server, it's coming soon. Surprisingly, when I was up in Redmond, quite a few people were asking about this, and wanted to have a new logo. I somewhat agree that the old logo was a little tired and it would be good to see something new.

I saw a copy of the new one, to be released at TechEd, so it's out ther somewhere. I'm going next week, so I thought I'd just get a copy then, but now I started writing this post, and IMAP deletion means gone is gone.

It looks like the old logo, except someone swiss-cheese'd it. Not quite what I was expecting, but I shouldn't have expected a large corporation like Microsoft to deviate much from the current design. If I get a copy from a friend, I'll post it back up here.

 

UPDATE: Here is it:


Posted 04 June 2008 10:22 by Steve Jones

Filed under:

DIY

Last month the LCD screen on our treadmill went out. We bought it last Christmas (06) as a present for my wife and I and with all the snow in 07, it was a nice addition to the household, letting us work out with bad weather outside. However the LCD dropped, nothing at all on it, and so my wife is on me to get it fixed. She could do it, but she's as busy as I am, perhaps more so, and tends to leave the physical stuff to me (she has enough of it with horses), and I like moving things around.

I've been slow to get started, but today I decided to call and see what could be done. The guy on the other end from Reebok was nice, had me disassemble a few parts, which had me sweating at 10am as I had to tip things over work out screws, lift things, etc. Eventually we diagnosed a bad controlled board, ordered a new one for $150n and thing should be working in 2 weeks. Well before the first snow (I hope)!!

To me it's no big deal to replace the board myself, messing with the connections and stuff. It will cost me time and $150 and the trip out from the store for professionals was $120 + parts, and perhaps a charge for the second trip. So I save money. It's no different than me replacing the blades or even the tie rod on our mower, or sharpening the bush hog blades myself. I get to mess with something, learn about it, develop a new skill, and save money.

I consider myself a software guy. I don't necessarily want to do it for a living, but I've enjoyed working on software, tinkering here and there, building my own website, including blog software, and learning about computers. I also replaced my own video card this weekend, not a big deal, but something not everyone does.

I think that many IT people are natually DIY'ers, and curious how things work. They like to mess with things and I'm always amazed how many IT people I know dive in and do things themselves.

This probably should be an editorial by itself. Maybe someday. 

Posted 02 June 2008 17:21 by Steve Jones