June 2007 - Posts

Walking the Line
I have the best job here at SQLServerCentral.com. I work from home, I get to be creative, writing about what I want to write about and work on my own schedule. I've gotten free admissions to the PASS Summit, TechEd, and the MS BI Conference and been offered others.

However it can be aq pain at times. Like today where I've got people banging on me for running an article in today's newsletter. I put the title as "Writing Faster T-SQL" and the description as follows:

All of us would like to code faster and with fewer mistakes. While a third party tool usualy won't help with the latter, there are a few to help you increase the speed at which you build readable code. New author Tom Fisher brings us a short tutorial on using a couple of Red Gate's tools.

Now I thought it made sense that this was a vendor piece, admittedly, but my vendor of employment. Instead in the discussion I had some complaints that this is becoming a "Red Gate" site and we are running too many vendor things.

I can't quite understand that. We have run 3 reviews of Red Gate products, including today's, this year, but we've run 7 or 8 other vendor's products. We rarely run reviews, and so I can't understand what the extrapolation here is for other users. Are these the "squeeky wheels" or are they representative of 1000 other users.

It's a fine line to walk with Red Gate. They paid a good price for this brand/site and they want to be able to promote their products. To be fair, other than asking me to remove advertising, re-do the newsletter look, and to display their logo, they haven't pushed any content on me. I tend to just do the same things I've always done and this site reflects my feelings, interests, and ideas, not really anyone else's. I guess Brian and Andy's from over the years, but for the most part I decide how the site works and what content appears. I didn't think anything of running the article as it was a nice "how to" tutorial showing off their products. I've actually asked other vendors to do this, but haven't gotten any responses.

So I'm not sure what to do here, but I'd love feedback (sjones at sqlservercentral.com) on what you think.  


Posted 28 June 2007 11:44 by Steve Jones | 1 comment(s)

Keep a Spare
I wanted to get working on Katmai a little, but not being brave and seeing the release notes that it can't run side by side with SQL Server 2000 (who decided that?), I wanted to put it in a VM.

So I fired up Virtual PC, created a new VM, and...

Nothing. It started the boot and froze. I'd selected Windows 2003 as the base image and for some weird reason thought I'd get a W2K3 install. It was then that I realized I don't have a blank Windows 2003 image. So I'm installing W2K3 now on a VM and when I'm done, I'll shut it down and copy the file off to another place to hold as a template.

I haven't done much VM work lately, but I used to keep spare images around as templates, which allowed me to quickly and easily test installs.

So a tip: Keep at least a Win XP and a Win 2K3 image handy. Or Vista if your hardware supports it.

Posted 20 June 2007 13:36 by Steve Jones | with no comments

Back to Work (Some days it flows)
My job here at SQLServerCentral.com is mainly as editor. This means that for the most part I edit articles, schedule them for release, work with authors, build questions of the day, correct mistakes in questions of the day, and of course write editorials.

Last week I was on vacation, so no work got done. As a matter of fact, I was completely unwired, and so I didn't really do anything. I did make some notes on vacation and start to write an editorial on Friday (as I was ramping up out of vacation mode), so I was really behind when the week started. Only about 600 emails to go through from the week, of which 400 or so were nonsense, vendors, newsletters, SPAM, etc. that I keep an eye on.

So today was going to be a good day to edit some stuff and do some writing. I was behind getting to the Database Weekly links this week, and they usually provide me with some good things to write about. And in fact I jotted down 5 or 6 things I need to editorialize on or write articles on  at some point. I keep a living document of stuff going and today was going to be a productive day.

But my daughter claimed illness this morning. With my wife suffering Montezuma's Revenge for the last 2 days, I couldn't take a chance, and let her stay home from summer day camp. But after a couple hours, she was "miraculously cured" and wants to play, talk, ask for things. I finally snapped when she walked in the office for the 10th time in an hour with some request. Not making for a good work day.   And I was ready to work.  

Posted 20 June 2007 13:31 by Steve Jones | with no comments

Katmai Manageability
The database manages itself. Not totally, but close to it. It was an analogy made by Dan Jones with cars. Auto shop isn't really offered in high school anymore, people don't "tinker" with their cars much anymore.
If this is true, we have more time for data design, data architecture. I'm not sure I agree and as much as I hear, the more I think a DBA needs to be around because when something goes south, it goes way down.

Some trends affecting managability.

Trend 1 - New Features in the product
A huge chart of new features that were in SQL Server 2005. Many more features were added in SQL Server 2005 than will be added in 2008. However the surface area is still growing in many ways.

One of the manageability things is that no one can understand so completely the product anymore because it's too large.

Trend 2 - Data center and server consolidation
Companies are trying to have fewer centers and less servers. And the amount of staff will shrink, but the work grows. In fact it's expected that more databases per DBA is the trend.
Not too many customers are virtualizing SQL Server. Bigger servers and multiple instances, but not really virtual servers in production.

Trend 3 - Remote and embedded DBs
More and more of these are remote, kiosk, small form factor databases will grow. The data will upload from these remote and embedded systems, but the DBAs aren't on-site. One retail chain, close to 1000 locations, has 3 DBAs managing 5,000 databases. Most are small, but they still need to be managed.

Laptops and mobile workers will often have databases on their laptops that need to be managed.

Katmai Manageability
To combat these trends, Katmai is trying to scale the DBA with better managability. Here are some features:
- resource governmer - I've been asking for this for years. The goal is to differentiate workloads, monitor resources per workload group, and you can limit resources. The goal here is to prevent runaway queries.

You can set applications up into groups and set min/max for memory and CPU usage. You can set priorities between groups as well, so power users, sales, etc. can be placed above regular report runners. You can also put applications into groups as well.

Programmed via T-SQL, SSMS, or SMO

All Actions Audited
A frameowrk to audit the data environment. A new object (audit configuration) will be included to specify a location to store audit information. Can audit to the Windows logs or a file. This object will have a specification that defines which actions to audit. Can define auditing activity on DML, events, on particular objects. You can lump multiple specications together. It's a securable just like any other object. It will have a T-SQL programmability API, SSMS, or SMO access.

Performance Studio
Data collectors will be introduced to collect data in memory. They can be configured via T-sQL or Win32. They run as configured to collect information and post it to a warehouse somewhere. you can configure what and how often they collect data as well as how often they post to a warehouse.

This allows you to analyze performance. Some canned reports will be provided or SSRS can write custom ones. It can be moved to SSAS as well. After Ss2K8, they expect to provide a real-time dashboard  as well as a wizard to watch trends and offer suggestions.

There will be the ability to apply policy, like system must run within parametesr. So like if it goes above 80%CPU for more than 5 minutes, send an alert.

Data Compression
Not backup compression, but compression on tables and indexes.
works on partitioned tables (or non partitioned). You can compress one partition and not another.
It has page or row compression, works on clustered or non-clustered index.
T-SQL extenions available for table and index DDL. You'll have an SP to estimate the savings. Also available via SSMS and SMO
There is a CPU hit. Not every data type is compressable.
I'm not sure if this is a great idea for many people, but for less used data, this can be a big deal. Don't forget that a savings in the database can mean a savings in backup size and time, a tape savings. It can really add up. When I bought litespeed years ago for JD Edwards, when it was $400 a server, we had about a 1 year payback based on tape cost.

Database Mirroring Page Repair
Pages can become corrupt due to physcial data problems, i.e. disk going down.
If you failed to a mirror with a bad page, you could be down. This system watches both the principal and the mirror for bad pages. If found, it can call it from the other side to restore that page. New DMV shows the last 100 page repairs, new trace event, data collector can be run on this, etc.

Powershell Integration - not committed.
Intellisense - WILL BE IN 2008!!!! About time that this will appear.

Declarative Management
also called policy based management. The focus for SS2K8 is on the db engine to allow you to set policies for server conifiguration. This can work across multiple servers, like the master/target by pushing policies to particular groups of servers.
The goal is to lower the TCO and reduce the cost of managing. Should eliminate some scripts to tweak servers. Policy should handle this. You can enforce policy or just monitor.
This will replace the surface area configuration configuration tool.

Why these changes?
All of these changes are designed to combat the trends above.

To handle the additional features, they want to give you less knobs to manage. There aren't really less knobs, but with grouping and policies, you can physically have to work with less knobs as you deploy these to other servers. The first server takes time, but changes or alterations are more easily pushed to other boxes.

These same features help with consolidations because by grouping things together, one DBA can manage more servers. It makes the DBA more scalable
.
Once again with policies, this allows you to manage all databases that are on cell phones, or on laptops, or in any other remote situations. You can control and monitor those devices in an easier fashion.

The conceptual model for management

Policy - The desired state of facets, when to check the policy (on change, on a schedule, etc), actions on check (log, rollback, etc), what to check (name, setting, etc.), and the categories. Group the policies together, like all security policies.
Facet - projection on top of a target type. A groupding of properties, like the security properties of a target type. Facets can have logic, so it can have new properties.
Target Type - object, table, view, sproc, physical objects with their properties.

Kind of confusing, but each of the items above was a box in a stack with the target type at the bottom. I have the feeling we'll see these in BOL, so start to get used to them.

Back up MSDB
I don't know anyone that isn't doing this, but it was mentioned to be sure you back up msdb. As with everything else internal it seems, the policies will be stored in msdb.

They'll also have an internal SQL Agent job to handle policies. This will also use SQLCLR, evne if you have it turned off. Very interesting that they have a back door, but more importantly, it's good to know that it's a potential problem. Not likely, but you never know. I'm assuming they backdoor SQLAgent as well, probably using the subssytem even if the service is off.

To prevent changes, they use the same DDL events that we use.

Demo
policy to prevent creation of tables in the dbo schema. Fails with a level 16 error message.
There's a new Policy Management folder under Management in SSMS. There are policies, conditions, and facets.
If there's something out of compliance, the icon on a database changes and you can find the issues.
Showed how to create policies, categories instances, and set groups up.

Once again with policies, this allows you to manage all databases that are on cell phones, or on laptops, or in any other remote situations. You can control and monitor those devices in an easier fashion.

Posted 06 June 2007 08:32 by Steve Jones | 2 comment(s)

SQL Server 2005 Security
It's a similar theme, off by default if it makes sense, minimize surface area.
Recommendations: for new installs, leave stuff off unless you need it. Minimize connectivity. for upgrades, turn things off if you do not truly use them.
SQLSAC - Surface Area tool, launch from command line or setup. There's a SAC command line utility to script out the configuration that you can apply to other servers.

Service Accounts
BOL says don't use Network Service? Why not? Has more rights than generally needed. Also a shared account. If penetrated somehow from another place, control of SQL Server is gained.

Local or domain account is recommended. You can change the password without a shutdown.
Least desirable: Local system
OK: Network Service or Local Service.

Changing the account: Previous advice was go to SQL Service Manager or Enterprise Manager to change the account. Reason is because large number of files and folders are ACL'd. Same advice for SQL Server 2005. Use the tools, not Services applet.

Authentication Mode
Windows is recommended and is the default.
Exceptions are because an application requires its own login. Also DBAs do not want to have Windows administrators having control over who can access SQL Server. This is why there are additional features in SQL Server 2005 for SQL logins. One thing is that the channel from client to server needs to be encrypted to cover passwords. SQL Server 2005 uses a self-signed certificate to encrypt the password packet to keep it safe.

Network Connectivity
It's recommended to enable the fewest protocols you need, probably only TCP/IP.
Also default ports should be blocked and changed (1433, 1434)
Endpoints are new entries into SQL Server, but you can secure these to limited users/roles. You can limit the types of access as well (no ad hoc, etc.).
DO NOT EXPOSE YOUR SERVER to the Internet. This should be common knowledge for anyone and it's something you want to do if at all possible.

System Procedures:
REVOKE execute from permission from XPs and system stored procedures.
Even though public had rights to run a system proc, the procedures had a security check. In 2005 all were re-examined to be sure rights were needed, for example sp_help had a security check added. Many XPs (XP_dirtree, xp_regread, etc) have security checks added.
They are more secure, but removing XPs results in an unsupported configuration.

Password Policy
In previous versions, no validation of SQL auth passwords.
New features added based on OS capabilities. Recommend that CHECK_POLICY is left on, CHECK_EXPIRATION is left on and MUST_CHANGE for new logins. Note, policy is only checked when passwords are changed, not during an upgrade.
If you set MUST_CHANGE, this only works for 2005 clients (.NET 2.0, Native Client). The reason is they can send a two password packet that old clients cannot when a change password is required.

Admin privileges
Use when needed. Don't use SA because it's convenient. Also minimize the number of administrators and provision them explicitly. Don't use the Windows Administrators group.
Also be sure each administrator has their own account to be sure you can audit and determine who change what.

Vista
User Account Control issues.
Protected Admin - Even if you are an admin, your token does not have the admin token. A check is made to be sure that you have rights and want to perform the action.

DB Ownership and Trust
SQL2K SP3 turned off database cross chaining. This ensures that a cross database query has a security check. If you don't need it, turn it off. If you do need this, you can trust the dbo of the other database or sign a stored procedure with a certificate and have the dbo trust the certificate.
Recommendations are to have distinct owners for each database, not "sa", and only enable trust as needed.

Schemas
It's a namespeace in the container hierarchy
  server -> database -> schema -> object
Can be owned by anyone and can grant permissions at this level. This was done to isolate applications or groups of objects, separate administrative grouping from application usage. You can easily change the user as well.

OK, I'm done. It's too basic a session. I guess if you haven't really worked with SQL Server, this is a good list of stuff, but I think too much of this is common knowledge for DBAs. Giving some of the reasons for changes is good, but for a many of these explanations it sounds like an excuse or a "ok, we give in". Having a somewhat boring speaker doesn't help.

Posted 06 June 2007 07:06 by Steve Jones | with no comments

Southern BBQ
Last night I went out with Tony Davis of Simple Talk and Alan White, a longtime author for both here and Simple Talk. After satisfying their need for microbrewery beer, we ended up at a Southern BBQ. They enjoyed it, despite having to join me in choking down American beers.

We got to talk about a bunch of stuff, mostly non-technical, but it was interesting to hear their perspectives on the conferences. They definitely are enjoying the conference and getting some good development knowledge here at TechEd. There's definitely some, but I think this is more an infrastructure conference overall. Still it was interesting to hear them. They still like the PASS conference for DBAs with lots of networking opportunities and real world presentations.

They did think that Powershell is going to be here a long time and take off. I think it's pretty cool, but I've seen MS push lots of technologies, so I'm kind of adopting a wait-and-see attitude about Powershell. If you get the chance, check it out and see what you think There were definitely lots of Powershell sessions here covering all sorts of admin, Windows, SQL , Exchange, etc.

Posted 06 June 2007 07:02 by Steve Jones | with no comments

DBCC Internals
CHECKDB
- needs consistent view of the db. Needs to do this without locking. In SQL Server 2000 log analysis is used. Almost everything is logged and once the database is read, the log is read and inside checkdb, recovery is performed to reconcile the differences. If there was a lot of activity, this could take a long time. While the data portion is multi-threaded, the t-log analysis is single threaded. There are problems because not everything is logged. So it guesses in some cases. Could cause false positives.
In SQL Server 2005, this was changed and they use COWs (copy on write). An internal snapshot is generated and used for a consistent view. Still with a high load, there are problems (just like a snapshot). Tempdb is not used (nor can you create a snapshot on tempdb).
- no online checks on tempdb. recovery cannot be run and a snapshot cannot be created.
- only works if you have databases on NTFS.
- internal snapshot is an alternate file stream on each db file, but it will not show up in explorer.
- any space used in the alternate stream is returned to the drive after checkdb.
- If you run checkdb on a snapshot (if you're low on space), no alternate stream/snapshot is created.

- What does it do?
  - primitive checks on critical tables (5 tables)
    - works by getting to the leaf level of each of these tables. Read every page, no cycles in the linkages, page audit of every page.
    - if there's a  problem it's done.
  - allocation checks (= checkalloc) - checks consistency of allocation maps. Ensures that the maps are accurately stored. IAM, GAM, etc. ensure no two tables have the same map set, ensure that the GAM and the IAM match up correctly. Check PFS pages - is page allocated, is allocation page, etc.
  - so far this is fast
  - If running repair, it's run on allocation problems.
  - logical system table chceks and repairs
    - eq of checktable on these. Works in parallel
     - check indexes are correct, b-trees are correct, page audit, etc.
     - If page checksums are turned on, this is checked.
  - logical checks of user tables (and repairs)
    - this runs in parallel as well
  - service broker checks
  - metadata checks (check catalog, wasn't included in ss2k)
  - indexed views and XML index checks (and repairs). Determines if the content of the indexed view matches the definition of the view. Can determine if there are extra views or missing rows. Makes new index to check. Similar thing with XML.

You can check percent complete column to figure out which stage and progress in the DMV (dm_exec_requests).

Physical_only?
- only first two checks
- then reads every page to do checksum checks and page audits.

Leaf Index Checks
- Page audit (buffer pool does torn page or checksum checks). Can eat a checksum failure and continue on. Reports the issues.
  - look at page header, check that data is what is expected for that page.
- Record audit - gross audit of structure of all records. Are offsets correctly set.
- B-tree check
  - for index and leaf pages.
  - check linkages
  - are key ranges correct?
  - partitioning - is the record correct for the partitioning function.
  - key ordering - records on the page are not stored in order. They are stored where they fit, so there's a key ordering store for the records on a page.
- per record checks
  - complex columns - checks the pointer to the values. ensures that what's stored in the row is what's stored elsewhere.
  - computed columns - make sure persisted values are correct according to the formula.
  - Data purity - not all types have range checking. Datetime has minutes after midnight. Must be < 1440, but the stored value could be higher in theory. So checks are performed to be sure the stored value is within the range.
  - 1 to 1 mapping - ensures that a nonclustered index maps to one base table row.
- Page counts
  - counts in header are correct.

No blocking of the table on SS2k and beyond.

How often do you run this? It depends.

checks could approach n<sup>2</sup> complexity. Fastest way is N * log(n). Rather than following links, as pages are read, various facts are generated about each page.
 - A (actual) - we read A
 - A (sinbling) b - who does A point to
 - P (parent) A - when A is read.

 Where it's an end page, nulls are generated, For parent pages, a psuedo-parent is generated. A, B, O, are really the page IDs, index IDs, etc. these are sorted and then checked. Every node must have 3 facts (can be null) or there's a missing link. If there are too many items, like 2 parents, then there's a linkage issue. This is done so pages can be read in allocation order, which is the fastest way to read them. Other checks use a similar method with different facts.

 Index matching. It's getting complex, but the idea is that a bitmap is generated. As the table (heap or clustered index), is read, a hash is done and a bit is flipped. As the index is read, it's hashsed and it should map up to the same value and the bit is flipped back. So as long as a bit is flipped twice, it's good. so it digs through and can reuser the bitmap for multiple tables. As long as everything flips twice, it's ok. If at the end they do find a set bit  a "deep drive" is performed, which is very time consuming to find the corruption. This first check is 30% of the resources of the deep dive and is fast. So if checkdb starts to take a long time, it's bad.

 Before SP2, you couldn't tell a deep-dive was being run. Post SP2, an error message is now logged.

Repairs
- purpose is to make the db consistent. NOT protect data. Data is not necessarily deleted, but the repairs are efficient, fast repairs and data can be lost.
- mor eintrusive repairs ar edone first. Like allocation map for an index repaired before a missing row. The more drastic ones often fix less drastic ones.
- Not everything is repaired, or known. You have to check.
- Not feasible to make it online. Too complex.

CHECKDB doesn't run faster after index rebuilds.
If you run CHECKDB and find errors, does it run again with the repair option? Yes, it runs twice. Because something could have changed.

Space required ro run checkdb.
 - facts require more space, so tempdb space is needed to store facts. Run with_estimate_only, to get an idea of the space for facts.

 can internally do single index row inserts and deletes.
 If a value in a row is corrupt, the row can be deleted.
 If a pointer is corrupt, the entire page can be deleted.

 to be safe, you can start a transaction, run repair, and if data is deleted, you can rollback if needed.

 checkdb does not respect key relationships (PK, FK). It is structurally complete, but you could have orphaned rows. Need to make checks.

 Some things are unrepairable
  - system table clustered index problems
  - PFS pages
  - Data purity errors.

  Beware of third party kernel filters not working with CHECKDB that doesn't handle alternate streams.

  Not recommended to have anti-virus on SQL Server because it can interfere with IOs


Posted 05 June 2007 15:36 by Steve Jones | with no comments

Beyond Relational
I missed this yesterday, but I saw it on the schedule again, and I knew I had to make this one. It's titled "Beyond Relational", which I expect to mean it's focused on enhancements not necessarily related to rows and columns. But I didn't see the abstract, so I'm not 100% sure what it will be. I'm hoping it's not too much time on GIS stuff.

Jnue CTP is out and is the first public release. They are definitely looking for feedback on the features that are already baked in.

4 pillars of SQL Server
- enterprise platform - all the "-abilities"
- Beyond relational
- Dynamic development - moving towards model based development.
- Pervuasive Insight - BI everywhere, all data can be used in BI.

Beyond Relational - The explosion of data which means more and more types of data will need to be hosted.

Survey of the audience: how many have more than 1TB at home? 75-80% did.

Applications work with different types of data to be rich.
- relational (rows/colummns), structured data
- XML
- documents and multimedia data.
- spatial data

An example is an insurance claims application. In the past the application was forms mapped to tables in rows. Today we have pictures and videos, relating to other documents like witness reports, moving data in XML, and even mapping those claims to geographical locations.

Idea on new types of data is to ensure that the functionality available for traditional types of data is also available for new data types.

Goals
- Reduce cost of managing all types of data
- simply development of apps using both relational and non-relational data
- Extend services for relational data to non-relational data.

Additions in Katmai
- XML upgrades
- Remote BLOB Store API, FILESTREAM, Integrated FTS
- support geometry and geography types and functions
- Large UDTs, flexible columns, wide tables, filtered indexes, hierarchyID

WinFS? - most of team was folded into SQL Server team. A lot of the work went into SQL Server. The FILESTREAM type and large UDTs came out of this project.

XML Upgrades
- Full support for storing and validating Office 12 document formats.
- Support for lax validation
- Full xs:dateTime support
 - support for no timezone values and timezone preservation
- support for lists and union types
 - There is XQUERY support for the let-clause.
 - Added support for insert sql:variable("@xml") into /a/b

Doucments and Multimedia
Why store in the db?
 - integrated management and data level consistency
 - problems with poor data streaming, size limitations, and high cost/GB
 - right now you can use VARBINARY(MAX)
Alternatives
 - File server storage or dedicated BLOB store server (RMC Centera, Fujitsu Nearline).

None of these is the right way. Each works in different situations.

SS2K8 should deliver a few different platforms.
- FILESTREAM feature. Addresses limitations of SQL BLOBs ni making them work better.
- REMOTE BLOB storage - allows blobs to live in file server or dedicated store with reference in db.
- SQL BLOBs - more support for internal storage.

FILESTREAMS
storage attribute on VARVINARY(MAX)
- unstrcutured data is stored in the file systems (NTFS)
- dual programming model.
  - TSQL (same as blob)
  - WIN32 streaming APIs with T-SQL transactional semantics.
- Gives data consistency and integrated manageability (backup/restore, administraton)
- Size limit is the file system volume size
- uses SQL Server security stack
- demo - Kevin Farlee
  - create db with filegroups, one of which has an attribute as containing filestreams. the location is a directory that will be created. Different tables can have filestreams in different locations. Just like tables.
  - table is a "varbinary(max) filestream" type.
  - insert a row and we see
  - to work with the data, need to get a handle and logical path to get to the data in the filesystem
    "select get_filestream_transaction_context(), field.pathname from MyTable"
  - Use a Win32 call to read and write this data from the file system (fascreat??)
  - Showed transaction rollback as well as trigger fire from update to the file system data.
  - backup, drop db, restore db, and data is there, including filestream data.
- First release in 2008 should have these limitations
 - remote storage of filestream not supported
 - DB snapshot and mirroring are not supported (log shipping will work)
 - encryption and table value parameters are not supported.

My question is what does this do to the log?

REMOTE STORE BLOB API
- applications can link via a provider. So each application provider needs to deliver their own provider for their own system.
- simple api for fetch, create, enumerate, Garbage collection, delete
- returns reference that is stored in the db. By going through the db, there is some maintenance of the references. This should ensure some link consistency and allows the application to be loosely linked to the BLOB store.

Full-Text Indexing
Challenges
 - indexes outside SS have manageability problems
 - mixed query performance suffers from pull over complete full-text set
 - scaling issues on big boxes

SS2K8
 - FTI fully integrated into the engine.
 - Mixed queries should scale better. (query on "contains() and xx = i)

Spatial Data
- New types (geometry, geography)
- New methods (intersects, buffer, more)
- new indexes for similar performance to other types.
- Full set of OGC/SQL MM, ISO 19125 components
- Integration with Virtual Earth

- Demo - michael rys
  - intersection of map with zip code query.
  - added spatial index, performance is improved.
  - showed integration with Virtual Earth as well

Relational Data
- HierarchyID - store arbitrary hierarchies of data and efficiently query them. New UDT to implement hierarchies.
- Large UDTs - no more 8k limit
- Sparse columns - optimized storage for sparsely populated columns.
- Wide tables - support for hundreds of thousands of sparse columns.
- Filtered indexes - define indexes over subsets of data.  Index on rows in a table with a particular value.

Posted 05 June 2007 13:36 by Steve Jones | with no comments

Maintenance in SS2K5
Andrew Kelley from Solid Quality Learning did this one, which was good. I was worried it would be about maintenance plans, but it wasn't. Instead it was on good real life maintenance for your server and how to handle it. He did start out talking at the Maintenance Plans abstract you away from what's happening and can be limiting, so be sure you do understand what's going on, don't just blindly assume the plan wizard knows best.

Hardware
Since the maintenance can stress SQL, it's a good idea to look at hardware. First, memory. It's better to have too much than too little. Most SQL Servers are supposed to be memory bound, and memory is an easy and cheap way to boost performance.

For CPUs, more is better and multi-cores are definitely recommended. As your data size grows and load grows, even in maintenance, having more CPUs or cores helps. One caveat, it's recommended to turn off hyperthreading unless you really test. Since it's one core/CPU, you could end up with threads performing different activities, and slow things down. I still haven't see great metrics on this, but the consensus seems to be leave this off. So if you have an older CPU, be sure that's the case.

MAXDOP - You can set this at the server or statement level. Which can be handy if you are working on a system that is being used. You can reduce the use of CPUs by your maintenance statement by setting this to something other than 0 (use app procs).

Disks
Know your config. RAID 5 has been the standard, but it's gone out of favor for databases. It's write intensive and people seem to be more worried about failures. Andrew recommended R0+1 for most things, data, logs, tempdb. R1 was for the OS, logs, and tempdb. Not sure why not for data, perhaps because large files. R0 is not the ideal.

Beware of SAN or other storage people. Be sure your separate drives are separate physically.

DBCC CheckDB
Checks all rows, text offsets, etc. to be sure that they are correctly linked. 2005 added many more options and checks.
 - data purity - values within range for data type.
 - Row overflow pointers
 - Service Broker checks
 - XML and indexed view checks
 - torn page and checksum - set this for each database for the page level.

Be sure that when you upgrade, you run checks. It is possible for data to exceed type limits and be invalid.

In 2000, you could run CHECKDB and got CHECKALLOC and CHECKTABLE. 2005 adds CHECKCATALOG. SS2K also used a table level schema lock and could block log truncations. In 2005, it uses an internal snapshot to get a transactionally consistent view and doesn't block the log.

Run this on master, model, and msdb. They can get corrupt as well. Not as likely, but it can happen. Run on ALL databases.

When to run? Every night it best, but you have to weigh the risks v the time and resources you have. Run as often as you can.   

How can you speed it up?
 - Run physical_only to skip some checks. Does a good job of checking integrity, so use this if you cannot run the entire check.
 - Run it with TABLOCK. It runs faster, but it blocks other connections.
 - Run it against a recently restored backup. This allows you to run this on another backup.
 - Use no_infomsgs

File-level Fragmentation
 - Normal fragmentation you might see on your workstation.
 - Create your files as large as needed for needs at once. This prevents growth, which often fragments files.
 - Don't let auto-grow kick in. Proactively add space when needed. Auto growths can be expensive and pause your server for the connections doing work. Instant file initialization helps, but requires Enterprise Edition, permissions for the SQL Server service account, etc. NOT TRUE FOR LOG FILES
 - Dedicate the drives to SQL Server, not other apps.
 - Use an OS level defragmenting tool.
 
 Shrinking a Database
 - This should almost never happen. This removes free space, which fragments files, causes a need for growth, etc.
 - costly operation and FULLY LOGGED
 - Keep the free space you need for maintenance activities.
 - If you need to do this, use SHRINKFILE, not SHRINKDB. More control and lets you stop after each file. It is online and it's stoppable.

 Optimizations
 Why? to deal with logical fragmentation and page fill.
 
 Logical Fragmentation - The physical order does not match the logical order. So index on names, alphabetically, might be stored as Allen, George, Henry, Rhonda, Ken, Tom, Billy. So the server jumps around through the pages back and forth to follow the index. Page splits cause this.

 Page fullness - Also known as fillfactor. How much free space on a page. Caused by data changes. Reindexing regularly it to get this to your ideal value, not deal with corruption.

In 2000 we used
 - DBCC REINDEX
 - DBCC INDEXDEFRAG
 - DBCC SHOWCONTIG

 Now SS2K5 uses
  - ALTER INDEX REBUILD
  - ALTER INDEX REORGANIZE
  - sys.dm_db_index_physical_Stats

 The DBCCs are being deprecated and will not be in Katmai, so change your code now!

INDEX REBUILD
  - does all or one index and deals with existing RI and can work online
  - Adheres to fill factors and padding, updates stats, and can uses parallelism.
  - Best chance of getting contiguous index.
 However, there are cons.
  - Locks the entire table for the entire operation if it's offline.
  - One large transaction, so log space can be an issue
  - Can take awhile and uses lots of resouces. Requires 1.2x size of the index.

REORGANIZE
  - minimal locks, fills pages up to the fill factor
  - can be stopped and restarted
  - log can be backed up while this occurs
  - doesn't require extra free space.
HOWEVER, the cons
 - Can log several times the size of the index over time because pages can move more than once.
 - Doesn't update statistics and is single threaded. Rebuilds the leaf level only and works only on one file at a time.

Which is better? It depends. Use the one that works best in each situation. You may have servers that require REORGANIZE, but others that can use REBUILD. Need to consider the size of tables, maintenance window, size of tables, hardware, online requirements, etc. You don't necessarily need to do every index on every table every night. Might have different schedules for different databases. Check under SHOWCONTIG (SS2K) or under the DMV (SS2K5) is an example for reindexing based on thresholds.

Statistics
- Do not need to do this if doing a REBUILD. You don't necessarily need to do this unless your data changes dramatically. If you have auto-update turned on, that probably works.

Backups
What type of backups should you use? It depends.
What are your requirements? The recovery time is a big factor here. How fast you need to recover helps determine what types of backups to take. Full backup every night is a standard practice. Data loss is the other factor. Less tolerance for data loss means more frequent backups. And of course, the database size is a factor (data size, not allocated size). The larger it is, the less choice you may have.

Do we need to verify backups? Verify does not detect corruption (unless you use the checksum option), but verify makes sure that you can read back what you wrote.

Need to test your backups! Ensure that you can restore them or they could be useless.

Disk or Tape? Disk first is preferred, not the same disk as the data resides on, but it's quicker, cleaner, and available if you need a quick restore. Move to tape for long term storage. If you use remote drives, use UNC pathing, not mapped drives. Third party tools are a good idea to compress the backups and save space and time.

Don't use INIT or the same file every day. It loses flexibilty and increases the chances of problems. The maintenance plan does this for you, but lots of code samples here at SQLServerCentral.com or elsewhere on the web.

Watch our for backup history. Every backup gets recorded in msdb. The maintenance plans can clean up the history, but if you don't use them, be sure that you clean up with sp_deletebackuphistory periodically.

Scheduling - Know when things are happening. I agree with this and in most of my environments my servers have been small enough I could arrange things well. But I do see where something like sqlSentry is needed if you have lots of jobs. Since maintenance is resource intensive, be sure that you don't schedule things a the same time, like everything at midnight. You'll make everything take longer.

Things not built in, but need to be done
 - check for free disk space
 - check for free space in files
 - scripting jobs
 - scripting dbs
 - delete old backup files (or log files
 - chcek for log errors
 - check for job failures.


Posted 05 June 2007 12:45 by Steve Jones | with no comments

SQL Server Always On
Everyone wants 5 9's of uptime, 24x7 support, and instant fixes for issues with their database server. Never mind that I'd argue that most DBAs don't really need the "Always On" technology, but it's something that sells. So I thought Kim Tripps "Demo Fest" would be a cool seminar to see for a DBA.
Kim is a great speaker, entertaining and very comfortable up there, so if you get the chance to see her, I think you'll enjoy it. This was a demo session, really a couple slides and then demos. She was looking to cover:
 - Mirroring
 - Snapshot no mirror
 - Online Indexing
 - Peer to peer replication

 Using VPC images, we got to see the various demo items. One of the intersting things is she showed some tips on how to setup your lab or test environment. Kim has 5 instances on her image, but she also has them set to not start by default. Instead there are batch files to start and stop services for specific labs.

 Mirroring was the first step, which is one of the hot new features of SS2K5. An interesting fact that I knew, but you might not, is that the witness server, the third (optional) server in mirroring, can run on SQL Server Express. So it's a very lightweight requirement for this server instance. Plus the cool thing about a witness is it doesn't cost anything with Express.

 I don't think mirroring is that hard to understand as it's simlar to what I've been doing with custom log shipping scripts, but it's something many new people to SQL Server have to dig into a little to understand. A great, simple explanation was given on what mirroring is and the limitations and restrictions. Like driving home the point the primary can only be in full recovery mode, no simple or bulk-logged, and that the mirror cannot have its recovery model changed. So reporting cannot be directly done off the mirror.

 Scripting mirroring isn't something that's easy to do. No wizard buttons for it and while Profiler can be used, it's a hassle. Kim Trip has a script that should go up on SQLSkills.com, using SQLCMD, after TechEd.

 If you haven't looked at SQLCMD, you should. As Kim mentioned, it's got a lot of advantages in its scripts: variables, error handling, loops, labels, etc. She also gives good reasons to be sure that your directory structures are the same on the principal and mirror. Some good tips and tricks to be sure that you don't get yourself into trouble with this technology. Another big one is to be sure that you don't get into trouble with the setup because of the timing of the restore to allow your mirror to "catch up."  

There's also a DMV for mirroring, sys.database_mirroring for those of you that don't want to work with the Mirroring Monitor.

In the Object Explorer, your database will show as eitherthe principal or mirror and whether it's synchronized. The mirror also shows it's in the "Restoring" state.

For monitoring, one of the cool things was seeing a small applet that watches a table on both the principal and mirror and counts rows. It actually connects to the principal, but the attempt to connect to the secondary, it gets redirected to the mirror. Kind of an interesting way to show the implicit redirection. If you put the mirror in your string, then you'll move directly to the principal. If you want to connect to the principal with a failover, then you need both servers in your connection string. One issue with the connecting to the mirror instance, is that it must be available. If it's down, you'll get an error.

Failover can be via SSMS or T-SQl, as well as a real failure of your principal of course :). The connections that exist, when something fails, connections break and transactions rollover. So there isn't a transaction movement from one instance to the other.

Be aware that if your mirror is down for a long time, you might have a "Synchronizing" state for some time to get back in synch. That means that you aren't really protected while things are catching up. Which means you need to get your mirror back up as quickly as possible.

The witness checks both the principal and mirror for "aliveness", but it's forming a quorum with one of the other two servers. This means that it can die and it doesn't affect the log on the principal, or even the transactions moving to the mirror. There is a problem if the witness goes down in that if the principal loses connectivity to the mirror, it thinks it might be down, and the mirror and witness can see each other. So it closes the database down, which means losing the witness and the mirror will bring down the principal. Not a great situation.

A Reporting Mirror
You can't use the mirror to report, but you can create a snapshot on the mirror database. I had someone ask the question at the Colorado Code Camp that we can create a snapshot on the mirror, we can read the snapshot, which goes back to the mirror, why can't we read the mirror? I asked and was told it's coming. Not sure about Katmai, but should be out by at least the release after that. Same for snapshot backups.

My answer at the Code Camp, when no one else had one, was "it's a feature" :)

Some interesting questions. What if you have a clsutered principal that has a mirrored database. If you have a failover on the cluster, you might have a mirror failover as well because of the time to restart the cluster. You can change the mirror timeout to fix this.

Someone also asked if the principal/mirror transfer is a two-phase commit. It's not two-phase commit, but a new algorithm that writes the transaction to the Principal's log, then writes it to the Mirror log, then commits on the Mirror, then commits it on the Principal. If the Mirror is down, it still commits on the principal's log. If the witness is down.

It's not complicated, but there are some moving parts and it can get confusing in some scenarios.

No license required for mirror being used a mirror only. If you report off it, you got to get a new license for that server.

Snapshot files look big, but check the space on disk. They're "sparse files" and are really small.

Online indexing, some blocking that could occur at the beginning and end of the process, but not that much.

A quick look at Peer to Peer
Because of time, this was just a 5 minute session. Different than mirroring. Mirroring seems simpler, and more protective in that it happens quickly.

Repl is more scalable because you can do partial replication, only move certain data, move to multiple places. This means that you can have multiple peers and have data move among them all with bi-directional transactional replication between them all. Requires some app work, but it's pretty cool.

Posted 05 June 2007 08:35 by Steve Jones | with no comments

SQL Server 2005 Tools
A good talk on the tools from the guy that's responsible for SSMS, Configuration Manager, Maintenance plans, Surface Area Configuration Wizard, and a few more.
Initially I thought this might be a dud, with him showing some of the basic features of Configuration Manager and Surface Area tools,but then he showed some interesting tricks. Like how to connect to different version or filter the types of connections in the registered servers pane (hint, check the icon buttons). He also looked at filtering and some of the scripting options available. Changing defaults is nice because I've been frustrated by them in the past, with my scripts requiring lots of search and replace.
He covered maintenance plans as well, showing some of the new options in SP2. Sub plans, target servers, etc. No mention of the bugs that SP2 introduced, but it wasn't mentioned at all. Another great place for an apology for the hassles, but it wasn't shown.
The summary page (RTM, SP1), became the Object Explorer Details. Paul detailed a few other changes, which I'm not sure why they were made as they confuse people, but it was good to see them listed.
One of the things I've worried about is configuration changes over time. Since this has required lots of work to track, it was great to see the standard report of "Configuration Changes History" report. Now it's coming from the default trace that is very lightweight, but it only keeps 5 files, each 20MB, so you could lose a bunch of information.
These reports show up in Object Explorer with SP2, Before that, you'd have to dig through the tool. You can make your own reports in BIDS with a Report Project. Create a query and then format the report. Ignore the report server and deployment folder. Load these into SSMS and you have lots of data available. There are tons of reports, so search around for the dashboard performance reports.
Lots of work went into the Database Turning Advisor, making it its own product.
Profiler allows SSAS traces in SS2K5. One of the cool things is integrating the System Monitor with a trace to correlate events.
The Best Practices Analyzer, gives a good check on your instances. It's come out about 18 months after RTM, which is better than SS2K, when it took 4 years before it was released. Look for this to be updated with the  Katmai RTM. Results are linked to BOL, so this is a nice integration of information. They also include some rational for the recommendations.

Posted 05 June 2007 07:30 by Steve Jones | with no comments

Reunion
We're not partners anymore in SQLServerCentral.com, but Andy, Brian, and myself got together last night for dinner. We haven't seen each other since the PASS Summit and this was a good time to get together. We had a nice dinner, reminiscing about the site a bit, them giving me some feedback on how they think I'm doing and a little strategy talk about our training business at End to End Training. Our first mentoring class is this week, which should be interesting.     

Posted 05 June 2007 06:41 by Steve Jones | with no comments

Working Afternoon
I ended up driving around Orlando this afternoon to get books. We'd printed the next version of our Two Minute SQL Server Stumpers and I'd had it shipped down to a friend's house. He didn't make it down to TechEd, so I drove up and picked up the books. So I spent the afternoon running around Orlando and with some dinner plans. I cut out early from the conference.

I did spend some time talking with some Microsoft developers talking about Katmai and their plans and lined up a few more interviews, so look for those soon. I met up with Tony Davis of Simple Talk and we strategized about some new features that we'd like to bring you soon.        

Tomorrow will be a long day.     

Posted 04 June 2007 14:22 by Steve Jones | with no comments

First Session
I dropped into Brian Knight's "Clustering in 60 Minutes" to say hi. If you've never seen this, it's pretty cool and worth the watch. Brian makes clustering look pretty easy, and he really sets up a cluster in the session from scratch, so check it out. It's updated for 2005, but I've seen it, so heading to the SQL IO Demystified session.

And I can't report on it. There are at least 15 people standing outside trying to listen in to this "informal talk" with the room packed and people sitting all over the floors. Can't really hear, so I talk to the people from PASS and say to to a few Microsoft developers I've met over the years.  

Posted 04 June 2007 09:37 by Steve Jones | with no comments

Tech Ed - Keynote
These will get pretty long as I'm kind of wordy, but these are impressions as things are occurring.
Registration is a little crazy. I'm not sure who desgined this conference, but it's split among the two sides of the conference center in Orlando. I've attended 3 or 4 Tech Eds, a couple PDCs, and half a dozen other conferences. In all of the cases, the entire conference is in one building. A big one, but one building. Here the keynote is in the west center and the show/expo, and the backpack you get loaded with marketing material and swag, is in the South center. If you've never seen the Orlando convention center, that's a big deal. There's a 1/2 mile walk outside in the Florida heat between them. Not a big deal to me, since I like the heat, but I sweat, so it's not so nice for my neighbors.

Keynote
  I arrived late, and it was packed. If you've attended others, you know that it's usually the case you have 4 or so large screens. It seems most conferences have moved towards dual screens, showing the speaker on one and a computer on the other. The recent BI conference had 2 pairs of screens. TechEd has 6 pairs. It's a big show.

  Talking about models for optimization of your IT processes. It's a good idea and I think we need more reference models and also most explanations on how to map our internal IT processes to the model.

  Energizer - The first video was from Envergizer, which was also a featured company with the BI conference. They must be a heavy Microsoft user. Their initial plan was to be 2 years behind any release of software. Sharepoint is a featured, which shows just how much focus is being put on thie particular piece of software.

  The focus on IT this year seems to be working on an agile, rapidly changeably infrastructure, being able to move in real-time.
  Gartner
  - Connections more pervasive
  - reponse time expectations are shrinking.
  - technology can really enable differentiation.
  - relationships are online and short-lived (is this true? Or has business pushed us this way? In some ways it has, in others it hasn't.)
  - Agility is hard. It seems that the gap between agile and not-so-agile companies is growing. Agility is sensing a change and efficiently and rapidly responding.
  - 70% of IT budgets maintaining. But successful companies have moved that to 50-50.
    - Focus on costs, cost based on usage, not more efficiency. A utiltiy model
     - Quality of service with management based on policies. Matching up what IT can do with the business needs. Not everything needs 5 9s or subsecond response.
     - be agile. infrastructure is more dynamic.
  - Need to build other metrics besides cost to allow you to measure this and improve it.
  - Ask business what they need and translate those needs into measures and begin to work with those measures.
  - Gartnew has an IT maturity model
    - Focus on process, technology, and culture. All three are important.
     - Don't use long-termprojects. They usually fail. Work on smaller pieces instead with ROI measured in each stage. Not just cost, but also quality, and agility. Usually it is cost recovery in early stages with later stages needing investment to gain on quality and agility.

  XEN open source software is including Microsoft interoperability in their distrobutions. Licensing agreements are occuring, so it seems thatt Microsoft is trying to work with open source software.

  There was a good docus on the interoperability and the efforts their making to work with other software and embrace stadnards. I'm not sure how much I believe it, but they do seem to be working a bit better with ther systems. Is it Microsoft or the toher companies?

  The Back to the Future car was on stage and Christopher Lloyd showed up on stage a few times.  Microsoft does go all out at TechEd and they usually have some godo guests.

  Windows Server 2008 Demo
   - Server Core - Minimal installation option for specialized installation. This means that a file server canjust have the file server services. Good for virtualization. IIS, file server, AD server can be roles. Only a command line interface for server admin.
    - Virtual Server lets you use multicore cpus.
    - System Center has a virtual server manager. VMWare machines can be converted to MS virtual machines. Built on powershell.
    - Physical servers can be converted to virtual machines.
    - move VMs from one host to another. Quick Migration is the option that you can do today with Virtual Server.
    - Demo was a bit fast, but well delivered.

  - New model driven management tools
    - System Center, Operations Manager, modeling applications for a company. Operations Manager can see errors, like web erors. There is a "Problem Path" section that can highlight all the components showing errors. A lot of design and documentation work to get things set, but it certainly could assist your production team in managing servers. Not sure about security, rights, etc., but it's a good idea in general.
     - Katmai - Policies in Management Studio, and we can set policies for servers. This allows us to set settings for standards and I can now see if a database is not set according to policy. Things like db options, etc. These can rollup to System Center.

SML is a theme. This seems like the next wave of XML use in Microsoft. We had the XML for visual displays in Vista, and now we're seeing SML for system modeling.

Services are important. One of the four pillars that Microsoft is pushing is the need for software services. SOA is definitely something to watch and learn a bit more about. BizTalk has been working with "services in the cloud" where disparate companies can easily set up services that they can use to talk to each other. Not sure how well this will take off, but it's interesting.

Kind of a cool demo showing the Dundas technology Microsoft has licensed for Reporting Services 2008. An easy to build report with a map control in it. At the BI conference, I also saw the Excel and Word integration technologies for building reports. The rest of the demo was a bit drab. Not a great presented, and it seemed that they were trying to push Biztalk, but I didn't really see it as a great technology to use. Maybe I don't get it, but I think if I don't, then most people won't.

It's nice to see that Microsoft hasn't forgotten the user experience. The fourth pillar was absed on the user interface, and featured :Expression, silverlight, and Office with Visual studio. VS2008 was used with the office tools included. An extension to Outlook was built. I can see people getting a little crazy with this, but it's a good idea. Getting more, rich Outlook messages is a good step to ensuring people stick with the platform. And it makes for a better experience as well. But it does tie you tighter to Microsoft. I'm ok with it, but some people may be concerned. I think this is mostly for internal mail, so I'm not sure what would happen when it transitions outside your domain.

Losing my battery, more later.

Posted 04 June 2007 09:31 by Steve Jones | with no comments