A Partial List of Things that Tripped Me up as a New DBA, Part 2.

In an earlier post, I detailed a short list of things that tripped me up starting out as a DBA. Some of these are things that trip up developers, others are things that trip up DBAs. Whether I was the cause or the one that had to fix it, I was involved, somehow.

This continues that list. Again, there are no answers here; only questions.

Why are the pains of using NVarchar(MAX) not shoved down a developer’s throat by SSMS? Developers understand data structures; It’s easy enough to visualize what data can fit on an 8K page (or some representation thereof), so why doesn’t SSMS warn the developer the same way the compiler does? “Hey, using NVarChar(Max) means your pretty little indexes won’t be awesome.”

Why doesn’t NOLOCK generate a similar warning? It’s a bad idea in most cases and doesn’t do what you expect, so why not hit developers over the head with that detail?

Why don’t uses of Table Variables generate warnings about bad statistics?

Why aren’t execution plans and Statistics automatically generated for each query run in “developer” mode? Developers are used to seeing those sorts of metrics; and with a few simple default changes SSMS could get developers to start to pay attention to things that really matter.

Views. Why doesn’t SQL Server complain when you use nested views? Seems like something to tell a developer. I spent two days unraveling nested views just to find out the de-nested view generated a different query execution plan than the nested view.

Shouldn’t SELECT * generate a warning EVERY TIME A DEVELOPER USES IT? It shouldn’t matter if it’s in the query window or not — If we subscribe to the idea that developers need to know this stuff, why aren’t we beating them over the head with it?

Why does SQL Server let you set up a LINKED Server as SA without some sort of “You’re an Idiot” warning? In fact, why does SQL Server let you use SA for anything without some kind of signal that it’s a bad idea?

Why does changing the name of a SQL Server box (to the name of a box that is still in AD/DNS) result in dire delayed pain if you’re using Windows Auth? It works, it works, it works, and then all of a sudden: No one can log in using Windows Auth (I was bitten by this twice before I realized why the renaming did it).

Why doesn’t SQL Server use the Begin/End Transaction pattern by default for SSMS queries? Just put it in the template; don’t give people the option of choosing the implicit transaction settings. Some junior DBA may forget what is implicit and what isn’t, and end up filling the transaction log with uncommitted transactions.

Why does SQL Server throw the principle of least privilege out the window? Why doesn’t it make it hard to do something stupid security-wise? Good software helps its uses fall into the pit of success, not the pit of “We just Got Owned”.

Along the same lines, there should be warnings for unparameterized queries as well; make the DBA turn them off; don’t just let it happen.

For that matter; why doesn’t SQL Server have some generated tables/template scripts for various business needs? I recognize it’s a general purpose Database; but Microsoft doesn’t have some idea about Stored Procedures or tables for User Account Management, or half a dozen other business purposes. You can’t swing a query plan without hitting templates for most Microsoft products, but for SQL Server, they’re conspicuously absent.

Why are our options for importing data into SQL Server to either use SSIS (that you pay for), BCP, or some application code? Why is it so difficult to import a CSV file or XML file into SQL Server? (Unicode will get you every time).

Editor’s Note: Apparently there’s a way to get SSIS for free. It’s also available with the Express version; and the issues I talk about here are available in Express.

Why doesn’t Microsoft provide SQL Server as its own bundle with Windows Server set up to reasonable defaults (like Instant File Initialization; and read/write access to what SQL Server needs)? It’s not like there are people out there that think, “I’d love to use SQL Server, but I’d rather put it on Linux”. Microsoft knows what Windows Server needs to be configured as to use SQL Server; why not bundle it?

Why is it so painful to script SQL Server privileges for Read/Write access to the filesystem? Part of my DR work has been to automatically stand up new servers when needed, and I spent more time on getting privileges to work than any other part of the task (including writing the automated restore code to restore 200+ databases from backup to a single server when the system drive has lost all its data/MSDB is corrupt). I literally spent two days on that code (including unit tests) but a better part of a week trying (and failing) to get user access rights set up through scripting in AWS.

It’s 2015; if it weren’t for AWS, I’d have set up a few dozen more SQL Servers than I’ve had to — why does it take the cloud to make such a thing scriptable?

Why isn’t there a better API for Windows Server + SQL Server administration stuff? Do they still envision a DBA clicking on settings? Really? My options are T-SQL, or PowerShell that emits T-SQL.

Why aren’t there giant warnings on Triggers? Why are they so hard to debug? Sell a bundle; SQL Server + MSMQ or your favorite Queuing service. While you’re at it, give me a better API into SQL Server.

Why did it take until 2014 for Microsoft to have (nearly) sane RAM licenses?

High Availability, Disaster Recovery. As part of a startup, my options are: Write it myself, migrate to Postgres, or pay a lot of money for Microsoft to do it. We’ve punted on HA and focused on DR for now; but that’s a similar scene: Write my own or pay a lot of money.

Why doesn’t Microsoft give some sort of guidance about usage patterns for different databases? In a truly multi-tenant environment, it’s almost impossible to find out who the real users are. If it weren’t for the Brent Ozar scripts and an internet full of small utilities it’d be nearly impossible to find that out easily.

Why are all the important DMVs, procedures, and tables hidden from view? Shouldn’t those be part of the default view for a DBA?

Deleting a database is not always a Severity 16 error. Sometimes it’s a non-event, and sometimes it’s a world stopping event.

Pager? Really? It’s 2015; is that really still a thing?
Why doesn’t SQL Server emit a warning when a database in full recovery mode hasn’t had a log backup in a predetermined amount of time (an hour seems reasonable)?

Why aren’t orphaned users front and center when restoring a database? Why doesn’t SQL Server try to restore users if it finds a match?

Wait Stats. SQL Server knows which are bad and which aren’t; or which are indicators of something untoward; but out of the box there is no notification mechanism. Sane defaults, people. Sane Defaults.

If SQL Server fails to compile a query; why doesn’t it send the query to the error log and tell us exactly which query it couldn’t compile? It has the query text!

Why doesn’t SQL Server give an easy option for a user to go in and out of SA mode? Sometimes I want to drop to SA mode (think: sudo), but most of the time, I don’t want my Windows user to be a SysAdmin; and I defintely don’t want to have to execute a custom command to connect to SSMS to log in as a Windows User that has SA.

Why isn’t there the ability to generate an empty database? If you choose to put tables in Model, those tables get created for every Database you create. If you want an empty database, you’d better keep the backup of an empty database before you mess with Model. Don’t delete it, or you may find yourself setting up a new SQL Server instance just so you can have an empty database.

Author’s Note: These are a few more things that have tripped me up as a DBA. In the coming weeks, I’ll have answers for some of these, but I may not. This blog post is mostly a theraputic look at DBA work; I’m really not angry (even though I may come across as such in this series).  DBA work means detective work. It means visualizing the problem with pieces of evidence scattered across a large surface area.  It means having appropriate tools in place before something goes wrong; and above all, it means being able to take a complex system and making it work.

The only thing I can compare it to is to understand the basis of flight, and then be asked to pilot many different types of aircraft  (including helicopters) by instrumentation alone, with all of those instruments hidden from view until you know exactly which one to ask for.

A partial list of things that tripped me up (as a new DBA)

To the untrained eye, Relational Databases are arcane, magical things.  I remember my first realization (as a programmer, mind you) that Databases do follow normal data structures and algorithms that any other program follows.

Wonder and awe.

I wish I were being facetious.

I’m still fresh enough in my exposure to databases that I haven’t turned into a grizzled old guy yelling at the NoSQL kids with their hats on backwards and their ACID washed jeans bearing phrases on their rears like “BASE” or “fault tolerant” (HA), and I haven’t forgotten my own struggles as a DBA. If anything, the mistakes I’ve made are only amplified now; they’re still new enough that I remember them as if they were yesterday (in some cases, they were only yesterday).

This blog post is a list of things that tripped me up. I don’t provide any answers here, only questions.

What are service accounts? Are they the standard NT SERVICE\MSSQLSERVER account types? If not, are they domain accounts?  How are they managed and created?

If Backups shouldn’t all run at the same time, and we should have our databases in FULL Recovery mode,; how do you set up a FULL, DIFF, and LOG database backup to not collide? Do you stagger the FULL/DIFF? The FULL starts at 12:00am and the DIFF at 12:05?

Why are some basic bad things (like the query processor running out of memory) not better catalogued by SQL Server? Why doesn’t it tell me, “Hey, this is the hash of the plan I failed on?” (or at the very least output the query to the error log)

Are transaction log backups supposed to randomly fail with weird messages? (That’s a SQL Backup error. I think I understand the cause, but I’d love to figure out how to stop it)

Why doesn’t SQL Server come with an automated backup/restore tool? Why must we keep reinventing the wheel ?

Why is RECOVERY for when you want your database up right now and NO RECOVERY for when you want to restore more backups? Shouldn’t that be the other way around?

Why do I need to use RESTORE FILELIST ONLY to see what the data and log logical names are for a database before I restore it from a backup? The .bak file obstensibly contains the name of the database I’m trying to restore. It knows. Does it want to know that I know?

And What crazy person decided that a database’s logical name could be totally different from its filename, and that you really do need both pieces of information to restore it? Why not just say ‘screw it’ and let me overwrite the original file name on restore if I don’t provide a logical file name?

Why does a transaction log backup failing for 1 database in 200 (on a single server) mean that finding out which database that failed take an act of congress? (Or really, a custom backup/restore checker) My options are to either spleunk through the log files, or hope the restore tool catches it before we need it.

Why are basic things like compressing the backup not available in Web edition?  This isn’t an expensive feature (Red Gate charges a few thousand a year for the privilege for 4 servers), but you pay out the rear end if you want it (just check the licensing costs between Web edition and Standard Edition).

Why does using Buffer Pool Extensions blow up SQL Server if you put the BPE file on an ephemeral storage volume? TempDB can handle it, why can’t BPE?

Why, in 2014, is Microsoft Guidance to email a database if it’s small enough?

Why are system databases still stored on C:\ if other options are available? Why is moving them such a giant pain once the database is in production?

For that matter, why aren’t Log files automatically pruned out of the box for everything?  Why aren’t there sensible defaults?

Why would you name two completely separate things (the log file, like the error log) and the transaction log file (sometimes just called the log file) named so similarly?  Why are we lazy with our words?

Why aren’t important things for SQL server screaming at you?  If my database is slow, why isn’t there a giant thing that says, “This is why it’s slow?” We all know you can’t trust the activity monitor.

Why are there Users and Logins? Of course, I’d love to create a user that can’t login to the database. Why that makes perfect sense (again, fledgling DBA).  Why do users get orphaned from logins when you restore a database? The names match; in what world would that be two different people? What insanity is this?

Database Mail? Why isn’t that easily scriptable from one Database Server Instance to the next?  Trips me up more than I care to admit.

Why is it when I kill a process, that process doesn’t show as killed?  Sometimes it still shows as “runnable”, and other times it stays in the “rollback/killed” status for a long time (depending on whether I’m using sp_whoisactive or sp_who2).

Why does the backup and restore wizard perform so badly over a network?  Why can’t I use the UNC Path syntax to access other drives?

Why aren’t necessary database metrics shown in their own view?  The DBA should have a necessarily different view than a database developer. Why is SSMS so generally crappy?  Why doesn’t it have better support (in 2014) for Intellisense, or command completion?  (It seems like starting a command off with RESTORE Database should be enough to let SSMS know to start showing you the options)

Why is it infinitely configurable; yet common sense defaults still aren’t set? Why does High DPI support still suck so bad? With as much money as Microsoft makes from SQL Server, they can’t spend time to improve SSMS? Why s the difference between an operational server and a server in RECOVERY (again, NO RECOVERY?) mode a tiny little icon difference?  Shouldn’t stuff like that be in giant bold letters? YOUR DATABASE IS DOWN.

Why doesn’t SQL Server tell you its ebb and flow for SQL Server agent jobs? It knows when it’s busy, it knows when it’s not; why doesn’t it warn you when you try to schedule a job during peak load?

Why are Reporting Services encrypted by default, yet the actual engine does not?  Microsoft went all “Do whatever you want” for SQL Server, but for Reporting Services suddenly got cold feet?

Why doesn’t SQL Server support a TMUX like connection through SSMS? If it does, why isn’t that the default?

Does anyone really create solutions in SQL Server Management Studio?  If we’re closing SQL_QUERY_1.sql -> SQL_QUERY_7.sql, do you really think we know which one is which? Why not make the default file name to show whatever the first 20 characters were of the query?

Why do I feel like an episode of Sherlock whenever something goes wrong in SQL Server? Server is slow… Well, there’s a hung SQL Backup process, could that be causing it? Let me kill it. Things got better! That must be it. Was it it? I HAVE NO IDEA. I know baseline, and I know what it’s like now, but this is like piloting a helicopter in a snowstorm.  I only know which way is down because gravity and death. Everything else? No idea.

Come to think of it, I’d pay for some Star Trek TNG style alerts and dashboards for SQL server. Think of the Brent Ozar scripts (and stuff like SP_WhoIsActive) in a nice view; catalogued and kept track of, with baselines, charts, and warning klaxons. Yea, I’d buy that.

I’m new to this whole DBA thing, for sure.  Maybe at some point it’ll be like looking at the Matrix; for now though it feels like there are grand opportunities to make things a bit less painful for DBAs. Maybe it’s just not profitable to do so?

My First Talk of 2015

In My Epic Life Quest, I went over what my goals for 2015 were. One of those bullets was:

  • Give 3+ Dev talks at work

Two weeks ago, I gave the first talk; an introduction to Git.  This was to the whole dev team, around 15 people.

Part of the leveling up is in the act of preparing the talk, but most of it is in the delivery and feedback cycle after the talk.  Or, as I said on Twitter:

Yea, that’s how I did.

So let’s talk about ‘why’ that was the case:

Initially, I had 25 slides; (all of which would take around 45 seconds a piece to go through) and then I was going to demo Git and show it in different situations.  Since my audience was full of developers that were primarily used to TFVC (Think SVN for TFS;with visual integrations and no command line component), I wanted to take them through the lingo and show how TFS is different from Git before demoing Git.

In practice, however, there were a lot of questions as soon as the talk started, all from one developer.  They started almost immediately with questions about Git’s branching model; or that it isn’t really different from TFS; or whatever. The developer admitted they didn’t have experience with git, and were asking these questions for understanding (in their words “because they didn’t see how it was different than TFS” or “What would we gain by moving to git”) ; but without meaning to, that developer became the heckler in the room.

Instead of asking them to hold all questions until the end; I answered their questions.  That invited more questions, and because I had already opened the door; it effectively sunk the flow of the talk.

In the end, we barely got through the slides (ug, slides) and into the demo ( My rehearsals had 30% slides with 70% demo).

After the talk I sent out an anonymous feedback form, out of 14 people that attended the talk, I received two responses:

“Very informative! Maybe sticking to the flow of the powerpoint more would have made things make a little more sense from a general view, but I understand that other peoples’ questions were the main cause of going a bit tangent at some points.”

“Content was presented in a thoughtful, meaningful way. Demoing helps to clear things up to those who do not have a full grasp on the subject. Perhaps talk about one thing and then demo it right away.”

Both were great responses; and based on those responses and more reflection, here are some of the things I needed to watch out for (and didn’t):

  •  If your talk relies on a flow; and questions interrupt that flow; then make sure to hold all questions until the end
  • a demo is worth a thousand words
  • Never engage one-on-one; always bring it back to your control; at the end of the day the only person who can let you lose the flow of your talk is you
  • Preview your talk by giving it to a smaller group; if you’re giving a talk to the whole dev team, then preview the talk to 2 or 3 people on that team first
  • If you’re lucky enough to know the personalities in your audience, be sure to practice dealing with those personalities.  If you’re giving a dev talk at work; then you know your audience.

Even though I know I bombed the talk, I find it very hard to look at this as a failure.  If Shipping Things is the goal, then this talk was a success.  It’s not easy (even though I’ve done it dozens of times) to give a talk to a group of people, and every time I feel as scared as I did the first time.

Yea, I failed on the outcome, but I didn’t fail from the standpoint of getting up there and doing it — and that’s what matters.  I’ll get better with each talk; I just have to keep getting up there and doing it.

Let’s be Purpose Driven

One thing I’ve struggled with as a developer is working on projects I believe in, with people I believe in, who share a mission I believe in.

It hasn’t happened often enough.

I worked in companies where I liked the people, or thought the work was technically challenging, but the company didn’t fulfill me.  I felt empty.

These business all had something in common: They were chasing the bottom line instead of a higher purpose.  It wasn’t until I joined the Fool that I was able to put words to those feelings.

When I worked at the Fool, Tom Gardner (the CEO) would always remind us what our purpose was: To Help the World Invest. Better. This wasn’t to make money (although that does keep business afloat), it was to show and help people raise their standard of living. Of all the ways to raise your standard of living, saving and investing are probably the best (playing the lottery: Worst). The Fool advocates a ‘Buy and Hold’ philosophy, but more than that, they advocate investing in companies you believe in. He talks about his views more in this interview with the Elephant Journal.

John Mackey calls the approach companies like the Fool are following Conscious Capitalism, and has even dedicated his company, Whole Foods, to that purpose (he also sits on the board of The Motley Fool).

Conscious Capitalism has four principles:

  • Higher Purpose: Don’t exist just to make money, but to further the world in some way. This may mean making decisions that cause you to lose money just because they don’t meet with your purpose.

  • Stakeholder Orientation: Your business should be orientated to your stakeholders: The people that work with you, for you, give you money to run this business, and the people who you’ve asked to help shape your business. This means that you should treat your employees just as well as you treat your customers and investors. Their success is your success.
  • Conscious Leadership: It’s not traditionally considered a virtue to show your weaknesses; but you should. You should be honest in your dealings; in what you know and what you don’t know. Focus on how to enrich the whole team to accomplish the business’s purpose.
  • Conscious Culture: Your culture is the sum of all of your interactions, processes, and outcomes in your company. With a good culture, you can conquer just about any problem. With a mediocre culture, you’re not competitive with companies that have a good culture.

More and more, we’re seeing companies that aren’t just there to make money, but to have a purpose. These companies have the opportunity not only to make money, but to make the world a better place. We need that. As the interview points out, millennials are looking for that in a company. If you had asked me what I wanted four years ago, I would have told you I wanted to find a home. Four years later, I understand what I want that home to look like.

  • I should be proud of everything my company does. If I’m not, that means bringing up the subject and talking about the issues and working to resolve them.
  • My company should be purpose driven. It has to be a purpose I believe in.
  • The internal culture should support fulfilling that purpose,
  • I should be proud to tell people where I work and what we do.
  • Diversity of thought and deed should be paramount to the the company. This means hiring a diverse group of people; not to match some sort of idea of equality (although that has its own benefits), but because diversity of thought and action makes everything better.
  • The second part to diversity is being able to speak up; if the culture doesn’t support speaking up without fear of retaliation, it’s not a diverse culture.

Silicon Valley tries* to pay more than lip service to this ideal, but given that Silicon Valley has a myopic culture, it’s hard to see if they’ll have any real effect in this area (If you’re looking for more evidence of Silicon Valley’s culture problems, spend a day reading Model View Culture).  Another reason Silicon Valley won’t produce these sorts of companies is that on the whole, SV is worried about Exits, not about building a sustainable, purpose-driven business. That isn’t to say a Silicon Valley VC driven business can’t also be a purpose-driven business, just that it’s going to be hard if the VC aren’t on the same page with the founders about the type of business they want.

That means that these purpose-driven companies that developers want to be a part of are going to emerge from places other than Silicon Valley.

I’m excited about the rise of purpose-driven companies. For my part; I’m going to continue to pursue opportunities to be purpose-driven; and for the future of technology, I hope you’ll do the same.

 

 


*While there are lots of companies that believe in diversity and being purpose-driven (like Mozilla); the good companies don’t get nearly the press the bad companies do (Uber, for instance).  Like it or not, the companies that do bad things (again, Uber) are having a bad effect on the perception of every company in Silicon Valley. It may not be fair; but this is my perception as an outsider.

*

What should a Programmer be able to do in an hour?

There have always been different approaches to hiring in tech.

Most famously, How would you move Mount Fiji type questions. In my first professional interview with a really large bank that you know about, I was asked a similar question by other developers who worked on a trading room floor. The interview consisted of various brain teasers and rapid fire questions designed to put a candidate through what it’d be like on a trading floor.

Having just gotten out of the Army during wartime, this seemed more silly than useful to me. Luckily for me, I did not get the job, and a little over a year later, the finance sector crashed and all of those traders/programmers were looking for other work.

Since then, almost 10 years ago now, I’ve been hired and have done my share of hiring. I’ve dealt with interviewers that wanted me to take a personality exam, an online coding quiz, writing a few small web applications (reproducing search, creating a site that showed the weather, and implementing UI features with various frameworks), and even more of those brain teasers everyone loves to hate. I’ve even done practical coding, coding with unit tests, and even a few interviews that focused especially on writing database queries. My success rate isn’t 100% — I’ve failed as many as I’ve passed. But each time, the interview showed me what the company was looking for (or rather, what they thought they were looking for).

In the case of the Trading Floor exam, they were trying to simulate actual live conditions. Why the software would be so crappy that it would fail during trading is beyond me, but apparently that happens enough to hire for it.

In other cases, they just wanted to see if I could code — there are enough programmers (!) out there that can’t that interviewers have to screen candidates somehow.

I know this because I used Fizzbuzz and determining if a string was a palindrome to weed out candidates. I’m somewhat embarrassed to say that even after letting some candidates work on either of these problems for 30 minutes, they couldn’t produce working code.

So I changed tactics; and started to ask candidates to show me some code they were proud of. One candidate produced an entire ream of paper worth of printed out code. I’m not sure what it did; all I can remember is watching the candidate pull out that ream and go “Here.”

Luckily some time after that I was able to interview at The Motley Fool; and their interview really reflected the sort of culture fit they were looking for.

The interview started out with an online quiz taken before the in-person interview was scheduled; essentially asking about parts of the C# language and .NET framework to weed out those that didn’t know it well. It didn’t just include trivia (is a String a value or reference type?), it included parts that would bite you if you had actually used the language or the .NET framework in production.

The in person interview was a little more chaotic — there was an hour of onsite coding in Notepad++; and then four hours of interviews with various members of the team. One person asked me how I’d model a rubik’s cube in code. That was an interesting hour (and by far the scariest).

At the end of it, however, I understood exactly what the Fool wanted; they wanted someone who was smart, could get things done, and (more than anything else) was a culture fit. So much so that they’d have you sacrifice lunch and spend 6 hours interviewing (I don’t think they hire that way any more, but if you are ever interviewing for the Fool, eat a large breakfast).

Each company I have interviewed at showed a bit of what they were looking for during the interview, and the interview itself became a way to understand whether the company was a good fit.

A more recent trend has been to ask candidates to do *X* in an hour. *X* could be anything, a small application; designing an architecture scheme for a system, whatever. That *X* is important.

Let’s say (hypothetically, of course) you’re looking for a developer and you want them to do something in an hour. What should that thing be?

Should it be some general programming problem where you’ll see what they can do (like implementing a calculator using TDD?)

Or should it be something more proscriptive — like implementing something in a specific framework using specific tools?

The danger with the latter approach is that if you’re too proscriptive, you may actually measure something other than you intend to. Even if my stack is in C#, .NET, Bootstrap, and ASP.NET Webforms (incidentally, that’s the exact system I’m working with now), should I ask a developer to implement Bootstrap for *X*? More over, should I ask them to use specific plugins for the bootstrap framework?

No. If I’m measuring whether a developer can code; I should gear the hour towards problems that can determine if they can write code.

If I’m measuring whether a developer can use Framework X efficiently, then I should interview for that — but be aware that it’s not a fair test of whether or not they’re a good programmer.

So this leaves me with a question for you: What should a programmer be able to do in an hour? Should they be able to take a foreign plugin from a framework they may or may not have intimate knowledge of and implement a trivial/non-trivial feature in it inside an hour?

It’s a question I can’t answer. Some days I can spend an hour on a problem and resolve it, some days I can’t — it just depends on the nature of the problem and its documentation.

With the proliferation of frameworks and problems we attempt to solve, I have a hard time understanding where programming begins and implementing a specific framework ends. Should I hire for the former? Or the latter? Should I hire based on whether a programmer can solve an arbitrary problem in an hour? Are there really times where you only have an hour to solve a problem?

Unsatisfyingly, I have no answers here. I don’t know what a programmer should or shouldn’t be able to do in an hour. Sometimes I feel like I’m a great programmer, other times I’ve wanted to quit programming in my stack completely because of some silly issue.

So I ask you: What do you think a programmer should be able to do in an hour?