Tuesday, 14 April 2020

T-SQL Tuesday #125 - Unit testing databases - How I Found Value in Unit Testing a Complex Database Procedure


I’m writing this post as a response to Hamish Watson’s T-SQL Tuesday #125 invitation. Hamish’s question is this: is unit testing valuable for implementing in databases?

I'd like to write specifically about two places I’ve considered implementing unit testing: a complex data cleanup procedure, and the mass of stored procedures behind SQL Server Reporting Services reports. I believe that unit testing is absolutely required for a large complex process, and encourages several good practices as a side-effect. In the day-to-day development of the stored procedures behind SSRS reports, I’m less sure of the value of a comprehensive suite of unit tests.

When I started working as a junior SSRS developer nine years ago, I didn’t have any notion that unit testing was a thing. The reports ran just fine, and I fixed them where any errors occurred. There was no need to write a separate piece of code after I’d written a report to validate that it worked. Because the queries were hard-coded into the reports, there was also no way to do any sort of database unit testing, so that wasn’t a concern.

Below I’ve tried to document the phases of my relationship with unit testing my database code, and why I don’t write unit tests for the stored procedures behind SSRS reports.

Phase #1. Discovery, and my response

I’ve worked with application developers who chose not to unit test. When I learned about the importance of unit testing in software development from the outside world - a talk by Hamish Watson, YouTube, blog posts etc, 

I believed it was a given that developers should be writing unit tests. I felt like the developers I worked alongside were not fulfilling their jobs properly. I remember one instance where testing would have helped detect that a feature was nonfunctional earlier, instead of the days or weeks it took to identify. In my newfound understanding of unit testing, they were supposed to be in a position to show me how to write unit tests.

I remember one short conversation between myself and that team about unit testing and situational awareness, where one of the developers looked at me perplexed and maybe slightly disgusted. I don't remember what communication happened in that conversation, but looking back I expect the way I communicated in that conversation was not helpful, or on reflection, fair.

At some point, I realised that I needed to stop beating the drum regarding my conviction that they should be doing unit tests, but I, as a database and reports developer, didn’t need to. I suspect that was in 2017, my Google search history shows that I watched TDD for those who don't need it in that year, a talk I highly recommend, which challenges the mentality I had.

Phase #2. Failure managing a complex data cleansing project.

In 2017 I was tasked with a massive data quality improvement cleanup project. There were many duplicate records in a master table for one of our systems, with many dependencies. Several of them were in tables for areas of that system that I wasn’t familiar with, areas which I barely understood. This level of complexity was far outside of my previous experience.

First I started designing a process that merged duplicate records from the bottom-up using a batch-based approach so that it was fast, fixing all rows in each of the tables in the lowest-levels first then working up, but when I realised that sometimes the parent objects wouldn’t exist under the entry I wanted to merge a duplicate into, I stopped, threw-out the bottom-up progress, and started working top-down. 

Because I wrote much of the procedure from a top-down approach, it took a long time to see results and get any outside feedback. If one of those statements was poorly coded, or a nuance had been overlooked, I could be causing some significant damage, and I might not notice it for days, weeks, or months after the problem had been introduced, likely by reports from the end users who knew the data best.

It was seriously painful for those around me who were waiting on that project to complete - the problem was causing duplicate rows to show on the UI, prevented a process change, and required the developers implement some complex workarounds that I didn’t understand to hide the data quality issue during the months that I was working on this. The flip-flop in architecture and the delays were the result of, in part, knowing how serious this code was, and how unlikely it was that all of my code would just work when I eventually hit that execute button. This anxiety was added to by not being able to hold a picture of the whole system in my head due to its size.

During the development of this system, I became a fan of the television show Can’t Pay We’ll Take it Away, as the task that I’d already sunk so much company time into was beginning to be chronically over-schedule. I also put in some serious overtime to try to get that project complete. I need to note that I felt a significant moral obligation to get this done, rather than scared, and the overtime put in was not requested by my employer.

Phase #3. Regaining control using tSQLt.

I remember working on this project on the afternoon of December 25th, 2018, thinking working on this project was the same thing I had been doing the afternoon of that day a year earlier, and it no longer seemed cool, at all. I needed to find a better way to work. Something more sustainable, that would help me make verifiable progress. I recognised that unit testing could provide that. I need to note that this project wasn’t the only thing I did that year, this work was mostly sidelined by other tasks during business hours.

After watching the Pluralsight course Unit Testing T-SQL Code with tSQLt by Dave Green on the SQL tSQLt testing suite, I implemented tSQLt tests for all of the components of the data cleanup procedures. Some of the tests were written using TDD, and other tests written after-the-fact. This pulled up mistakes, and I fixed those, until the tests all passed. It found a good crop of mistakes early on in procedure development. I love how easy tSQLt is to use to build test suites with, and appreciated being able to see actual vs expected variations in the test outcome.

More significantly, writing unit tests gave me the confidence that I needed when it came to clicking  that green triangle on the execute button. Knowing that I had thoroughly tested the individual components gave me some confidence that it would function as a whole.

tSQLt doesn’t support linked servers, so I did need to move databases and remap synonyms to references in those databases, and there were some loopback issues that came up later in dry runs, but those were the limited final pieces on the end of the testing process.

During that project I also learned the importance of modularization, in my case using inline TVFs, to reduce code complexity, and reduce lines of code. This helped speed up the development of both the unit tests and the actual stored procedures. The practice of breaking complex stored procedures into smaller more testable blocks of code is also a good pattern that the work implementing unit testing reinforced to me.

I still chose to run the merge procedures before daylight, and over several days, but none of the code I had written and tested needed redone, and none of the critical failure I was concerned with eventuated. I was glad to see the project complete without significant data damage, If you ignore the serious schedule overrun, and a minor incident around which fields to keep in one table for a merge conflict.

Phase #4. Post-project, returning to business as usual

I work with over 200 SQL Server Reporting Services reports, and the volume of reports and the wide range of other responsibilities my team is tasked with, has resulted in limited capacity for ownership, instead falling back on taking tickets and responding with a report, without a good understanding of the reports that have already been produced. This has led to some unhelpfully specific reports, some essentially duplicate, and some reports that have not fulfilled their need for a long time, but still get enough user hits for them to show as in use and avoid being pruned away. In other terms, the reports have been regarded as somewhat disposable.

A side-effect of this is that the SQL queries behind these reports are hardcoded into the reports. At 2019’s SQL Saturday in Christchurch, a panel with Hamish Watson and Greg Low advocated unanimously for getting SQL out of hardcoded queries in SSRS reports, and into stored procedures, so they can be validated. We’ve invested some work in pulling the queries out into their own stored procedures, and getting those stored procedures under source control. Now we’re down to a very nice 0 Errors, and a few warnings in our source control project. We now get a warning if we reference an object that doesn’t exist. It is an easy way to make a change to where an external synonym points, and get quick notice if that object isn’t recognised. This helps me have some confidence the reports we do have under source control will run when requested.

Writing unit tests for all of the 200+ stored procedures behind these reports, as we interact with them, seems like unnecessary overhead. It would tell us if the output changes unexpectedly, but enough test data to handle the different edge cases and paths in the reports seems like a lot of resource that we don’t have right now. The SSDT database project warns about invalid object references, which I have found to be a great value for detecting broken reports.

We don’t often have report failures causing us to scramble, needing rework due to poorly code quality or invalid references. Users are somewhat happy to tell us when reports aren’t behaving in the way they expected, and these deviations are usually due to a difference in understanding between the user and the person who wrote the report.

I do believe knowing if code is in error is a key element of situational awareness, alongside good monitoring infrastructure. We have our reporting stored procedures in an SSDT project with no errors; this doesn’t test that the procedures are functional and operating correctly, as unit tests would, but it gets most of the way there with little effort needed to maintain. Adding unit tests would tell us if a stored procedure has deviated from known good behavior, given a known input and output, so I do see some value there.

Why I do not have unit tests for the stored procedures behind my SSRS reports

We are not facing an issue regarding the reliability of the existing report code. The issue we do have is that we don’t have coherency over our enterprise reporting solutions. Users don’t know where to find reports, or which ones to trust for their particular need. That is where we need to be investing resources right now - learning what needs each report fulfills, and learning how we can better facilitate our users’ need to find the information they’re looking for, over our various solutions.

Back when I started, when I was starting out as a database and report developer, I had nothing else on my plate. I could focus just on writing reports, maybe I could have implemented unit testing then. Now, I’m learning how to be a somewhat acceptable manager to my direct report, improving workflow, optimising use of the BI/reporting solutions, producing BI work, and managing other database-related priorities. I am also now responsible for influencing the patterns of those who work around me, so maybe I should be doing it to encourage them.


Summary

I’ve seen the wonders of unit testing with a big-bang complex project, but in the slow burn of day-to-day report development, I feel that the competing priorities need my focus more. I’d like to do unit testing of my reporting stored procedures, but I don’t think implementing this is my highest need right now.

No comments:

Post a Comment