Today something wonderful happened (31 May 2013). The Ministerial Inquiry into Novopay has been released. Not so wonderful for Novopay/Ministry of Education/Talent2 but one of the few learning experiences we all have to reflect upon what we do in IT.
A little bit of history. Novopay is the second Ministerial Inquiry into an IT project in New Zealand that I am aware of. The first one was the INCIS project from the 90/00’ies run by the Police. The difference between the two is that this report was actually supported by all parties involved, and it is on a project that actually went live.
Anyway, I don’t want to berate MoE or Talent2. I do want to discuss the general issues I see in many projects and my take on what it means and sometimes how it applies to testers or testing.
For those readers not from NZ, you might need to catch up a bit on the topic by Googling for some press reports on Novopay.
I have read quite a bit into the publicised documentation on the Novopay project pre inquiry. Of course I zeroed in on the testing documentation. What immediately struck me was that there was actually a lot there. Defects were raised and put to stakeholders. I found areas of defects that sounded a lot like the production issues they were having.
I noted that the really interesting defects were coming from an area that was trying to run end of day successions to emulate what would happen later in production. Oddly this testing was stopped by management to prioritise other testing.
I also noted the grim reports from testing and the decisions by stakeholders to ignore them and paint rosy pictures. So I was awaiting the official review with interest.
The report is a resounding slap in the face of the governance and leadership of the whole project. It highlights bad practices that I have seen on so many projects (public and private). It seems to be an industry sickness. So let’s take some of the things said (not necessarily in order):
- This state of affairs and the wider disruptions that were caused were avoidable. It is clear to us that important lessons from the past, in particular those arising from the 1996 education payroll implementation difficulties and the INCIS experience in 2000, should have been learned, but were not.
Well, there there’s the answer to my blog topic question. We don’t!
There are things like the INCIS report and other lessons learned from other projects. Do we use them? NO. I have read the summary to the INCIS report and it annoys me to no end to see how many people in stakeholder positions don’t know this report. INCIS and now this report should be mandatory reading for everyone in IT and those connected to IT projects. Repeating these errors is gross negligence when they have been so clearly defined and in the public domain.
From a testing perspective, we could also see ourselves in a Quality Assurance (QA) role (although I fundamentally don’t agree testers have anything to do with QA but I’ll skip that rant here). If we so do, then we’d be tasked to highlight breakdowns in process(es) too. That would mean the testers/QA staff on this project would have been well served to know the INCIS report and address the points that now both reports have in common with stakeholders.
- It is our overall view that weaknesses in project governance and leadership allowed the service to go live with a number of significant risks which the Ministry and its vendors were over-confident of managing.
- …which invited Ministers English, Parata and Foss to approve the
continuation of the project following Confidence Point Two, misrepresented its state.
- We reviewed the papers submitted to Ministers and presented to Cabinet, to understand the
information, risks and advice that were presented and where decisions were made, and found
that Ministers were not always well served. Reporting to Ministers has been inconsistent, at
times unduly optimistic and sometimes misrepresented the situation.
So this to me translates to someone knowingly and with intent lying to the approving stakeholders. I don’t know about you but such behaviour to me is unacceptable. Yes, you might support the little white lie to achieve something short term but this is a breach of ethics on a massive scale. People involved in IT projects usually have some sort of education, certification, belong to industry groups or similar. All of these have ethical charters. Even if some of them are very short they all contain something that forbids misrepresentation and lying.
In my career as a tester I have seen lots of behaviour like this. Some incidents are small, others larger. We constantly get faced with these situations in projects. How do we react? Our ethical standards actually mandate us to expose such behaviour and deal with it openly. Anyone who has been on a project knows how impossible that is. You recognise this happening when the voice in your head is saying “think of your mortgage!”.
Interestingly enough, the report includes a recommendation going forward, that there is a direct feedback from QA/testing to the top stakeholders. This is a good idea but only works if the misrepresentation doesn’t happen on that level too.
One of my pet peeves is that testing reports to project management. This is counter productive as testing is a direct danger to the project go-live. Good testing -that highlights lots of issues- makes the life of a PM and that of a project very difficult. It has the ability to actually break projects. If the “unfiltered” test reports would have reached Cabinet or the Ministers there would have been a good chance the project would have been delayed or worse. This (recommended) action would have invalidated the goals of the project though. That means a PM would have done everything in his power to counteract what testers find. Fixing defects is hard, time consuming and delays a project. Hiding or explaining them away is much easier.
You could thereby deduce that the groundwork of the deception’s was laid by the goals set out to achieve by the project and the actual project structure. It’s easy to set out to do a project on time and in budget. The hard things to do are in scope and quality. These latter are hard to measure, to define, and to get water tight.
A PM should also act in the wider interest than just the project. A project has a specific external context within which it lives. This can be towards it’s company, customers, stakeholders, local conditions, competitors, the economy and others. Ideally, if any of these apply, they should be integrated into the project goals. If they are not, or are treated as implicit, they still should be considered.
- Over the course of the project, Talent2 had missed agreed milestones or deadlines, which eroded trust and confidence in its ability to deliver.
I encounter this on many projects. Either milestones get missed or entry/exit criteria are ignored but everything is still full-steam-ahead. One has to wonder, why these milestones are made in the first place. So the lesson learned here is: No milestone/criteria without consequence(s). I am not talking about blame or money. What is happening is the project context has changed. There needs to be change to re-adapt. The thing I can guarantee with unerring certainty is that not changing the path will not get you where you want to go.
Just imagine as a tester you don’t get the promised environment you have stipulated that is needed for testing. This will have severe implications for quality, effort, validity of tests and much more. Thereby at some point testing will get into a compromised situation. Stakeholders are happy to have resolved the bothersome issue of testing but don’t realise the added, very significant, amount of risk that they have now taken on board.
- Despite the problems, we observed a strong commitment to delivering a successful project and some significant individual efforts.
This was intended as a positive remark but I am unsure it is. A strong commitment is, what gets most projects into trouble. Delivery to a certain date becomes so important that any rationale that advises caution is futile.
[Update] All IT projects require a skilled staff who uses their initiative and commitment to solve (poorly) structured problems. Sometimes though heroes emerge. If these heroes are needed to just get the basic project out the door this points to a dysfunctional project. The project is suffering from bad governance and leadership, caused by a lack of delegation, planning, overly optimistic timelines, solution complexity, and/or lack of enough specialised know-how in staff. [Update]
- Work commenced on the requirements for the schools payroll project in October 2008. This process was lengthy, and was never actually completed. Even after Go Live, new requirements were being discovered.
- Requirements definition, design, development and testing activity were all occurring in parallel, making it very difficult to maintain a known level of quality.
We are talking big Waterfall type projects. They do prescribe requirements to be finished before anything else begins. This is the 1st sign of danger projects usually display. Hands up, who has been on a project, where this was the case? (Everybody hands down now, please.)
Overlap and change cannot be prevented but can only be tolerated to a certain level. Once we get a statement like the above, where all Waterfall phases are going at once you are kidding yourself if you think you will come out unscathed. This kind of behaviour happens because of unrealistic/overconfident timelines and overconfident vendors/marketing gurus believe she’ll be right (also sometimes a misplaced and forced “all is well” attitude).
Testing in this scenario has become a farce. It is not uncommon that 70% of testing effort has been completely wasted by the time the project ends. In the testing section of the report there is actually a hint of this:
- …poor requirements definition, early testing was wasted.
In clear-text that means, if everyone would have sat around and waited until requirements definition would have ended and then started testing they would have done just as bad a job (i.e. a really stellar job under such chaotic circumstances!).
- There was little direct customer (boards of trustees) or user (principals and school administrators) involvement in the definition of the requirements, and Datacom’s involvement was minimal.
- The execution of the change management plans which the Ministry did have was inadequate, and roles were unclear. The engagement with the payroll service’s customers and users was also insufficient.
Hmmm… so basically nobody bothered to find out what/who the oracles were in order to interview them to find out what this thing should do. Testers can help here! We have learned to find and respect oracles. We act upon their information. Maybe it is time to spread the concept of oracles a bit wider in the SDLC.
- During the service design and development phase, the intended pilot and phased rollout of the service were removed from the project plan.
- Some important areas of functionality were not fully tested prior to Go Live. Some types of testing were not completed to the original scope, on the basis that testing could be completed after Go Live, or that the risks of not doing the testing had been adequately mitigated. Not all System Integration Testing criteria were met.
Also a thing I see time and time again. Scope, especially in testing, is decreased without replacement or risk mitigation. Everybody notices if development effort is cut. Basically functionality will be missing. Cutting testing / QA effort impacts output quality but this is not directly traceable and has a delayed impact. You can actually shift effort and problems from the project(budget) to BAU (budget). From what I can see 100% of projects that try this seem to get away with it.
Then there is the big myth of after Go-Live.
After Go Live the project will be shut down. There will be no testing or anything else going on. It’s a nice way of saying “we won’t do it” without spooking the steering committee. If you’re lucky there is a project phase 2, that will -begrudgingly- accept this into their scope (don’t fret, phase 2 will never materialise as phase 1 will have failed).
- …meeting of the Project Board, the Novopay Business Owner indicated that a Go Live date of 14 August 2012 was the “absolutely last preferred date”.
“Last preferred”? Is this an order to comply or not? The word preferred is what annoys me. What are the factual reasons for this Go-Live? Preference is akin to whim and that is not enough of a reason. If, on a project like Novopay, I need to Go Live in a critical state I need to be able to assuage my assumptions and risks against why we are doing this. How can I substantiate what risk level something has if I can’t back it up by a tangible goal? This should not be a difficult exercise. Actually it should be easy. Stakeholders have reasons for doing and wanting things. These just need to be communicated. With this transparency, solutions might be viable that were not even thought of before.
- The real Go Live decision was made on 31 May 2012, despite the Confidence Point Two criteria not having been met and schools not being ready. Project governance and leadership allowed a combination of significant risks to be carried into Go Live and overestimated the ability of the Ministry, Talent2 and schools to manage them.
- systems development was continuing through the code freeze right up to Go Live;
- Talent2 missed agreed milestones and deadlines. The Ministry had cause to invoke breaches of the contract for non-delivery from as early as 2010, but did not exercise this option.
- The Ministry was not always willing to take or act on advice, and at times demonstrated misplaced optimism about the state of the project.
Projects are weird beasts. They are stressfuland ruthless and on the other hand, when it comes to making some consequences, everyone backs off and does things like the quoted above. Again this is ignoring change and not reacting appropriately. The logic of reasoning behind such decisions has eluded me so far. I cannot fathom that things like Prince2 and PMP condone such decision making in their relevant processes (and I don’t think they do). Again it is not about blame but dealing with reality in an appropriate manner.
Maybe therein lies the issue. Humanity is averse to dealing with failure. As a tester we have accepted that failure is something good, especially if it happens early in a process. You can fix, learn and grow by dealing with failure. From a holistic project view I cannot see the same understanding. Failure is demonised and feared. You have to wonder why though. There is the omnipresent statistic that is quoted in next to every keynote at an IT conference: “90% of projects fail”. So if you can’t deal with failure there is a 9 in 10 chance you’ll have a serious issue with the project you’re on. If 9 out of 10 projects fail, why don’t we focus on dealing with failure?
Some people will now refer to their risks and issues registers. Theoretically that is a good idea, but I have not really seen them work. I think it is better to name the beast. If you start from the premise that your project will fail (project failure in my definition is: over budget, reduced scope, insufficient quality or not in time), you can now start thinking about how it will fail and start to actively mitigate.
In comparison I see that risks get raised and then people ignore them until they become issues (and sometimes after that too). By the time the risk becomes an issue the remediation is complex and expensive. Active and early mitigation might have easily prevented the risk from becoming an issue or have introduced early adaptation to a changing context.
- …there was no overall accountability for Independent Quality Assurance,…
I actually wonder if this accountability needs to be independent. Projects often forget the fourth corner of the balancing square (it’s not a triangle like most literature points out), which is quality. Quality is hard to define at the best of times but everyone knows, when it isn’t achieved. More than accountability is a careful management of quality expectations and the clear communication of these. Testers are not the beholders of quality. Quality is a whole project thing. Everyone involved needs to know these expectations. Trying to retro-fit quality by failures in testing will definitely kill your project. When we get to testing we should be testing that quality is present and highlight where there are omissions only.
- The Novopay project has cost materially more than estimated. Benefits have yet to be fully realised, and in some cases may not be. The complete cost of implementing and establishing Novopay, including the real costs to all stakeholders, substantially exceeds the reported overruns to date. Value for money thinking and expenditure control have been weak.
Feasibility studies are something that you never get to see as a tester (if they even exist, I did hear rumours). These studies should list in detail, what is expected of a solution and how that will generate a return on the investment your making. Companies do IT projects so they can directly or indirectly make money. Nobody is in this game for just the good of mankind. So all goals and requirements should somehow tie into this feasibility study.
So why are testers not privy to this information? I could actually start calculating the monetary risk or monetary consequences of testing actions or the omission thereof. Why are testers (with their excellent critical thinking abilities) not invited to scrutinise such documents and calculations?
When the time comes to cut testing effort (see below) such a document would be priceless. You could evaluate what to test and what not and to prepare for the fallout. If Novopay would have done this, I think they would have focussed on different areas of the application/solution. The huge benefit would have been that the actual financial risk or opportunity could have been given to the steering committee/ministers. That would have made their decision making much easier and more founded.
- By December 2009 Talent2 was advising that the project would not be delivered on time because of delays with testing…
This sentence can be found in reports from many a project. I always balk at such a sentence. What it states is, that testing was delayed. i.e. there is something that happened within testing that caused the time-frame to blow out. There is a second interpretation though, which is more accurate and true but is intentionally hidden in wordplay of this sort. What I am talking about is the interpretation as “testing was delayed”.
So let’s look at a few(!!) reasons why testing can be “delayed”:
- Late delivery of code
- Late delivery of a working environment
- Late delivery and/or changing requirements
- Low quality code
- Low quality requirements
- External factors (e.g. the Christchurch earthquake and the tons of long term issues that it caused)
- Failure to adapt to change, thereby making existing goals/timelines fictional (this is a project management issue)
- Bad test preparation (although I usually see this as a consequence of 3 & 5)
- Bad testing methodology
- Bad test execution
- Bad testers
As you can clearly see there are a lot of reasons that are actually not directly connected to anything testers are doing (1-7). In most projects (like on this one) it is obvious that a combination of points 1 to 7 are actually true. Very rarely things like points 8 to 11 are listed as the issues. And if they are (like on this project) they actually quite easily get mitigated.
So when you comment that “delays in testing” are an issue you’re more likely admitting to a complete project lifecycle/management failure. It’s like blaming the tail-lights on a car for a frontal collision.
- Novopay should not, however, be regarded as symptomatic of all public sector ICT projects.
This sentence surprised me. For all the rational and excellent work of the Inquiry in the report, this sentence does not fit the mold.
It is a political sentence. It is intended to assuage any tendencies to investigate any other project. As much as I can understand that sentiment I do think this is making them guilty of the thing they critcise (“at
times unduly optimistic and sometimes misrepresented the situation“). Yes, surely not all projects are as bad as Novopay but actually a lot of them are (remember, 90% of projects fail!). This sentence absolves a lot of projects, where it shouldn’t. It also lessens the chances of these projects getting the support they need to become successful. I’d rather have applauded an initiative like they did on the Government security breaches, where a stock-take of all projects over a certain $$$ figure would be reviewed.
Even the projects listed in the report that are successes weren’t in the sense of the definition I gave above (of course there is a lot of speculation here on my side and it depends highly on my definition of success). Don’t get me wrong, I do think they are successes in the wider sense and we can be proud of them but without the Ministerial report outlining what –to them– defines a success, I think they are on shaky ground making that statement.
So in conclusion…
All in all I think this is a great report, if not stellar. Nothing is perfect and being a tester I do pick up on this more than I probably should. The reports, in my opinion, highlights typical issues in large public & private projects. Everyone should know this report and probably re-read it before starting on a venture like this. While at it, add the INCIS report to that reading list, too. I really hope these reports will actually make us learn from history, give us a modicum of self criticism and make us embrace a culture of failure in order to succeed.
I know I am no die hard PM or manager but I do know these issues affect me in my testing career every day. They annoy me, they make be uninspired and less effective. I know I am not alone in this. I hope that this report will generate lots of discussion, articles, books,… and that we start dealing openly with the issues at hand.
I will write another post later on the actual recommendations of the report and what I think is a solution.
Author: Oliver Erlewein
All relevant Links: