The Computer says “You’re fired”.

There’s been some little kerfuffle online over this story from the wonderful world of corporate HR systems:

https://www.bbc.co.uk/news/technology-44561838

And here’s the guy’s story in his own words:

https://idiallo.com/blog/when-a-machine-fired-me

Although it’s an example from the USA, it has relevance here in the UK even though employment law is rather different on this side of the water. (This post is written specifically from a UK perspective.) The problem was perhaps not so much the procedure itself (though I shall return to that later); it was the fact that various people tried to unwind that procedure but the system wouldn’t permit it. And there was no good reason for the original supervisor not to renew the contract (though it does suggest an area where the company’s HR policies were not aligned with business needs).

The OP makes the point that what eventually happened was that he was dismissed and then eventually re-hired after the termination process was completed. Nonetheless, that really shouldn’t have happened under those circumstances; it left the OP out of pocket and also involved him in reputational damage. The system design was based on 1) an assumption that any decision to terminate is automatically correct and unchallengeable, and 2) in any termination process, there are a series of steps to be taken that have to be taken without any option for challenge at the operational level – i.e. the security team were under a three-line whip from the system to escort the OP off the premises, even though they had had conversations with other managers and knew that there was an issue with the process that the management team were trying to reverse.

This, of course, is a real-life illustration of the Milgram experiment, where any instruction from an authority figure is complied with, no matter how unreasonable. It’s funny how often that seems to be borne out in experience.

Someone commented to me that perhaps there was an underlying issue with the OP’s attitude and the possibility of their posting rude blogs online. Personally, I think Ibrahim Diallo’s blog was completely fair and justified, and at no point does he name the company where this occurred (though some who commented accurately identified them based on their own experience). Interestingly, the risk of disparaging blogs doesn’t seem to have been considered in system design as there was no provision made to generate a non-disclosure agreement on contractor exit!

The use of the words “fired” and “job” reflect common usage rather than strict legal definitions, though it also has some bearing on the long-running debate on the status of workers in the gig economy going all the way back (for us in the UK) to the Inland Revenue’s original implementation of IR35. The thing is that there is little indication that the process would be any different for a contracted employee. The omission of a renewal action by a disgruntled supervisor would never be an admissible reason for dismissal – it certainly wouldn’t come under the heading of “some other substantial reason” for dismissal which is the catch-all reason in UK employment law which is automatically assumed to be fair. And if such a system caused such a “cascade dismissal” of a contracted employee, it might render the company liable for action, not only in an Employment Tribunal, but also in a civil court for consequential loss for putting in place a system which actually circumvented accepted procedures (although the ACAS code of practice on discipline leading to dismissal isn’t statutory, I doubt a court would take a favourable view of a system that ignored established custom and practice in such a well-established legal area).

From a system testing point of view, this entire incident illustrates areas where testers should have had greater involvement at the design stage, in challenging issues such as “is it possible to back out of the process once it has been initiated?” or “Does this actually comply with employment law?” (and hence the underlying question, “What’s the worst that can happen to us – the company – if this goes badly wrong?”). There certainly seems to have been a distinct lack of risk analysis when this system was being defined and designed; and any concerns raised by testers that may have been raised seem to have been ignored. And that’s a valid challenge to make, irrespective of variations of legal practice in different jurisdictions.

Advertisements

Thought for the Day

I was looking today at a Stickyminds blog post from last year which asked “Has continuous deployment become a new worst practice?“, and an argument in favour of employing real testers instead of automating all the things suddenly struck me:

When a tester finds a bug, they report a bug. When a customer finds a bug, they may just walk away from the product.

Conducting testing

I was reading a blog post from @GregPaciga on the Modern Testing Principles being developed by Alan Page and Brent Jensen (http://gregorytesting.ca/2018/06/discussing-the-modern-testing-principles/). I saw Alan give a presentation at TestBash Brighton on ‘Modern Testing’ and found his arguments persuasive.

In his post, Gregory was describing a discussion about the seven principles of Modern Testing, and in particular reporting on issues raised in the discussion over the fifth and (particularly) the seventh principles. Part of that seventh principle caused a few raised eyebrows amongst a discussion group comprised of testers:

7. We expand testing abilities and knowhow across the team; understanding that this may reduce (or eliminate) the need for a dedicated testing specialist

Gregory said that he was unsure whether disagreement with the second part of that statement, about reducing/eliminating the need for dedicated test specialists, was a knee-jerk reaction or not. (I got the feeling that he was saying ‘knee-jerk reaction’ as if that were a Bad Thing.) He talks about reducing the need for a testing specialist without eliminating the need for the function itself.

Looking at this, I was reminded of the orchestral conductor André Previn. Many years ago (when he was principal conductor of the London Symphony Orchestra) he did some tv documentaries on the role of the conductor. In one, he set the orchestra playing a piece and then stepped away from the podium and sat down. The orchestra continued quite happily without him to the end of the piece. So why, asked Previn, have a conductor?

Well, it was a very standard piece of repertoire, something the orchestra had played many times before. The individual players were all talented and able musicians. So the conductor wasn’t completely necessary. But consider what might happen if the orchestra had to play an unfamiliar or new piece, or had to integrate a new member who did not know the way they worked and played together? Then all the assumptions that each orchestra member might make, the unstated consensus that the orchestra would collectively have, might be thrown into confusion. Then the conductor acts as a unifying force, setting the tempo, or exploring matters of interpretation or meaning in rehearsal, and holding the whole together in public performance.

I think this is the same as expanding testing knowhow across the entire development team. But that does not eliminate the need for a dedicated testing specialist. Rather, it changes the test specialist’s role. They become the person who directs testing; the person who thinks about testing as a discipline and steers individuals on their own testing journey; the person who helps mould the team into a single unit and, indeed, orchestrates the testing effort to a successful delivery. The tester also can act as a fresh pair of eyes in the planning and design stages of a project; just as the conductor does, they can consider matters of meaning or interpretation outside the cutting edge of the performance itself and communicate this back to the team in the development process (which you might think of as being like rehearsal).

And of course, this sees software development as being, like music, an art as well as a science.

I for one welcome our new robot overlords

AI isn’t going away; and certainly the testing blogosphere keeps returning to the subject. And not just the blogosphere: in today’s Daily Telegraph business section (14 May 2018), their business correspondent James Titcomb said “Tech giants need to build ethics into AI from the start”. Looking in particular at the well-publicised demonstration of Google Duplex, with its conversation between a caller and a call centre where one party was an AI system and one was human, and which certainly proved a challenge to the Turing Test, but also at various autonomous driving systems , he added “Every frontier technology now needs to be built with at least some level of paranoia: some person asking ‘How could this be abused?’ ”

As a science fiction fan, I’ve been thinking about artificial intelligences of one sort or another since I was a teenager; and in one form or another, there has been thinking on the whole subject since the 1940s, long before AI could become a physical reality. Even in the real world of physical things, I remember my father – at that stage in his life, a railway signalling system designer and implementation engineer – talking about “automation” in the 1960s. Later, when computers evolved into the first microprocessor-driven machines (first coming to my attention with the Tandy (UK)/Radio Shack (US) TRS-80 and the Commodore PET. both machines that shipped with processors with mind-bendingly huge memories of 4k (!) in their entry models), a book by Dr. Christopher Evans, The Mighty Micro and a BBC television documentary of the same name made serious political waves with predictions of a future both terrifying and rich with possibility. That’s the future we’re living in now.

In later years, I talked, exchanged letters, and went on drinking sessions with people whose stock-in-trade as imagineers saw them thinking about AI. A late friend of mine worked with Stanley Kubrick on his development of the film that Steven Spielberg finally finished as the eponymous A.I.: Artificial Intelligence, and I later talked with others who worked either with Kubrick or Spielberg on different aspects of that film.

OLYMPUS DIGITAL CAMERA
The soul of a new machine?

Fast-forward, then, to the beginning of the year and a Ministry of Testing Masterclass on the subject. In the wake of this, a thread was started in the MoT Club (Ethics in Machine Learning), and I’ve pointed a couple of bloggers to this in the past few days. With this reawakening of interest, I thought it was time to share the discussion with some more people.

(The other participants were Ministry of Testing test ninjas punkmik, andrewkelly2555, kimberley, ceedubsnz, jacks and vjkumran. Thanks to all for their input.

The discussion started by putting the question of what we should do as testers if we spotted a system under test doing, or heading towards doing, something unethical. Should we really expect AIs to have any ethics at all? It then emerged that AI systems are already being encountered in recruitment situations, and how they “learnt” from their initial ‘training’ dataset or parameters. Sifting according to a list of technical specifications, for instance, might reject good candidates where soft skills were specifically being looked for, but these might not be easy to quantify to the levels that the intelligent system demanded. And if the managers who were gatekeeping the recruitment process didn’t themselves fully understand the criteria they were selecting for (say, in a situation where the budget holder for the post being recruited for wasn’t themselves a tester), then having an intelligent system doing the sifting might just lead to the same flawed sift only being done more quickly and economically…

If we’re going to look at the ethics of using AI, we should be certain that our existing, human systems are equally ethically based for a starter.

This question has been exercising the minds of science fiction writers for nearly seventy years! Most people will point to Isaac Asimov and his Three Laws of Robotics . (Asimov himself claimed that there were enough loopholes in the Three Laws to keep him profitably selling stories for the next forty years…) The Will Smith film based on Asimov’s robot stories, I, robot, posed a perfect example of something unethical happening for ethical reasons. The Will Smith character had a major hangup about robots because he had been involved in an accident where two cars ended up sinking in a river. A robot went into the river to save the humans in the cars, based on the First Law. But the robot weighed up the likelihood of saving both humans and decided that was not possible. Instead, it prioritised saving one human based on best chance of survival and utility of the saved human to society. It saved the Will Smith character, a policeman, before attempting to save the other human, a child, despite Smith ordering the robot to save the child (Second Law trumped by First Law).

Interestingly, the film developed robot motivations to the point where they were prepared to restrict human freedoms because humans do things to themselves that are harmful. This reflected the work of another classic science fiction writer, Jack Williamson, whose “Humanoids” had only one directive: “To serve and obey, and to keep men from harm”. Taking this to its logical conclusion, the “Humanoids” ended up keeping the entire population under chemical lockdown for their own good.

So the answer to the question “Should we really expect AI’s to have any ethics at all?” looks uncomfortably as if that, given that for thirty or forty years we have been told that the role of business is wealth creation and ethics has no – or at best, a secondary – role to play in maximising shareholder value, the answer is sadly clear. Future “Laws of Robotics” may be closer to the alternative proposed by David Langford:

1. A robot will not harm authorised Government personnel but will terminate intruders with extreme prejudice.

2. A robot will obey the orders of authorised personnel except where such orders conflict with the Third Law.

3. A robot will guard its own existence with lethal antipersonnel weaponry, because a robot is bloody expensive.

So how can we even begin to test for ethics?

Fortunately, we aren’t in Asimov’s scenario, where his robots were multi-purpose machines that were expected to learn how to deal with new situations almost without limit. The systems we are likely to be testing, in the near term at least, will at least be designed to do a specific task, so that will help define the range of ethics that designers will have to build into systems and that we will have to test for.

What it will require will be for designers, analysts and testers to look to a different set of expert bodies as sources to build an understanding of the ethical issues that a system might have to incorporate. So an HR system would need input on equality issues as well as employment law; for these, I would first of all look (in the UK, at least) to the Equality & Human Rights Commission (EHRC) and/or to some of the trades unions, especially those active in the public sector who have addressed such things in trying to keep in line employers who are supposed to take ethical issues into account. For finance and accountancy issues, I’d be looking to take advice from banks and other investment bodies who have identified an ethical dimension to their work, such as (again in the UK) the Co-operative Bank.

I think this is an evolving area and a possible whole new field of expertise which will combine traditional IT skills and a range of soft skills that the IT profession hasn’t necessarily been noted for in the past. Otherwise, we could find ourselves in an “I for one welcome our new robot overlords” situation before we know it!

Others are thinking along these lines, as this article by David Weinberger shows.

A lot of this boils down to very basic questions. Who is responsible for unforeseen outcomes of a new tool? The old saying is that only a bad workman blames their tools, and this seems very pertinent. Ultimately, there is a human at the bottom of all AI learning algorithms. It would be just the same if a human was training an animal, say a dog, to undertake criminal acts such as theft (by retrieving someone else’s property) at a simple command. The dog could not be charged with the criminal act; it would be the owner, as trainer, that was responsible.

(This article on Geoffrey Hinton at the University of Toronto is an interesting read https://torontolife.com/tech/ai-superstars-google-facebook-apple-studied-guy/ )

What we should really be thinking about are complex situations which go beyond simple IF > THEN (action) decisions. It will always be the programmer who is responsible for how ethical we make our systems. I think the debate has to be about the failsafes that we build into our systems to enable AIs to spot unethical – or ethically ambiguous – situations and either apply ethical subroutines to decide the correct course of action or to stop and flag the situation to a human (who may or may not take an ethical decision, of course). And this does mean that as systems get more complex, the more they will have to be designed to spot ambiguous situations or ones where there are unforeseen circumstances.

To bring the discussion back to testing and testers, there’s the role for testers in the requirements gathering and system definition stage – trying to foresee the unforeseeable and design ethical safeguards into system behaviours and test them once code is written. And that’s going to need a different sort of skills set to the ones that are currently fashionable.

 

Zen and the Art of Software Testing

Many years ago, I read Robert M. Pirsig’s Zen and the Art of Motorcycle Maintenance. It was a great late hippy-era text and over the years has spoken to a lot of people about a lot of things. But after reading an online review, I think I may have to read it again. the review appeared on a booklover’s site which I frequent called LibraryThing (www.librarything.com); it’s from a user called jon1lambert, and I’ve copied it to this post in total because the UI is a bit old and fusty (like a proper library, then!) and it’s actually quite hard to find and view one single review online, as it selects content by book title and so to find one named review you have to look at all reviews for a book, and in the case of Zen and…, there’s 203 of them. No problem now whilst it’s a new review, but come back to this post in six months’ time and this review will have been knocked down the list and it will be harder to find.

Anyway, here’s the review:

This is indeed a remarkable book. It is all about the distinction between classical (facts) and romantic (feelings) approaches and continuity. A sentence or two on page 165 of the Bodley Head edition (1974) made me stop and think. ‘Peace of mind isn’t at all superficial, really…It’s the whole thing…The ultimate test’s always your own serenity. If you don’t have this when you start and maintain it while you’re working you’re likely to build your personal problems right into the machine itself’. This is the answer to everything. If you are trying to put together some flat pack furniture you need to have peace of mind at the outset. Some people focus in on the technology – they put everything together perfectly; other people are not interested in the nuts and bolts they, just want the function – put some books on those shelves. For peace of mind, continuity is needed underpinning it all – form, function, beauty like the motorbike, the journey, the sensation of travelling across a beautiful landscape. Experience tells me that If i start out on something rattled rather than serene, then I know it won’t work out.

And what does this mean for software testing? Well, like the contemplation of any project, being in the right frame of mind is a great help. Software testing’s the same. It’s no good just validating the code if the assumptions made in the specification stage are faulty, or if the UI has never been near a real human being. A flashy UI is no good if the functionality is broken.

There’s a lot of discussion right now amongst testers about wider issues relating to the workplace. I’ve heard some people complain about this: “Tell me more about automation tools/your favourite heuristics/Selenium/Jenkins….” and so on. That’s great; we’re all working on highly complex stuff and the more we collaborate, the easier we’ll find it and the better the products that emerge from our testing. But if you’re wound up over things going wrong in the workplace, or in your life more generally, then you won’t be the most effective tester ever.  As Pirsig said, it’s the whole thing.

Don’t bank on it

IT systems and (quite possibly) testing issues have been at the centre of a lot of news in the UK in recent weeks. TSB (Trustee Savings Bank) suffered a major outage of their online and mobile banking apps following a systems upgrade during April. Systems became inaccessible; some customers could not see transactions reflected in their accounts; others saw other customers’ transactions showing up in their accounts. The bank’s CEO had to make media appearances to apologise and promise action; and the bank drafted in “experts” from IBM to try to put things right. (I’m not casting doubts on the expertise of the consultants here; “experts” was the word used in the media to describe the consultants, but in terms of understanding what actions need to be taken, it’s a pretty meaningless word. We are all “experts” of one sort or another here.)

UK banks are pretty heavily regulated these days, though that regulation is a fairly imperfect thing. I last wrote on regulation and banking in November 2015 in my mainstream blog Steer for the deep waters only (Is He One of Us?); since then, I’ve had a little more insight into High Street banking and testing.

Let’s make on thing clear; I’m expressing my own opinion as a tester. I do not and have never worked for TSB; indeed, if I ever had, I really wouldn’t be able to comment because of simple issues of employer confidentiality. But I can make some informed guesses as to what has happened, based on my knowledge of projects I’ve worked on that have involved integrating new applications with legacy systems, on load testing for new Cloud-based applications and also on my experience working on one project where a UK High Street bank was the ultimate client.

For non-UK readers, retail banking in Britain is mainly in the hands of four major clearing banks – Barclays, HSBC, Lloyds and NatWest. They operate national networks of branches, hold shares in major credit card companies, and participate in a national network such that British customers enjoy highly available access to their bank accounts from almost anywhere in the country (even before the era of mobile and internet banking). Working together, they underwrite the companies that run inter-bank clearances. But for a long time, they were considered to be unresponsive to customer concerns or needs; with that in mind, previous UK Governments took steps to change the legal landscape to allow new competitors to emerge. The tendency of the big four banks to absorb smaller banks was countered in the 1990s and 2000s by changes which allowed the building societies (long-established mutual bodies who mainly dealt in savings products and mortgages for house loans) to expand their operations to embrace retail banking and (if they wished) to de-mutualise and become banks in all but name. The building society sector itself consolidated, so that what used to be a sector with a large number of local mutual societies is now more a second tier of retail banking, with a number of regional societies giving customers access to a similar level of service that they might expect from a bank.

TSB was created by the amalgamation of local and regional savings banks over a period of years starting in 1967, although some of its constituents date back as far as 1810. By 1975-6, the bank existed in a form recognisable today as a fully-fledged retail bank, offering a similar range of services to the big four clearing banks. It merged with Lloyds in 1995, creating what was then the largest retail bank in the UK. The separate identity of TSB was submerged within the Lloyds operation, although the name remained on the new bank’s masthead. Even that disappeared when Lloyds acquired HBOS (Halifax/Bank of Scotland, itself a merged entity formed from the former Halifax Building Society and the Bank of Scotland) in 2009, when the new business became known simply as Lloyds Banking Group. But that was just a short time before the global banking crisis of the same year; the UK government bailed out the High Street banks, effectively nationalising those that were most heavily exposed, in order to prevent the catastrophic collapse of the entire retail banking system. In order to comply with EU rules preventing state aid subsidy, Lloyds announced that it would spin off a part of its business as a re-launched TSB. The new bank began operations in 2013; but separating systems and customers has been a longer drawn out business. The Spanish bank Sabadell acquired TSB in 2015, and proposed moving their UK customer base from Lloyds legacy systems to a UK-based replica of their own customer platform, Proteo, with a target go-live date of the end of 2017. In fact, it seems that the programme ran late, and customer records did not begin to be migrated to the new platform until April this year. Users of the online and mobile phone banking systems found their accounts to be unavailable for at least a week, and other customers reported being able to access account details of other customers.

-oOo-

The banks were early adopters of IT, and by “early” I mean that their original systems were – and are – prehistoric in terms of the sort of technical churn we now think of as normal in the sector. Cheque clearing was automated in the 1960s, using big mainframes at regional data centres, with applications written in languages such as FORTRAN or COBOL. And this is part of the problem; their systems are so large and extensive that it is extremely difficult for them to be upgraded, both in terms of the development cost and the sheer logistics of swapping out one system for another, seamlessly and across an entire country with millions of concurrent users hitting the systems day and night. Over the years, new developments have been added to the existing systems; so no matter how modern a bank’s latest website might look, drill down through the layers and you will find something designed and first implemented up to fifty years ago.

This, of course, has its own problems; the developers who wrote the original banking systems are now at best retired, so there is a serious knowledge transfer issue across the industry. And adding new applications or functionality to any existing system, let alone one that may be twenty, thirty or more years old, has its own problems. Middleware has to talk accurately to both upstream and downstream applications; I once had a contract with a high-profile professional organisation that was doing in-depth end-to-end testing on a membership renewal website upgrade that had to interact with a customer relationship management app that was about five years old and an accountancy package that was probably fifteen years old. I joined a team of testers that had been in place for some time; I spent four months there, and it was only because of a set of business decisions connected with other things happening in the organisation that a line was drawn under the e2e testing as being “good enough”. In a later role, an application declared by the company’s CEO to be the “best tested ever” fell over when deployed because the specification hadn’t looked at the data items it was required to process from upstream apps and the data formats it was required to hand off downstream. As the downstream app was the invoicing and payments one, failure here meant that transactions didn’t get processed. At one stage, there were transactions worth about £25k per day getting logjammed because no-one had considered what actual format the data needed to be in before it was handed off for payment.

The people who were responsible for that were consultants who had been given a very specific brief when the system specification had been drawn up eighteen months earlier; and the person in the company who had engaged them had long since left. But the flak from that came back to me as a tester and the BA who managed the process.

-oOo-

Coming back to the recent problems, I wonder how much data cleansing TSB did before their system went into test? Was their test dataset properly representative of the range of customers, their identities and the sort of transactions they wanted to carry out? Or did they just use a vanilla dataset that they knew would work and return acceptable results in a limited timeframe? When you’re up against a deadline, the temptation to use such a vanilla dataset and a check for the happy path only is pretty big – but should be resisted as far as possible.

Even if you consider all these issues, there is then the problem of scalability. It’s one thing to test with a dataset of perhaps five hundred users hitting the system consecutively; it’s quite another to apply the same test for five million concurrent users.

I have had one experience working on a banking system, and that was quite an eye-opener. In 2014 I went as a sub-contractor to a third–party testing provider, positioned within an outsourced services company. Their client was a High Street bank that they provided back-office data and records processing services for, mostly using mature proprietary applications from big-label IT companies. The project was to implement a minor change in the way account changes (authorised signatories, official addresses) were notified and actioned on certain accounts and it involved the completion of a paper form in the bank branch, which was then scanned, sent to the back-office company, who then processed it as an image, validated it and did the OCR work to implement the change on the bank’s own system and generate the necessary hard copy documentation confirming the change.

The outsourcing company I was placed with was a US-based multinational. My first problem was my own status. As a contractor, I found that I was subject to all sorts of restrictions, starting with not being allowed to use the front door. Whilst this might seem quite reasonable in the case of rufty tufty blokes in hard hats and hi-vis jerkins treading cement dust into the carpet in reception, this hardly seemed appropriate for an IT professional. Still: their office, their rules. But then, it took a week for me to be authorised to even have access to the IT network, as my credentials had to be approved by head office IT admins in the States. Even that wasn’t the end of it. As the company was dealing with banking issues, my ID swipe card had to be collected from the security office at the main gate on arrival and returned there when I left each night.  I also had to have appropriate permissions to move around the building, including one set of permissions to visit the developers and a separate set of permissions to access the shop floor where the document processing was actually done.

Even then, the impediments didn’t stop. There was no test environment; all my testing had to be done in the live environment after business hours (transaction processing ceased at 3:30pm). This meant that I ended up doing pair testing with the lead developer on this change from about 3:30 to 7:30 or 8pm, running test documentation that we arranged to be sent by a nominated bank branch on an end-to-end test to ensure that installation and functionality worked in the live environment. This wouldn’t have been so bad, but I was providing holiday cover, so the test manager who could have authorised variations to the terms of engagement was one of the people I was substituting for; and the terms of the contract were that I had to attend during normal business hours, starting at 9am regardless. My contract was that I was paid by the day, not by the hour, and there was no-one who could authorise a local variation so that I could have a more reasonable starting time or could be paid for the hours I actually put in. Luckily for them, I consider myself to be a conscientious professional who does what’s needed to get the job done.

And this was a simple addition of one new form to an existing system that would be actioned at one location. A major migration of millions of customers from one system to another, on a completely new platform but which has to merge seamlessly with legacy systems on a national basis was never going to be an easy task. That TSB had to “bring in experts from IBM” a week into the crisis suggests that there was a mindset in the bank of considering testers to be mere functionaries, people who stepped through a simple test script to ensure that system functionality did what it was supposed to. The wider role of the tester, in determining how the system should be tested, what would be required to properly test the system, what resources and how much time these tests would require, does not appear to have been taken into account. No-one appears to have done any sort of risk analysis (which boils down to the simple question: “What’s the worst thing that could happen?”). Testing seems to have been restricted to a pure quality control process, a simple check that the application does what it says on the tin.

More recently, another problem has emerged with the National Health Service (NHS) breast cancer screening programme. Women in England between the ages of 50 and 70 are automatically invited for breast cancer screening every three years. They should receive their final invitation between their 68th and before their 71st birthdays.

But in 2009, the system was changed to allow trials to take place over extending the age range of those invited for screening.  As a consequence, women who had already reached their 70th birthday were excluded from the system. Up to 450,000 women never received an invitation for their final scan. Only when Public Health England set a systems upgrade in place ten years later was the error discovered.

An official enquiry is now under way to determine how this omission occurred; as testers, our concerns should be that a lack of feedback loops in the business end of the process – issuing invitations, booking places and keeping patients’ own GPs informed, processes that would act as checks on the whole screening programme – will be overlooked and blame laid wholly on the shoulders of testers who an ignorant management may very well expect, wholly unreasonably, to detect every single possible bug in the system. Testers do not guarantee 100% bug-free software; and sometimes a system can work exactly the way the code says it should, but there is a gap or a shortfall in the understanding of the person who specified the system, or the person or body who issued the requirement for the system.

Of course, this does also raise a question of gender bias. The bulk of developers are men; not themselves being subject to breast screening, it might be argued that their grasp of the realities and implications of this test was less immediate than if there had been more women involved in the software development. Would there have been a different outcome if the programme had been screening for, say, prostate cancer? Or would other biases, such as age, have taken hold instead? Or should we expect, and even publicise, the fact that anything made by humans can have errors in it? Accepting that would mean that testers would cease being seen as gatekeepers, with a burden of responsibility for the consequences of bugs being undetected; rather, they should be seen as explorers, finding the limits of what a piece of software can do, managing expectations and helping mitigate risk. In a world becoming daily more dependent on the complex, intangible constructs that are software applications, some recognition of the enormity and difficulty of the task of building, testing, deploying and maintaining software tools is very much overdue. But I’m not holding my breath for that anytime soon.