Introduction

Enterprise software delivery for large-scale projects is complex. It requires substantial time and financial resources to complete, whereas a single system can cost millions of dollars and take years to deliver. Firms will likely expect their software ecosystems to operate for decades before retiring or replacing them to maximize profit and justify the significant initial development outlay.

But what happens when the time eventually comes to retire, replace, or upgrade an old system? Why do upgrade projects often grow to be much larger than initially thought? And what warning signs are worth noticing and correcting to avoid the worst possible outcome?

Planning for success

Once a system is delivered, technology leadership should plan for a mixture of production support, periodic maintenance, and continuous improvement. If business requirements don’t evolve much after launch, the balance between these workstreams will tip towards support and maintenance with little need for new improvements to be made.

Regardless of the exact post-launch work requirements, budget holders should account for some ongoing maintenance to ensure the continued health of the system. Updating a system in small chunks over time helps limit the scope of any given change, making each step far more achievable than attempting a single massive upgrade project.

Beyond implementing any new business functionality, constant upkeep of an enterprise software system also allows for promptly actioning any policy changes before they become major blockers. Responding quickly is crucial for information security where issues may impact customers or external regulatory requirements where non-compliance leads to significant penalties. Handling these problems should also have minimal impact on other business-as-usual activities; while such problems may be unforeseen, having the flexible capacity to deal with them is something that can be planned.

Having ongoing capacity also allows tech teams to implement improvements to tooling and methodologies from the broader technology industry over time. This continuous general upkeep helps spread the effort and cost of updating and keeps the technology group aligned with current technology labor pools from where to hire.

But what about the cost?

While the benefits of software maintenance are apparent, many of them are hypotheticals. A small(er) amount of work now will help reduce the amount of work required for the next update – but it’s often impossible to quantify what that next update will need to put the combined effort in perspective.

The main challenge in allocating sufficient ongoing maintenance is usually around budgets. It is more apparent what the year-to-year maintenance of a given system costs versus some theoretical future uplift or replacement work once a system reaches end-of-life. Why budget for a future that may never happen when halting a system’s upkeep gives an immediate cost-saving benefit?

Plus, it’s a big assumption to think businesses properly consider a system’s full lifecycle, including sunsetting – easier to just put it out of mind as a future problem. The potential cost of eventually uplifting or replacing a system remains precisely that: a potential future cost. No matter what order of magnitude larger the future work to replace a system may be versus all annual ongoing maintenance tasks, it is a ‘hypothetical’ cost that is easily deferred to become some future person’s problem.

Why should we consider the future?

To illustrate why preventative action is good – and for comedic effect – consider an inverse of the problem…

When the time comes to upgrade or replace a system that has been in operation for decades without any maintenance, how would you go about making the experience as painful as possible?

Disclaimer: The following recipe is intended as satire – not to be taken literally!

The Recipe for Disaster Success: Part 1

Change everything all at once

Change everything all at once

Infrastructure, software runtimes, middleware, data sources – swap them all out. But don’t think this is a new greenfield project starting from a blank slate – you’ll have to try to upgrade all the surrounding technology while simultaneously uplifting your existing codebase and attempting to keep the currently expected application functionality.

Change all the processes of how your teams operate. Better yet, make sure teams change to different processes from each other and their business colleagues. Keep changing things throughout the project – some way of working proving too efficient? Change how teams work! Adding more meetings is always a good idea to help productivity. Or better yet, don’t track how efficient current processes are, so any change you make automatically becomes a “good change.”

Ensure there is no automation

Ensure there is no automation

When changing everything except the expected functionality of the system, make sure you don’t have any automated test suites to validate changes made along the way. Testing should be as manual as possible, take up loads of time, require scheduling well in advance, and every test run should be unique in what it exercises.

Every change made to the system is valid when you cannot repeatedly verify it. If a test fails, just perform a new one – perhaps tossing the test coin will land on success next time? Or, if there is an automated suite, ensure it reports success even when it’s disabled. A great way to do this is to have the suite not exercise any application functionality through its assertions. Or perhaps the suite’s assertions always validate as successful? Why would you want to validate your application when you can just say your application has been validated by showing a green “100%” metric on a screen somewhere?

Deployment pipelines should also be manual, especially in production. Things get too easy when you can automatically repeat and validate a deployment across environments. Everyone loves missing a deployment step or performing things in random order, so why not test your powers of production issue investigation and resolution on every deployment? Even better, schedule production deployments in the early hours, so the people performing deployment steps and investigating any resulting issues are as tired as possible. This maximizes the potential for human error – perfect!

Or, if it’s not deployment problems you’re looking for but rather inconveniencing your users, perform manual deployments during business hours. Results should include slow responses due to cache misses, A/B oddities depending on which instance a request happens to hit, or even full-service unavailability for extended periods while problems are investigated. In other words, a great user experience!

Make sure nothing is documented

Make sure nothing is documented

A nice tie-in to ensuring no automation; if no one ever documented the system’s existing functionality, who can say if a change is valid or not? After the next deployment, your users may be surprised by a new, unexpected feature.

Or, which user doesn’t love something they expect to work one way suddenly starts working another way without notice? Perhaps that was how the system always worked? Who can say! If you somehow want to validate that functionality remains the same, the process should be as painstaking as possible. Require someone to perform side-by-side comparisons of the current and new systems manually.

Teams that don’t communicate with each other don’t have to know about or deal with other issues. Make sure any system knowledge you do have stays in the minds of specific individuals. These are The Keepers Of The Secrets that help maintain order throughout the system’s lifetime (and through creating their fiefdoms, ensure they have a job for life if so desired). If one of them were suddenly to quit, or worse… you’d have an excellent opportunity to change things more unexpectedly without recourse. Plus, you can probably hire two or three new inexperienced graduates for the same budget the old Secret Keeper was costing you. Everyone knows the keys to flexible software delivery are compartmentalization and secrecy. 

Next time…

Join us in the second half of this 2-part series to see what other areas you can work on to ensure you have the worst *ahem* best possible major enterprise systems upgrade project you could ever hope for!