By Richard Eichen
Many people are surprised by the failures of crucial government systems recently, most notably unemployment application processing. Blame has been laid upon system age and COBOL era technology. However, the unemployment systems are the visible tip of an enormous illness – much older banking, airline, hospital, and government systems were initially architected and built during the IBM S/360 mainframe era, which started circa 1964. COBOL itself was created through the contribution of Admiral Grace Hopper (who coined the term ‘bug’), beginning in the late 1950s. There have been cases where the original COBOL developers who wrote a mission-critical system not only retired but have passed away. This is old technology, so why is it still around?
Let’s dimension the size of this illness – Reuters, in 2014, came up with some troubling stats focused on the US Financial Services industry:
- 43% of banking systems rely on COBOL applications
- 80% of in-person transactions rely on COBOL
- 95% of ATM swipes rely on COBOL
- There are 220 BILLION lines of COBOL code in production systems
- The average age of a COBOL developer was 45-51 in 2014, (51-61 today, but most of us have not met a COBOL coder under 60 in a long, long, time. As of Q4 2019, about 10% of remaining COBOL coders retire annually)
Why the longevity when your mobile phone is ‘old’ in a year and ancient in 2? First, IBM’s S/360 architecture was, for its time, revolutionary in separating the underlying computer and operating system from applications software, permitting many applications to run on the same hardware. IBM, being a large company, understood the computing needs, still being defined, of other large enterprises including transaction integrity, remote terminal and device support, and a 10-year product roadmap, ensuring backward compatibility of software on their newest engines (preserving investment). Their technical field support force was top-notch, wore white shirts and ties, had been trained in customer service, and looked just like their customers’ employees, bringing peace of mind to the C-Suite.
Given the high LOE involved in software development, applications were thought-through, end to end, which did take time, and were not very flexible. It all worked; however painful it was to get there at times. Out of sight, out of mind.
It worked so well, the timesharing industry was born, the ancient ancestor of SaaS, where applications were sold on a per-transaction basis without having to install and maintain your system. For example, you could upload massive amounts of data (for the time) and have the timeshare company do the sort or run a statistics pack against it. Defense contractors were among the early providers of timesharing, selling unused computer time on their massive systems, just as Amazon initially created AWS to help pay for the increasing costs to build out their transactions and fulfillment configurations.
From this stability, and intra-company cultural alignment, many essential but not so sexy applications became an ‘if it ain’t broke, don’t touch it’ utility. Another downside was IT’s inward focus and lack of flexibility, spawning the Business Unit use of distributed computing to avoid the regimentation and long backlogs. COBOL applications grew cobwebs while the investments went into Java-based newer systems. Given the loss of documentation over the decades, even modifying the COBOL code had significant risks, especially if there is unpaid technical debt. We forgot about legacy applications, because they worked, day in and day out – until they fell over.
As a result of the new visibility on older systems, the C-Suite should address:
- What behind the scenes applications are mission-critical, COBOL based, and if so, can they be phased out in parallel? The good news is many offshore companies have strong COBOL capabilities at desirable rates, so ongoing maintenance is affordable during the transition.
- For your SaaS vendors, have they taken old on-premise applications, thrown them into a data center (or on AZURE or AWS), and now present it as their new SaaS offering rather than rewrite them entirely?
- Have Procurement review all SaaS contract Business Continuity and Disaster Recovery sections, focusing on three key concerns:
- RTO (Recovery Time Objective, i.e., how long before they restore full service)
- RPO (Recovery Point Objective, i.e., how much data will be lost, for example, the most current transaction or a day’s worth)
- When was the last full-scope lifeboat drill, who conducted it and what were the results?
These considerations are particularly crucial in specialty applications, which can be from relatively small vendors struggling with handling user growth. In the end, this is not an abstract exercise. For example, more than one SaaS EMR vendor has recently experienced multiple complete fails, and AZURE and AWS have both had significant outages, and who knows when the COBOL unemployment systems will be back up.
Admittedly, there is not much strategic thought here, but these are tactical times.