COBOL Conversion: How to ensure your success

There is a growing need to translate legacy mainframe applications from COBOL to a modern programming language. This legacy transformation is happening in many industries, such as financial and manufacturing, for two main reasons:

  1. Applications need to be adapted for a modern Web-based service-oriented architecture (SOA)
  2. It is becoming increasingly difficult to find skilled COBOL programmers to perform program maintenance

COBOL applications are often decades old, large, monolithic, and complex, produced from continuous program modifications and enhancements extending over long periods of time.

COBOL has been primarily used in business, finance, and administrative systems for companies and governments.

COBOL is still widely used in legacy applications deployed on mainframe computers, such as large-scale batch and transaction processing jobs. But due to its declining popularity and the retirement of experienced COBOL programmers, COBOL programs are being migrated to new platforms, rewritten in modern languages or replaced with software packages.

Most programming in COBOL is now purely to maintain existing applications, however some large financial institutions are still developing new systems in COBOL due to the mainframe processing speed.

COBOL History

COBOL (Common Business Orientated Language) was designed for business use.

COBOL is imperative, procedural and, since 2002, an object-oriented programming language.

In the late 1950s, computer users and manufacturers were becoming concerned about the rising cost of programming. A 1959 survey had found that in any data processing installation, the programming cost $800,000 on average and that translating programs to run on new hardware would cost $600,000.

At a time when new programming languages were proliferating at an ever-increasing rate, the same survey suggested that if a common business-oriented language were used, conversion would be far cheaper and faster.

COBOL was designed in 1959 by CODASYL (Conference/Committee on Data Systems Languages) and was partly based on the programming language FLOW-MATIC designed by Grace Hopper. It was created as part of a US Department of Defense effort to create a portable programming language for data processing. It was originally seen as a stopgap, but the Department of Defense promptly forced computer manufacturers to provide support for it, resulting in its widespread adoption. It was standardized in 1968 and has since been revised four times. Expansions include support for structured and object-oriented programming.

On 8 April 1959, Mary K. Hawes, a computer scientist at Burroughs Corporation, called a meeting of representatives from academia, computer users, and manufacturers at the University of Pennsylvania to organize a formal meeting on common business languages. Representatives included Grace Hopper, inventor of the English-like data processing language FLOW-MATIC, Jean Sammet and Saul Gorn.

At the April meeting, the group asked the Department of Defense (DoD) to sponsor an effort to create a common business language. The delegation impressed Charles A. Phillips, director of the Data System Research Staff at the DoD, who thought that they “thoroughly understood” the DoD's problems. The DoD operated 225 computers, had a further 175 on order and had spent over $200 million on implementing programs to run on them. Portable programs would save time, reduce costs and ease modernization. [^1]

COBOL Legacy Software

The Gartner Group reported that 80% of the world's business ran on COBOL with over 200 billion lines of code and 5 billion lines more being written annually.

In 2016, the Government Accountability Office (GAO) reported the Department of Homeland Security, Department of Veterans Affairs, and the Social Security Administration, to name just three, were still using COBOL.

COBOL programs are used globally by businesses and governments and are running on a variety of operating systems such as z/OS, z/VSE, VME, Unix, Linux, OpenVMS and Windows.

The largest number of businesses using COBOL are financial institutions. This includes banking, insurance and wealth management/equities trading.

Reuters reported in 2017 that 43% of banking system still used COBOL.

COBOL use by business activity

Near the end of the 20th century, the year 2000 problem (Y2K) was the focus of significant COBOL programming effort, sometimes by the same programmers who had designed the systems decades before.

The particular level of effort required to correct COBOL code has been attributed to the large amount of business-oriented COBOL, as COBOL business applications use dates and fixed-length data fields extensively. After the clean-up effort put into these programs for Y2K, a 2003 survey found that many of these applications are still in use.

Survey data suggests “a gradual decline in the importance of COBOL in application development over the next 10 years unless … integration with other languages and technologies can be adopted”.

In 2006 and 2012, Computerworld surveys found that over 60% of organizations used COBOL (more than C++ and Visual Basic .NET) and that for half of those, COBOL was used for the majority of their internal software. 36% of managers said they planned to migrate from COBOL, and 25% said they would like to if it was cheaper. Instead, some businesses have migrated their systems from expensive mainframes to less expensive, more modern systems, while maintaining their core COBOL programs.

During the 2019–20 coronavirus pandemic and the ensuing surge of unemployment, several US states reported a shortage of skilled COBOL programmers to support the legacy systems used for unemployment benefit management. Many of these systems had been in the process of conversion to more modern programming languages prior to the pandemic, but the process had to be put on hold.

COBOL Lack of structure

In the 1970s, adoption of the structured programming paradigm was becoming increasingly widespread. Edsger Dijkstra[^2], a preeminent computer scientist, wrote a letter to the editor of Communications of the Association for Computing Machinery (ACM), published in 1975, entitled “How do we tell truths that might hurt?"[^3], in which he was critical of COBOL and several other contemporary languages; remarking …

The use of COBOL cripples the mind; its teaching should, therefore, be regarded as a criminal offense”. Computing Science seems to suffer severely from this conflict. On the whole, it remains silent and tries to escape this conflict by shifting its attention. (For instance: with respect to COBOL you can really do only one of two things: fight the disease or pretend that it does not exist. Most Computer Science Departments have opted for the latter easy way out.)

COBOL statements have an English-like syntax, which was designed to be self-documenting and highly readable. However, it's verbose and uses over 300 reserved words.

In contrast with modern, succinct syntax like y = x;, COBOL has a more English-like syntax (in this case, MOVE x TO y). COBOL code is split into four divisions (identification, environment, data and procedure) containing a rigid hierarchy of sections, paragraphs and sentences.

A COBOL program is separated into four divisions:

  • Identification Division, identifies (names) the program, class, or interface
  • Environment Division, defines configuration values
  • Data Division, which has six sections, defines the data model of the
    program using hierarchical data structures
  • Procedure Division, the heart of the program — containing procedures which process data.

Lacking a large standard library, the standard specifies 43 statements, 87 functions and just one class.

COBOL programs were infamous for being monolithic and lacking modularization. COBOL code could only be modularized through procedures, which were found to be inadequate for large systems.

One cause of spaghetti code in COBOL was the GO TO statement. Attempts to remove GO TOs from COBOL code, however, resulted in convoluted programs and reduced code quality. GO TOs were largely replaced by the PERFORM statement and procedures, which promoted modular programming and gave easy access to looping facilities. However, PERFORM could only be used with procedures, so loop bodies were not located where they were used, making programs harder to understand.

It was impossible to restrict access to data, meaning a procedure could access and modify any data item. Also, there was no way to pass parameters to a procedure, an omission Jean Sammet (a short-range COBOL committee member and one of the original developers of COBOL) regarded as the COBOL committee's biggest mistake.

Another complication stemmed from the ability to PERFORM THRU a specified sequence of procedures. This meant that control could jump to and return from any procedure, creating convoluted control flow and permitting a programmer to break the single-entry single-exit rule.

A lot of important legacy COBOL software uses unstructured code, which has become unmaintainable. It can be too risky and costly to modify even a simple section of code, since it may be used from unknown places in unknown ways.

COBOL Compatibility issues

COBOL was intended to be a highly portable, “common” language. However, by 2001, around 300 dialects had been created.

COBOL syntax has often been criticized for its verbosity. Proponents say that this was intended to make the code self-documenting, easing program maintenance. COBOL was also intended to be easy for programmers to learn and use, while still being readable to non-technical staff such as managers.

The desire for readability led to the use of English-like syntax and structural elements, such as nouns, verbs, clauses, sentences, sections, and divisions. Yet by 1984, maintainers of COBOL programs were struggling to deal with “incomprehensible” code and the main changes in COBOL-85 were there to help ease maintenance.

Jean Sammet noted that “little attempt was made to cater to the professional programmer, in fact people whose main interest is programming tend to be very unhappy with COBOL” which she attributed to COBOL's verbose syntax.

COBOL's Isolation from the computer science community

The COBOL community has always been isolated from the computer science community. No academic computer scientists participated in the design of COBOL: all of those on the COBOL committee came from commerce or government. Computer scientists at the time were more interested in fields like numerical analysis, physics and system programming than the commercial file-processing problems which COBOL development tackled.

There was a significant disdain towards COBOL in the business community from users of other languages, for example FORTRAN or assembler, implying that COBOL could be used only for non-challenging problems.

Jean Sammet attributed COBOL's unpopularity to an initial “snob reaction” due to its inelegance, the lack of influential computer scientists participating in the design process and an aversion to business data processing.

Concerns about the COBOL design process

Doubts have been raised about the competence of the COBOL standards committee. Short-term committee member Howard Bromberg said that there was “little control” over the development process and that it was “plagued by discontinuity of personnel and … a lack of talent.” Jean Sammet and Jerome Garfunkel also noted that changes introduced in one revision of the standard would be reverted in the next, due as much to changes in who was in the standard committee as to objective evidence.

COBOL standards have repeatedly suffered from delays: COBOL-85 arrived five years later than hoped, COBOL 2002 was five years late, and COBOL 2014 was six years late. To combat delays, the standard committee allowed the creation of optional addenda which would add features more quickly than by waiting for the next standard revision. However, some committee members raised concerns about incompatibilities between implementations and frequent modifications of the standard.

The cost of maintaining a COBOL system

“The cost of running COBOL environments have increased over the past five to ten years. Companies that charge license fees for mainframe-related technologies have raised their rates to compensate for the number of people migrating away from these solutions.”

COBOL maintenance costs are only part of the issue. “For many organizations, the legacy COBOL system is a black box - a massive amount of loosely organized COBOL code written by developers who retired or left long ago, leaving behind little documentation or standards.”

“Remaining faithful to COBOL often poses more risks than a migration and old tools ultimately affect the competitiveness of the organization."

COBOL applications don't integrate with modern software architecture or business intelligence tools.

“Companies collecting valuable data in COBOL and legacy databases do not have the ability to integrate and analyze data from multiple sources to find new trends or customer usage patterns.”

In addition, “new features and functionality can take up to 10x longer to implement in COBOL than they can in a programming language like modern C++.”

“Companies keeping COBOL will be those slowest to change – mostly large finance, insurance or government bodies.”

Massachusetts invested the time and money to make sure it built a system that would actually work—and now it’s reaping the benefits. In 2017, Massachusetts migrated its unemployment technology to the cloud, predicting it would save about $800,000 annually in ongoing maintenance expenses. More importantly, though, Massachusetts’ online system has kept up with demand, state officials told WBUR, a Boston-area NPR news station.

What can organizations do about their COBOL problem?

To stay competitive in today’s fast-paced business world, planning is crucial for the conversion of your COBOL system, not only to stay true to the initial design of the software; but also to enhance your business growth with modern functionality and integration with other components and systems.

One of the main reasons to migrate your COBOL system or applications is simply to modernize your current environment. Many factors can influence this decision, including but not limited to the following:

  • Support. The availability of support for both legacy platforms and the COBOL languages continues to decline.
  • Developer Availability. The pool of programmers familiar with COBOL continues to decline.
  • Cost. Migrating to a newer system platform can reduce operating costs. Converting COBOL to a different programming language can reduce/eliminate licensing costs.
  • Performance/Speed. Migrating from a COBOL system platform to another can dramatically improve the speed at which jobs and operations run.
  • Market Expansion. Porting your COBOL application to a different platform can help you expand your customer base.
  • Feature Availability. Different platforms and languages may offer newer and/or improved functionality (e.g., web integration, hardware availability, etc.).
  • Third-Party Product Availability. Integration with new third party software solutions that may be available on other platforms.

Organizations basically have three options when it comes to deciding how to deal with the emerging COBOL crisis.

  • One option is to do nothing, ignore the problem and hope for the best. Software written in COBOL is still good for some functions, but ignoring the problem stifles growth and won’t fix how impractical it is for making new customer-centric products. Also with this option companies face impending shortages of COBOL developers to enhance and maintain their systems, and increased labor costs due to the smaller supply of COBOL developers.

  • Option number two is to replace everything, creating a completely new core platform written in a modern programming language. However, many organizations do not have the competence nor the resources to make this move.

  • The third option, which is the least expensive and easiest. Instead of trying to completely rebuild the entire system at once, you start by looking at your customers’ problems, and solve those problems with more agile light-weight solutions.

Converting specific functionality of your COBOL application into a modern programming language with support for static analysis and improved performance. Then if necessary, you keep COBOL running, only doing the things it’s good at, and combine it with the new solutions.

COBOL Gradual and safe improvements

The safe and easier way of migrating your COBOL software is basically by replacing specific functionality with light-weight components or add-ons in a modern programming language, ether by a direct COBOL conversion or writing the intended and desired functionality in the modern programming language.

If necessary, the new software components may only rely on COBOL for some of the core feature of the old system. The key is how the connection to the old system is made.

You can create a thin connection between the new and the old platform, which will not strain the core infrastructure and will make it simple and containable in terms of integration.

Gradually companies will be able to address each and every product need that they have with new software services that will replace the problematic COBOL software with add-ons. This compartmentalizes the organization's COBOL problem and makes it cheaper to fix, as it won’t have to be done all at once.

Then all the important stuff is being handled by modern software, and you only have this thin connection to the COBOL, which is now doing one simple thing: for example, high volume processing. When you’ve reached this point, replacing COBOL is a lot easier.

With this approach banks and other organizations don’t have to actively kill COBOL to ensure their systems will be able to be modernized (and avoid costly issues). They’ll just have to minimize the new software’s connections to the old systems so the it can be switched out in a safe and inexpensive manner.

COBOL Conversion

Migrating your legacy COBOL system

Converting COBOL into C++ can be up to five times cheaper than re-writing everything from scratch.

Systems like the New York Stock Exchange’s Euronext migrated from COBOL, stating the changing environment of their system along with the obsolescence of mainframes in the trading industry drove them to adopt a new processing platform and server system.

Replacing COBOL with a modern programming language involves analyzing business rules embedded in the code, implementing migration, integrating new code with the existing mainframe or switching from a mainframe to a new system, and testing for bugs and performance.

If your organization will be continuing the use of a mainframe system, it's important to ensure the resulting conversion supports the full environment of the Mainframe-based COBOL application, including databases (DB2, IDMS, IMS and VSAM), TP monitors (e.g., CICS), JCL, utilities (e.g., IDCAMS), and SORT, that the resulting code has the exact functionality of the original legacy application and can easily be tested and implemented in production.

Accurately translating the meaning behind the source language into the syntax of the target language should be approached with care. Although some code conversion can be automated, it's important to ensure the output from the target language is the same as the source language.

Without programmers experienced in the target language, monitoring, verifying and enhancing the conversion, the automated conversion of COBOL can result in an inefficient translation which can result in lost capabilities and performance.

If your organization isn't ready for a complete COBOL migration, one of the benefits of converting your COBOL procedures to a modern programming language like C++ is that COBOL procedures can be called from C++ and C++ functions can be called from COBOL. Also C++ is portable and will run on existing mainframe systems, which would help to preserve the investment in the mainframe system and avoid a complete architectural change. This is inline with the gradual and safe approach of migrating a legacy COBOL system by replacing specific COBOL functionality with light-weight components.

On IBM Mainframes, the process of compiling a C or C++ source program and then link-editing the object deck into a load module is basically the same as it is for COBOL. The relationship between JCL and program files is the same for C and C++ as it is for COBOL and other high level Programming languages. COBOL Conversion Link Editing[^4]

Automated COBOL Conversion

Automated COBOL conversion can transform a Mainframe-based legacy COBOL application to a modern programming language, such as C++ and produces object oriented code that can be easily maintained and efficiently integrated with other modern applications.

Since it is both difficult and impractical to manually port entire large-scale legacy COBOL programs to a modern programming language, in many cases COBOL translators are used first and then corrections and fine adjustments are added by hand to the resulting code.

Among the post-translation refinements, performance tuning is one of the most difficult and painful tasks, especially for large scale applications. It requires collecting and analyzing execution profiles for critical application scenarios, identifying performance bottlenecks, making necessary modifications in the translated programs, and verifying the results.

Fixing one major problem often reveals other problems hidden behind it, and in many cases the process needs to be repeated a number of times to finally meet the required levels of performance for an application.

There are many COBOL translators available in the market, and there are variations in the quality of the translated code. However, even the best translators produce code that tends not to perform well, without modifications. Some of these performance problems are due to impedance mismatch between COBOL and the target language programs, while others are due to inappropriate conversions into the target programming language. [^5]

One example of an application that automates COBOL conversion is GnuCOBOL C++, which translates COBOL programs to C++ code. The C++ program can then be compiled into the actual code used by the computer (object code) or into a library where other programs can call (link to) it.

Under UNIX and similar operating systems (such as Linux) the GNU C++ compiler (GCC) is used. The two step compilation is usually performed by a single command, but an option exists to allow the programmer to stop compilation after the C++ code has been generated. At which point the C++ code can be analyzed for correctness, then optimized for functionality and performance.

Conclusion

There are a number of options for organizations to transform their legacy COBOL system to a modern system. The simplest and most cost effective way is to gradually replace important functions with improved functionality written in a modern programming language, such as C++, which optimizes and simplifies your system, while improving your technology and reducing costs.

Contact us today, to learn more about converting your COBOL system to a modern system with improved functionality and performance.

[^1]:COBOL on Wikipedia [^2]:Edsger W. Dijkstra [^3]:Edsger Dijkstra's letter to the ACM : "How do we tell truths that might hurt?" [^4]: How a load module is created: Application programming on z/OS [^5]:Performance pitfalls in large-scale java applications translated from COBOL