
- By Patricia Bourrillon
- ·
- Posted 27 Feb 2025
Software Modernisation to Boost Business Agility
Many software teams face a common challenge: they inherit large, complex systems built long before they arrived. Over time, these systems accumulate..
The technology landscape is constantly changing and as a result, we can often see that some applications don’t stand the test of time. To counter this problem, we can choose to extend, rebuild and re-architect the existing applications.
Often, I have seen that rebuilding the legacy system is the proposed solution. While this is an engineer’s dream solution, it is highly impractical and costly in terms of time and money.
One thing to note is that containers are no longer the bleeding edge of technology anymore and their value has already been recognised. Containerisation can help to support the modernisation through modernisation of underlying technology.
In this article, I am going to explore the refactoring / re-architecting approach which helps to transition us into an evolutionary solution that can support both present and future needs.
To start with, let's look at the advantages and challenges of containers, let’s briefly look at the problems they solve and those they do not.
Not all apps are right for the cloud and not all apps are suitable candidates for containerisation. Examples of poor candidates are:
Without further ado, let’s look at what we need to evaluate in order to containerise existing applications.
Let’s imagine that the application you’d like to run in containers is a legacy monolithic application. This application has some background processes in the form of Linux cron jobs and it uses a single relational database as a back-end. The list below is by no means exhaustive, but it should give you enough to think about when planning your changes.
Firstly, there needs to be a strategy that builds on the needs of your applications. There are several possible solutions for deploying a legacy application in a container:
Regardless of which method you chose, the key thing is to decide whether the application is a suitable candidate for containerisation.
When coming up with the plan, keep in mind future evolution, as you look at the architecture, performance and security.
We can also look at the application stack diagram. If one doesn't exist, take the time to create it. The diagram helps you identify shared technologies / libraries amongst the applications you have selected for containerisation, as well as being of value in the future.
If you have numerous applications that share similar libraries, language version, web server and OS, then these can be combined into a base image that each application can use as a starting point.
Take the time to look through system configurations or constraints.
For example, I received an error on my newly containerised web application. This happened when I was trying to upload a larger file than allowed in the web-server configuration.
It helps to look at language and environment configurations.
Before you set to work on any refactoring, it is imperative to analyse your application and operating system’s dependencies. For example, you may need some OS libraries e.g. libzip before your language can provide that as part of standard lib.
If you don't have the luxury of having these package management files, you can always get your hands dirty and 'grep' the codebase to find any includes or imports which may reveal third-party libraries used.
Search for all places where environment variables are being used. I've seen these being used for injecting various pieces of information such as API keys, endpoints, paths or application environment.
Credentials and API keys can be better moved to a secure secrets management solution that integrates with your chosen container runtime.
Not all applications are built, from the outset, with a suitable set of options for containers. In these cases, you will need to refactor parts of the application. Common things to look at are configurations; such as settings per environment.
Also bear in mind that any classes that write data to local OS paths will need to be changed to a more distributed storage solution.
Build in acceptance tests, if there weren't any before. This will ensure that as you start to introduce changes, it functions correctly.
If you’d like to consider using the application in your current environment while you perform refactoring, you may be able to leverage feature flags.
This will give you some flexibility and safety.
Imagine that you built your application based on a docker image you found somewhere. It had all the features you needed. A security exploit has led your application to stop working after you created it, based on this image.
Another example would be when your chosen base image is no longer available. And when you release your application next time, the build process seems to be broken.
To avoid such a mess, a wise place to start is to use official language images where possible and the closest matching version of Linux (or other OS).
For example, if you have a PHP 7.2 application that runs on Ubuntu 21.04 LTS, it would be ok to use a Debian bullseye base image.
You’ve got some idea about the kinds of challenges you can face. However, let’s look at some additional factors that can become roadblocks if not well thought out in your planning.
It is imperative to evaluate your application to see where your application stores data. This is important because of the bindings to various OS specific paths. Let’s see two use-cases:
In these two scenarios, the common element is the need to figure out how we are going to have to architect data. This may lead to more tasks around migrating data to a shared storage.
As SSL/TLS is used virtually everywhere, it is critical to figure this one out when you’re further along the process. This may not be a big problem initially, especially when there is only one instance of a container running.
Considering the scenario in the diagram above, this is quite a typical setup in a containerised environment where your SSL/TLS terminates at the boundary (e.g. a load balancer) and then the traffic moves inside the trusted internal network but is unencrypted.
This now becomes a problem because if the application sets secure cookies which it should, they will not be transmitted to the browser because it comes over HTTP.
Containers should do one thing very well. So, when we look at the various tasks a typical application needs to perform in the background, cron jobs are very handy. They focus on one thing, but, to overload them into a container e.g. That is also a web server, is not what should happen.
We should only execute one process per container. This is because if we don't, the process that forked and is running in the background is not monitored and may stop without us knowing.
Let’s compare a few approaches.
This is not a high-scaling option, however, by utilising the host machine’s crontab, you can create scheduled tasks. Since cron job allows you to execute commands, you can use the container engine as you would a command line. This is done by using the same container as your web application but overriding the ENTRYPOINT command.
In the above example, your container will be recreated every 10 minutes and then removed after execution.
There are some pros and cons here e.g. You won’t have to worry about restarting docker containers. You won’t have to manage the logging, thus removing noise from logging. However, it’s not recommended to use a host’s resources as the background tasks can be quite hefty e.g. Report generation. This may also not be an option if your target runtime is a serverless environment where you don’t have access to the underlying host.
This is ideal for simple services as it gives us portability by packaging the crontab files directly into your container services.
We can scale our application more easily this way since it is no longer dependent on the host machine. One caveat here is that scaling the number of containers can have serious implications if your background job does not deal with concurrent processing.
Furthermore, we deviate away from the doing one thing well principle
For this approach, your application’s image should be the basis for both containers. Both containers will need access to the service’s volume and network. The only difference will be the foreground process for the container.
In this example, we use the same application image, but override the ENTRYPOINT command. At this point, task scheduling can be done in accordance with your target runtime.
This problem is twofold. On the one hand, there are files created within containers and their permissions. On the other hand, file permissions are too open or permissive when you’re building the container images in the first place.
File permissions and ownership get copied from the host machine’s file system while building images.
Therefore, be mindful when building them in environments that implement the least privilege principle on the filesystem. Most CI servers will have restricted permissions.
So when you COPY or ADD files in a Dockerfile, you can specify the following:
Do not use the default root user as it can cause permission problems, and it also poses security risks.
The most effective way to prevent privilege-escalation attacks from within a container is to configure your container’s applications to run as unprivileged users.
Extract any secrets, e.g. API Keys from the secrets management solution offered by your target runtime. Initially, you may not need a secret management solution, so moving these to environment variables would allow you to get on with the job at hand.
An obsession, I’ve observed, is trying to keep the size of your container images as small as possible. For example, your current application in Debian amounts to 500MB whereas the same application packaged up with Alpine Linux is 50MB.
I feel that pursuing this obsession can be a pitfall as not everyone on your team may be familiar with Alpine instead of Debian. Furthermore, there can often be differences between libraries deemed essential for the functionality of your application. However, having said all this, containers do allow you to experiment with these kinds of things in a much safer way. This should not be the goal in the beginning, but could be something that is done once you’ve become used to over time.
While this is the great benefit of containerisation, it also brings potential challenges. For example, data consistency and task coordination can be challenging across multiple instances. As a result, costs may be higher, and more code may be required to make that coordination happen. Furthermore, underlying servers may still encounter hardware limit issues if host machines are not specced accordingly.
The preferred practice for containers is to write them to the standard output stream so this will likely be different from trying to write logs to a specific file in the OS. This will enable you to see a combined view of various services, although you'll want to differentiate between logs from different services. A centralised logging solution would be needed to help with aggregation of various logs when you scale. As a result of this, there may be a need for a possible refactor of your logging driver’s configuration.
Before even deploying our container, we should be completely aware of what is occurring within. By following recommended practices, most security risks can be reduced. The list of best practices is lengthy, so let's look at some of the things to do and not do.
Aside from the above, you should also consider using a good image scanning tool which can notify you of any security vulnerabilities as they are discovered.
To summarise, this article has looked at the refactoring/re-architecting approach to containers irrespective of the technology chosen. We also took a closer look at both the advantages and difficulties of containerization as they’re ephemeral and stateless.
The list of actions you can take to plan around the migration, and enhance your stack may be much more extensive and go into greater detail. Hopefully, this article has helped you better recognise some of the main issues we can face and also provided some helpful advice.
Additionally, you might want to assess the various run times you have at your disposal, such as Docker Swarm, Kubernetes, AWS Fargate, etc. These run times will have their own set of challenges.
If you want to know more about refactoring, why not read James Mason's blog Refactor before you rewrite.
Many software teams face a common challenge: they inherit large, complex systems built long before they arrived. Over time, these systems accumulate..
When faced with the challenge of modernising legacy systems in businesses, we know the potential benefits are immense, but so are the challenges...
Technical debt—a term often discussed but less frequently addressed effectively—is often cited by technology leaders as one of the biggest drains on..
Join our newsletter for expert tips and inspirational case studies
Join our newsletter for expert tips and inspirational case studies