The Perils of Duplication

A few weeks ago, the stream that I was working with was tasked with the job of creating two new C# solutions for upcoming work. The task involved creating the two solutions and then deploying them through demo and live ready for the projects to be worked on. The first solution was created by myself, while my colleague worked on speeding up our internal builds. As I had never written a deploy script before I knew that this task would be a great learning curve for me. There was no documentation outlining how to create a new solution within Codeweavers so I decided that this was something that I would invest time into, whilst creating the solution. The first solution required research as I had no clue how much work was involved to get the solution up and running. I decided to create a single solution to start with along with the documentation. This documentation would then be followed in the production of the second solution. This would allow me to fine tune anything that was unclear, and add any little steps that I had missed.

Whilst creating the first solution, I found that in quite a large number of places I was simply copying and pasting existing code into a new file for my project, especially when it come to our build scripts. Rather than be able to add the new solution names to a list, and the build scripts take care of everything else internally, I was changing the solution name multiple times within each build script. It was clear to me that this duplication was unnecessary and that it could be eradicated making production of future build scripts much simpler. This is the next waste task that I will take on when the chance arises.

Copying and pasting a file and changing the solution name in ten places or so is not a big issue and does not take that long especially if you are only creating a single solution. Because I am doing this task twice I am trying to think of things that are going to have to be duplicated when the second solution is created. The fact that these files are all duplicated means that if we ever want to change something across all of the build scripts then we would have to make the change in all of our build scripts, thus resulting in around 25 changes in 25 separate files. As the content of these files is duplicated then surely it makes sense to have the content encapsulated in a single place. Once this code is encapsulated then any future changes to the build scripts would result in only a single file having to change.

We are very careful at Codeweavers that we do not duplicate anything within our core C# code base as this is not good code design and can lead to major problems. However, we have been lapse in the past when it comes to duplication in other code that we write, such as the build scripts, stored procedures etc. This duplication is something that we now want to remove. The deploy scripts and stored procedures were written a long time ago when we did not know any better, but now we do!

Along with the deploy scripts I also came across a bunch of files that were required for deploying the solution to our demo and live servers. This time the duplication was extreme as there was around 80 files of duplication, each of the files containing the same information for MS Deploy. The files that we use for MS Deploy are the same and are not application dependant so why have we got 80 different files containing the same information. Moving the information into a single place produces clear benefits as from now on if we need to change anything for MS Deploy then we only have to change a single file rather than 80.

My goal was to produce documentation that could then be used by myself and the rest of the team to try and reduce the number of manual steps to as few as possible when creating a new solution. The only way to do this is through the reduction of duplication. When I had completed the documentation there were eleven manual steps that had to be taken to get the solution from a local machine through to the demo and live servers. During the production of the second solution I took steps to minimize the number of steps that we would have to take. This process is now down to nine manual steps and I will continue to look into reducing this number further in the future when opportunities arise.

Now don’t get me wrong duplication can be useful as the work that I am currently working on is creating a new web service for a new client of ours. This web service is an exact copy of one used by an existing client of ours. We are having to duplicate the web service as the existing clients web service was written around seven years ago and we did not know better than to hard code the web service to the existing client. The decision to duplicate the web service was made by the team as this meant that the existing web service would continue untouched while we created a generic web service for the new client. Once the new web service has been pushed out to live we will then remove this duplication by combining both of the web services to use a single web service that could be used by n number of clients. It is important that when duplication like this is added then it is removed as soon as possible, otherwise it will get forgotten and this will cause problems later on.

Duplication in your code, will come back to haunt you and will cost you and your business time and money. Just remember the longer duplication is left the harder it becomes to remove it. More and more code is built on top of the duplication so pull it out sooner rather than later or you will regret it.

If you enjoyed this post follow me on twitter (@agilemooney) for news of future posts.

Join the ConversationLeave a reply