Software Anti-Patterns that Impact DevOps
Sometimes it’s hard to be sure that the culture is fully DevOps. After all, this is a path of continuous improvement, and always striving to become better is part of the DNA. There’s also a lot of nuance and opinion on what DevOps means, and a lot of ways to get there.
So, I was thinking: what about an alternative symptom-based checklist to measure how far you are on the journey? I came up with some ideas to use as a gauge. Essentially… How DevOpsey are you?
By DevOpsey, of course, I mean all of the things in the DevOps Loop (plan, create, verify, package, release, configure, monitor). But it also applies to things like the DevOps Superpattern, and probably a bunch of other things too.
I kept adding stuff to the article until I realized that it was getting too big and needed to split it out. This third part addresses some of the anti-pattern symptoms of DevOps you might come across when developing software.
Let me know what you think.
You have layers of application code that developers are afraid to touch
A project might start out small. With a sense of urgency and drive, corners may be cut in order to just get something out there. I think that’s perfectly normal.
The issue comes when those behaviors become ingrained and developers are not empowered or encouraged by their management to test and refactor in order to keep code clean and malleable.
Whether the cause is business drivers or culture, it’s essential to avoid the place where developers are scared to the point that they believe their only recourse is to add additional layers of code or abstraction because it’s “safer” and “faster” than touching that old critical code.
You practice resume driven development
So there’s Behavior Driven Development (BDD) which is about how to deliver incremental value through software based upon business requirements. BDD is an evolution of Test Driven Development (TDD) where — in implementing a particular piece of functionality — we first write tests, and then continue to develop code until the tests pass. These practices are great, and are to be encouraged.
Then there’s Resume Driven Development (RDD). RDD is where a development team finds a really cool piece of technology and looks for an excuse to use it.
Whereas this is perhaps fine for green-field development, or where a new unique need is identified, it can also be dangerous.
Case in point: at one company, an architect who was new to AWS designed a system that employed Simple Email Service, Simple Notification Service, Simple Queue Service, Simple Storage Service, an Aurora Database, DynamoDB, and a whole host of other technologies when something far more trivial would have sufficed.
A similar example echoes one that was covered in the Operations article: even though there is an existing library, framework, or service out there that will already kinda/mostly do the job, it’s easy to get tempted into writing our own version because we think we can do it better. I mean… it’s probably more fun, and it’s easy to fall into the trap of cognitive biases and kid ourselves that it won’t be that hard.
The issue with these approaches is that they can quickly drive exponential growth in both complexity of the systems, as well as the learning requirements to onboard new developers. Higher complexity will reduce maintainability.
Keeping things simple and coherent might not seem as much fun in the short-term, but it will put you in a stronger position for the long-haul.
You don’t have adequate automated testing
You should be able to know at any point in time whether your product is working or not. You should have high confidence that a pending change is sound.
There are multiple approaches to testing and although I’ll not cover all of them here, there are a couple which are really important for fast feedback that I want to mention:
Unit testing where tests are written to verify the functionality of small bits of software, and are typically executed in conjunction with how the software is built.
Unit tests are great guides when using approaches like Test Driven Development, but are also crucial in verifying that code refactoring didn’t introduce defects. We might measure unit tests in terms of quantity, or in terms of how much of the code is actually tested.
Integration tests are used to improve confidence that a software component still plays nicely with other software components after a change is introduced.
Static analysis is where you use software to grade software, and you should care about it since it will give you a lot of insights into the health of your code and whether things are getting better or worse.
In all cases, the saying perfect is the enemy of the good applies here, and I’ve used the word adequate deliberately: there is a point of diminishing returns where doing more testing won’t further improve confidence.
Depending on your preference, experience, and use case, you might favor one type of test over another. You might determine a certain degree of testing is sufficient. This doesn’t happen over night, but automated testing is a critical part of how software is delivered.
Changes break downstream services
If a team wastes time debugging a problem that later turns out to be due to another team making a breaking change, then that’s going to be both frustrating and counter-productive.
Whether it’s done through implementing versioned APIs, stage regression testing, ensuring backwards compatibility, or being ruthless about testing changes, it’s essential for teams to be aware of their responsibilities, and that other teams are relying on them.
Your code short-cuts good implementation with Feature Envy
So there’s this piece of code that another team maintains. It’s a pretty decent piece of code and it’s been used for a while on your project as well as several others.
Now suppose that you need to extend the functionality of that piece of code in order deliver your next piece of work. There are several common approaches, here are four that I’ve listed in order of preference:
- Send the other team a pull request after you’ve updated the code for them. This approach helps stay true to the original intent of the design, and is likely quick since the other team won’t need to expend any effort on implementation. It becomes even easier if the team uses good testing practices;
- Ask the other team to add that functionality for you. Perhaps the organization has silos, and communication between teams is low. It’s not the greatest place to be, and you may have to wait until the other team can schedule time to do the work. On the plus side, the eventual outcome will be architecturally sound;
- Study the implementation of the other piece of code and figure out how you can access it’s internal workings while actually implementing the functionality in your code. With things like organizational silos, tribe mentality, and introvertedness, choosing one of the higher options can be scary! The easy and low-effort thing is to just figure out how to bend the inner workings of the other code to your will. Unfortunately, there are huge problems with this approach: If the other team modifies their implementation, then your code will break;
- Build an abstraction layer on top of the other team’s code so that you can get your job done without having to bother the other team. Sure… it’ll work, but if you needed this change, then perhaps other people did too? It’d be awful if multiple teams went of and wasted time building their own abstractions for something that only had to be done once, and think of how much duplicate code now needs to be maintained across the organization!
The whole point of objects is that they are a technique to package data with the processes used on that data. A classic smell is a method that seems more interested in a class other than the one it actually is in. The most common focus of the envy is the data. We’ve lost count of the times we’ve seen a method that invokes half-a-dozen getting methods on another object to calculate some value.
— Refactoring: Improving the Design of Existing Code, Martin Fowler
This is one of those what’s the right thing to do? questions, and often times the easiest answer isn’t the right answer, especially when the pressure is on to deliver.
Your databases contain business logic
I’m thinking about two things in particular, but there may be others:
Stored Procedures are pieces of code that execute inside the database itself under certain circumstances, potentially modifying data.
Triggers are a clever mechanism that enables the database to say “when this thing happens in the database, go do this other thing”.
Whereas perhaps there is a case for using triggers as part of an approach to auditing, I think both of these features should generally be avoided. I say this for two reasons:
- Although most of the database-related code we write as developers is abstracted in a platform independent way, stored procedures and triggers are highly vendor specific, and you’ll have your work cut out for you if you ever want to change which database platform you use;
- It’s great when everything works, but when strange behaviors start to creep in, it can be incredibly hard to track down when a trigger or stored procedure is misbehaving and causing unexpected behaviors.
Not to mention that deploying, testing, and tracking this type of change to databases can be hard.
You have “pinned” library or language versions
Libraries are foundational to the way in which we write software today, and the open source movement has revolutionized our ability to build bigger and better software by providing simple and standard ways of performing common operations.
These libraries are “other” pieces of software that are developed by companies or teams of volunteers, and we generally favor libraries which are actively developed or maintained.
The down side to this is that these libraries (especially those that are actively maintained) can be a “moving target” and as we write code that relies on them, we find that our software breaks if we do not also track and stay compatible with the changes that those libraries undergo.
Often times the changes aren’t huge, but sometimes we get busy and “pin” our code to a specific historical version of a library because we don’t have enough time right now to make our code work with the current version.
This is when you know you’ve just joined the on-ramp to the road to hell.
Once pinned, it’s easy to forget. Once forgotten, you’ll miss out on new features and will certainly not take the benefit of any bug and security fixes. Over time the rot sets in and it becomes increasingly hard to catch up.
But there’s something even worse than that. At one company, they decided to pin to a specific version of a programming language. This meant that they also had to pin to all of the old library versions that went with that version. When a new critical piece of functionality came up, they had to backport a library from the current version of the language over to the old version. Not only is the effort required to do this huge, it also means that it becomes even harder to catch up with current technology.
The concern here is that as the code becomes more complex, the risk of breaking something with a change compounds, and feature velocity will suffer drastically.
Wouldn’t it have been better to regularly refactor and have the freedom that comes with staying up to date?
You go through “integration hell” at the end of a sprint or release cycle
I worked with one company that sold appliances that ran embedded software. The software was fairly monolithic, they used a waterfall approach to development, and — classically — they had six-month release cycles.
More interestingly for this story, they had a complex branching strategy that was sort-of like a subversion/gitflow superbaby on steroids:
A major release was branched from mainline; new minor releases were branched from major as a develop branch; feature branches were branched from develop. At the end of a release, develop was renamed to the appropriate version, and changes merged down to the major release branch. Subsequently, changes on the major release branch were merged back to mainline in order to keep mainline synced with latest version of the code.
This is all horribly complex. Not only are there risk of merge conflicts where humans have to step in and decide how the code should look (sometimes without context of the intent of changes), but problems were compounded as features and bug fixes were back ported to previous versions.
There were a couple of times where we inadvertently introduced fairly major regressions into shipped code.
Ultimately, the more complex, over-engineered, and “cleverer” your branching strategy, the more time and effort it will take to actually deliver code, and the more opportunity there is to introduce defects.
Simply from the perspective of reducing complexity, and whereas it’s not always possible (or appropriate) to make all commits directly to master, the simpler the branching strategy and the faster changes can be merged in the better.
You store passwords or run-time configuration in code
Whereas it’s vital that these things tracked through some form of version and change control, embedding this detail directly inside the application repo is bad for several reasons.
Changing configuration that’s stored as code requires a new artifact to be generated. New artifacts typically need to go through significantly more build and validation effort than configuration.
Conversely, dynamic configuration (where a service reads updated configuration information frequently) can typically be deployed or rolled back in seconds. Even though it may seem easier, never opt for static configuration which requires a restart or redeploy: you’ll regret it at just the wrong time.
Passwords are particularly terrible: should the code leak out into the world; potential bad actors will have access to systems and inject bad data. Moving passwords to a configuration service is a step in the right direction, but better still to treat secrets in a special way and request them from a third party only when required, perhaps even use short-duration tokens.
In the event that you must store passwords, please make sure that you used service accounts rather than an individual’s login: I’ve seen times where a key member of a team has left, their account has been locked, and suddenly critical systems begin to fail.
Better yet to let the infrastructure or fabric to take care of authentication and authorization for you.
Is this all really part of a DevOps conversation though? I truly believe it is: consider DevOps to be about delivering business value faster, and a lot suddenly comes into scope.
There is a near-infinite number of more technical software related anti-patterns that I could have talked about here (such as things like persisting session state), but there isn’t room for this post to be exhaustive, and perhaps some are off-topic. I hope that you’ll forgive me.
The fourth and final part of this series will talk about the organization. It’s a favorite area of mine, and whereas bad process and software practices will inhibit velocity, good organization will accelerate it.
p.s. Thanks to Jan Baez and Chuck Cullison for review, suggestions, and feedback.
p.p.s. In case you missed it, this article follows the one on Process.