What news from AWS re:Invent last week will have the most impact on you?
Amazon Q, an AI chatbot for explaining how AWS works.
Super-fast S3 Express storage.
New Graviton 4 processor instances.
Emily Freeman leaving AWS.
I don't use AWS, so none of this will affect me.
CI/CD / DevOps / Operations / Security

4 Tips to Avoid Downtime Risks and Start the New Year Strong

Prioritize operational readiness, increase test coverage and use this slower time to optimize your systems to avoid that notorious holiday slump.
Dec 19th, 2022 10:57am by
Featued image for: 4 Tips to Avoid Downtime Risks and Start the New Year Strong
Image via Pixabay

The end-of-year holiday season is particularly challenging for software companies — demand is often at its peak, while at the same time your staff members want to be home with their families.

But what if you could give your team some time off to relax with their loved ones over the holidays while still maintaining productivity and avoiding downtime?

You can avoid downtime risks if you:

  • Prioritize operational readiness
  • Write higher quality tests
  • Strengthen your security
  • Are consistent

Here’s what I suggest to do so:

Prioritize Operational Readiness

In CircleCI’s 2022 State of Software Delivery report, we found that the impacts many engineering teams experience every year around the holidays were worse than the onset of the global pandemic. Productivity decreases across the board more dramatically this time each year than it did at the start of COVID-19.

But we’ve also seen that the teams most adept at continuous integration are not slowed down by the challenges of having fewer engineers online. Teams that are good at CI are operationally ready to handle a lower headcount because they are more inclined to have robust testing in place, and they’re able to validate their software with confidence.

Being operationally ready also means that in addition to understanding the system and how to operate it, engineers understand how what they’re doing affects the business. Because software today is built on so many dependencies and microservices, answering the question, “Is our application working?” has changed. Some components of your application might be failing while others are working just fine.

Make sure everyone in your engineering organization has the same understanding of what it means to have your platform working and available to customers. If everyone is on the same page about what constitutes a critical failure, you can implement a mitigation strategy ahead of time and better ensure your platform’s reliability during this slowdown period.

Write Higher-Quality Tests that Center the User Experience

Recovery time is affected most by the end-of-year holidays than any other time throughout the year. This indicates that annual cycles in the market and the predictable rhythm of the seasons have more powerful effects on engineering productivity than global phenomena like the pandemic.

Recovery time increases across the board toward the end of the year, putting small developer teams at a particular disadvantage. Fewer team members available to debug and fix problems result in longer recovery times. It’s no wonder that many teams decide to code freeze during this time.

But it is possible to still maintain high productivity by expanding your test coverage. You should be conducting tests from the user point of view, as well as the technology point of view. Test the ways in which different components of your application work together and iterate until you find the most optimal configuration.

The truth is that writing tests and maintaining them over time is really difficult, which is exactly why many engineers find themselves only testing the easy things. But if you take time to replicate a user’s behavior, it will exercise the system in a way that will help you really understand the user and what to expect from them. This is one of the best strategies you can implement to prevent failures.

Know that increasing your test coverage will lead to longer workflow durations, but it will also empower small dev teams to continue work while their colleagues are offline, reducing the amount of downtime when errors do occur.

Strengthen Your Security 

This end-of-year period is also when attackers are most likely to take advantage of perceived weaknesses and attempt entry into your systems. There are phishing attacks, supply chain issues and poor configuration, among other risks that attackers can use to their advantage.

In the past year alone, modern software supply chain attacks increased by 650%. The most common type of attack is “dependency confusion” — when an automated software development tool is updating the dependencies it relies on and the software installer is duped into downloading a malicious package from a public repository. Dependencies allow engineers to move fast, but few people stop to think about whether their dependencies are secure.

Performing automated security scans can help safeguard against this because it means that no single developer is responsible for protecting the entire system. This slower period during the holidays can be used to do security patches and upgrades since these are lower-risk changes than some of the new feature sets released at other times of the year. It’s wise to make sure no one on your team is accessing applications using a default username and password. Default logins are commonly used by attackers to gain system entry.

This is also a great time to get your team up to date on their security training, like how to spot phishing scams. Phishing attacks have become incredibly sophisticated, making it increasingly important for everyone in your organization to understand how to identify and prevent them.

Consistency Is King

Q4 is typically a big architecting time for software companies. This slower period gives your business more opportunities to evaluate what’s working and what’s not. You should look for divergences across your systems that need to be addressed and similarities that are working well — consistency is king.

If your engineering organization has a platform team or organization, they can perform this type of auditing before everyone is out for the holidays to help safeguard your system. Platform teams take a macro-view — they’re looking to provide large-scale efficiency rather than micro-optimizations, which will have benefits for the entire company year-round.

By prioritizing operational readiness, increasing your test coverage and using this slower time period to optimize your systems, your engineering organization will be better prepared to maintain productivity and avoid the notorious holiday slump this year.

For more on engineering productivity, check out CircleCI’s 2022 State of Software Delivery report.

Group Created with Sketch.
TNS owner Insight Partners is an investor in: Pragma.
THE NEW STACK UPDATE A newsletter digest of the week’s most important stories & analyses.