top of page

Keeping your cool with IT risk management

  • Writer: Steve Thorlby-Coy
    Steve Thorlby-Coy
  • Aug 3, 2022
  • 4 min read

Updated: Sep 10, 2022

In July, the UK recorded its hottest ever temperatures. On the hottest day, Google and Oracle suffered outages due to cooling system failures in their UK data centres (BBC report here). Since May, energy prices have spiralled to the point where governments are intervening to try to help people and businesses. The war in Ukraine, COVID, cyber attacks, the threats are endless. As a leader, I think about risks a lot, particularly IT related risks. There's always something happening that could impact normal operations or throw your change projects off track. Here are my tips for keeping your cool and not wasting too much energy on IT risk management.


Remember PC RAT


The acronym that I use when thinking about risk management is PC RAT. It puts a picture in my mind of a furry little rodent dressed in police officer uniform, complete with helmet and truncheon. PC RAT polices my risk management plans! Sadly, it may not ever get the Disney series it so clearly deserves! (Random aside, Dutch police trained rats to help with crime! https://www.wired.co.uk/article/dutch-recruit-rats)


PC RAT stands for the 5 main responses that can be made to any risk:

  • Prevent

  • Contingency

  • Reduce

  • Accept

  • Transfer


Keep a lookout


Before the heatwave, my colleagues and I discussed the potential impact and any responses we might need to take. We decided to close our office so that colleagues didn’t have to commute, since travel disruption was expected. We advised people to take whatever measures they needed to stay safe and sane, such as working at different times or in different locations.


To be honest, I only briefly thought about the potential risks to our IT from the heatwave. Our servers are in temperature controlled fully managed data centres. The kit we tend to buy is the same as is available in countries which see higher temperatures more often. But, in light of an organisation like Google deciding to shut systems down, perhaps I should have asked, or have been asked, a few more questions. Because, it’s the risk you don’t think about or plan for is going to be the one that catches you out - right, COVID? But, how often do you really review your risk register and consider new or emerging threats such as climate change? I have a monthly reminder in my diary. When was the last time the IT risk register was properly scrutinised and not only by the IT team? Put it on the agenda for formally at cross-organisation management meetings, or at least bring it up in your 'IT Business Partnering' catch-ups with managers of other teams.


Do nothing ... sometimes


I have no doubt that Google and Oracle have significant measures in place to prevent an outage and reduce the impact of it. They will have load balancing, backup data centres that they could transfer workloads to, redundancy to cope with additional demand and more. Yet, during the heatwave, the outage still happened. And to some extent that means that they accepted the risk. It may have been more expensive or taken more time to transfer to another data centre and back than just accepting an outage for a few hours. Taking action may have introduced another risk to whatever function the outage affected. Maybe the functions weren’t particularly critical. They’re probably reviewing those plans again now though. Given the cost of energy, I wonder if there are IT services that we could shut down to reduce costs. This is easier for cloud infrastructre, but there may be elements of your physical infrastructure that could be switched off in a more managed and deliberate way, such as network switches in office buildings in the evenings and weekends. How many kWh that could save each year?


Sometimes the right course of action is to do nothing. Just accept the risk. I’ve experienced outages to business critical systems in my career. You’re under pressure and it is so tempting to leap into action. You find yourself eager to invoke that business continuity or disaster recovery (DR) plan that you spent time and effort writing and getting agreement for, because you just want to see whether it actually works. When you find yourself in that situation, take a breath and think before you pull the trigger on the starting gun of your DR plan. Doing nothing is always an option, but should always be an active choice rather than a passive one. You should be deliberately not taking action, or deliberately waiting to see what unfolds.


Test the plan


Of course, IT teams should be testing DR plans regularly, but there are often more pressing priorities. I’ve noticed this in smaller organisations in particular - and that’s a risk in itself of course. I’d recommend testing at least annually, with slightly different scenarios each year. You might never encounter the exact scenario that you test, but what you will learn will be useful. It’s also worth considering those relatively minor system hiccups, or incidents, along the way as mini DR scenario tests, because you learn from those too. Schedule tests in your annual plans, write up the results and report to your senior managers. If nothing else, senior managers expecting a report will motivate you to carry out tests.


Keep cool


So, with temperatures rising and new risks emerging, ask yourself:

  • When was the last time you properly reviewed your IT risk register?

  • Do you need additional PC RAT risk management actions?

  • Are you carrying out regular DR tests and learning from them?



As a qualified coach, I can help you to be more self-aware and confident in your abilities and address work challenges. Reach out if you'd like to discuss coaching with me. I offer career/work coaching for people in any role. I am able to offer a blend of coaching and mentoring for people in IT and change roles, particularly managers or aspiring managers.


I am currently able to offer coaching sessions free of charge - contact me via LinkedIn or Twitter




留言


Post: Blog2_Post

Subscribe Form

Thanks for submitting!

©2022 by Steve Thorlby-Coy.
Created with Wix.com

bottom of page