Firstly let me just say that if people think this problem is just limited to Crowdstrike then think again. This could have been any of your 3rd party vendors that could have had this issue.
One of the interesting things about what this global outage has shown is the reach into our daily lives that this vendor had. Most of us not even realising what was going on until we couldn’t pay for something online, take that flight for our holiday or in my case buy some laptops for work from Dell.
Critical Infrastructure, Healthcare, Retail, Finance, Travel pretty much every sector has been impacted by this outage in some form or another.
Interestingly the outage of Microsoft Azure that happened at the same time caused a lot of confusion and speculation. Bad timing or something more……The rumour mills have started already on social media.
The true impact of this outage will last for weeks and for some organisations will cause some very real and massive logistic nightmares. Imagine your a 100% online business that’s global. Your staff work all over the world and you issue their IT remotely.
I guess you have 2 choices:
- Issue new laptops to all the staff to get them back up and operational immediately (assuming fastest shipping option)
- Call via the phone, as Teams or Zoom clearly may not be an option and talk each staff member through booting into recovery to sort the issue.
You will by now have realised that just this task alone is a huge amount of effort that is needed and please do spare a thought for the IT Support teams around the world at the moment working as hard as they can to get people back up and running.
What can we do
So there are a number of things that we can do to attempt to limit the impact of this happening in the future:
- 3rd Party Risk Register: Organisations should have a register of all their 3rd party solutions that they use including all software vendors, this is not a small task, especially the bigger the organisation. In some cases this list could ne 1000’s of vendors. This list should then link to how critical this software is to your business and this will allow you to plan in the event that something goes wrong with it.
- Patch Deployment Process: What is your patch deployment process? Do you release patched immediately to production systems or do you test them first in a development or pre-production environment? There have been a lot of people saying “un-tick auto updates” well this in the majority of cases is a bad idea as it will open you up to a lot more pain for a very small short term gain. You should delay patches for a few days for all your systems unless they are deemed Critical, this will give you some breathing room to assess that they are good to go.
- Vendor QA Release Process: Ask your 3rd party vendors for their QA and patch release process. You will see already a lot of vendors are already publicly sharing their processes to get ahead of the impending questions that are going to be asked after the fall out of this outage.
- Disaster and Business Continuity Plan: Make sure that your DC & DR process has a 3rd party event in there and that you have clear processes and procedures to follow in the event of this happening again. Don’t be fooled this will 100% happen again in the future and all we can do is plan for the event as best we can.
- Employee Support: The often forgotten part of the business puzzle. Your employees will potentially need additional support during this time and may be a little bit more stressed out than normal. Don’t forget that something like this impacts both your organisation and organisations that your employees will use in their personal lives. So checking in and making sure people are ok is a key must.
Conclusion
This once again has highlighted how fragile our daily lives are relying on so much technology. There were a lot of businesses not affected at all by this outage but their staff would have been through the collateral damage to the likes of Visa and other big names.
Make sure that you have planned for what you will do when this happens again in the future and try to limit the risk and damage to your organisation and more importantly your staff.
Finally, I hope that people understand that there are a lot of people working around the clock globally right now to help fix the issues caused and just try to have a little empathy and patience. We are all human after all.
As always if you need any support or have any questions please do reach out to me or one of the team Cyber Security Associates or FluidOne