Doctor Digital MY BUSINESS STOPPED when [insert crisis] happened. What can I do so that never happens again?
Doctor Digital Says
Introduction
There is nothing more heart-stopping than the moment when all of your business resources, communication, connectivity and capacity stop dead and there is nothing you can do about it. That was the situation many Tasmanian businesses (organisations and people) found themselves in when both the Internet cables servicing the state were uncannily severed in two places on the same day in 2022 and again for Optus customers in 2023 when their services failed, leaving customers without connection for over 9 hours in some cases. No matter how small the risk, major outages happen and due to the nature of these failures, as businesses they are out of our control. What isn't out of our control is a backup plan, a risk assessment, and some well-documented and tested plans for staff to triage the situation when it happens.
While this factsheet is about digital disasters, living in Tasmania and more broadly Australia, the digital disaster might be linked to a natural disaster. Fire and flood are common natural foes that also can significantly disrupt business and digital connectivity - similar issues but different circumstances. As our businesses become more dependent on single sources of failure such as Internet connectivity, it is worth remembering that this is a modern evolution of risk and dependency, and it is something for businesses to plan for and manage as you would for natural disasters. You may want to take a look at Business Tasmania's Emergency Management for Natural Disasters Toolkit which has some helpful templates and guides.
Risk management and business continuity planning at its simplest is about fully understanding where your business is vulnerable, and strategically creating contingencies if one of those vulnerabilities happens. You have insurance for things like floods, fires, theft - this is risk planning to help fund your business if it needs to recover from any of these things. If you don't already have one, it's time to make a digital backup plan for your business for the next time there is a significant Internet outage.
Internet or power outages that will really disrupt business seldom have any lead time to prepare, so a backup plan for this type of event is assuming that the outage is happening and there may be no guidance from the carriers or utilities involved as to when services will be restored. As in the case of the recent Tasmanian outages, as there was no Internet and in some places no phone service, it was very difficult to find out what was going on, as the usual sources of information - ie the Internet, social media, news feeds etc - were all down too.
Let's look at how your Internet can fail, some actions to take, how to create your plan, and mitigate as much of the risk as possible in a situation where there are many variables you are unable to control. But before that, some simple context about Tasmania's Internet so you understand how it all works. If the tech specs don't interest you, jump ahead to the bit about the contingency planning.
Tasmania and Internet connectivity
National, and therefore, international network connectivity to Tasmania is currently available from two optical fibre cable service providers. One is Telstra, which operates two cables across Bass Strait, and the other is Basslink, which operates a single cable across Bass Strait. Bass Strait has been the traditional and only telecommunications route to Tasmania for cable (copper and fibre) which makes Tasmania an end point in the network and not a connection point (network node).
What this means is that we rely on network nodes to provide services that are hosted in a distributed manner across the global Internet network. So when our cables get damaged or cut, Tasmania's connection to the internet is effectively cut off. That is why banks, cloud servers like Amazon and Google, video conferencing, online streaming services and complex powerful network capability are typically not hosted here. When our network went down, so did our access.
For Tasmanian businesses, on-island internet access is either via NBN or via Mobile Carrier broadband that uses 4G and/or 5G services
- NBN offers fixed, wireless and satellite connections, depending on physical location – ie an address
- NBN fixed services are FTTP, FTTN and FTTC – fibre-to-the Premises/area Node/ or street Curb
- NBN fixed wireless is a tower-to-premises microwave radio link that also depends on a physical location - ie an address, usually regional or remote
- NBN satellite is a whole-of-Australia satellite coverage primarily for remote/regional premises
All of these only offer a single link without a separate backhaul connection also known as (redundancy).
For business continuity planning, the issue is not the NBN connection to an individual premises, the issue is the whole-of-state capacity to offer an uninterrupted connection to the off-island Internet nodes. As we have seen from this cable cutting incident, the likelihood of the connection being interrupted is relatively low but the impact is definitely serious.
Some other points of failure
Before we get into the depths of backup Internet plans, let’s walk through the different ways the Internet can disconnect or fail completely. First steps if there is a significant outage are to establish that it is not a point of failure within your system, so check your connections, check for things like dust or an overturned coffee cup, turn devices off and on again, check the power supply, check your service provider for notifications of any issues and if possible contact other businesses in your area to see if they are having an outage. Once you have established it is an external issue, you can narrow things down.
Generally speaking, there are four common faults that lead to broken Internet:
1. Internal ISP issue: These problems occur within the network of your Internet Service Provider (ISP), and are usually quick and easy to fix. You can expect a downtime of a couple of hours.
2. Major ISP issue: Sometimes faults within your ISP’s network aren’t straightforward. Cue the Optus debacle and the many deep layers of investigation before they found what *they think* is the issue. For example, they may experience the shutdown of critical hardware, or a disconnection in their operations centre or remote Exchange which can take time to find and fix. Downtime can significant and devastating to business and services.
3. Accounts issue: Your ISP will shut down your Internet connection if they think you haven’t paid your bill (oops). You will usually get lots of notice before things escalate to this point, and all services will resume immediately once the accounts dispute has been resolved.
4. Infrastructure failure: Critical infrastructure may be interrupted by extended power outages and physical disturbance (for example, if someone digging up some infrastructure hits a cable or infrastructure is damaged by a natural disaster). Depending on the extent of the damage and the remoteness of access, there could be a couple of days of downtime or more.
What are your reliances, risks and vulnerabilities?
It’s a good idea to consider how an Internet failure would impact your business. What does your business depend on to deliver your product or service - banking, EFTPOS, online bookings, email, data storage, inventory management, logistics, CRMs, SMS or messaging for clients, website carts and fulfillment, payroll, invoicing?
If any of the above weren’t available for a couple of days, how would your business cope? Could staff still be productive? Could you still bring in revenue? Can you handle, process, store and invoice for cash? Can you realistically afford to risk an Internet outage for more than a couple of hours, how many, for how long?
Make a list of the dependencies your business has based on the suggestions above, and map the severity of impacts that it would have on your business - this will help you to prioritise how you manage risk, what you can financially tolerate, and what the cost of that will be so you can think about what the investment to mitigate it might look like.
One of the solutions to consider is having a redundancy or failover system. Redundant internet is a connection that kicks in automatically when your primary internet connection goes down, which means you never lose connectivity. It's like having a backup generator that can keep a home running after a storm knocks out the neighbourhood's power. This is realistically the only thing that could have helped businesses after the Optus outage and many more are now considering it as a viable investment.
There are a few types of redundancies to consider when creating a back up plan, some you can do yourself, others that can be purchased from organisations that specialise in Redundancy as a Service (RaaS).
- Electronic redundancy: This is partial protection achieved by routing your internet traffic through a variety of alternate network paths via standby routers and switches.
- Physical redundancy: Another form of partial protection, it involves deploying two separate paths between two locations, with each one ending in a different piece of equipment - say having both satellite and fibre connections.
- Wireless option: This kind of redundancy employs 4G LTE wireless connections. If an outage happens, the network seamlessly shifts to a battery-backed wireless connection, where computers can connect to a hotspot.
- Using a different backup carrier: Some companies have a second internet service provider, or ISP, at the ready that can come online if the main connection is severed.
The solution you come up with for your business will be specific to where your greatest risks and dependencies are. All options will come at a cost.
Creating a business contingency
Every organisation, no matter how large or small, should have a crisis response plan. It doesn’t need to be a massively sophisticated document, but it must meet the needs of the organisation and prepare it for when things go wrong. And it needs to be up to date and practised regularly. Y'all listening Optus and Telstra?
Business Tasmania has created a suite of business contingency planning tools including a comprehensive template to help you plan for future risks and how to recover from them. One of the main risks to businesses is that they fail to plan for issues such as significant Internet outages (less than 2% have active plans for this), which costs millions of dollars a year in lost revenue. Planning and communicating that plan to your team is an easy thing to put off in the daily busyness of business, but can be a costly omission when it is required.
Let's look at a simple scenario: Many businesses now have all their data in the cloud - because it's safe and secure, until you can't access it of course. For a services business, your database and contacts are critical. If your database is cloud-based, you may want to have a locally stored/desktop version so you can contact customers by phone when a lack of email means there is no way to email or message them. Perhaps your new protocol will be to download details for the customers you are dealing with that day so you have a phone contact available, and you will make this policy and communicate it to your staff as part of your business contingency planning.
Low risk doesn't mean no risk as the cable break and internet downtime from Optus have shown. The strong message is that you need to work out your critical dependencies, plan for them to be unavailable for a period of time, and then invest in the mitigations you feel comfortable with to ensure you are able to continue to support your customers and staff until access and capacity is restored.
Being able to communicate with staff and customers what is happening is important, as is being able to transact if you don't have access to electronic funds or credit cards for a period of time. Can these be linked to a RaaS product? Focus on your time-critical issues and your revenue, while also remembering that sometimes there isn't an easy or plannable solution, and you simply have to be able to be agile and creative with an ever changing situation.
Here's the TL:DR
- Plan for future outages: You need to develop a comprehensive outage response plan that outlines clear procedures for communication, task prioritisation, and alternative workflows and processes. Regularly review and update the plan to reflect changes in technology and business operations.
- Employee training and awareness: You should provide training to employees on alternative workflows and communication protocols during downtime. Ensure they understand how to access and use offline productivity tools and alternative communication channels.
- Regular testing and simulations: If you really want to prepare, you could conduct regular tests and simulations to ensure the effectiveness of your outage response plan. This allows you to identify areas for improvement and tweak procedures based on the test results.
Conclusion
Regardless of the details of the Optus outage, and what new revelations and actions might emerge once the real issue of what happened is surfaced, the fundamental lesson here is about having an effective crisis management plan in place.
Not just an off-the-shelf consultant cookie-cutter model, but a dynamic and flexible plan which provides for every specific scenario, for everything which might go wrong for your organisation, and which has been fully tested by probing for possible weaknesses.
As Optus has shown, it’s not enough just to have a plan. It has to work when disaster strikes.