Flight Line Support has been growing for several years. The number of on-premise servers they run and the importance of these servers increased greatly. They were concerned that if these servers went down the business would be greatly affected, and they would like to put some disaster recovery in place.
They already had backups, but these did not provide agile and quick disaster recovery. We already knew the systems at FSL, as we already provided IT support to them. Disaster recovery is something we had mentioned to them in the past but their increased reliance on these servers put this higher up on their priority list.
The project was to provide offsite replication of all systems and quick recovery time in the event of site disaster or server failure.
As with all projects the first stage is to find out the customers goal and then advise if this goal can be improved upon. FSL stated that they wanted to the quickest possible recovery time should either physical server failure or there was a complete site disaster such as fire or theft.
These goals are easily met with our “TOD (Triumph Over Disaster) Extra” backup product. The first stage of such a project is to do the due diligence to find out about the data. We need to find out the following:
- What do we need to back up?
- How much total data is there?
- How frequently should we back this up?
- Where does it need backing up to?
- RPO – What is the recovery point objective?
- RTO – What is the recovery time objective?
The research showed the following:
What do we need to back up?
There were six virtual machines that needed backing up spread across two physical servers.
How much total data is there?
3TB in total
How frequently should we back this up?
We suggested hourly onsite backups 24 hours a day and nightly offsite backups of all data. FSL agreed.
Where does it need backing up to?
All data needs to be stored offsite in case of an onsite disaster such as fire or theft. This is our suggestion to all customers.
The recovery point object states how far back the backups go and how granular they are. In this case FSL agreed that they would be able to go back as far as 2 months. For granularity, the following was agreed:
- Hourly backups are kept from the current day to 14 days
- Daily backups are kept from 15 – 28 days
- Weekly backups are kept from 29 – 56 days
The recovery time objective states how quickly the systems need to be up and running after they have gone down. The following was agreed:
- Server failure – 4-hour RTO
- Site failure – 24-hour RTO
These objectives are achievable with the TOD Extra product.
With such a backup system it is also important to consider internet speeds. Offsite replication of such a large amount of data is only possible with a good fibre internet connection. FSL have an FTTP (Fibre to the premises) connection with 80MBps down and 40MBps up. This will be more than adequate to replicate the amount and frequency of FSL’s data.
Once all the due diligence is completed and the details have been agreed with the customer its time to put a price together. The TOD Extra system is a fixed monthly price per server that includes everything. For the TOD Extra service, you get:
- Hourly onsite backups 24 hours a day of the entire system.
- Nightly offsite backups 7 days a week of the entire system.
- Backups are monitored daily and test restores are done weekly by Triumph.
- File level restores upon request.
- Access to our redundant physical server in the event of a disaster.
- Cloud boot facility in the event of a disaster – We can boot your server/s in our Cloud and provide you remote access to these resources.
- All hardware, software and support costs included.
FSL agreed the system was exactly what they were looking for and sign off was completed.
- The implementation of this backup system involves the following:
- Hardware – Installation of the NAS that will hold all the backups at FSL’s office.
- Software – Installation of the backup software on all servers.
- Configuration – Configuring the software to the required spec and setting up the offsite replication.
- Seeding – The initial data needs to be replicated offsite using physical media.
- Testing – Testing the backups and disaster recovery.
The NAS box will be installed in a secure location, with a static IP ready to receive the backup images. We also ensure that data on the NAS is password protected and encrypted. This protects the images from a locker virus should it make its way into the network.
The backup agent will need loading onto each server. This is then configured to send backups to the NAS box hourly 24 hours a day. The software uses VSS to capture the entire system even files and software that are in use. The software will automatically notify us of failures, although these are manually checked every day. After each installation an evening reboot of the servers is scheduled.
We will also install our replication agent that manages the replication of the backups to our Cloud network.
Because of the size of the data (3TB). The initial full backups need to be copied to removeable physical media and manually copied to our Cloud network. We manage this process for the customer and send an engineer to site once the data has seeded. The engineer will bring the media back, copy to our Cloud at which point we can start the replication of the incremental backups.
Once the backups have fully replicated we will run the following tests:
File restore – Check that we can restore an individual file from every server
Server recovery – Check that we can recovery every server and successful boot it in our cloud.
RTO – Check that we meet the customer’s recovery time object.
All our testing came back perfectly. The backup system is now implemented and tested.
Project Sign Off
The final stage of any project is to check that the customer is happy. An important aspect of a good IT company is knowing how much technical information the customer wants to know. Some like to know every detail, others like to know “Is it finished?”. FSL sat somewhere in the middle, so we discussed the project and gave them the key details.
The category: Case Studies