British Airways cancels all flights from Gatwick and Heathrow due to IT failure

Messages
453
Name
A
Edit My Images
No
https://www.theguardian.com/world/2017/may/27/british-airways-system-problem-delays-heathrow


British Airways has cancelled all flights from Heathrow and Gatwick on Saturday due to a major IT failure that is causing very severe disruption to its global operations.

The airline said that its terminals at Heathrow and Gatwick had become “extremely congested” due to the computer problems. It decided to cancel all flights from both airports before 6pm UK time on Saturday. “Please do not come to the airports,” BA said.

A later statement said the airline had been forced to cancel all remaining flights scheduled to depart from the UK’s largest two airports on Saturday. “We are extremely sorry for the inconvenience this is causing our customers and we are working to resolve the situation as quickly as possible.”

It is believed hundreds of flights at the two airports have been affected, and more around the world have suffered major delays.

Travellers have been told to check ba.com and its Twitter account for updates about the situation.
 
A very sad day there today, this will take a few days to get back to normal schedule.
 
Hopefully they will sort it soon as there must be thousands of people affected (and checking their insurance documents).
 
That's what the manager who ordered someone to cut corners said.

Critical systems can fail. But you prepare for just such cases with multiple fallbacks.
Which is great but it's not beyond possibility that all systems including their redundancy options simultaneously fail.

Nobody seems to be saying what the issue is yet so let's leave the finger pointing for now.
 
Sure. We don't know the cause. But we can see the effect. Misery for so many people, which I think was avoidable.
 
One thing is for sure. This will provide plenty fodder down at the Daily Fail and Sun etc.
 
The Delta data centre outage in January cost Delta Airlines $100 million dollars. I guess BA are going to get a hefty bill. And I guess they don't want it to happen again. So they will likely improve their procedures.
 
Last edited:
Well it's either that, or I was stating the obvious, or I'm the evil super hacker behind it all.
 
Well it's either that, or I was stating the obvious, or I'm the evil super hacker behind it all.

From what I can gather, it's actually a major power failure in India where their main offshore support provider (Tata) is based. I don't know more details than that, but from experience, a lot of offshore providers rely on pretty shaky power grids so unless Tata can run their support infrastructure on generators for days, there's not a lot BA can do.
 
From what I can gather, it's actually a major power failure in India where their main offshore support provider (Tata) is based. I don't know more details than that, but from experience, a lot of offshore providers rely on pretty shaky power grids so unless Tata can run their support infrastructure on generators for days, there's not a lot BA can do.

Is that the Indian IT support provider that BA recently started using, making UK IT staff redundant?
 
Risking their major operations on known unreliable power supplies, without an adequate, switch in alternative location, would then be the place where they'll be making big changes, in that case.

I suspect they have lots of contingencies in place, but could have skimped on the testing and quality, like Delta Airlines did. Their backup site smoothly took over. But not everything could connect to it.
 
Last edited:
"I'm sorry sir. Supplies of Champagne have been disrupted by the computer outage. They sent Red Bull instead"
 
A few very short sighted responses in here. Unfortunately this is likely to be an outcome of the general publics obsession with lower prices. As with most things, when you try to reduce the cost something has to give. I've no idea what the cause of this was, but as someone who works in the aviation industry I do know that margins are now next to nothing, and I'm afraid those who drive the prices down are those who suffer the consequences. But don't worry, you can claim your money back... but, oh that will mean someone has to pay... oh no, higher prices...l so we need to cut costs, and the cycle continues.
 
i suspect some data centra mega switch has gone pop and the 3rd part that subcontracted the support contract to the 4th party has been tole they dont stock the part so its on 24 hour order from cisco.
 
I wonder if they've tried the power cycle routine failsafe?

(i.e. switch it off, switch it back on).
 
i suspect some data centra mega switch has gone pop and the 3rd part that subcontracted the support contract to the 4th party has been tole they dont stock the part so its on 24 hour order from cisco.
That was close to the Australian census computer cock up.
"Computer giant IBM has conceded the issues surrounding the census website outage could have been avoided if it had turned one of its routers off and on again beforehand"
 
Carnage in T5 this morning, glad I'm not having to re-book. Business class check in is 1hr, normal is at least double that. Be early.
 
Carnage in T5 this morning, glad I'm not having to re-book. Business class check in is 1hr, normal is at least double that. Be early.
Did you mean half rather than double?
Or words to that effect anyway, not slept so might be my failing logic circuit....
Unless you mean the carnage has made check-in shorter somehow but then why be early?...headache lol.
 
Last edited:
As more and more systems are loaded onto the Internet more and more systems will fail.

The more complex a system is the more room for failure whether accidental or deliberate.
Again it's driven by the consumer. People want express check in via the Internet, people want live flight details, people want to be able to check and make changes to their bookings and all of the other frilly little features.

However like I said with all technology sometimes poop happens no matter how simple or how many redundant failsafes you have.
 
Outsourcing is not an issue - in fact its normally a good thing. Ok, this applies to a company the size of BA less so, but by outsourcing companies can get enterprise grade IT for an affordable price, and I have seen first hand (admittedly in smaller businesses) how many in house IT is pretty pants.

A decent data centre should have 72 hours + of diesel generator power available if a power cut happens, and a business of this site should have at least 1, if not 2 data centres as a failsafe. Outsource to India TBH would not be my first choice however.
 
I'd be very surprised if an IT system running two major U.K. airports didn't have failsafes but, as per the post above, sometimes sh1t just happens in the real world.

From what I can gather, it's actually a major power failure in India where their main offshore support provider (Tata) is based. I don't know more details than that, but from experience, a lot of offshore providers rely on pretty shaky power grids so unless Tata can run their support infrastructure on generators for days, there's not a lot BA can do.

Outsourcing is not an issue - in fact its normally a good thing. Ok, this applies to a company the size of BA less so, but by outsourcing companies can get enterprise grade IT for an affordable price, and I have seen first hand (admittedly in smaller businesses) how many in house IT is pretty pants.

A decent data centre should have 72 hours + of diesel generator power available if a power cut happens, and a business of this site should have at least 1, if not 2 data centres as a failsafe. Outsource to India TBH would not be my first choice however.

BA's datacentre is at Heathrow, in their waterside HQ It support was indeed outsourced to Tata, with some smart hands on site. It appears they have power to site but due to a major fault, cannot get it to the affected parts of the building housing the datacentre rooms. It's not known if this is affecting the servers or if they are running and it's the network connectivity as there is a full clampdown on staff speaking to anyone.

Now this smacks of complacency and incompetence and really will cost them far more in lost revenue, compensation and lets face it embarrassment to the brand, than having a decent DR site. There's no technical issue with having a DR site, it's just down to cost, planning and too many times people gamble on this.
 
I'd be amazed if they didn't have a secondary datacentre along with one at Heathrow. A colo rack and vpls isn't exactly expensive.

One theory that is being banded around is that apparently some displays (departure boards etc) having muddled information before the main issues. It's possible that when one dc went down and rolled to the second the data was perhaps out of sync/incomplete/corrupt. Alternatively the fail over of databases had issues and/or didn't complete, or the rollover happened but the primary came back up too fast and the data wasn't syncd back etc.

I'd not be surprised if they're pulling databases back from their backups.
 
Back
Top