Google's Cloud Platform suffered a major breakdown on Sunday taking down several sites

Humza

Posts: 1,026   +171
Staff member
What just happened? Google's cloud infrastructure powering many of its own services and other major companies' faced an outage that lasted for several hours. The issues started Sunday around 3pm ET before the company resolved the downtime for all affected users around 8pm ET, spreading across multiple regions particularly the US East Coast and Europe.

A number of web services dependent on Google's Cloud Platform were knocked out owing to "high levels of network congestion in the eastern USA" faced by its cloud infrastructure. The outage resulted in a number of problems, ranging from users' inability to control room temperatures through their Nest devices, to credit card payment processing delays with Shopify, and connectivity problems with Gmail and YouTube, the latter of which was reported to go offline across many countries worldwide.

Google Cloud also provides the backend for Discord, Snapchat and Vimeo, among other popular apps all of which were affected by network congestion issues faced by Google's Cloud Networking and Google Compute Engine.

Google Cloud outage map, source: Down Detector

At one point, the entire G Suite Dashboard lit red with almost every service facing an outage showing just how severe the issue was.

"We will conduct an internal investigation of this issue and make appropriate improvements to our systems to help prevent or minimize future recurrence. We will provide a detailed report of this incident once we have completed our internal investigation. This detailed report will contain information regarding SLA credits." the company issued in a statement after resolving the issue.

YouTube also tweeted notifying users as it identified and fixed the issue.

Service outages like this occasionally reveal how critical the cloud has become for the modern internet and computing architecture. Entire companies often rely on a single cloud service provider for their operation because of the many benefits it provides such as business scaling, monitoring and resources management, but the service's uptime and availability, generally touted as a plus, can also suffer either due to the cloud infrastructure running into problems or lack of contingency planning on the third-party's end.

These breakdowns subsequently lead to huge inconvenience for end users and resulting downtime can cause millions of dollars of losses to businesses.

Permalink to story.

 
"Service outages like this occasionally reveal how critical the cloud has become for the modern internet and computing architecture."

Which is why in the beginning, I said I would never rely on these services. I still see no reason to change my mind.
 
"Service outages like this occasionally reveal how critical the cloud has become for the modern internet and computing architecture."

Which is why in the beginning, I said I would never rely on these services. I still see no reason to change my mind.

It's still backed up and your files are safe and secure. Having your storage locally is not the best solution, but neither is having it just on cloud.

Dropbox + files locally stored on raid 5 and if you are really worried, push everything to Glacier for cold storage.

Cloud is good.
 
"Service outages like this occasionally reveal how critical the cloud has become for the modern internet and computing architecture."

Which is why in the beginning, I said I would never rely on these services. I still see no reason to change my mind.

Availability is going to be an issue no matter where you choose to store your data...there is always going to be the risk of outages.
 
Wait a minute...People we're unable to use their thermostats because of a Google network issue? That's ridiculous. I actually was considering getting a nest, but it apparently has no backup mode if google's service goes down.

Just wait until they EOL the service for older models, which would effectively render the nest inoperable. Oh, I'm sure Google will offer a discount to people who own them to buy the "New and Improved" model, but they will still profit from planned obsolescence. It's a move worthy of Apple or Intel.
 
Eh, I didn't even notice apparently.

I wonder what it was that caused it...
 
Wait a minute...People we're unable to use their thermostats because of a Google network issue? That's ridiculous. I actually was considering getting a nest, but it apparently has no backup mode if google's service goes down.

Just wait until they EOL the service for older models, which would effectively render the nest inoperable. Oh, I'm sure Google will offer a discount to people who own them to buy the "New and Improved" model, but they will still profit from planned obsolescence. It's a move worthy of Apple or Intel.

Nest is dead baby. Avoid!
 
"Service outages like this occasionally reveal how critical the cloud has become for the modern internet and computing architecture."

Which is why in the beginning, I said I would never rely on these services. I still see no reason to change my mind.

It's still backed up and your files are safe and secure. Having your storage locally is not the best solution, but neither is having it just on cloud.

Dropbox + files locally stored on raid 5 and if you are really worried, push everything to Glacier for cold storage.

Cloud is good.
RAID5 is horrible and I hate seeing it used everywhere, because I've seen it fail so many times (and heard of it happening even more). Scenario: A hard drive fails in a RAID5 array. Someone goes to swap it, and a resync is automatically started, pushing the remaining hard drives in the array to 100% utilization for a number of hours (on average around 8, depending on speed and capacity). During that time, another hard drive that's close to failing dies, and the entire array is lost. Customer is in tears because that was their only copy of all of their company's data.

I hate using the cloud as a backup solution. Have an onsite NAS, RAID6 or 10, and an offsite NAS as geographically far away as you can justify. That way you have three copies of your data (the third is your servers/computers/wherever the data originally came from), just stagger the backups (I do daily to the on-site NAS, weekly/monthly to the off-site) so that if bad data gets copied to your first NAS, there's a low chance it'll get copied to your second before you find out and break the replication. That's the cheapest solution you can get that's solid, obviously you can spend more money and get more protection.
 
RAID5 is horrible and I hate seeing it used everywhere, because I've seen it fail so many times (and heard of it happening even more). Scenario: A hard drive fails in a RAID5 array. Someone goes to swap it, and a resync is automatically started, pushing the remaining hard drives in the array to 100% utilization for a number of hours (on average around 8, depending on speed and capacity). During that time, another hard drive that's close to failing dies, and the entire array is lost. Customer is in tears because that was their only copy of all of their company's data.

I hate using the cloud as a backup solution. Have an onsite NAS, RAID6 or 10, and an offsite NAS as geographically far away as you can justify. That way you have three copies of your data (the third is your servers/computers/wherever the data originally came from), just stagger the backups (I do daily to the on-site NAS, weekly/monthly to the off-site) so that if bad data gets copied to your first NAS, there's a low chance it'll get copied to your second before you find out and break the replication. That's the cheapest solution you can get that's solid, obviously you can spend more money and get more protection.

I suppose if you are offering this as a service that's probably more ideal. Personally, I never had any problem at all with my setup. I have ALL my client web applications on my web server and S3 / Glacier. That's pretty much enough (with the personal setup as mentioned previously).

Remember, cloud companies have their own redundancies in place and the whole point is cut cost and minimize manual work on your end.

Otherwise, I might as well become a host and I will never be able to compete effectively for the time invested / money in.
 
Back