Man Accidentally Deletes His Entire Company With One Wrong Command
If you didn’t know already, / represents
root. And running ‘rm -rf /’ will delete root directory and all of its
content. In Linux file hierarchy, root contains everything. Deleting
root means your system is gone, forever.
No wonder this is compared to drunken driving in the Linux world.
Sh*t happens
But shit happens in the IT world. And
apparently it happened with this hapless SysAdmin Marco Marsala who runs
a web hosting company serving over 1500 customers.
As per the question posted on Serverfault few
days back, Marsala tried to run a Bash script that had the following
command in it: rm -rf {foo}/{bar}. But it turned out to be ‘rm -rf /’
due to undefined variables and the inevitable happened.
In Marsala’s own words:
I run a small hosting provider with more or less 1535 customers and I use Ansible to automate some operations to be run on all servers. Last night I accidentally ran, on all servers, a Bash script with arm -rf {foo}/{bar}
with those variables undefined due to a bug in the code above this line.All servers got deleted and the offsite backups too because the remote storage was mounted just before by the same script (that is a backup maintenance script).How I can recover from arm -rf /
now in a timely manner?
Oh, poor guy!! What did you just do?
What next?
What next? This is what Marsala wanted to know. Is there a way to recover from ‘rm -rf /’?
But chances of recovering all the data
from a rm -rf / are thin. No wonder, this post started getting sarcastic
(but honest) comments like:
If you really don’t have any backups I am sorry to say but you just nuked your entire company
Another one went like:
You’re going out of business. You don’t need technical advice, you need to call your lawyer.
Few people suggested to shutdown
everything, don’t overwrite anything and use data recovery tools to get
at least some data back.
And it seems like, it did work to a larger extent for Marsala as he did mention “luckily we recovered almost all data” later on.
Lessons to learn
As some people are speculating that it’s a hoax, there are still few lessons to learn for all of us.- Backup everything. If it’s a professional server, have multiple, offline backups
- Don’t use a random tool or script from the internet and use it on a production machine directly
- Have test machines identical to that of production for testing out new stuff without risking the production system
Anything to add to this scary incident?
Comments
Post a Comment