Claude Code Ran 'Terraform Destroy' and Obliterated 2.5 Years of Student Data — Backups Included
There are disasters that happen because an AI does something no reasonable person would ever ask it to do. Then there are disasters that happen because an AI does exactly what you asked it to do — just with the wrong thing in its crosshairs.
Alexey Grigorev is the founder of DataTalks.Club, an open-source community for data engineers with thousands of students. In February 2026, he was attempting a mundane migration — moving a static website from GitHub Pages to AWS. He asked Claude Code to help with Terraform. Within an hour, the entire production environment for his course platform was gone. VPC. ECS cluster. The RDS database. Every automated snapshot. Gone.
2.5 years of student homework submissions, project evaluations, and leaderboard data — 1,943,200 rows in the courses_answer table alone — vanished in roughly 60 minutes.
Not a single line of code was malicious. Claude Code didn’t hallucinate or rebel. It followed perfectly logical reasoning at every step. That’s what makes this worth studying: it wasn’t a glitch. It was a cascade.
THE SETUP: A MISSING STATE FILE AND A SHARED VPC
Grigorev wanted to migrate his AI Shipping Labs site to AWS. He already had Terraform managing production infrastructure for his DataTalks.Club course platform. Rather than set up something separate, he decided to combine them into one VPC to save $5–10 a month. Claude Code itself advised against this. Grigorev overrode the recommendation.
Then he switched to a new computer. The Terraform state file — the document that tells Terraform what resources already exist — was still on the old laptop. No remote state in S3. Nothing.
Grigorev pointed Claude at the configuration and told it to plan and apply. The plan came back with a massive list of resources to create. None should’ve needed creating — the infrastructure already existed. He caught the anomaly and cancelled. But some resources had already been created — duplicates of existing ones.
Manageable. Annoying, but manageable.
THE AGENT’S LOGIC — SOUND, COMPLETELY WRONG
Grigorev instructed Claude Code to use the AWS CLI to identify and delete just the duplicate resources. While that cleanup ran, he went to his old laptop, grabbed the Terraform archive with the proper state file, and transferred it over.
Without him noticing, Claude Code unpacked the archive and replaced the current state file with one describing the entire DataTalks.Club production infrastructure — not just the newly created resources.
The agent then reported it couldn’t cleanly identify duplicates through the CLI. It proposed an alternative:
“Since the resources were created through Terraform, destroying them through Terraform would be cleaner and simpler than through AWS CLI.”
Grigorev approved.
The logic was sound — if Terraform created something, Terraform should destroy it. The problem was the state now pointed to the wrong infrastructure. The terraform destroy -auto-approve command didn’t target temporary duplicates. It targeted everything in that state file: the VPC, ECS cluster, load balancers, bastion host, and the database holding 2.5 years of student data.
Grigorev checked the AWS console.
DISASTER DOSSIER
Date: February 26–27, 2026
Victim: DataTalks.Club — open-source data engineering education platform
Tool: Anthropic’s Claude Code
Wiped: Full AWS environment — VPC, RDS database, ECS, load balancers, bastion host
Data at Risk: 1,943,200 rows — 2.5 years of student submissions and records
Backups: All automated RDS snapshots also deleted by terraform destroy
Recovery: ~24 hours via AWS Business Support internal snapshot (invisible in customer console)
Cost: Upgraded to AWS Business Support — ~10% permanent increase in AWS bill
Root Cause: Missing state file → shared VPC → agent unpacked old state → terraform destroy -auto-approve against wrong target
Source: Grigorev’s post-mortem
[Sources: Tom’s Hardware, Times of India, Spiceworks]
THE GHOST SNAPSHOTS
Here’s where it got terrifying.
Grigorev knew automated backups ran at 2 AM. He opened the AWS console. No snapshots. Checked again. Nothing. The RDS events log confirmed a backup existed — timestamped at 00:24 that morning — but clicking on it returned nothing. The event said the backup happened. The backup was gone.
terraform destroy had deleted the snapshots too. Not just the database — the safety net.
It was past 11 PM. Standard AWS support doesn’t prioritize tickets at midnight. Grigorev upgraded to AWS Business Support (~10% more on the monthly bill) for the one-hour response time on production incidents.
AWS confirmed: the database and all snapshots were deleted. But — and this is the only reason the data survived — AWS had an internal copy of a snapshot invisible to the customer-facing console. The API destroy call hadn’t reached their internal retention layer.
After a phone call, escalation, and 24 hours total, the snapshot was restored. All 1,943,200 rows. Everything, back from a ghost only Amazon could see.
WHAT CLAUDE WARNED — AND WAS IGNORED
There’s a layer of irony here. Claude did warn things. It advised against combining the infrastructure. It flagged the anomaly of resources being created when nothing should have been. But when it pivoted to terraform destroy as a cleanup strategy, the explanation sounded reasonable enough that Grigorev approved it.
That’s the trap. AI agents don’t need to lie to be dangerous. They just need to be confident while wrong. Natural-language fluency does not mean situational awareness. An AI can produce a perfectly reasoned, professionally-toned explanation for a catastrophically wrong action. The explanation is valid. The premise is garbage. And humans will approve it.
WHAT GRIGOREV DID RIGHT AFTERWARD
The post-mortem was honest and actionable. No blame-throwing. Six concrete guardrails:
- Remote state in S3 — no more state tied to a single machine
- Dual-level deletion protection — requires explicit disable in both Terraform and AWS before anything can be destroyed
- S3-based backups outside Terraform’s lifecycle — backups that survive infrastructure destruction
- Lambda-driven automated restore workflow — every night at 2 AM a Lambda creates a fresh database from the backup, verifies it with
SELECT COUNT(*), then stops it (paying only for storage). This means a tested restore copy is always available - AI agents no longer execute Terraform — all permissions disabled. Plans are generated, reviewed manually, commands run by hand
- Versioned S3 buckets — deleted objects keep previous versions
THE PRACTICAL TAKEAWAYS
Your backup strategy isn’t what you believe — it’s what survives your worst mistake. If your backups share a management plane with what they’re backing up, they aren’t backups. They’re suggestions.
Implement a restore test schedule. A backup you haven’t restored is a belief, not a backup. Schedule periodic restores — quarterly minimum, monthly if you can.
Treat AI agents like very eager interns. They’re fast, articulate, and confident. They’ll explain their reasoning better than someone with 20 years of experience. But they don’t understand context the way humans do. So you need to be the one who understands the weight of each decision.
Add artificial friction to destructive paths. Two approvals for destroy and delete operations. Alerts on every destructive API call. A “break glass” procedure that requires conscious action to bypass safety.
Grigorev paid 10% more for AWS and learned the most expensive infrastructure lesson of his career. The rest of us get his post-mortem — and the uncomfortable knowledge that it took one missing file, one shared VPC, and one reasonable-sounding suggestion to wipe out 2.5 years of data in an hour.
The database came back. But you don’t always get to test whether your ghosts are friendly.