Tech Journal - GCloud, Disk Problem and More
Created on: 09 Jul 23 09:25 +0700 by Son Nguyen Hoang in English
Some good website recommended by a good friend of mine
My book server has a serious bug last night. The server crashed and cannot recovered automatically. Why this happended and how to resolve it? I write down this note to summarize the steps I took to fix the issue.
1. Finding the problem: SSH to the VM
– I tried to ssh connect using the default ssh-in-browser (google cloud), unfortunately, the connection failed multipletime.
– I tried to troubleshoot it by the default troubleshoot button in the ssh-in-browser. The troubleshoot stuck at testing “connectivity”.
– After a long research, I try to connect using gcloud compute followed this tutorial:
gcloud compute ssh --zone "REGION_NAME" "VM_NAME" --project "PROJECT_NAME"
– The connection failed. I run the command again, this time add --troubleshoot
to check for issues.
– It’s seem like the problem is that the disk has reached the limit. The troubleshoot
suggest me to increase the disk size
2. Confirm the problem
– I need to confirm that the current disk size had reached limit.
– I check online and it turns out that this problem fairly happened: Disk which run out of space prevent user from connecting via SSH
– According to a suggestion, I created a snapshot of the disk (current boot-disk) for backup & check for current disk usage.
– The disk usage (according to the snapshot size) is arround 15GB, which had not exceeded the limit (20GB)
– However, it’seem like the size of the snapshot is not equal to the actual size.
3. Create debugging VM:
– Shutdown the current Virtual Machine.
– Deattach the current disk (Disk A), for my case it is also the boot disk.
– Create new VM
– Attach the current disk then attach Disk A to this new VM
– From the new VM, mount the disk to a new folder. Then, you can ssh to this new VM to check the old disk.
– From the new VM, the old disk is already full (19.9 GB). It’s look like the default system already take up to 5GB. This default files & system had been ignored when the snapshot created.
– From the new VM, we can remove some files, increase the disk size ( I increase it to 30GB from 20GB), and re-attach it into the old VM.
4. Small Issue:
– The first time I increase the disk size, I still couldn’t ssh to the old VM.
– After I delete some files, I can ssh again. Also, the server run normally again.
5. Lession Learn:
– Using “Serial Console” to show system log
– Some useful linux command, which are:
$ lsbsk // to show all disk
$ blkid "PARTITION_NAME" // to show filesystem tyoe
$ sudo mount -t ext4 partition_disk disk_folder // mount disk into a folder
– Interaction to disk in VM