Migrating a VirtualBox Windows installation

I have been using Linux as my primary OS since 1999ish, except for a brief period early in the history of 67 Bricks when I had an iMac. Whenever I have used Windows it has invariably been in some kind of virtualised form; this was necessary in the iMac days when I was developing .NET applications in Visual Studio, but these days I work solely on Scala / Play projects developed in IntelliJ in Linux. Nevertheless, I have found it convenient to have an installation of Windows available for the rare instances where it’s actually necessary (for example, to connect to someone’s VPN for which no Linux client is available).

My Windows version of choice is the venerable Windows 7. This is the last version of Windows which can be configured to look like “proper Windows” as I see it by disabling the horrible Aero abomination. I tried running the Windows 10 installer once out of morbid curiosity, it started talking to me, so I stopped running it. I am old and set in my ways, and I feel strongly that an OS does not need to talk to me.

So anyway, I had a VirtualBox installation of Windows 7 and, because I am an extraordinarily kind and generous soul, I had given it a 250G SATA SSD all to itself using VirtualBox’s raw hard disk support.

Skip forward a few years, and I decided I would increase the storage in my laptop by replacing this SSD with a 2TB SSD since such storage is pretty cheap these days. The problem was – what to do with the Windows installation? I didn’t fancy the faff of reinstalling it and the various applications, and in fact I wasn’t even sure this would be possible given that it’s no longer supported. In any case, I didn’t want to give Windows the entire disk this time, I wanted a large partition available to Linux.

It turns out that VirtualBox’s raw disk support will let you expose specific partitions to the guest rather than the whole disk. The problem is that with a full raw disk, the guest sees the boot sector (containing the partition table and probably the bootloader), whereas you presumably don’t want the guest to see the boot sector if you’re only exposing certain partitions. How does this work?

The answer is that when you create a raw disk image and specify specific partitions to be available, in addition to the sda.vmdk file, you also get a sda-pt.vmdk file containing a copy of the boot sector, and this is presented to the guest OS instead of the real thing. Here, then, are the steps I took to clone my Windows installation onto partitions of the new SSD, keeping a partition free for Linux use, and ensuring Windows still boots. Be warned that messing about with this stuff and making a mistake can result in possibly irrecoverable data loss!

Step 1 – list the partitions on the current drive

My drive presented as /dev/sda

$ fdisk -x /dev/sda
Disk /dev/sda: 232.89 GiB, 250059350016 bytes, 488397168 sectors
Disk model: Crucial_CT250MX2
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: dos
Disk identifier: 0xc3df4459

Device     Boot  Start       End   Sectors Id Type            Start-C/H/S   End-C/H/S Attrs
/dev/sda1  *      2048    206847    204800  7 HPFS/NTFS/exFAT     0/32/33   12/223/19    80
/dev/sda2       206848 488394751 488187904  7 HPFS/NTFS/exFAT   12/223/20 1023/254/63 

Step 2 – create partitions on the new drive

I put the new SSD into a USB case and plugged it in, whereupon it showed up as /dev/sdb

Then you need to do the following

  1. Use fdisk /dev/sdb to edit the partition table on the new drive
  2. Create two partitions with the same number of sectors (204800 and 488187904) as the drive you are hoping to replace, and then you might as well turn all of the remaining space into a new partition
  3. Set the types of the partitions correctly – the first two should be type 7 (HPFS/NTFS/exFAT), if you’re creating another partition it should be type 83 (Linux)
  4. Toggle the bootable flag on the first partition
  5. Use the “expert” mode of fdisk to set the disk identifier to match that of the disk you are cloning; in my case this was 0xc3df4459 – Googling suggested that Windows may check for this and some software licenses may be tied to it.

So now we have /dev/sdb looking like this:

$ fdisk -x /dev/sdb
Disk /dev/sdb: 1.82 TiB, 2000398934016 bytes, 3907029168 sectors
Disk model: 500SSD1         
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0xc3df4459

Device     Boot     Start        End    Sectors Id Type            Start-C/H/S End-C/H/S Attrs
/dev/sdb1  *         2048     206847     204800  7 HPFS/NTFS/exFAT     0/32/33 12/223/19    80
/dev/sdb2          206848  488394751  488187904  7 HPFS/NTFS/exFAT   12/223/20 705/42/41 
/dev/sdb3       488394752 3907029167 3418634416 83 Linux             705/42/42 513/80/63

Note that the first two partitions are an exact match in terms of starts, end and sector count. I think we can ignore the C/H/S (cylinder / head / sector) values.

Step 3 – clone the data

Taking great care to get this right, we can use the dd command and we can tell it to report its status from time to time so we know something is happening. There are two partitions to clone:

$ dd if=/dev/sda1 of=/dev/sdb1 bs=1G status=progress
... some output (this is quite quick as it's only 100M)
$ dd if=/dev/sda2 of=/dev/sdb2 bs=1G status=progress
... this took a couple of hours to copy the partition, the bottleneck probably being the USB interface

Step 4 – create (temporarily) a VirtualBox image representing the new disk

There are various orders in which one could do the remaining steps, but I did it like this, while I still had /dev/sda as the old disk and the new one plugged in via USB as /dev/sdb

$ VBoxManage internalcommands createrawvmdk -filename sdb.vmdk -rawdisk /dev/sdb -partitions 1,2

This gives two files; sdb.vmdk and sdb-pt.vmdk. To be honest, I got to this point not knowing how the boot sector on the disk would show up, but having done it I was able to verify that sdb-pt.vmdk appeared to be a copy of the first sector of the real physical disk (at least the first 512 bytes of it). I did this by comparing the output of xxd sdb-pt.vmdk | head -n 32 with xxd /dev/sdb | head -n 32 which uses xxd to show a hex dump and picks the first 32 lines which happen to correspond to the first 512 bytes. In particular, you can see at offset 0x1be the start of the partition table, which is 4 blocks of 16 bytes each ending at 0x1fd, with the magic signature 55aa at 0x1fe. The disk identifier can also be seen as the 4 bytes starting at 0x1b8. And note that everything leading up to the disk identifier is all zeroes.

Step 5 – dump the boot sector of the existing disk

At this point, we have /dev/sda with the bootsector containing the Windows bootloader, we have a freshly initialised /dev/sdb containing a copy of the Windows partitions and a bootsector that’s empty apart from the partition table and disk identifier, and we have the sdb-pt.vmdk file containing a copy of that empty bootsector. We have also arranged for the new disk to have the same identifier as the old disk.

What is needed now is to create a new sdb-pt.vmdk file containing the bootloader code from /dev/sda and then the partition table from /dev/sdb. There are 444 bytes that we need from the former. So we can do something like this:

$ head -c 444 /dev/sda > sdb-pt-new.vmdk
$ tail -c +445 sdb-pt.vmdk >> sdb-pt-new.vmdk

We can confirm that the new file has the same size as the original, we can also use xxd to confirm that we’ve got the bootloader code and then everything from that point is as it was (including the new partition table)

Step 6 – the switch

We’re almost done. All that’s left to do is:

  1. Replace the old SATA SSD with the new 2TB SSD
  2. Run VirtualBox – do not start Windows but instead remove the existing hard drive attached to it (which is the “full raw disk” version of sda.vmdk), and use the media manager to delete this.
  3. Quit VirtualBox
  4. Recreate the raw disk with partitions file but for /dev/sda (which is what the new drive is): VBoxManage internalcommands createrawvmdk -filename sda.vmdk -rawdisk /dev/sda -partitions 1,2
  5. The previous command will have created the sda-pt.vmdk boot sector image from the drive, which will again be full of zeroes. Overwrite this with the sdb-pt-new.vmdk that we made earlier, obviously ensuring we preserve the original sda-pt.vmdk name and the file permissions.
  6. Start VirtualBox, add the newly created sda.vmdk image to the media manager, and then as the HD for the Windows VM
  7. Start the Windows VM and hope for the best

I was very gratified to find that this worked more-or-less first time. In truth, I originally just tried replacing the old sda.vmdk with the new one, which gave the error “{5c6ebcb7-736f-4888-8c44-3bbc4e757ba7} of the medium ‘/var/daniel/virtualbox/Windows 7/sda.vmdk’ does not match the value {e6085c97-6a18-4789-b862-8cebcd8abbf7} stored in the media registry (‘/home/daniel/.config/VirtualBox/VirtualBox.xml’)”. It didn’t seem to want to let me remove this missing drive from the machine using the GUI, so I edited the Windows 7.vbox file to remove it, then removed it from the media library and added the replacement, which I was then able to add to the VM.

Organising Git Pull Requests

Recently we had an interesting dev forum on Git workflows. Git is the de facto source control tool of choice for software development. Powerful and flexible, teams have a wide range of choices to make when deciding exactly how to make use of it. This discussion originally stemmed from a Changelog podcast titled “Git your reset on” based on the blog post “Git Organized” by Annie Sexton.

To briefly summarise the above, if we imagine a situation where a merge to main leads to a production deployment and then an issue is found in production, how do we roll back? We look in the git history, find the commit that introduced the bug, revert it and then redeploy to production. But oh no! This revert changed a number of source files beyond the scope of the bug and has ended up introducing a new bug.

The proposed solution to this is to ensure a sensible, clean history of commits is used within a branch. Annie Sexton suggests using a feature branch where you commit regularly with useless messages like WIP until you are ready to submit a pull request. You then run git reset origin/head to be presented with all the changes you have done and how this differs from main, you then progressively add files and make commits with sensible messages to build up a more coherent change.

This approach has a number of advantages we discussed:

  • It provides a neat history of compartmentalised changes
  • Reviewers can then follow a structured story of commits about how the developer arrived at their solution
  • Irrelevant changes are excluded
  • Ensure that all of the included changes are relevant only to the task at hand

Disadvantages of this approach may include:

  • Bad approaches that were tried and abandoned are not in the history
  • Changes aren’t going to always compartmentalise neatly into specific changes of files meaning we would need to commit specific changes within a file, some tools do a bad job supporting this use case

It turns out that a lot of developers at 67 Bricks use a form of history rewriting when developing code, but no one used the specific approach proposed above. git commit–amend is a very useful command to tweak the previous commit. Some (including myself) use git rebase -i as a way of rebuilding the commit history, though this has a limitation in cases where you may want to split a commit up. Finally, others create new branches and build up clean commits on the new branch.

Is force pushing a force for good?

This question really split the developers, some see using git push --force as an evil that indicates you’re using git wrongly and should only be used as a last resort. Others really like the idea of force pushing, but only for branches you’ve been developing on and have yet to create a merge request for.

Squashing?

A workflow some developers have seen before involves using the --squash flag when merging into main. This can create a nice history where each commit neatly maps to a single ticket but the general view was this caused the loss of helpful information from the git history.

Should the software be valid at every commit?

Some argued that this is a sensible thing to strive for, having confidence that you can revert back to any commit and the software will work is a nice place to be at. However, others criticised this as tricky to achieve in practice. Ideally the tests would all be valid and pass too, but this goes against how some developers work; they create a failing test to show the problem or new feature, commit it and then build out code to make the test green. Others claim that tests should change in the same commit as code changes to make it obvious these changes belong together.

Tools used by Developers

At 67 bricks, we try to avoid specifying which precise tool a developer should use for their craft. Different developers prefer different approaches when it comes to using git, some are more comfortable with their IDEs plugin (e.g. IntelliJ has good git integration which can add specific lines to commits) while others may prefer using the command line. Here’s a list of tools we’ve used when dealing with git:

To conclude, specific approaches and tools used vary, but everyone agreed on the benefits of striving for a clean, structured git history.

Terraform tricks to cope with conditionally created resources

Terraform is a great tool and we use it extensively to spin up resources in AWS.  It’s very easy to get started, the documentation is great and once you have built yourself a development environment it’s just a matter of changing a few config settings and you have yourself a staging and prod environment too.

It gets more tricky when the different environments are not exactly copies of each other. For example, in development you might want to create a SNS topic to use for testing whereas in staging and production you might want to use an external one.

Creating resources in some environments and not others can be controlled using the “count” parameter on a resource. So in the case of the SNS topic we only want to create in development, we can set an sns_count variable to be 1 in development and 0 elsewhere and use this to only create the topic in development.

But suppose we want to use the ARN of the created topic if it exists, or a fixed ARN of an external topic otherwise. Now it becomes even harder. Suppose that we are creating the SNS topic like this:

resource "aws_sns_topic" "sns_topic" {
  name = "our_topic"
  count = "${var.sns_count}"
}

and we have a variable topic_arn that will contain a fixed topic ARN to use if we have not created one. Terraform does not have any kind of conditional logic, but has a simple ternary function and there is a coalesce function which takes the first non-empty string from a list. So you might think that the following string interpolation would work:

"${coalesce(aws_sns_topic.sns_topic.arn, var.topic_arn)}"

Unfortunately it doesn’t because in the case where we do not create the topic, Terraform complains that there is no arn attribute.

So, to get around this we use this syntax instead:

"${coalesce(join("", aws_sns_topic.sns_topic.*.arn), var.topic_arn)}"

The .*.arn syntax returns a list of the ARNs for all the created topics, which will be either one or none. So we flatten this to a string with the join function and finally, it works as we want. Phew!