Portal Home > Knowledgebase > cPanel > Xen - VDI is not available


Xen - VDI is not available




Server unresponsive, had to force a shutdown.  Once finally shutdown received the error "VDI is unavailable" while booting.

The VDI is somehow locked by the system and so it needs to be unlocked.  
One way to do this is to do a toolchain reset (however, this may not work)

xe-toolstack-restart

To fix this issue on Nov. 28, 2012 I had to release the VDI, rescan and then re-attach

First login using SSH to the pool master (xen60-R710-A) to get the VDI list

xe vdi-list

I wrote this to a file as follows;  xe vdi-list > VDIs.txt

When I did this though I noticed that the snapshot VDIs had the same names, so I renamed the OS and Data VDI's of the live VM so that they stood out.  Once that was done I ran the above command again.  Once I opened the file I found the two UUID's of the VDI's for the bad VM.  These were the ones that I would need to release.

I recorded the names of the VDI's and also the size since when I restored them I knew they would only show me the size, the name would be lost.
As well, through all of this I also tracked where the VDI's were located since once they were released I would have to know where to go looking for them.

Next step was to forget the VDI's.  This is a nail biter as the VDI's are essentially going to disappear, and we will need to re-scan the SR to bring them back.

xe vdi-forget uuid=UUID_number_from_the_list

I did this twice to free up both the OS and Data VDI's from the VM.  Each VDI has its own unique UUID.

I then went to the storage location (QNAP 2-3 in this case) and did a re-scan of the SR.  The two VDI's re-appeared but without a name.  Based on the size, I was able to determine which was the OS and which was the Data VDI.  I renamed them appropriately.

Then I went to the VM -> storage tab and selected the Attach option.  I did this twice to re-attach the two VDI's that were forgotten briefly above.

I then selected to boot the VM and all was back online without any issue.

Note:
In this case it appeared that HA was the culprit as there were complaints about HA not being able to fail over to a different server (we will need another R710 to balance the load properly).  HA has been disabled for now to avoid further issues (HA=High Availability)



Was this answer helpful?

Add to Favourites Add to Favourites    Print this Article Print this Article