Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to deploy VMs with additional disks on EGI FedCloud OpenStack sites #578

Closed
alahiff opened this issue Mar 26, 2018 · 11 comments
Closed
Labels

Comments

@alahiff
Copy link

alahiff commented Mar 26, 2018

If I create a VM with an additional disk by including something like this in the RADL:

disk.1.type='filesystem' and
disk.1.mount_path='/disk' and
disk.1.fstype='ext4' and
disk.1.device='vdb' and
disk.1.size=40G

this works as expected on CESNET-MetaCloud (and also on Azure and Google btw). However, I can't get it to work on FedCloud sites which use OpenStack such as RECAS-BARI and IN2P3-IRES. I get an error like this:

Attempt 1: Error launching the VMs of type node to cloud ID occi6 of type OCCI. Internal Server Error
The server has either erred or is incapable of performing the requested operation.

Looking in /var/log/im/im.log I see this:

2018-03-26 14:51:02,916 - InfrastructureManager - INFO - Inf ID: 12df8522-3105-11e8-b44b-0242ac110004: Launching 1 VMs of type node
2018-03-26 14:51:03,238 - CloudConnector - INFO - Getting Keystone v2 token
2018-03-26 14:51:04,500 - CloudConnector - INFO - Using tenant: EGI_fedcloud
2018-03-26 14:51:07,670 - CloudConnector - INFO - Inf ID: 12df8522-3105-11e8-b44b-0242ac110004: Creating a 40 GB volume for the disk 1
2018-03-26 14:51:09,780 - CloudConnector - INFO - Inf ID: 12df8522-3105-11e8-b44b-0242ac110004: Volume id 3360cfbb-3dbd-45e4-a7ce-1a79cc71cbdf sucessfully created.
2018-03-26 14:51:15,759 - CloudConnector - INFO - Inf ID: 12df8522-3105-11e8-b44b-0242ac110004: Waiting volume 3360cfbb-3dbd-45e4-a7ce-1a79cc71cbdf to be online. Current state: online
2018-03-26 14:51:16,551 - CloudConnector - INFO - Inf ID: 12df8522-3105-11e8-b44b-0242ac110004: Delete storage: /occi//storage/3360cfbb-3dbd-45e4-a7ce-1a79cc71cbdf
2018-03-26 14:51:17,356 - CloudConnector - INFO - Inf ID: 12df8522-3105-11e8-b44b-0242ac110004: Storage /occi//storage/3360cfbb-3dbd-45e4-a7ce-1a79cc71cbdf exists. Try to delete it.
2018-03-26 14:51:18,144 - CloudConnector - INFO - Inf ID: 12df8522-3105-11e8-b44b-0242ac110004: Successfully deleted
2018-03-26 14:51:18,144 - InfrastructureManager - WARNING - Inf ID: 12df8522-3105-11e8-b44b-0242ac110004: Error launching some of the VMs: Internal Server Error
The server has either erred or is incapable of performing the requested operation.

So the volume seems to be successfully created but is almost immediately deleted for some reason, then an Internal Server Error appears.

I am using the grycap/im:1.6.7-1 Docker image.

@micafer
Copy link
Member

micafer commented Mar 26, 2018

Hi @alahiff,

Try to remove the line:

disk.1.device='vdb' and

In some connectors are not supported.
In general is better not to add it. Only some connectors really need it.

@alahiff
Copy link
Author

alahiff commented Mar 26, 2018

Hi @micafer, thanks for the quick reply. Removing the device line results in a slightly different error:

2018-03-26 16:10:20,112 - CloudConnector - INFO - Inf ID: 08ce9978-3110-11e8-a86b-0242ac110004: Creating a 40 GB volume for the disk 1
2018-03-26 16:10:21,796 - CloudConnector - INFO - Inf ID: 08ce9978-3110-11e8-a86b-0242ac110004: Volume id e4818ad1-915f-4a51-9249-eb2ed5e8ca95 sucessfully created.
2018-03-26 16:10:28,223 - CloudConnector - INFO - Inf ID: 08ce9978-3110-11e8-a86b-0242ac110004: Waiting volume e4818ad1-915f-4a51-9249-eb2ed5e8ca95 to be online. Current state: online
2018-03-26 16:10:29,301 - CloudConnector - INFO - Inf ID: 08ce9978-3110-11e8-a86b-0242ac110004: Delete storage: /occi//storage/e4818ad1-915f-4a51-9249-eb2ed5e8ca95
2018-03-26 16:10:30,031 - CloudConnector - INFO - Inf ID: 08ce9978-3110-11e8-a86b-0242ac110004: Storage /occi//storage/e4818ad1-915f-4a51-9249-eb2ed5e8ca95 exists. Try to delete it.
2018-03-26 16:10:30,948 - CloudConnector - INFO - Inf ID: 08ce9978-3110-11e8-a86b-0242ac110004: Successfully deleted
2018-03-26 16:10:30,948 - InfrastructureManager - WARNING - Inf ID: 08ce9978-3110-11e8-a86b-0242ac110004: Error launching some of the VMs: Bad Request
Expecting http://cloud.recas.ba.infn.it:8787/occi/storage resource

@micafer
Copy link
Member

micafer commented Mar 26, 2018

Sometimes the problem is that the device is already used, try to use another one: vdc, vdd ...

@alahiff
Copy link
Author

alahiff commented Mar 27, 2018

I had already tried a few other devices without success. I've now tried some more and I still always have the same problem.

BTW if I create a VM using https://dashboard.appdb.egi.eu/vmops on IN2P3-IRES, for example, and specify an additional disk, everything is fine (and vdb is used). I think this webpage uses IM internally (?)

@micafer
Copy link
Member

micafer commented Mar 27, 2018

Yes the AppDB VMOps dashboard uses the IM internally to launch the VMs.
And it is currently using the IM 1.6.7.
Does the problem only appears in RECAS Bari site?

@alahiff
Copy link
Author

alahiff commented Mar 27, 2018

Not just Bari, also IN2P3-IRES and INFN-PADOVA-STACK.

@micafer
Copy link
Member

micafer commented Mar 27, 2018

I have contacted with the AppDB VMOps dashboard developers and they say me that the RADL code that they use is like that:

   disk.1.mount_path = '/mnt/storage01' and
   disk.1.fstype = 'ext4' and
   disk.1.size = 10G 

@alahiff
Copy link
Author

alahiff commented Mar 27, 2018

I can create a VM using the VMOps dashboard on RECAS-BARI and it works fine, but when using IM directly with the RADL extract above I get:

2018-03-27 14:52:10,599 - CloudConnector - INFO - Inf ID: 54662680-31ce-11e8-888c-0242ac110004: Creating a 10 GB volume for the disk 1
2018-03-27 14:52:12,414 - CloudConnector - INFO - Inf ID: 54662680-31ce-11e8-888c-0242ac110004: Volume id bbae9341-d8b4-424c-9eda-c557d6422197 sucessfully created.
2018-03-27 14:52:23,772 - CloudConnector - INFO - Inf ID: 54662680-31ce-11e8-888c-0242ac110004: Waiting volume bbae9341-d8b4-424c-9eda-c557d6422197 to be online. Current state: online
2018-03-27 14:52:24,730 - CloudConnector - INFO - Inf ID: 54662680-31ce-11e8-888c-0242ac110004: Delete storage: /occi//storage/bbae9341-d8b4-424c-9eda-c557d6422197
2018-03-27 14:52:25,732 - CloudConnector - INFO - Inf ID: 54662680-31ce-11e8-888c-0242ac110004: Storage /occi//storage/bbae9341-d8b4-424c-9eda-c557d6422197 exists. Try to delete it.
2018-03-27 14:52:25,953 - CloudConnector - INFO - Inf ID: 54662680-31ce-11e8-888c-0242ac110004: Successfully deleted
2018-03-27 14:52:25,953 - InfrastructureManager - WARNING - Inf ID: 54662680-31ce-11e8-888c-0242ac110004: Error launching some of the VMs: Bad Request
Expecting http://cloud.recas.ba.infn.it:8787/occi/storage resource

For IN2P3-IRES it doesn't actually work on the VMOps dashboard. Fortunately I get the same error:

Exception: Some deploys did not proceed successfully: 'ascii' codec can't encode character u'\xe9' in position 17: ordinal not in range(128)

In this case I assume there is a message/error containing a French accent which is causing problems.

@micafer
Copy link
Member

micafer commented Mar 29, 2018

In RECAS Bari I have tried using this RADL code and it worked for me.

   disk.1.mount_path = '/mnt/storage01' and
   disk.1.fstype = 'ext4' and
   disk.1.size = 10G 

But I think that I know that reason that causes your error.
Check your IM auth data and check that the host is:
http://cloud.recas.ba.infn.it:8787/occi
NOT ended with an slash
(I will try to fix this issue in the future to avoid this errors)

In the case of IN2P3-IRES as you said there is an error message containing a French accent which is causing problems.

Le disque du gabaris est trop petit pour l'image demandée. Le disque du gabarit fait  1073741824 bytes, l'image fait 1434451968 bytes.

It means that you are selecting a flavour too small fot the image selected. You probably has used the smalles one (as me) and you get this error. (I will also have to fix this error with French accents).

@alahiff
Copy link
Author

alahiff commented Mar 29, 2018

Yes, I had taken the site endpoint from appdb, which for RECAS-BARI has a slash at the end. I have now tried without the slash and it works. I also just tried IN2P3-IRES with a larger flavour, and it also works.

Thanks!

@micafer
Copy link
Member

micafer commented Mar 29, 2018

You are wellcome.

@micafer micafer closed this as completed Mar 29, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants