Skip to content

Conversation

@GutoVeronezi
Copy link
Contributor

@GutoVeronezi GutoVeronezi commented Dec 30, 2020

Description

KVM has a limitation to modify instances definitions while they are on running state. Therefore, it is not possible to change volumes backend location easily.

There is a problem in the migrateVolume API. This API command ignores that limitation and causes an inconsistence on the database. ACS processes the migrate command, copies the volume to the destination storage, modifies the database and finishes the process with success. However, the running backend is still using the "old volume file".

This PR intends to prevent KVM to perform volumes migrations while KVM instances are in the running state and inform the user of an alternative API command that enables such operation on running instances.

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

How Has This Been Tested?

It has been tested locally.

Copy link
Member

@GabrielBrascher GabrielBrascher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM.

Copy link
Contributor

@DaanHoogland DaanHoogland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code lgtm

@GutoVeronezi GutoVeronezi force-pushed the throw-exception-when-live-migrate-volume-in-kvm branch from 66fa777 to 5113863 Compare January 4, 2021 16:38
@yadvr yadvr requested a review from shwstppr January 5, 2021 07:22
@yadvr yadvr changed the base branch from master to 4.15 January 5, 2021 07:36
@yadvr
Copy link
Member

yadvr commented Jan 5, 2021

@blueorangutan package

@blueorangutan
Copy link

@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result: ✔centos7 ✔centos8 ✔debian. JID-2524

Copy link
Member

@weizhouapache weizhouapache left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code makes sense.

could you please create pr for 4.15 instead of master ? (some changes on pom.xml needs to be reverted)

@yadvr
Copy link
Member

yadvr commented Jan 5, 2021

The PR base branch is 4.15 @weizhouapache, I think you mean rebase against 4.15

@weizhouapache
Copy link
Member

The PR base branch is 4.15 @weizhouapache, I think you mean rebase against 4.15

@rhtyd
as I see, this PR is based on 4.15, but it have some commits from latest master (contains version update to 4.16.0.0) which are not needed. should remove or revert these commits.

by the way, is there same issue in 4.14 ?

@yadvr yadvr changed the base branch from 4.15 to master January 5, 2021 09:49
Copy link
Contributor

@RodrigoDLopez RodrigoDLopez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code LGTM.
great job @GutoVeronezi

what do you mean with:

How Has This Been Tested?
It has been tested locally.

could you please explain how to test this branch?

I tested this branch using the following behavior.

  1. Create a virtual machine using KVM as a hypervisor
  2. Run the 'migrateVolume' command with the instance running.
  3. I checked the database to see if the information on that disk had been 'changed'
  4. check the storage pool to see if the target volume has actually been moved as requested
  5. at this point we can verify an inconsistency, the database tells us that the volume has been moved to the target storage pool, however it is not possible to identify the volume in this storage pool.

is this correct my test form?

@GabrielBrascher GabrielBrascher modified the milestone: 4.16.0.0 Jan 6, 2021
@GutoVeronezi
Copy link
Contributor Author

@RodrigoDLopez it's a good way to test it.

Before the commit, what you described in step 5 is what was happening.

To test it...

  1. I created two primary storages.

  2. I created one instance (using KVM as hypervisor) in one primary storage.

  3. I updated hypervisor_capabilities to support storage motion of a KVM.

  4. Via GUI, I tried to migrate the volume to the other primary storage.

  5. Before the modifications, it would claim that the volume was migrated, but actually it would create an inconsistency in the database and in the storage pool. Now, it's throwing an exception explaining why it is not possible and how to get around the situation.

@weizhouapache
Copy link
Member

@GutoVeronezi @rhtyd
can we have this in 4.14, 4.15 and master ?

@GutoVeronezi
Copy link
Contributor Author

Thanks for the review guys. To us, it is fine if it only goes to master.

@weizhouapache
Copy link
Member

Thanks for the review guys. To us, it is fine if it only goes to master.

@GutoVeronezi
the issue exists in 4.14 LTS and coming 4.15 LTS, we should fix it in the branches as well.
please update this pr

@GutoVeronezi
Copy link
Contributor Author

Thanks for the review guys. To us, it is fine if it only goes to master.

@GutoVeronezi
the issue exists in 4.14 LTS and coming 4.15 LTS, we should fix it in the branches as well.
please update this pr

@weizhouapache
We would rather not open for 4.14, it would be more work on our side. We already fixed it in our 4.13 internal branch, and now we are proposing the patch to contribute back with the community. Committers could execute a back port (if they think it is interesting) after this one is merged in master.

@weizhouapache
Copy link
Member

Thanks for the review guys. To us, it is fine if it only goes to master.

@GutoVeronezi
the issue exists in 4.14 LTS and coming 4.15 LTS, we should fix it in the branches as well.
please update this pr

@weizhouapache
We would rather not open for 4.14, it would be more work on our side. We already fixed it in our 4.13 internal branch, and now we are proposing the patch to contribute back with the community. Committers could execute a back port (if they think it is interesting) after this one is merged in master.

@GutoVeronezi when you create a pr, it will trigger jenkins test, travis test, trillian test, and also be reviewed and/or manual tested by community. no matter which dest branch is.

@GutoVeronezi
Copy link
Contributor Author

Thanks for the review guys. To us, it is fine if it only goes to master.

@GutoVeronezi
the issue exists in 4.14 LTS and coming 4.15 LTS, we should fix it in the branches as well.
please update this pr

@weizhouapache
We would rather not open for 4.14, it would be more work on our side. We already fixed it in our 4.13 internal branch, and now we are proposing the patch to contribute back with the community. Committers could execute a back port (if they think it is interesting) after this one is merged in master.

@GutoVeronezi when you create a pr, it will trigger jenkins test, travis test, trillian test, and also be reviewed and/or manual tested by community. no matter which dest branch is.

@weizhouapache
Ok, I see how it works. Anyways, to us it is fine it if it goes only to master.

@weizhouapache
Copy link
Member

@GutoVeronezi ok, got it.
@DaanHoogland what's your opinion ? merge it into master then backport to 4.14/4.15 ?

Copy link
Contributor

@shwstppr shwstppr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@weizhouapache @DaanHoogland @rhtyd do we plan to backport this or should this go into 4.15.1.0?

@DaanHoogland
Copy link
Contributor

@weizhouapache @shwstppr @GutoVeronezi @rhtyd I am always for merging in the oldest applicable branch first. If we want this in those branches we should rebase it first (imnsho).

@weizhouapache
Copy link
Member

@weizhouapache @shwstppr @GutoVeronezi @rhtyd I am always for merging in the oldest applicable branch first. If we want this in those branches we should rebase it first (imnsho).

@DaanHoogland I agree. merge it to 4.14 at first, then merge to 4.15/master.

@GutoVeronezi can you create this branch from 4.14, apply the code change and force-push to github ?

KVM has a limitation to modify instances definitions while they are on running state. Therefore, it is not possible to change volumes backend location easily.

There is a problem in the `migrateVolume` API. This API command ignores that limitation and causes an inconsistence on the database. ACS processes the migrate command, copies the volume to the destination storage, modifies the database and finishes the process with success. However, the running backend is still using the "old volume file".

This PR intends to prevent KVM to perform volumes migrations while KVM instances are in the running state and inform the user of an alternative API command that enables such operation on running instances.
@GutoVeronezi GutoVeronezi changed the base branch from master to 4.14 February 12, 2021 14:50
@GutoVeronezi GutoVeronezi force-pushed the throw-exception-when-live-migrate-volume-in-kvm branch from 5113863 to bc760f2 Compare February 12, 2021 14:51
@GutoVeronezi
Copy link
Contributor Author

@weizhouapache done

Copy link
Member

@weizhouapache weizhouapache left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code lgtm

@weizhouapache
Copy link
Member

@weizhouapache done

@GutoVeronezi good, thanks

@DaanHoogland DaanHoogland added this to the 4.14.2.0 milestone Feb 13, 2021
@yadvr
Copy link
Member

yadvr commented Feb 15, 2021

@blueorangutan package

@blueorangutan
Copy link

@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result: ✔centos7 ✖centos8 ✖debian. JID-2708

@DaanHoogland
Copy link
Contributor

@blueorangutan test

@blueorangutan
Copy link

@DaanHoogland a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

@blueorangutan
Copy link

Trillian test result (tid-3540)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 30714 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr4562-t3540-kvm-centos7.zip
Smoke tests completed. 83 look OK, 0 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File

@DaanHoogland
Copy link
Contributor

not sure if we need more third party testing on this. @rhtyd @weizhouapache, can we ?

@yadvr
Copy link
Member

yadvr commented Feb 19, 2021

Simple fix, I add a space between if (.... No testing needed, we don't support live volume migration with VM running.

@yadvr yadvr merged commit 3b5f99a into apache:4.14 Feb 19, 2021
nlgordon pushed a commit to ippathways/cloudstack that referenced this pull request Aug 2, 2022
…es (apache#4562)

* Prevent KVM from performing volume migrations of running instances

KVM has a limitation to modify instances definitions while they are on running state. Therefore, it is not possible to change volumes backend location easily.

There is a problem in the `migrateVolume` API. This API command ignores that limitation and causes an inconsistence on the database. ACS processes the migrate command, copies the volume to the destination storage, modifies the database and finishes the process with success. However, the running backend is still using the "old volume file".

This PR intends to prevent KVM to perform volumes migrations while KVM instances are in the running state and inform the user of an alternative API command that enables such operation on running instances.

* Update VolumeApiServiceImpl.java

Co-authored-by: Daniel Augusto Veronezi Salvador <[email protected]>
Co-authored-by: Rohit Yadav <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants