Skip to content

Conversation

@GabrielBrascher
Copy link
Member

@GabrielBrascher GabrielBrascher commented May 10, 2021

Description

Currently there is no disk IO driver configuration for VMs running on KVM. That's OK for most the cases; however, recently there have been added some quite interesting optimizations with the IO driver io_uring.

Note that IO URING requires:

  • Qemu >= 5.0, and
  • Libvirt >= 6.3.0.

By using io_uring we can see a massive I/O performance improvement within Virtual Machines running from Local and/or NFS storage.

This implementation enhances the KVM disk configuration by adding workflow for setting the disk IO drivers. Additionally, if the Qemu and Libvirt versions matches with the required for having io_uring we are going to set it on the VM. If there is no support for such driver we keep it as it is nowadays, without any IO driver configured.

Fixes: #4883

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)

How Has This Been Tested?

Locally debugging with breakpoints on remote debug and asserting the IO driver as well as qemu/libvirt versions.

@GabrielBrascher GabrielBrascher self-assigned this May 10, 2021
@GabrielBrascher GabrielBrascher added this to the 4.16.0.0 milestone May 10, 2021
@rohityadavcloud
Copy link
Member

@GabrielBrascher interesting PR, I checked the libvirt version. Which distros/version is this targeted against?

@GabrielBrascher
Copy link
Member Author

@rhtyd as far as I know these Qemu and Libvirt packages can be found in Ubuntu 21.04.
It is quite a fresh thing, but we are already working on having CloudStack supporting it.

@rohityadavcloud
Copy link
Member

Thanks for explaining @GabrielBrascher good we're looking ahead.

@wido
Copy link
Contributor

wido commented May 11, 2021

Thanks for explaining @GabrielBrascher good we're looking ahead.

Especially with NVMe local storage in hypervisors this can vastly improve the overall I/O performance of Virtual Machines.

See this comment from Gabriel with a graph of the performance boost possible: #4883 (comment)

@GabrielBrascher
Copy link
Member Author

Thanks for the review, @GutoVeronezi. Code has been updated.

@GabrielBrascher GabrielBrascher marked this pull request as ready for review May 28, 2021 00:08
Copy link
Contributor

@GutoVeronezi GutoVeronezi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM, although I did not test it.

@GabrielBrascher
Copy link
Member Author

GabrielBrascher commented May 28, 2021

I will check if I can install the required Libvirt/Qemu packages on Ubuntu 20.04 in order to test this implementation. I am not sure if they will be installable in 20.04 as we can find them on 21.04.

If it works I can run some tests to check if all looks good when the XML is parsed and VM deployed. I will be able to run some tests on Ubuntu+KVM & RBD / NFS / Filesystem.

@rohityadavcloud
Copy link
Member

@blueorangutan package

@blueorangutan
Copy link

@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result: ✖️ centos7 ✖️ centos8 ✔️ debian. SL-JID 271

@sureshanaparti
Copy link
Contributor

@blueorangutan package

@blueorangutan
Copy link

@sureshanaparti a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result: ✔️ centos7 ✔️ centos8 ✔️ debian. SL-JID 357

@sureshanaparti
Copy link
Contributor

@blueorangutan test

@blueorangutan
Copy link

@sureshanaparti a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

@blueorangutan
Copy link

Trillian test result (tid-1126)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 33399 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr5012-t1126-kvm-centos7.zip
Smoke tests completed. 88 look OK, 0 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File

Copy link
Contributor

@DaanHoogland DaanHoogland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cltgm

@DaanHoogland
Copy link
Contributor

I see unit tests @GabrielBrascher 👍 are integration tests for io driver-types in the planning/possible? or do we need hardware? (i don't think so, only kernel features and maybe qemu/libvirt version supporting it)

* (ii) Libvirt >= 6.3.0
*/
protected void setDiskIoDriver(DiskDef disk) {
if (getHypervisorLibvirtVersion() >= HYPERVISOR_LIBVIRT_VERSION_SUPPORTS_IO_URING && getHypervisorQemuVersion() >= HYPERVISOR_QEMU_VERSION_SUPPORTS_IO_URING) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM and tests have passed, but this enables io_uring as default for the matching version of qemu and libvirt, could that cause an issue @GabrielBrascher for different type of guest OS?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rhtyd as far as I know there are no issues with different types of guest OS.
This would impact the Hypervisor side; thus, it is important that the virtualization layer supports it.
As far as I know, both Ubuntu and Centos would support as long as the Qemu >= 5.0 and Libvirt >= 6.3.0.

One issue that I have seen with this proposed implementation is when (live) migrating from a KVM node that supports IO_URING to one that does not. I will check how to update the IO Driver when migrating.

@GabrielBrascher
Copy link
Member Author

GabrielBrascher commented Jul 10, 2021

@DaanHoogland @rhtyd I've executed some manual tests on hardware that has the required Qemu & Libvirt versions to support IO_URING.

The VM has the IO_URING added as expected in its XML when the versions match the requirements.
Executed some simple IO tests on the same VM and host before and after the IO driver being applied (same hardware, VM, and OS/Kernel).

data in/out  1GB   2GB   4GB   5GB   10GB
default      104   101   102    92    102 MB/s
io-uring      92    95    96    96    100 MB/s

@DaanHoogland Do you mean integration tests that would validate the domain XML according to the host qemu/libvirt version?.

@weizhouapache
Copy link
Member

@DaanHoogland @rhtyd I've executed some manual tests on hardware that has the required Qemu & Libvirt versions to support IO_URING.

The VM has the IO_URING added as expected in its XML when the versions match the requirements.
Executed some simple IO tests on the same VM and host before and after the IO driver being applied (same hardware, VM, and OS/Kernel).

data in/out  1GB   2GB   4GB   5GB   10GB
default      104   101   102    92    102 MB/s
io-uring      92    95    96    96    100 MB/s

@DaanHoogland Do you mean integration tests that would validate the domain XML according to the host qemu/libvirt version?.

@GabrielBrascher it looks the performance are quite close, right ?

@GabrielBrascher
Copy link
Member Author

I agree, @weizhouapache. I think that this is due to the tests being too simple, probably not the best way of exploring the IO optimizations with some "dd" tests. I think that I need to run some elaborated tests.

There are some interesting tests thet show a great improvement in performance. For example: #4883 (comment)

@weizhouapache
Copy link
Member

I agree, @weizhouapache. I think that this is due to the tests being too simple, probably not the best way of exploring the IO optimizations with some "dd" tests. I think that I need to run some elaborated tests.

There are some interesting tests thet show a great improvement in performance. For example: #4883 (comment)

@GabrielBrascher
the testing results in #4883 (comment) are exciting.
do you know any other setting other than io_uring ? ( I did not look into the pdf).

@DaanHoogland
Copy link
Contributor

@DaanHoogland Do you mean integration tests that would validate the domain XML according to the host qemu/libvirt version?.

yes, and does a copy/mv/del to verify io is functioning, maybe. I do not say we need it, just wondering. might be excessive.

Copy link
Member

@rohityadavcloud rohityadavcloud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't tested, but no regressions so let's merge.

@rohityadavcloud rohityadavcloud merged commit 1d831a3 into apache:main Jul 15, 2021
@xlmnxp
Copy link

xlmnxp commented May 17, 2022

this PR prevent me from using Cloudstack
I get the following in cloudstack logs

May 17 16:21:48 app.sy.sa java[1982]: INFO  [o.a.c.f.j.i.AsyncJobMonitor] (Work-Job-Executor-77:ctx-da4144a8 job-481/job-559) (logid:f2176072) Add job-559 into job monitoring
May 17 16:21:48 app.sy.sa java[1982]: INFO  [c.c.a.m.a.i.FirstFitAllocator] (Work-Job-Executor-77:ctx-da4144a8 job-481/job-559 ctx-a0d804e3 FirstFitRoutingAllocator) (logid:3d720a9a)  Guest VM is requested with Custom[UEFI] Boot Type false
May 17 16:21:48 app.sy.sa java[1982]: INFO  [c.c.s.StorageManagerImpl] (Work-Job-Executor-77:ctx-da4144a8 job-481/job-559 ctx-a0d804e3) (logid:3d720a9a) Storage pool Primary1 (1) does not supply IOPS capacity, assuming enough capacity
May 17 16:21:49 app.sy.sa java[1982]: INFO  [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-77:ctx-da4144a8 job-481/job-559 ctx-a0d804e3) (logid:3d720a9a) Unable to start VM on Host {"id": "1", "name": "app.sy.sa", "uuid": "512fa84c-237d-4cdb-b86b-1bac252eaffc", "type"="Routing"} due to unsupported configuration: io uring is not supported by this QEMU binary

May 17 16:21:51 app.sy.sa java[1982]: INFO  [c.c.v.VirtualMachineManagerImpl] (consoleproxy-1:ctx-82b6a15f) (logid:792d27bd) allocating virtual machine from template:0d91769b-c74f-4614-92a3-759f26af8c62 with hostname:v-182-VM and 3 networks
May 17 16:21:51 app.sy.sa java[1982]: INFO  [o.a.c.e.o.VolumeOrchestrator] (consoleproxy-1:ctx-82b6a15f) (logid:792d27bd) adding disk object ROOT-182 to v-182-VM
May 17 16:21:52 app.sy.sa java[1982]: INFO  [o.a.c.f.j.i.AsyncJobMonitor] (Work-Job-Executor-78:ctx-8d72fa13 job-483/job-560) (logid:3bbd911f) Add job-560 into job monitoring

@xlmnxp
Copy link

xlmnxp commented May 18, 2022

same thing for router:

May 17 21:21:29 app.sy.sa java[1975]: INFO  [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-22:ctx-9643ce8b job-481/job-1600 ctx-501df6a9) (logid:3d720a9a) Unable to start VM on Host {"id": "1", "name": "app.sy.sa", "uuid": "512fa84c-237d-4cdb-b86b-1bac252eaffc", "type"="Routing"} due to unsupported configuration: io uring is not supported by this QEMU binary
May 17 21:21:29 app.sy.sa java[1975]: INFO  [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-23:ctx-038727fd job-483/job-1601 ctx-cd372811) (logid:e68ba882) Unable to start VM on Host {"id": "1", "name": "app.sy.sa", "uuid": "512fa84c-237d-4cdb-b86b-1bac252eaffc", "type"="Routing"} due to unsupported configuration: io uring is not supported by this QEMU binary

@slavkap
Copy link
Contributor

slavkap commented May 18, 2022

@xlmnxp, can you share the libvirt/qemu versions and the OS? Also, do you use advanced virtualization? I had the same problem on Rocky with the advanced virtualization, and there is a fix which will be included in CS versions - 4.16.2.0 and 4.17.
Reported issue

@wido
Copy link
Contributor

wido commented May 18, 2022

I indeed think this is due to the OS. Without knowing the OS and Qemu version we can't tell the root cause.

@weizhouapache
Copy link
Member

as @slavkap said, #6253 introduced a new config enable.io.uring in agent.properties.
it will be included in upcoming 4.17.0.0.

if you use latest libvirt, #6244 is also required which will be included in 4.17.0.0 as well.

@xlmnxp
Copy link

xlmnxp commented May 18, 2022

@xlmnxp, can you share the libvirt/qemu versions and the OS? Also, do you use advanced virtualization? I had the same problem on Rocky with the advanced virtualization, and there is a fix which will be included in CS versions - 4.16.2.0 and 4.17. Reported issue

I indeed think this is due to the OS. Without knowing the OS and Qemu version we can't tell the root cause.

OS: Rocky Linux 8.6 (Green Obsidian) (Inside Virtualbox and I enabled nested virtualization hypervisors)
Libvirt: libvirtd (libvirt) 8.0.0
Cloudstack: 4.16.1.0


as @slavkap said, #6253 introduced a new config enable.io.uring in agent.properties.
it will be included in upcoming 4.17.0.0.
if you use latest libvirt, #6244 is also required which will be included in 4.17.0.0 as well.

I already disabled it in agent config but the router didn't start because of is_uring which I disabled

/etc/cloudstack/agent/agent.properties:

#Storage
#Wed May 18 05:08:47 EDT 2022
cluster=1
pod=1
resource=com.cloud.hypervisor.kvm.resource.LibvirtComputingResource
private.network.device=cloudbr0
domr.scripts.dir=scripts/network/domr/kvm
guest.cpu.mode=host-passthrough
router.aggregation.command.each.timeout=600
guest.network.device=cloudbr0
enable.io.uring=false
keystore.passphrase=EaPsawRuTKRF7km3
hypervisor.type=kvm
port=8250
zone=1
public.network.device=cloudbr0
local.storage.uuid=ca0e32d1-a818-49fb-bf4d-9e73aef1f19e
host=192.168.8.4@static
guid=28e45ca9-ac69-3278-8703-3e81566b0575
LibvirtComputingResource.id=1
workers=5
iscsi.session.cleanup.enabled=false
vm.migrate.wait=3600

@weizhouapache
Copy link
Member

@xlmnxp, can you share the libvirt/qemu versions and the OS? Also, do you use advanced virtualization? I had the same problem on Rocky with the advanced virtualization, and there is a fix which will be included in CS versions - 4.16.2.0 and 4.17. Reported issue

I indeed think this is due to the OS. Without knowing the OS and Qemu version we can't tell the root cause.

OS: Rocky Linux 8.6 (Green Obsidian) (Inside Virtualbox and I enabled nested virtualization hypervisors) Libvirt: libvirtd (libvirt) 8.0.0 Cloudstack: 4.16.1.0

as @slavkap said, #6253 introduced a new config enable.io.uring in agent.properties.
it will be included in upcoming 4.17.0.0.
if you use latest libvirt, #6244 is also required which will be included in 4.17.0.0 as well.

I already disabled it in agent config but the router didn't start because of is_uring which I disabled

/etc/cloudstack/agent/agent.properties:

#Storage
#Wed May 18 05:08:47 EDT 2022
cluster=1
pod=1
resource=com.cloud.hypervisor.kvm.resource.LibvirtComputingResource
private.network.device=cloudbr0
domr.scripts.dir=scripts/network/domr/kvm
guest.cpu.mode=host-passthrough
router.aggregation.command.each.timeout=600
guest.network.device=cloudbr0
enable.io.uring=false
keystore.passphrase=EaPsawRuTKRF7km3
hypervisor.type=kvm
port=8250
zone=1
public.network.device=cloudbr0
local.storage.uuid=ca0e32d1-a818-49fb-bf4d-9e73aef1f19e
host=192.168.8.4@static
guid=28e45ca9-ac69-3278-8703-3e81566b0575
LibvirtComputingResource.id=1
workers=5
iscsi.session.cleanup.enabled=false
vm.migrate.wait=3600

@xlmnxp
4.16.1.0 does not have the changes in #6253
nothing will be changed if you set enable.io.uring to false.

@xlmnxp
Copy link

xlmnxp commented May 18, 2022

@xlmnxp, can you share the libvirt/qemu versions and the OS? Also, do you use advanced virtualization? I had the same problem on Rocky with the advanced virtualization, and there is a fix which will be included in CS versions - 4.16.2.0 and 4.17. Reported issue

I indeed think this is due to the OS. Without knowing the OS and Qemu version we can't tell the root cause.

OS: Rocky Linux 8.6 (Green Obsidian) (Inside Virtualbox and I enabled nested virtualization hypervisors) Libvirt: libvirtd (libvirt) 8.0.0 Cloudstack: 4.16.1.0

as @slavkap said, #6253 introduced a new config enable.io.uring in agent.properties.
it will be included in upcoming 4.17.0.0.
if you use latest libvirt, #6244 is also required which will be included in 4.17.0.0 as well.

I already disabled it in agent config but the router didn't start because of is_uring which I disabled
/etc/cloudstack/agent/agent.properties:

#Storage
#Wed May 18 05:08:47 EDT 2022
cluster=1
pod=1
resource=com.cloud.hypervisor.kvm.resource.LibvirtComputingResource
private.network.device=cloudbr0
domr.scripts.dir=scripts/network/domr/kvm
guest.cpu.mode=host-passthrough
router.aggregation.command.each.timeout=600
guest.network.device=cloudbr0
enable.io.uring=false
keystore.passphrase=EaPsawRuTKRF7km3
hypervisor.type=kvm
port=8250
zone=1
public.network.device=cloudbr0
local.storage.uuid=ca0e32d1-a818-49fb-bf4d-9e73aef1f19e
host=192.168.8.4@static
guid=28e45ca9-ac69-3278-8703-3e81566b0575
LibvirtComputingResource.id=1
workers=5
iscsi.session.cleanup.enabled=false
vm.migrate.wait=3600

@xlmnxp 4.16.1.0 does not have the changes in #6253 nothing will be changed if you set enable.io.uring to false.

Then how I can upgrade to 4.17?

@weizhouapache
Copy link
Member

@xlmnxp 4.16.1.0 does not have the changes in #6253 nothing will be changed if you set enable.io.uring to false.

Then how I can upgrade to 4.17?

@xlmnxp
4.17.0.0 is not released yet. it will be released in coming weeks.

@xlmnxp
Copy link

xlmnxp commented May 19, 2022

@xlmnxp 4.16.1.0 does not have the changes in #6253 nothing will be changed if you set enable.io.uring to false.

Then how I can upgrade to 4.17?

@xlmnxp 4.17.0.0 is not released yet. it will be released in coming weeks.

Is there any quick fix for it or patch because I want to try cloudstack in next 2 days?

@slavkap
Copy link
Contributor

slavkap commented May 19, 2022

@xlmnxp, I could suggest until there isn't a released version with the fix, you can downgrade the qemu/libvirt versions to libvirt 6.0.0 and qemu 4.2 or something lower than the supported ones:
Qemu >= 5.0, and
Libvirt >= 6.3.0

@xlmnxp
Copy link

xlmnxp commented May 19, 2022

@xlmnxp, I could suggest until there isn't a released version with the fix, you can downgrade the qemu/libvirt versions to libvirt 6.0.0 and qemu 4.2 or something lower than the supported ones: Qemu >= 5.0, and Libvirt >= 6.3.0

Is there a way other then downgrade libvirtd?

@slavkap
Copy link
Contributor

slavkap commented May 19, 2022

@xlmnxp, I don't have anything else on my mind :/ ... The most painless solution for me is downgrading the versions of libvirt/qemu. Another option is to switch to another OS (which supports io_uring) on the hypervisor or downgrade the CS version.
Or wait for 4.17

@xlmnxp
Copy link

xlmnxp commented May 19, 2022

@xlmnxp, I don't have anything else on my mind :/ ... The most painless solution for me is downgrading the versions of libvirt/qemu. Another option is to switch to another OS (which supports io_uring) on the hypervisor or downgrade the CS version. Or wait for 4.17

then best option is to wait 4.17
thank you 👍

@leolleeooleo
Copy link
Contributor

I have same problem after I update to Rocky Linux 8.6
I had tried to downgrade, but it is difficult to downgrade with some dependence issue.

@xlmnxp , You can rebase #6253, #6399, and #6402 to tag 4.16.1.0 then replace jar file cloud-plugin-hypervisor-kvm-4.16.0.0.jar and cloud-agent-4.16.1.0.jar if you urgently need.

My steps:
Install requirement packages (on RockyLinux)

$ sudo dnf install git java-11-openjdk-devel maven
$ sudo alternatives --config javac
$ export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-11.0.15.0.10-2.el8_6.x86_64/

clone the repository

$ git clone https://github.com/apache/cloudstack.git
$ cd cloudstack
$ git remote add shapeblue https://github.com/shapeblue/cloudstack.git
$ git fetch shapeblue

checkout #6399 and rebase to 4.16.1.0
only pick #6253 and #6399, drop others commit (note that PR #6399 isn't merged yet)

$ git checkout -b fix-iouring shapeblue/disableiouringbydefault
$ git rebase -i 4.16.1.0
:1,29s/^pick/drop/g
:31,39s/^pick/drop/g

drop bc70535ee5 Updating pom.xml version numbers for release 4.16.2.0-SNAPSHOT
drop c366511294 UI: Missing message on VM import for not found networks (#6055)
drop 2820a36f86 Check the network access when deploying VM in Advanced Security Group. (#6050)
drop 84f19d8f36 UI: update vm with userdata (#6051)
drop e4d70d4214 UI: Fix Autogenview cleared resource (#6066)
drop 6b913a76cf ui: Add user initials as avatar if no icon present (#6070)
drop 7fbfd4c6ea UI: Fix navigation to domains (#6072)
drop c2bcad8571 ui: Set icon to osdisplayname to avoid multiple api calls (#6075)
drop 00c1bdb365 UI: Reload page on closing Bulk Action modal (#6077)
drop 3a456f1b31 server: mark volume snapshots as Destroyed if it does not exist on primary and secondary storage when delete a volume (#6057)
drop 067c1de080 Fix get upload params NPE (#6079)
drop 6ff378f6a3 travis: run nosetests-2.7 (#6113)
drop 75b54171ae Travis - fix test failures observed (#6119)
drop 4be99fe971 api: Allow updating VM settings when custom contrained offering is used (#6136)
drop f8b648b938 Fix migration of VM with volume on Ubuntu (#6116)
drop fb43076f9e Fix linux native bridge for SUSE in cloudutils (#6134)
drop f4b9ab034b UI: fix create l2 network offering with userdata (#6174)
drop bcd1a3274a api: Fix reset configuration (#6168)
drop 908f594f00 configDrive: Fix failure to delete (unstarted) VM (#6146)
drop a69ab3b28f Ensure configdrive path is edited properly during live migration (#6173)
drop ee27708ffb SAML: replace first number with random alphabet if request ID starts with a number (#6165)
drop 66a6671e0b ui,refactor: fix missing label in update network form (#6181)
drop 47454eca7d VR: add '-m <protocol>' for tcp or udp protocol (#6188)
drop c61ea9f96d VR: Do not add iptables rules for the revoked ip addresses (#6189)
drop e7071ec196 server: increment deviceid while importing vm data volumes (#6123)
drop ee2ded8200 potential null pointer in condition; AYAI9l8k5Irk9_td-cXb (#6237)
drop b6072fc826 Allow expunging a VM on a deleted host when using host cache and ConfigDrive userdata service (#6234)
drop 19a7774cab VR: add rules for traffic between static nat and private gateway static routes (#6153)
drop 91a5f0e285 server: honor global setting system.vm.default.hypervisor as first option when deploy VRs (#6160)
pick 42a92dcdd3 Extract the IO_URING configuration into the agent.properties (#6253)
drop fbf77978e1 Fix: Allow disabling the login attempts mechanism for disabling users (#6254)
drop bbb4ffa593 UI: fix dedicate public ip range to domain (#6258)
drop 365966dd0a UI: Fix custom unconstrained for a zone does not show CPU speed (#6285)
drop b2338f7158 Updated reset configuration, to return the updated config value in the response (#6284)
drop 5fa8fa5580 Fix upload volume format (#6297)
drop efb1f2b719 UI: Fix templates page redirection after delete job is finished (#6345)
drop 556f9dac0f ui: Network offerings not listed if listVPCs not available in the account Role (#6354)
drop d373f973ba Update VM name, when the new name provided in updateVirtualMachine API in different case. (#6379)
drop 006473ca19 Log exception on keystore build for custom certificate (#6394)
pick f9aa91c642 [KVM] Disable IOURING by default on agents
pick fad8a5752c Refactor
pick 1d6d1b09fb Remove agent property for iouring
pick c11cd1bf4e Restore property
pick 261e275c38 Refactor suse check and enable on ubuntu by default
pick 0e76ca64d7 Refactor irrespective of guest OS
pick 9e0574683d Improvement
pick b349e75614 Logs and new path
pick a74a4a83df Refactor condition to enable iouring
pick 09c1f44093 Improve condition
pick a3ab1a33e4 Refactor property check
pick 88b3666d8c Improvement
pick 0c8a6aa03a Doc comment
pick 3835e8b8cf Extend comment

# Rebase cad9332082..3835e8b8cf onto 0c8a6aa03a (53 commands)

:wq

and also need to fix VNC password as #6402
checkout #6402 and rebase to fix-iouring
only pick #6402, drop others commit

$ git checkout -b fix-vncpassword 363a2cff8231a31c3c9def329805910e43358b1d
$ git rebase -i fix-iouring
:1,39s/^pick/drop/g

drop bc70535ee5 Updating pom.xml version numbers for release 4.16.2.0-SNAPSHOT
drop c366511294 UI: Missing message on VM import for not found networks (#6055)
drop 2820a36f86 Check the network access when deploying VM in Advanced Security Group. (#6050)
drop 84f19d8f36 UI: update vm with userdata (#6051)
drop e4d70d4214 UI: Fix Autogenview cleared resource (#6066)
drop 6b913a76cf ui: Add user initials as avatar if no icon present (#6070)
drop 7fbfd4c6ea UI: Fix navigation to domains (#6072)
drop c2bcad8571 ui: Set icon to osdisplayname to avoid multiple api calls (#6075)
drop 00c1bdb365 UI: Reload page on closing Bulk Action modal (#6077)
drop 3a456f1b31 server: mark volume snapshots as Destroyed if it does not exist on primary and secondary storage when delete a volume (#6057)
drop 067c1de080 Fix get upload params NPE (#6079)
drop 6ff378f6a3 travis: run nosetests-2.7 (#6113)
drop 75b54171ae Travis - fix test failures observed (#6119)
drop 4be99fe971 api: Allow updating VM settings when custom contrained offering is used (#6136)
drop f8b648b938 Fix migration of VM with volume on Ubuntu (#6116)
drop fb43076f9e Fix linux native bridge for SUSE in cloudutils (#6134)
drop f4b9ab034b UI: fix create l2 network offering with userdata (#6174)
drop bcd1a3274a api: Fix reset configuration (#6168)
drop 908f594f00 configDrive: Fix failure to delete (unstarted) VM (#6146)
drop a69ab3b28f Ensure configdrive path is edited properly during live migration (#6173)
drop ee27708ffb SAML: replace first number with random alphabet if request ID starts with a number (#6165)
drop 66a6671e0b ui,refactor: fix missing label in update network form (#6181)
drop 47454eca7d VR: add '-m <protocol>' for tcp or udp protocol (#6188)
drop c61ea9f96d VR: Do not add iptables rules for the revoked ip addresses (#6189)
drop e7071ec196 server: increment deviceid while importing vm data volumes (#6123)
drop ee2ded8200 potential null pointer in condition; AYAI9l8k5Irk9_td-cXb (#6237)
drop b6072fc826 Allow expunging a VM on a deleted host when using host cache and ConfigDrive userdata service (#6234)
drop 19a7774cab VR: add rules for traffic between static nat and private gateway static routes (#6153)
drop 91a5f0e285 server: honor global setting system.vm.default.hypervisor as first option when deploy VRs (#6160)
drop fbf77978e1 Fix: Allow disabling the login attempts mechanism for disabling users (#6254)
drop bbb4ffa593 UI: fix dedicate public ip range to domain (#6258)
drop 365966dd0a UI: Fix custom unconstrained for a zone does not show CPU speed (#6285)
drop b2338f7158 Updated reset configuration, to return the updated config value in the response (#6284)
drop 5fa8fa5580 Fix upload volume format (#6297)
drop efb1f2b719 UI: Fix templates page redirection after delete job is finished (#6345)
drop 556f9dac0f ui: Network offerings not listed if listVPCs not available in the account Role (#6354)
drop d373f973ba Update VM name, when the new name provided in updateVirtualMachine API in different case. (#6379)
drop 006473ca19 Log exception on keystore build for custom certificate (#6394)
drop b62b5c96e8 Prevent NPE on reboot stopped VM and startVM output with null displayname (#6397)
pick 363a2cff82 Backport: kvm: truncate vnc password to 8 chars (#6244) (#6402)

# Rebase bab903af16..363a2cff82 onto 363a2cff82 (40 commands)

:wq

build jar files and copy to your agent

$ mvn install
$ scp agent/target/cloud-agent-4.16.1.0.jar leo@kvm-agent:
$ scp plugins/hypervisors/kvm/target/cloud-plugin-hypervisor-kvm-4.16.1.0.jar leo@kvm-agent:

You need to update CloudStack to 4.16.1.0 first
Then you can replace those two jar file and restart agent
Of course you can backup the original jar file if you want

$ sudo mv /usr/share/cloudstack-agent/lib/cloud-agent-4.16.1.0.jar /usr/share/cloudstack-agent/lib/cloud-agent-4.16.1.0.jar.origin
$ sudo mv /usr/share/cloudstack-agent/lib/cloud-plugin-hypervisor-kvm-4.16.1.0.jar /usr/share/cloudstack-agent/lib/cloud-plugin-hypervisor-kvm-4.16.1.0.jar.origin
$ sudo cp ~/cloud-agent-4.16.1.0.jar /usr/share/cloudstack-agent/lib/
$ sudo cp ~/cloud-plugin-hypervisor-kvm-4.16.1.0.jar /usr/share/cloudstack-agent/lib/
$ sudo systemctl restart cloudstack-agent.service

You can use my jar file if you don't mind
CloudStack-4.16.1.0-agent_patch_for_rockylinux8.6.zip

@xlmnxp
Copy link

xlmnxp commented May 21, 2022

I have same problem after I update to Rocky Linux 8.6 I had tried to downgrade, but it is difficult to downgrade with some dependence issue.

@xlmnxp , You can rebase #6253, #6399, and #6402 to tag 4.16.1.0 then replace jar file cloud-plugin-hypervisor-kvm-4.16.0.0.jar and cloud-agent-4.16.1.0.jar if you urgently need.

My steps: Install requirement packages (on RockyLinux)

$ sudo dnf install git java-11-openjdk-devel maven
$ sudo alternatives --config javac
$ export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-11.0.15.0.10-2.el8_6.x86_64/

clone the repository

$ git clone https://github.com/apache/cloudstack.git
$ cd cloudstack
$ git remote add shapeblue https://github.com/shapeblue/cloudstack.git
$ git fetch shapeblue

checkout #6399 and rebase to 4.16.1.0 only pick #6253 and #6399, drop others commit (note that PR #6399 isn't merged yet)

$ git checkout -b fix-iouring shapeblue/disableiouringbydefault
$ git rebase -i 4.16.1.0
:1,29s/^pick/drop/g
:31,39s/^pick/drop/g

drop bc70535ee5 Updating pom.xml version numbers for release 4.16.2.0-SNAPSHOT
drop c366511294 UI: Missing message on VM import for not found networks (#6055)
drop 2820a36f86 Check the network access when deploying VM in Advanced Security Group. (#6050)
drop 84f19d8f36 UI: update vm with userdata (#6051)
drop e4d70d4214 UI: Fix Autogenview cleared resource (#6066)
drop 6b913a76cf ui: Add user initials as avatar if no icon present (#6070)
drop 7fbfd4c6ea UI: Fix navigation to domains (#6072)
drop c2bcad8571 ui: Set icon to osdisplayname to avoid multiple api calls (#6075)
drop 00c1bdb365 UI: Reload page on closing Bulk Action modal (#6077)
drop 3a456f1b31 server: mark volume snapshots as Destroyed if it does not exist on primary and secondary storage when delete a volume (#6057)
drop 067c1de080 Fix get upload params NPE (#6079)
drop 6ff378f6a3 travis: run nosetests-2.7 (#6113)
drop 75b54171ae Travis - fix test failures observed (#6119)
drop 4be99fe971 api: Allow updating VM settings when custom contrained offering is used (#6136)
drop f8b648b938 Fix migration of VM with volume on Ubuntu (#6116)
drop fb43076f9e Fix linux native bridge for SUSE in cloudutils (#6134)
drop f4b9ab034b UI: fix create l2 network offering with userdata (#6174)
drop bcd1a3274a api: Fix reset configuration (#6168)
drop 908f594f00 configDrive: Fix failure to delete (unstarted) VM (#6146)
drop a69ab3b28f Ensure configdrive path is edited properly during live migration (#6173)
drop ee27708ffb SAML: replace first number with random alphabet if request ID starts with a number (#6165)
drop 66a6671e0b ui,refactor: fix missing label in update network form (#6181)
drop 47454eca7d VR: add '-m <protocol>' for tcp or udp protocol (#6188)
drop c61ea9f96d VR: Do not add iptables rules for the revoked ip addresses (#6189)
drop e7071ec196 server: increment deviceid while importing vm data volumes (#6123)
drop ee2ded8200 potential null pointer in condition; AYAI9l8k5Irk9_td-cXb (#6237)
drop b6072fc826 Allow expunging a VM on a deleted host when using host cache and ConfigDrive userdata service (#6234)
drop 19a7774cab VR: add rules for traffic between static nat and private gateway static routes (#6153)
drop 91a5f0e285 server: honor global setting system.vm.default.hypervisor as first option when deploy VRs (#6160)
pick 42a92dcdd3 Extract the IO_URING configuration into the agent.properties (#6253)
drop fbf77978e1 Fix: Allow disabling the login attempts mechanism for disabling users (#6254)
drop bbb4ffa593 UI: fix dedicate public ip range to domain (#6258)
drop 365966dd0a UI: Fix custom unconstrained for a zone does not show CPU speed (#6285)
drop b2338f7158 Updated reset configuration, to return the updated config value in the response (#6284)
drop 5fa8fa5580 Fix upload volume format (#6297)
drop efb1f2b719 UI: Fix templates page redirection after delete job is finished (#6345)
drop 556f9dac0f ui: Network offerings not listed if listVPCs not available in the account Role (#6354)
drop d373f973ba Update VM name, when the new name provided in updateVirtualMachine API in different case. (#6379)
drop 006473ca19 Log exception on keystore build for custom certificate (#6394)
pick f9aa91c642 [KVM] Disable IOURING by default on agents
pick fad8a5752c Refactor
pick 1d6d1b09fb Remove agent property for iouring
pick c11cd1bf4e Restore property
pick 261e275c38 Refactor suse check and enable on ubuntu by default
pick 0e76ca64d7 Refactor irrespective of guest OS
pick 9e0574683d Improvement
pick b349e75614 Logs and new path
pick a74a4a83df Refactor condition to enable iouring
pick 09c1f44093 Improve condition
pick a3ab1a33e4 Refactor property check
pick 88b3666d8c Improvement
pick 0c8a6aa03a Doc comment
pick 3835e8b8cf Extend comment

# Rebase cad9332082..3835e8b8cf onto 0c8a6aa03a (53 commands)

:wq

and also need to fix VNC password as #6402 checkout #6402 and rebase to fix-iouring only pick #6402, drop others commit

$ git checkout -b fix-vncpassword 363a2cff8231a31c3c9def329805910e43358b1d
$ git rebase -i fix-iouring
:1,39s/^pick/drop/g

drop bc70535ee5 Updating pom.xml version numbers for release 4.16.2.0-SNAPSHOT
drop c366511294 UI: Missing message on VM import for not found networks (#6055)
drop 2820a36f86 Check the network access when deploying VM in Advanced Security Group. (#6050)
drop 84f19d8f36 UI: update vm with userdata (#6051)
drop e4d70d4214 UI: Fix Autogenview cleared resource (#6066)
drop 6b913a76cf ui: Add user initials as avatar if no icon present (#6070)
drop 7fbfd4c6ea UI: Fix navigation to domains (#6072)
drop c2bcad8571 ui: Set icon to osdisplayname to avoid multiple api calls (#6075)
drop 00c1bdb365 UI: Reload page on closing Bulk Action modal (#6077)
drop 3a456f1b31 server: mark volume snapshots as Destroyed if it does not exist on primary and secondary storage when delete a volume (#6057)
drop 067c1de080 Fix get upload params NPE (#6079)
drop 6ff378f6a3 travis: run nosetests-2.7 (#6113)
drop 75b54171ae Travis - fix test failures observed (#6119)
drop 4be99fe971 api: Allow updating VM settings when custom contrained offering is used (#6136)
drop f8b648b938 Fix migration of VM with volume on Ubuntu (#6116)
drop fb43076f9e Fix linux native bridge for SUSE in cloudutils (#6134)
drop f4b9ab034b UI: fix create l2 network offering with userdata (#6174)
drop bcd1a3274a api: Fix reset configuration (#6168)
drop 908f594f00 configDrive: Fix failure to delete (unstarted) VM (#6146)
drop a69ab3b28f Ensure configdrive path is edited properly during live migration (#6173)
drop ee27708ffb SAML: replace first number with random alphabet if request ID starts with a number (#6165)
drop 66a6671e0b ui,refactor: fix missing label in update network form (#6181)
drop 47454eca7d VR: add '-m <protocol>' for tcp or udp protocol (#6188)
drop c61ea9f96d VR: Do not add iptables rules for the revoked ip addresses (#6189)
drop e7071ec196 server: increment deviceid while importing vm data volumes (#6123)
drop ee2ded8200 potential null pointer in condition; AYAI9l8k5Irk9_td-cXb (#6237)
drop b6072fc826 Allow expunging a VM on a deleted host when using host cache and ConfigDrive userdata service (#6234)
drop 19a7774cab VR: add rules for traffic between static nat and private gateway static routes (#6153)
drop 91a5f0e285 server: honor global setting system.vm.default.hypervisor as first option when deploy VRs (#6160)
drop fbf77978e1 Fix: Allow disabling the login attempts mechanism for disabling users (#6254)
drop bbb4ffa593 UI: fix dedicate public ip range to domain (#6258)
drop 365966dd0a UI: Fix custom unconstrained for a zone does not show CPU speed (#6285)
drop b2338f7158 Updated reset configuration, to return the updated config value in the response (#6284)
drop 5fa8fa5580 Fix upload volume format (#6297)
drop efb1f2b719 UI: Fix templates page redirection after delete job is finished (#6345)
drop 556f9dac0f ui: Network offerings not listed if listVPCs not available in the account Role (#6354)
drop d373f973ba Update VM name, when the new name provided in updateVirtualMachine API in different case. (#6379)
drop 006473ca19 Log exception on keystore build for custom certificate (#6394)
drop b62b5c96e8 Prevent NPE on reboot stopped VM and startVM output with null displayname (#6397)
pick 363a2cff82 Backport: kvm: truncate vnc password to 8 chars (#6244) (#6402)

# Rebase bab903af16..363a2cff82 onto 363a2cff82 (40 commands)

:wq

build jar files and copy to your agent

$ mvn install
$ scp agent/target/cloud-agent-4.16.1.0.jar leo@kvm-agent:
$ scp plugins/hypervisors/kvm/target/cloud-plugin-hypervisor-kvm-4.16.1.0.jar leo@kvm-agent:

You need to update CloudStack to 4.16.1.0 first Then you can replace those two jar file and restart agent Of course you can backup the original jar file if you want

$ sudo mv /usr/share/cloudstack-agent/lib/cloud-agent-4.16.1.0.jar /usr/share/cloudstack-agent/lib/cloud-agent-4.16.1.0.jar.origin
$ sudo mv /usr/share/cloudstack-agent/lib/cloud-plugin-hypervisor-kvm-4.16.1.0.jar /usr/share/cloudstack-agent/lib/cloud-plugin-hypervisor-kvm-4.16.1.0.jar.origin
$ sudo cp ~/cloud-agent-4.16.1.0.jar /usr/share/cloudstack-agent/lib/
$ sudo cp ~/cloud-plugin-hypervisor-kvm-4.16.1.0.jar /usr/share/cloudstack-agent/lib/
$ sudo systemctl restart cloudstack-agent.service

You can use my jar file if you don't mind CloudStack-4.16.1.0-agent_patch_for_rockylinux8.6.zip

Great, That fix the issue 👍🏼

@xlmnxp
Copy link

xlmnxp commented May 21, 2022

Thank you, I'm facing other issues, I will try to fix it

Why there no discourse community forum or discord server that will help on troubleshooting and learn Cloudstack

@xlmnxp
Copy link

xlmnxp commented May 22, 2022

finally everything fixed and Cloudstack work great 👍🏼
thanks all

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

KVM: Add io driver option to libvirt XML for io_uring support