Skip to content

Timeout snapshot #11893

@Lauta11

Description

@Lauta11

problem

When creating snapshots of large volumes (approximately larger than 200 GB), the task fails to complete successfully. The snapshot command reaches a timeout of 3600 seconds, which is not enough time to copy the entire disk.

We attempted to modify various timeout-related parameters in CloudStack (including those for snapshot and asynchronous tasks), but none of them appear to extend or affect this timeout limit.

It is unclear whether this is a bug or a missing feature that should allow increasing the timeout duration for snapshot operations.

The corresponding logs are attached for further analysis.

Management:

2025-10-21 02:01:57,863 DEBUG [c.c.s.s.SnapshotSchedulerImpl] (SnapshotPollTask:ctx-93846872) (logid:f246f83a) Snapshot [455b227d-797f-45e6-aa03-1dc7609ac030] for volume [{"name":"ROOT-392","uuid":"8508204f-24ba-454a-8bd1-f56b274088ae"}] can be executed.
2025-10-21 02:01:59,126 DEBUG [c.c.s.s.SnapshotSchedulerImpl] (SnapshotPollTask:ctx-93846872) (logid:f246f83a) Scheduling snapshot [455b227d-797f-45e6-aa03-1dc7609ac030] for volume [{"name":"ROOT-392","uuid":"8508204f-24ba-454a-8bd1-f56b274088ae"}] at [2025-10-21 05:00:00 GMT].
2025-10-21 02:01:59,179 DEBUG [c.c.s.s.SnapshotSchedulerImpl] (SnapshotPollTask:ctx-93846872) (logid:f246f83a) Scheduled snapshot [455b227d-797f-45e6-aa03-1dc7609ac030] for volume [{"name":"ROOT-392","uuid":"8508204f-24ba-454a-8bd1-f56b274088ae"}] as job [d0e089bb-d2cb-4e81-8fb4-5e6a23eee556].
2025-10-21 02:02:01,607 DEBUG [o.a.c.s.s.StorPoolSnapshotStrategy] (Work-Job-Executor-28:ctx-a246b76c job-33232/job-33233 ctx-6f646c29) (logid:d0e089bb) StorpoolSnapshotStrategy.canHandle: snapshot=server18063_ROOT-392_20251021050158, uuid=70decb5c-6f4d-405a-b7e1-f2eb2ace4de0, op=TAKE



2025-10-21 03:02:01,993 DEBUG [o.a.c.s.s.SnapshotServiceImpl] (Work-Job-Executor-28:ctx-a246b76c job-33232/job-33233 ctx-6f646c29) (logid:d0e089bb) create snapshot server18063_ROOT-392_20251021050158 failed: com.cloud.utils.exception.CloudRuntimeException: Failed to send command, due to Agent:59, com.cloud.exception.OperationTimedoutException: Commands 545217029888601412 to Host 59 timed out after 3600


Host:


2025-10-21 03:06:26,176 WARN  [utils.script.Script] (Script-10:null) (logid:) Interrupting script.
2025-10-21 03:06:26,176 WARN  [utils.script.Script] (agentRequest-Handler-2:null) (logid:d0e089bb) Process [3934352] for command [qemu-img convert -O qcow2 -U --imag
e-opts driver=qcow2,file.filename=/mnt/4c524bab-25a3-3db5-a665-d37159b81f11/11806790-0ad0-4a4a-b389-2f7ab41b4e87 /mnt/4c524bab-25a3-3db5-a665-d37159b81f11/snapshots/
3ae09c67-c072-48b4-a8a7-e3f2dbaf2687 ] timed out. Output is [].
2025-10-21 03:06:43,053 ERROR [kvm.storage.KVMStorageProcessor] (agentRequest-Handler-2:null) (logid:d0e089bb) Failed take snapshot for volume [volumeTO[uuid=8508204
f-24ba-454a-8bd1-f56b274088ae|path=11806790-0ad0-4a4a-b389-2f7ab41b4e87|datastore=PrimaryDataStoreTO[uuid=4c524bab-25a3-3db5-a665-d37159b81f11|name=Pool-01|id=18|poo
ltype=NetworkFilesystem]]], in VM [i-52-392-VM], due to [Failed to convert volumeTO[uuid=8508204f-24ba-454a-8bd1-f56b274088ae|path=11806790-0ad0-4a4a-b389-2f7ab41b4e
87|datastore=PrimaryDataStoreTO[uuid=4c524bab-25a3-3db5-a665-d37159b81f11|name=Pool-01|id=18|pooltype=NetworkFilesystem]] snapshot of volume [KVMPhysicalDisk {"dispN
ame":null,"format":"qcow2","name":"11806790-0ad0-4a4a-b389-2f7ab41b4e87","path":"\/mnt\/4c524bab-25a3-3db5-a665-d37159b81f11\/11806790-0ad0-4a4a-b389-2f7ab41b4e87","
pool":{"uuid":"4c524bab-25a3-3db5-a665-d37159b81f11","path":"\/mnt\/4c524bab-25a3-3db5-a665-d37159b81f11"},"size":122363461632,"virtualSize":161061273600,"vmName":nu
ll}] to [/mnt/4c524bab-25a3-3db5-a665-d37159b81f11/snapshots/3ae09c67-c072-48b4-a8a7-e3f2dbaf2687] due to [timeout].].
com.cloud.utils.exception.CloudRuntimeException: Failed to convert volumeTO[uuid=8508204f-24ba-454a-8bd1-f56b274088ae|path=11806790-0ad0-4a4a-b389-2f7ab41b4e87|datas
tore=PrimaryDataStoreTO[uuid=4c524bab-25a3-3db5-a665-d37159b81f11|name=Pool-01|id=18|pooltype=NetworkFilesystem]] snapshot of volume [KVMPhysicalDisk {"dispName":null,"format":"qcow2","name":"11806790-0ad0-4a4a-b389-2f7ab41b4e87","path":"\/mnt\/4c524bab-25a3-3db5-a665-d37159b81f11\/11806790-0ad0-4a4a-b389-2f7ab41b4e87","pool":{"uuid":"4c524bab-25a3-3db5-a665-d37159b81f11","path":"\/mnt\/4c524bab-25a3-3db5-a665-d37159b81f11"},"size":122363461632,"virtualSize":161061273600,"vmName":null}] to [/mnt/4c524bab-25a3-3db5-a665-d37159b81f11/snapshots/3ae09c67-c072-48b4-a8a7-e3f2dbaf2687] due to [timeout].
        at com.cloud.hypervisor.kvm.storage.KVMStorageProcessor.validateConvertResult(KVMStorageProcessor.java:1915)
        at com.cloud.hypervisor.kvm.storage.KVMStorageProcessor.createSnapshot(KVMStorageProcessor.java:1790)
        at com.cloud.storage.resource.StorageSubsystemCommandHandlerBase.execute(StorageSubsystemCommandHandlerBase.java:140)
        at com.cloud.storage.resource.StorageSubsystemCommandHandlerBase.handleStorageCommands(StorageSubsystemCommandHandlerBase.java:66)
        at com.cloud.hypervisor.kvm.resource.wrapper.LibvirtStorageSubSystemCommandWrapper.execute(LibvirtStorageSubSystemCommandWrapper.java:36)
        at com.cloud.hypervisor.kvm.resource.wrapper.LibvirtStorageSubSystemCommandWrapper.execute(LibvirtStorageSubSystemCommandWrapper.java:30)
        at com.cloud.hypervisor.kvm.resource.wrapper.LibvirtRequestWrapper.execute(LibvirtRequestWrapper.java:78)
        at com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.executeRequest(LibvirtComputingResource.java:1930)
        at com.cloud.agent.Agent.processRequest(Agent.java:683)
        at com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:1106)
        at com.cloud.utils.nio.Task.call(Task.java:83)
        at com.cloud.utils.nio.Task.call(Task.java:29)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:829)

### versions

Cloudstack v4.20 / KVM
S.O ubuntu 24


### The steps to reproduce the bug

1. Start a snapshot task for a volume > 200GB
2. Wait one hour
3. Error


### What to do about it?

Complete configuration to be able to modify this time or fix a bug if necessary.

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions