USHIFT-1083 microshift admin data backup command#1819
USHIFT-1083 microshift admin data backup command#1819openshift-merge-robot merged 27 commits intoopenshift:mainfrom
microshift admin data backup command#1819Conversation
dhellmann
left a comment
There was a problem hiding this comment.
This looks like a good start.
How can we make sure MicroShift is not running before taking the backup? Is that actually a requirement or does cp work ok in that case?
That would be the thing with |
microshift admin backup command
|
/retest |
|
/assign @dhellmann @pacevedom |
|
/retest |
4af7f85 to
8548161
Compare
|
/test e2e-openshift-conformance-reduced-arm |
dhellmann
left a comment
There was a problem hiding this comment.
The overall logic looks good. I have a couple of specific points inline, and I think it would be worth reviewing the logging advice at https://docs.openstack.org/oslo.log/latest/user/guidelines.html#overall-logging-principles and thinking about how to apply that here, both for the use of different log levels and for the "unit of work" approach described for info log messages.
pkg/admin/data/manager.go
Outdated
| Restore(BackupConfig) error | ||
| } | ||
|
|
||
| func NewManager() *manager { |
There was a problem hiding this comment.
Why wouldn't the config be passed in when creating the manager?
There was a problem hiding this comment.
I'm planning to use it like this (src):
a.dataManager.Backup(data.BackupConfig{
Storage: config.BackupsDir,
Name: prevBootHealth.DeploymentID,
})
When I'll need to restore backup for another deployment, I don't want to create another manager.
Should manager take backupsDir and Backup/Restore functions take name of the backup?
There was a problem hiding this comment.
I think you're right that the backup directory could be held in the manager, and then the functions could take arguments that are more dynamic, like the name to restore from. A method like List() would also use the backup directory, but not the name value in BackupConfig.
There was a problem hiding this comment.
Changed it so the manager has "storage" as field and Backup/Restore only take BackupName
pkg/admin/data/manager.go
Outdated
| Restore(BackupConfig) error | ||
| } | ||
|
|
||
| func NewManager() *manager { |
There was a problem hiding this comment.
I think you're right that the backup directory could be held in the manager, and then the functions could take arguments that are more dynamic, like the name to restore from. A method like List() would also use the backup directory, but not the name value in BackupConfig.
pkg/cmd/admin.go
Outdated
| }, | ||
| } | ||
| v := version.Get() | ||
| data.PersistentFlags().String( |
There was a problem hiding this comment.
What's the use case for making the storage location configurable from the command line?
There was a problem hiding this comment.
I'd say nice to have for RPM systems where backups are manual. But i don't have concrete use case other than NFS (or different partition) mount for backups, but then CoW goes out of window
There was a problem hiding this comment.
OK, that makes sense. In that case I think we're going to not want to overwrite an existing backup, so we should think about how to handle that. Maybe just a check in the CLI code, before we get into the low-level code, where we do want old backups overwritten.
There was a problem hiding this comment.
Added BackupExists() and --force
pkg/admin/data/backup.go
Outdated
| } | ||
|
|
||
| if err := renamePath(dest_tmp, dest); err != nil { | ||
| klog.Errorf("Renaming path failed - renaming %s back and deleting %s: %v", dest_old, dest_tmp, err) |
There was a problem hiding this comment.
dest_old will only exist if the original location existed when this function started, so if the rename on line 60 fails for some other reason then the logic in this section is going to produce more errors.
What are the situations that could result in the location already existing? We can definitely write code to produce a unique name, so I think we only have to worry about the collision if the name comes from the user via the command line? What if we just refuse to take the backup in that case? Or don't let the user specify the full backup location (give us a directory and we pick a name in that directory)?
There was a problem hiding this comment.
Same code will be used in automated ostree backing up so 3rd boot into the same healthy deployment will overwrite backup taken during 2nd boot (and since backup name is 'deploymentID' the dir will already exist)
There was a problem hiding this comment.
We might want to move some of that overwrite logic "up" the call stack to a place where the code knows whether it is being invoked by the user or by automation. That will make the edge cases easier to reason about.
There was a problem hiding this comment.
Added function to interface BackupExists(BackupName) and it's called in the CLI. It will refuse to overwrite a backup unless --force is provided
klog includes filename in the logs, but if we add more manager.go then it'll be harder to know which one logged the message
1d04379 to
4bb597f
Compare
microshift admin backup commandmicroshift admin data backup command
dhellmann
left a comment
There was a problem hiding this comment.
/lgtm
/hold for QE verification
pkg/admin/data/data_manager.go
Outdated
| } | ||
|
|
||
| if err := renamePath(tmp, dest); err != nil { | ||
| klog.Errorf("Renaming path failed - renaming %s back and deleting %s: %v", old, tmp, err) |
There was a problem hiding this comment.
It's going to be interesting to set up a test for this error case.
There was a problem hiding this comment.
Since this is so unlikely, do you think that "recovery actions" (renamePath and removePath) are unnecessary? I would suspect that if one rename fails, then other might fail as well but since go might return error I thought this is the way to handle it (try to revert)
There was a problem hiding this comment.
We talked on slack about a way to generate unique names to avoid having to rename anything. Let's land this version and make that change in another PR.
|
/label tide/merge-method-squash |
pkg/admin/data/data_manager.go
Outdated
| } | ||
|
|
||
| if err := renamePath(tmp, dest); err != nil { | ||
| klog.Errorf("Renaming path failed - renaming %s back and deleting %s: %v", old, tmp, err) |
There was a problem hiding this comment.
We talked on slack about a way to generate unique names to avoid having to rename anything. Let's land this version and make that change in another PR.
| } | ||
| klog.InfoS("Backup storage directory created", "path", dm.storage) | ||
| } else { | ||
| klog.InfoS("Backup storage directory already existed", "path", dm.storage) |
There was a problem hiding this comment.
This indicates an attempt to write over an existing backup, right? I would expect that to be an error.
There was a problem hiding this comment.
Not really, that's the "parent directory" where backups are stored. Maybe you have how can we name it better ("backup storage") or maybe just simply delete this log (and keep only one about creating /var/lib/microshift-backups/)?
There was a problem hiding this comment.
Ah, I see what's going on now.
|
@pmtk: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
| } | ||
| klog.InfoS("Backup storage directory created", "path", dm.storage) | ||
| } else { | ||
| klog.InfoS("Backup storage directory already existed", "path", dm.storage) |
There was a problem hiding this comment.
Ah, I see what's going on now.
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dhellmann, pmtk The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
I've preformed some early testing on this PR and it behaves as expected. |
No description provided.