Skip to content

ETCD-403: Revert user configuration for etcd and change default max quota#1506

Merged
openshift-merge-robot merged 1 commit intoopenshift:mainfrom
dusk125:revert-etcd-config
Mar 31, 2023
Merged

ETCD-403: Revert user configuration for etcd and change default max quota#1506
openshift-merge-robot merged 1 commit intoopenshift:mainfrom
dusk125:revert-etcd-config

Conversation

@dusk125
Copy link
Contributor

@dusk125 dusk125 commented Mar 15, 2023

To simplify the configuration, this PR reverts the changes made by #1435. Also, this sets the maximum on-disk quota for etcd to be the maximum allowed (and aligned with Openshift) as the default to reduce the likelihood that a customer will hit the quota during runtime.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 15, 2023
@openshift-ci-robot
Copy link

openshift-ci-robot commented Mar 15, 2023

@dusk125: This pull request references ETCD-403 which is a valid jira issue.

Details

In response to this:

To simplify the configuration, this PR reverts the changes made by #1435. Also, this sets the maximum on-disk quota for etcd to be the maximum allowed (and aligned with Openshift) as the default to reduce the likelihood that a customer will hit the quota during runtime.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@dusk125
Copy link
Contributor Author

dusk125 commented Mar 15, 2023

/assign @dhellmann @pmtk

@openshift-ci openshift-ci bot requested review from copejon and ggiguash March 15, 2023 15:51
@dusk125
Copy link
Contributor Author

dusk125 commented Mar 15, 2023

/hold We want to see the memory behavior of etcd during a defragment of a large (near 8gb quota) database to ensure that it does not grow in memory usage over time and doesn't heavily spike in usage in the short term. We should expect to see a sawtooth pattern of usage.

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 15, 2023
@dhellmann
Copy link
Contributor

We are pretty confident that we do want this change, so I'll approve the PR. We should wait for the measurements before merging it, as Allen says.

/approve

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 15, 2023
@dusk125 dusk125 force-pushed the revert-etcd-config branch from 03daad1 to 31cdc11 Compare March 22, 2023 16:28
@dusk125
Copy link
Contributor Author

dusk125 commented Mar 22, 2023

So it looks like etcd has a memory overhead of 30-60% (during times of heavy use) in addition to attempting to hold the database file(s) in memory. Doug and I agreed that keeping the quota at 8GB and allowing the user to opt-in to a memory limit was a good way forward. The most recent version of this PR introduces a memory limit for etcd that a user can add if they want; by default, there's no limit.

I ran a synthetic 'put' test against a fresh etcd server with varying memory limits and found that 50MB is a good minimum before etcd performance degrades significantly. Therefore, Microshift config will fail to parse if a user attempts to set a value lower than 50MB.

Copy link
Contributor

@dhellmann dhellmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you update docs/howto_config.md as part of this PR, too?

@dusk125 dusk125 force-pushed the revert-etcd-config branch from 31cdc11 to e74e7a3 Compare March 24, 2023 14:51
@dhellmann
Copy link
Contributor

I had one nit, and there's a hold. We can address the nit separately, so if we're ready to move ahead otherwise please remove the hold.

@dhellmann
Copy link
Contributor

/retest

@dusk125
Copy link
Contributor Author

dusk125 commented Mar 27, 2023

/unhold

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 27, 2023
@dusk125 dusk125 force-pushed the revert-etcd-config branch from e74e7a3 to dfd0253 Compare March 27, 2023 13:36
@dusk125 dusk125 force-pushed the revert-etcd-config branch from dfd0253 to 8d231d7 Compare March 27, 2023 15:05
Copy link
Contributor

@dhellmann dhellmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 doc nits and then I think this is ready to merge

@dusk125 dusk125 force-pushed the revert-etcd-config branch from 8d231d7 to 042457a Compare March 27, 2023 16:17
@dusk125
Copy link
Contributor Author

dusk125 commented Mar 27, 2023

/retest

@dhellmann
Copy link
Contributor

/lgtm
/cherry-pick release-4.13

@openshift-cherrypick-robot

@dhellmann: once the present PR merges, I will cherry-pick it on top of release-4.13 in a new PR and assign it to you.

Details

In response to this:

/lgtm
/cherry-pick release-4.13

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 27, 2023
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Mar 28, 2023
@dhellmann
Copy link
Contributor

/test e2e-openshift-conformance-reduced
/test e2e-openshift-conformance-reduced-arm

@dhellmann
Copy link
Contributor

This version looks good. Let's see if these are just test flakes.

@dusk125
Copy link
Contributor Author

dusk125 commented Mar 29, 2023

/test e2e-openshift-conformance-reduced
/test e2e-openshift-conformance-reduced-arm

@dusk125
Copy link
Contributor Author

dusk125 commented Mar 29, 2023

I was able to see that etcd is running in the sos report in the CI failures, so I think these are just flakes as well.

@dusk125
Copy link
Contributor Author

dusk125 commented Mar 29, 2023

/retest

@dusk125 dusk125 force-pushed the revert-etcd-config branch from fb9f2dd to e836402 Compare March 29, 2023 19:33
@copejon
Copy link
Contributor

copejon commented Mar 29, 2023

/test e2e-greenboot-arm

@dusk125 dusk125 force-pushed the revert-etcd-config branch 2 times, most recently from 9f26584 to c664334 Compare March 30, 2023 18:30
@dhellmann
Copy link
Contributor

I think this strikes the right balance. Let's see what CI has to say.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 30, 2023
@dhellmann
Copy link
Contributor

/test e2e-router-smoke-test

@dhellmann
Copy link
Contributor

/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 30, 2023
@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD ea60015 and 2 for PR HEAD c664334184d559b68fac9b7dd69d509dece55341 in total

@dusk125
Copy link
Contributor Author

dusk125 commented Mar 31, 2023

/retest

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD a94f74f and 1 for PR HEAD c664334184d559b68fac9b7dd69d509dece55341 in total

@dusk125 dusk125 force-pushed the revert-etcd-config branch from c664334 to c2f45a4 Compare March 31, 2023 16:46
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Mar 31, 2023
@dhellmann
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 31, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 31, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dhellmann, dusk125

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD a94f74f and 2 for PR HEAD c2f45a4 in total

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 31, 2023

@dusk125: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-cherrypick-robot

@dhellmann: new pull request created: #1604

Details

In response to this:

/lgtm
/cherry-pick release-4.13

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@dusk125 dusk125 deleted the revert-etcd-config branch April 6, 2023 02:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants