Skip to content

Conversation

@weizhouapache
Copy link
Member

@weizhouapache weizhouapache commented Sep 13, 2021

Description

This PR fixes some issues with OVS/GRE
Issues with isolated networks/vpcs: #2886 #2996 #3802
Issues with shared networks: #2863 #2885

To test this PR, we need to create a zone with isolation method 'GRE' instead of default 'VLAN'.

Test results
(1) smoke test test/integration/smoke/test_privategw_acl_ovs_gre.py passed on kvm and xenserver
(2) component test test/integration/component/test_vpc_distributed_routing_offering.py passed on xenserver.

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Screenshots (if appropriate):

How Has This Been Tested?

@weizhouapache
Copy link
Member Author

@blueorangutan package

@blueorangutan
Copy link

@weizhouapache a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result: ✔️ el7 ✔️ el8 ✔️ debian ✔️ suse15. SL-JID 1224

@weizhouapache
Copy link
Member Author

@blueorangutan test centos7 xenserver-71 keepEnv

@blueorangutan
Copy link

@weizhouapache a Trillian-Jenkins test job (centos7 mgmt + xenserver-71) has been kicked to run smoke tests

@weizhouapache
Copy link
Member Author

@blueorangutan test centos7 xenserver-71 keepEnv

@blueorangutan
Copy link

@weizhouapache a Trillian-Jenkins test job (centos7 mgmt + xenserver-71) has been kicked to run smoke tests

@blueorangutan
Copy link

Trillian test result (tid-2031)
Environment: xenserver-71 (x2), Advanced Networking with Mgmt server 7
Total time taken: 50918 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr5446-t2031-xenserver-71.zip
Smoke tests completed. 88 look OK, 1 have errors
Only failed tests results shown below:

Test Result Time (s) Test File
ContextSuite context=TestRVPCSite2SiteVpn>:setup Error 0.00 test_vpc_vpn.py
ContextSuite context=TestVPCSite2SiteVPNMultipleOptions>:setup Error 0.00 test_vpc_vpn.py
ContextSuite context=TestVpcRemoteAccessVpn>:setup Error 0.00 test_vpc_vpn.py
ContextSuite context=TestVpcSite2SiteVpn>:setup Error 0.00 test_vpc_vpn.py

@svenvogel
Copy link
Contributor

@weizhouapache only for interest because we are using OVS/VLAN at the moment with KVM. What are the fixes in detail? 😄 thanks and cheers

(1) VR <-> VM should work
(2) Private GW should work
ovs bridges are deleted by xenserver/ovs automatically
```
[root@ref-trl-1797-x-M7-wei-zhou-xs2 ~]# grep -r xapi7 /var/log/ |grep del-br
/var/log/xensource.log:Sep 15 07:13:44 ref-trl-1797-x-M7-wei-zhou-xs2 xcp-networkd: [ info|localhost|611 |org.xen.xapi.xenops.classic events D:4a3d931cd89f|network_utils] /usr/bin/ovs-vsctl --timeout=20 -- --if-exists del-br xapi7
/var/log/daemon.log:Sep 15 07:13:45 ref-trl-1797-x-M7-wei-zhou-xs2 ovs-vsctl: ovs|00001|vsctl|INFO|Called as /usr/bin/ovs-vsctl --timeout=20 -- --if-exists del-br xapi7
```

which results that xe network exists but bridge does not exist, and operation stuck for 20 minutes at
```
2021-09-15 16:06:56    DEBUG [root] #### VMOPS enter  create_tunnel ####
2021-09-15 16:06:56    DEBUG [root] Creating tunnel from host 2 to host 1 with GRE key 2116
2021-09-15 16:06:56    DEBUG [root] Executing:['/usr/bin/ovs-vsctl', '--timeout=0', 'wait-until', 'bridge', 'xapi7', '--', 'get', 'bridge', 'xapi7', 'name']
2021-09-15 16:26:56    DEBUG [root] bridge xapi7 for creating tunnel - VERIFIED
2021-09-15 16:26:56    DEBUG [root] Executing:['/usr/bin/ovs-vsctl', 'add-port', 'xapi7', 't2116-2-1', '--', 'set', 'interface', 't2116-2-1', 'type=gre', 'options:key=2116', 'options:remote_ip=10.0.34.230']
```
@weizhouapache
Copy link
Member Author

@weizhouapache only for interest because we are using OVS/VLAN at the moment with KVM. What are the fixes in detail? thanks and cheers

@svenvogel
this PR fixes some issues with Openvswitch/GRE support on kvm and xenserver.
You can find the description of issues in each commit.

OVS/VLAN is not impacted.

@weizhouapache
Copy link
Member Author

@blueorangutan package

@blueorangutan
Copy link

@weizhouapache a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result: ✔️ el7 ✔️ el8 ✔️ debian ✔️ suse15. SL-JID 1307

@weizhouapache
Copy link
Member Author

@blueorangutan test

@weizhouapache
Copy link
Member Author

@blueorangutan test centos7 xenserver-71

@blueorangutan
Copy link

@weizhouapache a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

@blueorangutan
Copy link

Trillian test result (tid-2114)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 38606 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr5446-t2114-kvm-centos7.zip
Smoke tests completed. 89 look OK, 1 have errors
Only failed tests results shown below:

Test Result Time (s) Test File
ContextSuite context=TestKubernetesCluster>:teardown Error 75.02 test_kubernetes_clusters.py

@yadvr yadvr modified the milestones: 4.17.0.0, 4.16.0.0 Sep 21, 2021
@yadvr
Copy link
Member

yadvr commented Sep 21, 2021

@weizhouapache is this critical for 4.16, and is this ready for review - how are we testing it?

@blueorangutan
Copy link

Trillian test result (tid-2198)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 47132 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr5446-t2198-kvm-centos7.zip
Smoke tests completed. 87 look OK, 3 have errors
Only failed tests results shown below:

Test Result Time (s) Test File
test_01_internallb_roundrobin_1VPC_3VM_HTTP_port80 Failure 406.95 test_internal_lb.py
test_08_deploy_and_upgrade_kubernetes_ha_cluster Failure 143.34 test_kubernetes_clusters.py
test_02_VPC_default_routes Failure 610.40 test_vpc_router_nics.py

@weizhouapache
Copy link
Member Author

@blueorangutan package

@blueorangutan
Copy link

@weizhouapache a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result: ✔️ el7 ✔️ el8 ✔️ debian ✔️ suse15. SL-JID 1401

@weizhouapache
Copy link
Member Author

@blueorangutan test

@blueorangutan
Copy link

@weizhouapache a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

@blueorangutan
Copy link

Trillian test result (tid-2212)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 34079 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr5446-t2212-kvm-centos7.zip
Smoke tests completed. 90 look OK, 0 have errors
Only failed tests results shown below:

Test Result Time (s) Test File

@DaanHoogland
Copy link
Contributor

@sureshanaparti did you ping the interested parties for testing? Or do you have any advise on testing?

@weizhouapache
Copy link
Member Author

@sureshanaparti did you ping the interested parties for testing? Or do you have any advise on testing?

@sureshanaparti @DaanHoogland @nvazquez
I think nobody is using this setting now (advanced zone with GRE isolation) as the feature was broken in some previous versions.

My opinion is
(1) if anyone want to test it, it is fine. ping me if you need help.
(2) if nobody have chance testing it, I think we can merge it if we make sure it does not break existing functionalities (by code review, smoke test, manual test, etc).

I have tested the feature manually,
the new smoke test (test/integration/smoke/test_privategw_acl_ovs_gre.py) passed on kvm and xenserver. component test test_vpc_distributed_routing_offering.py passed on xenserver, but not on kvm as the feature is never implemented on kvm.

if someone find bugs in their testing, we can fix them in next minor releases (4.16 will be LTS as far as I know)

Copy link
Contributor

@nvazquez nvazquez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edit: Still testing on OVS-KVM environment

Manually updated database to set GRE isolation on the existing physical network and added and enabled the Ovs provider.

vpc-test2

@apache apache deleted a comment from blueorangutan Sep 29, 2021
@weizhouapache
Copy link
Member Author

@blueorangutan package

@nvazquez
Copy link
Contributor

@blueorangutan package

@blueorangutan
Copy link

@nvazquez a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result: ✔️ el7 ✔️ el8 ✔️ debian ✔️ suse15. SL-JID 1447

@blueorangutan
Copy link

Packaging result: ✔️ el7 ✔️ el8 ✔️ debian ✔️ suse15. SL-JID 1448

@nvazquez nvazquez assigned borisstoyanov and unassigned nvazquez Sep 30, 2021
@yadvr
Copy link
Member

yadvr commented Sep 30, 2021

@blueorangutan test

@blueorangutan
Copy link

@rhtyd a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

@blueorangutan
Copy link

Trillian test result (tid-2251)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 33248 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr5446-t2251-kvm-centos7.zip
Smoke tests completed. 91 look OK, 0 have errors
Only failed tests results shown below:

Test Result Time (s) Test File

@yadvr yadvr merged commit 09fce75 into apache:main Oct 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug Report] - GRE Isolation Broken in VPC Tier Feature Request - Create shared network in advanced zone using GRE tunnels / OVS Plugin

9 participants