Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Change for given Route ECMP to fallback on Default Route ECMP #3389

Open
wants to merge 21 commits into
base: master
Choose a base branch
from

Conversation

abdosi
Copy link
Contributor

@abdosi abdosi commented Nov 24, 2024

What I did:
Added Change for given Route ECMP to fallback on Default Route ECMP. When all the Members of Route are Link Down and if route is eligible for fallback to default route the ECMP Member in SAI Nexthop Goup are updated to the Default Route Nexthop/Nexthop's Members.

This change does not take care of this scenarios:

  1. When the Route which is fallback on Default Route Nexthops if the original nexthop become active [link comes up] it does not move back to original path. Reason is we except this should transient case as the Route which is fallback should get deleted once all the links are down

  2. If Default Routes gets updated [BGP Updates] or if default Route nexthops become link down we do not update ECMP members of Routes that are already fallback to default. Again Reason being Route which is fallback should get deleted once all the links are down and is during this short window getting default routes update is very corner case. We can optimize if needed.

Why I did:
For Faster of Traffic Convergence for Routes where it is ok to send traffic over default route when most specific prefix/route do not have any valid nexthops for transient time before more specific route gets deleted.

How I verified:
UT updated
Ixia based Traffic Convergance.

Reference to full context of this changes
Swss_route_enhancemnts.docx

Signed-off-by: Abhishek Dosi <[email protected]>
Signed-off-by: Abhishek Dosi <[email protected]>
Signed-off-by: Abhishek Dosi <[email protected]>
@abdosi abdosi marked this pull request as ready for review November 25, 2024 03:45
@abdosi abdosi requested a review from prsunny as a code owner November 25, 2024 03:45
@abdosi abdosi changed the title Changes for fallback to default route Added Change for given Route ECMP to fallback on Default Route ECMP Nov 25, 2024
@abdosi abdosi requested a review from arlakshm November 25, 2024 03:56

if (default_nhg_key.getSize() == 1)
{
current_default_route_nhops.insert(*default_nhg_key.getNextHops().begin());
Copy link
Contributor

@arlakshm arlakshm Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: indentation #Closed


if (nhopgroup->second.nh_member_install_count == 0 && nhopgroup->second.eligible_for_default_route_nh_swap && !nhopgroup->second.is_default_route_nh_swap)
{
if(nexthop.ip_address.isV4())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if at this time the default route from bgp is not present. will the v4_active_default_route_nhops have the drop port?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arlakshm : if no default route than existing behavior will happen where nexthop group will not have any members which will cause drop as expected.

Comment on lines +1132 to +1133
{
if (ip_prefix.isV4())
Copy link
Contributor

@arlakshm arlakshm Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: indentation #Resolved

@@ -673,6 +761,10 @@ void RouteOrch::doTask(Consumer& consumer)
{
ctx.protocol = fvValue(i);
}
if (fvField(i) == "fallback_to_default_route")
{
Copy link
Contributor

@arlakshm arlakshm Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix indentation. mix of tabs and spaces #Closed

if (fvField(i) == "fallback_to_default_route")
{
fallback_to_default_route = fvValue(i) == "true";
}
Copy link
Contributor

@arlakshm arlakshm Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix indentation. mix of tabs and spaces #Closed

@@ -1067,9 +1174,19 @@ void RouteOrch::doTask(Consumer& consumer)
}
else if (m_syncdNextHopGroups[it_nhg.first].ref_count == 0)
{
removeNextHopGroup(it_nhg.first);
// Pass the flag to indicate if the NextHop Group as Default Route NH Members as swapped.
removeNextHopGroup(it_nhg.first, m_syncdNextHopGroups[it_nhg.first].is_default_route_nh_swap);
Copy link
Contributor

@arlakshm arlakshm Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix indentation #Resolved

updateDefaultRouteSwapSet(v4_default_nhg_key, v4_active_default_route_nhops);

if (v6_default_nhg_key.getSize())
updateDefaultRouteSwapSet(v6_default_nhg_key, v6_active_default_route_nhops);
Copy link
Contributor

@arlakshm arlakshm Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix indentation #Resolved


std::string key; // Key in database table
std::string protocol; // Protocol string
bool is_set; // True if set operation

RouteBulkContext(const std::string& key, bool is_set)
: key(key), excp_intfs_flag(false), using_temp_nhg(false), is_set(is_set)
: key(key), excp_intfs_flag(false), using_temp_nhg(false), is_set(is_set),
fallback_to_default_route(false)
Copy link
Contributor

@arlakshm arlakshm Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mix of tabs and spaces #Closed

@@ -146,6 +152,7 @@ struct RouteBulkContext
using_temp_nhg = false;
key.clear();
protocol.clear();
fallback_to_default_route = false;
Copy link
Contributor

@arlakshm arlakshm Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove tabs #Closed

Copy link
Contributor

@arlakshm arlakshm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/Azp run Azure.sonic-swss

@arlakshm
Copy link
Contributor

/Azp run Azure.sonic-swss

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

arlakshm
arlakshm previously approved these changes Nov 28, 2024
Copy link
Contributor

@arlakshm arlakshm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

tests/test_nhg.py Dismissed Show dismissed Hide dismissed
Copy link
Collaborator

@prsunny prsunny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few questions on approach:

  1. Why don't we simply delete the route if NHGroup is empty for a route and this option is set?
    @prsunny : Th main reason for this
    a. Did not want to change regular flow of Route Delete of getting message from fpmsyncd.
    b. Goal for this changes are to minimize traffic loss and in scale of 70K routes updating few Nexthopgroup members is much fater than deleting 70K routes.

  2. Why not reuse the default route NHgroup instead of creating a new one?
    @prsunny we are not creating new nexthop group . instead we adding default route nexthop's into exisitng group. We do not want to do set operation on Route to update nexthop group which will for scale of 70K routes does not help w.r.t traffic loss

orchagent/routeorch.cpp Show resolved Hide resolved
orchagent/routeorch.cpp Show resolved Hide resolved
orchagent/routeorch.cpp Show resolved Hide resolved
@@ -673,6 +761,10 @@ void RouteOrch::doTask(Consumer& consumer)
{
ctx.protocol = fvValue(i);
}
if (fvField(i) == "fallback_to_default_route")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

who sets this attribute?

Copy link
Contributor Author

@abdosi abdosi Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prsunny : it is being done here: #3353

tests/test_nhg.py Dismissed Show dismissed Hide dismissed
tests/test_nhg.py Dismissed Show dismissed Hide dismissed
tests/test_nhg.py Dismissed Show dismissed Hide dismissed
tests/test_nhg.py Dismissed Show dismissed Hide dismissed
tests/test_nhg.py Dismissed Show dismissed Hide dismissed
tests/test_nhg.py Dismissed Show dismissed Hide dismissed
@abdosi
Copy link
Contributor Author

abdosi commented Dec 11, 2024

/azp run Azure.sonic-swss

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@abdosi
Copy link
Contributor Author

abdosi commented Dec 12, 2024

@prsunny : can you help with sign-off/merge of the PR.

@abdosi
Copy link
Contributor Author

abdosi commented Dec 16, 2024

@prsunny : Document attached

@mssonicbld
Copy link
Collaborator

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@arlakshm
Copy link
Contributor

arlakshm commented Jan 3, 2025

@abdosi can you fix the conflicts

arlakshm added a commit to Azure/sonic-swss.msft that referenced this pull request Jan 8, 2025
…fault Route ECMP

[202405] Added Change for given Route ECMP to fallback on Default Route ECMP

What I did:

Cherry-pick of PR: sonic-net/sonic-swss#3389
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Status: No status
Development

Successfully merging this pull request may close these issues.

4 participants