Bug #46496 for Net-BGP: Bug in Update.pm with empty NLRI and attributes

Fri May 29 16:00:12 2009 olivier.montanuy [...] m4x.org - Ticket created

Subject:

Bug in Update.pm with empty NLRI and attributes

BGP Update messages with Withdrawn prefix but empty Attributes and NLRI will cause the Net::BGP to tear down the BGP session. Consequence: the BGP session is restarted whenever some routes are withdrawn without any new route announced, that happen within a few minutes on a typical BGP session with Internet full routing. The initial announce of 300k routes causes CPU load on both server and routers. Cause: Update.pm _decode_path_attributes checks that mandatory BGP attributes are present, and of course that cannot work when the attribute list is empty. So a BGP error is reported even though the BGP message is perfectly valid. Note: the bug is present in all versions of the lib, but may not cause any trouble in older versions that did not check mandatory BGP attributes. Quick fix: check that $length is above zero before calling _decode_path_attributes and _decode_nlri I don't provide a patch because I'm not sure this is the right way to do fix the bug, since I only use a little part of the API. But with that fix, the BGP sessions can last several hours without problems.

Fri May 29 18:08:21 2009 kbrint [...] rufus.net - Taken

Wed Jul 01 09:39:40 2009 guillaume.lambert-geek [...] laposte.net - Correspondence added

From:

guillaume.lambert-geek [...] laposte.net

Hello gentlemen I would like to add some precisions on this topic. I have taken a quick look at all versions of Net::BGP since 0.08 and it seems confirmed that they are all suffering from the bug. However, only versions above 0.10 (included) seem to cause the BGP session to tear down ( provided of course that when instantiating the Net::BGP process, you didn't put in your notification callback function some commands to explicitly do it ) As meant by Olivier, the bug can easily be corrected by adding a little control line in the _decode_message function before calling _decode_path_attributes . See details below. sub _decode_message { … if ( $length > (length($buffer) - $offset) ) { Net::BGP::Notification->throw( ErrorCode => BGP_ERROR_CODE_UPDATE_MESSAGE, ErrorSubCode => BGP_ERROR_SUBCODE_MALFORMED_ATTR_LIST ); } # update message without any NLRI - dont check attributes if( $length <= 0) return; $this->_decode_path_attributes(substr($buffer, $offset, $length)); $offset += $length; … } But in my opinion, the most interesting issue that was raised here is why only versions above 0.10 were actually causing the BGP session to tear down. The answer is maybe in the fact that since version 0.10 a new function called throw was added to the Net::BGP::Notification package: sub throw { my $class = shift; my $notif = $class->new(@_); die $notif; } Before version 0.10, everytime NetBGP detected an error, it was instantiating directly a new Notification and nothing else particular was done: $error = new Net::BGP::Notification( ErrorCode => BGP_ERROR_CODE_UPDATE_MESSAGE, ErrorSubCode => BGP_ERROR_SUBCODE_BAD_ATTR_LENGTH, ErrorData => $error_data ); Since version 0.10, this code has been replaced by a call to this new throw function: Net::BGP::Notification->throw( ErrorCode => BGP_ERROR_CODE_UPDATE_MESSAGE, ErrorSubCode => BGP_ERROR_SUBCODE_BAD_ATTR_LENGTH, ErrorData => $error_data ); Which creates of course a new Notification to be instantiated but also requests the current process to die, causing the BGP session to tear down. I think that it is a little bit too strict for a BGP listener. As far as I am concerned, the behaviour to adopt in that case should be configurable by the user if Net::BGP intends to answer any specifics purposes. For example, a typical use case of a BGP listener developped with Net::BGP could be a BGP implementation validation tool that need to be error-proof. Hope thise helps Best Regards Guillaume Le Ven. Mai. 29 16:00:12 2009, Skewell a écrit : Show quoted text

> BGP Update messages with Withdrawn prefix but empty Attributes and NLRI > will cause the Net::BGP to tear down the BGP session. > > Consequence: the BGP session is restarted whenever some routes are > withdrawn without any new route announced, that happen within a few > minutes on a typical BGP session with Internet full routing. > > The initial announce of 300k routes causes CPU load on both server and > routers. > > Cause: Update.pm _decode_path_attributes checks that mandatory BGP > attributes are present, and of course that cannot work when the > attribute list is empty. So a BGP error is reported even though the > BGP message is perfectly valid. > > Note: the bug is present in all versions of the lib, but may not cause > any trouble in older versions that did not check mandatory BGP attributes. > > > Quick fix: check that $length is above zero before calling > _decode_path_attributes and _decode_nlri > > I don't provide a patch because I'm not sure this is the right way to do > fix the bug, since I only use a little part of the API. > But with that fix, the BGP sessions can last several hours without

problems. Show quoted text

> > > >

Wed Jul 01 09:39:41 2009 The RT System itself - Status changed from 'new' to 'open'

Wed Jul 15 22:53:18 2009 kbrint [...] rufus.net - Correspondence added 60 min

Subject:	Re: Bug in Update.pm with empty NLRI and attributes
RT-Send-CC:	guillaume.lambert-geek [...] laposte.net, elisa.jasinska [...] ams-ix.net

Thank you for the bug reports. Net::BGP v0.13 addresses this problem.. I've just uploaded it to CPAN. Please test and let me know if you find any problems. Gillaume, I plan to add a special callback for the notification error in the next revision. Stay tuned. Thanks for your feedback. kevin On Fri May 29 16:00:12 2009, Skewell wrote: Show quoted text

> BGP Update messages with Withdrawn prefix but empty Attributes and NLRI > will cause the Net::BGP to tear down the BGP session. > > Consequence: the BGP session is restarted whenever some routes are > withdrawn without any new route announced, that happen within a few > minutes on a typical BGP session with Internet full routing. > > The initial announce of 300k routes causes CPU load on both server and > routers. > > Cause: Update.pm _decode_path_attributes checks that mandatory BGP > attributes are present, and of course that cannot work when the > attribute list is empty. So a BGP error is reported even though the > BGP message is perfectly valid. > > Note: the bug is present in all versions of the lib, but may not cause > any trouble in older versions that did not check mandatory BGP attributes. > > > Quick fix: check that $length is above zero before calling > _decode_path_attributes and _decode_nlri > > I don't provide a patch because I'm not sure this is the right way to do > fix the bug, since I only use a little part of the API. > But with that fix, the BGP sessions can last several hours without problems. > > > >

Wed Jul 15 22:53:21 2009 kbrint [...] rufus.net - Status changed from 'open' to 'resolved'

Tue Jul 28 09:20:13 2009 olivier.montanuy [...] m4x.org - Broken in 0.10 added

Tue Jul 28 09:20:14 2009 olivier.montanuy [...] m4x.org - Fixed in 0.13 added

Tue Jul 28 09:34:49 2009 olivier.montanuy [...] orange-ftgroup.com - Correspondence added

Subject:	RE: [rt.cpan.org #46496] Re: Bug in Update.pm with empty NLRI and attributes - FIXED
Date:	Tue, 28 Jul 2009 15:34:19 +0200
To:	<bug-Net-BGP [...] rt.cpan.org>
From:	<olivier.montanuy [...] orange-ftgroup.com>

I just tested that 0.13 works fine with a 7200 and 324k Internet routes. The 7200 announced all, then withdrew all routes, then re-announced them all. The BGP session did not fail, and the prefixes were learnt correctly. Thanks a lot for the bug fix Olivier Montanuy Show quoted text

> -----Message d'origine----- > De : kevin brintnall via RT [mailto:bug-Net-BGP@rt.cpan.org] > Envoyé : jeudi 16 juillet 2009 04:53 > À : olivier.montanuy@m4x.org > Objet : [rt.cpan.org #46496] Re: Bug in Update.pm with empty > NLRI and attributes > > <URL: https://rt.cpan.org/Ticket/Display.html?id=46496 > > > Thank you for the bug reports. Net::BGP v0.13 addresses this > problem.. I've just uploaded it to CPAN. Please test and > let me know if you find any problems. > > Gillaume, I plan to add a special callback for the > notification error in the next revision. Stay tuned. > > Thanks for your feedback. > > kevin > > On Fri May 29 16:00:12 2009, Skewell wrote:

> > BGP Update messages with Withdrawn prefix but empty Attributes and > > NLRI will cause the Net::BGP to tear down the BGP session. > > > > Consequence: the BGP session is restarted whenever some routes are > > withdrawn without any new route announced, that happen within a few > > minutes on a typical BGP session with Internet full routing. > > > > The initial announce of 300k routes causes CPU load on both

> server and

> > routers. > > > > Cause: Update.pm _decode_path_attributes checks that mandatory BGP > > attributes are present, and of course that cannot work when the > > attribute list is empty. So a BGP error is reported even though the > > BGP message is perfectly valid. > > > > Note: the bug is present in all versions of the lib, but

> may not cause

> > any trouble in older versions that did not check mandatory

> BGP attributes.

> > > > > > Quick fix: check that $length is above zero before calling > > _decode_path_attributes and _decode_nlri > > > > I don't provide a patch because I'm not sure this is the

> right way to

> > do fix the bug, since I only use a little part of the API. > > But with that fix, the BGP sessions can last several hours

> without problems.

> > > > > > > >

> > > >

Tue Jul 28 09:34:50 2009 The RT System itself - Status changed from 'resolved' to 'open'

Mon Aug 24 11:46:29 2009 kbrint [...] rufus.net - Status changed from 'open' to 'resolved'

Tue Aug 25 14:13:23 2009 omontanuy [...] orange.fr - Correspondence added

Subject:	Re: [rt.cpan.org #46496] Resolved: Bug in Update.pm with empty NLRI and attributes
Date:	Tue, 25 Aug 2009 20:12:36 +0200
To:	bug-Net-BGP [...] rt.cpan.org
From:	Olivier Montanuy <omontanuy [...] orange.fr>

kevin brintnall via RT wrote: Show quoted text

> <URL: https://rt.cpan.org/Ticket/Display.html?id=46496 > > > According to our records, your request has been resolved. If you have any > further questions or concerns, please respond to this message

I don't understand this message, since I indicated one month ago that the bug was resolved.

Tue Aug 25 14:13:23 2009 The RT System itself - Status changed from 'resolved' to 'open'

Tue Aug 25 15:17:35 2009 kbrint [...] rufus.net - Correspondence added

Every email reply re-opens the ticket. Please do not reply. Marking resolved again.

Tue Aug 25 15:17:36 2009 kbrint [...] rufus.net - Status changed from 'open' to 'resolved'