lots of problems with the latest firmware including system wide auth problems

  • 1
  • Question
  • Updated 1 year ago
  • (Edited)
We have experienced horrible, campus wide problems since upgrading our APs to the latest releases.  The first issues I noticed is that I could not push a full config to some models, high cpu utilization and that SSH was broken.  After downgrading as follows, as directed by support, those issues have disappeared.  

HiveOS 6.4r1g.2138  - 121
HiveOS 6.5r4 Honolulu.128121  -330
HiveOS 7.0r2 Bay.131568  -250
HiveOS 6.5r4 Honolulu.128121 -230.


However,  we continue to have massive authentication / deauth issues.  We have the same setup that has been working well for a long time with a primary and secondary RADIUS proxy set up on two Aerohive APs and NPS acting as the auth server. Auth will work for a number of hours and then clients get deauth'ed (50%+) or cannot connect.

Testing the connection between the Radius proxy in Aerohive and the NPS server returns messages such as:
"There is no configuration on the device for the specified RADIUS server".
"The RADIUS server rejected the Access Request message. Check the shared secret in configuration."
"The connection has timed out"

Then maybe 20 or 30 minutes later, everything is OK again, for a bit.

We don't have any more clients than we did a few weeks ago (a good mix of Apple, PC and Chromebooks), and there is nothing on the NPS server to indicate this is a resource issue.  As a matter of fact, I get the aforementioned errors, even when I am the only one on campus.  

I have done the PHY level troubleshooting(cable, switch port).

I have tried a number of things with support including adding a newer AP to act as the RADIUS proxy.

The NPS logs don't reveal anything helpful.  

I am currently mirroring the primary Aerohive RADIUS proxy AP switch port and capturing in Wireshark.  

I would love to have suggestions on where to go from here.   


edit:  sample from client monitor

01/12/2017 02:06:33 PM  2CBE08F17496  9C5D12DB1364  US-Computer-Lab           

DETAIL  (131)Send message to RADIUS Server(1x.1x.x.x): code=1 (Access-Request) identifier=27 length=146,  User-Name=xxxxxxxxxx NAS-IP-Address=1x.1x.x.x Called-Station-Id=9C-5D-12-DB-13-64:CFS Calling-Station-Id=2C-BE-08-F1-74-96


This will repeat over and over and then out of nowhere the client will complete the auth process.  

Photo of Dawn Douglass

Dawn Douglass

  • 67 Posts
  • 3 Reply Likes

Posted 2 years ago

  • 1
Photo of Nick Lowe

Nick Lowe, Official Rep

  • 2491 Posts
  • 451 Reply Likes
Hi Dawn,

There is a known issue that pertains to certain forms of Class attributes in HiveOS 6.5r5 and HiveOS 7.1r1, experienced primarily with NPS, and an issue with RADIUS accounting retransmissions pertaining to the calculation of the Request Authenticator in HiveOS 6.5r5, HiveOS 6.8r1 and HiveOS 7.0rX. The can cause authentication issues.

These will be corrected with our upcoming releases.

My suggestion would be to disable RADIUS accounting temporarily for the RADIUS servers configured via HiveManager and to ensure that NPS does not generate RADIUS class attributes by adding the Generate-Class-Attribute pseudo-attribute in either Connection Request Policies (CRPs) or Network Policies (NPs) under Vendor Specific RADIUS Attributes, setting the value to False.
(Once, added, due to a quirk in NPS, this will then appear under Standard RADIUS Attributes.)



Alternatively, use previous versions of HiveOS temporarily if you have a need for RADIUS accounting in the environment.

The SSH issue was an issue with HiveManager 6 and not HiveOS, and is already corrected with 6.8r7. It is noted in the release notes.

I would also strongly suggest not deploying using with the RADIUS proxy in our APs for a larger, campus scoped network, rather, get each AP to talk to NPS directly.

Thanks,

Nick
(Edited)
Photo of Franco Gobbetti

Franco Gobbetti

  • 45 Posts
  • 0 Reply Likes
Nick, I am supporting a customer on this changes but I have some requests:
1- changed Radius in HM from auth/acct to authentication only : OK DONE
2- setting the Generate-Class-Attribute to False:
2.1 - is this the default and you just are suggesting this to set this to make double-sure the default value is applied ?
2.2 - what is the impact of setting the Generate-Class-Attribute to False on a live environment, does this affect connected users or can this be done without risks of tearing down current connections or disabling new connections ?

Thanks

Franco
Photo of Nick Lowe

Nick Lowe, Official Rep

  • 2491 Posts
  • 451 Reply Likes
Hi Franco,

No, the default in NPS is to generate a Class attribute. It has to be explicitly disabled by a Generate-Class-Attribute pseudo-attribute being included and set to False.
(This default changed when IAS became NPS with Server 2008 and newer.)

The purposes of Class attributes is to allow binary blobs of data to be included in an Access-Accept and to then have these included in any subsequent RADIUS accounting that is performed for the client (Start, Interim-Update and Stop). By specification, they are not interpreted by a NAS (AP/switch etc.) in any way.

This does not affect existing associations or pertain to auth in any way.

Thanks,

Nick
(Edited)
Photo of Franco Gobbetti

Franco Gobbetti

  • 45 Posts
  • 0 Reply Likes
Thanks Nick, just to be triple-sure, the customer has defined in NPS the value for the Attribute Number 1234 used to match the proper user profile for that set uf users who authenticate via Radius. Setting the Generate-Class-Attribute = False will NOT affect the transmission of the Attibute Number back to the AP after the user authenticates, right ? Thx again...
Photo of Nick Lowe

Nick Lowe, Official Rep

  • 2491 Posts
  • 451 Reply Likes
Correct
Photo of Dawn Douglass

Dawn Douglass

  • 67 Posts
  • 3 Reply Likes
Thanks Nick.  I'll make these changes tomorrow and report back.

Dawn
Photo of Nick Lowe

Nick Lowe, Official Rep

  • 2491 Posts
  • 451 Reply Likes
Being mindful of the number of RADIUS clients that you may otherwise have to add to NPS, to migrate away from using the RADIUS proxy in HiveOS, you can add an IP range to NPS with the Enterprise edition of Server 2008/2008 R2. With Server 2012 and later, this is supported with the Standard edition and up.

Microsoft document how to do this here:

https://technet.microsoft.com/en-us/library/cc731824(v=ws.10).aspx
(Edited)
Photo of Dawn Douglass

Dawn Douglass

  • 67 Posts
  • 3 Reply Likes
So far so good but, tomorrow when everyone is back on campus will be the real test.  
Photo of Dawn Douglass

Dawn Douglass

  • 67 Posts
  • 3 Reply Likes
Everything is running smoothly this morning.  Thanks Nick!
Photo of Franco Gobbetti

Franco Gobbetti

  • 45 Posts
  • 0 Reply Likes
Hello All, can this be also compared to the issue I have with AP250, intermittend and random 4 way handshake not completing ( 1/4 sent, nothing received back and client unable to authenticate ) with PSK only ? 7.1r1 on AP250 - all other AP 130 are fine
Photo of Franco Gobbetti

Franco Gobbetti

  • 45 Posts
  • 0 Reply Likes
Dawn, can you please tell what HiveOs you are using on AP250 ? did you stay with 7.0r2 or did you upgraded to 7.1r1 ? Thanks
Photo of Dawn Douglass

Dawn Douglass

  • 67 Posts
  • 3 Reply Likes
We only have one of that model and it is running currently running  HiveOS 7.0r2 Bay.131568 and is working OK .
Photo of Franco Gobbetti

Franco Gobbetti

  • 45 Posts
  • 0 Reply Likes
Thank you !