Roaming problems with Aerohive

  • 1
  • Question
  • Updated 2 years ago

Our wireless network use aerohive AP and 90% of the user use apple laptop. We use AD authentication. We have roaming problems with wifi.
When the user walks around our building, the laptop or mobile phone will be disconnected from wireless network and cannot reconnect automatically.
Users have to turn off wifi then turn on it to reconnect to network.

I suspect this problem is related to AD authentication. We use Windows 2008 as the NPS server. I noticed when I pass each AP, I need to be reauthenticated.

I can find a lot of authentication record from the logs of NPS. Usually more than 100,000 items for a working day.

                                                                                                                        
I am not sure if too many authentication cause the roaming problem.

Does anyone have the similar problem with us or have any idea of troubleshooting this problem?

(also, we use Aerohive for authentication, no ACS server. Our switches are Brocade)

Photo of James Chen

James Chen

  • 3 Posts
  • 0 Reply Likes

Posted 3 years ago

  • 1
Photo of James Chen

James Chen

  • 3 Posts
  • 0 Reply Likes

This is the example of logs collected from NPS server:

DCSDC02 IAS   10/23/2015      10:28:14  1       DOMAIN\21l.schick    DOMAIN\21l.schick    E0-1C-41-2E-22-AA:DOMAIN         64-9A-BE-17-B6-4B                    JS-2F-17   10.99.0.48        0       0       10.99.0.48        Aerohive AP                        19                       2         11     student      0       311 1 10.0.0.30 08/02/2015 08:26:25 23775245                                Microsoft: Secured password (EAP-MSCHAP v2)                                                                                                                                                                                                                                                                                  Secure Wireless Connections  1
DCSDC02 IAS   10/23/2015      10:28:14  2                DOMAIN\21l.schick                                                                      0       10.99.0.48         Aerohive AP                                                             11     student      0       311 1 10.0.0.30 08/02/2015 08:26:25 23775245                                     Microsoft: Secured password (EAP-MSCHAP v2)                                          

Photo of MST

MST

  • 152 Posts
  • 3 Reply Likes
how many users are you authenticating? Windows 2008 has limitation. 2012 does not have that limitation. 
Photo of James Chen

James Chen

  • 3 Posts
  • 0 Reply Likes
we have about 1000 users. what is the limit of windows 2008? we also tried install a new NPS on Windows 2012. but still have similar problems. by the way, windows 2008 is a physical server and windows 2012 is a virtual server.
Photo of John Fabry

John Fabry

  • 28 Posts
  • 8 Reply Likes
Please do me a favor. When this happens check the client monitor and see if DHCP is no the issue. We are experiencing similar issues but find it is related to the DHCP sequence not completing. or the IP being lost somehow. Either the NAK or the Offer packets coming in to theAP on eth0 but not leaving the AP on wifi0 or 1..
Photo of MST

MST

  • 152 Posts
  • 3 Reply Likes

With NPS in Windows Server 2008 Standard, you can configure a maximum of 50 RADIUS clients and a maximum of 2 remote RADIUS server groups. You can define a RADIUS client by using a fully qualified domain name or an IP address, but you cannot define groups of RADIUS clients by specifying an IP address range. If the fully qualified domain name of a RADIUS client resolves to multiple IP addresses, the NPS server uses the first IP address returned in the Domain Name System (DNS) query.

50 is a limit with radius for 2008 - I believe you are aware of that. 2012 does not have that limitation.



https://technet.microsoft.com/en-us/library/cc770442(v=ws.10).aspx

(Edited)
Photo of Nick Lowe

Nick Lowe, Official Rep

  • 2491 Posts
  • 451 Reply Likes
Hi John,

First off, Aerohive's Client Monitor can help you here to see what's going on.

Once you have excluded things like DHCP being the cause...

Do you have "Enable Proactive PMK ID Response" enabled on your 802.1X SSID?
If not, can you enable this?

Do you have 802.1X pre-authentication enabled?
If not, can you enable this?

Is the version of OS X in use bang up-to-date?

The thing to bear in mind is that if an AP has to refer to a RADIUS server to auth when a roam occurs, the roam is suboptimal. There are, with many clients, better techniques of achieving this.

To understand all the roaming techniques that exist currently, I suggest that you take a read of:

http://www.revolutionwifi.net/revolutionwifi/2012/02/wi-fi-roaming-analysis-part-2-roaming.html

(While techniques like TLS session resumption (aka TLS-based EAP fast reconnect) can help keep the process to a minimum at a RADIUS server when a nomadic roam occurs (where supported), it is better to avoid this completely where possible.

And... To be correct, the roam is nomadic from the perspective of the APs here. Not from the perspective of a RADIUS server which maintains TLS session state.)

There is certainly a significant issue with OS X to bear in mind if and where a nomadic roam occurs:

https://support.apple.com/en-gb/HT203841

You may wish to further consider applying the workaround that Apple suggest.

Also be mindful that Band Steering can significantly exacerbate this issue with Apple clients. You may wish to consider disabling this if you have it enabled.

(Sadly, to my knowledge, we can't use better roaming (BSS transition) techniques with OS X. Apple have only implemented 802.11r (Fast BSS Transition) support for iOS at present for their newer devices. 802.11r it's part of Voice-Enterprise features that Aerohive's newer APs support.)

I think it is highly unlikely that your issue is related to RADIUS client limits with the Standard edition of Server 2008/Server 2008 R2. (This, incidentally, does not apply to the Enterprise edition of Server 2008 and the restriction was removed in Server 2012 Standard.)

The RADIUS clients are your APs and not the devices that connect to your APs.

What else... are you sure that you have appropriate overlapping cell coverage so that seamless roaming can take place?

Are you able to enable Voice-Enterprise (802.11k/r/v) on the 802.1X SSID? Perhaps additionally offering a legacy SSID without this for any broken/intolerant clients that you encounter?

What version(s) of HiveOS are you currently using?

With NPS, I encountered a bug in it where all its processing threads could easily get blocked attempting to perform lookups for invalid domain names with just a few incorrectly configured clients retrying in a loop with an invalid domain. Applying a regular expression to check the domain component of the User-Name (EAP outer identity) before processing adequately worked around this.

(This caused huge problems with the initial auth and the auth taking place when a nomadic roam happened.)

Cheers,

Nick
(Edited)
Photo of Andrew Garcia

Andrew Garcia, Official Rep

  • 368 Posts
  • 120 Reply Likes
Great post Nick. Just one point on what you said.

"Enable Proactive PMK ID Response" is not needed to enable PMK caching. To the best of my knowledge that is a legacy knob, and one should not need to enable it if modern Macs and Windows machines are in the network, as is mentioned in the help.

That said, if you have had good results with that knob, I would love to know about it.
Photo of Jan Boje

Jan Boje

  • 47 Posts
  • 0 Reply Likes
Nick is writing
Do you have "Enable Proactive PMK ID Response" enabled on your 802.1X SSID?
If not, can you enable this.

When I read the helpsite in Hivemanager there recomendation is disabling this option for Mac and windows client - If I am reading it wright.

so why will you enable it ?

Another thing your link to https://support.apple.com/en-gb/HT203841  is dead. I would like to read Apples workaround.

We do have roaming problems with Mac computers, our simple solution for this is: turn your wireless off and then on again.

I don't know if it would help if we enabled the option: 802.1X pre-authentication.

We are using MS radius server 2012, it is working fine

Jan Boje
Photo of Andrew Garcia

Andrew Garcia, Official Rep

  • 368 Posts
  • 120 Reply Likes
Apple seems to have taken down that KB document. I am going to be optimistic and hope that means the problem was fixed in a recent patch.

To sum up the workaround, you need to go into your OS X keychain and automatically trust SSL for certificate used by your RADIUS server.

I found this to be a pretty effective workaround.
Photo of Jan Boje

Jan Boje

  • 47 Posts
  • 0 Reply Likes

Thanks Andrew

We hope that Apple have fix't the issues.

but what about : 802.1X pre-authentication

should we enable this feature ?

Our Radio profile is a mix from your two guidelines about high density. The one written by Andrew Von Nagy and the new one part 4 about K12 High Density Radio Profile.

It would be nice to have a guidelines for "the best way to set up roaming for Mac and PC"