After adding more Aerohive AP230s to our network, our RADIUS servers seem to be having trouble handling the load.

  • 2
  • Question
  • Updated 3 years ago
After adding more Aerohive AP230s to our network, our RADIUS servers seem to be having troubles handling the load. Will using a few APs as RADIUS proxies help at all? Or should I look into other RADIUS servers, APs as RADIUS servers, etc.?
Photo of Gary Ossewaarde

Gary Ossewaarde

  • 12 Posts
  • 0 Reply Likes

Posted 3 years ago

  • 2
Photo of Hans Matthé

Hans Matthé

  • 131 Posts
  • 28 Reply Likes
Hello Garry, are you using the Aerohive AP's as Radius server?
Photo of Gary Ossewaarde

Gary Ossewaarde

  • 12 Posts
  • 0 Reply Likes
We are using FreeRADIUS.
Photo of Jonas Dekkers

Jonas Dekkers

  • 152 Posts
  • 29 Reply Likes
How many ap's do you have? We are using the aerohive ap's as radius in schools (>80 access points) without any problem.
Photo of Harry Zahlis

Harry Zahlis

  • 22 Posts
  • 3 Reply Likes
I have seen issues with using Aerohive AP's as RADIUS where the AAA cache fills up on the AP.  I have had to SH into the RADIUS AP and clear the cache (we can have upwards of 10-15K users on campus).  We will be moving to RADIUS servers over the summer and moving away from using the APs.

If you want to check the cache ssh into the AP and use the command is "show aaa radius-server cache".  If you have more than 512 entries then the cache is full.  to clear the cache the command is "clear aaa radius-server cache".
Photo of Gary Ossewaarde

Gary Ossewaarde

  • 12 Posts
  • 0 Reply Likes
We have 283 AP230s.


Would you SSH into each AP and clear the cache? The AP230s are not acting as RADIUS servers, we have FreeRADIUS running on its own linux server (with an LDAP backend).
Photo of Harry Zahlis

Harry Zahlis

  • 22 Posts
  • 3 Reply Likes
Only on the ones running as a RADIUS server - not all of the APs.  Are you getting anything from the FreeRADIUS logs?  I will be moving towards a full-blown RADIUS server over the summer...
Photo of Nick Lowe

Nick Lowe, Official Rep

  • 2491 Posts
  • 451 Reply Likes
When configured correctly, an up-to-date instance of FreeRADIUS can handle considerable load on modern hardware.

This usual cause of bottlenecks is any logging that is configured to SQL et al, general I/O issues or misconfiguration.

Have you reviewed the setup there to check that it is sane and behaving?

A common issue I have seen has been people hosting RADIUS in underpowered VMs that exhibit very poor IOPS, either consistently or intermittently. (It's possible for these to be CPU bound but I've never seen it.)

There can also be performance issues using Samba when talking to a Windows Active Directory Domain when the processing threads all block waiting for completion.

If you are using an older version of FreeRADIUS, ensure that you update to the latest releases, currently 2.2.7 or 3.0.8 as this solves many performance issues. Which version are you using?

The most likely cause of issues will be found on the RADIUS server(s) themselves and not with the NASes or having too many of them in a campus setting. 283 NASes with typical loads isn't that significant relative to the resources that should be available.

That said, by default, Interim-Update RADIUS accounting will occur per session every 20 seconds if you have accounting configured in addition to auth. If so, have you considered dialling this back to something more sensible, like every 3 minutes or so (180 seconds)?
(Edited)
Photo of Gary Ossewaarde

Gary Ossewaarde

  • 12 Posts
  • 0 Reply Likes
Thanks for the responses. I appreciate all of you helping me get my AeroHive/RADIUS infrastructure in order.

While the FreeRADIUS vm is not a powerhouse, it has decent specs, (2 CPUs, 4GB of RAM). Also, the disk is most likely not the bottleneck. It isn't talking to an AD domain at all, either. Its backend is OpenLDAP.

We are using FreeRADIUS 3.0.4 currently.

Interim-Update RADIUS is set to 21600 seconds (I believe this was bumped up earlier when we were having issues before with fewer APs).

Where can I look on my RADIUS server to make sure things are set up well, especially accounting, if that's what could be causing the load issues.
Photo of Nick Lowe

Nick Lowe, Official Rep

  • 2491 Posts
  • 451 Reply Likes
Your question would best be placed on the FreeRADIUS mailing list rather than here. It is a RADIUS backend issue not an Aerohive issue.

3.0.4 was not considered a stable release, it did not get that badge until 3.0.7 shipped. You should therefore consider updating to 3.0.8.

For performance tuning, I suggest that you read through:

https://github.com/FreeRADIUS/www.freeradius.org/blob/master/radiusd/doc/tuning_guide

A good tool to performance test along with some scripting is:

http://www.coova.org/JRadius/Simulator

Or... eapol_test with -r

Or... http://qatesterblog.blogspot.co.uk/2014/06/new-way-of-freeradius-performance-test.html

There are also LDAP performance testing tools available and RadPerf, these won't do EAP so they won't simulate your clients well, but they will show up many database, directory or I/O issues.

You can also run FreeRADIUS in debugging mode to get more information if you need it.

My hunch is that you will find that your bottleneck is occurring when the LDAP queries are made or logging is taking place.

Once you know where the bottleneck is occurring, it can be triaged by digging in that particular process.
(Edited)