Why do some of my Aerohive APs reboot every so often by themselves?

  • 5
  • Question
  • Updated 3 months ago
  • Answered
Randomly every day one or two of our Aerohive Access Points reboots without being asked to do so. We have managed to gather some logs from APs that do this and sent them to Aerohive, but we've not heard anything back for over two weeks.

Before someone points out the obvious and asks us to upgrade, we are running HiveOS 5.1r5.1089 and cannot upgrade to any version 6 since there is a client disconnect problem and/or APs reach 100% CPU and don't respond. This has been reported widely across Colleges with seperate LANs within the University I work.
Photo of Ben Bridle

Ben Bridle

  • 3 Posts
  • 0 Reply Likes

Posted 5 years ago

  • 5
Photo of Ben Bridle

Ben Bridle

  • 3 Posts
  • 0 Reply Likes
I'd like to add an observation that another colleague has mentioned that an overwhelming number of sites which face this common issue are using the HMOL.
Photo of Van Jones

Van Jones

  • 75 Posts
  • 4 Reply Likes
We are seeing some access points reboot on their own (running 6.1r1). We sent netdumps to Aerohive and they diagnosed the problem as being a timing problem. They have created an update for us and we are waiting for them to complete testing so that we can try it out. Are you saying the reboots or the client disconnect problem is happening at a large number of sites?
Photo of Nick Lowe

Nick Lowe, Official Rep

  • 2491 Posts
  • 451 Reply Likes
I have had issues with multiple AP 330s, backed by UPS, running a revised 6.1r2 (1365) rebooting, prima facie, on their own and reverting to the previous software version, the secondary image. The reboots by themselves are not particularly troublesome for our use case as they are relatively infrequent but the reversion was as it caused dodgy RADIUS accounting to occur due to an issue in the older image.

I ended up mitigating that by going through all the APs manually using the save image command via SSH with a TFTP server up to ensure the older version was gone for good so that if an when a reboot occured of this nature, bad things didn't happen as the latest version would always be booted.

It would be nice if they didn't reboot seemingly of their own accord but I have not had the motivation or time to look in to it to see if I can identify the causal factor yet and I like to establish root cause if I can before opening a support case.

I have not noticed a client disconnection problem with 6.x or undue/pegged CPU usage.
Photo of Scott M.

Scott M., Sr. Support Engineer

  • 104 Posts
  • 8 Reply Likes
Hello Ben,

I recommend you contact your service provider and open a support case. Troubleshooting will be performed and we will determine the specific root cause and then fix it.

Thank you,

Scott Myron
Photo of Van Jones

Van Jones

  • 75 Posts
  • 4 Reply Likes
Ben, I'm curious if you are still experiencing random reboots of your access points.  We are now on 6.1r3 and we still have 2 or 3 access points rebooting per day.  I am providing logs to support on a daily basis.  I would like to know if I am the only one still experiencing this issue.
Photo of Ben Bridle

Ben Bridle

  • 3 Posts
  • 0 Reply Likes
Hi Van, we have upgraded a few APs to v6.1r3 and we have reboots still occurring.  We are in contact with the reseller about the issue.
Photo of Ash Gilson

Ash Gilson

  • 1 Post
  • 0 Reply Likes
We are looking at evaluating Aerohive for our complex, as pasrt of this blog, would the people using Aerohive recommend the hardware and product as a WIFI solution.. If you had the decision again would you use Aerohive??.  Has the random rebooting issue been resolved??. and how long did it take Aerohive to resolve


(Edited)
Photo of Van Jones

Van Jones

  • 75 Posts
  • 4 Reply Likes
I would definitely go with Aerohive again.  They have way too much to offer over other vendors.  The management / troubleshooting tools that Aerohive provides out of the gate is incredible (That's coming from a previous Meru customer).  There is no centralized controller (central point of failure / bottleneck) for all of my traffic to go through.  Having access to a semi-local field engineer (Jeff Haydel) that cares about us is invaluable.  Am I frustrated that a small subset of my access points are rebooting? Yes, but no matter which vendor you choose, you are going to have issues.  Do I wish that they had been able to resolve the issue faster?  Of course, but I can say that support has been very meticulous (Tim Ruda) in the process of determining the exact cause of our issue and not just grouping our issue together with everyone else.  We received an update from support yesterday that is actually 6.1r3+our specific fix.  We are testing that on 2 buildings for a few days.  If we conclude that this takes care of our issue, then this fix becomes a part of 6.1r4 (along with other bugs that have been squashed since r3).  I hope that answers your question.
Photo of Nick Lowe

Nick Lowe, Official Rep

  • 2491 Posts
  • 451 Reply Likes
It would be a complete fallacy to assume that other vendors don't have these type of issues, they do and quite regularly... not that is ever any consolation but it is the reality of the market due to the complex, nuanced nature of what we buy. (Cisco, for example, have just had to withdraw the software for some their access points.) While we all crave for better testing to have been done when we hit a bug like this and it's frustrating and annoying, there is nothing like the baptism of fire of real world deployments to draw out issues. That's exactly why Enterprise class hardware needs the same class of support...

(Edited)
Photo of Andrew MacTaggart

Andrew MacTaggart, Champ

  • 483 Posts
  • 86 Reply Likes
I have 80 and counting AP 330's and have not noticed any random reboots.
we have HMOP
radius is external
code is 6.1r3
no wips [ As David Coleman once asked me, when an AP is mitigating a rogue what is it not doing?]



Photo of MichaelB

MichaelB

  • 8 Posts
  • 1 Reply Like
I have seen this happen different access points with loose cables or RJ45 connectors. Sometime the RJ45 connector can back out on its own and not have a secure/tight/fit inside the power brick or POE.


Photo of Brian L'Heureux

Brian L'Heureux

  • 2 Posts
  • 0 Reply Likes
Is your CAPWAP traffic running over HTTP or UDP? About a year ago (running HiveOS v4.x), we were seeing these types of reboots regularly. At the time, CAPWAP was running over the HTTP fallback, rather than UDP. Once we changed our firewall settings to ensure proper UDP operation to the HMOL and the APs started using CAPWAP over UDP, our rebooting issues improved significantly.
Photo of Kevin Barrett

Kevin Barrett

  • 4 Posts
  • 0 Reply Likes
What firewall changes did you make?   
Photo of Crowdie

Crowdie, Champ

  • 972 Posts
  • 272 Reply Likes
CAPWAP connectivity occurs on UDP 12222.  If the access points cannot communicate with the on-premise or cloud based HiveManager on this port they fall back to port 80 (HTTP).

To test if UDP 12222 is available use the following CLI command on the access point:
capwap ping [HiveManager IP address or hostname]