Odd AP230 experience with loss of connectivity

  • 3
  • Question
  • Updated 5 months ago

I've been fighting this on my own for about 2 months now and decided to ask for some help.

I have one particular portion of my network that complains they can set their watch by the network outage at 10am every morning.

I am obviously a skeptic. But, after watching some pings and connectivity of wifi devices, I notice that between 10 and 10:05 EVERY morning this particular wing drops or at least has pings over 3000ms. Some devices drop completely others just become so slow they can't do anything.

I ran through some other scenarios causing the issue and found that it is only happening to wireless clients, and only happening to clients connected to AP230s. This leads me to believe there is a wireless scan or some 'process' running on those devices at that time. I have tried to remove settings that I could find that point to channel scanning when clients are connected. I also have upgraded to the most recent OS for the AP230s. Today, Dec 28th, there was only 1 device connected at the school, and pings to that device dropped before they became 3300ms and then leveled off to above 500ms then came back to normal after 10:05. After this I reverted all the AP230s back to 6.5.8r code to see if that will work tomorrow.

Has anyone experienced anything like this? And more importantly, has it been fixed?

Photo of Fred Fish

Fred Fish

  • 13 Posts
  • 1 Reply Like

Posted 5 months ago

  • 3
Photo of John Haithwaite

John Haithwaite

  • 3 Posts
  • 0 Reply Likes
I would load a WiFi scanner onto your smarphone and go to the area at 09:55 and scan for what is around. Watch for anything like a new SSID popping up at that time when the performance drops out. Also you could get a laptop with Wireshark loaded up.. don't forget the WiFi scanning option to see what js happening.

Also... check what out of network "rogue APs" are in that area. We find wireless printers are a pain. Check also what channel theybare on. You should only see things on 1, 6 and 11 for maximum stability and throughput. I would try forcing that AP230 to use one og those three channels not being used by anything else if possible. Also get everything in the area onto those three channels and use standard width 20Mhz not wide 40Mhz channels.
Photo of Fred Fish

Fred Fish

  • 13 Posts
  • 1 Reply Like

Thanks John.  I've done most of that already.  With the fact that during the Christmas break, no one was in the building there was only 1 device connected to any of the AP230s and one connected to one of the AP330s in an adjacent hall and the problem still only happened to the one in the library on the AP230...I still think this is a configuration issue with Aerohive.

Also, don't forget there are many more channels than 1/6/11, unless you only use 2.4ghz radios.  And then 20/40Mhz width are only for the 5Ghrz radios.

Photo of John Haithwaite

John Haithwaite

  • 3 Posts
  • 0 Reply Likes
There are more channels but, with the 2.4Ghz band those three are the only ones where you can get all on a 20Mhz bandwidth. Co Channel works much better than adjacent channels due to the way they slice up the throughput. With adjacent channels the interference is there but no way to control it.

You do have a point though. You could try turning off one of the radios and test. Then the other to find out where the issue is. I had a problem with the 5Ghz channels inferfering with a Camera feed on our sound stage. Turned off 5Ghz on those APs and problem went away. May have some naughty non wifi traffic on one of the ranges.
Photo of Crowdie

Crowdie, Champ

  • 972 Posts
  • 272 Reply Likes
In your 5 GHz radio profile disable the following options:

  • Channel and Power/Enable DFS (Dynamic Frequency Selection) channels
  • Radio Settings/Enable the Detection of Spoofed BSSIDs
  • Radio Settings/Enable Frame Burst (Applies only to 802.11ac platforms except the AP370 and AP390)
  • Radio Settings/Enable Transmit Beamforming (Applies only to 802.11ac platforms)
  • Radio Settings/Enable MU-MIMO (AP250/AP245X/AP550/AP150W)
In the same 5 GHz radio profile set the channel width to 20 MHz.
Photo of Fred Fish

Fred Fish

  • 13 Posts
  • 1 Reply Like
Thanks, unfortunately all of these are already the settings I have
(Edited)
Photo of Gary Smith

Gary Smith, Official Rep

  • 299 Posts
  • 61 Reply Likes
Fred,

Have you opened a ticket and worked with Aerohive Support on this issue yet? 

Kind Regards,
Gary Smith
Photo of Fred Fish

Fred Fish

  • 13 Posts
  • 1 Reply Like

No Gary, really thought I could find the issue.  Which I have as of last week resolved the problem, but not in a way I want.  I pulled all of my AP230s down to 6.8 code and the problem has gone away.

Now is the time to open a ticket and see if they know what in the code is causing an issue.

Photo of Gary Smith

Gary Smith, Official Rep

  • 299 Posts
  • 61 Reply Likes
Hi Fred,

I think Crowdie may be thinking along the same lines as me. It's possible that at around 10am, the AP could be (falsely?) detecting a DFS event and going off channel as per the expected behaviour. This would drop clients for a period of time. 

The other theory is that the AP is becoming overwhelmed with traffic from the wired side which might be causing high CPU and clients not able to pass traffic.

Collecting a "show tech" file at the time of the issue might help identify what the issue is. This will likely be the first data set that Aerohive support will ask for as it will give the initial clues to work with.

Kind Regards,
Gary Smith
Photo of Fred Fish

Fred Fish

  • 13 Posts
  • 1 Reply Like

Gary thanks for the info.  The problems happened during the holiday when there was only 1 device connected and no network traffic to speak of.  It could still be falsely identifying a DFS even, but every day?  And it was not just one of the AP230s it seemed to be any AP230 in multiple buildings.

I really think there is a code issue since rolling it back has thus far eliminated the problem this week.  I will have to bring one into my office to test with to see if I can replicate the problem with just the one AP and the 8 code.