Hive OS 6.5r3 ID Manager Proxy

  • 1
  • Question
  • Updated 2 years ago
Currently going through the process of upgrading my AP's to a newer version of software, however I am noticing that once again, the AP's on the newer firmware (6.5r3) are not electing themselves as IDM proxy servers. This has been a problem in previous releases for some time and I was assured that in this "golden release" this problem would be solved. 
Photo of Luke Harris

Luke Harris

  • 265 Posts
  • 18 Reply Likes

Posted 3 years ago

  • 1
Photo of Gary Smith

Gary Smith, Official Rep

  • 299 Posts
  • 61 Reply Likes
Hi Luke,

What version have you come from on both the HM and AP? Are you sure that you have the firewall rules correct to allow proper use of IDM?

http://docs.aerohive.com/330000/docs/help/english/6.1r1/idm/help.htm#ref/services.htm%3FTocPath%3DRe...


Kind Regards,
Gary Smith
Photo of Luke Harris

Luke Harris

  • 265 Posts
  • 18 Reply Likes
Hi Gary,

We have recently upgraded to 6.6r3 from 6.5r1 - I've been having issues with AP firmware versions for months with regards to the election of IDM proxy servers. Some revisions within the 6.x family  rendered my service completely unusable. Firewall is not the issue here as I don't see the issue with earlier versions. The issue now is that I don't think that this feature has been fixed properly in the so called "golden release" of 6.5r3.
Photo of Gary Smith

Gary Smith, Official Rep

  • 299 Posts
  • 61 Reply Likes
Hi Luke,

There are some specific debugs which may help to determine where the issue lies. Have you rasied this issue with your support provider? If yes, do you know the Aerohive case number?

Kind Regards,
Gary Smith
Photo of Luke Harris

Luke Harris

  • 265 Posts
  • 18 Reply Likes
I have raised this issue with my support provider in the past regarding prior versions. The issue gets passed to Aerohive and then all falls silent until I pursue it again. To my recollection I have never been given a case reference when I have raised any support issues. 
(Edited)
Photo of Andrew Garcia

Andrew Garcia, Official Rep

  • 368 Posts
  • 120 Reply Likes
I use 6.5r3 in several networks and the RADSEC proxy election process works fine.
Out of curiosity, what version works for you?
Photo of Luke Harris

Luke Harris

  • 265 Posts
  • 18 Reply Likes
Hi Andrew, I'm currently using it in areas of my deployment. However I have not seen evidence of devices running the new firmware electing themselves as an ID Manager Proxy Server. 
Photo of Andrew Garcia

Andrew Garcia, Official Rep

  • 368 Posts
  • 120 Reply Likes
I don't doubt that you are experiencing an issue, just asking a few questions trying to isolate the reason.

The RADSEC election process changed significantly in 6.2r1 and subsequent versions. Are the versions that work for you 6.1r6 or earlier?
Photo of Luke Harris

Luke Harris

  • 265 Posts
  • 18 Reply Likes
Sorry Andrew, let me apologise for the 'short' tone of my last post. The versions I have seen it running on are HiveOS 6.1r3b and 6.4r1g... to my direct recollection (obviously I'm aware the latter isn't below 6.1r6) 

EDIT:- I distinctly remember having issues with 6.2r1 as well. 
(Edited)
Photo of Andrew Garcia

Andrew Garcia, Official Rep

  • 368 Posts
  • 120 Reply Likes
No need for apology, but thanks. Let's just try to get to the bottom of this.

The election process in 6.1r3b will be different than that in 6.4 or 6.5.  But the fact that 6.4r1g works for you but 6.5r3 does not is surprising to me. Let me have a think on this, and I will post again when I get to the office.
Photo of Luke Harris

Luke Harris

  • 265 Posts
  • 18 Reply Likes
Yes I was aware that there were some differences in the election process between the 6.x versions. At this stage I still have AP's within a particular subnet running the older version that are elected as proxies. I'm wondering whether upgrading all AP's in the area to 6.5r3 will force the election process. 
Photo of Roberto Casula

Roberto Casula, Champ

  • 231 Posts
  • 111 Reply Likes
Luke, can you describe exactly what it is that you are expecting and exactly what it is that you are seeing?

You should have at any one time no more than two APs per management subnet elected as an IDM Proxy Server. (Only APs whose network policy includes an SSID enabled for ID Manager will be considered for election).

There were bugs in some earlier versions of 6.x which caused the election process to flap frequently, causing the proxy function to flip between different APs, which could appear to show more than two APs per subnet elected at once because of delays in what gets reported and displayed in HiveManager. Also, it can take a short while after upgrading/rebooting APs for HiveManager's view of the world to sort itself out, so again you can for a short time appear to see more than two IDM Proxy APs per subnet in HiveManager...but in general, you should never see more than two in normal operation.

If you issue the command "show idm" on an AP which is NOT elected as an IDM Proxy, it should show you something like this:
IDM client: Enabled
IDM Proxy IP: 10.0.0.10
IDM proxy: Disabled
This says that this AP is communicating with IDM only via another AP acting as a RADSEC proxy and is not itself directly connected to IDM. In this case, 10.0.0.10 is the elected PRIMARY IDM Proxy Server. If you run the same command on that AP, you should see:

IDM client: Enabled
IDM Proxy IP: 10.0.0.10
IDM proxy: Enabled
IDM server: auth.aerohive.com
IDM server IP: 54.171.185.94
RUN state: Connected securely to the IDM server
...which says that this AP is using itself as its proxy server and that it has an active connection to IDM and is therefore able to proxy RADSEC connections from itself and other APs in the subnet.

And finally, on the elected BACKUP IDM Proxy Server, it will also show:
IDM client: Enabled
IDM Proxy IP: 10.0.0.10
IDM proxy: Enabled
IDM server: auth.aerohive.com
IDM server IP: 54.171.185.94
RUN state: Connected securely to the IDM server
...which says that this AP is using 10.0.0.10 as its current active proxy server, but that it also has an active connection to IDM and is running as a RADSEC proxy server so that it is ready to take over the role of the active IDM proxy for the subnet should the PRIMARY fail.

So are you saying that you have a management subnet which contains only APs running 6.5r3 and in that subnet, there are NO APs being elected as IDM proxy (which basically means IDM is non-functional in that subnet)? Or are you saying you have subnets with a mixture of different code versions and in those subnets, you never see APs running 6.5r3 getting elected? Or something else?

Other than issues in earlier releases which I mention above, across my many customers that use IDM, running a variety of HiveOS versions, I don't see any problems in general with the IDM election process - so if you are seeing incorrect behaviour, there should be some specific reason for it in your environment rather than there being a general code issue. Like Andrew though, it's not immediately obvious what that could be - hence if you could describe exactly what you are seeing/not seeing, that would be useful.
Photo of Luke Harris

Luke Harris

  • 265 Posts
  • 18 Reply Likes
Hi Roberto, thank you for your reply. What I am expecting to see is AP's on the new release of 6.5r3 running a network policy using and ID Manager enabled SSID to be elected as an IDM proxy.

So are you saying that you have a management subnet which contains only APs running 6.5r3 and in that subnet, there are NO APs being elected as IDM proxy (which basically means IDM is non-functional in that subnet)? At this stage no, my current deployment comprises of a mixture of firmware as we go through the upgrade process. Would upgrading all AP's in the area to 6.5r3 will force the election process?

Or are you saying you have subnets with a mixture of different code versions and in those subnets, you never see APs running 6.5r3 getting elected? Or something else? 
Yes, as above. I'm yet to see an AP running 6.5r3 become elected, however there are AP's in the same running older versions of Hive OS that are. 
Photo of Roberto Casula

Roberto Casula, Champ

  • 231 Posts
  • 111 Reply Likes
Well the elected proxy will only change when the active proxy AP fails (for example because you reboot or upgrade the AP that is the currently active proxy or backup proxy).

I'm not sure what the criteria are for "winning" the election as Aerohive don't detail that. It may be that up-time is a factor for example so the APs running the older release, which have been up for longer, are getting elected each time you upgrade - there would be logic to that as it may indicate an AP which is more "stable". It may also be that APs with lower OS versions are preferred. Aerohive would have to advise on whether there are any such criteria for "scoring" the election.

When I look at a few of my customers that have mixed software versions, I do see generally that APs with older software versions and/or longer uptimes are the elected proxies, but this is just common sense as once an AP is elected, it will remain elected until it is rebooted or otherwise fails.
Photo of Andrew Garcia

Andrew Garcia, Official Rep

  • 368 Posts
  • 120 Reply Likes
For an AP to elect itself RADSEC proxy, the time needs to be correct on the AP, the AP needs to be able to resolve auth.aerohive.com and the AP needs to be able to connect outbound to auth.aerohive.com on TCP port 2083.

Once the AP contacts auth.aerohive.com, there is some negotiation back and forth including downloading certificates, and then the AP assumes the RADSEC proxy role. Other APs in the network should learn of the RADSEC proxy via AMRP, which uses UDP 3000 for communication.

So, make sure AP time is correct.
Make sure AP can resolve auth.aerohive.com
Make sure AP can contact auth.aerohive.com on TCP 2083. (Don't bother trying to ping that host, as ICMP is blocked).
Make sure your APs can communicate with each other on UDP 3000.

The show idm command that Roberto mentioned will also be helpful to resolve issues, as it reports what the current IDM state is for an AP. If it can not contact the host or the port, that command will let you know.

It is also possible there is some kind of certificate issue, which show idm will also tell you about. You can clear the certificate from your APs. From the Monitor tab, check your APs, click utilities and click Clear IDM credentials. The AP that is RADSEC proxy will automatically grab the certificate again in a few moments.

There are more ports listed here in the IDM section that are worth noting, particularly if you use on-prem HiveManager rather than HMOL.  The contact to the IDM CA from on-prem HM is of particular note.
(Edited)
Photo of Luke Harris

Luke Harris

  • 265 Posts
  • 18 Reply Likes
Hi Andrew,

The CLI commands for checking the time don't seem to be giving me an actual "time". What command should I used to find this on the AP itself? 
Photo of Nick Lowe

Nick Lowe, Official Rep

  • 2491 Posts
  • 451 Reply Likes
Take a look at "show clock".
Photo of Luke Harris

Luke Harris

  • 265 Posts
  • 18 Reply Likes
How on earth did I not see that!?
Photo of Luke Harris

Luke Harris

  • 265 Posts
  • 18 Reply Likes
Update:- All AP's now on 6.5r3, network policy/network segment using IDM in one particular area seems to have elected more than two proxy servers.
(Edited)
Photo of Roberto Casula

Roberto Casula, Champ

  • 231 Posts
  • 111 Reply Likes
I have not seen this behaviour at any customer. The only time HiveManager may show more than two IDM proxies for a given management network segment is while APs are being upgraded/rebooted (due to the delay in HiveManager reflecting the "live" status of the network) - after a couple of minutes, everything should settle down and you should, as always, have no more than two IDM proxies per network segment.

Are the APs which are showing in HM as IDM proxy on that segment consistent, or is it changing? How many APs are there on the segment and how many APs have the IDM icon?

As previously noted, if your APs are configured as RADIUS servers, they may show the second icon for "IDM Proxy Authentication Server" (the icon is a head and shoulders with a cylinder next to it - this would be normal), but you should only have two per segment showing the "IDM Proxy Server" icon (the orange circle with a cylinder and blue arrow).

On the APs in that network segment, if you issue the command "show amrp neighbor", does each AP have an entry for every other AP on that same segment?

Can you post the output of "show idm" on the APs which HiveManager shows as being IDM proxies?

Can you post a screenshot of HiveManager showing what you are seeing?
Photo of Luke Harris

Luke Harris

  • 265 Posts
  • 18 Reply Likes
See the screenshot below. I've got around 8 at this present moment in time, and they do appear to move around. They are definitely IDM proxies as per your accurate description. 




Show IDM:- 
IDM client: Enabled Per SSIDIDM Proxy IP: 172.24.1.82
IDM proxy: Enabled
IDM server: auth.aerohive.com
IDM server IP: 54.171.185.94
RUN state: Connected securely to the IDM server
IDM transport mode: TCP
Server destination Port: 2083
RadSec Certificate state: Valid
RadSec Certificate Issued: 2015-12-04 11:11:12
RadSec Certificate Expires: 2016-12-06 11:11:12
Photo of Roberto Casula

Roberto Casula, Champ

  • 231 Posts
  • 111 Reply Likes
We need to look at the show idm output on all the APs that show as IDM proxies (that are on the same management subnet); also the show amrp output.

If the IDM proxy is moving around, that might suggest the APs are having some difficulty communicating between themselves.

Do all the APs have the same SSIDs configured?
Photo of Nick Lowe

Nick Lowe, Official Rep

  • 2491 Posts
  • 451 Reply Likes
I think that Roberto is, as usual, likely to be on the right track with the thought that there may be connectivity issues such as inaccessibly, packet loss or delay between the APs causing this.

Split brain is a common class of problem for distributed systems where election of a master takes place over unreliable networks.
(Edited)
Photo of Luke Harris

Luke Harris

  • 265 Posts
  • 18 Reply Likes
Roberto, yes the AP's all broadcast the same SSID's (same network policy) - when I say "moving around" I don't mean sporadically or terribly often, perhaps after a number of days. The AMRP outputs also confirm your suspicion of poor connectivity between APs...I'm currently looking into the potential cause of this. I have only observed this problem at one of our remote sites. 
Photo of Andrew Garcia

Andrew Garcia, Official Rep

  • 368 Posts
  • 120 Reply Likes
I would agree with Nick and Roberto that this sounds like a connectivity issue between APs. It's tough to know for sure without knowing what your management subnet and building topologies look like, but in the HiveManager AP screen grab, it does appear that there are 2 RADSEC proxies elected per building.  I see 2 in Arden and 2 in Avon.

This is the behavior we would expect to see if there are different management subnets in both buildings.  If there is any routing or NAT in between buildings, this is what it is expected to look like.
Photo of Luke Harris

Luke Harris

  • 265 Posts
  • 18 Reply Likes
Andrew - All AP's are on the same management subnet for this particular site. The AP's themselves are physically located in separate accommodation blocks. The switches the provide connectivity to the AP's are linked directly back to our core switch. This is the case for our other sites as well, however this behaviour is not replicated at these locations.

Interestingly, when I run the 'show amrp neighbors' command on certain AP's I only see the AP's within that building.

 
Photo of Andrew Garcia

Andrew Garcia, Official Rep

  • 368 Posts
  • 120 Reply Likes
The RADSEC election and assignment processes definitely run atop the AMRP protocol. If AMRP is not seeing all the APs in your management VLAN, that would explain the problem.  

You said that certain APs only show the AMRP neighbors within the same building, implying that others show AMRP neighbors in the other building as well. The question is why some APs are having the problem and some are not.

Out of curiosity, is the following command present in the show run of one of the APs that is not showing AMRP neighbors in the other building?

no roaming cache-broadcast neighbor-type backhaul enable