Unable to retrieve Active Directory Information

  • 3
  • Question
  • Updated 4 years ago
  • Answered
Hello
I have a problem with the AD integration. When I want to retrieve the directory information I get this error:
Unable to retrieve Active Directory Information. The Aerohive RADIUS server is currently disconnected from HiveManager. Please restore the connection and try again.

All settings are double checked and should be correct, anny one any idea why this occures?

Thx!

Kind Regards
Photo of Excellentics

Excellentics

  • 3 Posts
  • 0 Reply Likes

Posted 4 years ago

  • 3
Photo of Rob

Rob

  • 42 Posts
  • 5 Reply Likes
Ill take a stab at this...
First make sure your AP is connected to Hivemanager, then make sure it has a static IP.  If your going to make an AP a RADIUS server or RADIUS Proxy, you need to give it a static IP.
Photo of Excellentics

Excellentics

  • 3 Posts
  • 0 Reply Likes
Helo Rob
Thx for the reply, the AP is connected and has a static IP.

Photo of Roberto Casula

Roberto Casula, Champ

  • 231 Posts
  • 111 Reply Likes
If you are using HiveManager Online, are you allowing the APs to initiate SSH connections out to the cloud?
Photo of Excellentics

Excellentics

  • 3 Posts
  • 0 Reply Likes
Helo

I'm not sure, but I don't believe this could be an issue for the AD-connection (we had some SSH issue's in the past at other setups, but this never give problems with the active directory connection).

Photo of TV

TV

  • 1 Post
  • 0 Reply Likes
I'm having this problem as well.
AP has a static IP address, but is unable to fetch directory information.

However! When placed in the same subnet as the DC, the lookup goes fine. Still, gateways have been triple checked..


Photo of Scott M.

Scott M., Sr. Support Engineer

  • 104 Posts
  • 8 Reply Likes
Hello Excellentics,

Where did you see this error/info message?  Maybe you could include a screenshot.  Did you see this on the client device?

The error/info message to which you referred was, "Unable to retrieve Active Directory Information. The Aerohive RADIUS server is currently disconnected from HiveManager. Please restore the connection and try again."

Question:  Is the AP in question CAPWAP down?

Hive you tested using the HiveManager test tools located at:
HiveManager > Tools > Server Access Tests > RADIUS Test

Figure 1: HiveManager RADIUS Test


The test (above) will show if attributes are being returned via RADIUS.  The test in Figure 1 indicates that Aerohive RADIUS Client "scube-ap370-006bc0-NPS" can connect to NPS 2008 server at 192.168.95.125 using UN: student001.  Attribute 10 is being returned for "RADIUS_ATTR_TUNNEL_TYPE"

If the attribute is not returned, more detail can be seen in HiveManager's Client Monitor tool:
HiveManager > Tools > Tools > Client Monitor

Figure 2: Successful NPS authentication with attributes as seen from HiveManager's Client Monitor


I recommend you try to authenticate while monitoring in Client Monitor and post the output to this thread for analysis.

Are you sure this is not a routing or ACL issue?

Thank you,

Scott Myron




Photo of Michael Ratcliffe

Michael Ratcliffe

  • 8 Posts
  • 2 Reply Likes
I have the same problem. It occurs at the point of trying to configure the link from an AP Radius server to an AD server after clicking on the “Retrieve Directory Information” button which, so far as I can see, should fill in the BaseDN value. The AP is connected to HiveManager. In this case its a local instance. The AD server(s) is/are in a different subnet.

What exactly is happening when clicking on this button? At this time, the AP/HiveManager has not got an AD account configured .....

Thanks for any tips

--michael ratcliffe
Photo of Denis Recchia

Denis Recchia

  • 3 Posts
  • 0 Reply Likes
I'm having the same issue, AP has a static IP, but cannot get the domain information.  I'm trying to set it up as a Radius Server and connect it to Active Directory.

Denis
Photo of Roberto Casula

Roberto Casula, Champ

  • 231 Posts
  • 111 Reply Likes
OK, a few suggestions/answers to questions:

What the AP is doing when you click that button is similar to what a Windows client does when it tries to join a domain.

1. The AP performs a DNS lookup against the DNS server specified in the connection setup settings for the "any DC" SRV record (_ldap._tcp.dc._msdcs.<domain>). This returns ANY domain controller at any site.
2. The AP then performs an anonymous connectionless LDAP (over UDP) search request to that DC, which will return amongst other things the appropriate Active Directory site based on the subnet that the AP is connecting from.
3. Another DNS lookup against the DNS server then occurs for the site-specific SRV record (_ldap._tcp.<Site Name>._sites.dc._msdcs.<domain>).
4. A further CLDAP query is then issued to the domain controller identified by this SRV lookup, the result of which then populates the BaseDN field and the Domain Controller field in HiveManager for the next step in the process (the actual domain join).

As several people have mentioned that everything works as expected if the AP is on the same subnet as the domain controller, I can think of a few things that might be causing a problem (assuming it's not something as simple as the wrong DNS server being specified or a routing issue of course!):

1. (Unlikely but possible!) The firewall policy on the domain controller might be set only to allow connectionless LDAP queries (UDP port 363) from the DC's local subnet. Note that the LDAP requests may not necessarily go to the domain controller you expect, as this will depend on the SRV record lookup, so check all your domain controllers!

2. (More likely!) The AP management subnet is not bound to a site in Active Directory Sites & Services. As well as preventing the AP from knowing which site-specific DC it should subsequently be using for the rest of the configuration, this can have effects that are not immediately obvious. In Windows 2008 and above, a domain controller that receives a request from an IP address which is not site-bound (i.e. the subnet is not bound to any site in AD Sites & Services) will try to determine using several DNS lookups whether that client has another IP interface that IS site-bound (this is primarily to accommodate the fact that a client may have multiple addresses, especially with IPv6, and may be connecting from a different address than the administrator expected). In some circumstances, these DNS lookups can take several seconds to timeout (for example because a reverse lookup zone for the AP management subnet is not available in local DNS, so the request is being forwarded to the root DNS servers and timing out rather than getting an NXDOMAIN response). This can cause a knock-on timeout of the LDAP query.

I would therefore check the firewall policy on the domain controllers just to be sure that connectionless LDAP (UDP 363) is allowed from the AP management subnet.

I would definitely ensure that the AP management subnet is listed in AD Sites & Services under the appropriate site and also that there is a DNS reverse lookup zone configured in your local DNS for this subnet.

If you are still having problems after this, the best thing to do is to set the AP to do a remote Wireshark capture and capture the eth0 traffic from the AP. You can then investigate exactly what the AP is doing when you click that button and work out why it is failing in your case.
Photo of Denis Recchia

Denis Recchia

  • 3 Posts
  • 0 Reply Likes
Why the weird RADIUS server error then?  I'm failing even though I have the AP on the same subnet as my DC, and have no issues with clients performing this same action.  Is this process going out through the internet at all or staying local?

Firewall is off on the server, so thats not my issue.  DNS looks correct
Photo of Roberto Casula

Roberto Casula, Champ

  • 231 Posts
  • 111 Reply Likes
Hi Denis,

It is the AP doing all the work here. HiveManager simply asks the AP to carry out the functions and return the results of those actions to HiveManager over the CAPWAP connection so you can proceed to the next step. If HiveManager does not get the response from the AP within a certain timeout period, it displays the message you got. The message is generic - all HiveManager knows is that it didn't get a response from the AP - I guess the developers assumed the most likely reason for this is that the CAPWAP connection between the AP and HiveManager has been lost, but that is probably not the case here. The reference to RADIUS is simply because you are configuring an AP to be a RADIUS server, so HiveManager refers to it as an "Aerohive RADIUS Server" (which it is!) - at this point, there's no actual RADIUS going on anywhere, it's just a confusing reference.

To prove this is the case, I just added a blackhole route to one of my APs on my DNS server (so that DNS lookups from that AP will timeout) and tried to do the "retrieve". A capture shows the AP is repeatedly trying to do the DNS lookups I describe above, but is getting no response from DNS. This means the AP never reports the results to HiveManager, so HiveManager times out and displays the exact message you are getting, even though what has actually happened is that the AP is just desparately and repeatedly trying to fulfil HiveManager's request but failing.

This is why I say the root cause of the issue reported in this thread is that the AP is not getting a timely response to the various DNS and/or LDAP queries I describe above. A likely cause is a site binding failure, hence the suggestion to check the subnet to site mappings in AD Sites & Services. As I say, if you still have a problem after checking this, the best thing to do is to capture the network traffic from the AP using the remote sniffer function - it should be obvious from this capture what is failing - look for DNS/LDAP requests that do not get a timely response (or indeed any response).
Photo of Roberto Casula

Roberto Casula, Champ

  • 231 Posts
  • 111 Reply Likes
Couple of other little gotchas. Make sure you enter the AD DNS domain name and not the NetBIOS name in the settings here. Also, for later stages, the time on the AP must be accurately set via NTP (point the NTP service settings at the domain controllers is best) as later stages use Kerberos which is very sensitive to time being out-of-sync.
Photo of Michael Ratcliffe

Michael Ratcliffe

  • 8 Posts
  • 2 Reply Likes
Thanks for the response Roberto. I think my problem is none of these ... from the firewall on the AP subnet I can see the initial DNS query and response, I then see the AD query on port 389 and a response packet returning so I’m pretty certain in my case at least that it also wouldn’t work if the AD server were in the same subnet.

I’ve been doing my homework on AD and it looks like anonymous AD queries are not enabled by default and need to be configured. So, I suspect that the AD server is rejecting the query its getting.

I already got hold of an AD Admin who was able to confirm that anonymous queries was NOT enabled. As soon as I can persuade her to allow them (security), I’ll test the theory and post the results here.
Photo of Roberto Casula

Roberto Casula, Champ

  • 231 Posts
  • 111 Reply Likes
There's a difference between an anonymous connectionless LDAP query and anonymous LDAP binding. Anonymous binding is not allowed by the default security permissions for sure, but what I am referring to is not an LDAP bind, merely a query of the LDAP <ROOT> namespace necessary to obtain information about the domain. Microsoft refer to this process as a "Netlogon ping" - as I mention, it is part of the normal "closest DC" discovery process. It is performed by every Windows client at various times, including when joining a domain and at every logon. It is a standard Windows function.

I would strongly advise NOT to enable anonymous binding to AD - the security implications are pretty dire. It is absolutely not necessary to do so for this to work. I have configured this in dozens of customers and have never once had a problem; also in the lab several times with a brand new Active Directory built from scratch. This works absolutely fine with default Active Directory settings/permissions, I promise.

As per my previous posts, from the symptoms described, the issue is likely caused by either a slow response (by slow I mean a couple of seconds), which you can't easily determine is happening from firewall logs, or a complete lack of response. It may be that the response is correct, but by being delayed by just a second or so can cause a timeout and then a continuous retry cycle.

I've found an MS KB article that describes the behaviour I mentioned with unbound sites just to prove I'm not making stuff up! http://support.microsoft.com/kb/2668820/en-us. Forget about the specific stuff about "under heavy load" and the ATQ thread pool; the key piece of information is this one: "Depending on the Name Resolution mechanism configured on the DC this name resolution may take some seconds." Note that this is not referring to the DNS lookup performed by the AP to find a DC, this is something the DC itself is doing in between receiving the Netlogon ping and responding to it in the specific case where the ping originates from a subnet that has no site mapping in AD. That delay can be enough to cause the AP to timeout.

So I'm sorry to keep labouring the point, but have you confirmed that the AP management subnet is correctly bound to a correct site in AD Sites & Services? Apologies if you have, and I know I'm sounding like a stuck record, but this IS the most likely cause. In the past I've seen some very large Enterprises with whole teams of Microsoft-certified people that have not configured their Sites & Services mappings correctly for every client subnet; it's very easy to miss/forget about, especially as Windows will work without configuring these mappings - just not optimally. Misconfiguring sites & services can lead to all sorts of odd behaviour, including slow logon, intermittent inability to connect to network file shares, poor performance of Outlook and other applications etc. etc.

Again, this is just a very likely cause of the problem, but it may not be this (and after having gone on about it so much, it probably won't be, just to make me look stupid!). If all of this has been checked and looks fine, your next step is to get a packet capture to see what is going on.

[You also mention you have a firewall between the AP and the domain controllers. You need to keep an eye on the logs for denied traffic as there are a lot of protocols that need to be allowed for all this to work properly. It's quite easy to get in a situation where it works for a bit, then stops working (not allowing everything required for Kerberos to work is usually the culprit). Just a word of caution really.]

Photo of Michael Ratcliffe

Michael Ratcliffe

  • 8 Posts
  • 2 Reply Likes
Hi and thanks again for the followup. I hadn’t missed your point about the subnet binding and, indeed the APs are on a newly created subnet so that could well be the problem. I already dropped an email to one of the AD admins to check that but given that its 01:31 here, will have to wait for a response!

I’m allowing all traffic through the firewall for the APs at the moment. 

Thanks again for the inspiration.
Photo of Roberto Casula

Roberto Casula, Champ

  • 231 Posts
  • 111 Reply Likes
There is another possibility that I found while doing some research. If IPv6 is disabled on domain controllers (which Microsoft strongly recommend against), one of the many side-effects is a problem with connectionless (UDP) LDAP. Even though you may not actively be using IPv6, as of Windows 2008, it is used internally in Windows for various things and disabling it can cause a bunch of problems, and specifcally for LDAP. Worth checking too.

http://gallery.technet.microsoft.com/Disable-IPV6-Domain-69c4ef31

Photo of Roberto Casula

Roberto Casula, Champ

  • 231 Posts
  • 111 Reply Likes
Also, for some reason my brain made me type UDP port 363 TWICE in these posts when of course I meant UDP port 389!! (636 is the port for LDAPS so that's probably what my brain was doing) :)
Photo of Denis Recchia

Denis Recchia

  • 3 Posts
  • 0 Reply Likes
ok!

I ended up putting Wireshark and watching what the AP was doing, the TLDR of this is i had alot of wrong entries in my DNS.

It seemed to me that by watching the Wireshark, the Aerohive was asking the lowest numbered DNS server for the information everytime, which for me was somehow a core switch.

Anyway, thanks for all your help.  I wish that error message was more descriptive.  I am solved.
Photo of Roberto Casula

Roberto Casula, Champ

  • 231 Posts
  • 111 Reply Likes
Glad you're sorted now. I've seen lots of instances where, for example, a domain controller is decommissioned, but the DNS SRV records related to it are either not removed or only partially removed. The interesting thing is that Microsoft clients and servers manage to carry on working with a fair amount of incorrect configuration present (albeit sometimes not optimally) while non-Microsoft (i.e. Samba-based systems, Linux, MAC OS etc.) struggle. This is the difference between working according to the spec of the protocol vs. the stuff Microsoft do without documenting it for third parties. Often the third party gets blamed, when in fact they are just following the published spec. Similar to what happens with wireless where a lot of the bugs are attributable to the drivers provided by the chipset manufacturer (Qualcomm, Broadcom etc.) that are completely outside of the wireless vendor's control. The wonderful world of IT.
Photo of Michael Ratcliffe

Michael Ratcliffe

  • 8 Posts
  • 2 Reply Likes
I can now confirm that my problem was indeed due to the AD Sites & Services configuration. THANKS again Roberto ... I’m a “happy camper” again :-)
(Edited)