Users only able to authenticate when in certain locations - brain exploding

  • 2
  • Question
  • Updated 3 years ago
About a year ago we fired up 2x 2012 R2 Servers to handle the new wireless authentication from our APs (RADIUS clients). Everything was sunny and happy until about the 22nd of last month. This is on or around the time that the Server group and Security groups moved some additional "RADIUS Clients" onto the 2 servers as well as a number of additional Network policies and a new connection request policy. These were moved over from a 2008 server.

I "think" we've fixed a lot of what was broken, However there's one thing that is driving me batty...

User A is a member of IT. User A can authenticate and get on the wireless in their office with both a domain device (laptop) and non-domain device (iPhone etc) but as soon as they go to another  location (different building/department/geographical/ip range etc) they suddenly are not able to get on the wireless with their domain-device (laptop). 

I had one user type in their credentials from the HMO GUI in the RADIUS Test tool and they got
"RADIUS server is reachable. Get attributes from RADIUS server: None" from both servers even when using different "RADIUS Clients" (APs) from all over (different places) in our organization. Of course this test doesn't differentiate domain/non-domain devices but it at least shows me that the APs (RADIUS Clients) are working/hitting the NPS server etc and that this users creds are good.... though how does the NPS server view this request (through the tool)? As if the user is on a domain device right? since it's coming from an AP....

What in the samhill could be going on such that a person is seemingly blocked from getting on the wireless when in certain locations? It's the same config on all the APs across our organization and this was never an issue for the last year or more. Could there be some group policy wonkyness going on? I'm really at a loss. 

server 2012 R2
NPS

Connection Request Policy
Use Windows Authentication for all users
Condition                                         Value
Day and time restrictions               set to 24/7 all week

Network Policy
Domain-Computers
Condition                                 Value
NAS Port Type                Wireless - IEEE 802.11
Machine Groups              XXXX\Domain Computers

Constraints - Authentication = Microsoft Protected EAP (PEAP) with Microsoft Encrypted Authentication version 2 (MS-CHAP-v2) CHECKED

Settings Tab - RADIUS Attributes - Standard = Framed-Protocol PPP and Service-Type Framed

Domain-Users
Condition                                Value
User Groups                   XXXX\Domain Users
NAS Port Type                Wireless - IEEE 802.11

Constraints - Authentication = Microsoft Protected EAP (PEAP) with Microsoft Encrypted Authentication version 2 (MS-CHAP-v2) CHECKED

Settings Tab - RADIUS Attributes - Standard = Framed-Protocol PPP and Service-Type Framed

I think this is how we had things set before things went sideways... and the issue with folks not being able to authenticate from certain locations may be unrelated from any changes on the NPS server and be due from something else?

I need stiff drink and some rest. Any tips or thoughts are greatly appreciated.
Photo of intvlan1shut

intvlan1shut

  • 29 Posts
  • 1 Reply Like

Posted 3 years ago

  • 2
Photo of thewifigeek

thewifigeek, Champ

  • 86 Posts
  • 12 Reply Likes
Simplify the problem!

1) issue occurs with all domain users at that site?
2) test same client(s) with new cloned SSID but auth change to open or PSK. Check user profiles, VLANS, DHCP and LAN connectivity.

If all good then look closer at RADIUS settings and User profiles assignments.
Photo of Ruwan Indika

Ruwan Indika

  • 66 Posts
  • 22 Reply Likes
Hi, I have seen a similar issue, user could authenticate with AP-location-1 but could not with AP-location-2 and that started happening suddenly. For some strange reason in the NPS the AP-location-2 IP address had vanished from the RADIUS client list. Actually the name of AP-location-2 was there but the IP address was not. So please check that.

The first step to troubleshoot this is to add your client's mac address to the client monitor and replicate the issue. That should show whether the RADIUS server is replying or not, if it is not replying at all the shared secret is wrong or the RADIUS client is not added to the client list in the RADIUS server. If it is replying we can see where it fails,




(Edited)
Photo of intvlan1shut

intvlan1shut

  • 29 Posts
  • 1 Reply Like
DNS, DHCP, VLANs and LAN connectivity all appear normal as if nothing has changed (all managed by Networking ;-)  ) I do have a backup of my RADIUS server config and I had the server team spin me up 2 more servers. I just have to get the Cert part squared away (something I'm not too familiar with but learning) and if need be, I'll clone my Aerohive config and point it at the new servers, move one of the more troublesome sites APs over (while I'm on-site) and see how folks do then. That at least might lend some more credence to my thought that something got hosed when other folks started messing around with the servers.

Not a perfect example but... If there's 50 employees in location X, maybe 5 or 6 can't get connected with their domain laptop. The other 44-45 are fine. They're all in the same AD group for their department and they're all under the umbrella of "all domain computers and users" (or so I think) that is in our NPS network policies.

Meanwhile some people connect fine in their office but can't connect with their domain laptop when visiting another site.

And there's yet more who just can't connect anywhere with their domain laptop.

All the sites have the same config on the APs and everyone has the same profile.

It got more weird when I wasn't able to get anything from "Client Monitor" on three different users/laptops today. I was sitting with them and just didn't get anything in client monitor.... It was like their laptops weren't even trying to connect even though the laptop acted like it was trying.

and all three users AD creds passed the HM RADIUS Sever test tool for both RADIUS Server IPs.

I have a ticket open with Microsoft to talk to a NPS/AD/GP guru because sadly, there isn't one in my organization.

I appreciate the tips and suggestions. 
Photo of thewifigeek

thewifigeek, Champ

  • 86 Posts
  • 12 Reply Likes
Just clarifying, when users move to remote sites it could cause random issues or one specific site-only as random laptop connectivity issues?

If radius issue it would cause problems across all sites.
Photo of Sjoerd de Jong

Sjoerd de Jong, Employee

  • 97 Posts
  • 20 Reply Likes
Allright, what does the event viewer of your radius servers tell you when someone is unable to connect? What are the differences with a working user?

Sometimes you need to enable succesfull eventlogs in order to see them in server 2012 eventviewer:

https://support.microsoft.com/en-us/kb/951005 (works for 2012 as well)
Photo of John Kahl

John Kahl

  • 1 Post
  • 0 Reply Likes
I had seen an issue with a client running enterprise WEP that is similar.   Changing the SSID to WPA2 enterprise seemed to solve the issue.   I assume support and testing for WEP is not great as everyone should have gone to WPA2 by now.