Hopefully this little tid bit helps some poor administrator saving them weeks worth of work and pain. Most recently I’ve been working with a team to upgrade their environment to McAfee Endpoint Security (ENS). Upon doing so on the LDMS core server, all security scans began to fail, one of the service accounts started reporting lock outs and the nightmare began. Removing McAfee from the server did not resolve the issue.
After diving into the logs today, what I started to see were 403.16 errors in the IIS logs. These were generated when the security scan (vulscan.exe) process would attempt to run, immediately causing a failure to communicate. IIS 403.16, as seen here indicate client certificate errors. When checking the LANDESK related certificates on the core, everything looked fine, yet it was still broken.
During my research I happened to stumble across the root cause and later I found the LANDESK and then the McAfee KB articles describing the problem. Long story short, McAfee was breaking the system and it would remain broken because uninstalling McAfee didn’t also remove the certificates it installed into the incorrect location.
Microsoft Article describing the problem in part:
For a LANDESK KB, see these documents:
- https://community.ivanti.com/docs/DOC-41855
- Resolution 1: https://help.landesk.com/ld/help/en_US/LDMS/10.0/Windows/core-h-rootcert-config.htm
McAfee KB article describing the issue and affected versions of their software:
To check your core for the issue, from power shell run the following:
Get-Childitem cert:\LocalMachine\root -Recurse | Where-Object {$_.Issuer -ne $_.Subject}
This will output to the console a list of certs that need to be moved, if any are found.
If some are found, you can use the following code to move them to the CA where they belong:
Get-Childitem cert:\LocalMachine\root -Recurse | Where-Object {$_.Issuer -ne $_.Subject} | Move-Item -Destination Cert:\LocalMachine\CA
This should get your core back up and running and security scans being processed, it is one of the manual suggestions McAfee gives (although it would be nice if they gave you the commands to check and move like I just did). Be sure to check your other servers. LANDESK is not the only system which was impacted by this problem, it just happens to be the one I am focused on while writing this post.
Our team had escalated and engaged McAfee support, but unfortunately they had not found the root cause of the issue, even after procmon traces, process explorer dumps and other proprietary McAfee debugging tools. It had been suggested, by the McAfee team, that LANDESK was doing some sort of driver injections or insecure modifications to the system files, which stalled getting to a proper solution before I became more involved. I’m still not sure how they came to that conclusion and never saw any evidence of this being the case, just what appeared to be theories and seeing what sticks.
I guess my gripe (and prayer) is that we system analysts do a better job of wanting to help identify root cause rather than to simply rule ourselves out of the picture. We had also escalated the issue to iVanti and the TRM assigned stayed with the issue working with us constantly and tried to do whatever he could to his abilities in trying to reproduce the problem on their own systems. While iVanti didn’t think or see how it could be their issue that the system breaks after installing McAfee, they stuck with us and were open to the possibility it was and wanted to help us get to the root of what was wrong. I was a little more disappointed to hear the reports on the McAfee side of things with almost an outright refusal by the analysts involved to believe it could be their product, this to me is what I now identify as the Microsoft mentality I really hate and it causes so much delays to getting to root cause and resolution. Granted, not every analyst we work with will be the greatest and most knowledgeable, but I really believe as an industry, we system analysts need to do a better job at supporting our customers and providing exceptional support.
Side Note:
There was another known issue, which was not our problem. This issue was that if you upgraded from LDMS 9.6 to 2016 and the old and new .0 file resided on the core, the clients would try to use the 9.6 cert by default. My understanding is that this issue has been resolved on the iVanti\LANDESK side of things. See for example this article: https://community.ivanti.com/docs/DOC-34506