Quite recently I’ve had an interesting troubleshoot at a customer. The problem was at first that there was an issue in the newly build Exchange 2019 environment that Outlook clients would open up and ask for credentials in a domain joined environment, so the SSO part of WIA isn’t working and it “seemed” to work after you would put in credentials.
Long story short support case at Microsoft was in play and after some weeks of log troubleshooting no results and a standstill for the customers migration project.
Drilldown to the issue at hand it seemed that the Exchange solution wasn’t working correctly and it was decided to uninstall the software and clean-up the entire setup and do a fresh redeploy. The engineer already repaired some ADDS inconsistencies which perhaps could be an issue but that was not the case.
At this time I got involved because the uninstall wasn’t working correctly on the last server, it kept preventing the uninstall with still some arbitrary mailboxes present and an old hardcoded OAB reference which couldn’t get removed. I’ve also had this kind of behavior with the arbitrary mailboxes on a Exchange 2016 solution and a reboot would resolve the blocking issues, the entities will get populated again and afterwards you could remove them accordingly. That was the solution here as well until the point of the OAB entry we had to delete that one through ADSIEDIT.
Ok.. all is well again and Exchange 2019 is completely removed and a redeploy was the next item. Exchange got redeployed, basic no fuzz and behavior of the client seemed to improve.. well no it came back and the outlook prompts and disconnects were reoccurring after some time again.
Next item was to strip stuff even further, we put on some scenario’s like lets do a complete redeploy of ADDS etc. which would mean a lot of work. So we tried to simulate it in separate lab environments to reproduce the issue and there we found out no issues at all, SSO was working out of the box and things started to point through customization of some sorts or other external factors.
We’ve landed on the point that perhaps the forest trust would be an issue, that was our last item to troubleshoot. There was a full forest trust in place for the migration/transition to the new network which is being used accordingly for resources etc. This was also something we implemented in both lab environments and never had any issues with. Well here comes the kicker… we’ve disabled the trust and the issues were instantly gone!
Ok. We have our little friend and are going to focus on that, it turned out that the name suffix routing enabled of the domain was giving issues, very strange because they are not overlapping of any sorts. It “seemed” all ok from a configuration point of perspective. We focused in on everything that could be an issue and ultimately came to the point that the NETBIOS name and FQDN of ADDS where the same with a dot in it e.g. domain.local for them both.. Right this isn’t something that you can setup anytime soon from Server 2019 through Windows 2000 and is basically not supported or advised to use, this was the case in Windows NT4 but further on no go at all.
Recap of the whole situation we have our workaround for the customer, but the root cause with NETBIOS and a dot in its setup isn’t going away. The migration is focused to move it all as soon as it can and then kill the trust and any connections to the old forest/domain which was giving all this grief.
I’ve had an splendid troubleshoot with the engineer who’s a wizard in ADDS, be sure to connect or follow him on LinkedIn.