So this indicates that:
1) You have 2 SEPM entries in your connection list (the Sylink.xml file)
2) One server, the one with a DNS address, seems to be working fine (assuming the DNS can be resolved by your client, in this case it can)
2) One server, the one refered to by IP is not responding.
So half the clients may work because the connect to the 'working' SEPM. Depending on how your connection list is structured, the clients may have the downed SEPM server as Priority 1 -- they from time to time they try to switch to it. In an error condition, clients will wait up to 34 minutes before trying to reconnect.
So start troubleshooting the 'down' SEPM server. I believe the error code zero implies that the server did not reply. So I would guess one of the following issues:
1) There is a firewall in the network between the remote server and the clients
2) There is a firewall on the SEPM server itself
3) The IIS services is not running.
4) The Port Number of the IIS server may of been changed somehow (I doubt it, but double check)
My first step would be to try to log into the broken server and try to access go to
http://localhost:8014/secars/secars.dll?hello.secars . Then try
http://<External_IP>:8014/secars/secars.dll?hello,secars
If the LocalHost works, but the external address does not, then there may be an IIS permissions issue. Check the IIS logs.
If the external address works from the Server, but not from any external host, then it's likely to be a firewall issue.
This should return SOMETHING regardless if the SEPM service is running because Secars runs inside the IIS process.
So first step, let's get IIS working.
Another page you can try to hit is
http://<Server_Addr>:8014/Reporting . Client's don't use this address, but it may be useful to verify is basic IIS functionality is working.
Another useful link to test a little bit end-to-end (from Secars to SEPM-Tomcat) is
http://localhost:8014/secars/secars.dll?action=36 . Note, this only works from the localhost.
As soon as you get IIS working, and can connect to if using a browser from an external host, run the test again and see if you client's don't connect after half and hour.