The issue happened again just a few minutes ago. To add another bit of information, I noticed this event log warning that roughly correlates to the time of the SSO error:
Event ID 4227 Warning TCPIP
TCP/IP failed to establish an outgoing connection because the selected local endpoint was recently used to connect to the same remote endpoint. This error typically occurs when outgoing connections are opened and closed at a high rate, causing all available local ports to be used and forcing TCP/IP to reuse a local port for an outgoing connection. To minimize the risk of data corruption, the TCP/IP standard requires a minimum time period to elapse between successive connections from a given local endpoint to a given remote endpoint.
We have seen this once before.
I have an hourly scheduled task running that counts the number of TCP and UDP connections. My logs indicate an average of 85 TCP and 30 UDP continuous connections. I have not seen a spike yet. To get a finer grained picture of what's happening with the connections on the system, I changed the logging from every 1 hour to every 5 minutes. Hopefully if there's a spike, we will catch it. The script will dump all netstat process information on >100 TCP or UDP connections, so we should know which process has many connections.
If anyone has any ideas, please let me know!
- Michal