Wednesday, August 26, 2009

Advanced troubleshooting

Was troubleshooting third-party application integration.

The agent installed on our server, configured, started
Firewall ports opened both on boundary and IPSEC firewalls.
Remote host can not establish connection with agent.

1. Checked network monitor - shows SYN sent to agent, no return ACK
2. Checked firewall logs, no blocking.
3. Checked netstat -n -a - agent not listening on the port, where remote suppose to connect.
4. Checked netstat -n -o - agent not listening on any port.
5. Checked Application Event log - agent service reported error on startup, with call stack containing functions OnLoad and CreateIPPerformanceCoutners.
Reason - not enough security to access performance counter related registry keys under agent account NETWORK_SERVICE.
6. Changed service account for the agent to LocalSystem - connection established, network traffic flows.

Solution - recommended vendor to have proper error handling for non critical services such as performance counter reporting - basically wrap CreateIPPerformanceCoutners
into try/catch.


On this troubleshooting - where is the boundary between software engineer and network engineer?