Thursday, December 9, 2010

ANR0482W TSM - How to interpret and troubleshoot it

This article describes TSM warning message ANR0482W:
ANR0482W ANS1005E ANS1809W - Sessions terminating due to timeouts


Problem(Abstract)
ANR0482W, ANS1005E, and ANS1809W messages are encountered during Tivoli Storage Manager backup processing. This document explains the sequence of events that take place between the Tivoli Storage Manager client and server.
 
Symptom
ANR0482W ANS1005E ANS1809W messages are observed during client backup activity
Cause
Below is an example and the explanation for the events in the actlog and dsmerror.log.
From the actlog:
08/19/03 10:21:38 ANR0406I Session 805 started for node TSMNODE.
08/19/03 10:21:39 ANR0406I Session 806 started for node TSMNODE.
08/19/03 10:51:40 ANR0482W Session 805 for node TSMNODE terminated
- idle for more than 30 minutes.
08/19/03 10:51:58 ANR0406I Session 813 started for node TSMNODE.
08/19/03 12:53:25 ANR0482W Session 813 for node TSMNODE terminated
- idle for more than 30 minutes.
08/19/03 14:31:56 ANR0403I Session 806 ended for node TSMNODE.
08/19/03 14:31:59 ANR0406I Session 868 started for node TSMNODE.
08/19/03 14:32:01 ANE4952I (Session: 868, Node: TSMNODE) Total number of objects inspected: 11
08/19/03 14:32:01 ANE4953I (Session: 868, Node: TSMNODE) Total number of objects archived: 11
08/19/03 14:32:01 ANE4958I (Session: 868, Node: TSMNODE) Total number of objects updated: 0
08/19/03 14:32:01 ANE4960I (Session: 868, Node: TSMNODE) Total number of objects rebound: 0
08/19/03 14:32:01 ANE4957I (Session: 868, Node: TSMNODE) Total number of objects deleted: 0
08/19/03 14:32:01 ANE4970I (Session: 868, Node: TSMNODE) Total number of objects expired: 0
08/19/03 14:32:01 ANE4959I (Session: 868, Node: TSMNODE) Total number of objects failed: 0
08/19/03 14:32:01 ANE4961I (Session: 868, Node: TSMNODE) Total number of bytes transferred: 5.65 GB
08/19/03 14:32:01 ANE4963I (Session: 868, Node: TSMNODE) Data transfer time: 91,210.15 sec
08/19/03 14:32:01 ANE4966I (Session: 868, Node: TSMNODE) Network data transfer rate: 65.04 KB/sec
08/19/03 14:32:01 ANE4967I (Session: 868, Node: TSMNODE) Aggregate data transfer rate: 395.01 KB/sec
08/19/03 14:32:01 ANE4968I (Session: 868, Node: TSMNODE) Objects compressed by: 75%
08/19/03 14:32:01 ANE4964I (Session: 868, Node: TSMNODE) Elapsed processing time: 04:10:19
08/19/03 14:32:01 ANR0403I Session 868 ended for node TSMNODE.


Explanation:
Session 805 is a producer session - producer session is responsible for querying the server, establishing communication with the server, and reporting results to the server.
Session 806 is a consumer session - it is responsible for actual sending of data.
Session 805 terminates due to timeout, because there is nothing for the producer session to do. This session will get restarted if/when needed.
Session 813 is a producer session that was restarted, because Tivoli Storage Manager has some work for it to do, such as reporting the status of the backup to the server.
Session 813 terminates. Again, there is nothing for it to do, so it times out.
Session 806 terminates, because the backup was finished.
Session 868 is a producer session. It gets started to send the final results to the server. Session 868 terminates.


From dsmerror.log:
08/19/03 10:51:42 ANS1005E TCP/IP read error on socket = 6, errno = 73, reason : 'A connection with a remote socket was reset by that socket.'.
08/19/03 10:51:42 ANS1809W Session is lost; initializing session reopen procedure.
08/19/03 10:51:42 ANS1809W Session is lost; initializing session reopen procedure.

The ANS1005E is the first session (session 805 in the actlog) that was ended by the server. Observe that the Tivoli Storage Manager Client tries to establish the session again, by querying the server for communication.

08/19/03 10:51:59 ANS1810E Tivoli Storage Manager session has been re-established.
This corresponds to session 813 in the actlog being established at 10:51:58

08/19/03 14:31:58 ANS1005E TCP/IP read error on socket = 6, errno = 73, reason : 'A connection with a remote socket was reset by that socket.'.
Session 806 (consumer session) has finished. Tivoli Storage Manager Client queries the server and finds out that the producer session was ended. So it establishes a new session (868) to send the final results.
 

Resolving the problem
The client default is to use a maximum of two sessions; one to query the server (producer) and one to send file data(consumer). Setting resourceutilization to 1 will eliminate the sessions being terminated due to time out. Tivoli Storage Manager will share one session for both producer and consumer sessions.
Increasing the resourceutilization can make the backups faster, but more sessions may be terminated due to time-outs. For example, when resourceutilization is set to 5 there may be 2 producer sessions and 3 consumer sessions and you may see 2 producer sessions terminating due to timeout.

An alternative to increasing resourceutilization is to increase the idletimeout value on the Tivoli Storage Manager server. This option is not as desirable since it will hold the producer sessions in an open state even if there is nothing for these sessions to process. Cancelling the producer sessions with the timeout value can actually increase the performance because it allows more consumer sessions to start and perform the actual work for the backup. Producer sessions do very little work while backups are running and should not be allowed to hold sessions open if there is nothing for them to do.

Note that the Tivoli Storage Manager server and client are working as designed and intended, backups will not fail due to the timed out sessions, so no action is actually required.

Source:
IBM Support site
http://www-01.ibm.com/support/docview.wss?uid=swg21402687

No comments:

Post a Comment

Post a Comment