What can cause this problem and how a root cause can be found?
I faced this problem at one of my customers.
We received this alert on trunk created from two 8Gbit ISL ports between two 5100 switches.
Here is alert message:
Time Level Message Service Number Count Message ID Switch
Mon Aug 06 2012 20:32:05 CEST Warning Severe latency bottleneck detected at slot 0 port 35. Switch 1241 1 AN-1010 XSAN01
Port in alert message is ISL port from one trunk group.
Performance of trunk:
XSAN01:admin> trunkshow -perf
1: 1-> 7 10:00:00:05:1e:36:38:62 100 deskew 15 MASTER
0-> 6 10:00:00:05:1e:36:38:62 100 deskew 24
Tx: Bandwidth 8.00Gbps, Throughput 37.44Kbps (0.00%)
Rx: Bandwidth 8.00Gbps, Throughput 51.94Kbps (0.00%)
Tx+Rx: Bandwidth 16.00Gbps, Throughput 89.38Kbps (0.00%)
2: 5-> 71 10:00:00:05:1e:36:38:62 100 deskew 16 MASTER
4-> 70 10:00:00:05:1e:36:38:62 100 deskew 15
Tx: Bandwidth 8.00Gbps, Throughput 33.12Kbps (0.00%)
Rx: Bandwidth 8.00Gbps, Throughput 58.08Kbps (0.00%)
Tx+Rx: Bandwidth 16.00Gbps, Throughput 91.20Kbps (0.00%)
3: 35-> 35 10:00:00:05:33:ce:61:f5 203 deskew 15 MASTER => trunk with alerts
39-> 39 10:00:00:05:33:ce:61:f5 203 deskew 16
Tx: Bandwidth 16.00Gbps, Throughput 442.46Kbps (0.00%)
Rx: Bandwidth 16.00Gbps, Throughput 433.73Kbps (0.00%)
Tx+Rx: Bandwidth 32.00Gbps, Throughput 876.19Kbps (0.00%)
porterrshow
frames enc crc crc too too bad enc disc link loss loss frjt fbsy
tx rx in err g_eof shrt long eof out c3 fail sync sig
=========================================================================================================
35: 374.0m 115.7m 0 0 0 0 0 0 0 70 0 1 2 0 0
39: 3.1g 3.8g 0 2 0 0 0 0 0 204 0 1 2 0 0
Regarding to Brocade docs looks like buffer credit problem:
Bottleneck Detection can detect ports that are blocked due to lost credits and generate special “stuck VC” and “lost
credit” alerts for the E_Port with the lost credits (available in FOS 6.3.1b and later).
Example of a “stuck VC” alert on an E_Port:
2010/03/16-03:40:48, [AN-1010], 21761, FID 128, WARNING, sw0, Severe latency bottleneck detected at slot 0 port 38.
"timestamp", [AN-1010], "sequence-number",, WARNING, "system-name", Severe latency bottleneck detected at Slot "slot number" port "port number within slot number".
This message identifies the date and time of a credit loss on a link.The platform and port affected and the number of seconds that triggered the threshold.
But what can cause buffer credit loss?
There could be a slow drain device causing the issue.
Root cause of this problem in my case has been one erroneous port on second switch with id 203.
XSAN01:admin> porterrshow
frames enc crc crc too too bad enc disc link loss loss frjt fbsy
tx rx in err g_eof shrt long eof out c3 fail sync sig
=========================================================================================================
28: 0 0 0 0 0 0 0 0 13.9k 0 0 0 0 0 0
SFP has been identified as a failing item in fabric. After its replacement problem has gone.
Source:
Severe latency bottleneck detected on ISL / Trunk port
HP Storageworks B-series SAN Switches - How to Interpret the Brocade porterrshow Output
HP StorageWorks B-Series Switches - Identifying if SFP or the Cable is the Cause for Loss of Link
No comments:
Post a Comment