Friday, July 13, 2012

H205535: Possibility of incorrect data being written on Multi-bit error correcting code (ECC) - IBM System Storage


Source

RETAIN tip: H205535
http://www-947.ibm.com/support/entry/portal/docdisplay?brand=5000008&lndocid=MIGR-5090005

Symptom

When one of the listed IBM Storage Subsystem controllers has detected a Multi-bit Error Correcting Code (ECC) memory error, users will see an informational message in the Major Events Log (event 0x2604) and that individual controller will reboot. If that Multi-bit ECC memory error happens on a dirty cache block that does not begin at offset 0x0, the data may be written incorrectly.
Any of the listed IBM Storage Subsystem controllers running any of these controller firmware versions are susceptible to this issue.
  • 7.77.33.00 or older
  • 7.70.45.00 or older
  • any 7.60.xx.xx
  • any 7.50.xx.xx
  • any 7.36.xx.xx
  • 7.35.69.00 or older
  • any 7.30.xx.xx

Affected configurations

The system may be any of the following IBM servers:
  • IBM System Storage DCS3700 Storage Subsystem, type 1818, any model
  • IBM System Storage DS3200, type 1726, any model
  • IBM System Storage DS3300, type 1726, any model
  • IBM System Storage DS3400, type 1726, any model
  • IBM System Storage DS3512, type 1746, any model
  • IBM System Storage DS3524, type 1746, any model
  • IBM System Storage DS3950 Express, type 1814, any model
  • IBM System Storage DS5020 Disk Controller (1814-20A), any model
  • IBM System Storage DS5100 Storage Controller, type 1818, any model
  • IBM System Storage DS5300 Storage Controller, type 1818, any model
This tip is not software specific.
This tip is not option specific.

Solution

This behavior has been corrected in the controller firmware version 7.77.34.00 for the DS3500, DCS3700, DS3950, DS5020, DS5100 and S5300.
This behavior will be corrected in firmware for the DS3200, DS3300 and DS3400 in a version of 7.35 controller firmware in second quarter 2012.
These firmware updates are available by selecting the appropriate Product Group, Product name, Product machine type, and operating system on IBM Support's Fix Central web page, at the following URL:
IBM highly recommends that users upgrade all of their DS3200, DS3300 and DS3400 controllers to the 7.35.xx.xx firmware version when it becomes available.

Additional information

An IBM System Storage Subsystem controller firmware issue caused this behavior. It has been corrected as described above.
Note: When a Multi-bit ECC memory error occurs while running the firmware versions containing the fix, the 0x2604 MEL event will still be logged and the controller will still reboot, but the error recovery will be handled properly.
This issue has not been reported to IBM by any IBM user.

Applicable countries and regions

No comments:

Post a Comment