English
Language : 

QG5000XSL9TH Datasheet, PDF (318/458 Pages) Intel Corporation – Intel 5000X Chipset Memory Controller Hub (MCH)
Functional Description
5.3.10
5.3.10.1
5.3.10.2
5.3.10.3
FB-DIMM Memory Failure Isolation Mechanisms
Since the Intel 5000X chipset MCH does not operate FB-DIMM in fail-over mode, CRC
accompanies Northbound data. Successful transaction completion is signalled by the
absence of alerts within a read round-trip. Bad CRC accompanies alerts. Alerts preempt
read data. Detection of corrupted CRC or corrupted write acknowledge (idle) will
initiate an FB-DIMM fast reset followed by a retry of all commands since completion of
the last successful transaction. A consecutive CRC/ack failure on the same transaction
is fatal.
FB-DIMM Configuration Read Error
An erroneous configuration read return will be master aborted and return all 1’s. It will
not be retried.
DIMM Failure Isolation
The failing DIMM may be isolated using information contained in several registers. ECC
error flag bits are recorded in register FERR_NF_FBD, Section 3.9.22.3. This register
records various error sources related to FB-DIMM memory transactions. When an error
occurs the channel/branch information is recorded in the FBDChan_indx field.
The FBDChan_indx is a two bit field that records branch ECC errors. ECC errors are
reported on a per branch basis (the LSB of this field has no relevance for ECC errors).
For ECC errors the possible values for this field are:
FBDChan_indx = 0 Branch 0 ECC error
FBDChan_indx = 2 Branch 1 ECC error
Once the branch is determined the failing DIMM is determined, the rank and DIMM is
determined from the RECMEMA.RANK and REDMEMB.ECC_Locator fields. The
ECC_Locator indicates which x8 SDRAM device (or pair of adjacent x4 devices) caused
the error. If any of the bits [8:0] is set, a DIMM on the even channel caused the error.
If any of the bits [17:9] is set, a DIMM on the odd channel caused the error. See
Table 3-49.
For uncorrectable errors the NRECMEMA.RANK register is used to identify the failing
DIMM pair (lockstep channels).
After a mirrored branch is taken off line, BIOS can execute MemBIST routines on the
suspect DIMM-Pair to reproduce failures. This can be performed out-of-band using the
SPD (SM bus) interface.
ECC Code
When branches operate in dual-channel mode, the MCH supports the 18 device DRAM
failure correction code (SDDC aka SECC) option for FB-DIMM. As applied by Intel
5000P Chipset, this code has the following properties:
• Correction of any x4 or x8 DRAM device failure
• Detection of 99.986% of all single bit failures that occur in addition to a x8 DRAM
failure. The Intel 5000X chipset MCH will detect a series of failures on a specific
DRAM and use this information in addition to the information provided by the code
to achieve 100% detection of these cases.
• Detection of all 2 wire faults on the DIMMs. This includes any pair of single bit
errors.
318
Intel® 5000X Chipset Memory Controller Hub (MCH) Datasheet