Tagged: ECC mv_ddr
Version armada-18.09 has been released.
An inexplicable behavior was resolved in this version(mv_ddr-devel-18.09.1).
I replaced the 4GB NonECC DIMM with a 16GB ECC DIMM two months ago.
Linux kernel was frequently logging errors since that.
[ 351.562134] Synchronous External Abort: synchronous external abort (0x92000210) at 0x0000007fa4f5e6e0
After occurrence of an error,
the register MC6_CH0_ECC_1BIT_ERR_COUNTER_REG has been incremented by several hundreds or more.
It does not happen if I modify mv_ddr source to disable ECC.
diff --git a/mv_ddr_topology.c b/mv_ddr_topology.c index eb1b47e..80e7b18 100644 --- a/mv_ddr_topology.c +++ b/mv_ddr_topology.c @@ -295,6 +295,10 @@ unsigned short mv_ddr_bus_bit_mask_get(void) pri_and_ext_bus_width = 0x0; } +#if 1 + bus_width_ext = MV_DDR_BUS_WIDTH_EXT_0; + printf("mv_ddr: ECC disabled.\n"); +#endif if (bus_width_ext == MV_DDR_BUS_WIDTH_EXT_8) pri_and_ext_bus_width |= 1 << (octets_per_if_num - 1); }
Also, it occurs only after a reboot or reset by signal CP_MR#.
It does not occur after a power-on reset.
commit fixed it.
I’m interested what 16GB ECC RAM do you use? I tried two modules and both of them didn’t boot.
I use a Crucial CT16G4XFD824A (Micron MTA18ADF2G72AZ-2G3B1).
No problems in my environment with the fix above.
$ decode-dimms # decode-dimms version $Revision$ Memory Serial Presence Detect Decoder By Philip Edelbrock, Christian Zuckschwerdt, Burkart Lingner, Jean Delvare, Trent Piepho and others Decoding EEPROM: /sys/bus/i2c/drivers/ee1004/0-0053 Guessing DIMM is in bank 4 ---=== SPD EEPROM Information ===--- EEPROM CRC of bytes 0-125 OK (0x7A64) # of bytes written to SDRAM EEPROM 384 Total number of bytes in EEPROM 512 Fundamental Memory type DDR4 SDRAM SPD Revision 1.1 Module Type UDIMM EEPROM CRC of bytes 128-253 OK (0x05F1) ---=== Memory Characteristics ===--- Maximum module speed 2400 MHz (PC4-19200) Size 16384 MB Banks x Rows x Columns x Bits 16 x 16 x 10 x 64 SDRAM Device Width 8 bits Ranks 2 Rank Mix Symmetrical Bus Width Extension 8 bits AA-RCD-RP-RAS (cycles) 17-17-17-39 Supported CAS Latencies 21T, 20T, 19T, 18T, 17T, 16T, 15T, 14T, 13T, 12T, 11T, 10T ---=== Timings at Standard Speeds ===--- AA-RCD-RP-RAS (cycles) as DDR4-2400 17-17-17-39 AA-RCD-RP-RAS (cycles) as DDR4-2133 15-15-15-35 AA-RCD-RP-RAS (cycles) as DDR4-1866 13-13-13-30 AA-RCD-RP-RAS (cycles) as DDR4-1600 11-11-11-26 ---=== Timing Parameters ===--- Minimum Cycle Time (tCKmin) 0.833 ns Maximum Cycle Time (tCKmax) 1.600 ns Minimum CAS Latency Time (tAA) 13.750 ns Minimum RAS to CAS Delay (tRCD) 13.750 ns Minimum Row Precharge Delay (tRP) 13.750 ns Minimum Active to Precharge Delay (tRAS) 32.000 ns Minimum Active to Auto-Refresh Delay (tRC) 45.750 ns Minimum Recovery Delay (tRFC1) 350.000 ns Minimum Recovery Delay (tRFC2) 260.000 ns Minimum Recovery Delay (tRFC4) 160.000 ns Minimum Four Activate Window Delay (tFAW) 21.000 ns Minimum Row Active to Row Active Delay (tRRD_S) 3.300 ns Minimum Row Active to Row Active Delay (tRRD_L) 4.900 ns Minimum CAS to CAS Delay (tCCD_L) 5.000 ns Minimum Write Recovery Time (tWR) 15.000 ns Minimum Write to Read Time (tWTR_S) 2.500 ns Minimum Write to Read Time (tWTR_L) 7.500 ns ---=== Other Information ===--- Package Type Monolithic Maximum Activate Count Unlimited Post Package Repair One row per bank group Soft PPR Supported Module Nominal Voltage 1.2 V Thermal Sensor TSE2004 compliant ---=== Physical Characteristics ===--- Module Height 19 mm Module Thickness 2 mm front, 2 mm back Module Reference Card ZZ ---=== Manufacturer Data ===--- Module Manufacturer Micron Technology DRAM Manufacturer Micron Technology Manufacturing Location Code 0x0F Manufacturing Date 2018-W13 Assembly Serial Number 0x******** Part Number 18ADF2G72AZ-2G3B1 Revision Code 0x31 Number of SDRAM DIMMs detected and decoded: 1
I’ve bought MTA18ADF2G72AZ-2G3B1ZI – CT16G4XFD82A.18FB1
Boot goes into UEFI, and then, when grub is loaded, it complains that it can’t find boot partition (it is located on the internal emmc). If I use 4GB non-ECC stick that came with the board, grub and linux load normally. It is definitelly better than the other 16GB memory sticks (they didn’t even pass DDR traninig), but it still doestn’ boot.
I compiled the firmware versions mv_ddr-18.12.0 and mv_ddr-18.09.0, but it doesn’t make difference.
What I’m I doing wrong?
BootROM – 2.03
Starting CP-0 IOROM 1.07
Booting from SD 0 (0x29)
SD – wait_for_sd_interrupt: Error interrupt – 00008000
SD – wait_for_sd_interrupt: Error interrupt status 00000003
SD – sd_get_cmd_response: Get command response failed.
SD – sd_init: Failed – ret = 00000081
Error: Failed initializing interface
Found valid image at boot postion 0x002
lNOTICE: Starting binary extension
NOTICE: SVC: SW Revision 0x0. SVC is not supported
mv_ddr: mv_ddr-devel-18.12.0-g618dadd (Mar 13 2019 – 03:20:41 PM)
mv_ddr: scrubbing memory…
mv_ddr: completed successfully
NOTICE: Cold boot
NOTICE: Booting Trusted Firmware
NOTICE: BL1: v1.5(release):711ecd32 (Marvell-armada-18.09.4)
NOTICE: BL1: Built : 15:20:43, Mar 13 2019
NOTICE: BL1: Booting BL2
NOTICE: BL2: v1.5(release):711ecd32 (Marvell-armada-18.09.4)
NOTICE: BL2: Built : 15:20:43, Mar 13 2019
BL2: Initiating SCP_BL2 transfer to SCP
NOTICE: SCP_BL2 contains 2 concatenated images
NOTICE: Load image to CP1 MSS AP0
NOTICE: Loading MSS image from addr. 0x4023020 Size 0x135c to MSS at 0xf4280000
NOTICE: Load image to AP0 MSS
NOTICE: Loading MSS image from addr. 0x402437c Size 0x1f6c to MSS at 0xf0580000
FreeRTOS 7.3.0 – Marvell cm3 – A8K release armada-18.05.1
NOTICE: SCP Image doesn’t contain PM firmware
NOTICE: BL1: Booting BL31
lNOTICE: MSS PM is not supported in this build
NOTICE: BL31: v1.5(release):711ecd32 (Marvell-armada-18.09.4)
NOTICE: BL31: Built : 15:20:43, Mar 13 2019
Armada 8040 MachiatoBin Platform Init
Comphy0-0: PCIE0 5 Gbps
Comphy0-1: PCIE0 5 Gbps
Comphy0-2: PCIE0 5 Gbps
Comphy0-3: PCIE0 5 Gbps
Comphy0-4: SFI 10.31 Gbps
Comphy0-5: SATA1 5 Gbps
Comphy1-0: SGMII1 1.25 Gbps
Comphy1-1: SATA2 5 Gbps
Comphy1-2: USB3_HOST0 5 Gbps
Comphy1-3: SATA3 5 Gbps
Comphy1-4: SFI 10.31 Gbps
Comphy1-5: SGMII2 3.125 Gbps
UTMI PHY 0 initialized to USB Host0
UTMI PHY 1 initialized to USB Host1
UTMI PHY 2 initialized to USB Host0
Succesfully installed protocol interfaces
Detected w25q32bv SPI NOR flash with page size 256 B, erase size 4 KB, total 4 MB
ramdisk:blckio install. Status=Success
3h3h3hTianocore/EDK2 firmware version MARVELL UEFI 18.11.0
Press ESCAPE for boot options …error: no such device: 34e635ff-15bb-4006-8f5f-
error: failure reading sector 0x2 from `hd3′.
Entering rescue mode…
I’ve found out that this Crucial stick works with u-boot (although I have to reboot the board by power-cycle, to avoid the ECC bug). I have stressed the machine by gcc compilation and it is stable.
But the Crucial memory stick doesn’t work with UEFI+GRUB. With 16GB RAM, grub is incapable of recognizing any filesystem and falls back to rescue mode. Is it a bug in grub? Does anyone have UEFI+GRUB working with 16GB ECC?
Shell> fs1: FS1:\> cd EFI\EFI\debian FS1:\EFI\EFI\debian\> ls Directory of: FS1:\EFI\EFI\debian\ 04/21/2019 21:46 <DIR> 2,048 . 04/21/2019 21:46 <DIR> 2,048 .. 04/21/2019 21:46 120,832 grubaa64.efi 1 File(s) 120,832 bytes 2 Dir(s) FS1:\EFI\EFI\debian\> grubaa64.efi error: no such partition. Entering rescue mode... grub rescue>
With modding to fake DRAM size.
Technical specification tables can not be displayed on mobile. Please view on desktop