For testing macb stall (mainline)#7472
Conversation
…rn_ratelimited" This reverts commit b2f7eec. Signed-off-by: Nicolai Buchwitz <nb@tipi-net.de>
…_tx_poll" This reverts commit 60fc80b. Signed-off-by: Nicolai Buchwitz <nb@tipi-net.de>
This reverts commit 79dc190. Signed-off-by: Nicolai Buchwitz <nb@tipi-net.de>
This reverts commit ff6914e. Signed-off-by: Nicolai Buchwitz <nb@tipi-net.de>
…write The MACB found in the Raspberry Pi RP1 suffers from sporadic stalls on the TX queue. While the exact root cause is not yet fully understood, it is likely related to a hardware issue where a TSTART write to the NCR register is missed, preventing the transmission from being kicked off. Implement a timeout callback to handle TX queue stalls, triggering the existing restart mechanism to recover. Link: https://lore.kernel.org/all/20260514215459.36109-1-lukasz@raczylo.com/ Fixes: dc110d1 ("net: cadence: macb: Add support for Raspberry Pi RP1 ethernet controller") Signed-off-by: Lukasz Raczylo <lukasz@raczylo.com> Co-developed-by: Steffen Jaeckel <sjaeckel@suse.de> Signed-off-by: Steffen Jaeckel <sjaeckel@suse.de> Co-developed-by: Andrea della Porta <andrea.porta@suse.com> Signed-off-by: Andrea della Porta <andrea.porta@suse.com> Reviewed-by: Nicolai Buchwitz <nb@tipi-net.de> Reviewed-by: Théo Lebrun <theo.lebrun@bootlin.com> Link: https://patch.msgid.link/468f480454a314303bac6a54780b153f689f2267.1781598350.git.andrea.porta@suse.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> (cherry picked from commit e438ec3)
|
I'm running with this now... |
|
2 hours later and I've not hit a timeout (there's a pr_err in my build) but that doesn't surprise me - we never saw the stall - and no more error messages because the special stall detection code has gone. |
|
Is there actually a stall happening that was previously masked? |
|
I have seen this stall once or twice in my lab, so I can confirm that they exist. Andrea pointed me to a potential reproducer, but I was never able to provoke the stall with it though. |
|
I think it's more likely to be a false positive |
|
I don't think either of us are questioning whether stalls are a problem - the question is whether the messages that have started appearing are indications of a real fault having occurred, or an accidental triggering due to a fault in the detection logic. |
|
Let's give this some broader testing. As you already know I'm always in favor of aligning with mainline (where possible). @satmandu, any chance that you can test this too? You can download the kernel with rpi-update pulls/7472 (the usual precautions apply etc.) |
|
I'm not seeing the messages any more with this kernel. |
No description provided.