Discussion:
PCI SIO devices hog interrupts, cause lock order problems
(too old to reply)
Andrew H. Derbyshire
2004-08-30 11:41:12 UTC
Permalink
Attached are two boot logs with sources as of today (Sunday 8/28/2004) with
both the standard GENERIC kernel configuration and the generic kernel
modified to include a single additional device, the PUC device. The system
is a Dell GX300 Dual PIII/733 with PCI 3COM Ethernet, a 3COM internal modem,
and 29160 SCSI.

Basically, any PCI SIO device hogs its interrupt if the PUC device is not
also in the kernel, and this causes real problems for any environment like
mine where pulling the modem is not trivial. Does the distributed GENERIC
kernel have room for the PUC device? Are there side effects that PUC should
be excluded from GENERIC?

As a bonus, there appears to be a bug with kernel locking exposed by the
problem. With the stock generic kernel, the XL device reports it couldn't
map the interrupt, and then a lock order reversal is reported. (See the
attached log for the gory details).

Suggestions?

The machine is pure test mode now. I can test either CVS updated source or
private patches.
M. Warner Losh
2004-08-30 18:49:49 UTC
Permalink
In message: <012301c48e25$14924180$***@hh.kew.com>
"Andrew H. Derbyshire" <***@kew.com> writes:
: Basically, any PCI SIO device hogs its interrupt if the PUC device is not
: also in the kernel, and this causes real problems for any environment like
: mine where pulling the modem is not trivial. Does the distributed GENERIC
: kernel have room for the PUC device? Are there side effects that PUC should
: be excluded from GENERIC?

puc should be in GENERIC, imho.

: As a bonus, there appears to be a bug with kernel locking exposed by the
: problem. With the stock generic kernel, the XL device reports it couldn't
: map the interrupt, and then a lock order reversal is reported. (See the
: attached log for the gory details).

This is a known problem.

Warner
Drew Derbyshire
2004-08-30 19:30:56 UTC
Permalink
Post by M. Warner Losh
: Basically, any PCI SIO device hogs its interrupt if the PUC device is not
: also in the kernel, and this causes real problems for any environment like
: mine where pulling the modem is not trivial. Does the distributed GENERIC
: kernel have room for the PUC device? Are there side effects that PUC should
: be excluded from GENERIC?
puc should be in GENERIC, imho.
Who makes the call (or the commit)? The cost is ~ 55K on disk
(which seems excessive) with current build, I assume that's bloated
by the current kernel options.
Post by M. Warner Losh
: As a bonus, there appears to be a bug with kernel locking exposed by the
: problem. With the stock generic kernel, the XL device reports it couldn't
: map the interrupt, and then a lock order reversal is reported. (See the
: attached log for the gory details).
This is a known problem.
Well, it at least it didn't panic on me, which previous experiments
(months ago) were prone to do.

-ahd-

p.s. Sorry about the original mail being ugly MS HTML. I needed the MIME, not the HTML.
Poul-Henning Kamp
2004-08-30 19:38:41 UTC
Permalink
Post by Drew Derbyshire
Post by M. Warner Losh
puc should be in GENERIC, imho.
I agree.
Post by Drew Derbyshire
Who makes the call (or the commit)? The cost is ~ 55K on disk
(which seems excessive) with current build, I assume that's bloated
by the current kernel options.
This could be vastly improved if the data structure puc uses were
more intelligent. Man cards could be described simply by their
PCI ID and "fill resource #1 with sio ports" rather than the very
space consuming and errorprone stuff we do now.
--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
***@FreeBSD.ORG | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
M. Warner Losh
2004-08-31 04:36:48 UTC
Permalink
In message: <***@critter.freebsd.dk>
"Poul-Henning Kamp" <***@phk.freebsd.dk> writes:
: In message <***@shub-internet.kew.com>, Drew Derbyshire w
: rites:
:
: >> puc should be in GENERIC, imho.
:
: I agree.
:
: >Who makes the call (or the commit)? The cost is ~ 55K on disk
: >(which seems excessive) with current build, I assume that's bloated
: >by the current kernel options.
:
: This could be vastly improved if the data structure puc uses were
: more intelligent. Man cards could be described simply by their
: PCI ID and "fill resource #1 with sio ports" rather than the very
: space consuming and errorprone stuff we do now.

The stuff we do now is trying to be too smart. Most single port cards
are like phk says, but multiport is where things really go wonkies...

Warner
Bruce Evans
2004-09-10 12:11:48 UTC
Permalink
Post by M. Warner Losh
: Basically, any PCI SIO device hogs its interrupt if the PUC device is not
: also in the kernel, and this causes real problems for any environment like
: mine where pulling the modem is not trivial.
This seems to be just the old bug that interrupt attributes are wired
at bus_setup_intr() time, but that time is too early for at least the
INTR_FAST attribute because the best possible wiring depends on the
set of devices that is actively using the interrupt. (This set may
grow as more devices are attached; it should also shrink as devices
are attached, and it should be fully dynamic so that inactive devices
don't pessimize the interrupt wiring of active ones.)

sio just attempts to set up the interrupt using INTR_FAST because that
is best for it. If this fails, then sio tries again without INTR_FAST.
The fallback only helps if, if some (non-sio) device on the interrupt
can't handle INTR_FAST, then at least one such device is wired to the
interrupt before any (not necessarily sio) device asks for the interrupt
to be wired as INTR_FAST.

Using puc works around the problem by breaking setup of the interrupt
using INTR_FAST more deterministically provided PUC_FASTINTR is not
used. This depends on the magic and arguably broken ordering of puc
and pci-sio attachment -- it depends on puc being attached first, but
perhaps pci-sio should be first since it is less generic and more
efficient for the small set of hardware that it handles.

The cy driver works around the problem in a different way: INTR_FAST
is not tried by default, but the CY_PCI_FASTINTR option forces it to
be tried first with a fallback to !INTR_FAST in the same way as in
sio. Thus the default is fail-safe but pessimal for cy. The default
is fail-unsafe for sio mainly for historical reasons.
Post by M. Warner Losh
Does the distributed GENERIC
: kernel have room for the PUC device? Are there side effects that PUC should
: be excluded from GENERIC?
puc should be in GENERIC, imho.
I agree. It is too large due to its sparse data structures, but since the
sparse data compresses very well, it doesn't take any more space on
boot media than most drivers in GENERIC.

Bruce

Loading...