Page MenuHomeFreeBSD

dmar: Don't try to reserve PCI regions for non-existing devices
ClosedPublic

Authored by kd on Oct 21 2021, 1:14 PM.
Tags
None
Referenced Files
F81891384: D32589.id97648.diff
Fri, Nov 29, 1:51 AM
F81891383: D32589.id97648.diff
Fri, Nov 29, 1:51 AM
F81891382: D32589.id97648.diff
Fri, Nov 29, 1:51 AM
F81891381: D32589.id97648.diff
Fri, Nov 29, 1:51 AM
Unknown Object (File)
Sat, Nov 9, 4:22 AM
Unknown Object (File)
Fri, Nov 8, 4:28 AM
Unknown Object (File)
Thu, Nov 7, 9:02 PM
Unknown Object (File)
Thu, Nov 7, 6:43 PM
Subscribers

Details

Summary

In some cases we might have to create DMAR context before the corresponding device has been enumerated by the PCI bus.
In that case we get called with NULL dev, because of that trying to reserve PCI regions causes a NULL pointer dereference in pci_find_pcie_root_port.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

kd requested review of this revision.Oct 21 2021, 1:14 PM
kd created this revision.

I wonder if it makes more sense to check for dev != NULL at the beginning of pci_find_pcie_root_port()

This revision is now accepted and ready to land.Oct 21 2021, 1:42 PM

A NULL check seems like a clear improvement over a crash.

So, does that mean we need to make the PCIe root port reservation at a later time in that case? Or do we know the PCIe root port from the passed in BSF? If so we could add a different lookup method here. We need to ensure somehow that those IOMMU virtual addresses are not allocated.

In D32589#735603, @kib wrote:

I wonder if it makes more sense to check for dev != NULL at the beginning of pci_find_pcie_root_port()

Hmm, I looked at the panic backtrace again and technically the NULL pointer dereference happens in device_get_parent called form pci_find_pcie_root_port.
Now the question is whether we should do sanity checks inside functions exposed as KAPI, or is it the caller responsibility to make sure that the arguments they provide are sane.
I'm not sure what the convention is here.

A NULL check seems like a clear improvement over a crash.

So, does that mean we need to make the PCIe root port reservation at a later time in that case? Or do we know the PCIe root port from the passed in BSF? If so we could add a different lookup method here. We need to ensure somehow that those IOMMU virtual addresses are not allocated.

I'm not sure if this is not a chicken-and-egg kind of problem, hence this patch.
Finding the root port with DBSF only is one thing, but can we even assume that it has been enumerated at this point?

In D32589#735813, @mindal_semihalf.com wrote:
In D32589#735603, @kib wrote:

I wonder if it makes more sense to check for dev != NULL at the beginning of pci_find_pcie_root_port()

Hmm, I looked at the panic backtrace again and technically the NULL pointer dereference happens in device_get_parent called form pci_find_pcie_root_port.
Now the question is whether we should do sanity checks inside functions exposed as KAPI, or is it the caller responsibility to make sure that the arguments they provide are sane.
I'm not sure what the convention is here.

It is about making the pci_find_pcie_root_port() more useful, perhaps.

A NULL check seems like a clear improvement over a crash.

So, does that mean we need to make the PCIe root port reservation at a later time in that case? Or do we know the PCIe root port from the passed in BSF? If so we could add a different lookup method here. We need to ensure somehow that those IOMMU virtual addresses are not allocated.

I'm not sure if this is not a chicken-and-egg kind of problem, hence this patch.
Finding the root port with DBSF only is one thing, but can we even assume that it has been enumerated at this point?

Note that for the newbus initialization, we usually come from root down to the devices. DMARs do not participate in the parent/children relations there, but logically they are located before the root ports. So in principle DMARs should be initialized before anything in PCIe domain below it, and this is logical, because device attach might already need to start some DMA transfer. So probably regions reservation indeed should be moved later to the moment where ports are enumerated, at least.