This retrieves Xen's recommended address range for grant table and
interrupt for event channel. The interrupt is forwarded to the Xen
interrupt handling code.
Originally the device was "xen0", but "xen-dt0" should hopefully be
clearer in boot messages. The address range is for grant table and
the interrupt is used for the event channel.
This was originally implemented by Julien Grall in 2014. This is was
broken off of a much larger commit.
I'm not convinced this should be tied to the usage of device-tree (as it's in the same file). An Arm guest booting off ACPI would also require the registration of the shared_info page, and hence this routine should be in a separate file.
Same here, the reigstration of the vcpu_info shouldn't be done here, the more that this will only be executed for the BSP, and you need the vcpu_info to be registered for each vCPU on the system
That likely means merging this portion with D29874. I noticed xen_early_init() didn't fit in xen-dt.c, but hadn't noticed xen_init() might also need to be broken off. Hmm.
Maybe. Depends upon whether this can reasonably be shared by multiple architectures. This would likely share with implementation attempts for aarch32 or other ARM on aarch64 Xen. I'm less sure about how well it will share with RISC-V.
The device-tree node is pretty generic. So I would hope that any architecture using Device-Tree would end up to use/improve it rather than reinventing the wheel.
This doesn't seem to be used by this file. Software contexts are not typically tracked in this way, but instead obtained from the device_t structure via device_get_softc(9).
xen_init() is too general a name for this function. This looks to be equivalent to xen_hvm_init_shared_page_info() from x86/xen/hvm.c. Since arm64 will be HVM only (yes?), I guess maybe xen_init_shared_page().
I don't see anything in this file that couldn't be shared, it's all pretty standard fdt/newbus stuff. I'm excluding xen_init(), which I think we agree will go elsewhere under sys/arm64/.
So yes, this should go in the generic code under sys/dev/xen. Please also make this line optional on fdt.
I suppose it could be renamed. I can't help noting this is a static function so as long as the name doesn't collide with another function in the file it should be okay.
I'm kind of doubtful Xen is likely to support anything besides device-tree on ARM. The boot method I've been using has been to use a Tianocore build as something akin to a first-stage bootloader (domain configuration file: kernel = "/usr/share/xen/XEN_EFI.fd"). In this situation Tianocore preserves the device-tree as an ACPI table, so this does function in presence of ACPI/UEFI support.
That though is not an argument against breaking this function off and into a separate file.
Other issue is why isn't HYPERVISOR_shared_info in a common file somewhere...
I don't see anything in this file that couldn't be shared, it's all pretty standard fdt/newbus stuff. I'm excluding xen_init(), which I think we agree will go elsewhere under sys/arm64/.
So yes, this should go in the generic code under sys/dev/xen. Please also make this line optional on fdt.
Question for @royger: What would your view of changing the return type of xen_intr_handle_upcall() to int be?
On x86 xen_intr_handle_upcall() is called from sys/{i386|amd64}/{i386|amd64}/apic_vector.[sS] which don't need a return. The result is on ARM the function xen_intr_filter() had to be created, which calls xen_intr_handle_upcall() then returns FILTER_HANDLED. If the return of xen_intr_handle_upcall() was changed, this code could directly pass it to bus_setup_intr() instead of needing the wrapper.
(this should be taken care of before this is committed)
Xen *does* provide both ACPI and Device-Tree. Although, the support for ACPI in Dom0 is less mature than the guest support.
If an admin wants to use Xen + FreeBSD on server, then he/she will have to use ACPI because quite a few servers don't provide DT (or doesn't guarantee to be fully functional).
Therefore, I think we should avoid to rely too much on DT in port for FreeBSD. Note this is not a request to add support for ACPI as part of this work.
I'm thinking what is xen_init() here is going to be merged into D28982. While D30006 would make the interrupt bit cleaner (assuming that actually works for x86).
Thing is, this really is highly similar to xen_hvm_init_shared_info_page() in sys/x86/xen/hvm.c. So similar they could fairly readily be merged.
@royger does xen_hvm_init_shared_info_page() actually need to be called on resume on x86? Can it instead simply be called once during sysinit? (if yes, then this could be a static function in common.c; if no, then this could be a global in common.c)
@royger and @julien_xen.org opinion on the differences between xen_hvm_init_shared_info_page() versus what is currently xen_init()? xen_hvm_init_shared_info_page() instead does a malloc(PAGE_SIZE, M_XENHVM, M_NOWAIT), versus the vm_page_alloc() here.
This version has xen_early_init() replaced with xen_early_probe(). The
prototype for xen_early_probe() matches xen_domain(). As such xen_early_probe() could be used by any driver to test the presence of Xen
during driver probe.
I'm starting to think this needs to come after the interrupt code is
taken care of. Presently this takes care of feeding the interrupt to the
Xen interrupt handling code (note xen_intr_filter() and D30006).
Several of the comments have been addressed, yet the issue pointed to during upload now looms large. This looks like it should come after the interrupt code situation is resolved.
Now part of D28982 and the function is xen_handle_shared_info(). This implementation appears compatible with xen_hvm_init_shared_info_page(), so the two are combined in D28982.
As a revision, I think this is mostly ready. Problem is due to taking the role of feeding interrupts to the rest of the Xen interface, it really needs the interrupt work done first.
@julien_xen.org Can we drop All rights reserved.? We've been removing it from other files (with copyright holders permission) as it's no longer needed.
Updating D29875 to what I currently have, though I do not expect this to be final either. Mostly this is updating others on what I have. General idea is the call to xen_domain() at the top of xencons_cnprobe() could be replaced with a call to xen_early_probe(). On x86 xen_early_probe() would simply be a #define to xen_domain(), whereas on other architectures the console probe is when Xen's presence or absence needs to be known.
Failing that, if the console is completely disabled, SI_SUB_HYPERVISOR/SI_ORDER_FIRST should be before devices are probed and thus be soon enough to ascertain Xen's presence or absence.
The update had been aimed at keeping what was in Phabricator up to date. Unfortunately I still don't think this is quite ready and should still be in the "changed planned" state.
I really don't expect this to be final. There remain quirks which I expect need addressing.
There is a crucial ordering issue between D29498 and D29875. D30816 introduces a hook point. The theory is xen_dt_probe() gets declared in sys/arm64/include/xen/xen-os.h, then xen_early_probe() is defined to xen_dt_probe(). The theory is if it becomes possible to probe Xen's presence via EFI, then a xen_efi_probe() could be introduced and xen_early_probe() could call both.
Two other improvements I see are: Likely the interrupt resource should be remarked as belonging to xenpv0, and the address resource needs to be returned to the nexus. Current implementation is using x86's approach of asking the nexus for an address range, and the address resource is simply unused.
I'll be looking through the source trying to figure out what you mean by "FDT specific name". My first guess is FDT files are supposed to include some sort of setting to tell OFW code to load them if device X is seen. Mainly searching for a "/hypervisor" which is compatible with "xen,xen" isn't the way it is supposed to be done (this is based on something originally written in 2014, so I wouldn't be surprised at yet more changes).
Good thing I'm regularly doing builds and analyzing failures.
What is presently in Phabricator is broken due to mistaken handling of the xen_handle_shared_info() call. Two solutions come to mind.
The console init and SI_SUB_HYPERVISOR invocations could be distinguished by passing numbers as an argument, then the xen_handle_shared_info() could be called during SI_SUB_HYPERVISOR. This makes the code more complex as this conditionally calls xen_handle_shared_info() during the main portion, but also conditionally calls it during the short-circuit near the top
Return to an approach similar to what had been part of D28982. Code-wise this is distinctly simpler. I dislike the invocation of xen_handle_shared_info() being so far from D28982, though this does have the advantage of ensuring xen_handle_shared_info() is only called once during boot on x86.
Knowing what the implementation of option 1 is going to look like, the implementation of option 2 is sufficiently simpler for me to consider that compelling. As such I would opt for option 2 unless there is a strong opinion against (@mhorne has indicated dislike of this in D28982).
Another issue is telling the ofw code to release the range suggested for the grant-table. That range is valuable for Linux's approach to interacting with Xen, not FreeBSD's approach. I've got a working implementation which does a bus_alloc_resource_any() followed by a bus_release_resource(), but this leaves behind the report during probe of xen-dt0 having a 16MB memory range. I would really like to nuke that during probe...
There is actually a bug here. xen_handle_shared_info() can be called during SI_SUB_HYPERVISOR, but cannot be called during console init. Since xen_dt_probe() will be called during console init unless the system console is completely disabled, this doesn't work.
Actually, presently the ABI assumes 4KB pages ("frames"). There are plans to address this, but such isn't yet available in Xen. Due to this, the update I've got staged includes switches to using PAGE_*_4K.
I'm marking this done as next update will include this change.
Updating. This is missing two hunks from the most recent successful build, but I believe things should work without those. The extra bits are trying to do something about the grant-table range.
Fixing goof introduced during nitpicking. Got the cast slightly wrong and it had been a while since last check. I'm wondering whether additional flags are appropriate for the interrupt allocation...