Address Eric's style(9) nits and remove space in front of parans in sizeof
- Feed Queries
- All Stories
- Search
- Feed Search
- Transactions
- Transaction Logs
Aug 5 2025
Update diff to address Gleb's feedback
Jul 26 2025
Jul 12 2025
Jun 29 2025
Jun 24 2025
Thank you for moving to this solution, away from a global atomic.
Jun 9 2025
May 28 2025
In D50546#1154747, @markj wrote:This generally seems fine to me, but I'm not familiar with the original change and wonder why the compat shims are so coupled to the one application.
In D50546#1154136, @ziaee wrote:Should we put the sysctl in the manual?
May 27 2025
Apr 22 2025
Apr 15 2025
Update to use hlen, which already holds the size of the ip6 hdr, as suggested by glebius
Apr 14 2025
Mar 22 2025
Mar 20 2025
Mar 14 2025
Feb 25 2025
Feb 24 2025
Feb 7 2025
Feb 5 2025
Jan 31 2025
Thank you!
Jan 29 2025
I'd very much like for this to be backed out. I feel like users will be chasing the breakage caused by this for years. It broke some tests, iocage, and nstat that I know of in the 5 minutes I've been aware of the commit.
Jan 15 2025
Jan 13 2025
Nov 25 2024
Nov 15 2024
Nov 14 2024
Super helpful review, John. I just opened a new review (https://reviews.freebsd.org/D47583) for the simplest suggested change. Will work on your other suggestions.
Nov 13 2024
Nov 12 2024
Nov 11 2024
Address Kib's feedbackj
Nov 8 2024
Nov 6 2024
Nov 4 2024
Oct 28 2024
In D47294#1078779, @markj wrote:In D47294#1078776, @gallatin wrote:Why do we want or need a hardcoded list? Why can't this function be more like lagg_capabilities()? If we do want a hardcoded list, what about IFCAP_TXTLS*
That's a good question. if_bridge could probably be smarter, indeed.
Why exactly does if_bridge need to care about IFCAP_TXTLS*?
Why do we want or need a hardcoded list? Why can't this function be more like lagg_capabilities()? If we do want a hardcoded list, what about IFCAP_TXTLS*
Oct 25 2024
I'd personally want to keep these messages with bootverbose.. I can imagine it might be handy to see them at times...
Oct 23 2024
Fix style issue pointed out by Mark
Oct 22 2024
Why is this re-surfacing?
Oct 16 2024
The a/b results were not surprising (boring as David likes to say). Just slightly higher CPU on the canary (due to the increased irq rate). But no clear streaming quality changes.
All in all, it seems to work and do no real harm, but we'll not use it due to the increased CPU
Yeah, my ideal irq rate/queue is < 1000 . We mostly use Chelsio and Mellanox NICs that can do super aggressive irq coalescing without freaking out TCP due to using RX timestamps. Super aggressive coalescing like this lets us build packet trains in excess of 1000 packets to feed to lro via RSS assisted LRO, and we actually have useful LRO on internet workloads with tens of thousands of TCP connections per-queue. That reminds me that I should port RSS assisted LRO to iflib (eg, lro_queue_mbuf()).
In D30155#1074005, @kbowling wrote:In D30155#1073987, @gallatin wrote:In D30155#1073639, @kbowling wrote:@imp @gallatin if you are able to test your workload, setting this to 1 and 2 would be new behavior versus where you are currently:
I can pull this into our tree and make an image for @dhw to run on the A/B cluster. However, we're not using this hardware very much any more, and there is only 1 pair of machines using it in the A/B cluster. Lmk if you're still interested, and I'll try to build the image tomorrow so that David can test it at his leisure.
Sure, it sounds like that is only enough for one experiment so I would focus on the default algorithm the patch will boot with sysctl dev.ix.<N>.enable_aim=1
Oct 15 2024
In D30155#1073639, @kbowling wrote:
Oct 14 2024
In D45950#1073880, @jhb wrote:I don't think we need the taskqueue. It's probably just a design copied from the Intel drivers, and I don't think it makes much sense for those either. The other thing that can be nice to do though when making this change is to instead build a temporary list of packets linked via m_nextpkt (mbufq works for this) and pass an entire batch to if_input. This lets you avoid dropping the lock as often.
Oct 7 2024
In D46761#1070797, @kib wrote:No, the lock cannot be sleepable because the processing occurs in the context of the interrupt thread.
I would implemented something with blockcount_t or even epoch, but then I realized that it would not help. blockcount cannot be used because rx memory is not type-stable. Driver-private epoch might work, but then note that the second reported backtrace in the PR 281368 shows ip stack acquiring sleepable lock. So even if I try to fix driver, the stack still tries to sleep in ip_input().
I suspect you (Netflix) did not see the deadlock because you either do not use ipv6 or use it in situation with static network configuration. The problems are visible when multicast group membership is changed, at least this is what I see in the PR.
Oct 1 2024
Sep 26 2024
Would it be better to call mb_unmapped_to_ext() here ?
Ah, OK, I understand now.
Sep 25 2024
I'm very afraid there will be performance implications due to new cache misses here from queing mbufs twice. On tens of thounsands of interfaces running over 8 years, we've never hit a deadlock from this lock, and I don't think fixing this is important enough to hurt performance for.
I'm confused.. if we are marking non-writable M_EXTPG mufs as M_RDONLY, why can't we simply remove the M_EXTPG check from M_WRITABLE? Why do we need a new macro?
Sep 9 2024
Sep 5 2024
This passes basic sanity testing at netflix. Sorry for the delayed approval; we had a few integration issues with this and a local Netflix feature that made it look like splice was not working. It only just now became obvious that it was due to our local feature & how to fix it.
Aug 16 2024
Aug 5 2024
Aug 4 2024
Jul 18 2024
Jul 15 2024
In D45950#1048214, @markj wrote:In D45950#1048085, @gallatin wrote:Is this safe? I think so, but I confess that I don't know the low level details in this driver very well.
I believe so, from what I see, the lock exists to synchronize with the taskqueue and to protect some non-atomic counters.
When I read a network driver, I view a lock around rx processing as an indicator there is room for improvement in the design. The reason the lock seems to exist is to serialize rx ring servicing between the ithread (the normal path) and the taskqueue (which is woken if we continually find more packets to process... or maybe if interrupts don't work..?). I don't really understand code at the bottom of vtnet_rx_vq_process(). It seems like interrupts should be disabled if switching to the taskqueue, and enabled if returning from it. It probably has something to do with the "race" mentioned in that code..
I don't quite understand that either. The comment above the definition of VTNET_INTR_DISABLE_RETRIES suggests to me that the idea is:
- vtnet_rxq_eof() returns 0, so more == 0, i.e., there were no more descriptors to process.
- vtnet_rxq_enable_intr() returned 1, meaning that we found some completed descriptors on the queue when enabling interrupts.
- We should call vtnet_rxq_eof() again to collect those newly completed descriptors instead of deferring.
I'm not sure I understand the purpose of the taskqueue at all though. Why can't we handle all packets in the ithread context?
Is this safe? I think so, but I confess that I don't know the low level details in this driver very well.
Jul 8 2024
Jul 1 2024
Jun 21 2024
I was concerned at first about isal, but then I remembered that @jhb had moved it from plugging in at the ktls layer to plugging in at the ocf layer