Query: Search

	Include stories about projects I am a member of.

Looks fine to me, just couple nits.

Have you looked on similar Linux code? It would be good to be consistent or at least similar. I haven't looked deep, but foreach_nfs_host_cb() seems to support multiple hosts.

Looks odd to me, but OK.

In D45660#1042742, @ken wrote:

So here is what the debugging log message in isp_getpdb() shows. isp0 and isp1 are connected to LTO-6 tape drives via an 8Gb switch. isp2 is directly connected to an LTO-6 in loop mode:

isp0: Chan 0 handle 0x0 Port 0xfffc01 flags 0x0 curstate 77 laststate 77
isp0: Chan 0 handle 0x1 Port 0x011b26 flags 0x40a0 curstate 46 laststate 46
isp0: Chan 0 handle 0x7fe Port 0xfffffe flags 0x0 curstate 44 laststate 44
isp0: Chan 0 handle 0x7fe Port 0xfffffe flags 0x0 curstate 44 laststate 44
isp1: Chan 0 handle 0x0 Port 0xfffc01 flags 0x0 curstate 77 laststate 77
isp1: Chan 0 handle 0x1 Port 0x011a26 flags 0x40a0 curstate 46 laststate 46
isp1: Chan 0 handle 0x7fe Port 0xfffffe flags 0x0 curstate 44 laststate 44
isp1: Chan 0 handle 0x7fe Port 0xfffffe flags 0x0 curstate 44 laststate 44
isp2: Chan 0 handle 0x0 Port 0x000026 flags 0x40a0 curstate 46 laststate 46

It seems a good tunable, except I am not getting the meaning of "only" there. Why not "always", "force" or something like that?

None of QLogic documents I have know nothing about NVMe, and this state field is declared is byte there. I have no objections for this patch, but a bit curios what NVMe status do we see there for non-NVMe devices.

Differences of less than 4 (RQ_PPQ) are insignificant and are simply removed. No functional change (intended).

I suspect that first thread was skipped to avoid stealing a thread that was just scheduled to a CPU, but was unable to run yet.

I am not fully sure about the motivation of this change, but It feels wrong to me to have per-namespace zones. On a big system under heavy load UMA does a lot of work for per-CPU and per-domain caching, and doing it also per-namespace would multiply resource waste. Also last time I touched it, I remember it was difficult for UMA to operate in severely constrained environments, since eviction of per-CPU caches is quite expensive. I don't remember how reservation works in that context, but I suppose that having dozens of small zones with small reservations, but huge per-CPU caches is not a very viable configuration.

I see no problems, but I have difficulties to believe that timeout handlers 1-2 times per second per queue pair may have any visible effects. Also I am not happy to see second place where timeouts are calculated. And 99/100 also looks quite arbitrary.

Mechanically it seems to have sense. I've missed when than original transition happened, but if you say it is right, so be it.

I wonder if there is any real architecture where pointer load/store is non-atomic. For things that are going to be executed between once and never it feels like you are over-engineering it. :)

I have no objections, if it is useful.

In D44961#1025280, @asomers wrote:

What is an "OOA queue"?

I wonder what is your queue depth, so that one message per request per 90 seconds would cause a noticeable storm. Also per-system limiting makes output not very useful, since it does not say much useful about LUNs, ports, commands, etc due to selecting first message out of many, only that something is wrong. Thinking even wider, I find those messages printed on actual completion not very useful, since if there are not a delays, but something is really wrong, the commands many never complete and so the messages may never get printed. I wonder if instead removing all this and once per second checking OOA queues for stuck requests and printing some digests would be more useful.

Looks good to me, but if you wish, couple cosmetic thoughts.

Looks good to me, though seems only cosmetic.

I don't have any chip documentation to know what is right here, so just wonder if unconditional printing a bunch of raw hex numbers is expected here. It feels mpi3mr_print_fault_info() is another candidate for mpi3mr_dprint().

I am not a big fan of kernel printing something in response to arbitrary user requests, it makes logs messy. Is the error reporting to user is not enough here?

Why not backport 506fe78c48 instead?

My only complaint is that it puts the queue into the same cache line as the main queue, that may be modified by writers. But if you really need it for debugging, it could be understood.

On failure we've already notified consumers that controller has failed. What will report it is back? And is there even a device to sent request IOCTL?

If you say it helps I have no objections, but I see nvme_sim_controller_fail() destroying SIM, so I am not sure you actually get here.

I wonder if there are any namespace-specific events? I remember NVMe specs allow per-namespace SMART, but I don't remember much details now.

In D39620#1008905, @sean_rogue-research.com wrote:

stable/13 has this patch
releng/13.2 doesn't have this patch (yet).

I'm not very familiar with FreeBSD's branching system... I see FreeBSD 13.3-RELEASE was released today, is this bug fix included?

Search
Use Results
Edit Query
Hide Query

Jul 3 2024

Jul 2 2024

Jun 27 2024

Jun 24 2024

Jun 14 2024

Jun 7 2024

Jun 6 2024

May 29 2024

May 23 2024

May 14 2024

May 7 2024

May 3 2024

Apr 27 2024

Apr 26 2024

Apr 20 2024

Apr 17 2024

Apr 10 2024

Mar 25 2024

Mar 21 2024

Mar 18 2024

Mar 15 2024

Mar 6 2024

Mar 5 2024

Feb 27 2024

Feb 5 2024

Jan 27 2024

Jan 19 2024

SearchUse ResultsEdit QueryHide Query