UMA uses a return value of 0 to indicate that the allocation
failed, so it cannot be used to cache a resource with that value.
Details
- Reviewers
kib alc - Commits
- rS339599: Don't import 0 into vmem quantum caches.
Diff Detail
- Repository
- rS FreeBSD src repository - subversion
- Lint
Lint Not Applicable - Unit
Tests Not Applicable
Event Timeline
I believe this change means that the range from 0 to QUANTUM_SIZE - 1 cannot be allocated by vmem_alloc() when quantum caching is enabled (except as part of an allocation request that is too large to be satisfied by a quantum cache). Should the qc_import comment state this explicitly?
That's true.
Should the qc_import comment state this explicitly?
I think we could side-step the problem entirely by having vmem_alloc() perform an M_NOWAIT allocation from the quantum cache before falling back to vmem_xalloc(). We would still attempt to import resources into the cache upon a cache miss, but we would only block in vmem_xalloc(). Does that seem reasonable?
As I understand it, there are now two qualitatively different vmem_xalloc() calls that could be made when a user calls vmem_alloc(). The first is the vmem_xalloc() call used in qc_import() to fill the quantum caches when needed, and the change to this call's parameters means that a quantum cache will never contain the range from 0 to QUANTUM_SIZE - 1. The second is called directly by vmem_alloc() if the allocation size exceeds those supported by the quantum caches, and this call could still allocate the range from 0 to QUANTUM_SIZE - 1.
With quantum caching enabled, this change means that in practice vmem_alloc() will almost never allocate the range from 0 to QUANTUM_SIZE - 1. (If the allocation request hits the second vmem_xalloc() call described above, it is necessarily asking for a range whose size is greater than the quantum size. Unless this is the very first allocation request, it's unlikely that the range from 0 to QUANTUM_SIZE - 1 would be used to satisfy it.) I'm not sure whether this behavior is a big deal, but vmem consumers may not expect it.
I think we could side-step the problem entirely by having vmem_alloc() perform an M_NOWAIT allocation from the quantum cache before falling back to vmem_xalloc(). We would still attempt to import resources into the cache upon a cache miss, but we would only block in vmem_xalloc(). Does that seem reasonable?
I think that because resources would still be imported into the cache by the first vmem_xalloc() call described above and used to satisfy the allocation request on a cache miss, the behavior of not allocating the range from 0 to QUANTUM_SIZE - 1 would still apply. Please let me know whether this makes sense.
One potential solution to the issue could be to store ranges that are offset by some constant in the quantum caches and to subtract that constant before returning them to the consumer. Not sure whether that is worth it, though.
It does. To summarize the issue with the current patch: vmem_alloc() will return 0 only if the consumer specifies an allocation size larger than qcache_max. In particular, if the arena is otherwise completely depleted, vmem_alloc(M_WAITOK) may block waiting for resources to be freed even when 0 is available. I think the issue can be fixed by ensuring that qc_import() only ever performs M_NOWAIT allocations and by falling back to vmem_xalloc(VMEM_ADDR_MIN, VMEM_ADDR_MAX) if an attempt to allocate from the quantum cache fails and M_WAITOK is specified. The downside of this approach is that vmem_alloc(M_WAITOK) becomes marginally more expensive if the arena is depleted, since it will result in two vmem_xalloc() calls: one from qc_import(), and one fallback call. In that case performance will already be very degraded, so I think this is an acceptable compromise.
I realized that I made another mistake: nothing prevents vmem_free() from putting 0 into one of the quantum caches.
One potential solution to the issue could be to store ranges that are offset by some constant in the quantum caches and to subtract that constant before returning them to the consumer. Not sure whether that is worth it, though.
Wouldn't that potentially cause the same problem if you wanted to use vmem to manage resources at the other end of the range?
Regarding the fallback solution, I believe it is okay to fall back on vmem_xalloc() regardless of whether vmem_alloc() is called with M_WAITOK or M_NOWAIT (i.e. the if statement on line 1299 is not necessary). If vmem_alloc() is called with M_NOWAIT, then the fallback vmem_xalloc() call will also have M_NOWAIT and could also allocate the range starting at 0.
However, while this fallback solution prevents waiting when there are actually resources available, it still has the behavior I described in my previous comment. I think the most elegant solution to that behavior would be to change the signature of uma_zalloc() so status and return value are distinct. However this isn't practical since uma_zalloc() is used in many places.
Wouldn't that potentially cause the same problem if you wanted to use vmem to manage resources at the other end of the range?
The offset or "slide" solution would also fix the behavior, and it would not cause the same problem for the other end of the range as long as values in the quantum caches are correctly slid back when they leave the caches. I believe we can achieve this with very slight modifications to qc_import(), qc_release(), vmem_alloc(), and vmem_free() (on the order of one line in each function).
Edit: When writing the above, I was under the assumption that vmem would only be allocating non-negative integers. In that case a slide offset of 1 would solve the problem. But now I'm thinking there's no reason why vmem couldn't allocate negative integers as well, so the slide may be a bit more tricky.
Right, I'll drop that check.
However, while this fallback solution prevents waiting when there are actually resources available, it still has the behavior I described in my previous comment. I think the most elegant solution to that behavior would be to change the signature of uma_zalloc() so status and return value are distinct. However this isn't practical since uma_zalloc() is used in many places.
Indeed. I don't think the behaviour you described is fixable per se, but I also don't see it as a problem.
Wouldn't that potentially cause the same problem if you wanted to use vmem to manage resources at the other end of the range?
The offset or "slide" solution would also fix the behavior, and it would not cause the same problem for the other end of the range as long as values in the quantum caches are correctly slid back when they leave the caches. I believe we can achieve this with very slight modifications to qc_import(), qc_release(), vmem_alloc(), and vmem_free() (on the order of one line in each function).
Edit: When writing the above, I was under the assumption that vmem would only be allocating non-negative integers. In that case a slide offset of 1 would solve the problem. But now I'm thinking there's no reason why vmem couldn't allocate negative integers as well, so the slide may be a bit more tricky.
As far as I know, a vmem arena can manage any subset of [0, ULONG_MAX] (or [LONG_MIN, LONG_MAX] if you prefer), and that interval contains one more element than the set of possible return values from uma_zalloc() upon success. So I'm still having trouble seeing how a slide can be made to work in all cases.
Right, I'll drop that check.
Looks reasonable to me.
Indeed. I don't think the behaviour you described is fixable per se, but I also don't see it as a problem.
As far as I know, a vmem arena can manage any subset of [0, ULONG_MAX] (or [LONG_MIN, LONG_MAX] if you prefer), and that interval contains one more element than the set of possible return values from uma_zalloc() upon success. So I'm still having trouble seeing how a slide can be made to work in all cases.
Fair enough, and I agree with your assessment. Thanks.
sys/kern/subr_vmem.c | ||
---|---|---|
512 ↗ | (On Diff #49150) | Whether "it" refers to UMA or the value zero is ambiguous until you've understood the end of the sentence. |
524–525 ↗ | (On Diff #49150) | Doesn't the MPASS(strat == ...) in vmem_alloc() make this redundant? |
532 ↗ | (On Diff #49150) | You might as well fix the style bug here, since you fixed the same style bug in vmem_alloc(). |
1425 ↗ | (On Diff #49150) | __predict_true(addr >= ...) is as likely to be correct here as on *addrp != 0 in vmem_alloc(). |