Sysop: | Amessyroom |
---|---|
Location: | Fayetteville, NC |
Users: | 23 |
Nodes: | 6 (0 / 6) |
Uptime: | 46:59:22 |
Calls: | 583 |
Files: | 1,138 |
Messages: | 111,072 |
Context reported by notafet :
14.3-RELEASE-p2 GENERIC on amd64 with ZFS in use
RAM: looks to be 24 GiBytes, not explicitly mentioned
SWAP: 8192 MiBytes
(From using the image of top's figures.)
Wired: 17 GiBytes
ARC Total: 1942 MiBytes
SWAP used: 1102 MiBytes
The link to the storage channel's message is:
https://discord.com/channels/727023752348434432/757305697527398481/1404367777904463914
On Mon, 11 Aug 2025 13:18:47 -0700
Mark Millard <marklmi@yahoo.com> wrote:
Context reported by notafet :
14.3-RELEASE-p2 GENERIC on amd64 with ZFS in use
RAM: looks to be 24 GiBytes, not explicitly mentioned
SWAP: 8192 MiBytes
(From using the image of top's figures.)
Wired: 17 GiBytes
ARC Total: 1942 MiBytes
SWAP used: 1102 MiBytes
The link to the storage channel's message is:
https://discord.com/channels/727023752348434432/757305697527398481/1404367777904463914
As of a couple of weeks ago update to latest -CURRENT at the time
(from previous build of a couple of months ago) I've been
experiencing a similar but worse case of this. 64G ram on the
machine, given 2 days it will run out of memory and start killing off process, with Wired Memory growing the whole time (at the rate of
about 1G every 5s at some points). ZFS ARC never passes more then 20G
the whole time.
For example right now top reports:
Mem: 1031M Active, 14G Inact, 241M Laundry, 45G Wired, 404M Buf, 2157M
Free ARC: 17G Total, 2883M MFU, 11G MRU, 256K Anon, 167M Header, 3025M
(Time for another reboot or it'll have killed things off when I get
home from work in 8 hours)
I'm still trying to pin down what exactly it's related too hence. I've
pretty much elimiated ZFS being the issue and a large poudriere ports
build or a full build world doesn't seem to push it up. It seems to be
more closely related to how many graphical things are open at once.
I've tried drm-61-kmod and drm-66-kmod for my amdgpu polaris 10 video
card but that seems to make no difference. Next thing I was planning
on trying was switch to the vesa driver and see if that reduces
things, but these things take time to fit in around everything else.
Darrin
On Tue, 12 Aug 2025 16:10:32 +0930
Darrin Smith <beldin@beldin.org> wrote:
On Mon, 11 Aug 2025 13:18:47 -0700
Mark Millard <marklmi@yahoo.com> wrote:
Context reported by notafet :
14.3-RELEASE-p2 GENERIC on amd64 with ZFS in use
RAM: looks to be 24 GiBytes, not explicitly mentioned
SWAP: 8192 MiBytes
(From using the image of top's figures.)
Wired: 17 GiBytes
ARC Total: 1942 MiBytes
SWAP used: 1102 MiBytes
The link to the storage channel's message is:
https://discord.com/channels/727023752348434432/757305697527398481/1404367777904463914
As of a couple of weeks ago update to latest -CURRENT at the time
(from previous build of a couple of months ago) I've been
experiencing a similar but worse case of this. 64G ram on the
machine, given 2 days it will run out of memory and start killing off process, with Wired Memory growing the whole time (at the rate of
about 1G every 5s at some points). ZFS ARC never passes more then 20G
the whole time.
Correction here, 1*M* every 5s...It lasts a day or two at least :D
For example right now top reports:
Mem: 1031M Active, 14G Inact, 241M Laundry, 45G Wired, 404M Buf, 2157M
Free ARC: 17G Total, 2883M MFU, 11G MRU, 256K Anon, 167M Header, 3025M
(Time for another reboot or it'll have killed things off when I get
home from work in 8 hours)
I'm still trying to pin down what exactly it's related too hence. I've pretty much elimiated ZFS being the issue and a large poudriere ports
build or a full build world doesn't seem to push it up. It seems to be
more closely related to how many graphical things are open at once.
I've tried drm-61-kmod and drm-66-kmod for my amdgpu polaris 10 video
card but that seems to make no difference. Next thing I was planning
on trying was switch to the vesa driver and see if that reduces
things, but these things take time to fit in around everything else.
Darrin
Well removing amdgpu altogether made no difference. Still climbing
Wired (I only suspected it was that because it climbed sharpest when I
was logged in). However I have noticed that there is no noticable growth
when using a local login (on ZFS), it's only the NFS based users that
seem to be causing the wired to climb sharply.
On Tue, 12 Aug 2025 16:10:32 +0930This might be a hint. NFS uses metadata heavily. I'm not a ZFS guy,
Darrin Smith <beldin@beldin.org> wrote:
On Mon, 11 Aug 2025 13:18:47 -0700
Mark Millard <marklmi@yahoo.com> wrote:
Context reported by notafet :
14.3-RELEASE-p2 GENERIC on amd64 with ZFS in use
RAM: looks to be 24 GiBytes, not explicitly mentioned
SWAP: 8192 MiBytes
(From using the image of top's figures.)
Wired: 17 GiBytes
ARC Total: 1942 MiBytes
SWAP used: 1102 MiBytes
The link to the storage channel's message is:
https://discord.com/channels/727023752348434432/757305697527398481/1404367777904463914
As of a couple of weeks ago update to latest -CURRENT at the time
(from previous build of a couple of months ago) I've been
experiencing a similar but worse case of this. 64G ram on the
machine, given 2 days it will run out of memory and start killing off process, with Wired Memory growing the whole time (at the rate of
about 1G every 5s at some points). ZFS ARC never passes more then 20G
the whole time.
Correction here, 1*M* every 5s...It lasts a day or two at least :D
For example right now top reports:
Mem: 1031M Active, 14G Inact, 241M Laundry, 45G Wired, 404M Buf, 2157M
Free ARC: 17G Total, 2883M MFU, 11G MRU, 256K Anon, 167M Header, 3025M
(Time for another reboot or it'll have killed things off when I get
home from work in 8 hours)
I'm still trying to pin down what exactly it's related too hence. I've pretty much elimiated ZFS being the issue and a large poudriere ports
build or a full build world doesn't seem to push it up. It seems to be
more closely related to how many graphical things are open at once.
I've tried drm-61-kmod and drm-66-kmod for my amdgpu polaris 10 video
card but that seems to make no difference. Next thing I was planning
on trying was switch to the vesa driver and see if that reduces
things, but these things take time to fit in around everything else.
Darrin
Well removing amdgpu altogether made no difference. Still climbing
Wired (I only suspected it was that because it climbed sharpest when I
was logged in). However I have noticed that there is no noticable growth
when using a local login (on ZFS), it's only the NFS based users that
seem to be causing the wired to climb sharply.
Darrin
Start looking at differences in periodic shots of vmstat -z and
vmstat -m. It would not catch direct page allocators.
On Tue, 12 Aug 2025 12:57:39 +0300
Konstantin Belousov <kostikbel@gmail.com> wrote:
Start looking at differences in periodic shots of vmstat -z and
vmstat -m. It would not catch direct page allocators.
Ok, I hope I'm reading these outputs correctlty...
Looking at vmstat -z I am assuming the 'size' column shows the size of
each malloc bucket and the used indicates the number of buckets used?
(A quick look at vmstat.c pointing me to memstat_get_* suggests I'm on
the right track) This results in numbers around the right order of
magnitude to match my memory.
I have noticed with 3 samples over the last 18 hours (in which time it
looks like about 1/2 of my memory is now wired, which seems a little execessive, especially considering ZFS is only using about 6 1/2G
accoding to top:
Mem: 1568M Active, 12G Inact, 656M Laundry, 36G Wired, 994M Buf, 12G
Free ARC: 6645M Total, 3099M MFU, 2617M MRU, 768K Anon, 49M Header, 877M
4995M Compressed, 5803M Uncompressed, 1.16:1 Ratio Swap:
8192M Total, 198M Used, 7993M Free, 2% Inuse
In the middle of this rang I was building about 1000 packages in
poudriere so it's been busy.
Interestingly the ZFS ARC size has actually dropped since 9 hours ago
when I took the 2nd measurement (was about 15G then) but that was at
the height of the build and suggests the ARC is expiring older stuff
happily.
So assuming the used * size is correct I saw the following big changes
in vmstat -z:
vm_page:
18 hours ago (before build): 18159063040, 25473990656
9 hours ago (during build) : 27994304512, 29363249152
delta : +9835241472, +3889258496
recent sample : 14337658880, 35773743104
delta : -13656645632, +6410493952
NAMEI:
18 hours ago: 2 267 478 016
9 hours ago : 13 991 848 960
delta : +11 724 370 944
recent sample: 24 441 244 672
delta : +10 449 395 712
zfs_znode_cache:Are you running the nfsd?
18 hours ago: 370777296
9 hours ago : 975800816
delta : +605023520
recent sample: 156404656
delta : -819396160
VNODE:
18 hours ago: 440384120
9 hours ago : 952734200
delta : +512350080
recent sample: 159528160
delta : -793206040
Everything else comes out to smaller numbers, so I assume it's probably
not them.
If Im getting the numbers right I'm seeing various caches
expiring after the poudriere build finished. But that NAMEI seems to be growing quite extensively still, don't know if that's expected or not :)
I will keep watching these, and hopefully get a sample after the
machine has started killing processess.
If any gurus would like .xml dumps of the vmstat -z & -m outputs I have
them avaiable (xml easier to import into spreadsheet for me), I can
email them or upload them somewhere suitable.
Darrin
--
=b
On Wed, Aug 13, 2025 at 12:47rC>AM Darrin Smith <beldin@beldin.org> wrote:
On Tue, 12 Aug 2025 12:57:39 +0300
Konstantin Belousov <kostikbel@gmail.com> wrote:
Start looking at differences in periodic shots of vmstat -z and
vmstat -m. It would not catch direct page allocators.
Ok, I hope I'm reading these outputs correctlty...
Looking at vmstat -z I am assuming the 'size' column shows the size of
each malloc bucket and the used indicates the number of buckets used?
(A quick look at vmstat.c pointing me to memstat_get_* suggests I'm on
the right track) This results in numbers around the right order of magnitude to match my memory.
I have noticed with 3 samples over the last 18 hours (in which time it looks like about 1/2 of my memory is now wired, which seems a little execessive, especially considering ZFS is only using about 6 1/2G
accoding to top:
Mem: 1568M Active, 12G Inact, 656M Laundry, 36G Wired, 994M Buf, 12G
Free ARC: 6645M Total, 3099M MFU, 2617M MRU, 768K Anon, 49M Header, 877M
4995M Compressed, 5803M Uncompressed, 1.16:1 Ratio Swap:
8192M Total, 198M Used, 7993M Free, 2% Inuse
In the middle of this rang I was building about 1000 packages in
poudriere so it's been busy.
Interestingly the ZFS ARC size has actually dropped since 9 hours ago
when I took the 2nd measurement (was about 15G then) but that was at
the height of the build and suggests the ARC is expiring older stuff happily.
So assuming the used * size is correct I saw the following big changes
in vmstat -z:
vm_page:
18 hours ago (before build): 18159063040, 25473990656
9 hours ago (during build) : 27994304512, 29363249152
delta : +9835241472, +3889258496
recent sample : 14337658880, 35773743104
delta : -13656645632, +6410493952
NAMEI:
18 hours ago: 2 267 478 016
9 hours ago : 13 991 848 960
delta : +11 724 370 944
recent sample: 24 441 244 672
delta : +10 449 395 712
So, I basically answered the question myself. After mjg@'s commitzfs_znode_cache:
18 hours ago: 370777296
9 hours ago : 975800816
delta : +605023520
recent sample: 156404656
delta : -819396160
VNODE:
18 hours ago: 440384120
9 hours ago : 952734200
delta : +512350080
recent sample: 159528160
delta : -793206040
Everything else comes out to smaller numbers, so I assume it's probably
not them.
If Im getting the numbers right I'm seeing various cachesAre you running the nfsd?
expiring after the poudriere build finished. But that NAMEI seems to be growing quite extensively still, don't know if that's expected or not :)
I ask because there might be a pretty basic blunder in the NFS server.
There several places where the NFS server code calls namei() and
they don't do a NDFREE_PNBUF() after the call.
All but one of them is related to the pNFS server, so it would not
affect anyone (no one uses it), but one of them is used to update the
V4 export list (a function called nfsrv_v4rootexport()).
So Kostik, should there be a NDFREE_PNBUF() after a successful
namei() call to get rid of the buffer?
rick
I will keep watching these, and hopefully get a sample after the
machine has started killing processess.
If any gurus would like .xml dumps of the vmstat -z & -m outputs I have them avaiable (xml easier to import into spreadsheet for me), I can
email them or upload them somewhere suitable.
Darrin
--
=b
On Wed, Aug 13, 2025 at 05:42:55PM -0700, Rick Macklem wrote:
On Wed, Aug 13, 2025 at 3:20rC>PM Rick Macklem <rick.macklem@gmail.com> wrote:
On Wed, Aug 13, 2025 at 12:47rC>AM Darrin Smith <beldin@beldin.org> wrote:
On Tue, 12 Aug 2025 12:57:39 +0300
Konstantin Belousov <kostikbel@gmail.com> wrote:
Start looking at differences in periodic shots of vmstat -z and vmstat -m. It would not catch direct page allocators.
Ok, I hope I'm reading these outputs correctlty...
Looking at vmstat -z I am assuming the 'size' column shows the size of each malloc bucket and the used indicates the number of buckets used? (A quick look at vmstat.c pointing me to memstat_get_* suggests I'm on the right track) This results in numbers around the right order of magnitude to match my memory.
I have noticed with 3 samples over the last 18 hours (in which time it looks like about 1/2 of my memory is now wired, which seems a little execessive, especially considering ZFS is only using about 6 1/2G accoding to top:
Mem: 1568M Active, 12G Inact, 656M Laundry, 36G Wired, 994M Buf, 12G Free ARC: 6645M Total, 3099M MFU, 2617M MRU, 768K Anon, 49M Header, 877M
4995M Compressed, 5803M Uncompressed, 1.16:1 Ratio Swap: 8192M Total, 198M Used, 7993M Free, 2% Inuse
In the middle of this rang I was building about 1000 packages in poudriere so it's been busy.
Interestingly the ZFS ARC size has actually dropped since 9 hours ago when I took the 2nd measurement (was about 15G then) but that was at the height of the build and suggests the ARC is expiring older stuff happily.
So assuming the used * size is correct I saw the following big changes in vmstat -z:
vm_page:
18 hours ago (before build): 18159063040, 25473990656
9 hours ago (during build) : 27994304512, 29363249152
delta : +9835241472, +3889258496
recent sample : 14337658880, 35773743104
delta : -13656645632, +6410493952
NAMEI:
18 hours ago: 2 267 478 016
9 hours ago : 13 991 848 960
delta : +11 724 370 944
recent sample: 24 441 244 672
delta : +10 449 395 712
zfs_znode_cache:
18 hours ago: 370777296
9 hours ago : 975800816
delta : +605023520
recent sample: 156404656
delta : -819396160
VNODE:
18 hours ago: 440384120
9 hours ago : 952734200
delta : +512350080
recent sample: 159528160
delta : -793206040
Everything else comes out to smaller numbers, so I assume it's probably not them.
If Im getting the numbers right I'm seeing various cachesAre you running the nfsd?
expiring after the poudriere build finished. But that NAMEI seems to be growing quite extensively still, don't know if that's expected or not :)
I ask because there might be a pretty basic blunder in the NFS server. There several places where the NFS server code calls namei() and
they don't do a NDFREE_PNBUF() after the call.
All but one of them is related to the pNFS server, so it would not
affect anyone (no one uses it), but one of them is used to update the
V4 export list (a function called nfsrv_v4rootexport()).
YYes.So Kostik, should there be a NDFREE_PNBUF() after a successfulSo, I basically answered the question myself. After mjg@'s commit
namei() call to get rid of the buffer?
on Sep. 17, 2022 (5b5b7e2 in main), the buffer is always saved
unless there is an error return.
The "vmstat -z | fgrep NAMEI" count does increase by one each
time I send a SIGHUP to mountd.
This is fixed by adding a NDFREE_PNBUF().
However, one buffer each time exports are reloaded probably is
not the leak you guys are looking for.
Definitely.
I am not sure what they reported (instead of raw output some
interpretation was provided), but so far it seems just the normal vnode caching. Perhaps they can compare the number of vnode allocated against
the cap kern.maxvnodes. The allocation number should not exceed the
maxvnodes significantly.
On Wed, Aug 13, 2025 at 05:42:55PM -0700, Rick Macklem wrote:
On Wed, Aug 13, 2025 at 3:20rC>PM Rick Macklem
<rick.macklem@gmail.com> wrote:
On Wed, Aug 13, 2025 at 12:47rC>AM Darrin Smith <beldin@beldin.org> wrote:
On Tue, 12 Aug 2025 12:57:39 +0300
Konstantin Belousov <kostikbel@gmail.com> wrote:
Start looking at differences in periodic shots of vmstat -z
and vmstat -m. It would not catch direct page allocators.
Ok, I hope I'm reading these outputs correctlty...
Looking at vmstat -z I am assuming the 'size' column shows the
size of each malloc bucket and the used indicates the number of
buckets used? (A quick look at vmstat.c pointing me to
memstat_get_* suggests I'm on the right track) This results in
numbers around the right order of magnitude to match my memory.
I have noticed with 3 samples over the last 18 hours (in which
time it looks like about 1/2 of my memory is now wired, which
seems a little execessive, especially considering ZFS is only
using about 6 1/2G accoding to top:
Mem: 1568M Active, 12G Inact, 656M Laundry, 36G Wired, 994M
Buf, 12G Free ARC: 6645M Total, 3099M MFU, 2617M MRU, 768K
Anon, 49M Header, 877M 4995M Compressed, 5803M Uncompressed,
1.16:1 Ratio Swap: 8192M Total, 198M Used, 7993M Free, 2% Inuse
In the middle of this rang I was building about 1000 packages in poudriere so it's been busy.
Interestingly the ZFS ARC size has actually dropped since 9
hours ago when I took the 2nd measurement (was about 15G then)
but that was at the height of the build and suggests the ARC is expiring older stuff happily.
So assuming the used * size is correct I saw the following big
changes in vmstat -z:
vm_page:
18 hours ago (before build): 18159063040, 25473990656
9 hours ago (during build) : 27994304512, 29363249152
delta : +9835241472, +3889258496
recent sample : 14337658880, 35773743104
delta : -13656645632, +6410493952
NAMEI:
18 hours ago: 2 267 478 016
9 hours ago : 13 991 848 960
delta : +11 724 370 944
recent sample: 24 441 244 672
delta : +10 449 395 712
zfs_znode_cache:
18 hours ago: 370777296
9 hours ago : 975800816
delta : +605023520
recent sample: 156404656
delta : -819396160
VNODE:
18 hours ago: 440384120
9 hours ago : 952734200
delta : +512350080
recent sample: 159528160
delta : -793206040
Everything else comes out to smaller numbers, so I assume it's
probably not them.
If Im getting the numbers right I'm seeing various cachesAre you running the nfsd?
expiring after the poudriere build finished. But that NAMEI
seems to be growing quite extensively still, don't know if
that's expected or not :)
I ask because there might be a pretty basic blunder in the NFS
server. There several places where the NFS server code calls
namei() and they don't do a NDFREE_PNBUF() after the call.
All but one of them is related to the pNFS server, so it would not
affect anyone (no one uses it), but one of them is used to update
the V4 export list (a function called nfsrv_v4rootexport()).
YYes.So Kostik, should there be a NDFREE_PNBUF() after a successfulSo, I basically answered the question myself. After mjg@'s commit
namei() call to get rid of the buffer?
on Sep. 17, 2022 (5b5b7e2 in main), the buffer is always saved
unless there is an error return.
The "vmstat -z | fgrep NAMEI" count does increase by one each
time I send a SIGHUP to mountd.
This is fixed by adding a NDFREE_PNBUF().
However, one buffer each time exports are reloaded probably is
not the leak you guys are looking for.
Definitely.
I am not sure what they reported (instead of raw output some
interpretation was provided), but so far it seems just the normal
vnode caching. Perhaps they can compare the number of vnode allocated
against the cap kern.maxvnodes. The allocation number should not
exceed the maxvnodes significantly.
On Thu, 14 Aug 2025 05:38:00 +0300
Konstantin Belousov <kostikbel@gmail.com> wrote:
On Wed, Aug 13, 2025 at 05:42:55PM -0700, Rick Macklem wrote:
On Wed, Aug 13, 2025 at 3:20rC>PM Rick Macklem
<rick.macklem@gmail.com> wrote:
On Wed, Aug 13, 2025 at 12:47rC>AM Darrin Smith <beldin@beldin.org> wrote:
On Tue, 12 Aug 2025 12:57:39 +0300
Konstantin Belousov <kostikbel@gmail.com> wrote:
Start looking at differences in periodic shots of vmstat -z
and vmstat -m. It would not catch direct page allocators.
Ok, I hope I'm reading these outputs correctlty...
Looking at vmstat -z I am assuming the 'size' column shows the
size of each malloc bucket and the used indicates the number of buckets used? (A quick look at vmstat.c pointing me to
memstat_get_* suggests I'm on the right track) This results in numbers around the right order of magnitude to match my memory.
I have noticed with 3 samples over the last 18 hours (in which
time it looks like about 1/2 of my memory is now wired, which
seems a little execessive, especially considering ZFS is only
using about 6 1/2G accoding to top:
Mem: 1568M Active, 12G Inact, 656M Laundry, 36G Wired, 994M
Buf, 12G Free ARC: 6645M Total, 3099M MFU, 2617M MRU, 768K
Anon, 49M Header, 877M 4995M Compressed, 5803M Uncompressed,
1.16:1 Ratio Swap: 8192M Total, 198M Used, 7993M Free, 2% Inuse
In the middle of this rang I was building about 1000 packages in poudriere so it's been busy.
Interestingly the ZFS ARC size has actually dropped since 9
hours ago when I took the 2nd measurement (was about 15G then)
but that was at the height of the build and suggests the ARC is expiring older stuff happily.
So assuming the used * size is correct I saw the following big changes in vmstat -z:
vm_page:
18 hours ago (before build): 18159063040, 25473990656
9 hours ago (during build) : 27994304512, 29363249152
delta : +9835241472, +3889258496
recent sample : 14337658880, 35773743104
delta : -13656645632, +6410493952
NAMEI:
18 hours ago: 2 267 478 016
9 hours ago : 13 991 848 960
delta : +11 724 370 944
recent sample: 24 441 244 672
delta : +10 449 395 712
zfs_znode_cache:
18 hours ago: 370777296
9 hours ago : 975800816
delta : +605023520
recent sample: 156404656
delta : -819396160
VNODE:
18 hours ago: 440384120
9 hours ago : 952734200
delta : +512350080
recent sample: 159528160
delta : -793206040
Everything else comes out to smaller numbers, so I assume it's probably not them.
If Im getting the numbers right I'm seeing various cachesAre you running the nfsd?
expiring after the poudriere build finished. But that NAMEI
seems to be growing quite extensively still, don't know if
that's expected or not :)
I ask because there might be a pretty basic blunder in the NFS
server. There several places where the NFS server code calls
namei() and they don't do a NDFREE_PNBUF() after the call.
All but one of them is related to the pNFS server, so it would not affect anyone (no one uses it), but one of them is used to update
the V4 export list (a function called nfsrv_v4rootexport()).
YYes.So Kostik, should there be a NDFREE_PNBUF() after a successfulSo, I basically answered the question myself. After mjg@'s commit
namei() call to get rid of the buffer?
on Sep. 17, 2022 (5b5b7e2 in main), the buffer is always saved
unless there is an error return.
The "vmstat -z | fgrep NAMEI" count does increase by one each
time I send a SIGHUP to mountd.
This is fixed by adding a NDFREE_PNBUF().
However, one buffer each time exports are reloaded probably is
not the leak you guys are looking for.
Definitely.
I am not sure what they reported (instead of raw output some
interpretation was provided), but so far it seems just the normal
vnode caching. Perhaps they can compare the number of vnode allocated against the cap kern.maxvnodes. The allocation number should not
exceed the maxvnodes significantly.
I appologise for not just pasting the direct dumps, but I didn'tSo, in your case it looks like a NAMEI leak (buffers for file paths being looked up).
think multiple 246 line files would be appreciated.
All the raw data is at http://files.beldin.org/logs.
Unfortunately a power outage occured here before I was able to reach
the memory exhaustion level so I will have to wait another
approximately 2 days to hit the problem again.
Darrin
On Wed, Aug 13, 2025 at 7:39rC>PM Konstantin Belousov <kostikbel@gmail.com> wrote:
On Wed, Aug 13, 2025 at 05:42:55PM -0700, Rick Macklem wrote:
On Wed, Aug 13, 2025 at 3:20rC>PM Rick Macklem <rick.macklem@gmail.com> wrote:
On Wed, Aug 13, 2025 at 12:47rC>AM Darrin Smith <beldin@beldin.org> wrote:
On Tue, 12 Aug 2025 12:57:39 +0300
Konstantin Belousov <kostikbel@gmail.com> wrote:
Start looking at differences in periodic shots of vmstat -z and vmstat -m. It would not catch direct page allocators.
Ok, I hope I'm reading these outputs correctlty...
Looking at vmstat -z I am assuming the 'size' column shows the size of
each malloc bucket and the used indicates the number of buckets used? (A quick look at vmstat.c pointing me to memstat_get_* suggests I'm on
the right track) This results in numbers around the right order of magnitude to match my memory.
I have noticed with 3 samples over the last 18 hours (in which time it
looks like about 1/2 of my memory is now wired, which seems a little execessive, especially considering ZFS is only using about 6 1/2G accoding to top:
Mem: 1568M Active, 12G Inact, 656M Laundry, 36G Wired, 994M Buf, 12G Free ARC: 6645M Total, 3099M MFU, 2617M MRU, 768K Anon, 49M Header, 877M
4995M Compressed, 5803M Uncompressed, 1.16:1 Ratio Swap: 8192M Total, 198M Used, 7993M Free, 2% Inuse
In the middle of this rang I was building about 1000 packages in poudriere so it's been busy.
Interestingly the ZFS ARC size has actually dropped since 9 hours ago when I took the 2nd measurement (was about 15G then) but that was at the height of the build and suggests the ARC is expiring older stuff happily.
So assuming the used * size is correct I saw the following big changes
in vmstat -z:
vm_page:
18 hours ago (before build): 18159063040, 25473990656
9 hours ago (during build) : 27994304512, 29363249152
delta : +9835241472, +3889258496
recent sample : 14337658880, 35773743104
delta : -13656645632, +6410493952
NAMEI:
18 hours ago: 2 267 478 016
9 hours ago : 13 991 848 960
delta : +11 724 370 944
recent sample: 24 441 244 672
delta : +10 449 395 712
zfs_znode_cache:
18 hours ago: 370777296
9 hours ago : 975800816
delta : +605023520
recent sample: 156404656
delta : -819396160
VNODE:
18 hours ago: 440384120
9 hours ago : 952734200
delta : +512350080
recent sample: 159528160
delta : -793206040
Everything else comes out to smaller numbers, so I assume it's probably
not them.
If Im getting the numbers right I'm seeing various cachesAre you running the nfsd?
expiring after the poudriere build finished. But that NAMEI seems to be
growing quite extensively still, don't know if that's expected or not :)
I ask because there might be a pretty basic blunder in the NFS server. There several places where the NFS server code calls namei() and
they don't do a NDFREE_PNBUF() after the call.
All but one of them is related to the pNFS server, so it would not affect anyone (no one uses it), but one of them is used to update the V4 export list (a function called nfsrv_v4rootexport()).
YYes.So Kostik, should there be a NDFREE_PNBUF() after a successfulSo, I basically answered the question myself. After mjg@'s commit
namei() call to get rid of the buffer?
on Sep. 17, 2022 (5b5b7e2 in main), the buffer is always saved
unless there is an error return.
Hi Peter,The "vmstat -z | fgrep NAMEI" count does increase by one each
time I send a SIGHUP to mountd.
This is fixed by adding a NDFREE_PNBUF().
However, one buffer each time exports are reloaded probably is
not the leak you guys are looking for.
Definitely.
I am not sure what they reported (instead of raw output some
interpretation was provided), but so far it seems just the normal vnode caching. Perhaps they can compare the number of vnode allocated against
the cap kern.maxvnodes. The allocation number should not exceed the maxvnodes significantly.
Peter Eriksson posted this to me a little while ago...
I wish I could upgrade our front-end servers from FreeBSD 13.5 btw -
but there is a very troublesome issue with ZFS on FreeBSD 14+ -
sometimes it runs amok and basically uses up all available RAM - and
then the system load goes thru the roof and the machine basically
grinds to a hold for _long_ periods - happens when we run our backup
rsync jobs.
https://github.com/openzfs/zfs/issues/17052You referred to this, which is obviously Linux specific.
rick--
On Thu, 14 Aug 2025 05:38:00 +0300
Konstantin Belousov <kostikbel@gmail.com> wrote:
On Wed, Aug 13, 2025 at 05:42:55PM -0700, Rick Macklem wrote:
On Wed, Aug 13, 2025 at 3:20rC>PM Rick Macklem
<rick.macklem@gmail.com> wrote:
On Wed, Aug 13, 2025 at 12:47rC>AM Darrin Smith <beldin@beldin.org> wrote:
On Tue, 12 Aug 2025 12:57:39 +0300
Konstantin Belousov <kostikbel@gmail.com> wrote:
Start looking at differences in periodic shots of vmstat -z
and vmstat -m. It would not catch direct page allocators.
Ok, I hope I'm reading these outputs correctlty...
Looking at vmstat -z I am assuming the 'size' column shows the
size of each malloc bucket and the used indicates the number of buckets used? (A quick look at vmstat.c pointing me to
memstat_get_* suggests I'm on the right track) This results in numbers around the right order of magnitude to match my memory.
I have noticed with 3 samples over the last 18 hours (in which
time it looks like about 1/2 of my memory is now wired, which
seems a little execessive, especially considering ZFS is only
using about 6 1/2G accoding to top:
Mem: 1568M Active, 12G Inact, 656M Laundry, 36G Wired, 994M
Buf, 12G Free ARC: 6645M Total, 3099M MFU, 2617M MRU, 768K
Anon, 49M Header, 877M 4995M Compressed, 5803M Uncompressed,
1.16:1 Ratio Swap: 8192M Total, 198M Used, 7993M Free, 2% Inuse
In the middle of this rang I was building about 1000 packages in poudriere so it's been busy.
Interestingly the ZFS ARC size has actually dropped since 9
hours ago when I took the 2nd measurement (was about 15G then)
but that was at the height of the build and suggests the ARC is expiring older stuff happily.
So assuming the used * size is correct I saw the following big changes in vmstat -z:
vm_page:
18 hours ago (before build): 18159063040, 25473990656
9 hours ago (during build) : 27994304512, 29363249152
delta : +9835241472, +3889258496
recent sample : 14337658880, 35773743104
delta : -13656645632, +6410493952
NAMEI:
18 hours ago: 2 267 478 016
9 hours ago : 13 991 848 960
delta : +11 724 370 944
recent sample: 24 441 244 672
delta : +10 449 395 712
zfs_znode_cache:
18 hours ago: 370777296
9 hours ago : 975800816
delta : +605023520
recent sample: 156404656
delta : -819396160
VNODE:
18 hours ago: 440384120
9 hours ago : 952734200
delta : +512350080
recent sample: 159528160
delta : -793206040
Everything else comes out to smaller numbers, so I assume it's probably not them.
If Im getting the numbers right I'm seeing various cachesAre you running the nfsd?
expiring after the poudriere build finished. But that NAMEI
seems to be growing quite extensively still, don't know if
that's expected or not :)
I ask because there might be a pretty basic blunder in the NFS
server. There several places where the NFS server code calls
namei() and they don't do a NDFREE_PNBUF() after the call.
All but one of them is related to the pNFS server, so it would not affect anyone (no one uses it), but one of them is used to update
the V4 export list (a function called nfsrv_v4rootexport()).
YYes.So Kostik, should there be a NDFREE_PNBUF() after a successfulSo, I basically answered the question myself. After mjg@'s commit
namei() call to get rid of the buffer?
on Sep. 17, 2022 (5b5b7e2 in main), the buffer is always saved
unless there is an error return.
The "vmstat -z | fgrep NAMEI" count does increase by one each
time I send a SIGHUP to mountd.
This is fixed by adding a NDFREE_PNBUF().
However, one buffer each time exports are reloaded probably is
not the leak you guys are looking for.
Definitely.
I am not sure what they reported (instead of raw output some
interpretation was provided), but so far it seems just the normal
vnode caching. Perhaps they can compare the number of vnode allocated against the cap kern.maxvnodes. The allocation number should not
exceed the maxvnodes significantly.
I appologise for not just pasting the direct dumps, but I didn'tA fix for a NAMEI leak was just committed to main.
think multiple 246 line files would be appreciated.
All the raw data is at http://files.beldin.org/logs.
Unfortunately a power outage occured here before I was able to reach
the memory exhaustion level so I will have to wait another
approximately 2 days to hit the problem again.
Darrin
On Thu, Aug 14, 2025 at 12:37rC>AM Darrin Smith <beldin@beldin.org>
wrote:
On Thu, 14 Aug 2025 05:38:00 +0300
Konstantin Belousov <kostikbel@gmail.com> wrote:
On Wed, Aug 13, 2025 at 05:42:55PM -0700, Rick Macklem wrote:
On Wed, Aug 13, 2025 at 3:20rC>PM Rick Macklem
<rick.macklem@gmail.com> wrote:
On Wed, Aug 13, 2025 at 12:47rC>AM Darrin Smith
<beldin@beldin.org> wrote:
On Tue, 12 Aug 2025 12:57:39 +0300
Konstantin Belousov <kostikbel@gmail.com> wrote:
Start looking at differences in periodic shots of vmstat
-z and vmstat -m. It would not catch direct page
allocators.
Ok, I hope I'm reading these outputs correctlty...
Looking at vmstat -z I am assuming the 'size' column shows
the size of each malloc bucket and the used indicates the
number of buckets used? (A quick look at vmstat.c pointing
me to memstat_get_* suggests I'm on the right track) This
results in numbers around the right order of magnitude to
match my memory.
I have noticed with 3 samples over the last 18 hours (in
which time it looks like about 1/2 of my memory is now
wired, which seems a little execessive, especially
considering ZFS is only using about 6 1/2G accoding to top:
Mem: 1568M Active, 12G Inact, 656M Laundry, 36G Wired, 994M
Buf, 12G Free ARC: 6645M Total, 3099M MFU, 2617M MRU, 768K
Anon, 49M Header, 877M 4995M Compressed, 5803M Uncompressed,
1.16:1 Ratio Swap: 8192M Total, 198M Used, 7993M Free, 2%
Inuse
In the middle of this rang I was building about 1000
packages in poudriere so it's been busy.
Interestingly the ZFS ARC size has actually dropped since 9
hours ago when I took the 2nd measurement (was about 15G
then) but that was at the height of the build and suggests
the ARC is expiring older stuff happily.
So assuming the used * size is correct I saw the following
big changes in vmstat -z:
vm_page:
18 hours ago (before build): 18159063040, 25473990656
9 hours ago (during build) : 27994304512, 29363249152
delta : +9835241472, +3889258496
recent sample : 14337658880, 35773743104
delta : -13656645632, +6410493952
NAMEI:
18 hours ago: 2 267 478 016
9 hours ago : 13 991 848 960
delta : +11 724 370 944
recent sample: 24 441 244 672
delta : +10 449 395 712
zfs_znode_cache:
18 hours ago: 370777296
9 hours ago : 975800816
delta : +605023520
recent sample: 156404656
delta : -819396160
VNODE:
18 hours ago: 440384120
9 hours ago : 952734200
delta : +512350080
recent sample: 159528160
delta : -793206040
Everything else comes out to smaller numbers, so I assume
it's probably not them.
If Im getting the numbers right I'm seeing various cachesAre you running the nfsd?
expiring after the poudriere build finished. But that NAMEI
seems to be growing quite extensively still, don't know if
that's expected or not :)
I ask because there might be a pretty basic blunder in the NFS server. There several places where the NFS server code calls
namei() and they don't do a NDFREE_PNBUF() after the call.
All but one of them is related to the pNFS server, so it
would not affect anyone (no one uses it), but one of them is
used to update the V4 export list (a function called nfsrv_v4rootexport()).
YYes.So Kostik, should there be a NDFREE_PNBUF() after a successful namei() call to get rid of the buffer?So, I basically answered the question myself. After mjg@'s
commit on Sep. 17, 2022 (5b5b7e2 in main), the buffer is always
saved unless there is an error return.
The "vmstat -z | fgrep NAMEI" count does increase by one each
time I send a SIGHUP to mountd.
This is fixed by adding a NDFREE_PNBUF().
However, one buffer each time exports are reloaded probably is
not the leak you guys are looking for.
Definitely.
I am not sure what they reported (instead of raw output some interpretation was provided), but so far it seems just the normal
vnode caching. Perhaps they can compare the number of vnode
allocated against the cap kern.maxvnodes. The allocation number
should not exceed the maxvnodes significantly.
I appologise for not just pasting the direct dumps, but I didn't
think multiple 246 line files would be appreciated.
All the raw data is at http://files.beldin.org/logs.
Unfortunately a power outage occured here before I was able to reachA fix for a NAMEI leak was just committed to main.
the memory exhaustion level so I will have to wait another
approximately 2 days to hit the problem again.
Maybe you can update your kernel and see if this helps for
your problem?
rick
Darrin
On 14 Aug 2025, at 06:35, Rick Macklem <rick.macklem@gmail.com> wrote:[ =E2=80=A6 ]
Peter Eriksson posted this to me a little while ago...
I wish I could upgrade our front-end servers from FreeBSD 13.5 btw -
but there is a very troublesome issue with ZFS on FreeBSD 14+ -
sometimes it runs amok and basically uses up all available RAM - and
then the system load goes thru the roof and the machine basically
grinds to a hold for _long_ periods - happens when we run our backup
rsync jobs.
=20
https://github.com/openzfs/zfs/issues/17052
On Thu, 14 Aug 2025 09:16:27 -0700
Rick Macklem <rick.macklem@gmail.com> wrote:
A fix for a NAMEI leak was just committed to main.
Maybe you can update your kernel and see if this helps for
your problem?
rick
Darrin
Sounds like a possibility. I'll set the wheels in motion and see how
we go.
On 14 Aug 2025, at 06:35, Rick Macklem <rick.macklem@gmail.com> wrote:
[ rCa ]
Peter Eriksson posted this to me a little while ago...
I wish I could upgrade our front-end servers from FreeBSD 13.5 btw -
but there is a very troublesome issue with ZFS on FreeBSD 14+ -
sometimes it runs amok and basically uses up all available RAM - and
then the system load goes thru the roof and the machine basically
grinds to a hold for _long_ periods - happens when we run our backup
rsync jobs.
https://github.com/openzfs/zfs/issues/17052
I was seeing a problem just like this in the 14.3 betas during heavy workloads.The original report was for 14.3. Of course, they may have experienced something different than what you did.
The problem was fixed at around BETA4, I believe by this commit. https://cgit.freebsd.org/src/commit/?h=releng/14.3&id=7a9ea03e4bbfee1b2192d9a5b4da89a53d3a2c14
Have you see a similar problem on 14.3?
Regards,
Jan M.
On Fri, Aug 15, 2025 at 4:24rC>AM Jan Martin Mikkelsen <janm@transactionware.com> wrote:Yea, discord is a need-to-login type of context. If
. . .The original report was for 14.3. Of course, they may have experienced something different than what you did.
(I tried to look at whatever it was on the discord channel, but got
nothing useful,
just some sort of "welcome to discord..." message.)
I do hope that Peter can try 14.3 and determine if he still sees the problem he reported.===
Rick Macklem <rick.macklem_at_gmail.com> wrote onI do have a discord login (for the NFSv4 bakeathons),
Date: Sat, 16 Aug 2025 23:43:11 UTC :
On Fri, Aug 15, 2025 at 4:24rC>AM Jan Martin Mikkelsen <janm@transactionware.com> wrote:
. . .The original report was for 14.3. Of course, they may have experienced something different than what you did.
(I tried to look at whatever it was on the discord channel, but got
nothing useful,
just some sort of "welcome to discord..." message.)
Yea, discord is a need-to-login type of context. If
one is unlikely to want to establish such a login,
such discord URLs should likely be ignored. (I
originally established a login in order to get to
Solid Run's support for their HoneyComb [aarch64
based].)
As for the specific report, some of the information
was presented with images, making for a not great
fit for the mail-list. Thus the URL usage for those
that could readily use it.
So far, it has turned out that the message to the
mail-list mostly has prompted other leaks to be
found and fixed. Not bad for unintended
consequences.
I do hope that Peter can try 14.3 and determine if he still sees the problem
he reported.
===
Mark Millard
marklmi at yahoo.com
On Sat, Aug 16, 2025 at 9:22rC>PM Mark Millard <marklmi@yahoo.com> wrote:Ahh, so, ignoring my URL copy/paste, go to the
I do have a discord login (for the NFSv4 bakeathons),
Rick Macklem <rick.macklem_at_gmail.com> wrote on
Date: Sat, 16 Aug 2025 23:43:11 UTC :
On Fri, Aug 15, 2025 at 4:24rC>AM Jan Martin Mikkelsen
<janm@transactionware.com> wrote:
. . .The original report was for 14.3. Of course, they may have experienced
something different than what you did.
(I tried to look at whatever it was on the discord channel, but got
nothing useful,
just some sort of "welcome to discord..." message.)
Yea, discord is a need-to-login type of context. If
one is unlikely to want to establish such a login,
such discord URLs should likely be ignored. (I
originally established a login in order to get to
Solid Run's support for their HoneyComb [aarch64
based].)
but when I click on what is in your original post, it justMy normal desktop environment is macOS. In that environment,
says something about downloading an app and no text
channels.
rick===
As for the specific report, some of the information
was presented with images, making for a not great
fit for the mail-list. Thus the URL usage for those
that could readily use it.
So far, it has turned out that the message to the
mail-list mostly has prompted other leaks to be
found and fixed. Not bad for unintended
consequences.
I do hope that Peter can try 14.3 and determine if he still sees the problem
he reported.