From Newsgroup: news.software.nntp
Den 2025-12-28 skrev Adam W. <
aw@somewhere.invalid>:
Hi,
Hello,
I'll share some thoughts on my experience with administrating the NNTP
server I'm posting from.
I want to set up a newsserver (inn) with the archive of Polish Usenet.
I already have most of archives downloaded (in some weird format I'll convert, sort by date, and feed to inn), it's around 58 million articles (around 100 GB + overview). Seems manageable.
What would be the best spooling and overview method for this?
Right now I'm thinking about creating a file, formatting it as some filesystem (which filesystem? I use ext4 for my everyday needs, but maybe something else is better for this?), tuning its parameters, and using tradspool.
The file would be extended if needed, so the chosen filesystem has to have the capability to do it (ext4 can be extended if the underlying storage is extended, but I don't know about other filesystems).
We use ZFS and I think it has a number of advantages over ext4. Maybe
the largest one for this use case is that inodes are dynamically
allocated so you don't have a static limit like with ext4. Other
than that, you get nice features like snapshotting, scrubbing, easy
creation of datasets and transparent compression. Another good one
when dealing with millions of small files is zfs-{send,receive} that
allow you to transfer the file system on a block level, bypassing the
need of iterating over millions of files and sending them one by one,
granted not as much of an issue if you store everything in a file.
ZFS does support growing the file system, but not shrinking it.
In terms of tuning, you want to make sure the ashift setting matches
your underlying storage's sector size. That's usually 4k on modern
HDDs so ashift=12, i.e. 2^12 = 4096, even if they report a smaller
logical sector size. The record size setting is 128k by default and
probably doesn't need adjustment for your use-case since you do not
plan on rewriting data. It defines the max size for individual file
blocks and mostly affects sequential reading of large files and
rewrite performance. You could experiment with setting it to a lower
value if you want. It can be changed on-the-fly, but will only affect
newly written files.
This rich feature-set does come wit the penalty of worse IO/s
performance, but we store about 28 million articles in a tradspool on
spinning rust and it works.
Regardless of what file system you end up using, I would recommend
against using a file for this since you will impose double file system
overhead that will likely tank your IO/s performance.
Another option is XFS, which gives great IO/s performance, but I
haven't used it in a server setting so I can't vouch for what it's
like stability-wise. It also doesn't suffer the static inode problem
of ext4 and can be grown.
If it's ext4, I also have a couple of questions on how to best tune it. That's what I want to do -- is it a good idea?
1. Set the bytes per inode ratio to 1536 (very low, but it will give me 69 million inodes per 100 GB)
2. Set the block size to 1024 (not too low?)
3. Set the inode size to 128
4. Set uid16 to disable 32-bit UIDs
5. Disable large_file
6. Set dir_index
7. Set reserved blocks percentage to some low value (is 0% OK?)
Overview would be tradindexed, I think it will suffice.
We were running a tradindex and got pretty serious performance issues. Switching to ovsqlite fixed that problem.
CNFS would be better if there was a way to throttle a server when it's
about to rotate the buffer (I don't want to lose articles, ever, even if there's some massive flood that would overwhelm my storage, I want to add new buffers and unthrotle server then), but is it even possible?
Plus in case of a flood I'd have a problem with deleting articles from
CNFS that I wouldn't have with tradspool...
Some idea would be to use timecaf, but:
1. It doesn't seem to be widely used, so it's also not very well tested.
Or is it? How stable it is?
2. Is there a way to rotate a .CF file when it's full (262144 articles), instead of relying on arrival time? I want to feed new articles as fast as
I can
3. Maybe there are some tools to initially write the .CF files directly, instead of letting inn handle it? Then I'd just have to build the rest (history, overview)
Unfortunately I don't have good answer to any of the questions above.
The server won't accept new articles from readers -- after the initial prefeeding from my archives there will be only a single feed from my main server.
Suggestions are welcome.
--- Synchronet 3.21a-Linux NewsLink 1.2