Forum: Too Lazy BBS

Best spool for an archive server

From aw@aw@somewhere.invalid (Adam W.) to news.software.nntp on Sun Dec 28 00:42:58 2025

From Newsgroup: news.software.nntp

Hi,

I want to set up a newsserver (inn) with the archive of Polish Usenet.
I already have most of archives downloaded (in some weird format I'll
convert, sort by date, and feed to inn), it's around 58 million articles (around 100 GB + overview). Seems manageable.

What would be the best spooling and overview method for this?

Right now I'm thinking about creating a file, formatting it as some
filesystem (which filesystem? I use ext4 for my everyday needs, but maybe something else is better for this?), tuning its parameters, and using tradspool.

The file would be extended if needed, so the chosen filesystem has to have
the capability to do it (ext4 can be extended if the underlying storage is extended, but I don't know about other filesystems).

If it's ext4, I also have a couple of questions on how to best tune it.
That's what I want to do -- is it a good idea?

1. Set the bytes per inode ratio to 1536 (very low, but it will give me 69 million inodes per 100 GB)
2. Set the block size to 1024 (not too low?)
3. Set the inode size to 128
4. Set uid16 to disable 32-bit UIDs
5. Disable large_file
6. Set dir_index
7. Set reserved blocks percentage to some low value (is 0% OK?)

Overview would be tradindexed, I think it will suffice.

CNFS would be better if there was a way to throttle a server when it's
about to rotate the buffer (I don't want to lose articles, ever, even if there's some massive flood that would overwhelm my storage, I want to add
new buffers and unthrotle server then), but is it even possible?

Plus in case of a flood I'd have a problem with deleting articles from
CNFS that I wouldn't have with tradspool...

Some idea would be to use timecaf, but:

1. It doesn't seem to be widely used, so it's also not very well tested.
Or is it? How stable it is?

2. Is there a way to rotate a .CF file when it's full (262144 articles), instead of relying on arrival time? I want to feed new articles as fast as
I can

3. Maybe there are some tools to initially write the .CF files directly, instead of letting inn handle it? Then I'd just have to build the rest (history, overview)

The server won't accept new articles from readers -- after the initial prefeeding from my archives there will be only a single feed from my main server.

Suggestions are welcome.
--- Synchronet 3.21a-Linux NewsLink 1.2

From Andreas Kempe@kempe@lysator.liu.se to news.software.nntp on Wed Dec 31 17:22:36 2025

From Newsgroup: news.software.nntp

Den 2025-12-28 skrev Adam W. <aw@somewhere.invalid>:

Hi,

Hello,

I'll share some thoughts on my experience with administrating the NNTP
server I'm posting from.

I want to set up a newsserver (inn) with the archive of Polish Usenet.
I already have most of archives downloaded (in some weird format I'll convert, sort by date, and feed to inn), it's around 58 million articles (around 100 GB + overview). Seems manageable.

What would be the best spooling and overview method for this?

Right now I'm thinking about creating a file, formatting it as some filesystem (which filesystem? I use ext4 for my everyday needs, but maybe something else is better for this?), tuning its parameters, and using tradspool.

The file would be extended if needed, so the chosen filesystem has to have the capability to do it (ext4 can be extended if the underlying storage is extended, but I don't know about other filesystems).

We use ZFS and I think it has a number of advantages over ext4. Maybe
the largest one for this use case is that inodes are dynamically
allocated so you don't have a static limit like with ext4. Other
than that, you get nice features like snapshotting, scrubbing, easy
creation of datasets and transparent compression. Another good one
when dealing with millions of small files is zfs-{send,receive} that
allow you to transfer the file system on a block level, bypassing the
need of iterating over millions of files and sending them one by one,
granted not as much of an issue if you store everything in a file.

ZFS does support growing the file system, but not shrinking it.

In terms of tuning, you want to make sure the ashift setting matches
your underlying storage's sector size. That's usually 4k on modern
HDDs so ashift=12, i.e. 2^12 = 4096, even if they report a smaller
logical sector size. The record size setting is 128k by default and
probably doesn't need adjustment for your use-case since you do not
plan on rewriting data. It defines the max size for individual file
blocks and mostly affects sequential reading of large files and
rewrite performance. You could experiment with setting it to a lower
value if you want. It can be changed on-the-fly, but will only affect
newly written files.

This rich feature-set does come wit the penalty of worse IO/s
performance, but we store about 28 million articles in a tradspool on
spinning rust and it works.

Regardless of what file system you end up using, I would recommend
against using a file for this since you will impose double file system
overhead that will likely tank your IO/s performance.

Another option is XFS, which gives great IO/s performance, but I
haven't used it in a server setting so I can't vouch for what it's
like stability-wise. It also doesn't suffer the static inode problem
of ext4 and can be grown.

If it's ext4, I also have a couple of questions on how to best tune it. That's what I want to do -- is it a good idea?

1. Set the bytes per inode ratio to 1536 (very low, but it will give me 69 million inodes per 100 GB)
2. Set the block size to 1024 (not too low?)
3. Set the inode size to 128
4. Set uid16 to disable 32-bit UIDs
5. Disable large_file
6. Set dir_index
7. Set reserved blocks percentage to some low value (is 0% OK?)

Overview would be tradindexed, I think it will suffice.

We were running a tradindex and got pretty serious performance issues. Switching to ovsqlite fixed that problem.

CNFS would be better if there was a way to throttle a server when it's
about to rotate the buffer (I don't want to lose articles, ever, even if there's some massive flood that would overwhelm my storage, I want to add new buffers and unthrotle server then), but is it even possible?

Plus in case of a flood I'd have a problem with deleting articles from
CNFS that I wouldn't have with tradspool...

Some idea would be to use timecaf, but:

1. It doesn't seem to be widely used, so it's also not very well tested.
Or is it? How stable it is?

2. Is there a way to rotate a .CF file when it's full (262144 articles), instead of relying on arrival time? I want to feed new articles as fast as
I can

3. Maybe there are some tools to initially write the .CF files directly, instead of letting inn handle it? Then I'd just have to build the rest (history, overview)

Unfortunately I don't have good answer to any of the questions above.

The server won't accept new articles from readers -- after the initial prefeeding from my archives there will be only a single feed from my main server.

Suggestions are welcome.

--- Synchronet 3.21a-Linux NewsLink 1.2

From =?UTF-8?Q?Julien_=C3=89LIE?=@iulius@nom-de-mon-site.com.invalid to news.software.nntp on Sun Jan 4 21:34:43 2026

From Newsgroup: news.software.nntp

Hi Adam,

I want to set up a newsserver (inn) with the archive of Polish Usenet.
I already have most of archives downloaded (in some weird format I'll convert, sort by date, and feed to inn), it's around 58 million articles (around 100 GB + overview). Seems manageable.

Overview would be tradindexed, I think it will suffice.

I would use ovsqlite because it may perform a bit faster with millions
of articles in a single newsgroup.

CNFS would be better if there was a way to throttle a server when it's
about to rotate the buffer (I don't want to lose articles, ever, even if there's some massive flood that would overwhelm my storage, I want to add
new buffers and unthrotle server then), but is it even possible?

There is no such feature. Maybe a new keyword in cycbuff.conf could be worthwhile having for such a use, like "nowrap:<buffer>[,<buffer>,...]"
to list cyclic buffers that should not wrap. Unfortunately, it does not currently exist.

Some idea would be to use timecaf, but:

1. It doesn't seem to be widely used, so it's also not very well tested.
Or is it? How stable it is?

I have been using timecaf for more than a decade with a few hierarchies
and never noticed any issue.

2. Is there a way to rotate a .CF file when it's full (262144 articles), instead of relying on arrival time? I want to feed new articles as fast as
I can

Unfortunately no. This is the only drawback I see with this storage method.

3. Maybe there are some tools to initially write the .CF files directly, instead of letting inn handle it? Then I'd just have to build the rest (history, overview)

No tools exist for that.
--
Julien |eLIE

-2-aJe ne cherche pas |a conna|<tre les r|-ponses, je cherche |a comprendre
les questions.-a-+

--- Synchronet 3.21a-Linux NewsLink 1.2

From aw@aw@somewhere.invalid (Adam W.) to news.software.nntp on Wed Jan 7 00:57:11 2026

From Newsgroup: news.software.nntp

Julien +LIE <iulius@nom-de-mon-site.com.invalid> wrote:

I would use ovsqlite because it may perform a bit faster with millions
of articles in a single newsgroup.

Thanks, I'll use that.

Thanks for responses, Julien and Andreas. I'll be doing some experiments
with ZFS and ext4 on a smaller article set and we'll see how it works.
--- Synchronet 3.21a-Linux NewsLink 1.2

From =?UTF-8?Q?Julien_=C3=89LIE?=@iulius@nom-de-mon-site.com.invalid to news.software.nntp on Wed Jan 21 21:31:25 2026

From Newsgroup: news.software.nntp

Hi Adam,

I'll be doing some experiments
with ZFS and ext4 on a smaller article set and we'll see how it works.

Did you have the time to do the experiment? I believe the community
will be happy to hear what you found out.

FWIW, I've added your wish for a non-wrapping CNFS buffer in our feature tracker: https://github.com/InterNetNews/inn/issues/330
--
Julien |eLIE

-2-aCampagne |-lectorale-a: c'est l'art de gagner les voix des pauvres avec
l'argent des riches en promettant |a chacun des deux de les prot|-ger
contre l'autre.-a-+ (Oscar Ameringer)

--- Synchronet 3.21a-Linux NewsLink 1.2

From aw@aw@somewhere.invalid (Adam W.) to news.software.nntp on Tue Jan 27 14:35:57 2026

From Newsgroup: news.software.nntp

Julien +LIE <iulius@nom-de-mon-site.com.invalid> wrote:

I'll be doing some experiments
with ZFS and ext4 on a smaller article set and we'll see how it works.

Did you have the time to do the experiment? I believe the community
will be happy to hear what you found out.

No, not with ZFS. I created ext4 buffer, but I didn't get to the stage of actually injecting articles yet (I'm keeping them in a large file with a simple, custom format now, and I'm still trying to source some archives to merge into it). It's a side project that's been postponed now due to other things... But I didn't forget about it.

FWIW, I've added your wish for a non-wrapping CNFS buffer in our feature tracker: https://github.com/InterNetNews/inn/issues/330

It would be great to see it in the future. Thanks :) It would solve the problem for archiving servers. Even if I'll probably go with tradspool due
to its ability to reclaim space used by deleted articles (I expect to find some spam in the archives, or someone might find it later and report).
--- Synchronet 3.21b-Linux NewsLink 1.2

From aw@aw@somewhere.invalid (Adam W.) to news.software.nntp on Mon Apr 13 14:03:19 2026

From Newsgroup: news.software.nntp

Adam W. <aw@somewhere.invalid> wrote:

No, not with ZFS. I created ext4 buffer, but I didn't get to the stage of actually injecting articles yet (I'm keeping them in a large file with a simple, custom format now, and I'm still trying to source some archives to merge into it). It's a side project that's been postponed now due to other things... But I didn't forget about it.

Some time has passed, and the server is online now
(news-archive.chmurka.net, port 121 -- port is important, as the IP is currently shared with news.chmurka.net, so there's a normal news server on port 119).

I tried with tradspool first, but after several millions of articles I
started getting some weird errors regarding symlinking files ("could not create symlink", I don't remember the issue exactly now). I switched to timehash and it seems to have fixed the issue.

It's just ext4 made with:

mkfs.ext4 -b 1024 -i 1024 -I 128 -m 0 -O dir_index,large_dir /dev/nbd0

[newsarch@fna ~]$ df -h /mnt/fna
Filesystem Size Used Avail Use% Mounted on
/dev/nbd0 171G 120G 52G 70% /mnt/fna

[newsarch@fna ~]$ df -ih /mnt/fna
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/nbd0 196M 38M 158M 20% /mnt/fna

Perhaps I could've used a larger inode ratio, but it is how it is now.

Seems to work fine.
--- Synchronet 3.21f-Linux NewsLink 1.2

Who's Online

System Info

Sysop:	Amessyroom
Location:	Fayetteville, NC
Users:	70
Nodes:	6 (0 / 6)
Uptime:	39:53:24
Calls:	948
Files:	1,325
Messages:	280,770

Best spool for an archive server

Who's Online

System Info