• inn scalability improvements

    From Kevin Bowling@kevin.bowling@kev009.com to news.software.nntp on Sat May 9 16:26:11 2026
    From Newsgroup: news.software.nntp

    Hi,

    I've been looking at inn as I run it: tradspool/ovsqlite on FreeBSD+ZFS
    with unlimited retention and trying to fix some scaling issues.

    ovsqlite: Add direct reader mode for nnrpd with WAL https://github.com/InterNetNews/inn/pull/338

    expireover: Add bloom filter for fast history existence checks https://github.com/InterNetNews/inn/pull/339

    expire: skip per-article SMretrieve via cancel tombstone log https://github.com/InterNetNews/inn/pull/340

    The first two have been running on csiph.com for a few days without
    issue. I haven't yet deployed the last one yet but will publish results
    in the PR when I do.

    The general theme of all three is to reduce random or sync I/O. In the
    case of expireover and expire, the general problem is somewhat similar
    but the root cause and fix are fairly different.

    expireover checks history for every article and this turns into a very
    slow and random serial I/O for every article in the history DB. The new option is to use a bloom filter with a configurable false positive rate, chosen for some acceptable RAM vs accuracy tradeoff by the admin and
    their spool size. The bloom filter is created with a much more
    efficient streaming read of the entire history file. In the case of a
    false positive (bloom claims article in history, but expire removed it),
    it gets cleaned up on the next expireover run (entropy introduced by
    hash churn make this probabilistic reality). On my system this is a
    1100x improvement, 2 weeks down to 18 minutes.

    expire is a bit more involved. We don't want the same bloom tradeoff, exactness trades a bit of RAM (pending deletes) for totally eliminating unnecessary disk I/O for verification. If we keep a tombstone log of
    articles expireover deletes, and any cancels, and we trust the state of
    the spool, we can eliminate all speculative work and only deal with the
    actual history cleanup needed. On my system this will net a 2800x improvement, 2 days down to sub minute.

    With the expire tombstone, I can make 'nnrpdcheckart' an extremely
    efficient hash lookup, so OVER and friends don't need to check every
    single article's existence and only clean any pending tombstone entries.

    Both of these suppose integrity of the spool; if you 'rm' articles
    instead of using 'sm -r' you'll need to do a patrol expire to clean up
    your spool; expireover builds from history so it would see the new
    history state the next time it runs. The first time expire runs it wont
    have a tombstone so it will 'upgrade' to a known state cleaning any
    issues.. a gap is only if you decide to manually botch the spool after that.

    You'd avoid the worst of some of this with CNFS, but the unlimited
    retention case gets a little awkward, and tradspool should scale to a
    text feed with over a billion articles with these changes on modern
    hardware.

    The ovsqlite direct reader mode allows each nnrpd to open the sqlite DB directly, in read only mode. This eliminates costly IPC and exclusive
    locking to the ovsqlite-server which is now only concerned with innd.
    The WAL file also helps with writes as they turn into streaming I/O and
    the main DB does not create the same level of synchronous write load.
    This is conceptually similar to how BDB used to work.

    I have these gated on the FreeBSD news/inn-current port behind an
    EXPERIMENTAL flag (expire tombstones will come once I'm satisfied).

    I'd be curious to hear any feedback or reviews from other inn admins
    that can test the patch set.

    Regards,
    Kevin
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From =?UTF-8?Q?Julien_=C3=89LIE?=@iulius@nom-de-mon-site.com.invalid to news.software.nntp on Sun May 10 10:38:22 2026
    From Newsgroup: news.software.nntp

    Hi Kevin,

    ovsqlite: Add direct reader mode for nnrpd with WAL
    expireover: Add bloom filter for fast history existence checks
    expire: skip per-article SMretrieve via cancel tombstone log

    The general theme of all three is to reduce random or sync I/O.-a In the case of expireover and expire, the general problem is somewhat similar
    but the root cause and fix are fairly different.

    Many thanks for these very useful improvements to expiration and
    ovsqlite concurrent reading.
    expire and expireover run way faster (x1100 for expire and x2800 for expireover on your system !!) thanks to your optimizations. Some people
    have been complaining with their slowness from a long time, and
    hopefully you managed to find out how to improve that a lot and
    contributed a patch. It will be integrated to the next release.


    On my system this is a 1100x improvement, 2 weeks down to 18 minutes.

    Gosh!


    On my system this will net a 2800x improvement, 2 days down to sub minute.

    Re-gosh!


    Both of these suppose integrity of the spool; if you 'rm' articles
    instead of using 'sm -r' you'll need to do a patrol expire to clean up
    your spool; expireover builds from history so it would see the new
    history state the next time it runs.-a The first time expire runs it wont have a tombstone so it will 'upgrade' to a known state cleaning any
    issues.. a gap is only if you decide to manually botch the spool after
    that.

    Worth mentioning.


    I'd be curious to hear any feedback or reviews from other inn admins
    that can test the patch set.

    If anyone could test the provided patches, do not hesitate to do so in
    order to eventually catch possible nits before the next release (in June).
    --
    Julien |eLIE

    -2-a|etre |+u ne p|Ns |-tre, telle est l|N questi|+nrCa-a-+ (Ker|+zen)

    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Kevin Bowling@kevin.bowling@kev009.com to news.software.nntp on Sun May 10 17:59:50 2026
    From Newsgroup: news.software.nntp

    On 5/10/26 01:38, Julien |eLIE wrote:
    Hi Kevin,

    ovsqlite: Add direct reader mode for nnrpd with WAL
    expireover: Add bloom filter for fast history existence checks
    expire: skip per-article SMretrieve via cancel tombstone log

    The general theme of all three is to reduce random or sync I/O.-a In
    the case of expireover and expire, the general problem is somewhat
    similar but the root cause and fix are fairly different.

    Many thanks for these very useful improvements to expiration and
    ovsqlite concurrent reading.
    expire and expireover run way faster (x1100 for expire and x2800 for expireover on your system !!) thanks to your optimizations.-a Some people have been complaining with their slowness from a long time, and
    hopefully you managed to find out how to improve that a lot and
    contributed a patch.-a It will be integrated to the next release.


    On my system this is a 1100x improvement, 2 weeks down to 18 minutes.

    Gosh!


    On my system this will net a 2800x improvement, 2 days down to sub
    minute.

    Re-gosh!


    Both of these suppose integrity of the spool; if you 'rm' articles
    instead of using 'sm -r' you'll need to do a patrol expire to clean up
    your spool; expireover builds from history so it would see the new
    history state the next time it runs.-a The first time expire runs it
    wont have a tombstone so it will 'upgrade' to a known state cleaning
    any issues.. a gap is only if you decide to manually botch the spool
    after that.

    Worth mentioning.


    I'd be curious to hear any feedback or reviews from other inn admins
    that can test the patch set.

    If anyone could test the provided patches, do not hesitate to do so in
    order to eventually catch possible nits before the next release (in June).


    I'll pause to save some energy to fix any feedback and problems if they
    arise.

    And these fix my most notable issues for csiph.com. But I've wondered a
    bit about what might be next. It seems like some sites use Diablo
    primarily for transit. And commercial sites might use Tornado or some
    home grown system (maybe we should do a NNTP Server and Capabilities
    survey of top1000?).

    I'm unfamiliar with Diablo but took a brief tour of the code to see what
    it does. It's (also :)) archaic but would have been advanced in the
    time it was conceived.. select() and aio, more efficient feeders, dns, readers, clever file formats mostly mmaped. aio is not a great fit for
    Linux due to glibc. There's nothing in particular that would be a good
    model or forklift for inn because network programming changed a lot once kqueue and epoll entered a couple years later.

    nnrpd would be somewhat easy to make more efficient across a few
    dimensions. The least code churn might be integrating poll() directly
    in the per process forks and adding some eventing to deal with NNTP
    pipelining and I/O and DNS. The main benefit would be building up some
    I/O parallelism, all storage devices benefit from that and you smooth
    out the latency of hot vs cold I/Os through the page cache (or
    ARC/L2ARC) and actual storage devices. Data movement could optionally
    use sendfile and even KTLS sendfile on Linux and FreeBSD.

    A step further would be to actually multiplex clients. Instead of
    forking per client, one or more nnrpds multiplex clients across an event
    loop, usually kqueue/epoll/devpoll but could fall back to poll for wide compatibility. The decision tree would be somewhat influenced by what
    set of OSes does modern inn still compile on, did any cleanup eject
    1990s era unix?

    Similar opportunities likely present in innd/innfeeder but I am not
    familiar with this code so far.

    File formats would be another consideration I am not familiar enough to evaluate right now but the network and disk I/O stuff interests me more.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From =?UTF-8?Q?Julien_=C3=89LIE?=@iulius@nom-de-mon-site.com.invalid to news.software.nntp on Mon May 11 16:57:28 2026
    From Newsgroup: news.software.nntp

    Hi Kevin,

    I've wondered a bit about what might be next.

    Thanks for asking and being motivated to enhance and improve INN!


    network programming changed a lot once kqueue and epoll entered a couple years later.

    Yes, it would be worth using libevent with INN. It would permit to fix
    the long-standing bug that innd does not honour DNS TTLs (https://github.com/InterNetNews/inn/issues/89) as epoll provides
    asynchronous non-blocking calls to do that.
    Also, implementing TLS and zlib (COMPRESS) support in innd and innfeed
    would be very useful. People often ask for TLS (for privacy reasons)
    and find it complicated to correctly make stunnel or tcpwrappers work
    with innd.


    nnrpd would be somewhat easy to make more efficient across a few dimensions.-a The least code churn might be integrating poll() directly
    in the per process forks and adding some eventing to deal with NNTP pipelining and I/O and DNS.-a The main benefit would be building up some
    I/O parallelism, all storage devices benefit from that and you smooth
    out the latency of hot vs cold I/Os through the page cache (or ARC/
    L2ARC) and actual storage devices.

    I agree.


    Similar opportunities likely present in innd/innfeeder but I am not
    familiar with this code so far.
    Sure, there also are improvements to do for them!


    File formats would be another consideration I am not familiar enough to evaluate right now but the network and disk I/O stuff interests me more.

    You can find some tickets about storage and file formats in the GitHub tracker. The most "urgent" would be the one about the Y2038 issue for timecaf:
    timecaf disk format uses time_t (Y2038 issue for 64-bit time_t
    transition on 32-bit archs)
    https://github.com/InterNetNews/inn/issues/292

    Normally, the rest of INN is not affected by the Y2038 issue (but not
    100% sure - at least, I did not find other suspicious uses of time_t).


    Another useful thing to work on is to make easier how to keep both the
    active file and the newsgroups descriptions synchronized and up-to-date.
    It is somewhat a pain for newcomers to understand what to do and how
    to achieve it, and naturally also in the long-term how to keep it
    up-to-date. We have Perl utilities to clean and merge newsgroups files
    in https://github.com/InterNetNews/inn/issues/39 but they do not cover
    all the cases the news admins usually want. The ideal would be that
    actsync take care of everything, or that we have a wrapper which calls
    several tools.
    More generally, what eases the installation of INN for newcomers would
    be helpful.

    Maybe revisiting the paths where all INN stuff is installed could
    interest you as a package maintainer :)
    Support installation in FHS paths
    https://github.com/InterNetNews/inn/issues/43

    Another suggestion as you seem to like expiration/orphan stuff:
    tradspool and timehash articles not deleted after rmgroup
    https://github.com/InterNetNews/inn/issues/61


    You will also find some interesting reworks Russ had in mind in the TODO
    file at the root of the INN tree.
    Naturally, feel free to pick up anything you are interested in and with
    what you take fun working on it.
    --
    Julien |eLIE

    -2-aCe vieux forban d'Asthmatix, il ne manquait pas d'air-a!-a-+ (Ast|-rix)

    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From InterLinked@nntp@phreaknet.org to news.software.nntp on Tue May 12 22:46:01 2026
    From Newsgroup: news.software.nntp

    On 5/10/2026 8:59 PM, Kevin Bowling wrote:
    Data movement could optionally use sendfile and even KTLS sendfile on Linux and FreeBSD.

    So, this gave me an interesting thought earlier this evening, as I
    actually do use sendfile() a lot (it's not very portable, so I have
    wrappers around sendfile, copy_file_range, splice, and other fun functions).

    I was reading the requirement for :bytes earlier today and noted that it specifically does not include dot-stuffing characters, and I recalled
    then that in SMTP and NNTP, I strip out any dot-stuffed lines as I
    thought was convention. However, when *sending* the articles to a
    client, I have just been using ~sendfile, for efficiency. I realize now
    this is illegal, since I'm not adding the dot-stuffing back.

    Taking a look at INN, I noticed it counts the number of dot-stuffed
    lines and subtracts it from the value it provides for :bytes, which is
    what the RFC says to do. But it looks like INN doesn't remove the
    dot-stuffing when receiving an article. This actually seems like the
    more elegant way to do things; if you never remove the dot-stuffing, you
    don't need to add it back, and then you can do efficient things like use sendfile rather than going line by line. So in my case, I think I don't
    need to care about receiving dot-stuffed lines since I can store them in
    the spool and just have it be transparent.

    On the other hand, this wouldn't work for SMTP, since with
    BDAT/CHUNKING, you may not need to do dot-stuffing.

    I guess what I'm trying to say is that for SMTP, because there are
    multiple ways to send/receive messages, it makes sense to store the
    messages in the "queue" normalized (without dot-stuffing), but since
    there's only one way in NNTP, it makes more sense to just store the
    stuffed lines in the spool directly. Is this about the gist of it?

    (And implicitly, I'm assuming that in the future, it would be hard to
    support other extensions like SMTP has since I would assume most news
    server are storing the dots in the spool; of course, such extensions are probably of little value to NNTP considering most articles are small).
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Kevin Bowling@kevin.bowling@kev009.com to news.software.nntp on Tue May 12 20:30:14 2026
    From Newsgroup: news.software.nntp

    On 5/12/26 19:46, InterLinked wrote:
    On 5/10/2026 8:59 PM, Kevin Bowling wrote:
    Data movement could optionally use sendfile and even KTLS sendfile on
    Linux and FreeBSD.

    So, this gave me an interesting thought earlier this evening, as I
    actually do use sendfile() a lot (it's not very portable, so I have
    wrappers around sendfile, copy_file_range, splice, and other fun
    functions).

    I was reading the requirement for :bytes earlier today and noted that it specifically does not include dot-stuffing characters, and I recalled
    then that in SMTP and NNTP, I strip out any dot-stuffed lines as I
    thought was convention. However, when *sending* the articles to a
    client, I have just been using ~sendfile, for efficiency. I realize now
    this is illegal, since I'm not adding the dot-stuffing back.

    Taking a look at INN, I noticed it counts the number of dot-stuffed
    lines and subtracts it from the value it provides for :bytes, which is
    what the RFC says to do. But it looks like INN doesn't remove the dot- stuffing when receiving an article. This actually seems like the more elegant way to do things; if you never remove the dot-stuffing, you
    don't need to add it back, and then you can do efficient things like use sendfile rather than going line by line. So in my case, I think I don't
    need to care about receiving dot-stuffed lines since I can store them in
    the spool and just have it be transparent.

    On the other hand, this wouldn't work for SMTP, since with BDAT/
    CHUNKING, you may not need to do dot-stuffing.

    I guess what I'm trying to say is that for SMTP, because there are
    multiple ways to send/receive messages, it makes sense to store the
    messages in the "queue" normalized (without dot-stuffing), but since
    there's only one way in NNTP, it makes more sense to just store the
    stuffed lines in the spool directly. Is this about the gist of it?

    (And implicitly, I'm assuming that in the future, it would be hard to support other extensions like SMTP has since I would assume most news
    server are storing the dots in the spool; of course, such extensions are probably of little value to NNTP considering most articles are small).

    Can you post some example transactions showing your thoughts?

    One thing that comes to mind is probably to synthesize the header(s) and sendfile at an offset of the body. The synthesized part could be stack allocated. Likewise any trailer if 'wireformat' is off.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From InterLinked@nntp@phreaknet.org to news.software.nntp on Tue May 12 23:59:40 2026
    From Newsgroup: news.software.nntp

    On 5/12/2026 11:30 PM, Kevin Bowling wrote:
    On 5/12/26 19:46, InterLinked wrote:
    I guess what I'm trying to say is that for SMTP, because there are
    multiple ways to send/receive messages, it makes sense to store the
    messages in the "queue" normalized (without dot-stuffing), but since
    there's only one way in NNTP, it makes more sense to just store the
    stuffed lines in the spool directly. Is this about the gist of it?

    (And implicitly, I'm assuming that in the future, it would be hard to
    support other extensions like SMTP has since I would assume most news
    server are storing the dots in the spool; of course, such extensions
    are probably of little value to NNTP considering most articles are
    small).

    Can you post some example transactions showing your thoughts?

    One thing that comes to mind is probably to synthesize the header(s) and sendfile at an offset of the body.-a The synthesized part could be stack allocated.-a Likewise any trailer if 'wireformat' is off.

    That is similar to what I'm talking about, though I think wireformat in
    INN is just line endings (whether to store the CR or not). I was talking
    about storing the leading dot in dot-stuffed lines in the spool (same
    idea really). I believe that INN does this unconditionally without any
    way to disable it... because why would you want to? Unlike SMTP, it does
    not appear to me there are any use cases in NNTP (currently) where it
    would be useful to normalize them out of the message.

    In SMTP, you may or may not need to dot-stuff depending on how the
    message gets sent, so there, it seems logical to store messages with the
    dot removed.

    But suppose NNTP ever had an extension like CHUNKING with a BDAT
    command. (I don't really see such a thing being useful, outside of
    binary groups maybe, but let's suppose.) Then there could be cases where
    you wouldn't send articles dot-stuffed, but that would vary by
    transmission and in that case, maybe it would make sense to normalize
    the message, so you could frame it in different ways. And yeah, you
    could use sendfile for parts of the message, but not all of it and there
    would be more parsing involved.

    There are a lot of potentially useful extensions in SMTP and IMAP that
    could be useful in NNTP, but admittedly I don't see this being a likely scenario.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Russ Allbery@eagle@eyrie.org to news.software.nntp on Tue May 12 21:50:49 2026
    From Newsgroup: news.software.nntp

    InterLinked <nntp@phreaknet.org> writes:

    That is similar to what I'm talking about, though I think wireformat in
    INN is just line endings (whether to store the CR or not). I was talking about storing the leading dot in dot-stuffed lines in the spool (same idea really).

    wireformat controls both in INN. If you set wireformat to false with
    tradspool, INN will remove the dot stuffing when storing articles as well
    as changing the line endings.

    I believe that INN does this unconditionally without any way to disable
    it... because why would you want to?

    For compatibility with external programs that want to look at the articles
    in a tradspool spool directly without using INN's storage library. Back in
    the day, there used to be some of those. Less common now!
    --
    Russ Allbery (eagle@eyrie.org) <https://www.eyrie.org/~eagle/>

    Please post questions rather than mailing me directly.
    <https://www.eyrie.org/~eagle/faqs/questions.html> explains why.
    --- Synchronet 3.22a-Linux NewsLink 1.2