From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to alt.html on Mon Oct 28 21:41:05 2024
From Newsgroup: alt.html
I'm accessing some public Wiki information (~2100 Wiki pages).
Due to the extreme time and space demands I started to extract
that information (from the original MD files) to either create
a huge text file or to generate a HTML file.[*]
My post's intention is to understand whether the time and space
bloat that I observed with that Wiki data is typical or just an
effect of the underlying tool used.[**]
For example, a typical MD file has
10 header lines and 27 information lines (including links)
in 48 (non-empty) lines and requiring 2'354 bytes.
From this MD file they create HTML information with
56'551 lines(!) requiring 3'744'427 bytes(!)
and this file also loads 63(!) JS files with another
4'104'887 bytes(!) of storage requirements.
So the net storage demands for a *single* HTML page is about 8 MB
and there's also the runtime considerations of the JS code. All
that for 48 lines of information! - Is that typical for Wikis? -
And *every* click on some link adds to those storage/time demands.
(I'm regularly astonished how badly software is written nowadays,
but these numbers appear to me to be beyond all hope.)
BTW, I'd also be interested in hints if there's some free tool(s)
to do such a MD-files -> HTML-file (or -> PDF-file) conversion so
that I don't need to (unnecessarily) re-invent the wheel.
Janis
[*] I wished I wouldn't have to do that, though; I had hoped
the Wiki authors would have a way to provide some PDF or an
all-in-one-page HTML with the tool they're using to create the
HTML structure. Alas, they (or the tool) seem to not be able to
provide that.
[**] I have no information about that tool (and I anyway don't
intend any "tool blaming" with my post, so that information is
of minor importance to me).
--- Synchronet 3.21d-Linux NewsLink 1.2