• Fastest way to run two external processes

    From Mark Summerfield@m.n.summerfield@gmail.com to comp.lang.tcl on Wed Apr 29 07:38:23 2026
    From Newsgroup: comp.lang.tcl

    I need to run two external processes (on Linux):

    pdftotext -tsv one.pdf
    pdftotext -tsv two.pdf

    For each one I need to acquire the output and post-process it.
    Both are completely independent.
    (However, once I've finished post-processing I then do some work on
    both sets of post-processed data together.)

    Each external process takes about 3 secs so it takes just over 6 secs
    to acquire the data from both processes.

    When I've done something similar in Python I've used the multiprocessing
    module and this has got my runtime close to the 3 secs.

    In my experiments with Tcl's threading I've found the threading startup overhead to be rather large.

    What is the fastest way to run two independent processes concurrently
    and acquire their outputs using Tcl?
    --- Synchronet 3.21f-Linux NewsLink 1.2
  • From Mark Summerfield@m.n.summerfield@gmail.com to comp.lang.tcl on Wed Apr 29 08:51:17 2026
    From Newsgroup: comp.lang.tcl

    On Wed, 29 Apr 2026 07:38:23 -0000 (UTC), Mark Summerfield wrote:

    I need to run two external processes (on Linux):

    pdftotext -tsv one.pdf
    pdftotext -tsv two.pdf

    For each one I need to acquire the output and post-process it.
    Both are completely independent.
    (However, once I've finished post-processing I then do some work on
    both sets of post-processed data together.)

    Each external process takes about 3 secs so it takes just over 6 secs
    to acquire the data from both processes.

    When I've done something similar in Python I've used the multiprocessing module and this has got my runtime close to the 3 secs.

    In my experiments with Tcl's threading I've found the threading startup overhead to be rather large.

    What is the fastest way to run two independent processes concurrently
    and acquire their outputs using Tcl?

    Here's my serial version:

    proc app::serial {pdftotext pdf1 pdf2} {
    puts serial
    set pdf1tsv [exec $pdftotext -tsv $pdf1 -]
    set pdf2tsv [exec $pdftotext -tsv $pdf2 -]
    list $pdf1tsv $pdf2tsv
    }

    This takes ~2 sec for two ~650 page PDFs.

    With some help from Gemini (after I got past non-working and slow
    solutions) I did a multiprocessing version:

    proc app::multiprocess {pdftotext pdf1 pdf2} {
    set p1 [open "|$pdftotext -tsv $pdf1 - 2>@1" r]
    try {
    set p2 [open "|$pdftotext -tsv $pdf2 - 2>@1" r]
    try {
    fconfigure $p1 -blocking 0
    fconfigure $p2 -blocking 0
    set pdf1tsv ""
    set pdf2tsv ""
    while {![eof $p1] || ![eof $p2]} {
    append pdf1tsv [read $p1]
    append pdf2tsv [read $p2]
    after 1
    }
    } finally {
    close $p2
    }
    } finally {
    close $p1
    }
    list $pdf1tsv $pdf2tsv
    }

    This takes ~1 sec.
    --- Synchronet 3.21f-Linux NewsLink 1.2
  • From meshparts@alexandru.dadalau@meshparts.de to comp.lang.tcl on Wed Apr 29 11:24:34 2026
    From Newsgroup: comp.lang.tcl

    Am 29.04.2026 um 10:51 schrieb Mark Summerfield:
    This takes ~1 sec.
    So it's 2x faster, as expected.
    What's the issue?
    --- Synchronet 3.21f-Linux NewsLink 1.2
  • From Mark Summerfield@m.n.summerfield@gmail.com to comp.lang.tcl on Wed Apr 29 09:46:56 2026
    From Newsgroup: comp.lang.tcl

    On Wed, 29 Apr 2026 11:24:34 +0200, meshparts wrote:

    Am 29.04.2026 um 10:51 schrieb Mark Summerfield:
    This takes ~1 sec.
    So it's 2x faster, as expected.
    What's the issue?

    When I originally asked I only had the serial approach.
    I replied to myself once I had the multiprocessing approach which
    solved the problem so that people could see it was solved.
    --- Synchronet 3.21f-Linux NewsLink 1.2
  • From Ralf Fassel@ralfixx@gmx.de to comp.lang.tcl on Wed Apr 29 12:30:12 2026
    From Newsgroup: comp.lang.tcl

    * Mark Summerfield <m.n.summerfield@gmail.com>
    | With some help from Gemini (after I got past non-working and slow
    | solutions) I did a multiprocessing version:

    | proc app::multiprocess {pdftotext pdf1 pdf2} {
    | set p1 [open "|$pdftotext -tsv $pdf1 - 2>@1" r]
    | try {
    | set p2 [open "|$pdftotext -tsv $pdf2 - 2>@1" r]
    | try {
    | fconfigure $p1 -blocking 0
    | fconfigure $p2 -blocking 0

    Depending on the output of $pdftotext, some -encoding option might be necessary, too.

    | set pdf1tsv ""
    | set pdf2tsv ""
    | while {![eof $p1] || ![eof $p2]} {
    | append pdf1tsv [read $p1]
    | append pdf2tsv [read $p2]
    | after 1
    | }

    I don't like the busy-waiting loop for eof, but a solution using
    fileevents would require namespace vars or globals to collect the output
    and signallig 'done', so ymmv.

    R'
    --- Synchronet 3.21f-Linux NewsLink 1.2
  • From abu@user13892@newsgrouper.org.invalid to comp.lang.tcl on Thu Apr 30 00:51:16 2026
    From Newsgroup: comp.lang.tcl


    I don't understand why Threads are not used (in particular Thread Pools)

    Here's my solution. Please tell me if there's a significant speed penalty.

    # ===============================

    package require Thread

    # run up to 3 parallel workers; extra jobs are queued
    set mytpool [tpool::create -minworkers 3]

    # ..
    set jobs {
    "exec $pdftotext -tsv $pdf1 - 2>@1"
    "exec $pdftotext -tsv $pdf2 - 2>@1"
    "exec $pdftotext -tsv $pdf3 - 2>@1"
    }

    set T0 [clock milliseconds]
    set myjobIDs {}
    # scheduled all jobs
    foreach job $jobs {
    lappend myjobIDs [tpool::post -nowait $mytpool $job]
    }
    unset RESULT
    puts "waiting for RESULT..."
    while { [llength $myjobIDs] > 0 } {
    # get the completed jobs; myjobIDs is updated with the list of the still pending jobs
    set completedJobs [tpool::wait $mytpool $myjobIDs myjobIDs]
    foreach job $completedJobs {
    puts "== Job $job completed at [expr {[clock milliseconds]-$T0}] msec"
    set RESULT($job) [tpool::get $mytpool $job]
    }
    }

    puts "Result saved in the RESULT() array"
    puts "Total processing time: [expr {[clock milliseconds]-$T0}] msec"
    --- Synchronet 3.21f-Linux NewsLink 1.2
  • From Ralf Fassel@ralfixx@gmx.de to comp.lang.tcl on Thu Apr 30 14:23:58 2026
    From Newsgroup: comp.lang.tcl

    * abu <user13892@newsgrouper.org.invalid>
    | I don't understand why Threads are not used (in particular Thread Pools)

    Most probably because Mark stated in Message-ID: <10sschf$3nvs2$1@dont-email.me>

    In my experiments with Tcl's threading I've found the threading
    startup overhead to be rather large.

    | Here's my solution. Please tell me if there's a significant speed penalty.

    Did you compare your version to Mark's solution? This would be the best comparison when running on the same hardware...

    R'
    --- Synchronet 3.21f-Linux NewsLink 1.2
  • From Mark Summerfield@m.n.summerfield@gmail.com to comp.lang.tcl on Fri May 1 07:04:08 2026
    From Newsgroup: comp.lang.tcl

    I created a tiny test program (65 LOC; shown at the end) to
    compare timings. I did multiple timings and here're the averages:

    serial (2 LOC) 2.020 sec
    multiprocess (19 LOC) 1.055 sec
    threaded (13 LOC) 1.061 sec

    Since the difference between the multiprocess and threaded
    approaches is so small and that the threaded code is simpler
    and more appealing, I'm going to use the threaded version in
    my programs (which only ever work with two PDFs at a time)
    rCo so thank you "abu"!

    #!/usr/bin/env tclsh9
    # usage: time ./concurrent.tcl <s|m|t> <file1.pdf> <file2.pdf>

    package require thread

    proc main {} {
    set pdftotext [auto_execok pdftotext]
    set pdf1 [lindex $::argv 1]
    set pdf2 [lindex $::argv 2]
    switch [lindex $::argv 0] {
    s { serial $pdftotext $pdf1 $pdf2 }
    m { multiprocess $pdftotext $pdf1 $pdf2 }
    t { threaded $pdftotext $pdf1 $pdf2 }
    }
    }

    proc serial {pdftotext pdf1 pdf2} {
    puts -nonewline "serial "
    set tsv1 [exec $pdftotext -tsv $pdf1 - 2>@1]
    set tsv2 [exec $pdftotext -tsv $pdf2 - 2>@1]
    puts " tsv1=[string length $tsv1] tsv2=[string length $tsv2]"
    }

    proc multiprocess {pdftotext pdf1 pdf2} {
    puts -nonewline multiprocess
    set p1 [open "|$pdftotext -tsv $pdf1 - 2>@1" r]
    try {
    set p2 [open "|$pdftotext -tsv $pdf2 - 2>@1" r]
    try {
    fconfigure $p1 -blocking 0
    fconfigure $p2 -blocking 0
    set tsv1 ""
    set tsv2 ""
    while {![eof $p1] || ![eof $p2]} {
    append tsv1 [read $p1]
    append tsv2 [read $p2]
    after 1
    }
    } finally {
    close $p2
    }
    } finally {
    close $p1
    }
    puts " tsv1=[string length $tsv1] tsv2=[string length $tsv2]"
    }

    proc threaded {pdftotext pdf1 pdf2} {
    puts -nonewline "threaded "
    set pool [tpool::create -minworkers 2]
    set job1 [tpool::post -nowait $pool "exec $pdftotext -tsv $pdf1 - 2>@1"]
    set job2 [tpool::post -nowait $pool "exec $pdftotext -tsv $pdf2 - 2>@1"]
    set job_ids [list $job1 $job2]
    while {[llength $job_ids] > 0} {
    foreach job_id [tpool::wait $pool $job_ids job_ids] {
    if {$job_id eq $job1} {
    set tsv1 [tpool::get $pool $job_id]
    } else {
    set tsv2 [tpool::get $pool $job_id]
    }
    }
    }
    puts " tsv1=[string length $tsv1] tsv2=[string length $tsv2]"
    }

    main
    --- Synchronet 3.21f-Linux NewsLink 1.2
  • From Olivier@user1108@newsgrouper.org.invalid to comp.lang.tcl on Fri May 1 10:06:35 2026
    From Newsgroup: comp.lang.tcl


    Mark Summerfield <m.n.summerfield@gmail.com> posted:

    I need to run two external processes (on Linux):

    pdftotext -tsv one.pdf
    pdftotext -tsv two.pdf


    I am not an expert, but the construction (with Tcl 9.x) :

    1) launch both processes in background

    2) check the status with ::tcl::process

    3) post-process the output of each process as soon as it has ended (*)

    seems doable but no one mentions something similar, is this a construction
    to avoid ?

    (*) with a monolithic script if it is fast, I mean no thread or
    different interpreters
    --- Synchronet 3.21f-Linux NewsLink 1.2
  • From Ralf Fassel@ralfixx@gmx.de to comp.lang.tcl on Fri May 1 22:54:54 2026
    From Newsgroup: comp.lang.tcl

    * Mark Summerfield <m.n.summerfield@gmail.com>
    | I created a tiny test program (65 LOC; shown at the end) to
    | compare timings. I did multiple timings and here're the averages:

    | serial (2 LOC) 2.020 sec
    | multiprocess (19 LOC) 1.055 sec
    | threaded (13 LOC) 1.061 sec

    | Since the difference between the multiprocess and threaded
    | approaches is so small and that the threaded code is simpler
    | and more appealing, I'm going to use the threaded version in
    | my programs (which only ever work with two PDFs at a time)
    | rCo so thank you "abu"!

    I wonder: you stated in your initial message

    Message-ID: <10sschf$3nvs2$1@dont-email.me>
    In my experiments with Tcl's threading I've found the threading
    startup overhead to be rather large.

    Can you tell what is/was the difference to the current solution which
    obviously has no "startup overhead"?

    R'
    --- Synchronet 3.21f-Linux NewsLink 1.2
  • From Emiliano@emiliano@example.invalid to comp.lang.tcl on Sat May 2 00:34:54 2026
    From Newsgroup: comp.lang.tcl

    On Wed, 29 Apr 2026 07:38:23 -0000 (UTC)
    Mark Summerfield <m.n.summerfield@gmail.com> wrote:

    I need to run two external processes (on Linux):

    pdftotext -tsv one.pdf
    pdftotext -tsv two.pdf

    You can use pipes and run the processes in the background, collecting output with the event loop. Here's a rough draft

    proc runit {var file} {
    lassign [chan pipe] cr cw
    exec pdftotext -tsv $file - >@ $cw &
    chan close $cw
    chan configure $cr -blocking 0
    chan event $cr readable [list handle $var $cr]
    }
    proc handle {var fd} {
    global $var
    append $var [chan read $fd]
    if {[chan eof $fd]} {
    chan close $fd
    set ::done 1
    }
    }
    puts "sequential: [time {
    set out1 [exec pdftotext -tsv one.pdf -]
    set out2 [exec pdftotext -tsv two.pdf -]
    puts "one.pdf [string length $out1]"
    puts "two.pdf [string length $out2]"
    }]"
    puts "parallel: [time {
    runit out1 one.pdf
    runit out2 two.pdf
    vwait done
    vwait done
    puts "one.pdf [string length $out1]"
    puts "two.pdf [string length $out2]"
    }]"


    Regards
    --
    Emiliano
    --- Synchronet 3.21f-Linux NewsLink 1.2
  • From Mark Summerfield@m.n.summerfield@gmail.com to comp.lang.tcl on Sat May 2 06:57:06 2026
    From Newsgroup: comp.lang.tcl

    On Fri, 01 May 2026 22:54:54 +0200, Ralf Fassel wrote:

    * Mark Summerfield <m.n.summerfield@gmail.com>
    | I created a tiny test program (65 LOC; shown at the end) to
    | compare timings. I did multiple timings and here're the averages:

    | serial (2 LOC) 2.020 sec
    | multiprocess (19 LOC) 1.055 sec
    | threaded (13 LOC) 1.061 sec

    | Since the difference between the multiprocess and threaded
    | approaches is so small and that the threaded code is simpler
    | and more appealing, I'm going to use the threaded version in
    | my programs (which only ever work with two PDFs at a time)
    | rCo so thank you "abu"!

    I wonder: you stated in your initial message

    Message-ID: <10sschf$3nvs2$1@dont-email.me>
    In my experiments with Tcl's threading I've found the threading
    startup overhead to be rather large.

    Can you tell what is/was the difference to the current solution which obviously has no "startup overhead"?

    R'

    Yes, the difference was that I started out using thread::create etc.,
    rather than using tpool. I've put a new version that compares them
    all at the end. Anyone can compare timings for themselves if they
    have one or two big PDF files (the program needs two but for tests
    it is fine if it is the same one).

    On an old laptop:

    serial (2 LOC) 6.37 sec
    multiprocess (19 LOC) 3.33 sec
    thread pool (15 LOC) 3.60 sec
    threaded (22 LOC) 3.66 sec

    I've now gone back to using the multiprocess version.
    Here's the full test code.

    #!/usr/bin/env tclsh9
    # usage: time ./concurrent.tcl <s|m|p|t> <file1.pdf> <file2.pdf>

    package require thread 3

    const OPT -tsv ;# OR if not supported by older pdftotext use: -bbox
    const PDFTOTEXT [auto_execok pdftotext]

    proc main {} {
    set pdf1 [lindex $::argv 1]
    set pdf2 [lindex $::argv 2]
    switch [lindex $::argv 0] {
    s { serial $pdf1 $pdf2 }
    m { multiprocess $pdf1 $pdf2 }
    p { thread_pool $pdf1 $pdf2 }
    t { threaded $pdf1 $pdf2 }
    }
    }

    proc serial {pdf1 pdf2} {
    puts -nonewline "serial "
    set tsv1 [exec $::PDFTOTEXT $::OPT $pdf1 - 2>@1]
    set tsv2 [exec $::PDFTOTEXT $::OPT $pdf2 - 2>@1]
    puts " tsv1=[string length $tsv1] tsv2=[string length $tsv2]"
    }

    proc multiprocess {pdf1 pdf2} {
    puts -nonewline multiprocess
    set p1 [open "|$::PDFTOTEXT $::OPT $pdf1 - 2>@1" r]
    try {
    set p2 [open "|$::PDFTOTEXT $::OPT $pdf2 - 2>@1" r]
    try {
    fconfigure $p1 -blocking 0
    fconfigure $p2 -blocking 0
    set tsv1 ""
    set tsv2 ""
    while {![eof $p1] || ![eof $p2]} {
    append tsv1 [read $p1]
    append tsv2 [read $p2]
    after 1
    }
    } finally {
    close $p2
    }
    } finally {
    close $p1
    }
    puts " tsv1=[string length $tsv1] tsv2=[string length $tsv2]"
    }

    proc thread_pool {pdf1 pdf2} {
    puts -nonewline "thread pool "
    set pool [tpool::create -minworkers 2]
    set job1 [tpool::post -nowait $pool \
    "exec $::PDFTOTEXT $::OPT $pdf1 - 2>@1"]
    set job2 [tpool::post -nowait $pool \
    "exec $::PDFTOTEXT $::OPT $pdf2 - 2>@1"]
    set job_ids [list $job1 $job2]
    while {[llength $job_ids] > 0} {
    foreach job_id [tpool::wait $pool $job_ids job_ids] {
    if {$job_id eq $job1} {
    set tsv1 [tpool::get $pool $job_id]
    } else {
    set tsv2 [tpool::get $pool $job_id]
    }
    }
    }
    puts " tsv1=[string length $tsv1] tsv2=[string length $tsv2]"
    }

    proc threaded {pdf1 pdf2} {
    puts -nonewline "threaded "
    set tid1 [thread::create -joinable]
    set tid2 [thread::create -joinable]
    tsv::set shared pdf1 $pdf1
    tsv::set shared pdf2 $pdf2
    tsv::set shared pdftotext $::PDFTOTEXT
    tsv::set shared opt $::OPT
    thread::send -async $tid1 {
    tsv::set shared tsv1 \
    [exec -encoding utf-8 {*}[tsv::get shared pdftotext] \
    [tsv::get shared opt] [tsv::get shared pdf1] - 2>@1]
    }
    thread::send -async $tid2 {
    tsv::set shared tsv2 \
    [exec -encoding utf-8 {*}[tsv::get shared pdftotext] \
    [tsv::get shared opt] [tsv::get shared pdf2] - 2>@1]
    }
    thread::release $tid1
    thread::join $tid1
    thread::release $tid2
    thread::join $tid2
    set tsv1 [tsv::get shared tsv1]
    set tsv2 [tsv::get shared tsv2]
    puts " tsv1=[string length $tsv1] tsv2=[string length $tsv2]"
    }

    main
    --- Synchronet 3.21f-Linux NewsLink 1.2
  • From Ashok@apnmbx-public@yahoo.com to comp.lang.tcl on Sat May 9 16:03:16 2026
    From Newsgroup: comp.lang.tcl

    Shameless plug...

    Bit late to the topic, but the simplest way to parallelize multiple
    processes or threads and wait for completion is promises, if you do not
    mind an external package. Bit of a learning curve however.

    lappend promises [promise::pexec pdftotext pdf1.pdf pdf1.txt]
    lappend promises [promise::pexec pdftotext pdf2.pdf pdf2.txt]
    set waiter [promise::all $promises]
    # Assumes eventloop not running!
    promise::eventloop $waiter

    Timing:

    % time {demo} <- using promises
    2606403 microseconds per iteration
    % time {demo2} <- sequential exec's
    4762417 microseconds per iteration

    https://wiki.tcl-lang.org/page/promise
    https://tcl-promise.magicsplat.com/ https://www.magicsplat.com/blog/tags/promises/

    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Ralf Fassel@ralfixx@gmx.de to comp.lang.tcl on Mon May 11 11:08:06 2026
    From Newsgroup: comp.lang.tcl

    * Ashok <apnmbx-public@yahoo.com>
    | Shameless plug...

    | Bit late to the topic, but the simplest way to parallelize multiple
    | processes or threads and wait for completion is promises, if you do
    | not mind an external package. Bit of a learning curve however. --<snip-snip>--
    | https://tcl-promise.magicsplat.com/

    Ashok,
    since coroutines are already part of TCL, any chance of getting promises
    into the core? It would seem to me as a 'natural' addition for async
    features in TCL, and the package looks quite mature...

    R'
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Mark Summerfield@m.n.summerfield@gmail.com to comp.lang.tcl on Tue May 12 08:40:29 2026
    From Newsgroup: comp.lang.tcl

    On Sat, 9 May 2026 16:03:16 +0530, Ashok wrote:

    Shameless plug...

    Bit late to the topic, but the simplest way to parallelize multiple processes or threads and wait for completion is promises, if you do not
    mind an external package. Bit of a learning curve however.

    lappend promises [promise::pexec pdftotext pdf1.pdf pdf1.txt]
    lappend promises [promise::pexec pdftotext pdf2.pdf pdf2.txt]
    set waiter [promise::all $promises]
    # Assumes eventloop not running!
    promise::eventloop $waiter

    Timing:

    % time {demo} <- using promises
    2606403 microseconds per iteration
    % time {demo2} <- sequential exec's
    4762417 microseconds per iteration

    https://wiki.tcl-lang.org/page/promise
    https://tcl-promise.magicsplat.com/ https://www.magicsplat.com/blog/tags/promises/

    I tried it but hit a problem. Here's the code I used:

    proc promised {pdf1 pdf2} {
    set p1 [promise::pexec $::PDFTOTEXT $::OPT $pdf1 - 2>@1]
    set p2 [promise::pexec $::PDFTOTEXT $::OPT $pdf2 - 2>@1]
    set waiter [promise::all [list $p1 $p2]]
    # Assumes eventloop not running!
    promise::eventloop $waiter
    set tsv1 [$p1 getdata]
    set tsv2 [$p2 getdata]
    puts " tsv1=[string length $tsv1] tsv2=[string length $tsv2]"
    }

    I used promise-1.2.0.tm. Here's the error:

    $ ./concurrent.tcl P ~/commercial/pdfs/boson[12].pdf
    invalid command name "::oo::Obj22"
    while executing
    "$p1 getdata"
    (procedure "promised" line 7)
    invoked from within
    "promised $pdf1 $pdf2 "
    (procedure "main" line 9)
    invoked from within
    "main"
    (file "./concurrent.tcl" line 112)

    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Ralf Fassel@ralfixx@gmx.de to comp.lang.tcl on Tue May 12 16:58:55 2026
    From Newsgroup: comp.lang.tcl

    * Mark Summerfield <m.n.summerfield@gmail.com>
    | > https://wiki.tcl-lang.org/page/promise
    | > https://tcl-promise.magicsplat.com/
    | > https://www.magicsplat.com/blog/tags/promises/

    | I tried it but hit a problem. Here's the code I used:

    | proc promised {pdf1 pdf2} {
    | set p1 [promise::pexec $::PDFTOTEXT $::OPT $pdf1 - 2>@1]
    | set p2 [promise::pexec $::PDFTOTEXT $::OPT $pdf2 - 2>@1]
    | set waiter [promise::all [list $p1 $p2]]
    | # Assumes eventloop not running!
    | promise::eventloop $waiter
    | set tsv1 [$p1 getdata]
    | set tsv2 [$p2 getdata]

    promise::eventloop already returns the result of the 'waiter' promise
    (i.e. those registered in promise::all).

    So change those two 'getdata' calls to

    lassign [promise::eventloop $waiter] tsv1 tsv2

    | puts " tsv1=[string length $tsv1] tsv2=[string length $tsv2]"

    HTH
    R'
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Mark Summerfield@m.n.summerfield@gmail.com to comp.lang.tcl on Wed May 13 08:54:56 2026
    From Newsgroup: comp.lang.tcl

    On Tue, 12 May 2026 16:58:55 +0200, Ralf Fassel wrote:

    * Mark Summerfield <m.n.summerfield@gmail.com>
    | > https://wiki.tcl-lang.org/page/promise
    | > https://tcl-promise.magicsplat.com/
    | > https://www.magicsplat.com/blog/tags/promises/

    | I tried it but hit a problem. Here's the code I used:

    | proc promised {pdf1 pdf2} {
    | set p1 [promise::pexec $::PDFTOTEXT $::OPT $pdf1 - 2>@1]
    | set p2 [promise::pexec $::PDFTOTEXT $::OPT $pdf2 - 2>@1]
    | set waiter [promise::all [list $p1 $p2]]
    | # Assumes eventloop not running!
    | promise::eventloop $waiter
    | set tsv1 [$p1 getdata]
    | set tsv2 [$p2 getdata]

    promise::eventloop already returns the result of the 'waiter' promise
    (i.e. those registered in promise::all).

    So change those two 'getdata' calls to

    lassign [promise::eventloop $waiter] tsv1 tsv2

    | puts " tsv1=[string length $tsv1] tsv2=[string length $tsv2]"

    HTH
    R'

    Thanks, I've now done that. Here are the new timings (each is the best
    of several):

    sec method
    2.010 serial
    1.052 multiprocess
    1.065 thread_pool
    1.067 threaded
    8.366 promised
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From meshparts@alexandru.dadalau@meshparts.de to comp.lang.tcl on Thu May 14 10:00:10 2026
    From Newsgroup: comp.lang.tcl

    Am 13.05.2026 um 10:54 schrieb Mark Summerfield:
    Thanks, I've now done that. Here are the new timings (each is the best
    of several):

    sec method
    2.010 serial
    1.052 multiprocess
    1.065 thread_pool
    1.067 threaded
    8.366 promised

    So with promisses it's 4x slower than serial?
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Ashok@apnmbx-public@yahoo.com to comp.lang.tcl on Thu May 14 17:12:29 2026
    From Newsgroup: comp.lang.tcl

    I would not be in support of this myself. As it is I'm skeptical of
    adding packages to the core because there simply are not enough folks to maintain the packages already there.

    Additionally, promises are still a "fringe" idiom in Tcl land and not
    widely used or adopted.

    /Ashok

    On 5/11/2026 2:38 PM, Ralf Fassel wrote:

    Ashok,
    since coroutines are already part of TCL, any chance of getting promises
    into the core? It would seem to me as a 'natural' addition for async features in TCL, and the package looks quite mature...

    R'

    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Ashok@apnmbx-public@yahoo.com to comp.lang.tcl on Thu May 14 17:20:51 2026
    From Newsgroup: comp.lang.tcl

    I'm surprised by the promises result below (not that I doubt it). I'll
    have to take a look when I have some time.

    In my tests that I posted earlier, the promise version took about the
    same time as the multiprocess one.

    The difference between my example and yours is that in my example,
    pdftotext was writing to a file and not to its stdout. In your example,
    it is writing back to the pipe and read directly in Tcl.

    I wonder if the difference stems from your code essentially doing a busy
    loop reading data while the promise version goes through the event loop
    though I cannot explain why that would make that much difference.

    Worth investigating further when I have time...

    /Ashok

    On 5/13/2026 2:24 PM, Mark Summerfield wrote:

    Thanks, I've now done that. Here are the new timings (each is the best
    of several):

    sec method
    2.010 serial
    1.052 multiprocess
    1.065 thread_pool
    1.067 threaded
    8.366 promised

    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Mark Summerfield@m.n.summerfield@gmail.com to comp.lang.tcl on Fri May 15 10:38:01 2026
    From Newsgroup: comp.lang.tcl

    On Thu, 14 May 2026 17:20:51 +0530, Ashok wrote:

    I'm surprised by the promises result below (not that I doubt it). I'll
    have to take a look when I have some time.

    In my tests that I posted earlier, the promise version took about the
    same time as the multiprocess one.

    The difference between my example and yours is that in my example,
    pdftotext was writing to a file and not to its stdout. In your example,
    it is writing back to the pipe and read directly in Tcl.

    I wonder if the difference stems from your code essentially doing a busy loop reading data while the promise version goes through the event loop though I cannot explain why that would make that much difference.

    Worth investigating further when I have time...

    /Ashok

    On 5/13/2026 2:24 PM, Mark Summerfield wrote:

    Thanks, I've now done that. Here are the new timings (each is the best
    of several):

    sec method
    2.010 serial
    1.052 multiprocess
    1.065 thread_pool
    1.067 threaded
    8.366 promised

    In the hope it helps, below is the full source for the example I used.
    I ran it on Tcl/Tk 9.0.3 (64-bit), Debian GNU/Linux 12 (bookworm)
    Linux 6.1.0-44-amd64 (x86_64), 12th Gen Intel Core i7-12700 20 cores.
    I used two PDF files both of 647 pages and did several runs of each
    method to find the best time.

    #!/usr/bin/env tclsh9
    # usage: time ./concurrent.tcl <s|m|p|t> <file1.pdf> <file2.pdf>

    package require thread 3
    tcl::tm::path add .
    package require promise

    const OPT -tsv ;# If older pdftotext doesn't support -tsv use -bbox
    const PDFTOTEXT [auto_execok pdftotext]

    proc main {} {
    set pdf1 [lindex $::argv 1]
    set pdf2 [lindex $::argv 2]
    switch [lindex $::argv 0] {
    h - -h - --help {
    puts "usage: <s|m|p|t|P> <file1.pdf> <file2.pdf"
    exit
    }
    s { serial $pdf1 $pdf2 }
    m { multiprocess $pdf1 $pdf2 }
    p { thread_pool $pdf1 $pdf2 }
    t { threaded $pdf1 $pdf2 }
    P { promised $pdf1 $pdf2 }
    }
    }

    proc serial {pdf1 pdf2} {
    puts -nonewline "serial "
    set tsv1 [exec $::PDFTOTEXT $::OPT $pdf1 - 2>@1]
    set tsv2 [exec $::PDFTOTEXT $::OPT $pdf2 - 2>@1]
    puts " tsv1=[string length $tsv1] tsv2=[string length $tsv2]"
    }

    proc multiprocess {pdf1 pdf2} {
    puts -nonewline multiprocess
    set p1 [open "|$::PDFTOTEXT $::OPT $pdf1 - 2>@1" r]
    try {
    set p2 [open "|$::PDFTOTEXT $::OPT $pdf2 - 2>@1" r]
    try {
    fconfigure $p1 -blocking 0
    fconfigure $p2 -blocking 0
    set tsv1 ""
    set tsv2 ""
    while {![eof $p1] || ![eof $p2]} {
    append tsv1 [read $p1]
    append tsv2 [read $p2]
    after 1
    }
    } finally {
    close $p2
    }
    } finally {
    close $p1
    }
    puts " tsv1=[string length $tsv1] tsv2=[string length $tsv2]"
    }

    proc thread_pool {pdf1 pdf2} {
    puts -nonewline "thread pool "
    set pool [tpool::create -minworkers 2]
    set job1 [tpool::post -nowait $pool \
    "exec $::PDFTOTEXT $::OPT $pdf1 - 2>@1"]
    set job2 [tpool::post -nowait $pool \
    "exec $::PDFTOTEXT $::OPT $pdf2 - 2>@1"]
    set job_ids [list $job1 $job2]
    while {[llength $job_ids] > 0} {
    foreach job_id [tpool::wait $pool $job_ids job_ids] {
    if {$job_id eq $job1} {
    set tsv1 [tpool::get $pool $job_id]
    } else {
    set tsv2 [tpool::get $pool $job_id]
    }
    }
    }
    puts " tsv1=[string length $tsv1] tsv2=[string length $tsv2]"
    }

    proc threaded {pdf1 pdf2} {
    puts -nonewline "threaded "
    set tid1 [thread::create -joinable]
    set tid2 [thread::create -joinable]
    tsv::set shared pdf1 $pdf1
    tsv::set shared pdf2 $pdf2
    tsv::set shared pdftotext $::PDFTOTEXT
    tsv::set shared opt $::OPT
    thread::send -async $tid1 {
    tsv::set shared tsv1 \
    [exec -encoding utf-8 {*}[tsv::get shared pdftotext] \
    [tsv::get shared opt] [tsv::get shared pdf1] - 2>@1]
    }
    thread::send -async $tid2 {
    tsv::set shared tsv2 \
    [exec -encoding utf-8 {*}[tsv::get shared pdftotext] \
    [tsv::get shared opt] [tsv::get shared pdf2] - 2>@1]
    }
    thread::release $tid1
    thread::join $tid1
    thread::release $tid2
    thread::join $tid2
    set tsv1 [tsv::get shared tsv1]
    set tsv2 [tsv::get shared tsv2]
    puts " tsv1=[string length $tsv1] tsv2=[string length $tsv2]"
    }

    proc promised {pdf1 pdf2} {
    set p1 [promise::pexec $::PDFTOTEXT $::OPT $pdf1 - 2>@1]
    set p2 [promise::pexec $::PDFTOTEXT $::OPT $pdf2 - 2>@1]
    set waiter [promise::all [list $p1 $p2]]
    # Assumes eventloop not running!
    lassign [promise::eventloop $waiter] tsv1 tsv2
    puts " tsv1=[string length $tsv1] tsv2=[string length $tsv2]"
    }

    main
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Ashok@apnmbx-public@yahoo.com to comp.lang.tcl on Sun May 17 12:36:55 2026
    From Newsgroup: comp.lang.tcl

    On 5/15/2026 4:08 PM, Mark Summerfield wrote:

    In the hope it helps, below is the full source for the example I used.
    I ran it on Tcl/Tk 9.0.3 (64-bit), Debian GNU/Linux 12 (bookworm)
    Linux 6.1.0-44-amd64 (x86_64), 12th Gen Intel Core i7-12700 20 cores.
    I used two PDF files both of 647 pages and did several runs of each
    method to find the best time.
    Thanks, having a benchmark source helped. However I cannot reproduce
    your results. The promise version is as fast as any other. My laptop is
    long in the tooth but that should not make a difference in comparative
    terms I think.

    Below is what I get using the following shell script:

    ------
    #!/bin/sh

    for method in s m p t P; do
    for i in $(seq 1 5); do
    time -p ~/tcl/9.0.3/x64/bin/tclsh9.0 bench.tcl $method x.pdf
    y.pdf 2>&1 | tr '\n' ' '
    # Appends to previous line!
    echo "Method $method, Run $i"
    done
    echo "---------------------------------------------------------"
    done
    -----

    I used -bbox instead of -tsv as my pdftotext does not support the
    latter. Tests on my Ubuntu 22 WSL. All versions, including promises are
    about twice as fast as the serial one. On every run, there seems to be
    one or two exceptionally fast anomaly, independent of the method. Not
    sure why that is, some fortunate cache or memory effect?

    Here are the results, more or less as expected.

    serial tsv1=3559848 tsv2=3559848 real 5.59 user 0.82 sys 0.27
    Method s, Run 1
    serial tsv1=3559848 tsv2=3559848 real 5.50 user 0.82 sys 0.29
    Method s, Run 2
    serial tsv1=3559848 tsv2=3559848 real 5.61 user 0.82 sys 0.26
    Method s, Run 3
    serial tsv1=3559848 tsv2=3559848 real 4.24 user 0.83 sys 0.25
    Method s, Run 4
    serial tsv1=3559848 tsv2=3559848 real 5.59 user 0.81 sys 0.27
    Method s, Run 5
    ---------------------------------------------------------
    multiprocess tsv1=3559849 tsv2=3559849 real 3.13 user 0.95 sys 0.33
    Method m, Run 1
    multiprocess tsv1=3559849 tsv2=3559849 real 3.05 user 0.89 sys 0.36
    Method m, Run 2
    multiprocess tsv1=3559849 tsv2=3559849 real 3.12 user 0.95 sys 0.31
    Method m, Run 3
    multiprocess tsv1=3559849 tsv2=3559849 real 3.13 user 0.96 sys 0.30
    Method m, Run 4
    multiprocess tsv1=3559849 tsv2=3559849 real 3.13 user 0.97 sys 0.30
    Method m, Run 5
    ---------------------------------------------------------
    thread pool tsv1=3559848 tsv2=3559848 real 3.21 user 0.93 sys 0.40
    Method p, Run 1
    thread pool tsv1=3559848 tsv2=3559848 real 3.15 user 0.90 sys 0.39
    Method p, Run 2
    thread pool tsv1=3559848 tsv2=3559848 real 1.79 user 0.94 sys 0.37
    Method p, Run 3
    thread pool tsv1=3559848 tsv2=3559848 real 3.14 user 0.97 sys 0.31
    Method p, Run 4
    thread pool tsv1=3559848 tsv2=3559848 real 3.10 user 0.90 sys 0.37
    Method p, Run 5
    ---------------------------------------------------------
    threaded tsv1=3559848 tsv2=3559848 real 3.17 user 0.90 sys 0.41
    Method t, Run 1
    threaded tsv1=3559848 tsv2=3559848 real 3.14 user 0.90 sys 0.39
    Method t, Run 2
    threaded tsv1=3559848 tsv2=3559848 real 3.14 user 0.90 sys 0.37
    Method t, Run 3
    threaded tsv1=3559848 tsv2=3559848 real 3.14 user 0.94 sys 0.31
    Method t, Run 4
    threaded tsv1=3559848 tsv2=3559848 real 3.14 user 0.92 sys 0.35
    Method t, Run 5
    ---------------------------------------------------------
    promise tsv1=3559849 tsv2=3559849 real 3.33 user 2.68 sys 0.42
    Method P, Run 1
    promise tsv1=3559849 tsv2=3559849 real 3.30 user 2.48 sys 0.46
    Method P, Run 2
    promise tsv1=3559849 tsv2=3559849 real 1.94 user 2.48 sys 0.46
    Method P, Run 3
    promise tsv1=3559849 tsv2=3559849 real 3.35 user 2.69 sys 0.39
    Method P, Run 4
    promise tsv1=3559849 tsv2=3559849 real 3.31 user 2.64 sys 0.41
    Method P, Run 5
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Mark Summerfield@m.n.summerfield@gmail.com to comp.lang.tcl on Mon May 18 07:11:14 2026
    From Newsgroup: comp.lang.tcl

    The most obvious difference is that I am running Linux on the hardware
    and (I think) you are running Linux on Windows. All I can suggest is
    trying the same test on Linux that's running directly on the hardware?
    --- Synchronet 3.22a-Linux NewsLink 1.2