• Googles TPU muscle in 2017 [Prolog Community is Sleepy Joe] (Was: Zeus: A Language for Expressing Algorithms in Hardware)

    From Mild Shock@janburse@fastmail.fm to sci.physics on Thu Nov 27 15:22:31 2025
    From Newsgroup: sci.physics

    Hi,

    Well I am currently looking in Local AI when I
    consider NPUs, which have very small floats.
    An example of bigger AI accelerators are TPUs,

    which can deal with larger floats. Subsequently
    then can also deal with a larger integer range.
    I only read this anecdote yesterday:

    "In December 2017, Stockfish 8 was used as a
    benchmark to test Google division DeepMind's
    AlphaZero, with Stockfish running on CPU and
    AlphaZero running on Google's proprietary
    Tensor Processing Units (TPUs).

    AlphaZero was trained through self-play for
    a total of nine hours, and reached Stockfish's
    level after just four. AlphaZero also played
    twelve 100-game matches against Stockfish starting
    from twelve popular openings for a final score
    of 290 wins, 886 draws and 24 losses, for a
    point score of 733:467." https://en.wikipedia.org/wiki/Stockfish_(chess)#Stockfish_8_versus_AlphaZero

    And then:

    "AlphaZero's victory over Stockfish sparked a
    flurry of activity in the computer chess community,
    leading to a new open-source engine aimed at
    replicating AlphaZero, known as Leela Chess Zero.
    The two engines remained close in strength for a
    while, but Stockfish has pulled away since the
    introduction of NNUE, winning every TCEC
    season since Season 18."

    Meanwhile the Prolog community: Sleepy Joe

    LoL

    Bye

    Mild Shock schrieb:
    Hi,

    What mindset is needed to program an NPU. Mostlikely
    a mindset based on fork/join parallelism is nonsense.
    What could be more fruitful is view the AI accellerator

    as a blackbox that runs a neural network, whereby
    a neural network can be effectively viewed as a form
    of hardware, although unter the hood, it is open weights

    and matrix operations. So the mindest needs:

    Zeus: A Language for Expressing Algorithms in Hardware
    K. J. Lieberherr --a 01 February 1985 https://dl.acm.org/doi/10.1109/MC.1985.1662799

    What changed back to then?

    - 80's Field Programmable Gate Array (FPGA)

    - 20's AI Boom: NPUs, Unified Memory and Routing Fabric

    Bye

    Mild Shock schrieb:
    Hi,

    I already posted how to do SAT and Clark Completion
    with ReLU. This was a post from 15.03.2025, 16:13,
    see also below. But can we do CLP as well? Here

    is a take on the dif/2 constraint, or more precisely
    a very primitive (#\=)/2 from CLP(FD), going towards
    analogical computing. Might work for domains that

    fit into the quantization size of a NPU:

    1) First note that we can model abs() via ReLU:

    abs(x) = ReLU(x) + ReLU(- x)

    2) Then note that for integer values, we can model
    chi(x>0), the characteristic function of the predicate x > 0:

    chi(x>0) = 1 - ReLU(1 - x).

    3) Now chi(x=\=y) is simply:

    chi(x=\=y) = chi(abs(x - y) > 0)

    Now insert the formula for chi(x>0) based on ReLU
    and the formula for abs() based on ReLU. Eh voila you
    got an manually created neural network for the

    (#\=)/2 condition of CLP(FD), constraint logic
    programming for finite domains.

    Have Fun!

    Bye

    Mild Shock schrieb:
    A storm of symbolic differentiation libraries
    was posted. But what can these Prolog code
    fossils do?

    Does one of these libraries support Python symbolic
    Pieceweise ? For example one can define rectified
    linear unit (ReLU) with it:

    -a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a /-a-a x-a-a-a-a-a x-a >= 0
    -a-a-a-a-a ReLU(x) := <
    -a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a \-a-a 0-a-a-a-a-a otherwise

    With the above one can already translate a
    propositional logic program, that uses negation
    as failure, into a neural network:

    NOT-a-a-a-a \+ p-a-a-a-a-a-a-a-a-a-a-a-a 1 - x
    AND-a-a-a-a p1, ..., pn-a-a-a-a-a ReLU(x1 + ... + xn - (n-1))
    OR-a-a-a-a-a p1; ...; pn-a-a-a-a-a 1 - ReLU(-x1 - .. - xn + 1)

    For clauses just use Clark Completion, it makes
    the defined predicate a new neuron, dependent on
    other predicate neurons,

    through a network of intermediate neurons. Because
    of the constant shift in AND and OR, the neurons
    will have a bias b.

    So rule based in zero order logic is a subset
    of neural network.

    Python symbolic Pieceweise

    https://how-to-data.org/how-to-write-a-piecewise-defined-function-in-python-using-sympy/



    rectified linear unit (ReLU)
    https://en.wikipedia.org/wiki/Rectifier_(neural_networks)

    Clark Completion
    https://www.cs.utexas.edu/~vl/teaching/lbai/completion.pdf

    Mild Shock schrieb:
    Hi,

    I am spekulating an NPU could give 1000x more LIPS.
    For certain combinatorial search problems. It all
    boils down to implement this thingy:

    In June 2020, Stockfish introduced the efficiently
    updatable neural network (NNUE) approach, based
    on earlier work by computer shogi programmers
    https://en.wikipedia.org/wiki/Stockfish_%28chess%29

    There are varying degrees what gets updated of
    a neural network. But the specs of an NPU tell
    me very simply the following:

    - An NPU can make 40 TFLOPS, all my AI Laptops
    -a-a from 2025 can do that right now. The brands
    -a-a are Intel Ultra, AMD Ryzen and Snapdragon X,

    -a-a but I guess there might be more brands around,
    -a-a which can do that with a price tag less
    -a-a than 1000.- USD.

    - SWI Prolog can make 30 MLIPS, Dogelog Player
    -a-a runs similar, some Prolog systems are faster.

    Now thats is 10^12 versus 10^6. If some of the
    LIPS can be delegated to a NPU, and if we assume
    for example less locality or more primitive

    operations that require a layering. Would could assume
    that from the NPU 10^12 a factor of 1000 goes
    away. So we might still see 10'9 LIPS emerge.

    Now make the calculation:

    - Without NPU: MLIPS
    - With NPU: GLIPS
    - Ratio: 1000x times faster

    Have fun!

    Bye

    Mild Shock schrieb:
    Hi,

    So Boris the Loris and Nazi Retartd Julio are
    not alone. There is now a mobilization of the
    kind of rage against the machine,

    fighting for methods without randomness. Its
    almost like-a Albert Einstein ascendet from his
    grave and is now preaching,

    "God does not play dice"

    So how it started:

    PIVOT was an interactive program verifier designed by
    L. Peter Deutsch for his Ph.D. dissertation.
    Posted here by permission of L. Peter Deutsch.
    https://softwarepreservation.computerhistory.org/pivot/

    How its going:

    Formal Methods: Whence and Whither?
    The text also highlights the evolving role of formal
    methods amidst technological advancements, such as
    AI, and explores educational and standardization issues
    related to their adoption.
    https://de.slideshare.net/slideshow/formal-methods-whence-and-whither-keynote/273708245


    Can the Don Quijotes win, and fight the AI windmills?

    LoL

    Bye

    Mild Shock schrieb:
    Hi,

    Boris the Loris and Julio Di Egidio the Nazi Retard,
    are going for an afterwork beer. They are still
    highly confused by Fuzzy Testing:

    Star Trek - The 70's Disco Generation
    https://www.youtube.com/watch?v=505zvAvnreg

    The favorite hangout is Spock's Logic Dancefloor,
    which is known for its sharp unfuzzy wit. They
    have-a a chat with Data about Disco Math,

    the only Math which has no Fuzzy Logic in it.

    Bye

    Mild Shock schrieb:
    Hi,

    Candidate Recommendation Draft - 30 September 2025
    https://www.w3.org/TR/webnn

    WebNN samples by Ningxin Hu, Intel, Shanghai
    https://github.com/webmachinelearning/webnn-samples

    Bye

    Mild Shock schrieb:
    Hi,

    It seems I am having problems pacing with
    all the new fancy toys. Wasn't able to really
    benchmark my NPU from a Desktop AI machine,

    picked the wrong driver. Need to try again.
    What worked was benchmarking Mobile AI machines.
    I just grabbed Geekbench AI and some devices:

    USA Fab, M4:

    -a-a-a-a sANN-a-a-a hANN-a-a-a qANN
    iPad CPU-a-a-a 4848-a-a-a 7947-a-a-a 6353
    iPad GPU-a-a-a 9752-a-a-a 11383-a-a-a 10051
    iPad NPU-a-a-a 4873-a-a-a 36544-a-a-a *51634*

    China Fab, Snapdragon:

    -a-a-a-a sANN-a-a-a hANN-a-a-a qANN
    Redmi CPU-a-a-a 1044-a-a-a 950-a-a-a 1723
    Redmi GPU-a-a-a 480-a-a-a 905-a-a-a 737
    Redmi NNAPI-a-a-a 205-a-a-a 205-a-a-a 469
    Redmi QNN-a-a-a 226-a-a-a 226-a-a-a *10221*

    Speed-Up via NPU is factor 10x. See the column
    qANN which means quantizised artificial neural
    networks, when NPU or QNN is picked.

    The mobile AI NPUs are optimized using
    mimimal amounts of energy, and minimal amounts
    of space squeezing (distilling) everything

    into INT8 and INT4.

    Bye







    --- Synchronet 3.21a-Linux NewsLink 1.2