From Newsgroup: comp.misc
Mike Spencer <
mds@bogus.nodomain.nowhere> wrote or quoted:
Can someone suggest books at more or less the same level of
technicality that that I might look at to catch up a bit on how neural
nets are now constructed, trained, connected etc. to produce what is
being called "large language models"?
From short to long:
"The Little Book of Deep Learning" - Fran|oois Fleuret
(GPT in section 5.3 "Attention Models")
"Understanding Deep Learning" - Simon J.D. Prince (section 12.5
"Transformers for natural language processing")
"Dive into Deep Learning" - Aston Zhang, Zachary C. Lipton,
Mu Li, And Alexander J. Smola (chapter 11 "Attention
Mechanisms and Transformers")
Hands-on:
"The Complete Guide to Deep Learning with Python Keras, Tensorflow,
And Pytorch" - Joseph G. Derek (section 15.4 "NLP Applications:
Chatbots, Sentiment Analysis Tools") - only conditionally
recommended due to deficiencies in source code formatting
--- Synchronet 3.21a-Linux NewsLink 1.2