re-search-forward to translate rawbytes to appropriate unicode point
From Btraven@caudex2@gmail.com to gnu.emacs.help on Wed Jul 27 18:13:01 2022
From Newsgroup: gnu.emacs.help
Emacs gurus:
I can manually search and replace escaped octal sequences (e.g. \202 etc.)
but can't figure out how to do that programmatically. I thought something like this might work:
(defun esc-oct2unichar-in-region (start end) ;; alias M-x otu
"Replace rawbytes in region with utf-8 chars."
(interactive "r")
(goto-char start)
(save-excursion
(setq curs (point))
(while (< curs end)
(progn
(setq curs (re-search-forward "[\204 \221 \222 \223 \226 \234 \273 \311 \337 \342 \344 \346 \351 \253 \352 \356 \363 \364 \366 \373 \374]"))
(if (< curs end)
(replace-match (cdr (assoc (match-string-no-properties 0)
'((\204 . "\"") (\221 . T") (\222 .nm) (\223 . T") (\226 .") (\234 . "ce") (\2 73 . T") (\311 . "E") (\337 . "&") (\342 . "a") (\344 . "a") (\346 . "ae") (\351 . "e") (\253 . 1
"") (\352 . "e") (\356 . T) (\363 . "5") (\364 . "6") (\366 . "6") (\373 . "u") (\374 . "u"))
)))
)))))
many typos after '((/204.... due to need to ocr emacs screen) but my intent should be obvious.
The arguements to re-search-forward are actual illegal sequences in utf-8 (mostly) and show up in red in the emacs windows. Is there some string representation of these escaped octal numbers. Is it even possible to automate this chore ? Apparently rawbytes can't be used in regexes. It's fairly easy to globally replace each of these rawbytes but rather tedious.
Thanks for any comments, especially helpful ones.
Ed
--- Synchronet 3.21d-Linux NewsLink 1.2