• Please ignore my previous post - Re: Automating an atypical search & re

    From Janis Papanagnou@21:1/5 to Janis Papanagnou on Sat Jul 13 19:55:35 2024
    Please ignore my previous post - it would delete the whole span'ed
    section!


    It just occurred to me you'd probably want something like

    /<span class='add'>
    df>
    /<\/span>
    df>

    And if you're using recording of the commands (I'll provide code
    on demand) just repeat the recordings. You can also just use the
    arrow keys after typing / to get the previous search patterns
    if you like.


    On 13.07.2024 19:48, Janis Papanagnou wrote:
    On 13.07.2024 18:08, Richard Owlett wrote:
    I'm reformatting some HTML files containing chapters of the KJV Bible.
    My source follows the practice of italicizing some words.
    I find italics distracting.

    These occurrences are consistently of the form
    <span class='add'>arbitrary_text</span>

    I wish to delete "<span class='add'>" and *ASSOCIATED* "</span>".
    Obviously it would not be wise to fully automate the action.
    I wish to find all occurrences of <span
    class='add'>arbitrary_text</span> an manually confirm the edit.

    In general, is it feasible?

    Yes, sure.

    Some remarks...
    I would use Regular Expressions (RE) for that task.
    If <span> sections can be nested in your HTML source then you
    cannot do that with plain RE processors.
    Since you want to inspect each <span> pattern individually it's
    not clear what you mean by "automate" (which I'd interpret as
    running a batch job to do the process).
    Actually you seem to want a sequential find + replace-or-skip.

    In Vim I'd search for the "<span ..." pattern and then delete
    to the next "</span>" pattern. (Assuming no nested <span>.)
    Rinse repeat.
    That could be (for example) the commands [case 1]

    /<span class='add'>
    d/<\/span>df>

    If there's no other <...> inside the span-sections you could
    simplify that to [case 2]

    /<span class='add'>
    d2f>

    with the opportunity to repeat those search+delete commands
    by simply typing n. for every match, like n.n.n.n. or if
    you want to skip some like, e.g., n.nnnn.n.nnn.n

    With n you get to the next span pattern and . repeats the
    last command.

    In [case 1] the repeat isn't possible since we have two delete
    operations d/<\/span> and df> , but here you can define
    macros to trigger the command by a keystroke or just use the
    recording function to repeat the once recorded commands.

    Sounds complicated? - Maybe. - But if we know your exact data
    format we can provide the best command sequence for Vim for
    most easy use.


    Can KDE's Kate do it?

    Don't know.

    Janis


    TIA


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Richard Owlett@21:1/5 to Janis Papanagnou on Sun Jul 14 02:33:25 2024
    On 07/13/2024 12:55 PM, Janis Papanagnou wrote:
    Please ignore my previous post - it would delete the whole span'ed
    section!


    It just occurred to me you'd probably want something like

    /<span class='add'>
    df>
    /<\/span>
    df>

    And if you're using recording of the commands (I'll provide code
    on demand) just repeat the recordings. You can also just use the
    arrow keys after typing / to get the previous search patterns
    if you like.

    I don't know how to parse your answer.
    But I suspect following some leads from Lawrence and Stan in this thread
    will be illuminating. I have just started reading https://docs.kde.org/stable5/en/kate/katepart/regular-expressions.html .

    Part of my motive for this project is self education.



    On 13.07.2024 19:48, Janis Papanagnou wrote:
    On 13.07.2024 18:08, Richard Owlett wrote:
    I'm reformatting some HTML files containing chapters of the KJV Bible.
    My source follows the practice of italicizing some words.
    I find italics distracting.

    These occurrences are consistently of the form
    <span class='add'>arbitrary_text</span>

    I wish to delete "<span class='add'>" and *ASSOCIATED* "</span>".
    Obviously it would not be wise to fully automate the action.
    I wish to find all occurrences of <span
    class='add'>arbitrary_text</span> an manually confirm the edit.

    In general, is it feasible?

    Yes, sure.

    Some remarks...
    I would use Regular Expressions (RE) for that task.
    If <span> sections can be nested in your HTML source then you
    cannot do that with plain RE processors.
    Since you want to inspect each <span> pattern individually it's
    not clear what you mean by "automate" (which I'd interpret as
    running a batch job to do the process).
    Actually you seem to want a sequential find + replace-or-skip.

    In Vim I'd search for the "<span ..." pattern and then delete
    to the next "</span>" pattern. (Assuming no nested <span>.)
    Rinse repeat.
    That could be (for example) the commands [case 1]

    /<span class='add'>
    d/<\/span>df>

    If there's no other <...> inside the span-sections you could
    simplify that to [case 2]

    /<span class='add'>
    d2f>

    with the opportunity to repeat those search+delete commands
    by simply typing n. for every match, like n.n.n.n. or if
    you want to skip some like, e.g., n.nnnn.n.nnn.n

    With n you get to the next span pattern and . repeats the
    last command.

    In [case 1] the repeat isn't possible since we have two delete
    operations d/<\/span> and df> , but here you can define
    macros to trigger the command by a keystroke or just use the
    recording function to repeat the once recorded commands.

    Sounds complicated? - Maybe. - But if we know your exact data
    format we can provide the best command sequence for Vim for
    most easy use.


    Can KDE's Kate do it?

    Don't know.

    Janis


    TIA



    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Richard Owlett on Sun Jul 14 10:43:48 2024
    On 14.07.2024 09:33, Richard Owlett wrote:
    On 07/13/2024 12:55 PM, Janis Papanagnou wrote:
    Please ignore my previous post - it would delete the whole span'ed
    section!


    It just occurred to me you'd probably want something like

    /<span class='add'>
    df>
    /<\/span>
    df>

    And if you're using recording of the commands (I'll provide code
    on demand) just repeat the recordings. You can also just use the
    arrow keys after typing / to get the previous search patterns
    if you like.

    I don't know how to parse your answer.

    What I meant is that if you're doing some editing tasks or editing
    commands repeatedly you certainly want to avoid typing them over
    and aver again. There's a couple methods to achieve that in the Vim
    editor. One method is using the editor's history functions that
    make it possible to access (for example) previous search patterns.
    Another one in Vim is to record the commands to be able to replay
    them whenever you want with simple keystrokes.

    The maybe cryptic appearing commands I gave are the Vim commands
    for the task you had described:

    / searches for the regular expression pattern following
    df> deletes the text up to the tag-terminating '>' symbol


    But I suspect following some leads from Lawrence and Stan in this thread
    will be illuminating. I have just started reading https://docs.kde.org/stable5/en/kate/katepart/regular-expressions.html .

    Part of my motive for this project is self education.

    Fair enough. It's not clear to me what exactly you want to learn.
    Using the Kate editor, learning how to write Regular Expressions,
    how to efficiently edit texts, or how to handle/edit HTML files
    to make them readable for your purposes?

    If it's the latter than the right way to do that is (as already
    said in my [OT] reply or as also Stan suggested) to just fix the
    CSS definition, if that's the place where the 'italic' property
    had been defined. (If, OTOH, your HTML code contains, e.g. lots
    of <i> tags then you'd have to handle/edit them individually.)

    It has also been mentioned already that HTML structures can not
    sensibly handled by regular expressions. - So you learned that
    already. - But for non-nested HTML sub-structures it could be
    achievable anyway.

    To learn the Kate editor I'd suppose there's a description or
    manual available.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)