I'm reformatting some HTML files containing chapters of the KJV Bible.
My source follows the practice of italicizing some words.
I find italics distracting.
These occurrences are consistently of the form
<span class='add'>arbitrary_text</span>
I wish to delete "<span class='add'>" and *ASSOCIATED* "</span>".
Obviously it would not be wise to fully automate the action.
I wish to find all occurrences of <span
class='add'>arbitrary_text</span> an manually confirm the edit.
In general, is it feasible?
Can KDE's Kate do it?
TIA
On 13.07.2024 18:08, Richard Owlett wrote:
I'm reformatting some HTML files containing chapters of the KJV Bible.
My source follows the practice of italicizing some words.
I find italics distracting.
These occurrences are consistently of the form
<span class='add'>arbitrary_text</span>
I wish to delete "<span class='add'>" and *ASSOCIATED* "</span>".
Obviously it would not be wise to fully automate the action.
I wish to find all occurrences of <span
class='add'>arbitrary_text</span> an manually confirm the edit.
In general, is it feasible?
Yes, sure.
Some remarks...
I would use Regular Expressions (RE) for that task.
If <span> sections can be nested in your HTML source then you
cannot do that with plain RE processors.
Since you want to inspect each <span> pattern individually it's
not clear what you mean by "automate" (which I'd interpret as
running a batch job to do the process).
Actually you seem to want a sequential find + replace-or-skip.
In Vim I'd search for the "<span ..." pattern and then delete
to the next "</span>" pattern. (Assuming no nested <span>.)
Rinse repeat.
That could be (for example) the commands [case 1]
/<span class='add'>
d/<\/span>df>
If there's no other <...> inside the span-sections you could
simplify that to [case 2]
/<span class='add'>
d2f>
with the opportunity to repeat those search+delete commands
by simply typing n. for every match, like n.n.n.n. or if
you want to skip some like, e.g., n.nnnn.n.nnn.n
With n you get to the next span pattern and . repeats the
last command.
In [case 1] the repeat isn't possible since we have two delete
operations d/<\/span> and df> , but here you can define
macros to trigger the command by a keystroke or just use the
recording function to repeat the once recorded commands.
Sounds complicated? - Maybe. - But if we know your exact data
format we can provide the best command sequence for Vim for
most easy use.
Can KDE's Kate do it?
Don't know.
Janis
TIA
I'm reformatting some HTML files containing chapters of the KJV Bible.
My source follows the practice of italicizing some words.
I find italics distracting.
These occurrences are consistently of the form
<span class='add'>arbitrary_text</span>
I wish to delete "<span class='add'>" and *ASSOCIATED* "</span>".
Obviously it would not be wise to fully automate the action.
I wish to find all occurrences of <span
class='add'>arbitrary_text</span> an manually confirm the edit.
In general, is it feasible?
Can KDE's Kate do it?
TIA
These occurrences are consistently of the form
<span class='add'>arbitrary_text</span>
I wish to delete "<span class='add'>" and *ASSOCIATED* "</span>".
I'm reformatting some HTML files containing chapters of the KJV Bible.
My source follows the practice of italicizing some words.
I find italics distracting.
These occurrences are consistently of the form
<span class='add'>arbitrary_text</span>
I wish to delete "<span class='add'>" and *ASSOCIATED* "</span>".
Obviously it would not be wise to fully automate the action.
I wish to find all occurrences of <span
class='add'>arbitrary_text</span> an manually confirm the edit.
In general, is it feasible?
]* matches a string of characters not including a <. If there isother HTML between span and /span, it will not match.
Can KDE's Kate do it?
On Sat, 13 Jul 2024 11:08:48 -0500, Richard Owlett wrote:
These occurrences are consistently of the form
<span class='add'>arbitrary_text</span>
I wish to delete "<span class='add'>" and *ASSOCIATED* "</span>".
This is beyond the abilities of regular expressions. This is the point
where you need to use an actual HTML/XML-parsing library.
Please ignore my previous post - it would delete the whole span'ed
section!
It just occurred to me you'd probably want something like
/<span class='add'>
df>
/<\/span>
df>
And if you're using recording of the commands (I'll provide code
on demand) just repeat the recordings. You can also just use the
arrow keys after typing / to get the previous search patterns
if you like.
On 13.07.2024 19:48, Janis Papanagnou wrote:
On 13.07.2024 18:08, Richard Owlett wrote:
I'm reformatting some HTML files containing chapters of the KJV Bible.
My source follows the practice of italicizing some words.
I find italics distracting.
These occurrences are consistently of the form
<span class='add'>arbitrary_text</span>
I wish to delete "<span class='add'>" and *ASSOCIATED* "</span>".
Obviously it would not be wise to fully automate the action.
I wish to find all occurrences of <span
class='add'>arbitrary_text</span> an manually confirm the edit.
In general, is it feasible?
Yes, sure.
Some remarks...
I would use Regular Expressions (RE) for that task.
If <span> sections can be nested in your HTML source then you
cannot do that with plain RE processors.
Since you want to inspect each <span> pattern individually it's
not clear what you mean by "automate" (which I'd interpret as
running a batch job to do the process).
Actually you seem to want a sequential find + replace-or-skip.
In Vim I'd search for the "<span ..." pattern and then delete
to the next "</span>" pattern. (Assuming no nested <span>.)
Rinse repeat.
That could be (for example) the commands [case 1]
/<span class='add'>
d/<\/span>df>
If there's no other <...> inside the span-sections you could
simplify that to [case 2]
/<span class='add'>
d2f>
with the opportunity to repeat those search+delete commands
by simply typing n. for every match, like n.n.n.n. or if
you want to skip some like, e.g., n.nnnn.n.nnn.n
With n you get to the next span pattern and . repeats the
last command.
In [case 1] the repeat isn't possible since we have two delete
operations d/<\/span> and df> , but here you can define
macros to trigger the command by a keystroke or just use the
recording function to repeat the once recorded commands.
Sounds complicated? - Maybe. - But if we know your exact data
format we can provide the best command sequence for Vim for
most easy use.
Can KDE's Kate do it?
Don't know.
Janis
TIA
On Sat, 13 Jul 2024 11:08:48 -0500, Richard Owlett wrote:
These occurrences are consistently of the form
<span class='add'>arbitrary_text</span>
I wish to delete "<span class='add'>" and *ASSOCIATED* "</span>".
This is beyond the abilities of regular expressions. This is the point
where you need to use an actual HTML/XML-parsing library.
See also <https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags>.
AOn Sat, 13 Jul 2024 23:39:14 -0000 (UTC), Lawrence D'Oliveiro wrote:
On Sat, 13 Jul 2024 11:08:48 -0500, Richard Owlett wrote:
These occurrences are consistently of the form
<span class='add'>arbitrary_text</span>
I wish to delete "<span class='add'>" and *ASSOCIATED* "</span>".
This is beyond the abilities of regular expressions. This is the point
where you need to use an actual HTML/XML-parsing library.
In general I'd agree with you. But the OP made a big deal -- in a
different thread, for some reason -- about wanting to use minimal
HTML, so I doubt very much there will be nested <span> ... </span>
sequences.
Also, the OP quite rightly wanted to confirm each change before it is
made, so presumably if there are any nested sequences he will say no
to that particular edit and fix it manually.
On Sat, 13 Jul 2024 11:08:48 -0500, Richard Owlett wrote:
I'm reformatting some HTML files containing chapters of the KJV Bible.
My source follows the practice of italicizing some words.
I find italics distracting.
These occurrences are consistently of the form
<span class='add'>arbitrary_text</span>
I wish to delete "<span class='add'>" and *ASSOCIATED* "</span>".
Obviously it would not be wise to fully automate the action.
I wish to find all occurrences of <span
class='add'>arbitrary_text</span> an manually confirm the edit.
In general, is it feasible?
Yes, of course. Any editor above the level of Notepad ought to be
able to do this. (Sadly, a lot of editors are not above the level of Notepad.)
For instance, in Vim you would use this command after opening the
file:
:%s;<span class='add'>\([^<]*\)</span>;\1;gc
% = process every line of the file
\( ... \) makes that part of the pattern match addressable
]* matches a string of characters not including a <. If there isother HTML between span and /span, it will not match.
\1 = the text found between span and /span
gc = do every occurrence on each line, but confirm each one
Can KDE's Kate do it?
I've no idea.
But there's an easier solution. Change the definition of class add in
your style sheet:
span.add { font-style:normal; }
Then you won't have to edit the HTML at all.
On 07/13/2024 12:55 PM, Janis Papanagnou wrote:
Please ignore my previous post - it would delete the whole span'ed
section!
It just occurred to me you'd probably want something like
/<span class='add'>
df>
/<\/span>
df>
And if you're using recording of the commands (I'll provide code
on demand) just repeat the recordings. You can also just use the
arrow keys after typing / to get the previous search patterns
if you like.
I don't know how to parse your answer.
But I suspect following some leads from Lawrence and Stan in this thread
will be illuminating. I have just started reading https://docs.kde.org/stable5/en/kate/katepart/regular-expressions.html .
Part of my motive for this project is self education.
Learning CSS is beyond my current goals.
On Sun, 14 Jul 2024 03:02:12 -0500, Richard Owlett wrote:
Learning CSS is beyond my current goals.
CSS is essentially an indispensable part of HTML at this point. If it
saves you effort, why not use it?
On 07/14/2024 04:15 PM, Lawrence D'Oliveiro wrote:
On Sun, 14 Jul 2024 03:02:12 -0500, Richard Owlett wrote:At 80 I pursue what's interesting ;}
Learning CSS is beyond my current goals.
CSS is essentially an indispensable part of HTML at this point. If it
saves you effort, why not use it?
When I set personal goals for for the spec of my project I decided on
doing it in a small as possible sub-set of HTML 2.0 .
On Sun, 14 Jul 2024 16:48:26 -0500, Richard Owlett wrote:
On 07/14/2024 04:15 PM, Lawrence D'Oliveiro wrote:
On Sun, 14 Jul 2024 03:02:12 -0500, Richard Owlett wrote:At 80 I pursue what's interesting ;}
Learning CSS is beyond my current goals.
CSS is essentially an indispensable part of HTML at this point. If it>>> saves you effort, why not use it?
When I set personal goals for for the spec of my project I decided on
doing it in a small as possible sub-set of HTML 2.0 .
To me, thatrCOs like spending your weekends rebuilding a Morris Minor.
On Sun, 14 Jul 2024 03:02:12 -0500, Richard Owlett wrote:
Learning CSS is beyond my current goals.
CSS is essentially an indispensable part of HTML at this point. If it
saves you effort, why not use it?
Lawrence D'Oliveiro <ldo@nz.invalid> wrote at 21:15 this Sunday (GMT):
CSS is essentially an indispensable part of HTML at this point. If it
saves you effort, why not use it?
It is kinda hard for me to get a good looking website up..
On Mon, 15 Jul 2024 15:30:06 -0000 (UTC), candycanearter07 wrote:
Lawrence D'Oliveiro <ldo@nz.invalid> wrote at 21:15 this Sunday (GMT):
CSS is essentially an indispensable part of HTML at this point. If it
saves you effort, why not use it?
It is kinda hard for me to get a good looking website up..
MDN is a good resource on all things Web, including CSS.
<https://developer.mozilla.org/en-US/docs/Web>
On 07/15/2024 04:59 PM, Lawrence D'Oliveiro wrote:
MDN is a good resource on all things Web, including CSS.Appears to have useful content.
<https://developer.mozilla.org/en-US/docs/Web>
Needs at least a "Table of Contents".
On Mon, 15 Jul 2024 20:35:02 -0500, Richard Owlett wrote:
On 07/15/2024 04:59 PM, Lawrence D'Oliveiro wrote:
MDN is a good resource on all things Web, including CSS.Appears to have useful content.
<https://developer.mozilla.org/en-US/docs/Web>
Needs at least a "Table of Contents".
That page has the links to the various contents.
On Mon, 15 Jul 2024 15:30:06 -0000 (UTC), candycanearter07 wrote:
Lawrence D'Oliveiro <ldo@nz.invalid> wrote at 21:15 this Sunday (GMT):
CSS is essentially an indispensable part of HTML at this point. If it
saves you effort, why not use it?
It is kinda hard for me to get a good looking website up..
MDN is a good resource on all things Web, including CSS.
<https://developer.mozilla.org/en-US/docs/Web>
Those do not make a "Table of Contents"!
| Sysop: | Amessyroom |
|---|---|
| Location: | Fayetteville, NC |
| Users: | 59 |
| Nodes: | 6 (0 / 6) |
| Uptime: | 00:15:39 |
| Calls: | 810 |
| Files: | 1,287 |
| Messages: | 197,321 |