Cut (change in question)
Sorry I have posted this before but I have a slight change in the
question.
I have an html file.The entire script is in one line only. The
following is the script.
<table><tbody><tr><td class="r">Chapter
1: ................................................. </td></tr></
tbody></table><p Hare Krishna ......................</p> <p Hare
Rama ......................</p>
where .............. is a variable text
In the above script I want to delete the text
<table><tbody><tr><td class="r">Chapter
1: ................................................. </td></tr></
tbody></table>
where ........ represents variable content.
I have 100 files with names 1.htm to 100.htm
How can i do this using unix commands rather than selecting the text
and deleting.
Thanks
Santhosh
Re: Cut (change in question)
On 11/28/2007 12:23 AM, sant527 [at] gmail.com wrote:
> Sorry I have posted this before but I have a slight change in the
> question.
>
> I have an html file.The entire script is in one line only. The
> following is the script.
>
> <table><tbody><tr><td class="r">Chapter
> 1: ................................................. </td></tr></
> tbody></table><p Hare Krishna ......................</p> <p Hare
> Rama ......................</p>
>
> where .............. is a variable text
>
> In the above script I want to delete the text
>
>
> <table><tbody><tr><td class="r">Chapter
> 1: ................................................. </td></tr></
> tbody></table>
>
>
> where ........ represents variable content.
>
>
> I have 100 files with names 1.htm to 100.htm
>
>
> How can i do this using unix commands rather than selecting the text
> and deleting.
>
Depending on whether not "<table>" or "</table>" can occur multiple times on a
line, this may be all you need:
for file in *.htm
do
sed 's:<table>.*</table>::' "$file" > tmp &&
mv tmp "$file"
done
Regards,
Ed.
Re: Cut (change in question)
On Nov 29, 8:49 am, Ed Morton <mor... [at] lsupcaemnt.com> wrote:
> On 11/28/2007 12:23 AM, sant... [at] gmail.com wrote:
>
>
>
> > Sorry I have posted this before but I have a slight change in the
> > question.
>
> > I have an html file.The entire script is in one line only. The
> > following is the script.
>
> > <table><tbody><tr><td class="r">Chapter
> > 1: ................................................. </td></tr></
> > tbody></table><p Hare Krishna ......................</p> <p Hare
> > Rama ......................</p>
>
> > where .............. is a variable text
>
> > In the above script I want to delete the text
>
> > <table><tbody><tr><td class="r">Chapter
> > 1: ................................................. </td></tr></
> > tbody></table>
>
> > where ........ represents variable content.
>
> > I have 100 files with names 1.htm to 100.htm
>
> > How can i do this using unix commands rather than selecting the text
> > and deleting.
>
> Depending on whether not "<table>" or "</table>" can occur multiple times on a
> line, this may be all you need:
>
> for file in *.htm
> do
> sed 's:<table>.*</table>::' "$file" > tmp &&
> mv tmp "$file"
> done
>
> Regards,
>
> Ed.
"<table>" or "</table>" can occur multiple times on a line then what
can be done
Re: Cut (change in question)
On 11/29/2007 10:44 PM, sant527 [at] gmail.com wrote:
> On Nov 29, 8:49 am, Ed Morton <mor... [at] lsupcaemnt.com> wrote:
>
>>On 11/28/2007 12:23 AM, sant... [at] gmail.com wrote:
>>
>>
>>
>>
>>>Sorry I have posted this before but I have a slight change in the
>>>question.
>>
>>>I have an html file.The entire script is in one line only. The
>>>following is the script.
>>
>>><table><tbody><tr><td class="r">Chapter
>>>1: ................................................. </td></tr></
>>>tbody></table><p Hare Krishna ......................</p> <p Hare
>>>Rama ......................</p>
>>
>>>where .............. is a variable text
>>
>>>In the above script I want to delete the text
>>
>>><table><tbody><tr><td class="r">Chapter
>>>1: ................................................. </td></tr></
>>>tbody></table>
>>
>>>where ........ represents variable content.
>>
>>>I have 100 files with names 1.htm to 100.htm
>>
>>>How can i do this using unix commands rather than selecting the text
>>>and deleting.
>>
>>Depending on whether not "<table>" or "</table>" can occur multiple times on a
>>line, this may be all you need:
>>
>>for file in *.htm
>>do
>> sed 's:<table>.*</table>::' "$file" > tmp &&
>> mv tmp "$file"
>>done
>>
>>Regards,
>>
>> Ed.
>
>
>
> "<table>" or "</table>" can occur multiple times on a line then what
> can be done
Use all of the unique text you mentioned and replace the chain of periods with ".*":
sed 's:<table><tbody><tr><td class="r">Chapter 1:.*</td></tr></tbody></table>::'
If the text on either side of the ".*" can appear elsewhere on the same line,
then it's a harder problem that needs a different approach.
Ed.
Re: Cut (change in question)
On Nov 30, 9:57 am, Ed Morton <mor... [at] lsupcaemnt.com> wrote:
> On 11/29/2007 10:44 PM, sant... [at] gmail.com wrote:
>
>
>
> > On Nov 29, 8:49 am, Ed Morton <mor... [at] lsupcaemnt.com> wrote:
>
> >>On 11/28/2007 12:23 AM, sant... [at] gmail.com wrote:
>
> >>>Sorry I have posted this before but I have a slight change in the
> >>>question.
>
> >>>I have an html file.The entire script is in one line only. The
> >>>following is the script.
>
> >>><table><tbody><tr><td class="r">Chapter
> >>>1: ................................................. </td></tr></
> >>>tbody></table><p Hare Krishna ......................</p> <p Hare
> >>>Rama ......................</p>
>
> >>>where .............. is a variable text
>
> >>>In the above script I want to delete the text
>
> >>><table><tbody><tr><td class="r">Chapter
> >>>1: ................................................. </td></tr></
> >>>tbody></table>
>
> >>>where ........ represents variable content.
>
> >>>I have 100 files with names 1.htm to 100.htm
>
> >>>How can i do this using unix commands rather than selecting the text
> >>>and deleting.
>
> >>Depending on whether not "<table>" or "</table>" can occur multiple times on a
> >>line, this may be all you need:
>
> >>for file in *.htm
> >>do
> >> sed 's:<table>.*</table>::' "$file" > tmp &&
> >> mv tmp "$file"
> >>done
>
> >>Regards,
>
> >> Ed.
>
> > "<table>" or "</table>" can occur multiple times on a line then what
> > can be done
>
> Use all of the unique text you mentioned and replace the chain of periods with ".*":
>
> sed 's:<table><tbody><tr><td class="r">Chapter 1:.*</td></tr></tbody></table>::'
>
> If the text on either side of the ".*" can appear elsewhere on the same line,
> then it's a harder problem that needs a different approach.
>
> Ed.
Thank you.