Parse an XML File using Shell script
Parse an XML File using Shell script
am 20.10.2005 19:13:05 von karthik.prabaharan
Hello,
Could someone help me in this regard?
I have an Input file having more than 1000 entries as below in xml
format:
I would like to have the Output file in the following format for all
the entries as shown below:
Item ID Item Revision Owning Site Release
1L2T-5K762-B 11 Own Pun
Any help is highly appreciated.
Thanks, Karthik
Re: Parse an XML File using Shell script
am 20.10.2005 20:30:31 von James
Using perl snippet in shell script,
#!/bin/bash
perl -e '
undef %h;
while (<>) {
$k = $2,$h{$k} = $1 if (/value\=\"(.+?)\"\s+title\=\"(.+?)\"/);
}
@L = ("Item ID","Item Revision","Owning Site","Release");
print join "\t",@L;
print "\n";
for $k (@L) {
print "$h{$k}\t";
}
print "\n";
' $1
$ script xml.file
Item ID Item Revision Owning Site Release
1L2T-5K762-B 11 Own Pub
James
Re: Parse an XML File using Shell script
am 20.10.2005 20:40:10 von Steffen Schuler
karthik.prabaharan@gmail.com wrote:
> Hello,
>
> Could someone help me in this regard?
>
> I have an Input file having more than 1000 entries as below in xml
> format:
>
>
>
>
>
>
>
>
>
>
>
>
>
> I would like to have the Output file in the following format for all
> the entries as shown below:
>
> Item ID Item Revision Owning Site Release
> 1L2T-5K762-B 11 Own Pun
>
>
> Any help is highly appreciated.
>
> Thanks, Karthik
>
An AWK script:
#!/usr/bin/awk -f
func check(a1, a3) {
return a1 ~ /^[ \t]*
}
BEGIN {
FS="\""
print "Item ID Item Revision Owning Site Release"
}
check($1, $3) && $4 ~ /^Item ID$/ {
itemId = $2
}
check($1, $3) && $4 ~ /^Item Revision$/ {
itemRev = $2
}
check($1, $3) && $4 ~ /^Owning Site$/ {
ownSite = $2
}
check($1, $3) && $4 ~ /^Release$/ {
release = $2
}
itemId != "" && itemRev != "" && ownSite != "" && release != "" {
format = "%-14s%6s %-14s%-10s\n"
printf( format, itemId, itemRev, ownSite, release)
itemId = itemRev = ownSite = release = ""
}
Regards,
Steffen
Re: Parse an XML File using Shell script
am 20.10.2005 21:02:23 von William Park
karthik.prabaharan@gmail.com wrote:
> Hello,
>
> Could someone help me in this regard?
>
> I have an Input file having more than 1000 entries as below in xml
> format:
>
>
>
>
>
>
>
>
>
>
>
>
>
> I would like to have the Output file in the following format for all
> the entries as shown below:
>
> Item ID Item Revision Owning Site Release
> 1L2T-5K762-B 11 Own Pun
>
>
> Any help is highly appreciated.
If your input data is nicely line-oriented like the above, then you can
extract 'value' and 'title' attributes using Sed, Python, Perl, or even
ordinary shell.
However, to handle more general syntax, you need to use a XML parser.
Expat is the first thing that comes to mind. Expat interface is
available for Gawk and Bash shell.
For Bash shell extension to Expat XML parser, see
http://home.eol.ca/~parkw/index.html#expat
Eg.
start() # Usage: start tag att=value...
{
case $1 in
UserData)
unset Item_ID Item_Revision Owning_Site Release
;;
UserValue)
declare "${@:2}"
case $title in
__ITEM_ID) Item_ID=$value ;;
__REVISION_ID) Item_Revision=$value ;;
Owning\ Site) Owning_Site=$value ;;
Release) Release=$value ;;
esac
;;
esac
}
end() # Usage: end tag
{
case $1 in
UserData)
echo "$Item_ID $Item_Revision $Owning_Site $Release"
;;
esac
}
echo "Item_ID Item_Revision Owning_Site Release"
expat -s start -e end < file.xml
--
William Park , Toronto, Canada
ThinFlash: Linux thin-client on USB key (flash) drive
http://home.eol.ca/~parkw/thinflash.html
BashDiff: Super Bash shell
http://freshmeat.net/projects/bashdiff/
Re: Parse an XML File using Shell script
am 20.10.2005 22:14:07 von Enrique Perez-Terron
On Thu, 20 Oct 2005 19:13:05 +0200, wrote:
> Hello,
>
> Could someone help me in this regard?
>
> I have an Input file having more than 1000 entries as below in xml
> format:
>
>
>
>
>
>
>
>
>
>
>
>
>
> I would like to have the Output file in the following format for all
> the entries as shown below:
>
> Item ID Item Revision Owning Site Release
> 1L2T-5K762-B 11 Own Pun
#! /usr/bin/perl
$format="%-20s %15s %-14s %-10s\n";
@titles=("Item ID", "Item Revision", "Owning Site", "Release");
printf $format, @titles;
%row = {};
while (<>) {
chomp;
if (/\
( $value ) = /\svalue\=\"(.*?)\"/i or die "UserValue with no value at line $.: $_\n";
( $title ) = /\stitle\=\"(.*?)\"/i or die "UserValue with no title at line $.: $_\n";
$row{$title} = $value if (grep {$_ == $title} @titles;
}
if (/\<\/part\s*\>/i) {
printf $format, map {$row{$_}} @titles;
%row = {};
}
}
I can't remember now if xml has case sensitive tags and elements, in which case you
remove the "i" after the "/" before "or die"
-Enrique
Re: Parse an XML File using Shell script
am 21.10.2005 05:23:03 von brian_hiles
karthik.prabaharan@gmail.com wrote:
> I have an Input file ... in xml format:
> ...
> Thanks, Karthik
As with all the previous excellent suggestions, I probably
should round out the choices with an XML parser already
written -- and debugged! -- by Steve Coile and Aharon Robbins:
XMLparse.awk 1.1
ftp://ftp.freefriends.org/arnold/Awkstuff/xmlparser.awk
http://groups-beta.google.com/group/comp.lang.awk/browse_frm /thread/88bd947ea7e97a6a/de2fb62778b31600#de2fb62778b31600
=Brian
Re: Parse an XML File using Shell script
am 23.10.2005 01:36:45 von brian_hiles
bsh wrote:
> karthik.prabaharan@gmail.com wrote:
> > I have an Input file ... in xml format:
Or maybe even the XMLgawk patch to gawk(1):
https://sourceforge.net/projects/xmlgawk/
http://home.vrweb.de/~juergen.kahrs/gawk/XML/
=Brian