Parse an XML File using Shell script

Parse an XML File using Shell script

am 20.10.2005 19:13:05 von karthik.prabaharan

Hello,

Could someone help me in this regard?

I have an Input file having more than 1000 entries as below in xml
format:













I would like to have the Output file in the following format for all
the entries as shown below:

Item ID Item Revision Owning Site Release
1L2T-5K762-B 11 Own Pun


Any help is highly appreciated.

Thanks, Karthik

Re: Parse an XML File using Shell script

am 20.10.2005 20:30:31 von James

Using perl snippet in shell script,

#!/bin/bash
perl -e '
undef %h;
while (<>) {
$k = $2,$h{$k} = $1 if (/value\=\"(.+?)\"\s+title\=\"(.+?)\"/);
}
@L = ("Item ID","Item Revision","Owning Site","Release");
print join "\t",@L;
print "\n";
for $k (@L) {
print "$h{$k}\t";
}
print "\n";
' $1

$ script xml.file
Item ID Item Revision Owning Site Release
1L2T-5K762-B 11 Own Pub

James

Re: Parse an XML File using Shell script

am 20.10.2005 20:40:10 von Steffen Schuler

karthik.prabaharan@gmail.com wrote:
> Hello,
>
> Could someone help me in this regard?
>
> I have an Input file having more than 1000 entries as below in xml
> format:
>
>
>
>
>
>
>
>
>
>
>
>
>
> I would like to have the Output file in the following format for all
> the entries as shown below:
>
> Item ID Item Revision Owning Site Release
> 1L2T-5K762-B 11 Own Pun
>
>
> Any help is highly appreciated.
>
> Thanks, Karthik
>

An AWK script:

#!/usr/bin/awk -f
func check(a1, a3) {
return a1 ~ /^[ \t]* }
BEGIN {
FS="\""
print "Item ID Item Revision Owning Site Release"
}
check($1, $3) && $4 ~ /^Item ID$/ {
itemId = $2
}
check($1, $3) && $4 ~ /^Item Revision$/ {
itemRev = $2
}
check($1, $3) && $4 ~ /^Owning Site$/ {
ownSite = $2
}
check($1, $3) && $4 ~ /^Release$/ {
release = $2
}
itemId != "" && itemRev != "" && ownSite != "" && release != "" {
format = "%-14s%6s %-14s%-10s\n"
printf( format, itemId, itemRev, ownSite, release)
itemId = itemRev = ownSite = release = ""
}

Regards,

Steffen

Re: Parse an XML File using Shell script

am 20.10.2005 21:02:23 von William Park

karthik.prabaharan@gmail.com wrote:
> Hello,
>
> Could someone help me in this regard?
>
> I have an Input file having more than 1000 entries as below in xml
> format:
>
>
>
>
>
>
>
>
>
>
>
>
>
> I would like to have the Output file in the following format for all
> the entries as shown below:
>
> Item ID Item Revision Owning Site Release
> 1L2T-5K762-B 11 Own Pun
>
>
> Any help is highly appreciated.

If your input data is nicely line-oriented like the above, then you can
extract 'value' and 'title' attributes using Sed, Python, Perl, or even
ordinary shell.

However, to handle more general syntax, you need to use a XML parser.
Expat is the first thing that comes to mind. Expat interface is
available for Gawk and Bash shell.

For Bash shell extension to Expat XML parser, see
http://home.eol.ca/~parkw/index.html#expat

Eg.

start() # Usage: start tag att=value...
{
case $1 in
UserData)
unset Item_ID Item_Revision Owning_Site Release
;;
UserValue)
declare "${@:2}"
case $title in
__ITEM_ID) Item_ID=$value ;;
__REVISION_ID) Item_Revision=$value ;;
Owning\ Site) Owning_Site=$value ;;
Release) Release=$value ;;
esac
;;
esac
}
end() # Usage: end tag
{
case $1 in
UserData)
echo "$Item_ID $Item_Revision $Owning_Site $Release"
;;
esac
}

echo "Item_ID Item_Revision Owning_Site Release"
expat -s start -e end < file.xml

--
William Park , Toronto, Canada
ThinFlash: Linux thin-client on USB key (flash) drive
http://home.eol.ca/~parkw/thinflash.html
BashDiff: Super Bash shell
http://freshmeat.net/projects/bashdiff/

Re: Parse an XML File using Shell script

am 20.10.2005 22:14:07 von Enrique Perez-Terron

On Thu, 20 Oct 2005 19:13:05 +0200, wrote:

> Hello,
>
> Could someone help me in this regard?
>
> I have an Input file having more than 1000 entries as below in xml
> format:
>
>
>
>
>
>
>
>
>
>
>
>
>
> I would like to have the Output file in the following format for all
> the entries as shown below:
>
> Item ID Item Revision Owning Site Release
> 1L2T-5K762-B 11 Own Pun

#! /usr/bin/perl

$format="%-20s %15s %-14s %-10s\n";
@titles=("Item ID", "Item Revision", "Owning Site", "Release");

printf $format, @titles;

%row = {};

while (<>) {
chomp;
if (/\ ( $value ) = /\svalue\=\"(.*?)\"/i or die "UserValue with no value at line $.: $_\n";
( $title ) = /\stitle\=\"(.*?)\"/i or die "UserValue with no title at line $.: $_\n";
$row{$title} = $value if (grep {$_ == $title} @titles;
}
if (/\<\/part\s*\>/i) {
printf $format, map {$row{$_}} @titles;
%row = {};
}
}


I can't remember now if xml has case sensitive tags and elements, in which case you
remove the "i" after the "/" before "or die"

-Enrique

Re: Parse an XML File using Shell script

am 21.10.2005 05:23:03 von brian_hiles

karthik.prabaharan@gmail.com wrote:
> I have an Input file ... in xml format:
> ...
> Thanks, Karthik

As with all the previous excellent suggestions, I probably
should round out the choices with an XML parser already
written -- and debugged! -- by Steve Coile and Aharon Robbins:

XMLparse.awk 1.1
ftp://ftp.freefriends.org/arnold/Awkstuff/xmlparser.awk
http://groups-beta.google.com/group/comp.lang.awk/browse_frm /thread/88bd947ea7e97a6a/de2fb62778b31600#de2fb62778b31600

=Brian

Re: Parse an XML File using Shell script

am 23.10.2005 01:36:45 von brian_hiles

bsh wrote:
> karthik.prabaharan@gmail.com wrote:
> > I have an Input file ... in xml format:

Or maybe even the XMLgawk patch to gawk(1):

https://sourceforge.net/projects/xmlgawk/
http://home.vrweb.de/~juergen.kahrs/gawk/XML/

=Brian