Mix of English and Cyrillic Characters

Hi,

I am working on a script where I have strings that contain an English
string followed by the Cyrillic translation. For now, I am looking for a
way to strip out the Cyrillic characters and and leave the English ones.
I have tried a simple regular expression such as :

$text =~ s/Surname.+/Surname/g;

Which doesn't seem to Match.

Any help is appreciated.

Barry


--
To unsubscribe, e-mail: beginners-unsubscribe [at] perl.org
For additional commands, e-mail: beginners-help [at] perl.org
http://learn.perl.org/
Barry-Home [ Sa, 02 April 2011 15:56 ] [ ID #2057492 ]

Re: Mix of English and Cyrillic Characters

--0016e6dd8f449b13c1049ffb379f
Content-Type: text/plain; charset=ISO-8859-1

I don't really know the first thing about Cyrillic, so you'll probably have
to play around with this before making it work like you want it to. It makes
use of Unicode character properties, which you can start learning from
perluniprops[0]:

$text =~ s/[\p{Cyrillic}\p{Block: Cyrillic}\p{Block:
Cyrillic_Extended_A}\p{Block: Cyrillic_Extended_B}\p{Block:
Cyrillic_Supplement}]+//g;

Brian.

[0] http://perldoc.perl.org/perluniprops.html

--0016e6dd8f449b13c1049ffb379f--
Brian Fraser [ So, 03 April 2011 05:27 ] [ ID #2057569 ]

Re: Mix of English and Cyrillic Characters

>>>>> "Barry-Home" == Barry-Home <rumpole6 [at] comcast.net> writes:

Barry-Home> I am working on a script where I have strings that contain
an English string Barry-Home> followed by the Cyrillic translation.

"perldoc perluniintro" would be a good start, since you're gonna be
knee-deep in unicode issues. And if you have it, "perlunitut" and
"perlunifaq".

--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn [at] stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc.
See http://methodsandmessages.posterous.com/ for Smalltalk discussion

--
To unsubscribe, e-mail: beginners-unsubscribe [at] perl.org
For additional commands, e-mail: beginners-help [at] perl.org
http://learn.perl.org/
merlyn [ Mo, 04 April 2011 17:31 ] [ ID #2057592 ]
Perl » gmane.comp.lang.perl.beginners » Mix of English and Cyrillic Characters

Vorheriges Thema: Re: Perl Hash Comparison and concatenate result from %hash2 comparedto %hash1 into %hash3
Nächstes Thema: Is CDB_File good for CDB in readonly