my @folder = ('s,c%','c__pp_p','Monday_øå_Tuesday, Wednesday','Monday & Tuesday','Monday_Tuesday___Wednesday'); if ($folder =~ s/[^w_*-]/_/g ) { $folder =~ s/_+/_/g; print "$folder : Got %n" ; }
Using above code i am not able to handle this “Monday_øå_Tuesday_Wednesday”
The output should be :
s_c c_pp_p Monday_øå_Tuesday_Wednesday Monday_Tuesday Monday_Tuesday_Wednesday
Advertisement
Answer
You can use W
to negate the w
character class, but the problem you’ve got is that w
doesn’t match your non-ascii letters.
So you need to do something like this instead:
#!/usr/bin/env perl use strict; use warnings; use Data::Dumper; my @folder = ('s,c%','c__pp_p','Monday_øå_Tuesday, Wednesday','Monday & Tuesday','Monday_Tuesday___Wednesday'); s/[^p{Alpha}]+/_/g for @folder; print Dumper @folder;
Outputs:
$VAR1 = [ 's_c_', 'c_pp_p', 'Monday_øå_Tuesday_Wednesday', 'Monday_Tuesday', 'Monday_Tuesday_Wednesday' ];
This uses a unicode property – these are documented in perldoc perluniprop
– but the long and short of it is, p{Alpha}
is the unicode alphanumeric set, so much like w
but internationalised.
Although, it does have a trailing _
on the first line. From your description, that seems to be what you wanted. If not, then… it’s probably easier to:
s/_$// for @folder;
than make a more complicated pattern.