I am trying to replace part of filenames based on matching string of filename from another file. Filenames are in following format:
36872_20190806_00.csv 40800_20190806_00.csv 41883_20190806_00.csv 38064_20190806_00.csv 40848_20190806_00.csv 41891_20190806_00.csv 38341_20190806_00.csv 40856_20190806_00.csv 41923_20190806_00.csv 40417_20190806_00.csv 40948_20190806_00.csv 44373_20190806_00.csv 40745_20190806_00.csv 41217_20190806_00.csv 45004_20190806_00.csv 40754_20190806_00.csv 41256_20190806_00.csv
where digits before first _
represent station code, which I want to replace with its station name from another file named radiosonde.csv
. For example : I want
change 36872_20190806_00.csv
to ALMATY_20190806_00.csv
change 38064_20190806_00.csv
to KYZYLORDA_20190806_00.csv
Data of radiosonde
is as given below:
CODE,LAT,LON,Elevation,STN_NAME 41620,31.35,69.467,1407,ZHOB 41600,32.5,74.5333,255,SIALKOT 41598,32.9333,73.7167,232,JHELUM 41594,32.05,72.667,188,SARGODHA 41571,33.6167,73.1,507,ISLAMABAD_AIRPORT 41560,33.8667,70.0833,1725,PARACHINAR 41529,34.0333,71.9333,329,PESHAWAR 41516,35.9167,74.3333,1453,GILGIT 41515,35.5667,71.7833,1464,DROSH 41506,35.9217,71.8,1499,CHITRAL 41316,17.0439,54.1022,23,SALALAH_AIRPORT 41288,20.667,58.9,19,MASIRAH 41256,23.5953,58.2983,8.4,MUSCAT_INTL_AIRPORT 41217,24.4333,54.65,16,ABU_DHABI_INTL_AIRPOR 41169,25.2731,51.6081,4,HAMAD_INTL_AIRPORT 40990,31.5,65.85,1010,KANDAHAR_AIRPORT 40948,34.55,69.2167,1791,KABUL_AIRPORT 40938,34.217,62.217,977,HERAT 40913,36.6667,68.9167,433,KUNDUZ 40911,36.7,67.2,378,MAZAR-I-SHARIF 40875,27.2167,56.3667,10,BANDARABBASS 40856,29.4667,60.8833,1370,ZAHEDAN 40848,29.5333,52.6,1484,SHIRAZ 40841,30.25,56.9667,1748,KERMAN 40821,31.9,54.2833,1238,YAZD 40811,31.3333,48.6667,20,AHWAZ 40809,32.8667,59.2,1491,BIRJAND 40800,32.5175,51.7061,1550.4,ESFAHAN 40754,35.6833,51.3167,1204,TEHRAN-MEHRABAD 40745,36.2667,59.6333,999,MASHHAD 40427,26.267,50.617,2,BAHRAIN 40417,26.45,49.8167,22,KING_FAHD_INTL_AIRPORT 40416,26.267,50.167,19,DHAHRAN 3992,10.83,106.97,11,AN_LOC 38989,35.9,62.9667,375,TAGTABAZAR 38954,37.5,71.5,2077,KHOROG 38927,37.233,67.267,310,TERMEZ 38880,37.987,58.361,211,ASHGABAT_KESHI 38836,38.55,68.783,800,DUSHANBE 38750,37.467,53.967,-22,ESENGYLY 38687,39.083,63.6,190,CHARDZHEV 38613,40.917,72.95,765,DZHALAL-ABAD 38606,40.55,70.95,499,KOKAND 38599,40.217,69.733,427,KHUDJAND 38507,40.0333,52.9833,90,TURKMENBASHI 38457,41.267,69.267,493,TASHKENT 38413,41.733,64.617,237,TAMDY 38392,41.833,59.983,87,DASHKHOVUZ 38353,42.833,74.583,760,BISHKEK 38341,42.85,71.3,652,TARAZ 38064,44.7667,65.5167,133.4,KYZYLORDA 38001,44.55,50.25,-25,FORT SHEVCHENKO 37985,38.733,48.833,-11,LANKARAN 37860,40.5333,50,27,MASHTAGA 36974,41.433,76,2041,NARYN 36872,43.3633,77.0042,662.7,ALMATY 36859,44.167,80.067,645,ZHARKENT 3369,22.77,88.37,0,BARAKPUR 3368,25.88,89.43,0,LALMANIR_HAT
I looked into this question. As suggested there, I tried :
sort -r radiosonde.csv | awk -F"," '{print "for files in *00.csv; do mv $files ${files/" $1 "/" $5 "}; done" }' | bash
It did work in some sense. It renamed some files and left few as it is and gave error as:
bash: line 25: unexpected EOF while looking for matching `'' bash: line 113: syntax error: unexpected end of file
I am not understanding why it’s behaving so strangely with some files. If I’ll take those filenames and put them into some another file say test.csv
and use above command again i.e.
sort -r test.csv | awk -F"," '{print "for files in *00.csv; do mv $files ${files/" $1 "/" $5 "}; done" }' | bash
then it will rename all those files which were left earlier. Is there any way to do it using shell script. I tried following script but didn’t work:
for file in *00.csv ; do mv $files ${files/" $1 "/" $5 "}; done < radiosonde.csv
Advertisement
Answer
What about this:
Make sure that radiosonde.csv
file along with all the csv
files that you want to rename in the same directory.
$ cd <directory of radiosonde.csv, 36872_20190806_00.csv, 38064_20190806_00.csv and so on...> $ ls *.csv > .tmp; awk -F ',' '{name[$1]=$5}END{for(;(getline filename < ".tmp")>0;){ori=filename;sub(/_.+$/,"",filename);pre=filename;sub(/^[0-9]+/,"",ori);post=ori;if(name[pre]!="")system("mv " pre post " " name[pre] post)}} ' 'radiosonde.csv' $ rm -f '.tmp'
Explanation:
ls *.csv > .tmp
-> List all files in current dir and write them into.tmp
awk -F ','
-> Set,
(comma) as the field separator for awk. Because we want to split lines like41620,31.35,69.467,1407,ZHOB
into separate fields. Then we can get them via$1
,$2
,$3
and so on.'{ ... }END{}'
-> This is awk’s blocks. First block for reading input files and the later will be execute before awk program exits.'radiosonde.csv'
Set this as input file to feed awk for reading.'{name[$1]=$5}'
->$1
is the first field and$5
is the 5’th one. In this case$1
would be41620
,41600
and so on and$5
would beZHOB
,SIALKOT
and etc. name is an array. When we read the first line, we setname[CODE]=STN_NAME
andname[41620]=ZHOB
for the second line.END{}'
-> After we the set all the variables we needed, we need to rename the files andEND{}
is one of the block we can used for that purpose.for(;(getline filename < ".tmp")>0;) {}
-> This is for reading.tmp
file that contains list of files that we want to rename.ori=filename;
-> Set variablefilename
to another variable. This is because we want to usesub()
function that will alter the variable but still needfilename
variable to get the remaining part of the filename.sub(/_.+$/,"",filename);
-> This is to remove characters that we don’t want to. In this case from character_
to the end. For example, iffilename
is41620_20190806_00.csv
,_20190806_00.csv
will be removed andfilename
will become41620
.pre=filename;
-> Setfilename
to another variable calledpre
for clarity.sub(/^[0-9]+/,"",ori);
-> This will remove the leading numbers soori
will become_20190806_00.csv
.post=ori;
-> Setori
to another variable in this casepost
.if(name[pre]!="")
-> Becauseradiosonde.csv
will be inside.tmp
and is not one of the files that we want to rename, we need thisif
statement so that we don’t receive any error for the next command.name[radiosonde]
will be empty.system("mv " pre post " " name[pre] post)
-> What this statement does would be renaming your file. Ifpre
is41620
andpost
is_20190806_00.csv
, this statement can be translate into this"mv 41620_20190806_00.csv ZHOB_20190806_00.csv"
.rm -f '.tmp'
-> Delete.tmp
file because we don’t need it anymore.
Ignore my commend below. We do need the if
statement.