I am trying to replace part of filenames based on matching string of filename from another file. Filenames are in following format:
36872_20190806_00.csv 40800_20190806_00.csv 41883_20190806_00.csv 38064_20190806_00.csv 40848_20190806_00.csv 41891_20190806_00.csv 38341_20190806_00.csv 40856_20190806_00.csv 41923_20190806_00.csv 40417_20190806_00.csv 40948_20190806_00.csv 44373_20190806_00.csv 40745_20190806_00.csv 41217_20190806_00.csv 45004_20190806_00.csv 40754_20190806_00.csv 41256_20190806_00.csv
where digits before first _ represent station code, which I want to replace with its station name from another file named radiosonde.csv. For example : I want
change 36872_20190806_00.csv to ALMATY_20190806_00.csv
change 38064_20190806_00.csvto KYZYLORDA_20190806_00.csv
Data of radiosonde is as given below:
CODE,LAT,LON,Elevation,STN_NAME 41620,31.35,69.467,1407,ZHOB 41600,32.5,74.5333,255,SIALKOT 41598,32.9333,73.7167,232,JHELUM 41594,32.05,72.667,188,SARGODHA 41571,33.6167,73.1,507,ISLAMABAD_AIRPORT 41560,33.8667,70.0833,1725,PARACHINAR 41529,34.0333,71.9333,329,PESHAWAR 41516,35.9167,74.3333,1453,GILGIT 41515,35.5667,71.7833,1464,DROSH 41506,35.9217,71.8,1499,CHITRAL 41316,17.0439,54.1022,23,SALALAH_AIRPORT 41288,20.667,58.9,19,MASIRAH 41256,23.5953,58.2983,8.4,MUSCAT_INTL_AIRPORT 41217,24.4333,54.65,16,ABU_DHABI_INTL_AIRPOR 41169,25.2731,51.6081,4,HAMAD_INTL_AIRPORT 40990,31.5,65.85,1010,KANDAHAR_AIRPORT 40948,34.55,69.2167,1791,KABUL_AIRPORT 40938,34.217,62.217,977,HERAT 40913,36.6667,68.9167,433,KUNDUZ 40911,36.7,67.2,378,MAZAR-I-SHARIF 40875,27.2167,56.3667,10,BANDARABBASS 40856,29.4667,60.8833,1370,ZAHEDAN 40848,29.5333,52.6,1484,SHIRAZ 40841,30.25,56.9667,1748,KERMAN 40821,31.9,54.2833,1238,YAZD 40811,31.3333,48.6667,20,AHWAZ 40809,32.8667,59.2,1491,BIRJAND 40800,32.5175,51.7061,1550.4,ESFAHAN 40754,35.6833,51.3167,1204,TEHRAN-MEHRABAD 40745,36.2667,59.6333,999,MASHHAD 40427,26.267,50.617,2,BAHRAIN 40417,26.45,49.8167,22,KING_FAHD_INTL_AIRPORT 40416,26.267,50.167,19,DHAHRAN 3992,10.83,106.97,11,AN_LOC 38989,35.9,62.9667,375,TAGTABAZAR 38954,37.5,71.5,2077,KHOROG 38927,37.233,67.267,310,TERMEZ 38880,37.987,58.361,211,ASHGABAT_KESHI 38836,38.55,68.783,800,DUSHANBE 38750,37.467,53.967,-22,ESENGYLY 38687,39.083,63.6,190,CHARDZHEV 38613,40.917,72.95,765,DZHALAL-ABAD 38606,40.55,70.95,499,KOKAND 38599,40.217,69.733,427,KHUDJAND 38507,40.0333,52.9833,90,TURKMENBASHI 38457,41.267,69.267,493,TASHKENT 38413,41.733,64.617,237,TAMDY 38392,41.833,59.983,87,DASHKHOVUZ 38353,42.833,74.583,760,BISHKEK 38341,42.85,71.3,652,TARAZ 38064,44.7667,65.5167,133.4,KYZYLORDA 38001,44.55,50.25,-25,FORT SHEVCHENKO 37985,38.733,48.833,-11,LANKARAN 37860,40.5333,50,27,MASHTAGA 36974,41.433,76,2041,NARYN 36872,43.3633,77.0042,662.7,ALMATY 36859,44.167,80.067,645,ZHARKENT 3369,22.77,88.37,0,BARAKPUR 3368,25.88,89.43,0,LALMANIR_HAT
I looked into this question. As suggested there, I tried :
sort -r radiosonde.csv | awk -F"," '{print "for files in *00.csv; do mv $files ${files/" $1 "/" $5 "}; done" }' | bash
It did work in some sense. It renamed some files and left few as it is and gave error as:
bash: line 25: unexpected EOF while looking for matching `'' bash: line 113: syntax error: unexpected end of file
I am not understanding why it’s behaving so strangely with some files. If I’ll take those filenames and put them into some another file say test.csv and use above command again i.e.
sort -r test.csv | awk -F"," '{print "for files in *00.csv; do mv $files ${files/" $1 "/" $5 "}; done" }' | bash
then it will rename all those files which were left earlier. Is there any way to do it using shell script. I tried following script but didn’t work:
for file in *00.csv ; do
mv $files ${files/" $1 "/" $5 "};
done < radiosonde.csv
Advertisement
Answer
What about this:
Make sure that radiosonde.csv file along with all the csv files that you want to rename in the same directory.
$ cd <directory of radiosonde.csv, 36872_20190806_00.csv, 38064_20190806_00.csv and so on...>
$ ls *.csv > .tmp; awk -F ',' '{name[$1]=$5}END{for(;(getline filename < ".tmp")>0;){ori=filename;sub(/_.+$/,"",filename);pre=filename;sub(/^[0-9]+/,"",ori);post=ori;if(name[pre]!="")system("mv " pre post " " name[pre] post)}} ' 'radiosonde.csv'
$ rm -f '.tmp'
Explanation:
ls *.csv > .tmp-> List all files in current dir and write them into.tmpawk -F ','-> Set,(comma) as the field separator for awk. Because we want to split lines like41620,31.35,69.467,1407,ZHOBinto separate fields. Then we can get them via$1,$2,$3and so on.'{ ... }END{}'-> This is awk’s blocks. First block for reading input files and the later will be execute before awk program exits.'radiosonde.csv'Set this as input file to feed awk for reading.'{name[$1]=$5}'->$1is the first field and$5is the 5’th one. In this case$1would be41620,41600and so on and$5would beZHOB,SIALKOTand etc. name is an array. When we read the first line, we setname[CODE]=STN_NAMEandname[41620]=ZHOBfor the second line.END{}'-> After we the set all the variables we needed, we need to rename the files andEND{}is one of the block we can used for that purpose.for(;(getline filename < ".tmp")>0;) {}-> This is for reading.tmpfile that contains list of files that we want to rename.ori=filename;-> Set variablefilenameto another variable. This is because we want to usesub()function that will alter the variable but still needfilenamevariable to get the remaining part of the filename.sub(/_.+$/,"",filename);-> This is to remove characters that we don’t want to. In this case from character_to the end. For example, iffilenameis41620_20190806_00.csv,_20190806_00.csvwill be removed andfilenamewill become41620.pre=filename;-> Setfilenameto another variable calledprefor clarity.sub(/^[0-9]+/,"",ori);-> This will remove the leading numbers sooriwill become_20190806_00.csv.post=ori;-> Setorito another variable in this casepost.if(name[pre]!="")-> Becauseradiosonde.csvwill be inside.tmpand is not one of the files that we want to rename, we need thisifstatement so that we don’t receive any error for the next command.name[radiosonde]will be empty.system("mv " pre post " " name[pre] post)-> What this statement does would be renaming your file. Ifpreis41620andpostis_20190806_00.csv, this statement can be translate into this"mv 41620_20190806_00.csv ZHOB_20190806_00.csv".rm -f '.tmp'-> Delete.tmpfile because we don’t need it anymore.
Ignore my commend below. We do need the if statement.