Skip to content
Advertisement

Use awk to replace one column value with another column value if the first letter is not a alphabet letter

I have the following csv file:

ID                Name          Gender        NID
Y12               Jim              M          C12
Y23               David            M          C23
234               Bob              M          C22
2b1               Lucy             F          C24
2bb               Lily             F          C25

What I want to do is to replace ID colunm’s value with NID value if the first letter in ID is not a alphabet letter. So the output is expected to be

ID                Name          Gender        NID
Y12               Jim              M          C12
Y23               David            M          C23
C22               Bob              M          C22
C24               Lucy             F          C24
C25               Lily             F          C25

This is just a sample. My original data size is much bigger so I have to use awk. The key issue is how to specify the condition. I am not clear how to deal with. The code I was thinking of is like awk -F, 'if $1 != ..., $1=$4'. I also googled but failed to find an appropriate solution. Here is a link which I think may be helpful. Filter lines that have only alphabets in first column. Thank you.

Advertisement

Answer

Use a regular expression.

awk '$1 ~ /^[^A-Z]/ { $1 = $4 }1' filename

DEMO

User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement