JavaScript
x
cat file1.txt
abc bcd abc
abcd bcde cdef
abcd bcde cdef
abcd bcde cdef
efg fgh
efg fgh
hig
My expected result is like as below:
JavaScript
abc bcd abc
abcd bcde cdef
<!!! pay attention, above sentence has repeated 3 times !!!>
efg fgh
<!!! pay attention, above sentence has repeated 3 times !!!>
hig
I have found a way to deal with the issues, but my code is a little noisy.
JavaScript
cat file1.txt | uniq -c | sed -e 's/ +/ /g' -e 's/^.//g' | awk '{print $0," ",$1}'| sed -e 's/^[2-9] /n/g' -e 's/^[1] //g' |sed -e 's/[^1]$/n<!!! pay attention, above sentence has repeated & times !!!> n/g' -e 's/[1]$//g'
abc bcd abc
abcd bcde cdef
<!!! pay attention, above sentence has repeated 3 times !!!>
efg fgh
<!!! pay attention, above sentence has repeated 2 times !!!>
hig
I was wondering if you could show me more high-efficiency way to achieve the goal.Thanks a lot.
Advertisement
Answer
If you’re lines are not already grouped, then you could use
JavaScript
awk '
NR == FNR {count[$0]++; next}
!seen[$0]++ {
print
if (count[$0] > 1)
print "... repeated", count[$0], "times"
}
' file1.txt file1.txt
This will consume a lot of memory if your file is very large. You might want to sort it first.