Skip to content
Advertisement

How to remove certain lines of a large file (>5G) using linux commands

I have files which are very large (> 5G), and I want to remove some lines by the line numbers without moving (copy and paste) files.

I know this command works for a small size file. (my sed command do not recognize -i option)

sed "${line}d" file.txt > file.tmp && mv file.tmp file.txt

This command takes relatively long time because of the size. I just need to remove the first line and the last line, but also want to know how to remove line number n, for example.

Advertisement

Answer

Because of the way files are stored on standard filesystems (NTFS, EXTFS, …), you cannot remove parts of a file in-place.

The only thing you can do in-place is

  • append to the end of a file (append mode)
  • modify data in a file (read-write mode)

Other operations must use a temporary file, or temporary memory to read the file fully and write it back modified.

EDIT: you can also “shrink” a file as read here using a C program (Linux or Windows would work) so that means that you could remove the last line (but still not the first line or any line in between)

Advertisement