Skip to content
Advertisement

If I want to empty a directory, is there any reason I shouldn’t remove it and recreate It?

I’ve looked at these two threads for emptying a directory:

However, the functions for emptying a directory given as answers to those threads are very complex, and one answer can encounter a stack overflow.

When I think of removing and adding the directory, I’m thinking of doing it in C, and on a Linux platform. However, if you want to also explain the reasoning behind this on Windows, I’ll appreciate that as well. This is the code I would implement:

system( "rm -rf /some/directory" )
mkdir("/some/directory", 0700);

This seems so much simpler. What reason would I not use it? Is it far less efficient than it appears?

Advertisement

Answer

The rm -fr /some/directory command has to do the recursive work for you. Using the command is a lot simpler than writing your own code to do the same job — it embodies the virtue of laziness and exploits code reuse on the program scale. (You can’t use the rmdir() system call on a directory unless it is already empty.). So that’s not wholly unreasonable.

One issue is security: can someone place an alternative to the system’s rm command on your path? On Unix, are /bin and /usr/bin first in your path, or do other directories come first?

If you decide that using /bin/rm (or /usr/bin/rm) is safe enough, that might be a better choice than an unadorned rm, but overall, that isn’t too bad.

Another issue is a different aspect of security — can you actually remove and create the directory? Can you write in /some or not? And should you keep the current owner, group, permissions of /some/directory if you remove and recreate it? After those operations, the directory will be owned by the effective UID of the process; if will belong to group with the effective GID of the process (unless there’s a sticky bit set on the /some directory — or unless you’re on macOS); and the permissions will be 0777 as modified by the current setting of umask().

If these issues are not important, or are circumventable, then remove and recreate is plausible.


Expanding on the comments above:

When you mentioned the system() call in your comment, I had no idea what you were talking about.

The system() function executes the string passed as an argument via a command interpreter. When you write code, you have to think about how it could go wrong when someone malicious is trying to get you run it. When you write "rm -fr /some/directory", you are relying on the shell finding the normal rm command and working properly. However, if your PATH has a value such as $HOME/bin:/bin:/usr/bin (so that commands in your own private bin directory are used in preference to those provided by the system), then if the user can install their own script as $HOME/bin/rm, they can execute arbitrary code with your privileges — which could allow them to leave a way of breaking into your system later keeping all the privileges you have. They might even clean up (most of) the evidence that there was once a $HOME/bin/rm script.

One way to avoid such a problem is to request "/bin/rm -fr /some/directory" (unless rm is /usr/bin and not in /bin, of course). This is arguably safer. There used to be attacks available via the IFS environment variable; these are neutered by modern shells which do not use any inherited value for IFS.

Note that one problem will be interpreting whether the rm command was successful. The -fr option means it will report success under almost all circumstances — but if the directory didn’t vanish, your mkdir("/some/directory", 0777) call will fail.

As for the security of the /some/directory, my understanding of what you’re trying to say, is that the wrong directory could be selected. I just can’t imagine how that would occur.

Assuming that you did have permission to modify the /some directory (you need that to be able to remove /some/directory), and that you could modify all the sub-directories of the old version of /some/directory, then you might start off with /some/directory owned by user victim, group witless and with permission 775, whereas after the command and system call (note that system() is a function rather than a system call; mkdir() is a system call) are successful, the directory might be owned by user victor, group mischief and with permission 0777. This might be less than ideal. If you don’t want to break such permission settings, you probably don’t want to use the remove and recreate technique — or, not such a simple-minded one as this example shows. You would then have to work a bit harder to remove the contents of the directory without modifying those attributes. You might scan the directory (opendir(), readdir(), closedir() and invoke rm on each name — or sets of names — to clean up thoroughly without removing the directory itself. This is of hybrid complexity. It is more fiddly than simply removing and recreating, but far less fiddly than dealing with full recursive delete over multiple levels.

As I said before, you have to decide whether these issues matter or not. It is important that you’re aware that the issues exist, and that you have made a conscious (and informed) decision about whether to handle the issues, and how to handle the issues.

Remember that if your program may be run with root (administrator) privileges, it is a lot more important to be careful — but it still matters even if only ordinary mortal users will run the program.

Advertisement