Skip to content
Advertisement

The directory is being deleted too early

I’m trying to create a tar file from a directory’s contents using Codec.Archive.Tar, but I also want to clean up the directory once the tar file has been created. Here’s a small example which demonstrates my issue:

import System.Directory
import qualified Codec.Archive.Tar as T

listFile = do createDirectory "dir"
              createDirectory "dir/dir2"
              tarfile <- fmap T.write $ T.pack "dir" ["dir2"]
              removeDirectoryRecursive "dir"
              return tarfile

When I call the function listFile e.g. from within ghci, I get back

"*** Exception: dir/dir2: getModificationTime:getFileTimes:getFileStatus: does not exist (No such file or directory)

which I’m guessing is caused by the tar file being generated lazily and the directory being cleaned up strictly. Thus the directory is deleted before the tar file is actually created.

First, am I correct in my analysis of why this is failing? If so, what can I do to fix this? I don’t want to generate the tar file strictly because it could be fairly large and I don’t want to store it all in memory. What’s the “idiomatic” way to delay deleting the directory until the tar file has been generated?

Advertisement

Answer

The simplest solution is to invert control of your listFile function. Instead of having it return a lazy ByteString (which will be useless once the directory is removed), have it take an IO action to consume the ByteString and actually do something with it before removing the directory. For example:

import System.Directory
import qualified Codec.Archive.Tar as T
import qualified Data.ByteString.Lazy as LB
import System.IO

listFileTo :: (LB.ByteString -> IO ()) -> IO ()
listFileTo sink = do createDirectory "dir"
                     createDirectory "dir/dir2"
                     tarfile <- fmap T.write $ T.pack "dir" ["dir2"]
                     sink tarfile
                     removeDirectoryRecursive "dir"

main :: IO ()
main = listFileTo (tarcontents -> withBinaryFile "my.tar" WriteMode
                    (h -> LB.hPut h tarcontents))

Here, listFileTo takes a “sink”, a function that takes a lazy ByteString and performs an IO action with it. For example, the above version of main writes it to a tarfile.

You could also generalize this to something that can return a value from the sink:

listFileTo :: (LB.ByteString -> IO a) -> IO a
listFileTo sink = do createDirectory "dir"
                     createDirectory "dir/dir2"
                     tarfile <- fmap T.write $ T.pack "dir" ["dir2"]
                     result <- sink tarfile
                     removeDirectoryRecursive "dir"
                     return result

This would allow you to, for example, determine the size of the resulting tarfile without actually doing anything with it, though you have to take care to strictly evaluate the result in the sink:

{-# LANGUAGE BangPatterns #-}

main :: IO ()
main = do size <- listFileTo (tarcontents ->
                                let !size = LB.length tarcontents in return size)
          print size
User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement