I’m trying to create a tar file from a directory’s contents using Codec.Archive.Tar
, but I also want to clean up the directory once the tar file has been created. Here’s a small example which demonstrates my issue:
import System.Directory import qualified Codec.Archive.Tar as T listFile = do createDirectory "dir" createDirectory "dir/dir2" tarfile <- fmap T.write $ T.pack "dir" ["dir2"] removeDirectoryRecursive "dir" return tarfile
When I call the function listFile
e.g. from within ghci, I get back
"*** Exception: dir/dir2: getModificationTime:getFileTimes:getFileStatus: does not exist (No such file or directory)
which I’m guessing is caused by the tar file being generated lazily and the directory being cleaned up strictly. Thus the directory is deleted before the tar file is actually created.
First, am I correct in my analysis of why this is failing? If so, what can I do to fix this? I don’t want to generate the tar file strictly because it could be fairly large and I don’t want to store it all in memory. What’s the “idiomatic” way to delay deleting the directory until the tar file has been generated?
Advertisement
Answer
The simplest solution is to invert control of your listFile
function. Instead of having it return a lazy ByteString
(which will be useless once the directory is removed), have it take an IO action to consume the ByteString
and actually do something with it before removing the directory. For example:
import System.Directory import qualified Codec.Archive.Tar as T import qualified Data.ByteString.Lazy as LB import System.IO listFileTo :: (LB.ByteString -> IO ()) -> IO () listFileTo sink = do createDirectory "dir" createDirectory "dir/dir2" tarfile <- fmap T.write $ T.pack "dir" ["dir2"] sink tarfile removeDirectoryRecursive "dir" main :: IO () main = listFileTo (tarcontents -> withBinaryFile "my.tar" WriteMode (h -> LB.hPut h tarcontents))
Here, listFileTo
takes a “sink”, a function that takes a lazy ByteString
and performs an IO action with it. For example, the above version of main
writes it to a tarfile.
You could also generalize this to something that can return a value from the sink:
listFileTo :: (LB.ByteString -> IO a) -> IO a listFileTo sink = do createDirectory "dir" createDirectory "dir/dir2" tarfile <- fmap T.write $ T.pack "dir" ["dir2"] result <- sink tarfile removeDirectoryRecursive "dir" return result
This would allow you to, for example, determine the size of the resulting tarfile without actually doing anything with it, though you have to take care to strictly evaluate the result in the sink
:
{-# LANGUAGE BangPatterns #-} main :: IO () main = do size <- listFileTo (tarcontents -> let !size = LB.length tarcontents in return size) print size