optimization - How to list directories faster? -


i have few situations need list files recursively, implementations have been slow. have directory structure 92784 files. find lists files in less 0.5 seconds, haskell implementation lot slower.

my first implementation took bit on 9 seconds complete, next version bit on 5 seconds , i'm down bit less 2 seconds.

listfilesr :: filepath -> io [filepath] listfilesr path = let     isdodd "." = false     isdodd ".." = false     isdodd _ = true      in         allfiles <- getdirectorycontents path     dirs <- form allfiles $ \d ->       if isdodd d         let p = path </> d            isdir <- doesdirectoryexist p            if isdir listfilesr p else return [d]         else return []     return $ concat dirs 

the test takes 100 megabytes of memory (+rts -s), , program spends around 40% in gc.

i thinking of doing listing in writert monad sequence monoid prevent concats , list creation. helps? else should do?

edit: have edited function use readdirstream, , helps keeping memory down. there's still allocation happening, productivity rate >95% , runs in less second.

this current version:

list path =   de <- opendirstream path   readdirstream de >>= go de   closedirstream de       go d [] = return ()     go d "." = readdirstream d >>= go d     go d ".." = readdirstream d >>= go d     go d x = let newpath = path </> x          in           e <- doesdirectoryexist newpath           if e                    list newpath >> readdirstream d >>= go d         else putstrln newpath >> readdirstream d >>= go d  

i think system.directory.getdirectorycontents constructs whole list , therefore uses memory. how using system.posix.directory? system.posix.directory.readdirstream returns entry 1 one.

also, filemanip library might useful although have never used it.


Comments

Popular posts from this blog

ASP.NET/SQL find the element ID and update database -

jquery - appear modal windows bottom -

c++ - Compiling static TagLib 1.6.3 libraries for Windows -