Using Python code coverage tool for understanding and pruning back source code of a large library -
my project targets low-cost , low-resource embedded device. dependent on relatively large , sprawling python code base, of use of apis quite specific.
i keen prune code of library bare minimum, executing test suite within coverage tools ned batchelder's coverage or figleaf, scripting removal of unused code within various modules/files. not understanding libraries' internals, make writing patches easier. ned refers use of coverage tools "reverse engineer" complex code in 1 of online talks.
my question community whether people have experience of using coverage tools in way wouldn't mind sharing? pitfalls if any? coverage tool choice? or better off investing time figleaf?
the end-game able automatically generate new source tree library, based on original tree, including code actually used when run nosetests.
if has developed tool similar job python applications , libraries, terrific baseline start development.
hopefully description makes sense readers...
what want isn't "test coverage", transitive closure of "can call" root of computation. (in threaded applications, have include "can fork").
you want designate small set (perhaps 1) of functions make entry points of application, , want trace through possible callees (conditional or unconditional) of small set. set of functions must have.
python makes hard in general (iirc, i'm not deep python expert) because of dynamic dispatch , due "eval". reasoning function can called can pretty tricky static analyzers applied highly dynamic languages.
one might use test coverage way seed "can call" relation specific "did call" facts; catch lot of dynamic dispatches (dependent on test suite coverage). result want transitive closure of "can or did" call. can still erroneous, less so.
once set of "necessary" functions, next problem removing unnecessary functions source files have. if number of files start large, manual effort remove dead stuff may pretty high. worse, you're revise application, , answer keep changes. every change (release), need reliably recompute answer.
my company builds tool analysis java packages (with appropriate caveats regarding dynamic loads , reflection): input set of java files , (as above) designated set of root functions. tool computes call graph, , finds dead member variables , produces 2 outputs: a) list of purportedly dead methods , members, , b) revised set of files "dead" stuff removed. if believe a), use b). if think a) wrong, add elements listed in a) set of roots , repeat analysis until think a) right. this, need static analysis tool parse java, compute call graph, , then revise code modules remove dead entries. basic idea applies language.
you'd need similar tool python, i'd expect.
maybe can stick dropping files unused, although may still lot of work.
Comments
Post a Comment