ruby on rails - How to edit docx with nokogiri and rubyzip -


i'm using combination of rubyzip , nokogiri edit .docx file. i'm using rubyzip unzip .docx file , using nokogiri parse , change body of word/document.xml file ever time close rubyzip @ end corrupts file , can't open or repair it. unzip .docx file on desktop , check word/document.xml file , content updated changed other files messed up. me issue? here code:

require 'rubygems'   require 'zip/zip'   require 'nokogiri'   zip = zip::zipfile.open("test.docx")   doc = zip.find_entry("word/document.xml")   xml = nokogiri::xml.parse(doc.get_input_stream)   wt = xml.root.xpath("//w:t", {"w" => "http://schemas.openxmlformats.org/wordprocessingml/2006/main"}).first   wt.content = "new text"   zip.get_output_stream("word/document.xml") {|f| f << xml.to_s}   zip.close 

i ran same corruption problem rubyzip last night. solved copying new zip file, replacing files necessary.

here's working proof of concept:

#!/usr/bin/env ruby  require 'rubygems' require 'zip/zip' # rubyzip gem require 'nokogiri'  class wordxmlfile   def self.open(path, &block)     self.new(path, &block)   end    def initialize(path, &block)     @replace = {}     if block_given?       @zip = zip::zipfile.open(path)       yield(self)       @zip.close     else       @zip = zip::zipfile.open(path)     end   end    def merge(rec)     xml = @zip.read("word/document.xml")     doc = nokogiri::xml(xml) {|x| x.noent}     (doc/"//w:fldsimple").each |field|       if field.attributes['instr'].value =~ /mergefield (\s+)/         text_node = (field/".//w:t").first         if text_node           text_node.inner_html = rec[$1].to_s         else           puts "no text node #{$1}"         end       end     end     @replace["word/document.xml"] = doc.serialize :save_with => 0   end    def save(path)     zip::zipfile.open(path, zip::zipfile::create) |out|       @zip.each |entry|         out.get_output_stream(entry.name) |o|           if @replace[entry.name]             o.write(@replace[entry.name])           else             o.write(@zip.read(entry.name))           end         end       end     end     @zip.close   end end  if __file__ == $0   file = argv[0]   out_file = argv[1] || file.sub(/\.docx/, ' merged.docx')   w = wordxmlfile.open(file)    w.force_settings   w.merge('first_name' => 'eric', 'last_name' => 'mason')   w.save(out_file) end 

Comments

Popular posts from this blog

ASP.NET/SQL find the element ID and update database -

jquery - appear modal windows bottom -

c++ - Compiling static TagLib 1.6.3 libraries for Windows -