regex - Extract values from a text body in Ruby -


i need extract values multi-line string (which read text body of emails). want able feed patterns parser can customize different emails later. came following:

#!/usr/bin/env ruby  text1 =  <<-eos lorem ipsum dolor sit amet,   name: pepe manuel periquita  email: pepe@manuel.net  sisters: 1 brothers: 3 children: 2  lorem ipsum dolor sit amet eos  pattern1 = {   :exp => /name:[\s]*(.*?)$\s*           email:[\s]*(.*?)$\s*           sisters:[\s]*(.*?)$\s*           brothers:[\s]*(.*?)$\s*           children:[\s]*(.*?)$/mx,   :blk => lambda |m|     m.flatten!     {:name => m[0],      :email => m[1],      :total => m.drop(2).inject(0){|sum,item| sum + item.to_i}}   end }  # scan on text returns  #[["pepe manuel periquita", "pepe@manuel.net", "1", "3", "2"]]    def do_parse text, pattern     data = pattern[:blk].call(text.scan(pattern[:exp]))      puts data.inspect   end   do_parse text1, pattern1  # ./text_parser.rb # {:email=>"pepe@manuel.net", :total=>6, :name=>"pepe manuel periquita"} 

so, define pattern regular expression paired block build hash matches. "parser" takes text , apply rules executing block on result of matching regular expression against text scan.

at moment have parse emails format shown in text1 later add patterns possible extract data different emails (the format of emails fixed each type). therefore simplify pattern moving as possible "parser". code above works , extracts data of work located @ pattern...

is right way go?

could simplified or think different / better solution problem?

update

i updated parser following tonttu solution pattern hash now:

pattern2 = {   :exp => /^(.+?):\s*(.+)$/,   :blk => lambda |m|     r = hash[m.map{|x| [x[0].downcase.to_sym, x[1]]}]      {:name => r[:name],      :email => r[:email],      :total => r[:children].to_i + r[:brothers].to_i + r[:sisters].to_i}   end } 

maybe generic enough?

pp hash[*text1.scan(/^(.+?):\s(.+)$/).map{|x|      [x[0].downcase.to_sym, x[1]]    }.flatten]  => {:sisters=>"1",  :brothers=>"3",  :children=>"2",  :name=>"pepe manuel periquita",  :email=>"pepe@manuel.net"} 

Comments

Popular posts from this blog

java - SNMP4J General Variable Binding Error -

windows - Python Service Installation - "Could not find PythonClass entry" -

Determine if a XmlNode is empty or null in C#? -