sql server 2008 - Indexing large number of XML files -
i have difficult problem lying before me , thought best seek guidance community before formulating plan of attack on own.
i have couple thousand xml files need searchable sql server 2008 database. xml files reside on disk , not part of repository. mean "searchable" need able (psuedo-code here)
select * tbl_xmldata contains('xmldata', 'some search word')
tbl_xmldata table xml files being stored, , xmldata column actual xml data.
the last requirement (and tough one) when hit found (and 'hit' mean xml file found contain term being searched upon) need have access wording surrounds search term found out. instance, if had xml file had following in it:
< root> we hold these truths self-evident, men created equal < /root>
and searched upon word "self-evident", need able return around 20 characters before , after search term found. bring last point because - in experience anyway - sql server's full-text indexing limited in can tell if term/word/phrase located in particular document (assuming document stored in sql server 2008 filestream) , can't tell context in term/word/phrase located.
any appreciated! thanks!
take @ solr project. less mature promising alternative elastic search
Comments
Post a Comment