algorithm - How are efficient consecutive word searches implemented? -
search engines , databases allow use consecutive string search (such "this test"
), matches this test match
, won't match test a
.
i know databases have built-in features allow use same functionality without writing single line of code (e.g. mysql's full text search). that's not kind of answer looking for.
what want know kind of algorithm , database structures used enable fast searching of strings.
what indexed table given above example? similar this?
// indexeditemid | position | word 1 | 0 | 1 | 1 | 1 | 2 | 1 | 3 | test 1 | 4 | 1 | 5 | 1 | 6 | match 2 | 0 | test 2 | 1 | 2 | 2 | 2 | 3 |
now there indexed items, how efficiently create sql statement matches items?
here's 1 example can think of:
select indexeditemid form (select indexeditemid, position indexedwords word = "this") word1position exists(select * indexedwords indexeditemid = word1position.indexeditemid , word = "is" , position = word1position.position + 1) , exists(select * indexedwords indexeditemid = word1position.indexeditemid , word = "a" , position = word1position.position + 2) , exists(select * indexedwords indexeditemid = word1position.indexeditemid , word = "test" , position = word1position.position + 3)
i'm sure there more standardized way that's more efficient.
you want @ trie. efficient in scenarios this, consumes lot of memory store whole structure.
Comments
Post a Comment