algorithm - How are efficient consecutive word searches implemented? -

- April 15, 2014

search engines , databases allow use consecutive string search (such "this test"), matches this test match, won't match test a.

i know databases have built-in features allow use same functionality without writing single line of code (e.g. mysql's full text search). that's not kind of answer looking for.

what want know kind of algorithm , database structures used enable fast searching of strings.

what indexed table given above example? similar this?

// indexeditemid | position | word 1 | 0 | 1 | 1 | 1 | 2 | 1 | 3 | test 1 | 4 | 1 | 5 | 1 | 6 | match 2 | 0 | test 2 | 1 | 2 | 2 | 2 | 3 |

now there indexed items, how efficiently create sql statement matches items?

here's 1 example can think of:

select indexeditemid form   (select indexeditemid, position indexedwords word = "this") word1position   exists(select * indexedwords indexeditemid = word1position.indexeditemid , word = "is" , position = word1position.position + 1)   , exists(select * indexedwords indexeditemid = word1position.indexeditemid , word = "a" , position = word1position.position + 2)   , exists(select * indexedwords indexeditemid = word1position.indexeditemid , word = "test" , position = word1position.position + 3)

i'm sure there more standardized way that's more efficient.

you want @ trie. efficient in scenarios this, consumes lot of memory store whole structure.

Search This Blog

ERT

algorithm - How are efficient consecutive word searches implemented? -

Comments

Post a Comment

Popular posts from this blog

ASP.NET/SQL find the element ID and update database -

c++ - Compiling static TagLib 1.6.3 libraries for Windows -

PostgreSQL 9.x - pg_read_binary_file & inserting files into bytea -