Home

On Sep 25, 4:11 pm, "exhuma.twn" <exh...@gmail.com> wrote:
> Is it possible to calculate a distance between two chunks of text? I
> suppose one could simply do a simple word-count on the chunks
> (removing common noise words of course). And then go from there. Maybe
> even assigning different weighting to words. But maybe there is a well-
> tested and useful algorithm already available?

A good distance between two chunks of text is the number of changes
you have to make to one to transform it to the other. You should look
at 'difflib' with which you should be able to code up this sort of
distance (although the details will depend just on what your text
looks like).

--
Paul Hankin

previous
next

Re: Reassigning references (to std::map in this case)...
Re: Writing Scalabe Software in C++
Re: list index()
Re: 8 bit integer type
Re: super() doesn't get superclass
Fundacja Hobbit
Rodzic Po Ludzku
Fundacja Iskierka
Mam Marzenie
Niechciane i Zapomniane