Overview

This module provides a small collection of functions for diffing text.

Installation

The Diff library consists of a single file (diff.lua), plus a test script (tests/basic.lua). Just put the file somewhere in your Lua path. Here is a list of recent releases:

You can also install it using LuaRocks with

luarocks install diff

or:

luarocks --from=http://sputnik.freewisdom.org/rocks/earth install diff

Using Diff

A simple diff of two strings:

> require("diff")
> for _, token in ipairs(diff.diff("This is a test", "This was a test!")) do 
>> print(token[1], token[2])
>> end
This    same
is      out
was     in
        same
a       same
        same
test    out
test!   in
        same

That is, diff.diff(old, new) returns a table of pairs, each consisting of a string and it's status: "same" (the string is present in both), "in" (the string appeared in new, or "out" (the string was present in old but was removed).

Alternatively, you can just generate an HTML for this diff:

> = diff.diff("This is a test", "This was a test!"):to_html()
This <del>is</del><ins>was</ins> a <del>test</del><ins>test!</ins>

Contact

Please contact Yuri Takhteyev (yuri -at- freewisdom.org) with any questions.

LuaDoc

diff

Provides functions for diffing text. (c) 2007, 2008 Yuri Takhteyev (yuri@freewisdom.org) (c) 2007 Hisham Muhammad License: MIT/X, see http://sputnik.freewisdom.org/en/License

diff() Returns a diff of two strings as a list of pairs, where the first value represents a token and the second the token's status ("same", "in", "out").
old:
The "old" text string
new:
The "new" text string
separator:
[optional] the separator pattern (defaults ot any white space).
Returns: A list of annotated tokens.
escape_html() Escapes an HTML string.
text:
The string to be escaped.
Returns: Escaped string.
format_as_html() Formats an inline diff as HTML, with and tags.
tokens:
a table of {token, status} pairs.
Returns: an HTML string.
quick_LCS() Derives the longest common subsequence of two strings.
t1:
the first string.
t2:
the second string.
Returns: the least common subsequence as a matrix.
split() Split a string into tokens.
text:
A string to be split.
separator:
[optional] the separator pattern (defaults to any white space - %s+).
skip_separator:
[optional] don't include the sepator in the results.
Returns: A list of tokens.

License

Copyright (c) 2007, 2008 Yuri Takhteyev, Hisham Muhammad

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.