Nilsimsa is an anti-spam focused locality-sensitive hashing algorithm originally proposed the cmeclax remailer operator in 2001 [1] and then reviewed by Ernesto Damiani et al. in their 2004 paper titled, "An Open Digest-based Technique for Spam Detection". [2] The goal of Nilsimsa is to generate a hash digest of an email message such that the digests of two similar messages are similar to each other. In comparison with cryptographic hash functions such as SHA-1 or MD5, making a small modification to a document does not substantially change the resulting hash of the document. The paper suggests that the Nilsimsa satisfies three requirements:
Subsequent testing on a range of file types identified the Nilsimsa hash as having a significantly higher false positive rate when compared to other similarity digest schemes such as TLSH, Ssdeep and Sdhash. [3]
Nilsimsa similarity matching was taken in consideration by Jesse Kornblum when developing the fuzzy hashing in 2006, [4] that used the algorithms of spamsum by Andrew Tridgell (2002). [5]
Several implementations of Nilsimsa exist as open-source software. [6] [7] [8] [9] [10]
Nilsimsa is an anti-spam focused locality-sensitive hashing algorithm originally proposed the cmeclax remailer operator in 2001 [1] and then reviewed by Ernesto Damiani et al. in their 2004 paper titled, "An Open Digest-based Technique for Spam Detection". [2] The goal of Nilsimsa is to generate a hash digest of an email message such that the digests of two similar messages are similar to each other. In comparison with cryptographic hash functions such as SHA-1 or MD5, making a small modification to a document does not substantially change the resulting hash of the document. The paper suggests that the Nilsimsa satisfies three requirements:
Subsequent testing on a range of file types identified the Nilsimsa hash as having a significantly higher false positive rate when compared to other similarity digest schemes such as TLSH, Ssdeep and Sdhash. [3]
Nilsimsa similarity matching was taken in consideration by Jesse Kornblum when developing the fuzzy hashing in 2006, [4] that used the algorithms of spamsum by Andrew Tridgell (2002). [5]
Several implementations of Nilsimsa exist as open-source software. [6] [7] [8] [9] [10]