Charles Reid ba8f4b39f9 Update 'minhash.go' 2 weeks ago
cmd/minhash-lsh-all-pair remove use of concurrent lsh query, update key to use interface{} 1 year ago
.gitignore Initial commit 1 year ago
.travis.yml fix value count string parse 1 year ago
LICENSE Initial commit 1 year ago
README.md Update 'README.md' 2 weeks ago
lsh.go remove unnecessary code 1 year ago
lsh_benchmark_test.go update signature interface 1 year ago
lsh_test.go update signature interface 1 year ago
minhash.go Update 'minhash.go' 2 weeks ago
minhash_test.go first commit 1 year ago

README.md

FORK: This fork uses a vendored version of go-minhash (see gominhash).

Minhash LSH in Golang

Build Status GoDoc

Documentation

Install: go get github.com/ekzhu/minhash-lsh

Run Benchmark

Set file format

  1. One set per line
  2. Each set, all items are separated by whitespaces
  3. If the parameter firstItemIsID is set to true, the first itme is the unique ID of the set.
  4. The rest of the items with the following format: <value>____<frequency>

    • value is an unique element of the set
    • frequency is an integer count of the occurance of value
    • ____ (4 underscores) is the separator

All Pair Benchmark

minhash-lsh-all-pair -input <set file name>