Charles Reid ba8f4b39f9 Update 'minhash.go' 5 months ago
cmd/minhash-lsh-all-pair remove use of concurrent lsh query, update key to use interface{} 1 year ago
.gitignore Initial commit 2 years ago
.travis.yml fix value count string parse 1 year ago
LICENSE Initial commit 2 years ago Update '' 5 months ago
lsh.go remove unnecessary code 1 year ago
lsh_benchmark_test.go update signature interface 1 year ago
lsh_test.go update signature interface 1 year ago
minhash.go Update 'minhash.go' 5 months ago
minhash_test.go first commit 2 years ago

FORK: This fork uses a vendored version of go-minhash (see gominhash).

Minhash LSH in Golang

Build Status GoDoc


Install: go get

Run Benchmark

Set file format

  1. One set per line
  2. Each set, all items are separated by whitespaces
  3. If the parameter firstItemIsID is set to true, the first itme is the unique ID of the set.
  4. The rest of the items with the following format: <value>____<frequency>

    • value is an unique element of the set
    • frequency is an integer count of the occurance of value
    • ____ (4 underscores) is the separator

All Pair Benchmark

minhash-lsh-all-pair -input <set file name>