Go to file

Charles Reid b2ec17a2a1 Merge branch 'master' of github.com:charlesreid1/lfw_fuel

* 'master' of github.com:charlesreid1/lfw_fuel:
  Fix import statement and argument name to match latest Keras API.
  Update example LFW CNN to use latest Keras API. → → Pretty sure these are not equivalent, but I had some trouble → understanding the prior conv. neural network, so I did my best → to come up with a one-to-one translation. Since it is just a → lightweight example, no biggie smalls.
  Update map() call to work with Python 2 or Python 3.

2017-05-07 18:07:51 -07:00

example

Fix import statement and argument name to match latest Keras API.

2017-05-06 05:52:45 -07:00

lfw_fuel

Read CSV file as strings, not bytes.

2017-05-07 17:57:15 -07:00

README.md

updated readme info on rebuilding from fuel-{download,convert}

2015-09-13 12:15:26 -07:00

setup.py

updated readme info on rebuilding from fuel-{download,convert}

2015-09-13 12:15:26 -07:00

README.md

LFW dataset, converted to fuel

Labeled Faces in the Wild is a database of face photographs designed for studying the problem of unconstrained face recognition.

This project currently packages the pairsDevTrain / pairsDevTest image sets into a fuel compatible dataset along with targets to indicate whether the pairs are same or different. In addition to the original lfw dataset, conversion is supported for both the funneled and deepfunneled versions of the images.

This project uses kerosene to produce a fuel comptable hdf5 file that is usable by blocks or keras.

Show me

From the included example

from keras.models import Sequential
from lfw_fuel import lfw

# the data, shuffled and split between train and test sets
(X_train, y_train), (X_test, y_test) = lfw.load_data(format="deepfunneled")

# (build the perfect model here)

model.fit(X_train, Y_train, show_accuracy=True, validation_data=(X_test, Y_test))
score = model.evaluate(X_test, Y_test, show_accuracy=True, verbose=0)

The features are currently stored in six channels - three for each of the two RGB images to be compared.

Note that the images are 250x250 - which is quite large by most CNN standards. These can be cropped and scaled before passing them to the network as shown in the example.

What's this dataset all about again?

The primary task of Labeled Faces in the Wild is to learn wheather the face in two pictures are the same person, or two different people. There are 2200 training pairs and 1000 test pairs in the predefined split.

Here are three matching training pairs:

Image 1	Image 2	Status
		MATCH
		MATCH
		MATCH

And here are three non-matching training pairs

Image 1	Image 2	Status
		DIFFERENT
		DIFFERENT
		DIFFERENT

In addition, this dataset is provided in both this raw format, and at least two "preprocessed" versions called funneled and deepfunneled. Often these are very similar, but here is an example of how they can differ.

Original	Funneled	Deep Funneled

On the LFW page you can browse the complete training set or the complete test set and see all three versions of all images.

Example

There is an included example of how to train a network using keras for this task. To run this example from the repo:

$ python example/run-lfw.py

This should run the example, downloading the dataset if necessary.

Note that currently the example runs, but the performance is poor. Suggestions or merge requests improving this example certainly welcome.

Installation

Installation is optional - if kerosene is installed then simply clone the repo and run the example script. However, installation is an option so that the lfw_fuel dependency can be used from the path, which can be useful if you'd like to use this dataset in your own blocks or keras project.

python setup.py install

You can also rebuild the hdf5 files from scratch by running fuel-download and fuel-convert with updated settings for EXTRA_DOWNLOADERS and EXTRA_CONVERTERS.

FUEL_EXTRA_DOWNLOADERS="lfw_fuel" fuel-download lfw
FUEL_EXTRA_CONVERTERS="lfw_fuel" fuel-convert lfw

This will convert the original version of lfw, but funneled and deepfunneled formats are also supported:

FUEL_EXTRA_DOWNLOADERS="lfw_fuel" fuel-download lfw --format deepfunneled
FUEL_EXTRA_CONVERTERS="lfw_fuel" fuel-convert lfw --format deepfunneled

These settings can also be set in the ~/.fuelrc file:

extra_downloaders: ['lfw_fuel']
extra_converters: ['lfw_fuel']

License

MIT