Fast cheminformatics fingerprint search, anywhere you use Python

Chemfp is a set of command-line tools and a Python library for fingerprint generation and high-performance similarity search. Its market-leading performance and comprehensive API make it easy for you to add fast similarity search anywhere you use Python.

NEW! Chemfp 3.5.1 was released on 4 Feburary 2021. See the documentation for the full list of notable changes or go to the download page.

Why chemfp?

If that sounds interesting

You can get started by downloading the most recent version, chemfp 3.5.1, using the following:

python -m pip install chemfp -i

A few features are either limited or disabled. Visit the licensing page to see the licensing terms, to request a evaluation key to unlock those features, and learn about some of the available licensing options.

You do not need to request a license key for Tanimoto searches of the licensed FPB files available from the datasets page, so long as you follow the terms of the Chemfp Base License Agreement.

More information

Chemfp includes extensive documentation. For a more scholarly description, see: Dalke, A. The chemfp project. J. Cheminformatics 11, 76 (2019). doi: 10.1186/s13321-019-0398-8

Open source reference baseline for benchmarking

Chemfp 1.6.1 is the latest version of the no-cost/open source chemfp development track. It only supports Python 2.7. It is being maintained in order to provide a good reference baseline to evaluate similarity search performance, and to support the dwindling number of legacy users who haven't moved to Python 3. See the download page for download details.

Some of the many improvements in chemfp 3.x are: higher performance, support for the FPB binary format for fast loading times, support for more than 4GB of fingerprint data, sublinear Tversky search in addition to sublinear Tanimoto search, API improvements for web development, and support for both Python 2.7 and Python 3.6+.