📜  tanimoto 系数 rdkit - Python (1)

📅  最后修改于: 2023-12-03 14:47:51.986000             🧑  作者: Mango

Tanimoto Coefficient in RDKit using Python

The Tanimoto coefficient is a similarity metric that is often used in cheminformatics, especially for the comparison of molecular structures. In RDKit, the Tanimoto coefficient can be calculated using the DataStructs.TanimotoSimilarity function.

Requirements

RDKit must be installed in order to use the DataStructs.TanimotoSimilarity function. RDKit can be installed via pip:

pip install rdkit
Usage

To calculate the Tanimoto coefficient between two molecules, first we need to create a binary fingerprint representation of each molecule using the RDKit Chem module.

from rdkit import Chem
from rdkit.Chem import AllChem
from rdkit import DataStructs

mol1 = Chem.MolFromSmiles('CC(=O)C1=CC=CC=C1C(=O)O')
mol2 = Chem.MolFromSmiles('CC(C)C1=CC=CC=C1C(=O)O')

fp1 = AllChem.GetMorganFingerprintAsBitVect(mol1, 2, nBits=1024)
fp2 = AllChem.GetMorganFingerprintAsBitVect(mol2, 2, nBits=1024)

Here, we have created two molecules mol1 and mol2 using their SMILES notation. We have then converted these molecules into binary fingerprints using the GetMorganFingerprintAsBitVect function. The radius parameter sets the radius of the Morgan fingerprint, while nBits sets the size of the fingerprint.

Once we have two fingerprints, we can calculate their Tanimoto coefficient using the TanimotoSimilarity function.

similarity = DataStructs.TanimotoSimilarity(fp1, fp2)

Here, the TanimotoSimilarity function takes two fingerprints as input and returns their Tanimoto coefficient as a float value between 0 and 1.

Conclusion

The Tanimoto coefficient is a useful metric for comparing molecular structures. In RDKit, this metric can be calculated using the DataStructs.TanimotoSimilarity function, which takes two binary fingerprints as input and returns their Tanimoto coefficient. By using this function along with the Chem module in RDKit, cheminformaticians can easily compare and analyze molecular structures.