{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Cheminformatics\n", "## 10/31/2023 🎃\n", "\n", "print view\n", "\n", "notebook\n" ] }, { "cell_type": "code", "execution_count": 44, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/html": [ "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%html\n", "" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "*Cheminformatics (also known as chemoinformatics, chemioinformatics and chemical informatics) is the use of computer and informational techniques applied to a range of problems in the field of chemistry.* \n", "--Wikipedia" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Open Source Cheminformatics" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* rdkit [http://www.rdkit.org](http://www.rdkit.org)\n", " * BSD License\n", " * Relatively new, very nicely architected C++ backend\n", " * Actively developed\n", " * Native Python interface\n", "\n", "* OpenBabel [http://openbabel.org](http://openbabel.org)\n", " * GNU License\n", " * Older (forked from OpenEye in 2001), a bit crufty and complicated\n", " * Lots of functionality (e.g., support for more than 100 file formats)\n", " * Python interface is through SWIG (auto-generated) bindings to C/C++\n", " * Includes standalone programs: babel, obabel, etc.\n", " \n", "* Pybel\n", " * A native, user-friendly python interface to OpenBabel\n", " * Limited functionality (but can always fallback to OpenBabel)\n", " * Simplest to use\n", " * **Note:** Pybel is installed as part of openbabel. There is a completely unrelated python package called PyBEL that is *not* what you want" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# File Formats" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

2D

\n", "\n", " SMILES\n", "\n", "

3D

\n", " pdb, sdf, mol2\n", "
\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Simplified Molecular Input Line Entry System (SMILES)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

Atoms

\n", "\n", "Specified by their atomic symbols inside brackets\n", "\n", "* [Au], [Fe], [Zn], etc\n", "\n", "No brackets needed for organic subset: B, C, N, O, P, S, F, Cl, Br, and I\n", "\n", "Aromatic atoms are lower case: c1ccccc1\n", "\n", "

Bonds

\n", "\n", "* Single -\n", "* Double =\n", "* Triple #\n", "* Aromatic :\n", "\n", "Single and aromatic can be omitted.\n" ] }, { "cell_type": "code", "execution_count": 45, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%html\n", "
\n", "" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# SMILES, cont." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "## Branches\n", "\n", "Parentheses denote branches and can be nested.\n", "\n", "Example: SC(N)CO\n", "\n", "## Cycles\n", "\n", "Break a bond in the cycle and use a digit to label the break.\n", "\n", "\n", "\n", "As long as rings are separate, digits can be reused.\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# SMILES, cont.\n", "\n", "## Disconnections\n", "\n", "A period `.` separates nonbonded molecules.\n", "\n", "[Na+].[Cl-]\n", "\n", "## Isomeric Smiles\n", "Slashes (`/ \\`) denote configuration around double bonds.\n", "\n", "At (`@`) denotes configuration around chiral centers.\n", "\n" ] }, { "cell_type": "code", "execution_count": 46, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%html\n", "
\n", "" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Drawing" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "All but the simplest smiles can be challenging to interpret (especially if chirality is included). Fortunately, you can use pybel (or molecular viewers like [MarvinView](https://www.chemaxon.com/products/marvin/marvinview/)) to convert them to their 2D representation.\n", "\n", "Example: CC(NC1=CC=C(O)C=C1)=O" ] }, { "cell_type": "code", "execution_count": 47, "metadata": {}, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "CH\n", "3\n", "NH\n", "OH\n", "O\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 47, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from openbabel import pybel\n", "mol = pybel.readstring('smi','CC(NC1=CC=C(O)C=C1)=O')\n", "mol" ] }, { "cell_type": "code", "execution_count": 48, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "mol.draw(filename=\"imgs/accet.png\",show=False) " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "-" } }, "source": [ "" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# SMARTS" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Regular expressions for molecules.\n", "\n", "All SMILES are SMARTS (exact matches). Additionally, SMARTS support\n", "\n", "* wild cards \n", " * `C~*~C` any atom can be between two carbons using any (~) bond\n", " * `a1aaaaa1` any aromatic 6 atom ring\n", "* property testing \n", " * `[R]` atom in a ring\n", " * `[#6]` atomic number is 6 (matches aromatic or aliphatic)\n", " * `[D3]` atom with three explicit bonds (degree)\n", "* logical operators (not - !, and - & ;, or - ,)\n", " * `[!C&R]` not aliphatic carbon and in ring\n", " * `[F,Cl,Br,I]` one of the first four halogens\n", "* matching an atomic environment ('recursive' SMARTS)\n", " * `[$(*O);$(*C)]` this matches one atom that is bound to both C and O" ] }, { "cell_type": "code", "execution_count": 49, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%html\n", "
\n", "" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Pybel Input/Output" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`pybel.readstring`\n", "\n", "Takes a format and string with molecular data in it and returns a single molecule." ] }, { "cell_type": "code", "execution_count": 50, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "4" ] }, "execution_count": 50, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mol = pybel.readstring('smi','CCCC')\n", "len(mol.atoms)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For simple output, use the molecule's `write` method, which takes the format" ] }, { "cell_type": "code", "execution_count": 51, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", " OpenBabel10302320482D\n", "\n", " 4 3 0 0 0 0 0 0 0 0999 V2000\n", " 0.0000 0.0000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0\n", " 0.0000 0.0000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0\n", " 0.0000 0.0000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0\n", " 0.0000 0.0000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0\n", " 1 2 1 0 0 0 0\n", " 2 3 1 0 0 0 0\n", " 3 4 1 0 0 0 0\n", "M END\n", "$$$$\n", "\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "==============================\n", "*** Open Babel Warning in WriteMolecule\n", " No 2D or 3D coordinates exist. Stereochemical information will be stored using an Open Babel extension. To generate 2D or 3D coordinates instead use --gen2D or --gen3D.\n" ] } ], "source": [ "mol.write('sdf','output.sdf',overwrite=True) #write to file\n", "print(mol.write('sdf')) #no filename - return string" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# `pybel.readfile`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`pybel.readfile`\n", "\n", "Takes a format and file name and returns an *iterator* over all the molecules in the file." ] }, { "cell_type": "code", "execution_count": 52, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "14" ] }, "execution_count": 52, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mols = list(pybel.readfile('smi','../files/results.smi')) #expand the iterator into a list\n", "len(mols)" ] }, { "cell_type": "code", "execution_count": 53, "metadata": {}, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "OH\n", "N\n", "N\n", "OH\n", "O\n", "O\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 53, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mol = next(pybel.readfile('smi','../files/results.smi')) #get just first molecule\n", "mol" ] }, { "cell_type": "code", "execution_count": 54, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "N#Cc1c(O)c2C(=O)c3ccccc3C(=O)c2c(c1C#N)O\tNSC27034\n", "N#Cc1cc2SCCSCCCSCCSc2cc1C#N\tNSC680721\n", "N#Cc1cc2CN(CCN(CCN(CCN(Cc2cc1C#N)S(=O)(=O)c1ccc(cc1)C)S(=O)(=O)c1ccc(cc1)C)S(=O)(=O)c1ccc(cc1)C)S(=O)(=O)c1ccc(cc1)C\tNSC673657\n", "N#Cc1cc2/C(=N\\c3cccc(n3)N)/N=C(c2cc1C#N)Nc1cccc(n1)N\tNSC666078\n", "N#Cc1c(OC)ccc(c1C#N)O.COc1ccc(c(c1C#N)C#N)OC\tNSC618324\n", "N#Cc1c(C#N)c(O)c2c(c1O)c(N)ccc2\tNSC320651\n", "N#Cc1cc(ccc1C#N)NC(=O)CCCC(=O)Nc1ccc(c(c1)C#N)C#N\tNSC309816\n", "N#Cc1c(C#N)c(O)c(c(c1O)Cl)Cl\tNSC172566\n", "N#Cc1c(C#N)c(O)c2c(c1O)cccc2\tNSC128281\n", "N#Cc1cc(ccc1C#N)[N+](=O)[O-]\tNSC123374\n", "N#Cc1cc(ccc1C#N)Oc1ccc(c(c1)C#N)C#N\tNSC94808\n", "N#Cc1c2c(cc(c1C#N)[N+](=O)[O-])n(c1c2cccc1)C\tNSC92934\n", "N#Cc1c(O)ccc(c1C#N)O\tNSC43554\n", "N#Cc1ccccc1C#N\tNSC17562\n" ] } ], "source": [ "#of course, this is the most efficient way to read all\n", "for mol in pybel.readfile('smi','../files/results.smi'): \n", " print(mol.write('can').rstrip()) #canonical smiles" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# `pybel.Outputfile`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To output many molecules to the same file, use `pybel.Outputfile`" ] }, { "cell_type": "code", "execution_count": 55, "metadata": {}, "outputs": [], "source": [ "output = pybel.Outputfile('sdf','output.sdf',overwrite=True)\n", "for m in mols:\n", " output.write(m)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Molecules\n", "\n", "The molecule object provides a number of methods and access to the molecules atoms and bonds." ] }, { "cell_type": "code", "execution_count": 56, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "6 6 6 6 6 6 6 7 6 7 " ] } ], "source": [ "for atom in mol:\n", " print(atom.atomicnum,end=' ')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Atom properties in clude `atomicmass`, `atomicnum`, `coords`, `formalcharge`, `hyb`, `isotope`, `partialcharge`, `degree`, `explicitvalence` and `totalvalence`\n", "\n", "Atoms can also be accessed in `mol.atoms`" ] }, { "cell_type": "code", "execution_count": 57, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%html\n", "
\n", "" ] }, { "cell_type": "code", "execution_count": 58, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "(1, 1, 1, 4)" ] }, "execution_count": 58, "metadata": {}, "output_type": "execute_result" } ], "source": [ "m = pybel.readstring('smi','CC')\n", "a1 = m.atoms[0]\n", "a1.degree, a1.heavydegree, a1.explicitvalence, a1.totalvalence" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# SMARTS Matching" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "SMARTS matching is done by initializing a `pybel.Smarts` object with a SMARTS expression. This can then be applied to any molecule to identify the matching atoms." ] }, { "cell_type": "code", "execution_count": 59, "metadata": {}, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "N\n", "N\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 59, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mol" ] }, { "cell_type": "code", "execution_count": 60, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[(1, 6, 5, 4, 3, 2)]" ] }, "execution_count": 60, "metadata": {}, "output_type": "execute_result" } ], "source": [ "aromatic_ring = pybel.Smarts('a1aaaaa1')\n", "aromatic_ring.findall(mol) #returns all _unique_ matches" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The returned matches are atom indices that can be accessed through `mol.atoms`" ] }, { "cell_type": "code", "execution_count": 61, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "5\n", "8\n" ] } ], "source": [ "double_ring = pybel.Smarts('a1aaaa2a1aaaa2')\n", "for (i,m) in enumerate(mols):\n", " if double_ring.findall(m):\n", " print(i)\n", " m.draw(filename=\"r%d.png\"%i,show=False) \n" ] }, { "cell_type": "code", "execution_count": 62, "metadata": {}, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "OH\n", "N\n", "N\n", "OH\n", "NH\n", "2\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 62, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mols[5]" ] }, { "cell_type": "code", "execution_count": 63, "metadata": {}, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "OH\n", "N\n", "N\n", "OH\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 63, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mols[8]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Molecular Properties" ] }, { "cell_type": "code", "execution_count": 64, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "128.13076\n" ] } ], "source": [ "print(mol.molwt) #molecular weight" ] }, { "cell_type": "code", "execution_count": 65, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'abonds': 6.0, 'atoms': 10.0, 'bonds': 10.0, 'cansmi': nan, 'cansmiNS': nan, 'dbonds': 0.0, 'formula': nan, 'HBA1': 2.0, 'HBA2': 2.0, 'HBD': 0.0, 'InChI': nan, 'InChIKey': nan, 'L5': nan, 'logP': 1.42996, 'MP': 79.79220000000001, 'MR': 35.872, 'MW': 128.13076, 'nF': 0.0, 'rotors': 0.0, 's': nan, 'sbonds': 2.0, 'smarts': nan, 'tbonds': 2.0, 'title': nan, 'TPSA': 47.58}\n" ] } ], "source": [ "desc = mol.calcdesc()\n", "print(desc)" ] }, { "cell_type": "code", "execution_count": 66, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1.42996" ] }, "execution_count": 66, "metadata": {}, "output_type": "execute_result" } ], "source": [ "desc['logP'] #calculated partition coefficient between octanol/water" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Lipinski's Rule of Five" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In 1997 Christopher Lipinski analyzed existing drugs and came up with a set of molecular property rules for classifying a small molecule as *drug-like*.\n", "\n", "* No more than 5 hydrogen bond donors\n", "* No more than 10 hydrogen bond acceptors\n", "* Molecular weight less than 500 daltons\n", "* Partition coefficient logP less than 5\n", "* There is no fifth rule" ] }, { "cell_type": "code", "execution_count": 67, "metadata": {}, "outputs": [], "source": [ "def lipinski(mol):\n", " desc = mol.calcdesc()\n", " return desc['HBD'] <= 5 and desc['HBA1'] <= 10 and desc['MW'] <= 500 and desc['logP'] <= 5" ] }, { "cell_type": "code", "execution_count": 68, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "True\n", "True\n", "False\n", "True\n", "True\n", "True\n", "True\n", "True\n", "True\n", "True\n", "True\n", "True\n", "True\n", "True\n" ] } ], "source": [ "for m in mols:\n", " print(lipinski(m))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Fingerprints" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A *molecular fingerprint* reduces the chemical features of a molecule into a *bit vector*. The features of the fingerprint correspond to a bit in the vector. This bit is set if the compound has that feature.\n", "\n", "The most common type of fingerprint is a Daylight style fingerprint where all the paths (up to a given length) are enumerated and *hashed* to their bit positions.\n", "\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Fingerprints, cont." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Bit vectors can easily be compared, most commonly with the Tanimoto coefficient:\n", "$$\\frac{A \\cap B}{A \\cup B}$$\n", "\n", "This provides a quantitative measure of *chemical similarity*.\n", "\n", "Similarity search is a surprisingly effective mechanism of virtual screening (given enough data)." ] }, { "cell_type": "code", "execution_count": 69, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "openbabel.pybel.Fingerprint" ] }, "execution_count": 69, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fp = mol.calcfp()\n", "type(fp)" ] }, { "cell_type": "code", "execution_count": 70, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[75, 82, 224, 279, 296, 299, 348, 440, 442, 474, 503, 598, 656, 671, 711, 716, 728, 870, 906, 913, 937]\n" ] } ], "source": [ "print(fp.bits)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Chemical Similarity\n", "\n", "### Tanimoto coefficient\n", "$\\Large \\frac{A \\cap B}{A \\cup B}$ 1.0 means identical\n", "\n", "To calculate the Tanimoto similarity between two fingerprints, use the **|** operator" ] }, { "cell_type": "code", "execution_count": 71, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.28\n", "0.3\n", "0.19626168224299065\n", "0.12138728323699421\n", "0.5\n", "0.4666666666666667\n", "0.29577464788732394\n", "0.42857142857142855\n", "0.6176470588235294\n", "0.4117647058823529\n", "0.4772727272727273\n", "0.22826086956521738\n", "0.6363636363636364\n", "1.0\n" ] } ], "source": [ "fp = mol.calcfp()\n", "for m in mols:\n", " f = m.calcfp()\n", " print(f | fp)" ] }, { "cell_type": "code", "execution_count": 72, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%html\n", "
\n", "" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# 2D -> 3D" ] }, { "cell_type": "code", "execution_count": 73, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(2.468272539514969, 0.6349880847110817, -0.018730053230247772)\n", "(2.46371187360455, 0.6341729709888883, -0.010376048284164541)\n" ] } ], "source": [ "mol.make3D() #this makes a reasonable 3D structure\n", "print(mol.atoms[0].coords)\n", "mol.localopt() #this further optimizes the structure\n", "print(mol.atoms[0].coords)" ] }, { "cell_type": "code", "execution_count": 74, "metadata": {}, "outputs": [], "source": [ "sdf = mol.write('sdf')" ] }, { "cell_type": "code", "execution_count": 75, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "application/3dmoljs_load.v0": "
\n

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
\n jupyter labextension install jupyterlab_3dmol

\n
\n", "text/html": [ "
\n", "

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
\n", " jupyter labextension install jupyterlab_3dmol

\n", "
\n", "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import py3Dmol\n", "view = py3Dmol.view()\n", "view.addModel(sdf)\n", "view.setStyle({'stick':{}})\n", "view.zoomTo()\n", "view.show()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# `sdf` Molecules" ] }, { "cell_type": "code", "execution_count": 76, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "10" ] }, "execution_count": 76, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mols = list(pybel.readfile('sdf','../files/best.sdf'))\n", "len(mols)" ] }, { "cell_type": "code", "execution_count": 77, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(-0.5939, -56.8911, 14.3139)\n" ] } ], "source": [ "atom = mols[0].atoms[0]\n", "print(atom.coords)" ] }, { "cell_type": "code", "execution_count": 78, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "ZINC78996542\r\n", "\r\n", "\r\n", " 39 44 0 0 0 0 0 0 0 0999 V2000\r\n", " -0.5939 -56.8911 14.3139 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 2.3154 -57.8883 15.8741 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -0.3628 -55.5394 14.9296 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 1.0440 -55.7357 15.4805 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 0.3058 -57.7869 14.5684 N 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 1.2724 -57.1748 15.3144 N 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 3.1864 -57.3893 16.5881 O 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -6.5650 -58.0576 12.9536 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -6.4112 -58.0403 11.5707 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -5.4635 -57.8375 13.7859 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -5.1560 -57.8031 11.0185 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -4.1883 -57.5962 13.2480 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -4.0573 -57.5833 11.8565 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -2.9942 -57.3574 14.1090 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -1.7971 -57.1312 13.5121 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -1.6648 -57.1197 12.0139 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -2.8049 -57.3464 11.2822 N 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -0.5742 -56.9136 11.4820 O 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -2.7364 -57.3419 10.3001 H 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -2.7198 -58.5030 16.2901 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.5010 -56.2171 16.2506 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -2.7946 -58.5045 17.6831 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.5759 -56.2186 17.6435 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.0729 -57.3592 15.5739 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.2227 -57.3625 18.3596 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.3046 -57.3655 19.8487 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 3.3111 -54.6466 15.3638 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 4.2563 -53.8710 14.6948 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 1.8121 -54.3645 13.5073 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 4.6178 -52.1004 11.5954 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 3.4057 -52.3452 11.0066 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 2.0836 -54.8943 14.7675 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 3.9949 -53.3368 13.4353 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 2.7496 -53.5881 12.8303 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 4.9229 -52.5915 12.8100 N 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 2.4640 -53.0881 11.6153 N 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 3.6585 -59.9707 15.9992 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 1.1564 -60.0645 16.2196 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 2.3350 -59.3635 15.5577 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 8 9 1 0 0 0\r\n", " 10 12 1 0 0 0\r\n", " 20 24 1 0 0 0\r\n", " 21 23 1 0 0 0\r\n", " 27 32 1 0 0 0\r\n", " 22 25 1 0 0 0\r\n", " 28 33 1 0 0 0\r\n", " 11 13 1 0 0 0\r\n", " 29 34 1 0 0 0\r\n", " 24 14 1 0 0 0\r\n", " 12 14 1 0 0 0\r\n", " 33 34 1 0 0 0\r\n", " 13 17 1 0 0 0\r\n", " 15 1 1 0 0 0\r\n", " 15 16 1 0 0 0\r\n", " 16 17 1 0 0 0\r\n", " 2 6 1 0 0 0\r\n", " 3 1 1 0 0 0\r\n", " 3 4 1 0 0 0\r\n", " 4 32 1 0 0 0\r\n", " 4 6 1 0 0 0\r\n", " 26 25 1 0 0 0\r\n", " 37 39 1 0 0 0\r\n", " 38 39 1 0 0 0\r\n", " 39 2 1 0 0 0\r\n", " 6 5 1 0 0 0\r\n", " 8 10 2 0 0 0\r\n", " 9 11 2 0 0 0\r\n", " 20 22 2 0 0 0\r\n", " 21 24 2 0 0 0\r\n", " 27 28 2 0 0 0\r\n", " 23 25 2 0 0 0\r\n", " 29 32 2 0 0 0\r\n", " 30 31 2 0 0 0\r\n", " 30 35 2 0 0 0\r\n", " 31 36 2 0 0 0\r\n", " 12 13 2 0 0 0\r\n", " 33 35 2 0 0 0\r\n", " 34 36 2 0 0 0\r\n", " 14 15 2 0 0 0\r\n", " 1 5 2 0 0 0\r\n", " 16 18 2 0 0 0\r\n", " 2 7 2 0 0 0\r\n", " 17 19 1 0 0 0\r\n", "M END\r\n", "> \r\n", "-7.83433\r\n", "\r\n", "> \r\n", "1.45522\r\n", "\r\n", "> \r\n", "475.372\r\n", "\r\n", "$$$$\r\n", "ZINC78996542\r\n", "\r\n", "\r\n", " 39 44 0 0 0 0 0 0 0 0999 V2000\r\n", " -0.5722 -56.8468 14.3132 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 2.3170 -57.8869 15.8829 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -0.3244 -55.4995 14.9316 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 1.0775 -55.7161 15.4874 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 0.3140 -57.7556 14.5698 N 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 1.2862 -57.1582 15.3202 N 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 3.1923 -57.4012 16.6007 O 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -6.5452 -57.9747 12.9290 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -6.3911 -57.9310 11.5468 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -5.4434 -57.7729 13.7658 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -5.1352 -57.6858 10.9997 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -4.1676 -57.5240 13.2330 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -4.0362 -57.4846 11.8422 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -2.9733 -57.3042 14.0987 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -1.7756 -57.0691 13.5066 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -1.6429 -57.0290 12.0089 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -2.7832 -57.2393 11.2727 N 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -0.5516 -56.8149 11.4815 O 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -2.7144 -57.2159 10.2909 H 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -2.7038 -58.4927 16.2574 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.4766 -56.2038 16.2618 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -2.7789 -58.5209 17.6500 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.5521 -56.2319 17.6545 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.0525 -57.3342 15.5634 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.2031 -57.3905 18.3484 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.2855 -57.4221 19.8372 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 3.3565 -54.6510 15.3845 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 4.3151 -53.8880 14.7200 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 1.8755 -54.3614 13.5148 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 4.7196 -52.1345 11.6161 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 3.5100 -52.3693 11.0185 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 2.1314 -54.8885 14.7794 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 4.0693 -53.3564 13.4560 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 2.8264 -53.5977 12.8421 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 5.0099 -52.6234 12.8352 N 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 2.5559 -53.0999 11.6229 N 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 3.6348 -59.9861 16.0001 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 1.1325 -60.0490 16.2302 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 2.3170 -59.3619 15.5646 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 8 9 1 0 0 0\r\n", " 10 12 1 0 0 0\r\n", " 20 24 1 0 0 0\r\n", " 21 23 1 0 0 0\r\n", " 27 32 1 0 0 0\r\n", " 22 25 1 0 0 0\r\n", " 28 33 1 0 0 0\r\n", " 11 13 1 0 0 0\r\n", " 29 34 1 0 0 0\r\n", " 24 14 1 0 0 0\r\n", " 12 14 1 0 0 0\r\n", " 33 34 1 0 0 0\r\n", " 13 17 1 0 0 0\r\n", " 15 1 1 0 0 0\r\n", " 15 16 1 0 0 0\r\n", " 16 17 1 0 0 0\r\n", " 2 6 1 0 0 0\r\n", " 3 1 1 0 0 0\r\n", " 3 4 1 0 0 0\r\n", " 4 32 1 0 0 0\r\n", " 4 6 1 0 0 0\r\n", " 26 25 1 0 0 0\r\n", " 37 39 1 0 0 0\r\n", " 38 39 1 0 0 0\r\n", " 39 2 1 0 0 0\r\n", " 6 5 1 0 0 0\r\n", " 8 10 2 0 0 0\r\n", " 9 11 2 0 0 0\r\n", " 20 22 2 0 0 0\r\n", " 21 24 2 0 0 0\r\n", " 27 28 2 0 0 0\r\n", " 23 25 2 0 0 0\r\n", " 29 32 2 0 0 0\r\n", " 30 31 2 0 0 0\r\n", " 30 35 2 0 0 0\r\n", " 31 36 2 0 0 0\r\n", " 12 13 2 0 0 0\r\n", " 33 35 2 0 0 0\r\n", " 34 36 2 0 0 0\r\n", " 14 15 2 0 0 0\r\n", " 1 5 2 0 0 0\r\n", " 16 18 2 0 0 0\r\n", " 2 7 2 0 0 0\r\n", " 17 19 1 0 0 0\r\n", "M END\r\n", "> \r\n", "-7.7915\r\n", "\r\n", "> \r\n", "1.18555\r\n", "\r\n", "> \r\n", "475.372\r\n", "\r\n", "$$$$\r\n", "ZINC78996534\r\n", "\r\n", "\r\n", " 39 44 0 0 0 0 0 0 0 0999 V2000\r\n", " -0.6060 -58.4259 14.4308 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 2.2622 -57.0761 15.7885 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -0.3848 -59.6010 15.3414 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 1.0076 -59.2726 15.8653 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 0.2852 -57.4884 14.4946 N 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 1.2358 -57.9070 15.3815 N 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 3.1167 -57.3919 16.6176 O 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -6.5490 -57.6468 12.7169 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -6.3640 -57.9721 11.3766 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -5.4660 -57.6655 13.6007 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -5.0959 -58.3170 10.9188 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -4.1782 -58.0108 13.1580 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -4.0156 -58.3341 11.8081 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.0030 -58.0395 14.0758 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -1.7918 -58.3829 13.5703 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -1.6256 -58.7300 12.1163 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -2.7497 -58.6825 11.3283 N 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -0.5225 -59.0404 11.6676 O 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -2.6589 -58.9060 10.3738 H 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.5198 -58.6843 16.4130 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -2.8163 -56.4211 15.9438 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.6259 -58.3704 17.7678 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -2.9225 -56.1072 17.2987 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.1149 -57.7096 15.5009 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.3273 -57.0819 18.2108 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.4433 -56.7463 19.6592 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 2.8030 -59.9702 14.2440 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 3.7737 -60.8749 13.8173 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 2.3083 -61.4219 16.0936 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 5.1631 -64.0341 14.7974 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 4.4366 -64.3052 15.9262 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 2.0669 -60.2450 15.3869 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 4.0225 -62.0537 14.5162 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 3.2759 -62.3324 15.6757 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 4.9642 -62.9109 14.0844 N 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 3.4901 -63.4609 16.3744 N 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 1.2882 -54.7944 15.8358 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 3.6925 -55.1298 15.1839 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 2.2860 -55.7119 15.1441 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 8 9 1 0 0 0\r\n", " 10 12 1 0 0 0\r\n", " 20 24 1 0 0 0\r\n", " 21 23 1 0 0 0\r\n", " 27 32 1 0 0 0\r\n", " 22 25 1 0 0 0\r\n", " 28 33 1 0 0 0\r\n", " 11 13 1 0 0 0\r\n", " 29 34 1 0 0 0\r\n", " 24 14 1 0 0 0\r\n", " 12 14 1 0 0 0\r\n", " 33 34 1 0 0 0\r\n", " 13 17 1 0 0 0\r\n", " 15 1 1 0 0 0\r\n", " 15 16 1 0 0 0\r\n", " 16 17 1 0 0 0\r\n", " 2 6 1 0 0 0\r\n", " 3 1 1 0 0 0\r\n", " 3 4 1 0 0 0\r\n", " 4 32 1 0 0 0\r\n", " 4 6 1 0 0 0\r\n", " 26 25 1 0 0 0\r\n", " 37 39 1 0 0 0\r\n", " 38 39 1 0 0 0\r\n", " 39 2 1 0 0 0\r\n", " 6 5 1 0 0 0\r\n", " 8 10 2 0 0 0\r\n", " 9 11 2 0 0 0\r\n", " 20 22 2 0 0 0\r\n", " 21 24 2 0 0 0\r\n", " 27 28 2 0 0 0\r\n", " 23 25 2 0 0 0\r\n", " 29 32 2 0 0 0\r\n", " 30 31 2 0 0 0\r\n", " 30 35 2 0 0 0\r\n", " 31 36 2 0 0 0\r\n", " 12 13 2 0 0 0\r\n", " 33 35 2 0 0 0\r\n", " 34 36 2 0 0 0\r\n", " 14 15 2 0 0 0\r\n", " 1 5 2 0 0 0\r\n", " 16 18 2 0 0 0\r\n", " 2 7 2 0 0 0\r\n", " 17 19 1 0 0 0\r\n", "M END\r\n", "> \r\n", "-7.60183\r\n", "\r\n", "> \r\n", "2.26383\r\n", "\r\n", "> \r\n", "475.372\r\n", "\r\n", "$$$$\r\n", "ZINC78996542\r\n", "\r\n", "\r\n", " 39 44 0 0 0 0 0 0 0 0999 V2000\r\n", " -1.1562 -57.7105 14.6555 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 2.0859 -57.6546 15.8298 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -1.1390 -56.2126 14.7797 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 0.3392 -55.9873 15.0720 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -0.0584 -58.3149 14.9816 N 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 0.8469 -57.3439 15.3020 N 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 2.9188 -56.8145 16.1731 O 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -6.8917 -60.1405 14.9063 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -6.9219 -60.5932 13.5908 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -5.7606 -59.4813 15.3968 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -5.8212 -60.3875 12.7645 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -4.6396 -59.2637 14.5786 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -4.6919 -59.7269 13.2610 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.4188 -58.5636 15.0722 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -2.3780 -58.3922 14.2190 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -2.4425 -58.8944 12.8026 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.5968 -59.5297 12.4148 N 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -1.4927 -58.7341 12.0364 O 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.6562 -59.8644 11.4908 H 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -2.7756 -58.8669 17.4469 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.7367 -56.7639 16.7466 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -2.6706 -58.3845 18.7515 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.6320 -56.2815 18.0513 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.3085 -58.0565 16.4443 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.0988 -57.0919 19.0536 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -2.9885 -56.5777 20.4491 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 0.6434 -55.4303 12.6355 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 1.3219 -54.7691 11.6131 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 2.1612 -54.4616 14.2263 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 4.1096 -52.5500 11.1925 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 4.5256 -52.3975 12.4883 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 1.0650 -55.2760 13.9477 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 2.4186 -53.9532 11.8808 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 2.8462 -53.7965 13.2121 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 3.0567 -53.3258 10.8773 N 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 3.9009 -53.0163 13.5062 N 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 2.4158 -59.7838 14.5992 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 3.6872 -59.3395 16.7216 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 2.3789 -59.1280 15.9719 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 8 9 1 0 0 0\r\n", " 10 12 1 0 0 0\r\n", " 20 24 1 0 0 0\r\n", " 21 23 1 0 0 0\r\n", " 27 32 1 0 0 0\r\n", " 22 25 1 0 0 0\r\n", " 28 33 1 0 0 0\r\n", " 11 13 1 0 0 0\r\n", " 29 34 1 0 0 0\r\n", " 24 14 1 0 0 0\r\n", " 12 14 1 0 0 0\r\n", " 33 34 1 0 0 0\r\n", " 13 17 1 0 0 0\r\n", " 15 1 1 0 0 0\r\n", " 15 16 1 0 0 0\r\n", " 16 17 1 0 0 0\r\n", " 2 6 1 0 0 0\r\n", " 3 1 1 0 0 0\r\n", " 3 4 1 0 0 0\r\n", " 4 32 1 0 0 0\r\n", " 4 6 1 0 0 0\r\n", " 26 25 1 0 0 0\r\n", " 37 39 1 0 0 0\r\n", " 38 39 1 0 0 0\r\n", " 39 2 1 0 0 0\r\n", " 6 5 1 0 0 0\r\n", " 8 10 2 0 0 0\r\n", " 9 11 2 0 0 0\r\n", " 20 22 2 0 0 0\r\n", " 21 24 2 0 0 0\r\n", " 27 28 2 0 0 0\r\n", " 23 25 2 0 0 0\r\n", " 29 32 2 0 0 0\r\n", " 30 31 2 0 0 0\r\n", " 30 35 2 0 0 0\r\n", " 31 36 2 0 0 0\r\n", " 12 13 2 0 0 0\r\n", " 33 35 2 0 0 0\r\n", " 34 36 2 0 0 0\r\n", " 14 15 2 0 0 0\r\n", " 1 5 2 0 0 0\r\n", " 16 18 2 0 0 0\r\n", " 2 7 2 0 0 0\r\n", " 17 19 1 0 0 0\r\n", "M END\r\n", "> \r\n", "-7.58798\r\n", "\r\n", "> \r\n", "1.876\r\n", "\r\n", "> \r\n", "475.372\r\n", "\r\n", "$$$$\r\n", "ZINC35448294\r\n", "\r\n", "\r\n", " 33 38 0 0 0 0 0 0 0 0999 V2000\r\n", " 6.2193 -51.5392 13.7822 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 6.5893 -51.9773 15.0518 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 5.1156 -52.0958 13.1259 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 5.8703 -52.9821 15.7052 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 4.3763 -53.1129 13.7626 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 3.2347 -53.8835 13.4110 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 4.7696 -53.5311 15.0379 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 2.9678 -54.7425 14.4550 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -0.2954 -56.1404 13.1784 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 2.3853 -53.8636 12.1948 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 1.6865 -55.2195 12.0267 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 1.8500 -55.7376 14.5366 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 3.8912 -54.5146 15.4387 N 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 1.0317 -55.6866 13.2732 N 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -0.9662 -56.1848 12.1474 O 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 3.9233 -54.9834 16.3038 H 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -2.9363 -58.4538 18.1389 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.3508 -57.1319 18.2899 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -2.1432 -58.8360 17.0510 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -2.9875 -56.1521 17.3616 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -0.9885 -57.8949 14.9095 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -1.7643 -57.8655 16.1021 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -2.1944 -56.5486 16.2790 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -0.9633 -56.6191 14.3950 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -1.6932 -55.8143 15.2279 N 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -1.8386 -54.8497 15.0945 H 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 2.4658 -57.5272 16.2097 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 2.6382 -58.0380 13.8546 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 2.9083 -58.8130 16.5209 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 3.0807 -59.3238 14.1659 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 2.3307 -57.1398 14.8765 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 3.2157 -59.7112 15.4991 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 3.7617 -61.2970 15.8834 Cl 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 17 18 1 0 0 0\r\n", " 1 2 1 0 0 0\r\n", " 19 22 1 0 0 0\r\n", " 3 5 1 0 0 0\r\n", " 27 31 1 0 0 0\r\n", " 28 30 1 0 0 0\r\n", " 20 23 1 0 0 0\r\n", " 4 7 1 0 0 0\r\n", " 29 32 1 0 0 0\r\n", " 21 22 1 0 0 0\r\n", " 5 6 1 0 0 0\r\n", " 23 25 1 0 0 0\r\n", " 7 13 1 0 0 0\r\n", " 32 33 1 0 0 0\r\n", " 24 9 1 0 0 0\r\n", " 24 25 1 0 0 0\r\n", " 8 13 1 0 0 0\r\n", " 9 14 1 0 0 0\r\n", " 10 6 1 0 0 0\r\n", " 10 11 1 0 0 0\r\n", " 11 14 1 0 0 0\r\n", " 12 31 1 0 0 0\r\n", " 12 8 1 0 0 0\r\n", " 12 14 1 0 0 0\r\n", " 17 19 2 0 0 0\r\n", " 1 3 2 0 0 0\r\n", " 18 20 2 0 0 0\r\n", " 2 4 2 0 0 0\r\n", " 27 29 2 0 0 0\r\n", " 28 31 2 0 0 0\r\n", " 30 32 2 0 0 0\r\n", " 21 24 2 0 0 0\r\n", " 22 23 2 0 0 0\r\n", " 5 7 2 0 0 0\r\n", " 6 8 2 0 0 0\r\n", " 9 15 2 0 0 0\r\n", " 25 26 1 0 0 0\r\n", " 13 16 1 0 0 0\r\n", "M END\r\n", "> \r\n", "-7.52352\r\n", "\r\n", "> \r\n", "6.72818\r\n", "\r\n", "> \r\n", "407.767\r\n", "\r\n", "$$$$\r\n", "ZINC72314638\r\n", "\r\n", "\r\n", " 34 38 0 0 0 0 0 0 0 0999 V2000\r\n", " -6.9192 -60.0249 14.8267 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -6.9224 -60.5532 13.5394 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -5.7591 -59.4408 15.3438 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -5.7655 -60.4981 12.7677 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -4.5814 -59.3754 14.5808 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -4.6074 -59.9121 13.2905 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.3284 -58.7584 15.1036 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -2.2330 -58.7339 14.3037 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -2.2694 -59.3144 12.9167 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.4559 -59.8663 12.4993 N 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -1.2700 -59.2872 12.1989 O 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.4977 -60.2504 11.5937 H 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -0.9776 -58.1386 14.7709 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 1.1138 -55.4726 15.4117 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 0.0692 -59.0069 15.4111 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 1.2024 -57.9981 15.5499 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 2.4884 -55.4636 16.0338 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -0.6963 -56.8766 14.7008 N 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 0.5616 -56.7232 15.2099 N 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 0.5532 -54.4163 15.1165 O 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 3.3449 -58.4107 12.4428 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 4.5462 -58.8702 12.9824 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 2.2658 -58.1304 13.2810 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 4.6683 -59.0498 14.3602 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 2.3878 -58.3101 14.6587 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 3.5892 -58.7698 15.1985 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 3.7434 -58.9704 16.6705 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -2.8700 -58.9787 17.5299 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.5363 -56.8305 16.6476 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -2.7893 -58.4284 18.8090 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.4557 -56.2801 17.9268 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.2435 -58.1798 16.4491 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.0821 -57.0791 19.0075 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -2.9979 -56.4920 20.3757 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 1 2 1 0 0 0\r\n", " 21 22 1 0 0 0\r\n", " 3 5 1 0 0 0\r\n", " 28 32 1 0 0 0\r\n", " 29 31 1 0 0 0\r\n", " 23 25 1 0 0 0\r\n", " 24 26 1 0 0 0\r\n", " 30 33 1 0 0 0\r\n", " 4 6 1 0 0 0\r\n", " 32 7 1 0 0 0\r\n", " 5 7 1 0 0 0\r\n", " 6 10 1 0 0 0\r\n", " 8 13 1 0 0 0\r\n", " 8 9 1 0 0 0\r\n", " 9 10 1 0 0 0\r\n", " 14 19 1 0 0 0\r\n", " 15 13 1 0 0 0\r\n", " 15 16 1 0 0 0\r\n", " 16 25 1 0 0 0\r\n", " 16 19 1 0 0 0\r\n", " 34 33 1 0 0 0\r\n", " 27 26 1 0 0 0\r\n", " 17 14 1 0 0 0\r\n", " 19 18 1 0 0 0\r\n", " 1 3 2 0 0 0\r\n", " 21 23 2 0 0 0\r\n", " 22 24 2 0 0 0\r\n", " 2 4 2 0 0 0\r\n", " 28 30 2 0 0 0\r\n", " 29 32 2 0 0 0\r\n", " 31 33 2 0 0 0\r\n", " 5 6 2 0 0 0\r\n", " 25 26 2 0 0 0\r\n", " 7 8 2 0 0 0\r\n", " 13 18 2 0 0 0\r\n", " 9 11 2 0 0 0\r\n", " 14 20 2 0 0 0\r\n", " 10 12 1 0 0 0\r\n", "M END\r\n", "> \r\n", "-7.51168\r\n", "\r\n", "> \r\n", "1.81673\r\n", "\r\n", "> \r\n", "411.326\r\n", "\r\n", "$$$$\r\n", "ZINC72314638\r\n", "\r\n", "\r\n", " 34 38 0 0 0 0 0 0 0 0999 V2000\r\n", " -6.9192 -60.0250 14.8266 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -6.9223 -60.5532 13.5393 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -5.7590 -59.4409 15.3438 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -5.7655 -60.4980 12.7677 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -4.5813 -59.3753 14.5807 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -4.6073 -59.9120 13.2905 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.3283 -58.7583 15.1036 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -2.2330 -58.7336 14.3037 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -2.2693 -59.3140 12.9166 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.4559 -59.8659 12.4992 N 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -1.2700 -59.2867 12.1988 O 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.4977 -60.2499 11.5936 H 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -0.9776 -58.1384 14.7708 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 1.1137 -55.4723 15.4114 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 0.0691 -59.0065 15.4111 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 1.2024 -57.9977 15.5499 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 2.4883 -55.4632 16.0337 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -0.6963 -56.8762 14.7006 N 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 0.5616 -56.7229 15.2098 N 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 0.5532 -54.4160 15.1161 O 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 3.3452 -58.4103 12.4429 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 4.5464 -58.8703 12.9827 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 2.2660 -58.1300 13.2811 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 4.6682 -59.0500 14.3606 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 2.3879 -58.3098 14.6589 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 3.5890 -58.7698 15.1986 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 3.7431 -58.9706 16.6707 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -2.8702 -58.9787 17.5299 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.5360 -56.8304 16.6476 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -2.7895 -58.4286 18.8091 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.4554 -56.2801 17.9268 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.2434 -58.1797 16.4491 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.0819 -57.0792 19.0075 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -2.9978 -56.4922 20.3758 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 1 2 1 0 0 0\r\n", " 21 22 1 0 0 0\r\n", " 3 5 1 0 0 0\r\n", " 28 32 1 0 0 0\r\n", " 29 31 1 0 0 0\r\n", " 23 25 1 0 0 0\r\n", " 24 26 1 0 0 0\r\n", " 30 33 1 0 0 0\r\n", " 4 6 1 0 0 0\r\n", " 32 7 1 0 0 0\r\n", " 5 7 1 0 0 0\r\n", " 6 10 1 0 0 0\r\n", " 8 13 1 0 0 0\r\n", " 8 9 1 0 0 0\r\n", " 9 10 1 0 0 0\r\n", " 14 19 1 0 0 0\r\n", " 15 13 1 0 0 0\r\n", " 15 16 1 0 0 0\r\n", " 16 25 1 0 0 0\r\n", " 16 19 1 0 0 0\r\n", " 34 33 1 0 0 0\r\n", " 27 26 1 0 0 0\r\n", " 17 14 1 0 0 0\r\n", " 19 18 1 0 0 0\r\n", " 1 3 2 0 0 0\r\n", " 21 23 2 0 0 0\r\n", " 22 24 2 0 0 0\r\n", " 2 4 2 0 0 0\r\n", " 28 30 2 0 0 0\r\n", " 29 32 2 0 0 0\r\n", " 31 33 2 0 0 0\r\n", " 5 6 2 0 0 0\r\n", " 25 26 2 0 0 0\r\n", " 7 8 2 0 0 0\r\n", " 13 18 2 0 0 0\r\n", " 9 11 2 0 0 0\r\n", " 14 20 2 0 0 0\r\n", " 10 12 1 0 0 0\r\n", "M END\r\n", "> \r\n", "-7.51156\r\n", "\r\n", "> \r\n", "2.07052\r\n", "\r\n", "> \r\n", "411.326\r\n", "\r\n", "$$$$\r\n", "ZINC39912421\r\n", "\r\n", "\r\n", " 35 39 0 0 0 0 0 0 0 0999 V2000\r\n", " -2.8637 -58.0485 14.3831 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -1.6591 -57.4567 14.0111 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.7718 -57.6110 13.4624 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -5.0742 -58.1435 13.6832 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.4965 -58.9601 15.3604 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -1.8048 -56.6783 12.9233 N 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.1200 -56.7955 12.6090 N 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -4.8956 -58.9553 14.8275 N 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -6.0866 -57.9453 13.0303 O 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.5472 -56.3394 11.8484 H 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 0.0881 -58.8830 14.9464 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 2.1072 -57.9326 15.8719 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -0.3763 -57.6028 14.6449 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 1.3300 -59.0479 15.5599 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 1.6430 -56.6523 15.5704 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 0.4011 -56.4874 14.9569 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 1.8249 -60.4168 15.8846 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 2.4900 -55.4712 15.9138 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -0.0441 -55.2299 14.6663 O 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 0.4836 -54.4853 14.8788 H 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -2.9761 -58.9905 17.8000 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.6595 -56.9124 16.7746 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -2.8625 -58.3455 19.0315 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.5458 -56.2672 18.0061 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.3746 -58.2738 16.6715 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.1474 -56.9839 19.1346 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.0409 -56.3650 20.3177 F 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -5.9715 -59.7155 15.4305 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -8.5076 -61.3824 13.1820 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -7.7831 -62.4617 12.6762 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -7.9064 -60.4967 14.0761 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -6.4575 -62.6553 13.0647 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -6.5808 -60.6902 14.4647 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -5.8563 -61.7695 13.9589 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -4.2144 -62.0410 14.4164 Cl 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 29 30 1 0 0 0\r\n", " 21 25 1 0 0 0\r\n", " 22 24 1 0 0 0\r\n", " 31 33 1 0 0 0\r\n", " 23 26 1 0 0 0\r\n", " 32 34 1 0 0 0\r\n", " 11 13 1 0 0 0\r\n", " 12 14 1 0 0 0\r\n", " 13 2 1 0 0 0\r\n", " 1 2 1 0 0 0\r\n", " 15 16 1 0 0 0\r\n", " 16 19 1 0 0 0\r\n", " 26 27 1 0 0 0\r\n", " 34 35 1 0 0 0\r\n", " 3 4 1 0 0 0\r\n", " 3 7 1 0 0 0\r\n", " 4 8 1 0 0 0\r\n", " 5 25 1 0 0 0\r\n", " 5 1 1 0 0 0\r\n", " 5 8 1 0 0 0\r\n", " 17 14 1 0 0 0\r\n", " 18 15 1 0 0 0\r\n", " 28 33 1 0 0 0\r\n", " 28 8 1 0 0 0\r\n", " 7 6 1 0 0 0\r\n", " 29 31 2 0 0 0\r\n", " 30 32 2 0 0 0\r\n", " 21 23 2 0 0 0\r\n", " 22 25 2 0 0 0\r\n", " 24 26 2 0 0 0\r\n", " 11 14 2 0 0 0\r\n", " 12 15 2 0 0 0\r\n", " 13 16 2 0 0 0\r\n", " 1 3 2 0 0 0\r\n", " 33 34 2 0 0 0\r\n", " 2 6 2 0 0 0\r\n", " 4 9 2 0 0 0\r\n", " 7 10 1 0 0 0\r\n", " 19 20 1 0 0 0\r\n", "M END\r\n", "> \r\n", "-7.49363\r\n", "\r\n", "> \r\n", "1.94002\r\n", "\r\n", "> \r\n", "442.764\r\n", "\r\n", "$$$$\r\n", "ZINC39912421\r\n", "\r\n", "\r\n", " 35 39 0 0 0 0 0 0 0 0999 V2000\r\n", " -2.8637 -58.0484 14.3832 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -1.6591 -57.4566 14.0111 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.7718 -57.6110 13.4624 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -5.0743 -58.1434 13.6832 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.4964 -58.9600 15.3605 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -1.8048 -56.6783 12.9233 N 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.1200 -56.7954 12.6090 N 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -4.8956 -58.9552 14.8276 N 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -6.0865 -57.9453 13.0303 O 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.5473 -56.3392 11.8484 H 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 0.0881 -58.8830 14.9464 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 2.1074 -57.9325 15.8720 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -0.3763 -57.6028 14.6449 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 1.3299 -59.0479 15.5600 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 1.6428 -56.6523 15.5704 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 0.4011 -56.4874 14.9569 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 1.8249 -60.4168 15.8846 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 2.4900 -55.4712 15.9137 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -0.0441 -55.2299 14.6664 O 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 0.4836 -54.4854 14.8789 H 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -2.9763 -58.9904 17.8002 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.6592 -56.9122 16.7746 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -2.8627 -58.3454 19.0317 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.5455 -56.2671 18.0061 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.3746 -58.2738 16.6716 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.1473 -56.9838 19.1347 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.0408 -56.3649 20.3178 F 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -5.9714 -59.7154 15.4306 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -8.5075 -61.3823 13.1820 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -7.7831 -62.4617 12.6762 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -7.9063 -60.4966 14.0762 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -6.4575 -62.6552 13.0648 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -6.5808 -60.6901 14.4648 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -5.8563 -61.7695 13.9591 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -4.2143 -62.0411 14.4166 Cl 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 29 30 1 0 0 0\r\n", " 21 25 1 0 0 0\r\n", " 22 24 1 0 0 0\r\n", " 31 33 1 0 0 0\r\n", " 23 26 1 0 0 0\r\n", " 32 34 1 0 0 0\r\n", " 11 13 1 0 0 0\r\n", " 12 14 1 0 0 0\r\n", " 13 2 1 0 0 0\r\n", " 1 2 1 0 0 0\r\n", " 15 16 1 0 0 0\r\n", " 16 19 1 0 0 0\r\n", " 26 27 1 0 0 0\r\n", " 34 35 1 0 0 0\r\n", " 3 4 1 0 0 0\r\n", " 3 7 1 0 0 0\r\n", " 4 8 1 0 0 0\r\n", " 5 25 1 0 0 0\r\n", " 5 1 1 0 0 0\r\n", " 5 8 1 0 0 0\r\n", " 17 14 1 0 0 0\r\n", " 18 15 1 0 0 0\r\n", " 28 33 1 0 0 0\r\n", " 28 8 1 0 0 0\r\n", " 7 6 1 0 0 0\r\n", " 29 31 2 0 0 0\r\n", " 30 32 2 0 0 0\r\n", " 21 23 2 0 0 0\r\n", " 22 25 2 0 0 0\r\n", " 24 26 2 0 0 0\r\n", " 11 14 2 0 0 0\r\n", " 12 15 2 0 0 0\r\n", " 13 16 2 0 0 0\r\n", " 1 3 2 0 0 0\r\n", " 33 34 2 0 0 0\r\n", " 2 6 2 0 0 0\r\n", " 4 9 2 0 0 0\r\n", " 7 10 1 0 0 0\r\n", " 19 20 1 0 0 0\r\n", "M END\r\n", "> \r\n", "-7.49359\r\n", "\r\n", "> \r\n", "2.12367\r\n", "\r\n", "> \r\n", "442.764\r\n", "\r\n", "$$$$\r\n", "ZINC39912344\r\n", "\r\n", "\r\n", " 35 39 0 0 0 0 0 0 0 0999 V2000\r\n", " -2.9655 -58.0579 14.4075 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -1.7487 -57.5053 14.0153 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.8718 -57.6016 13.4941 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -5.1866 -58.0933 13.7357 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.6126 -58.9419 15.4006 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -1.8851 -56.7323 12.9225 N 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.2070 -56.8131 12.6256 N 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -5.0176 -58.9002 14.8849 N 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -6.2006 -57.8708 13.0937 O 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.6301 -56.3508 11.8663 H 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -0.0185 -58.9767 14.9117 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 2.0258 -58.0767 15.8325 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -0.4630 -57.6840 14.6346 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 1.2258 -59.1731 15.5107 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 1.5813 -56.7838 15.5554 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 0.3369 -56.5875 14.9564 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 1.6995 -60.5554 15.8092 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 2.4524 -55.6234 15.9088 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -0.0884 -55.3180 14.6896 O 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 0.4544 -54.5863 14.9086 H 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.0661 -58.9675 17.8345 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.6938 -56.8776 16.7976 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -2.9180 -58.3157 19.0588 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.5456 -56.2257 18.0218 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.4539 -58.2484 16.7039 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.1577 -56.9448 19.1524 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -3.0181 -56.3191 20.3285 F 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -6.1077 -59.6230 15.5080 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -8.0602 -62.0396 13.3356 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -5.9663 -63.0767 13.9497 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -7.7088 -60.9008 14.0603 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -5.6148 -61.9379 14.6743 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -7.1891 -63.1276 13.2804 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -6.4861 -60.8499 14.7295 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " -7.5641 -64.3456 12.5058 C 0 0 0 0 0 0 0 0 0 0 0 0\r\n", " 21 25 1 0 0 0\r\n", " 22 24 1 0 0 0\r\n", " 29 33 1 0 0 0\r\n", " 30 32 1 0 0 0\r\n", " 31 34 1 0 0 0\r\n", " 23 26 1 0 0 0\r\n", " 11 13 1 0 0 0\r\n", " 12 14 1 0 0 0\r\n", " 13 2 1 0 0 0\r\n", " 1 2 1 0 0 0\r\n", " 15 16 1 0 0 0\r\n", " 16 19 1 0 0 0\r\n", " 26 27 1 0 0 0\r\n", " 3 4 1 0 0 0\r\n", " 3 7 1 0 0 0\r\n", " 4 8 1 0 0 0\r\n", " 5 25 1 0 0 0\r\n", " 5 1 1 0 0 0\r\n", " 5 8 1 0 0 0\r\n", " 35 33 1 0 0 0\r\n", " 17 14 1 0 0 0\r\n", " 18 15 1 0 0 0\r\n", " 28 34 1 0 0 0\r\n", " 28 8 1 0 0 0\r\n", " 7 6 1 0 0 0\r\n", " 21 23 2 0 0 0\r\n", " 22 25 2 0 0 0\r\n", " 29 31 2 0 0 0\r\n", " 30 33 2 0 0 0\r\n", " 32 34 2 0 0 0\r\n", " 24 26 2 0 0 0\r\n", " 11 14 2 0 0 0\r\n", " 12 15 2 0 0 0\r\n", " 13 16 2 0 0 0\r\n", " 1 3 2 0 0 0\r\n", " 2 6 2 0 0 0\r\n", " 4 9 2 0 0 0\r\n", " 7 10 1 0 0 0\r\n", " 19 20 1 0 0 0\r\n", "M END\r\n", "> \r\n", "-7.4906\r\n", "\r\n", "> \r\n", "1.79745\r\n", "\r\n", "> \r\n", "419.322\r\n", "\r\n", "$$$$\r\n" ] } ], "source": [ "!cat ../files/best.sdf" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "`sdf` files can have arbitrary data embedded in them:\n", "\n", " M END\n", " > \n", " -7.83433\n", "\n", " > \n", " 1.45522\n", "\n", " > \n", " 475.372\n", "\n", " $$$$" ] }, { "cell_type": "code", "execution_count": 79, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'MOL Chiral Flag': '0', 'minimizedAffinity': '-7.83433', 'minimizedRMSD': '1.45522', 'molecular weight': '475.372', 'OpenBabel Symmetry Classes': '27 24 14 23 12 26 2 6 7 16 15 31 35 32 29 34 30 4 5 13 13 11 11 28 21 1 10 17 20 9 8 25 36 33 18 19 3 3 22'}" ] }, "execution_count": 79, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mols[0].data" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Beyond Pybel" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Recall that Pybel is a python-native wrapper around the OpenBabel SWIG bindings. The underlying OpenBabel objects are always accessible if you need to use the additional functionality provided by OpenBabel (this may be necessary if you modifying or creating molecule objects)." ] }, { "cell_type": "code", "execution_count": 80, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " >\n" ] } ], "source": [ "obmol = mol.OBMol\n", "vec = obmol.Center(0)\n", "print(vec)" ] }, { "cell_type": "code", "execution_count": 81, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.7085902947538542 -0.016861359909934506 -0.0005521894831451903\n" ] } ], "source": [ "print(vec.GetX(),vec.GetY(),vec.GetZ())" ] }, { "cell_type": "code", "execution_count": 82, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['AddAtom',\n", " 'AddBond',\n", " 'AddConformer',\n", " 'AddHydrogens',\n", " 'AddNewHydrogens',\n", " 'AddNonPolarHydrogens',\n", " 'AddPolarHydrogens',\n", " 'AddResidue',\n", " 'Align',\n", " 'AreInSameRing',\n", " 'AssignSpinMultiplicity',\n", " 'AssignTotalChargeToAtoms',\n", " 'AutomaticFormalCharge',\n", " 'AutomaticPartialCharge',\n", " 'BeginAtom',\n", " 'BeginAtoms',\n", " 'BeginBond',\n", " 'BeginBonds',\n", " 'BeginConformer',\n", " 'BeginData',\n", " 'BeginInternalCoord',\n", " 'BeginModify',\n", " 'BeginResidue',\n", " 'BeginResidues',\n", " 'CBeginAtoms',\n", " 'CEndAtoms',\n", " 'Center',\n", " 'ClassDescription',\n", " 'Clear',\n", " 'CloneData',\n", " 'ConnectTheDots',\n", " 'ContigFragList',\n", " 'ConvertDativeBonds',\n", " 'ConvertZeroBonds',\n", " 'CopyConformer',\n", " 'CopySubstructure',\n", " 'CorrectForPH',\n", " 'DataSize',\n", " 'DecrementMod',\n", " 'DeleteAtom',\n", " 'DeleteBond',\n", " 'DeleteConformer',\n", " 'DeleteData',\n", " 'DeleteHydrogen',\n", " 'DeleteHydrogens',\n", " 'DeleteNonPolarHydrogens',\n", " 'DeletePolarHydrogens',\n", " 'DeleteResidue',\n", " 'DestroyAtom',\n", " 'DestroyBond',\n", " 'DestroyResidue',\n", " 'DoTransformations',\n", " 'Empty',\n", " 'EndAtom',\n", " 'EndAtoms',\n", " 'EndBond',\n", " 'EndBonds',\n", " 'EndData',\n", " 'EndModify',\n", " 'EndResidue',\n", " 'EndResidues',\n", " 'FindAngles',\n", " 'FindChildren',\n", " 'FindLSSR',\n", " 'FindLargestFragment',\n", " 'FindRingAtomsAndBonds',\n", " 'FindSSSR',\n", " 'FindTorsions',\n", " 'GetAllData',\n", " 'GetAngle',\n", " 'GetAtom',\n", " 'GetAtomById',\n", " 'GetBond',\n", " 'GetBondById',\n", " 'GetConformer',\n", " 'GetConformers',\n", " 'GetCoordinates',\n", " 'GetData',\n", " 'GetDimension',\n", " 'GetEnergies',\n", " 'GetEnergy',\n", " 'GetExactMass',\n", " 'GetFirstAtom',\n", " 'GetFlags',\n", " 'GetFormula',\n", " 'GetGIDVector',\n", " 'GetGIVector',\n", " 'GetGTDVector',\n", " 'GetInternalCoord',\n", " 'GetLSSR',\n", " 'GetMod',\n", " 'GetMolWt',\n", " 'GetNextFragment',\n", " 'GetResidue',\n", " 'GetSSSR',\n", " 'GetSpacedFormula',\n", " 'GetTitle',\n", " 'GetTorsion',\n", " 'GetTotalCharge',\n", " 'GetTotalSpinMultiplicity',\n", " 'Has2D',\n", " 'Has3D',\n", " 'HasAromaticPerceived',\n", " 'HasAtomTypesPerceived',\n", " 'HasChainsPerceived',\n", " 'HasChiralityPerceived',\n", " 'HasClosureBondsPerceived',\n", " 'HasData',\n", " 'HasFlag',\n", " 'HasHybridizationPerceived',\n", " 'HasHydrogensAdded',\n", " 'HasLSSRPerceived',\n", " 'HasNonZeroCoords',\n", " 'HasPartialChargesPerceived',\n", " 'HasRingAtomsAndBondsPerceived',\n", " 'HasRingTypesPerceived',\n", " 'HasSSSRPerceived',\n", " 'HasSpinMultiplicityAssigned',\n", " 'IncrementMod',\n", " 'InsertAtom',\n", " 'IsCorrectedForPH',\n", " 'IsPeriodic',\n", " 'IsReaction',\n", " 'MakeDativeBonds',\n", " 'NewAtom',\n", " 'NewBond',\n", " 'NewResidue',\n", " 'NextConformer',\n", " 'NextInternalCoord',\n", " 'NumAtoms',\n", " 'NumBonds',\n", " 'NumConformers',\n", " 'NumHvyAtoms',\n", " 'NumResidues',\n", " 'NumRotors',\n", " 'PerceiveBondOrders',\n", " 'RenumberAtoms',\n", " 'ReserveAtoms',\n", " 'Rotate',\n", " 'Separate',\n", " 'SetAromaticPerceived',\n", " 'SetAtomTypesPerceived',\n", " 'SetAutomaticFormalCharge',\n", " 'SetAutomaticPartialCharge',\n", " 'SetChainsPerceived',\n", " 'SetChiralityPerceived',\n", " 'SetClosureBondsPerceived',\n", " 'SetConformer',\n", " 'SetConformers',\n", " 'SetCoordinates',\n", " 'SetCorrectedForPH',\n", " 'SetData',\n", " 'SetDimension',\n", " 'SetEnergies',\n", " 'SetEnergy',\n", " 'SetFlag',\n", " 'SetFlags',\n", " 'SetFormula',\n", " 'SetHybridizationPerceived',\n", " 'SetHydrogensAdded',\n", " 'SetInternalCoord',\n", " 'SetIsPatternStructure',\n", " 'SetIsReaction',\n", " 'SetLSSRPerceived',\n", " 'SetPartialChargesPerceived',\n", " 'SetPeriodicMol',\n", " 'SetRingAtomsAndBondsPerceived',\n", " 'SetRingTypesPerceived',\n", " 'SetSSSRPerceived',\n", " 'SetSpinMultiplicityAssigned',\n", " 'SetTitle',\n", " 'SetTorsion',\n", " 'SetTotalCharge',\n", " 'SetTotalSpinMultiplicity',\n", " 'StripSalts',\n", " 'ToInertialFrame',\n", " 'Translate',\n", " 'UnsetFlag',\n", " '__class__',\n", " '__delattr__',\n", " '__dict__',\n", " '__dir__',\n", " '__doc__',\n", " '__eq__',\n", " '__format__',\n", " '__ge__',\n", " '__getattribute__',\n", " '__gt__',\n", " '__hash__',\n", " '__iadd__',\n", " '__init__',\n", " '__init_subclass__',\n", " '__le__',\n", " '__lt__',\n", " '__module__',\n", " '__ne__',\n", " '__new__',\n", " '__reduce__',\n", " '__reduce_ex__',\n", " '__repr__',\n", " '__setattr__',\n", " '__sizeof__',\n", " '__str__',\n", " '__subclasshook__',\n", " '__swig_destroy__',\n", " '__weakref__',\n", " 'this',\n", " 'thisown']" ] }, "execution_count": 82, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dir(obmol)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Project: Dimensionality Reduced Molecules\n", "\n", "Given a SMILES file where the molecules names are property (e.g. binding affinity), map the molecules into 2D space using PCA and visualize the data colored by the property.\n", "\n", " * Read SMILES\n", " * Save title as property to label with\n", " * Compute fingerprint\n", " * Convert fingerprint bits into an array of size 1024 of zeroes and ones\n", " * Use `sklearn.decomposition.PCA` to transform the fingerprints into 2D coordinates\n", " * Plot the coordinates and color by specified property" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#fill in fps with fingerprint bits, yvals with title value\n", "pca = PCA(n_components=2)\n", "res = pca.fit_transform(fps) \n", "\n", "plt.scatter(res[:,0],res[:,1],c=yvals)\n", "plt.gca().set_aspect('equal', adjustable='box');" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "--2023-10-30 20:19:10-- http://mscbio2025.csb.pitt.edu/files/er.smi\r\n", "Resolving mscbio2025.csb.pitt.edu (mscbio2025.csb.pitt.edu)... 136.142.4.139\r\n", "Connecting to mscbio2025.csb.pitt.edu (mscbio2025.csb.pitt.edu)|136.142.4.139|:80... connected.\r\n", "HTTP request sent, awaiting response... 200 OK\r\n", "Length: 20022 (20K) [application/smil+xml]\r\n", "Saving to: ‘er.smi’\r\n", "\r\n", "\r", "er.smi 0%[ ] 0 --.-KB/s \r", "er.smi 100%[===================>] 19.55K --.-KB/s in 0s \r\n", "\r\n", "2023-10-30 20:19:10 (38.3 MB/s) - ‘er.smi’ saved [20022/20022]\r\n", "\r\n" ] } ], "source": [ "!wget http://mscbio2025.csb.pitt.edu/files/er.smi" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "slideshow": { "slide_type": "notes" } }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAATgAAAD4CAYAAAB44PpqAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/MnkTPAAAACXBIWXMAAAsTAAALEwEAmpwYAAA/+UlEQVR4nO3dd5wddbn48c8zM6du32TTNlnSIKQRAiGUUKVLFQREUS9yxYLYr/Xqtfuz94ZiBUUFVFBEQJrUNBIgPYT0tpvte+rMPL8/5mw/gZAtZ8v3zWtfbObMmXlmy7Pf/hVVxTAMYySyCh2AYRjGQDEJzjCMEcskOMMwRiyT4AzDGLFMgjMMY8RyCnHTsWPH6tSpUwtxa8MwRqAVK1bUqWpVz+MFSXBTp05l+fLlhbi1YRgjkIhsy3fcVFENwxixTIIzDGPEMgnOMIwRyyQ4wzBGrGGd4FSV55a/zLNPbSKb9QodjmEYQ0xBelH7w9NPbeILn/gTbsYDAcsS3vPhC7j8yhPynu+6Hnv3NlFaGqO0NDbI0RqGUQjDLsGt3bSbb/7gfnYs2wUdK6EIvqf86Fv/ZNL0MSxeOL3be+6//3l++JOHaBMl4yjFsQjvvu40Lj9vASIy+A9hGMagGFZV1EeeWMf7334rOx9/GaspidWaRlwfRIIPFT76pT91e8/KlVv57vcfoN7xSYcUtYSWdIZv/fJhvv3LhzvO21bXwL3L17Jh1/7BfizDMAbIsCjBuZ6PbQnf/MI9OA0J8BUB1APJevjxMBoLgypWvctzm3dy7IxqWlpT3P77p2kTH8QKkmCOr8rfHnqe/XXNPL36ZTLqkym28GJCRUmM33/kWiZVlhXuoQ3D6DMpxIKXixYt0kOZyfDHx1bx498+htuQxQIiO1twmtL0rFQq4FUWgQhqwetuOJFVj73EgfpWXNfHjQga6lFYVUV6PLoKZIqFbIlFWVGUR7/8bizLVGENY6gTkRWquqjn8X6poopIuYjcKSLrRWSdiJzc12v+felafvDDh/APZLAUULAyXq/k1hGDG/SiqsBD9z7Pvv3NuK4fvKh0aa+j81jPayiEWxV8pSmRYvnmnX19DMMwCqi/qqjfA+5X1TeKSBiI9/WC3//9Y5DV7tVKx8KCvEnOi9ioLfglDqS7Dxmxs4ob6v6uVyqXWR74Fuxvaj38BzAMo+D6XIITkTLgdOBWAFXNqGpjX6/bfCDRq4czWxnrlZkU0JCFWxzCizvYlt2rsCYKVsoP2ugAx7YoKY7mv7GCWmCJMK9mfF8fwzCMAuqPEtw0oBb4lYgsAFYAH1DVtq4niciNwI0ANTU1r3rRkvIYqeY26NIG5heFSI+LE9mX6Eh0fsgiM7EUrCBXu67fUftUOjO45QGucsdPbiAeDbNtVz0f/cpdpNJux/UV8CKgtnDa3KlMHV/5Wr8WhmEMIf3RBucAxwE/UdWFQBvwiZ4nqeotqrpIVRdVVfVatqmXD113VpDEuhbHVHHLoiSOKKJ+QSmp6lLSU8pRp/tjZEqFA8c41C522H+8Q8tEC7UAESZPqKCyvIiFc6fw8XedR2lxlHDYBgE/JkQmx3nfRafw3f++tC9fE8MwhoD+KMHtBHaq6rO5f99JngT3Wr3+xNk8dfZmHv/XOjQXpWR8Qi0Z6uZHCbeCH9FebWnZIqHxKBvs4BV1IFFt4YeFaVLS7dzzTp/D65Yczf66ZkqLYxQXRfoatmEYQ0ifS3CquhfYISKzcofOBtb29boAX7jpYsLHleJaLpLKkI371C6KY2csQslgSEjPztCWyVbvp7KF1Djhcx+8qNc9HNti0vhyk9wMYwTqr17Um4Hbcz2oW4Dr++OiliV8/OIz+UjbfXh20FmAr0Sag95VL2rhJP1utVgvJt16XtuFbJuyMX3u3DUMYxjplwSnqquAXoPs+kNrXZKx2yCTdVEb7JTixu2glGYJbtxC3M5Bu+GMkAprryQnljC+tHggQjQMY4ga8nNRJ00oJxSysNNKqFURT7sPFZFgloIfEsJhm+pUpHe9VeH8o2cSC4UGM3TDMApsyCe4RQuOoLwsjsSCnlBRCZJcnilmFy6ZzTvfejqhjIBHkOh8kAw89eJWXM8f9PgNwyicIZ/gbNvix199M4sWTsMqcqDIZnJlKVHLzo3yBVFlxoRKZpw8hd89tQr1wEmBkwAnCbYLrqes3r670I9jGMYgGhariYypKOabn30j6XQWz1fisTCqysNPruNfqzaxKd3I6gMNrP3bw6RTHtj5r+P5g7+wgGEYhTMsEly7SKSzDW1HfRP/9+jjJNIZ0q4XTLHKeIhPblBv9/eqKv94+AU+/817KS6K8o43nsz5J80e3AcwDGNQDfkq6sF85q4HaUqkguQGICBeMCVL2tvfunzI/iyP3LuGlp1t7NlwgC9/5e986Wf3F+4BDMMYcMMywWVcj5Vbd+GrYmWUUIviJDQ3UE6xsmBnQFywXAinwTngIrkOWCE49YF/vEB9Y9ur3M0wjOFqWCY4kSBJxfb7FO9SYrVKfJ8SrQ/a2AQQP+hcsHNJzsnkuxDc9/iawQzdMIxBNCwTXMi2mROvJNRGZ6lMwU5DuDlYsBI/GEoiGSW2z8u3viUAZUUHWTbJMIxhb1h1MnQVau695LiQGx6SVNQOSnGWp6jfe1I+BGu+nX/6nMEI1zCMAhi2Ca7uwMFW2w2madnty7zl9mkQS7vuMoglwuc/eSnh0LD9EhiG8SqG5W93MpmhPpXEQnqVzESDfRk6aFBVtUM23/rCG3li+RbGlMe5/NxjicfCgxm2YRiDbFgluC076/jObY+ybP12EpVCvD63Uq/SMe7NSvhoPDcQLldkEx8mTCjl2NlTWDjn1VcT7klV+cMLz/OzZctpSCVZOHEinzjtdGYfwsKdhmEUzrBJcHvqmvnvz91BIpUhVQqELBJVPtF6OqujgF9sQSZXF1UlbFmEog6fev+Fh72L/beefJJfPbeSpBvc6D/btrFi9x3c8+a3ML3SLGtuGEPVsElwv79vOZlskGA01/frpAXb7dGBIIIfFbxiOG/aTGomj+Gy8xcwvqr0sO7bmslw68qVpD232/G06/Kjpc/yrQsuPKzrGoYx8IZNglv70t6O1UDsZG6Ab+rg2/9pWYir3nwiC2om9um+2xsbCdlWz50I8VRZvXdvn65tGMbAGjbj4GZMGRtssOUp4QREWoIBvAcjIuxr7vu+phNKSsh6Xq/jAkyvMNVTwxjKhk2Cu+i0ubQP1+063epgO9S7LVnaWtJ9vm9lLMaFRx5F1Ole2I04Du9dvLjP1zcMY+AMmyrqV3/3ULCIZR5KsCZcMDxEyBYJqPC12//NgcY2zlw0kx889jSrd+1lSkUZ7z3tRE6edui9qV8991yKI2HuXLMG1/cZX1TEF84+m2Mn9q36axjGwBLNszLuQFu0aJEuX778kM/furueaz79a0grVr5wVfFV8YqCpcu7bhYdciwyEy1Svoefe9ZoyOHLF5/LxfOOfk1xZz2PpOtSEg4fdo+sYRj9T0RWqGqvfWGGRRX1QFNbUHjLs01gO7fYwo9Y3ZIbgOf6pBNuR3IDSGVdvvLAY92OHYqQbVMaiZjkZhjDxLBIcEfVVAUzFNrH7+aOtzfB+bmqab59GnwUP89TtqTSNCSSAxazYRiFNywSXElRlJnzx6OW4DsEH1aQ8HwbsAUrm6c0pooTsdE8m2mJCMURM1XLMEayYZHgAH7+vqsYt6ASNx4sBqe2oJYE1UURLD/YM7V97imqxG2H6y5cRKzHhPqo43DFgjlEnGHTx2IYxmEYNgmuOBLm7x98O0tOnI5K5zpwHUSwssF6cE6rYqeV1qjL7Q+u5JKZs4iHQ8TDISKOzcXzjuZT559ZoCcxDGOwDKsiTCrrsm7HfiBYMcS3CbKcgnjaka29GLRWC1ZW0FaXJ1e9zFOffRd7W1oZWxSnJBop1CMYhjGIhlWC+81jK2hoSyA2aG7dNyCosorQOg6ypYAdHNfcPs+JdIbapjamVVUUJnDDMApi2FRRAf6+Yh0ZT/Ft6Uxu7QTCrXQkN3ywcvsweL5SFDUdCoYx2gyrBGdZQbhykA2crWzuk9z4EScFjmWxYNokxpTEBydIwzCGjGGV4K5YPJdoyOm+Ym8XagN+sMrImESIWMhhxsQxfO3trx/UOA3DGBqGVRvcW05dyJMbtvJcchfa6iJdMp2KkpwAp8+dzlfPPpd1O/czrqyYWdVVZuaBYYxS/ZbgRMQGlgO7VPXi/rpuVyHH5pYbr+S5l3fx638u46nVL6O+ohYkx8Hxx0zhlssvw7IsqsqKByIEwzCGkf4swX0AWAcc3tK5h0hEOG76ZI67aTK+r+ysb2R3opW546soiZg9Tg3D6NQvbXAiMhm4CPhFf1zvUFmWUDO2gpNqppjkZhhGL/3VyfBd4GOAf7ATRORGEVkuIstra2v76baGYRgH1+cEJyIXA/tVdcUrnaeqt6jqIlVdVGW22zMMYxD0RwluCXCpiGwF7gBeJyK39cN1DcMw+qTPCU5VP6mqk1V1KvAm4GFVva7PkRmGYfTRsBroaxiG8Vr060BfVX0UeLQ/r2kYhnG4TAnOMIwRyyQ4wzBGLJPgDMMYsUyCMwxjxDIJzjCMEcskOMMwRiyT4AzDGLFMgjMMY8QyCc4wjBHLJDjDMEYsk+AMwxixTIIzDGPEMgnOMIwRyyQ4wzBGLJPgDMMYsUyCMwxjxDIJzjCMEcskOMMwRiyT4AzDGLFMgjMMY8QyCc4wjBHLJDjDMEYsk+AMwxixTIIzDGPEMgnOMIwRyyQ4wzBGLJPgDMMYsUyCMwxjxDIJzjCMEcskOMMwRiyT4AzDGLGcvl5ARKYAvwXGAwrcoqrf6+t1DcMYWVY1vMCfd95LbbqOmvhk3jTlcmaWTB/Qe/ZHCc4FPqKqc4CTgJtEZE4/XNcwjGFKVUm6SXz1AXiybinf2fQztrRtpcVtZU3zer647ttsbHlpQOPocwlOVfcAe3Kft4jIOqAaWNvXaxuGMfwsPbCc27f/kaZsE46EOGfcmTxWt5SMn+l2XsbPcPv2O/n83I8PWCx9TnBdichUYCHwbH9e1zCM4WFN0zp+tuWXHcnM0zQP7HuYlO/nPX9b284BjaffOhlEpBi4C/igqjbnef1GEVkuIstra2v767aGYQwhd+78a56SWhZF855fES4b0Hj6JcGJSIggud2uqnfnO0dVb1HVRaq6qKqqqj9uaxjGELKpZQebWrb0Oi4CIRHCEup2PGyFubL6kgGNqc8JTkQEuBVYp6rf7ntIxnCgqYfxa8/D33s0/v7T8BN/RDX/X2lj5PPV5/Nrfk7Gt8j3YxAWh3MnnEnEChOSEHE7xrVTruDUqhMHNK7+aINbArwVeEFEVuWOfUpV7+uHaxtDkKYfQxs/CKRwVfG9PYSavgyaQYreiu+naEg+SMbbS3F4IcWR4xERXN9jb2o/RU58wKsmxuDa0rabNi9F1g8RtrxurwnCGyZfwkWTLuCaKZfT6rZRGirBFnvA4+qPXtQnAOmHWIxhQlu+TVaT7PTSJHPDAGzSVDd/Aye0mHX7rsXXDL5msCREceQ4auW9/GLrXXi+h6ceR5fO4CNH3UBJqLjAT2P0B199BPDUpikbpcjJ4IiPr0JZaCKvn3g+ACErREW4nHVb93Hno6tpaElwxsKZXHDi0URC/drnCfRzL6oxOqi7lW1uinSXhmMXZbtbj7f/nbh+I+Re8zXLhuaN3FF3G9kudZe1zZv56vqf8pX5Hx3k6I2BMKN4MiFxSJLGVZumbAyAiBXmuiPOJ2jJCvz1Py/wjT88QibroaosW7eDPz28ils/cQ3RcOhgtzgsJsEZr1nSGkuGA72OewroFmICGbXxck28S1snkVWfrgV9Tz1ebtvB7uR+JsXGDVLkxkCxxeJTc67n/168BVWfjLrYvk1mv8NvXnyCP86/B9/JMqtoBg/+I0E601k9TWaybNtbz71PruGqs47t17jMXFTjNXMjF+Q9bgk4okTFp9TKEpcsoDS7UfK1YniapS7dO1Eaw9OC8iP51eLPcPXE82BtJS3/Gou7DzKzt9NCCw2ZLE8e2Ejx+TtwirsPJUllXP69fGO/x2QS3DCWybp4BxlAOZDsyIkHGdWkCIpIMDQgIj6eb1MdacbB63W2r0p9uvewAmP4qgiXkn6hlLZnysjscyie3wwO1CXjNKZiNGciNLpRQuc0gNP9Z7ekKNrv8Zgq6jC0evNuvvq7h9iy+wCOY3HxyXP48JvO7Pf2i4NpTvzloK+1/0B5vvBk2zS2ZscCML2olqZsjH2ZUoJ+NaU8lGBDyzLOHH/RwAdtDJrla7aTyXqExmdRT2hyw3hq0bUU74cV+5hWvJWlAETDDlf3c/UUTAlu2Nm2t4Gbvn0nm3fV4auSyXr8/em1fOpn/xjwe6t6NNXfSLLtro5jbV6YtB+0p0Rypbc2z+GJRJDcfCx8LCyBslCSqnAzESvLuEgzleEEETs24HEbg6t6XDmWJXgtDlhK2nPo1UQhgjM1RVE0TDhk846LTuSE2TX9HospwQ0ztz2wnKzbvbqXyXo8s3Ybu+uamDR24MaXJdtuJ5N6mKjlkfKDCmlKQzzbehRnFG+mXh0ikiWNxfbMGPwefz8tgcpQktJQ0P4StiKcPOa8AYvXKIw3XXg8Dzy9nnRCSW6Pw4T854XCFp9+x1lsf76OO776EL//4gOoKOOnVPKjn11PRUVRn2MxJbhhZvPOOjy/dwtY2LHZWds0oPdOJW4DkoSBMABKqZ2kxEnxZHI6z6Um81RyOpvT4/AOMjRSBCJWFEdCnDr29cwqXTigMRuDb8aUsXz1A5cwpryIpn+Px04K9Gq1VeKhFL9e8xN+/6sncEOCVxLCLwmzp7GVK67+PuvW7e5zLCbBDTNzp0/AsXt/2zJZj2kTKwf47mkgSFLlVvCxLjWJpIZQBBcbH4tGP06pncx7hbGRyVwx+V18fPYPuGDimwc4XqNQTjl2Or/52ttoOyJE82OVaNpC3eA1wccWnzHRJvbfNh7fEjRkBbXY9h4qEd7/sd+RSmX7FIdJcMPMdeceTyTkdCsfRcIO5y2eRVX5wM4KiMQuByJA8DPoIOzzStAeP0aKRcTysPCBoKdM8HHE4qopN7Ow4lTKQmMGNFaj8FJuFnFA2hwanx5L4qViSu0kVbFWxjcmqbtzIn7GRqzcT3OXwcCI4KvHQ4++0KcYTBvcMPLg8o386r6lOCGHyohDSyJNSSzCNWcv5O0XnkBDa5J7n17Dtr0NHDNjEucvmkU0fHjf4l2JA6xv3sWEaDnVsTH8a89qdrRNZbp9AieVPEdI2vCJogepiipwUtFLvJweS4sfo8KJ8IYjvsn42NTD/wIYw0p1WSnxNoumcgFfiDwTYfziFg48PJ4DKyshCbGiFFbcI9XUe6yk71rcv/RxLr7guMOOwSS4YeIbv3+EPz+2uqP9zRIhFglx6yffxOSqcjburOWGb/4J1/NJZ13uX76Bn//jGX73yTdTUXzoPZWt2RQfXPEbXmzcQcS2QJS06xOybNK+S8yeze/Dc/nB3CTlkWoqUys5kNnV4ypKhZ2gxE6xIL6b6ZX/S3XpW/vxq2EMByLC0RVVPNuwm+gBn1CbkHipiKYVlZSMaebUd6+g1Y+hvuBlLDbfOZ36NZ3NLJbjUzK29/jJ18JUUYeB39z7LHc8vKpb54KvSmsqw9u+eQcPPr+J//3V/bSlMqSzQUNHMp1lf2MrP/7bk4d8n/VNuzn7oa+x/MBOUh40ZXya0z6eKmk/d10vw55Ulj/WnkpRyfu4YNLHCEkUO/e30kII4TMzfIAJ8dNZPPlhk9xGscVHTgGBaIMiCnUPTkAVTr1pJUk7jBVS7IhPuMRl1ls2U1TdBgQ1gKJxbbznqjf16f6mBDfEHWhq45a7nwZb6VqEVwsyJbCPJB+8/e8A2CFwurTJup7Pv5/bzKffcs6r3kdV+cDy33cksvZ7KeAr2NKZXLPq8eDe5/ng7IuYGDua62f8nFUN93IgvZ3q2FzmV1xAzC7t66Mbw4SqsubFndTVtlA9uYIVK7aydWsds2ZN5OKT5/DTJ5Z2dKJm6yNMPX47KUK92m4tx2fKWTtZ9/tZlM1sZO6JDjVV1X2KzSS4IW75mu04jgXaOa1FCZKb2nRrtvDiYLVC1+W4ws6hrbm1qWUfzdl8PZ8SJLge3fxOl7W8SkPjOH3cDYd0H2N48X2fO7b/k7/sfBTf9zh93ELeNfMaok4wUKiutoWPfuh2DtS1oqqkUlksW/B85T+Pb+D3tz/F5977Or7ymwewWkAUKic3I3km+4kF1fP3U3dqBfW1pcwf1/fVfk2CG+Ki4RCiAj6orcEkJ7t3cmvnRcBKBJ9HQjaXL5nb6xzf99nQfB9N2V3MLHkdY6NHdqznlV/3H8aI5XDJ5OP78ljGMHHDs59lfX2GpBf0nu9oXsPj+97PL0/6NmWROF/83N3s3tWA36X5xPOCz1OpLCnX5bu/eRg/ZJMqBzvps3vDGMoWNfa6l+CjWdiwr5pIC7z40p4+x28S3BB3/JwpJDNZLAU3V2hSiyDn9MxIApYtREM2CBwzfSLvuGBx8B5VyDxLsvXXbGldxpp0MVvdGI/W/pVpsRquqPkRUTtEwsv0uKhiCUSsEIpiiTC3bApvm37GAD+5UWjP1K1mdZ2S9mO0/7BlPYf1jeO56JHP8K2j3svGDXu7JTdyZyqgAi3VIXxbwRLUArfYYuvuak6W52nSoi6zXRQbn0f+swAvCrTAkTV937vFJLghbtmGHThFDtlWFywQHyyX/GsoK7zhhLnMnzie2TXjmHNE5xwZbfkqJP5IRJPMDsPMcCvr0sXcnxjPy8ntLK37CV8/7hpuXnYbnvpkfBdbhKpoKT9cdB27U43sSTYwu3Qy88undFvA0BiZfrrhDyj5qwopL8KHnvkdFXYRZPP3dKZLbHxbgjl67UTww/CHr5/BmW9/gfJxbfgi2Fmfxx8/hn37x4IokbDD5afP7/MzmAQ3xO1vaMUHWidBuAkURVSwE0GbW8fPngbJ7wMXLaGiKN7tGupuhsQfgHTHWMowyuxIK6vSZez1ojzX+CDvm3Uz/zjrQ9y3azW16VYWj5nOyVUzsMRiZunEQXxqYyhw/RSun7+zyFULqzxNLF5Ousdsg/bynBu3uie3LuxQCc8/cx67W1vxu+5So0qxhPjtZ99MeUnfF2Iww0SGuLnTJiAWeHEhUy6oBM2zoRQ4LSAZEBfsFEyPlvVKbgCkH6d9RkFXDsqMUNAt72rwV7gyUsx105fwodnns2TckVhifkRGq1PHLcSW/OsNOpaPWPCOD59FJBoKOsIIJiNYApFYiLDKwWYkUxIN863rLqY0FiGa24shGnKoKIpz98fexvTq/pnpYkpwQ9zcqeOZP3MSjzbvwItBthhCCcADOwt2rroaDTt86tqz819E4gTf6u5/aX0gi4Wg1MSOHNgHMYadG498K/dufz/bEmO6zFhRbFEs8VG1ufDUYzj2lzXc89cV7NnTyHHHT2XS5EoO1LUSr4jyyd/9i1TG7Xbd0qIIf/7if1FeGue+j17PX1esYcOeOmZPGsfli+ZQEo302zNIIfayXLRokS5fvnzQ7zscqSprNu/hxtv+ygFJgSVBiS0BThqK7BAnzJzMOy86iXnT8q9Lo349uv9MINXteFaFnzfWkCbEjTNupSRsqqFGd43pRm58+lOsbw6SnGP5hG0XRPjwrIu5ZtqSV3z/M2u38X+//BetyTS+r8yqqeJr77qY8ZUl/RqniKxQ1UW9jpsEN3S9vOsAH/763TS2JlGFZDZLeoJFsgQijk1xJMLd73wzE0pf/YdFU4+gTR8iGNfm4WuGx5KT8SJLOH38+4k7ZvK7cXB/3/kM391wHwnXpSIc5zPzruKkqlmH9F7fV3bWNhKLhAZsQQiT4IYZ1/O57OZbqG9q6zYKzbYtjjytmiVzpnH1cfMpix36OvbqJyDzJOBDeAlimT1JjZHhYAnOtMH10c4d9by0cS8Tqss5atbEfhs+sXLdDpKZbO/x3qocExnLO5ec8JqvKVYcouf2S3yGMRyYBHeYXNfjK5/7K88+tRnHsfB9ZXLNGL723TdTWtr37u2mlmTvRVABz1fqGtv6fH3DGA3MGIDD9Oc/PMPSpzeTybgkEhlSqSxbt+znW1/5e79c/9ijJ+N6vQdQRiMhTls4o1/uYRgjnUlwr5HvK57n85c/LyOd7t797bo+S5/Z3OdllgGqKop504XHE410bgUYCTtMnVTJOScfWuOuYYx2pop6iNra0vzwBw/wyCPrcF0P9YIt8lBQO9fupooC2YxLNNr3PUrfe81pHDurmjsfXEVbMsO5J83i4jPnETrEFUIMY7QzCe4QfeITf2TTxr1k2+fdCajTfY4dCupYWHk2hTlcpxw7nVOOnd5v1zOM0cRUUQ/Bpk172fLS/s7kBp0bZLTvAtTl87vvNkNgDGMoMAnuEOzcWd+5809XeYaEuK7H44+tG4SoDMN4Nf2S4ETkAhHZICKbReQT/XHNoWTa1Cp8r/ek44MNkY4X5Z9L15xI8ciqzTy7fjtunusZhtG/+twGJyI28CPgXGAnsExE7lHVtX299lAxdVoVxxwzhdWrt5PJBNXU9uTWc93JaDTEG97Qa0A1dzy2iu/e/TiOHXQQhB2bH998BUdPGTewwRvGKNYfJbjFwGZV3aKqGeAO4LJ+uO6Q8oUvvpEFp0wPlgsXyJRaNM4I4YeDlUqj0RBiCW65w59XreW5TZ1b6a3Ztpfv3f0f0lmPtlSGtlSGhtYk7/n+XWTzjHUzDKN/9EcvajWwo8u/dwIn9jxJRG4EbgSoqanph9v2jarSlsogItz7+Is8uHQD8UiYK1+3gDOOm9FrylU47NBWHaJufve5n/WzLYoyFskWD9cJ4ztK3drtrNq8my/dcCFnLZzJ3U++SMbtnciyns/yDTs5ec4RA/qshjFaDdowEVW9BbgFgsn2g3XffP71zHq+d8dj1DcnUFUssfD8oE3s+c27ufzM+Xzo2jM7zldVXnhpD02teXadEqEtrNhxC8tVxFPUglTG5Rt3PMKZx86guS3VfdXSzivTmkoPzEMahtEvCW4XMKXLvyfnjg1JT67ewpd/+UC3Rfi8LlvyJdNZ7np4NdeedxwTxpSyc38j7/3WnTS2JHFtDb5i+TpUXcVOKZFmH7UhXWZzoDlBazLN2QuP5Mk1W0lmus9wyLo+i46a0vtihmH0i/5og1sGHCki00QkDLwJuKcfrjsgbvnL071WGO3Jtiye27ATVeXm79zNngPNJFIZ3IYsdsIH1dy2QcGHnQz2Q3CSLk7CxUl4ROtdbF+IhkOcs/BIZteMIxYOZjdIbgXemy45hYrivk/MNwwjvz6X4FTVFZH3Af8CbOCXqrqmz5ENkN21Ta96jgiUF8fYsL2WuqY2JKPEGn1QiDaD50Cq3MKPWMHquhmf0q0pxA82tlXASXpUHzmuY1rVTz9wJQ+t3MSDKzdSHItw5anzWTB90gA/rWGMbv3SBqeq9wH39ce1BtrMKVWsWJ/rE3E1SMldOhQECDsOC2dN5sWX92IJRBt9pMu4ECcLRbU+qQrwIhZFezKI11lzFYKN6DM7WzuuG7JtLjzhaC484ehBeErDMGAUzmR47xuXEA07WCmfUEKRlHZWOQk+TbdmuOSmn/HyjgN4yYMP43ASwXudRO9d4QVoqm3N9zbDMAbJqEtw82dO4jPXn0coC74DGu2ysagquD7pTJYD0RRfffhhJk7Ovy+kAJar4B+8QzgcNmsZGEYhjcrfwKb6BLZjkY5q7/mkDtTP8chUgIpPQ7aOKUIwurcrVUItHtF6v2NWQ9cznJDNOef3fWduwzAO36hMcNFICN/JLebWgwo4LRbpqmDoiG+BGwInHUzK6niXdL67Y1243L8dx2La9CpueNdZA/0or4nrZ/jD1v9ja2INihKz4lxcfTOzy04tdGiGMSBGXRUV4PQTZx50ojzQrSjmtIHldR7ULif5UQsVcKMWmQqHbImDG7fxysKkSmwsZ2h9eX+6+SZeTqzJPYOQ9BP8ecfX2db2YoEjM4yBMbR+AwdJaUmMT954HnlH7FqQHts58Ffa+xgkz4ctZMtC+DEbLAsNW/gxmzQ+O3c38uBjQ2e9gb3JLdRn9vU4GhQ979/9k0KEZBgDblQmuEdeeIk7Vr5AvDKSq15qx0eqUnGLOs/NloL2XCG8I8lJ7za8nFQ6y+NPbxygJ8gv67exev//8sT2E3ly+5m81Pjbjte2tT1/kHcJDZn9gxOgYQyyUdcG97tHVvDD+57qnM1QBHZGcJKK5UG0HlpcwAJ1IOY4tCxUypf6vddGapcnyYlAeVl8IB+lG9dPsnTHEsI0EhcfVahv+gz1icc4YdKtTInPPsg7lbKQ2dXeGJlGVYJLZrL8qGtyA7AFL6qgQqRVcZLC+CcsEhOVmqPH8NaTF3HptKNpbkjykc/9mR27GnpdN193RTjscPmFCwf0ebpaf+DbhGnElqB6LQI2ip99iIbUelq9NmJWMUm/59g84byJ7xy0OA1jMI2qKuq2/Q1Y0vuRVYRsMbRNENrGCX5IKN5p8T/HnMabjjqGeCjMhHFlvOutp+fdLSsSCTFubAmxWJiieJhI2OGm689izlETB+OxAGhMPtCR3LryEf667Sb+tP1bJH0fv2O4ixKSEJdOei8zSo4ftDgNYzCNqhLcmJJ4rwUmFXDjuXY2S0CVbIlQ3upw5glHdjv31BOPZMniGTz57GbSGZeQYyMifP5jl3DicdNZt2kPbYk082ZVE4+HB+/BAEvK8P3gEXpK+D5pv32pJwuLMG854tPMKDlmUGM0jME2qhJcVVkxi4+awtKNOzoWoFQn+Oi+MxYkKmB/SyvjSoo73m9Zwmc/fDFrNuxm2XNbKSqKcPZpRzOmIjhn7qzCTZ6vKb+JPfU30rWyrAq+Cvvc7rMxsprlucaHTYIzRrxRVUUF+PrbL2LJ7KmEHZt4JIQdtfJ2HDi2xTNbdvQ6LiLMO7qa669dwtWXLupIboW0P7mKzS0PsiM7Fk8FVwVfIas2S5Nz0F7fZiXtJQoSq2EMplFVggMoiob57n9fSkNrksa2JLcvX80flq7uteKuiBAP9313+oG2ruF2nq+/hYyfBhwyalNqBasER8Tj9Pjz/KN5Pik6d/oKS4R5ZUsKFLFhDJ5RV4JrV1EcY9r4Sq5eNJ+w03OgW1CoO23m1EGP67VIe008X/9TXD8FKNV2E6V2unN8ngi25fP60heR3Lc6LFEmxWcyr3zoJrjGdJKfv7iU/3niPm5bv5K2bKbQIRnD1KgrwfV01PixfOKCM/jqPx/FsWwQsEX40qXnks66REJD90tUm1qNRQgIEsBYp5Xe9W3BsTxOKl9Iq8aYU3YSs0tPwpbeSb0Q9ida+daKJ9jV2syp1VN53eRpvPH+28l4HinP5R9b1/P91U9x7yVvZ3y8pNDhGsOMaN7NUAbWokWLdPny5YN+3658X1n23FY2bN7L+KpSFi6sYdWevWzaVcedj64mkcriqbLk6CP44nXnUxqPvvpFB9lLTX/lsb3fJosSkywLozvzD0RGqa74AZXFVwx2iK/o/pc38q5//63bMdtRgv+6HBPhoqlH8/0zLh3cAI1hQ0RWqGqvDYmHbvFkACVTGT7wyTvYvrOeVDpLNBIiHHb48IfO57f3LyeV7RwI/OT6rXz41nv5xc1XFTDi3moTK3lo73fwsAAhpZBVi5B45MtyRd5WVJOIDJ09IN73yL10nx6ieKq9JoZ4qvx7x0uDHJ0xEozKBHfbn57h5W11ZLLBUJFkKksqneXr37+fTEX3cXJZ1+f5rXvZUdvIlKryAkSb32P7Po/XMSkWXByWJqbS6sdI+Q7jQ83Mi+6i2E5TjkDi52QzjxAa+xdECv9tX7FvF1nf5yBFzl5C9qhtLjb6YFT+1PzrkbUdya2dKrQ1plC392yAkG2xr/GVlx9XVQaruq+q1GUa6Zocmt0IO7NjaPCKSGqErZmxPNAyj6hGGeeEgTTqbcVP/3tQYjw80rlbWRcR2+aNM8ziocZrV/g/5YVwkERkieDYFtkev2EZ12PmpPwT0uv2N/PDr93H0ic3ISKccuYsbvrY6ymvKMp7fn/Y1XI3Nj5+7u+Tr9DixeheGhJctViTmkRNKLdNrbbhZ5ZhR88fsNgO1fHjqwlZFlm/R5XaF+IRBxHwUQRh7pjxfOS40woWqzF8jcoEd95Zc/nz35Z3K8WJwLQjxvJSUZKWZBovt9dCNOxw7WnHUl7Uu+0qk3Z53/W/YGtZhrYlcRQ4sGML6995K7++433YA7Tg5damXzPZqWebOxYfC1dtBEV7VPd8LLZluw5EjiLW4M22SHkZnmvYgiAsrJhOxO4+rvB7Z13Ee/99b+5fQVtcUSjMk2+8kQ1NdWxtbuDoiioWjJ2IHGRZKsN4JaMywV139Ukse24rO3bVk0pliUaDTob/+59LeCnVwG8eXMHWnQ1UlRZx4bFH0dqU4v/99iHOPG4mJ849ouOX7T//XsPG6UqqKAx2cKx5cogXUlmeeWIDS8482BJFfeNrHZOdRlIaZq9XSkh6J7eAUm53GUMmNnbssgGJqacnatfyuRf+gNVlJeQvLbiOE8cc1XHORdOOZum11Xxt+X/Y3drMadVH8K5jTsSxLE6K1XDShJpBidUYuUb1MJGlK19m/aa9TBhfygmLp3HjA39jXV1tx/pHk1piJNcnyboeqoqNMGdcJf91ziKqa8bwu/uW8ud9L6FO1yqWEmmASFqwRDhp/lT+562vY+LY/LtzHY7n970fK30XlkBabdr8CCsTNdS6JR3VVoAQylvKt1ETSoM9hlD597HCx/ZbHAdTl27m6ie+TtrPdjseEpt5FVHSmuS48nlcU3MRleHyAY/HGPkONkxk1CY4VeWFVdt54skNOPEQG8cnuGfnRjK51UbEhTGrBGlfXkiV4i2thJuyhBwbx7GwSsNsWBBCI3bHObH9YGU7W5UsSygvjnHX199BUaxzhZGXDzSws6GJp7ds45E1W2isT+B5ytzJ4/jQRadxzBEHX2op7daybvfxhPA6hlRk1eLptunszFYAQtQu5pJJ7+KYkmlAFuwjBq2ad8fWx/nx5n/iafcOG0EpCaeJh7LYWBSHivjusZ+hNFT4+bzG8GbGwXWRzXp87ObfseaFHainweq9AuGTY2SmB0kg3Aziacf6Q9E9KcJNWUTBzXq4WQ8r6zHm+Qx1JwSlMysDltu9qd/3lWQ6w/1PreXKs49lR0Mj77njHrY3NJJ2Paw02OnO9yzfsosbfnonv3rPVcyrmZA3/ohTRfXYX7Gn7nosDRKyjc+C+F6Oci7kyPKLqY7PxirQbIV/7l6F5/u9RoAonf07Hj4JN8n9ex/j6ikXDXqMxugwKhPc3/78LC8+vx3xQS3IlIXwwkLFOpdUpYNbLsE2qF0Kt9G6FNKjsOu7PrH9PrgKjmBl6TXEASCZdtm4vRZflet++Sfq97YhWXAcuoxk65TKunz//ie55cYrD/oMVfGzqZy8kV1NPyflbaUidi7zii44zK9I/9neVsemlqCan6+8GHY6O3ay6vJi0yaunjJ48Rmjy6hMcLfe8gjigxcWWqYFvZ/YAp5SuV7xw2CnghKZG1PUFvIslguAhTCrvJItbY1IRLAsRXvsdh8NOxxZU8U9K9bSsqkNR4MFNj0b/HD3Km279btqX/U5bCtKTcXNh/U1GCgr6l9GsHBVsOmclaAKlighq/MLaWExIVpVoEiN0WDUDfTdsb2WtAa70bdOiaEWHT2glquEWyB6AMJt4GSUcItip3zcIifvXqrZsHDg5WZK91q8c/HxzJg0ptuoe5Fgo+kLTpnNr/++DBQyJdBWDZlKyBZDuhK8Hn9qqiv7r1NiMFWEi7DFxvUtsr6F5wuuL7i+RcTq/lciZDlcPGlobY5tjCyjrgS3dOlG0qU2liv44S7b/qkSSnYvSalIUMLyITU+jtPWjPqKFexRgwq0Ti3GTwdzV//4wHN85C1n8Y+n17JqXTC4VsNCa7HHul372bG3AS8MmQp6/WnJloJVH9w/GnJ4z3knH/IzJbIZ/t/Sx7lr0xoynsep1TV87pSzOaK0grXb9vLki1uJR8Oct+goqsoGtkF/SdVRhG2bNk/wcwtvAkStECdVTWBz20tYIhQ7cd474zpq4oVbBdkY+UZdL+oLz7/M9d+8i+KdKdpqYp0ZzYdIc/fRZJ7duZy5b0GmSInvTRNqdnHjNomJEfyITait8zJTJ1WyRVtIZ92gXpZLoPGwgzb5tJb6ZEvpXSf1IdQKUd/i45edyTWnLDjkZ3rT3+9g5f7dHT3AFkJpJMKFOoPHVrxEJuvhOBaC8P/++/WcsWDGYXzlDt3mln18aMXvaMi0YSFYYvGlBVdx6rhZtLoJEm6SqkilGbxr9JsB6UUVkW8AlxAsSPYScL2qNvblmgNt/jHTKM8qfsYn3Nw52dvLt3ivRUeCciOChoS2mh57naqiNkiu7XxffQt+Wa4q1vUXWIQ508azrHHPK84vV0+5++kXDjnBvVi3j9W1ezqSGwRTnBKZDP/YvQEnA1ZW8ds8fEf41C/u46FvvptYZOBWK55ZMp57zvgIm1v2kfazzCqdRMgKenSLnTjFzuDtF2uMbn1tg3sQmKeqxwAbgU/2PaSB94n3vJ7M2BiCdPRi2tlctbNrS5tP57iGLsmuJ+1yOF4SIevl2b7PV846YSbTisqD6/YkYGfA85WttY2s33Vou81vbjzQMVugq4z6pCIu0QMe8f0ekUafeJ2HvTvNo89tOqRr94WIcGTpBOaVT+lIboYx2PqU4FT1AVVtXzztGWBy30MaeKvX7+m1v54AooqVcoMkp4rlasfgLSurB5+k7+VOE9hnJfPOQVWFs+bN5J5PvJ2Tj5hCuP2XPrd6htNKxzAU2xL2NrYc0rNML6vMmy9thHADOKmg2m1p7voZ5W93rzykaxvGcNefvajvAP55sBdF5EYRWS4iy2trX30IxEDasu0g91fFizp4lmClfZyET7jJQ7KKnVbEBboMAVEU3GAsnRuF1BjI4OOpT7TLUuexsMM1S46hZmw5lmXxm7dcyU+vupQFVeMJpyHSAKFUZxgZ12N29fhDepb5Y8czu7KqM2ESJOuI7VC2Jd8C5rBp0z7S6SyGMdK9aoITkYdE5MU8H5d1OefTgAvcfrDrqOotqrpIVRdVVRV27NPsIyfg5CtlWVYwr9QCyQ3Et3yItPjE6j3KtmaJ1HsoihcCyYKo4NuCFxM0N9xEVTl11hGcNW8G5y84iu/+16V89JLTO+4jIpw2fSq/fPMVTJAiwl2+DbGwwxsWz2V8+aH1dooIv73wKi6fOYeIbWOJcOLEKdzzhuuIegepGiq91sMzjJHoVTsZVPWcV3pdRP4LuBg4WwvRJXsYrr54Efc++AKu27nSRnsVUy1BfUWFXjMXIDhmZXNTsqRzuL6ThLQF4gQXe2lLLfd88YZXjKM0HuVPH3kLP33gWR5bs4XiaJi3nLaQK06a95qepzgc5utnXMDXTj8fJVjXDuB1S2bx0GPr8HsMPJ4yuZKS4qG3x4Rh9Lc+DRMRkQuAbwNnqOoh1zuHwmT7rTsP8P1fPsLSVVuD5JYbEqIiQYOZp0Sbcq1buW0DvLCQqLLxw7nuiW6D5sAXcIuDdrvxmQj//v5NBXiyTrUHWrjxg7+lLZEhnXEJhWwc2+I7X76G2UcdfDK/YQw3AzXZ/odABHgwN6bpGVV9dx+vOSimTh7DlZccx6qtu0mmOtuj2nOW0+rTNtbCyQaT7t2IhRcV1Ap6XvM1blkalPwiTTDryENrQxtIVWNK+O1PbuCfD73Ii+t2UTOlkssuPJaqMWb7PWN06FOCU9WZ/RVIIaxcu71bcuvKj1lE2sCNCdkiC7EtysqjpNwsqZSb9z0K+FbQq/rxt75uACM/dCXFUa6+fBFXX97rj5thjHijbi5qV2MrigmHDj5GS0QIpSDWqIxNOvz4fVeQbcvfOK+qQSkuC20ToLjk4G1c9W0J1u7eT21zK1nPNPYbxkAZdXNRuzp/yWx+/uengJ5JpvfqIa7ncetfnw5qph5g59ou29vsAC8kWL7ih4X/bNnK5fPndLtGOuvy6b88wIPPb0JaffCCBTEvWnQ0n7r67AGdXWAYo9GoLsFVlhXxrY9dQWVZnFgkRDQSoiQeCdrdepybSru0tqVRVSwfaF/7TRX83GyGXGEw5FvY0vtL+5X7HuWhFzZDk9+RU31fuW/5Bj5y6729zjcMo29GdQkOYOHsydz7o3fz0o5aHNumoaGVj3/lr6S87m1zsWiIcxfPYvnW3cHMBghG/pFre8ttuiW5mQlnzJza7f0Z1+Vvq9biJjxsQFwN5q8qqHgs27h9yG0ubRjD3aguwbWzLOHII8YxbfIYFs6rYfbMCUTCnbk/EnaYMbWKC06bg4Y0N2e188OLAJZ0zGM9bdxkSqPd2+ASGRdfFcsjmBWhBF/93MBiL+GzfOOOQXpiwxgdTILrQUT45mev5IZrl3DE5DEcMbmS/7rmZL77+at5cuUWYp5NYgykKyBdFixY6UWDaVvig5OAfft6zyMti0UYU1yEb2nnxP32yfsSJLnv3P7ooD6rYYx0o76Kmk845HDt5Sdw7eUndDte29AKCZ+YKIkKIdIKXm7JcSfVOb1rXG6alev73LZsFXeseJ5U1qWqNE7tzub8NxWh1TXzQw2jP5kE9xrMnTGBkGMTafYJtyhuFEIJui2jFA07vPX8YMzZR//yTx7ZuIWUGzTW7W9pRcJAKs/FIe+GNYZhHD5TRX0NjplVzZyZE4mEHUSD+aeWBjMbiqJhYpEQN195GifPm8pLdfU83CW5AXiquHG675/XTpUS65WHifScU2oYxiszJbjXQET4zsev4I//XMnfH30BT5ULT53D2afMIpHKMm1iJdHcWLYXdu/FzrdApi1kY8H+D12LbJKBlO2y+HM/JBYLc80J83nnGYsJ2Tb/WLaO7/3jCfY1tjK2JM57LjyZN55yzOA8tGEMYybBvUYhx+a6S07guktOeMXzJpaW5F2aPGRZWKWgzS52+8rmPvhOsL2qW5ehtTzLLx5fzro9tVw0exZf+NNDpLJBSbCuJcE3/voYqnDVEpPkDOOVmCrqADnhiMlUFRX1KsVlfZ+U7ZGoEVJVgjqChgXJTeJ30sF5KdflkfVb+N9b7yPT7Hbs+QCQyrj85P6nB+9hDGOYMglugFgi/O5tV7FwyiTCWITbINwA4UawU8EQETcC6XK6dS74dlBdtZKKtPh4XrB0upVQJNN5Yn1LAjfP3g+GYXQyCW4AjS8t5qdXX0ZlJoJk2vd9CPZJiB5QindBtJ7c5ja59eTiwfJMgoAtuPFgKEr7HNj2vomxZUU4tvn2GcYrMb8hA+yeZWtJZnqMbxMJFti0g4QnqoirZHOzJKRr450IXm4amO2CbyvRkMPNFy0ZvIcwjGHKdDIMIFXln8vWdXQQdH8R1FKsZLD8OUBxCjLFSqaMXlsUam6fCATef8kSLls8d8DjN4zhzpTgBtDP73yKTZv35d9uUMDK5PZ2oPMj3Aah1t7nt+8PEXFspoyrGMiwDWPEMAlugCRSGW6/bzm0tO/r0CVpabBvg5PKs/K5QrjrVFZVrNzSTH4Ysq7PlDFlAxy9YYwMJsENkF37m7BtK0hYDdpt+RHfgUyR5t21C3JVUQ02mpYs2IlgxRIvEmxVWBKJDOKTGMbwZRLcABlXUYyb23s02LgGcBVVxY0BjoUXUiTrY2V8JOt3lvIEiiORYBVNB9xS8CPB0JKIZfHgyo0Fey7DGE5MghsgZSUxzj5pFrYPkbosVtLHjQfLK4Fgp5Rwq2K5wSY1lgt2KkiAH7zhLN566kIsDc4FAT8ovflK/k4LwzB6MQluAH307a8jUpelbm6I5GQLPyYdC1yWbPc6lleCzv9bAleds5Bzjj2SmDjBJtPZIAkKYFsWp8+bXqAnMozhxSS4AfTjn/2btlLwSnOLW7Z3laKE2vJurYqVVJpTaY6qruLKU+cTCzvB2yRYiuma0xcwY+KYwX4UwxiWzDi4AbRi1TZapx5kCSThoOu/FUfCAHz0ijM459gjuW/ZegBef8LRLJxRPQCRGsbIZBLcACotj+OHE71fECFdJkQatVspToGiqhiOZeVOExbOqDZJzTAOk6miDqC3v+00tLNe2k1LjYUXpvsGNgI//cKbBztMwxixTIIbQKccNy3YQ9Wl+zZcCorQPMWircrCCwleWCipijO1urKgMRvGSGIS3AASEWYWlYMoVhtBonODgbtOQrE8IdoCOEI0HuKGN55c2IANY4QxCW6A3fWpt1OiITSsOIngQ9xg+lWsVrEsoaQown9fdQpXnr+w0OEaxohiOhkGWMixWfalm3lkzUvc8cxq0skskYQwc9IY3vfG0/F8n3g0jGXlWd/cMIw+MQlukJw1dwZnzZ1R6DAMY1TplyqqiHxERFRExvbH9QzDMPpDnxOciEwBzgO29z0cwzCM/tMfJbjvAB/D7MtuGMYQ06cEJyKXAbtUdXU/xWMYhtFvXrWTQUQeAibkeenTwKcIqqevSkRuBG4EqKmpeQ0hGoZhHB7RfPsFHMobReYD/wbaJ1tOBnYDi1V176u8txbYdpCXxwJ1hxXU4BnqMZr4+m6ox2ji6+4IVa3qefCwE1yvC4lsBRapap8eSkSWq+qifglqgAz1GE18fTfUYzTxHRozk8EwjBGr3wb6qurU/rqWYRhGfxiKJbhbCh3AIRjqMZr4+m6ox2jiOwT91gZnGIYx1AzFEpxhGEa/MAnOMIwRa0gnuKE6iV9EviEi60XkeRH5i4iUFzomABG5QEQ2iMhmEflEoePpSUSmiMgjIrJWRNaIyAcKHVM+ImKLyHMi8vdCx9KTiJSLyJ25n791IjLkVkkVkQ/lvr8visgfRCRaqFiGbIIb4pP4HwTmqeoxwEbgkwWOBxGxgR8BFwJzgGtFZE5ho+rFBT6iqnOAk4CbhmCMAB8A1hU6iIP4HnC/qh4NLGCIxSki1cD7CcbEzgNs4E2FimfIJjiG8CR+VX1AVdu3l3+GYBZHoS0GNqvqFlXNAHcAlxU4pm5UdY+qrsx93kLwyzmktgwTkcnARcAvCh1LTyJSBpwO3AqgqhlVbSxoUPk5QExEHCBOMMOpIIZkghtmk/jfAfyz0EEQJIodXf69kyGWPLoSkanAQuDZAofS03cJ/rD6BY4jn2lALfCrXBX6FyJSVOigulLVXcA3CWpee4AmVX2gUPEULMGJyEO5OnrPj8sIJvF/tlCxHUJ87ed8mqDadXvhIh1+RKQYuAv4oKo2FzqediJyMbBfVVcUOpaDcIDjgJ+o6kKgDRhSba0iUkFQc5gGTAKKROS6QsVTsCXLVfWcfMdzk/inAatFBILq30oRedVJ/IMRXzsR+S/gYuBsHRqDCXcBU7r8e3Lu2JAiIiGC5Ha7qt5d6Hh6WAJcKiKvB6JAqYjcpqoF+wXtYSewU1XbS713MsQSHHAO8LKq1gKIyN3AKcBthQhmyFVRVfUFVR2nqlNz0792AscNZnJ7NSJyAUE15lJVzbN1fUEsA44UkWkiEiZo2L2nwDF1I8FfrFuBdar67ULH05OqflJVJ+d+7t4EPDyEkhu534EdIjIrd+hsYG0BQ8pnO3CSiMRz3++zKWBHiNl05vD8EIgAD+ZKmc+o6rsLGZCquiLyPuBfBD1Xv1TVNYWMKY8lwFuBF0RkVe7Yp1T1vsKFNOzcDNye+yO2Bbi+wPF0o6rPisidwEqC5pvnKOC0LTNVyzCMEWvIVVENwzD6i0lwhmGMWCbBGYYxYpkEZxjGiGUSnGEYI5ZJcIZhjFgmwRmGMWL9fyzDBX1QXoobAAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "import numpy as np\n", "import matplotlib.pyplot as plt\n", "\n", "yvals = []\n", "fps = []\n", "for mol in pybel.readfile('smi','er.smi'):\n", " yvals.append(float(mol.title))\n", " fpbits = mol.calcfp().bits\n", " fp = np.zeros(1024)\n", " fp[fpbits] = 1\n", " fps.append(fp)\n", " \n", "pca = PCA(n_components=2)\n", "res = pca.fit_transform(fps) \n", "\n", "plt.scatter(res[:,0],res[:,1],c=yvals)\n", "plt.gca().set_aspect('equal', adjustable='box');" ] }, { "cell_type": "code", "execution_count": 46, "metadata": { "slideshow": { "slide_type": "notes" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[1. 0.97673441]\n", " [0.97673441 1. ]]\n" ] }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYAAAAEGCAYAAABsLkJ6AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/d3fzzAAAACXBIWXMAAAsTAAALEwEAmpwYAAAg+UlEQVR4nO3df5RcZZ3n8fe3iwK7GaWDxBloEpJxmCAxkKw9kt3MeiQiYUQwIogRdtd11+ieHX8dJmMYGH44eMhs1h97dGdGBHfGA2T4EWzB4AQ1mUFyDEOHJoZAMiO6JBTMGDd0BNMcis53/6iqTnX1vVW3qm7Vrar7eZ3Th/StX0+RnOd7n+/zPN/H3B0REUmfvqQbICIiyVAAEBFJKQUAEZGUUgAQEUkpBQARkZQ6JukG1OOkk07yefPmJd0MEZGusmPHjl+6++zK610VAObNm8fo6GjSzRAR6Spm9mzQdaWARERSSgFARCSlFABERFJKAUBEJKUUAEREUqqrVgGJiPSikbEc6zfv5fnxCU4Z7GfNigWsXDJU87FmKQCIiCRoZCzH1fftYiI/CUBufIKr79s19XjYY3EEAQUAEZEElO7sc+MTMx6byE+yfvPeqT8HPaYAICLShSrv+oM8HxAYojxWD00Ci4i02frNe6t2/gCnDPZzymB/6GNxUAAQEWmzWnfw/dkMa1YsYM2KBfRnM4GPxUEpIBGRNjtlsD8w9w8wFLDSR6uARES6SLXlm2tWLJgxB9CfzXDzJYtmdO4rlwzF1uFXUgAQEYlZtaWd5R16q+7so1IAEBGJWdAkb+XyzVbe2UelSWARkZiFTfLGtXwzLhoBiIhEFJbXr7x+Qn+W8Yn8jNfHtXwzLgoAIiIRhOX1R589yMYduWnXsxkj22fkj/jU6+NcvhkXBQARSbWoxdbC8vobHt3PpPu06/lJZ9ZAloFjj0l0krcWBQARSa2wu/p7Rvex/WcvzujYg4Q9Z/xwnrHrzo+1vXFTABCRVCm/4+8zm9GBT+Qn2fbMwcjvZ0BQCOi0fH8QBQARSY3KO/4od/i1DByb4YgzY1NXp+X7g2gZqIh0jZGxHMvWbWH+2k0sW7eFkbFcXa+PUoStXodfneTmSxYxNNiPUSjlELSjtxNpBCAiXaHW7toowurvNOOUwf6O2NTVCI0ARKQrVNtdG8XIWA6LuU3ZjHVFqieMAoCIdIVmd9eu37w3cLK2KbG/YXspAIhIx6iW42/2cJRqgaKy5n5U+SMeeQTSiRQARKQjlHL8ufEJnKM5/lIQWLNiAdm+6UmcbF/0FExYoChN2g41uGyz0+r71COxAGBmc8xsq5k9ZWa7zezTSbVFRJIXKcdfmcSvI6lf7XStlUuG2LZ2eUNBoBvW+4dJcgTwGnCVu58JLAX+u5mdmWB7RCRBtXL86zfvJT85s+RCZQomLI20cslQzeWatUYTrTyeMQmJLQN19xeAF4p/fsnMngaGgKeSapOIJCfsmMTSHXaUSeAoB7FUW665cskQNz6wmxcPz6zkWTqqMelDXOLUEfsAzGwesAR4NOCx1cBqgLlz57a3YSLSNmHHJJbusGsFCIh2EEulymJwF5518rTqnuXt6Nb1/mESDwBm9hvARuAz7v6rysfd/RbgFoDh4eEuX3QlImFqHZNYK0BA9KWipU4/Nz4xrZZPbnyCjTtyfOBtQ2zdc6Bn7vTDJBoAzCxLofO/w93vS7ItIpK8anfYUc7RDTuI5YT+7NSfK9NElXeVE/lJtu45wLa1y5v8Np0vsQBgZgbcBjzt7l9Kqh0i0j1qpWAsZFVQ+fUo9YC6eWlnPZJcBbQM+A/AcjN7ovjzngTbIyJdbjxg8rbyepTOvZuXdtYjyVVAj1DXKl6R3hT1RCqpLWyi2IEln3+IC886OfAMgHLdvrSzHolPAoukWRwVLtPk2pFdU0cwZsxYdc4chk87cSqAntCfJZuxGfsFAF48nOf27fsC37c0ETyUsgBsHsOBCO0yPDzso6OjSTdDJDbL1m0JvGMdGuxPxSRkPa4d2RXYgWf6jMmyw9ezfcZr7kTt2jJmfPGDZ/d0p29mO9x9uPK6RgAiEbUiVdNshcs02fDo/sDr5Z0/FAq01WPSvac7/2pUDE4kgpGxHGvu3TmtUNmae3fWfSJVpWYrXKZJHMc3BjFo+u+xWykAiERw4wO7A+vQ3PjA7qbet1qBMjmqlR20Q1eXdG6GUkAiEQTVhql2Paoom5skng56INvH4fyRwMfSmnJTABBJWK/Vl2mFWmf5Dg3283wxPRfGMWYNZAODdlpTbgoAIhEMhpQYGCwrMSDxKp90r6W0Ymre2k2hz5nIT3LcMX30ZzNV6wmlieYARCK44eKFgadR3XDxwoRa1BvCavdfO7KLz971xNSke1S1DnQ5NJGveSZAmmgEIBKBcvXxC9sEN/rsQe7Yvq+h89aDKoaWO2WwXym3MgoAIhGp44hXWO3+DY/ub6jzh6OBOuhQlzSnesIoBSQiiQjL7de73r8y7bNyyRBj153PVy5frFRPDRoBiEgiwgq3ZWoUa6sUdlevEVttGgGISCLCNsGtOmfOjOvVjD57MO6mpYYCgIgkYuWSocAVOTetXDTj+pVLw88DD6sRJLUpBSQiiQlL0wRdDyvl3KoaQWmgEYCIdIVMyHmPYdelNgUAEWmpsM1e9Vp1zpy6rkttSgGJSEuMjOW44f7d00poNHPi2fBpJ3Ln9n2Ul3PrK16XxmgEICKxK+3yDaqfNJGfbKi65/rNe6ms5XmE9JZyjoNGACIyTTMnn5VeW6t6ZyPll3V6WvwUAERkSjOH1Fe+tppGyi+HbRxLaynnOCgAiHSQVpw7XI+w+jylNEtl20afPciGR/fXtRSz0Zo8QYXeVN+nOeZdtIZ2eHjYR0dHk26GSEsE3UH3ZzNtrWEzf+2m0EJslXX0Deou2jZrIMv1Fy1s+PskHSC7lZntcPfhyusaAYh0iGp33+3q5KrV56lsW711+uPorFXfJ15aBSTSITphkjOsPk+ju237sxm+cvlitq1dro67AyUaAMzsm2b2CzN7Msl2iHSCsMnMdk5ylurzzBo4etTlccf0Tfs9CpVg7g5Jp4D+Gvga8K2E2yGSuE6a5Hwlf3TF/fhEnmyf0WdwJMJA4Mqlc7lp5aIWtk7ikmgAcPeHzWxekm0Q6RSdcuxk0FxE/ogTpeTOsRlT599Fkh4B1GRmq4HVAHPnhpeEFekFnTDJGTbnUGsaIJsx/selZzf12Vrl014dPwns7re4+7C7D8+ePTvp5oj0vLA5h2pVN4cG+1l/6dlNddalZbC58Qmco5vQGi0eJ7V1fAAQkfaq56SuOFf51NqEJvHr+BSQiLReZerlA28bYuueAzNSMcOnndiyFE29y2CVLmpeogHAzDYA7wROMrPngOvd/bYk2ySSJmElm+96bD/HHzuze2jlHEU9tX6aqVkkRyWaAnL3Ve5+srtn3f1Udf4i7VOtZHN+0hmfyLc1Fx+WegpaBqt0UTw0ByCSUkGdaJh2dK5hh8QH3dF3wq7pXqA5AJGUqrezbEfnGjXFpNLQ8dAIQKRH1TqLt97OspM613rSRRJOAUCkB0VZU79mxQIibO4FOq9zrSddJOGUAhLpQVFKS69cMsToswe5ffu+0Pcx6Ngllp2wa7rbaQQg0oOiTpLetHJR1Uqfndr5SzwUAER6SCnvH1a2JyiPf/1FC2fk00tUjqG3KQCI9IjyvH+QsDx+eT49iNbX9y4FAJEeUW1df61J0pVLhti2dnnopLDW1/cmTQKL9IiwTtqAbWuXR3oPra9PF40ARHpEHEdKan19uigAiPSIODpvra9PF6WARHpEXEdKan19eigAiPQQdd5SD6WARERSSgFARCSlFABERFJKAUBEJKUUAEREUkoBQEQkpbQMVKRDjIzlml7DL1IPBQCRDlCq5Fkq5lYqwwwoCEjLKABI6rT7TjvK50U5wUskbqEBwMy+CqHnSuDun2pJi0RaqN132iNjOdbcs5P8EZ/6vDX37JzxeVFP8BKJU7VJ4FFgR5Ufka5T7U67FW64f/dU51+SP+LccP/uadfiqOQpUq/QEYC7/007GyLSDu2+0x6fyEe6vmbFgmkjE1AZZmm9mnMAZjYb+BxwJvC60nV3j3bCRPX3vgD4X0AGuNXd1zX7niLVdOqBJ3FV8hSpR5RJ4DuAu4ALgU8A/wk40OwHm1kG+N/Au4HngMfM7H53f6rZ9xYJ08l32qrkKe0WJQC80d1vM7NPu/s/AP9gZo/F8NlvB37q7j8DMLO/Bd4HKABIy7TyTjtotc+sgSwvHp6ZBpo1kG3680SaFWUncOlf7wtmdqGZLQFOjOGzh4D9Zb8/V7w2jZmtNrNRMxs9cKDpgYd0uZGxHMvWbWH+2k0sW7eFkbFc3a9vVed/9X27yI1P4BxdXXThWSeTzUw/aj2bMa6/aGHTnynSrCgB4CYzOwG4Cvgj4Fbgsy1tVRl3v8Xdh919ePbs2e36WOlAYZ1s1CDQ7OurCVtdtHXPAdZfeva0IxbXX3q2Uj3SEWqmgNz9u8U/HgLOjfGzc8Ccst9PLV4TCdTsZqlWbraqtrpIuX3pVFFWAf0fAjaEuftHm/zsx4DTzWw+hY7/Q8CHm3xP6WHNLOEcGcsFrv6J+vpaOnV1kUg1UVJA3wU2FX9+CLwBeLnZD3b314A/BDYDTwN3u/vu6q+SNGt0s1Qp9VPv+9ZjzYoF9Gcz0651yuoikTBRUkAby383sw3AI3F8uLs/CDwYx3tJ72t0CWdQ6qee10PtyWOt45du1EgxuNOBN8XdEJFaGu1kq6V4br5kUc3XR60fpFy/dJsocwAvMX0O4F8o7AwWabtGOtn+bB+H80dmXJ81kE188lgkSVFSQK9vR0NEWmFkLBfY+QN4aK3b6VSpU3pVzUlgM/thlGsinahalc9DIYXaKqlSp/SqaucBvA4YAE4ys1lAaTvjGwjYsSvSiardpVd24NeO7GLDo/uZdCdjxqpz5nDTykUdXT9IpBnVUkAfBz4DnEKh/n8pAPwK+FprmyUSj7D1+QZTHfjIWI5rvr2LX796tIOfdOf27fsAuGnlIkArfKT3mNdIhJrZJ939q21qT1XDw8M+OjqadDOki1Su4IFC53/F0rnctHJR4OPlMmY8c/N72tRakdYwsx3uPlx5Pcoy0CNmNuju48U3mgWscve/iLmNIrEpX7d/Qn+W12X7GD+c55TBfs49YzZb9xxg3tpNNd9nMupMsUgXirIT+GOlzh/A3V8EPtayFok0qbLo2/hEnlfyR/jy5YuZ98Z+bt++L7QsRKWMWe0niXSpKCOAjJmZF3NFxYNcjm1ts0TqV7rrD+rcJ/KT/Ml9PwldEhpm1Tlzaj9JpEtFCQB/B9xlZl8v/v5x4Huta5L0umplFRqt118rlw/U1fmXzxOI9KooAeBzwGoKx0EC/AT4rZa1SHrayFiONffuJD9ZyK3nxidYc+/OqcejlFwIUq3eTz0yZnzxg6rXL+kQZSfwETN7FHgz8EHgJGBj9VeJBLvxgd1TnX9JftK58YHdDBx7TMMlF+LYldufzUSqDSTSK6ptBPtdYFXx55cUDobH3eM8FEZSJuh83NL18ZDHonTuYev9oxrS2n5JoWqrgPYAy4H3uvvvF/cCND/GFgnRTMmFNSsWzDh7N6qhwX62rV2uzl9Sp1oAuAR4AdhqZt8ws3dxdDewSEMG+7Oh15s5VGX02YMzUktRqaibpFVoAHD3EXf/EHAGsJVCWYg3mdlfmtn5bWqf9JgbLl5Itm/6fUS2z7jh4oWsXDLEzZcsmnaA+gfeNsT6zXuZv3YTy9ZtmXGA+8hYjjP/9HtTZRsaoaJuklZRJoF/DdwJ3FncBXwZhZVBD7W4bdKDah3qUl7vv9ZBLFd848dse+ZgU+3JZkxF3SS1atYC6iSqBZQuy9ZtCZzYnTWQ5dDhPPVt6QqW7TPWX6Zln9LbwmoBRSkFIZKIsNz8izF1/gD5I171zACRXtbImcAiTYm62/eE/izjEQ9taYYmgSWtFACkraIesA7QrjpsfWbMX7tJdf4ldZQCkraqdsB6pbBNY43K9lngXoFJd5yjwahypZFIr9IIQGJTLbVTftxikMrJ3jOueTCWNhngHN3pC0dXIPWZzWhP1NITIr1AAUBiUS21M/rswZrr9Et19+NY2lkSVt6h9Pv8kANhNCcgaaEAILGoltr5l0Ov1Hz9pHukE7qiKnX+6zfv5bN3PRGY3w+rH6SNYZIWicwBmNllZrbbzI6Y2Yy1qdJ9wu6anx+fSORYxdIIpHQqWFB+v5nSEyK9IKlJ4Ccp1Bp6OKHPl4hGxnIsW7cltBRDSbVCbkkcq5gxqznZHFR6QuWgJU0SSQG5+9MApvNWO1o9SzbXrFgw40Su0t30PaP7YsvrR9GfzYQeDlM5UikvPSGSNh2/DNTMVpvZqJmNHjhwIOnmpEo9SzbD7qYBHt93KJb2LHvziVXL0ZZ/7lATpaVF0qJlIwAz+wHBR0de4+7fifo+7n4LcAsUagHF1DyJoFpeP0hlIbewA9obtf1nL4ZO3JZq+pcLG5GISEHLAoC7n9eq95b2aHSVTJQD2hsx6V411VSuVtVREdEyUKkirLM994zZvOVPv8dEvlCSrc/gw+fM5aaVhZRPXAe0V8qY1dWxK78vUl0i5aDN7P3AV4HZwDjwhLuvqPU6lYNun/IUTqa4Y3ZosJ9zz5jNndv3hVbjHGrybN5qrlx6NMiISHQdVQ7a3b/t7qe6+3Hu/ptROn9pn1IKp9SRT7pPpVm27jlQtRRzqzr/44/NqPMXiZlSQDLNyFiOq+7eGVojJ6kyCYdfjT+lJJJ2CgAypXTnH7Zzt5Rzb9VdfjWdsHwz6jkGIt1CAUCm1Jq8HTg2wwuH2t/5d8LyzXo2xYl0i47fCCbtUy29Y8CvX53kSBvWDCx784kdV56hnk1xIt1CI4CUK09rBNXHL2nHWrGMGavOmVN1sjepNEy9m+JEuoECQIpVpjWSqNoJhdHFz9ddWPN5I2M51ty7k/xkoZ258QnW3LsTaH0aRqWjpRcpBZRirdqwVa+oneiND+ye6vxL8pPOjQ/sbkWzplHpaOlFGgGkWCekLww494zZkZ4bdkZw3GcHB1FpCelFCgApNjiQbUvnWY0DG3fkGD7txI7vTFVaQnqNUkAp9koHpH8g+mqawf5sXddFpDoFgJQaGctNFXPrBOXpqLBTyG64eCHZvuknAmT7jBsuXtjWtor0CqWAukxcyyA7bf16aSI4yoYr5eFF4qEA0EXi3I3a6nIORvDeATM4xox82Y6y8tU01TZclXLwvdbhq8SEJEUpoC4S127UkbFc1aMVG2EUcvGl3buhOwoc1l92duhO37RtuCqvvOocDeqltJdIK2kE0EXi6hzXb94b685eA66oqNW/bN2W0I1T1e7i07bhqtaIR6SVNALoIoMDIatgQq6HiTv948DWPQemXWt041Qnb7gKm5xuRtpGPNJZNALoImGVGuqp4HDOF77f0Gf3GZx8Qngp6MoOq9EJ206d6G1VNdC0jXiksygAdJFDE8GbtsYn8sxfu6lmZ/nuL/09//rSqw199hGHbWuXV03tVGp0wrYTJ3pblaqJesi9SCsoAHSRaoexlE8gwtG70vIVJs3k/TNWmDZOa4fVqlRNp454JB0UALpIUOdbqfyu9NqRXdyxfV8sE75Lf3sWy9Zt4fnxCQYHshx3TB+HJvKp6bBamarpxBGPpIMCQBepvFsM69ifH59gZCwXS+efMWPpb8/i8X2HpgLPi4fz9GczfPnyxanpuNI68pHeZp5QDfhGDA8P++joaNLN6Bhh+fhmzRrIcuFZJ7N1z4GqB8UMDfazbe3y2D+/U2nDlnQrM9vh7sOV1zUC6GJRUkL1Gip2bFEOiknbUkWlaqTXaB9AF1u5ZIibL1nEQDaev8ZSbf6oB8VoqaJId9MIoMvdM7qPw01U9Syv2VOqzR+l81f+W6T7KQB0sSu+8WO2PXOwqfeoTO5M5CfJhOT8M2YccVf+W6RHJBIAzGw9cBHwKvAM8J/dfTyJtnSL0gRkq6t4QiHn35/NzFjxUl60TUS6X1JzAN8H3uruZwH/BFydUDs6SlitmfKKkY047pjgv2YLKQlaqtAZVrFTRHpDIiMAd3+o7NftwKVJtKOThNWaGX32YMPr+UsregDW3LuT/OTRd8lmjMt/b86MnH8pt68VLyK9rxPmAD4K3BX2oJmtBlYDzJ07t11taruwWjO3b99X93ste/OJ3PGxfxv4GaU17OeeMZutew5My/kPKbcvkiot2whmZj8AfivgoWvc/TvF51wDDAOXeISG9PJGsPlrN8VSsiGs8y9XOdoA5fhFelnbN4K5+3k1GvQR4L3Au6J0/r2uWqG3KGYNZLn+ooWROvCw0cZVd+8EmitvLCLdI5FJYDO7APhj4GJ3P5xEGzpN0EEoUV25dC5j150fueMO28E76a7jCEVSJKlVQF8DXg9838yeMLO/SqgdHaO0q7ceBnzl8sXTjmKMotoO3kbOGBaR7pRIAHD333H3Oe6+uPjziSTa0WnqSb1k+6zhapy1Rhtpq/EjkladsApIyoTtwi0XZbVOtcqVpf9edffOwM9SjR+RdFAASEC1znnVOXNCl35euXRupHRPlPNrS/9VjXuR9FIAaLNanXOpg9/w6H4m3cmYseqcOXXl+aOeX6vjCEXSTQfCtFnYIS5xHq5SbU+BgTp6kZTRgTAJinIwe5wTr40cHi8i6aMDYVpoZCzH4hsf4jN3PUGuSucP8U68RtlToOWeIqIRQIsElVsIE/fEaz2Hx4tIeikAtMDIWC50iWW5Vubjy6t5hs07aLmnSLopAMSsdOcfZS1/XJO+tQQdHq/lniKiABCzKAeqt7vz1XJPEQmiABCzWnn1eqp2xkkHvIhIJQWAmIUtwcyY8cUPnq1OWEQ6hpaBxixoCWZ/NqPOX0Q6jkYAMVO+XUS6hQJACyjfLiLdQCkgEZGUUgAQEUkpBQARkZRSABARSSkFABGRlFIAEBFJKQUAEZGUUgAQEUkpBQARkZRSABARSSkFABGRlEqkFpCZ/RnwPuAI8AvgI+7+fCs+a2Qsp8JsIiIBkhoBrHf3s9x9MfBd4LpWfEjpeMZc8WD03PgEV9+3i5GxXCs+TkSkqyQSANz9V2W/Hg9UP0C3QUHHM07kJ1m/eW8rPk5EpKskVg7azL4A/EfgEHBuleetBlYDzJ07t67PCDuesdaxjSIiadCyEYCZ/cDMngz4eR+Au1/j7nOAO4A/DHsfd7/F3YfdfXj27Nl1teGUwf66rouIpEnLAoC7n+fubw34+U7FU+8APtCKNoQdz7hmxYJWfJyISFdJahXQ6e7+z8Vf3wfsacXn6HhGEZFwSc0BrDOzBRSWgT4LfKJVH6TjGUVEgiUSANy9JSkfERGJTjuBRURSSgFARCSlFABERFJKAUBEJKXMvSVVGFrCzA5QWDXUiJOAX8bYnE6g79T5eu37gL5TN6j8Pqe5+4ydtF0VAJphZqPuPpx0O+Kk79T5eu37gL5TN4j6fZQCEhFJKQUAEZGUSlMAuCXpBrSAvlPn67XvA/pO3SDS90nNHICIiEyXphGAiIiUUQAQEUmpVAUAM/szM/uJmT1hZg+Z2SlJt6lZZrbezPYUv9e3zWww6TY1w8wuM7PdZnbEzLp6WZ6ZXWBme83sp2a2Nun2NMvMvmlmvzCzJ5NuSxzMbI6ZbTWzp4r/5j6ddJuaZWavM7N/NLOdxe90Y9Xnp2kOwMzeUDqP2Mw+BZzp7i0rRd0OZnY+sMXdXzOzPwdw988l3KyGmdlbKJQJ/zrwR+4+mnCTGmJmGeCfgHcDzwGPAavc/alEG9YEM3sH8DLwLXd/a9LtaZaZnQyc7O6Pm9nrgR3Ayi7/OzLgeHd/2cyywCPAp919e9DzUzUCaNdh9O3k7g+5+2vFX7cDpybZnma5+9PuvjfpdsTg7cBP3f1n7v4q8LcUDj/qWu7+MHAw6XbExd1fcPfHi39+CXga6OrDQ7zg5eKv2eJPaD+XqgAAhcPozWw/cAVwXdLtidlHge8l3QgBCh3J/rLfn6PLO5deZmbzgCXAowk3pWlmljGzJ4BfAN9399Dv1HMBIK7D6DtJre9UfM41wGsUvldHi/J9RNrFzH4D2Ah8piJL0JXcfdLdF1PIBrzdzELTdUkdCdky7n5exKfeATwIXN/C5sSi1ncys48A7wXe5V0wqVPH31E3ywFzyn4/tXhNOkgxT74RuMPd70u6PXFy93Ez2wpcAARO3PfcCKAaMzu97NeWHUbfTmZ2AfDHwMXufjjp9siUx4DTzWy+mR0LfAi4P+E2SZnihOltwNPu/qWk2xMHM5tdWgloZv0UFiGE9nNpWwW0EZh2GL27d/VdmZn9FDgO+H/FS9u7eWWTmb0f+CowGxgHnnD3FYk2qkFm9h7gK0AG+Ka7fyHZFjXHzDYA76RQavhfgevd/bZEG9UEM/t94EfALgp9AsCfuPuDybWqOWZ2FvA3FP7N9QF3u/vnQ5+fpgAgIiJHpSoFJCIiRykAiIiklAKAiEhKKQCIiKSUAoCISEopAEiqmNlksRrsk2Z2j5kNNPFef21mlxb/fKuZnVnlue80s3/XwGf8XzM7qdE2ilSjACBpM+Hui4vVLF8Fpu2ZMLOGdse7+3+tUUXynUDdAUCklRQAJM1+BPxO8e78R2Z2P/BUsZjWejN7rHjOwsehsHPUzL5WrPH/A+BNpTcys78vnV9QPAfg8WJN9h8WC419AvhscfTx74s7NjcWP+MxM1tWfO0brXBWxW4zuxWwNv8/kRTpuVpAIlEU7/T/APi74qV/A7zV3X9uZquBQ+7+e2Z2HLDNzB6iUC1yAXAm8JvAU8A3K953NvAN4B3F9zrR3Q+a2V8BL7v7/yw+707gy+7+iJnNBTYDb6FQm+oRd/+8mV0I/JeW/o+QVFMAkLTpL5bKhcII4DYKqZl/dPefF6+fD5xVyu8DJwCnA+8ANrj7JPC8mW0JeP+lwMOl93L3sPr55wFnFsrRAPCGYlXKdwCXFF+7ycxebOxritSmACBpM1EslTul2An/uvwS8El331zxvPfE2I4+YKm7vxLQFpG20ByAyEybgf9WLBWMmf2umR0PPAxcXpwjOBk4N+C124F3mNn84mtPLF5/CXh92fMeAj5Z+sXMFhf/+DDw4eK1PwBmxfWlRCopAIjMdCuF/P7jVjgA/esURsvfBv65+Ni3gB9XvtDdDwCrgfvMbCdwV/GhB4D3lyaBgU8Bw8VJ5qc4uhrpRgoBZDeFVNC+Fn1HEVUDFRFJK40ARERSSgFARCSlFABERFJKAUBEJKUUAEREUkoBQEQkpRQARERS6v8DFNHsdHkJUooAAAAASUVORK5CYII=\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "import numpy as np\n", "from openbabel import pybel\n", "from sklearn.linear_model import LinearRegression\n", "from sklearn.linear_model import LassoCV\n", "import matplotlib.pyplot as plt\n", "%matplotlib inline\n", "\n", "yvals = []\n", "fps = []\n", "for mol in pybel.readfile('smi','er.smi'):\n", " yvals.append(float(mol.title))\n", " fpbits = mol.calcfp().bits\n", " fp = np.zeros(1024)\n", " fp[fpbits] = 1\n", " fps.append(fp)\n", " \n", "fps = np.array(fps)\n", "yvals = np.array(yvals)\n", "lin = LinearRegression()\n", "lin.fit(fps,yvals)\n", "pred = lin.predict(fps)\n", "print(np.corrcoef(pred,yvals))\n", "plt.plot(pred,yvals,'o')\n", "plt.xlabel(\"Predicted\")\n", "plt.ylabel(\"Actual\")\n", "plt.show()\n", "\n" ] } ], "metadata": { "celltoolbar": "Slideshow", "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.12" } }, "nbformat": 4, "nbformat_minor": 2 }