Assemble 2.0 Tutorial
Brief Description | First Example | Fundamental difference between non-overlapping fragments and substructure constraints | Atom Tags | Assemble as a tool in structure elucidation | Ranking | Postprocessing
Fundamental difference between non-overlapping fragments and substructure constraints.
When using Assemble it is most important to recognize the fundamental difference between non-overlapping fragments and substructure constraints. As the molecular formula must be given, the number and kind of atoms in the molecule is known. Unknown is the pattern of bonds between them. Assemble generates all possible constitutions by tentatively forming bonds in a systematic way. The algorithm applies the strategy of a depth-first tree search. Bonds are formed one after the other until a constraint is violated. If this happens, the last bond is deleted and a different bond is formed. If none of the atoms has free valence, a complete molecule has been formed.
If there is knowledge about bonds in the molecule, the first bonds can be definitively formed, thereby reducing the size of the search tree. Not all substructural information can be handled like this. To initially form bonds, it must be known that the structural fragments in question are non-overlapping. Sometimes only a single fragment can be handled like this. Potentially overlapping substructures are mere constraints. There presence, or absence, in complete candidate structures must be guaranteed by a substructure search process. While the process of initial bond formation is most effective, as it reduces the size of the search tree, the retrospective substructure search is time-consuming. In addition, most of the bonds must be formed, before a substructure constraint can be violated. This happens at the periphery of the search tree. So almost the entire tree has to be searched. This behavior is an intrinsic shortcoming of the assembly process. It is therefore most important to carefully choose the substructural information that is used as non-overlapping fragments. The following example illustrates the difference in performance, when a large substructure is used as a fragment, contrary to a substructure constraint.
Enter the molecular formula C8H10O2. This formula corresponds to 4 double bond equivalents. Assume that the presence of a benzene ring is known. Open the JUME editor and select the "Templates" menu item:

Some large frequently used substructures are predefined. The second template is a benzene ring with all valences free. Move the cursor into the field and depress the right mouse button. Select the menu item "Copy This Structure To JUME".

Include the substructure into the Assemble input. By default, it is considered as a non-overlapping fragment. Run the generator and find 91 structures generated within a few seconds. Now repeat the calculation, but before declare the fragment as a substructure constraint by ticking the box. You may not have the patience to wait for termination of the assembly process, as it may take more than an hour, depending on your hardware.
When the ring is used as a fragment, 6 bonds between carbon atoms can be formed once forever. The search process starts with these bonds formed. It takes only a small amount of time to perform the task, as the remainder of the search tree is so small. In addition, no substructure has to be searched for all the time. The tree size increases roughly exponentially with the number of levels. Without the fragment there are 6 more levels. Whenever a bond is formed, a substructure search is performed to check for the validity if the tree node. This tremendously slows down the assembly process.
Brief Description | First Example | Fundamental difference between non-overlapping fragments and substructure constraints | Atom Tags | Assemble as a tool in structure elucidation | Ranking | Postprocessing
|