Abhishek Tiwari:Chemical Informatics Toolkits
Chemical Informatics or Cheminformatics Toolkits
Ligand Interaction Diagrams From MOE 2006 http://www.chemcomp.com/journal/ligintdia.htm
"Any idiot can stand up and say that virtual screening doesn't work. It takes real brains to show how to improve it!" - Mark McGann
Currently a lot of toolkits (Daylight Toolkit, Chemaxon Toolkit, OpenEye Toolkit, MOE SVL, Accord, CDK, JOELib etc) are available from different vendors and organization. Most of them are equally good but choice may vary based on user prospective. I will try to give a summarized overview of some commonly used Chemical Informatics Toolkits. For academics user Chemaxon JChem (which is free for academic user) and Open Source Toolkits like CDK and JOELib will be a better choice. If your budget permits then you can use Daylight, Accord , OpenEye, MOE SVL or any other depending on your needs but Chemaxon and MOE are low budget high quality options.
Common Chemical Informatics Toolkits
(in alphabetical order)
Accelrys Software(Free Evaluation License) http://www.accelrys.com
Accelrys cheminformatics solutions are provided through the Accord product family with a long range of product which can confuse some naive user and mostly useful for enterprise level developments. Embedded within every Accord cheminformatics product is the Accord Chemistry Engine, the technology that allows software applications to understand chemistry. Main programmable development components of Accord are
- Accord SDK
- Accord Chemistry Java Object
- Accord Enterprise Webkit™
Accord SDK enables power users and developers to rapidly develop advanced chemical applications, such as Combinatorial Chemistry, Property Prediction and CD-ROM database publishing.
Accord Enterprise Webkit™ is development toolkits for creating focused web-based applications that help customers perform specific tasks within Accord Enterprise Informatics (AEI).
The Accord Chemistry Java Object enables you to build effective chemistry-enabled solutions and applications in Java rapidly and efficiently.
AccuSoft(Free Evaluation License)http://www.accusoft.com/
VisiQuest is one of innovative tool from AccuSoft.Using the visual programming environment of VisiQuest there is no need to write the code . Simply drag and drop functions, called glyphs, onto the workspace to create your programs — use over 500 built-in glyphs.Some major features VisiQuest are
- Innovative Visual Programming Environment
- Powerful Image and Data Analysis
- Advanced Visualization
- Built-in Extensibility Tools
- Collaborate Across the Enterprise
- Easy MATLAB M-file Integration
Advanced Chemistry Development(Free Evaluation License) http://www.acdlabs.com
ACD/ChemCoder SDK provides program interfaces as standard C, Microsoft COM (Common Object Model), and ActiveX controls. It uses OPOS standard drivers to support a variety of barcode scanners.
Chemaxon(Free for Academics Users) http://www.chemaxon.com/
JChem Base & Cartridge are main tools from Chemaxon. JChem Base is a Java tool for the development of applications that allow for the search of mixed structural and non-structural data.JChem Base will integrate with a variety of database systems (Oracle, MS SQL Server, DB2, Access, etc) with web interfaces and offers fast substructure, similarity, exact and superstructure search engine using 2D hashed fingerprints. Structures are stored in database tables. Structural and non-structural data can be combined. SDF, SMILES, etc. can be imported and exported. JChem Base also supports ChemAxon's Chemical Terms language to enable complex chemical queries, expressions and rules. The system includes Marvin, a Java based chemical editor and viewer. Using the JChem Cartridge for Oracle the user can access many JChem functions such as structure storage and search, property prediction, structure canonicalization and Chemical Terms directly from Oracle's SQL.
Chemical Computing Group(Free Evaluation License) http://www.chemcomp.com
Scientific Vector Language (SVL) is the built-in command language, scripting language and application development language of MOE(Molecular Operating Environment). SVL is a "chemistry aware" computer programming language with over 1,000 specific functions for analyzing and manipulating chemical structures and related molecular objects. SVL is a concise, high-level language and SVL programs are typically 10 times smaller that equivalent programs written in C or Fortran. SVL source code is compiled to a "byte code" representation, which is then interpreted by the base run-time environment making SVL programs inherently portable across different computer hardware and operating systems.
The Chemistry Development Kit(Free & Open Source) http://cdk.sf.net/
The Chemistry Development Kit (CDK) is a Java library for structural chemo- and bioinformatics. It is now developed by more than 40 developers all over the world and used in more than 10 different academic as well as industrial projects world wide.In the past few years, the CDK library evolved into a fully blown chemoinformatics package with code reaching from QSAR descriptor calculations to 2D and 3D model building. Programs like the 2D structure editor JChemPaint and NMRShiftDB, a database of organic molecules and their NMR spectra are based on the Chemistry Development Kit (CDK).
Unfortunately this open source toolkit has a lot of problems and is still needed to improve. For developers it is a great opportunity to join hands with CDK team and bring new modules. Existing modules are not much reliable. For naive developer and academic users it is a good option to explore toolkit programming.
Daylight Chemical Information Systems(Free Evaluation License) http://www.daylight.com/
Daylight is very first company which introduced concepts like SMILESTM, SMARTS® and it can be consider as Parent of Chemical Informatics Systems. Daylight provides cheminformatics toolkits, applications, and database systems that are infinitely customizable to accommodate the needs of a wide variety of specialized scientific applications. Daylight toolkits provide an object-oriented programming library used to deliver cheminformatics capabilities to in-house applications. Currently Daylight provide 2 toolkits
- Daylight Toolkit Package
- THOR-Merlin Toolkit
The Daylight Toolkit is a programming library that provides all functions needed for chemical information processing and substructure pattern searching along with fingerprinting and similarity capabilities. The Daylight Toolkit combines the following individual tools into a single package: SMILESTM, SMARTS®, Reaction, Depict, Fingerprint and Program Object. The Daylight Toolkit is written to be system- and language-independent, and makes full use of dynamic memory allocation. Wrappers are provided for C, C++, and Fortran compilers supplied with most Unix platforms. Wrappers for other languages are available.
THOR-Merlin Toolkit is C-language interface for chemical database processing/ searching. It provides a comprehensive programmatic interface to THOR and Merlin Servers.
Digital Chemistry/Barnard Chemical Information (BCI) http://www.bci.gb.com
Digital Chemistry offers a rich set of components which allow users to develop custom applications exploiting a wide range of chemical informatics techniques.BCI toolkits provide support for Third Party Toolkits and Standards which makes it unique. Available toolkits from Digital Chemistry
- Markush Toolkit Components
- Clustering Toolkit Component
- Diversity Toolkit Component
- Fingerprint and Dictionary Generation Toolkit Component
- MOLSMART Toolkit Component
Individually, each Toolkit Component allows developers to quickly create applications in specific areas of chemical informatics such as clustering or fingerprint generation. However, the Components are also highly integrated with the output of each being freely interchangeable with others. For example, the Fingerprint Component can be used to generate structure fingerprints for sets of molecules, or the Markush Components to calculate physicochemical properties or fingerprints for the library members. These data may then be passed to the Clustering or Diversity Components for further analysis.
A number of language wrappers are available allowing the development of applications which may be subsequently deployed in any single, or multi-tiered architecture. Each language wrapper follows the same implementation pattern, making use of different languages as easy as possible. Currently Language support is available for: C/C++, Java, Visual Basic (6 and .NET), PERL and PYTHON.
JOELib2(Free & Open Source) http://www-ra.informatik.uni-tuebingen.de/software/joelib/
JOELib2 is the redesigned Java successor of the OELib library. The C++ analogue successor is OpenBabel (see below). So, JOELib and OpenBabel uses the same chemical expert system (with some marginal differences). The commercial successor of OELib is OEChem.
JOELib2 is a Cheminformatics algorithm library, which was designed for prototyping, data mining, graph mining, and of course algorithm development. It has not really a Graphical User Interface (GUI) and should not be used for high-throughput tasks. Anyway, a primitive 2D rendering and Java3D rendering are also part of this library. Several programming interfaces are provided.
JOELib2 is Open Source and competitive.
Open Babel(Free & Open Source) http://openbabel.sf.net
Open Babel is a chemical toolbox designed to speak the many languages of chemical data. It's an open, collaborative project allowing anyone to search, convert, analyze, or store data from molecular modeling, chemistry, solid-state materials, biochemistry, or related areas.
At its heart, Open Babel is a C++ library. It has bindings in Python, Perl, Ruby and Java, and can be used on Windows, MacOSX and Linux (in fact, it is included in several major Linux distributions). It also provides a command-line program babel for file conversion and compound filtering.
OpenEye Scientific Software(Free Evaluation License) http://www.eyesopen.com/
OpenEye provides many programming libraries like Case, Lexichem, OEChem etc but only OEChem is capable of cheminformatics and 3D molecular data handling. OEChem is a programming library for Chemistry and Chemical Informatics that is fast and has a stable, documented API. OEChem has many simple yet powerful functions, which handle the details of working with molecules. For routine tasks, OEChem offers clear and efficient scripting in Python. For more advanced software development, OEChem offers C++. High-level functions provide simplicity while low-level functions provide flexibility.OEChem now also supports free licenses for non--commercial products/projects.