Identifying Biomolecular Targets of the Anticancer Vitamin-E-δ-Tocotrienol Using a Computational Approach: Virtual Target Screening

therapeutic use poses a series of new challenges, with arguably the most important being the elucidation of the precise mechanism of action responsible for the anticancer activity of δ-tocotrienol. As an initial step to address this question, we have used a computational tool, Virtual Target Screening (a molecular docking-based tool that identifies potential binding partners for small molecules), to identify potential biomolecular targets of δ-tocotrienol. Then, to gain a consensus as to the type of biomolecular entity that could be a target for δ-tocotrienol, we utilized PharmMapper and PASS (a ligand-based chemoinformatic approach), and ProBiS (a tool that analyses binding site similarities across known proteins). The results of our multipronged computational consensus-seeking approach showed that such a strategy can identify potential cellular targets of small molecules. This is evidenced by our identification of estrogen receptor-beta, a protein that has been previously shown to bind δ-tocotrienol, which elicited a cellular response. This study supports the use of such a computational approach as an initial step in target identification to avoid time-consuming, costly large-scale experimental screening, greatly reducing the experimental work to just one or a few candidate proteins.


Introduction
Vitamins, a subclass of natural products, have long been thought to play an important role in the physiology of living organisms. Many essential processes in the human body rely on the availability of vitamins. While there is little dispute over the importance of vitamins, their precise mode of action that manifests into health benefits often remains elusive. However, this is not surprising, as understanding the underpinnings of a biological effect of a small molecule requires knowledge of its biomolecular targets. Zeroing in on the precise mode of action of many vitamins is made more complicated by the fact that they can come in various forms and induce a range of physiological responses. Vitamin E is a good example of such complexity. Vitamin E is an essential lipidsoluble vitamin and an important macronutrient that has long been thought to have strong antioxidant effects without causing major toxicity in humans [1-3]. Its primary activity has been attributed to its ability to reduce free radicals to prevent lipid peroxidation of polyunsaturated fatty acids [4].
Vitamin E consists of 8 naturally occurring isomers: d-alpha-, d-beta-, d-gamma-, and d-delta-tocopherols and d-alpha-, d-beta-, d-gamma-, and d-delta-tocotrienols [5]. Among the constituents of vitamin E, tocopherols initially received a great deal of attention from the scientific community due to their potent antioxidant properties and their abundance in common food sources. However, the focus has now shifted toward tocotrienols, which, although rare in nature, have exhibited a number of physiological effects that include neuroprotective, antioxidant, anticancer, cholesterol-lowering, and other therapeutic activities [6][7][8][9][10]. The diversity and, at the same time, specificity of physiological responses to tocotrienols may mean that their modes of action differ from the broad antioxidant activity of tocopherols. A quick look at the structures of tocopherols and tocotrienols reveals three double bonds in the farnesyl isoprenoid tail of the tocotrienols, making it the less flexible of the two classes of compounds ( Figure 1). This additional restraint on the conformational freedom of tocotrienols, however, may be the reason for the more specific range of activities. The significantly reduced number of allowable conformations of tocotrienols over tocopherols may result in an increased affinity of tocotrienols toward a specific group of binding partners. If true, the activity of tocotrienols may go beyond that of a broad antioxidant and into the signaling realm. Studies of vitamin E delta-tocotrienol (VEDT) have already proposed the notion that at least VEDT's anticancer activity may be attributed, not just to its antioxidant properties, but rather to its involvement in the signaling pathways of cancer [11][12][13]. Furthermore, VEDT was proposed to act as a mediating substance in antiproliferative and apoptotic mechanisms in carcinogenic tumor cells [5,6,[13][14][15][16][17]. In this study, we attempt to gain insight into the possible modes of action of VEDT by using molecular modeling, cheminformatics and bioinformatics, with the goal of proposing likely binding partners of VEDT.
As an initial step directed at pinpointing molecular target(s) of VEDT, we used in silico approaches to interrogate protein space for potential targets. With the growing prominence of in silico target discovery as a viable tool in research campaigns, a number of methodologies have been developed to take advantage of computational efficiency and storage capabilities of computer systems to search for potential protein targets, while greatly reducing the scale and scope of experimental work required for such exploration [18][19][20][21][22][23][24].
Here, we used a number of different approaches to computationally probe for potential target proteins of VEDT. These approaches vary significantly yet aim to solve the same problem. This was done in part to decrease the effect of errors resulting from the use of any one individual tool. The results drawn from the consensus reached via the utilization of different approaches have also previously produced reliable results [25][26][27][28]. One method, called Virtual Target Screening (VTS), is based on the explicit modeling of intermolecular interactions between VEDT and its potential targets via a docking protocol designed specifically to search for targets of small drug-like or natural product-like compounds [29,30]. The other methods, PharmMapper and PASS (Prediction of Activity Spectra of Substances), take chemoinformatic approaches for proposing potential targets of interest [31][32][33][34][35]. Here, the structure of VEDT and its pharmacophore were compared to a database of structures and pharmacophores of small molecules with known biomolecular targets. Target proteins were then proposed based on the structural similarities between database compounds and VEDT. Finally, we used the binding site analysis tool Protein Binding Sites Detection (ProBiS) to mine the Protein Data Bank (PDB) for proteins containing binding pockets similar to those that are likely to accommodate VEDT [36][37][38][39][40][41].

I VEDT Preparation
Three-dimensional coordinates of VEDT and δ-tocopherol were downloaded from PubChem (Link 1) in a structure data file format and were prepared using the LigPrep (LigPrep, Schrödinger, New York, NY) module of the suite of molecular modeling software (Suite 2012: Maestro, Schrödinger, New York, NY) with default settings corresponding to physiological pH.

II Virtual Target Screening
VTS is a Web-based software application deployed on a dedicated computer cluster in the Chemistry Department at the University of South Florida [29,30]. VTS works by comparing the docking performance of the molecule of interest (MOI) with a given protein against the performance of a calibration set of small molecules docked into the same protein. The National Cancer Institute Diversity Set (Set 1 provided by the Developmental Therapeutics Program, Cancer Treatment and Diagnosis, National Cancer Institute, Rockville, MD; collections are available at Link 2), which consists of 1990 drug-like compound structures, served as a calibration set and was docked into each of the 1451 protein structures currently in the VTS library. Docking and scoring were performed using Glide (Glide, Schrödinger, New York, NY). The resulting docking score, called the GScore, is based on Schrödinger's proprietary calculation of ligand/protein energetics. GScores are typically negative values since mutual accommodation of the ligand and protein reflects less disruptive energy in the complex versus the state when the entities are not combined. MOIs (VEDT and δ-tocopherol as two separate VTS runs) were then docked into each of the proteins in the VTS library.
In addition to human protein structures, which comprise the majority of the VTS library (~1000), structures from other organisms ranging from bacteria to mammals were also included in the VTS library to allow future utility in other project types. Once docked, the docking performance of the MOI, as measured by their GScores, was compared to the average GScores of the calibration set compounds for each protein.
A protein was considered a hit if the MOI's docking score was better than the average docking score of the top 200 calibration compounds docked into the protein and was of particular significance if the MOI's docking score was better than the top 20 average.

III PharmMapper
PharmMapper uses pharmacophore mapping to identify potential targets for the MOI. Six pharmacophore features, including hydrophobic center, positively charged center, negatively charged center, hydrogen bond acceptor, and donor vectors and the aromatic plane of an ensemble of the MOI conformations, are mapped onto a library of pharmacophores extracted from publicly available crystal structures of protein-ligand complexes. A fit score between the MOI's pharmacophores and those in the library is then calculated, and the desired number of best-fitting targets, N, are suggested. The structure data file of VEDT downloaded from PubChem was converted to Tripos mol2 file using OpenBabel software [42]. The resulting mol2 file was submitted to PharmMapper server at (Link 3).

IV Prediction of Activity Spectra of Substances
Prediction of Activity Spectra of Substances (PASS) predicts the biological activity of the MOI based on the "multilevel neighborhoods of atoms" structural descriptors of compounds and a training set of structure-activity relationship (SAR) data for over 60,000 chemical substances (SAR Base) [43]. For each biological activity, based on the similarity of multilevel neighborhoods of atoms descriptors of the MOI and the substances in the SAR Base, PASS outputs two probabilities: the probability of the MOI to exhibit the activity and the probability of the MOI to not exhibit the activity. To execute PASS prediction, a SMILES chemical identifier representing VEDT [CC1=C2C(=CC(=C1)O)CCC(O2)(C)CCC=C(C)CCC=C(C)CCC=C( C)C] was extracted from the PubChem entry for VEDT and submitted to (Link 4).

V Protein Binding Sites Detection
Protein Binding Sites Detection (ProBiS) identifies proteins structurally similar to the user-supplied protein of interest in the PDB. The algorithm behind the ProBiS server represents the entire proteins as graphs where vertices correspond to functional groups of surface amino acids and edges represent distance between these functional groups. Pairwise alignment of structural features of the protein of interest and the proteins in the PDB is then performed, with displayed results sorted by the statistical Z-scores and proteins most similar to protein of interest displayed on top. The PDB code for estrogen receptor β (ERβ) 1NDE was used as the input for the ProBiS search [44]. The choice of the above PDB code was based on results of the VTS screen.

Results
The VTS screens for both VEDT and δ-tocopherol have resulted in a number of proteins that were hits for VEDT but not for δ-tocopherol (Table 1). According to VTS, ERβ (PDB code 1NDE) was a top hit for VEDT, although not for δ-tocopherol ( Figure 2). Thirty-three other proteins were also predicted as potential hits for VEDT but not for δtocopherol. This is consistent with experimental findings that support binding of VEDT to ERβ [8,12]. Moreover, some patterns or at least consistencies could be observed by looking at the VTS hit list. In particular, a number of hormone and nuclear receptors were shown in the proposed target list. This is not surprising, however, as a ProBis search for binding sites similar to the ERβ revealed that the binding sites of a number of proteins such as Rxr-like protein, retinoic acid receptor Rxr-α, progesterone receptor, and others are indeed quite similar (Table  2).   PharmMapper prediction resulted in 300 proposed pharmacophores, with those fitting the flexible alignment with VEDT ranked on top. These results were examined for consistency with the best VEDT hits from the VTS screen. Based on the pharmacophore derived from the PDB structure 1NDE, ERβ, which was the best hit in the VTS screen, was ranked 51.
PASS prediction resulted in 500 proposed activities, which VEDT statistically is more likely to exhibit than not. The results were ranked by the probability that VEDT exhibits a given activity. Similarly, these results were examined for consistency with the VTS screen and PharmMapper prediction.
According to the PASS prediction, there are several activities related either to estrogen modulation or ERβ specifically (Table 3). Among the estrogen-related activities, VEDT is likely to serve as an estrogen antagonist with the highest probability, followed by ERβ antagonist, followed by estrogen agonist, followed by ERβ agonist. Therefore, VEDT is more likely to serve as an antagonist of estrogen and ERβ activity rather than their agonist. Predicted activities with relation to the estrogen modulation or action of ERβ. Rank: overall PASS rank of this activity by VEDT; Pa: Probability that VEDT exhibits the given activity; Pi: Probability that VEDT does not exhibit the given activity.

I Consensus Between Computational Studies
We identified potential molecular targets of VEDT via a combination of molecular modeling (docking), cheminformatics (SAR) techniques, and bioinformatics in the form of binding site analyses. Our results were consistent in identifying ERβ as a potential target of VEDT. All three approaches (VTS, PharmMapper, and PASS) identified ERβ modulation as one of the potential activities of VEDT. In particular, with VTS, ERβ was identified by PDB ID 1NDE as a top target from a list of 1451 protein structures. Although PharmMapper and PASS algorithms operate on a substantially larger data space, both identified ERβ as one of the top potential targets, although results did not rank it as high as with VTS. In addition, PharmMapper exhibited consistency in preference toward the conformation of the ERβ represented by 1NDE, suggesting a possible antagonistic mechanism of action [12,44]. The hypothesis of the antagonistic mode of action can also be favored based on the results of PASS, which favored an antagonistic activity of VEDT toward ERβ than that of an agonistic.

II Consistency with Experimental Data
Although no experimental data were considered during the initial VTS screening of VEDT, a subsequent literature search revealed that Comitato and colleagues had performed both in vitro binding analyses and molecular docking studies to identify a high-affinity interaction between VEDT and ERβ, which was the first reported evidence of such an association [11,12]. The consistency of experimental data with the computational approaches reported in this work is encouraging and gives validity to the use of tools such as VTS as an initial step to probe for potential molecular targets of compounds with an unknown mode of action. In addition, the techniques described in this work allow for additional inferences regarding the precise modulating effects of the MOI on its target. Although Comitato et al., based on the docking studies, favored an agonistic mode of action and seemed to contradict the preference toward an antagonistic activity favored by our results, the combination of different approaches reported here can be beneficial in facilitating the investigation of the true mode of action of the MOI.
Overall, the use of VTS as a first step, particularly in combination with other computational methods for consensus, is a viable strategy in helping to identify potential targets of natural products and other chemical substances before proceeding with extensive experimental work. Moreover, utilization of bioinformatics approaches like ProBis can help gain additional insight into potential targets and help identify structures for further inclusion into the VTS and cheminformatics screens. This multipronged consensus approach may prove especially valuable in cases where it is not feasible or is otherwise prohibitive to conduct experimental studies to address this question.
Computational approaches in drug discovery and development have faced skepticism in the past since, in some cases, needed virtual protein structures were not available or of poor resolution. In addition, there have been difficulties in developing accurate scoring algorithms for compound docking in virtual screening and difficulties accounting for protein flexibility (e.g. active site dynamics). Experimental screening has been considered more reliable since it is performed in a relevant in vitro or even in vivo setting. But virtual protein availability has greatly improved, protein conformational dynamics can be modeled with some accuracy, HPC (parallel processing on clusters) is more readily available and available virtual compound libraries provide greater breadth of chemical space to test (e.g. the ZINC databases).
When we consider virtual screening and/or VTS as a first pass at screening, compared to experimental screening, the accuracy is about the same or even better with the virtual approaches. Virtual is certainly the cheaper and faster approach to find "hit" compounds in virtual screening or to find "hit" proteins in VTS. VTS attained 72% accuracy in a test case of kinases and known kinase inhibitors with our prototype VTS system [Santiago, et al., 2012] and we believe VTS can be improved to 80-90% accuracy with planned improvements, such as incorporating machine learning to track protein promiscuity to down-weight proteins that tend to bind many compounds or incorporating low-mode vibrational analysis that allows the virtual protein/ligand complex more flexibility and in vivo-like dynamics. Experimental approaches can be costly and are hampered sometimes by poor availability of physical protein and viable assays. And experimental work can have issues with individual compounds such that we consider the accuracy of experimental approaches, as a first pass at screening, to be around 70%.
Experimental assaying can be affected by: 1) cell permeability; 2) compound solubility; 3) compound concentration; 4) proper solvents & buffers; 5) protein availability; 6) protein degradation; 7) potential modifications, degradation or sequestration of the MOI; 8) technical competency (e.g. pipetting errors) and availability of validated assays, reagents and equipment; 9) proper analysis and interpretation of results. Commercial services that test compounds against a panel of related protein targets (e.g. kinase panels) can be used but they are expensive and services that have comprehensive protein collections are an exception rather than the rule.
Typically, in experimental screening, 14% of compounds may be insoluble and 30-40% have poor solubility [45,46]. So, by comparison between computational and experimental for a first pass, the accuracy of computational screening is most encouraging. With VTS, compound concentration, purity and solubility of the virtual MOI and protein structures are irrelevant as are degradation and aberrant modifications of the virtual structures. Also, the virtual MOI with 100% solubility, has zero toxicity, zero photosensitivity and maintains absolute stereochemistry (i.e. no racemic drift). Of course, VTS results need experimental confirmation but the experimental work is greatly reduced by the VTS filtering out insignificant proteins and by virtual screening filtering out irrelevant compounds. Later in drug development, solubility and permeability issues of a compound, which can be interrelated depending on the intended therapeutic context, can be addressed in compound optimization with experimental assays and computational modeling, including additional VTS runs.