‘Query-by Committee’— An Efficient Method to Select Information-Rich Data for the Development of Peptide—HLA-Binding Predictors

Lamberth, K. ; Nielsen, M. ; Lundegaard, C. ; Worning, P. ; Laurmøller, S. L. ; Lund, O. ; Brunak, S. ; Buus, S.

Oxford, UK; Malden, USA : Blackwell Science Ltd/Inc.
Published 2004

ISSN:	1365-3083
Source:	Blackwell Publishing Journal Backfiles 1879-2005
Topics:	Medicine
Notes:	Rationale: We have previously demonstrated that bioinformatics tools such as artificial neural networks (ANNs) are capable of performing pathogen-, genome- and HLA-wide predictions of peptide–HLA interactions. These tools may therefore enable a fast and rational approach to epitope identification and thereby assist in the development of vaccines and immunotherapy. A crucial step in the generation of such bioinformatics tools is the selection of data representing the event in question (in casu peptide–HLA interaction). This is particularly important when it is difficult and expensive to obtain data. Herein, we demonstrate the importance in selecting information-rich data and we develop a computational method, query-by-committee, which can perform a global identification of such information-rich data in an unbiased and automated manner. Furthermore, we demonstrate how this method can be applied to an efficient iterative development strategy for these bioinformatics tools.Methods: A large panel of binding affinities of peptides binding to HLA A*0204 was measured by a radioimmunoassay (RIA). This data was used to develop multiple first generation ANNs, which formed a virtual committee. This committee was used to screen (or ‘queried’) for peptides, where the ANNs agreed (‘low-QBC’), or disagreed (‘high-QBC’), on their HLA-binding potential. Seventeen low-QBC peptides and 17 high-QBC peptides were synthesized and tested. The high- or low-QBC data were added to the original data, and new high- or low-QBC second generation ANNs were developed, respectively. This procedure was repeated 40 times.Results: The high-QBC-enriched ANN performed significantly better than the low-QBC-enriched ANN in 37 of the 40 tests.Conclusion: These results demonstrate that high-QBC-enriched networks perform better than low-QBC-enriched networks in selecting informative data for developing peptide–MHC-binding predictors. This improvement in selecting data is not due to differences in network training performance but due to the difference in information content in the high-QBC experiment and in the low-QBC experiment. Finally, it should be noted that this strategy could be used in many contexts where generation of data is difficult and costly.
Type of Medium:	Electronic Resource
URL:	http://dx.doi.org/10.1111/j.0300-9475.2004.01423bf.x

Staff View

_version_	1798290232051761153
autor	Lamberth, K. Nielsen, M. Lundegaard, C. Worning, P. Laurmøller, S. L. Lund, O. Brunak, S. Buus, S.
autorsonst	Worning, P. Laurmøller, S. L. Lund, O. Brunak, S. Buus, S.
book_url	http://dx.doi.org/10.1111/j.0300-9475.2004.01423bf.x
datenlieferant	nat_lic_papers
hauptsatz	hsatz_simple
identnr	NLZ243697511
insertion_date	2012-04-27
issn	1365-3083
journal_name	Scandinavian journal of immunology
materialart	1
notes	Rationale: We have previously demonstrated that bioinformatics tools such as artificial neural networks (ANNs) are capable of performing pathogen-, genome- and HLA-wide predictions of peptide–HLA interactions. These tools may therefore enable a fast and rational approach to epitope identification and thereby assist in the development of vaccines and immunotherapy. A crucial step in the generation of such bioinformatics tools is the selection of data representing the event in question (in casu peptide–HLA interaction). This is particularly important when it is difficult and expensive to obtain data. Herein, we demonstrate the importance in selecting information-rich data and we develop a computational method, query-by-committee, which can perform a global identification of such information-rich data in an unbiased and automated manner. Furthermore, we demonstrate how this method can be applied to an efficient iterative development strategy for these bioinformatics tools.Methods: A large panel of binding affinities of peptides binding to HLA A*0204 was measured by a radioimmunoassay (RIA). This data was used to develop multiple first generation ANNs, which formed a virtual committee. This committee was used to screen (or ‘queried’) for peptides, where the ANNs agreed (‘low-QBC’), or disagreed (‘high-QBC’), on their HLA-binding potential. Seventeen low-QBC peptides and 17 high-QBC peptides were synthesized and tested. The high- or low-QBC data were added to the original data, and new high- or low-QBC second generation ANNs were developed, respectively. This procedure was repeated 40 times.Results: The high-QBC-enriched ANN performed significantly better than the low-QBC-enriched ANN in 37 of the 40 tests.Conclusion: These results demonstrate that high-QBC-enriched networks perform better than low-QBC-enriched networks in selecting informative data for developing peptide–MHC-binding predictors. This improvement in selecting data is not due to differences in network training performance but due to the difference in information content in the high-QBC experiment and in the low-QBC experiment. Finally, it should be noted that this strategy could be used in many contexts where generation of data is difficult and costly.
package_name	Blackwell Publishing
publikationsjahr_anzeige	2004
publikationsjahr_facette	2004
publikationsjahr_intervall	7999:2000-2004
publikationsjahr_sort	2004
publikationsort	Oxford, UK; Malden, USA
publisher	Blackwell Science Ltd/Inc.
reference	59 (2004), S. 0
search_space	articles
shingle_author_1	Lamberth, K. Nielsen, M. Lundegaard, C. Worning, P. Laurmøller, S. L. Lund, O. Brunak, S. Buus, S.
shingle_author_2	Lamberth, K. Nielsen, M. Lundegaard, C. Worning, P. Laurmøller, S. L. Lund, O. Brunak, S. Buus, S.
shingle_author_3	Lamberth, K. Nielsen, M. Lundegaard, C. Worning, P. Laurmøller, S. L. Lund, O. Brunak, S. Buus, S.
shingle_author_4	Lamberth, K. Nielsen, M. Lundegaard, C. Worning, P. Laurmøller, S. L. Lund, O. Brunak, S. Buus, S.
shingle_catch_all_1	Lamberth, K. Nielsen, M. Lundegaard, C. Worning, P. Laurmøller, S. L. Lund, O. Brunak, S. Buus, S. ‘Query-by Committee’— An Efficient Method to Select Information-Rich Data for the Development of Peptide—HLA-Binding Predictors Blackwell Science Ltd/Inc. Rationale: We have previously demonstrated that bioinformatics tools such as artificial neural networks (ANNs) are capable of performing pathogen-, genome- and HLA-wide predictions of peptide–HLA interactions. These tools may therefore enable a fast and rational approach to epitope identification and thereby assist in the development of vaccines and immunotherapy. A crucial step in the generation of such bioinformatics tools is the selection of data representing the event in question (in casu peptide–HLA interaction). This is particularly important when it is difficult and expensive to obtain data. Herein, we demonstrate the importance in selecting information-rich data and we develop a computational method, query-by-committee, which can perform a global identification of such information-rich data in an unbiased and automated manner. Furthermore, we demonstrate how this method can be applied to an efficient iterative development strategy for these bioinformatics tools.Methods: A large panel of binding affinities of peptides binding to HLA A*0204 was measured by a radioimmunoassay (RIA). This data was used to develop multiple first generation ANNs, which formed a virtual committee. This committee was used to screen (or ‘queried’) for peptides, where the ANNs agreed (‘low-QBC’), or disagreed (‘high-QBC’), on their HLA-binding potential. Seventeen low-QBC peptides and 17 high-QBC peptides were synthesized and tested. The high- or low-QBC data were added to the original data, and new high- or low-QBC second generation ANNs were developed, respectively. This procedure was repeated 40 times.Results: The high-QBC-enriched ANN performed significantly better than the low-QBC-enriched ANN in 37 of the 40 tests.Conclusion: These results demonstrate that high-QBC-enriched networks perform better than low-QBC-enriched networks in selecting informative data for developing peptide–MHC-binding predictors. This improvement in selecting data is not due to differences in network training performance but due to the difference in information content in the high-QBC experiment and in the low-QBC experiment. Finally, it should be noted that this strategy could be used in many contexts where generation of data is difficult and costly. 1365-3083 13653083
shingle_catch_all_2	Lamberth, K. Nielsen, M. Lundegaard, C. Worning, P. Laurmøller, S. L. Lund, O. Brunak, S. Buus, S. ‘Query-by Committee’— An Efficient Method to Select Information-Rich Data for the Development of Peptide—HLA-Binding Predictors Blackwell Science Ltd/Inc. Rationale: We have previously demonstrated that bioinformatics tools such as artificial neural networks (ANNs) are capable of performing pathogen-, genome- and HLA-wide predictions of peptide–HLA interactions. These tools may therefore enable a fast and rational approach to epitope identification and thereby assist in the development of vaccines and immunotherapy. A crucial step in the generation of such bioinformatics tools is the selection of data representing the event in question (in casu peptide–HLA interaction). This is particularly important when it is difficult and expensive to obtain data. Herein, we demonstrate the importance in selecting information-rich data and we develop a computational method, query-by-committee, which can perform a global identification of such information-rich data in an unbiased and automated manner. Furthermore, we demonstrate how this method can be applied to an efficient iterative development strategy for these bioinformatics tools.Methods: A large panel of binding affinities of peptides binding to HLA A*0204 was measured by a radioimmunoassay (RIA). This data was used to develop multiple first generation ANNs, which formed a virtual committee. This committee was used to screen (or ‘queried’) for peptides, where the ANNs agreed (‘low-QBC’), or disagreed (‘high-QBC’), on their HLA-binding potential. Seventeen low-QBC peptides and 17 high-QBC peptides were synthesized and tested. The high- or low-QBC data were added to the original data, and new high- or low-QBC second generation ANNs were developed, respectively. This procedure was repeated 40 times.Results: The high-QBC-enriched ANN performed significantly better than the low-QBC-enriched ANN in 37 of the 40 tests.Conclusion: These results demonstrate that high-QBC-enriched networks perform better than low-QBC-enriched networks in selecting informative data for developing peptide–MHC-binding predictors. This improvement in selecting data is not due to differences in network training performance but due to the difference in information content in the high-QBC experiment and in the low-QBC experiment. Finally, it should be noted that this strategy could be used in many contexts where generation of data is difficult and costly. 1365-3083 13653083
shingle_catch_all_3	Lamberth, K. Nielsen, M. Lundegaard, C. Worning, P. Laurmøller, S. L. Lund, O. Brunak, S. Buus, S. ‘Query-by Committee’— An Efficient Method to Select Information-Rich Data for the Development of Peptide—HLA-Binding Predictors Blackwell Science Ltd/Inc. Rationale: We have previously demonstrated that bioinformatics tools such as artificial neural networks (ANNs) are capable of performing pathogen-, genome- and HLA-wide predictions of peptide–HLA interactions. These tools may therefore enable a fast and rational approach to epitope identification and thereby assist in the development of vaccines and immunotherapy. A crucial step in the generation of such bioinformatics tools is the selection of data representing the event in question (in casu peptide–HLA interaction). This is particularly important when it is difficult and expensive to obtain data. Herein, we demonstrate the importance in selecting information-rich data and we develop a computational method, query-by-committee, which can perform a global identification of such information-rich data in an unbiased and automated manner. Furthermore, we demonstrate how this method can be applied to an efficient iterative development strategy for these bioinformatics tools.Methods: A large panel of binding affinities of peptides binding to HLA A*0204 was measured by a radioimmunoassay (RIA). This data was used to develop multiple first generation ANNs, which formed a virtual committee. This committee was used to screen (or ‘queried’) for peptides, where the ANNs agreed (‘low-QBC’), or disagreed (‘high-QBC’), on their HLA-binding potential. Seventeen low-QBC peptides and 17 high-QBC peptides were synthesized and tested. The high- or low-QBC data were added to the original data, and new high- or low-QBC second generation ANNs were developed, respectively. This procedure was repeated 40 times.Results: The high-QBC-enriched ANN performed significantly better than the low-QBC-enriched ANN in 37 of the 40 tests.Conclusion: These results demonstrate that high-QBC-enriched networks perform better than low-QBC-enriched networks in selecting informative data for developing peptide–MHC-binding predictors. This improvement in selecting data is not due to differences in network training performance but due to the difference in information content in the high-QBC experiment and in the low-QBC experiment. Finally, it should be noted that this strategy could be used in many contexts where generation of data is difficult and costly. 1365-3083 13653083
shingle_catch_all_4	Lamberth, K. Nielsen, M. Lundegaard, C. Worning, P. Laurmøller, S. L. Lund, O. Brunak, S. Buus, S. ‘Query-by Committee’— An Efficient Method to Select Information-Rich Data for the Development of Peptide—HLA-Binding Predictors Blackwell Science Ltd/Inc. Rationale: We have previously demonstrated that bioinformatics tools such as artificial neural networks (ANNs) are capable of performing pathogen-, genome- and HLA-wide predictions of peptide–HLA interactions. These tools may therefore enable a fast and rational approach to epitope identification and thereby assist in the development of vaccines and immunotherapy. A crucial step in the generation of such bioinformatics tools is the selection of data representing the event in question (in casu peptide–HLA interaction). This is particularly important when it is difficult and expensive to obtain data. Herein, we demonstrate the importance in selecting information-rich data and we develop a computational method, query-by-committee, which can perform a global identification of such information-rich data in an unbiased and automated manner. Furthermore, we demonstrate how this method can be applied to an efficient iterative development strategy for these bioinformatics tools.Methods: A large panel of binding affinities of peptides binding to HLA A*0204 was measured by a radioimmunoassay (RIA). This data was used to develop multiple first generation ANNs, which formed a virtual committee. This committee was used to screen (or ‘queried’) for peptides, where the ANNs agreed (‘low-QBC’), or disagreed (‘high-QBC’), on their HLA-binding potential. Seventeen low-QBC peptides and 17 high-QBC peptides were synthesized and tested. The high- or low-QBC data were added to the original data, and new high- or low-QBC second generation ANNs were developed, respectively. This procedure was repeated 40 times.Results: The high-QBC-enriched ANN performed significantly better than the low-QBC-enriched ANN in 37 of the 40 tests.Conclusion: These results demonstrate that high-QBC-enriched networks perform better than low-QBC-enriched networks in selecting informative data for developing peptide–MHC-binding predictors. This improvement in selecting data is not due to differences in network training performance but due to the difference in information content in the high-QBC experiment and in the low-QBC experiment. Finally, it should be noted that this strategy could be used in many contexts where generation of data is difficult and costly. 1365-3083 13653083
shingle_title_1	‘Query-by Committee’— An Efficient Method to Select Information-Rich Data for the Development of Peptide—HLA-Binding Predictors
shingle_title_2	‘Query-by Committee’— An Efficient Method to Select Information-Rich Data for the Development of Peptide—HLA-Binding Predictors
shingle_title_3	‘Query-by Committee’— An Efficient Method to Select Information-Rich Data for the Development of Peptide—HLA-Binding Predictors
shingle_title_4	‘Query-by Committee’— An Efficient Method to Select Information-Rich Data for the Development of Peptide—HLA-Binding Predictors
sigel_instance_filter	dkfz geomar wilbert ipn albert
source_archive	Blackwell Publishing Journal Backfiles 1879-2005
timestamp	2024-05-06T08:13:27.202Z
titel	‘Query-by Committee’— An Efficient Method to Select Information-Rich Data for the Development of Peptide—HLA-Binding Predictors
titel_suche	‘Query-by Committee’— An Efficient Method to Select Information-Rich Data for the Development of Peptide—HLA-Binding Predictors
topic	WW-YZ
uid	nat_lic_papers_NLZ243697511