A comparative dataset: Bridging COVID-19 and other diseases through epistemonikos and CORD-19 evidence
Revista : Data in BriefVolumen : 51
Tipo de publicación : ISI Ir a publicación
Abstract
The COVID-19 pandemic has underlined the need for reliable information for clinical decision-making and public health policies. As such, evidence-based medicine (EBM) is essen-tial in identifying and evaluating scientific documents per-tinent to novel diseases, and the accurate classification of biomedical text is integral to this process. Given this context, we introduce a comprehensive, curated dataset composed of COVID-19-related documents.This dataset includes 20,047 labeled documents that were meticulously classified into five distinct categories: system-atic reviews (SR), primary study randomized controlled tri-als (PS-RCT), primary study non-randomized controlled tri-als (PS-NRCT), broad synthesis (BS), and excluded (EXC). The documents, labeled by collaborators from the Epistemonikos Foundation, incorporate information such as document type, title, abstract, and metadata, including PubMed id, authors, journal, and publication date.Uniquely, this dataset has been curated by the Epistemonikos Foundation and is not readily accessible through conven-tional web-scraping methods, thereby attesting to its dis-tinctive value in this field of research. In addition to this, the dataset also includes a vast evidence repository compris-ing 427,870 non-COVID-19 documents, also categorized into SR, PS-RCT, PS-NRCT, BS, and EXC. This additional collection can serve as a valuable benchmark for subsequent research. The comprehensive nature of this open-access dataset and its accompanying resources is poised to significantly advance evidence-based medicine and facilitate further research in the domain.(c) 2023 The Author(s). Published by Elsevier Inc.