The browser you are using is not supported by this website. All versions of Internet Explorer are no longer supported, either by us or Microsoft (read more here: https://www.microsoft.com/en-us/microsoft-365/windows/end-of-ie-support).

Please use a modern browser to fully experience our website, such as the newest versions of Edge, Chrome, Firefox or Safari etc.

Default user image.

Johanna Berg

Teaching staff

Default user image.

OpenChart-SE: A corpus of artificial Swedish electronic health records for imagined emergency care patients written by physicians in a crowd-sourcing project

Author

  • Johanna Berg
  • Carl Aasa
  • Björn Appelgren Thorell
  • Sonja Aits

Summary, in English

Electronic health records (EHRs) are a rich source of information for medical research and public health monitoring. Information systems based on EHR data could also assist in patient care and hospital management. However, much of the data in EHRs is in the form of unstructured text, which is difficult to process for analysis. Natural language processing (NLP), a form of artificial intelligence, has the potential to enable automatic extraction of information from EHRs and several NLP tools adapted to the style of clinical writing have been developed for English and other major languages. In contrast, the development of NLP tools for less widely spoken languages such as Swedish has lagged behind. A major bottleneck in the development of NLP tools is the restricted access to EHRs due to legitimate patient privacy concerns. To overcome this issue we have generated a citizen science platform for collecting artificial Swedish EHRs with the help of Swedish physicians and medical students. These artificial EHRs describe imagined but plausible emergency care patients in a style that closely resembles EHRs used in emergency departments in Sweden. In the pilot phase, we collected a first batch of 50 artificial EHRs, which has passed review by an experienced Swedish emergency care physician. We make this dataset publicly available as OpenChart-SE corpus (version 1) under an open-source license for the NLP research community. The project is now open for general participation and Swedish physicians and medical students are invited to submit EHRs on the project website (https://github.com/Aitslab/openchart-se). Additional batches of quality-controlled EHRs will be released periodically.

Department/s

  • Cell Death, Lysosomes and Artificial Intelligence
  • LU Profile Area: Natural and Artificial Cognition
  • LTH Profile Area: Engineering Health
  • EpiHealth: Epidemiology for Health
  • LTH Profile Area: AI and Digitalization
  • eSSENCE: The e-Science Collaboration

Publishing year

2023-01-05

Language

English

Document type

Preprint

Publisher

medRxiv

Topic

  • Public Health, Global Health, Social Medicine and Epidemiology

Keywords

  • electronic health records
  • natural language processing
  • artificial intelligence
  • citizen science

Status

Published

Project

  • Lund University AI Research
  • Artificial intelligence-based text mining for COVID-19 and other areas of medicine

Research group

  • Cell Death, Lysosomes and Artificial Intelligence