Datasets

Here some datasets we published in recent years:

  • MIMI-III IPI: Dataset of indirect identifiers in English discharge summaries of MIMIMC-III. [Data]

  • ADE Dataset [Dataset #1]

  • Ex4CDS: Textual Explanations for Clinical Decision Support [Data]

  • Subjective Text Complexity Corpus for German [Data]

  • Technical-Laymen Corpus (TLC): Annotated German medical forum with medical lay expression and their ‘translations’ and vice versa. [Data]

  • Dataset of German synthetic clinical notes [Data]

Tools

Here some tools we published in recent years:

  • Factual-med-bert-de: German BERT model to detect factuality (Affirmed, Negated and Possible) of medical conditions in German clinical texts [Model]

  • Pynegex: A python library to use NegEx for German (factuality detection) [Model]

  • A German Dependency Tree Parser for Medical Text, with a small Gold standard of clinical notes [Model]

  • mEx Workbench: A toolbox of NER and RE models for German clinical text [Model]





Dr. Roland Roller

roland.roller@dfki.de

Senior Researcher
German Research Center for Artificial Intelligence (DFKI)

 
Design courtesy of Vasilios Mavroudis: Plain Academic