CARE-SM CSV Glossary Documentation

CARE-SM implements a CSV template structure to populate patient registry information. This template consists of a defined set of columns used across different data model modules. Depending on the specific module, a subset of these columns may be required, optional, or not applicable.

CSV template columns

The following table is the complete list of available columns:

Name

Datatype

Description

model

Literal (string)

Tag name used to identify the specific model to generate.

pid

Literal (integer)

Patient unique identifier.

event_id

Literal (integer)

Event-specific identifier used to contextualize several data elements under the same medical visit or encounter.

value

Literal (string, number, or date)

Lexical value of this data element.

value_datatype

Literal (string)

Additional metadata of the value column, using the XML Schema Definition Language (XSD). Expected datatype: xsd:date, xsd:string, xsd:float or xsd:integer.

valueIRI

IRI

Full IRI-based value of the data element.

activity

IRI

Full conceptual IRI describing a specific process associated with this data element, such as a clinical method or route of administration.

unit

IRI

Full conceptual IRI describing the data element’s unit of measurement.

input

IRI

Full conceptual IRI defining any input associated with the clinical procedure, such as a medication used or a related tissue sample.

target

IRI

Full conceptual IRI defining any target associated with the clinical procedure, such as an anatomical structure or a molecule.

specification

IRI

Full conceptual IRI specifying any protocol or document that contextualizes the reported data element.

duration_value

Literal (ISO 8601 duration

Duration value, for instance, P10Y . it will be datatyped as xsd:duration.

duration_startdate

Literal (ISO 8601 date)

Start date of the duration time interval.

duration_enddate

Literal (ISO 8601 date)

End date of the duration interval.

frequency_type

IRI

Full conceptual IRI defining the frequency type of the associated clinical procedure.

frequency_value

Literal (string)

Numerical frequency value; combined with frequency_type, it defines the full frequency (e.g., “2 times per month”).

agent

IRI

Full conceptual IRI defining an additional agent participating in the data element definition.

startdate

Literal (ISO 8601 date)

Start date of the data registration or time instant when the observation was taken. If identical to value, it is not required.

enddate

Literal (ISO 8601 date)

End date of the data registration, if it ‘s different from startdate. No need to add enddate if its a time point, only is required.

age

Literal (integer)

Age of the patient at the time of observation.

comments

Literal (string)

Human-readable comments or descriptions related to this data element.

organisation

IRI

IRI used to define the organisation identifier.

CSV template creation

This guide explains how to structure, populate, and utilize CSV files for patient data in CARE-SM.

  1. Create the CSV File

    This document would only allow a set of concrete columns names:

    • model, pid, event_id, value, age, value_datatype, valueIRI, activity, unit, input, target, specification, frequency_type, frequency_value, agent, startdate, enddate, comments,organisation.

    Example:

    model,pid,event_id,value,age,value_datatype,valueIRI,activity,unit,input,target,specification,frequency_type,frequency_value,agent,startdate,enddate,comments,organisation
    ,,,,,,,,,,,,,,,,,,
    
  2. Populate Data
    Each data entry needs specific fields (not all fields are mandatory). See the glossary below for details.

    Example:

    model,pid,event_id,value,age,value_datatype,valueIRI,activity,unit,input,target,specification,frequency_type,frequency_value,agent,startdate,enddate,comments,organisation
    Diagnosis,30056,,,,,http://www.orpha.net/ORDO/Orphanet_93552,,,,,,,,,,,2006-01-19,,,
    

    For more examples, refer to the exemplar_data folder.

Data Element Glossary

Legend:

  • This column is MANDATORY for this case.

  • This column is OPTIONAL. for this case

  • This column is NOT APPLICABLE for this case.


Birthdate

This data element can be queried (in an aggregated and anonymized manner) through the Beacon API developed for CARE-SM. For more information, click here

  • model: Birthdate

  • pid: Patient Unique Identifier.

  • value: ISO 8601-formatted birthdate.

  • value_datatype:

  • valueIRI:

  • activity:

  • unit:

  • input:

  • target:

  • specification: IRI reference to any associated protocol.

  • frequency_type:

  • frequency_value:

  • agent:

  • startdate: ISO 8601-formatted start date of the data registration., could be the same of the value

  • enddate: ISO 8601-formatted end date of the data registration in case it is different from startdate.

  • age:

  • comments: Additional human-readable comments.

  • event_id: Visit occurrence identifier.

  • organisation:


Birthyear

This data element can be queried (in an aggregated and anonymized manner) through the Beacon API developed for CARE-SM. For more information, click here

  • model: Birthdate

  • pid: Patient Unique Identifier.

  • value: The year (YYYY) in which a person was born.

  • value_datatype:

  • valueIRI:

  • activity:

  • unit:

  • input:

  • target:

  • specification: IRI reference to any associated protocol.

  • frequency_type:

  • frequency_value:

  • agent:

  • startdate:

  • enddate:

  • age:

  • comments: Additional human-readable comments.

  • event_id: Visit occurrence identifier.

  • organisation:


Birthplace

This data element can be queried (in an aggregated and anonymized manner) through the Beacon API developed for CARE-SM. For more information, click here

  • model: Birthplace

  • pid: Patient Unique Identifier.

  • value: Human-readable label of the country.

  • value_datatype:

  • valueIRI: Full IRI used as annotation code for the country at birth.

  • activity:

  • unit:

  • input:

  • target:

  • specification: IRI reference to any associated protocol.

  • frequency_type:

  • frequency_value:

  • agent:

  • startdate: ISO 8601-formatted start date of the data registration.

  • enddate: ISO 8601-formatted end date of the data registration in case it is different from startdate.

  • age:

  • comments: Additional human-readable comments.

  • event_id: Visit occurrence identifier.

  • organisation:


Deathdate

  • model: Deathdate

  • pid: Patient Unique Identifier.

  • value: ISO 8601-formatted date of death (not date time)

  • value_datatype:

  • valueIRI: Full IRI used as annotation code for the cause of the patient’s death.

  • activity:

  • unit:

  • input:

  • target:

  • specification: IRI reference to any associated protocol.

  • frequency_type:

  • frequency_value:

  • agent:

  • startdate: ISO 8601-formatted start date of the data registration., could be the same of the value

  • enddate: ISO 8601-formatted end date of the data registration in case it is different from startdate.

  • age: The patient’s age at the time the observation was taken.

  • comments: Additional human-readable comments.

  • event_id: Visit occurrence identifier.

  • organisation:


Sex

This data element can be queried (in an aggregated and anonymized manner) through the Beacon API developed for CARE-SM. For more information, click here

  • model: Sex

  • pid: Patient Unique Identifier.

  • value: Human-readable label of the concept IRI used for valueIRI.

  • value_datatype:

  • valueIRI: Full concept IRI for the patient sex annotation.**

  • activity:

  • unit:

  • input:

  • target:

  • specification: IRI reference to any associated protocol.

  • frequency_type:

  • frequency_value:

  • agent:

  • startdate: ISO 8601-formatted start date of the data registration.

  • enddate: ISO 8601-formatted end date of the data registration in case it is different from startdate.

  • age:

  • comments: Additional human-readable comments.

  • event_id: Visit occurrence identifier.

  • organisation:

** If you are representing CARE-SM with the OBO Foundry, use the one of the following NCIT terms:

  • http://purl.obolibrary.org/obo/NCIT_C16576 (Female)

  • http://purl.obolibrary.org/obo/NCIT_C20197 (Male)

  • http://purl.obolibrary.org/obo/NCIT_C124294 (Undetermined)

  • http://purl.obolibrary.org/obo/NCIT_C17998 (Unknown)


First Confirmed Visit

  • model: First_visit

  • pid: Patient Unique Identifier.

  • value: ISO 8601-formatted date of first confirmed visit.

  • value_datatype:

  • valueIRI:

  • activity:

  • unit:

  • input:

  • target:

  • specification: IRI reference to any associated protocol.

  • frequency_type:

  • frequency_value:

  • agent:

  • startdate: ISO 8601-formatted start date of the data registration., could be the same of the value

  • enddate: ISO 8601-formatted end date of the data registration in case it is different from startdate.

  • age: The patient’s age at the time the observation was taken.

  • comments: Additional human-readable comments.

  • event_id: Visit occurrence identifier.

  • organisation:


Participation status

  • model: Status

  • pid: Patient Unique Identifier.

  • value: Human-readable label of the concept IRI used for valueIRI.**

  • value_datatype:

  • valueIRI: Full concept IRI for the patient participation status annotation.

  • activity:

  • unit:

  • input:

  • target:

  • specification: IRI reference to any associated protocol.

  • frequency_type:

  • frequency_value:

  • agent:

  • startdate: ISO 8601-formatted start date of the data registration.

  • enddate: ISO 8601-formatted end date of the data registration in case it is different from startdate.

  • age: The patient’s age at the time the observation was taken.

  • comments: Additional human-readable comments.

  • event_id: Visit occurrence identifier.

  • organisation:

** If you are representing CARE-SM with the OBO Foundry, use the one of the following NCIT terms:

  • http://purl.obolibrary.org/obo/NCIT_C37987 (Alive)

  • http://purl.obolibrary.org/obo/NCIT_C90387 (Found Dead)

  • http://purl.obolibrary.org/obo/NCIT_C70740 (Subject Lost to Follow Up)

  • http://purl.obolibrary.org/obo/NCIT_C124784 (Refusal To Participate)


Symptoms onset

This data element can be queried (in an aggregated and anonymized manner) through the Beacon API developed for CARE-SM. For more information, click here

  • model: Symptoms_onset

  • pid: Patient Unique Identifier.

  • value: One of the following:

    • Age of the symptoms onset.

    • ISO 8601-formatted date of the symptoms occurrence.

  • value_datatype: XSD datatype that defines value column.

  • valueIRI:

  • activity:

  • unit:

  • input:

  • target: Full concept IRI for the particular symptom annotation.

  • specification: IRI reference to any associated protocol.

  • frequency_type:

  • frequency_value:

  • agent:

  • startdate: ISO 8601-formatted start date of registration

  • enddate: ISO 8601-formatted end date of registration in case it is different from startdate.

  • age: The patient’s age at the time the observation was taken.

  • comments: Additional human-readable comments.

  • event_id: Visit occurrence identifier.

  • organisation:


Phenotype

This data element can be queried (in an aggregated and anonymized manner) through the Beacon API developed for CARE-SM. For more information, click here

  • model: Phenotype

  • pid: Patient Unique Identifier.

  • value: Human-readable label of the concept IRI used for valueIRI.

  • value_datatype:

  • valueIRI: Full concept IRI for the phenotype, symptom or sign annotation.

  • unit:

  • input:

  • target: Full concept IRI for the anatomic region where the observation was diagnosed, including the cardinality if possible.

  • specification: IRI reference to any associated protocol.

  • duration_value: ISO 8601 duration value of the phenotype duration interval, for instance, P10Y.

  • duration_startdate: ISO 8601 start date of the phenotype duration interval.

  • duration_enddate: ISO 8601 start date of the phenotype duration interval.

  • frequency_type:

  • frequency_value:

  • agent:

  • startdate: ISO 8601-formatted start date of the data registration.

  • enddate: ISO 8601-formatted end date of the data registration in case it is different from startdate.

  • age: The patient’s age at the time the observation was taken.

  • comments: Additional human-readable comments.

  • event_id: Visit occurrence identifier.

  • organisation:


Diagnosis

This data element can be queried (in an aggregated and anonymized manner) through the Beacon API developed for CARE-SM. For more information, click here

  • model: Diagnosis

  • pid: Patient Unique Identifier.

  • value: Human-readable label of the concept IRI used for valueIRI.

  • value_datatype:

  • valueIRI: Full concept IRI for the disease or disorder diagnosed.

  • activity:

  • unit:

  • input:

  • target: Full concept IRI for the anatomic region where the diagnosis was performed.

  • specification:

  • frequency_type:

  • frequency_value:

  • agent:

  • startdate: ISO 8601-formatted start date of the data registration.

  • enddate: ISO 8601-formatted end date of the data registration in case it is different from startdate.

  • age: The patient’s age at the time the observation was taken.

  • comments: Additional human-readable comments.

  • event_id: Visit occurrence identifier.

  • organisation:


Examination

  • model: Examination

  • pid: Patient Unique Identifier.

  • value: Resulting value from this examination.

  • value_datatype: XSD datatype that defines value column.

  • valueIRI: Full concept IRI for the patient attribute/finding examined annotation.**

  • activity: Full concept IRI for the specific method activity.

  • unit: Full concept IRI for the unit of measurement.

  • input:

  • target: Full concept IRI for the anatomic structure examined.

  • specification: URL reference to any protocol.

  • frequency_type:

  • frequency_value:

  • agent:

  • startdate: ISO 8601-formatted start date of the data registration.

  • enddate: ISO 8601-formatted end date of the data registration in case it is different from startdate.

  • age: The patient’s age at the time the observation was taken.

  • comments: Additional human-readable comments.

  • event_id: Visit occurrence identifier.

  • organisation:

** If you are representing CARE-SM with the OBO Foundry, this is some example using NCIT terms:

  • http://purl.obolibrary.org/obo/NCIT_C25208 (Weight)

  • http://purl.obolibrary.org/obo/NCIT_C25347 (Height)

  • http://purl.obolibrary.org/obo/NCIT_C99524 (Left Ventricular Ejection Fraction)

  • http://purl.obolibrary.org/obo/NCIT_C16358 (Body Mass Index)


Laboratory

  • model: Laboratory

  • pid: Patient Unique Identifier.

  • value: value of the laboratory measurement.

  • value_datatype: XSD datatype that defines value column.

  • valueIRI:

  • activity: Full concept IRI for the specific method annotation.

  • unit: Full concept IRI for the unit of measurement.

  • input: Full concept IRI for the substance sample analysed.

  • target: Full concept IRI for the compound measured in the sample.

  • specification: URL reference to a protocol.

  • frequency_type:

  • frequency_value:

  • agent:

  • startdate: ISO 8601-formatted start date of the data registration.

  • enddate: ISO 8601-formatted end date of the data registration in case it is different from startdate.

  • age: The patient’s age at the time the observation was taken.

  • comments: Additional human-readable comments.

  • event_id: Visit occurrence identifier.

  • organisation:


Genetic

This data element can be queried (in an aggregated and anonymized manner) through the Beacon API developed for CARE-SM. For more information, click here

  • model: Genetic

  • pid: Patient Unique Identifier.

  • value: Lexical annotation code for the genetic variant. E.g. NM-004006.2:c.4375C>T p.(Arg1459*)

  • value_datatype:

  • valueIRI: Full concept IRI for the genetic variant annotation.

  • activity: Full concept IRI for the specific method.

  • unit:

  • input: Full concept IRI for the type of substance sample analysed.

  • target:

  • specification: IRI reference to any associated protocol.

  • frequency_type:

  • frequency_value:

  • agent: Full concept IRI for the associated zygosity annotation.

  • startdate: ISO 8601-formatted start date of the data registration.

  • enddate: ISO 8601-formatted end date of the data registration in case it is different from startdate.

  • age: The patient’s age at the time the observation was taken.

  • comments: Additional human-readable comments.

  • event_id: Visit occurrence identifier.

  • organisation:


Medication

  • model: Medication

  • pid: Patient Unique Identifier.

  • value: Prescribed or Administrated dose value.

  • value_datatype: XSD datatype that defines value column type.

  • valueIRI: Full concept IRI for the definition of the type of medication. (Drug Prescription or Administration).**

  • activity: Full concept IRI for the route of administration.

  • unit: Full concept IRI for the unit of measurement.

  • input:

  • target:

  • specification: URL reference to a protocol.

  • frequency_type: Full concept IRI for the administered/prescribed frequency annotation.

  • frequency_value: frequency value prescribe to the patient

  • agent: Full concept IRI for the drugs annotation.

  • startdate: ISO 8601-formatted start date of the data registration.

  • enddate: ISO 8601-formatted end date of the data registration in case it is different from startdate.

  • age: The patient’s age at the time the observation was taken.

  • comments: Additional human-readable comments.

  • event_id: Visit occurrence identifier.

  • organisation:

** If you are representing CARE-SM with the OBO Foundry, use the one of the following NCIT terms:

  • http://purl.obolibrary.org/obo/NCIT_C167190 (Dose Administered)

  • http://purl.obolibrary.org/obo/NCIT_C198143 (Prescribed Dose)


Hospitalization

  • model: Hospitalization

  • pid: Patient Unique Identifier.

  • value:

  • value_datatype:

  • valueIRI:

  • activity: Full concept IRI for the specific activity performed during the hospitalization.

  • input:

  • target:

  • specification: URL reference to a protocol.

  • frequency_type:

  • frequency_value:

  • agent:

  • startdate: ISO 8601-formatted start date of the data registration.

  • enddate: ISO 8601-formatted end date of the data registration in case it is different from startdate.

  • age: The patient’s age at the time the observation was taken.

  • comments: Additional human-readable comments.

  • event_id: Visit occurrence identifier.

  • organisation:


Surgery

  • model: Surgery

  • pid: Patient Unique Identifier.

  • value:

  • value_datatype:

  • valueIRI:

  • activity: Full concept IRI for the specific activity performed during the surgery.

  • unit:

  • input:

  • target: Full concept IRI for the anatomic structure that participates in the surgical intervention.

  • specification: URL reference to a protocol.

  • frequency_type:

  • frequency_value:

  • agent:

  • startdate: ISO 8601-formatted start date of the data registration.

  • enddate: ISO 8601-formatted end date of the data registration in case it is different from startdate.

  • age: The patient’s age at the time the observation was taken.

  • comments: Additional human-readable comments.

  • event_id: Visit occurrence identifier.

  • organisation:


Questionnaire

  • model: Questionnaire

  • pid: Patient Unique Identifier.

  • value: Score/value of the patient reported response.

  • value_datatype: XSD datatype that defines value column.

  • valueIRI:

  • activity:

  • unit: Full concept IRI for the unit of measurement.

  • input: Full concept IRI for the clinical question or statement to report.

  • target:

  • specification: Full concept IRI for the assessment tool or Patient Reported Outcome Measures specification.

  • frequency_type:

  • frequency_value:

  • agent:

  • startdate: ISO 8601-formatted start date of the data registration.

  • enddate: ISO 8601-formatted end date of the data registration in case it is different from startdate.

  • age: The patient’s age at the time the observation was taken.

  • comments: Additional human-readable comments.

  • event_id: Visit occurrence identifier.

  • organisation:


Disability

  • model: Disability

  • pid: Patient Unique Identifier.

  • value: Score/value of the patient reported response.

  • value_datatype: XSD datatype that defines value column.

  • valueIRI:

  • activity:

  • unit: Full concept IRI for the unit of measurement.

  • input: Full concept IRI for the clinical question performed.

  • target:

  • specification: Full concept IRI for the assessment tool or questionnaire specification.

  • duration_value: ISO 8601 duration value of the disability duration interval, for instance, P10Y.

  • duration_startdate: ISO 8601 start date of the disability duration interval.

  • duration_enddate: ISO 8601 start date of the disability duration interval.

  • frequency_type:

  • frequency_value:

  • agent:

  • startdate: ISO 8601-formatted start date of the data registration.

  • enddate: ISO 8601-formatted end date of the data registration in case it is different from startdate.

  • age: The patient’s age at the time the observation was taken.

  • comments: Additional human-readable comments.

  • event_id: Visit occurrence identifier.

  • organisation:


Biobank

  • model: Biobank

  • pid: Patient Unique Identifier.

  • value: Sample accesion number.

  • value_datatype:

  • valueIRI:

  • activity:

  • unit:

  • input: Full concept IRI for the type of tissue/sample collected during the sampling process.

  • target:

  • specification: IRI reference to any associated protocol.

  • frequency_type:

  • frequency_value:

  • agent:

  • startdate: ISO 8601-formatted start date of the data registration.

  • enddate: ISO 8601-formatted end date of the data registration in case it is different from startdate.

  • age: The patient’s age at the time the observation was taken.

  • comments: Additional human-readable comments.

  • event_id: Visit occurrence identifier.

  • organisation: Associated biobank identifier.


Clinical trial

  • model: Clinical_trial

  • pid: Patient Unique Identifier.

  • value: Study report name/identifier.

  • value_datatype:

  • valueIRI:

  • unit:

  • input:

  • target: Full concept IRI for the genetic or disease condition.

  • specification: IRI reference to any associated protocol.

  • frequency_type:

  • frequency_value:

  • agent:

  • startdate: ISO 8601-formatted start date of the data registration.

  • enddate: ISO 8601-formatted end date of the data registration in case it is different from startdate.

  • age: The patient’s age at the time the observation was taken.

  • comments: Additional human-readable comments.

  • event_id: Visit occurrence identifier.

  • organisation: Medical/research center identifier associated with the clinical trial.


Cohort

  • model: Cohort

  • pid: Patient Unique Identifier.

  • value: Study report name/identifier.

  • value_datatype:

  • valueIRI:

  • unit:

  • input:

  • target: Full concept IRI for the genetic or disease condition.

  • specification: IRI reference to any associated protocol.

  • frequency_type:

  • frequency_value:

  • agent:

  • startdate: ISO 8601-formatted start date of the data registration.

  • enddate: ISO 8601-formatted end date of the data registration in case it is different from startdate.

  • age: The patient’s age at the time the observation was taken.

  • comments: Additional human-readable comments.

  • event_id: Visit occurrence identifier.

  • organisation: Medical/research center identifier associated with the clinical trial.