staging

staging Package

Image processing preparation.

The staging package defines the functions used to prepare the study image files for import into XNAT, submission to the TCIA QIN collections and pipeline processing.

ctp_config

qipipe.staging.ctp_config.ctp_collection_for_name(name)
Parameters:name – the QIN collection name
Returns:the CTP collection name

fix_dicom

qipipe.staging.fix_dicom.COMMENT_PREFIX = <_sre.SRE_Pattern object>

OHSU - the Image Comments tag value prefix.

qipipe.staging.fix_dicom.DATE_FMT = '%Y%m%d'

The DICOM date format is YYYYMMDD.

qipipe.staging.fix_dicom.fix_dicom_headers(collection, subject, *in_files, **opts)

Fix the given input DICOM files as follows:

  • Replace the Patient ID value with the subject number, e.g.
    Sarcoma001
  • Add the Body Part Examined tag
  • Anonymize the Patient's Birth Date tag
  • Standardize the file name

OHSU - The Body Part Examined tag is set as follows:

OHSU - Remove extraneous Image Comments tag value content which might contain PHI.

The output file name is standardized as follows:

  • The file name is lower-case
  • The file extension is .dcm
  • Each non-word character is replaced by an underscore
Parameters:
  • collection – the collection name
  • subject – the input subject name
  • opts – the following keyword arguments:
  • dest – the location in which to write the modified files (default is the current directory)
Returns:

the files which were created

Raises:

StagingError – if the collection is not supported

image_collection

class qipipe.staging.image_collection.Collection(name, **opts)

Bases: object

The image collection.

Parameters:
  • name – the name
  • opts – the following keyword options:
Option subject:

the subject directory name match regular expression

Option session:

the session directory name match regular expression

Option scan_types:
 

the scan_types

Option scan:

the {scan number: {dicom, roi}} dictionary

Option volume:

the DICOM tag which identifies a scan volume

Option crop_posterior:
 

the crop_posterior flag

__init__(name, **opts)
Parameters:
  • name – the name
  • opts – the following keyword options:
Option subject:

the subject directory name match regular expression

Option session:

the session directory name match regular expression

Option scan_types:
 

the scan_types

Option scan:

the {scan number: {dicom, roi}} dictionary

Option volume:

the DICOM tag which identifies a scan volume

Option crop_posterior:
 

the crop_posterior flag

crop_posterior = None

A flag indicating whether to crop the image posterior in the mask, e.g. for a breast tumor (default False).

instances = {'sarcoma': <qipipe.staging.image_collection.Collection object>, 'breast': <qipipe.staging.image_collection.Collection object>}

The collection {name: object} dictionary.

name = None

The capitalized collection name.

patterns = None

The DICOM and ROI meta-data patterns. This patterns attribute consists of the entries dicom and roi, Each of these fields has a mandatory glob entry and an optional regex entry. The glob entry matches the scan subdirectory containing the DICOM or ROI files. The regex entry matches the DICOM or ROI files in the subdirectory. The default in the absence of a regex entry is to include all files in the subdirectory.

scan_types = None

The scan {number: type} dictionary.

qipipe.staging.image_collection.with_name(name)
Returns:the Collection whose name is a case-insensitive match for the given name, or None if no match is found

iterator

class qipipe.staging.iterator.VisitIterator(project, collection, *session_dirs, **opts)

Bases: object

Scan DICOM generator class .

Parameters:
  • project – the XNAT project name
  • collection – the image collection name
  • session_dirs – the session directories over which to iterate
  • opts – the iter_stage() options
__init__(project, collection, *session_dirs, **opts)
Parameters:
  • project – the XNAT project name
  • collection – the image collection name
  • session_dirs – the session directories over which to iterate
  • opts – the iter_stage() options
collection = None

The iter_stage() collection name parameter.

project = None

The iter_stage() project name parameter.

scan = None

The iter_stage() scan number option.

session_dirs = None

The input directories.

skip_existing = None

The iter_stage() skip_existing flag option.

qipipe.staging.iterator.iter_stage(project, collection, *inputs, **opts)

Iterates over the the scans in the given input directories. This method is a staging generator which yields a tuple consisting of the {subject, session, scan, dicom, roi} object.

The input directories conform to the qipipe.staging.image_collection.Collection.patterns subject regular expression.

Each iteration {subject, session, scan, dicom, roi} object is formed as follows:

  • The subject is the XNAT subject name formatted by SUBJECT_FMT.
  • The session is the XNAT experiment name formatted by SESSION_FMT.
  • The scan is the XNAT scan number.
  • dicom is the DICOM directory.
  • roi is the ROI directory.
Parameters:
  • project – the XNAT project name
  • collection – the qipipe.staging.image_collection.Collection.name
  • inputs – the source subject directories to stage
  • opts – the following keyword option:
  • scan – the scan number to stage (default stage all detected scans)
  • skip_existing – flag indicating whether to ignore each existing session, or scan if the scan option is set (default True)
Yield:

the {subject, session, scan, dicom, roi} objects

map_ctp

TCIA CTP preparation utilities.

class qipipe.staging.map_ctp.CTPPatientIdMap

Bases: dict

CTPPatientIdMap is a dictionary augmented with a map_subjects() input method to build the map and a write() output method to print the CTP map properties.

CTP_FMT = '%s-%04d'

The CTP Patient ID format with arguments (CTP collection name, input Patient ID number).

MAP_FMT = 'ptid/%s=%s'

The ID lookup entry format with arguments (input Paitent ID, CTP patient id).

MSG_FMT = 'Mapped the QIN patient id %s to the CTP subject id %s.'

The log message format with arguments (input Paitent ID, CTP patient id).

SOURCE_PAT = <_sre.SRE_Pattern object>

The input Patient ID pattern is the study name followed by a number, e.g. Breast10.

add_subjects(collection, *patient_ids)

Adds the input => CTP Patient ID association for the given input DICOM patient ids.

Parameters:
  • collection – the image collection name
  • patient_ids – the DICOM Patient IDs to map
Raises:

StagingError – if an input patient id format is not the study followed by the patient number

write(dest=<open file '<stdout>', mode 'w'>)

Writes this id map in the standard CTP format.

Parameters:dest – the IO stream on which to write this map (default stdout)
qipipe.staging.map_ctp.PROP_FMT = 'QIN-%s-OHSU.ID-LOOKUP.properties'

The format for the Patient ID map file name specified by CTP.

qipipe.staging.map_ctp.map_ctp(collection, *subjects, **opts)

Creates the TCIA patient id map. The map is written to a property file in the destination directory. The property file name is given by property_filename().

Parameters:
  • collection – the image collection
  • subjects – the subject names
  • opts – the following keyword option:
  • dest – the destination directory
Returns:

the subject map file path

qipipe.staging.map_ctp.property_filename(collection)

Returns the CTP id map property file name for the given collection. The Sarcoma collection is capitalized in the file name, Breast is not.

ohsu

This module contains the OHSU-specific image collections.

The following OHSU QIN scan numbers are captured:
  • 1: T1
  • 2: T2
  • 4: DW
  • 6: PD

These scans have DICOM files specified by the qipipe.staging.image_collection.Collection.patterns dicom attribute. The T1 scan has ROI files as well, specified by the patterns roi.glob and roi.regex attributes.

qipipe.staging.ohsu.BREAST_DW_PAT = '*sorted/*Diffusion'

The Breast DW DICOM directory match pattern.

qipipe.staging.ohsu.BREAST_PD_PAT = '*sorted/*PD*'

The Breast pseudo-proton density DICOM directory match pattern.

qipipe.staging.ohsu.BREAST_ROI_PAT = 'processing/R10_0.[456]*/slice*'

The Breast ROI glob filter. The .bqf ROI files are in the following session subdirectory:

processing/<R10 directory>/slice<slice index>/
qipipe.staging.ohsu.BREAST_ROI_REGEX = <_sre.SRE_Pattern object at 0x48ccd60>

The Breast ROI .bqf ROI file match pattern.

qipipe.staging.ohsu.BREAST_SESSION_REGEX = <_sre.SRE_Pattern object>

The Sarcoma session directory match pattern. The variations Visit_3, Visit3, visit3, BC4V3, BC4_V3 and B4V3 all match Breast Session03.

qipipe.staging.ohsu.BREAST_SUBJECT_REGEX = <_sre.SRE_Pattern object>

The Breast subject directory match pattern.

qipipe.staging.ohsu.BREAST_T2_PAT = '*sorted/2_tirm_tra_bilat'

The Breast T2 DICOM directory match pattern.

qipipe.staging.ohsu.MULTI_VOLUME_SCAN_NUMBERS = [1]

Only T1 scans can have more than one volume.

qipipe.staging.ohsu.SARCOMA_DW_PAT = '*Diffusion'

The Sarcoma DW DICOM directory match pattern.

qipipe.staging.ohsu.SARCOMA_ROI_PAT = 'Breast processing results/multi_slice/slice*'

The Sarcoma ROI glob filter. The .bqf ROI files are in the session subdirectory:

Breast processing results/<ROI directory>/slice<slice index>/

(Yes, the Sarcoma processing results is in the “Breast processing results” subdirectory)!

qipipe.staging.ohsu.SARCOMA_ROI_REGEX = <_sre.SRE_Pattern object>

The Sarcoma ROI .bqf ROI file match pattern.

Note

The Sarcoma ROI directories are inconsistently named, with several alternatives and duplicates.

TODO - clarify which of the Sarcoma ROI naming variations should be used.

Note

There are no apparent lesion number indicators in the Sarcoma ROI input.

TODO - confirm that there is no Sarcoma lesion indicator.

qipipe.staging.ohsu.SARCOMA_SESSION_REGEX = <_sre.SRE_Pattern object>

The Sarcoma session directory match pattern. The variations Visit_3, Visit3, visit3 S4V3, and S4_V3 all match Sarcoma Session03.

qipipe.staging.ohsu.SARCOMA_SUBJECT_REGEX = <_sre.SRE_Pattern object>

The Sarcoma subject directory match pattern.

qipipe.staging.ohsu.SARCOMA_T2_PAT = '*T2*'

The Sarcoma T2 DICOM directory match pattern.

qipipe.staging.ohsu.SESSION_REGEX_PAT = "\n (?: # Don't capture the prefix\n [vV]isit # The Visit or visit prefix form\n _? # with an optional underscore delimiter\n | # ...or...\n %s\\d+_?V # The alternate prefix form, beginning with\n # a leading collection abbreviation\n # substituted into the pattern below\n ) # End of the prefix\n (\\d+)$ # The visit number\n"

The session directory match pattern. This pattern must be specialized for each collection by replacing the %s place-holder with a string.

qipipe.staging.ohsu.T1_PAT = '*concat*'

The T1 DICOM directory match pattern.

qipipe.staging.ohsu.VOLUME_TAG = 'AcquisitionNumber'

The DICOM tag which identifies the volume. The OHSU QIN collections are unusual in that the DICOM images which comprise a 3D volume have the same DICOM Series Number and Acquisition Number tag. The series numbers are consecutive, non-sequential integers, e.g. 9, 11, 13, ..., whereas the acquisition numbers are consecutive, sequential integers starting at 1. The Acquisition Number tag is selected as the volume number identifier.

roi

OHSU - ROI utility functions.

TODO - move this to ohsu-qipipe.

class qipipe.staging.roi.LesionROI(lesion, volume_number, slice_sequence_number, location)

Bases: object

Aggregate with attributes lesion volume, slice and location.

Parameters:
  • lesion – the lesion value
  • volume_number – the volume value
  • slice_sequence_number – the slice value
  • location – the location value
__init__(lesion, volume_number, slice_sequence_number, location)
Parameters:
  • lesion – the lesion value
  • volume_number – the volume value
  • slice_sequence_number – the slice value
  • location – the location value
lesion = None

The lesion number.

location = None

The absolute BOLERO ROI .bqf file path.

slice = None

The one-based slice sequence number.

volume = None

The one-based volume number.

qipipe.staging.roi.PARAM_REGEX = <_sre.SRE_Pattern object>

The regex to parse a parameter file.

qipipe.staging.roi.iter_roi(regex, *in_dirs)

Iterates over the the OHSU ROI .bqf mask files in the given input directories. This method is a LesionROI generator, e.g.:

>>> # Find .bqf files anywhere under /path/to/session/processing.
>>> next(iter_roi('.*/\.bqf', '/path/to/session'))
{lesion: 1, slice: 12, path: '/path/to/session/processing/rois/roi.bqf'}
;param regex:the file name match regular expression
Parameters:in_dirs – the ROI directories to search
Yield:the LesionROI objects

sort

qipipe.staging.sort.sort(collection, scan, *in_dirs)

Groups the DICOM files in the given location by volume.

Parameters:
  • collection – the collection name
  • scan – the scan number
  • in_dirs – the input DICOM directories
Returns:

the {volume: files} dictionary

sarcoma_config

qipipe.staging.sarcoma_config.CFG_FILE = '/home/docs/checkouts/readthedocs.org/user_builds/qipipe/checkouts/latest/qipipe/conf/sarcoma.cfg'

The Sarcoma Tumor Location configuration file. This file contains properties that associate the subject name to the location, e.g.:

Sarcoma004 = SHOULDER

The value is the SNOMED anatomy term.

qipipe.staging.sarcoma_config.sarcoma_config()
Returns:the sarcoma configuration
Return type:ConfigParser
qipipe.staging.sarcoma_config.sarcoma_location(subject)
Parameters:subject – the XNAT Subject ID
Returns:the subject tumor location

staging_error