Snakemake workflow: clinical DICOMs to BIDS

public public 1yr ago Version: stable 0 bookmarks

Description

Snakemake workflow to convert a clinical dicom directory into BIDS structure.

Requirements

  • dcm2niix (v1.0.20200427)

  • python requirements (defined in workflow/envs/mapping.yaml ):

    • dcmstack>=0.7.0

    • dicognito>=0.11.0

    • heudiconv>=0.8.0

    • pandas>=0.24.2

    • pydicom>=1.0.2

    • setuptools>=39.2.0

    • snakemake>=5.23.0

Input directory structure

The input directory with dicoms should be setup as follows:

data/
├── dicoms/
│ └── <subject>/
│ ├── <sequence>/<dicom_files.dcm>
│ ├── <sequence>/<dicom_files.dcm>
│ └── <sequence>/<dicom_files.dcm>
└── output/
  • data the main directory, which stores input subject directories

  • dicoms directory that stores the source DICOM files

    • <subject> is the identifier for the subject in the form sub-001 , sub-002 etc.

    • <sequence> is the directory for a specific imaging sequence and can be given any name

  • output directory will store all outputs from the pipeline

Clinical event dates

One main feature of this clinical pipeline is that the final output can be stored around a clinical event. For instance if the clinical event was an operation, which included preop, periop, and postop imaging, then the output directory would contain three session folders:

  • ses-pre: for imaging data acquired prior to the clinical event

  • ses-peri: for imaging data acquired on the same day as the clinical event (i.e. intraoperative imaging)

  • ses-post: for imaging data acquired after the clinical event

To divide the imaging data based on a clinical event, the event date is required for each subject. The date should be defined in a tab seperated text file named clinical_events.tsv , which has the clinical event date defined for all subjects.

Here is an example of what this file should look like:

subject event_date
sub-001 2014_09_28
sub-002 2018_03_26
sub-003 n/a

If the event date for a subject is unknown or the subject, place n/a under event_date for that subject. In this case, the output BIDS directory will contain only the pre session for that subject (with all the imaging data stored there):

output/bids/sub-P001/ └── ses-pre/anat/...

Configuration

Modify the configuration file

File paths

If you are running this pipeline locally, edit the config/config.yaml file to include the paths to the following:

| Variable | Description | | ------------------------------------ | -------------------------------------------------------------------------------------- | | `dicom_dir` | full path to where the input dicom directory is stored | | `out_dir` | full path to where the pipeline should output the data | | `clinical_event_file` **[optional]** | path to the `clinical_events.tsv` file with subject clinical event dates | | `heuristic` | heudiconv template file to sort/name dicoms according to BIDS standard | | `dcm_config` | used to modify the input parameters for dcm2niix | | `anonymize` | whether to anonymize the dicom files prior to storing in tar archives (default = True) |

Clinical event session determination

How one medical centre acquires imaging for/around a clinical event will differ from another centre. Additionally, patients may experience multiple clinical events, which increases the complexity of ensuring the imaging studies are stored in the appropriate session directory for each clinical event. Within the dicom2bids pipeline, the user can modify how pre , peri and post sessions are defined. Within the config/config.yaml , the following settings should be used to customize:

| Variable | Description | | ------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `peri` [default: 0] | number of days ± around clinical event that will deemed `peri` | | `dur_multi_event` [default: -30] | in the event of multiple clinical events (i.e. repeat surgery), the max num days, before the subsequent clinical event, that imaging studies will be deemed `pre`1 | | `override_peri` [default: True] | in the event an imaging study is deemed `peri`, but the heuristic file detects a post clinical event flag, it will be moved to `post` 2 | * 1 the default value is quantified as 30 days before the subsequent clinical event, any imaging acquisitions during that period will be stored in the `pre` session
  • 2 this may occur if the clinical event and post event imaging occur on the same day.

DICOM sort rules

DICOM files will first be sorted based on modality and acquisition date. The default DICOM sorting heuristic should work on any dataset but may need some adjustment. The tar archives will then be converted to nifit format and sorted based on image sequence name. This heurisitc will most likely need to be modified based.

dicom2tar sort rule

The default sort rule can be found in workflow/scripts/dicom2tar/sort_rules.py and is the function sort_rule_clinical . This sort rule separates the DICOM files based on image type (MRI/CT/fluoro) as well as acquisition date. Initially, image acquisitions occurring on different days are stored in separate Tar archives. The tar archives are then given sequentially numbering based on acquisition date.

tar2bids sort rule

The sort rule for HeuDiConv is called a heuristic . Refer to this detailed description of the heuristic and how to write your own . The default heuristic file within the dicom2bids workflow can be found in workflow/scripts/heudiconv/clinical_imaging.py .

Running locally

  1. Install Snakemake using Python :

    python -m pip install snakemake
    

    For installation details, see the instructions in the Snakemake documentation .

  2. Clone a copy of this repository to your system:

    git clone https://github.com/greydongilmore/clinical_dicom2bids_smk.git
    
  3. Install the Python dependencies by opening a terminal, changing to the project directory and running:

    python -m pip install -r requirements.txt
    

Local run

All the following commands should be run in the root of the project directory.

  1. Prior to running you can do a dry run:

    snakemake -n
    
  2. To locally run the pipeline, run the following:

    snakemake -j $N
    

    where $N specifies the number of cores to use.

Description of the pipeline

Repository structure

The repository has the following scheme:

├── README.md
├── workflow ├── rules  └── dicom2bids.smk ├── envs  └── mapping.yaml ├── scripts | ├── dicom2tar # sorts and stores the dicoms into Tarballs
|  | ├── clinical_helpers.py
| | | ├── DicomSorter.py
| | | ├── main.py
|  | └── sort_rules.py
| | ├── heudiconv # heuristic file for clinical imaging
|  | └── clinical_imaging.py 
| | └── post_tar2bids # refactors heudiconv output into final BIDS structure
|  └── clean_sessions.py
| └── Snakefile
├── config └── config.yaml
└── data # contains test input dicoms

Rule 01: dicom2tar

| Variable | Description | | -------- | -------------------------------------------------- | | Overview | Sorts and stores the dicom files into Tar archives | | Input | MRI/CT dicoms | | output | MRI/CT Tar archives | The dicom2tar pipeline is modified from the [dicom2tar](https://github.com/khanlab/dicom2tar) master branch (version date:16/07/2020). The code has been modified to fit the dicom2bids workflow.

First, the dicom2tar pipeline will create tar archives with the acquisition date embedded in the archive name:

output/ └── tars/ ├── P185_2017_09_15_20170915_I5U57IAQF6Q0.46F51E3C_MR.tar ├── P185_2017_11_10_20171110_NW1NTJVCBOJH.F06BAD6C_MR.tar ├── P185_2018_03_26_20180326_76EGBNLGE0PA.06FE3A9D_CT.tar ├── P185_2018_03_26_20180326_K6S2I5FI1MB9.92C5EAF3_CT.tar └── P185_2018_03_27_20180327_9WK33LJUKNJP.F61DC193_CT.tar

The final dicom2tar output will sort the tar archives based on date and assign sequential sessions numbers, which are embedded in the archive name:

output/ └── tars/ ├── P185_001_2017_09_15_20170915_I5U57IAQF6Q0.46F51E3C_MR.tar ├── P185_002_2017_11_10_20171110_NW1NTJVCBOJH.F06BAD6C_MR.tar ├── P185_003_2018_03_26_20180326_76EGBNLGE0PA.06FE3A9D_CT.tar ├── P185_004_2018_03_26_20180326_K6S2I5FI1MB9.92C5EAF3_CT.tar └── P185_005_2018_03_27_20180327_9WK33LJUKNJP.F61DC193_CT.tar

Rule 02: tar2bids

| Variable | Description | | -------- | ---------------------------------------------------- | | Overview | Converts the Tar archives into BIDS compliant format | | Input | MRI/CT dicom Tar archives | | output | MRI/CT nifti files stored in BIDS format | The output from the rule tar2bids will be:
output/ └── bids_tmp/ └── sub-P185/ ├── ses-001/anat/... ├── ses-002/anat/... ├── ses-003/anat/... ├── ses-004/anat/... └── ses-005/anat/...

Rule 03: cleanSessions

| Variable | Description | | -------- | --------------------------------------------------------------- | | Overview | Reorganizes the HeuDiConv BIDS output into the final session(s) | | Input | BIDS directory with default session numbering | | output | BIDS directory with clinically relevant session naming | The output from the rule cleanSessions will be:
output/ └── bids/ └── sub-P185/ ├── ses-peri/anat/... ├── ses-post/anat/... └── ses-pre/anat/...

Code Snippets

27
28
script:
	"../scripts/dicom2tar/main.py"
40
41
shell:
	'heudiconv --files {input.tar} -o {params.bids} -f {params.heuristic_file} -c dcm2niix --dcmconfig {params.dcm_config} -b'
57
58
script:
	"../scripts/post_tar2bids/clean_sessions.py"
72
73
script:
	"../scripts/post_tar2bids/clean_sessions.py"
37
38
39
shell:
    "export FASTSURFER_HOME={params.fastsurfer_run} &&PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:4096 {params.fastsurfer_run}/run_fastsurfer.sh \
    --t1 {input.t1} --sd {params.fastsurfer_out} --threads {params.threads} --vox_size {params.vox_size} --sid {params.subjid} --py {params.py} --viewagg_device cpu --fsaparc --parallel --allow_root"
57
58
59
shell:
    "export FASTSURFER_HOME={params.fastsurfer_run} &&PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:4096 {params.fastsurfer_run}/run_fastsurfer.sh \
    --t1 {input.t1} --sd {params.fastsurfer_out} --sid {params.subjid} --py {params.py} --vox_size {params.vox_size} --viewagg_device cpu --fsaparc --no_cereb --parallel --ignore_fs_version --allow_root"
76
77
78
shell:
    "export FASTSURFER_HOME={params.fastsurfer_run} &&PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:4096 {params.fastsurfer_run}/run_fastsurfer.sh \
    --t1 {input.t1} --sd {params.fastsurfer_out} --sid {params.subjid} --py {params.py} --run_viewagg_on cpu --fsaparc --parallel"
33
34
35
36
37
38
39
40
41
42
    shell:
        'export SINGULARITYENV_FS_LICENSE=$HOME/.freesurfer.txt&&\
singularity run --cleanenv \
--bind {params.bids_dir}:/tmp/input \
--bind {params.out_dir}:/tmp/output \
--bind {params.license}:/tmp/{params.license_name} \
{params.fmriprep_img} /tmp/input  /tmp/output participant --skip_bids_validation \
--participant_label {params.sub} --anat-only \
--fs-license-file /tmp/{params.license_name} \
--bids-filter-file {params.bids_filter}'
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
import os
import logging
from dicognito.anonymizer import Anonymizer
import sort_rules
import DicomSorter
import pydicom
import clinical_helpers as ch

logging.basicConfig(level=logging.DEBUG,
                    format='%(asctime)s - %(levelname)s -%(message)s')

class Namespace:
    def __init__(self, **kwargs):
        self.__dict__.update(kwargs)

def main():
    '''
    use DicomSorter sort or tar CFMM's dicom data

    input:
        dicom_dir: folder contains dicom files(and/or compressed files:.zip/.tgz/.tar.gz/.tar.bz2)
        output_dir: output sorted or tar files to this folder
    '''

    args = Namespace(dicom_dir=snakemake.input.dicom, output_dir=snakemake.output.tar, clinical_scans=True, clinical_events = snakemake.params.clinical_events, log_dir=snakemake.params.log_dir, prefix=snakemake.params.prefix)

    logger = logging.getLogger(__name__)

    if not os.path.exists(args.dicom_dir):
        logger.error("{} not exist!".format(args.dicom_dir))
        return False

    if not os.path.exists(args.output_dir):
        os.makedirs(args.output_dir)

    if snakemake.config['anonymize']:
        print('De-identifying imaging data for {}\n'.format(os.path.split(args.dicom_dir)[-1]))
        anonymizer = Anonymizer()
        for root, folders, files in os.walk(os.path.join(args.dicom_dir)):
            for file in files:
                fullpath = os.path.abspath(os.path.join(root,file))
                old_time={}
                with pydicom.dcmread(fullpath, force=True) as dataset:
                    old_time={'StudyDate': dataset.StudyDate if 'StudyDate' in dataset else None,
                              'SeriesDate': dataset.SeriesDate if 'SeriesDate' in dataset else None,
                              'StudyTime': dataset.StudyTime if 'StudyTime' in dataset else None,
                              'SeriesTime': dataset.SeriesTime if 'SeriesTime' in dataset else None,
                              'ContentDate': dataset.ContentDate if 'ContentDate' in dataset else None,
                              'ContentTime': dataset.ContentTime if 'ContentTime' in dataset else None,
                              'AcquisitionDate': dataset.AcquisitionDate if 'AcquisitionDate' in dataset else None
                              }
                    anonymizer.anonymize(dataset)
                    for key,val in old_time.items():
                        if val is not None:
                            dataset[key].value=val
                    dataset.save_as(fullpath)


    ######
    # CFMM sort rule
    ######
    try:
        if not args.clinical_scans:

            with DicomSorter.DicomSorter(sort_rules.sort_rule_CFMM, args) as d:
                # #######
                # # sort
                # #######
                # sorted_dirs = d.sort()
                # # logging
                # for item in sorted_dirs:
                #     logger.info("sorted directory created: {}".format(item))

                #######
                # tar
                #######
                # pi/project/study_date/patient/studyID_and_hash_studyInstanceUID
                tar_full_filenames = d.tar(5)
                # logging
                for item in tar_full_filenames:
                    logger.info("tar file created: {}".format(item))

            # ######
            # # demo sort rule
            # ######
            # with DicomSorter.DicomSorter(args.dicom_dir, sort_rules.sort_rule_demo, output_dir) as d:
            #     # sort
            #     sorted_dirs = d.sort()
            #     #logging
            #     for item in sorted_dirs:
            #         logger.info("sorted directory created: {}".format(item))

            #     # tar
            #     # patient_name/study_date/series_number/new_filename.dcm
            #     tar_full_filenames = d.tar(2)
            #     # logging
            #     for item in tar_full_filenames:
            #         logger.info("tar file created: {}".format(item))

        else:
            ######
            # Clinical sort rule
            ######
            logger.info("These are clinical scans.")

            with DicomSorter.DicomSorter(sort_rules.sort_rule_clinical, args) as d:
                # if os.path.exists(os.path.join(args.output_dir, 'errorInfo.tsv')):
                #     os.remove(os.path.join(args.output_dir, 'errorInfo.tsv'))
                # if os.path.exists(os.path.join(args.output_dir, 'or_dates.tsv')):
                #     os.remove(os.path.join(args.output_dir, 'or_dates.tsv'))
                # tar
                # study_date/patient/modality/series_number/new_filename.dcm
                tar_full_filenames = d.tar(4)

                # logging
                for item in tar_full_filenames:
                    logger.info("tar file created: {}".format(item))

            ch.tarSession(args)

    except Exception as e:
        logger.exception(e)

if __name__ == "__main__":

    main()
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
import os
import datetime
import pandas as pd
import numpy as np
import re
from collections import OrderedDict
import shutil


class Namespace:
	def __init__(self, **kwargs):
		self.__dict__.update(kwargs)

def sorted_nicely( l ):
	convert = lambda text: int(text) if text.isdigit() else text
	alphanum_key = lambda key: [convert(c) for c in re.split('([0-9]+)', key)]
	return sorted(l, key = alphanum_key)

def copytree(src, dst, symlinks=False, ignore=None):
	for item in os.listdir(src):
		s = os.path.join(src, item)
		d = os.path.join(dst, item)
		if os.path.isdir(s):
			shutil.copytree(s, d, symlinks, ignore)
		else:
			shutil.copy2(s, d)

def make_bids_filename(subject_id, session_id, suffix, prefix, task=None, acq=None, ce=None, rec=None, 
	dir=None, mod=None, echo=None, hemi=None, space=None,res=None,den=None,label=None,part=None,desc=None,run=None):
	if isinstance(session_id, str):
		if 'ses' in session_id:
			session_id = session_id.split('-')[1]

	order = OrderedDict([('ses', session_id),
						('task', task),
						('acq', acq),
						('ce', ce),
						('rec', rec),
						('dir', dir),
						('mod', mod),
						('echo', echo),
						('hemi', hemi),
						('space', space),
						('res', res),
						('den', den),
						('label', label),
						('part', part),
						('desc', desc),
						('run', run)])
	filename = []
	if subject_id is not None:
		filename.append(subject_id)
	for key, val in order.items():
		if val is not None:
			filename.append('%s-%s' % (key, val))

	if isinstance(suffix, str):
		filename.append(suffix)

	filename = '_'.join(filename)
	if isinstance(prefix, str):
		filename = os.path.join(prefix, filename)

	return filename

def make_bids_folders(subject_id, session_id, kind, output_path, make_dir, overwrite):
	path = []
	path.append(subject_id)

	if isinstance(session_id, str):
		if 'ses' not in session_id:
			path.append('ses-%s' % session_id)
		else:
			path.append(session_id)

	if isinstance(kind, str):
		path.append(kind)

	path = os.path.join(*path)  
	path = os.path.join(output_path, path)

	if make_dir == True:
		if not os.path.exists(path):
			os.makedirs(path)
		elif overwrite:
			shutil.rmtree(path)
			os.makedirs(path)

	return path

def main():
	output_dir=os.path.dirname(os.path.dirname(snakemake.input.touch_tar2bids))

	final_dir = os.path.join(output_dir, 'bids')
	if not os.path.exists(final_dir):
		os.mkdir(final_dir)

	#os.remove(snakemake.input.touch_tar2bids)
	sub_num=''.join([x for x in os.path.basename(snakemake.params.bids_fold) if x.isnumeric()])

	print('Converting subject {} ...'.format(os.path.basename(snakemake.params.bids_fold)))
	subject_event = []
	if os.path.exists(snakemake.params.clinical_events):
		event_dates = pd.read_csv(snakemake.params.clinical_events, sep='\t',dtype = str)
		subject_event = [datetime.datetime.strptime(x, '%Y_%m_%d') for x in [y for y in event_dates[event_dates['subject']==sub_num]['event_date'].values] if x is not np.nan]

	orig_sessions = sorted_nicely([x for x in os.listdir(snakemake.params.bids_fold) if os.path.isdir(os.path.join(snakemake.params.bids_fold, x)) and 'ses' in x])

	sessionDates = {'ses_num':[],'session':[]}
	for ises in orig_sessions:
		if not subject_event:
			try:
				sessionDates['ses_num']._append(ises)
				sessionDates['session']._append('pre')
			except:
				sessionDates['ses_num'].append(ises)
				sessionDates['session'].append('pre')
		else:
			scans_tsv = [x for x in os.listdir(os.path.join(snakemake.params.bids_fold, ises)) if x.endswith('scans.tsv')]
			scans_data = pd.read_table(os.path.join(snakemake.params.bids_fold, ises, scans_tsv[0]))
			idate = datetime.datetime.strptime(scans_data['acq_time'].values[0].split('T')[0], '%Y-%m-%d')
			dateAdded = False
			for ievent in subject_event:
				if dateAdded:
					if 'post' in sessionDates['session'][-1]:
						if abs((idate-ievent).days) <= snakemake.params.ses_calc['peri']:
							post_scan=False
							if snakemake.params.ses_calc['override_peri']:
								for root, folders, files in os.walk(os.path.join(snakemake.params.bids_fold, ises)):
									for file in files:
										if 'electrode' in file.lower():
											post_scan = True

							if post_scan:
								sessionDates['session'][-1]='post'
							else:
								sessionDates['session'][-1]='peri'

						elif snakemake.params.ses_calc['dur_multi_event'] < (idate-ievent).days < 0:
							sessionDates['session'][-1]='pre'
				else:
					sessionDates['ses_num'].append(ises)
					if (idate-ievent).days > 0:
						sessionDates['session'].append('post')
					elif abs((idate-ievent).days) <= snakemake.params.ses_calc['peri']:
						post_scan=False
						if snakemake.params.ses_calc['override_peri']:
							for root, folders, files in os.walk(os.path.join(snakemake.params.bids_fold, ises)):
								for file in files:
									if 'electrode' in file.lower():
										post_scan = True

						if post_scan:
							sessionDates['session'].append('post')
						else:
							sessionDates['session'].append('peri')

					elif (idate-ievent).days < 0:
						sessionDates['session'].append('pre')

					dateAdded=True

	sessionDates = pd.DataFrame.from_dict(sessionDates)
	isub = os.path.basename(snakemake.params.bids_fold)
	for ilabel in sessionDates.session.unique():
		sessions = sessionDates[sessionDates['session']==ilabel]['ses_num'].values
		scans_tsv_new = []
		for ises in sessions:
			scans_tsv = [x for x in os.listdir(os.path.join(snakemake.params.bids_fold, ises)) if x.endswith('scans.tsv')]
			scans_data = pd.read_table(os.path.join(snakemake.params.bids_fold, ises, scans_tsv[0]))
			scan_type = [x for x in os.listdir(os.path.join(snakemake.params.bids_fold, ises)) if os.path.isdir(os.path.join(snakemake.params.bids_fold, ises, x))]
			for iscan in scan_type:
				sub_path = make_bids_folders(isub, ilabel, iscan, final_dir, True, False)
				files = [x for x in os.listdir(os.path.join(snakemake.params.bids_fold, ises, iscan)) if os.path.isfile(os.path.join(snakemake.params.bids_fold, ises, iscan, x))]

				for ifile in files:

					key_dict={
						'task': [],
						'acq': [],
						'ce':[],
						'rec':[],
						'dir':[],
						'mod':[],
						'echo':[],
						'hemi':[],
						'space':[],
						'res':[],
						'den':[],
						'desc': [],
						'label':[],
						'part':[],
						'run':[],
					}

					key_dict['suffix']=ifile.split('_')[-1]

					for ikey in key_dict.keys():
						key_dict[ikey]=ifile.split(f'{ikey}-')[1].split('_')[0] if f'{ikey}-' in ifile else None

					key_dict['suffix']=ifile.split('_')[-1]
					key_dict['prefix']=sub_path

					new_file = make_bids_filename(isub, 'ses-'+ilabel, **key_dict)

					shutil.copy(os.path.join(snakemake.params.bids_fold, ises, iscan, ifile), new_file)
					os.chmod(new_file, 0o777)

					if iscan+'/'+ifile in scans_data['filename'].values:
						name_idx = [i for i,x in enumerate(scans_data['filename'].values) if x == iscan+'/'+ifile][0]
						data_temp = scans_data.iloc[name_idx,:].to_dict()
						data_temp['filename']=iscan+'/'+os.path.basename(new_file)

						scans_tsv_new.append(data_temp)

			sub_code_path = make_bids_folders(isub.split('-')[1], ilabel, 'info', os.path.join(final_dir,'.heudiconv'), True, False)
			copytree(os.path.join(os.path.dirname(snakemake.params.bids_fold), '.heudiconv', isub.split('-')[1], ises,'info'), sub_code_path)

		scans_file = make_bids_filename(isub, 'ses-' + ilabel, 'scans.json', os.path.dirname(sub_path))
		scans_json = [x for x in os.listdir(os.path.join(snakemake.params.bids_fold, ises)) if x.endswith('scans.json')]
		if scans_json:
			shutil.copy(os.path.join(snakemake.params.bids_fold, ises, scans_json[0]), scans_file)

		scans_file = make_bids_filename(isub, 'ses-'+ilabel, 'scans.tsv', os.path.dirname(sub_path))
		scans_tsv_new = pd.DataFrame(scans_tsv_new)
		scans_tsv_new.to_csv(scans_file, sep='\t', index=False, na_rep='n/a', lineterminator="")

	# Check to see if this is the last subject complete, copy main BIDS files if so
	check_status = [x for x in os.listdir(os.path.join(output_dir,'bids_tmp')) if os.path.isdir(os.path.join(output_dir, 'bids_tmp', x)) and not x.startswith('.')]
	if len(check_status)==1:
		bids_files = [x for x in os.listdir(os.path.join(output_dir, 'bids_tmp')) if os.path.isfile(os.path.join(output_dir, 'bids_tmp', x))]
		for ifile in bids_files:
			if ifile == 'participants.tsv':
				if os.path.exists(os.path.join(final_dir, ifile)):
					patient_tsv_old = pd.read_csv(os.path.join(output_dir, 'bids_tmp', 'participants.tsv'), sep='\t')
					patient_tsv = pd.read_csv(os.path.join(final_dir, 'participants.tsv'), sep='\t')
					try:
						patient_tsv = patient_tsv._append(patient_tsv_old).reset_index(drop=True)
					except:
						patient_tsv = patient_tsv.append(patient_tsv_old).reset_index(drop=True)
				else:
					patient_tsv = pd.read_csv(os.path.join(output_dir, 'bids_tmp', 'participants.tsv'), sep='\t')

				patient_tsv = patient_tsv.sort_values(by=['participant_id']).reset_index(drop=True)
				patient_tsv['group'] = patient_tsv['group'].replace('control',snakemake.params.sub_group)
				patient_tsv.to_csv(os.path.join(final_dir, ifile), sep='\t', index=False, na_rep='n/a', lineterminator="")
			else:
				if not os.path.exists(os.path.join(final_dir, ifile)):
					shutil.copy(os.path.join(output_dir, 'bids_tmp', ifile), os.path.join(final_dir, ifile))

		shutil.rmtree(os.path.join(output_dir, 'bids_tmp'))
	else:
		shutil.rmtree(os.path.join(output_dir, 'bids_tmp',isub))

if __name__ == "__main__":

	main()
ShowHide 7 more snippets with no or duplicated tags.

Login to post a comment if you would like to share your experience with this workflow.

Do you know this workflow well? If so, you can request seller status , and start supporting this workflow.

Free

Created: 1yr ago
Updated: 1yr ago
Maitainers: public
URL: https://github.com/greydongilmore/clinical_dicom2bids_smk
Name: clinical_dicom2bids_smk
Version: stable
Badge:
workflow icon

Insert copied code into your website to add a link to this workflow.

Downloaded: 0
Copyright: Public Domain
License: None
  • Future updates

Related Workflows

cellranger-snakemake-gke
snakemake workflow to run cellranger on a given bucket using gke.
A Snakemake workflow for running cellranger on a given bucket using Google Kubernetes Engine. The usage of this workflow ...