hello,
I created a panel_meta.csv file in order to construct a cross-sample panel for calling indels. I did this using the format outlined in the scan2 wiki.
But the scan2 config ran using the makepanel analyses failed to read the donor demographic information held within the panel_meta.csv file. To double check that the BAM files were successfully being read, I ran the scan2 config using the gatk call_mutations analyses instead, and this ran successfully. So, I am inclined to think the panel_meta.csv file was not formatted incorrectly.
FYI, I outline the panel_meta.csv file as follows...
cat ../01_Input/panel_metadata/PD64547c_panel_meta.csv
donor,sample,amp
PD64547,PD64547c_lo0007,PTA
PD64547,PD64547c_lo0045,PTA
PD64547,PD64547c_lo0088,PTA
PD45517,PD45517k,MDA
PD64547,PD64547a_lo0001,bulk
...and here is the error report from the verbose output:
Checking reference genome..
Checking dbSNP VCF..
Checking BAMs..
adding bulk
adding SC
adding bam
Checking panel metadata..
ERROR: some samples listed in --makepanel-metadata have no associated BAM file (specified by --bam, --sc-bam, or --bulk-bam). Please ensure that every BAM in the metadata table is present. Note that samples IDs are automatically determined from BAM files by the SM: tag in the first read group (@RG record).
Could you possibly share an example of the panel_meta.csv file used in your analyses? I was unable to locate it in the current scan2 repo.
Where might I find the base script responsible for reading the panel_meta.csv file to run the makepanel analyses? I understand that SM tags from the BAM files might be involved, and these are correctly identified in my data. I would like to know the specific parameters that the scan2 algorithm anticipates in order to format the panel_meta.csv appropriately.
FYI, the commands are run as follows:
# scan2 -d panel init
scan2 -d "$OUT_DIR" init
sc_args=()
for bam in "${sc_bams[@]}"; do
sc_args+=(--sc-bam "$bam")
done
scan2 -d "$OUT_DIR" config \
--verbose \
--analysis makepanel \
--regions-file "$REGIONS" \
--ref "$REF" \
--dbsnp "$DBSNP" \
--phaser eagle \
--eagle-refpanel "$EAGLE_REF" \
--eagle-genmap "$EAGLE_GENMAP" \
--makepanel-metadata "$PANEL_META/${sample}_panel_meta.csv" \
--bulk-bam "$BULK" \
"${sc_args[@]}" \
--bam "$DONOR2"
scan2 -d "$OUT_DIR" validate
Many thanks for sharing your experience with this.
hello,
I created a panel_meta.csv file in order to construct a cross-sample panel for calling indels. I did this using the format outlined in the scan2 wiki.
But the scan2 config ran using the makepanel analyses failed to read the donor demographic information held within the panel_meta.csv file. To double check that the BAM files were successfully being read, I ran the scan2 config using the gatk call_mutations analyses instead, and this ran successfully. So, I am inclined to think the panel_meta.csv file was not formatted incorrectly.
FYI, I outline the panel_meta.csv file as follows...
...and here is the error report from the verbose output:
Could you possibly share an example of the panel_meta.csv file used in your analyses? I was unable to locate it in the current scan2 repo.
Where might I find the base script responsible for reading the panel_meta.csv file to run the makepanel analyses? I understand that SM tags from the BAM files might be involved, and these are correctly identified in my data. I would like to know the specific parameters that the scan2 algorithm anticipates in order to format the panel_meta.csv appropriately.
FYI, the commands are run as follows:
Many thanks for sharing your experience with this.