Dear Flow Team,
I am trying to run the prepare CLIP-seq genome workflow for the Human GRCh37 genome and am facing errors at the CLIPSEQ_FILTER_GTF step.
I am running this with the latest iteration of the pipeline, so was wondering if this is ok or if I needed to select an older version of the workflow?
Thank you,
Fiona
Hi Fiona,
Which annotation version are you using? I think we have found v105 of Ensembl annotation to be problematic.
Can you please post the log of the process and an error trace?
Otherwise, we are working on a new release in the upcoming weeks that streamlines filtering to work for all annotations, or filtering can be made optional.
Klara
Hi Kara,
Thank you for your reply.
I am not sure which Ensembl annotation it is as its just the default pipeline for the Human GRCh37 genome. I can’t find any details of this in the data parameters?
It is failing at the filter GTF stage though so maybe this is the problem.
Here is the log:
The run is called “jovial_gauss” if that is helpful too.
Let me know if you need any more info!
Best wishes,
Fiona
Hi Fiona!
I reproduced your error, and the filtering is failing because the version the annotation GTF (Homo_sapiens.GRCh37.87.gtf) does not contain the “transcript_support_level” flag.
I will coordinate with the developer team to implement a fix for this.
Nevertheless, you can still run a clipseq pipeline, even without the filtered annotation. You just need to substitute all files based on “filtered gtf” with the files based on unfiltered GTF.
To run a clipseq pipeline this way, specify your “prepare genome” execution, with the failed FILTER_GTF process. The files that exist will be auto filled, but you can specify the missing files manually, like so:
-
Filtered GTF: Homo_sapiens.GRCh37.87_bracketsremoved.cmd.gtf (this is the unfiltered annotation file)
-
Segmented filtered GTF: Homo_sapiens_seg.gtf (Same file as is entered automatically for Segmented GTF)
-
Segmented resolved filtered GTF: Homo_sapiens_seg.gtf (Same file as is entered automatically for Segmented GTF)
-
Segmented resolved genic filtered GTF: Homo_sapiens_seg.gtf (Same file as is entered automatically for Segmented GTF)
-
Filtered regions GTF: Homo_sapiens_regions.gtf.gz (Same file as is entered automatically for Regions GTF)
-
Filtered resolved regions GTF: Homo_sapiens_regions.gtf.gz (Same file as is entered automatically for Regions GTF)
-
Filtered resolved regions genic GTF Homo_sapiens_regions.gtf.gz (Same file as is entered automatically for Regions GTF)
Hi Klara,
Great thank you for taking a look. I can certainly re-run the analysis as you suggest and hopefully that should sort the problem in the meantime.
Best wishes,
Fiona
Hi Klara,
I have tried to re-run as you suggested, but for the last three parameters, I could not see the Homo_sapiens_regions.gtf.gz files on the dropdown list and there is no way to type in, so I have used the following:
Do you think this is sensible?
Thank you, Fiona
Hi Klara - do you know if any hg37 gtfs have “transcript_support_level” flag? I can upload that for use?