CLIP-seq genome error

Dear Flow Team,

I am trying to run the prepare CLIP-seq genome workflow for the Human GRCh37 genome and am facing errors at the CLIPSEQ_FILTER_GTF step.
I am running this with the latest iteration of the pipeline, so was wondering if this is ok or if I needed to select an older version of the workflow?

Thank you,
Fiona

Hi Fiona,

Which annotation version are you using? I think we have found v105 of Ensembl annotation to be problematic.

Can you please post the log of the process and an error trace?

Otherwise, we are working on a new release in the upcoming weeks that streamlines filtering to work for all annotations, or filtering can be made optional.
Klara

Hi Kara,

Thank you for your reply.
I am not sure which Ensembl annotation it is as its just the default pipeline for the Human GRCh37 genome. I can’t find any details of this in the data parameters?
It is failing at the filter GTF stage though so maybe this is the problem.
Here is the log:

The run is called “jovial_gauss” if that is helpful too.
Let me know if you need any more info!
Best wishes,
Fiona

Hi Fiona!

I reproduced your error, and the filtering is failing because the version the annotation GTF (Homo_sapiens.GRCh37.87.gtf) does not contain the “transcript_support_level” flag.
I will coordinate with the developer team to implement a fix for this.

Nevertheless, you can still run a clipseq pipeline, even without the filtered annotation. You just need to substitute all files based on “filtered gtf” with the files based on unfiltered GTF.

To run a clipseq pipeline this way, specify your “prepare genome” execution, with the failed FILTER_GTF process. The files that exist will be auto filled, but you can specify the missing files manually, like so:

  • Filtered GTF: Homo_sapiens.GRCh37.87_bracketsremoved.cmd.gtf (this is the unfiltered annotation file)

  • Segmented filtered GTF: Homo_sapiens_seg.gtf (Same file as is entered automatically for Segmented GTF)

  • Segmented resolved filtered GTF: Homo_sapiens_seg.gtf (Same file as is entered automatically for Segmented GTF)

  • Segmented resolved genic filtered GTF: Homo_sapiens_seg.gtf (Same file as is entered automatically for Segmented GTF)

  • Filtered regions GTF: Homo_sapiens_regions.gtf.gz (Same file as is entered automatically for Regions GTF)

  • Filtered resolved regions GTF: Homo_sapiens_regions.gtf.gz (Same file as is entered automatically for Regions GTF)

  • Filtered resolved regions genic GTF Homo_sapiens_regions.gtf.gz (Same file as is entered automatically for Regions GTF)

Hi Klara,
Great thank you for taking a look. I can certainly re-run the analysis as you suggest and hopefully that should sort the problem in the meantime.
Best wishes,
Fiona

Hi Klara,
I have tried to re-run as you suggested, but for the last three parameters, I could not see the Homo_sapiens_regions.gtf.gz files on the dropdown list and there is no way to type in, so I have used the following:


Do you think this is sensible?
Thank you, Fiona

Hi Klara - do you know if any hg37 gtfs have “transcript_support_level” flag? I can upload that for use?