Hi Flow team,
Could you add the “Arabidopsis thaliana” Genome to flow?
It’d be useful to be able to import or upload any new genome. Thanks.
Kind regards,
Paulo
Hi Flow team,
Could you add the “Arabidopsis thaliana” Genome to flow?
It’d be useful to be able to import or upload any new genome. Thanks.
Kind regards,
Paulo
Hi Paulo
We have added Arabidopsis thaliana to Flow - as you can see the RNA-Seq genome worked well: Flow
However there are some errors with preparing the genome for CLIP that will need to be resolved because the GTF file is non-standard containing no “biotype” tags for genes: Flow
If you have a working GTF file please do share it with us which will speed things up for you.
Best,
Charlotte
Hi Charlotte,
Thank you for adding this genome, and already trying to prepare it for CLIP.
I will do as suggested, to obtain a GTF file containing the “biotype” tag for genes.
Best,
Paulo
Hi Charlotte,
I’ve uploaded a GTF file for Arabidopsis, I can see it has “biotype” tag for genes.
Could you check if this file can be used to generate a genome for CLIP? Thank you.
Kind regards,
Paulo
Hi Paulo
Thanks for this!! It is better, but still causing an issue with iCount segment that I can’t figure out. Flow
The exact error is:
Executing the following command: iCount segment Arabidopsis_thaliana.TAIR10.59_bracketsremoved.cmd.gtf Arabidopsis_thaliana_seg.gtf Arabidopsis_thaliana.TAIR10.dna_sm.toplevel.fa.fai
Input parameters for function 'get_segments' in iCount.genomes.segment
annotation: Arabidopsis_thaliana.TAIR10.59_bracketsremoved.cmd.gtf
segmentation: Arabidopsis_thaliana_seg.gtf
fai: Arabidopsis_thaliana.TAIR10.dna_sm.toplevel.fa.fai
report_progress: False
[ValueError] need more than 1 value to unpack
File "/usr/local/lib/python3.9/site-packages/iCount/cli.py", line 448, in main
result_object = func(**args)
File "/usr/local/lib/python3.9/site-packages/iCount/genomes/segment.py", line 1015, in get_segments
for gene_content in _get_gene_content(annotation, chromosomes, report_progress):
File "/usr/local/lib/python3.9/site-packages/iCount/genomes/segment.py", line 906, in _get_gene_content
if interval.attrs['gene_id'] == current_gene:
File "pybedtools/cbedtools.pyx", line 392, in pybedtools.cbedtools.Interval.attrs.__get__
File "pybedtools/cbedtools.pyx", line 180, in pybedtools.cbedtools.Attributes.__init__
I checked and every GTF line has a gene_id and there are no lone gene_id’s that appear on only one line.
This will require some deeper investigation I’m afraid.
Best,
Charlotte
Hi Charlotte,
Thank you for the troubleshooting. I was wondering if you can try this GFF3 file instead?
Best,
Paulo