Demultiplexing error - Ultraplex

@sam can you check the hash of the fastqs used in this execution?

Cheers
Charlotte

Thank you, Charlotte.

I am trying the alternative of uploading the files via the command line, as described in the documentation.

How can one upload the multiplexed R2 file, linking it to the R1 file?

Kind regards,
p

The MD5 is different for the files in the most recent execution you linked above:

6859a4ee513176e693c84985723dff57 CLIP-seq_R1_001.fastq.gz
11d9d9d454eefdc4f2c30e0b3a3ce9e1 CLIP-seq_R2_001.fastq.gz

This is very bizarre to me - for reference, the upload is done in chunks, where the chunk numbers are always checked to ensure one hasn’t been skipped or sent in the wrong order, so it must be an alteration within a chunk - but transmitting that correctly is handled by the browser itself, not Flow, so it is very strange that this happens so consistently.

I checked the entire database to see if this has occured in any previous uploads by looking for the CRC error message and found 19 such instances, but all of them in data owned by you - for some reason it’s only your data this happens to, which I appreciate will be of little consolation to you, but it may help us work out what is happening.

Is the data all from the same source? Or all uploaded from the same machine, or via the same browser? These all seem unlikely sources of the issue, but if we can isolate what they all had in common we may be able to work out what has happened. I am not trying to evade responsibility by checking this - it is still possible that Flow is doing something wrong when it encounters something specific to your data, we just need to figure out what.

To answer the question about the command-line upload, the flowbio library doesn’t currently let you pair data like the frontend does, but if you upload the first file with it, I can check the MD5 and we can see if uploading it this way does solve the problem - that would tell us a lot.

I appreciate your patience as we figure this out.

Hi Sam,

Thanks for the help.
I have uploaded the R1 and R2 files via the command line, and could successfully run the Flow demultiplexing pipeline. The R1 and R2 files were not linked as paired-end, but the Ultraplex did run without errors on either R1 alone or R1/R2 as concatenated single-end files.

Out of curiosity, could you check the MD5 for these most recent R1 and R2 files?

Could the issue be with ‘linking the R2 to the R1 file (as paired-end)’ on the Flow interface? Alternatively, some issue with the browser I am using as you mention.

Thanks,
Paulo

Hi Paulo,

I checked the files uploaded via the command line, and the MD5s match what you have - they are uncorrupted. I downloaded one of them myself and re-uploaded it via the browser, and it also uploaded without any corruption, which does suggest something specific to your browser. The specific way in which the corrupted file is corrupted is very odd - in about 0.5% of uploaded chunks, the byte at position 65389 (always that byte) is spliced out and reinserted at a random point later in the chunk - meaning the chunk size stays the same but the middle section is misaligned. I’ve never encountered that before.

Could you tell me what browser you are using to upload the data? Any browser extension you can think of which might impact this?

In any case, uploading the files via the command line should resolve the issue for you. You can link the two files after they are uploaded by clicking ‘edit data’ on the first file’s page, and choosing the second file to link to.

Sam

Hi Sam,

Thank you for checking this thoroughly.
The browser is Google Chrome, I believe I do not have any extension that could have an impact of this sort, I also used Chrome when uploading data to the iMaps platform, and it went fine. And not always do I get an error on uploading to Flow. This heappended with these files, and another fastq called CAP…fastq.gz, so very odd.

We now understand the issue, I wil upload via the command line, and link the files as you describe, thank you.

Kind regards,
Paulo