<p dir="ltr"><i>From</i> “A Reference-Guided Iterative Approach to Polish the Nanopore Sequencing Basecalling for Therapeutic RNA Quality Control”</p><p dir="ltr">We provide a comprehensive summary of basecalling performance across different models used in this study. For each read processed by the basecalling pipeline, we collected detailed metadata and alignment metrics to facilitate downstream benchmarking and quality assessment.</p><p dir="ltr"><i>ReDATA Curator's Note: t</i><i>he list of fields given below is incomplete and may not apply to all files.</i></p><p dir="ltr">Each record in the summary table includes the following information:</p><ul><li><b>File and Run Metadata</b>:</li><li><ul><li><code>filename</code>, <code>read_id</code>, <code>run_id</code>, <code>batch_id</code>, <code>channel</code>, <code>mux</code>, <code>start_time</code>, <code>duration</code>, <code>num_events</code></li></ul></li><li><b>Filtering and Signal Statistics</b>:</li><li><ul><li><code>passes_filtering</code> (whether the read passed the quality threshold),</li><li><code>template_start</code>, <code>num_events_template</code>, <code>template_duration</code>,</li><li><code>sequence_length_template</code>, <code>mean_qscore_template</code>, <code>strand_score_template</code>,</li><li><code>median_template</code>, <code>mad_template</code> (median absolute deviation),</li><li><code>scaling_median_template</code>, <code>scaling_mad_template</code></li></ul></li><li><b>Alignment Information</b> (reference-guided):</li><li><ul><li><code>alignment_genome</code>, <code>alignment_direction</code>,</li><li><code>alignment_genome_start</code>, <code>alignment_genome_end</code>,</li><li><code>alignment_strand_start</code>, <code>alignment_strand_end</code>,</li><li><code>alignment_num_insertions</code>, <code>alignment_num_deletions</code>,</li><li><code>alignment_num_aligned</code>, <code>alignment_num_correct</code>,</li><li><code>alignment_identity</code> (number of matched bases / aligned length),</li><li><code>alignment_accuracy</code> (matched bases / read length),</li><li><code>alignment_score</code> (alignment quality score reported by the aligner),</li><li><code>alignment_coverage</code> (aligned length / reference length),</li><li><code>alignment_mapping_quality</code> (MAPQ),</li><li><code>alignment_num_alignments</code>, <code>alignment_num_secondary_alignments</code>, <code>alignment_num_supplementary_alignments</code></li></ul></li></ul><p dir="ltr">This basecalling summary enables a side-by-side comparison of multiple models (e.g., Guppy, Bonito, Dorado, and our iterative framework), offering insight into improvements in read quality, alignment accuracy, and coverage after iterative refinement. All basecalling metadata are included as supplementary tables, and per-read performance is traceable via <code>read_id</code> across different iterations.</p><p><br></p><hr><p dir="ltr"><i>For inquiries regarding the contents of this dataset, please contact the Corresponding Author listed in the README.txt file. Administrative inquiries (e.g., removal requests, trouble downloading, etc.) can be directed to data-management@arizona.edu</i></p>