Renaming elements of an assembly graph
agtools can rename segment IDs, path IDs, and walk IDs by prepending a given prefix. You can use the rename subcommand provided through the command-line interface. Please refer to the CLI reference for further details on the rename subcommand.
Here is an example GFA file.
# GFA file: 10 segments (20 bp) and 1 short segment (10 bp: seqX)
H VN:Z:1.0
# Segments
S seq1 ATGCGTATGCGTATGCGTAA
S seq2 CGTACTGACTGACTGACTGA
S seq3 TGCATGCATGCATGCATGCA
S seq4 ACGTACGTACGTACGTACGT
S seq5 GTACGTACGTACGTACGTAC
S seq6 CGTACGTACGTACGTACGTA
S seq7 TACGTACGTACGTACGTACG
S seq8 ATCGATCGATCGATCGATCG
S seq9 GCGTGCGTGCGTGCGTGCGT
S seq10 TTGCTTGCTTGCTTGCTTGC
S seqX ACGTACGTAC
L seqX + seq2 + 5M
L seq4 + seq5 + 10M
L seq6 - seq7 + 7M
L seq9 + seq10 - 5M
J seq5 + seqX - *
J seq3 - seq8 + *
J seq7 + seq2 + *
C seq1 + seqX - 5 10M
C seq2 + seqX + 2 10M
P seqpath1 seq1+,seqX+,seq3- 20M,10M,20M
P seqpath2 seq4+,seq5+,seq6+ 20M,20M,20M
W seqread1 0 * <seq6<seqX>seq9
W seqread2 0 * <seq1>seq2<seq3
You can run the following command to rename the elements using the prefix TEST.
agtools rename -g test_graph.gfa -p TEST -o results/renamed_graph.gfa
The renamed graph will look as follows.
# GFA file: 10 segments (20 bp) and 1 short segment (10 bp: seqX)
H VN:Z:1.0
# Segments
S TEST_seq1 ATGCGTATGCGTATGCGTAA
S TEST_seq2 CGTACTGACTGACTGACTGA
S TEST_seq3 TGCATGCATGCATGCATGCA
S TEST_seq4 ACGTACGTACGTACGTACGT
S TEST_seq5 GTACGTACGTACGTACGTAC
S TEST_seq6 CGTACGTACGTACGTACGTA
S TEST_seq7 TACGTACGTACGTACGTACG
S TEST_seq8 ATCGATCGATCGATCGATCG
S TEST_seq9 GCGTGCGTGCGTGCGTGCGT
S TEST_seq10 TTGCTTGCTTGCTTGCTTGC
S TEST_seqX ACGTACGTAC
L TEST_seqX + TEST_seq2 + 5M
L TEST_seq4 + TEST_seq5 + 10M
L TEST_seq6 - TEST_seq7 + 7M
L TEST_seq9 + TEST_seq10 - 5M
J TEST_seq5 + TEST_seqX - *
J TEST_seq3 - TEST_seq8 + *
J TEST_seq7 + TEST_seq2 + *
C TEST_seq1 + TEST_seqX - 5 10M
C TEST_seq2 + TEST_seqX + 2 10M
P TEST_seqpath1 TEST_seq1+,TEST_seqX+,TEST_seq3- 20M,10M,20M
P TEST_seqpath2 TEST_seq4+,TEST_seq5+,TEST_seq6+ 20M,20M,20M
W TEST_seqread1 0 * <TEST_seq6<TEST_seqX>TEST_seq9
W TEST_seqread2 0 * <TEST_seq1>TEST_seq2<TEST_seq3