Quick Start Guide
For those of you who are eager to get started, this quick start guide covers the following basic operations on assembly graphs.
- Load an assembly graph
- Get basic graph-based statistics
- Get basic sequence-based statistics
- Get paths
- Query neighbors of a segment
- Get the adjacency matrix
- Query segment sequences
- Plot the graph
We will use the following example assembly graph.
H VN:Z:1.0
S 1 ACGTACGT
S 2 CGTACGTA
S 3 GTACGTAG
S 4 GTACGTAC
S 5 TACGTACC
L 1 + 2 + 7M
L 2 + 3 + 7M
L 2 + 4 + 7M
L 3 + 5 + 7M
L 4 + 5 + 7M
P path1 1+,2+,3+,5+ 7M,7M,7M
P path2 1+,2+,4+,5+ 7M,7M,7M
If you visualise this graph using Bandage, it will look as follows.

from agtools.core.unitig_graph import UnitigGraph
import igraph as ig
# Load the unitig-level assembly graph from the GFA file
ug = UnitigGraph.from_gfa("assembly_graph.gfa")
# Print basic graph-based statistics
print(f"Number of vertices (segments): {ug.vcount}")
print(f"Number of edges (links): {ug.ecount}")
print(f"Number of self-loops (repeats): {len(ug.self_loops)}")
print(f"Average node degree: {ug.calculate_average_node_degree()}")
print(f"Number of connected components: {len(ug.get_connected_components())}")
# Print basic sequence-based statistics
print(f"Total length of segments: {ug.calculate_total_length()}")
print(f"Average length of segments: {ug.calculate_average_segment_length()}")
print(f"GC content of segments: {ug.get_gc_content()}")
print(f"N50 and L50: {ug.calculate_n50_l50()}")
# Print oriented links
for from_link in ug.oriented_links:
for to_link in ug.oriented_links[from_link]:
for orient in ug.oriented_links[from_link][to_link]:
print(ug.segment_names[from_link], orient[0], "->", ug.segment_names[to_link], orient[1])
# Print paths
for path in ug.path_index:
# Path name: path string, path overlaps
print(path, ":", ug.get_path(path))
# Get neighbours of a segment
print(f"Neighbours of segment 5: {ug.get_neighbors('5')}")
# Print adjacency matrix
print("Adjacency matrix:")
print(ug.get_adjacency_matrix())
# Print segments
for seg_id in range(len(ug.segment_names)):
# internal segment ID, segment name, segment sequence, segment length
print(seg_id, ug.segment_names[seg_id], ug.get_segment_sequence(ug.segment_names[seg_id]), ug.segment_lengths[ug.segment_names[seg_id]])
# Plot the graph using igraph
ig.plot(
ug.graph, # graph object
"graph_plot.png", # file name
vertex_label=ug.graph.vs["name"], # label names
vertex_size=40, # vertex size
vertex_frame_width=2.0, #vertex frame width
vertex_label_size=20.0, # vertex label size
)
The plotted graph is shown below.
