\(GO_3\)

GO3 is a high-performance Python library (Rust backend) for Gene Ontology semantic similarity.

Why GO3?

Existing tools — GOATOOLS (Python), FastSemSim (Python), GOSemSim (R), simona (R), and TaxaGO (Rust CLI) — cover term-level semantic similarity, but many common operations in GO-based analyses (comparing sets of terms, gene-level similarity, distance matrices, embeddings) require writing glue code or switching between languages. GO3 brings all of these into a single Python library:

  • Term-level similarity — 8 methods (IC-based, topological, and hybrid) in one place.

  • Term-set and gene-level similarity — compare two sets of GO terms or two genes directly, with 5 groupwise strategies.

  • Batch operations — compute thousands of term or gene pairs in a single call, parallelized automatically.

  • All-vs-all distance matrices — one function call for a full symmetric distance matrix over any gene list.

  • Embeddings and visualization — built-in t-SNE, UMAP, and plotting helpers, no external pipeline needed.

  • Speed — the fastest library in our benchmark: 3.6–12.5× faster initialization and 2–25× faster gene-level similarity than other Python/R libraries. See Benchmarks.

  • Minimal setup — install with pip install go3, load an OBO file (auto-downloadable) and a GAF file, and start computing.

Highlights (v0.3)

  • TopoICSim and GraphIC hybrid methods

  • t-SNE / UMAP embedding helpers with plotting utilities

  • Thread-pool control via set_num_threads

  • Pre-built wheels for Linux, macOS, and Windows

Start Here

Install

pip install go3

Optional visualization extras:

pip install go3[viz]

Quick example

import go3

go3.load_go_terms("go-basic.obo")
annots = go3.load_gaf("goa_human.gaf")
counter = go3.build_term_counter(annots)

sim = go3.semantic_similarity("GO:0006397", "GO:0008380", "lin", counter)
print(sim)