Skip to content
ant

Crawl a slice of the graph to disk

Follow typed links across sites and materialize the URI tree.

get fetches one record. export fetches a record, follows its links, and writes the whole reachable slice to disk as the URI tree, so a record's file path is its URI. That tree is plain files you can grep, diff, or check into git.

Start from one URI

Pick a seed and see its outbound edges first:

$ ant links goodreads://author/153394
goodreads://book/2767052
goodreads://book/6148028

Materialize it

--follow N walks links to depth N; --to chooses where to write (otherwise --data / $HOME/data); --md drops a Markdown companion next to records that carry prose:

$ ant export goodreads://author/153394 --follow 1 --to ./data --md
{
  "root": "./data",
  "written": [
    "./data/goodreads/author/153394.json",
    "./data/goodreads/book/2767052.json",
    "./data/goodreads/book/2767052.md"
  ],
  "skipped": [],
  "errors": {}
}

The report is honest: a URI a site refuses (a sign-in wall, a WAF) lands under errors with the reason, and everything reachable is still written.

Read the tree back

$ ant ll goodreads:// --data ./data
goodreads://author/153394
goodreads://book/2767052
$ ant import ./data/goodreads/book/2767052.json | jq '.["@type"]'
"goodreads/book"

Draw it

graph walks the same edges and prints a subgraph as JSON or Graphviz dot:

$ ant graph goodreads://author/153394 --depth 1 --format dot | dot -Tsvg > graph.svg

Because links are typed and cross-site, a seed on one site can pull in records from another wherever the data points across — the URI tree holds them all under one root.