Mapping JSON to spreadsheet

This section demonstrates how to use the orcus_json class to map the contents of a JSON document onto a spreadsheet.

Consider the following JSON document:

{
  "meta": {
    "title": "My Book Collection",
    "owner": "John Doe",
    "last-updated": "2025-01-15"
  },
  "books": [
    {
      "title": "Crime and Punishment",
      "author": "Fyodor Dostoevsky",
      "year": 1866,
      "genre": "Novel",
      "rating": 4.8
    },
    {
      "title": "The Great Gatsby",
      "author": "F. Scott Fitzgerald",
      "year": 1925,
      "genre": "Novel",
      "rating": 4.0
    },
    {
      "title": "One Hundred Years of Solitude",
      "author": "Gabriel Garcia Marquez",
      "year": 1967,
      "genre": "Magic Realism",
      "rating": 4.7
    },
    {
      "title": "To Kill a Mockingbird",
      "author": "Harper Lee",
      "year": 1960,
      "genre": "Novel",
      "rating": 4.9
    },
    {
      "title": "Brave New World",
      "author": "Aldous Huxley",
      "year": 1932,
      "genre": "Dystopian",
      "rating": 4.2
    }
  ]
}

The document has a meta object at the top level containing three scalar fields, followed by a books array where each element describes a book.

JSON path expressions

Paths used by orcus_json use a JSONPath-inspired syntax, but are not a strict subset of the JSONPath standard. A path always starts with $ representing the document root. Object keys are addressed with bracket-and-quote notation (['key']), and array elements are addressed with the wildcard notation ([*]), or the equivalent empty bracket notation ([]).

For example, in the document above:

  • $['meta']['title'] points to the string "My Book Collection"

  • $['books'][*]['author'] points to the author field of every element in the books array

Range mapping

While set_cell_link() maps individual values to individual cells, orcus_json also supports mapping an array of objects to a range of rows — similar to a database table. Use start_range() to begin defining a range, append_field_link() to map each JSON field to a named column, set_range_row_group() to identify the array that determines row boundaries, and commit_range() to finalize the mapping:

filter.start_range("Books", 0, 0, true);
filter.append_field_link("$['books'][*]['title']", "Title");
filter.append_field_link("$['books'][*]['author']", "Author");
filter.append_field_link("$['books'][*]['year']", "Year");
filter.append_field_link("$['books'][*]['genre']", "Genre");
filter.append_field_link("$['books'][*]['rating']", "Rating");
filter.set_range_row_group("$['books']");
filter.commit_range();

The first three arguments to start_range() are the target sheet name and the row and column of the top-left corner of the range. The fourth argument is a boolean flag that, when true, reserves the first row for column headers. Each append_field_link() call takes a JSONPath expression and a column label. The path $['books'][*]['title'] selects the title field from every element of the books array. The call to set_range_row_group() with $['books'] tells orcus that each element of the books array represents one row in the output.

Running this code produces the following output:

rows: 6  cols: 5
+-------------------------------+------------------------+----------+---------------+---------+
| Title                         | Author                 | Year     | Genre         | Rating  |
+-------------------------------+------------------------+----------+---------------+---------+
| Crime and Punishment          | Fyodor Dostoevsky      | 1866 [v] | Novel         | 4.8 [v] |
+-------------------------------+------------------------+----------+---------------+---------+
| The Great Gatsby              | F. Scott Fitzgerald    | 1925 [v] | Novel         | 4 [v]   |
+-------------------------------+------------------------+----------+---------------+---------+
| One Hundred Years of Solitude | Gabriel Garcia Marquez | 1967 [v] | Magic Realism | 4.7 [v] |
+-------------------------------+------------------------+----------+---------------+---------+
| To Kill a Mockingbird         | Harper Lee             | 1960 [v] | Novel         | 4.9 [v] |
+-------------------------------+------------------------+----------+---------------+---------+
| Brave New World               | Aldous Huxley          | 1932 [v] | Dystopian     | 4.2 [v] |
+-------------------------------+------------------------+----------+---------------+---------+

Note that the [v] marker in the output indicates that the value is stored as a numeric type rather than a string.

Cell links and range mappings can be freely mixed within the same session. When combined, both become active when read_stream() is called, and orcus populates all of them in a single pass over the document. Multiple ranges can also be defined in the same session by repeating the start_range()commit_range() sequence before calling read_stream().