Generate Json | Substrate

Learn how to generate structured JSON output using ComputeJSON

Generating JSON with LLMs is a useful technique that is broadly applicable to many tasks – from data extraction, to data wrangling, to function calling. You'll find that it's also very useful in the context of chained workflows. We believe JSON generation is a core primitive for AI engineering, so we've optimized the heck out of it – learn more on our blog (opens in a new tab).

1. Define a schema

First, we define a schema (and convert it to JSON schema (opens in a new tab) format).

In TypeScript, we recommend using zod (opens in a new tab) and the zod-to-json-schema (opens in a new tab) package.
In Python, we recommend using Pydantic (opens in a new tab) and the model_json_schema() method.

TypeScript

Python


import { z } from "zod";
import { zodToJsonSchema } from "zod-to-json-schema";
const bookInfo = z.object({
  author: z.string().describe("The book's author."),
  characters: z.array(z.string()).describe("List of main characters."),
});
const jsonSchema = zodToJsonSchema(bookInfo);

2. Generate JSON

To generate JSON, pass a JSON schema to ComputeJSON along with a prompt.

To access the output of ComputeJSON in subsequent nodes, use future.json_object.

TypeScript

Python


import { Substrate, ComputeJSON, sb } from "substrate";
// ...
const a = new ComputeJSON({
  prompt: "Tell me the author and two main characters of Don Quixote",
  json_schema: jsonSchema,
});
const b = ComputeText({
  prompt: sb.concat("Tell me about ", a.future.json_object.get("author")),
});
const c = ComputeText({
  prompt: sb.concat("Tell me about ", a.future.json_object.get("characters").at(0)),
});

Use case: Data wrangling

JSON generation can be used to extract structured data from unstructured sources, or transform data from one shape to another. In the following example, we use ComputeJSON to transform a JSON object to a semantically equivalent object in a different shape.

TypeScript

Python


import { ComputeJSON, ComputeText, sb, Substrate } from "npm:substrate";
import { z } from "zod";
import { zodToJsonSchema } from "zod-to-json-schema";
const original = {
  personalInfo: {
    name: "John Doe",
    age: 30,
  },
  occupation: "Software Engineer",
  fullAddress: "123 Main St, Anytown",
  address: {
    street: "123 Main St",
    city: "Anytown",
  },
};
const substrate = new Substrate({ apiKey: "YOUR_API_KEY" });
const TargetSchema = z.object({
  fullName: z.string(),
  yearsOld: z.number().int().nonnegative(),
  profession: z.string(),
  location: z.object({
    streetAddress: z.string(),
    cityName: z.string(),
  }),
});
const json = new ComputeJSON({
  prompt: `Translate the following JSON object to the target schema.
  ${JSON.stringify(original)}`,
  json_schema: zodToJsonSchema(TargetSchema),
});
const res = await substrate.run(json);

Changelog Run Python code