In this article

May 28, 2026

How to build a custom SDK generator with oagen

How to build a custom, language-aware SDK generator from an OpenAPI spec using oagen's typed intermediate representation.

Maria Paktiti

May 28, 2026

Explore with AI

Open in ChatGPT

Open in Claude

Open in Perplexity

Here is a scenario you have probably lived: a backend engineer merges a new endpoint on a Friday. By Monday the Node SDK has it. The Python SDK gets it the following Wednesday after someone remembers to sync. The Go SDK gets a bug report three weeks later because a query parameter was quietly dropped during the manual port. Nobody did anything wrong. The system is just designed to produce drift.

oagen treats this as a compiler problem. You write a single OpenAPI spec; oagen parses it into a typed intermediate representation (IR); language-specific emitters consume that IR and produce files. The spec is the source of truth. The generated SDKs are output artifacts, not maintained codebases. This tutorial walks through the full pipeline, from a fresh OpenAPI spec to a working Python SDK generator, with enough detail to understand what is happening at each stage.

What makes oagen different from other generators

OpenAPI generators are not new. Tools like openapi-generator and Swagger Codegen have existed for years, and newer commercial platforms offer turnkey generation pipelines. If you need a working SDK in 20 minutes with no opinions, those tools are reasonable choices.

oagen is different in scope and philosophy. It is a framework for building generators, not a generator itself. The trade-off is explicit: you write emitter code, and in return you get complete control over the output. The generated SDK can look exactly like your team would have written it by hand, because your team writes the emitter.

The other key design decision is the IR. Rather than passing raw YAML to a template engine, oagen resolves all $ref references, normalizes schemas, groups operations into services, and derives method names. Every emitter works with the same resolved data model, so a bug in how the parser handles oneOf gets fixed once rather than in every emitter independently. The IR is the stable contract between the parser and everything downstream. The contract is enforced by types, not documentation.

The IR in detail

Before writing any emitter code, it is worth understanding the data structures you will be working with. Running oagen parse --spec openapi.yml prints the full IR as JSON. Here is the top-level shape:

  
interface ApiSpec {
  name: string;        // from info.title
  version: string;     // from info.version
  baseUrl: string;     // from servers[0].url
  services: Service[]; // operations grouped by tag
  models: Model[];     // resolved schema objects
  enums: Enum[];       // string/numeric enums
  auth?: AuthScheme[]; // bearer, apiKey, oauth2
  sdk: SdkBehavior;   // retry, pagination, timeout, etc.
}

Services are the most important concept to internalize. oagen groups operations by their first OpenAPI tag and converts the tag to PascalCase. An operation tagged organizations becomes a method on the Organizations service. When there is no tag, the parser falls back to the first path segment.

Each operation is resolved into:

  
interface Operation {
  name: string;              // e.g. "listUsers", "createOrganization"
  httpMethod: HttpMethod;
  path: string;              // e.g. "/users/{id}"
  pathParams: Parameter[];
  queryParams: Parameter[];
  requestBody?: TypeRef;
  requestBodyEncoding?: "json" | "form-data" | "form-urlencoded" | "binary" | "text";
  response: TypeRef;
  pagination?: PaginationMeta;
  injectIdempotencyKey: boolean;
  errors: ErrorResponse[];
}

The name is derived algorithmically. For a GET /users operation, oagen produces listUsers. For GET /users/{id}, it produces getUser. For POST /users/{id}/verify, it extracts the action verb and produces verifyUser. You can override any of these with operation hints in oagen.config.ts, which we will cover later.

TypeRef: the discriminated union at the heart of the IR

All types in the IR are expressed as TypeRef, a discriminated union keyed on kind:

  
type TypeRef =
  | { kind: "primitive"; type: "string" | "integer" | "number" | "boolean" | "unknown"; format?: string }
  | { kind: "array"; items: TypeRef }
  | { kind: "model"; name: string }
  | { kind: "enum"; name: string }
  | { kind: "nullable"; inner: TypeRef }
  | { kind: "union"; variants: TypeRef[]; compositionKind?: "allOf" | "oneOf" | "anyOf" }
  | { kind: "map"; valueType: TypeRef; keyType?: TypeRef }
  | { kind: "literal"; value: string | number | boolean | null };

This design has a critical implication for emitter authors. When you write an exhaustive switch over ref.kind and use a helper like assertNever on the default branch, TypeScript's type narrowing guarantees at compile time that you have handled every variant. If oagen adds a new TypeRef kind in a future release, your emitter will fail to build until you add a case for it. No runtime surprises, no silently missing types in generated output.

The nullable variant is particularly important to handle correctly. In Python, { kind: "nullable", inner: { kind: "primitive", type: "string" } } should render as Optional[str]. In TypeScript, it should render as string | null. The emitter makes that decision; the IR just says the type is nullable.

A spec to work with

For this tutorial, we will use a simple task management API. The key schemas are a Task object, a TaskStatus enum, and a paginated TaskList wrapper. Save this as tasks-api.yml:

  
openapi: "3.1.0"
info:
  title: Tasks API
  version: "1.0.0"
servers:
  - url: https://api.tasks.example.com

components:
  schemas:
    Task:
      type: object
      required: [id, title, status]
      properties:
        id:
          type: string
          format: uuid
        title:
          type: string
        status:
          $ref: "#/components/schemas/TaskStatus"
        assignee_id:
          type: string
          nullable: true
        created_at:
          type: string
          format: date-time

    TaskStatus:
      type: string
      enum: [pending, in_progress, done, cancelled]

    TaskList:
      type: object
      required: [data]
      properties:
        data:
          type: array
          items:
            $ref: "#/components/schemas/Task"
        after:
          type: string
          nullable: true
        before:
          type: string
          nullable: true

    CreateTaskInput:
      type: object
      required: [title]
      properties:
        title: { type: string }
        assignee_id:
          type: string
          nullable: true

    UpdateTaskInput:
      type: object
      properties:
        title: { type: string }
        status:
          $ref: "#/components/schemas/TaskStatus"
        assignee_id:
          type: string
          nullable: true

paths:
  /tasks:
    get:
      operationId: listTasks
      summary: List tasks
      tags: [Tasks]
      parameters:
        - name: after
          in: query
          schema: { type: string }
        - name: limit
          in: query
          schema: { type: integer }
        - name: status
          in: query
          schema:
            $ref: "#/components/schemas/TaskStatus"
      responses:
        "200":
          content:
            application/json:
              schema:
                $ref: "#/components/schemas/TaskList"

    post:
      operationId: createTask
      summary: Create a task
      tags: [Tasks]
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: "#/components/schemas/CreateTaskInput"
      responses:
        "201":
          content:
            application/json:
              schema:
                $ref: "#/components/schemas/Task"

  /tasks/{id}:
    get:
      operationId: getTask
      summary: Get a task
      tags: [Tasks]
      parameters:
        - name: id
          in: path
          required: true
          schema: { type: string, format: uuid }
      responses:
        "200":
          content:
            application/json:
              schema:
                $ref: "#/components/schemas/Task"

    patch:
      operationId: updateTask
      summary: Update a task
      tags: [Tasks]
      parameters:
        - name: id
          in: path
          required: true
          schema: { type: string, format: uuid }
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: "#/components/schemas/UpdateTaskInput"
      responses:
        "200":
          content:
            application/json:
              schema:
                $ref: "#/components/schemas/Task"

    delete:
      operationId: deleteTask
      summary: Delete a task
      tags: [Tasks]
      parameters:
        - name: id
          in: path
          required: true
          schema: { type: string, format: uuid }
      responses:
        "204": {}

A note on the request body schemas: they are defined as named components (CreateTaskInput, UpdateTaskInput) rather than inline anonymous objects. This is deliberate. When a request body is an inline anonymous object, oagen represents it as a { kind: "model" } ref with a synthesized name, but the behavior is spec-specific. Naming your input schemas explicitly keeps the IR predictable and gives you clean generated types.

Install oagen and parse the spec to confirm the IR looks correct:

  
npm install @workos/oagen
npx oagen parse --spec tasks-api.yml

The output should show a single Tasks service with five operations, three models (Task, CreateTaskInput, UpdateTaskInput), one model with a paginated wrapper (TaskList), and one enum (TaskStatus).

Also run the resolution table before writing any emitter code:

  
npx oagen resolve --spec tasks-api.yml --format table

This prints a markdown table mapping each operation to its derived method name and target service. For this spec, names come from explicit operationId fields. In specs without clean operationId values, the resolution table is where you discover what the algorithm produced and where you need to add hints.

Scaffolding the emitter project

  
npx oagen init --lang python --project ./tasks-python-emitter
cd ./tasks-python-emitter
npm install

The scaffold creates:

  
tasks-python-emitter/
  src/
    python/
      index.ts   ← stub emitter
    plugin.ts    ← registers the emitter
  oagen.config.ts
  package.json

The stub emitter implements the Emitter interface with all methods returning empty arrays. Open src/python/index.ts and you will see the shape you need to fill in:

  
import type { Emitter } from "@workos/oagen";

export const pythonEmitter: Emitter = {
  language: "python",
  generateModels: (models, ctx) => [],
  generateEnums: (enums, ctx) => [],
  generateResources: (services, ctx) => [],
  generateClient: (spec, ctx) => [],
  generateErrors: () => [],
  generateTests: () => [],
  fileHeader: () => "# Auto-generated by oagen. Do not edit.",
};

Every method receives IR nodes and an EmitterContext. The context carries ctx.spec (the full ApiSpec), ctx.resolvedOperations (the pre-computed operation name map), and ctx.namespace (the value passed via --namespace on the CLI).

Writing the Python type emitter

Start with the helper that converts IR types to Python type annotation strings. This is the foundation everything else depends on, so it is also where you want the exhaustiveness guarantee enforced properly.

  
// src/python/types.ts
import type { TypeRef } from "@workos/oagen";

function assertNever(x: never): never {
  throw new Error(`Unhandled TypeRef kind: ${(x as TypeRef).kind}`);
}

export function renderTypeRef(ref: TypeRef): string {
  switch (ref.kind) {
    case "primitive":
      return renderPrimitive(ref.type, ref.format);
    case "array":
      return `List[${renderTypeRef(ref.items)}]`;
    case "nullable":
      return `Optional[${renderTypeRef(ref.inner)}]`;
    case "model":
      return ref.name;
    case "enum":
      return ref.name;
    case "union":
      return `Union[${ref.variants.map(renderTypeRef).join(", ")}]`;
    case "map": {
      const keyType = ref.keyType ?? { kind: "primitive" as const, type: "string" as const };
      return `Dict[${renderTypeRef(keyType)}, ${renderTypeRef(ref.valueType)}]`;
    }
    case "literal":
      return ref.value === null ? "None" : JSON.stringify(ref.value);
    default:
      return assertNever(ref);
  }
}

function renderPrimitive(type: string, format?: string): string {
  if (type === "string" && format === "date-time") return "datetime";
  if (type === "string" && format === "date") return "date";
  if (type === "string") return "str";
  if (type === "integer") return "int";
  if (type === "number") return "float";
  if (type === "boolean") return "bool";
  return "Any";
}

The assertNever call in the default branch is the mechanism that enforces exhaustiveness. Because TypeRef is a discriminated union and each case narrows the type, after all known variants are handled the default branch receives type never. TypeScript then rejects the call to assertNever(ref) at compile time if any variant is unhandled. When oagen adds a new TypeRef kind in a future release, your build breaks before your generator produces bad output.

Generating models

Models map to Python dataclasses. Generating them means walking model.fields, rendering each field's type, and deciding whether to emit a default value for optional fields.

  
// src/python/models.ts
import type { Model, GeneratedFile } from "@workos/oagen";
import { renderTypeRef } from "./types.js";

function collectImports(model: Model): string[] {
  const imports = new Set<string>(["from __future__ import annotations"]);
  imports.add("from dataclasses import dataclass");

  for (const f of model.fields) {
    const rendered = renderTypeRef(f.type);
    if (rendered.includes("Optional")) imports.add("from typing import Optional");
    if (rendered.includes("List")) imports.add("from typing import List");
    if (rendered.includes("Dict")) imports.add("from typing import Dict");
    if (rendered.includes("Union")) imports.add("from typing import Union");
    if (rendered.includes("datetime")) imports.add("from datetime import datetime");
    if (rendered === "date") imports.add("from datetime import date");
  }

  return [...imports].sort();
}

export function generateModels(models: Model[]): GeneratedFile[] {
  return models.map((model) => {
    const imports = collectImports(model);

    // Required fields must come before optional fields in a dataclass
    const required = model.fields.filter((f) => f.required);
    const optional = model.fields.filter((f) => !f.required);
    const ordered = [...required, ...optional];

    const fields = ordered.map((f) => {
      const typeStr = renderTypeRef(f.type);
      const comment = f.description ? `  # ${f.description}\n` : "";
      return f.required
        ? `${comment}  ${f.name}: ${typeStr}`
        : `${comment}  ${f.name}: ${typeStr} = None`;
    });

    const content = [
      imports.join("\n"),
      "",
      "",
      `@dataclass`,
      `class ${model.name}:`,
      ...(model.description ? [`  """${model.description}"""`] : []),
      ...fields,
    ].join("\n");

    return {
      path: `models/${model.name.toLowerCase()}.py`,
      content,
    };
  });
}

One non-obvious ordering decision: Python dataclasses require fields with defaults to come after fields without defaults. The ordered array above enforces this by separating required from optional fields before rendering.

The generated file for Task will look like this:

  
# Auto-generated by oagen. Do not edit.
from __future__ import annotations
from dataclasses import dataclass
from datetime import datetime
from typing import Optional

@dataclass
class Task:
  id: str
  title: str
  status: TaskStatus
  assignee_id: Optional[str] = None
  created_at: Optional[datetime] = None

Generating enums

Python enums are straightforward. Using str, Enum as the base class means instances serialize cleanly to their string value when passed to json.dumps, which is what you want for HTTP request bodies.

  
// src/python/enums.ts
import type { Enum, GeneratedFile } from "@workos/oagen";

export function generateEnums(enums: Enum[]): GeneratedFile[] {
  return enums.map((entry) => {
    const members = entry.values
      .map((v) => `  ${v.name} = ${JSON.stringify(v.value)}`)
      .join("\n");

    const content = [
      "from enum import Enum",
      "",
      "",
      `class ${entry.name}(str, Enum):`,
      members,
    ].join("\n");

    return {
      path: `models/${entry.name.toLowerCase()}.py`,
      content,
    };
  });
}

Generating the resource client

Each service becomes a Python class. Each operation becomes a method. The method signature is derived from the operation's path params, query params, and request body.

  
// src/python/resources.ts
import type { Service, Operation, GeneratedFile, EmitterContext } from "@workos/oagen";
import { renderTypeRef } from "./types.js";

function renderParams(op: Operation): string[] {
  const params: string[] = ["self"];

  // Path params first, always required
  for (const p of op.pathParams) {
    params.push(`${p.name}: ${renderTypeRef(p.type)}`);
  }

  // Typed request body, if present
  if (op.requestBody) {
    params.push(`body: ${renderTypeRef(op.requestBody)}`);
  }

  // Required query params before optional ones
  const required = op.queryParams.filter((p) => p.required);
  const optional = op.queryParams.filter((p) => !p.required);

  for (const p of required) {
    params.push(`${p.name}: ${renderTypeRef(p.type)}`);
  }
  for (const p of optional) {
    params.push(`${p.name}: Optional[${renderTypeRef(p.type)}] = None`);
  }

  return params;
}

function renderMethod(op: Operation, ctx: EmitterContext): string {
  // Use the pre-computed resolved name rather than deriving it independently
  const resolved = ctx.resolvedOperations.find(
    (r) => r.operation.name === op.name
  );
  const methodName = resolved?.methodName ?? op.name;

  const params = renderParams(op);
  const returnType = op.response.kind === "primitive" && op.response.type === "unknown"
    ? "None"
    : renderTypeRef(op.response);
  const pythonPath = op.path.replace(/{(\w+)}/g, "{$1}");

  const lines: string[] = [
    `  def ${methodName}(${params.join(", ")}) -> ${returnType}:`,
  ];

  if (op.description) {
    lines.push(`    """${op.description}"""`);
  }

  if (op.queryParams.length > 0) {
    const pairs = op.queryParams.map((p) => `"${p.name}": ${p.name}`).join(", ");
    lines.push(`    params = {${pairs}}`);
    lines.push(`    params = {k: v for k, v in params.items() if v is not None}`);
  }

  const callArgs = [`f"${pythonPath}"`];
  if (op.requestBody) callArgs.push("json=body");
  if (op.queryParams.length > 0) callArgs.push("params=params");

  lines.push(`    return self._client.request("${op.httpMethod.toUpperCase()}", ${callArgs.join(", ")})`);
  lines.push("");

  return lines.join("\n");
}

export function generateResources(
  services: Service[],
  ctx: EmitterContext
): GeneratedFile[] {
  return services.map((service) => {
    const methods = service.operations.map((op) => renderMethod(op, ctx));

    const content = [
      "from __future__ import annotations",
      "from typing import Optional",
      "from .http_client import HttpClient",
      "",
      "",
      `class ${service.name}Client:`,
      `  def __init__(self, client: HttpClient) -> None:`,
      `    self._client = client`,
      "",
      ...methods,
    ].join("\n");

    return {
      path: `resources/${service.name.toLowerCase()}_client.py`,
      content,
    };
  });
}

The key line is ctx.resolvedOperations.find(r => r.operation.name === op.name). This is where the emitter consults the pre-computed resolution table to get the correct methodName in snake_case. Using ctx.resolvedOperations guarantees that all SDK languages produce the same logical method name, differing only in casing convention.

Generating the top-level client

  
// src/python/client.ts
import type { ApiSpec, GeneratedFile, EmitterContext } from "@workos/oagen";

export function generateClient(spec: ApiSpec, ctx: EmitterContext): GeneratedFile[] {
  const imports = spec.services.map(
    (s) => `from .resources.${s.name.toLowerCase()}_client import ${s.name}Client`
  );

  const props = spec.services.map(
    (s) => `    self.${s.name.toLowerCase()} = ${s.name}Client(self._http)`
  );

  const content = [
    "from __future__ import annotations",
    "from .http_client import HttpClient",
    ...imports,
    "",
    "",
    `class ${ctx.namespace}:`,
    `  """Auto-generated client for ${spec.name} ${spec.version}"""`,
    "",
    `  def __init__(self, api_key: str, base_url: str = "${spec.baseUrl}") -> None:`,
    `    self._http = HttpClient(api_key=api_key, base_url=base_url)`,
    ...props,
  ].join("\n");

  return [{ path: "client.py", content }];
}

Assembling the emitter

Wire everything together in src/python/index.ts:

  
import type { Emitter } from "@workos/oagen";
import { generateModels } from "./models.js";
import { generateEnums } from "./enums.js";
import { generateResources } from "./resources.js";
import { generateClient } from "./client.js";

export const pythonEmitter: Emitter = {
  language: "python",
  generateModels: (models) => generateModels(models),
  generateEnums: (enums) => generateEnums(enums),
  generateResources: (services, ctx) => generateResources(services, ctx),
  generateClient: (spec, ctx) => generateClient(spec, ctx),
  generateErrors: () => [],
  generateTests: () => [],
  fileHeader: () => "# Auto-generated by oagen. Do not edit.\n",
};

Then register it in src/plugin.ts:

  
import { registerEmitter } from "@workos/oagen";
import { pythonEmitter } from "./python/index.js";

registerEmitter(pythonEmitter);

export const myEmittersPlugin = {};

Configuring oagen.config.ts

Edit the scaffolded config to import the plugin and set any operation-level overrides:

  
// oagen.config.ts
import { myEmittersPlugin } from "./src/plugin.js";

export default {
  ...myEmittersPlugin,

  // Override derived operation names where needed.
  // Key format is "METHOD /path".
  // These are redundant here because the spec has explicit operationId values,
  // but in specs without clean operationIds this is how you enforce naming
  // conventions across a large surface without touching the spec itself.
  operationHints: {
    "GET /tasks": { name: "list_tasks" },
    "POST /tasks": { name: "create_task" },
    "GET /tasks/{id}": { name: "get_task" },
    "PATCH /tasks/{id}": { name: "update_task" },
    "DELETE /tasks/{id}": { name: "delete_task" },
  },

  // SDK runtime policy overrides
  sdkBehavior: {
    retry: {
      maxRetries: 2,
      backoff: { initialDelay: 0.5, maxDelay: 8.0, multiplier: 2, jitterFactor: 0.25 },
    },
    timeout: {
      defaultTimeoutSeconds: 30,
      timeoutEnvVar: "TASKS_REQUEST_TIMEOUT",
    },
  },
};

Running the generator

Build the emitter and generate:

  
npm run build
npm run sdk:generate -- --spec ../tasks-api.yml --namespace TasksClient

The output directory will contain:

  
sdk/
  client.py
  models/
    task.py
    createtaskinput.py
    updatetaskinput.py
    tasklist.py
    taskstatus.py
  resources/
    tasks_client.py

And sdk/resources/tasks_client.py will look like:

  
# Auto-generated by oagen. Do not edit.

from __future__ import annotations
from typing import Optional
from .http_client import HttpClient


class TasksClient:
  def __init__(self, client: HttpClient) -> None:
    self._client = client

  def list_tasks(self, after: Optional[str] = None, limit: Optional[int] = None, status: Optional[TaskStatus] = None) -> TaskList:
    """List tasks"""
    params = {"after": after, "limit": limit, "status": status}
    params = {k: v for k, v in params.items() if v is not None}
    return self._client.request("GET", f"/tasks", params=params)

  def create_task(self, body: CreateTaskInput) -> Task:
    """Create a task"""
    return self._client.request("POST", f"/tasks", json=body)

  def get_task(self, id: str) -> Task:
    """Get a task"""
    return self._client.request("GET", f"/tasks/{id}")

  def update_task(self, id: str, body: UpdateTaskInput) -> Task:
    """Update a task"""
    return self._client.request("PATCH", f"/tasks/{id}", json=body)

  def delete_task(self, id: str) -> None:
    """Delete a task"""
    return self._client.request("DELETE", f"/tasks/{id}")

Reading SDK behavior from the IR

A production emitter reads retry and timeout policy from ctx.spec.sdk and generates that configuration into the HTTP client file rather than hardcoding it. The values come from oagen's defaults merged with whatever overrides you specified in oagen.config.ts.

  
// src/python/http_client.ts
import type { ApiSpec, GeneratedFile, EmitterContext } from "@workos/oagen";

export function generateHttpClient(spec: ApiSpec, ctx: EmitterContext): GeneratedFile {
  const sdk = ctx.spec.sdk;
  const retryable = sdk.retry.retryableStatusCodes.join(", ");
  const maxRetries = sdk.retry.maxRetries;
  const initialDelay = sdk.retry.backoff.initialDelay;
  const maxDelay = sdk.retry.backoff.maxDelay;
  const defaultTimeout = sdk.timeout.defaultTimeoutSeconds;
  const timeoutEnvVar = sdk.timeout.timeoutEnvVar ?? "REQUEST_TIMEOUT";

  const content = `
import os
import time
import httpx

RETRYABLE_STATUS_CODES = {${retryable}}
MAX_RETRIES = ${maxRetries}
INITIAL_DELAY = ${initialDelay}
MAX_DELAY = ${maxDelay}
DEFAULT_TIMEOUT = float(os.environ.get("${timeoutEnvVar}", ${defaultTimeout}))


class HttpClient:
  def __init__(self, api_key: str, base_url: str) -> None:
    self._base_url = base_url.rstrip("/")
    self._client = httpx.Client(
      headers={"Authorization": f"Bearer {api_key}"},
      timeout=DEFAULT_TIMEOUT,
    )

  def request(self, method: str, path: str, **kwargs):
    url = f"{self._base_url}{path}"
    delay = INITIAL_DELAY
    for attempt in range(MAX_RETRIES + 1):
      response = self._client.request(method, url, **kwargs)
      if response.status_code not in RETRYABLE_STATUS_CODES:
        response.raise_for_status()
        return response.json() if response.content else None
      if attempt < MAX_RETRIES:
        time.sleep(min(delay, MAX_DELAY))
        delay *= 2
    response.raise_for_status()
  `.trim();

  return { path: "http_client.py", content };
}

If you later change sdkBehavior.retry.maxRetries in oagen.config.ts, the generated HTTP client file changes on the next generation run. Every SDK language gets the updated policy at the same time, from the same config source.

Testing emitters with fixture specs

Emitter tests follow a golden-file pattern. You commit a fixture spec alongside expected output for each case, then assert that the emitter produces exactly those files. The oagen repository includes a reference emitter with tests at examples/reference-emitter.

A minimal test for the model generator:

  
// test/models.test.ts
import { describe, it, expect } from "vitest";
import { parseSpec } from "@workos/oagen";
import { generateModels } from "../src/python/models.js";

const FIXTURE = `
openapi: "3.1.0"
info:
  title: Test API
  version: "1.0.0"
servers:
  - url: https://api.example.com
components:
  schemas:
    Widget:
      type: object
      required: [id, name]
      properties:
        id:
          type: string
          format: uuid
        name:
          type: string
        color:
          type: string
          nullable: true
paths: {}
`;

describe("generateModels", () => {
  it("generates a Python dataclass with optional fields after required ones", async () => {
    const spec = await parseSpec({ content: FIXTURE, format: "yaml" });
    const files = generateModels(spec.models);

    expect(files).toHaveLength(1);
    expect(files[0].path).toBe("models/widget.py");
    expect(files[0].content).toContain("@dataclass");
    expect(files[0].content).toContain("class Widget:");
    expect(files[0].content).toContain("id: str");
    expect(files[0].content).toContain("name: str");
    expect(files[0].content).toContain("color: Optional[str] = None");

    // Verify field ordering: required fields must precede optional ones
    const content = files[0].content;
    expect(content.indexOf("id: str")).toBeLessThan(content.indexOf("color: Optional[str]"));
  });
});

Run with npm test. This pattern scales well: add a fixture YAML string for each edge case (union types, pagination, deprecated fields, enums as field types) and assert on the generated content. After a few dozen tests, the emitter is robust enough to run against real production specs without surprises.

What the diff command tells you

When your API evolves, oagen diff compares two spec versions and outputs a structured report of what changed:

  
npx oagen diff --old tasks-api-v1.yml --new tasks-api-v2.yml

The report lists added operations, removed operations, parameter changes, and schema changes. Emitters that implement generateTypeSignatures can hook into this workflow to produce compatibility overlays, which preserve the public API surface during generation even when the underlying spec changes operation names or reorganizes schemas. That is an advanced topic beyond this tutorial, but knowing the diff machinery exists is useful when planning how to ship breaking changes without forcing SDK users to update immediately.

Where to go from here

The emitter built here generates working Python code from a real OpenAPI spec in roughly 250 lines of TypeScript. It handles models, enums, path and query parameters, typed request bodies, optional fields, and method name resolution from the IR.

A production-ready Python emitter would add several things: a pagination helper that yields pages automatically without the caller handling cursors, __init__.py files that re-export the public surface cleanly, inline docstrings for every parameter sourced from the IR's description fields, and a py.typed marker file for mypy. None of those require changes to oagen itself. They are all emitter decisions, which is exactly the point.

The same architecture applies to any target language. Write the type renderer for Go or Ruby, wire it up to the same IR, and the method names, retry policies, and operation groupings are already decided. The work you put into operation hints and SDK behavior config in oagen.config.ts pays dividends across every language you add.

Both oagen and oagen-emitters are open source and MIT licensed at github.com/workos/oagen and github.com/workos/oagen-emitters. The WorkOS OpenAPI spec is public at github.com/workos/openapi-spec. Studying how the production emitters handle edge cases in a large, real-world API surface is the most useful next step after getting the basics working.

We’re hiring

Our global team is growing and we’re hiring all types of roles.

View open roles

About us

WorkOS builds developer tools for quickly adding enterprise features to applications.

Learn more