.docx files headlessly. Install the SDK, open a document, and run an agentic tool loop. Full working code below.
If you need real-time sync between the agent and a frontend editor, add collaboration. The SDK client joins the same Yjs room as the frontend: edits appear live.
Prerequisites
- Node.js 18+
@superdoc-dev/sdk- An LLM provider API key (e.g.,
OPENAI_API_KEY)
Step 1: Install
- OpenAI
- Anthropic
- Vercel AI
Step 2: Open a document
Create an SDK client and open a.docx file. client.open() returns a document handle you’ll pass to the dispatcher.
Step 3: Load tools and system prompt
Load the tool definitions for your provider and the default system prompt. Both can be cached: they don’t change between requests.- OpenAI
- Anthropic
- Vercel AI
Step 4: Run the agent loop
The agent loop sends messages to the LLM, dispatches tool calls, feeds results back, and repeats until the model is done.- OpenAI
- Anthropic
- Vercel AI
- The system prompt teaches the model how to use SuperDoc tools.
- The
while(true)loop calls OpenAI, checks for tool calls, dispatches them viadispatchSuperDocTool, and feeds results back. - When the model returns
finish_reason: 'stop'(no more tool calls), the loop ends. - Errors are caught and returned as tool results so the model can see what went wrong and retry.
Step 5: Save and clean up
Full example
A complete, copy-pasteable script that opens a document, runs an agent, saves, and exits:- OpenAI
- Anthropic
- Vercel AI
Other providers
AWS Bedrock
UsechooseTools({ provider: 'anthropic' }) and convert to Bedrock’s toolSpec shape:
- Node.js
- Python
aws configure, env vars, or IAM role. No API key needed.
Streaming generated text into a visible editor
Sometimes you don’t need a full agent loop. You just want the model to write into the document while the user watches. Stream the output through a small backend proxy and append each delta to the editor:editor.doc.insert is the public Document API. With no target, content appends at the end. Newlines from the model become real paragraph breaks.
A few things to get right:
- Keep the model key on the server. A small Node proxy that forwards Server-Sent Events keeps the key out of client bundles.
- Buffer deltas. Inserting on every token causes one document mutation per token, which floods the layout engine and undo stack. Flush on a timer (~150ms) or whenever a newline arrives.
- Abort on unmount and Stop. Tie an
AbortControllerto the fetch and call it from your cleanup. The server should also abort upstream when the client disconnects so neither side burns tokens.
Related
- LLM tools: tool catalog and SDK functions
- Best practices: prompting, workflow tips, and tested prompt examples
- Debugging: troubleshoot tool call failures
- Collaboration: add real-time sync between agent and frontend
- SDKs: typed Node.js and Python wrappers

