RuneFile: An Open Format for AI Prompts
We kept finding the same problem: AI prompts living in Google Docs, Slack threads, and sticky notes. So we built RuneFile, an open format that turns prompts into typed, versioned, portable functions. Just Markdown. Any model. No lock-in.
We built RuneFile because we got tired of watching smart teams lose their best prompts in Slack threads. Here's the story, the format, and why we think this matters.
Every team working with AI today has the same dirty secret: their prompts are a mess.
We know because we've been in the room. Over the past year, Woza Labs has been helping public sector organizations adopt AI for real operational work: procurement analysis, citizen service automation, policy drafting, internal communications. These aren't experiments. These are workflows that thousands of people depend on every day.
And in every single engagement, we found the same thing: the prompts powering these workflows lived in Google Docs, Notion pages, Slack messages, email chains titled "final prompt v3 FINAL (2)", and, yes, literally, sticky notes on monitors.
This is not a tooling problem unique to government. It's everywhere. Startups, enterprises, agencies. The entire industry treats prompts like disposable text when they've quietly become critical infrastructure.
We decided to do something about it.
The problem, stated plainly
If you're building software, you have established patterns for nearly everything: version control for code, package managers for dependencies, schemas for data, environment configs for deployment. Each of these patterns emerged because someone recognized that a certain class of artifact had become important enough to deserve structure.
AI prompts have reached that threshold. Consider what a well-written prompt actually contains: it defines a persona, sets behavioral constraints, declares expected inputs, specifies output format, and often includes few-shot examples to guide the model. That's not a sticky note. That's a function signature with documentation.
Yet the current state of prompt management is roughly where software was before version control. Teams copy prompts manually. There's no way to parameterize them without string concatenation. There's no validation until runtime (when the model hallucinates because someone forgot a variable). There's no standard way to share a prompt between two people, let alone two organizations.
The consequences are predictable: duplicated work, inconsistent outputs, zero auditability, and a complete inability to improve prompts systematically over time.
What RuneFile is
RuneFile is an open format specification for AI prompts. A .rune.md file is a Markdown document, readable by anyone, writable by anyone, that encodes a prompt as a typed, parameterized, executable function.
The format has three layers that work together:
YAML frontmatter at the top declares the prompt's metadata and configuration: its name, target model, temperature, and most importantly, its inputs and output schema. Inputs are typed (string, text, number, boolean, enum, array, file) and can have defaults, constraints, and human-readable descriptions.
Markdown sections in the body define the prompt's content using ## headings that map directly to API message roles. A ## System section becomes a system message. A ## User section becomes a user message. A ## Prompt section is the primary parameterized user message. The order of sections in the document defines the message sequence, which means multi-turn conversations and few-shot patterns are represented naturally.
Template variables like {{company_name}} or {{industry}} are interpolated at execution time. Variables must match declared inputs. Missing required variables with no default raise errors before anything reaches the model.
Here's what this looks like in practice:
---
name: policy_summarizer
version: 1.0.0
model: claude-sonnet-4-20250514
temperature: 0.3
inputs:
- name: document
type: text
required: true
- name: audience
type: enum
options: [executive, technical, citizen]
- name: max_words
type: number
default: 300
output:
format: json
schema:
type: object
properties:
title: { type: string }
summary: { type: string }
key_points:
type: array
items: { type: string }
required: [title, summary, key_points]
---
## System
You are a policy analyst who makes complex government
documents accessible. Tailor language to the audience:
- **executive**: strategic implications, budget, risks
- **technical**: implementation, systems, compliance
- **citizen**: plain language, daily impact, next steps
## Prompt
Summarize the following policy document for a
**{{audience}}** audience.
Keep under approximately **{{max_words}}** words.
---
{{document}}
That's it. If you can read Markdown, you can read a RuneFile. If you can write YAML, you can author one. There is zero proprietary syntax, zero vendor dependency, and zero ambiguity about what this prompt does, what it expects, and what it produces.
Why this design
We made deliberate choices about what RuneFile is and isn't, and those choices reflect hard lessons from actual projects.
Markdown because it's already universal. We considered JSON, YAML-only, TOML, and custom DSLs. We rejected all of them because they create a barrier to adoption. Markdown has a unique property: non-technical people can read and edit it comfortably, while technical people can parse it programmatically. A policy officer in a government ministry can open a .rune.md file and understand what the prompt does. A developer can integrate it into a CI/CD pipeline. Same file serves both audiences.
Typed inputs because runtime failures are expensive. When a prompt expects an industry name and receives a boolean, the model doesn't crash, it produces confidently wrong output. By declaring input types upfront, runners can validate before execution. This is especially important in contexts where AI outputs inform decisions (procurement scoring, policy recommendations, citizen communications) and silent failures have real consequences.
Sections as message sequencing because multi-turn matters. Most interesting prompts aren't single messages. They include system instructions, few-shot examples (alternating user/assistant messages), injected context, and a final parameterized query. RuneFile represents this naturally: sections appear in document order, and repeated section types create additional messages. A translator prompt with three ## User / ## Assistant example pairs followed by a ## Prompt renders to exactly the message array you'd expect.
Output schemas because structured output is the whole point. When you're using AI to produce data that feeds into other systems, like JSON that populates a dashboard or structured analysis that feeds a report, you need to validate the output. RuneFile supports JSON Schema declarations. Runners validate responses automatically and return both the output and any validation errors.
Composition because prompts are like code. Real prompt libraries share common patterns: the same system instructions across a team, the same output format for a family of tasks, the same few-shot examples for a domain. RuneFile's include directive lets you compose prompts from reusable partials, with main-file sections overriding included ones. This means a team can maintain a shared system-analyst.rune.md partial and include it across dozens of task-specific prompts.
Provider-agnostic because lock-in is the wrong default. The model field in frontmatter is informational, not binding. A runner can override it at execution time. The same .rune.md file can run against Claude, GPT, Mistral, Llama, or any future model. Your prompt library is yours, not your provider's.
Security: the thing nobody talks about
Prompts that accept user input are vulnerable to template injection, the prompt equivalent of SQL injection. If a variable value contains {{ and }} sequences, it can hijack the template. If it contains phrases like "ignore previous instructions," it can attempt to override the system prompt.
RuneFile's spec addresses this explicitly. Runners must strip or escape template syntax in variable values. They must validate enum values against declared options. They must reject inputs exceeding declared length limits. And they should support a --strict mode that warns on instruction-like patterns in input.
We also specify that prompt files must never contain secrets, API keys, or tokens. Runners may support environment variable references ({{$ENV.API_KEY}}), but those references must never be logged or included in execution traces.
This isn't theoretical. In public sector contexts, where prompts process citizen data and produce official communications, input sanitization is a compliance requirement. We built it into the format because bolting it on later never works.
The execution lifecycle
A RuneFile goes through six stages from file to output:
Parse reads the YAML frontmatter and identifies Markdown sections. Resolve merges any included partials, depth-first, detecting circular references. Validate checks that all required inputs are provided, types match declarations, and constraints are satisfied. Render interpolates variables into section content, producing final message strings. Execute sends the rendered message array to the AI provider. Output returns the raw response or, if a schema is declared, validates the response and returns both the output and any validation errors.
Every stage can fail gracefully. Missing frontmatter is fatal. A type mismatch is fatal. An unknown variable in a template is a warning, logged but not blocking. Output schema validation failure returns the output alongside the errors, because a structurally imperfect response is often still useful.
This lifecycle is designed for both interactive use (a developer running a prompt from the terminal) and programmatic use (an application calling a runner library).
What you can build today
The repo includes six complete example files that cover the major patterns.
Competitive analysis (analysis.rune.md) demonstrates typed inputs with enums and arrays, JSON output with a full schema, few-shot examples, and context injection. It's the kind of prompt an analyst would run daily against different companies.
Email writer (email.rune.md) shows how a single prompt file can produce dramatically different outputs based on a tone parameter. Same template, friendly or assertive or formal, the runner just swaps the variable.
Code reviewer (review.rune.md) accepts source code via text input (pipe-friendly for CLI usage), focuses the review on a specific area (security, performance, readability), and produces structured JSON with line-level issues and severity ratings.
Translator (translate.rune.md) uses multi-turn sections, ## User / ## Assistant pairs, to establish few-shot translation examples before the parameterized prompt. This is the pattern that makes LLMs dramatically better at consistent translation.
Policy summarizer (policy-summarizer.rune.md) adapts the same document summary for three different audiences: executives get strategic implications, technical teams get implementation details, citizens get plain language. This came directly from our public sector work.
Citizen response drafter (citizen-response.rune.md) generates empathetic, clear, actionable responses to citizen inquiries in multiple languages, on behalf of a named department. It includes strict behavioral guidelines in the system prompt and few-shot examples that demonstrate the expected tone.
Each of these files is self-contained, documented, and ready to run.
CLI: coming next
We're building the rune CLI, a lightweight command-line runner that parses .rune.md files and executes them against multiple providers out of the box.
rune run analysis.rune.md \
--var company="Acme" \
--var industry="climatetech"
rune render analysis.rune.md \
--var company="Test" --var industry="saas"
rune validate analysis.rune.md
rune inspect analysis.rune.md
run executes. render previews the rendered messages without sending to a model, which is essential for debugging. validate checks the file without executing. inspect lists the declared inputs, their types, defaults, and descriptions.
Multi-provider support (Anthropic, OpenAI, Mistral, and others) is baked in from day one. Model overrides at runtime mean you can test the same prompt against different providers with a single flag.
What we're not saying
We're not saying RuneFile is the final answer to prompt management. We're not claiming this will become a universal standard. We're not building a platform, a marketplace, or a SaaS product around this.
We're saying that prompts have become important enough to deserve a format, and we're contributing one. It's open source, it's designed to be forked and extended, and it's shaped by real problems we've encountered helping real teams do real work.
If you're building with AI and your prompts live in documents or chat threads, give it a look. If you see something missing or broken, open an issue. If you want to build a runner in your language of choice, the spec is there.
The way we write prompts today won't survive the next two years. The improvisation phase is ending. We'd rather help build what comes next than wait for someone else to do it.
RuneFile is open source under MIT license.
Star it. Fork it. Make it better.