Create content, then test it on a real audience — from inside your agent

We're launching the neuroflash API and MCP server as first-class developer products. Here's what's inside, why we built it MCP-native, and how to start.

Donald Vlahovic

21 Jun 2026 — 5 min read

Auf Deutsch lesen →

We're launching the neuroflash API and MCP server as first-class developer products. Here's what's inside, why we built it MCP-native, and how to start in five minutes.

Most AI tooling stops at generation. You prompt, you get copy, and then you ship it on a hunch. The expensive part — finding out whether your audience actually responds to it — happens later, in production, after the budget is already spent.

neuroflash was built to close that loop: generate content and validate it against a real audience before it goes live. Today we're making that loop programmable. The neuroflash API and MCP server are now standalone products — not features buried inside the app, but a developer surface you can build on, with their own docs, onboarding, and pricing.

This post covers why we shipped them as products, what's actually inside, how the MCP server is architected, and how to connect in a few minutes.

Why an API and an MCP server — as products

Two shifts drove this.

First, integration is now table stakes. The 2025 state-of-the-API data is blunt about it: the overwhelming majority of organizations are API-first, and a large and growing share treat their APIs as revenue-generating products in their own right. A capability that can't be called from your own stack increasingly doesn't count. So the neuroflash API is a first-class product: documented, versioned, and built to be integrated, not just demoed.

Second, MCP is becoming the default way agents reach tools. The Model Context Protocol gives any MCP-aware client — Claude, Cursor, and a growing list of others — a uniform way to discover and call external capabilities. Shipping an MCP server means neuroflash shows up where developers already work, with no bespoke glue code. We didn't want that to be an afterthought wrapper around the API; we wanted it to be a product with its own design decisions. That's what the rest of this post is about.

What's inside

The MCP server spans 7 API domains:

Digital Twins — query AI representations of real audience segments.
Brand Voices — create, import, and apply on-brand tone.
Content generation — text in your brand voice, for any channel.
Image generation — on-brand visuals in-workflow.
Target audiences, usage/quota, and workspaces — the supporting surface.

The differentiator is the Digital Twins. They aren't LLM personas improvised from a prompt — they're built on over one million real survey profiles collected since 2017, with up to 255 data points per person, and in our testing they reproduce real human responses with 85–98% accuracy. That means an agent can do something genuinely new: draft a headline, an email subject, or an ad, then ask the target segment what they think — in seconds, without leaving the conversation.

In practice the flow looks like this: generate_text with a brand_voice_id to produce on-brand copy, then chat_with_twin (or chat_with_twin_group) to get segment-specific feedback, then iterate. Create and validate, in one place.

The loop that sets neuroflash apart: generation and validation in a single workflow.

How the MCP server is built

A few deliberate architecture choices.

Remote, not local. It runs as a remote HTTP server using Streamable HTTP transport — there's nothing to install or run. You point your client at one URL:

https://app.neuroflash.com/api/mcp-server/v1/mcp

OAuth 2.0 with PKCE. Authentication is a browser-based login with your neuroflash account — no API keys to paste into config files, no long-lived secrets on disk. Your client handles the flow; the server only ever sees a scoped token.

Architecture: an MCP client such as Claude or Cursor authenticates via OAuth 2.0 with PKCE, then reaches the seven neuroflash service domains through the MCP server.

One URL, one OAuth login — the MCP server fans out to every neuroflash domain.

Three interaction modes. This is the part we're most proud of. The same server can present its surface three different ways, and the LLM picks the best approach for the question:

Traditional — one tool per endpoint. Best for single, specific operations like "list my brand voices."
Plan — instead of chatting one call at a time, the LLM submits a single typed JSON plan that the server executes server-side: sequential and parallel calls, transforms, branching, bounded loops. No per-step round-trips, no sandbox boot. For multi-step workflows it uses roughly 80% fewer tokens than calling tools directly.
Exploratory — a discover → query → compare flow for open-ended questions, where the model progressively walks the API surface.

The three interaction modes side by side: Traditional calls one tool per endpoint; Plan submits a single typed JSON plan executed server-side; Exploratory walks the API with discover, query, and compare.

Same server, three surfaces — the model picks the most efficient one for the task.

A minimal Plan-mode payload — fetch every quota for a workspace in one round-trip — looks like this:

{
  "version": "1",
  "steps": [
    { "id": "all", "call": { "method": "GET",
      "path": "/api/usage-service/v1/workspaces/{workspace_id}/quotas" },
      "args": { "workspace_id": "$ctx.workspace_id" } }
  ],
  "return": { "all": "$all" }
}

The $ctx.workspace_id resolves automatically from the authenticated session. The point isn't the JSON — it's that complex, multi-call workflows stop costing you a chatty back-and-forth and a pile of tokens.

Getting started

In Claude Desktop:

Settings → Connectors → Add custom connector.
Paste the server URL: https://app.neuroflash.com/api/mcp-server/v1/mcp.
A browser window opens — log in with your neuroflash account to authorize.
Under Tool Permissions, set tools to Allow All so calls run without interruption.

In Cursor (which needs a loopback OAuth bridge), add to ~/.cursor/mcp.json:

{
  "mcpServers": {
    "neuroflash": {
      "command": "npx",
      "args": ["mcp-remote", "https://app.neuroflash.com/api/mcp-server/v1/mcp"]
    }
  }
}

Any MCP client that speaks Streamable HTTP and uses an HTTPS or loopback redirect works the same way. Full setup, the tool reference, and mode deep-dives are in the docs: https://neuro-flash.github.io/mcp-server/index.html

One note worth knowing up front: API and MCP usage draws from the same workspace quota as the app. There's no separate allowance and no surprise meter — what you'd spend in the UI is what you spend through the API. And because it's neuroflash, it's GDPR-compliant, hosted in Germany, and ISO 27001 certified.

Why this matters

The interesting thing about putting a validation loop behind an API isn't the tooling — it's what agents can now do with it. An autonomous workflow can generate a dozen subject-line variants, run them past a real audience segment, rank them by predicted resonance, and only then send. The "test it on real people" step, which used to mean weeks of market research, becomes one more call in the chain.

That's the bet we're making with the API and the MCP server: that the next generation of content workflows won't just generate — they'll generate, validate, and decide. And they'll do it from inside the tools you already use.

The neuroflash API and MCP server are available now. If you build with it, we'd love your feedback — and your first thread with a Digital Twin tends to be the moment it clicks.

Get started: MCP overview · MCP server docs · API overview · API docs