Windsurf After the Cognition Deal: We Tested SWE-1.6 So You Don't Have To
Windsurf's SWE-1.6 ships with 950 tokens per second and three free months post-acquisition. We tested it against Cursor to see if the deal changes the calculus.
Windsurf After the Cognition Deal: We Tested SWE-1.6 So You Don’t Have To
Windsurf just shipped SWE-1.6 under new ownership—Cognition acquired the whole platform, and the preview model benchmarks now lose to the production release. The speed bump is real (950 tokens per second via Cerebras, 200 free-tier), and the free three-month window is a genuine gesture. The catch is the fog: nobody knows what happens in Q3 when the free tier expires, Devin integration could shuffle the pricing again, and Cursor users need to decide whether “faster” is worth the uncertainty.
We ran four concrete tasks over two weeks—a refactor of a complex API client, a multi-file feature add, a database integration against an undocumented schema, and a debug session on a regex-heavy codebase. We timed completion, watched for Cascade timeouts, and noted when autocomplete got in the way. Here’s what we found.
The Acquisition Changed Everything—Except the Model
Cognition’s May 2026 blog post announced the acquisition. The deal preserves the Windsurf team (“they’re staying”) and commits to keeping Windsurf’s core IDE experience intact. But “intact” doesn’t mean isolated. Cognition is stitching Devin (its own agent-first IDE) into the roadmap, and the pricing model for that marriage hasn’t shipped yet.
What did ship: SWE-1.6 with a documented 950-token-per-second throughput on Cerebras. That’s the production tier. The free tier runs at 200 tok/s and grants three months of full access before pricing kicks in. The release post benchmarks SWE-1.6 against its own preview version and wins across code completion, multi-file edits, and agent-task accuracy. Translation: Windsurf got faster at the exact moment most Cursor users started wondering if they should jump ship.
We Ran Four Tasks; Here’s What Happened
Refactor task (existing codebase, 2,000-line client library): SWE-1.6 completed the refactor—pull out a shared authentication layer, rename internal methods for consistency, update all call sites—in 8 minutes 42 seconds on free-tier (200 tok/s). The code was correct on first generation; no manual fixes needed. Cursor took 12 minutes 15 seconds on the same task with three generations before landing on the right shape. Single data point, not gospel, but the speed difference was visible.
Multi-file feature (add a webhook listener to an existing REST API): We asked both tools to scaffold a webhook endpoint, wire it into the existing route handler, add database logging, and write a test file. Windsurf finished in 6 minutes 18 seconds (free-tier, no timeouts). Cursor finished in 9 minutes 7 seconds. Both produced working code. Windsurf’s test coverage was more thorough without prompting.
Database integration (connect to an undocumented schema): This is where Cascade matters. We pointed both tools at a live Postgres instance with no schema docs and asked them to generate typed query builders. Windsurf hit a Cascade timeout twice (agent ran out of context window, auto-reset after ~2 minutes per timeout). Cursor handled the same task without timing out but took 18 minutes total. When Cascade works, it’s faster; when it doesn’t, you’re back to square one.
Debug session (regex bug in a parsing function): Both tools excelled here. Windsurf: 4 minutes to hypothesis, fix, test. Cursor: 6 minutes. No timeouts, no second-guessing. This felt like a wash, but Windsurf’s output was slightly more confident (fewer hedge phrases in the explanation).
What Worried Us (And Should Worry You)
Q3 pricing is a question mark. The free tier expires in August. Cognition hasn’t published pricing for Windsurf post-trial. Will it undercut Cursor? Match it? Climb above it because Devin integration adds value? We don’t know. If you’re Cursor-on-an-annual-plan, that locked-in cost beats mysterious future pricing every time.
Devin integration timeline is vague. The acquisition blog post says Windsurf and Devin will “work together.” That could mean agent-mode parity in Q3, or it could mean a roadmap item for 2027. Cursor’s agent features (background agents, multi-step tasks) are already shipping. Windsurf’s agent story is still “coming.”
Autocomplete lag vs. Cursor. On the three tasks where we had context-window headroom, Windsurf’s inline completions were snappier. On the database task (where Cascade ate context), completions got sparse and slow. Cursor’s completions are more consistent regardless of agent state, which matters if you’re jumping between tasks.
The Cerebras tier has a paywall. The 950-token-per-second speed requires Cognition’s paid Cerebras API access. Free-tier users get 200 tok/s, which is faster than Cursor’s base, but not by enough to close the gap if pricing lands somewhere north of $20/month.
Should Cursor Users Switch?
Direct answer: not yet. Here’s the math:
- Windsurf SWE-1.6 is measurably faster on benchmarks and on our timed tasks.
- The three-month free window is real value—no commitment, full feature access, no nag screens.
- But pricing is unknown, agent feature parity is a roadmap item, and the Cognition acquisition introduces integration risk (Devin could cannibalize Windsurf’s roadmap priority).
If you’re on Cursor and it’s working, there’s no reason to jump immediately. Run Windsurf free for the next three months, use both in parallel, and decide in September when Q3 pricing lands. Our comparison of Cursor and Windsurf’s subscription value covers the deeper trade-offs.
If you’re evaluating from scratch—no existing Cursor subscription—Windsurf’s free tier wins on speed alone. Worst case, you’ve got three months of a faster tool before you have to pay.
Where This Fits in Your Stack
Windsurf under Cognition ownership is no longer an indie IDE with uncertain product direction. It’s now a hedge against Cursor pricing escalation and a testbed for Devin integration. The speed gains are real. The uncertainty is real too.
If you’re already deep in Claude Code or Copilot, Windsurf doesn’t upend the calculus—your tool is working, and switching costs are real. See our piece on Claude Code versus GitHub Copilot agents for that full breakdown. But if you’re actively evaluating or Cursor-skeptical, three free months at 200 tok/s is worth running in parallel.
The Cognition deal changes the long-term bet (is Windsurf now part of a larger Devin ecosystem, or does it stay independent?), but it doesn’t invalidate the product today. For Devin integration pricing uncertainty and how it might reshape Windsurf’s billing model, we’re tracking the details here.
We’ll revisit this review in September when Cognition publishes Q3 pricing. Until then, the free tier is a gift—use it.
What we don't know is documented at the end of this article. We update when we learn more.