Replit — Keeping Replit Agent Reliable Through Decision-Time Guidance

AI-generated code is changing how software is built, but securing that code raises new challenges. This research explores whether AI-driven security scans are sufficient for vibe coding platforms, or whether they risk asking models to audit their own output.

Through controlled experiments on React applications with realistic vulnerability variants, we compare AI-only security scans with Replit’s hybrid approaches that combine deterministic static analysis and dependency scanning with LLM-based reasoning. Along the way, we examine how prompt sensitivity, nondeterminism, and ecosystem awareness affect real-world security outcomes.

We show that functionally equivalent code can receive different security assessments depending on syntactic form or prompt phrasing. Issues like hardcoded secrets may be detected in one representation and missed in another. More critically, dependency-level vulnerabilities and supply-chain risks remain largely invisible without traditional scanning infrastructure.

The takeaway is not that LLMs are ineffective, but that they are best used alongside deterministic tools. While LLMs can reason about business logic and intent-level issues, static analysis and dependency scanning are essential for establishing a reliable security baseline.

Key Findings:

AI-only security scans are nondeterministic: Identical vulnerabilities receive different classifications based on minor syntactic changes or variable naming
Prompt sensitivity limits coverage: Detection depends on what security issues are explicitly mentioned, shifting responsibility from tool to user
Dependency vulnerabilities go undetected: Without continuous vulnerability feeds, AI cannot reliably identify version-specific CVEs
Static analysis provides consistency: Rule-based scanners deliver deterministic, repeatable detection across all code variations
Hybrid architecture is essential: Combine deterministic baseline security with LLM-powered reasoning for comprehensive protection

If you’re interested in the methodology, experiments, and detailed analysis behind these findings, read the full white paper.

We Built a Video Rendering Engine by Lying to the Browser About What Time It Is

The Problem: Browsers Don't Want to Be Cameras Here's a deceptively simple product requirement: take a web page with animations, and turn it into a video file. Sounds easy, right? Open a browser. Record the screen. Export MP4. Ship it. We tried that. It doesn't work. The core issue is that browsers are real-time systems. They render frames when they can, skip frames under load, and tie animations to wall-clock time. If your screenshot takes 200ms but your animation expects 16ms frames, you get a stuttery, unwatchable mess. The browser kept rendering at its pace while we captured at ours, and the two never agreed.

Decision-Time Guidance: Keeping Replit Agent Reliable

At Replit, we want to give our users access to the most powerful agentic coding system in the world—one that amplifies their productivity and minimizes the time from idea to product. Today, Replit Agent tackles more complex tasks than ever before. As a result, average session durations have increased and trajectories have grown longer, with the agent completing more work autonomously. But longer trajectories introduce a novel challenge: model-based failures can compound, and unexpected behaviors can surface. Every trajectory is unique, so static prompt-based rules often fail to generalize—or worse, pollute the context as they scale. This demands a new approach to guiding the agent when failures are detected. Our insight: the execution environment itself can be that guide. The environment already plays a critical role in any agentic system—but what if it could do more than just execute? What if it could provide intelligent feedback that helps the agent course-correct, all while keeping a human in the loop? In this post, we discuss a set of techniques that have proven effective on long trajectories, significantly improving Replit Agent across building, planning, deployment, and overall code quality—while keeping costs and context in check. The Problem with Static Prompts on Long Trajectories System prompts and few-shot examples are the classic way to steer an agent by specifying intent and constraints up front. In practice, many production agents also rely on execution-time scaffolding—task lists, reminders, and other lightweight control signals—that are updated as the agent observes tool output and responds to user input.

Inside Replit’s Snapshot Engine: The Tech Making AI Agents Safe

How Replit's snapshot engine makes AI agents safe: instant filesystem forks, versioned databases, and isolated sandboxes enable reversible AI development. Introduction At Replit, we’ve built a compute and storage fabric that allows us to make changes in an isolated, reversible way, almost like time travel. Envisioned initially to make Replit more powerful for professional developers and team collaboration, these primitives enable developers to experiment more frequently and faster, thanks to the speed at which they can clone a compute environment and start making changes to it. Later, when we built Replit Agent in 2024, we realized that the same primitives could superpower coding agents: not only does the system help the human driving the Agent, but the Agent itself also greatly benefits from these tools! One motivating example we discovered early in the development of Replit Agent is that giving an AI Agent direct access to your code and database can be risky: it might make a change you don’t like, or drop or alter your database in irreversible ways.

Pro

Enterprise

Use Cases

Roles

Enterprise

Small Businesses

Get Started

Inspiration

How Replit Secures AI-Generated Code [white paper]

We Built a Video Rendering Engine by Lying to the Browser About What Time It Is

Decision-Time Guidance: Keeping Replit Agent Reliable

Inside Replit’s Snapshot Engine: The Tech Making AI Agents Safe