How to Generate Production Code with GPT-4o

How to Generate Production Code with GPT-4o
schemas.py
- Step 2: Implement the GPT-4o Client with Production-Grade Error Handling
client.py
- Step 3: Build the Code Validation Pipeline

📺 Watch: Neural Networks Explained

Video by 3Blue1Brown

GPT-4o represents a significant advancement in AI-assisted code generation, offering multimodal capabilities and improved reasoning over previous models. As of June 2026, this model has become a standard tool in production engineering workflows, enabling developers to generate, refactor, and debug complex codebases with unprecedented accuracy. In this tutorial, we'll build a production-grade code generation pipeline that leverag [1]es GPT-4o's API to create validated, testable Python modules from natural language specifications.

Why GPT-4o Changes Production Code Generation

Traditional code generation approaches often produce syntactically correct but semantically flawed code that fails in edge cases or doesn't integrate well with existing systems. GPT-4o addresses these limitations through several key improvements:

Enhanced reasoning capabilities: The model can maintain context across longer code sequences and understand complex architectural patterns
Multimodal understanding: You can provide diagrams, screenshots, or handwritten notes as input alongside text prompts
Improved instruction following: GPT-4o better adheres to specific coding standards, style guides, and framework conventions

According to OpenAI [8]'s documentation, GPT-4o processes tokens at approximately 2x the speed of GPT-4 Turbo while maintaining comparable output quality for code generation tasks. This makes it suitable for real-time code completion in IDEs and CI/CD pipelines.

Real-World Use Case and Architecture

Consider a scenario where your team needs to generate data validation modules for a microservices architecture. Each service requires consistent validation logic but with slightly different business rules. Manual implementation is error-prone and time-consuming. A GPT-4o-powered code generation pipeline can:

Accept natural language specifications for validation rules
Generate Python modules with proper type hints and docstrings
Automatically create corresponding unit tests
Validate the generated code against your project's linting and type-checking standards

The architecture we'll build consists of three layers:

Orchestration Layer: Manages API calls to GPT-4o, handles rate limiting, and implements retry logic
Validation Layer: Parses generated code, checks syntax, runs linters, and verifies type correctness
Integration Layer: Outputs validated code as importable modules with proper project structure

Prerequisites and Environment Setup

Before diving into implementation, ensure you have the following:

Python 3.11+ installed (we'll use 3.12 features like pathlib.Path improvements)
An OpenAI API key with access to GPT-4o (verify access via the API dashboard)
pip version 24.0 or later

Create a virtual environment and install the required packages:

python3.12 -m venv codegen_env
source codegen_env/bin/activate  # On Windows: codegen_env\Scripts\activate

pip install openai==1.35.0 \
    pylint==3.2.0 \
    mypy==1.10.0 \
    pydantic==2.7.0 \
    black==24.4.0 \
    pytest==8.2.0 \
    httpx==0.27.0 \
    tenacity==8.3.0

Set your OpenAI API key as an environment variable:

export OPENAI_API_KEY="sk-your-key-here"  # Replace with your actual key

For production environments, use a secrets manager like HashiCorp Vault or AWS Secrets Manager instead of environment variables.

Core Implementation: Building the Code Generation Pipeline

Step 1: Define the Code Specification Schema

We'll use Pydantic to create a structured schema for code generation requests. This ensures type safety and provides clear documentation for the API contract.

# schemas.py
from pydantic import BaseModel, Field, field_validator
from typing import Optional, List
from enum import Enum

class CodeLanguage(str, Enum):
    PYTHON = "python"
    TYPESCRIPT = "typescript"
    RUST = "rust"

class OutputFormat(str, Enum):
    MODULE = "module"  # Single file with multiple functions/classes
    PACKAGE = "package"  # Multiple files with __init__.py
    SCRIPT = "script"  # Runnable script with if __name__ == "__main__"

class CodeGenerationRequest(BaseModel):
    """Structured request for GPT-4o code generation."""

    specification: str = Field(
        ..,
        min_length=10,
        max_length=5000,
        description="Natural language description of the code to generate"
    )
    language: CodeLanguage = Field(
        default=CodeLanguage.PYTHON,
        description="Target programming language"
    )
    output_format: OutputFormat = Field(
        default=OutputFormat.MODULE,
        description="How to structure the generated code"
    )
    include_tests: bool = Field(
        default=True,
        description="Generate pytest unit tests alongside the code"
    )
    max_tokens: int = Field(
        default=4096,
        ge=512,
        le=8192,
        description="Maximum tokens for the generated response"
    )
    temperature: float = Field(
        default=0.2,
        ge=0.0,
        le=1.0,
        description="Controls randomness in generation (lower = more deterministic)"
    )

    @field_validator('specification')
    @classmethod
    def specification_must_be_detailed(cls, v: str) -> str:
        """Ensure the specification has enough detail for meaningful code generation."""
        word_count = len(v.split())
        if word_count < 20:
            raise ValueError(
                f"Specification must be at least 20 words, got {word_count}. "
                "Provide more detail about the expected behavior and edge cases."
            )
        return v

class CodeGenerationResponse(BaseModel):
    """Response from the code generation pipeline."""

    code: str = Field(.., description="Generated source code")
    tests: Optional[str] = Field(None, description="Generated test code if requested")
    validation_results: dict = Field(
        default_factory=dict,
        description="Results from linting and type checking"
    )
    metadata: dict = Field(
        default_factory=dict,
        description="Generation metadata (tokens used, model, latency)"
    )

Step 2: Implement the GPT-4o Client with Production-Grade Error Handling

The core client handles API communication with retry logic, rate limiting, and proper error classification.

# client.py
import os
import json
import time
from typing import Optional
from openai import OpenAI, APIError, RateLimitError, APITimeoutError
from tenacity import (
    retry,
    stop_after_attempt,
    wait_exponential,
    retry_if_exception_type,
    before_sleep_log
)
import logging

logger = logging.getLogger(__name__)

class GPT4oClient:
    """Production client for GPT-4o with retry and rate limiting."""

    def __init__(
        self,
        api_key: Optional[str] = None,
        model: str = "gpt-4o",
        max_retries: int = 3,
        base_delay: float = 1.0
    ):
        self.api_key = api_key or os.getenv("OPENAI_API_KEY")
        if not self.api_key:
            raise ValueError(
                "OpenAI API key required. Set OPENAI_API_KEY environment variable "
                "or pass api_key parameter."
            )

        self.client = OpenAI(api_key=self.api_key)
        self.model = model
        self.max_retries = max_retries
        self.base_delay = base_delay

        # Rate limiting state
        self._last_request_time = 0.0
        self._min_request_interval = 0.5  # 500ms between requests

    def _rate_limit_wait(self):
        """Ensure we don't exceed API rate limits."""
        elapsed = time.time() - self._last_request_time
        if elapsed < self._min_request_interval:
            wait_time = self._min_request_interval - elapsed
            logger.debug(f"Rate limiting: waiting {wait_time:.2f}s")
            time.sleep(wait_time)
        self._last_request_time = time.time()

    @retry(
        stop=stop_after_attempt(3),
        wait=wait_exponential(multiplier=1, min=1, max=10),
        retry=(
            retry_if_exception_type(RateLimitError) |
            retry_if_exception_type(APITimeoutError)
        ),
        before_sleep=before_sleep_log(logger, logging.WARNING)
    )
    def generate_code(
        self,
        system_prompt: str,
        user_prompt: str,
        max_tokens: int = 4096,
        temperature: float = 0.2
    ) -> tuple[str, dict]:
        """
        Generate code using GPT-4o.

        Args:
            system_prompt: System-level instructions for the model
            user_prompt: The specific code generation request
            max_tokens: Maximum tokens in the response
            temperature: Generation temperature (0.0-1.0)

        Returns:
            Tuple of (generated_text, metadata_dict)

        Raises:
            APIError: For non-retryable API errors
            ValueError: For invalid input parameters
        """
        self._rate_limit_wait()

        try:
            response = self.client.chat.completions.create(
                model=self.model,
                messages=[
                    {"role": "system", "content": system_prompt},
                    {"role": "user", "content": user_prompt}
                ],
                max_tokens=max_tokens,
                temperature=temperature,
                response_format={"type": "text"}
            )

            generated_text = response.choices[0].message.content

            metadata = {
                "model": response.model,
                "prompt_tokens": response.usage.prompt_tokens,
                "completion_tokens": response.usage.completion_tokens,
                "total_tokens": response.usage.total_tokens,
                "latency_ms": response.response_ms if hasattr(response, 'response_ms') else None
            }

            logger.info(
                f"Code generation successful: {metadata['total_tokens']} tokens "
                f"in {metadata.get('latency_ms', 'N/A')}ms"
            )

            return generated_text, metadata

        except RateLimitError as e:
            logger.warning(f"Rate limit hit: {e}. Retrying..")
            raise
        except APITimeoutError as e:
            logger.warning(f"Request timeout: {e}. Retrying..")
            raise
        except APIError as e:
            logger.error(f"Non-retryable API error: {e}")
            raise

Step 3: Build the Code Validation Pipeline

Generated code must pass syntax checks, linting, and type checking before being considered production-ready.

# validator.py
import ast
import subprocess
import tempfile
from pathlib import Path
from typing import Optional
import black
import logging

logger = logging.getLogger(__name__)

class CodeValidator:
    """Validates generated code for syntax, style, and type correctness."""

    def __init__(self, project_root: Optional[Path] = None):
        self.project_root = project_root or Path.cwd()

    def validate_syntax(self, code: str) -> tuple[bool, Optional[str]]:
        """
        Check if the generated code has valid Python syntax.

        Returns:
            Tuple of (is_valid, error_message)
        """
        try:
            ast.parse(code)
            return True, None
        except SyntaxError as e:
            error_msg = f"Syntax error at line {e.lineno}, column {e.offset}: {e.msg}"
            logger.error(error_msg)
            return False, error_msg

    def format_code(self, code: str) -> tuple[str, bool]:
        """
        Format code using Black with project-specific configuration.

        Returns:
            Tuple of (formatted_code, was_modified)
        """
        try:
            # Try to find pyproject.toml for Black configuration
            config_path = self.project_root / "pyproject.toml"
            if config_path.exists():
                formatted = black.format_file_contents(
                    code, fast=False, mode=black.Mode()
                )
            else:
                formatted = black.format_str(code, mode=black.Mode())

            return formatted, formatted != code
        except (black.NothingChanged, black.InvalidInput) as e:
            logger.warning(f"Formatting issue: {e}")
            return code, False

    def run_pylint(self, code: str) -> dict:
        """
        Run pylint on the generated code and return results.

        Returns:
            Dictionary with linting results including score and issues
        """
        with tempfile.NamedTemporaryFile(
            mode='w', suffix='.py', delete=False, dir=self.project_root
        ) as f:
            f.write(code)
            temp_path = f.name

        try:
            result = subprocess.run(
                ["pylint", "--output-format=json", temp_path],
                capture_output=True,
                text=True,
                timeout=30
            )

            if result.returncode == 0:
                return {"score": 10.0, "issues": []}

            import json
            issues = json.loads(result.stdout) if result.stdout else []

            # Extract the score from pylint output
            score = 10.0
            for issue in issues:
                if issue.get("type") == "convention":
                    score -= 0.1
                elif issue.get("type") == "warning":
                    score -= 0.5
                elif issue.get("type") == "error":
                    score -= 1.0

            return {
                "score": max(0.0, score),
                "issues": [
                    {
                        "line": issue.get("line"),
                        "column": issue.get("column"),
                        "message": issue.get("message"),
                        "type": issue.get("type")
                    }
                    for issue in issues
                ]
            }
        except subprocess.TimeoutExpired:
            logger.error("Pylint timed out after 30 seconds")
            return {"score": 0.0, "issues": [{"message": "Linting timed out"}]}
        except FileNotFoundError:
            logger.error("Pylint not found. Ensure it's installed.")
            return {"score": 0.0, "issues": [{"message": "Pylint not available"}]}
        finally:
            Path(temp_path).unlink(missing_ok=True)

    def run_mypy(self, code: str) -> dict:
        """
        Run mypy type checker on the generated code.

        Returns:
            Dictionary with type checking results
        """
        with tempfile.NamedTemporaryFile(
            mode='w', suffix='.py', delete=False, dir=self.project_root
        ) as f:
            f.write(code)
            temp_path = f.name

        try:
            result = subprocess.run(
                ["mypy", "--strict", temp_path],
                capture_output=True,
                text=True,
                timeout=30
            )

            errors = []
            for line in result.stdout.split('\n'):
                if 'error:' in line:
                    errors.append(line.strip())

            return {
                "passed": result.returncode == 0,
                "errors": errors,
                "output": result.stdout
            }
        except subprocess.TimeoutExpired:
            logger.error("Mypy timed out after 30 seconds")
            return {"passed": False, "errors": ["Type checking timed out"]}
        except FileNotFoundError:
            logger.error("Mypy not found. Ensure it's installed.")
            return {"passed": False, "errors": ["Mypy not available"]}
        finally:
            Path(temp_path).unlink(missing_ok=True)

    def validate_all(self, code: str) -> dict:
        """
        Run all validation checks on the generated code.

        Returns:
            Dictionary with comprehensive validation results
        """
        results = {}

        # Syntax validation
        syntax_valid, syntax_error = self.validate_syntax(code)
        results["syntax"] = {
            "valid": syntax_valid,
            "error": syntax_error
        }

        if not syntax_valid:
            results["overall_pass"] = False
            return results

        # Formatting
        formatted_code, was_modified = self.format_code(code)
        results["formatting"] = {
            "was_modified": was_modified,
            "formatted_code": formatted_code
        }

        # Linting
        lint_results = self.run_pylint(formatted_code)
        results["linting"] = lint_results

        # Type checking
        type_results = self.run_mypy(formatted_code)
        results["type_checking"] = type_results

        # Overall assessment
        results["overall_pass"] = (
            lint_results.get("score", 0) >= 7.0 and
            type_results.get("passed", False)
        )

        return results

Step 4: Create the Orchestration Pipeline

This is the main entry point that ties everything together with proper logging and error handling.

# pipeline.py
import logging
from pathlib import Path
from typing import Optional
from datetime import datetime

from schemas import CodeGenerationRequest, CodeGenerationResponse
from client import GPT4oClient
from validator import CodeValidator

logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)

class CodeGenerationPipeline:
    """
    Production pipeline for generating validated code using GPT-4o.

    This pipeline handles the complete workflow from specification to
    validated, formatted code ready for integration.
    """

    def __init__(
        self,
        api_key: Optional[str] = None,
        project_root: Optional[Path] = None,
        output_dir: Optional[Path] = None
    ):
        self.client = GPT4oClient(api_key=api_key)
        self.validator = CodeValidator(project_root=project_root)
        self.project_root = project_root or Path.cwd()
        self.output_dir = output_dir or (self.project_root / "generated_code")
        self.output_dir.mkdir(parents=True, exist_ok=True)

    def _build_system_prompt(self, request: CodeGenerationRequest) -> str:
        """Construct the system prompt based on the request parameters."""
        prompts = {
            "python": (
                "You are an expert Python developer. Generate production-ready code "
                "with the following requirements:\n"
                "- Use Python 3.12+ features where appropriate\n"
                "- Include comprehensive type hints for all functions and methods\n"
                "- Write Google-style docstrings for all public APIs\n"
                "- Handle edge cases and input validation\n"
                "- Use modern Python patterns (context managers, dataclasses, etc.)\n"
                "- Follow PEP 8 style guidelines\n"
                "- Include logging for debugging purposes\n"
                "- Do NOT include any markdown formatting or code fences in your response"
            ),
            "typescript": (
                "You are an expert TypeScript developer. Generate production-ready code "
                "with the following requirements:\n"
                "- Use TypeScript 5.x features where appropriate\n"
                "- Include comprehensive type definitions\n"
                "- Write JSDoc comments for all public APIs\n"
                "- Handle edge cases and input validation\n"
                "- Follow modern TypeScript patterns\n"
                "- Include error handling and logging\n"
                "- Do NOT include any markdown formatting or code fences in your response"
            )
        }

        base_prompt = prompts.get(request.language.value, prompts["python"])

        if request.include_tests:
            base_prompt += (
                "\n\nAdditionally, generate comprehensive pytest unit tests for all "
                "functions and classes. Include tests for:\n"
                "- Normal operation with valid inputs\n"
                "- Edge cases and boundary conditions\n"
                "- Error handling and exception cases\n"
                "- Test fixtures and parametrized tests where appropriate\n"
                "Separate the test code from the main code with the marker: ###TESTS###"
            )

        return base_prompt

    def _parse_response(self, response_text: str) -> tuple[str, Optional[str]]:
        """
        Parse the GPT-4o response to extract main code and test code.

        Returns:
            Tuple of (main_code, test_code_or_None)
        """
        if "###TESTS###" in response_text:
            parts = response_text.split("###TESTS###", 1)
            main_code = parts[0].strip()
            test_code = parts[1].strip()
            return main_code, test_code
        else:
            return response_text.strip(), None

    def _save_generated_code(
        self,
        code: str,
        tests: Optional[str],
        request: CodeGenerationRequest,
        metadata: dict
    ) -> Path:
        """Save generated code to the output directory with proper naming."""
        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")

        # Create a safe filename from the first 50 chars of the specification
        safe_name = "".join(
            c if c.isalnum() or c in ('_', '-') else '_'
            for c in request.specification[:50]
        ).strip('_').lower()

        if request.output_format.value == "package":
            # Create package structure
            package_dir = self.output_dir / f"{safe_name}_{timestamp}"
            package_dir.mkdir(parents=True, exist_ok=True)
            (package_dir / "__init__.py").write_text(
                f"# Auto-generated package: {safe_name}\n"
                f"# Generated: {datetime.now().isoformat()}\n"
            )
            main_file = package_dir / "main.py"
            main_file.write_text(code)

            if tests:
                test_dir = package_dir / "tests"
                test_dir.mkdir(exist_ok=True)
                (test_dir / f"test_{safe_name}.py").write_text(tests)

            return package_dir
        else:
            # Single file output
            output_file = self.output_dir / f"{safe_name}_{timestamp}.py"
            output_file.write_text(code)

            if tests:
                test_file = self.output_dir / f"test_{safe_name}_{timestamp}.py"
                test_file.write_text(tests)

            return output_file

    def generate(self, request: CodeGenerationRequest) -> CodeGenerationResponse:
        """
        Execute the full code generation pipeline.

        Args:
            request: Structured code generation request

        Returns:
            CodeGenerationResponse with generated code and validation results
        """
        logger.info(f"Starting code generation for: {request.specification[:100]}..")

        # Build prompts
        system_prompt = self._build_system_prompt(request)
        user_prompt = (
            f"Generate code for the following specification:\n\n"
            f"{request.specification}\n\n"
            f"Output format: {request.output_format.value}\n"
            f"Include tests: {request.include_tests}"
        )

        # Generate code
        try:
            response_text, metadata = self.client.generate_code(
                system_prompt=system_prompt,
                user_prompt=user_prompt,
                max_tokens=request.max_tokens,
                temperature=request.temperature
            )
        except Exception as e:
            logger.error(f"Code generation failed: {e}")
            return CodeGenerationResponse(
                code="",
                validation_results={"error": str(e)},
                metadata={"status": "failed"}
            )

        # Parse response
        main_code, test_code = self._parse_response(response_text)

        # Validate generated code
        validation_results = self.validator.validate_all(main_code)

        # Save generated code
        output_path = self._save_generated_code(
            main_code, test_code, request, metadata
        )

        logger.info(
            f"Code generation complete. Output saved to: {output_path}\n"
            f"Validation passed: {validation_results.get('overall_pass', False)}"
        )

        return CodeGenerationResponse(
            code=main_code,
            tests=test_code,
            validation_results=validation_results,
            metadata={
                **metadata,
                "output_path": str(output_path),
                "timestamp": datetime.now().isoformat()
            }
        )

Step 5: Usage Example and Edge Case Handling

Here's how to use the pipeline in production, including handling common edge cases:

# example_usage.py
import asyncio
from schemas import CodeGenerationRequest
from pipeline import CodeGenerationPipeline

def main():
    """Example of using the code generation pipeline."""

    pipeline = CodeGenerationPipeline()

    # Example 1: Generate a data validation module
    request = CodeGenerationRequest(
        specification=(
            "Create a Python module for validating user registration data. "
            "Include functions to validate email addresses (check format and domain), "
            "passwords (minimum 8 characters, must contain uppercase, lowercase, "
            "digit, and special character), and phone numbers (support US and "
            "international formats). Use pydantic for data models. Include proper "
            "error messages for each validation failure. Handle edge cases like "
            "empty strings, None values, and Unicode characters in email addresses."
        ),
        language="python",
        output_format="module",
        include_tests=True,
        max_tokens=4096,
        temperature=0.2
    )

    response = pipeline.generate(request)

    print(f"Validation passed: {response.validation_results.get('overall_pass', False)}")
    print(f"Linting score: {response.validation_results.get('linting', {}).get('score', 'N/A')}")
    print(f"Tokens used: {response.metadata.get('total_tokens', 'N/A')}")

    # Example 2: Handle edge case - very long specification
    try:
        bad_request = CodeGenerationRequest(
            specification="Short spec",  # This will fail validation
            language="python"
        )
    except ValueError as e:
        print(f"Caught expected error: {e}")

    # Example 3: Handle API rate limiting
    # The pipeline automatically retries with exponential backoff

    # Example 4: Handle syntax errors in generated code
    # The validator catches these and reports them in validation_results

if __name__ == "__main__":
    main()

Production Considerations and Edge Cases

Rate Limiting and Cost Management

GPT-4o API calls incur costs based on token usage. As of OpenAI's published pricing, GPT-4o costs $5.00 per million input tokens and $15.00 per million output tokens. For a typical code generation request using 4,000 tokens, the cost is approximately $0.06. Implement these strategies to manage costs:

Cache frequent requests: Use a hash of the specification as a cache key
Implement request queuing: Batch similar requests to reduce API calls
Monitor token usage: Log all token counts and set budget alerts

Handling API Failures

The pipeline implements retry logic with exponential backoff for transient failures. However, you should also handle these scenarios:

Authentication errors: Check API key validity before making requests
Model unavailability: GPT-4o may occasionally be overloaded; implement fallback to GPT-4 Turbo
Response truncation: If the generated code is cut off, detect incomplete code blocks and request continuation

Security Considerations

Generated code should never be executed directly without human review. Implement these security measures:

Sandboxed execution: Run validation in isolated containers
Dependency scanning: Check generated imports against known vulnerability databases
Code review workflow: Require human approval before merging generated code

What's Next

This pipeline provides a foundation for integrating GPT-4o into your development workflow. Consider these enhancements:

Multi-file generation: Extend the pipeline to generate entire project structures with proper imports
Continuous integration: Add a GitHub Action that triggers code generation from issue descriptions
Feedback loop: Implement a system where validation failures are fed back to the model for iterative improvement
Custom fine-tuning: For domain-specific code generation, consider fine-tuning GPT-4o on your codebase

For more advanced patterns, explore our guides on building AI-powered developer tools and production ML pipelines. The techniques demonstrated here—structured prompting, validation pipelines, and error handling—apply broadly to any AI-assisted development workflow.

Remember that while GPT-4o significantly accelerates code generation, it should augment rather than replace human expertise. Always review generated code for correctness, security, and alignment with your project's architecture before deployment.

References

1. Wikipedia - Rag. Wikipedia. [Source]

2. Wikipedia - OpenAI. Wikipedia. [Source]

3. Wikipedia - GPT. Wikipedia. [Source]

4. GitHub - Shubhamsaboo/awesome-llm-apps. Github. [Source]

5. GitHub - openai/openai-python. Github. [Source]

6. GitHub - Significant-Gravitas/AutoGPT. Github. [Source]

7. GitHub - hiyouga/LlamaFactory. Github. [Source]

8. OpenAI Pricing. Pricing. [Source]

How to Generate Production Code with GPT-4o

How to Generate Production Code with GPT-4o

Table of Contents

📺 Watch: Neural Networks Explained

Why GPT-4o Changes Production Code Generation

Real-World Use Case and Architecture

Prerequisites and Environment Setup

Core Implementation: Building the Code Generation Pipeline

Step 1: Define the Code Specification Schema

Step 2: Implement the GPT-4o Client with Production-Grade Error Handling

Step 3: Build the Code Validation Pipeline

Step 4: Create the Orchestration Pipeline

Step 5: Usage Example and Edge Case Handling

Production Considerations and Edge Cases

Rate Limiting and Cost Management

Handling API Failures

Security Considerations

What's Next

References

Was this article helpful?

Related Articles

How to Build a Multimodal App with Gemini 2.0 Vision API

How to Build an AI Pentesting Assistant with LangChain

How to Build Autonomous Scientific Discovery Agents with EurekAgent