# OpenFPGA > FPGA-accelerated AI inference. OpenAI-compatible API. Drop-in replacement — just change the base URL. ## API OpenFPGA provides an OpenAI-compatible chat completions API backed by FPGA hardware acceleration. No proprietary SDK required — use the standard OpenAI client libraries. - Base URL: https://api.openfpga.ai/v1 - Auth: Bearer token (API key prefix: ofpga_sk_live_) - Model: llama-3.1-8b-fpga (Llama 3.1 8B Instruct on FPGA) ## Docs - [API Reference](https://app.openfpga.ai/#/docs) - [OpenAPI Spec](https://app.openfpga.ai/.well-known/openapi.yaml) - [Full Context (llms-full.txt)](https://app.openfpga.ai/llms-full.txt) - [Quickstart](https://app.openfpga.ai/#/quickstart) ## API Endpoints - [Chat Completions](https://docs.openfpga.ai/api/chat-completions): POST /v1/chat/completions — chat inference - [Embeddings](https://docs.openfpga.ai/api/embeddings): POST /v1/embeddings — text embeddings - [Models](https://docs.openfpga.ai/api/models): GET /v1/models — list available models ## Quick Start ```python from openai import OpenAI client = OpenAI( base_url="https://api.openfpga.ai/v1", api_key="ofpga_sk_live_..." ) response = client.chat.completions.create( model="llama-3.1-8b-fpga", messages=[{"role": "user", "content": "Hello!"}] ) ```