Door background frame
Llama 4 Scout
35k+ Runs
Llama 4-Scout visualization
Llama 4 Scout
35k+ Runs
17B active parameter model with 16 experts designed to fit on a single Nvidia H100 GPU while offering an industry leading 10M context window.

Llama API Usage

POST /v1/chat/completions
import requests
import json

url = "https://api.akashml.com/v1/chat/completions"

payload = {
    "model": "llama-4-scout",
    "messages": [
        {
            "role": "user",
            "content": "Hello, how are you?"
        }
    ],
    "max_tokens": 150,
    "temperature": 0.7
}

headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer YOUR_API_KEY"
}

response = requests.post(url, json=payload, headers=headers)
print(response.json())
Pricing
Price (per 1M Tokens)
Model
Input
Output
Llama 4 Scout
NA
NA
Model Details
Model Details
Provider
Meta
Type
Chat
Parameters
405B
Context Length
128k
Don't see the model you need?
Let us know, and we'll add it for you.
AkashML
X (Twitter)Discord
AI Inference Service
Akash Network
Built on Akash Network
Copyright 2025 © akashml.com