LPU inference  ·  llama-3.3-70b-versatile  ·  sub-100ms  ·  OpenAI-compatible

Slopshop for
Groq (Fast Inference)

Groq's LPU delivers sub-100ms first tokens. Slopshop delivers real tool executions across 82 categories. Combined: the fastest AI agent loop available — millisecond inference, real compute execution, verified results.

422
Real Tools
<100
ms First Token
LPU
Hardware
100%
Verified Results

Speed Profile

The fastest agent loop stack

Groq's LPU handles the LLM reasoning step in milliseconds. Slopshop's tool execution adds minimal latency. Multi-step agent loops that take seconds elsewhere complete in under a second.

Approximate latency per agentic step

Groq inference
<100ms
Slopshop tool exec
~200ms
Other LLM providers
~1-3s

Groq latency depends on model and prompt size. Tool latency depends on tool type (DNS ~50ms, HTTP ~200ms).


Quickstart

Three steps. Groq has tools.

Groq's API is OpenAI-compatible. Set base_url="https://api.groq.com/openai/v1" — the rest of the integration is identical to standard OpenAI function_calling.

Step 01 — Install

pip install groq

Use Groq's official Python SDK — it mirrors the openai SDK API exactly. Or use openai SDK with base_url set to Groq's endpoint.

pip install groq requestsclick to copy
Step 02 — Fetch Tools

Pull Slopshop schemas

Fetch tool definitions from /v1/openapi.json. They arrive in OpenAI function_calling format — Groq accepts them directly.

GET /v1/openapi.json → tools[] (OpenAI format)
Step 03 — Run Fast

Sub-second agent loops

Groq returns tool calls in <100ms. Slopshop executes them in ~50-200ms. Multi-step agent tasks complete in under a second total.

82 categories of tools · <300ms per step · verified

Code Example

function_calling with llama-3.3-70b-versatile on Groq

Uses the official groq SDK. The Groq client class is a drop-in for the OpenAI client — identical method signatures, just faster.

groq_agent.py python
import os, json, time, requests
from groq import Groq

client = Groq(api_key=os.environ["GROQ_API_KEY"])

SLOP_KEY = os.environ["SLOPSHOP_API_KEY"]
SLOP_URL = "https://slopshop.gg"

# Models with strong function_calling on Groq:
# "llama-3.3-70b-versatile"    — best all-round
# "llama-3.1-70b-versatile"    — fast tool calls
# "mixtral-8x7b-32768"         — large context
MODEL = "llama-3.3-70b-versatile"

def get_tools(slugs: list[str]) -> list[dict]:
    schema = requests.get(
        f"{SLOP_URL}/v1/openapi.json",
        headers={"Authorization": f"Bearer {SLOP_KEY}"}
    ).json()
    tools = []
    for slug in slugs:
        path = schema["paths"].get(f"/v1/{slug}", {}).get("post", {})
        if path:
            tools.append({
                "type": "function",
                "function": {
                    "name": slug,
                    "description": path.get("summary", ""),
                    "parameters": path["requestBody"]["content"]["application/json"]["schema"]
                }
            })
    return tools

def call_tool(name: str, args: dict) -> str:
    res = requests.post(
        f"{SLOP_URL}/v1/{name}", json=args,
        headers={"Authorization": f"Bearer {SLOP_KEY}"}
    )
    return json.dumps(res.json())

def run_agent(user_message: str) -> str:
    tools = get_tools(["dns-lookup", "ssl-check", "http-request",
                       "whois", "hash", "memory-set", "memory-get"])
    messages = [{"role": "user", "content": user_message}]
    step = 0
    start = time.perf_counter()

    while True:
        t0 = time.perf_counter()
        response = client.chat.completions.create(
            model=MODEL,
            messages=messages,
            tools=tools,
            tool_choice="auto"
        )
        llm_ms = (time.perf_counter() - t0) * 1000
        step += 1
        msg = response.choices[0].message

        print(f"Step {step}: Groq LLM {llm_ms:.0f}ms")

        if response.choices[0].finish_reason == "stop":
            total = (time.perf_counter() - start) * 1000
            print(f"Done in {step} steps, {total:.0f}ms total")
            return msg.content

        messages.append(msg)
        for tc in (msg.tool_calls or []):
            t1 = time.perf_counter()
            result = call_tool(tc.function.name, json.loads(tc.function.arguments))
            tool_ms = (time.perf_counter() - t1) * 1000
            print(f"  Tool {tc.function.name}: {tool_ms:.0f}ms")
            messages.append({
                "role": "tool",
                "tool_call_id": tc.id,
                "content": result
            })

answer = run_agent("Audit slopshop.gg: DNS, SSL, HTTP headers, WHOIS, save results to memory")
print(answer)

What You Get

The fastest agent stack available

Groq's LPU makes LLM reasoning near-instant. Slopshop's tools add the execution layer. The bottleneck is now network I/O, not inference time.

Sub-100ms LLM

Groq's LPU delivers inference at speeds that make multi-step agent loops feel synchronous. Stop waiting for LLM responses.

🀅

78 Categories of Real Tools

DNS, SSL, HTTP, crypto, hashing, code execution, data transforms. All real. Every result verified with _engine: real.

🧠

Free Persistent Memory

Groq agents can write to and read from Slopshop's key-value store at zero credit cost. State persists across calls.

👀

Full Observability

Audit logs per tool call. Latency breakdown per step. Know exactly where time is spent — LLM vs. tool execution.

🏠

Self-Hostable

Run Slopshop on your own infrastructure for zero-latency tool execution alongside Groq's fast inference.


Groq speed + real tools across 82 categories

Sub-100ms LLM inference. Real tool execution. Free memory.
The fastest complete agent stack available.

Get Started Free OpenAPI Schema Browse Browse All Tools
→ Full Docs → Browse Tools → Agent Templates → Savings Calculator