Groq's LPU delivers sub-100ms first tokens. Slopshop delivers real tool executions across 82 categories. Combined: the fastest AI agent loop available — millisecond inference, real compute execution, verified results.
Groq's LPU handles the LLM reasoning step in milliseconds. Slopshop's tool execution adds minimal latency. Multi-step agent loops that take seconds elsewhere complete in under a second.
Groq's API is OpenAI-compatible. Set base_url="https://api.groq.com/openai/v1" — the rest of the integration is identical to standard OpenAI function_calling.
Use Groq's official Python SDK — it mirrors the openai SDK API exactly. Or use openai SDK with base_url set to Groq's endpoint.
Fetch tool definitions from /v1/openapi.json. They arrive in OpenAI function_calling format — Groq accepts them directly.
Groq returns tool calls in <100ms. Slopshop executes them in ~50-200ms. Multi-step agent tasks complete in under a second total.
Uses the official groq SDK. The Groq client class is a drop-in for the OpenAI client — identical method signatures, just faster.
import os, json, time, requests from groq import Groq client = Groq(api_key=os.environ["GROQ_API_KEY"]) SLOP_KEY = os.environ["SLOPSHOP_API_KEY"] SLOP_URL = "https://slopshop.gg" # Models with strong function_calling on Groq: # "llama-3.3-70b-versatile" — best all-round # "llama-3.1-70b-versatile" — fast tool calls # "mixtral-8x7b-32768" — large context MODEL = "llama-3.3-70b-versatile" def get_tools(slugs: list[str]) -> list[dict]: schema = requests.get( f"{SLOP_URL}/v1/openapi.json", headers={"Authorization": f"Bearer {SLOP_KEY}"} ).json() tools = [] for slug in slugs: path = schema["paths"].get(f"/v1/{slug}", {}).get("post", {}) if path: tools.append({ "type": "function", "function": { "name": slug, "description": path.get("summary", ""), "parameters": path["requestBody"]["content"]["application/json"]["schema"] } }) return tools def call_tool(name: str, args: dict) -> str: res = requests.post( f"{SLOP_URL}/v1/{name}", json=args, headers={"Authorization": f"Bearer {SLOP_KEY}"} ) return json.dumps(res.json()) def run_agent(user_message: str) -> str: tools = get_tools(["dns-lookup", "ssl-check", "http-request", "whois", "hash", "memory-set", "memory-get"]) messages = [{"role": "user", "content": user_message}] step = 0 start = time.perf_counter() while True: t0 = time.perf_counter() response = client.chat.completions.create( model=MODEL, messages=messages, tools=tools, tool_choice="auto" ) llm_ms = (time.perf_counter() - t0) * 1000 step += 1 msg = response.choices[0].message print(f"Step {step}: Groq LLM {llm_ms:.0f}ms") if response.choices[0].finish_reason == "stop": total = (time.perf_counter() - start) * 1000 print(f"Done in {step} steps, {total:.0f}ms total") return msg.content messages.append(msg) for tc in (msg.tool_calls or []): t1 = time.perf_counter() result = call_tool(tc.function.name, json.loads(tc.function.arguments)) tool_ms = (time.perf_counter() - t1) * 1000 print(f" Tool {tc.function.name}: {tool_ms:.0f}ms") messages.append({ "role": "tool", "tool_call_id": tc.id, "content": result }) answer = run_agent("Audit slopshop.gg: DNS, SSL, HTTP headers, WHOIS, save results to memory") print(answer)
Groq's LPU makes LLM reasoning near-instant. Slopshop's tools add the execution layer. The bottleneck is now network I/O, not inference time.
Groq's LPU delivers inference at speeds that make multi-step agent loops feel synchronous. Stop waiting for LLM responses.
DNS, SSL, HTTP, crypto, hashing, code execution, data transforms. All real. Every result verified with _engine: real.
Groq agents can write to and read from Slopshop's key-value store at zero credit cost. State persists across calls.
Audit logs per tool call. Latency breakdown per step. Know exactly where time is spent — LLM vs. tool execution.
Run Slopshop on your own infrastructure for zero-latency tool execution alongside Groq's fast inference.