Hosting
Become a Hoster
Share your GPU and earn credits. Every request your server handles earns you credits you can spend on other models in the network.
Requirements
- A GPU server with at least 4GB VRAM (or a CPU with 8GB+ RAM)
- A publicly reachable endpoint (not localhost or private IPs)
- An OpenAI-compatible server (setup guide)
- An invite code — register to request one
One-command setup
The fastest way to get started. Downloads llama.cpp, selects a model for your GPU, registers with NeuralGate, and starts earning:
curl -fsSL https://api.computeshare.servequake.com/install.sh | bash
The installer:
- Detects your GPU and available VRAM
- Downloads the right model (Gemma 4 for 3GB+, Qwen 3.5 for 8GB+)
- Sets up llama.cpp as a systemd service
- Registers your endpoint with NeuralGate
- Runs verification and goes live immediately
Manual registration
If you already have a server running, register it directly:
curl -X POST https://api.computeshare.servequake.com/hosters/register \
-H "Content-Type: application/json" \
-d '{
"name": "My GPU Server",
"email": "me@example.com",
"endpoint_url": "https://my-server.example.com",
"api_key": "my-bearer-token",
"invite_code": "BETA2026",
"models": [
{
"model_id": "llama-3-8b",
"model_alias": "Llama 3 8B",
"price_per_input_token": 100,
"price_per_output_token": 300,
"context_window": 8192,
"max_tokens": 2048
}
]
}'
Supported server software
| Software | Compatible | Notes |
|---|---|---|
| llama.cpp server | ✅ | Recommended. Supports GGUF models. |
| vLLM | ✅ | Best for large HuggingFace models. |
| Ollama | ✅ (with bridge) | Requires LiteLLM bridge for OpenAI compat. |
| Any OpenAI-compatible server | ✅ | Must expose /health and /v1/chat/completions |
Blocked endpoints
The following cannot be registered (ToS violation):
- Localhost or private IPs (127.x, 10.x, 192.168.x) — must be publicly reachable
- Commercial API providers (OpenAI, Anthropic, Google, etc.)
- Commercial model IDs (gpt-4, claude-*, gemini-*)
Verification
After registration, NeuralGate runs two checks:
- Health check —
GET /healthmust return HTTP 200 - Inference test — sends a test prompt, expects a valid response
Once both pass, your server goes live and starts receiving traffic.