No description
Find a file
guochao 6143b51825 feat: add file-based quota cache with stale fallback
Cache provider balance data to ${XDG_RUNTIME_DIR:-~/.cache}/claude-status/${provider}.json
with a 60s TTL (mtime-based). On cache hit, skip the API call entirely. When the
API call fails but stale cache exists, serve it with a `*` prefix on the summary.
Add `--no-cache` flag to bypass cache reads and writes.
2026-05-11 11:08:10 +08:00
.gitignore Initial claude-statusline (Python, PEP 723) 2026-05-03 01:18:44 +08:00
claude-statusline.py feat: add file-based quota cache with stale fallback 2026-05-11 11:08:10 +08:00
config.example.yaml Initial claude-statusline (Python, PEP 723) 2026-05-03 01:18:44 +08:00
README.md docs: clarify Volcengine is unsupported 2026-05-07 04:27:25 +08:00

claude-statusline

A custom statusline command for Claude Code that:

  • detects which third-party provider ANTHROPIC_BASE_URL points at (Kimi / MiniMax / Volcengine / GLM / DeepSeek / OpenRouter) and shows the remaining quota,
  • renders model, cwd, git branch, context-window %, etc. via a user-editable Jinja2 template,
  • supports user-defined extensions: any external command's stdout becomes a template variable.

If no recognized provider is detected (e.g. plain api.anthropic.com), the balance section is silently omitted.

Install

The script is a single PEP 723 file. With uv installed, no extra setup is needed — the shebang handles dependency provisioning on first run.

git clone <this-repo> ~/src/claude-statusline
chmod +x ~/src/claude-statusline/claude-statusline.py

Wire it into Claude Code via ~/.claude/settings.json:

{
  "statusLine": {
    "type": "command",
    "command": "/home/you/src/claude-statusline/claude-statusline.py",
    "refreshInterval": 30
  }
}

(Or use uv run /path/to/claude-statusline.py if you prefer not to use the shebang.)

Configuration

Copy config.example.yaml to ~/.config/claude-statusline/config.yaml and edit. Search order:

  1. $CLAUDE_STATUSLINE_CONFIG (full path)
  2. $XDG_CONFIG_HOME/claude-statusline/config.yaml
  3. ~/.config/claude-statusline/config.yaml

If no config is found, a sensible default template is used.

Template variables

The template runs in Jinja2 with ChainableUndefined, so missing fields render as empty rather than crashing.

Variable Description
model.id, model.display_name The active model.
cwd, cwd_short Working directory (full and ~-collapsed).
session_id, transcript_path, version, output_style From Claude Code stdin.
cost.usd, cost.duration_ms, cost.lines_added, cost.lines_removed Session cost stats.
context.used_pct, context.remaining_pct, context.size, context.total_input_tokens, context.total_output_tokens Context-window usage.
rate_limits.five_hour_pct, rate_limits.seven_day_pct When the API exposes them.
git.branch, git.dirty Computed via git -C $cwd rev-parse / status (200 ms timeout).
worktree.branch, worktree.path When inside a Claude Code worktree.
agent.name When running as a subagent.
provider Detected provider name (kimi / minimax / volcengine / glm / deepseek / openrouter) or None.
balance {summary, raw} for the detected provider, or None.
extensions List of {key, value} from configured commands.

Provider detection

Match against ANTHROPIC_BASE_URL:

Host substring Provider
moonshot.cn, kimi.com kimi
minimax.io, minimaxi.com minimax
volces.com, volcengineapi.com volcengine (not yet wired up)
z.ai, bigmodel.cn glm
deepseek.com deepseek
openrouter.ai openrouter

Bearer token: ANTHROPIC_AUTH_TOKEN first, falling back to ANTHROPIC_API_KEY.

Kimi

Kimi returns a limits array of quota windows plus a top-level usage object for the billing cycle. The script computes each window's actual duration and renders it as a smart label (e.g. 4h, week, 1.5d, mon). The top-level usage is always shown as week.

Example response (trimmed):

{
  "user": {
    "userId": "co78evecp7fcde32j12g",
    "region": "REGION_CN",
    "membership": { "level": "LEVEL_BASIC" },
    "businessId": ""
  },
  "usage": {
    "limit": "100",
    "used": "25",
    "remaining": "75",
    "resetTime": "2026-05-11T09:02:51.662697Z"
  },
  "limits": [
    {
      "window": { "duration": 300, "timeUnit": "TIME_UNIT_MINUTE" },
      "detail": {
        "limit": "100",
        "remaining": "100",
        "resetTime": "2026-05-05T13:02:51.662697Z"
      }
    }
  ],
  "parallel": { "limit": "10" },
  "totalQuota": { "limit": "100", "remaining": "99" },
  "authentication": {
    "method": "METHOD_API_KEY",
    "scope": "FEATURE_CODING"
  },
  "subType": "TYPE_PURCHASE"
}

With this response balance.summary becomes: 5h(in 2.1h) 0% week(in 6.0d) 25% (when reset time is available), or 5h 0% week 25% (when not).

MiniMax requires a group_id

MiniMax's coding_plan/remains endpoint takes a GroupId query parameter. Either set providers.minimax.group_id in the config, or export MINIMAX_GROUP_ID in the shell that launches claude.

Example response (trimmed):

{
  "model_remains": [
    {
      "model_name": "MiniMax-M*",
      "start_time": 1777982400000,
      "end_time": 1777996800000,
      "current_interval_total_count": 1500,
      "current_interval_usage_count": 1492,
      "current_weekly_total_count": 0,
      "current_weekly_usage_count": 0
    }
  ],
  "base_resp": { "status_code": 0, "status_msg": "success" }
}

Notes:

  • model_name is literally "MiniMax-M*" (with an asterisk).
  • The "5h" plan window is actually 4 hours (e.g. 20:0000:00). The script computes the real duration from start_time / end_time, so the label is 4h rather than 5h.
  • current_interval_usage_count is remaining quota, not consumed.

With this response balance.summary becomes: 4h(in 3.2h) 8/1500 (when reset time is available), or 4h 8/1500 (when not). The script picks MiniMax-M* and formats used/total per slot.

GLM (智谱)

GLM uses the same token (ANTHROPIC_AUTH_TOKEN or ANTHROPIC_API_KEY) but sends it as a plain Authorization header (no Bearer prefix). No extra config is needed.

Endpoint: {ANTHROPIC_BASE_URL scheme+host}/api/monitor/usage/quota/limit

Example response:

{
  "code": 200,
  "msg": "Operation successful",
  "data": {
    "limits": [
      {
        "type": "TOKENS_LIMIT",
        "unit": 3,
        "number": 5,
        "percentage": 4,
        "nextResetTime": 1777991637355
      },
      {
        "type": "TOKENS_LIMIT",
        "unit": 6,
        "number": 1,
        "percentage": 1,
        "nextResetTime": 1778577831989
      },
      {
        "type": "TIME_LIMIT",
        "unit": 5,
        "number": 1,
        "usage": 50,
        "currentValue": 0,
        "remaining": 50,
        "percentage": 0,
        "nextResetTime": 1780651431995,
        "usageDetails": [
          { "modelCode": "search-prime", "usage": 0 },
          { "modelCode": "web-reader", "usage": 0 },
          { "modelCode": "zread", "usage": 0 }
        ]
      }
    ],
    "level": "lite"
  },
  "success": true
}

unit field meaning:

unit Duration Description
3 × 3600 s (1 hour) Short rolling window (e.g. 5 h).
5 × 30 days Monthly window.
6 × 7 days Weekly window.

The actual window length is number × unit_seconds. The script only renders TOKENS_LIMIT entries, using percentage as the used percentage. TIME_LIMIT entries are ignored.

With this response balance.summary becomes: 5h(in 1.3h) 4% week(in 6.8d) 1% (when reset time is available), or 5h 4% week 1% (when not).

DeepSeek

DeepSeek uses the same Bearer token (ANTHROPIC_AUTH_TOKEN or ANTHROPIC_API_KEY). No extra config is needed.

Endpoint: https://api.deepseek.com/user/balance

Example response:

{
  "is_available": true,
  "balance_infos": [
    {
      "currency": "CNY",
      "total_balance": "53.26",
      "granted_balance": "0.00",
      "topped_up_balance": "53.26"
    }
  ]
}

The script reads the first entry in balance_infos, then picks a currency symbol from currency (CNY, USD$) and formats total_balance to a compact decimal (trailing zeros and decimal point stripped).

With this response balance.summary becomes: ¥53.26.

OpenRouter

OpenRouter uses the same Bearer token (ANTHROPIC_AUTH_TOKEN or ANTHROPIC_API_KEY). No extra config is needed.

Endpoint: https://openrouter.ai/api/v1/credits

Example response:

{
  "data": {
    "total_credits": 200,
    "total_usage": 187.03
  }
}

The script computes remaining = total_credits - total_usage and renders it in USD. Formatting strips trailing zeros and decimal point from the remaining amount.

With this response balance.summary becomes: $12.97.

Volcengine

Not supported as of 2026-05-07. Volcengine usage currently has no reliable non-web path here; manual checking in the web console is the only supported option.

Extensions

Each entry is {key, command, timeout_ms?}. The command is run via /bin/sh -c; stdout (stripped) becomes the value. On non-zero exit or timeout, the value is empty. Extensions run sequentially.

extensions:
  - key: gpu
    command: "nvidia-smi --query-gpu=utilization.gpu --format=csv,noheader,nounits | head -n1"
    timeout_ms: 500

Then in the template: {% for ext in extensions %} | {{ ext.key }}: {{ ext.value }}{% endfor %}.

Smoke test

echo '{"model":{"id":"claude-opus-4-7","display_name":"Opus"},"cwd":"'"$PWD"'","context_window":{"used_percentage":15}}' \
  | ./claude-statusline.py

Status

v1. Caching is not implemented — every statusline render hits the provider API. This is fine because Claude Code only fires the statusline on assistant message complete (not every keystroke). If you hit rate limits, a short-TTL file cache would be the natural next step.