Skip to content

Comments

fix/hf inference#23475

Closed
Josephrp wants to merge 3 commits intoopenclaw:mainfrom
Josephrp:bugfix/hf-inference
Closed

fix/hf inference#23475
Josephrp wants to merge 3 commits intoopenclaw:mainfrom
Josephrp:bugfix/hf-inference

Conversation

@Josephrp
Copy link
Contributor

@Josephrp Josephrp commented Feb 22, 2026

Summary

  • Problem: Using a model ref with a huggingface/ prefix (or “huggingface” as preface) caused errors, and the HF Inference client was given the full string including the prefix even though the API expects only a Hub-style model id and optional tags.
  • Why it matters: Users choosing Hugging Face inference hit failures or wrong API behavior; the client must receive only the Hub id and tags (e.g. :cheapest, :fastest).
  • What changed: Added modelIdForHfInferenceClient() to strip a leading huggingface/ before calling the HF client; Pi embedded runner uses it when resolving Hugging Face models so the resolved Model id never includes the prefix. Removed duplicate buildHuggingfaceProvider in models-config.providers.ts. Added tests for the new helper.
  • What did NOT change (scope boundary): Other providers, auth, onboarding, and tag parsing (:cheapest, :fastest, :provider, etc.) are unchanged; only the string sent to the HF inference client is normalized.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

User-visible / Behavior Changes

  • Before: Prefixed refs (e.g. huggingface/org/model) could error or send the prefix to the HF API.
  • After: Same refs work; the client receives only the Hub-style id and tags (e.g. org/model:cheapest). No change to non-HF providers or to how users write refs in config/CLI.

Security Impact (required)

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No
  • New/changed network calls? No
  • Command/tool execution surface changed? No
  • Data access scope changed? No
  • If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

  • OS: (e.g. macOS 15.4 / Ubuntu 24.04 / Windows 11)
  • Runtime/container: Node 22+ / pnpm dev
  • Model/provider: Hugging Face Inference (HF_TOKEN or HUGGINGFACE_HUB_TOKEN)
  • Integration/channel (if any): N/A (inference/model resolution)
  • Relevant config (redacted): models.providers.huggingface and/or onboarding with Hugging Face selected

Steps

  1. Set HF token (e.g. HF_TOKEN or HUGGINGFACE_HUB_TOKEN).
  2. Use a model ref that includes the huggingface/ prefix (e.g. huggingface/mistralai/Mistral-7B-Instruct-v0.3 or a path that previously sent the prefix).
  3. Run an agent/inference flow that resolves the model and calls the HF inference client.

Expected

  • No error from the prefix; HF API is called with only the Hub-style model id (and tags); inference works.

Actual (before fix)

  • Error when using the preface, or HF client received the full prefixed string and API failed or behaved incorrectly.

Evidence

Attach at least one:

  • Failing test/log before + passing after (unit tests for modelIdForHfInferenceClient; Pi embedded runner model resolution with HF)
  • Trace/log snippets
  • Screenshot/recording (to be added)
  • Perf numbers (if relevant)

Human Verification (required)

  • Verified scenarios: Unit tests for modelIdForHfInferenceClient (strip prefix, preserve tags, no change when no prefix). Model resolution path in Pi embedded runner uses normalizedModelId for Hugging Face so resolved Model id has no prefix.
  • Edge cases checked: Ref with prefix + tags; ref without prefix; duplicate provider definition removed so build/lint pass.
  • What you did not verify: some currently failing test, but i'll stay up-to-date from now on :-)

Compatibility / Migration

  • Backward compatible? Yes
  • Config/env changes? No
  • Migration needed? No
  • If yes, exact upgrade steps: N/A

Failure Recovery (if this breaks)

  • How to disable/revert: Revert this PR; prefixed refs may fail again and the client may receive the prefix.
  • Files/config to restore: src/agents/huggingface-models.ts, src/agents/pi-embedded-runner/model.ts, src/agents/models-config.providers.ts, src/agents/huggingface-models.test.ts.
  • Known bad symptoms: HF inference errors or “invalid model id” from the API; regression in model resolution when provider is Hugging Face.

Risks and Mitigations

  • Risk: A caller elsewhere might rely on the resolved Model id still containing huggingface/ for display or routing.
    Mitigation: Normalization is applied only for the Hugging Face provider in the Pi embedded runner resolution path; display/CLI can still show the user’s original ref; HF API contract requires no prefix.
  • Risk: None others identified for this change.

Greptile Summary

This PR fixes HuggingFace Inference API integration by stripping the huggingface/ prefix before sending model IDs to the HF client. The core fix is small and correct:

  • Added modelIdForHfInferenceClient() helper in src/agents/huggingface-models.ts:28 to normalize model IDs
  • Updated src/agents/pi-embedded-runner/model.ts:61 to use the helper when resolving HuggingFace models
  • Added comprehensive test coverage including mixed-case prefix handling

Key changes:

  • Model IDs like huggingface/deepseek-ai/DeepSeek-R1 are normalized to deepseek-ai/DeepSeek-R1 before calling HF API
  • Routing tags (:cheapest, :fastest, :provider) are preserved correctly
  • Mixed-case prefixes like HuggingFace/ or HUGGINGFACE/ are handled case-insensitively

Scope concern: This PR includes 72 files with extensive refactoring of extensions (bluebubbles, feishu, line, zalo, msteams) and test files. While these appear to be legitimate type safety improvements and refactoring (removing unused types, adding GroupPolicy type annotations), they should ideally be in separate commits or PRs. The PR description claims this is a focused bug fix, but the actual changes include substantial unrelated work.

Confidence Score: 4/5

  • Safe to merge with minor concerns about commit organization
  • The core HuggingFace fix is correct and well-tested. The previous review comments about case-insensitive bugs were incorrect - the implementation handles mixed-case prefixes correctly. The vi.resetModules() cleanup issue has been addressed. However, the PR mixes a focused bug fix with extensive refactoring across 72 files (type safety improvements, test refactoring, and code cleanup in unrelated extensions), which is poor commit hygiene but doesn't introduce functional issues. The score reflects confidence in the correctness of the changes, with a -1 penalty for scope creep and mixing unrelated changes.
  • No files require special attention - the HuggingFace changes are correct and the extension refactoring appears safe, though ideally the unrelated changes should have been separate commits

Last reviewed commit: 09637cb

@openclaw-barnacle openclaw-barnacle bot added docs Improvements or additions to documentation channel: discord Channel integration: discord cli CLI command changes commands Command implementations agents Agent runtime and tooling size: S and removed docs Improvements or additions to documentation labels Feb 22, 2026
@Josephrp

This comment was marked as outdated.

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@Josephrp Josephrp marked this pull request as ready for review February 22, 2026 11:06
@Josephrp

This comment was marked as outdated.

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8 files reviewed, 4 comments

Edit Code Review Agent Settings | Greptile

@openclaw-barnacle openclaw-barnacle bot added gateway Gateway runtime channel: bluebubbles Channel integration: bluebubbles channel: line Channel integration: line channel: feishu Channel integration: feishu and removed gateway Gateway runtime labels Feb 22, 2026
@Josephrp Josephrp force-pushed the bugfix/hf-inference branch from d0f47be to b25576c Compare February 22, 2026 11:34
@openclaw-barnacle openclaw-barnacle bot added channel: mattermost Channel integration: mattermost channel: msteams Channel integration: msteams channel: voice-call Channel integration: voice-call channel: zalo Channel integration: zalo channel: zalouser Channel integration: zalouser app: web-ui App: web-ui gateway Gateway runtime size: XL and removed size: S labels Feb 22, 2026
@Josephrp
Copy link
Contributor Author

@greptileai , i have updated this pr with a squash commit , please update your review based on the actually submitted pr (current)

@Josephrp Josephrp closed this Feb 22, 2026
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

72 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

const prefix = "huggingface/";
const lowerS = s.toLowerCase();
if (lowerS.startsWith(prefix)) {
return s.slice(lowerS.indexOf(prefix) + prefix.length).trim() || s;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overly complex - lowerS.indexOf(prefix) always returns 0 when startsWith check passes

simplify to:

Suggested change
return s.slice(lowerS.indexOf(prefix) + prefix.length).trim() || s;
return s.slice(prefix.length).trim() || s;
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/huggingface-models.ts
Line: 33

Comment:
overly complex - `lowerS.indexOf(prefix)` always returns 0 when `startsWith` check passes

simplify to:
```suggestion
    return s.slice(prefix.length).trim() || s;
```

How can I resolve this? If you propose a fix, please make it concise.

This was referenced Feb 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling app: web-ui App: web-ui channel: bluebubbles Channel integration: bluebubbles channel: discord Channel integration: discord channel: feishu Channel integration: feishu channel: line Channel integration: line channel: mattermost Channel integration: mattermost channel: msteams Channel integration: msteams channel: voice-call Channel integration: voice-call channel: zalo Channel integration: zalo channel: zalouser Channel integration: zalouser cli CLI command changes commands Command implementations gateway Gateway runtime size: XL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant