Birbla

GLM 4.5 with Claude Code (docs.z.ai)
132 points by vincirufus - 11 hours ago

I stopped when I got to this sentence and realized the article is written by one of the companies mentioned.
> GLM-4.5 and GLM-4.5-Air are our latest flagship models
Maybe it is great, but with a conflict of interest so obvious I can't exactly take their word for it.
by apparent - 11 hours ago
Available on OpenRouter as well for those who want to test it: https://openrouter.ai/z-ai/glm-4.5
I would be interested to know where the claim of the “killer combination” comes from. I would also like to know who the people behind Z.ai are — I haven’t heard of them before. Their plans seem crazy cheap compared to Anthropic, especially if their models actually perform better than Opus.
by stingraycharles - 11 hours ago
Okay, I'm going to try it, but why didn't you link the information on how to integrate it with Claude Code: https://docs.z.ai/scenario-example/develop-tools/claude
Chinese software always has such a design language:
- prepaid and then use credit to subscribe
- strange serif font
- that slider thing for captcha
But I'm going to try it out now.
by arjie - 11 hours ago
Been using that for a while, first Chinese model that works REALLY well!
Also fascinating how they solved the issue that Claude expects a 200+k token model while GLM 4.5 has 128k.
by steipete - 10 hours ago
I wonder how you justify this editorialized title, and if HN mods share your justification. The linked article has no the word "killer" in it.
I think this is why many people have concerns about AI. This group can't express neutral ideas. They have to hype about a simple official documentation page.
by raincole - 10 hours ago
Hmm with the lower context length I'm wonder how it holds up for problems requiring slightly larger context given we know most models tend to degrade fairly quickly with context length.
Maybe it's best for shorter tasks or condensed context?
I find it interesting the number of models latching onto Claude codes harness. I'm still using Cursor for work and personal but tried out open code and Claude for a bit. I just miss having the checkpoints and whatnot.
by Jcampuzano2 - 10 hours ago
I've been using GLM 4.5 and GLM 4.5 Air for a while now. The Air model is light enough to run on a macbook pro and is useful for Cline. I can run the full GLM model on my Mac Studio, but the TPS is so slow that it's only useful for chatting. So I hooked up with openrouter to try but didn't have the same success. Any of the open weight models I try with open router give sub standard results. I get better results from Qwen 3 coder 30b a3b locally than I get from Qwen 3 Coder 480b through open router.
I'm really concerned that some of the providers are using quantized versions of the models so they can run more models per card and larger batches of inference.
by chisleu - 10 hours ago
Used it to fix a couple of bugs just now in Elixir and it runs very fast, faster than Codex with GPT-5 medium or high.
This is quite nice. Will try it out a bit longer over the weekend. I tested it using Claude Code with env variables overrides.
by sergiotapia - 9 hours ago
So you can use Claude Code with other models? I had assumed that it was tied to your subscription and that was that.
by abrookewood - 8 hours ago
I was blown away by this model. It was definitely comparable to sonnet 4. In some of my tests, it performed as good as Opus. I subscribed to their paid plan, and now the model seems dumb? I asked it to find and replace a string. It only made the change in one file. Codex worked fine. Can Z.ai confirm if this is the model we get through their API or is it quantized for Claude Code use?
by sagarpatil - 8 hours ago
This is really cool and should work well with something like RooCode as well. Usually I keep going back to either Claude Sonnet or Gemini 2.5 Pro (also tried out GPT-5, was quite unimpressed) but both of those are relatively expensive.
I've tried using the more expensive model for planning and something a bit cheaper for doing the bulk of changes (the Plan / Ask and Code modes in RooCode) which works pretty nicely, but settling on just one model like GLM 4.5 would be lovely! Closest to that I've gotten to up until now has been the Qwen3 Coder model on OpenRouter.
I think I used about 40M tokens with Claude Sonnet last month, more on Gemini and others, that's a bit expensive for my liking.
by KronisLV - 6 hours ago
Not just Claude Code. Their plans $3 and $15 plans work even better with tools like Roo Code.
After Claude models have recently become dumb, I switched to Qwen3-Coder (there's a very generous free tier) and GLM4.5, and I'm not looking back.
by jedisct1 - 5 hours ago
Anthropic can't compete with this on cost. They're probably bleeding money as it is.
But they can sort of compete on model quality, by no longer dumbing down their models. That'll be expensive too, but it's a lever they have.
by unsupp0rted - 5 hours ago
It's weird, but using Claude Code (CC) with GLM 4.5 from Z.ai, I spent $5 just starting CC and asking it to read the spec and guideline files. This on an API that advertises $0.6 per million tokens input, and $2.2 per million tokens output.
Then, I noticed that after the original prompt, any further prompt I gave CC (other that approving its actions) did not have any effect at all on it. Yes, any prompt is totally ignored. I have to stop CC and restart with the new prompt for it to take effect.
This is really strange to me. Does CC have some way of detecting when it is running against an API other than Anthropic's? The massive cost and/or token usage and crippled agentic mode is clearly not due to GLM 4.5 not being capable. Either CC is crippled in some way when using a 3rd party API, or else Z.ai's API is somehow not working properly.
by prmph - 4 hours ago
The graphs that some of these companies make are brutal: https://docs.z.ai/guides/llm/glm-4.5#higher-parameter-effici...
"So, we only know the Y-axis for some models in the scatter plot. Let's make up an X-axis value on the bad side of the graph and include the data points anyway."
Visually disingenuous!
by conradev - 20 minutes ago