Birbla

Login
    How I code with AI on a budget/free (wuu73.org)
    615 points by indigodaddy - 1 day ago

  • Without tricks google aistudio definitely has limits, though pretty high ones. gemini.google.com on the other hand has less than a handful of free 2.5 pro messages for free
    by CjHuber - 1 day ago
  • OpenAI offering 2.5M free tokens daily small models and 250k for big ones (tier 1-2) is so useful for random projects, I use them to learn japanese for example (by having a program that list informations about what the characters are just saying: vocabulary, grammar points, nuances).
    by GaggiX - 1 day ago
  • I wonder how much energy this is wasting.
    by cammikebrown - 1 day ago
  • For anyone else confused - there is a page 2 and 3 in the post that you need to access via arrow thing at bottom.
    by Havoc - 1 day ago
  • My experience lines up with the article. The agentic stuff only works with the biggest models. (Well, "works"... OpenAI Codex took 200 requests with o4-mini to change like 3 lines of code...)

    For simple changes I actually found smaller models better because they're so much faster. So I shifted my focus from "best model" to "stupidest I can get away with".

    I've been pushing that idea even further. If you give up on agentic, you can go surgical. At that point even 100x smaller models can handle it. Just tell it what to do and let it give you the diff.

    Also I found the "fumble around my filesystem" approach stupid for my scale, where I can mostly fit the whole codebase into the context. So I just dump src/ into the prompt. (Other people's projects are a lot more boilerplatey so I'm testing ultra cheap models like gpt-oss-20b for code search. For that, I think you can go even cheaper...)

    Patent pending.

    by andai - 1 day ago
  • To the OP: I highly recommend you look into Continue.dev and ollama/lmstudio and running models on your own. Some of them are really good at autocomplete-style suggestions while others (like gpt-oss) can reason and use tools.

    It's my goto copilot.

    by reactordev - 1 day ago
  • If you're looking for free API access, Google offers access to Gemini for free, including for gemini-2.5-pro with thinking turned on. The limit is... quite high, as I'm running some benchmarking and haven't hit the limit yet.

    Open weight models like DeepSeek R1 and GPT-OSS are also made available with free API access from various inference providers and hardware manufacturers.

    by chromaton - 1 day ago
  • I am the person that wrote that. Sorry about the font. This is a bit outdated, AI stuff goes at high speed. More models so I will try to update that.

    Every month so many new models come out. My new fav is GLM-4.5... Kimi K2 is also good, and Qwen3-Coder 480b, or 2507 instruct.. very good as well. All of those work really well in any agentic environment/in agent tools.

    I made a context helper app ( https://wuu73.org/aicp ) which is linked to from there which helps jump back and forth from all the different AI chat tabs i have open (which is almost always totally free, and I get the best output from those) to my IDE. The app tries to remove all friction, and annoyances, when you are working with the native web chat interfaces for all the AIs. Its free and has been getting great feedback, criticism welcome.

    It helps the going from IDE <----> web chat tabs. Made it for myself to save time and I prefer the UI (PySide6 UI so much lighter than a webview)

    Its got Preset buttons to add text that you find yourself typing very often, per-project state saves of window size of app and which files were used for context. So next time, it opens at same state.

    Auto scans for code files, guesses likely ones needed, prompt box that can put the text above and below the code context (seems to help make the output better). One of my buttons is set to: "Write a prompt for Cline, the AI coding agent, enclose the whole prompt in a single code tag for easy copy and pasting. Break the tasks into some smaller tasks with enough detail and explanations to guide Cline. Use search and replace blocks with plain language to help it find where to edit"

    What i do for problem solving, figuring out bugs: I'm usually in VS Code and i type aicp in terminal to open the app. Fine tune any files already checked, type what i am trying to do or what problem i have to fix, click Cline button, click Generate Context!. Paste into GLM-4.5, sometimes o3 or o4-mini, GPT-5, Gemini 2.5 Pro.. if its a super hard thing i'll try 2 or 3 models. I'll look and see which one makes the most sense and just copy and paste into Cline in VS Code - set to GPT 4.1 which is unlimited/free.. 4.1 isn't super crazy smart or anything but it follows orders... it will do whatever you ask, reliably. AND, it will correct minor mistakes from the bigger model's output. The bigger smarter models can figure out the details, and they'll write a prompt that is a task list with how-to's and why's perfect for 4.1 to go and do in agent mode....

    You can code for free this way unlimited, and its the smartest the models will be. Anytime you throw some tools or MCPs at a model it dumbs them down.... AND you waste money on all the API costs having to use Claude 4 for everything

    by radio879 - 1 day ago
  • Windsurf has a good free model. Good enough for autocomplete level work for sure (haven't tried it for more as I use Claude Code)
    by bravesoul2 - 1 day ago
  • I jump between Claude Sonnet 4 on GitHub Copilot Pro and now GPT-5 on ChatGPT. That seems to get me pretty far. I have gpt-oss:20b installed with ollama, but haven't found a need to use it yet, and it seems like it just takes too long on an M1 Max MacBook Pro 64GB.

    Claude Sonnet 4 is pretty exceptional. GPT-4.1 asks me too frequently if it wants to move forward. Yes! Of course! Just do it! I'll reject your changes or do something else later. The former gets a whole task done.

    I wonder if anyone is getting better results, or comparable for cheaper or free. GitHub Copilot in Visual Studio Code is so good, I think it'd be pretty hard to beat, but I haven't tried other integrated editors.

    by andrewmcwatters - 1 day ago
  • > When you use AI in web chat's (the chat interfaces like AI Studio, ChatGPT, Openrouter, instead of thru an IDE or agent framework) are almost always better at solving problems, and coming up with solutions compared to the agents like Cline, Trae, Copilot.. Not always, but usually.

    I completely agree with this!

    While I understand that it looks a little awkward to copy and paste your code out of your IDE and into a web chat interface, I generally get better results that way than with GitHub copilot or cursor.

    by joshdavham - 1 day ago
  • Just use Rovodev CLI. Gives you 20 million tokens for free per 24 hours and you can switch between sonnet 4 / gpt-5.
    by hgarg - 1 day ago
  • As of today, what is the best local model that can be run on a system with 32gb of ram and 24gb of vram?
    by xvv - 1 day ago
  • I think there’s huge potential for a fully local “Cursor-like” stack — no cloud, no API keys, just everything running on your machine.

    The setup could be: • Cursor CLI for agentic/dev stuff (example:https://x.com/cursor_ai/status/1953559384531050724) • A local memory layer compatible with the CLI — something like LEANN (97% smaller index, zero cloud cost, full privacy, https://github.com/yichuan-w/LEANN) or Milvus (though Milvus often ends up cloud/token-based) • Your inference engine, e.g. Ollama, which is great for running OSS GPT models locally

    With this, you’d have an offline, private, and blazing-fast personal dev+AI environment. LEANN in particular is built exactly for this kind of setup — tiny footprint, semantic search over your entire local world, and Claude Code/ Cursor –compatible out of the box, the ollama for generation. I guess this solution is not only free but also does not need any API.

    But I do agree that this need some effort to set up, but maybe someone can make these easy and fully open-source

    by yichuan - 1 day ago
  • I bet it's crazy to some people that others okay with giving up so much of their data for free tiers. Like yeah it's better to selfhost but it takes so much resources to run good enough LLM at home that I'd rather give up my code for some free usage, anyway that code eventually will end up open source
    by qustrolabe - 1 day ago
  • I replicate SDD from kiro code, it works wonder for multi switching model because I can just re fetch from specs folder
    by tonyhart7 - 1 day ago
  • Wow, there's a lot here that I didn't know about. Just never drilled that far into the options presented. For a change, I'm happy that I read the article rather than only the comments on HN. ;)

    And lots of helpful comments here on HN as well. Good job everyone involved. ;)

    by gexla - 1 day ago
  • This all sounds a lot more complicated and time consuming than just writing the damn code yourself.
    by sublinear - 1 day ago
  • To stop tab switching I built an extension to query all free models all at once: https://llmcouncil.github.io/llmcouncil/
    by hoerzu - 1 day ago
  • As the post says, the problem with coding agents is they send a lot of their own data + almost your entire code base for each request: that's what makes them expensive. But when used in a chat the costs are so low as to be insignificant.

    I only use OpenRouter which gives access to almost all models.

    Sonnet was my favorite until I tried Gemini 2.5 Pro, which is almost always better. It can be quite slow though. So for basic questions / syntax reminders I just use Gemini Flash: super fast, and good for simple tasks.

    by bambax - 1 day ago
  • A lot of work to evaluate these models. Thank you
    by worik - 1 day ago
  • Slightly off topic: What are good open weight models for coding that run well on a macbook?
    by chvid - 1 day ago
  • Was the page done with AI? The scrolling is kinda laggy. Firefox/m3 pro.
    by nottorp - 1 day ago
  • I'd love to see a thread that also takes advantage of student offers - for example, GitHub Copilot is free for university and college students
    by Weetile - 24 hours ago
  • I only use LLMs as a substitute for stackexchange, and sometimes to write boilerplate code. The free chat provided by deepseek works very well for me, and I've never encountered any usage limits. V3 / R1 are mostly sufficient. When I need something better (not very often), I use Claude's free tier.

    If you really need another model / a custom interface, it's better to use openrouter: deposit $10 and you get 1000 free queries/day across all free models. That $10 will be good for a few months, at the very least.

    by precompute - 23 hours ago
  • Now all we need is a wrapper/UI/manager/aggregator for all these "free" AI tools/pages so that we can use them without going into the hassle of changing tabs ;-)
    by NKosmatos - 23 hours ago
  • Why are people still drawn to using pointless AI assistants for everything? What time do we save by making the code quality worse overall?
    by burgerone - 23 hours ago
  • The chatgpt free tier doesn't seem to expire unlike claude or mistral ai, they just downgrade it to a different model
    by hoppp - 21 hours ago
  • Let's just be honest about what it is we actually do: The more people maximize what they can get for free, the more other people will have to shoulder the higher costs or limitations that follow. That's completely fine, not trying to pass judgement – but that's certainly not "free" unless you mean exactly "free for me, somebody else pays".
    by jstummbillig - 21 hours ago
  • These tricks are a little too much for me. I'd rather just write the code myself instead of opening 20 tabs with different LLM chats each.

    However, I'd like to mention a tool called repomix (https://repomix.com/), which will pack your code into a single file that can be fed to an LLM's web chat. I typically feed it to Qwen3 Coder or AI Studio with good results.

    by brokegrammer - 20 hours ago
  • OP must be a master of context switching! I can’t imagine opening that number of tabs and still focus
    by Oras - 20 hours ago
  • Maybe optimistic, but reading posts like this makes me hopeful that AI-assisted coding will drive people to design more modular and sanely organized code, to reduce the amount of context required for each task. Sadly pretty much all code I have worked with have been giant messes of everything being connected to everything else, causing the entire project to be potential context for anything.
    by 3036e4 - 20 hours ago
  • Why is Mistral not mentioned. Is there any reason? I have the impression that they are often ignored by media, bloggers, devs when it comes to comparing or showcasing LLM thingies. Comes with free tier and quality is quite good. (But I am not an AI power user) https://chat.mistral.ai/chat
    by 5kyn3t - 20 hours ago
  • It’s not free FREE but if you deposit at least $10 on OpenRouter, you can use their free models without credit withdrawals. And those models are quite powerful, like DeepSeek R1. Sometimes, they are rate limited by the provider due to their popularity but it works in a pinch.
    by jug - 20 hours ago
  • Nice write-up, especially the point about mixing different models for different stages of coding. I’ve been tracking which IDE/CLI tools give free or semi-free access to pro-grade LLMs (e.g., GPT-5, Claude code, Gemini 2.5 Pro) and how generous their quotas are. Ended up putting them side-by-side so it’s easier to compare hours, limits, and gotchas: https://github.com/inmve/free-ai-coding
    by codeclimber - 19 hours ago
  • Ai studio using https://aistudio.google.com/ is unlimited.

    I also use kiro which I got access for completely free because I was early on seeing kiro and actually trying it out because of hackernews!

    Sometimes I use cerebras web ui to get insanely fast token generation of things like gpt-oss or qwen 480 b or qwen in general too.

    I want to thank hackernews for kiro! I mean, I am really grateful to this platform y'know. Not just for free stuff but in general too. Thanks :>

    by Imustaskforhelp - 19 hours ago
  • The qwen coder CLI gives you 1000 free requests per day to the qwen coder model (405b). Probably the best free option right now.
    by scosman - 16 hours ago
  • Looks like somebody is a tad bit over reliant on these tools but other than that there is a lot of value in this article
    by gkoos - 15 hours ago
  • You might find this repo helpful, it compares popular coding tools by hours with top-tier LLMs like Claude Sonnet: https://github.com/inmve/free-ai-coding
    by imasl42 - 14 hours ago
  • https://claude.ai https://chat.z.ai https://chatgpt.com https://chat.qwen.ai https://chat.mistral.ai https://chat.deepseek.com https://gemini.google.com https://dashboard.cohere.com https://copilot.microsoft.com
    by matrixhelix - 13 hours ago
  • This is nightmarish, whether or not you like LLMs.

    Just use Amazon Q Dev for free which will cover every single area that you need in every context that you need (IDE, CLI, etc.).

    by iLoveOncall - 12 hours ago
  • Ha, I'm working on a similar tool: https://github.com/DrSiemer/codemerger

    Glad to see I'm not the only one who prefers to work like that. I don't need many different models though, the free version of Gemini 2.5 Pro is usually enough for me. Especially the 1.000.000 token context length is really useful. I can just keep dumping full code merges in.

    I'll have a look at the alternatives mentioned though. Some questions just seem to throw certain models into logic loops.

    by DrSiemer - 12 hours ago
  • All the AI corps have a free model, thats enough to use it for free no?
    by zwnow - 3 hours ago
  • It's another advertisement article
    by personjerry - 3 hours ago

© 2025 Birbla.com, a Hacker News reader · Content · Terms · Privacy · Support