The biggest ones are: using tree-sitter to index code files as a tool, code_execution tool running a workflow of tools inside a python interpreter (monty), and not being a harness developed by the company profiting from selling you the shovels (and introducing "dynamic workflows" aka spawning 50 agents).
Author of the text here. I will be honest with why I wrote it, the rtk ai looks very odd to me as software engineer, the number of stars, no mention of accuracy and how management is pushing that stuff to optimize costs. Now people are wrapping every possible command in rtk and trying to handle all major possible commands and decide which output you should get.
Would sincerely love to hear your thoughts on https://www.github.com/jahala/tilth - it’s a different approach than RTK, benchmarked to reduce cost per correct answer by ~40%
That’s the fair point. The rtk promotional posts point to 60-90% tokens savings and there is no mention how they perform accuracy wise. The commenter below did great job pointing to resource showing caveman, rtk saving just couple bucks on $926 bill. Thanks, Llyoyd Christmas for linking to useful substack
TLDR; ~3-4% savings to actual API costs with rtk, caveman, and headroom combined, but nothing tangible on if those cost reductions came at a cost of quality. By their calculations, rtk saved them $4.96 on a $926 bill.
Fwiw, I just ran the steps to reproduce and got `Error: prettier produced no output` on rtk (0.42.2). Not saying this isn't valid for the users environment but I could not reproduce on linux.
so how do you justify it's usage if it's not saving much and the work feels similiar. They have 664 issues open and some of them are quite funny, the tools are called and return success even though they aren't even installed.
My take is that handling so many versions and so many different tools shouldn't be the work of any single repo. The responsibility should be either on coding agent to compress or best case scenario people who are responsible for cli tool
I'm not justifying its usage, and I don't have to.
I've been trying it out for a couple days and it seems kinda OK or whatever. If that upsets you, then that's your problem.
I might dump it later on if it doesn't provide much if a benefit. I typically try out new things, then cull whatever doesn't work. This tool seems pretty neutral for now, at least.
no, it doesn't upset me. I am open for discussion, there might be things I miss and don't understand. I am just trying to get why it's been pushed so hard lately and if the benefits are really there. Sorry, if I sounded upset to you, but I am trying to be really civil and just genereally curious
Well, I'm sorry as well. I mistakenly assumed you were being confrontational.
There are a lot of people who have negative knee jerk reactions to any AI stuff, new workflows (I'll agree there is a lot of garbage being shilled in this space), etc., and I jumped the gun by lumping you into that group.
Nope, I am doing my master thesis on finetuning llms at 36, so I am into this stuff, but it’s been very weird lately. I’ve been self-taught dev and I definitely was missing computer science concepts so excited to fill the gaps, although the timing wasn’t perfect.
Good conversation! Great pushback against my arguments. That’s what I signed on with hacker news and missing that spirit recently
reply