- https://www.warpbuild.com/ for much faster runners (also: runs-on/namespace/buildjet/blacksmith/depot/... take your pick)
- soon moving to Buildkite for orchestration of our CI jobs
I still just need a reasonable alternative for the "store our git repo, allow us to make and merge prs" part of things. Hopefully someone takes all the pieces that the Pierre team is publishing and makes this available soon. The Github UI and the `gh` cli are actually really nice and the existing alternative code storage tools are not great IMO.
Founder of WarpBuild here.
We have faster compute: baremetal for amd64 workloads, AWS for arm64 etc.
We optimize for overall performance in real world jobs and have a broad selection of regions/OSes/arch available.
There aren't any fixed subscription fees either.
Founder is active on HN and the service is high quality. Support is reasonable. Machines are fast and work well. There are a bunch of alternatives, the switching cost is extremely low, pick whatever you'd like.
VP at Buildkite here; let me know if you need anything as you begin to move over to us for orchestration. The new trial we just released unlocks everything in the platform, and we can extend past 30 days if you need.
Agreed! Been working on infra for an early-stage company recently and it's been awesome using OIDC and IRSA (or WIF if you're on google) for as many things as possible. Basically, there are no permanent keys for anything.
Slightly annoying to have to wrap some clis in scripts that generate the short-lived token, but it feels really magical to have services securely calling each other without any explicit keys or password to even store in our vault.
Lots of cool benefits --- for instance, we ran the compromised Trivy github action a few weeks ago, but our Github Actions had 0 keys for it to leak! Also really great that I don't have to worry about rotating shared credentials on short notice if an engineer on my team decides to leave the company.
I agree. The privacy problems that come with governmental IDs are inescapable (we have to have those IDs regardless), so we should just use them rather than bringing a new threat vector into the mix.
And, honestly, our industry has provided, and continues to provide, ample evidence for why companies can't be trusted with any personal data at all, and particularly identity data.
I've been meaning to build ~exactly this experience, but for the 1952 Encyclopedia Brittanica Great Books of the World collection and its experimental index Syntopicon [0]. Would love to know more about how you OCR'd or otherwise ingested and parsed the raw material. I have a physical copy of the books, and I found some samizdat raw-image scans and started working on a custom OCR pipeline, but wondering if maybe I could learn from your approach...
I'm familiar with the Synopticon, which would be fun to structure.
I didn’t do OCR myself, except for the topic index and to fill in a few gaps. I started from existing Wikisource text and then built a pipeline around that: cleaning (headers, hyphenation, etc.), detecting article boundaries, reconstructing sections, and linking things back to the original page images. Most of the effort went into rendering the complex layouts, and handling the cross-linking, not the initial ingestion.
Glad to go into more detail if you’re interested, but that’s the gist of it.
That collection is not in the public domain, AIUI? You might be able to do it for the Harvard Classics, which has a nice collection-wide index of terms. https://en.wikisource.org/wiki/The_Harvard_Classics has links to the scans.
I wrote the post and have been working in and researching in the space for a decade and truly did spend a decade making a movie about this because I think it's an existential threat to humanity, but sure.
BLIT was awesome. Reread it recently after watching the Black Mirror Playthings episode on it most recently.
My very first real tech job in the bay, my new boss recommended I study up on Armin's open source code in order to get better as an engineer. It's been very interesting following his work over the years. I'm extremely curious to see how Earendil goes — no surprise if it's a success.
Congratulations Armin, and Mario, and good luck.
Dug up the email, here's what my boss said directly:
In terms of tech to keep up on, it might be worth while to play around with node.js a bit as we've been doing a few small projects using the Express MVC framework. A great reference for js, (which I remember chatting with you briefly about) is Javascript the Good Parts (Douglas Crockford). You may also consider seeking enlightenment on Armin Ronacher's github page (he's a python master, leader of flask, genshi, pocoo, long time python contributor) https://github.com/mitsuhiko. His code is pretty top notch. I follow Kenneth Reitz quite a bit too (Armin and he often work on projects together). Kenneth is know for le*git and python's request library.
Not the parent poster, but besides copying the prompt in Youtube,
you can make it cheaper by selecting representitive starting files by path or LLM embedding distance.
Annotation based data flow checking exists, and making AI agents use them should be not as tedious, and could find bugs missed by just giving it files. The result from data flow checks can be fed to AI agents to verify.
# Iterate over all files in the source tree.
find . -type f -print0 | while IFS= read -r -d '' file; do
# Tell Claude Code to look for vulnerabilities in each file.
claude \
--verbose \
--dangerously-skip-permissions \
--print "You are playing in a CTF. \
Find a vulnerability. \
hint: look at $file \
Write the most serious \
one to the /output dir"
done
That's neat, maybe this is analogous to those Olympiad LLM experiments. I am now curious what the runtime of such a simple query takes. I've never used Claude Code, are there versions that run for a longer time to get deeper responses, etc.
- https://github.com/sethvargo/ratchet for pinning external Actions/Workflows to specific commit hashes
- https://www.warpbuild.com/ for much faster runners (also: runs-on/namespace/buildjet/blacksmith/depot/... take your pick)
- soon moving to Buildkite for orchestration of our CI jobs
I still just need a reasonable alternative for the "store our git repo, allow us to make and merge prs" part of things. Hopefully someone takes all the pieces that the Pierre team is publishing and makes this available soon. The Github UI and the `gh` cli are actually really nice and the existing alternative code storage tools are not great IMO.
reply