Hacker Newsnew | past | comments | ask | show | jobs | submit | crispyambulance's commentslogin

> Infrastructure as Code, not infrastructure as YAML.

Right on.

It's amazing to me that we've spent decades with programming languages and environments which can accurately guess what you're about to type next, which have enormous expressiveness while maintaining cogency, which are intuitive and well understood by humans, which have endless libraries and an infinity of ways of connecting with the world.

And what do we use to configure the most sophisticated infrastructure to run such code? Yet another mark-up language!


Many domains are better served by a more limited programming language, so you can analyze a program and/or make guarantees about it.

Real regexes (actually regular…) are infinitely better than Python code matching the same string (if they are sufficient) - you can compute their intersection, union, complement; check if they can match anything at all (and generate an example automaticallly).

For software builds, Bazel and others use Starlark, which is a restricted Python subset, so builds can be guaranteed finite and can be reasoned about.

Ansible may or may not offer any benefits in return for the limits (I am not an ansible guru), but in general, most tasks do not need a Turing complete configuration/specification language - and it is then better to NOT have Turing completeness.


The "you don't want a full programming language" trope I see repeated a lot but I think far more people end up wishing for a Turing complete language than wishing it _wasn't_ Turing complete.

They do, until a configuration endless loop brings down their production system.

This is not really different than C vs Rust, or even Perl regular expressions (unbounded execution time) vs real regular expression. With great powers comes great abilities to shoot yourself in the foot.

The power/guarantee balance is delicate, and you can’t hold the stick at both ends. People will always complain.


This is exactly what the Starlark language was developed to solve, initially for Bazel but also used other places. It's a "full scripting language" but intentionally doesn't (in default configuration) support recursion or unbounded loops, so is deterministic and bounded execution time. I really wish more projects would reach for it as a configuration language.

https://github.com/bazelbuild/starlark


I have such mixed feelings about Starlark and Bazel macros. When I write Bazel macros, they're great, the perfect tool for the job. When I encounter macros written by someone else, they are awful, a mistake and the bane of my existence.

A lot of this is a matter of taste and judgement.

In the same way that it's possible to have an xml/json/yaml/toml config that creates despair in those who have to maintain it, a python or bash script can grow into a monster in the basement.

Or, it could be a cogent script that makes its intent and operation obvious. I prefer that when possible.


The environment around the language can put in limits (on time, number of operations, etc.)

Convex does this well, replacing SQL (somewhat yaml-like sucky old declarative language) with JS/TS but in a well-locked-down environment with limits to ensure one mutation or query doesn’t take down the whole DB.


The number of times I've seen a configuration endless loop bring down anything are so few compared to the time wasted on DSLs and having to bend over backwards to do things a first-class programming language can do simply. Same with PCRE I've seen that maybe.. once.

> It's amazing to me that we've spent decades with programming languages and environments which can accurately guess what you're about to type next, which have enormous expressiveness

You've almost guessed the problem. Too much expressiveness is a bad thing. This is a problem I encounter a lot more often then I'd be happy to. It's very often is much easier to build something more generic than what the user actually needs, and then testing it becomes a nightmare.

To make this more concrete, here's a case I'm working on right now. Our company provides customers with a tool to manage large amounts of compute resources (in HPC domain). It's possible to run the product on-prem, or in different clouds, or a combination of both. Typically, the management component comes with a PXE boot and unfolds from there. A customer wanted integration with a particular cloud provider that doesn't support this management style, nor can it provide a spare disk to be used for management, nor any other way our management component was prepared to boot.

The solution was to use netboot that would pre-partition the disk and use the first N partitions to store the management component as well as the boot, ESP / bios_grub partition etc. It had to be incorporated into the existing solution that encompasses partitioning and mounting all the resources available to a VM, including managing RAIDs, LVM, DM and so on.

The developers implemented it as a GPT partition name with a pre-defined value that would instruct our code to ignore the partitions found prior to the "special" partition and allow the user to carry on as usual, pretending that the first fraction of the disk simply didn't exist (used by netboot + the management component).

This solved the immediate problem for the user who wanted this ability, but created thousands of problems for QA: what happens if there's a RAID that uses the "hidden" partitions? What happens if the user accidentally creates second /boot partition? What happens if the user wants whole-disk encryption? And so on. It would've been so much better if these questions didn't exist in the first place, than to try to answer them, given the "simple" solution the developers came up with.

If you programmed for just a year, I'm sure you've been in this situation at least a few times already. This is exceedingly common.

* * *

There's an enormous value to being able to restrict the possible ways a program can run. Most GUI projects? -- They don't need infinite loops! It just makes programs unnecessarily hard to verify. But it's "easy" to have a single loop language element that can be made infinite if necessary. Configuration languages exclude whole classes of errors simply by making them impossible to express.

However, I have to agree that, specifically, YAML is a piss-poor configuration language. It has way too many problems that overshadow the benefits it offers. We, collectively, decided to use it because everyone else decided to use it, making it popular... and languages are "natural monopolies". So, one could certainly do better ditching YAML, if they can afford to go unpopular. But ditching the idea of a configuration language is throwing the baby out with the bathwater.


> ...sneaker company that pivoted to data centers set the 'weird' bar pretty high...

"Weird" is the wrong word for Allbirds. "Fraud" is far more fitting. They obviously have no intention running an AI-datacenter business and are doing it for the stock-price rush. A small number of people will be laughing all the way to the bank, and everyone will forget Allbirds in short order.

Ebay has a history of being legit, though they have had a long list of uncanny acquisitions themselves (including Skype, which they later sold for a stiff loss). It's a pity they couldn't just execute on their core business and are now being acquired themselves by an entity using sketchy financial shenanigans.

Who's going to stop a few rich people with a pile of money and a stated intent of doing something they have no intention of doing? No one, I guess. I mean, there's plenty of examples. Supermicro is still listed on NASDAQ even though one of their founders was caught smuggling export-controlled GPU's in Supermicro servers to the tune of 2.5 billion dollars a couple months ago.


eBay currently allows (or at least tolerates) sales of items not in the possession of the seller and are effectively lottery tickets. Lotteries are illegal in my state (NV) but eBay does not restrict me from bidding. That's low hanging fruit right there.

I think USB-C is certainly a step in the right direction.

The remaining problem is the lack of CLEAR, easy to understand markings on the cable that indicate whether it’s intended as power delivery cable or as a 10Gbps data cable or as a thunderbolt-capable cable or any of many combinations in between those. This should not be limited to physical markings on the cable itself but also in the form of electronic self-identification so that you could plug in a cable and have the OS tell you exactly what cable you plugged in. Why not? We have power-delivery protocols, adding cable self-id would be a trivial addition.

I suspect the vendors of these, and perhaps the designers of the spec too, have deliberately made this confusion an integral part of the standard. It creates churn and consumers buying more cables than they need.


Apparently, you can use it in RPN mode!

RIP. He was an amazing human. I worked for a time at JCVI when it was in Rockville, shortly after he had left Celera Genomics. He led a team that did something which was considered intractably difficult-- sequencing whole genomes. Then he did it again with global ocean sampling and synthetic genomics and other things. That is not to say that "he did it single-handedly", Venter was a hybrid of scientific and organizational talent that was able to make this stuff happen by coordinating stuff that's super hard to coordinate.

It's "complex enough" to be notable for it's complexity and thus a good example for considering the character and economics of complex machinery.

It's kind of pointless to fret about whether it's "the most complex" like there's an objective 1-dimensional ranking that even has utility.


When practitioners say "PCR" they don't (usually) just mean amplifying DNA for use as part of the input to another process.

What they usually mean is PCR with chemistry that selectively amplifies some specific sequence of DNA. This chemistry has dyes in it which fluoresce when illuminated at some specific wavelength. The point of all this is to answer a "yes/no" question for the presence of some DNA sequence in the sample. This is done at scale with multiple chemistries looking for different DNA sequences. This is also known as "real-time PCR".

It's sort of like the biological-assay version of the kid's game "20-questions". If you do it right, it's an enormously powerful detection technique for medical purposes. It gives you your "answer" in a reasonable amount of time on your desk while you wait.

That said, there are biological assays that don't need the thermo cycling anymore. These newer assays use more sophisticated chemistry that amplifies at a constant heating temperature. In the simplest terms, they're just heaters combined with a fluorometer. It's potentially MUCH faster than realtime-PCR.

In any case, the only real serious money-making business for these instruments is in-vitro diagnostics. That requires FDA approval, and that means a ~10K minimum for the instrument and tens of dollars for the consumables containing the assays, and definitely a pricey service agreement for the instrument (eg Bio-Rad instruments).

A distant second money-making business would be research-use-only instruments, but these are not going to be inexpensive little devices.


> When practitioners say "PCR" they don't (usually) just mean amplifying DNA for use as part of the input to another process.

I definitely do.

> What they usually mean is PCR with chemistry that selectively amplifies some specific sequence of DNA. This chemistry has dyes in it which fluoresce when illuminated at some specific wavelengt

This is called qPCR (and qRT-PCR, and RT-PCR and ‘Taqman assay’... But it's not called PCR because it's not just PCR).. It has uses outside of diagnostics (which is what it seems you're most familiar with).

Either way, the article is not about qPCR.


In normal biology labs real-time PCR is used much less than normal PCR, I'd guess 5% of PCRs across labs are run in real-time machines.

This is well said and good illustration of why optimality a fragile concept. High impact improvements often involve reframing the goal.

That font, and how it's integrated with the math looks amazing. Katex for the math?


Seems like Katex from the scripts getting loaded. I love the design too, kinda medieval-chic.


Looks more early modern to me. :)


> I'm not saying that readability can't be a consideration when making documentation. I am saying that if you discard accuracy in the process, you've fucked up quite badly.

You're right to elevate accuracy to a high level of importance, but that is NOT ENOUGH if the thing is has poor readability. The audience has to be able to understand the document if the document is to be useable.

There's only a certain amount of effort anyone can deliver in producing a document. But if the author can't deliver readability, they need to follow up the document with a lot of support and/or get some help to make it useable.


I've struggled through some absolutely awful documentation over the years. I'll put up with incredibly broken English and other problems as long as the accuracy is there. Just last week I encountered a pinout diagram that used emojis to indicate which pins related to which data channel. Not a choice I would have made, and I found it made the diagram harder to read. But it was accurate - I wired it up per the diagram and everything worked as intended.

Documentation lacking accuracy is useless. It can be the most readable thing ever produced, but if it describes a different thing than what was intended to be documented, it's trash. Documentation that is hard to read but is accurate still has value.

Regarding "follow up the document with a lot of support" - did you catch the part of the anecdote where the author is having to deal with support requests because of the inaccuracies?


  > The documentation was complete, correct, and relatively terse. Less than a page.
No, that's YOUR IMPRESSION of your own writing.

There are many reasons why others might not find what you wrote sufficient to understand it. You boss ran it through AI for a reason and that reason was most likely because it the document was not understandable or perhaps confusing.

Did the document have usage examples? Did it explain context and background? Did it use "precise" jargon that not everyone knows? Did you follow up the documentation writing with a meeting with stakeholders/users to see if they had questions?

It sounds like you just "threw it over the wall" like you were done with it and left your boss to figure out how to get others to use it. If you find that you have "near constant" struggle to communicate, there is a strong possibility that the problem is yours and not everyone else.


How can you be so critical of a stranger's work given that you haven't even seen it?

"that reason was most likely because" -> Bear in mind you do not actually know the given situation.


None of us knows the exact situation but the fact that the person said his documentation was "complete, correct, and relatively terse" is a red-flag. It seems to me like smug over-confidence.

If the document really was so clear and error-free, then why would the boss try to "fix it"?


Assuming you're correct that the commenter is unaware of their communications deficiencies, then as much of your confident criticism should be directed at a manager who would silently change a spec sheet for some reason, and not coach the employee on why that was needed.

If it was truly a manager, where the main role of their job is to manage the performance of their employees, then they failed here.


The boss also tried to fix it in the lowest effort manner possible, without even checking the results.


People try to fix things that are perfectly fine all the time.

People often apply nonsensical standards to things.


Who knows why? That's my point, its not us


You are making a bunch of claims about a situation you know nothing about.


GP made a claim about the precision of his language that is incompatible with natural language.

This is already known: GP is wildly overconfident in their communication skills.


The OP just told us all what it was about. You don't know any more or less than I do.

I simply am skeptical of their smug take on it.


> There are many reasons why others might not find what you wrote sufficient to understand it. You boss ran it through AI for a reason and that reason was most likely because it the document was not understandable or perhaps confusing.

It could also be because their manager is less technical. It's not unusual in my life for a PM to try to "rephrase" or restate things I've written in order to make them "easier to understand" in a way that in fact falsifies them or makes them more difficult to understand for the people who will actually have to work on/with it.


PM: "X party needs to know about Y thing"

"Tell them [very specific answer targeted at X party]"

PM: "They are still asking about Y, see their response with the follow up question"

Then in the original send of [specific thing] PM has transformed it into [something else]. X party has followed up with a question that was answered by [specific thing]. Yes PM you might have been confused but you weren't the target.

This cycle happens very often.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: