Hacker Newsnew | past | comments | ask | show | jobs | submit | coalteddy's commentslogin

Do i understand correctly that this allows models to generate two contradicting tokens (Contemplating Stream: y op z = 3, Thinking Stream: y op z = 5) in separate streams at any given point, right? What would happen in such cases? Sounds like an interesting problem or quirk of this architecture.


Yep, in general, the interesting question is how you merge multiple streams that have been worked on in parallel without imposing a particular order or prioritizing one over another. (We do know how to do this in principle, that's exactly what a vector sum of encoded representations does. But it's not clear how to train the model so that it can recognize the outcome of that vector sum operation as meaningful.) Working on multiple streams at the same time is just what subagents do naturally, so the merge is the interesting part.


Does anyone know how they make their map so performant? Showing all those pins is mind blowing to me coming from leaflet maps. Marinetraffic is also a map that blows me away every time i see all the icons and how smooth and fast the loading is when zooming in. Would love to make a similar map at some point for my hobby but leaflet just does not cut it when you want to render 10million plus pins on a global map.

Tech blogs or pointers would be great


Points are rendered server-side, backed by Elasticsearch, and served as PNG tiles for each zoom level. Individual markers are only rendered for small sets. Some of the relevant source code:

https://github.com/inaturalist/inaturalist/blob/main/app/ass...

https://github.com/inaturalist/inaturalist/blob/main/app/ass...

https://github.com/inaturalist/inaturalist/blob/main/app/ass...


Did not realize that they publish their code. Very cool. Thanks!


Looked at their source code out of curiosity. They use Elasticsearch as a geo backend, tile server that renders PNG images server side for each map tile request.

- low zoom -> server aggregated grid

- high zoom -> switcher to point tiles

browser just displays images, there is no work to do on client side..

I generated a visual schematic how these kind of systems work: https://vectree.io/c/server-side-geo-tile-rendering-elastics...


You may want to look into the PMTiles format and tippecanoe. It efficiently produces pyramidal XYZ tile overviews of vector data. Sometimes this is also done server side via the PostGIS asMVT ffunction, or Martin.

For client side rendering, deck.gl is quite good, also a newer library called lonboard from DevelopmentSeed.


Thanks for those pointers! Very helpful.


Very cool. Love this. Was the training more heavily weighted towards swiss languages and how does the model perform on swiss languages compared to others?

Are there any plans for further models after this one?


The pretraining (so 99% of training) is fully global, in over 1000 languages without special weighting. The posttraining (See section 4 of the paper) had also as many languages as we could get, and did upweight some languages. The posttraining can easily be customized to any other target languages


I have a friend that works on physically based renderers in the film industry and has also done research in the area. Always love hearing stories and explanations about how things get done in this industry.

What companies are hiring such talent at the moment? Have the AI companies also been hiring rendering engineers for creating training environments?

If you are looking to hire an experienced research and industry rendering engineer i am happy to connect you since my friend is not on social media but has been putting out feelers.


Have him ping me. Username at Gmail.


Thanks a lot for this eval!

One question i have regarding evals is, what sampling temperature and/or method do you use? As far as i understand temperature/ method can impact model output alot. Would love to here you're thoughts on how these different settings of the same model can impact output and how to go about evaluating models when its not clear how to use the to their fullest


Generally, we'll use the API provider's defaults.

For models we run ourselves from the weights, at the moment we'd use vLLM's defaults, but this may warrant more thought and adjustment. Other things being equal, I prefer to use an AI lab's API, with settings as vanilla as possible, so that we essentially defer to them on these judgments. For example, this is why we ran this Mistral model from Mistral's API instead of from the weights.

I believe the `temperature` parameter, for example, has different implementations across architectures/models, so it's not as simple as picking a single temperature number for all models.

However, I'm curious if you have further thoughts on how we should approach this.

By the way, in the log viewer UI, for any model call, you can click on the "API" button to see the payloads that were sent. In this case, you can see that we do not send any values to Mistral for `top_p`, `temperature`, etc.


Any blogs or other writing about this topic you can recommend? I worked with gurobi in the past but haven't been keeping up with the trends and performance gains.

Love this field of CS!


How do I get access to this feature? I cannot find it in the normal chatgpt interface.


It's a staged rollout. You'll probably have it by tomorrow morning.


I believe you wait until your number comes up :/


it's under the model list on the web interface


Wow this is the first time i hear about such a method. Anywhere i can read up on how the temperature multiplier works and what the implications/effects are? Is it just changing the temperature based on how many tokens have already been processed (i.e. the temperature is variable over the course of a completion spanning many tokens)?


Just a fixed multiplier (say, 0.5) that makes you use half of the range. As I said I'm just speculating. But Sonnet 3.5's temperature definitely feels like it doesn't affect much. The model is overfit and that could be the cause.


This is not true. I have no idea about the US but Canada, specifically BC, still has large amounts of old growth forest that is being cut. Really sad to see and read about.


US cities love their cars. Not even in city centers do they prioritize pedestrians over cars. That has nothing to do with apples or oranges. It's a priority thing and not costs. There is no reason to need cars in city centers. Makes cities ugly, loud and dangerous compared to europe or asia.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: