I know a ‘C’ code base that treats all socket errors the same and just retries for a limited time. However there are errors that make no sense to retry, like invalid socket or socket not connected. It is necessary to know what socket error occurred. I like how the Posix API defines an errno and documents the values. Of course this depends on accurate documentation.
This is an IDE/documentation problem in a lot of cases though. No one writes code badly intentionally, but we are time constrained - tracking down every type of error which can happen and what it means is time consuming and you're likely to get it wrong.
Whereas going with "I probably want to retry a few times" is guessing that most of your problems are the common case, but you're not entirely sure the platform you're on will emit non-commoncases with sane semantics.
I think a file layout describes the exact arrangement of bytes in a file. A schema is higher level. It describes what is stored, not how it is stored. A database could be one file, or a file per table, or a file per column. Data could be stored across multiple drives.
Does it make any sense to have specialized models, which could possibly be a lot smaller. Say a model that just translates between English and Spanish, or maybe a model that just understands unix utilities and bash. I don’t know if limiting the training content affects the ultimate output quality or model size.
> NVIDIA researchers customized LLaMA by training it on 24 billion tokens derived from internal documents, code, and other textual data related to chip design. This advanced “pretraining” tuned the model to understand the nuances of hardware engineering. The team then “fine-tuned” ChipNeMo on over 1,000 real-world examples of potential assistance applications collected from NVIDIA’s designers.
> Our results show that these ___domain adaptation techniques enable significant LLM performance improvements over general-purpose base models across the three evaluated applications, enabling up to 5x model size reduction with similar or better performance on a range of design tasks.
> Domain-adaptive pretraining (DAPT) of large language models (LLMs) is an important step towards building ___domain-specific models. These models demonstrate greater capabilities in ___domain-specific tasks compared to their off-the-shelf open or commercial counterparts.
I actually prefer text as #fff and background as #000. Lower contrast just seems harder to read. I suppose it depends on brightness setting.
Also from an article in Nature.
“Impact of text contrast polarity on the retinal activity in myopes and emmetropes using modified pattern ERG” 09 July 2023.
“Recently, reading standard black-on-white text was found to activate the retinal OFF pathway and induce choroidal thinning, which is associated with myopia onset. Contrarily, reading white-on-black text led to thicker choroids, being protective against myopia. Respective effects on retinal processing are yet unknown.”
In CA we have many intersections where one wants to turn left, but there is not a dedicated left turn signal. When the light turns green, you pull out into the intersection. Ideally you pull out enough that the car behind you can also get into the intersection. On busy roads you may not be able to complete the left turn until the signal goes red. If you chose not to pull out, then nobody would ever be able to turn left. I believe CA passed a law some 20 plus years ago that you must be able to clear the intersection before the red light, which is in conflict with what is sometimes necessary. There are situations though where the direction you are headed is backed up, such that if you pulled out you could end up stuck in the middle of the intersection long after the red light. I believe the law was intended for this situation. So don’t pull out if your direction of travel is blocked.
reply