More

gpjt · 2025-03-05T23:28:24 1741217304

OP here: my posts are kind of reading notes for the book, so I don't normally copy the code from there -- so you wouldn't have seen the packages used. So far there's been tiktoken for tokenization (Raschka shows how to write a simple tokenizer and explains the workings of the byte-pair tokenization that he recommends, though) and PyTorch for CUDA-acceleratable matrix maths, automated differentiation for gradient descent, and so on.

gpjt · 2025-03-05T23:22:17 1741216937

I do see your point!

But at the end of the day, it depends on where you want to spend your time. "Build an LLM from scratch" is over 300 pages -- and they are very dense pages. My blog post covers fewer than 10 of them (though TBF they are the hardest pages). Adding on tokenizers in depth from scratch would add on 100 or so more. Adding on efficient-enough matrix multiplication to do anything would add on a few hundred more, and doing it in CUDA would probably be a couple of thousand. Now add on automated differentiation to work out the gradients for training -- a few thousand more? Optimizers for the training -- even more than that, perhaps.

You have to draw the line somewhere, as otherwise (as you suggest) the "from scratch" book has to start "go out and get some really clean sand" so that you can start fabbing your own chips. I think that tiktoken and PyTorch are a solid choice for that line, as it means that the book is manageable in size and gives you enough of an overview of the underlying stuff to be able to work out what you want to dig into next.

gpjt · 2025-03-05T23:08:52 1741216132

OP here: that is an excellent point! Of the eight read-throughs, four were on Friday night, then I had dreams involving some kind of TRON-like vector spaces, and at brunch on Saturday things seemed to start gelling. The four extra read-throughs were to crystallise that intuition to a level that I felt I could start writing it up. I'm 100% sure the sleep was what built some kind of intuition at a pre-linguistic stage that I could build on.

gpjt · 2025-02-25T02:44:51 1740451491

Proofreading is one place that AI can actually be a friend rather than a foe. If you give Claude your draft and tell it explicitly to call out misspellings and grammatical errors only, it does a really good job.

zabzonk · 2025-02-25T02:47:37 1740451657

Much like (say) MS Word does? In real time.

dc96 · 2025-02-25T18:11:32 1740507092

AI is a lot more powerful in this aspect.

MS Word would find no qualms in this sentence:

"I started the car and went for a drive on the highway. There were many other cats on the road but it was nevertheless agitating."

Given the correct prompt (that avoids changing your literary style altogether), AI can quickly suggest cats -> cars and agitating -> peaceful, since it's much better at contextualizing.

arvinsim · 2025-02-25T06:22:38 1740464558

Proofreading is easily done with an editor. I think AI is much more useful for giving critic and advice on how you write your sentences. Setting the tone, refining the main idea, pointing out redundancies are some of the things that I find very useful.

gpjt · 2025-02-25T22:06:08 1740521168

Sometimes, but you have to be careful. IME Claude and (to my surprise) Grok 3 are really good at understanding your style and adapting their suggestions to match. ChatGPT, by contrast, tries to change everything into some kind of corporate drone.

arvinsim · 2025-02-26T03:37:04 1740541024

Another thing is to know when to stop. AI can endlessly refine your sentences if you let it.

gpjt · 2025-02-25T02:42:17 1740451337

Is your job super-safe? If so, that's awesome :-) The whole marketing thing only becomes important if you have to get a new one, and then it can become important very quickly.

0xEF · 2025-02-25T09:34:29 1740476069

That's where it turned for me. Originally, I had started a small tech-topic blog with the idea that it would be my portfolio because I really wanted to write for a tech publication, most because I thought I had the chops for it and I want a job where I can travel and work.

Things started off okay, me writing about my projects, etc on a small self-hosted site with zero analytics, keeping things small and manageable in my free time. But the lack of feedback sort of left me I limbo. Was I writing in an engaging way? Were my subjects interesting to more than just me? I had no idea. Eventually, that iteration of the blog got deleted.

And I made another. And another. And so on.

Til I landed on the current version, which is basically me just faffing about with a editorials about tech for fun since I have little time for actual projects anymore, let alone the accompanying writeup.

I still want that writing job, but I realize how much of a pipe dream it is, now. Tech bloggers were already a dime a dozen before I showed up and genAI only saturated that market even further. That, and I still have no interest in working for or hosting a site that is hostile to my reader by being a bloated sludge of scripts and sloppy use of frameworks, which limits my market for a writing career in disappointing and obvious ways.

When I see discussions like this pop up about writing online in today's landscape, it seems to always come down to "write what you find interesting or fun, but keep your eyes expectations near zero" which seems so self-defeating considering how much work it often takes to maintain a blog while you also have to tend to real life. As much as I loath places like Medium or Substack for asking for money up front, I do understand why those writers choose to go there instead of walking my lonely path.

gpjt · 2025-02-25T01:27:09 1740446829

Me too! These last two posts blogging about blogging are unusual for me. I'm working through a book (Sebastian Raschka's "Build an LLM from scratch") and posting about that at the moment. It's likely not a coincidence that I'm procrastination-posting before going through the trickiest bit...

rednafi · 2025-02-25T02:30:11 1740450611

I love reading meta-writings at times, as long as there’s a real human behind the keyboard. This was a fun, quick read.

gpjt · 2025-02-25T01:24:47 1740446687

The jokey last paragraph was probably based on a half-memory of having read that, now you mention it...

gpjt · 2025-02-25T01:23:43 1740446623

Good point. "Blog as much as you can, as otherwise the training set will be Twitter!"

gpjt · 2025-02-25T01:22:40 1740446560

Thanks! And that's a good counter-example... which made me realise, there's also https://waitbutwhy.com/

gpjt · 2025-02-25T01:20:37 1740446437

That's what I was trying to cover with the "make your newly-acquired knowledge concrete" bit, and was my focus in the previous post. This time around I wanted to look into the aspects that might be impacted by AI (and why I didn't think they would be).