Hacker News new | past | comments | ask | show | jobs | submit | otaviogood's comments login

Me and my team used these yellow tracking dots to reconstruct shredded documents for a DARPA shredder challenge over a decade ago. You can see our program highlight the dots as we reconstruct the shredded docs. https://www.youtube.com/watch?v=uzZDhyrjdVo Thanks to that, we were able to win by a large margin. :)


Oh wow, I remember hearing about this challenge on Daily Planet when I was still in elementary school. It's super cool seeing a follow up, it brought back a hidden memory.

Super cool demo btw


and here I thought the library shredder/scanner in Vinge's Rainbows End was just sci-fi loosely based on gene sequencing...

(I mean it is, but seeing this almost real-world implementation is fun!)


what was the process of getting each of the shredded pieces scanned for your program to use. I'm guessing that process could have a write up on it just as much as the solver. there's definitely a personality type that can handle that type of mess


DARPA scanned the shreds. The funny thing is, they didn't want to shred the original paper, so first they photocopied the paper in a high quality color copier, shredded it, and scanned it. And that's where the little yellow dots came from. :D


interesting. now my brain is churning on why would they not want the originals shredded. what does that say about the value they placed on the originals? why would they open a contest up with documents of such perceived value as the content? being DARPA, i'm sure there's a reason though


You might be reading into it too much. I think the originals were just random pieces of different kinds of paper. Graph paper, yellow lined, paper, blank white paper... I don't remember exactly, but I think the copies could be special paper with a colored backside so they would know which way was up really easily for the scanning process.


no it was a deliberate what if meant in jest that probably really could have been kept to myself


Did they scan both sides of the shreds?


This is much more interesting if you see the animations. https://x.com/jaschasd/status/1756930242965606582


Fractal zoom videos are worth infinite words.


> infinite

I see you


So what exactly are we looking at here? Did the authors only use two hyperparameters for the purpose of this visualization?


It's explained in the post:

> Have you ever done a dense grid search over neural network hyperparameters? Like a really dense grid search? It looks like this (!!). Blueish colors correspond to hyperparameters for which training converges, redish colors to hyperparameters for which training diverges.


I saw this, but am still not clear on what the axes represent. I assume two hyperparameters, or possibly two orthogonal principal components. I guess my point is it’s not clear how/which parameters are mapped onto the image.


your point is valid but the paper explains it clearly and obviously. they are NOT dimensionally reduced hyperparameters, no. The hyperparameters are learning rates, that's it. X axis, learning rate for input (1 hidden layer). Y axis, learning rate for output layer.

So what this is saying, for certain ill-chosen learning weights, model convergence is for lack of a better word, chaotic and unstable.


Just to add to this, only the two learning rates are changed, everything else including initialization and data is fixed. From the paper:

Training consists of 500 (sometimes 1000) iterations of full batch steepest gradient descent. Training is performed for a 2d grid of η0 and η1 hyperparameter values, with all other hyperparameters held fixed (including network initialization and training data).


‘1984 is 1 token. 1884 is 2 tokens.’

I would be surprised if they use this tokenization still as it’s not math friendly.


They do use this tokenization, and that's the reason why these models sometimes struggle with tasks like "how many twos does this long number contain" and things like "is 50100 greater than 50200" as it tries to compare "501"/"00" with "50"/"200" while knowing that "501" is greater than "50".

The models aren't optimized to be math friendly. They could be, but the major big generic ones weren't.


It's a language model, not a mathematical model


Where do you see the room for improvement that gets you to much faster speeds? Also I love the single file to do everything. :)


Hey hey heya! Sorry for the long wait! I've wanted to reply to every comment thoughtfully, and your comment on the single file bit was the first feedback I've heard about it. I've sort of fretted about that decision (though it's paid off well for me), and I wanted to say I can't express how much anxiety that one small compliment has relieved for me. Not that everyone will like it, but that it's indeed a good idea.

As far as speed goes, I've filled out a few good comments about a "what's next" here with Bram in my comment to them. There's also a corresponding Reddit thread that I've been frantically corresponding on, but most of that is more historical-and-hyperparameter focused! (https://www.reddit.com/r/MachineLearning/comments/10op6va/r_...)


This article talks about a pretty full-featured ray tracer. If you just want to play around with some ideas and have fun, you can make different tradeoffs and still get very nice and fast renderings. Here's a ray tracer that I made that's GPU-accelerated and can make some nice looking images quickly in 1000 lines of code. https://www.shadertoy.com/view/4ddcRn The tradeoff is that it's all procedural graphics. So there are no triangle meshes and it would also be a bit tricky to implement a bidirectional ray tracer like this.


Whoaaa. If you look from the bottom up through the glass floor, it looks crazy awesome.

I think that's the most impressive little demo I've seen in ... quite some time. I can think of more impressive ones, but they're all extraordinarily complicated in comparison.

How'd you do the glass if there are no meshes? I'm about to pass out, else I'd dig in.

I'm especially curious how you're getting the refraction right. I forget my basic physics, but is it as simple as deflecting the ray slightly? But then how does it work on a curved surface? I guess it happens per pixel, so it shouldn't be too surprising, but... still, I was expecting a lot more aliasing than this: https://i.imgur.com/fdDVcCT.jpg

I guess I've never seen a real time raytracer that happened to trace through ripply glass, which is why it feels so neat.


The sibling comment explains it too, but here's a simple description: The object still has a geometric definition, it just isn't a mesh. It has equations describing the curves. Yes, per pixel, using the curve geometry, you calculate where the ray intersects that curve, and at what angle to figure out the deflection from there.

Think of a simplification in 2D. Sunlight comes from directly vertical above a sine wave. For every x-coordinate where you cast a sun ray, you just need the slope of the sine-wave at that x-coordinate to calculate the angle for what will happen at the intersection.

A mesh isn't any fundamental unit of physics or geometry, it's just our customary way of approaching the rendering to get good-enough fast-enough results from our current hardware.


Not the person you're replying to, but in general refraction is pretty easy to do in a ray tracer. When the ray hits the object you use the normal vector, the incident ray direction, and the index of refraction of the material (or materials, if you're transitioning from one substance to another with different index of refractions) to calculate the new ray direction and continue from there. Usually there's some amount of reflection, too (the amount varying based on the angle of the incoming ray with respect to the surface), so you might spawn two rays: one continuing through the object, and the other bouncing off.

What's much harder to do is simulating the light that's refracted or reflected off of objects (called "caustics"), like the light on the bottom of a swimming pool. To do that in a physically correct way generally requires falling back on some kind of global illumination technique like path tracing or photon mapping.


The geometry in my raytracer can either be ray-traced primitives, like spheres or boxes, or it can be "signed distance functions" (SDFs), which let you define all kinda of crazy shapes. Inigo Quilez does a good job of explaining SDFs here: https://www.iquilezles.org/www/articles/distfunctions/distfu... The refraction math in my code is around line 799 or 809 depending on what you're looking for. There are a few errors in the refraction code in this version. :/ But caustics are handled well by my renderer.


If you think that shader is impressive this one will blow your mind: https://www.shadertoy.com/view/3lsSzf


Pity I can only vote once, this is amazing work, thanks for sharing.


I thought it would be nice to listen to these with my podcast player on iphone. Here's my too-complicated process that worked:

- in Chrome inspector, look at the network tab when you click play on a Feynman lecture. Right click the mp4 and do "copy as cURL".

- Go to command line (unix style) and paste. Then append to that command line something like "--output flp1.mp4". That will download the file locally with that file name.

- Put the file on Dropbox or something that will get it to your phone.

- From dropbox on iphone, share and export the file, then choose your podcast app. The podcast app that worked for me is "Pocket Casts".

- Now in Pocket Casts -> Profile -> Files, you should be able to play the mp4s with nice podcast-style controls and learn physics and be happy!


There's a wget one-liner here to download everything: https://news.ycombinator.com/item?id=27323235


Thanks.

Required 'brew install wget' first for me. ;-)


Nice, I tried doing the same, but by writing the curl command myself. It failed, it seems it requires ~the cookies~ because I was getting 403s. Thanks for the better idea!

Edit: Looks like all it needed was the Referrer header.


25 person Zoom calls aren't fun. So me and my friends made a virtual gathering space for people to host online events, meetups, social hours, etc.

Tech: Frontend uses Svelte / Snowpack, which is great. The game view uses the DOM, which is questionable, but my webgl implementation wasn't so hot either.

Backend is Firebase for general stuff, server written in Golang for the realtime game stuff, and another Golang server for video.

Main techincal lesson learned so far: WebRTC sure is a pain to get right across everyone's devices, browsers, and connections.


Cool!

The good news is that the browser/device/connections issues have gotten so, so much better! All the browsers now mostly agree on aiming towards full compliance with the WebRTC 1.0 spec. The likelihood of something important breaking in a browser release is trending downwards. And the encoding -> network -> decoding pipelines in libwebrtc are pretty robust and performant these days.

But, as someone who has been doing WebRTC stuff since 2014 or so, and started a company that's in part a bet on the WebRTC ecosystem, I have a lot of scars from how long it took to get here. :-)

I'm biased, but I would say that there are still three classes of non-trivial difficulty that add up to "you shouldn't build everything yourself" being the right answer if you're building WebRTC-reliant features that you want to deploy to production:

  1. cross-device issues are still painful and still a moving target

  2. network bandwidth and track/encoding configurations for any use cases more complex than 1:1 calls are a steep learning curve plus lots of corner cases

  3. devops as you scale usage is a lot of work because there aren't any off the shelf cloud provider things you can just turn on and expect to work
Platforms that will take care of (parts of) the above for you include Agora, Vonage, Twilio, and (my company) Daily.co.


Maybe we should have tried Daily. :) We tried Twilio video and unfortunately had a really hard time tracking down bugs. Wasted tons of time on that and then went back to doing our own stuff. :( I still don't know where the bug was. Could have been our code or Twilio's. But when we got rid of Twilio and used our own stuff, it resolved the bugs. Was a very frustrating process. Also, the way Twilio video charges seems to assume lots of n-squared, high res connections, which we don't have, so for an application like ours, the cost can be more than 10x less.


I think this eventually turned into Andrej Karpathy's class at Stanford, CS231n. The class notes are here: http://cs231n.github.io/ The class is on youtube. If you like this hacker's guide, I think you'll definitely like the class and the notes. edit: A lot of the compute graph and backprop type stuff that is in the hacker's guide is covered in this specific class, starting about at this time: https://www.youtube.com/watch?v=i94OvYb6noo&t=207s


People say you have to "do" math to learn it. Usually they make it sound like you need to do the exercises in the books. I think that doing just that can be boring and demotivating.

I would suggest finding projects that can motivate you and help you exercise your math. Some suggestions of mathy things I regularly work on for fun:

1. Make a video game. If it's a 3d game, you'll have to do your matrices, dot products, trigonometry, etc.

2. shadertoy.com - This is a community site where people just program cool looking graphics for fun. All the code is open, so you can learn from it. Similar to game programming but without the mathless overhead. :)

3. Machine learning projects - I love writing various machine learning things, but the project that has been a great ML playground has been my self driving toy car. It gives me plenty of opportunities to explore many aspects of machine learning and that helps drive my math knowledge. My car repo is here: https://github.com/otaviogood/carputer but a much easier project is donkeycar.com. ML will touch on linear algebra, calculus, probabilities/statistics, etc.

The most important thing for learning is to be inspired and have fun with what you're learning. :)


Books might not be the best resource for Shadertoy-type stuff. Almost all of Shadertoy 3d shaders use a technique called ray-marching with signed distance functions. If you Google it, you should find good resources. Also, someone on Shadertoy made a very good tutorial using Shadertoy, which I think is kindof amazing... https://www.shadertoy.com/view/4dSfRc There are other tutorial shaders on Shadertoy and I always try to make mine readable and heavily commented... https://www.shadertoy.com/user/otaviogood


wow, a shader tutorial and it's written in shaders!


Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: