More

jdeaton · 2025-04-29T21:28:45 1745962125

Its called recursion

> Doesn't for_tree(...) look a lot nicer and simpler and less error prone than needing to implement a recursive function for each operation you would want to do on a tree?

No it does not

jdeaton · 2025-04-22T20:51:28 1745355088

Yeah i vote it should be rebranded “why”

alexykn · 2025-04-22T20:53:07 1745355187

command: why install htop, -> install start: because you wan' it -> done: you got it

Something like that you mean?

Asraelite · 2025-04-22T21:35:17 1745357717

I would love if the way to respond "yes" to the CLI is "why not"

Ringz · 2025-04-22T22:06:14 1745359574

Why not.

Rodmine · 2025-04-23T09:11:58 1745399518

Obvious joke is obvious... but it seems the "motivation" is to write a project in rust.

chucksmash · 2025-04-22T21:26:51 1745357211

Perhaps _why?

jdeaton · 2025-04-21T15:38:13 1745249893

0.061 standard deviations? Thats like almost nothing?

jdeaton · 2025-04-05T00:54:29 1743814469

Pytorch???

jdeaton · 2025-02-17T21:49:04 1739828944

Maybe shes wearing those noise canceling headphones because of her auditory processing condition and not the other way around??

This seems like basic speculative attribution error- no research here.

izzydata · 2025-02-17T22:03:26 1739829806

This would be my guess as it sounds similar to what I experience and have been experiencing for decades long before any king of noise cancelling headphones. I find it very difficult to process noises when there are too many occurring at once and adding voices into the mix will make it look like I am lagging trying to respond. I have no idea what the underlying condition is, but I consider this problem to be called "misophonia". I think it is about not being able to filter sounds well.

Thankfully I'm not regularly in an environment that force me to wear noise cancelling headphones.

Sometimes I believe this was caused by my childhood bedroom being soundproofed, but I still went to noisy public school so who knows.

jdeaton · 2025-02-05T01:46:13 1738719973

Something nice about this guide is that it generally transfers to GPU directly thanks to JAX/XLA.

jdeaton · 2025-02-05T01:32:57 1738719177

if you're using tpu why are you using pytorch

hustwindmaple1 · 2025-02-06T01:22:34 1738804954

there is limited TPU support in pytorch via torch_xla

jdeaton · 2025-02-06T17:28:08 1738862888

Sounds limited

jdeaton · 2025-02-05T01:23:40 1738718620

The interesting thing about this comment is that JAX is actually higher-level even than pytorch generally. Since everything is compiled you just express a logcial program and let the compiler (XLA) worry about the rest.

Are you suggesting that XLA would be where this "lower level" approach would reside since it can do more automatic optimization?

Scene_Cast2 · 2025-02-05T01:36:21 1738719381

I'm curious, what does paradigmatic JAX look like? Is there an equivalent of picoGPT [1] for JAX?

[1] https://github.com/jaymody/picoGPT/blob/main/gpt2.py

jdeaton · 2025-02-05T01:52:44 1738720364

yeah it looks exactly like that file but replace "import numpy as np" with "import jax.numpy as np" :)

jdeaton · 2024-09-24T16:47:32 1727196452

JAX has a sub-system called Pallas[1] with a Triton-like programming model and an example implementation of Flash Attention [2]. It is quite fast. On TPUs I've heard that the XLA compiler already emits a flash-attention-like computation graph for a regular JAX implementation of attention so there's no need to have some specialized kernel in that case.

1. https://jax.readthedocs.io/en/latest/pallas/index.html

2. https://github.com/jax-ml/jax/blob/main/jax/experimental/pal...

jdeaton · 2024-09-05T04:22:33 1725510153

Why is the study specifically of Turkish students

eesmith · 2024-09-05T04:39:34 1725511174

The paper at https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4895486 says:

> One of the co-authors, Özge Kabakcı, a high school math teacher and former department chair of the math department at our partner Turkish high school, led the development of all session materials.

elashri · 2024-09-05T04:39:37 1725511177

I guess it would be hard to find a high school with the sample size that you need (thousand) that will agree on collaborating. And in the US every county will have different rules and in terms of math they don't teach it in standard way.

But why Turkish not British or any other place is going to be a question no matter the ___location. But do you really think the results will be significantly different if it is done lets say on Vietnamese students?