S2S is where we're investing the most effort on audio ... sorry it's been slow but we are working hard on it
Top priorities at the moment
1) Better function calling performance
2) Improved perception accuracy (not mishearing)
3) More reliable instruction following
4) Bug fixes (cutoffs, run ons, modality steering)