Discussion about this post

User's avatar
Sharad's avatar

Hi, thanks for this very comprehensive write-up. Some really interesting insights. I just wish there were more citations. One I'm looking for in particular is about the claim that sub-optimal tokenization impacts accuracy. Do you have a reference for this? Or are you working on this, because evaluating the correlation between model performance and tokenization (for Nepali, for example) sounds like something worth exploring.

Expand full comment
4 more comments...

No posts