Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I run ml algorithms like boosted trees (i.e xgboost) on data sets with 30k-1m rows and 200-2k columns. Sorting is the bottleneck, it's what the algorithm does. I doubt I'm special, and I'm sure these size are common


IIRC the average qsort len is less than 20 according to debian code search.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: