[cuda backend] int4/8 matvec: vectorized activation load #36098
| Job | Run time |
|---|---|
| 20m 36s | |
| 10m 36s | |
| 8m 28s | |
| 7m 50s | |
| 7m 50s | |
| 7m 37s | |
| 6m 59s | |
| 8m 21s | |
| 8m 32s | |
| 7m 8s | |
| 8m 17s | |
| 8m 4s | |
| 8m 40s | |
| 8m 17s | |
| 18m 26s | |
| 2h 25m 41s |
| Job | Run time |
|---|---|
| 20m 36s | |
| 10m 36s | |
| 8m 28s | |
| 7m 50s | |
| 7m 50s | |
| 7m 37s | |
| 6m 59s | |
| 8m 21s | |
| 8m 32s | |
| 7m 8s | |
| 8m 17s | |
| 8m 4s | |
| 8m 40s | |
| 8m 17s | |
| 18m 26s | |
| 2h 25m 41s |