Inter-SM scripts + L2 cache results#4
Conversation
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
| if sort_lats: | ||
| # plot the largest latency first, to ensure all bars are visible | ||
| for i in range(num_kernels): | ||
| l = [(lats[trace][i], trace) for trace in traces if trace in lats] | ||
|
|
||
| sorted_l = sorted(l, key=lambda x: x[0], reverse=True) | ||
| if bar: | ||
| for lat, lo in sorted_l: | ||
| ax.bar(i, lat, color=colors[lo]) | ||
|
|
||
| ax.legend( | ||
| handles=[plt.Rectangle((0,0),1,1, color=colors[trace]) for trace in traces if trace in lats], | ||
| labels=[f"{round(int(trace)*4/(1024*1024), 2)} MB" if trace != "iso" else "iso" for trace in traces if trace in lats], | ||
| ) | ||
| else: |
There was a problem hiding this comment.
I added this in non-default mode. To be discussed, currently we have a fixed order in which the bars are plotted assuming that the highest bar is always plotted first in the background and that we plot the lower ones on top. This is not always the case.
The above sorts the latencies each time to make sure that we always plot the highest bar first. To be honest visually it looks a bit harder to understand afterwards, because sometimes the orange bar is on top, sometimes it is in the background etc.
The current alternative uses slighthly transparent bars (by setting alpha) which is the version I like most at this point, though not ideal. Let me know if you have a preference on this
Uh oh!
There was an error while loading. Please reload this page.