Skip to content

feat: implement basic semantic tokens#288

Open
AneesPatel wants to merge 6 commits into
hudson-trading:mainfrom
AneesPatel:feat-semantic-tokens-final
Open

feat: implement basic semantic tokens#288
AneesPatel wants to merge 6 commits into
hudson-trading:mainfrom
AneesPatel:feat-semantic-tokens-final

Conversation

@AneesPatel

@AneesPatel AneesPatel commented Apr 1, 2026

Copy link
Copy Markdown

Summary

  • Implements basic Semantic Tokens support (fixes Implement Semantic Tokens #268). This maps core Slang tokens to the standard LSP legend using a high-performance, single-pass delta encoder.

Testing & Quality

  • Verified mapping logic with 10 core unit tests covering keywords, literals, macros, and operators.
  • Formatted via clang-format to match project style guidelines.

Comment thread src/document/SemanticTokens.cpp
Comment thread src/document/SemanticTokens.cpp Outdated
return SemanticTokenType::Macro;

default: {
auto text = parsing::LexerFacts::getTokenKindText(kind);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use SyntaxFacts::getUnaryPrefixExpression and SyntaxFacts::getBinaryExpression rather than making a string.

Comment thread src/document/SemanticTokens.cpp Outdated
using namespace slang;
using TK = parsing::TokenKind;

lsp::SemanticTokensLegend getSemanticTokensLegend() {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say to use SemanticTokenTypes in LspTypes.h, but that's generated as an rfl::Literal. Let's use slang's SLANG_ENUM to have the toString's generated and not have to maintain these two lists. We can also make this constexpr.

@AneesPatel

Copy link
Copy Markdown
Author

Thanks for the detailed review! I'm applying the fixes and refactoring now.

Quick question about the SyntaxFacts optimization: since getUnaryPrefixExpression and getBinaryExpression don’t catch basic punctuation like a semicolon, it falls through to std::nullopt. Should I update the test cases to expect std::nullopt, or add an explicit case to keep mapping semicolons to Operator? Also, just realized I need to fix a formatting error.

Comment thread src/document/SemanticTokens.cpp
@AndrewNolte

Copy link
Copy Markdown
Collaborator

I'd recommend installing the pre-commit to avoid formatting errors.

And I think it's fine to omit punctuation for now; the client side highlighter (textmate on vscode) should handle those ok. The primary purpose of semantic tokens is disambiguating identifiers using the AST. You could add some clarifying token labels by using the Syntax tree, although you'd need to look at syntax and not just tokens. The AST ones are a bit trickier, and involve performing lookups on some of the ids that reference others.

@AneesPatel

AneesPatel commented Apr 7, 2026

Copy link
Copy Markdown
Author

I've updated reserve() to scale proportionally and the SLANG_ENUM is fully integrated, and I added a quick pass to lowercase the strings so they don't break LSP compliance. Punctuation now falls through to nullopt unless SyntaxFacts identifies it as an operator.

For the constexpr legend, since C++20 can’t constexpr build the std::vector in SemanticTokensLegend, I added getSemanticTokensLegendTypes() as a constexpr std::arraystd::string_view, this should be the same in practice.

CI tests work on my fork. Thanks again for the review and help on this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement Semantic Tokens

2 participants