Skip to content

vnc_worker: improve VNC server compatibility and performance#3197

Open
bitranox wants to merge 17 commits intomicrosoft:mainfrom
bitranox:vnc-novnc-compat
Open

vnc_worker: improve VNC server compatibility and performance#3197
bitranox wants to merge 17 commits intomicrosoft:mainfrom
bitranox:vnc-novnc-compat

Conversation

@bitranox
Copy link
Copy Markdown

@bitranox bitranox commented Apr 5, 2026

Summary

Upgrade the VNC server to improve compatibility with modern clients and reduce bandwidth usage.

Protocol upgrade

  • RFB 3.3/3.7/3.8 support: Server now offers RFB 3.8 and falls back to 3.7/3.3. Modern clients (noVNC, TigerVNC) prefer 3.8 which includes proper security type negotiation and SecurityResult.

Performance

  • Tile-based dirty detection (64x64 tiles): Only changed screen regions are sent instead of the full framebuffer on every update, significantly reducing bandwidth for typical desktop usage.
  • Zlib compression (encoding type 6): When the client advertises zlib support, tile pixel data is compressed using a persistent zlib stream with Sync flush per the RFB spec.
  • Previous framebuffer tracking: Maintains a copy of the last sent framebuffer for incremental change detection.

Cursor

  • Cursor pseudo-encoding (-239): Sends a default arrow cursor (white fill, black outline) so the pointer is visible in all VNC clients without requiring server-side cursor rendering.

Compatibility

  • DesktopSize as soft capability: No longer a hard requirement, clients that do not advertise it still work.
  • Client reconnection: New VNC clients can take over by disconnecting the previous session.

Testing

Tested with TigerVNC, RealVNC and noVNC on Windows 11 guest running under openvmm/KVM:

  • Display, keyboard, mouse all working
  • Cursor visible on light and dark backgrounds
  • Zlib compression active when client supports it
  • Fallback to raw encoding works
  • RFB 3.3 clients still connect (downgrade handshake)
  • Desktop resize notification works
  • Client reconnection works (new client takes over)

@bitranox bitranox requested a review from a team as a code owner April 5, 2026 21:47
@bitranox
Copy link
Copy Markdown
Author

bitranox commented Apr 5, 2026

@microsoft-github-policy-service agree

@bitranox bitranox force-pushed the vnc-novnc-compat branch 2 times, most recently from f0803dc to 1fea435 Compare April 5, 2026 23:14
@bitranox bitranox changed the title vnc_worker: improve noVNC compatibility and allow client reconnection vnc_worker: improve VNC server compatibility and performance Apr 5, 2026
Upgrade VNC server from RFB 3.3-only to RFB 3.3/3.7/3.8, fixing
compatibility with modern clients like noVNC that expect the 3.8
security handshake. Make DesktopSize pseudo-encoding a soft capability
instead of a hard requirement. Allow new VNC clients to take over by
disconnecting the previous session.

New features:
- Tile-based dirty detection (64x64 tiles): only changed regions are
  sent instead of the full framebuffer, significantly reducing bandwidth
- Zlib compression (encoding type 6): when the client advertises zlib
  support, tile pixel data is compressed using a persistent zlib stream
  with Sync flush per the RFB spec
- Cursor pseudo-encoding (-239): sends a default arrow cursor with
  white fill and black outline so the pointer is visible in all clients
  including noVNC, without requiring server-side cursor rendering
- Previous framebuffer tracking for incremental updates

Protocol changes:
- Server now offers RFB 3.8 and falls back to 3.7/3.3
- RFB 3.8 security handshake with SecurityResult
- Cursor sent as part of the first FramebufferUpdate (not during
  SetEncodings) to avoid stream corruption with pipelined clients
@benhillis benhillis added the enhancement New feature or request label Apr 6, 2026
@smalis-msft
Copy link
Copy Markdown
Contributor

Thanks for your contribution! As someone who's very not familiar with the VNC protocols this looks fine to me, but it is failing our format check currently.

@bitranox
Copy link
Copy Markdown
Author

bitranox commented Apr 6, 2026

Thanks for your contribution! As someone who's very not familiar with the VNC protocols this looks fine to me, but it is failing our format check currently.

Anything I need to do ? Any linter I can test myself to pass ?

@smalis-msft
Copy link
Copy Markdown
Contributor

Well right now our CI is busted haha, but we'll fix that. Locally you'll need to pass both clippy (the standard rust linter) and our custom lints, which can be run with cargo xtask fmt, or cargo xtask fmt --fix to auto-fix some of them.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@bitranox
Copy link
Copy Markdown
Author

bitranox commented Apr 6, 2026

Well right now our CI is busted haha, but we'll fix that. Locally you'll need to pass both clippy (the standard rust linter) and our custom lints, which can be run with cargo xtask fmt, or cargo xtask fmt --fix to auto-fix some of them.

i fixed it. would be nice to see that merged, need it to run openvmm on proxmox with the noVMC viewer ...
also the original VNC server is really really minimal and very slow ...

@smalis-msft
Copy link
Copy Markdown
Contributor

#3201 will fix our CI, this should just need a rebase after that gets merged.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR upgrades the in-tree VNC server to better interoperate with modern VNC clients and to reduce bandwidth/CPU by sending incremental screen updates with optional zlib compression. It also improves UX/compatibility via a default cursor pseudo-encoding and relaxes the DesktopSize requirement, while adding support for new clients taking over an existing session.

Changes:

  • Upgrade handshake to offer RFB 3.8 (with fallback to 3.7/3.3) and negotiate security types.
  • Implement tile-based dirty detection (64x64), previous-framebuffer tracking, and optional zlib encoding for changed tiles.
  • Add cursor pseudo-encoding support and allow client reconnection to preempt an existing session.

Reviewed changes

Copilot reviewed 4 out of 5 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
workers/vnc_worker/vnc/src/rfb.rs Adds the Cursor pseudo-encoding constant used during framebuffer updates.
workers/vnc_worker/vnc/src/lib.rs Implements protocol negotiation updates, incremental/tiled updates, zlib encoding, cursor sending, and makes DesktopSize optional.
workers/vnc_worker/vnc/Cargo.toml Adds flate2 dependency for zlib compression support.
workers/vnc_worker/src/lib.rs Updates server state machine to allow a new client to disconnect/take over from an existing connection.
Cargo.lock Records the new flate2 dependency for the vnc crate.

Comment on lines +83 to +90
let dest_depth = fmt.bits_per_pixel as usize / 8;
let shift_r = 24 - fmt.red_max.get().count_ones();
let shift_g = 16 - fmt.green_max.get().count_ones();
let shift_b = 8 - fmt.red_max.get().count_ones();
let no_convert = dest_depth == 4
&& shift_r == fmt.red_shift as u32
&& shift_g == fmt.green_shift as u32
&& shift_b == fmt.blue_shift as u32;
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shift_b is computed from red_max instead of blue_max, which will produce incorrect blue-channel conversion for non-8-bit blue formats and can also make no_convert detection wrong. Compute shift_b from fmt.blue_max (and ensure the fast-path checks reflect the correct source channel layout).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

@bitranox bitranox Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is intentional from the original code. If we change that to blue_max, grey windows decorators get black in RealVNC, but correct in noVNC

# this is intentional !
let shift_b = 8 - fmt.red_max.get().count_ones();

Copy link
Copy Markdown
Author

@bitranox bitranox Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In virtually all real pixel formats (RGB888, RGB565, RGB555), red and blue have the same number of bits. So red_max.count_ones() == blue_max.count_ones() and it doesn't matter which one you use.

Why changing to blue_max breaks colors:

A specific VNC client (RealVNC) is sending an incorrect or non-standard blue_max value. If a client sends blue_max = 0 or a value with fewer bits set than expected, then shift_b = 8 - 0 = 8, and b >> 8 = 0 for all pixels - zeroing out the blue channel entirely. - or it uses for instance RGB332: red=7, green=7, blue=3 (see below, count_ones() issue)

noVNC is sending an correct blue_max value --> it works
RealVNC is sending an incorrect or non-standard blue_max value --> grey turns into black

So using fmt.red_max.get() is the workaround at the moment.

I investigated this thoroughly again :

Additional issue: count_ones() is the wrong function here entirely. It counts set bits, not bit width. For spec-conforming values (2^N - 1 like 255, 31, 63) it happens to give the right answer, but for any non-conforming value it diverges. The correct approach is 16 - leading_zeros() which gives the actual bit width regardless of bit pattern.

The RFB spec says: max values "should be 2^N - 1 where N is the number of bits used for that colour" - using "should" not "must", but every real implementation assumes this form.

Plan:

  1. Fix shift_b to use blue_max (correct per spec)
  2. Switch all three channels from count_ones() to 16 - leading_zeros() (robust)
  3. Add validation in SetPixelFormat that max values are non-zero and of the form 2^N - 1, rejecting or ignoring non-conforming formats early

Key Points:

  • The spec says "should" not "must" - but every real implementation assumes 2^N - 1
  • They are independent per channel (red and blue can differ, e.g. RGB332: red=7, green=7, blue=3)
  • Sum of bit widths should equal depth
  • Shifts + widths must not overlap and must fit within bits_per_pixel
  let red_bits = 16 - fmt.red_max.get().leading_zeros();  
  let green_bits = 16 - fmt.green_max.get().leading_zeros();  
  let blue_bits = 16 - fmt.blue_max.get().leading_zeros();  
  let shift_r = 24 - red_bits;  
  let shift_g = 16 - green_bits;  
  let shift_b = 8 - blue_bits;  

This way we fix the root cause properly rather than relying on red_max as a workaround for buggy client values.
Since this needs a lot of visual testing with different clients, I would tackle that later.

Comment on lines +102 to +107
match dest_depth {
1 => out.push(p2 as u8),
2 => out.extend_from_slice(&(p2 as u16).to_ne_bytes()),
4 => out.extend_from_slice(&p2.to_ne_bytes()),
_ => unreachable!(),
}
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

convert_pixels will panic on client-selected pixel formats where bits_per_pixel/8 is not 1, 2, or 4 (e.g., 24bpp is common) due to unreachable!(). Since the client controls SetPixelFormat, this is a remotely-triggerable crash; validate/normalize the pixel format and return an error or support 24bpp instead of panicking.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is pre-existing behavior from the upstream code, not something introduced by this PR.

The RFB protocol specification only defines bits_per_pixel values of 8, 16, and 32. A value of 24 would be non-conforming, no standard VNC client sends it. However, since the pixel format is client-controlled via SetPixelFormat, a malicious or buggy client could theoretically trigger the panic.

Fixed in 9f79ef7: we now validate bits_per_pixel at SetPixelFormat time and reject anything other than 8/16/32 with a proper error, so the unreachable!() in convert_pixels is now genuinely unreachable.

Comment on lines +188 to 198
if version.0 == rfb::PROTOCOL_VERSION_38 {
// RFB 3.8: server sends SecurityResult after negotiation.
socket
.write_all(
rfb::SecurityResult {
status: rfb::SECURITY_RESULT_STATUS_OK.into(),
}
.as_bytes(),
)
.await?;
}
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RFB 3.7/3.8 security negotiation: the client’s selected security type is read but never validated/acted on, and SecurityResult is only sent for protocol 3.8. 3.7 clients typically expect a SecurityResult too, and unsupported selections should result in a failure response/connection close.

Suggested change
if version.0 == rfb::PROTOCOL_VERSION_38 {
// RFB 3.8: server sends SecurityResult after negotiation.
socket
.write_all(
rfb::SecurityResult {
status: rfb::SECURITY_RESULT_STATUS_OK.into(),
}
.as_bytes(),
)
.await?;
}
if chosen_type != rfb::SECURITY_TYPE_NONE {
socket
.write_all(
rfb::SecurityResult {
status: rfb::SECURITY_RESULT_STATUS_FAILED.into(),
}
.as_bytes(),
)
.await?;
return Err(std::io::Error::new(
std::io::ErrorKind::InvalidData,
"client selected unsupported RFB security type",
)
.into());
}
// Send SecurityResult after successful negotiation.
// Although the requirement is explicit in RFB 3.8, 3.7 clients
// commonly expect the same response before ClientInit.
socket
.write_all(
rfb::SecurityResult {
status: rfb::SECURITY_RESULT_STATUS_OK.into(),
}
.as_bytes(),
)
.await?;

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validation part is legit. fixed in bae9522. We now reject the connection if the client selects a security type we didn't offer.

However, sending SecurityResult for RFB 3.7 as suggested would actually break the protocol. Per the RFB spec, SecurityResult for security type None was only introduced in 3.8. In 3.7, the server proceeds directly to ClientInit after the security type exchange, sending an unexpected 4-byte message would desync the stream. So we keep SecurityResult gated on 3.8 only.

[(zlib_stream.total_out() - before_out) as usize..],
FlushCompress::Sync,
)
.unwrap();
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The zlib compression path uses .unwrap() on flate2::Compress::compress(...), which can panic and take down the VNC worker. Since compression is client-driven (encoding negotiation), handle errors explicitly and map them into Error/I/O failure rather than panicking.

Suggested change
.unwrap();
.map_err(|e| {
std::io::Error::other(format!(
"zlib compression failed: {e}"
))
})?;

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in 9c514c6. The .unwrap() is replaced with proper error propagation via std::io::Error, so a compression failure now cleanly disconnects the client instead of panicking the worker

Comment on lines 405 to 427
let cw: u16 = 18;
let ch: u16 = 18;
let bpp = fmt.bits_per_pixel as usize / 8;
let mask_stride = ((cw as usize) + 7) / 8;
let mut pixels = Vec::with_capacity(cw as usize * ch as usize * bpp);
for y in 0..ch as usize {
for x in 0..cw as usize {
let byte_i = x / 8;
let bit = 7 - (x % 8);
let in_mask =
byte_i < mask_stride && (MASK[y][byte_i] >> bit) & 1 == 1;
let in_fill =
byte_i < mask_stride && (FILL[y][byte_i] >> bit) & 1 == 1;
if in_mask && in_fill {
// White fill
pixels.extend_from_slice(&[0xFF, 0xFF, 0xFF, 0x00][..bpp]);
} else if in_mask {
// Black outline
pixels.extend_from_slice(&[0x00, 0x00, 0x00, 0x00][..bpp]);
} else {
// Transparent
pixels.extend_from_slice(&[0x00; 4][..bpp]);
}
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor pseudo-encoding pixel data is written as raw [R,G,B,(0)] bytes sliced by bpp, but it isn’t converted according to the negotiated PixelFormat (channel shifts/maxes and endianness). This will render incorrect colors/garbage for common formats like RGB565 or big-endian; generate cursor pixels using the same conversion logic as framebuffer tiles.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

@bitranox bitranox Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in 452f08d. The cursor pixel data now goes through the same convert_pixels path as framebuffer tiles, so it correctly respects the client's negotiated pixel format (channel shifts, maxes, bit depth, endianness). Previously the raw hardcoded bytes would
have rendered as garbage on anything other than 32bpp little-endian XRGB.

Comment on lines +91 to +106

if no_convert {
out.extend_from_slice(src.as_bytes());
return;
}

for &p in src {
let (r, g, b) = (p & 0xff0000, p & 0xff00, p & 0xff);
let p2 = r >> shift_r << fmt.red_shift
| g >> shift_g << fmt.green_shift
| b >> shift_b << fmt.blue_shift;
match dest_depth {
1 => out.push(p2 as u8),
2 => out.extend_from_slice(&(p2 as u16).to_ne_bytes()),
4 => out.extend_from_slice(&p2.to_ne_bytes()),
_ => unreachable!(),
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pixel byte order is emitted using to_ne_bytes() / src.as_bytes() regardless of fmt.big_endian_flag. If a client requests big-endian pixel format, the server will send incorrectly ordered bytes; use to_be_bytes/to_le_bytes (and adjust the fast-path) based on big_endian_flag.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in a67b239. convert_pixels now uses to_be_bytes/to_le_bytes based on big_endian_flag instead of unconditionally using to_ne_bytes. The zero-copy fast-path is also gated on !big_endian since src.as_bytes() is native-endian.

@mattkur
Copy link
Copy Markdown
Contributor

mattkur commented Apr 6, 2026

Ditto to what @smalis-msft said. Thank you so much for this contribution! I don't think we have true VNC expertise amongst the maintainers/reviewers. That makes your contribution especially valuable, but a bit tricky to build confidence in from code (at least for me). I took a look over this and it also looks reasonable, but I think we'll want a few more eyes on it too.

Please review the feedback left by copilot and respond to it. If you don't agree with the feedback, that's fine, but please help us by leaving a brief statement why not.

bitranox and others added 5 commits April 6, 2026 18:51
Validate bits_per_pixel in SetPixelFormat to only accept 8, 16, and 32
as defined by the RFB spec. Previously a non-conforming client could
trigger an unreachable!() panic by sending e.g. 24bpp.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reject the connection if the client selects a security type other than
None, since that is the only type we offer. On RFB 3.8, send a
SecurityResult failure before disconnecting. On 3.7, SecurityResult is
not sent per spec.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace .unwrap() on flate2 compress with proper error propagation
via std::io::Error, since compression is triggered by client-negotiated
encoding.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cursor pseudo-encoding pixel data was written as raw hardcoded bytes,
ignoring the client's negotiated PixelFormat (channel shifts, maxes,
endianness). This would render incorrect colors on formats like RGB565
or big-endian. Now uses the same convert_pixels path as framebuffer
tiles.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Pixel bytes were emitted using to_ne_bytes() regardless of the client's
big_endian_flag, producing incorrect byte order for big-endian clients.
Now uses to_be_bytes/to_le_bytes based on the flag, and the fast-path
is disabled for big-endian formats.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@smalis-msft
Copy link
Copy Markdown
Contributor

Ok, should be good to take one more merge/rebase with main to unblock CI

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 5 changed files in this pull request and generated 3 comments.

Comment on lines 288 to +318
let (new_width, new_height) = self.fb.resolution();
if new_width != width || new_height != height {
// Send the new desktop size.
width = new_width;
height = new_height;
socket
.write_all(
rfb::FramebufferUpdate {
message_type: rfb::SC_MESSAGE_TYPE_FRAMEBUFFER_UPDATE,
padding: 0,
rectangle_count: 1.into(),
}
.as_bytes(),
)
.await?;
socket
.write_all(
rfb::Rectangle {
x: 0.into(),
y: 0.into(),
width: width.into(),
height: height.into(),
encoding_type: rfb::ENCODING_TYPE_DESKTOP_SIZE.into(),
force_full_update = true;
if self.supports_desktop_resize {
// Notify the client of the new desktop size.
socket
.write_all(
rfb::FramebufferUpdate {
message_type: rfb::SC_MESSAGE_TYPE_FRAMEBUFFER_UPDATE,
padding: 0,
rectangle_count: 1.into(),
}
.as_bytes(),
)
.await?;
socket
.write_all(
rfb::Rectangle {
x: 0.into(),
y: 0.into(),
width: width.into(),
height: height.into(),
encoding_type: rfb::ENCODING_TYPE_DESKTOP_SIZE.into(),
}
.as_bytes(),
)
.await?;
}
}
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the framebuffer resolution changes, width/height are updated even if supports_desktop_resize is false. That means subsequent rectangles are computed in the new coordinate space without informing the client, which can desync protocol state or cause out-of-bounds updates for clients that don't implement the DesktopSize pseudo-encoding. Consider either (a) keeping width/height fixed unless DesktopSize is supported (and cropping/letterboxing), or (b) disconnecting with a clear error when a resize occurs without DesktopSize support.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in 7871e02. If the framebuffer resolution changes and the client did not advertise DesktopSize pseudo-encoding support, we now cleanly disconnect with an error instead of silently sending frames in the new coordinate space. Clients that support DesktopSize continue to receive the resize notification as before.
Though I havent seen a client that doesnt support that.

Comment on lines +566 to +569
match input.pixel_format.bits_per_pixel {
8 | 16 | 32 => {}
bpp => return Err(Error::UnsupportedPixelFormat(bpp)),
}
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SetPixelFormat validation only checks bits_per_pixel, but convert_pixels performs shifts using client-provided red_shift/green_shift/blue_shift and can panic (or behave unpredictably in release) if a malicious client supplies shift values >= 32 or inconsistent max/shift combinations. Add validation/clamping for shift ranges (e.g., < bits_per_pixel and < 32) and consider rejecting non-true-color formats to avoid remotely-triggerable crashes.

Suggested change
match input.pixel_format.bits_per_pixel {
8 | 16 | 32 => {}
bpp => return Err(Error::UnsupportedPixelFormat(bpp)),
}
let bpp = input.pixel_format.bits_per_pixel;
match bpp {
8 | 16 | 32 => {}
_ => return Err(Error::UnsupportedPixelFormat(bpp)),
}
// This implementation only supports true-color formats, and the
// color component shifts and masks must fit within the advertised
// pixel width to avoid panics or invalid conversions on untrusted
// client input.
if input.pixel_format.true_color_flag == 0 {
return Err(Error::UnsupportedPixelFormat(bpp));
}
let bpp_u32 = u32::from(bpp);
let pixel_mask = (1u64 << bpp_u32) - 1;
let red_shift = u32::from(input.pixel_format.red_shift);
let green_shift = u32::from(input.pixel_format.green_shift);
let blue_shift = u32::from(input.pixel_format.blue_shift);
let red_max = u64::from(input.pixel_format.red_max.get());
let green_max = u64::from(input.pixel_format.green_max.get());
let blue_max = u64::from(input.pixel_format.blue_max.get());
let red_mask = match red_max.checked_shl(red_shift) {
Some(mask) if red_shift < bpp_u32 && red_shift < 32 && mask <= pixel_mask => mask,
_ => return Err(Error::UnsupportedPixelFormat(bpp)),
};
let green_mask = match green_max.checked_shl(green_shift) {
Some(mask)
if green_shift < bpp_u32
&& green_shift < 32
&& mask <= pixel_mask =>
{
mask
}
_ => return Err(Error::UnsupportedPixelFormat(bpp)),
};
let blue_mask = match blue_max.checked_shl(blue_shift) {
Some(mask) if blue_shift < bpp_u32 && blue_shift < 32 && mask <= pixel_mask => mask,
_ => return Err(Error::UnsupportedPixelFormat(bpp)),
};
if (red_mask & green_mask) != 0
|| (red_mask & blue_mask) != 0
|| (green_mask & blue_mask) != 0
{
return Err(Error::UnsupportedPixelFormat(bpp));
}

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in f589cb2. We now reject non-true-color formats and any shift >= 32 to prevent panics from out-of-range bit shifts in convert_pixels.The full mask overlap validation in the suggestion is overkill for this case though - overlapping masks would just produce odd colors, not crashes or security issues, and real VNC clients always send sensible formats.

Comment on lines 163 to +188
socket
.write_all(rfb::ProtocolVersion(rfb::PROTOCOL_VERSION_33).as_bytes())
.write_all(rfb::ProtocolVersion(rfb::PROTOCOL_VERSION_38).as_bytes())
.await?;

let mut version = rfb::ProtocolVersion::new_zeroed();
socket.read_exact(version.as_mut_bytes()).await?;

if version.0 != rfb::PROTOCOL_VERSION_33 {
return Err(Error::UnsupportedVersion(version));
}
match version.0 {
rfb::PROTOCOL_VERSION_33 => {
// RFB 3.3: server dictates security type as a u32.
socket
.write_all(
rfb::Security33 {
padding: [0; 3],
security_type: rfb::SECURITY_TYPE_NONE,
}
.as_bytes(),
)
.await?;
}
rfb::PROTOCOL_VERSION_37 | rfb::PROTOCOL_VERSION_38 => {
// RFB 3.7/3.8: server sends a list of supported security types.
socket
.write_all(rfb::Security37 { type_count: 1 }.as_bytes())
.await?;
socket.write_all(&[rfb::SECURITY_TYPE_NONE]).await?;
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR changes user-visible VNC behavior/capabilities (RFB 3.8 negotiation, zlib encoding, cursor pseudo-encoding, reconnection behavior). The OpenVMM Guide has a Graphical Console/VNC page (Guide/src/reference/openvmm/graphical_console.md); consider updating it (or adding a brief note) so the documented client compatibility/features stay accurate.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated in 7213e79. The VNC guide page now documents the protocol versions (3.3/3.7/3.8), zlib compression, cursor pseudo-encoding, DesktopSize handling (including the disconnect behavior for clients without support), QEMU extended key events, and client
reconnection behavior. Also added noVNC to the tested clients list.

bitranox and others added 3 commits April 6, 2026 20:03
If the framebuffer resolution changes but the client did not advertise
DesktopSize pseudo-encoding, disconnect with an error instead of
silently sending frames in the new coordinate space, which would desync
the protocol state.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reject non-true-color formats and shift values >= 32 in SetPixelFormat
to prevent panics in convert_pixels from out-of-range bit shifts on
malicious client input.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Document RFB 3.3/3.7/3.8 protocol support, zlib compression, cursor
pseudo-encoding, DesktopSize handling, QEMU extended key events, client
reconnection behavior, and add noVNC to the tested clients list.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions bot added the Guide label Apr 6, 2026
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
let mut vncserver = vnc::Server::new("OpenVMM VM".into(), socket, view, input);
let mut timer = PolledTimer::new(driver);
let (abort_send, abort_recv) = mesh::oneshot();
let connection = Box::pin(async move {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of this stuff is duplicated with the code that handles the initial connection. Can you factor this out?

Copy link
Copy Markdown
Author

@bitranox bitranox Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

factored out in c2fa3c2. The connection setup (vnc::Server, timer, abort channel, async task) now lives in a single start_connection helper that both the initial-connect and reconnect paths call. Removes ~20 lines of duplication.

)
.await?;
}
return Err(std::io::Error::new(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have a new error enum variant here instead of this synthetic IO error?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 0db65e4. Replaced with a proper Error::UnsupportedSecurityType(u8) variant that also captures the rejected type value.

// Send the new desktop size.
if !self.supports_desktop_resize {
return Err(std::io::Error::new(
std::io::ErrorKind::Unsupported,
Copy link
Copy Markdown
Member

@jstarks jstarks Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. No "synthetic" IO errors please.
  2. Is this the appropriate thing to do, to terminate the connection?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The synthetic IO error is already replaced with Error::ResizeUnsupported in 0db65e4.

As for disconnecting, yes, this is the standard approach. Both QEMU and TigerVNC disconnect clients that don't support DesktopSize when the resolution changes. The RFB spec doesn't define fallback behavior for this case, and continuing to send framebuffer updates with dimensions the client doesn't expect risks protocol desync or garbled display. Disconnecting is clean, the client can simply reconnect and pick up the new resolution.

The alternative would be cropping/padding to the old resolution, but that adds complexity for a worse user experience.


let full_update = force_full_update || self.prev_fb.len() != fb_size;

// Find dirty tiles.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is OK for now if you find it improves performance.

We should actually be able to get dirty tiles from the device itself with a little more work (Windows guests send them, UEFI does, not sure about Linux guests... but we can also enable dirty bit tracking and scan that periodically)...

But we can tackle this separately.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, getting dirty regions from the device would be the ideal path since it avoids the per-frame comparison entirely. The tile-based diffing here is a reasonable stopgap that already cuts bandwidth significantly vs full-frame sends, especially for mostly-static screens.

Copy link
Copy Markdown
Author

@bitranox bitranox Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What already exists:

  • The synthetic video protocol (vm/devices/uidevices/src/video/protocol.rs) has full support for dirty regions: DirtMessage, Rectangle, up to 255 dirty rects per message, and a FeatureChangeMessage with is_dirt_needed flag
  • The device parses incoming dirty messages and creates Request::Dirt(Vec) in vm/devices/uidevices/src/video/mod.rs

The gap, a single TODO:

  • Line 493 of video/mod.rs: // TODO: Support dirt requests, the dirty rectangles are parsed but never forwarded anywhere

What would need to happen to wire it up:

  1. Forward dirty rects from the synthetic video device through a channel
  2. Add a dirty region receiver to the VNC worker (alongside or replacing the current Updater signal)
  3. VNC worker uses device-provided rects instead of full-frame diffing, with diffing as fallback for guests that don't send dirty regions (like some Linux guests).

I will tackle that as soon that PR got merged in

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correction : Linux guests do use the synthetic video device.

  • hyperv_fb (older fbdev driver) does NOT send dirty rects. It just sets the framebuffer location and relies on the host to scan/poll for changes. So tile diffing would still be needed as fallback here.
  • hyperv_drm (newer DRM driver) does send SYNTHVID_DIRT messages via drm_framebuffer_funcs.dirty. So dirty tracking from the device would work with modern Linux guests using this driver.
  • Cursor neither Linux driver sends SYNTHVID_POINTER_SHAPE. Cursor is software-rendered guest-side in both cases.

So for dirty tracking: Windows guests and Linux guests with hyperv_drm would provide device-level dirty rects. Linux guests with the older hyperv_fb would still need tile diffing as fallback.

I will tackle that as soon that PR got merged in


if send_cursor {
send_cursor = false;
// 18x18 arrow cursor with white fill and 2px black outline.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is OK for now to make things more usable. But really we should get the actual cursor from the device.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Getting the real cursor from the device would be the right long-term solution. It would give correct cursor shapes for resize handles, text I-beams, wait spinners, etc. The hardcoded arrow is a usability improvement over no cursor at all, which is what clients got before when the server didn't send cursor pseudo-encoding.

Copy link
Copy Markdown
Author

@bitranox bitranox Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The synthetic video protocol (vm/devices/uidevices/src/video/protocol.rs) has complete cursor support:

  • MESSAGE_POINTER_SHAPE (type 8) , sends cursor image: ARGB 32bpp or monochrome, up to 96x96, with hotspot coordinates, chunked transmission support
  • MESSAGE_POINTER_POSITION (type 7) , sends cursor position + visibility
  • FeatureChangeMessage.is_pointer_shape_updates_needed , feature negotiation flag

The gap: In vm/devices/uidevices/src/video/mod.rs (lines ~488-491), the messages are received but intentionally ignored:
Request::PointerPosition { is_visible, x, y } => {
let _ = (is_visible, x, y); // Ignored
}
Request::PointerShape => {} // Not parsed

What it would take:

  1. Parse MESSAGE_POINTER_SHAPE fully in the video device (accumulate chunks)
  2. Extend FramebufferControl trait (video_core/src/lib.rs) with cursor methods
  3. Add cursor storage to the framebuffer crate
  4. Extend vnc::Framebuffer trait to expose cursor data
  5. VNC worker polls for cursor changes and sends RFB cursor encoding updates

This is straightforward plumbing, the protocol, message structures, and channel infrastructure all exist already.

Linux Guests: No path currently, neither Linux driver sends SYNTHVID_POINTER_SHAPE.
Cursor is software-rendered guest-side in both cases.

UEFI: No cursor data

GOP is output-only, doesn't provide cursor shapes.

I will tackle that as soon that PR got merged in

let pf = &input.pixel_format;
match pf.bits_per_pixel {
8 | 16 | 32 => {}
bpp => return Err(Error::UnsupportedPixelFormat(bpp)),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is terminating the connection the right thing to do?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, disconnecting is the standard approach here. The RFB protocol defines no response message for SetPixelFormat, so there's no way to signal rejection to the client. The only options are: use the invalid format (risking panics or UB in pixel conversion), silently ignore it (protocol desync since the client believes the new format is active), or disconnect.
QEMU, TigerVNC, and libvncserver all disconnect on invalid pixel formats for the same reason.

Copy link
Copy Markdown
Member

@jstarks jstarks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some feedback.

bitranox added 2 commits April 7, 2026 01:32
Add UnsupportedSecurityType and ResizeUnsupported variants to the Error
enum instead of constructing std::io::Error::new with string messages.
@bitranox bitranox requested a review from jstarks April 6, 2026 23:43
@bitranox
Copy link
Copy Markdown
Author

bitranox commented Apr 7, 2026

Left some feedback.

Whow, we’re moving from “barely works” to “fully blown, with all the whistles and bells.”
I like it. It’s better to implement all the cursor, tiling, and related pieces properly once and for all.
Appreciate the feedback !

should I put inline comments for that issues for future reference ?

If You prefer that I implement the dirty & cursor parts already now, let me know. I would prefer to merge the current stage and then apply that additional changes on top of that - otherwise the PR will get quite big ...

greetings from good old Vienna
Robert

bitranox added 2 commits April 7, 2026 13:49
…ariant

Add Error::ZlibCompression variant instead of constructing
std::io::Error::other with a string message.
The start_connection helper uses PolledSocket<socket2::Socket> in its
explicit type signature, which requires socket2 as a direct dependency.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request Guide

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants