Skip to content

Add Basic DOM Functionality test codes#23515

Open
DavidHuang666 wants to merge 10 commits into
sonic-net:masterfrom
DavidHuang666:transceiver_dom_test
Open

Add Basic DOM Functionality test codes#23515
DavidHuang666 wants to merge 10 commits into
sonic-net:masterfrom
DavidHuang666:transceiver_dom_test

Conversation

@DavidHuang666

@DavidHuang666 DavidHuang666 commented Apr 1, 2026

Copy link
Copy Markdown

Description of PR

  1. Update dom_test_plan.md for DOM data consistency verification.
  2. Add Basic DOM Functionality test codes according to the dom_test_plan.md.

Summary:
Fixes # (issue)

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • New Test case
    • Skipped for non-supported platforms
  • Test case improvement

Back port request

  • 202205
  • 202305
  • 202311
  • 202405
  • 202411
  • 202505
  • 202511

Approach

What is the motivation for this PR?

How did you do it?

How did you verify/test it?

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

@linux-foundation-easycla

linux-foundation-easycla Bot commented Apr 1, 2026

Copy link
Copy Markdown

CLA Not Signed

@github-actions github-actions Bot requested review from mihirpat1, r12f and wangxin April 1, 2026 07:31
@mssonicbld

Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command.

@mihirpat1 mihirpat1 requested a review from Copilot April 8, 2026 03:49
@mihirpat1

Copy link
Copy Markdown
Contributor

@DavidHuang666 Can you please update the branch to resolve merge conflict?

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new transceiver DOM (Digital Optical Monitoring) test plan and a set of pytest-based DOM validation tests under tests/transceiver/dom/, intended to verify availability, operational ranges, EEPROM thresholds, and polling consistency using configuration-driven attributes.

Changes:

  • Introduces DOM pytest fixtures/helpers for reading/parsing TRANSCEIVER_DOM_SENSOR / TRANSCEIVER_DOM_THRESHOLD from STATE_DB.
  • Adds 4 basic DOM functionality test cases (availability, operational range, thresholds, consistency).
  • Adds a new DOM test plan markdown document describing attributes and validation algorithms.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
tests/transceiver/dom/conftest.py Adds DOM-specific fixtures and STATE_DB read/parse helpers; builds per-port DOM context from attribute config.
tests/transceiver/dom/test_dom_availability.py Validates DOM sensor table presence, expected fields, and freshness.
tests/transceiver/dom/test_dom_operational_range.py Validates sensor readings fall within configured operational ranges (including lane expansion).
tests/transceiver/dom/test_dom_threshold.py Validates threshold completeness/hierarchy and compares configured vs STATE_DB thresholds.
tests/transceiver/dom/test_dom_consistency.py Polls DOM data across cycles and checks timestamp progression + bounded variation.
tests/transceiver/dom/__init__.py Declares the DOM test package.
docs/testplan/transceiver/dom_test_plan.md Adds DOM test plan and attribute reference/algorithm documentation.

Comment on lines +250 to +265
@pytest.fixture(scope="module")
def dom_port_context(port_attributes_dict):
context = {}
for port, attrs in port_attributes_dict.items():
dom_attrs = attrs.get(DOM_CATEGORY_KEY, {})
if not isinstance(dom_attrs, dict) or not dom_attrs:
continue
context[port] = {
"base": attrs.get("BASE_ATTRIBUTES", {}),
"dom": dom_attrs,
}

if not context:
pytest.skip("No ports with non-empty DOM_ATTRIBUTES found in port_attributes_dict")

return context

Copilot AI Apr 8, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dom_port_context depends on a port_attributes_dict fixture, but there is no definition of port_attributes_dict anywhere else in the repo (search only finds this reference). As-is, pytest will error with "fixture 'port_attributes_dict' not found" and none of these DOM tests will run. Please add/provide the session-scoped port_attributes_dict fixture (as described in docs/testplan/transceiver/test_plan.md) or refactor these fixtures to consume an existing inventory/config fixture.

Copilot uses AI. Check for mistakes.
Comment thread tests/transceiver/dom/conftest.py Outdated
Comment on lines +323 to +337
@pytest.fixture(scope="module")
def dom_sensor_by_port(duthost, dom_ports):
return {
port: _read_state_db_hash(duthost, STATE_DB_SENSOR_KEY_TEMPLATE.format(port))
for port in dom_ports
}


@pytest.fixture(scope="module")
def dom_threshold_by_port(duthost, dom_ports):
return {
port: _read_state_db_hash(duthost, STATE_DB_THRESHOLD_KEY_TEMPLATE.format(port))
for port in dom_ports
}

Copilot AI Apr 8, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dom_sensor_by_port / dom_threshold_by_port are module-scoped and take a one-time snapshot of STATE_DB before the tests run. This can make data_max_age_min freshness checks fail (age keeps increasing after fixture creation) and can also be flaky if xcvrd hasn’t populated tables yet. Consider reading STATE_DB at assertion time (use dom_db_reader) or making these fixtures function-scoped / retrying until data is present.

Copilot uses AI. Check for mistakes.
Comment on lines +91 to +96
# Step 3/4: Poll repeatedly, validate timestamp progression and reasonable sensor behavior.
for poll_idx in range(1, poll_count):
time.sleep(poll_interval_sec)
current = read_sensor(port)
if not current:
field_failures.append("DOM sensor read failed during consistency polling")

Copilot AI Apr 8, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The consistency polling loop uses time.sleep(poll_interval_sec) where poll_interval_sec comes from max_update_time_sec (default in the test plan is 60). With consistency_check_poll_count >= 3 this can add minutes of hard sleep per port, which can significantly slow CI and makes failures expensive. Prefer using the repo’s wait_until pattern to wait for last_update_time to advance (or cap the sleep interval / poll only until the update happens) instead of always sleeping the full interval.

Copilot uses AI. Check for mistakes.
Comment on lines +61 to +67
| Attribute Name | Type | Default Value | Mandatory | Override Levels | Description |
|----------------|------|---------------|-----------|-----------------|-------------|
| temperature_operational_range | dict | {"min": 20.0, "max": 70.0} | O | transceivers | Realistic operational temperature range in Celsius during normal operation (typical: room temp to moderate heat) |
| temperature_threshold_range | dict | (format) {"lowalarm": <float>, "lowwarning": <float>, "highwarning": <float>, "highalarm": <float>} | O | transceivers | Absolute threshold temperature range in Celsius (must define all four keys; no implicit defaults) |
| voltage_operational_range | dict | {"min": 3.20, "max": 3.40} | O | transceivers | Realistic operational voltage range in volts during normal operation (typical: 3.3V ±3%) |
| voltage_threshold_range | dict | (format) {"lowalarm": <float>, "lowwarning": <float>, "highwarning": <float>, "highalarm": <float>} | O | transceivers | Absolute threshold voltage range in volts (provide EEPROM alarm/warn limits; skip to disable voltage threshold validation) |
| laser_temperature_operational_range | dict | {"min": 20.0, "max": 70.0} | O | transceivers | Realistic operational laser temperature range in Celsius during normal operation |

Copilot AI Apr 8, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The attributes table uses || at the start of each row, which breaks standard Markdown table rendering. Please convert these to normal |-delimited rows (single leading/trailing pipe) so the table renders correctly in GitHub markdown.

Copilot uses AI. Check for mistakes.
Comment on lines +36 to +37

if max_age_min is not None:

Copilot AI Apr 8, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a duplicated if max_age_min is not None: block: one sets has_configured_checks = True (lines 33-36) and the next immediately re-checks max_age_min is not None to run the freshness validation. This duplication is unnecessary and makes the flow harder to read/maintain; please collapse into a single conditional.

Suggested change
if max_age_min is not None:

Copilot uses AI. Check for mistakes.
Comment on lines +145 to +152
| Attribute Name | Base Field | Lane Expansion | Expected STATE_DB Fields |
|----------------|------------|----------------|-------------------------|
| `temperature_operational_range` | `temperature` | No | `temperature` |
| `txLANE_NUMbias_operational_range` | `txLANE_NUMbias` | Yes | `tx1bias`, `tx2bias`, `tx3bias`, `tx4bias` (for 4-lane) |
| `rxLANE_NUMpower_operational_range` | `rxLANE_NUMpower` | Yes | `rx1power`, `rx2power`, `rx3power`, `rx4power` (for 4-lane) |
| `voltage_threshold_range` | `voltage` | No | `vcchighalarm`, `vcclowalarm`, `vcchighwarning`, `vcclowwarning` |
| `tx_power_threshold_range` | `tx_power` | No | `txpowerhighalarm`, `txpowerlowalarm`, `txpowerhighwarning`, `txpowerlowwarning` |

Copilot AI Apr 8, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "Example Mappings" table also uses || row prefixes, which will not render as a proper Markdown table. Please update it to standard | ... | formatting for GitHub rendering.

Copilot uses AI. Check for mistakes.
Signed-off-by: david <david.huang@eoptolink.com>
@DavidHuang666 DavidHuang666 force-pushed the transceiver_dom_test branch from 5fc6485 to b565d2f Compare April 9, 2026 07:31
@mssonicbld

Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

@DavidHuang666

Copy link
Copy Markdown
Author

@mihirpat1 I have resolved merge conflict based on the latest sonic-net/sonic-mgmt/master.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 11 comments.

pytest.mark.topology("ptp-256", "m0"),
]


Copilot AI Apr 13, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These DOM tests don't skip the virtual switch (VS) environment, but other transceiver tests (e.g., EEPROM verification) explicitly skip VS because there are no physical transceivers/DOM data. Consider adding the same VS skip (ideally as an autouse fixture in this package's conftest) so TC2/TC3/TC4 don't fail on VS testbeds.

Suggested change
@pytest.fixture(autouse=True)
def skip_dom_consistency_on_vs(request, tbinfo):
"""Skip DOM consistency tests on VS testbeds without physical transceivers."""
testbed_type = str(request.config.getoption("--testbed_type", default="") or "").lower()
topo = tbinfo.get("topo", {}) if isinstance(tbinfo, dict) else {}
topo_name = str(topo.get("name", "") or "").lower()
topo_type = str(topo.get("type", "") or "").lower()
if testbed_type == "vs" or topo_type == "vs" or topo_name == "vs":
pytest.skip("DOM consistency tests are not supported on VS testbeds")

Copilot uses AI. Check for mistakes.
Comment on lines +51 to +60
if max_age_min is not None:
parsed_time = parse_dom_update_time(sensor_data.get("last_update_time"))
if parsed_time is None:
field_failures.append("last_update_time missing or unparsable while data_max_age_min is configured")
else:
age_minutes = (now_utc - parsed_time).total_seconds() / 60.0
if age_minutes > float(max_age_min):
field_failures.append(
"last_update_time too old (age_min={:.2f}, limit={})".format(age_minutes, max_age_min)
)

Copilot AI Apr 13, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

float(max_age_min) can raise ValueError if data_max_age_min is present but non-numeric. Prefer parsing/validating with parse_dom_numeric(max_age_min) and treat invalid values as a test failure with a clear message rather than crashing the test.

Copilot uses AI. Check for mistakes.
Comment on lines +97 to +105
# Step 4: Compare configured threshold values against STATE_DB values.
for logical_key in EXPECTED_THRESHOLD_KEYS:
expected = float(attr_value[logical_key])
actual = parsed_actual[logical_key]
if abs(actual - expected) > VALUE_TOLERANCE:
field_failures.append(
"{} expected {}={}, got {}".format(attr_name, logical_key, expected, actual)
)

Copilot AI Apr 13, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expected = float(attr_value[logical_key]) will raise if the configured threshold value is non-numeric (e.g., "N/A"), causing the test to error rather than produce a readable validation failure. Use parse_dom_numeric (or a try/except) when converting configured threshold values and report which key is invalid.

Copilot uses AI. Check for mistakes.
Comment on lines +124 to +145
op_min = operational_range.get("min")
op_max = operational_range.get("max")
if op_min is None or op_max is None:
field_failures.append(
"{} present but missing min/max in DOM_ATTRIBUTES".format(operational_attr)
)
continue

if not (
parsed_actual["lowwarning"] < float(op_min)
and float(op_max) < parsed_actual["highwarning"]
):
field_failures.append(
"{} operational range [{}, {}] is not within warning bounds ({}, {})".format(
operational_attr,
op_min,
op_max,
parsed_actual["lowwarning"],
parsed_actual["highwarning"],
)
)

Copilot AI Apr 13, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

float(op_min) / float(op_max) can raise when operational range min/max are present but non-numeric. Parse/validate these with parse_dom_numeric (or guard with try/except) so the test reports a config validation failure instead of crashing.

Copilot uses AI. Check for mistakes.
Comment on lines +77 to +82
| tx_power_consistency_variation_threshold | float | 3.0 | O | transceivers or platform | Maximum allowed Tx power variation between two consecutive TC4 polling reads |
| rx_power_consistency_variation_threshold | float | 3.0 | O | transceivers or platform | Maximum allowed Rx power variation between two consecutive TC4 polling reads |
| tx_bias_consistency_variation_threshold | float | 10.0 | O | transceivers or platform | Maximum allowed Tx bias variation percentage between two consecutive TC4 polling reads |
| laser_temperature_consistency_variation_threshold | float | 3.0 | O | transceivers or platform | Maximum allowed laser temperature variation between two consecutive TC4 polling reads |
| temperature_consistency_variation_threshold | float | 3.0 | O | transceivers or platform | Maximum allowed module temperature variation between two consecutive TC4 polling reads |
| voltage_consistency_variation_threshold | float | 0.1 | O | transceivers or platform | Maximum allowed module voltage variation between two consecutive TC4 polling reads |

Copilot AI Apr 13, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new consistency variation threshold attributes are marked as optional ("O") with default values in this table, but the TC4 implementation currently treats missing threshold attrs as configuration errors and fails the test. Align the testplan with the implementation (mark required) or update the implementation to actually support the documented defaults/optionality.

Suggested change
| tx_power_consistency_variation_threshold | float | 3.0 | O | transceivers or platform | Maximum allowed Tx power variation between two consecutive TC4 polling reads |
| rx_power_consistency_variation_threshold | float | 3.0 | O | transceivers or platform | Maximum allowed Rx power variation between two consecutive TC4 polling reads |
| tx_bias_consistency_variation_threshold | float | 10.0 | O | transceivers or platform | Maximum allowed Tx bias variation percentage between two consecutive TC4 polling reads |
| laser_temperature_consistency_variation_threshold | float | 3.0 | O | transceivers or platform | Maximum allowed laser temperature variation between two consecutive TC4 polling reads |
| temperature_consistency_variation_threshold | float | 3.0 | O | transceivers or platform | Maximum allowed module temperature variation between two consecutive TC4 polling reads |
| voltage_consistency_variation_threshold | float | 0.1 | O | transceivers or platform | Maximum allowed module voltage variation between two consecutive TC4 polling reads |
| tx_power_consistency_variation_threshold | float | 3.0 | M | transceivers or platform | Maximum allowed Tx power variation between two consecutive TC4 polling reads; required for TC4 polling consistency validation |
| rx_power_consistency_variation_threshold | float | 3.0 | M | transceivers or platform | Maximum allowed Rx power variation between two consecutive TC4 polling reads; required for TC4 polling consistency validation |
| tx_bias_consistency_variation_threshold | float | 10.0 | M | transceivers or platform | Maximum allowed Tx bias variation percentage between two consecutive TC4 polling reads; required for TC4 polling consistency validation |
| laser_temperature_consistency_variation_threshold | float | 3.0 | M | transceivers or platform | Maximum allowed laser temperature variation between two consecutive TC4 polling reads; required for TC4 polling consistency validation |
| temperature_consistency_variation_threshold | float | 3.0 | M | transceivers or platform | Maximum allowed module temperature variation between two consecutive TC4 polling reads; required for TC4 polling consistency validation |
| voltage_consistency_variation_threshold | float | 0.1 | M | transceivers or platform | Maximum allowed module voltage variation between two consecutive TC4 polling reads; required for TC4 polling consistency validation |

Copilot uses AI. Check for mistakes.
Comment on lines +227 to +236
for attr_name in CONSISTENCY_VARIATION_THRESHOLD_ATTRS:
raw_value = dom_attrs.get(attr_name)
if raw_value is None:
errors.append("missing required DOM attribute {} for consistency variation validation".format(attr_name))
continue

numeric = _parse_numeric(raw_value)
if numeric is None:
errors.append("{} is non-numeric in DOM_ATTRIBUTES (raw={!r})".format(attr_name, raw_value))
continue

Copilot AI Apr 13, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The consistency variation thresholds are documented as optional with defaults, but _parse_consistency_variation_thresholds treats missing attributes as errors ("missing required DOM attribute ...") which will fail TC4 on any port that doesn't explicitly set all thresholds. Either apply documented defaults when the attrs are absent, or only enforce/validate thresholds that are present and skip the corresponding variation checks when they are not configured.

Copilot uses AI. Check for mistakes.
Comment on lines +33 to +51
max_age_min = dom_attrs.get("data_max_age_min")
if max_age_min is not None:
has_configured_checks = True

if max_age_min is not None:
if not sensor_data:
field_failures.append("missing TRANSCEIVER_DOM_SENSOR data for freshness check")
else:
parsed_time = parse_dom_update_time(sensor_data.get("last_update_time"))
if parsed_time is None:
field_failures.append(
"last_update_time missing or unparsable while data_max_age_min is configured"
)
else:
age_minutes = (now_utc - parsed_time).total_seconds() / 60.0
if age_minutes > float(max_age_min):
field_failures.append(
"last_update_time too old (age_min={:.2f}, limit={})".format(age_minutes, max_age_min)
)

Copilot AI Apr 13, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

float(max_age_min) can raise ValueError if data_max_age_min is present but non-numeric (e.g., string/"N/A"), turning this into an error instead of a clear test failure. Use parse_dom_numeric(max_age_min) (and validate it is not None and > 0) before comparing ages, and append a failure message when the config value is invalid.

Copilot uses AI. Check for mistakes.
Comment on lines +62 to +93
min_value = attr_value.get("min")
max_value = attr_value.get("max")
if min_value is None or max_value is None:
field_failures.append("{} missing required min/max in DOM_ATTRIBUTES".format(attr_name))
continue

field_base = attr_name[: -len(OPERATIONAL_SUFFIX)]
if "LANE_NUM" in field_base:
if lane_count <= 0:
field_failures.append("{} requires lane count but lane_count <= 0".format(attr_name))
continue
fields = [field_base.replace("LANE_NUM", str(lane)) for lane in range(1, lane_count + 1)]
else:
fields = [field_base]

# Step 4: Validate each derived field is numeric and inside configured operational range.
for field in fields:
if not sensor_data:
field_failures.append("{} DOM sensor table missing".format(field))
continue
raw_value = sensor_data.get(field)
numeric_value = parse_dom_numeric(raw_value)
if numeric_value is None:
field_failures.append(
"{} missing/non-numeric operational value in STATE_DB (raw={!r})".format(field, raw_value)
)
continue

if not float(min_value) <= numeric_value <= float(max_value):
field_failures.append(
"{} value {} out of range [{}, {}]".format(field, numeric_value, min_value, max_value)
)

Copilot AI Apr 13, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

float(min_value) / float(max_value) can raise if the configured min/max values are non-numeric. Parse min/max with parse_dom_numeric first and report an actionable validation failure when they are missing/non-numeric, so the test doesn't crash on bad config data.

Copilot uses AI. Check for mistakes.
pytestmark = [
pytest.mark.topology("ptp-256", "m0"),
]

Copilot AI Apr 13, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These DOM tests don't skip the virtual switch (VS) environment, but other transceiver tests (e.g., EEPROM verification) explicitly skip VS because there are no physical transceivers/DOM data. Consider adding the same VS skip (ideally as an autouse fixture in this package's conftest) so TC2/TC3/TC4 don't fail on VS testbeds.

Suggested change
@pytest.fixture(autouse=True)
def skip_dom_tests_on_vs(tbinfo):
topo_info = tbinfo.get("topo", {})
topo_name = str(topo_info.get("name", "")).lower()
topo_type = str(topo_info.get("type", "")).lower()
testbed_type = str(tbinfo.get("testbed_type", "")).lower()
if "vs" in {topo_name, topo_type, testbed_type}:
pytest.skip("DOM operational range tests are not supported on VS testbeds")

Copilot uses AI. Check for mistakes.
pytestmark = [
pytest.mark.topology("ptp-256", "m0"),
]

Copilot AI Apr 13, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These DOM tests don't skip the virtual switch (VS) environment, but other transceiver tests (e.g., EEPROM verification) explicitly skip VS because there are no physical transceivers/DOM data. Consider adding the same VS skip (ideally as an autouse fixture in this package's conftest) so TC2/TC3/TC4 don't fail on VS testbeds.

Suggested change
@pytest.fixture(autouse=True)
def skip_dom_threshold_on_vs(duthosts, rand_one_dut_hostname):
duthost = duthosts[rand_one_dut_hostname]
facts = getattr(duthost, "facts", {}) or {}
if facts.get("asic_type") == "vs" or facts.get("platform") == "x86_64-kvm_x86_64-r0":
pytest.skip("DOM threshold validation is not supported on VS testbeds")

Copilot uses AI. Check for mistakes.
import pytest

pytestmark = [
pytest.mark.topology("ptp-256", "m0"),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DavidHuang666 Please remove the topology markers.
This is not required anymore since we have the below fix checked in
#23497

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your remind. I have removed the topology markers for TC1 to TC4.

Comment thread tests/transceiver/dom/conftest.py Outdated
STATE_DB_THRESHOLD_KEY_TEMPLATE = "TRANSCEIVER_DOM_THRESHOLD|{}"

OPERATIONAL_SUFFIX = "_operational_range"
THRESHOLD_SUFFIX = "_threshold_range"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DavidHuang666 Can you redefine these repeated constants to a common utils folder so that all the tests can reuse it?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I’ll move them into common utils for reuse.

@@ -0,0 +1,372 @@
import logging

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DavidHuang666 The conftest.py is currently loaded with many functions. Can you please restructure it per the below?

tests/transceiver/dom/
├── init.py
├── conftest.py # fixtures only
├── utils/
│ ├── init.py
│ ├── dom_constants.py # THRESHOLD_SUFFIX, THRESHOLD_PREFIX_OVERRIDES,
│ │ # CONSISTENCY_VARIATION_RULES, etc.
│ ├── dom_field_mapper.py # _expand_operational_fields,
│ │ # _build_operational_field_range_map,
│ │ # _threshold_field_map,
│ │ # _build_threshold_field_map (deduplicated from test file)
│ ├── dom_state_db_reader.py # _parse_hgetall_output, _read_state_db_hash,
│ │ # _parse_numeric, _parse_update_time
│ └── dom_health_check.py # xcvrd/core-file/log guards (TC999 equivalent)
├── test_dom_availability.py
├── test_dom_consistency.py
├── test_dom_operational_range.py
└── test_dom_threshold.py

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I’ll restructure conftest.py that way and move the helpers/ constants into utils/.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I have restructured tests/transceiver/dom/conftest.py to keep fixtures only,
and moved shared helpers into tests/transceiver/dom/utils/:

  • dom_constants.py
  • dom_field_mapper.py
  • dom_state_db_reader.py
  • dom_health_check.py

The DOM test files now import shared constants/mappers/readers from utils
instead of duplicating helper logic.

def _parse_consistency_variation_thresholds(dom_attrs):
thresholds = {}
errors = []
for attr_name in CONSISTENCY_VARIATION_THRESHOLD_ATTRS:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DavidHuang666 Please ensure that we do not throw this error for optional attributes. Can you please review the HLD and update this accordingly?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out. I’ll review the HLD and update the logic so missing optional attributes are skipped instead of being reported as errors.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I reviewed the DOM HLD/test plan and updated the TC4 consistency threshold parsing so missing optional variation threshold attributes are skipped instead of treated as errors. Configured but invalid threshold values still fail validation.

Signed-off-by: david <david.huang@eoptolink.com>
@mssonicbld

Copy link
Copy Markdown
Collaborator

/azp run

@github-actions github-actions Bot requested a review from mihirpat1 April 16, 2026 09:08
@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

…contains pytest fixtures and shared logic lives under tests/transceiver/dom/utils/.2.Add shared DOM utility modules for constants, field mapping, STATE_DB parsing/reading, and DOM health checks. Deduplicate operational-range and threshold field mapping logic across DOM tests.3.Add explicit dom_health_guard usage to TC1-TC4 so basic DOM tests run pre-test health checks and post-test cleanup validation. The guard covers critical services, xcvrd status, core-file detection, DOM/syslog and I2C log scanning, transceiver presence, STATE_DB TRANSCEIVER_INFO, link state, LLDP when enabled, DOM polling state, sensor freshness, and post-test reporting.4.Update TC4 consistency validation so missing optional consistency variation thresholds are skipped instead of treated as failures, while configured invalid threshold values still fail.

Signed-off-by: david <david.huang@eoptolink.com>
@mssonicbld

Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 3 comments.

}

for port in ports:
if port not in stdout:

Copilot AI Apr 21, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLDP neighbor detection uses substring matching (if port not in stdout), which can yield false positives (e.g., Ethernet1 matches the Ethernet10 row). This can incorrectly pass validation when a port is missing neighbors. Prefer parsing show lldp table output (e.g., duthost.show_and_parse) or at least matching the local port column with a regex/word-boundary anchored to the line start.

Suggested change
if port not in stdout:
port_pattern = r"(?m)^\s*{}(?:\s+|$)".format(re.escape(port))
if not re.search(port_pattern, stdout):

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment on lines +105 to +107
for poll_idx in range(1, poll_count):
time.sleep(poll_interval_sec)
current = read_sensor(port)

Copilot AI Apr 21, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The consistency test sleeps poll_interval_sec inside the per-port loop, so total runtime scales as O(num_ports * poll_count * poll_interval_sec). With typical defaults (e.g., 3 polls at 60s) and dozens of ports, this can add tens of minutes or more to a run. Consider polling all ports per interval (baseline read for all ports, then sleep once per poll, then read/validate all ports) to keep total sleep time proportional to poll_count rather than port count.

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DavidHuang666 Can you please address all the open comments from Github Copilot?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I’ll take care of them.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

Comment on lines +52 to +75
def read_state_db_hash(duthost, key):
"""Read one STATE_DB hash, trying sonic-db-cli first and redis-cli as fallback.

Args:
duthost: DUT host fixture used to execute database commands.
key: STATE_DB hash key to read.

Returns:
dict: Parsed hash field/value pairs, or an empty dict if the key cannot be read.
"""
commands = [
'sonic-db-cli STATE_DB HGETALL "{}"'.format(key),
'redis-cli --raw -n 6 HGETALL "{}"'.format(key),
]

for cmd in commands:
result = duthost.command(cmd, module_ignore_errors=True)
if result.get("rc", 1) != 0:
continue
parsed = parse_hgetall_output(result.get("stdout_lines", []))
if parsed:
return parsed

return {}

Copilot AI Apr 21, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

read_state_db_hash() only queries the default namespace (sonic-db-cli STATE_DB ... / redis-cli -n 6 ...). On multi-ASIC systems, STATE_DB keys can live under per-ASIC namespaces, and other tests handle this by iterating duthost.frontend_asics and trying sonic-db-cli -n <namespace> ... until data is found (e.g., redis_hgetall() in tests/snmp/test_snmp_phy_entity.py). Without namespace-aware lookup, these DOM tests can consistently return empty hashes and fail on multi-ASIC platforms. Consider adding namespace iteration (or accepting a namespace/asic parameter and wiring it through fixtures) before falling back to the non-namespaced query.

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

@DavidHuang666 DavidHuang666 deleted the transceiver_dom_test branch May 26, 2026 12:41
@DavidHuang666 DavidHuang666 restored the transceiver_dom_test branch May 26, 2026 12:42
@DavidHuang666 DavidHuang666 reopened this May 26, 2026
@mssonicbld

Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

@DavidHuang666

Copy link
Copy Markdown
Author

@DavidHuang666 Can you please accommodate the changes merged via #24184

Sure, I will accommodate the changes merged via #24184 as requested.

Comment thread tests/transceiver/dom/utils/dom_health_check.py Fixed
@DavidHuang666

Copy link
Copy Markdown
Author

@DavidHuang666 Can you please accommodate the changes merged via #24184

@mihirpat1 I will try to update My PR based on PR #24184 next Friday.

@DavidHuang666

Copy link
Copy Markdown
Author

@DavidHuang666 Can you please accommodate the changes merged via #24184

Hi @mihirpat1, I noticed that file_organization.md in the sonic-net/sonic-mgmt master branch is different from the version in PR #24184. Should I follow the current sonic-net/sonic-mgmt master branch when implementing the DOM test code?

@mihirpat1

Copy link
Copy Markdown
Contributor

@DavidHuang666 Please follow the latest version from master since it contains 1 additional commit.

@DavidHuang666

Copy link
Copy Markdown
Author

@DavidHuang666 Please follow the latest version from master since it contains 1 additional commit.

OK. After completing the verification, we will try to update PR #23515 based on the latest version from master within this week.

@mihirpat1

Copy link
Copy Markdown
Contributor

MSFT ADO - 38346195

Signed-off-by: david <david.huang@eoptolink.com>
Signed-off-by: david <david.huang@eoptolink.com>
@mihirpat1 mihirpat1 added the Transceiver Test Dev 🔌 Action item for transceiver test development label Jun 10, 2026
Signed-off-by: david <david.huang@eoptolink.com>
Keep docs/testplan/transceiver and tests/transceiver from transceiver_dom_test while aligning all other paths with upstream/master.

Signed-off-by: david <david.huang@eoptolink.com>
Keep docs/testplan/transceiver and tests/transceiver from transceiver_dom_test while aligning all other paths with latest upstream/master.

Signed-off-by: david <david.huang@eoptolink.com>
@DavidHuang666 DavidHuang666 force-pushed the transceiver_dom_test branch from 780f283 to 786a014 Compare June 12, 2026 03:34
@mssonicbld

Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

Signed-off-by: david <david.huang@eoptolink.com>
@mssonicbld

Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

@DavidHuang666

Copy link
Copy Markdown
Author

@DavidHuang666 Please follow the latest version from master since it contains 1 additional commit.
Hi @mihirpat1 , I have updated the PR based on the latest master. Please help to review it.

@mihirpat1

Copy link
Copy Markdown
Contributor

@DavidHuang666 Please attach the output of ./run_tests.sh -c <test_path> -n <dut_name> -i ../ansible/lab,../ansible/veos -u -e " --skip_sanity --html=.html --self-contained-html"

@mihirpat1

Copy link
Copy Markdown
Contributor

@DavidHuang666 Please fix the PR check failures.

all_failures.append("{}:\n missing consistency validation plan".format(port))
continue

dom_attrs = plan["dom_attrs"]
@DavidHuang666

Copy link
Copy Markdown
Author

@DavidHuang666 Please attach the output of ./run_tests.sh -c <test_path> -n <dut_name> -i ../ansible/lab,../ansible/veos -u -e " --skip_sanity --html=.html --self-contained-html"

ok,I will attach the output next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Transceiver Test Dev 🔌 Action item for transceiver test development

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants