[pull] master from tensorflow:master by pull[bot] · Pull Request #1687 · GesuBackups/tensorflow

pull · 2026-04-02T13:29:15Z

See Commits and Changes for more details.

Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

Corrected minor grammatical issues and formatting in installation instructions.

In CustomCallThunk, when `cpu_target_machine_options` is not provided, pass a pointer to a default-constructed `xla::cpu::TargetMachineOptions` instead of `nullptr`. It autodetects the current host triple and cpu name. PiperOrigin-RevId: 893340005

…de-registration until the module execution is completed. Also added delayed test-cases for both symmetric memory and peer parameters cases. PiperOrigin-RevId: 893340921

… MIOpen autotuning Imported from GitHub PR openxla/xla#39622 📝 Summary of Changes Pass down device_allocator from gpu_compiler into miopen backend instead constructing one when getting algorithms. 🎯 Justification Allows miopen backend to allocate larger scratch buffers due to not being limited by amount of free memory that is not being reserved by BFCAllocator. 🚀 Kind of Contribution 🐛 Bug Fix 📊 Benchmark (for Performance Improvements) N\A 🧪 Unit Tests: N\A 🧪 Execution Tests: N\A Copybara import of the project: -- 0113da6439b85f745d8114d9377c25ade4cea2e1 by Dragan Mladjenovic <Dragan.Mladjenovic@amd.com>: [ROCm] Use BFCAllocator for scratch allocations needed for MIOpen autotuning Merging this change closes #39622 PiperOrigin-RevId: 893361718

Imported from GitHub PR openxla/xla#39843 Bumps [jwalton/gh-find-current-pr](https://github.com/jwalton/gh-find-current-pr) from 1.3.3 to 1.3.5. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/jwalton/gh-find-current-pr/releases">jwalton/gh-find-current-pr's releases</a>.</em></p> <blockquote> <h2>v1.3.5</h2> <h2><a href="https://github.com/jwalton/gh-find-current-pr/compare/v1.3.4...v1.3.5">1.3.5</a> (2026-03-15)</h2> <h3>Bug Fixes</h3> <ul> <li>Bump to node24. (<a href="https://github.com/jwalton/gh-find-current-pr/commit/db0c647679ec9fd2ff8950ac8b66be6d579d17d1">db0c647</a>)</li> </ul> <h2>v1.3.4</h2> <h2>What's Changed</h2> <ul> <li>Update action to use Node.js 22 by <a href="https://github.com/larshp"><code>@larshp</code></a> in <a href="https://redirect.github.com/jwalton/gh-find-current-pr/pull/120">jwalton/gh-find-current-pr#120</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/larshp"><code>@larshp</code></a> made their first contribution in <a href="https://redirect.github.com/jwalton/gh-find-current-pr/pull/120">jwalton/gh-find-current-pr#120</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/jwalton/gh-find-current-pr/compare/v1...v1.3.4">https://github.com/jwalton/gh-find-current-pr/compare/v1...v1.3.4</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/jwalton/gh-find-current-pr/commit/f3d61b485d2801773f7a07b2aaa3306bd8f8e653"><code>f3d61b4</code></a> chore(release): 1.3.5 [skip ci]</li> <li><a href="https://github.com/jwalton/gh-find-current-pr/commit/db0c647679ec9fd2ff8950ac8b66be6d579d17d1"><code>db0c647</code></a> fix: Bump to node24.</li> <li><a href="https://github.com/jwalton/gh-find-current-pr/commit/6aa931781d174b648d6cc7070eb46d1fa4d29927"><code>6aa9317</code></a> Merge pull request <a href="https://redirect.github.com/jwalton/gh-find-current-pr/issues/120">#120</a> from larshp/patch-1</li> <li><a href="https://github.com/jwalton/gh-find-current-pr/commit/e2e9ed4a7ba06c4f27e59f1ff7ff1418a550f198"><code>e2e9ed4</code></a> Update action to use Node.js 22</li> <li>See full diff in <a href="https://github.com/jwalton/gh-find-current-pr/compare/89ee5799558265a1e0e31fab792ebb4ee91c016b...f3d61b485d2801773f7a07b2aaa3306bd8f8e653">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=jwalton/gh-find-current-pr&package-manager=github_actions&previous-version=1.3.3&new-version=1.3.5)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Copybara import of the project: -- c730fd8205aa0ffb3c2d8b16daf71090d1e1c685 by dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>: Bump jwalton/gh-find-current-pr from 1.3.3 to 1.3.5 Bumps [jwalton/gh-find-current-pr](https://github.com/jwalton/gh-find-current-pr) from 1.3.3 to 1.3.5. - [Release notes](https://github.com/jwalton/gh-find-current-pr/releases) - [Commits](jwalton/gh-find-current-pr@89ee579...f3d61b4) --- updated-dependencies: - dependency-name: jwalton/gh-find-current-pr dependency-version: 1.3.5 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Merging this change closes #39843 PiperOrigin-RevId: 893377229

Here we are trying to make it as similar as possible to AffineExpr to minimize the number of tests that we need to modify in the migration. PiperOrigin-RevId: 893379056

Imported from GitHub PR openxla/xla#39951 Add missing collective op. Copybara import of the project: -- 8a92831cebf93b08e2cec11bb5dcbe6bb5e6a755 by Eugene Zhulenev <ezhulenev@openxla.org>: [xla:gpu] Add missing scheduling for all-gather-start Merging this change closes #39951 PiperOrigin-RevId: 893383177

Failing tests were fixed in (cl/892275092). Tests are passing now. Original message: Partial migration of IndexingMap::GetAffineMap. Still some users are there to be migrated, but I do not want to make a giant CL. The changes include: - Updating call sites in XLA GPU backend emitters (reduction, scatter, transpose) - Updating HWIR's hlo_expansion.cc to use SymbolicMap. - Updating various files in /emitters... Reverts 00af54d PiperOrigin-RevId: 893384506

…t of splitk) into one when rewriting dot to cuBLAS. The CanCublasHandleGemm() already returned true for such dots even though they are not supported. Also remove normalization logic from the SplitK rewriter as Triton emitter handles it fine. PiperOrigin-RevId: 893406725

PiperOrigin-RevId: 893411151

Partial migration of the deprecated IndexingMap::GetAffineMap and IndexingMap constructors. This CL replaces the MLIR AffineMap and AffineExpr types with custom XLA types SymbolicMap and SymbolicExpr within the XLA codegen attrs, ops and transformation pasess. PiperOrigin-RevId: 893411269

When a user calls into the compiler, there are 2 ways to provide the CPU target architecture - either through a field in `Compiler::CompileOptions` or through a field in `GpuTopology` (which is the new way). The current logic has only been looking into `Compiler::CompileOptions`, so this change unifies all usages under `GpuTopology` - the new way. 1. It adjusts InferGpuTopology to take the CPU target config from CompileOptions also into account. 2. Makes the legacy AOT flow use InferGpuTopology 3. Hands down GpuTopology into CompileBackendResult so that the target config can be accessed from there. 4. Also gives the target config to the host offloading XLA:CPU compilation call. 5. Adds an integration test that ensures the target compile options reach the CustomCall FFI Instantiate handler. PiperOrigin-RevId: 893417109

These are transitional/forwarding headers that should be automatically removed when code is refactored. PiperOrigin-RevId: 893421594

Imported from GitHub PR openxla/xla#40252 Reduce the number of global thread pools in JAX/XLA by using global hang watchdog. Copybara import of the project: -- 93244f9fad533f5f97f484f88730580a8f04c440 by Eugene Zhulenev <ezhulenev@openxla.org>: [xla:gpu] Use Global HangWatchdog in se_gpu_pjrt_client Merging this change closes #40252 PiperOrigin-RevId: 893431466

HloVerifier cannot assume anything about the provided HLO and should validate it instead of crashing. Fix the same kind of bug for ReduceScatter. PiperOrigin-RevId: 893432458

PiperOrigin-RevId: 893433224

…uring tiling. During emission we need to map the affine symbols that correspond to the RT vars to the TiledHloInstructions and then to the emitted TensorValues for them. We can accumulate the tiled HLOs for the RT vars when assembling the TiledHloComputation. PiperOrigin-RevId: 893441229

It is not needed, the test passes well within 5 minutes on all backends. PiperOrigin-RevId: 893442970

The `CompileOptions` parameter was not used in the implementations of `CompileTargetBinary` in `AMDGPUCompiler` and `NVPTXCompiler`. This change removes the parameter from the method signature in `GpuCompiler` and its subclasses. PiperOrigin-RevId: 893446781

This change refactors indexing_analysis and related emitters to use xla::SymbolicExpr and xla::SymbolicMap instead of mlir::AffineExpr and mlir::AffineMap Migrating indexing_analysis required cascading updates across multiple files, representing the minimal change required to completely remove all Affine references from indexing_analysis. Key changes: - Symbolic Representation: Replaced AffineMap and AffineExpr with SymbolicMap and SymbolicExpr across CPU and GPU emitters (including fusion, scatter, transpose, and reduction), tiling schedules, and stablehlo indexing analysis. - API Refactoring: Migrated MLIR factory calls (e.g., getAffineDimExpr, getAffineConstantExpr) to XLA's symbolic factory functions (e.g., CreateDimExpr, CreateSymbolicConstant, CreateSymbolExpr, SymbolicMap::Get). I have many headaches migrating CreateSymbolExpr. - Operation Syntax: Replaced explicit .floorDiv() calls with the overloaded / operator for SymbolicExpr evaluation, and updated dimension replacements to use ReplaceDims (more headaches). Note: This change makes several tests to fail but they were already addressed in previously reviewed CLs (cl/884921505, cl/884981099, cl/885025521 and many more) Benchmark: Everything seems flat towards positive (gpaste/5804159988269056): Device Time (denoised) 1.00x (very close to 1.01x - 1.004x) Device Time for XLA Codegened / Library Kernels (denoised) 1.01x Total Compile Time 1.00x (mostly over 1.00 - 1.002x) PiperOrigin-RevId: 893452848

… HLO verifier Imported from GitHub PR openxla/xla#40232 ## Summary Fixes #40191 `VerifyAsynchronousInstructionPairs` checks that async Start/Done instructions are properly connected for all async op types except `kAllGatherStart/kAllGatherDone`. This adds the missing verification, following the same pattern used for `kAllReduceStart/kAllReduceDone`. ## Changes - `hlo_verifier.cc`: Add `kAllGatherStart` and `kAllGatherDone` cases to the async pair verification switch - `hlo_verifier_test.cc`: Add tests for valid AllGather Start/Done pair and invalid multiple-Done case ## Test plan - [ ] CI verification - [x] Tests follow existing AllReduce verifier test patterns - [x] Follow Google C++ style Copybara import of the project: -- 30b3bd3e40e9fbd25f5336febefe6f06fa6e0346 by Manish Reddy <kreddy.manish@gmail.com>: Add missing kAllGatherStart/kAllGatherDone verification to HLO verifier. Fixes #40191 -- 8a46e9c7312dad8f8f47e52c9b1548456cee934f by Manish Reddy <kreddy.manish@gmail.com>: Use ParseAndReturnVerifiedModule for valid AllGather test case. Merging this change closes #40232 PiperOrigin-RevId: 893456297

PiperOrigin-RevId: 893465860

…ipblaslt Imported from GitHub PR openxla/xla#39373 📝 Summary of Changes Prevent usage of hipblaslt for gemms with bf16 or f16 with f32 🎯 Justification hipblaslt custom call was generated altough not supported on mi200 🚀 Kind of Contribution Please remove what does not apply: 🐛 Bug Fix, 🧪 Tests 📊 Benchmark (for Performance Improvements) Please measure and include speedups for one of the public HLOs in `compiler/xla/tools/benchmarks/hlo/`. 🧪 Unit Tests: Added CheckCustomCallHipblasLtBF16 test into gemm_rewriter_test 🧪 Execution Tests: What execution tests were added? For example, a new optimization should be tested with an end-to-end execution test triggering the optimization and asserting correctness. Please provide test cases running with at most 2 GPUs. Copybara import of the project: -- 7884a994ece5c761e62b1363a8be73680502a29a by Zoran Jovanovic <zjovanov@amd.com>: [ROCm] Fix issue with unsupported types combinations for hipblaslt -- 8c6ec922e141ac1ee0e46b6cf65ecdcb06c97011 by Zoran Jovanovic <zjovanov@amd.com>: Skip CheckCustomCallHipblasLtBF16 test for non ROCm archs. Merging this change closes #39373 PiperOrigin-RevId: 893469347

vamshikiran065-jpg and others added 23 commits March 13, 2026 13:27

Fix grammar and formatting in README installation section

76f9f18

Corrected minor grammatical issues and formatting in installation instructions.

[XLA:GPU] Add symmetric memory to collective memory cache to prevent …

01aa15c

…de-registration until the module execution is completed. Also added delayed test-cases for both symmetric memory and peer parameters cases. PiperOrigin-RevId: 893340921

Improve parentheses in SymbolicExpr serialization

386b3e5

Here we are trying to make it as similar as possible to AffineExpr to minimize the number of tests that we need to modify in the migration. PiperOrigin-RevId: 893379056

Merge pull request #112288 from vamshikiran065-jpg:master

07be721

PiperOrigin-RevId: 893411151

[NFC] Annotate forwarding headers with INLINER_FORWARD_TO.

dfbfe3e

These are transitional/forwarding headers that should be automatically removed when code is refactored. PiperOrigin-RevId: 893421594

Avoid crashing in HloVerifier for unexpected shapes in AllGather.

e2c9f36

HloVerifier cannot assume anything about the provided HLO and should validate it instead of crashing. Fix the same kind of bug for ReduceScatter. PiperOrigin-RevId: 893432458

[XLA:GPU] Introduce EmitterContext to group the params.

86788d3

PiperOrigin-RevId: 893433224

Remove test timeout long from prng_test.

69b74f6

It is not needed, the test passes well within 5 minutes on all backends. PiperOrigin-RevId: 893442970

Automated Code Change

6185942

PiperOrigin-RevId: 893465860

pull Bot locked and limited conversation to collaborators Apr 2, 2026

pull Bot added the ⤵️ pull label Apr 2, 2026

pull Bot merged commit 45f557d into GesuBackups:master Apr 2, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[pull] master from tensorflow:master#1687

[pull] master from tensorflow:master#1687
pull[bot] merged 23 commits into
GesuBackups:masterfrom
tensorflow:master

pull Bot commented Apr 2, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

13 participants

Uh oh!

Conversation

pull Bot commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

13 participants

pull Bot commented Apr 2, 2026 •

edited

Loading