[rocm-libraries] ROCm/rocm-libraries#7518 (commit 2d260f3)

evedovelli · assistant-librarian[bot] · commit bbebfcd16125 · 2026-05-19T15:20:17.000Z
[hiptensor] Add hipTensor support on gfx1250 ## Motivation Add hipTensor support on gfx1250. ## Technical Details NA ## Test Plan NA ## Test Result NA ## Submission Checklist - [x] Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests. AIHIPTENS-2
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -2,6 +2,12 @@
 
 Full documentation for hipTensor is available at [rocm.docs.amd.com/projects/hiptensor](https://rocm.docs.amd.com/projects/hipTensor/en/latest/index.html).
 
+## Since last release ROCm 7.13
+
+### Added
+
+* Added support for new GPU target gfx1250.
+
 ## Since last release ROCm 7.12
 
 ### Added
@@ -11,13 +17,13 @@ Full documentation for hipTensor is available at [rocm.docs.amd.com/projects/hip
   * gfx11: gfx1100, gfx1101, gfx1102, gfx1103, gfx1150, gfx1151, gfx1152, gfx1153.
   * gfx12: gfx1200, gfx1201.
 * Added unary element-wise operators to contraction, including the new `BilinearUnary` class, dedicated instances, samples, and tests.
-* Added Dockerfiles (prebuilt and full build) and documentation to streamline hipTensor build environment setup. 
+* Added Dockerfiles (prebuilt and full build) and documentation to streamline hipTensor build environment setup.
 * Added the `CREATE_TEST_APP_LOCAL_DEPLOY` CMake option to stage required ROCm DLLs on Windows, and updated the Windows build documentation accordingly.
 
 ### Changed
 * Replaced numeric UID-based actor-critic kernel lookup with platform-stable string-based kernel name comparison to enable cross-platform compatibility.
 * Adopted FNV-1a string hashing in place of `std::hash` to ensure plan cache files are portable across platforms.
-* Switched to ROCm-provided CMake install functions (`rocm_export_targets`) for consistency with other ROCm libraries. 
+* Switched to ROCm-provided CMake install functions (`rocm_export_targets`) for consistency with other ROCm libraries.
 * Adapted hipTensor to CK namespace changes for `host_tensor` functions.
 * Cleaned up `rtest` script formatting and removed invalid run commands from `rtest.xml`.
 
diff --git a/README.md b/README.md
@@ -9,11 +9,11 @@ Welcome! hiptensor is AMD's C++ library for accelerating tensor primitives using
 
 hipTensor currently supports the following AMDGPU architectures:
 
-* CDNA class GPU featuring matrix core support: gfx908, gfx90a, gfx942, gfx950 as 'gfx9'.
-* RDNA class GPU featuring matrix core support: gfx1100, gfx1101, gfx1102, gfx1103, gfx1150, gfx1151, gfx1152, gfx1153, gfx1200 and gfx1201.
+* CDNA class GPU featuring matrix core support: gfx908, gfx90a, gfx942, gfx950 as 'gfx9'
+* RDNA class GPU featuring matrix core support: gfx1100, gfx1101, gfx1102, gfx1103, gfx1150, gfx1151, gfx1152, gfx1153, gfx1200, gfx1201 and gfx1250.
 
 > [!NOTE]
-> Double precision FP64 datatype support requires gfx90a, gfx942 or gfx950.
+> Double precision FP64 datatype support requires gfx90a, gfx942, gfx950 or gfx1250.
 
 Dependencies:
 
diff --git a/cmake/Functions/hiptensorSupportedArchitectures.cmake b/cmake/Functions/hiptensorSupportedArchitectures.cmake
@@ -14,6 +14,7 @@ set(SUPPORTED_ARCHITECTURES
     gfx11-generic
     gfx1200
     gfx1201
+    gfx1250
     gfx12-generic
 )
 
diff --git a/docs/api-reference/api-reference.rst b/docs/api-reference/api-reference.rst
@@ -38,10 +38,11 @@ List of supported RDNA architectures:
 * gfx1153
 * gfx1200
 * gfx1201
+* gfx1250
 
 .. note::
     gfx11 = gfx1100, gfx1101, gfx1102, gfx1103, gfx1150, gfx1151, gfx1152, gfx1153
-    gfx12 = gfx1200, gfx1201
+    gfx12 = gfx1200, gfx1201, gfx1250
 
 .. _hiptensor-supported-data-types:
 
@@ -82,7 +83,7 @@ Data Types **<Ti / To / Tc>** = <Input type / Output Type / Compute Type>, where
 |                     +------------------------------+                     |                     |
 |                     |     f32 / f32 / bf16         |                     | 4m4n4k (Rank8)      |
 |                     +------------------------------+---------------------+                     |
-|                     |     f16 / f16 / f32          |  gfx9               | 5m5n5k (Rank10)     |
+|                     |     f16 / f16 / f32          |  gfx9 gfx1250       | 5m5n5k (Rank10)     |
 |                     +------------------------------+                     |                     |
 |                     |     bf16 / bf16 / f32        |                     | 6m6n6k (Rank12)     |
 |                     +------------------------------+                     |                     |
diff --git a/docs/install/installation.rst b/docs/install/installation.rst
@@ -73,7 +73,7 @@ including the gfx908, gfx90a, gfx942, and gfx950 GPUs (collectively labeled as g
 
 Additionally, hipTensor is supported on AMD RDNA GPUs:
  - gfx11-generic: gfx1100, gfx1101, gfx1102, gfx1103, gfx1150, gfx1151, gfx1152 and gfx1153.
- - gfx12-generic; gfx1200 and gfx1201.
+ - gfx12-generic: gfx1200, gfx1201 and gfx1250.
 
 .. note::
 
@@ -180,7 +180,7 @@ Here are the available options to build the hipTensor library, with or without c
         -   **Default value**
     *   -   ``GPU_TARGETS``
         -   Build the code for specific GPU target(s)
-        -   ``gfx908``; ``gfx90a``; ``gfx942``; ``gfx950``; ``gfx1100``; ``gfx1101``; ``gfx1102``; ``gfx1103``; ``gfx1150``; ``gfx1151``; ``gfx1152``; ``gfx1153``; ``gfx11-generic``; ``gfx1200``; ``gfx1201``; ``gfx12-generic``
+        -   ``gfx908``; ``gfx90a``; ``gfx942``; ``gfx950``; ``gfx1100``; ``gfx1101``; ``gfx1102``; ``gfx1103``; ``gfx1150``; ``gfx1151``; ``gfx1152``; ``gfx1153``; ``gfx11-generic``; ``gfx1200``; ``gfx1201``; ``gfx1250``; ``gfx12-generic``
     *   -   ``HIPTENSOR_BUILD_TESTS``
         -   Build the tests
         -   ``ON``
diff --git a/library/include/hiptensor/internal/config.hpp b/library/include/hiptensor/internal/config.hpp
@@ -45,6 +45,7 @@ namespace hiptensor
 /// HIPTENSOR_ARCH_GFX1153
 /// HIPTENSOR_ARCH_GFX1200
 /// HIPTENSOR_ARCH_GFX1201
+/// HIPTENSOR_ARCH_GFX1250
 #if defined(__gfx908__)
 #define HIPTENSOR_ARCH_GFX908 __gfx908__
 #elif defined(__gfx90a__)
@@ -73,6 +74,8 @@ namespace hiptensor
 #define HIPTENSOR_ARCH_GFX1200 __gfx1200__
 #elif defined(__gfx1201__)
 #define HIPTENSOR_ARCH_GFX1201 __gfx1201__
+#elif defined(__gfx1250__)
+#define HIPTENSOR_ARCH_GFX1250 __gfx1250__
 #else
 #define HIPTENSOR_ARCH_HOST 1
 #endif
@@ -119,6 +122,9 @@ namespace hiptensor
 #if !defined(HIPTENSOR_ARCH_GFX1201)
 #define HIPTENSOR_ARCH_GFX1201 0
 #endif
+#if !defined(HIPTENSOR_ARCH_GFX1250)
+#define HIPTENSOR_ARCH_GFX1250 0
+#endif
 #if !defined(HIPTENSOR_ARCH_HOST)
 #define HIPTENSOR_ARCH_HOST 0
 #endif
diff --git a/library/src/hip_device.cpp b/library/src/hip_device.cpp
@@ -102,6 +102,10 @@ namespace hiptensor
         {
             mGcnArch = hipGcnArch_t::GFX1201;
         }
+        else if(deviceName.find("gfx1250") != std::string::npos)
+        {
+            mGcnArch = hipGcnArch_t::GFX1250;
+        }
 
         switch(mProps.warpSize)
         {
@@ -135,7 +139,8 @@ namespace hiptensor
     {
         return (mGcnArch == HipDevice::hipGcnArch_t::GFX90A
                 || mGcnArch == HipDevice::hipGcnArch_t::GFX942
-                || mGcnArch == HipDevice::hipGcnArch_t::GFX950);
+                || mGcnArch == HipDevice::hipGcnArch_t::GFX950
+                || mGcnArch == HipDevice::hipGcnArch_t::GFX1250);
     }
 
     bool HipDevice::matrixCoreSupport(hiptensorComputeDescriptor_t typeCompute) const
@@ -152,7 +157,8 @@ namespace hiptensor
             return (mGcnArch == HipDevice::hipGcnArch_t::GFX908
                     || mGcnArch == HipDevice::hipGcnArch_t::GFX90A
                     || mGcnArch == HipDevice::hipGcnArch_t::GFX942
-                    || mGcnArch == HipDevice::hipGcnArch_t::GFX950);
+                    || mGcnArch == HipDevice::hipGcnArch_t::GFX950
+                    || mGcnArch == HipDevice::hipGcnArch_t::GFX1250);
         case HIPTENSOR_COMPUTE_DESC_16F:
         case HIPTENSOR_COMPUTE_DESC_16BF:
             return (mGcnArch == HipDevice::hipGcnArch_t::GFX908
@@ -168,7 +174,8 @@ namespace hiptensor
                     || mGcnArch == HipDevice::hipGcnArch_t::GFX1152
                     || mGcnArch == HipDevice::hipGcnArch_t::GFX1153
                     || mGcnArch == HipDevice::hipGcnArch_t::GFX1200
-                    || mGcnArch == HipDevice::hipGcnArch_t::GFX1201);
+                    || mGcnArch == HipDevice::hipGcnArch_t::GFX1201
+                    || mGcnArch == HipDevice::hipGcnArch_t::GFX1250);
         default:
             return false;
         }
diff --git a/library/src/include/hip_device.hpp b/library/src/include/hip_device.hpp
@@ -51,6 +51,7 @@ namespace hiptensor
             GFX1153          = 0x1153,
             GFX1200          = 0x1200,
             GFX1201          = 0x1201,
+            GFX1250          = 0x1250,
             UNSUPPORTED_ARCH = 0x0,
         };
 
diff --git a/test/utils.hpp b/test/utils.hpp
@@ -135,7 +135,8 @@ inline bool isF16Supported()
            || (deviceName.find("gfx1152") != std::string::npos)
            || (deviceName.find("gfx1153") != std::string::npos)
            || (deviceName.find("gfx1200") != std::string::npos)
-           || (deviceName.find("gfx1201") != std::string::npos);
+           || (deviceName.find("gfx1201") != std::string::npos)
+           || (deviceName.find("gfx1250") != std::string::npos);
 }
 
 inline bool isF32Supported()
@@ -161,7 +162,8 @@ inline bool isF32Supported()
            || (deviceName.find("gfx1152") != std::string::npos)
            || (deviceName.find("gfx1153") != std::string::npos)
            || (deviceName.find("gfx1200") != std::string::npos)
-           || (deviceName.find("gfx1201") != std::string::npos);
+           || (deviceName.find("gfx1201") != std::string::npos)
+           || (deviceName.find("gfx1250") != std::string::npos);
 }
 
 inline bool isF64Supported()
@@ -176,7 +178,8 @@ inline bool isF64Supported()
 
     return (deviceName.find("gfx90a") != std::string::npos)
            || (deviceName.find("gfx942") != std::string::npos)
-           || (deviceName.find("gfx950") != std::string::npos);
+           || (deviceName.find("gfx950") != std::string::npos)
+           || (deviceName.find("gfx1250") != std::string::npos);
 }
 
 inline bool isF16F16MatrixCoreSupported()
@@ -198,7 +201,8 @@ inline bool isF16F16MatrixCoreSupported()
            || (deviceName.find("gfx1152") != std::string::npos)
            || (deviceName.find("gfx1153") != std::string::npos)
            || (deviceName.find("gfx1200") != std::string::npos)
-           || (deviceName.find("gfx1201") != std::string::npos);
+           || (deviceName.find("gfx1201") != std::string::npos)
+           || (deviceName.find("gfx1250") != std::string::npos);
 }
 
 inline bool isF32F16MatrixCoreSupported()
@@ -224,7 +228,8 @@ inline bool isF32F16MatrixCoreSupported()
            || (deviceName.find("gfx1152") != std::string::npos)
            || (deviceName.find("gfx1153") != std::string::npos)
            || (deviceName.find("gfx1200") != std::string::npos)
-           || (deviceName.find("gfx1201") != std::string::npos);
+           || (deviceName.find("gfx1201") != std::string::npos)
+           || (deviceName.find("gfx1250") != std::string::npos);
 }
 
 inline bool isF32F32MatrixCoreSupported()
@@ -240,7 +245,8 @@ inline bool isF32F32MatrixCoreSupported()
     return (deviceName.find("gfx908") != std::string::npos)
            || (deviceName.find("gfx90a") != std::string::npos)
            || (deviceName.find("gfx942") != std::string::npos)
-           || (deviceName.find("gfx950") != std::string::npos);
+           || (deviceName.find("gfx950") != std::string::npos)
+           || (deviceName.find("gfx1250") != std::string::npos);
 }
 
 inline bool isF16F32MatrixCoreSupported()
@@ -275,7 +281,8 @@ inline bool isF64F32MatrixCoreSupported()
 
     return (deviceName.find("gfx90a") != std::string::npos)
            || (deviceName.find("gfx942") != std::string::npos)
-           || (deviceName.find("gfx950") != std::string::npos);
+           || (deviceName.find("gfx950") != std::string::npos)
+           || (deviceName.find("gfx1250") != std::string::npos);
 }
 
 inline bool isDataType16Bits(hiptensorDataType_t dataType)

Original file line number	Diff line number	Diff line change
`@@ -14,6 +14,7 @@ set(SUPPORTED_ARCHITECTURES`
`14`	`14`	`gfx11-generic`
`15`	`15`	`gfx1200`
`16`	`16`	`gfx1201`
	`17`	`+ gfx1250`
`17`	`18`	`gfx12-generic`
`18`	`19`	`)`
`19`	`20`