Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
2198 commits
Select commit Hold shift + click to select a range
b6ee3c5
fix: update dyn_cast to dyn_cast_or_null to avoid assertion
hongziqi Jan 31, 2026
a506b6b
!12 merge release/3.3.x-upgrade-candy-dev into release/3.3.x-upgrade
hongziqi Jan 31, 2026
93e6316
fix(launch): adapting the contextr variable of the signature
hongziqi Feb 3, 2026
61985d9
fix: resolve TypeError when converting tuple to tensor
hongziqi Feb 10, 2026
c461218
fix: remove debug print message
hongziqi Feb 10, 2026
487d1f3
!15 merge release/3.3.x-upgrade-candy-dev into release/3.3.x-upgrade
hongziqi Feb 10, 2026
46eb346
Merge Triton-Ascend 577a2d2 into release/3.5.x first.
Feb 25, 2026
e626215
fix: delete duplicated variable install_requires, update applyPattern…
Feb 25, 2026
aa34a7a
Merge Triton 5389ed7 into release/3.4.x first.
Feb 27, 2026
887cafd
fix: update interface match to matchAndRewrite, toMemrefOp to toBuffe…
Feb 27, 2026
7093f5a
Merge Triton cfc0a9d into release/3.5.x first.
Feb 28, 2026
9805d68
fix:fix TA compile error caused by to_tensor and ConstantIntOp.
Mar 2, 2026
509a062
change npuir commit id
Mar 2, 2026
9ba6a2b
fix: remove builder in indrectLoad
hongziqi Mar 3, 2026
f3e24fc
fix: RuntimeError concrete subclasses of DriverBase
hongziqi Mar 3, 2026
f4db337
fix: change _constexpr_to_value to _unwrap_if_constexpr
hongziqi Mar 3, 2026
ff0fd37
fix: temporarily disable extension interface pending semantic refactor
hongziqi Mar 3, 2026
a3ed01a
fix: sync JIT files with community version and fix JITFunction Attrib…
hongziqi Mar 3, 2026
4a67442
fix: restore cdiv in standard but not init cdiv
hongziqi Mar 3, 2026
dc4363f
fix: remove deprecated functions in 3.4, fix make_ir compile error
hongziqi Mar 3, 2026
1a922bb
fix: sync community logic to fix unexpected '_builder' TypeError and …
hongziqi Mar 4, 2026
f2cf3f2
feat: support triton-mlir-opt and compile with bytecode file
hongziqi Mar 4, 2026
2757ce8
fix: remove outdated language/_utils.py and update root _utils.py
hongziqi Mar 5, 2026
0e16103
fix: modify the extension op's parameter from _builder to _semantic a…
Mar 5, 2026
8db52f0
feat: set LLVM_MAJOR_VERSION_22_COMPATIBLE in CmakeLists.txt
hongziqi Mar 6, 2026
719cab2
fix(test): temporarily disable hang-inducing test cases
hongziqi Mar 6, 2026
f735f82
fix: replace _builder with _semantic in libdevice, modify the cast, t…
Mar 8, 2026
64b7971
fix: update MAP_SIGTYPE_TO_INT for i4 and u1
hongziqi Mar 8, 2026
1440ef8
!24 merge release/3.5.x-upgrade-candy-dev into release/3.5.x-upgrade
hongziqi Mar 8, 2026
2dbcab8
fix: resolve TypeError in cast() caused by duplicate '_semantic' argumen
hongziqi Mar 8, 2026
0f1d44f
!25 merge release/3.5.x-upgrade-candy-dev into release/3.5.x-upgrade
hongziqi Mar 8, 2026
f928de5
fix: update _builder to _semantic, replace tuple with tl.tuple, and c…
Mar 8, 2026
6312144
fix: update argument in math.cdiv, update autotune, add clear_cache i…
hongziqi Mar 8, 2026
8c22984
!27 merge release/3.5.x-upgrade-candy-dev into release/3.5.x-upgrade
hongziqi Mar 8, 2026
b51effd
fix: convert pi and half_pi to constexpr in math_ops.
Mar 8, 2026
7bf67c8
!28 merge release/3.5.x-upgrade-jeshd-dev-merge-5389ed7-cfc0a9d-compi…
Mar 8, 2026
eceddbe
fix: use builder.get_string_attr instead of f get_str_attr
hongziqi Mar 8, 2026
4a9ae38
!29 merge release/3.5.x-upgrade-candy-dev into release/3.5.x-upgrade
hongziqi Mar 8, 2026
b83c678
fix: add core.tuple support for compile_hint.
Mar 8, 2026
8aff1e4
fix(test): temporarily disable hang-inducing test cases
hongziqi Mar 8, 2026
31fae21
fix: update _builder param to _semantic in extension.core and extension
hongziqi Mar 8, 2026
2c01a76
!33 merge release/3.5.x-upgrade-candy-dev into release/3.5.x-upgrade
hongziqi Mar 8, 2026
b875d9d
feat: support running ascend passes with triton-opt
hongziqi Mar 9, 2026
19e1ea5
!35 merge release/3.5.x-upgrade-candy-dev into release/3.5.x-upgrade
hongziqi Mar 9, 2026
0e6fd78
feat: update AscendNPU-IR commit to 9a2c50f
hongziqi Mar 9, 2026
da9a077
fix: use tl.constexpr() for global constant in Triton kernel
hongziqi Mar 9, 2026
552b795
fix(test): unpack broadcast return value as tuple (lhs, rhs)
hongziqi Mar 9, 2026
58246fb
!39 merge release/3.5.x-upgrade-candy-dev into release/3.5.x-upgrade
hongziqi Mar 9, 2026
27c12c8
fix(triton_ascend.cc): fix create_extract_slice and create_insert_sli…
Mar 10, 2026
db1cde5
fix: avoid scalar full-slice in complex mask indexing.
Mar 10, 2026
a503b50
fix test_to_buff case to adapt to ASTSource changes
Mar 10, 2026
2c5aa74
fix: replace _builder with _semantic, add tl.tuple and add builder se…
Mar 10, 2026
1da46f5
fix: create reinterpret_cast with dynamic and static parameters.
hongziqi Mar 10, 2026
974ad44
optimize: merge two similar loops and extract common logic
hongziqi Mar 10, 2026
33a4590
!42 merge release/3.5.x-upgrade-candy-dev into release/3.5.x-upgrade
hongziqi Mar 10, 2026
2addadc
fix(unlimit loop): delete hasFolder attribute of AddPtrOp to avoid ul…
Mar 10, 2026
9f626ab
fix(test): test: re-enable previously disabled hang-inducing test cases
hongziqi Mar 10, 2026
bbf8dc2
fix(test): unpack tl.randint4x return values.
hongziqi Mar 10, 2026
8716fb8
!44 merge release/3.5.x-upgrade-candy-dev into release/3.5.x-upgrade
hongziqi Mar 10, 2026
5adecc1
feat: update _get_bishengir_opt_path func
hongziqi Mar 11, 2026
88b9566
!45 merge release/3.5.x-upgrade-candy-dev into release/3.5.x-upgrade
hongziqi Mar 11, 2026
f024db7
feat: reduce triton-mlir-opt in wheel
hongziqi Mar 11, 2026
3202478
!46 merge release/3.5.x-upgrade-candy-dev into release/3.5.x-upgrade
hongziqi Mar 11, 2026
0d80f19
feat: improve compilation error output
hongziqi Mar 12, 2026
c42acbf
!47 merge release/3.5.x-upgrade-candy-dev into release/3.5.x-upgrade
hongziqi Mar 12, 2026
4b77755
fix: remove override/final keywords for LLVM compatibility
hongziqi Mar 12, 2026
b5cbd6d
!48 merge release/3.5.x-upgrade-candy-dev into release/3.5.x-upgrade
hongziqi Mar 13, 2026
b2c385e
fix: add get_home_dir func in ascend/testing
hongziqi Mar 13, 2026
3aae812
!49 merge release/3.5.x-upgrade-candy-dev into release/3.5.x-upgrade
hongziqi Mar 14, 2026
0cbbd70
feat: add UnsplatConverter to rewirte tt.unsplat to tensor.extract
hongziqi Mar 14, 2026
e01fa77
!50 merge release/3.5.x-upgrade-candy-dev into release/3.5.x-upgrade
hongziqi Mar 14, 2026
99cbb9d
chore(option): add max num of fusion ops option
Mar 21, 2026
1dad112
fix: deleted mark hivm.parallel_loop in discrete store
LH-123L Mar 21, 2026
5289e3a
fix(driver): declare ret with RT_ERROR_NONE
Mar 21, 2026
e087c31
fix: Skip i1 tensor optimization in select analysis
WuTYSFG Mar 21, 2026
df5a785
feat(libdevice): add support for 'pow' and 'tanh' operations on A5
Levein7 Mar 21, 2026
c53ca1c
fix(syncPos): wait should up
CodeWorker-bot Mar 21, 2026
75e3ad5
fix: update NPU-IR submodule commitid
LH-123L Mar 22, 2026
35ca173
fix: decompose AND masks to preserve boundary safety in discrete mask…
hongziqi Mar 23, 2026
bf38055
fix(load): add 'was_bool_to_int8' attr to load op
OiseauL Mar 23, 2026
b613cc7
feat: a5 inject-block-all
Mar 23, 2026
181f717
[fix]fix_kernel_name_error_in_tritonparse
Mar 23, 2026
6c7e026
fix(dagsync): change default pos && add crossblock condition
CodeWorker-bot Mar 24, 2026
e22b286
fix: Move makeTensorPtrOp cache after redundantOp create
WuTYSFG Mar 24, 2026
cbaaa1a
feat(Cannonicalizer):add if converter to extract add expr from if bod…
OiseauL Mar 25, 2026
b436f63
fix: Move the two extract operations in the if judgment condition and…
WuTYSFG Mar 25, 2026
cc1dc95
feat(compiler): add enable vf fusion option for 910_95
KanuaK Mar 25, 2026
ddd59f9
fix: Erase mstate inserted ops when select converter return failure
WuTYSFG Mar 25, 2026
1bec974
!1449 merge main4 into main
WuTYSFG Mar 25, 2026
e3c8d32
fix(test): prevent device mismatch in boundary_check test assertions
hongziqi Mar 25, 2026
9969c6b
fix(ssbuffer): issues occurring when the cube core is greater than 3
XMU-qcj Mar 25, 2026
33f8b99
test: move mindspore cases to generalization
hongziqi Mar 25, 2026
6d04b94
fix(make_block_ptr): transform boundary_check when make_block_ptr's o…
Mar 26, 2026
d83f382
fix: adjust types for dot to matmul
Mar 26, 2026
3d039b3
fix(useAnalysis): add use analysis for reduce op;add cse/canonicalize…
OiseauL Mar 26, 2026
62e90c5
fix (ssbuffer): copy load ptr calculation while load ptr is a result …
zhuxinguang33 Mar 26, 2026
6594f2b
fix: Add forward sync for mem ops for ssbuffer
Mar 26, 2026
d613727
fix(reduce):Fix the reduce Converter did not fill initial values when…
Mar 26, 2026
205a6cc
Refactor: fix control flow
cxtverygood Mar 26, 2026
8d1bb2b
fix(ssbuffer): Fixed issue with incorrect insertion position of set s…
XMU-qcj Mar 27, 2026
3da36af
fix(dagsync): mark reshape dim
Mar 27, 2026
efe5b7d
docs(safe) : add public IP address statement
gymgit1 Mar 28, 2026
1f9ecf4
fix(load): load make_block_ptr tensor default fill 0 when out of bound
Mar 28, 2026
eaf772d
fix:Fix the error caused by missing cache_dir attribute & prioritize …
Mar 30, 2026
89859c7
feat(inline_asm): support nD tensor case for inline asm
KanuaK Mar 30, 2026
33a98d2
docs(install): modify installation instruction docs
melo882 Mar 30, 2026
0669d53
(Fix Ssbuffer): Remove redundant dependent values
cxtverygood Mar 30, 2026
3c27208
feat(performance):Update the performance charts in README.md
Mar 30, 2026
48cb9fe
fix:Fix the infinite loop in isUsedWithCondition caused by nested for…
Mar 30, 2026
4296d2c
fix(TritonToUnstructure): fix map[a]=map[b] scenario with two lines i…
Mar 30, 2026
3da1a72
!1488 merge 0330_main_performance_svg into main
Mar 30, 2026
d99be31
fix(DiscreteMask): distribute broadcast over andi in collectAndLeaves
hongziqi Mar 31, 2026
425236d
fix: Force protecting side-effect op
WuTYSFG Mar 31, 2026
4ec4ceb
fix(Linalg): fix base offset update impl in rewriteAddPtrToUnstrucMemAcc
KanuaK Mar 31, 2026
78d9842
docs(install): modify installation instruction docs
melo882 Mar 31, 2026
3b10d99
(Fix Ssbuffer): cast condition type in selectop to avoid unsupported …
cxtverygood Mar 31, 2026
8232cb8
feat(autotune): support fixed grid dimensions and axis-program_id map…
Hins-xp Mar 31, 2026
a1391cf
Merge Triton-Ascend 425236de into release/3.5.x
Mar 31, 2026
7a5e8d9
docs(install): modify installation instruction docs
melo882 Mar 31, 2026
3ff3ede
!54 merge release/3.5.x-upgrade-jeshd-dev-merge-main-425236de into re…
Mar 31, 2026
10f93e1
fix(libentry): TypeError for function arguments
hongziqi Mar 31, 2026
5718a8c
fix(delete_option): delete enable-ubuf-saving, tile-mix-vector-loop, …
Mar 31, 2026
922832e
!1507 merge docs_readme3 into main
melo882 Mar 31, 2026
5b75ccd
fix(libentry): update typing imports to include additional utilities
hongziqi Mar 31, 2026
ee4e125
!56 merge release/3.5.x-upgrade-candy-dev into release/3.5.x-upgrade
hongziqi Mar 31, 2026
d90cf8d
fix(AutoBlockify): disable dotOp in AutoBlockify
Apr 1, 2026
b0ecce7
CustomOp indexing_map and extra_buffers
Apr 1, 2026
b56b39e
fix buffer_num when multi %alloc occurs after waitOp
m-everglow Apr 1, 2026
55fd961
feat(autotune): expand tiling configurations with small programs numb…
Hins-xp Apr 1, 2026
faef318
bug: atomicrmw op supports block pointer analysis
LH-123L Apr 1, 2026
9610544
CustomOp align_dim on arg
Apr 2, 2026
4e8a31b
fix(ssbuffer): fix some clean code issues.
XMU-qcj Apr 2, 2026
489b1bf
feat(simt): Remove temporary api
Apr 2, 2026
22a825f
CustomOp IteratorTypes
Apr 2, 2026
f4fe830
fix(ssbuffer): Reduce the number of idle for loops
XMU-qcj Apr 2, 2026
92574a7
fix: improve compilation error output
hongziqi Apr 2, 2026
4384824
fix : resolve CleanCode warnings
gymgit1 Apr 2, 2026
fc0d3a1
feat: Remove tanh op to align with the upstream, and add bf16 fallbac…
Apr 2, 2026
bf4ca19
build(safe):Add safe compile opitions for libtriton.so libentryC.so
Apr 3, 2026
4086027
fix: add hfusion_enable_multiple_consumer_fusion compile option to ad…
Apr 3, 2026
1ee4766
!57 merge release/3.5.x-upgrade-candy-dev into release/3.5.x-upgrade
hongziqi Apr 3, 2026
0fcf6ff
fix(MaskAnalysis): when got a negative dim value in mask analysis, we…
OiseauL Apr 3, 2026
9d39041
feat(libdevice): add support for 'fast_dividef' and 'fast_expf' opera…
Levein7 Apr 3, 2026
f887959
fix(sum): comment the type promotion logic of sum op
Apr 3, 2026
172c4bf
fix: Fixing the Extension API in tutorials
WuTYSFG Apr 3, 2026
ca7aa30
fix : resolve CleanCode warnings
gymgit1 Apr 7, 2026
6765b03
feat: update llvm hash and upload fad3272.patch
hongziqi Apr 7, 2026
79713fb
fix(BlockPtrAnalysis):add dimension information to scalar blockdata
CHNJZ Apr 7, 2026
6ad2c24
fix: add disable_tightly_coupled_buffer_reuse compile option to adapt…
Apr 7, 2026
26013a7
ci(llvm):update llvm and apply patch
Apr 7, 2026
b77d79b
ci(update):Skip some test cases not supported after the NPUIR is upda…
Apr 7, 2026
cdb0325
fix(autotune): harden autotuner parsing and fp8 dtype handling
Hins-xp Apr 7, 2026
62eb951
feat: TTIR Graph Analysis Framework
huchengbei Apr 7, 2026
936eca2
Merge Triton-Ascend 62eb951f into release/3.5.x
Apr 7, 2026
fa27c6c
!63 merge release/3.5.x-upgrade-jeshd-dev-merge-main-62eb951f into re…
Apr 7, 2026
52988e2
fix: remove 'cd python' when build triton
hongziqi Apr 8, 2026
9059eaa
fix: add nanobind>=2.4 in requirements_dev.txt
hongziqi Apr 8, 2026
9de8262
fix: no wheel found in python/dist
hongziqi Apr 8, 2026
75a6108
fix: update rename-wheel in makefile
hongziqi Apr 8, 2026
dff3662
doc: update installation_guide
hongziqi Apr 8, 2026
2ec7d16
fix(buffer): restore buffer_type correctly in for/if subregions
hongziqi Apr 9, 2026
e44b54e
!68 merge release/3.5.x-upgrade-candy-dev into release/3.5.x-upgrade
hongziqi Apr 9, 2026
e3e32e7
feat: update AscendNPU-IR to 8c903bbf
hongziqi Apr 10, 2026
1fdd5ee
!69 merge release/3.5.x-upgrade-candy-dev into release/3.5.x-upgrade
hongziqi Apr 10, 2026
8c2fdd4
fix: fix test_custom compilation error
Apr 10, 2026
6744f5f
!1551 merge release/3.5.x-upgrade into main
hongziqi Apr 13, 2026
e8e6738
fix(Simt Template): when mask is discrete and ptr is continuous, we s…
OiseauL Apr 8, 2026
4509fb0
fix(interpret): fix interpret implemention of compile_hint op
Apr 14, 2026
7bc3159
test(if_advance): add test if-advance case
Apr 14, 2026
45ee4c6
fix : delete useless code in compile
gymgit1 Apr 14, 2026
6b140bd
feat(TritonToStructured) : add ut for coverage rate
gymgit1 Apr 14, 2026
fcc7b09
feat(pytest): add use cases to improve the coverage
LH-123L Apr 9, 2026
927b3a7
!1565 merge simt_template into main
OiseauL Apr 14, 2026
0e89a25
fix: add sync_block_lock/unlock around the RMW sequence(store->load+s…
hongziqi Mar 23, 2026
7669903
feat(libdevice):Replace the operations implementation in libdevice.10…
Levein7 Apr 14, 2026
52199a3
!1405 merge main/fix-mask into main
hongziqi Apr 15, 2026
9998a46
!1607 merge mian_pytest into main
LH-123L Apr 15, 2026
f1db373
!1616 merge libdevice_simt_414_main into main
Levein7 Apr 15, 2026
ef52caf
fix(docs_ci): fix docs-ci-pipeline-failed
Apr 15, 2026
7cfb524
!1623 merge fix_docs_ci_for_main into main
Apr 15, 2026
e84ca38
fix doc-tools 低错
Mar 25, 2026
cf64ee9
fix doctools probleam
Apr 15, 2026
c092cf6
fix link
Apr 15, 2026
c8d49ec
!1613 merge ut_release3.5 into main
gymgit1 Apr 15, 2026
c87da1d
doc: update llvm_patch link in Installation Guide
hongziqi Apr 14, 2026
f8e451e
!1609 merge inte_1 into main
Apr 15, 2026
431cf34
feat: ta support python3.12 and python 3.13; add third-party pybind11…
LH-123L Apr 14, 2026
800095a
!1614 merge tritontostructured into main
gymgit1 Apr 15, 2026
ba0d777
!1608 merge main into main
LH-123L Apr 15, 2026
4a2526a
!1615 merge main/doc-fix into main
hongziqi Apr 15, 2026
c8c990c
fix(autotuner): fix compatibility of do_bench and respect user-define…
Hins-xp Apr 4, 2026
7e4213c
!1610 merge add_if_advance_main into main
Apr 16, 2026
58a9ae6
fix(docs): add mitigation instructions for installation overwrite iss…
Apr 8, 2026
2c45574
fix(packaging): avoid triton overwriting triton-ascend after installa…
Apr 3, 2026
27fba49
feat(filecheck): add mlir filecheck and delete failed test cases temp…
gymgit1 Apr 15, 2026
3e036cc
!1463 merge main into main
Apr 16, 2026
ee50b50
docs(install): modify architecture design docs
melo882 Apr 16, 2026
32256b2
!1555 merge codex/fix-autotune-compatibility into main
Hins-xp Apr 16, 2026
786b3f7
!1620 merge fix_triton_install_for_main into main
Apr 16, 2026
255adaf
!1635 merge docs_readme4 into main
melo882 Apr 16, 2026
d0cca35
!1621 merge fix_triton_install_for_main_docs into main
Apr 16, 2026
1041e0d
update AscendNPU-IR commit id
Apr 16, 2026
3de13dc
!1626 merge test-mlir into main
gymgit1 Apr 17, 2026
c6c0b1b
!1636 merge main into main
Apr 17, 2026
62b7646
build: update LLVM prebuilt package generation for AlmaLinux and impr…
Apr 17, 2026
efa8f70
fix(docs)Update CANN_TYPE to uniformly use a2 and a3; fix some expres…
Apr 17, 2026
bd563d9
fead: modify SIMT num_warps
Apr 17, 2026
33eff2d
[fix] (tritonparse):Adapt to the upstream community tritonparse and e…
Apr 17, 2026
8880b4c
[Ascend][CI] Add NPU integration test pipeline and sync CI infrastruc…
xuedinge233 Apr 17, 2026
eeb030e
fix(ssbuf): implement backward synchronization and fix ssbuf for pipe…
licf497 Apr 18, 2026
255373c
fix: handle zero value correctly for vf_merge_level
Apr 20, 2026
3707aaf
fix: raise error for unsupported allow_tf32 in dot_op
Apr 20, 2026
003d3fa
Merge commit '3707aaf06ae50ccd4cc926cc248ff4313822ca83'
hipudding Apr 20, 2026
0c9ce37
ci: fix LLVM build workflow and enhance CI test coverage
hipudding Apr 20, 2026
161ee34
ci: optimize llvm-build workflow (#25)
hipudding Apr 22, 2026
109f249
ci: improve docker build workflow and simplify Dockerfiles (#28)
hipudding Apr 23, 2026
cc46780
ci: optimize integration test (#30)
hipudding Apr 23, 2026
1a9890b
ci: pre-install runtime dependencies in Docker images (#34)
hipudding Apr 24, 2026
8455457
ci: clean up integration-tests-ascend workflow (#35)
hipudding Apr 25, 2026
9543627
ci: use pre-built CANN images as Docker base (#36)
hipudding Apr 25, 2026
10121d6
ci: set clang-15 as default compiler in Docker image (#38)
hipudding Apr 27, 2026
a84b1eb
ci: use CANN images in integration tests (#37)
hipudding Apr 27, 2026
0cdb6c8
Add documentation
xuedinge233 Apr 29, 2026
0645792
update
xuedinge233 Apr 29, 2026
aab3951
update
xuedinge233 Apr 29, 2026
3fe6409
Add blue yellow pipeline
xuedinge233 Apr 30, 2026
f37905b
Add documentation
xuedinge233 Apr 29, 2026
1d5e044
Merge branch 'main' into test-pr
xuedinge233 Apr 30, 2026
d448b60
Merge pull request #2 from xuedinge233/test-pr
xuedinge233 Apr 30, 2026
67e294e
Add blue yellow pipeline
xuedinge233 Apr 30, 2026
1d20830
Add blue yellow pipeline
xuedinge233 May 6, 2026
efb0dea
update file name
xuedinge233 May 6, 2026
2e24e4a
update
xuedinge233 May 6, 2026
83ec34c
update
xuedinge233 May 6, 2026
7a2db1d
Merge branch 'main' into test-yb-pipeline
xuedinge233 May 6, 2026
1f143a4
update
xuedinge233 May 6, 2026
6c96787
update
xuedinge233 May 6, 2026
8669caf
update
xuedinge233 May 6, 2026
2733a8e
update
xuedinge233 May 7, 2026
7b0d5e6
update
xuedinge233 May 7, 2026
9837ec5
update
xuedinge233 May 7, 2026
4d392a4
update
xuedinge233 May 7, 2026
e63bec9
update
xuedinge233 May 7, 2026
0905922
update
xuedinge233 May 7, 2026
22c6fb9
update
xuedinge233 May 7, 2026
2b44a51
update
xuedinge233 May 7, 2026
55b6f0a
update
xuedinge233 May 7, 2026
fc28c28
Merge remote-tracking branch 'origin/main' into test-yb-pipeline
xuedinge233 May 7, 2026
fb4d36b
update
xuedinge233 May 7, 2026
738e1f5
update
xuedinge233 May 7, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
5 changes: 0 additions & 5 deletions .flake8

This file was deleted.

17 changes: 17 additions & 0 deletions .github/CODEOWNERS
Original file line number Diff line number Diff line change
Expand Up @@ -46,3 +46,20 @@ lib/Dialect/TritonGPU/Transforms/TritonGPUConversion.cpp @ptillet
# third_party
# -----------
third_party/amd/ @antiagainst @zhanglx13
third_party/proton/ @Jokeren @crobeck @fywkevin

# -----------
# gluon
# -----------
python/triton/experimental/gluon/ @peterbell10
python/src/gluon_ir.cc @peterbell10
python/test/gluon @peterbell10
test/Gluon @peterbell10
include/triton/Dialect/Gluon @peterbell10
lib/Dialect/Gluon @peterbell10

# -----------
# Linear Layouts
# -----------
lib/Tools/ @lezcano
lib/Dialect/TritonGPU/IR/LinearLayoutConversions.cpp @lezcano
48 changes: 48 additions & 0 deletions .github/ISSUE_TEMPLATE/bug.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
name: Report a bug
description: Report triton failing to compile a kernel, or giving incorrect results
labels: ["bug"]

body:
- type: markdown
attributes:
value: |
#### Disclaimer
The core triton team is small and has very limited capacity. We may not have time to look into your report.
For the best results, please:
- Avoid submitting duplicates. Search through [the existing and past issues](https://github.com/triton-lang/triton/issues?q=is%3Aissue+sort%3Acreated-desc+) first to see if it's been reported previously.
- Check if the issue persists with a build from the latest source.
- Provide all relevant information in the initial report, to prevent unnecessary back and forth discussion.
- If you can, try to diagnose and/or fix the issue yourself. We welcome high quality contributions.
- type: textarea
attributes:
label: Describe the bug
description: |
Please provide a clear and concise description of what the bug is.

If relevant, add a [minimal complete example](https://stackoverflow.com/help/minimal-reproducible-example) that reproduces the bug. It is very important for the snippet to be as simple as possible, so please take time to trim down any irrelevant code to help us debug efficiently. We are going to copy-paste your code and we expect to get the same result as you did, so include both the kernel and launching code as well as any relevant imports.

If the code is too long (hopefully, it isn't), feel free to put it in a public gist and link it in the issue: https://gist.github.com.

Please also paste or describe the results you observe instead of the expected results. If you observe an error, please paste the error message including the **full** traceback of the exception. It may be relevant to wrap error messages in ```` ```triple quotes blocks``` ````.
placeholder: |
A clear and concise description of what the bug is.

```python
# Sample code to reproduce the problem
```

```
The error message you got, with the full traceback.
```
validations:
required: true
- type: textarea
attributes:
label: Environment details
description: |
Please include any relevant context about how you're running the reproducer e.g. which version of triton, and what GPU you are using.
placeholder: |
Triton: ...
GPU: ...
validations:
required: true
5 changes: 5 additions & 0 deletions .github/ISSUE_TEMPLATE/config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
blank_issues_enabled: true
contact_links:
- name: Community help
url: https://discord.gg/gpumode
about: GPU-mode discord community has a triton channel which is a great resource for help writing/learning triton
44 changes: 44 additions & 0 deletions .github/ISSUE_TEMPLATE/performance.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
name: Report a performance issue
description: Report cases where triton is generating sub-optimal (but functionally correct) PTX/LLVM IR
labels: ["performance"]

body:
- type: markdown
attributes:
value: |
#### Disclaimer
The core triton team is small and has very limited capacity. We may not have time to look into your report.
For the best results, please:
- Avoid submitting duplicates. Search through [the existing and past issues](https://github.com/triton-lang/triton/issues?q=is%3Aissue+sort%3Acreated-desc+) first to see if it's been reported previously.
- Check if the issue persists with a build from the latest source.
- Provide all relevant information in the initial report, to prevent unnecessary back and forth discussion.
- If you can, try to diagnose and/or fix the issue yourself. We welcome high quality contributions.
- type: textarea
attributes:
label: Describe the issue
description: |
Please provide a clear and concise description of the issue.

Include a [minimal complete example](https://stackoverflow.com/help/minimal-reproducible-example) that reproduces the issue. It is very important for the snippet to be as simple as possible, so please take time to trim down any irrelevant code to help us debug efficiently. We are going to copy-paste your code and we expect to get the same result as you did.

A reproducer could be a python program that runs a triton kernel and prints out the relevant suboptimal IR, or an IR file with an accompanying triton-opt command.

If the code is too long (hopefully, it isn't), feel free to put it in a public gist and link it in the issue: https://gist.github.com.
placeholder: |
A clear and concise description of the issue.

```python
# Sample code to reproduce the problem
```
validations:
required: true
- type: textarea
attributes:
label: Environment details
description: |
Please include any relevant context about how you're running the reproducer e.g. which version of triton, and what GPU you are using.
placeholder: |
Triton: ...
GPU: ...
validations:
required: true
3 changes: 3 additions & 0 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,14 @@
<!---
The core Triton is a small number of people, and we receive many PRs (thank
you!). To help us review your code more quickly, **if you are a new
contributor (less than 3 PRs merged) we ask that you complete the following
tasks and include the filled-out checklist in your PR description.**
Complete the following tasks before sending your PR, and replace `[ ]` with
`[x]` to indicate you have done them.
-->

# New contributor declaration
- [ ] I am not making a trivial change, such as fixing a typo in a comment.

- [ ] I have written a PR description following these
Expand Down
118 changes: 52 additions & 66 deletions .github/workflows/build-docker-image.yml
Original file line number Diff line number Diff line change
@@ -1,94 +1,80 @@
name: Release Triton Ascend Images

on:
pull_request:
types: [opened, reopened, synchronize]
branches: [main]
push:
branches: [main]
paths:
- 'docker/Dockerfile'
- 'Makefile'
- 'requirements.txt'
- 'requirements_dev.txt'
- '.github/workflows/build-docker-image.yml'
pull_request:
branches: [main]
paths:
- 'docker/Dockerfile'
- 'Makefile'
- 'requirements.txt'
- 'requirements_dev.txt'
- '.github/workflows/build-docker-image.yml'

permissions:
contents: read
actions: write

jobs:
build-release-images:
name: Build and Push Docker Images (${{ matrix.npu_type }})
build-images:
name: Build ${{ matrix.cann_version }}-${{ matrix.chip_type }}-${{ matrix.os }}-py${{ matrix.python_version }}
runs-on: ubuntu-22.04

strategy:
fail-fast: false
matrix:
npu_type: ['910b', 'A3']
env:
QUAY_USER: ${{ secrets.QUAY_USER }}
QUAY_PASSWD: ${{ secrets.QUAY_PASSWD }}
HEAD_COMMIT: ${{ github.sha }}
cann_version: [8.5.0, 9.0.0-beta.2]
chip_type: [910b, a3]
os: [ubuntu22.04]
python_version: ["3.10", "3.11"]

steps:
- name: Checkout code
uses: actions/checkout@v4
with:
fetch-depth: 0

- name: Set up QEMU
uses: docker/setup-qemu-action@v3

- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3

- name: Buildx info
run: |
echo "Checking Docker Buildx version and builders..."
docker buildx version
docker buildx ls
- name: Login to Quay.io
if: github.event_name == 'push'
uses: docker/login-action@v3
with:
registry: quay.io
username: ${{ secrets.QUAY_USER }}
password: ${{ secrets.QUAY_PASSWD }}

- name: Build and push release images
shell: bash
- name: Generate tags
id: tags
run: |
if [ -z "${QUAY_USER}" ] || [ -z "${QUAY_PASSWD}" ]; then
echo "Please set QUAY_USER and QUAY_PASSWD secrets before building images."
exit 1
fi

if [ -n "${HEAD_COMMIT}" ]; then
GIT_COMMIT_SHORT="$(git rev-parse --short "${HEAD_COMMIT}")"
else
GIT_COMMIT_SHORT="$(git rev-parse --short HEAD)"
fi

echo "${QUAY_PASSWD}" | docker login -u "${QUAY_USER}" --password-stdin quay.io

BUILD_IMAGE=0
if [ -z "${BASE_COMMIT}" ]; then
echo "BASE_COMMIT not set. Forcing Docker image build..."
BUILD_IMAGE=1
elif [ -n "${filenames}" ] && echo "${filenames}" | grep -Eq '\b(docker/Dockerfile_build|Makefile|requirements(_dev)?\.txt)\b'; then
echo "Relevant files changed. Building Docker image..."
BUILD_IMAGE=1
else
echo "No relevant changes since BASE_COMMIT. Skipping Docker image build."
fi
CANN_TAG="${{ matrix.cann_version }}-${{ matrix.chip_type }}-${{ matrix.os }}-py${{ matrix.python_version }}"
GIT_COMMIT_SHORT="$(git rev-parse --short HEAD)"
TAG_COMMIT="${CANN_TAG}-${GIT_COMMIT_SHORT}"
TAG_LATEST="${CANN_TAG}-latest"
echo "cann_tag=${CANN_TAG}" >> $GITHUB_OUTPUT
echo "tag_commit=${TAG_COMMIT}" >> $GITHUB_OUTPUT
echo "tag_latest=${TAG_LATEST}" >> $GITHUB_OUTPUT

if [ "$BUILD_IMAGE" -eq 1 ]; then
docker buildx build --platform "linux/amd64,linux/arm64" \
--build-arg NPU_TYPE=${{ matrix.npu_type }} \
-f docker/Dockerfile_build --push \
-t "quay.io/ascend/triton:dev-build-${GIT_COMMIT_SHORT}-${{ matrix.npu_type }}" .
fi

BUILD_IMAGE=0
if [ -z "${BASE_COMMIT}" ]; then
echo "BASE_COMMIT not set. Forcing Docker image build..."
BUILD_IMAGE=1
elif [ -n "${filenames}" ] && echo "${filenames}" | grep -Eq '\b(docker/Dockerfile|Makefile|requirements(_dev)?\.txt)\b'; then
echo "Relevant files changed. Building Docker image..."
BUILD_IMAGE=1
else
echo "No relevant changes since BASE_COMMIT. Skipping Docker image build."
fi

if [ "$BUILD_IMAGE" -eq 1 ]; then
docker buildx build --platform "linux/amd64,linux/arm64" \
--build-arg NPU_TYPE=${{ matrix.npu_type }} \
-f docker/Dockerfile --push \
-t "quay.io/ascend/triton:dev-${GIT_COMMIT_SHORT}-${{ matrix.npu_type }}" .
fi
- name: Build and push
uses: docker/build-push-action@v6
with:
context: .
file: docker/Dockerfile
platforms: linux/amd64,linux/arm64
push: ${{ github.event_name == 'push' }}
tags: |
quay.io/ascend/triton:${{ steps.tags.outputs.tag_commit }}
quay.io/ascend/triton:${{ steps.tags.outputs.tag_latest }}
build-args: |
CANN_BASE_IMAGE=quay.io/ascend/cann:${{ steps.tags.outputs.cann_tag }}
cache-from: type=gha
cache-to: type=gha,mode=max
39 changes: 35 additions & 4 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,26 @@ name: Integration Tests
on:
workflow_dispatch:
pull_request:
branches-ignore: ['llvm-**']
paths-ignore:
- 'docs/**'
- '**/*.md'
- 'LICENSE'
- 'MANIFEST.in'
- '.gitignore'
- '.github/CODEOWNERS'
- '.github/dependabot.yml'
- 'cmake/llvm-hash.txt'
- 'third_party/ascend/llvm_patch/**'
- 'docker/**'
- 'requirements.txt'
- 'requirements_dev.txt'
- '.github/workflows/build-docker-image.yml'
- '.github/workflows/create_release.yml'
- '.github/workflows/documentation.yml'
- '.github/workflows/llvm-build.yml'
- '.github/workflows/llvm-build/**'
- '.github/workflows/pre-commit.yml'
- '.github/workflows/runner-preparation.yml'
merge_group:
branches: [main, 'dev-**']
types: [checks_requested]
Expand All @@ -18,12 +37,24 @@ jobs:
runner-preparation:
uses: ./.github/workflows/runner-preparation.yml

pre-commit:
uses: ./.github/workflows/pre-commit.yml
# pre-commit:
# uses: ./.github/workflows/pre-commit.yml

integration-tests-ascend:
build-wheels:
needs: runner-preparation
if: needs.runner-preparation.outputs.matrix-ASCEND != ''
uses: ./.github/workflows/wheels.yml
with:
ci_only: true
secrets: inherit
permissions:
id-token: write
contents: read

integration-tests-ascend:
needs: [runner-preparation, build-wheels]
if: needs.runner-preparation.outputs.matrix-ASCEND != ''
uses: ./.github/workflows/integration-tests-ascend.yml
with:
matrix: ${{ needs.runner-preparation.outputs.matrix-ASCEND }}
use_prebuilt_wheel: true
18 changes: 7 additions & 11 deletions .github/workflows/create_release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ jobs:
outputs:
release_name: "${{ steps.release_name.outputs.name }}"
steps:
- uses: actions/checkout@v6
- uses: actions/checkout@v4
with:
show-progress: false
submodules: 'recursive'
Expand All @@ -47,32 +47,28 @@ jobs:
# strip trailing v from tag name
tag_or_branch="${tag_or_branch#v}"
# important: version must be fixed in setup.py
sed -i -e "s:^TRITON_VERSION = .*:TRITON_VERSION = '${tag_or_branch}':" python/setup.py || exit 1
sed -i -e "s:^TRITON_VERSION = .*:TRITON_VERSION = '${tag_or_branch}':" setup.py || exit 1
fi
echo "RELEASE_NAME=triton-ascend$tag_or_branch" >> "$GITHUB_ENV"

echo "RELEASE_NAME=triton-ascend-$tag_or_branch" >> "$GITHUB_ENV"
- name: Create source distribution
working-directory: python
run: |
python -m pip install --upgrade pip -qq || exit 1
python -m pip install "setuptools<70" wheel build -qq || exit 1
python -m pip install --upgrade pip || exit 1
python -m pip install "setuptools<70" wheel build || exit 1
python -m build -s || exit 1
cd dist || exit 1
release_file=( *.tar.gz )
echo "RELEASE_FILE=${release_file}" >> "$GITHUB_ENV"

- name: Upload source distribution for release
if: ${{ github.event_name == 'release' }}
uses: softprops/action-gh-release@v2
with:
files: python/dist/${{ env.RELEASE_FILE }}

files: dist/${{ env.RELEASE_FILE }}
- name: Upload source distribution to GHA artifacts for release tags
if: ${{ github.event_name == 'push' && startsWith(github.ref, 'refs/tags/v') && contains(github.ref, 'rc') }}
uses: actions/upload-artifact@v4.4.0
with:
name: ${{ env.RELEASE_FILE }}
path: python/dist/${{ env.RELEASE_FILE }}
path: dist/${{ env.RELEASE_FILE }}
- name: Set output
id: release_name
run: echo "name=release_name::${{ env.RELEASE_NAME }}.tar.gz" >> "${GITHUB_OUTPUT}"
Expand Down
Loading
Loading