Skip to content

enhancement: 优化rearrange算子 #8

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 7, 2025
Merged

Conversation

pwhMass
Copy link

@pwhMass pwhMass commented Jan 20, 2025

No description provided.

@pwhMass
Copy link
Author

pwhMass commented Jan 20, 2025

目前代码不可用,且只针对2维做了转置优化,正在测试什么情况下性能更好

@YdrMaster YdrMaster self-assigned this Jan 20, 2025
@YdrMaster YdrMaster marked this pull request as draft January 20, 2025 09:37
@pwhMass pwhMass force-pushed the rearrange branch 2 times, most recently from cfd19d9 to b6d2b42 Compare January 24, 2025 13:43
@pwhMass pwhMass force-pushed the rearrange branch 2 times, most recently from de841a0 to 8d4cd02 Compare February 20, 2025 16:23
@YdrMaster YdrMaster marked this pull request as ready for review February 24, 2025 03:47
//src strides 降序 index
let src_strides_desc_idx = (0..scheme_update.ndim())
.zip(src_strides)
.sorted_by(|a, b| b.1.cmp(&a.1))

Check warning

Code scanning / clippy

this expression creates a reference which is immediately dereferenced by the compiler

this expression creates a reference which is immediately dereferenced by the compiler
let dst_cs = dst_cs / unit;
let src_rs = src_rs / unit;
let src_cs = src_cs / unit;
let unit = unit as usize;

Check warning

Code scanning / clippy

casting to the same type is unnecessary (`usize` -> `usize`)

casting to the same type is unnecessary (`usize` -> `usize`)
需要注意目前 ARRAY_SIZE 的大小是5,该常亮与可接受的Tensor的维度有关,但太大会导致kernel计算量增大
Operator 需要用到max_warps_block,warp_size来辅助计算,目前并未用到
block_size 目前固定位256,可进一步优化
@pwhMass pwhMass changed the base branch from main to dev May 7, 2025 13:07
@YdrMaster YdrMaster merged commit 972e357 into YdrMaster:dev May 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants