Google Summer of Code/2022/Ideas/Refine and complete ROCm: eclass, more packages and downstream softwares
ROCm™ open software platform is a open source software for HPC/Hyperscale-class GPU computing developed by AMD. It currently support various AMD GPUs (also Nvidia GPUs, by wrapping CUDA), and may include more hardware like FPGA in the future. Packages can be classified to 4 categories: low-level drivers and runtime libraries, developer toolkit, high level libs and frameworks. Thanks to the contributor from ROCm overlay, Gentoo has packaged the most important ones.
However there are still a lot to be done:
1. Enable ROCm for packages like tensorflow, jax, cupy; 2. Write a rocm.eclass to make ROCm related packages more maintainable, and consider USE Flag for different GPU architecture; 3. Enable more testing; 4. Hold a discussion about open source, heterogeneous computing platform and GNU/Linux distos. Due to it's open-source nature, ROCm packages can be carefully treated to meet FHS standard. But it contains binary kernels for GPU, which is not well considered, and testing GPU libraries require specific hardware. Those are the challenges we must face if distros package heterogeneous compute packages; 5. More packages missing in ::gentoo, such as ROCgdb, rocWMMA, etc. 6. Current ebuild maintenance, including bug fix, stabilization. 7. Wiki page for ROCm usage and development.
 https://rocmdocs.amd.com/en/latest/index.html  https://github.com/justxi/rocm  https://bugs.gentoo.org/810619  https://bugs.gentoo.org/795825  https://bugs.gentoo.org/817440
|Expected Project Size||Expected Outcomes|
|200 to 300 hours, depend on the actual plan.||