-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Fix ARMV9SME target in DYNAMIC_ARCH and add SME query code for MacOS #5222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
Hi @martin-frbg For a non-Apple CPU, the check should enter this part of get_coretype() (verified on QEMU). Here when the TARGET is set as ARMV8, gotoblas_ARMV9SME is NULL whereas when the TARGET is set to ARMV9SME, gotoblas_ARMV9SME is not NULL and hence the architecture initialization is successful. Please note that for compilation I am using the following command:
Also, though the test is on QEMU, the SME sgemmdirect kernel will eventually have to run on a Qualcomm device as well. So I think we need to add support_sme1() check for 0x51 implementer ID here similar to the one added by you for Apple M4 |
The way this is supposed to work is that for Linux, it checks a variety of implementer and cpu IDs, and if none of them matches, it runs support_sme1() to see if it should return ARMV9SME. |
On QEMU, support_sme1() returns true which I verified using debug prints. I think the issue is somewhere in gotoblas->init returning null.
Moreover, the check for (gotoblas && gotoblas->init) is true when the library is compiled with TARGET=ARMV9SME DYNAMIC_ARCH=1. It fails when TARGET=ARMV8 or ARMV8SVE , DYNMAIC_ARCH=1. I believe the init function maps to init_parameter() taken from the generated file setparam-ARMV9SME.c. This object (setparam-ARMV9SME.o) is getting generated in both the cases (ARMV8 and ARMV9SME). Not sure if I am missing something here .. :( |
Hi @martin-frbg Were you able to check on this issue? I tried to fix but without any luck. Please let me know if you figure out a solution. |
Unfortunately I'm still at the stage of building a kernel with SME support in a Debian VM under qemu (which is a lot slower than anticipated even on a fast x86_64). Wanted to try Arm FVP instead but did not quite figure out how to make that work |
This is sufficient to enable the SME version of the "small matrix SGEMM" kernel on Apple M4
Also added is commented-out code for recognizing the M4 as ARMV9SME - this is not yet useful except for testing, as
none of the ARMV8SVE kernels that the V9SME target builds upon support streaming SVE.