Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve support for RISC-V architecture on Linux #148

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

GlassOfWhiskey
Copy link
Contributor

No description provided.

@facebook-github-bot
Copy link
Contributor

Hi @GlassOfWhiskey!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at [email protected]. Thanks!

@facebook-github-bot
Copy link
Contributor

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

@GlassOfWhiskey
Copy link
Contributor Author

Hi,
Any comment on this?

@appujee
Copy link

appujee commented Aug 10, 2023

ping. any update on when this will land?

cc: @Maratyszcza

include/cpuinfo.h Outdated Show resolved Hide resolved
processor->features |= CPUINFO_RISCV_LINUX_FEATURE_F;
#endif
} else if(*feature == 'i') {
#if CPUINFO_ARCH_RISCV
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this line not have CPUINFO_ARCH_RISCV64?

src/riscv/linux/hwcap.c Outdated Show resolved Hide resolved
src/riscv/linux/riscv-isa.c Outdated Show resolved Hide resolved
tools/cpu-info.c Outdated Show resolved Hide resolved
tools/cpu-info.c Outdated Show resolved Hide resolved
@appujee
Copy link

appujee commented Aug 10, 2023

torvalds/linux@16252e0 maybe helpful in revising this patch.

Copy link
Contributor

@Maratyszcza Maratyszcza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reformat consistently with the rest of cpuinfo. Use tabs for indentation.

@GlassOfWhiskey GlassOfWhiskey changed the title Added support for RISC-V architecture on Linux DO NOT MERGE: Added support for RISC-V architecture on Linux Nov 19, 2023
@GlassOfWhiskey GlassOfWhiskey marked this pull request as draft November 19, 2023 23:56
@GlassOfWhiskey GlassOfWhiskey changed the title DO NOT MERGE: Added support for RISC-V architecture on Linux Added support for RISC-V architecture on Linux Nov 20, 2023
@GlassOfWhiskey GlassOfWhiskey marked this pull request as ready for review November 20, 2023 00:35
@GlassOfWhiskey
Copy link
Contributor Author

@malfet I've seen that a recent PR (#190) on RISC-V support has been merged into the main branch. However, I think this PR can still be useful for some reasons:

  • It adds support for some additional features, like caches and uarchs;
  • It allows to retrieve CPU information by parsing the /proc/cpuinfo output, as a baseline method;
  • It disables the hwprobe path for kernels older than 6.4.0, which prevents the actual version of cpuinfo to compile on almost all current RISC-V systems on the market.

It is still not clear to me why @prashanthswami chose to open a sibling PR instead of converging the efforts on a single one. However, I tried to manually merge #190 into this PR, applying some minor modifications when deemed necessary.
Please let me know what you think.

@malfet
Copy link
Contributor

malfet commented Nov 20, 2023

@GlassOfWhiskey can you please amend the title and expand description to nightlight what this PR is trying to accomplish.

Copy link
Contributor

@malfet malfet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Please stick to the same formatting rules
  • Refactor cpuinfo_smallfile_callback into the separate PR

tools/isa-info.c Outdated Show resolved Hide resolved
tools/isa-info.c Outdated Show resolved Hide resolved
CMakeLists.txt Outdated Show resolved Hide resolved
CMakeLists.txt Outdated Show resolved Hide resolved
@@ -88,7 +88,7 @@ inline static bool is_whitespace(char c) {
static const uint32_t default_max_processors_count = CPU_SETSIZE;
#endif

static bool uint32_parser(const char* text_start, const char* text_end, void* context) {
static bool uint32_parser(const char* filename, const char* text_start, const char* text_end, void* context) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense, but do you mind submitting a separate PR to accomplish that?

Copy link
Contributor Author

@GlassOfWhiskey GlassOfWhiskey Dec 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created #211 with these improvements. I'll rebase as soon as it will be merged.

@prashanthswami
Copy link
Collaborator

@GlassOfWhiskey - I sincerely apologize, this was poor GitHub practice on my part. I had written my PR a few months ago on an internal fork and needed it up streamed somewhat quickly, so I pushed it without analyzing the open PRs. When I looked at this issue last, it hadn't been active for a few months, so I was worried it would be inappropriate to ping here and ask for an accelerated landing, just to satisfy my own project timelines. In hindsight, I see that it would have been courteous to message here and coordinate, or CC you on the other PR. I'd like to make sure I don't do this again - should I CC you on any open PRs I have for RISC-V contributions to this repo?

include/cpuinfo.h Outdated Show resolved Hide resolved
#else
return false;
#endif
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this removed because it's not something folks need to switch on? If so, I'm also curious if we want to take this opportunity to remove everything that's not optional. I was a bit over-zealous when adding all the extensions, but when I was chatting about this with some folks they pointed out that the likelihood of anyone showing up with a chipset that doesn't support IMAFD is basically zero. Should we just limit this whole structure to 'c' and 'v' for now?

I'm not too strongly opinionated about this, but more surface area = more maintenance in the long-term.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this because it was not used in any place, and in particular it was never captured by the cpuinfo_riscv_linux_decode_isa_from_hwcap method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see everything removed that is not immediately useful. Attempting to run without 'base isa' is not useful.

src/riscv/api.h Outdated Show resolved Hide resolved
src/riscv/cache.c Show resolved Hide resolved
src/riscv/cache.c Outdated Show resolved Hide resolved
Comment on lines 670 to 694
/* Populate cache information structures in l2 */
uint32_t l2_index = UINT32_MAX;
for (uint32_t processor = 0; processor < valid_processors_count; processor++) {
struct cpuinfo_cache dummy_l1i, dummy_l1d, temp_l2 = { 0 };
cpuinfo_riscv_decode_cache(
riscv_linux_processors[processor].core.uarch,
&dummy_l1i, &dummy_l1d, &temp_l2);

if (temp_l2.size != 0) {
if (riscv_linux_processors[processor].package_leader_id == riscv_linux_processors[processor].processor.linux_id) {
l2_index += 1;
l2[l2_index] = (struct cpuinfo_cache) {
.size = temp_l2.size,
.associativity = temp_l2.associativity,
.sets = temp_l2.sets,
.partitions = temp_l2.partitions,
.line_size = temp_l2.line_size,
.flags = temp_l2.flags,
.processor_start = processor,
.processor_count = riscv_linux_processors[processor].package.processor_count,
};
}
processors[processor].cache.l2 = l2 + l2_index;
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't we do this in the loop above, right where we commit the L1 information?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No because we don't know the value of l2_count. THe program first computes how many l2 caches are present, then allocates the l2 data structure and populates it. We can change the logic but it will behave diffferently from what it does now.

src/riscv/linux/init.c Outdated Show resolved Hide resolved
src/riscv/linux/init.c Outdated Show resolved Hide resolved
Comment on lines -19 to -22
/* In case of failure, report unknown. */
*vendor = cpuinfo_vendor_unknown;
*uarch = cpuinfo_uarch_unknown;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're going to remove this, can we change this function to report a bool success or failure? The caller should know whether or not to trust the information in the structure.

#include <riscv/api.h>
#include <riscv/linux/api.h>

#include <linux/version.h>

#if LINUX_VERSION_CODE >= KERNEL_VERSION(6,4,0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So . . . .philosophical question to @malfet @GlassOfWhiskey . I know current hardware is not all at 6.4 (I guess, to be more fair, most of it isn't), but hwprobe is supposed to be the "proper" way of querying this information. I think it's fair to say that no one likes the prospect of trying to string-parse a proc-file to infer data.

This actually goes back to my comment on the other file about whether we should take this opportunity to just remove all the features that aren't useful to detect (e.g. why do we want to detect 'm', literally everyone will have 'm'). If we do it that way, then the question becomes "if you're only here to detect, say "V", well V isn't available till Linux 6.5, so you must have hwprobe". What do you both think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This #ifdef does not fix the RISC-V build on machines with a 6.4 kernel or greater. When compiling this branch on a RISC-V 64 Ubuntu 23.10 VM, which has a 6.5 kernel, I get the following errors

cpuinfo/src/riscv/linux/riscv-hw.c:11:10: fatal error: sys/hwprobe.h: No such file or directory
   11 | #include <sys/hwprobe.h>
      |          ^~~~~~~~~~~~~~~
compilation terminated.

It's not clear to me where sys/hwprobe.h is supposed to come from. @prashanthswami I think you added this originally. Were you testing with your own build of glibc?

Also, IIUC, with this compile time check on the kernel version, hwprobe will only be used if both the build machine and the system running the binary have a 6.4 kernel or greater. If we build on Ubuntu 22.04 for example but run on 23.10 the hwprobe path would not be taken here even though 23.10 has a 6.5 kernel with hwprobe support.

Other projects that are using hwprobe right now get around this problem by making the syscall directly. OpenJDK duplicates the constants from the kernel headers needed for the sys call and then does a direct syscall. It doesn't include the kernel header files at all. A nicer solution would be to first check to see if the <asm/hwprobe.h> exists either using a build system check that defines a macro if a program that includes <asm/hwprobe.h> builds or perhaps using the trick the glibc patch uses

#ifdef __has_include
# if __has_include (<asm/hwprobe.h>)
#  include <asm/hwprobe.h>
# endif
#endif

Regardless of the method used, if the include file is not found, manually defined constants for the hwprobe syscall number and the various bitmasks can be defined, so the syscall can still be issued, regardless of the kernel version on the build machine.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My primary use-case was Android on RISC-V, so I likely overcorrected here by testing primarily against the Android NDK (note, there's no official NDK with RISC-V support yet, but the public canary builds work). This header is available in bionic, so building against the NDK does not complain.

The proposal to check for the header explicitly seems reasonable enough to me and there's precedent for it.

As a side note - we should indeed have a CI bot for this. I can't add one for Android just yet, because the common Github workflows don't support canary Android NDKs, but I've filed #206 to do so when it is. Separately, it would probably be valuable to run a linux-riscv64 build to catch stuff like this, so I'll figure out that out.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that explains it. It looks like you're pulling in this header which contains a definition of __riscv_hwprobe that appears to be compatible with the pending glibc patch. Ultimately, the thing to do would be to check for <sys/hwprobe.h> and, if it's not present, fall back to code that makes the syscall directly. However, perhaps until the glibc patch is merged, <sys/hwprobe.h> should only be included when building for Android. A manual riscv_hwprobe syscall could then be used for other Linux distributions for the time being.

Copy link
Contributor

@markdryan markdryan Dec 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@malfet, @GlassOfWhiskey , @prashanthswami The build issues on mainline Linux distributions are still not addressed by this patch. It hasn't been possible to build pytorch/cpuinfo on a RISC-V device running Ubuntu 23.10 for over 3 weeks and this patch in its current form won't help. Would there by any objections from anyone if I submit a PR to fix the build issues? The PR would only address the build issues to avoid introducing too many conflicts. Other issues with hwprobe, like the fact we're calling it far too often in most cases, could be addressed in a subsequent PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@markdryan no problem from my side. Also because probably things will go a bit longer in approving this PR as many aspects are involved.
I missed what was the preferred solution among the ones proposed so I didn't do anything on that direction for now. Feel free to proceed as you prefer.
The __has_include path is probably the most elegant one. It is not fully C standard, but it is now part of the C++ standard since C++17 so I'd vote for that one.

@GlassOfWhiskey GlassOfWhiskey changed the title Added support for RISC-V architecture on Linux Improve support for RISC-V architecture on Linux Nov 20, 2023
@GlassOfWhiskey
Copy link
Contributor Author

@GlassOfWhiskey - I sincerely apologize, this was poor GitHub practice on my part. I had written my PR a few months ago on an internal fork and needed it up streamed somewhat quickly, so I pushed it without analyzing the open PRs. When I looked at this issue last, it hadn't been active for a few months, so I was worried it would be inappropriate to ping here and ask for an accelerated landing, just to satisfy my own project timelines. In hindsight, I see that it would have been courteous to message here and coordinate, or CC you on the other PR. I'd like to make sure I don't do this again - should I CC you on any open PRs I have for RISC-V contributions to this repo?

@prashanthswami no problem, it was just a bit of work to merge all the things, but the important thing is to provide community with RISC-V support, no matter how or who :)
I'll start addressing your comments. Since there is a lot of stuff, I'll probably do it incrementally, so that we can track progresses. Then we can squash all commits at the end.

As a general comment @prashanthswami @malfet: if you care so much about the code style and coherence, why don't you or someone in the maintainers team put a .clang-format file or something equivalent in the repo? So that contributors can automatically format the code before submitting and you can integrate style adherence directly in the CI pipeline?

@prashanthswami
Copy link
Collaborator

In regards to style-adherence, I'll defer to @malfet here, I'm only along for the ride :) my format comments were largely just parroting earlier comments Marat / Nikita gave me. Supportive though of the idea, anything that lets my IDE auto-switch to the right format is desirable. Does PyTorch have top-level guidance about what is considered proper code format?

#include <riscv/api.h>
#include <riscv/linux/api.h>

#include <linux/version.h>

#if LINUX_VERSION_CODE >= KERNEL_VERSION(6,4,0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This #ifdef does not fix the RISC-V build on machines with a 6.4 kernel or greater. When compiling this branch on a RISC-V 64 Ubuntu 23.10 VM, which has a 6.5 kernel, I get the following errors

cpuinfo/src/riscv/linux/riscv-hw.c:11:10: fatal error: sys/hwprobe.h: No such file or directory
   11 | #include <sys/hwprobe.h>
      |          ^~~~~~~~~~~~~~~
compilation terminated.

It's not clear to me where sys/hwprobe.h is supposed to come from. @prashanthswami I think you added this originally. Were you testing with your own build of glibc?

Also, IIUC, with this compile time check on the kernel version, hwprobe will only be used if both the build machine and the system running the binary have a 6.4 kernel or greater. If we build on Ubuntu 22.04 for example but run on 23.10 the hwprobe path would not be taken here even though 23.10 has a 6.5 kernel with hwprobe support.

Other projects that are using hwprobe right now get around this problem by making the syscall directly. OpenJDK duplicates the constants from the kernel headers needed for the sys call and then does a direct syscall. It doesn't include the kernel header files at all. A nicer solution would be to first check to see if the <asm/hwprobe.h> exists either using a build system check that defines a macro if a program that includes <asm/hwprobe.h> builds or perhaps using the trick the glibc patch uses

#ifdef __has_include
# if __has_include (<asm/hwprobe.h>)
#  include <asm/hwprobe.h>
# endif
#endif

Regardless of the method used, if the include file is not found, manually defined constants for the hwprobe syscall number and the various bitmasks can be defined, so the syscall can still be issued, regardless of the kernel version on the build machine.

src/riscv/linux/riscv-isa.c Show resolved Hide resolved
Copy link
Contributor

@markdryan markdryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple more comments about the current use of riscv_hwprobe. The use of hwprobe I'm commenting on here was introduced in an earlier PR but as we're having the conversation about hwprobe here I thought it worth mentioning. Happy to copy these comments to an issue if this is preferred.

  1. We're calling hwprobe once for each processor to retrieve the vendor id and uarch. If we have lots of processors this will result in lots of syscalls (and lots of failures on kernels older than 6.4). There's a good chance that all the processors will have the same vendor and uarch, so it's probably worth trying to retrieve the vendor and uarch for all the processors we're interested in a single syscall, outside of a loop. If -1 is returned for either value, different processors do have different values for uarch and vendor id and so they would need to be queried on a processor by processor basis as is currently done.
  2. We don't use hwprobe to detect the ISA extensions. We're only using hwcap, which cannot detect multi-letter extensions. It would be better to check for extensions with hwprobe first, falling back to hwcap if hwprobe is not available. This could be done in the same call used to retrieve the vendor and uarch for all processors as recommend in point 1 above. Using riscv_hwprobe in this way would allow us to detect Zba, Zbb and Zbs (and more extensions in the future).

src/riscv/linux/init.c Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants