anyint is not any

Introduction

This is a exploration log. When writing LLVM IR intrinsic, it is often to use llvm_anyint_ty to indicate a more general range of types rather than one single type like llvm_i8_ty. After some searches, it is easy to find it is being defined at include/llvm/IR/Intrinsics.td:L235, referencing to LLVMType<iAny>, as defined at include/llvm/IR/CodeGen.td:L217. Combining the comment nearby: // Pseudo valuetype to represent "integer of any bit width", it somewhat made one think that for an intrinsic def int_foo_bar : Intrinsic<[llvm_anyint_ty], [LLVMMatchType<0>], [IntrNoMem, IntrSpeculatable]>;, there exists the following intrinsics:

declare i1 @llvm.foo.bar.i1(i1);
declare i8 @llvm.foo.bar.i8(i8);
declare i16 @llvm.foo.bar.i16(i16);
declare i32 @llvm.foo.bar.i32(i32);
declare i64 @llvm.foo.bar.i64(i64);
declare i128 @llvm.foo.bar.i128(i128);

That is, for all ValueTypes in ValueTypes.td, there is a overload for it. The author of LLVM revision D66479 seems also has the same confusion.

This problem arises when I was implementing the LLVM IR Intrinsic of K (crypto) extension so the description below would take that as example.

As in the intrinsic proposal of Markku, there is one intrinsic _rv_sha256sig0 supporting parameter types of i32/i64, but seems other types of parameters need to be casted. Thus, when implementing, the anyint below made the implementation mismatch from the proposal above, seems types like i1, i8, i128 is not what we wanted.

class ScalarCryptoGprIntrinsicAny : Intrinsic<[llvm_anyint_ty], [LLVMMatchType<0>], [IntrNoMem, IntrSpeculatable]>;

def int_riscv_sha256sig0 : ScalarCryptoGprIntrinsicAny;

def : PatGpr<int_riscv_sha256sig0, SHA256SIG0>;
// Record SHA256SIG0 defined in the MC Layer, demonstrating an instruction

Have a try!

But does it works as its name shows? Let's see the log below for the record above.

$ cat test.ll
declare i32 @llvm.riscv.sha256sig0(i32);

define i32 @sha256sig0(i32 %a) nounwind {
; RV32IK-LABEL: sha256sig0
; RV32IK: # %bb.0:
; RV32IK-NEXT: sha256sig0 a{{[0-9]+}}, a{{[0-9]+}}
; RV32IK-NEXT: ret
    %val = call i32 @llvm.riscv.sha256sig0(i32 %a)
    ret i32 %val
}

$ ./bin/llc -mtriple=riscv32 -mattr=+experimental-k -verify-machineinstrs test.ll
# Try running $ cat test.s here!
$ sed "s/i32/i8/" test.ll -i
$ cat test.ll
declare i8 @llvm.riscv.sha256sig0(i32);

define i8 @sha256sig0(i32 %a) nounwind {
; RV32IK-LABEL: sha256sig0
; RV32IK: # %bb.0:
; RV32IK-NEXT: sha256sig0 a{{[0-9]+}}, a{{[0-9]+}}
; RV32IK-NEXT: ret
    %val = call i8 @llvm.riscv.sha256sig0(i32 %a)
    ret i8 %val
}

$ ./bin/llc -mtriple=riscv32 -mattr=+experimental-k -verify-machineinstrs test.ll
Intrinsic has incorrect argument type!
i8 (i32)* @llvm.riscv.sha256sig0
llc: error: 'test.ll': input module cannot be verified

Type i8 is integer type, and we had specified it as anyint, right? However, it seems that it does not accept all int type... Then what does it accept? Let's see what had it generated from the product of TableGen files (.td files) — the .inc file. To be specific, we want to see the file RISCVGenGlobalISel.inc.

Searching for sha256sig0, the few search result quickly make us located to a specific region, providing the answer we are urging for. To better demonstrate this result, an record above and below the record of sha256sig0, respectively, is pasted.

      // Label 510: @16451
      GIM_Try, /*On fail goto*//*Label 511*/ 16491, // Rule ID 42126 //
        GIM_CheckFeatures, GIFBS_HasStdExtZknd_IsRV64,
        GIM_CheckIntrinsicID, /*MI*/0, /*Op*/1, Intrinsic::riscv_aes64im,
        GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s64,
        GIM_CheckType, /*MI*/0, /*Op*/2, /*Type*/GILLT_s64,
        GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/RISCV::GPRRegClassID,
        GIM_CheckRegBankForClass, /*MI*/0, /*Op*/2, /*RC*/RISCV::GPRRegClassID,
        // (intrinsic_wo_chain:{ *:[i64] } 6691:{ *:[iPTR] }, GPR:{ *:[i64] }:$rs1)  =>  (AES64IM:{ *:[i64] } GPR:{ *:[i64] }:$rs1)
        GIR_BuildMI, /*InsnID*/0, /*Opcode*/RISCV::AES64IM,
        GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/0, // rd
        GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/2, // rs1
        GIR_EraseFromParent, /*InsnID*/0,
        GIR_ConstrainSelectedInstOperands, /*InsnID*/0,
        // GIR_Coverage, 42126,
        GIR_Done,
      // Label 511: @16491
      GIM_Try, /*On fail goto*//*Label 512*/ 16531, // Rule ID 42132 //
        GIM_CheckFeatures, GIFBS_HasStdExtZknh,
        GIM_CheckIntrinsicID, /*MI*/0, /*Op*/1, Intrinsic::riscv_sha256sig0,
        GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s64,
        GIM_CheckType, /*MI*/0, /*Op*/2, /*Type*/GILLT_s64,
        GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/RISCV::GPRRegClassID,
        GIM_CheckRegBankForClass, /*MI*/0, /*Op*/2, /*RC*/RISCV::GPRRegClassID,
        // (intrinsic_wo_chain:{ *:[i64] } 6723:{ *:[iPTR] }, GPR:{ *:[i64] }:$rs1)  =>  (SHA256SIG0:{ *:[i64] } GPR:{ *:[i64] }:$rs1)
        GIR_BuildMI, /*InsnID*/0, /*Opcode*/RISCV::SHA256SIG0,
        GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/0, // rd
        GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/2, // rs1
        GIR_EraseFromParent, /*InsnID*/0,
        GIR_ConstrainSelectedInstOperands, /*InsnID*/0,
        // GIR_Coverage, 42132,
        GIR_Done,
      // Label 512: @16531
      GIM_Try, /*On fail goto*//*Label 513*/ 16571, // Rule ID 42133 //
        GIM_CheckFeatures, GIFBS_HasStdExtZknh,
        GIM_CheckIntrinsicID, /*MI*/0, /*Op*/1, Intrinsic::riscv_sha256sig0,
        GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32,
        GIM_CheckType, /*MI*/0, /*Op*/2, /*Type*/GILLT_s32,
        GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/RISCV::GPRRegClassID,
        GIM_CheckRegBankForClass, /*MI*/0, /*Op*/2, /*RC*/RISCV::GPRRegClassID,
        // (intrinsic_wo_chain:{ *:[i32] } 6723:{ *:[iPTR] }, GPR:{ *:[i32] }:$rs1)  =>  (SHA256SIG0:{ *:[i32] } GPR:{ *:[i32] }:$rs1)
        GIR_BuildMI, /*InsnID*/0, /*Opcode*/RISCV::SHA256SIG0,
        GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/0, // rd
        GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/2, // rs1
        GIR_EraseFromParent, /*InsnID*/0,
        GIR_ConstrainSelectedInstOperands, /*InsnID*/0,
        // GIR_Coverage, 42133,
        GIR_Done,
      // Label 513: @16571
      GIM_Try, /*On fail goto*//*Label 514*/ 16611, // Rule ID 42134 //
        GIM_CheckFeatures, GIFBS_HasStdExtZknh,
        GIM_CheckIntrinsicID, /*MI*/0, /*Op*/1, Intrinsic::riscv_sha256sig1,
        GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s64,
        GIM_CheckType, /*MI*/0, /*Op*/2, /*Type*/GILLT_s64,
        GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/RISCV::GPRRegClassID,
        GIM_CheckRegBankForClass, /*MI*/0, /*Op*/2, /*RC*/RISCV::GPRRegClassID,
        // (intrinsic_wo_chain:{ *:[i64] } 6724:{ *:[iPTR] }, GPR:{ *:[i64] }:$rs1)  =>  (SHA256SIG1:{ *:[i64] } GPR:{ *:[i64] }:$rs1)
        GIR_BuildMI, /*InsnID*/0, /*Opcode*/RISCV::SHA256SIG1,
        GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/0, // rd
        GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/2, // rs1
        GIR_EraseFromParent, /*InsnID*/0,
        GIR_ConstrainSelectedInstOperands, /*InsnID*/0,
        // GIR_Coverage, 42134,
        GIR_Done,

Looking into GIM_CheckType, it is obvious that only i32/i64 exists there, which is exactly what we needed. But, what are the other types? How were those two types being selected?

Exploring for the Answer

As a start, it is intuitive for us to use the clue had — iAny to find something. A simple way is just to search that globally and fall us to a class MVT, and the SimpleValueType enumeration to be exact.

Looking at the enumeration, the one most attracting us must be the entry iAny and we are curious about where did it come from. To do this, we search the reference of that, and a few interesting results are shown below.

And it is obvious that the last one may be the most valuable one to check, since the naming Legal might be the way limiting the types available for a overload.

Jumping into the definition of method count, we found it a bit magically-looking: return (Words[T.SimpleTy / WordWidth] >> (T.SimpleTy % WordWidth)) & 1;

To see what this thing does, let's actually run it and see what are all those values!

Playing Around with Debugger

Before running, we should know what role this GlobalISel plays in the whole build process. As described above, a file RISCVGenGlobalISel.inc was generated using TableGen. This file is included in lib/Target/RISCV/RISCVInstructionSelector.cpp and the file name clearly shows the work it does. Thus, to find the answer, we need to use TableGen to generate a GlobalISel-related file for RISCV, leading toward the command used for debugging: bin/llvm-tblgen --gen-global-isel -I ~/llvm-project/llvm/include and set a breakpoint at the line Legal.count(T) above.

And it stopped at that point. Printing the values, we get:

Seeing that table, with considering (3680)_10=(111001100000)_2 and that we only got i32 and i64 type, everything about how it is selected became clear.

What about that really huge number Legal.Words[1]? Converting that, we got (18446726481523507200)_10 = (FFFFF00000000000)_16. Note that to make this count method use Legal.Words[1], the condition WordWidth <= T.simpleTy < 2*WordWidth must be satisified and when T.SimpleTy == WordWidth, T.SimpleTy represents type v64i64, which is unsupported and the it is the types nxv*i* which are supported.

Thinking about that magic equation, since there are so many types, and each type requires one bit to represent whether it is "legal", so each Words[i] store a part of consusive types in the enumeration, and use division to get which part is it in and use modulus and right shift to move that bit to the right end and use &1 to get that yes or no verdict.

But, where was this thing being written in? How did it know those magic numbers? Let's have a look at the stack trace to know where we are now.

#0  llvm::TypeInfer::expandOverloads (this=0x7fffffffd058, Out=..., Legal=...)
    at ~/llvm-project/llvm/utils/TableGen/CodeGenDAGPatterns.cpp:792
#1  0x00005555555ff90b in llvm::TypeInfer::expandOverloads (this=0x7fffffffd058, VTS=...)
    at ~/llvm-project/llvm/utils/TableGen/CodeGenDAGPatterns.cpp:770
#2  0x0000555555621ced in llvm::TreePatternNode::UpdateNodeType (this=0x55555ed070a0, ResNo=0, InTy=-3, TP=...)
    at ~/llvm-project/llvm/utils/TableGen/CodeGenDAGPatterns.h:991
#3  0x000055555560a3ac in llvm::TreePatternNode::ApplyTypeConstraints (this=0x55555ed06f10, TP=..., NotRegisters=false)
    at ~/llvm-project/llvm/utils/TableGen/CodeGenDAGPatterns.cpp:2448
#4  0x000055555560da38 in llvm::TreePattern::InferAllTypes (this=0x7fffffffcfd0, InNamedTypes=0x7fffffffcfe8)
    at ~/llvm-project/llvm/utils/TableGen/CodeGenDAGPatterns.cpp:2979
#5  0x0000555555614b88 in llvm::CodeGenDAGPatterns::ParseOnePattern (this=0x7fffffffd338, TheDef=0x55555bb16f30, Pattern=..., Result=..., 
    InstImpResults=std::vector of length 0, capacity 0) at ~/llvm-project/llvm/utils/TableGen/CodeGenDAGPatterns.cpp:4176
#6  0x00005555556155aa in llvm::CodeGenDAGPatterns::ParsePatterns (this=0x7fffffffd338)
    at ~/llvm-project/llvm/utils/TableGen/CodeGenDAGPatterns.cpp:4284
#7  0x000055555560e530 in llvm::CodeGenDAGPatterns::CodeGenDAGPatterns(llvm::RecordKeeper&, std::function<void (llvm::TreePattern*)>) (
    this=0x7fffffffd338, R=..., PatternRewriter=...) at ~/llvm-project/llvm/utils/TableGen/CodeGenDAGPatterns.cpp:3084
#8  0x00005555557dd23d in (anonymous namespace)::GlobalISelEmitter::GlobalISelEmitter (this=0x7fffffffd330, RK=...)
    at ~/llvm-project/llvm/utils/TableGen/GlobalISelEmitter.cpp:3722
#9  0x00005555557ee2ba in llvm::EmitGlobalISel (RK=..., OS=...)
    at ~/llvm-project/llvm/utils/TableGen/GlobalISelEmitter.cpp:6165
#10 0x0000555555908029 in (anonymous namespace)::LLVMTableGenMain (OS=..., Records=...)
    at ~/llvm-project/llvm/utils/TableGen/TableGen.cpp:249
#11 0x0000555555a443f8 in llvm::TableGenMain (argv0=0x7fffffffe3b0 "~/llvm-project/llvm/build/bin/llvm-tblgen", 
    MainFn=0x5555559078f9 <(anonymous namespace)::LLVMTableGenMain(llvm::raw_ostream&, llvm::RecordKeeper&)>)
    at ~/llvm-project/llvm/lib/TableGen/Main.cpp:120
#12 0x00005555559081ea in main (argc=5, argv=0x7fffffffe128) at ~/llvm-project/llvm/utils/TableGen/TableGen.cpp:285

Among that, ApplyTypeConstraints seems a bit like what is happening so let's return all the way back to there to see what's happening.

#0  0x000055555560a3ac in llvm::TreePatternNode::ApplyTypeConstraints (this=0x55555ed06f10, TP=..., NotRegisters=false)
    at ~/llvm-project/llvm/utils/TableGen/CodeGenDAGPatterns.cpp:2448
(gdb) l
2443        for (unsigned i = 0, e = getNumChildren()-1; i != e; ++i) {
2444          MadeChange |= getChild(i+1)->ApplyTypeConstraints(TP, NotRegisters);
2445
2446          MVT::SimpleValueType OpVT = Int->IS.ParamVTs[i];
2447          assert(getChild(i+1)->getNumTypes() == 1 && "Unhandled case");
2448          MadeChange |= getChild(i+1)->UpdateNodeType(0, OpVT, TP);
2449        }
2450        return MadeChange;
2451      }

However, it seems not so helpful. Let's continue returning. Until a constructor, we see what we might want:

CodeGenDAGPatterns::CodeGenDAGPatterns(RecordKeeper &R,
                                       PatternRewriterFn PatternRewriter)
    : Records(R), Target(R), LegalVTS(Target.getLegalValueTypes()),
      PatternRewriter(PatternRewriter) {...}

The Target here and the LegalValueTypes is what we really wanted so let's jump into that!

ArrayRef<ValueTypeByHwMode> getLegalValueTypes() const {
  if (LegalValueTypes.empty())
    ReadLegalValueTypes();
    return LegalValueTypes;
}
void CodeGenTarget::ReadLegalValueTypes() const {
  for (const auto &RC : getRegBank().getRegClasses())
    llvm::append_range(LegalValueTypes, RC.VTs);

And methods getRegBank and getRegClasses gave us some clue for finding it in the RISCV target directory, bringing us to the file ~/llvm-project/llvm/lib/Target/RISCV/RISCVRegisterBankInfo.cpp, with the statement #include "RISCVGenRegisterBank.inc". Unfortunately, there is no strong connection with that. So, how about seeing another file with similar name, RISCVGenRegisterInfo.inc. By opening this, we found the answer we are urging for.

static const MVT::SimpleValueType VTLists[] = {
  /* 0 */ MVT::i32, MVT::Other,
  /* 2 */ MVT::i64, MVT::Other,
  /* 4 */ MVT::f16, MVT::Other,
  /* 6 */ MVT::f32, MVT::Other,
  /* 8 */ MVT::f64, MVT::Other,
  /* 10 */ MVT::nxv64i1, MVT::nxv32i1, MVT::nxv16i1, MVT::nxv8i1, MVT::nxv4i1, MVT::nxv2i1, MVT::nxv1i1, MVT::Other,
  /* 18 */ MVT::nxv8i8, MVT::nxv4i16, MVT::nxv2i32, MVT::nxv1i64, MVT::nxv4f16, MVT::nxv2f32, MVT::nxv1f64, MVT::nxv4i8, MVT::nxv2i8, MVT::nxv1i8, MVT::nxv2i16, MVT::nxv1i16, MVT::nxv1i32, MVT::nxv1f16, MVT::nxv2f16, MVT::nxv1f32, MVT::nxv1i1, MVT::nxv2i1, MVT::nxv4i1, MVT::nxv8i1, MVT::nxv16i1, MVT::nxv32i1, MVT::nxv64i1, MVT::Other,
  /* 42 */ MVT::nxv16i8, MVT::nxv8i16, MVT::nxv4i32, MVT::nxv2i64, MVT::nxv8f16, MVT::nxv4f32, MVT::nxv2f64, MVT::Other,
  /* 50 */ MVT::nxv32i8, MVT::nxv16i16, MVT::nxv8i32, MVT::nxv4i64, MVT::nxv16f16, MVT::nxv8f32, MVT::nxv4f64, MVT::Other,
  /* 58 */ MVT::nxv64i8, MVT::nxv32i16, MVT::nxv16i32, MVT::nxv8i64, MVT::nxv32f16, MVT::nxv16f32, MVT::nxv8f64, MVT::Other,
  /* 66 */ MVT::Untyped, MVT::Other,
};

So how are those things inserted into it? Above the count method is exactly the insert method. Llet's break there and take a quick look.

(gdb) where
#0  llvm::MachineValueTypeSet::insert (this=0x55555d151f58, T=...)
    at ~/llvm-project/llvm/utils/TableGen/CodeGenDAGPatterns.h:91
#1  0x00005555555f9b87 in llvm::TypeSetByHwMode::insert (this=0x7fffffffd748, VVT=...)
    at ~/llvm-project/llvm/utils/TableGen/CodeGenDAGPatterns.cpp:119
#2  0x00005555555f9499 in llvm::TypeSetByHwMode::TypeSetByHwMode (this=0x7fffffffd748, VTList=...)
    at ~/llvm-project/llvm/utils/TableGen/CodeGenDAGPatterns.cpp:72
#3  0x000055555560e462 in llvm::CodeGenDAGPatterns::CodeGenDAGPatterns(llvm::RecordKeeper&, std::function<void (llvm::TreePattern*)>) (
    this=0x7fffffffd2f8, R=..., PatternRewriter=...) at ~/llvm-project/llvm/utils/TableGen/CodeGenDAGPatterns.cpp:3074
#4  0x00005555557dd23d in (anonymous namespace)::GlobalISelEmitter::GlobalISelEmitter (this=0x7fffffffd2f0, RK=...)
    at ~/llvm-project/llvm/utils/TableGen/GlobalISelEmitter.cpp:3722
#5  0x00005555557ee2ba in llvm::EmitGlobalISel (RK=..., OS=...)
    at ~/llvm-project/llvm/utils/TableGen/GlobalISelEmitter.cpp:6165
#6  0x0000555555908029 in (anonymous namespace)::LLVMTableGenMain (OS=..., Records=...)
    at ~/llvm-project/llvm/utils/TableGen/TableGen.cpp:249
#7  0x0000555555a443f8 in llvm::TableGenMain (argv0=0x7fffffffe382 "~/llvm-project/llvm/build/bin/llvm-tblgen", 
    MainFn=0x5555559078f9 <(anonymous namespace)::LLVMTableGenMain(llvm::raw_ostream&, llvm::RecordKeeper&)>)
    at ~/llvm-project/llvm/lib/TableGen/Main.cpp:120
#8  0x00005555559081ea in main (argc=5, argv=0x7fffffffe0e8) at ~/llvm-project/llvm/utils/TableGen/TableGen.cpp:285

The insert itself is not surprising. Let's return.

/*
#0  llvm::TypeSetByHwMode::TypeSetByHwMode (this=0x7fffffffd748, VTList=...)                         at ~/llvm-project/llvm/utils/TableGen/CodeGenDAGPatterns.cpp:73
*/
bool TypeSetByHwMode::insert(const ValueTypeByHwMode &VVT) {
  bool Changed = false;
  bool ContainsDefault = false;
  MVT DT = MVT::Other;

  SmallDenseSet<unsigned, 4> Modes;
  for (const auto &P : VVT) {
    unsigned M = P.first;
    Modes.insert(M);
  //...

Continue returning, and it reaches the boundary — it is constructor below, and let's see the code here

/*
#0  llvm::TypeSetByHwMode::TypeSetByHwMode (this=0x7fffffffd788, VTList=...)
    at ~/llvm-project/llvm/utils/TableGen/CodeGenDAGPatterns.cpp:72
*/
TypeSetByHwMode::TypeSetByHwMode(ArrayRef<ValueTypeByHwMode> VTList) {
  for (const ValueTypeByHwMode &VVT : VTList) {
    insert(VVT);
    AddrSpaces.push_back(VVT.PtrAddrSpace);
  }
}

And there is one VTList, printing out values we get:

(gdb) p VTList
$1 = {Data = 0x55555d19f5a0, Length = 49}
(gdb) p *VTList.Data
$2 = {<llvm::InfoByHwMode<llvm::MVT>> = {Map = std::map with 2 elements = {[0] = {SimpleTy = llvm::MVT::i32}, [1] = {
        SimpleTy = llvm::MVT::i64}}}, PtrAddrSpace = 4294967295}
(gdb) p *(VTList.Data+1)
$3 = {<llvm::InfoByHwMode<llvm::MVT>> = {Map = std::map with 1 element = {[0] = {SimpleTy = llvm::MVT::f16}}}, PtrAddrSpace = 4294967295}
(gdb) p *(VTList.Data+20)
$4 = {<llvm::InfoByHwMode<llvm::MVT>> = {Map = std::map with 1 element = {[0] = {SimpleTy = llvm::MVT::nxv4i16}}}, 
  PtrAddrSpace = 4294967295}

Seems that information comes from here. (There are a few days off and when I get back I forgot what happened in between 😥) Digging in further, we find the constructor of CodeGenDAGPatterns have an initializer list entry LegalVTS(Target.getLegalValueTypes()). With in this function, it is just that construct one and save when unconstructed and just return the saved one:

ArrayRef<ValueTypeByHwMode> getLegalValueTypes() const {
  if (LegalValueTypes.empty())
    ReadLegalValueTypes();
  return LegalValueTypes;
}

How did it do ReadLegalValueTypes();? The code is just as below:

void CodeGenTarget::ReadLegalValueTypes() const {
  for (const auto &RC : getRegBank().getRegClasses())
    llvm::append_range(LegalValueTypes, RC.VTs);
  ...
}

Here, the LegalValueTypes is a SmallVector being written into. Since the information was taken from each entry of ReadLegalValueTypes, how about printing that out? Unfortunately, it turns out a huge bulk of messy things. So let's just see how getRegBank() and getRegClasses() works. Still, it is just construct&save when it does not exists, and return the one saved, with being constructed by statement RegBank = std::make_unique<CodeGenRegBank>(Records, getHwModes());.

Jumping into its constructor, it seems this thing highly worth paying attention to:

CodeGenRegisterClass::CodeGenRegisterClass(CodeGenRegBank &RegBank, Record *R)
    : TheDef(R), Name(std::string(R->getName())),
      TopoSigs(RegBank.getNumTopoSigs()), EnumValue(-1) {
  GeneratePressureSet = R->getValueAsBit("GeneratePressureSet");
  std::vector<Record*> TypeList = R->getValueAsListOfDefs("RegTypes");
  if (TypeList.empty())
    PrintFatalError(R->getLoc(), "RegTypes list must not be empty!");
  for (unsigned i = 0, e = TypeList.size(); i != e; ++i) {
    Record *Type = TypeList[i];
    if (!Type->isSubClassOf("ValueType"))
      PrintFatalError(R->getLoc(),
                      "RegTypes list member '" + Type->getName() +
                          "' does not derive from the ValueType class!");
    VTs.push_back(getValueTypeByHwMode(Type, RegBank.getHwModes()));
  }
  /// This method looks up the specified field and
  /// returns its value as a vector of records, throwing an exception if the
  /// field does not exist or if the value is not the right type.
  std::vector<Record*> getValueAsListOfDefs(StringRef FieldName) const;

And it is very clear that it is filtering all def RegTypes. Now let's turn to see how the records see:

$ bin/llvm-tblgen ../lib/Target/RISCV/RISCV.td --print-records -I /home/xueqixing/llvm-project/llvm/include -I ../lib/Target/RISCV > record.td
$ grep "RegTypes" record.td
  list<ValueType> RegTypes = RegisterClass:regTypes;
  list<ValueType> RegTypes = VReg:regTypes;
  list<ValueType> RegTypes = [f16];
  list<ValueType> RegTypes = [f32];
  list<ValueType> RegTypes = [f32];
  list<ValueType> RegTypes = [f64];
  list<ValueType> RegTypes = [f64];
  list<ValueType> RegTypes = [XLenVT];
  list<ValueType> RegTypes = [XLenVT];
  list<ValueType> RegTypes = [XLenVT];
  list<ValueType> RegTypes = [XLenVT];
  list<ValueType> RegTypes = [XLenVT];
  list<ValueType> RegTypes = [XLenVT];
  list<ValueType> RegTypes = [XLenVT];
  list<ValueType> RegTypes = [nxv64i1, nxv32i1, nxv16i1, nxv8i1, nxv4i1, nxv2i1, nxv1i1];
  list<ValueType> RegTypes = [nxv1i1, nxv2i1, nxv4i1, nxv8i1, nxv16i1, nxv32i1, nxv64i1];
  list<ValueType> RegTypes = [nxv8i8, nxv4i16, nxv2i32, nxv1i64, nxv4f16, nxv2f32, nxv1f64, nxv4i8, nxv2i8, nxv1i8, nxv2i16, nxv1i16, nxv1i32, nxv1f16, nxv2f16, nxv1f32, nxv1i1, nxv2i1, nxv4i1, nxv8i1, nxv16i1, nxv32i1, nxv64i1];
  list<ValueType> RegTypes = [nxv16i8, nxv8i16, nxv4i32, nxv2i64, nxv8f16, nxv4f32, nxv2f64];
  list<ValueType> RegTypes = [nxv16i8, nxv8i16, nxv4i32, nxv2i64, nxv8f16, nxv4f32, nxv2f64];
  list<ValueType> RegTypes = [nxv32i8, nxv16i16, nxv8i32, nxv4i64, nxv16f16, nxv8f32, nxv4f64];
  list<ValueType> RegTypes = [nxv32i8, nxv16i16, nxv8i32, nxv4i64, nxv16f16, nxv8f32, nxv4f64];
  list<ValueType> RegTypes = [nxv64i8, nxv32i16, nxv16i32, nxv8i64, nxv32f16, nxv16f32, nxv8f64];
  list<ValueType> RegTypes = [nxv64i8, nxv32i16, nxv16i32, nxv8i64, nxv32f16, nxv16f32, nxv8f64];
  list<ValueType> RegTypes = [untyped];
  list<ValueType> RegTypes = [untyped];
  list<ValueType> RegTypes = [untyped];
  list<ValueType> RegTypes = [untyped];
  list<ValueType> RegTypes = [untyped];
  list<ValueType> RegTypes = [untyped];
  list<ValueType> RegTypes = [untyped];
  list<ValueType> RegTypes = [untyped];
  list<ValueType> RegTypes = [untyped];
  list<ValueType> RegTypes = [untyped];
  list<ValueType> RegTypes = [untyped];
  list<ValueType> RegTypes = [untyped];
  list<ValueType> RegTypes = [untyped];
  list<ValueType> RegTypes = [untyped];
  list<ValueType> RegTypes = [untyped];
  list<ValueType> RegTypes = [untyped];
  list<ValueType> RegTypes = [untyped];
  list<ValueType> RegTypes = [untyped];
  list<ValueType> RegTypes = [untyped];
  list<ValueType> RegTypes = [untyped];
  list<ValueType> RegTypes = [untyped];
  list<ValueType> RegTypes = [untyped];
  list<ValueType> RegTypes = [nxv8i8, nxv4i16, nxv2i32, nxv1i64, nxv4f16, nxv2f32, nxv1f64, nxv4i8, nxv2i8, nxv1i8, nxv2i16, nxv1i16, nxv1i32, nxv1f16, nxv2f16, nxv1f32, nxv1i1, nxv2i1, nxv4i1, nxv8i1, nxv16i1, nxv32i1, nxv64i1];

And with -U20 parameter, we could see some context of those things, finding them nested in Register Classes.

def GPR {	// DAGOperand RegisterClass
  string OperandNamespace = "MCOI";
  string DecoderMethod = "";
  string Namespace = "RISCV";
  RegInfoByHwMode RegInfos = XLenRI;
  list<ValueType> RegTypes = [XLenVT];
  int Size = 0;
  int Alignment = 32;
  int CopyCost = 1;
  dag MemberList = (add (sequence "X%u", 10, 17), (sequence "X%u", 5, 7), (sequence "X%u", 28, 31), (sequence "X%u", 8, 9), (sequence "X%u", 18, 27), (sequence "X%u", 0, 4));
  RegAltNameIndex altNameIndex = NoRegAltName;
  bit isAllocatable = 1;
  list<dag> AltOrders = [];
  code AltOrderSelect = [{}];
  int AllocationPriority = 0;
  bit GeneratePressureSet = 1;
  int Weight = ?;
  string DiagnosticType = "";
  string DiagnosticString = "";
}

Among them, there is a interesting thing called XLenVT, so let's see what it does:

def XLenVT {    // HwModeSelect ValueType ValueTypeByHwMode
  list<HwMode> Modes = [DefaultMode, RV64];
  string Namespace = "MVT";
  int Size = 0;
  int Value = 0;
  list<ValueType> Objects = [i32, i64];
}

With originally defined as def XLenVT : ValueTypeByHwMode<[RV32, RV64], [i32, i64]>;. Thus, in this way of configuring, it selected only integer types i32 and i64 as legal types.

Within this Process...


Copyright (c) ksyx 2021, licensed under CC-BY-SA. The code segments are licensed in accordance to LLVM, with HEAD at commit f7294ac

Download Markdown file at anyint.md

< Back to posts