anyint
is not anyThis is a exploration log. When writing LLVM IR intrinsic, it is often to use llvm_anyint_ty
to indicate a more general range of types rather than one single type like llvm_i8_ty
. After some searches, it is easy to find it is being defined at include/llvm/IR/Intrinsics.td:L235
, referencing to LLVMType<iAny>
, as defined at include/llvm/IR/CodeGen.td:L217
. Combining the comment nearby: // Pseudo valuetype to represent "integer of any bit width"
, it somewhat made one think that for an intrinsic def int_foo_bar : Intrinsic<[llvm_anyint_ty], [LLVMMatchType<0>], [IntrNoMem, IntrSpeculatable]>;
, there exists the following intrinsics:
declare i1 @llvm.foo.bar.i1(i1);
declare i8 @llvm.foo.bar.i8(i8);
declare i16 @llvm.foo.bar.i16(i16);
declare i32 @llvm.foo.bar.i32(i32);
declare i64 @llvm.foo.bar.i64(i64);
declare i128 @llvm.foo.bar.i128(i128);
That is, for all ValueTypes in ValueTypes.td
, there is a overload for it. The author of LLVM revision D66479 seems also has the same confusion.
This problem arises when I was implementing the LLVM IR Intrinsic of K (crypto) extension so the description below would take that as example.
As in the intrinsic proposal of Markku, there is one intrinsic _rv_sha256sig0
supporting parameter types of i32/i64
, but seems other types of parameters need to be casted. Thus, when implementing, the anyint
below made the implementation mismatch from the proposal above, seems types like i1, i8, i128
is not what we wanted.
class ScalarCryptoGprIntrinsicAny : Intrinsic<[llvm_anyint_ty], [LLVMMatchType<0>], [IntrNoMem, IntrSpeculatable]>;
def int_riscv_sha256sig0 : ScalarCryptoGprIntrinsicAny;
def : PatGpr<int_riscv_sha256sig0, SHA256SIG0>;
// Record SHA256SIG0 defined in the MC Layer, demonstrating an instruction
But does it works as its name shows? Let's see the log below for the record above.
$ cat test.ll
declare i32 @llvm.riscv.sha256sig0(i32);
define i32 @sha256sig0(i32 %a) nounwind {
; RV32IK-LABEL: sha256sig0
; RV32IK: # %bb.0:
; RV32IK-NEXT: sha256sig0 a{{[0-9]+}}, a{{[0-9]+}}
; RV32IK-NEXT: ret
%val = call i32 @llvm.riscv.sha256sig0(i32 %a)
ret i32 %val
}
$ ./bin/llc -mtriple=riscv32 -mattr=+experimental-k -verify-machineinstrs test.ll
# Try running $ cat test.s here!
$ sed "s/i32/i8/" test.ll -i
$ cat test.ll
declare i8 @llvm.riscv.sha256sig0(i32);
define i8 @sha256sig0(i32 %a) nounwind {
; RV32IK-LABEL: sha256sig0
; RV32IK: # %bb.0:
; RV32IK-NEXT: sha256sig0 a{{[0-9]+}}, a{{[0-9]+}}
; RV32IK-NEXT: ret
%val = call i8 @llvm.riscv.sha256sig0(i32 %a)
ret i8 %val
}
$ ./bin/llc -mtriple=riscv32 -mattr=+experimental-k -verify-machineinstrs test.ll
Intrinsic has incorrect argument type!
i8 (i32)* @llvm.riscv.sha256sig0
llc: error: 'test.ll': input module cannot be verified
Type i8
is integer type, and we had specified it as anyint
, right? However, it seems that it does not accept all int type... Then what does it accept? Let's see what had it generated from the product of TableGen files (.td
files) — the .inc
file. To be specific, we want to see the file RISCVGenGlobalISel.inc
.
Searching for sha256sig0
, the few search result quickly make us located to a specific region, providing the answer we are urging for. To better demonstrate this result, an record above and below the record of sha256sig0
, respectively, is pasted.
// Label 510: @16451
GIM_Try, /*On fail goto*//*Label 511*/ 16491, // Rule ID 42126 //
GIM_CheckFeatures, GIFBS_HasStdExtZknd_IsRV64,
GIM_CheckIntrinsicID, /*MI*/0, /*Op*/1, Intrinsic::riscv_aes64im,
GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s64,
GIM_CheckType, /*MI*/0, /*Op*/2, /*Type*/GILLT_s64,
GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/RISCV::GPRRegClassID,
GIM_CheckRegBankForClass, /*MI*/0, /*Op*/2, /*RC*/RISCV::GPRRegClassID,
// (intrinsic_wo_chain:{ *:[i64] } 6691:{ *:[iPTR] }, GPR:{ *:[i64] }:$rs1) => (AES64IM:{ *:[i64] } GPR:{ *:[i64] }:$rs1)
GIR_BuildMI, /*InsnID*/0, /*Opcode*/RISCV::AES64IM,
GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/0, // rd
GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/2, // rs1
GIR_EraseFromParent, /*InsnID*/0,
GIR_ConstrainSelectedInstOperands, /*InsnID*/0,
// GIR_Coverage, 42126,
GIR_Done,
// Label 511: @16491
GIM_Try, /*On fail goto*//*Label 512*/ 16531, // Rule ID 42132 //
GIM_CheckFeatures, GIFBS_HasStdExtZknh,
GIM_CheckIntrinsicID, /*MI*/0, /*Op*/1, Intrinsic::riscv_sha256sig0,
GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s64,
GIM_CheckType, /*MI*/0, /*Op*/2, /*Type*/GILLT_s64,
GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/RISCV::GPRRegClassID,
GIM_CheckRegBankForClass, /*MI*/0, /*Op*/2, /*RC*/RISCV::GPRRegClassID,
// (intrinsic_wo_chain:{ *:[i64] } 6723:{ *:[iPTR] }, GPR:{ *:[i64] }:$rs1) => (SHA256SIG0:{ *:[i64] } GPR:{ *:[i64] }:$rs1)
GIR_BuildMI, /*InsnID*/0, /*Opcode*/RISCV::SHA256SIG0,
GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/0, // rd
GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/2, // rs1
GIR_EraseFromParent, /*InsnID*/0,
GIR_ConstrainSelectedInstOperands, /*InsnID*/0,
// GIR_Coverage, 42132,
GIR_Done,
// Label 512: @16531
GIM_Try, /*On fail goto*//*Label 513*/ 16571, // Rule ID 42133 //
GIM_CheckFeatures, GIFBS_HasStdExtZknh,
GIM_CheckIntrinsicID, /*MI*/0, /*Op*/1, Intrinsic::riscv_sha256sig0,
GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32,
GIM_CheckType, /*MI*/0, /*Op*/2, /*Type*/GILLT_s32,
GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/RISCV::GPRRegClassID,
GIM_CheckRegBankForClass, /*MI*/0, /*Op*/2, /*RC*/RISCV::GPRRegClassID,
// (intrinsic_wo_chain:{ *:[i32] } 6723:{ *:[iPTR] }, GPR:{ *:[i32] }:$rs1) => (SHA256SIG0:{ *:[i32] } GPR:{ *:[i32] }:$rs1)
GIR_BuildMI, /*InsnID*/0, /*Opcode*/RISCV::SHA256SIG0,
GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/0, // rd
GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/2, // rs1
GIR_EraseFromParent, /*InsnID*/0,
GIR_ConstrainSelectedInstOperands, /*InsnID*/0,
// GIR_Coverage, 42133,
GIR_Done,
// Label 513: @16571
GIM_Try, /*On fail goto*//*Label 514*/ 16611, // Rule ID 42134 //
GIM_CheckFeatures, GIFBS_HasStdExtZknh,
GIM_CheckIntrinsicID, /*MI*/0, /*Op*/1, Intrinsic::riscv_sha256sig1,
GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s64,
GIM_CheckType, /*MI*/0, /*Op*/2, /*Type*/GILLT_s64,
GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/RISCV::GPRRegClassID,
GIM_CheckRegBankForClass, /*MI*/0, /*Op*/2, /*RC*/RISCV::GPRRegClassID,
// (intrinsic_wo_chain:{ *:[i64] } 6724:{ *:[iPTR] }, GPR:{ *:[i64] }:$rs1) => (SHA256SIG1:{ *:[i64] } GPR:{ *:[i64] }:$rs1)
GIR_BuildMI, /*InsnID*/0, /*Opcode*/RISCV::SHA256SIG1,
GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/0, // rd
GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/2, // rs1
GIR_EraseFromParent, /*InsnID*/0,
GIR_ConstrainSelectedInstOperands, /*InsnID*/0,
// GIR_Coverage, 42134,
GIR_Done,
Looking into GIM_CheckType
, it is obvious that only i32/i64
exists there, which is exactly what we needed. But, what are the other types? How were those two types being selected?
As a start, it is intuitive for us to use the clue had — iAny
to find something. A simple way is just to search that globally and fall us to a class MVT
, and the SimpleValueType
enumeration to be exact.
Looking at the enumeration, the one most attracting us must be the entry iAny
and we are curious about where did it come from. To do this, we search the reference of that, and a few interesting results are shown below.
// utils/TableGen/IntrinsicEmitter.cpp : EncodeFixedType
switch (VT) {
default: break;
case MVT::iPTRAny: ++Tmp; LLVM_FALLTHROUGH;
case MVT::vAny: ++Tmp; LLVM_FALLTHROUGH;
case MVT::fAny: ++Tmp; LLVM_FALLTHROUGH;
case MVT::iAny: ++Tmp; LLVM_FALLTHROUGH;
case MVT::Any: {
// If this is an "any" valuetype, then the type is the type of the next
// type in the list specified to getIntrinsic().
Sig.push_back(IIT_ARG);
// Figure out what arg # this is consuming, and remember what kind it was.
assert(NextArgCode < ArgCodes.size() && ArgCodes[NextArgCode] == Tmp &&
"Invalid or no ArgCode associated with overloaded VT!");
unsigned ArgNo = NextArgCode++;
// Encode what sort of argument it must be in the low 3 bits of the ArgNo.
return Sig.push_back((ArgNo << 3) | Tmp);
}
// utils/TableGen/IntrinsicEmitter.cpp : UpdateArgCodes(..)
switch (getValueType(R->getValueAsDef("VT"))) {
default: break;
case MVT::iPTR:
UpdateArgCodes(R->getValueAsDef("ElTy"), ArgCodes, NumInserted, Mapping);
break;
case MVT::iPTRAny:
++Tmp;
LLVM_FALLTHROUGH;
case MVT::vAny:
++Tmp;
LLVM_FALLTHROUGH;
case MVT::fAny:
++Tmp;
LLVM_FALLTHROUGH;
case MVT::iAny:
++Tmp;
LLVM_FALLTHROUGH;
case MVT::Any:
unsigned OriginalIdx = ArgCodes.size() - NumInserted;
assert(OriginalIdx >= Mapping.size());
Mapping.resize(OriginalIdx+1);
Mapping[OriginalIdx] = ArgCodes.size();
ArgCodes.push_back(Tmp);
break;
}
// utils/TableGen/CodeGenDAGPatterns.cpp
case MVT::iAny:
for (MVT T : MVT::integer_valuetypes())
if (Legal.count(T))
Out.insert(T);
And it is obvious that the last one may be the most valuable one to check, since the naming Legal
might be the way limiting the types available for a overload.
Jumping into the definition of method count
, we found it a bit magically-looking: return (Words[T.SimpleTy / WordWidth] >> (T.SimpleTy % WordWidth)) & 1;
To see what this thing does, let's actually run it and see what are all those values!
Before running, we should know what role this GlobalISel
plays in the whole build process. As described above, a file RISCVGenGlobalISel.inc
was generated using TableGen. This file is included in lib/Target/RISCV/RISCVInstructionSelector.cpp
and the file name clearly shows the work it does. Thus, to find the answer, we need to use TableGen
to generate a GlobalISel
-related file for RISCV
, leading toward the command used for debugging: bin/llvm-tblgen --gen-global-isel -I ~/llvm-project/llvm/include
and set a breakpoint at the line Legal.count(T)
above.
And it stopped at that point. Printing the values, we get:
(gdb) p (int)T.SimpleTy
$1 = 3
(gdb) p Legal.WordWidth
$2 = 64
(gdb) p Legal.Words
$3 = {_M_elems = {3680, 18446726481523507200, 19323417567, 0}}
And those variable values are enough for us to understand what that magic is doing:
Type | Other | i1 | i8 | i16 | i32 | i64 | i128 | bf16 |
---|---|---|---|---|---|---|---|---|
EnumVal i.e. (int) T.SimpleTy | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
(Words[0] >> EnumVal) & 1 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 |
Seeing that table, with considering (3680)_10=(111001100000)_2
and that we only got i32
and i64
type, everything about how it is selected became clear.
What about that really huge number Legal.Words[1]
? Converting that, we got (18446726481523507200)_10 = (FFFFF00000000000)_16
. Note that to make this count
method use Legal.Words[1]
, the condition WordWidth <= T.simpleTy < 2*WordWidth
must be satisified and when T.SimpleTy == WordWidth
, T.SimpleTy
represents type v64i64
, which is unsupported and the it is the types nxv*i*
which are supported.
Thinking about that magic equation, since there are so many types, and each type requires one bit to represent whether it is "legal", so each Words[i]
store a part of consusive types in the enumeration, and use division to get which part is it in and use modulus and right shift to move that bit to the right end and use &1
to get that yes or no verdict.
But, where was this thing being written in? How did it know those magic numbers? Let's have a look at the stack trace to know where we are now.
#0 llvm::TypeInfer::expandOverloads (this=0x7fffffffd058, Out=..., Legal=...)
at ~/llvm-project/llvm/utils/TableGen/CodeGenDAGPatterns.cpp:792
#1 0x00005555555ff90b in llvm::TypeInfer::expandOverloads (this=0x7fffffffd058, VTS=...)
at ~/llvm-project/llvm/utils/TableGen/CodeGenDAGPatterns.cpp:770
#2 0x0000555555621ced in llvm::TreePatternNode::UpdateNodeType (this=0x55555ed070a0, ResNo=0, InTy=-3, TP=...)
at ~/llvm-project/llvm/utils/TableGen/CodeGenDAGPatterns.h:991
#3 0x000055555560a3ac in llvm::TreePatternNode::ApplyTypeConstraints (this=0x55555ed06f10, TP=..., NotRegisters=false)
at ~/llvm-project/llvm/utils/TableGen/CodeGenDAGPatterns.cpp:2448
#4 0x000055555560da38 in llvm::TreePattern::InferAllTypes (this=0x7fffffffcfd0, InNamedTypes=0x7fffffffcfe8)
at ~/llvm-project/llvm/utils/TableGen/CodeGenDAGPatterns.cpp:2979
#5 0x0000555555614b88 in llvm::CodeGenDAGPatterns::ParseOnePattern (this=0x7fffffffd338, TheDef=0x55555bb16f30, Pattern=..., Result=...,
InstImpResults=std::vector of length 0, capacity 0) at ~/llvm-project/llvm/utils/TableGen/CodeGenDAGPatterns.cpp:4176
#6 0x00005555556155aa in llvm::CodeGenDAGPatterns::ParsePatterns (this=0x7fffffffd338)
at ~/llvm-project/llvm/utils/TableGen/CodeGenDAGPatterns.cpp:4284
#7 0x000055555560e530 in llvm::CodeGenDAGPatterns::CodeGenDAGPatterns(llvm::RecordKeeper&, std::function<void (llvm::TreePattern*)>) (
this=0x7fffffffd338, R=..., PatternRewriter=...) at ~/llvm-project/llvm/utils/TableGen/CodeGenDAGPatterns.cpp:3084
#8 0x00005555557dd23d in (anonymous namespace)::GlobalISelEmitter::GlobalISelEmitter (this=0x7fffffffd330, RK=...)
at ~/llvm-project/llvm/utils/TableGen/GlobalISelEmitter.cpp:3722
#9 0x00005555557ee2ba in llvm::EmitGlobalISel (RK=..., OS=...)
at ~/llvm-project/llvm/utils/TableGen/GlobalISelEmitter.cpp:6165
#10 0x0000555555908029 in (anonymous namespace)::LLVMTableGenMain (OS=..., Records=...)
at ~/llvm-project/llvm/utils/TableGen/TableGen.cpp:249
#11 0x0000555555a443f8 in llvm::TableGenMain (argv0=0x7fffffffe3b0 "~/llvm-project/llvm/build/bin/llvm-tblgen",
MainFn=0x5555559078f9 <(anonymous namespace)::LLVMTableGenMain(llvm::raw_ostream&, llvm::RecordKeeper&)>)
at ~/llvm-project/llvm/lib/TableGen/Main.cpp:120
#12 0x00005555559081ea in main (argc=5, argv=0x7fffffffe128) at ~/llvm-project/llvm/utils/TableGen/TableGen.cpp:285
Among that, ApplyTypeConstraints
seems a bit like what is happening so let's return all the way back to there to see what's happening.
#0 0x000055555560a3ac in llvm::TreePatternNode::ApplyTypeConstraints (this=0x55555ed06f10, TP=..., NotRegisters=false)
at ~/llvm-project/llvm/utils/TableGen/CodeGenDAGPatterns.cpp:2448
(gdb) l
2443 for (unsigned i = 0, e = getNumChildren()-1; i != e; ++i) {
2444 MadeChange |= getChild(i+1)->ApplyTypeConstraints(TP, NotRegisters);
2445
2446 MVT::SimpleValueType OpVT = Int->IS.ParamVTs[i];
2447 assert(getChild(i+1)->getNumTypes() == 1 && "Unhandled case");
2448 MadeChange |= getChild(i+1)->UpdateNodeType(0, OpVT, TP);
2449 }
2450 return MadeChange;
2451 }
However, it seems not so helpful. Let's continue returning. Until a constructor, we see what we might want:
CodeGenDAGPatterns::CodeGenDAGPatterns(RecordKeeper &R,
PatternRewriterFn PatternRewriter)
: Records(R), Target(R), LegalVTS(Target.getLegalValueTypes()),
PatternRewriter(PatternRewriter) {...}
The Target
here and the LegalValueTypes
is what we really wanted so let's jump into that!
ArrayRef<ValueTypeByHwMode> getLegalValueTypes() const {
if (LegalValueTypes.empty())
ReadLegalValueTypes();
return LegalValueTypes;
}
void CodeGenTarget::ReadLegalValueTypes() const {
for (const auto &RC : getRegBank().getRegClasses())
llvm::append_range(LegalValueTypes, RC.VTs);
And methods getRegBank
and getRegClasses
gave us some clue for finding it in the RISCV target directory, bringing us to the file ~/llvm-project/llvm/lib/Target/RISCV/RISCVRegisterBankInfo.cpp
, with the statement #include "RISCVGenRegisterBank.inc"
. Unfortunately, there is no strong connection with that. So, how about seeing another file with similar name, RISCVGenRegisterInfo.inc
. By opening this, we found the answer we are urging for.
static const MVT::SimpleValueType VTLists[] = {
/* 0 */ MVT::i32, MVT::Other,
/* 2 */ MVT::i64, MVT::Other,
/* 4 */ MVT::f16, MVT::Other,
/* 6 */ MVT::f32, MVT::Other,
/* 8 */ MVT::f64, MVT::Other,
/* 10 */ MVT::nxv64i1, MVT::nxv32i1, MVT::nxv16i1, MVT::nxv8i1, MVT::nxv4i1, MVT::nxv2i1, MVT::nxv1i1, MVT::Other,
/* 18 */ MVT::nxv8i8, MVT::nxv4i16, MVT::nxv2i32, MVT::nxv1i64, MVT::nxv4f16, MVT::nxv2f32, MVT::nxv1f64, MVT::nxv4i8, MVT::nxv2i8, MVT::nxv1i8, MVT::nxv2i16, MVT::nxv1i16, MVT::nxv1i32, MVT::nxv1f16, MVT::nxv2f16, MVT::nxv1f32, MVT::nxv1i1, MVT::nxv2i1, MVT::nxv4i1, MVT::nxv8i1, MVT::nxv16i1, MVT::nxv32i1, MVT::nxv64i1, MVT::Other,
/* 42 */ MVT::nxv16i8, MVT::nxv8i16, MVT::nxv4i32, MVT::nxv2i64, MVT::nxv8f16, MVT::nxv4f32, MVT::nxv2f64, MVT::Other,
/* 50 */ MVT::nxv32i8, MVT::nxv16i16, MVT::nxv8i32, MVT::nxv4i64, MVT::nxv16f16, MVT::nxv8f32, MVT::nxv4f64, MVT::Other,
/* 58 */ MVT::nxv64i8, MVT::nxv32i16, MVT::nxv16i32, MVT::nxv8i64, MVT::nxv32f16, MVT::nxv16f32, MVT::nxv8f64, MVT::Other,
/* 66 */ MVT::Untyped, MVT::Other,
};
So how are those things inserted into it? Above the count
method is exactly the insert
method. Llet's break there and take a quick look.
(gdb) where
#0 llvm::MachineValueTypeSet::insert (this=0x55555d151f58, T=...)
at ~/llvm-project/llvm/utils/TableGen/CodeGenDAGPatterns.h:91
#1 0x00005555555f9b87 in llvm::TypeSetByHwMode::insert (this=0x7fffffffd748, VVT=...)
at ~/llvm-project/llvm/utils/TableGen/CodeGenDAGPatterns.cpp:119
#2 0x00005555555f9499 in llvm::TypeSetByHwMode::TypeSetByHwMode (this=0x7fffffffd748, VTList=...)
at ~/llvm-project/llvm/utils/TableGen/CodeGenDAGPatterns.cpp:72
#3 0x000055555560e462 in llvm::CodeGenDAGPatterns::CodeGenDAGPatterns(llvm::RecordKeeper&, std::function<void (llvm::TreePattern*)>) (
this=0x7fffffffd2f8, R=..., PatternRewriter=...) at ~/llvm-project/llvm/utils/TableGen/CodeGenDAGPatterns.cpp:3074
#4 0x00005555557dd23d in (anonymous namespace)::GlobalISelEmitter::GlobalISelEmitter (this=0x7fffffffd2f0, RK=...)
at ~/llvm-project/llvm/utils/TableGen/GlobalISelEmitter.cpp:3722
#5 0x00005555557ee2ba in llvm::EmitGlobalISel (RK=..., OS=...)
at ~/llvm-project/llvm/utils/TableGen/GlobalISelEmitter.cpp:6165
#6 0x0000555555908029 in (anonymous namespace)::LLVMTableGenMain (OS=..., Records=...)
at ~/llvm-project/llvm/utils/TableGen/TableGen.cpp:249
#7 0x0000555555a443f8 in llvm::TableGenMain (argv0=0x7fffffffe382 "~/llvm-project/llvm/build/bin/llvm-tblgen",
MainFn=0x5555559078f9 <(anonymous namespace)::LLVMTableGenMain(llvm::raw_ostream&, llvm::RecordKeeper&)>)
at ~/llvm-project/llvm/lib/TableGen/Main.cpp:120
#8 0x00005555559081ea in main (argc=5, argv=0x7fffffffe0e8) at ~/llvm-project/llvm/utils/TableGen/TableGen.cpp:285
The insert
itself is not surprising. Let's return.
/*
#0 llvm::TypeSetByHwMode::TypeSetByHwMode (this=0x7fffffffd748, VTList=...) at ~/llvm-project/llvm/utils/TableGen/CodeGenDAGPatterns.cpp:73
*/
bool TypeSetByHwMode::insert(const ValueTypeByHwMode &VVT) {
bool Changed = false;
bool ContainsDefault = false;
MVT DT = MVT::Other;
SmallDenseSet<unsigned, 4> Modes;
for (const auto &P : VVT) {
unsigned M = P.first;
Modes.insert(M);
//...
Continue returning, and it reaches the boundary — it is constructor below, and let's see the code here
/*
#0 llvm::TypeSetByHwMode::TypeSetByHwMode (this=0x7fffffffd788, VTList=...)
at ~/llvm-project/llvm/utils/TableGen/CodeGenDAGPatterns.cpp:72
*/
TypeSetByHwMode::TypeSetByHwMode(ArrayRef<ValueTypeByHwMode> VTList) {
for (const ValueTypeByHwMode &VVT : VTList) {
insert(VVT);
AddrSpaces.push_back(VVT.PtrAddrSpace);
}
}
And there is one VTList
, printing out values we get:
(gdb) p VTList
$1 = {Data = 0x55555d19f5a0, Length = 49}
(gdb) p *VTList.Data
$2 = {<llvm::InfoByHwMode<llvm::MVT>> = {Map = std::map with 2 elements = {[0] = {SimpleTy = llvm::MVT::i32}, [1] = {
SimpleTy = llvm::MVT::i64}}}, PtrAddrSpace = 4294967295}
(gdb) p *(VTList.Data+1)
$3 = {<llvm::InfoByHwMode<llvm::MVT>> = {Map = std::map with 1 element = {[0] = {SimpleTy = llvm::MVT::f16}}}, PtrAddrSpace = 4294967295}
(gdb) p *(VTList.Data+20)
$4 = {<llvm::InfoByHwMode<llvm::MVT>> = {Map = std::map with 1 element = {[0] = {SimpleTy = llvm::MVT::nxv4i16}}},
PtrAddrSpace = 4294967295}
Seems that information comes from here.
(There are a few days off and when I get back I forgot what happened in between 😥)
Digging in further, we find the constructor of CodeGenDAGPatterns
have an initializer list entry LegalVTS(Target.getLegalValueTypes())
. With in this function, it is just that construct one and save when unconstructed and just return the saved one:
ArrayRef<ValueTypeByHwMode> getLegalValueTypes() const {
if (LegalValueTypes.empty())
ReadLegalValueTypes();
return LegalValueTypes;
}
How did it do ReadLegalValueTypes();
? The code is just as below:
void CodeGenTarget::ReadLegalValueTypes() const {
for (const auto &RC : getRegBank().getRegClasses())
llvm::append_range(LegalValueTypes, RC.VTs);
...
}
Here, the LegalValueTypes
is a SmallVector
being written into. Since the information was taken from each entry of ReadLegalValueTypes
, how about printing that out? Unfortunately, it turns out a huge bulk of messy things. So let's just see how getRegBank()
and getRegClasses()
works.
Still, it is just construct&save when it does not exists, and return the one saved, with being constructed by statement RegBank = std::make_unique<CodeGenRegBank>(Records, getHwModes());
.
Jumping into its constructor, it seems this thing highly worth paying attention to:
CodeGenRegisterClass::CodeGenRegisterClass(CodeGenRegBank &RegBank, Record *R)
: TheDef(R), Name(std::string(R->getName())),
TopoSigs(RegBank.getNumTopoSigs()), EnumValue(-1) {
GeneratePressureSet = R->getValueAsBit("GeneratePressureSet");
std::vector<Record*> TypeList = R->getValueAsListOfDefs("RegTypes");
if (TypeList.empty())
PrintFatalError(R->getLoc(), "RegTypes list must not be empty!");
for (unsigned i = 0, e = TypeList.size(); i != e; ++i) {
Record *Type = TypeList[i];
if (!Type->isSubClassOf("ValueType"))
PrintFatalError(R->getLoc(),
"RegTypes list member '" + Type->getName() +
"' does not derive from the ValueType class!");
VTs.push_back(getValueTypeByHwMode(Type, RegBank.getHwModes()));
}
/// This method looks up the specified field and
/// returns its value as a vector of records, throwing an exception if the
/// field does not exist or if the value is not the right type.
std::vector<Record*> getValueAsListOfDefs(StringRef FieldName) const;
And it is very clear that it is filtering all def RegTypes
. Now let's turn to see how the records see:
$ bin/llvm-tblgen ../lib/Target/RISCV/RISCV.td --print-records -I /home/xueqixing/llvm-project/llvm/include -I ../lib/Target/RISCV > record.td
$ grep "RegTypes" record.td
list<ValueType> RegTypes = RegisterClass:regTypes;
list<ValueType> RegTypes = VReg:regTypes;
list<ValueType> RegTypes = [f16];
list<ValueType> RegTypes = [f32];
list<ValueType> RegTypes = [f32];
list<ValueType> RegTypes = [f64];
list<ValueType> RegTypes = [f64];
list<ValueType> RegTypes = [XLenVT];
list<ValueType> RegTypes = [XLenVT];
list<ValueType> RegTypes = [XLenVT];
list<ValueType> RegTypes = [XLenVT];
list<ValueType> RegTypes = [XLenVT];
list<ValueType> RegTypes = [XLenVT];
list<ValueType> RegTypes = [XLenVT];
list<ValueType> RegTypes = [nxv64i1, nxv32i1, nxv16i1, nxv8i1, nxv4i1, nxv2i1, nxv1i1];
list<ValueType> RegTypes = [nxv1i1, nxv2i1, nxv4i1, nxv8i1, nxv16i1, nxv32i1, nxv64i1];
list<ValueType> RegTypes = [nxv8i8, nxv4i16, nxv2i32, nxv1i64, nxv4f16, nxv2f32, nxv1f64, nxv4i8, nxv2i8, nxv1i8, nxv2i16, nxv1i16, nxv1i32, nxv1f16, nxv2f16, nxv1f32, nxv1i1, nxv2i1, nxv4i1, nxv8i1, nxv16i1, nxv32i1, nxv64i1];
list<ValueType> RegTypes = [nxv16i8, nxv8i16, nxv4i32, nxv2i64, nxv8f16, nxv4f32, nxv2f64];
list<ValueType> RegTypes = [nxv16i8, nxv8i16, nxv4i32, nxv2i64, nxv8f16, nxv4f32, nxv2f64];
list<ValueType> RegTypes = [nxv32i8, nxv16i16, nxv8i32, nxv4i64, nxv16f16, nxv8f32, nxv4f64];
list<ValueType> RegTypes = [nxv32i8, nxv16i16, nxv8i32, nxv4i64, nxv16f16, nxv8f32, nxv4f64];
list<ValueType> RegTypes = [nxv64i8, nxv32i16, nxv16i32, nxv8i64, nxv32f16, nxv16f32, nxv8f64];
list<ValueType> RegTypes = [nxv64i8, nxv32i16, nxv16i32, nxv8i64, nxv32f16, nxv16f32, nxv8f64];
list<ValueType> RegTypes = [untyped];
list<ValueType> RegTypes = [untyped];
list<ValueType> RegTypes = [untyped];
list<ValueType> RegTypes = [untyped];
list<ValueType> RegTypes = [untyped];
list<ValueType> RegTypes = [untyped];
list<ValueType> RegTypes = [untyped];
list<ValueType> RegTypes = [untyped];
list<ValueType> RegTypes = [untyped];
list<ValueType> RegTypes = [untyped];
list<ValueType> RegTypes = [untyped];
list<ValueType> RegTypes = [untyped];
list<ValueType> RegTypes = [untyped];
list<ValueType> RegTypes = [untyped];
list<ValueType> RegTypes = [untyped];
list<ValueType> RegTypes = [untyped];
list<ValueType> RegTypes = [untyped];
list<ValueType> RegTypes = [untyped];
list<ValueType> RegTypes = [untyped];
list<ValueType> RegTypes = [untyped];
list<ValueType> RegTypes = [untyped];
list<ValueType> RegTypes = [untyped];
list<ValueType> RegTypes = [nxv8i8, nxv4i16, nxv2i32, nxv1i64, nxv4f16, nxv2f32, nxv1f64, nxv4i8, nxv2i8, nxv1i8, nxv2i16, nxv1i16, nxv1i32, nxv1f16, nxv2f16, nxv1f32, nxv1i1, nxv2i1, nxv4i1, nxv8i1, nxv16i1, nxv32i1, nxv64i1];
And with -U20
parameter, we could see some context of those things, finding them nested in Register Classes.
def GPR { // DAGOperand RegisterClass
string OperandNamespace = "MCOI";
string DecoderMethod = "";
string Namespace = "RISCV";
RegInfoByHwMode RegInfos = XLenRI;
list<ValueType> RegTypes = [XLenVT];
int Size = 0;
int Alignment = 32;
int CopyCost = 1;
dag MemberList = (add (sequence "X%u", 10, 17), (sequence "X%u", 5, 7), (sequence "X%u", 28, 31), (sequence "X%u", 8, 9), (sequence "X%u", 18, 27), (sequence "X%u", 0, 4));
RegAltNameIndex altNameIndex = NoRegAltName;
bit isAllocatable = 1;
list<dag> AltOrders = [];
code AltOrderSelect = [{}];
int AllocationPriority = 0;
bit GeneratePressureSet = 1;
int Weight = ?;
string DiagnosticType = "";
string DiagnosticString = "";
}
Among them, there is a interesting thing called XLenVT
, so let's see what it does:
def XLenVT { // HwModeSelect ValueType ValueTypeByHwMode
list<HwMode> Modes = [DefaultMode, RV64];
string Namespace = "MVT";
int Size = 0;
int Value = 0;
list<ValueType> Objects = [i32, i64];
}
With originally defined as def XLenVT : ValueTypeByHwMode<[RV32, RV64], [i32, i64]>;
. Thus, in this way of configuring, it selected only integer types i32
and i64
as legal types.
i32/i64
selection is target-dependent so it should be focused in the lib/Target/RISCV
directory. This method has been tried, but since that definition is not that significant and one might not expect that is defined with association to Registers, so that one is ignored.GlobalISelEmitter
but found that, although many deep calls and checks are made, it is just taking information from RecordKeeper
and made me think it a part back, to the parsing part, and some simple tracks of the debugger and interpretations brought me the answer.Copyright (c) ksyx 2021, licensed under CC-BY-SA. The code segments are licensed in accordance to LLVM, with HEAD
at commit f7294ac
Download Markdown file at anyint.md
< Back to posts