In this post we will present how the assembly development environment tool (asmde) can ease assembly program development for RISC-V ISA.
You will develop a basic floating-point vector add routine.
Introducing ASMDE
The ASseMbly Development Environment (asmde, https://github.com/nibrunie/asmde) is an open-source set of python utility to help the assembly developper. The main eponym utility, asmde, is a register assignation script. It consumes a templatized assembly source file and fill in variable names with legal register, removing the burden of register allocation from the developper.
Recently, alpha support for RV32 (32-bit version of RISC-V) was added to asmde. We are going to demonstrate how to use it in this post.
Vector-Add testbench
/** Basic single-precision vector add * @param dst destination array * @param lhs left-hand side operand array * @param lhs right-hand side operand array * @param n vector sizes */ void my_vadd(float* dst, float* lhs, float* rhs, unsigned n);
The program is split in two files:
- a test bench main.c
- an asmde template file vec_add.template.S
Review of the assembly template
// testing for basic RISC-V RV32I program // void vector_add(float* dst, float* src0, float* src1, unsigned n) //#PREDEFINED(a0, a1, a2, a3) .option nopic .attribute arch, "rv32i2p0_m2p0_a2p0_f2p0_d2p0" .attribute unaligned_access, 0 .attribute stack_align, 16 .text .align 1 .globl my_vadd .type my_vadd, @function my_vadd: // check for early exit condition n == 0 beq a3, x0, end loop: // load inputs flw F(LHS), 0(a1) flw F(RHS), 0(a2) // operation fadd.s F(ACC), F(LHS), F(RHS) // store result fsw F(ACC), 0(a0) // update addresses addi a1, a1, 4 addi a2, a2, 4 addi a0, a0, 4 // update loop count addi a3, a3, -1 // branch if not finished bne x0, a3, loop end: ret .size my_vadd, .-my_vadd .section .rodata.str1.8,"aMS",@progbits,1
ASMDE Macro
ASMDE Variable
flw F(LHS), 0(a1) flw F(RHS), 0(a2) // operation fadd.s F(ACC), F(LHS), F(RHS) // store result fsw F(ACC), 0(a0)
Assembly template translation
python3 asmde.py -S --arch rv32 \
examples/riscv/test_rv32_vadd.S \
--output vadd.S
Building and executing the test program
#include <stdio.h> #ifdef LOCAL_IMPLEMENTATION void my_vadd(float* dst, float* lhs, float* rhs, unsigned n){ unsigned i; for (i = 0; i < n; ++i) dst[i] = lhs[i] + rhs[i]; } #else void my_vadd(float* dst, float* lhs, float* rhs, unsigned n); #endif int main() { float dst[4]; float a[4] = {1.0f, 2.0f, 3.0f, 4.0f}; float b[4] = {4.0f, 3.0f, 2.0f, 1.0f}; my_vadd(dst, a, b, 4); int i; for (i = 0; i < 4; ++i) { if (dst[i] != 5.0f) { printf("failure\n"); return -1; } } printf("success\n"); return 0; }
(requires rv32 gnu toolchain and a 32-bit proxy kernel pk)
# building test program $ riscv64-unknown-elf-gcc -march=rv32i -mabi=ilp32 -o test_vadd vadd.S test_vadd.c # executing binary $ spike --isa=RV32gc riscv32-unknown-elf/bin/pk ./test_vadd
Conclusion
References:
- asmde github page: https://github.com/nibrunie/asmde