Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add strlen builtin #187

Merged
merged 1 commit into from
Dec 2, 2023
Merged

Conversation

MineRobber9000
Copy link
Contributor

According to the docs:

If the string's length is needed, we can use a bit of arithmetic to derive it:

helloworld:
    #d "Hello, world!\0"

helloworldLen = $ - helloworld

But sometimes, we need the length of the string in order to output the string (most notably, Pascal/length-prefixed strings). I
personally came across this issue while trying to write a Lua bytecode ruledef (yes, I know I'm a weirdo); Lua strings are variable-int length-prefixed strings, so a 126-character string (excessive, but entirely possible to have in your program) would be encoded as 0xff followed by 126 characters, while a 127-character string would be encoded as 0x01 0x80 followed by 127 characters.

This usecase can't use "a bit of arithmetic", since by the time customasm has emitted the string we're already too late to do anything with that length. Using an asm block and the whole "you can refer to a variable before it exists" thing doesn't work either, since customasm just chokes on not being able to find the variable's value. You can use the "bit of arithmetic" outside of a ruledef, but then it just looks sloppy and is more difficult to use (see below):

#ruledef {
        size {num} => {
                assert(num>=0)
                assert(num<=0x7f)
                0b1 @ (num+1)`7
        }
        ; presumably other definitions of size {num} for larger sizes
}

; you have to do this every time you want to emit a string
; (you'd have to do something similar to emit a string containing binary data,
; since strings are stored as String on the backend and not OsString but that's for another time)
size len ; 87
old = $
#d "=stdin"
len = $-old

The solution I came up with is to add a builtin strlen function, which just returns the string's length (in bytes) as an integer. This solves the previous usecase, as I can simply do the following (compare the above codeblock):

#ruledef
{
        size {num} => {
                assert(num>0)
                assert(num<=0x7e)
                0b1 @ (num+1)`7
        }
	; presumably other definitions of size {num} for larger sizes
        str {x} => asm {
                size strlen({x})
        } @ x
}

str "=stdin" ; 87 "=stdin"

@hlorenzi
Copy link
Owner

hlorenzi commented Dec 2, 2023

This looks good! In the future, I'd even like to go even further, and add some functionality to get the bit-size of any kind of value (#95), or the data pointed to by a label (as in #167).

@hlorenzi hlorenzi merged commit a793015 into hlorenzi:main Dec 2, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants