Writing A Compiler In Go を読んでいく Chapter 2 make bytecode.

Writing A Compiler In Go (English Edition)

作者:Ball, Thorsten

Amazon

さて、今回からはいよいよ『Writing A Compiler In Go』です。興奮してきますね。

Chapter 2が終わったので、忘備録として書いていきたいと思います。

今回はbytecode.について書いていきます。

Chapter 2でやること

Chapter 2のゴールは、1 + 2 をコンパイルすることです。

書いてみるとシンプルですが、これが長旅で面白かったです。

1 + 2 をコンパイルする流れ

まずは、流れについてざっとみていきます。

※本書から掲載

図の通り

Lexerによって字句解析を行い、Tokenに分割する
Parserを経て、ASTを生成する
ASTをCompilerに投げて、ByteCodeに変換する
ByteCodeをVMが処理する

という流れです。

Lexer と 2. Parserについては、前書で実装済みなので、今回はCompilerとVMを実装します。

Bytecode

Bytecodeというワードが初登場なので、軽く触れておきます。

※本書から掲載 バイトコードは一連の命令（OpcodeとOperandの組み合わせ）で構成されています。

これらの命令は、メモリ上に連続して配置され、仮想マシン（VM）によって順番に解釈・実行されます。

Opcodeは、特定の操作を指示するためのコードで、バイトコードの命令を識別するために使用されます。

図の例では、PUSH、ADDがOpcodeです。

Operandは、Opcodeが実行する操作の対象となる値または変数を指します。

図の例では、505, 205 がOperandです。

ここらへんは、コンピュータシステムの理論と実装で完全に理解してるので、スルッと進めました。

コンピュータシステムの理論と実装 ―モダンなコンピュータの作り方

Amazon

make bytecode.

では、まずはbytecodeがどのように生成されるかについてみていきましょう。

前書同様、TDDで進んでいくため、まずはテストコードから。

// code/code_test.go
package code

import (
    "testing"
)

func TestMake(t *testing.T) {
    tests := []struct {
        op       Opcode
        operands []int
        expected []byte
    }{
        {OpConstant, []int{65534}, []byte{byte(OpConstant), 255, 254}},
        {OpAdd, []int{}, []byte{byte(OpAdd)}},
    }

    for _, tt := range tests {
        instruction := Make(tt.op, tt.operands...)

        if len(instruction) != len(tt.expected) {
            t.Errorf("instruction has wrong length. want=%d, got=%d",
                len(tt.expected), len(instruction))
        }

        for i, b := range tt.expected {
            if instruction[i] != tt.expected[i] {
                t.Errorf("wrong byte at pos %d. want=%d, got=%d",
                    i, b, instruction[i])
            }
        }
    }
}

テストケースは、

OpConstant 65534
OpAdd

です。

これがMake関数によってバイトコードに変換された結果が正しいものであるかをテストしています。

では、Make関数をみていきましょう

// code/code.go

type Definition struct {
    Name          string
    OperandWidths []int
}

var definitions = map[Opcode]*Definition{
    OpConstant: {"OpConstant", []int{2}},
    OpAdd:      {"OpAdd", []int{}},
}


func Make(op Opcode, operands ...int) []byte {
    def, ok := definitions[op]

    if !ok {
        return []byte{}
    }

    instructionLen := 1
    for _, w := range def.OperandWidths {
        instructionLen += w
    }

    instruction := make([]byte, instructionLen)
    instruction[0] = byte(op)

    offset := 1
    for i, o := range operands {
        width := def.OperandWidths[i]
        switch width {
        case 2:
            binary.BigEndian.PutUint16(instruction[offset:], uint16(o))
        }

        offset += width
    }

    return instruction
}

順にみていきましょう。

定義されたOpcodeかどうかをチェック

def, ok := definitions[op]

まずは、Opcodeが定義されたものであるかどうかをチェックします。

命令のバイト数を取得

次に、命令のバイト数を取得します。

instructionLen := 1
for _, w := range def.OperandWidths {
    instructionLen += w
}

Opcode自体が1バイトなので、OperandWidthsの長さを取得して足していってます。

バイトスライスを生成して、Opcodeを書き込む

instructionLenを元に、バイトスライスを生成し、最初のバイトにOpcodeを入れます。

instruction := make([]byte, instructionLen)
instruction[0] = byte(op)

バイトスライスにOperandを書き込む

で、最後はinstructionにOperandをビッグエンディアンのバイト表現に変換して、書き込むという流れです。

binary.BigEndian.PutUint16(instruction[offset:], uint16(o))

これにて、make bytecode.の完了です。

まとめ

今回は make bytecode編ということでバイトコードの生成についてみていきました。次回はコンパイラからVMあたりまで触れていきたいと思います

僕から以上。あったかくして寝ろよ

全体のコードはこちら

github.com

んだ日記

ndaDayoの技術日記です