Manipulating pointers in golang

No problem, it’s about pointers!

A previous article mentioned the issues to be aware of when using slices in Go language functions. Normally, Go language does not support pointer arithmetic, but if you’ve also read the article about using C language code in Go, you would discover that there is a package called unsafe in Go, which includes a type called Pointer, essentially behaving much like the void* pointer in C language! Since we have the void* pointer available, and the package name is also unsafe, doesn’t it imply that programmers can ~~fully exploit~~ this package to do potentially unsafe things? So, the idea is this: since programmers have actively imported the unsafe package, it theoretically means they know what they are doing. And from the name unsafe, it’s clear that we are not in the so-called normal situation, so the Go compiler should not interfere too much! From now on, let me use the pointers like a pro!

In the following sections, some examples will be given to explain how to use the functions provided by the unsafe package.

Go Language’s unsafe.Pointer and uintptr

Regarding pointers in Go language, you can refer to the previously written section Go Language’s Pointers. Its basic behavior is completely identical to C language, except that you cannot perform pointer arithmetic. The Pointer type located in the unsafe package is for careless coders to create more problems cautious programmers to fill this gap. Let’s take a look at the type definition of Pointer:

type Pointer *ArbitraryType

Go language documentation describes Pointer as an alias for a pointer to the ArbitraryType type. Continuing to the definition of ArbitraryType:

type ArbitraryType int

It turns out that ArbitraryType is just an alias for int. In Go language, the int type automatically adjusts to the longest integer type supported by the hardware environment, and it occupies at least 4 bytes. Therefore, the Pointer type, simply put, is a pointer whose memory capacity is the size of an integer type. A Pointer of size int can accommodate the size of all types of pointers, thus it can be used as void* in C language. The following code simultaneously demonstrates the size of unsafe.Pointer and the behavior that int automatically adjusts to the longest integer size (the execution environment of the program is a 64-bit machine):

var int_auto int
var int_64 int64
var unsafe_ptr unsafe.Pointer
fmt.Println("  int_auto =", unsafe.Sizeof(int_auto))  //   int_auto = 8
fmt.Println("    int_64 =", unsafe.Sizeof(int_64))    //     int_64 = 8
fmt.Println("unsafe_ptr =", unsafe.Sizeof(ptr))       // unsafe_ptr = 8

Okay. Let’s start from the simplest pointer operation. The example used is the first example in the section Go Language’s Pointers, where we hope to change the value of c[1] from 2 to 5 through the new pointer address after pointer arithmetic.

c := [3]int{1, 2, 3}  // c is an array of size 3
d := &c               // d is a pointer to c
*(d+1) = 5            // This line will cause a compilation error!

The third line will cause a compilation error because you cannot directly perform pointer arithmetic on a regular pointer. But now we have the unsafe package, and the code can be modified as follows:

c := [3]int{1, 2, 3}
one := unsafe.Sizeof(c[0])
d := uintptr(unsafe.Pointer(&c[0]))
*(*int)(unsafe.Pointer(d + one)) = 5
fmt.Println("c =", c)

The program execution result is as follows:

c = [1 5 3]  // Successfully changed the value of c[1]

Before delving deeper into the code, it is recommended that readers first understand what the slice type is. Assuming that’s clear, we can now start explaining the code.

First is the calculation of one:

one := unsafe.Sizeof(c[0])

Here, the Sizeof() function from the unsafe package is used to obtain the size of a single element in the slice in memory, which is then stored in the variable one. This variable will be used later for pointer arithmetic.

The Sizeof() function returns a type of uintptr, which is very special. It is the only pointer type in Go language that can perform pointer arithmetic, and apart from pointers of type unsafe.Pointer, all pointers cannot be directly converted to uintptr! Therefore, the way to convert a regular pointer to a uintptr pointer must be:

// And then you can perform pointer arithmetic
uintptr(unsafe.Pointer(pointer to any type))

Rules for converting between pointers can also be seen at the bottom of this page:

A pointer value of any type can be converted to a Pointer.
A Pointer can be converted to a pointer value of any type.
A uintptr can be converted to a Pointer.
A Pointer can be converted to a uintptr.

Using graphics would make it clearer:

Pointer conversion in golang

From this diagram, it’s not hard to understand why Go language’s unsafe.Pointer is like C language’s void*

Next, let’s continue explaining the declaration of d in the code:

d := uintptr(unsafe.Pointer(&c[0]))

In short, this line is the process of converting from *int to uintptr through unsafe.Pointer, which can be broken down into equivalent code as follows:

intPtr := &c[0]                     // *int
voidPtr := (unsafe.Pointer(intPtr)  // unsafe.Pointer
uintPtr := uintptr(voidPtr)         // uintptr

&c[0] indicates the address of the first element of the slice c. Why not use &c still requires the reader to first understand what the slice type is. Next is to convert the pointer intPtr, which points to c[0] and is of type *int, to unsafe.Pointer. As mentioned before, unsafe.Pointer in Go language is like void* in C language, so it needs to be converted to this type of pointer before converting to other types of pointers. The outermost layer of this expression does the last conversion, converting the unsafe.Pointer pointer to a uintptr pointer.

The next code to explain is the key expression for pointer arithmetic:

*(*int)(unsafe.Pointer(d + one)) = 5

The pointer conversion process of this expression is actually the reverse operation of the previous line, from uintptr back to *int through unsafe.Pointer. This expression can be broken down into equivalent code as follows:

d = d + one                         // uintptr
voidPtr := unsafe.Pointer(d)        // unsafe.Pointer
intPtr := (*int)(voidPtr)           // *int
*intPtr = 5

The variable d and the variable one are both of type uintptr, emphasizing again that this is the only pointer type in Go language that can perform pointer arithmetic, so adding them together achieves the operation of moving the pointer forward by one index.

The second line converts uintptr to unsafe.Pointer.

The third line converts it to *int.

The final line’s variable intPtr is already a pointer *int pointing to the memory address of the element of slice c, so the method to change the value at the memory address it points to is *intPtr = 5.

You might ask, why not directly perform the following operation on the variable d of type uintptr:

d = d + one
*d = 5        // Already got the pointer, why not directly set the value to 5 after dereferencing?

Or perform the following operation on the variable voidPtr of type unsafe.Pointer:

voidPtr := unsafe.Pointer(d)
*voidPtr = 5  // Already got the pointer, why not directly set the value to 5 after dereferencing?

The reason for specifically converting back to *int before setting the value is simple, let’s review again, Go language is a strongly typed language, it requires explicit conversion of various types before performing operations appropriate for that type! Since you intend to change the value of c[1] from 2 to 5, which is an int type, you naturally need to convert the pointer back to *int type for operation:

intPtr := (*int)(voidPtr)
*intPtr = 5   // This is a legal pointer value setting operation of the same type!

So far, the rules for manipulating pointers in Go language can be summarized as:

Only pointers of type uintptr can perform pointer arithmetic operations.
Only pointers to general types can change values in memory.
Converting between these two types of pointers relies on unsafe.Pointer.

Let’s look at another example, adding 1 to every element in slice a through unsafe.Pointer and uintptr:

// Declare a slice a of length 5
a := []int{0, 1, 2, 3, 4}
// Calculate the size of one pointer shift
ptrSize := unsafe.Sizeof(a[0])
// Declare ptr as a pointer to the address of the first element of slice a

ptr := uintptr(unsafe.Pointer(&a[0]))

// Print the value of slice a before pointer manipulation
fmt.Println("Before pointer manipulation a =", a)

// Use pointer manipulation to add one to the value of every element in slice a
for i := 0; i < len(a); i++ {
    // Add 1 to the value at the pointer address
    *(*int)(unsafe.Pointer(ptr)) += 1
    // Move the pointer to the next address
    ptr += ptrSize
}
// Print the value of slice a
fmt.Println("After pointer manipulation a =", a)

The execution result of the program is:

Before pointer manipulation a = [0 1 2 3 4]
After pointer manipulation a = [1 2 3 4 5]  // Successfully added 1 to the contents of the slice through pointer manipulation

The logic is exactly the same, try to deduce it yourself, and you will understand immediately. The only thing to note here is that when doing pointer arithmetic in Go language, the compiler does not automatically calculate the size of the pointer unit for you, you must explicitly calculate the size of each shift before performing the operation, as an example of C language code:

int charAry[] = {0, 1, 2};
int *pCharAry = charAry;
pCharAry++;
printf("*pCharAry = %d\n", *pCharAry);

int intAry[] = {0, 1, 2};
int *pIntAry = intAry;
pIntAry++;
printf(" *pIntAry = %d\n", *pIntAry);

The result printed by this code is:

*pCharAry = 1
*pIntAry = 1

In C language, when a pointer performs ++ operation, the compiler knows to move according to the size of the memory space occupied by different pointers. So pCharAry++ moves by sizeof(char), pIntAry++ moves by sizeof(int). But if you perform similar operations in Go language, the results would be unexpected:

charAry := []byte{0, 1, 2}
pCharAry := uintptr(unsafe.Pointer(&charAry[0]))
pCharAry++
fmt.Printf("*pCharAry = %d\n", *(*byte)(unsafe.Pointer(pCharAry)))

intAry := []int{0, 1, 2}
pIntAry := uintptr(unsafe.Pointer(&intAry[0]))
pIntAry++
fmt.Printf(" *pIntAry = %d\n", *(*int)(unsafe.Pointer(pIntAry)))

The result of this code is:

*pCharAry = 1
*pIntAry = 72057594037927936

In Go language, incrementing a pointer with ++ is interpreted as moving it forward by one byte. Therefore, for a pointer of type *byte, like pCharAry, the result is still correct. However, for a pointer of type *int like pIntAry, it is incorrect. The code can be modified as follows:

charAry := []byte{0, 1, 2}
pCharAry := uintptr(unsafe.Pointer(&charAry[0]))
pCharAry += unsafe.Sizeof(charAry[0])  // Move by the memory size of one byte
fmt.Printf("*pCharAry = %d\n", *(*byte)(unsafe.Pointer(pCharAry)))

intAry := []int{0, 1, 2}
pIntAry := uintptr(unsafe.Pointer(&intAry[0]))
pIntAry += unsafe.Sizeof(intAry[0])    // Move by the memory size of one int
fmt.Printf("*pIntAry = %d\n", *(*int)(unsafe.Pointer(pIntAry)))

This modification yields the correct result:

*pCharAry = 1
*pIntAry = 1

Using Pointers with Composite Types

Suppose there is a structure named Student as follows:

// Define a Student structure
type Student struct {
    name    string
    id      int
    grade   float32
    friends []string
}
// Declare an empty student variable to represent this structure
student := Student{}

This Student structure contains 4 members: the student’s name, ID, grade, and a list of friends. Next, let’s see how to use functions provided in the unsafe package to manipulate this structure.

First is the frequently mentioned Sizeof() function:

fmt.Println("Size of Student.name      =", unsafe.Sizeof(student.name))
fmt.Println("Size of Student.id        =", unsafe.Sizeof(student.id))
fmt.Println("Size of Student.grade     =", unsafe.Sizeof(student.grade))
fmt.Println("Size of Student.friends   =", unsafe.Sizeof(student.friends))

The execution result of this code is as follows:

Size of Student.name      = 16
Size of Student.id        = 8
Size of Student.grade     = 4
Size of Student.friends   = 24

Sizeof() reports the memory size occupied by the passed object, and the returned value is of type uintptr, which can be used directly for pointer arithmetic. Among these, the string and []string types might raise questions. Let’s start with the latter. friends is a slice type with string elements. If you remember this section about slices, you’ll know that a slice is a composite type containing three parts: a pointer to the first memory address of the referenced memory, an int representing the length of the slice, and an int representing the capacity of the slice. Therefore, regardless of what is stored inside, the size of a slice on the same bit machine is fixed, as shown in the examples below:

slice_byte := []byte{}
slice_int := []int{}
slice_string := []string{}
slice_complicate := [][][][]string{}
slice_struct := []Student{}

fmt.Println(unsafe.Sizeof(slice_byte))        // 24 
fmt.Println(unsafe.Sizeof(slice_int))         // 24
fmt.Println(unsafe.Sizeof(slice_string))      // 24
fmt.Println(unsafe.Sizeof(slice_complicate))  // 24
fmt.Println(unsafe.Sizeof(slice_struct))      // 24

Simply put, on my 64-bit machine, a slice is composed of 3 parameters, each occupying 8 bytes, so the size of a slice is 8 * 3 = 24. Understanding slices makes it easy to see why the size of a string is 16, as a string in Go language is also a composite type, containing a pointer to the first character’s memory address in the string and an int representing the string’s length, making the size of a string naturally 8 * 2 = 16.

Next, let’s look at another function from the unsafe package, Offsetof():

fmt.Println("Offset of Student.name    =", unsafe.Offsetof(student.name))
fmt.Println("Offset of Student.id      =", unsafe.Offsetof(student.id))
fmt.Println("Offset of Student.grade   =", unsafe.Offsetof(student.grade))
fmt.Println("Offset of Student.friends =", unsafe.Offsetof(student.friends))

The execution result of this code is as follows:

Offset of Student.name    = 0
Offset of Student.id      = 16
Offset of Student.grade   = 24
Offset of Student.friends = 32

This function helps you find the offset of a structure member relative to the structure’s initial memory address. Since the return type is also uintptr, it is clearly designed to facilitate pointer arithmetic by programmers.

The last function in the unsafe package we’ll look at is Alignof():

fmt.Println("Align of Student.name     =", unsafe.Alignof(student.name))
fmt.Println("Align of Student.id       =", unsafe.Alignof(student.id))
fmt.Println("Align of Student.grade    =", unsafe.Alignof(student.grade))
fmt.Println("Align of Student.friends  =", unsafe.Alignof(student.friends))

The execution result of this code is as follows:

Align of Student.name     = 8
Align of Student.id       = 8
Align of Student.grade    = 4
Align of Student.friends  = 8

If you have read the previous section I wrote, the alignment of structs should be familiar. Although the previous article was explained in the context of C language, the purpose and concept of struct alignment are completely the same in these two languages.

Now, having introduced all functions in the unsafe package, let’s practice how to directly operate on structs using these functions:

// Declare a pointer of type uintptr, pointing to the address of the student variable
pStudent := uintptr(unsafe.Pointer(&student))

// Set the name
offset := unsafe.Offsetof(student.name)
*(*string)(unsafe.Pointer(pStudent + offset)) = "Tom"

// Set the ID
offset = unsafe.Offsetof(student.id)
*(*int)(unsafe.Pointer(pStudent + offset)) = 3

// Set the grade
offset = unsafe.Offsetof(student.grade)
*(*float32)(unsafe.Pointer(pStudent + offset)) = 97.5

// Set the list of friends
offset = unsafe.Offsetof(student.friends)
*(*[]string)(unsafe.Pointer(pStudent + offset)) = []string{"John", "Mary", "Dean", "Sam"}

After executing this code, printing student yields:

	student = {Tom 3 97.5 [John Mary Dean Sam]}

The principle of pointer operation is the same throughout. Here, only the part about setting the list of friends is broken down. The two expressions for setting the list of friends can be decomposed into equivalent code as follows:

offset = unsafe.Offsetof(student.friends)            // Calculate the pointer offset
pStudent = pStudent + offset                         // uintptr, move the pointer to the memory address where the friends variable is located
voidPtr := unsafe.Pointer(pStudent)                  // unsafe.Pointer
slicePtr := (*[]string)(voidPtr)                     // *[]string
*slicePtr = []string{"John", "Mary", "Dean", "Sam"}  // Set the value at the pointed-to address using the * operator

This should be quite clear now.

Finally, organizing these pointer operations into a template expression looks like this:

// Assume the variable is as follows
var variableName Type

// When the pointer points to an array
offset := unsafe.Sizeof(arrayVariableName[0])
uintPtr := uintptr(unsafe.Pointer(&arrayVariableName[0]))

// When the pointer points to a struct
offset := unsafe.Offsetof(structVariableName.AccessMember)
uintPtr := uintptr(unsafe.Pointer(&structVariableName))

// Set the new value for the memory address pointed to by the pointer after offset
*(*Type)(unsafe.Pointer(uintPtr + offset)) = newValue

Compared to C language, pointer arithmetic in Go language might be a bit more cumbersome, but once you understand the entire rule set, you’ll know that all operating concepts are inherited, and the concepts are not as complex as they might seem in syntax.

Regardless, learning and clarifying the ~~quirks and techniques~~ of a programming language is always a delightful process!

No problem, it’s about pointers!

Go Language’s unsafe.Pointer and uintptr

Using Pointers with Composite Types

Further Reading