How to manipulate shared memory between golang and C

One Shared Memory Space, Two Different Programming Languages

This article extends from Using C in golang, as it involves methods of pointer arithmetic in Go language. Before continuing, it is strongly recommended to understand the related background knowledge.

Regardless of whether the original code is written in C or Go, these two programming languages may have different data structures. This article discusses how to use two different programming languages to operate data through a piece of shared memory. First, let’s review how to use a C function to allocate a piece of memory in Go language:

// Allocate a 128-byte memory space
pAry := C.malloc(128)
// Free the memory space
C.free(unsafe.Pointer(pAry))

Suppose there is a Go language program that randomly generates several student data records, including the student’s name, ID, and scores for four subjects. After generating these data, they are handed over to C language for processing. C language calculates the total score for each student based on these data, updates the total score field, then sorts the students’ data according to their scores, and finally, Go language prints out these updated data. The process illustrated graphically would be:

Golang share memory with C

From the diagram, it can be quickly understood which parts need to be completed in Go language. First, decide how many student data entries to generate, using the way constants are defined in Go language to declare a _STUDENTS_COUNT of 5.

const _STUDENTS_COUNT = 5

Next is the Student structure for storing student data:

type Student struct {
    name string
    id int
    math, english, chemistry, physics, total float32  // 5 float32 type score attributes
}

After defining the structure, the next step is to use random numbers to generate student data. In Go language, this can be done by importing the rand from math to use random number generation functions:

import "math/rand"

The prototypes for the two random number generation functions used in this program are:

func Intn(n int) int
func Float32() float32

The first function, Intn(), will randomly generate an integer in the range [0, n), and the second function, Float32(), will randomly generate a float32 type floating-point number in the range [0.0, 1.0). The implementation function for generating student data is as follows:

func generateStudents(students []Student) []Student {
    nameList := [...]string{
        "Kilroaam", "Ockgored", "Caedmas", "Aaron", "Bematen",
        "Chloe", "Manaeas", "Aallyah", "Emma", "Julie",
        "Uyfallian", "Thilnaeana", "Lymnysa", "Mynaaeleas", "Kyleigh",
        "Aldusa", "Christine", "Rubella", "Victor", "Xavier",
    }
    // Use the current system's nanosecond count as the random seed
    rand.Seed(time.Now().UTC().UnixNano())
    // _STUDENTS_COUNT = 5, for a total of 5 student data entries
    for i := 0; i < _STUDENTS_COUNT; i++ {
        students = append(
            students,
            Student{
                name:      nameList[rand.Intn(len(nameList))],
                id:        i + 1,
                math:      75 + rand.Float32()*25,
                english:   75 + rand.Float32()*25,
                chemistry: 75 + rand.Float32()*25,
                physics:   75 + rand.Float32()*25,
            },
        )
    }
    return students
}

The entire code is intuitive and doesn’t need any special explanation. This code will randomly generate 5 student data entries and put them into a slice named students to be returned. The operation on the scores of the four subjects is just to ensure that each subject’s score is at least above 75.

The only thing to note in this code is that it uses the system’s nanoseconds at the time of program execution as the random seed, because the random numbers generated by Go language are pseudorandom numbers. Using a different nanosecond as the random seed each time the program runs can generate different student data each time. Obtaining the system’s nanoseconds requires using Go language’s built-in time function, as mentioned in a previous section, the way to use functions from imported packages in Go language is:

{ package_name }.{      function_name     }
{     time     }.{ Now().UTC().UnixNano() }  // Applied in this function

Therefore, this function for getting the current system nanoseconds belongs to the time package. Hence, it must be imported at the beginning:

import "time"

After preparing the student data, the next step is to dynamically allocate memory based on the size of the data occupied in memory:

oneOffset := unsafe.Sizeof(students[0])
// oneOffset * _STUDENTS_COUNT = The total memory capacity needed for all student data
pSharedMem := C.malloc(C.size_t(oneOffset * _STUDENTS_COUNT))
uintPtr := uintptr(unsafe.Pointer(pSharedMem))

In this code, several functions and types from the package unsafe are used. If not familiar, you can refer to Methods of Manipulating Pointers in golang. Simply put, this code does:

  1. Calculate how much memory a single student structure occupies.
  2. Dynamically allocate a memory block based on the total size of the student data using the C language’s malloc function.
  3. Declare a uintptr that can perform pointer arithmetic, pointing to this dynamically allocated memory block.

After preparing the necessary memory, the generated student data can be copied over. This is done by calling the C function memcpy, defined in the C language’s memory library. Since the C program to be introduced later already includes it, there’s no need to re-import it here. First, let’s look at the prototype of this memcpy function:

void * memcpy ( void * destination, const void * source, size_t num );

This C function takes three parameters, the first two are void* type pointers. The first parameter is the address to copy to, the second parameter is the address to copy from, and the last parameter is the number of bytes to copy. Previously mentioned, the concept of unsafe.Pointer in Go language is like void* in C language, so the way to use memcpy in Go language is:

// Not just void*, size_t also needs to be converted
C.memcpy(unsafe.Pointer(pSharedMem), unsafe.Pointer(&students[0]), C.size_t(oneOffset*_STUDENTS_COUNT))

A common mistake C programmers might make here is to write the second pointer, pointing to the entire array of students, as either of the following two ways:

unsafe.Pointer(&students)  // Compilable, but produces incorrect results
unsafe.Pointer(students)   // Uncompilable, because the students slice is not a pointer!

Why a slice is not a pointer, or why you can’t point a pointer to the variable representing the entire slice students, is detailed in ]Go Language’s slice. After copying, the original slice students’ space can be freed.

	students = nil

Go language automatically performs garbage collection on variables that no longer reference any memory space.

Next, let’s verify if the data that has been copied into memory is correct:

for i := 0; i < _STUDENTS_COUNT; i++ {
    // Obtain the address of the student in memory through Go language pointer arithmetic
    student := (*Student)(unsafe.Pointer(uintPtr + oneOffset*uintptr(i)))
    // Print the student data
    fmt.Println(student)
}

For an explanation of pointer arithmetic in Go language, refer to Manipulating Pointers in golang. The printed student data is organized as follows:

      NAME  ID    MATH   ENGLISH   CHEMISTRY   PHYSICS    TOTAL
  Caedmas,  1,  78.45,    89.21,      88.10,    92.66,     0.00
Uyfallian,  2,  97.50,    94.94,      83.92,    92.59,     0.00
   Xavier,  3,  82.72,    88.65,      88.66,    86.40,     0.00
     Emma,  4,  78.43,    91.59,      95.85,    76.66,     0.00
   Xavier,  5,  90.88,    80.01,      87.98,    80.35,     0.00

The total score is not shown because, when initially generating the student data randomly, the total attribute was not actively initialized, so it is automatically assigned Go language’s zero value, which is 0.0 for the float32 type.

After verifying the data in memory is correct, it’s time to call the previously written C function for processing. The prototype of the C function we plan to call is as follows:

void orderStudents(void* mem, const int _NumberOfStudent, const int _GradeOffset, const int _StudentSize);

This function requires four parameters: the first is the starting address of the student data placed in memory, the second is the total number of students, the third is the relative offset of the score attribute in the structure, and the last parameter is the size of a student structure. So far, except for the third parameter, everything else is ready, so we need to calculate this relative offset first:

gradeOffset := unsafe.Offsetof(students[0].math)

選 math 這個屬性的原因是它排在所有分數屬性中的第一個,所以選擇它,關於 unsafe.Offsetof 函式更深入的解釋可以看這個小節中的說明。 所有參數都有了以後,就可以呼叫我們寫好的 C 語言函式: The reason for choosing the math attribute is that it is the first among all the score attributes. For a deeper explanation of the unsafe.Offsetof function, see Manipulating Pointers in golang.

After having all parameters, we can call our C function:

C.orderStudents(unsafe.Pointer(pSharedMem), _STUDENTS_COUNT, C.int(gradeOffset), C.int(oneOffset))  //  _STUDENTS_COUNT 是常數 5

The only thing to note here is the conversion between variables of two programming languages. Why _STUDENTS_COUNT does not need conversion is because this parameter is defined as a constant and will be replaced with a number during the compilation stage.

Below is the C function implementation called by Go language:

#include <stdio.h>
#include <stdlib.h>
#include <memory.h>

// Can't use printf to print so using fprintf instead
#define printf(...) fprintf (stderr, __VA_ARGS__)

// Grade structure declaration, interested only in students' scores when sorting
struct Grade {
    float math, english, chemistry, physics, total;
};

// The C function called by Go language
void orderStudents(void* mem, const int _NumberOfStudent, const int _GradeOffset, const int _StudentSize) {

    int i = 0, j = 0;

    for (i = 0; i < _NumberOfStudent - 1; i++ ) {
        for (j = 0; j < _NumberOfStudent - 1 - i; j++ ) {
			// pGrade = starting address of student structure + size of one student structure * which student + _GradeOffset
            struct Grade *pGrade     = (struct Grade*)((unsigned char *)mem + _StudentSize *    j    + _GradeOffset);
            struct Grade *pGradeNext = (struct Grade*)((unsigned char *)mem + _StudentSize * (j + 1) + _GradeOffset);

			// Calculate total score
            if (pGrade->total == 0.0)
                pGrade->total = pGrade->math + pGrade->english + pGrade->chemistry + pGrade->physics;
            if (pGradeNext->total == 0.0)
                pGradeNext->total = pGradeNext->math + pGradeNext->english + pGradeNext->chemistry + pGradeNext->physics;

			// Sort based on total score
            if (pGrade->total > pGradeNext->total) {
				// pGrade - _GradeOffset = starting address of student structure
                char tmpStudent[_StudentSize];
                memcpy(                               tmpStudent,      (unsigned char*)pGrade - _GradeOffset,  _StudentSize);
                memcpy(    (unsigned char*)pGrade - _GradeOffset,  (unsigned char*)pGradeNext - _GradeOffset,  _StudentSize);
                memcpy((unsigned char*)pGradeNext - _GradeOffset,                                 tmpStudent,  _StudentSize);
            }
        }
    }
}

This C function is straightforward and follows these steps:

  1. Through pointer operations, convert each student’s scores in the passed memory into Grade structure pointers.
  2. Prepare a second Grade structure pointer, pointing to the next student’s score data.
  3. Calculate the total score for these two Grade structure pointers and update the total variable.
  4. Sort based on the total score of each pair of students.

Since all operations are pointer-based, every modified value is a direct access to the memory address, so the data in memory is already updated after sorting.

Finally, print out the data in memory to see the results:

      NAME  ID    MATH   ENGLISH   CHEMISTRY   PHYSICS    TOTAL
   Xavier,  5,  90.88,    80.01,      87.98,    80.35,   339.21
     Emma,  4,  78.43,    91.59,      95.85,    76.66,   342.53
   Xavier,  3,  82.72,    88.65,      88.66,    86.40,   346.43
  Caedmas,  1,  78.45,    89.21,      88.10,    92.66,   348.42
Uyfallian,  2,  97.50,    94.94,      83.92,    92.59,   368.96         

After the student data has been processed, not only does it have total scores, but it is also sorted. Lastly, don’t forget to release the previously declared memory:

C.free(unsafe.Pointer(pSharedMem))

This concludes the method of sharing data between Go and C languages through the same piece of memory.

Other Methods of Copying Data into Memory

Finally, let’s talk about a few methods of memory copying.

The memory copying method used in this article is through the C function memcpy, which, even from the perspective of Go, a higher-level language, is designed perfectly: clear functionality, simple operation, and excellent performance. You only need two addresses and a size to complete all the tasks you’re assigned. This function feels similar to C language, where the toolbox might lack some components, so the worker has to piece things together themselves, but using each component is as intuitive, simple, and efficient as using a wrench or hammer.

This section provides some other methods using Go language to copy student data into memory. Of course, there are more than these few methods; this is just to spark ideas and propose some feasible approaches. The first method continues the spirit of printing out student data, which is to obtain the memory address of the corresponding data through pointer arithmetic and then assign values:

for i := 0; i < _STUDENTS_COUNT; i++ {
    // Use pointer arithmetic to obtain the memory address of each student
    pStudent := (*Student)(unsafe.Pointer(uintPtr + oneOffset*uintptr(i)))
    // Use the * operator to set the value at the address pointed to by the pointer
    *pStudent = students[i]
}

This method copies the size of one student structure at a time, copying a total of _STUDENTS_COUNT times, which is not as efficient as memcpy, so let’s consider a second method:

// Convert uintptr through unsafe.Pointer to a pointer to a [_STUDENTS_COUNT]Student type
allStudents := (*[_STUDENTS_COUNT]Student)(unsafe.Pointer(uintPtr))
// Copy the data of the type slice students into the pointer array allStudents
copy(allStudents[:], students)

Both lines have points to note, explained as follows:

The first line relies on the universal type conversion unsafe.Pointer to convert the original type uintptr to a pointer to a Student array of size _STUDENTS_COUNT. It’s important to note that uintptr cannot be converted to an unspecified size slice; it must be converted to a specified size array. The reason is explained in detail in the section slice in golang and another section on manipulating compound types with pointers regarding unsafe.Sizeof. Simply put, a slice container, regardless of the declared type, does not hold the actual size of the type, but the fixed size of 24 for the slice, because the slice stores not the actual data but a pointer to the actual data address.

The second line uses Go language’s built-in copy() function to help us copy. As mentioned in the section Useful Tips of Slice Tricks in Go, this copy() function prototype only accepts two slices, and having two collections of different sizes does not cause any problems; the copy always proceeds based on the smaller collection’s length. Therefore, before copying to each other, the original allStudents array needs to be converted into a slice. Using the : operator, explained in the section extracting a part of a slice into a new slice, turns the original array into a slice as a container, and then copies all elements from students into it. The syntax [i:j] extracts all elements that meet the condition i <= {x} < j, and using [:] indicates taking all elements. The size of the allStudents array is explicitly specified as _STUDENTS_COUNT, which perfectly matches the size of the slice students, thus fully copying to meet our requirements.

Further Reading