Background

Regarding Go’s Context, I have only written one article: Printing All Context Values in Go, and that was quite a while ago. Over time, through continuous use of Go, I have gained a deeper understanding of Context. I wanted to expand on this topic, but modifying the previous article didn’t seem appropriate, so I decided to write a new one in a more systematic way.

Common Uses of Context

First, I learned that Context was not introduced at the inception of Go; it was added in Go 1.7. The motivation behind its addition was to solve issues related to cross-goroutine collaboration and information passing. This is reflected in its interface, which includes the following functions:

  1. func WithValue(parent Context, key, val any) Context
  2. func WithCancel(parent Context) (ctx Context, cancel CancelFunc)
  3. func WithDeadline(parent Context, d time.Time) (Context, CancelFunc)
  4. func WithTimeout(parent Context, timeout time.Duration) (Context, CancelFunc)

These interfaces highlight the primary uses of Context. Below, I provide examples to illustrate their application.

Passing Information

Here is a simple example of passing information through Context:

  1. [root@liqiang.io]# cat example0.go
  2. func main() {
  3. type favContextKey string
  4. f := func(ctx context.Context, k interface{}) {
  5. if v := ctx.Value(k); v != nil {
  6. fmt.Println("found value:", v)
  7. return
  8. }
  9. fmt.Println("key not found:", k)
  10. }
  11. k := favContextKey("language")
  12. ctx := context.WithValue(context.Background(), k, "Go")
  13. f(ctx, k)
  14. f(ctx, "language")
  15. }

The output of this example is:

  1. [root@liqiang.io]# go run example0.go
  2. found value: Go
  3. key not found: language

From the output, you can see that although the key “language” is used in both cases, one retrieves a value while the other does not. This demonstrates that Context keys use strict type comparison. I will discuss this topic further in the future, but if you’re interested, you can check out this document: Go Comparison Operators.

To ensure consistency when storing and retrieving values from Context, it is common practice to define keys as constants, often provided through an SDK. A simplified approach looks like this:

  1. [root@liqiang.io]# cat example1.go
  2. func businessCode(ctx context.Context) {
  3. val := ctx.Value(sdk.CtxKeyHello)
  4. if val == nil{
  5. fmt.Println("value not found in context")
  6. } else {
  7. fmt.Println("value found in context ", val.(string))
  8. }
  9. }

Some developers prefer encapsulating Context operations within an interface, such as:

  1. [root@liqiang.io]# cat example2.go
  2. import "context"
  3. type ContextOperator interface {
  4. WithHello(ctx context.Context, value interface{})
  5. GetHello(ctx context.Context)
  6. }

This hides implementation details from users and prevents them from handling Context keys directly, but it is usually used in SDK implementations.

Time Control

In concurrent programming, a common issue is managing concurrency. Suppose we start multiple goroutines to perform a task, but if any of them fail, we may want the others to terminate early. Context can help us manage this, as shown in the following example:

  1. [root@liqiang.io]# cat main.go
  2. func main() {
  3. ctx, cancel := context.WithCancel(context.Background())
  4. wg := sync.WaitGroup{}
  5. wg.Add(3)
  6. go func() {
  7. defer wg.Done()
  8. println("goroutine 1 start")
  9. <-ctx.Done()
  10. println("goroutine 1 done")
  11. }()
  12. go func() {
  13. defer wg.Done()
  14. println("goroutine 2 start")
  15. <-ctx.Done()
  16. println("goroutine 2 done")
  17. }()
  18. go func() {
  19. defer wg.Done()
  20. println("goroutine 3 start")
  21. time.Sleep(1 * time.Second)
  22. cancel()
  23. println("goroutine 3 fail, canceled")
  24. }()
  25. wg.Wait()
  26. println("main done")
  27. }

Here, three goroutines are started. The third one calls cancel() to cancel the context, notifying the first two to terminate early.

Structure of Context

You might have some questions about how Context works internally. For example:

  • If multiple Cancel Contexts are nested, does canceling one affect all?
  • If I add a new key/value pair to a Context, does it overwrite an existing key?

To answer these questions, we need to examine the source code.

cancelCtx

  1. [root@liqiang.io]# cat context.go
  2. func withCancel(parent Context) *cancelCtx {
  3. if parent == nil {
  4. panic("cannot create context from nil parent")
  5. }
  6. c := &cancelCtx{}
  7. c.propagateCancel(parent, c)
  8. return c
  9. }

A cancelCtx is structured as a tree:

Diagram: Context Tree Structure
Image source: Understanding Context Package in Golang

timerCtx

  1. [root@liqiang.io]# cat context.go
  2. type timerCtx struct {
  3. cancelCtx
  4. timer *time.Timer
  5. deadline time.Time
  6. }

valueCtx

  1. [root@liqiang.io]# cat context.go
  2. type valueCtx struct {
  3. Context
  4. key, val any
  5. }
  6. func WithValue(parent Context, key, val any) Context {
  7. return &valueCtx{parent, key, val}
  8. }

The valueCtx implementation shows that retrieving values from a Go Context involves traversing a linked list, making it inefficient. Therefore, storing key-value pairs in Context should be minimized.

Cross-Process Context Transmission

Contexts are useful within a single process, but what if we need to transmit them across processes, such as in RPC calls? We need to:

  1. Determine which key-value pairs to pass.
  2. Attach them to the inter-process message.
  3. Extract them from the message and restore the Context.

This article explores different approaches to implementing cross-process Context transmission in Go, including timeout handling, cancellation, and best practices.

Identifying Which Key-Value Pairs to Transmit

There are two common approaches:

  • Full Transmission: All key-value pairs are transmitted without filtering (rarely used).
  • Selective Transmission: Only specific key-value pairs are transmitted (more common).

Selective transmission can be implemented in various ways. One method is using Getter and Setter functions to control which key-value pairs are passed, ensuring consistency and preventing arbitrary modifications. However, this approach requires version updates when new keys need to be supported.

In the open-source RPC framework Kitex, a special key is used to store and pass key-value pairs. This simplifies transmission but can cause issues with service governance, such as key conflicts and length restrictions.

Transmission Protocol

After deciding which key-value pairs to transmit, we need to consider how to transmit them. The implementation varies depending on the protocol:

  • HTTP: Headers are a natural choice but have limitations (e.g., ASCII-only support, length restrictions).
  • RPC Protocols: Some implementations embed metadata in the request/response body, which complicates service governance.

A common approach is to include metadata in an RPC metadata structure. For example, Kitex uses TTHeader for this purpose. However, different companies implement it in different ways:

  • Some teams assign each key to a separate field in the metadata structure.
  • Some teams package all key-value pairs into a single metadata field.
  • Some teams use a hybrid approach, storing specific keys in separate fields while placing the remaining key-value pairs into a metadata field.

Parsing Key-Value Pairs

When receiving a cross-process message, we must parse the message and extract the transmitted key-value pairs. The method used for parsing is closely tied to how the key-value pairs were transmitted, so the choice of transmission approach directly determines the parsing method. This step does not offer much flexibility and is largely dictated by the first two steps.

A Simple Implementation

After introducing so much, let’s take an example using the Context Operator and assume that we are passing context between two HTTP services. I’ll introduce a simple implementation. First, my Context Operator looks like this:

  1. [root@liqiang.io]# cat example2.go
  2. type ContextOperator interface {
  3. WithHello(ctx context.Context, value interface{})
  4. WithWorld(ctx context.Context, value interface{})
  5. GetHello(ctx context.Context)
  6. GetWorld(ctx context.Context)
  7. Marshal() ([]byte, error)
  8. Unmarshal([]byte) error
  9. }

Sending a Request

When the Client Service is ready to send a request, it needs a middleware to perform context marshaling. The code might look like this:

  1. [root@liqiang.io]# cat example2.go
  2. const contextHeader = "liqiang-io-context"
  3. type Transport struct {
  4. rt http.RoundTripper
  5. co ContextOperator
  6. }
  7. func (t *Transport) RoundTrip(r *http.Request) (*http.Response, error) {
  8. ctx, err := t.co.Marshal(r.Context())
  9. if err == nil {
  10. r.Header.Add(contextHeader, string(ctx))
  11. }
  12. return http.DefaultTransport.RoundTrip(r)
  13. }

Here, the Context value is placed in the HTTP request and sent out. The Marshal implementation can be very simple, just string concatenation:

  1. [root@liqiang.io]# cat example2.go
  2. func (o *ctxOp) Marshal(ctx context.Context) ([]byte, error) {
  3. var rtns []string
  4. if hello := o.GetHello(ctx); hello != nil {
  5. rtns = append(rtns, "hello="+*hello)
  6. }
  7. if world := o.GetWorld(ctx); world != nil {
  8. rtns = append(rtns, "world="+*world)
  9. }
  10. return []byte(strings.Join(rtns, string([]rune{0}))), nil
  11. }

Receiving a Request

On the server side, after receiving the HTTP request, we perform the opposite operation by wrapping the HTTP Handler:

  1. [root@liqiang.io]# cat example2.go
  2. type wrapperHandler struct {
  3. h http.Handler
  4. co ContextOperator
  5. }
  6. func (h *wrapperHandler) ServeHTTP(resp http.ResponseWriter, req *http.Request) {
  7. ctxVal := req.Header.Get(contextHeader)
  8. ctx, err := h.co.Unmarshal([]byte(ctxVal))
  9. if err == nil {
  10. req = req.WithContext(ctx)
  11. }
  12. h.h.ServeHTTP(resp, req)
  13. }
  14. func NewWrapperHandler(h http.Handler) http.Handler {
  15. return &wrapperHandler{
  16. h: h,
  17. }
  18. }

Similarly, the Unmarshal operation is simply string decomposition:

  1. [root@liqiang.io]# cat example2.go
  2. func (o *ctxOp) Unmarshal(content []byte) (context.Context, error) {
  3. vals := strings.Split(string(content), string([]rune{0}))
  4. ctx := context.Background()
  5. for _, val := range vals {
  6. switch val {
  7. case "hello":
  8. ctx = o.WithHello(ctx, val)
  9. case "world":
  10. ctx = o.WithWorld(ctx, val)
  11. }
  12. }
  13. return ctx, nil
  14. }

Passing Cancel and Timeout

At first glance, the above example seems to have implemented context passing. However, in reality, it does not cover the full capabilities of Go’s context, especially its most commonly used features: WithTimeout and WithCancel.

Cancel

The WithCancel function might seem straightforward—it cancels a cross-process request. However, in practice, it’s more complex than expected because Go’s default networking library does not support context.

Let’s look at the net.Conn interface:

  1. [root@liqiang.io]# cat net/net.go
  2. type Conn interface {
  3. Read(b []byte) (n int, err error)
  4. Write(b []byte) (n int, err error)
  5. Close() error
  6. }

As we can see, there’s no way to control read or write using context. So when passing a Cancel across processes, it’s not feasible with the default networking interface. Because of this limitation, developers have found some workarounds. Below are two common patterns:

Goroutine Pattern

Since goroutines are lightweight in Go, one common approach is:

  1. [root@liqiang.io]# cat example3.go
  2. import "context"
  3. func passCancel1(ctx context.Context) {
  4. var networkDone chan struct{}
  5. go func() {
  6. // network operation 1
  7. networkDone <- struct{}{}
  8. }()
  9. select {
  10. case <-ctx.Done():
  11. // canceled
  12. case <-networkDone:
  13. // completed normally
  14. }
  15. return
  16. }

However, this approach has a risk of goroutine leaks, so careful handling is required.

Busy-Wait Pattern

Another approach is to set a very short timeout for network operations. If the operation completes before cancellation, it’s normal; otherwise, it’s considered canceled.

  1. [root@liqiang.io]# cat example4.go
  2. func networkProcess(ctx context.Context) {
  3. for {
  4. conn.SetReadDeadline(time.Now().Add(time.Microsecond * 20))
  5. readBytes, err := conn.Read(buffer)
  6. if err != nil {
  7. return
  8. }
  9. if readBytes > 0 {
  10. return
  11. }
  12. select {
  13. case <-ctx.Done():
  14. default:
  15. continue
  16. }
  17. }
  18. }

This method has an obvious drawback—it wastes a lot of CPU resources. The fundamental issue is that network operations don’t recognize Cancel Context. Therefore, in practice, passing cancelable contexts between processes is rare, while timeout usage is more common.

Timeout

Passing a Timeout Context is similar to passing context values, but with a key difference: when process A passes the timeout to process B, process B must be able to apply the timeout correctly.

In the in-process context implementation, timeoutCtx is essentially a Deadline Context. Let’s look at how Deadline Context is implemented:

  1. [root@liqiang.io]# cat context/context.go
  2. func WithDeadline(parent Context, d time.Time) (Context, CancelFunc) {
  3. ... ...
  4. dur := time.Until(d)
  5. ... ...
  6. if c.err == nil {
  7. c.timer = time.AfterFunc(dur, func() {
  8. c.cancel(true, DeadlineExceeded)
  9. })
  10. }
  11. return c, func() { c.cancel(true, Canceled) }
  12. }

We can see that Deadline is implemented using a timer that automatically cancels the context when it expires.

At first glance, it seems simple—just pass the deadline value to the remote process, like this:

  1. [root@liqiang.io]# cat exmaple2.go
  2. func (o *ctxOp) Marshal(ctx context.Context) ([]byte, error) {
  3. var rtns []string
  4. if hello := o.GetHello(ctx); hello != nil {
  5. rtns = append(rtns, "hello="+*hello)
  6. }
  7. if world := o.GetWorld(ctx); world != nil {
  8. rtns = append(rtns, "world="+*world)
  9. }
  10. if dl, ok := ctx.Deadline(); ok {
  11. rtns = append(rtns, "deadline="+strconv.FormatInt(dl.UnixMilli(), 10))
  12. }
  13. return []byte(strings.Join(rtns, string([]rune{0}))), nil
  14. }
  15. func (o *ctxOp) Unmarshal(content []byte) (context.Context, context.CancelFunc, error) {
  16. vals := strings.Split(string(content), string([]rune{0}))
  17. ctx := context.Background()
  18. var cancel context.CancelFunc
  19. for _, val := range vals {
  20. switch val {
  21. case "hello":
  22. ctx = o.WithHello(ctx, val)
  23. case "world":
  24. ctx = o.WithWorld(ctx, val)
  25. case "deadline":
  26. i, err := strconv.ParseInt(val, 10, 64)
  27. if err == nil {
  28. t := time.Unix(i/1e9, i%1e9)
  29. ctx, cancel = context.WithDeadline(ctx, t)
  30. }
  31. }
  32. }
  33. return ctx, cancel, nil
  34. }

However, this approach has a major issue in a distributed environment: it relies on UNIX timestamps, which means both machines must be perfectly time-synchronized. Otherwise, the timeout might trigger too early or too late.

A better approach is to pass relative time. For example, if process A sets an initial timeout of 5000ms and has already used 500ms, then it should pass 4500ms to process B. To account for network latency, process A might extend its wait time slightly (e.g., 4550ms instead of 4500ms).

Ref