Wednesday, December 18, 2013

Learning Go: Slices

Slices are one of the stranger pieces of Go. They're like lists or vectors in other languages, but have some peculiar behaviors; particularly when multiple slices share the same backing array. I suspect that a lot of bugs will come from slices that suddenly stop sharing.

To explain, let's start with a simple slice example: creating two slices backed by an explicit array (you can run these examples in the Go Playground):

package main

import "fmt"

func main() {
    a := []int{1, 2, 3, 4, 5}
    s1 := a[1:4]
    s2 := s1[0:2]
    
    fmt.Println(a)
    fmt.Println(s1)
    fmt.Println(s2)
}

When you run this program, you get the following output (the slice operator is inclusive of its first parameter, exclusive of its second):

[1 2 3 4 5]
[2 3 4]
[2 3]

As I said, these slices share a backing array. A change to s2 will be reflected in s1 and a:

func main() {
    a := []int{1, 2, 3, 4, 5}
    s1 := a[1:4]
    s2 := s1[0:2]
    
    s2[0] = 99
    
    fmt.Println(a)
    fmt.Println(s1)
    fmt.Println(s2)
}
[1 99 3 4 5]
[99 3 4]
[99 3]

If you're used to slices from, say, Python, this is a little strange: Python slices are separate objects. A Go slice is more like a Java sub-list, sharing the same backing array. But wait, there's more, you can add items to the end of a slice:

func main() {
    a := []int{1, 2, 3, 4, 5}
    s1 := a[1:4]
    s2 := s1[0:2]
    
    s2 = append(s2, 101)
    
    fmt.Println(a)
    fmt.Println(s1)
    fmt.Println(s2)
}

Since s2 shares backing store with s1 and a, when you append a value to the former, the latter are updated as well:

[1 2 3 101 5]
[2 3 101]
[2 3 101]

But now what happens if we append a bunch of values to s2?

func main() {
    a := []int{1, 2, 3, 4, 5}
    s1 := a[1:4]
    s2 := s1[0:2]
    
    s2 = append(s2, 101)
    s2 = append(s2, 102)
    s2 = append(s2, 103)
    
    fmt.Println(a)
    fmt.Println(s1)
    fmt.Println(s2)
}
[1 2 3 101 102]
[2 3 101]
[2 3 101 102 103]

Did you see that one coming? Here's one more piece of code to ponder:

func main() {
    a := []int{1, 2, 3, 4, 5}
    s1 := a[1:4]
    s2 := s1[0:2]
    
    s2 = append(s2, 101)
    s2 = append(s2, 102)
    s2 = append(s2, 103)

    a[3] = 17
    
    fmt.Println(a)
    fmt.Println(s1)
    fmt.Println(s2)
}
[1 2 3 17 102]
[2 3 17]
[2 3 101 102 103]

As you can see, s2 no longer uses a as its backing array. Not only does a not reflect the last element added to the slice, but changing an element of a does not propagate to s2.

This behavior is hinted in the slice internals documentation, which says that append() “grows the slice if a greater capacity is needed.” Since Go requires you to pay attention to return values, whatever code appended to the slice will always see the correct values. But if you have multiple slices that you think refer to the same backing array, the others won't.

I can understand why you would want slices to share a backing array: they represent a view on that array. And I can understand the desire for expanding the slice via append(): it's the behavior that other languages provide in a list. But this blend seems to be the worst of both worlds, in that you never know whether a changing a given slice will mutate other slices/arrays, or not. I recommend treading carefully.

No comments: