Monday, December 23, 2013

Learning Go: Imports

I'm going to finish this series of posts with an interesting but in my opinion unfinished feature: third-party imports. Unless you like to reinvent wheels, your projects will depend on libraries written by other people. To help you do this, Go's import declaration allows you to reference packages stored in open source repositories.

For example, if you want to use the Go project's example square root implementation, you could write the following:

package main

import "fmt"
import "code.google.com/p/go.example/newmath"

func main() {
    fmt.Println("The square root of 2 is ", newmath.Sqrt(2))
}

If you just try to build this code, you'll get an error message indicating that the compiler can't find the package. Before you can use it, you have to retrieve it:

go get code.google.com/p/go.example/newmath

This is, to say the least, inconvenient: if you depend on multiple libraries, you have to manually retrieve each of them before you can build your own project. Fortunately, go get resolves transitive dependencies, and if your project is stored in one of the “standard” repositories you can leverage this feature to retrieve your project and its dependencies in one step. However, at least for GitHub projects, this technique doesn't clone the repository. If you want to make changes — or get updates from your dependencies — you need to manually clone.

A bigger problem is that there's no mechanism for versioning: you always get the trunk revision. If all of your libraries are backwards compatible, that may not be a problem. My experience suggests that's a bad thing to rely upon. In practice, it seems that most developers retrieve the libraries that they want to use, then check those libraries into their own version control system: creating a “locked” revision that they know supports their code.

As I said, I think the feature is unfinished. Versioning is a necessary first step, and should not be too difficult to add. But it's not enough. If you rely on retrieving packages from an open-source repository, you also rely on the package owner; one fit of pique, and your dependency could disappear. Along with versioning, Google needs to add its own repository for versioned artifacts, much like Maven Central, CPAN, or RubyGems.org. This could work transparently to the user: remote imports of public projects would be proxied through Google's server, and it would keep a copy of anything that had a version number.

Thursday, December 19, 2013

Learning Go: Goroutines and Multiple Cores

Here's a program that I've been using to explore Go concurrency. If you want to play along at home, you'll need to compile and run it locally; the Playground won't suffice.

package main

import (
    "fmt"
    "runtime"
    "syscall"
)

type Response struct {
    Received int
    Calculated int
    Handler int
    Tid int
}

func Listener(me int, in chan int, out chan Response, done chan int) {
    for val := range in {
        out <- Response{val, runCalc(me), me, syscall.Gettid()}
    }
    done <- me
}

func runCalc(num int) int {
    zz := 1
    for ii := 0 ; ii < 100000000 ; ii++ {
        zz += ii % num
    }
    return zz
}

func main() {
    fmt.Println("Main running on thread ", syscall.Gettid(), " numCPU = ", runtime.NumCPU())

    chSend := make(chan int, 100)
    chRecv := make(chan Response)
    chDone := make(chan int)

    listenerCount:= 8
    for ii := 1 ; ii <= listenerCount ; ii++ {
        go Listener(ii, chSend, chRecv, chDone)
    }

    messageCount := 100
    for ii := 0 ; ii < messageCount ; ii++ {
        chSend <- ii
    }
    close(chSend)

    for listenerCount > 0 {
        select {
            case data := <- chRecv :
                fmt.Println("Received ", data.Received, ",", data.Calculated, " from ", data.Handler, " on thread ", data.Tid)
                messageCount--
            case lnum := <- chDone :
                fmt.Println("Received DONE from ", lnum)
                listenerCount--
        }
    }

    fmt.Println("Main done, outstanding messages = ", messageCount)
}

The short description of this program is that it kicks off a bunch of goroutines, then sends them CPU-intensive work. My goal in writing it was to explore thread affinity and communication patterns as the amount of work increased. Imagine my surprise when I saw the following output from my 4 core CPU:

go run multi_listener.go 
Main running on thread  27638 , numCPU =  8
Received  0 , 1  from  1  on thread  27640
Received  1 , 1  from  1  on thread  27640
Received  2 , 50000001  from  2  on thread  27640
Received  3 , 100000000  from  3  on thread  27640
Received  4 , 150000001  from  4  on thread  27640
Received  5 , 200000001  from  5  on thread  27640
Received  6 , 249999997  from  6  on thread  27640
Received  7 , 299999996  from  7  on thread  27640
Received  8 , 350000001  from  8  on thread  27640
Received  9 , 1  from  1  on thread  27640
Received  10 , 1  from  1  on thread  27640
Received  11 , 50000001  from  2  on thread  27640
Received  12 , 100000000  from  3  on thread  27640
Received  13 , 150000001  from  4  on thread  27640
Received  14 , 200000001  from  5  on thread  27640

The thread ID is's always the same! And top confirmed that this wasn't a lie: one core was consuming 100% of the CPU, while the others were idle. It took some Googling to discover the GOMAXPROCS environment variable:

experiments, 505> export GOMAXPROCS=4
experiments, 506> go run multi_listener.go 
Main running on thread  27674 , numCPU =  8
Received  2 , 350000001  from  8  on thread  27677
Received  0 , 299999996  from  7  on thread  27678
Received  1 , 1  from  1  on thread  27674
Received  3 , 350000001  from  8  on thread  27677
Received  4 , 50000001  from  2  on thread  27679
Received  5 , 299999996  from  7  on thread  27678
Received  6 , 200000001  from  5  on thread  27674
Received  7 , 249999997  from  6  on thread  27677
Received  8 , 100000000  from  3  on thread  27679
Received  9 , 1  from  1  on thread  27678
Received  10 , 200000001  from  5  on thread  27674
Received  11 , 150000001  from  4  on thread  27677

This variable is documented in the runtime package docs, and also in the (28 page) FAQ. It's not mentioned in the Go Tour or tutorial.

I'm a bit taken aback that it's even necessary, however the comment that goes along with the associated runtime method gives a hit: “This call will go away when the scheduler improves.” As of Go 1.2, the behavior remains, one of the quirks of using a young framework.

Wednesday, December 18, 2013

Learning Go: Slices

Slices are one of the stranger pieces of Go. They're like lists or vectors in other languages, but have some peculiar behaviors; particularly when multiple slices share the same backing array. I suspect that a lot of bugs will come from slices that suddenly stop sharing.

To explain, let's start with a simple slice example: creating two slices backed by an explicit array (you can run these examples in the Go Playground):

package main

import "fmt"

func main() {
    a := []int{1, 2, 3, 4, 5}
    s1 := a[1:4]
    s2 := s1[0:2]
    
    fmt.Println(a)
    fmt.Println(s1)
    fmt.Println(s2)
}

When you run this program, you get the following output (the slice operator is inclusive of its first parameter, exclusive of its second):

[1 2 3 4 5]
[2 3 4]
[2 3]

As I said, these slices share a backing array. A change to s2 will be reflected in s1 and a:

func main() {
    a := []int{1, 2, 3, 4, 5}
    s1 := a[1:4]
    s2 := s1[0:2]
    
    s2[0] = 99
    
    fmt.Println(a)
    fmt.Println(s1)
    fmt.Println(s2)
}
[1 99 3 4 5]
[99 3 4]
[99 3]

If you're used to slices from, say, Python, this is a little strange: Python slices are separate objects. A Go slice is more like a Java sub-list, sharing the same backing array. But wait, there's more, you can add items to the end of a slice:

func main() {
    a := []int{1, 2, 3, 4, 5}
    s1 := a[1:4]
    s2 := s1[0:2]
    
    s2 = append(s2, 101)
    
    fmt.Println(a)
    fmt.Println(s1)
    fmt.Println(s2)
}

Since s2 shares backing store with s1 and a, when you append a value to the former, the latter are updated as well:

[1 2 3 101 5]
[2 3 101]
[2 3 101]

But now what happens if we append a bunch of values to s2?

func main() {
    a := []int{1, 2, 3, 4, 5}
    s1 := a[1:4]
    s2 := s1[0:2]
    
    s2 = append(s2, 101)
    s2 = append(s2, 102)
    s2 = append(s2, 103)
    
    fmt.Println(a)
    fmt.Println(s1)
    fmt.Println(s2)
}
[1 2 3 101 102]
[2 3 101]
[2 3 101 102 103]

Did you see that one coming? Here's one more piece of code to ponder:

func main() {
    a := []int{1, 2, 3, 4, 5}
    s1 := a[1:4]
    s2 := s1[0:2]
    
    s2 = append(s2, 101)
    s2 = append(s2, 102)
    s2 = append(s2, 103)

    a[3] = 17
    
    fmt.Println(a)
    fmt.Println(s1)
    fmt.Println(s2)
}
[1 2 3 17 102]
[2 3 17]
[2 3 101 102 103]

As you can see, s2 no longer uses a as its backing array. Not only does a not reflect the last element added to the slice, but changing an element of a does not propagate to s2.

This behavior is hinted in the slice internals documentation, which says that append() “grows the slice if a greater capacity is needed.” Since Go requires you to pay attention to return values, whatever code appended to the slice will always see the correct values. But if you have multiple slices that you think refer to the same backing array, the others won't.

I can understand why you would want slices to share a backing array: they represent a view on that array. And I can understand the desire for expanding the slice via append(): it's the behavior that other languages provide in a list. But this blend seems to be the worst of both worlds, in that you never know whether a changing a given slice will mutate other slices/arrays, or not. I recommend treading carefully.

Tuesday, December 17, 2013

Learning Go: Syntax Quirks

Whenever you approach a new language you'll stumble over things that are almost, but not quite, the same as the language you're currently using. Coming from a C and Java background, I've had several such experiences with Go. This post is about two of the syntactical oddities: things that result in compiler errors and not bugs. Future posts will be about things that show up at runtime.

Short-form Declarations

There are two ways to declare and initialize a variable in Go:

var x int = 10
y := 10

The first form is readily understandable to Java or C programmers, even though the syntax is strange: it defines a variable of type int, and assigns the value 10 to it. The second form relies on type inference to do almost the same thing.

I say “almost” because the literal 10 is a numeric constant, which is permitted to have arbitrary precision. And that means that you have to know what type the compiler will pick. Not really an issue, as short-form declarations are primarily used for function returns, and the compiler will quickly correct you if you assume incorrectly.

A somewhat more annoying quirk is re-assignment, as shown below:

x := 10
x, y := 20, 30      // this compiles
x := 30             // this doesn't

You're allowed to re-assign a variable that was formerly assigned using a short-form declaration only if you also assign a newly-declared variable at the same time. I suppose that the rationale is to prevent accidental overwrites of a variable, while still allowing a single “error” variable for function calls, but I think it just adds to the mental effort needed to understand a program. In some cases you can use “:=”, in others you must use a plain “=”; I prefer consistency.

Braces or Parentheses?

Like most C-family languages, Go uses braces to delimit blocks, and parentheses to delimit function arguments. What's confusing is that some things that you expect to be blocks or functions aren't. For example, the import declaration:

import (
    "fmt"
)

That certainly looks like it should take a block: it's a declaration, not a function call. But those are parentheses, not braces, and I have no idea why.

And going the other way, we have the initialization of a struct (this example copied from the Go Tour):

type Vertex struct {
    X int
    Y int
}

func main() {
    v := Vertex{1, 2}
    // ...

I'm always tripping over this one: I think it's a constructor function, but there aren't any constructors in Go. It's actually a composite literal, much like a Java array initializer.

Monday, December 16, 2013

Learning Go

A few months ago I decided that Go was a language that I wanted to learn. Step one was to find a local Go user group: there's no subsitute for discussing the quirks of a new language with other people. The group I joined has a mix of people that are using Go as part of their daily work, along with those like me who are just trying to pick up the language.

Step two was to download the distribution. It comes with a webserver that appears to have the entire golang.org website, or at least all of the documentation and wiki pages. That's useful, as most of my time for learning is (was) while riding the train. I'm not entirely certain that I like running a server in order to browse documentation — no, actually I'm certain that I don't — but it seems to be the way the world is moving.

Step three was to download the Go Tour (also available online). This is, without question, the best tutorial that I've ever used. It reduces the language to about 60 bite-size pieces, with an example program for each, and includes a service that lets you run those programs from within the page. It's not quite a REPL, but it's close (the online Go Playground, is the program runner without the tutorial).

As a teaching tool, the Tour is nice in that you can easily page up and down between lessons. This is particularly useful when doing one of the exercises, where you're presented with an empty function and have to remember syntax. The only problem with the Tour is that it doesn't save the programs in local storage: when you close your browser window, your work is gone.

The next few posts will give my initial impressions of Go, including some of the places that I think bugs will lurk. I'm also thinking of a somewhat more substantial program to give me a real sense of how the language works.