Tuesday 29 May 2012

Straightforward, Powerful, Standard Library


Whilst testing a Go application to destruction I was manufacturing various scenarios and needed to discover where exactly each goroutine was in its processing. After a quick scan of the documentation, and 5 minutes keyboard bashing I was able to add a couple of simple functions to a utility package and enable applications to dump their current stack when signalled.



func ListenForPrintStackSignal() {
    sigCh := make(chan os.Signal, 1)
    signal.Notify(sigCh,syscall.SIGUSR1)
    go func(sigCh chan os.Signal) {
        for _ = range sigCh {
            log.Print("Request to print stack")
            PrintStack(true)
        }
    }(sigCh)
}


func PrintStack(all bool) {
     //arb sized buffer
     buffer := make([]byte, 10240)
     numB := runtime.Stack(buffer,all)
     log.Print(fmt.Sprintf("%v current goroutines",runtime.NumGoroutine()))
     log.Print(string(buffer[0:numB]))
}


Go's standard library is by far the most powerful and striaghtforward set of builtins I have used. Everything seems to be in place (or planned for) to quickly plug functionality together without having to jump through hoops or battle oddly named methods with bizare signatures. The productivity kick that comes with using this standard library is one of the reasons that we will see an exponential uptake of the Go language in the coming months and years.


Thanks to @rogpeppe for the golfing tips.

Tuesday 22 May 2012

Simple Synchronous and Asynchronous Logging

Almost all services need to write to a log for auditing purposes. When these services require high throughput this write to disk can be a real bottleneck.

So, I wrote a simple log package that implements asynchronous logging to standard output. I only ever write to standard out for service logs. For me redirecting to a file from the command line has proven too flexible over the years to even consider named logs.

Here is the ad/log package:

/*
Package log implements standard logging functions for logging to standard out
*/
package log

import (
    "log"
    "os"
    "fmt"
    "strings"
    "time"
)

var myLog *log.Logger = log.New(os.Stdout, "", log.Ldate+log.Lmicroseconds)

var (
    AsyncBuffer = 1000
    asyncLogging = false
    asyncChannel chan string 
)

//Print message to standard out prefixed with date and time
func Print(s string) {
    if asyncLogging {
        s = "(" + strings.Fields(time.Now().Format(time.StampMicro))[2] + ") " + s
        asyncChannel <- s
    } else {
        myLog.Print(s)
    }
}

//Enable / disable asynchronous logging
func EnableAsync(enable bool) {

    if enable == asyncLogging {
        return
    }

    if enable {
        asyncChannel = make(chan string,AsyncBuffer)
        go func() {
            for {
                message, ok := <-asyncChannel
                if ok {
                    myLog.Print(message)
                } else {
                    asyncLogging = false
                    return
                }
            }
        }()
    } else {
        close(asyncChannel)
    }

    asyncLogging = enable

}

//Returns current asynchronous logging value
func Async() (bool) {
    return asyncLogging
}

Here is a simple example:

package main

import (
    "fmt"
    "time"
    "ad/log"
    "ad/utl"
)

var i = 0;

func main() {

    utl.UseAllCPU()

    incAndPrint()
    log.EnableAsync(true)

    for {
        incAndPrint()
        incAndPrint()
        fmt.Printf("Doing my own thing\n")
        incAndPrint()
        time.Sleep(1e9)
    }

}

func incAndPrint() {
    i += 1
    log.Print(fmt.Sprintf("%v",i))
}

Output looks like this:

~/go/src/ad/misc$ go run logT.go 
2012/05/22 15:39:26.229273 Running log 1.2.1
2012/05/22 15:39:26.229771 Running utl 1.0.0
2012/05/22 15:39:26.229812 Setting UseCPU from 1 to 4
2012/05/22 15:39:26.229874 1
Doing my own thing
2012/05/22 15:39:26.230098 (15:39:26.229989) 2
2012/05/22 15:39:26.230151 (15:39:26.230035) 3
2012/05/22 15:39:26.230163 (15:39:26.230092) 4
Doing my own thing
2012/05/22 15:39:27.230701 (15:39:27.230555) 5
2012/05/22 15:39:27.230761 (15:39:27.230612) 6
2012/05/22 15:39:27.230775 (15:39:27.230696) 7
Doing my own thing
2012/05/22 15:39:28.231295 (15:39:28.231143) 8
2012/05/22 15:39:28.231356 (15:39:28.231203) 9
2012/05/22 15:39:28.231369 (15:39:28.231262) 10

For me the power of this painfully simple package comes when we code log.Print and take the value of the log.asyncLogging flag from a command line switch. This allows you to code in a standard fashion but run individual service instances up with a dedicated logging goroutine if entering a busy period. Alternatively, the log.asyncLogging flag could be set on the fly when the service decided; say when demand is high.

Saturday 12 May 2012

System monitoring


I hate having to configure monitoring. Especially in a complex interconnected system with 100s of servers and 1000s of processes. I have used Nagios in the past; this might be why I'm bald.
    
When monitoring a system all I really want to know is what should be up, what is up, what is it connected to, what are the inputs / outputs, and how busy is it.
                                                                                                                
A while ago I wrote a Python script to take a list of hosts, ssh to them, issue a ps and lsof, parse the outputs and connect all the dots. This worked well, but it took a long time to run and output a massive file giving a lot of data (network connections, stdin/out, executables, command line arguments, cpu / memory usage) but no real useful information. Apart from lengthy execution and no information it worked a treat. So, having failed to acheive anything useful I naturally put this python code into my back-burner directory and got on with something else.
        
A few months ago I read somewhere that the Processing environment was now available in the broswer via processing.js. This was good news. I now had the tool to allow me to successfully display the data as meaningful information. I just now had to get the get the server scan down to a sensible time in order that the information displayed was up to date. I also noticed that using python to monitor my system's performance was causing systems performance issues!                
           
Enter Go.                                                                                                       
    
I ported the python code to Go. Not a straight port; I utilised goroutines to issue the ssh commands and channels to gather / process the output and push the information to browser clients.
    
On a loop, ssh commands are issued and the output processed into meaningful results which are then stored in a map. Process signatures are parsed so as to give sensible names and display arguments. This map is then compared to the previous loop's map to calculate each process's status. Is the process alive? Has the PID changed? Have the socket connection changed, or dropped? How much CPU and memory are being used. That sort of thing. The results are then pushed as JSON to any connected browsers via websockets.
    
In the browser, the processing.js environment loops at a given frame rate drawing the nodes and connections as detailed in the provided JSON. The screen gives lots of visual information: status, type of process, resource usage, connections made etc. The Processing and jQuery allowed me to code user interactions with the nodes. The user can position, filter, detail, and maintain the nodes. As the websocket is bidirectional and a node details the process's standard output I was able to allow the users to hit a key to tail the process's log file in the browser. I intend to expand this further to interact with the underlying process; start stop etc, but I haven't added any authentication or authorisation yet.

Once the user has positioned the nodes to their liking they're able to save the screen by sending the current display context, again as JSON, back to the server over the websocket. This then allows me to save a good known state of the system and use this as a default process map when starting the monitoring process. This solves the problem of monitoring missing processes.

Processing enabled me to present the system information in a palatable way and deserves a lot of credit.

Go though, deserves unending credit. All the heavy lifting of issuing 1000s of system calls, parsing the output and collating the results is handled with ease. The job that took my python script four to five minutes to complete is done in under a second. I even restricted the Go process to a single core to make the comparison with python fair. The Go garbage collector is clearly doing its job as throughout the day the memory footprint holds steady. Sar stats across all machines show no adverse effect of using this scatter / gather approach.
Go, Processing (along with jQuery for the dialog inputs) and the good old shell utils gave me the ability to write my own bespoke monitoring software and ditch Nagios.

My one complaint though is that no matter how much I rub the source code on my head my hair still wont grow back.