The code used for this test can be found here.

TL;DR: when looking into overall performance and usability, json-iter is the clear winner. It gives roughly a 4x improvement over encoding/json and 1.2x over the second most performant option. It is also extremely easy to use. You simply need to import it, define a global variable (that you can call json to make it even easier) and then use it like you would use encoding/json.

Services that serve OpenRTB requests are typically under heavy load and have very strict latency constraints. In this scenario a fast json parser is highly desirable. In the Go ecosystem there are many json parsing libraries that claim to have better performance than encoding/json. The following list are the libraries that I will consider for this test:

Performance

Lets go directly to the performance results. Unmarshaling a standard OpenRTB request with a single imp object yields the following numbers:

goos: darwin
goarch: amd64
pkg: github.com/matipan/openrtb
BenchmarkRequest_UnmarshalJSON/json-iter-12        	  255331	      4730 ns/op	    1328 B/op	      50 allocs/op
BenchmarkRequest_UnmarshalJSON/easyjson-12         	  197110	      5695 ns/op	    1432 B/op	      28 allocs/op
BenchmarkRequest_UnmarshalJSON/jsonparser-12       	  128836	      9360 ns/op	    5528 B/op	      94 allocs/op
BenchmarkRequest_UnmarshalJSON/encoding/json-12    	   70920	     17180 ns/op	    1752 B/op	      49 allocs/op
BenchmarkRequest_UnmarshalJSON/simdjson-go-12      	   46688	     26658 ns/op	  115388 B/op	      48 allocs/op
PASS
ok  	github.com/matipan/openrtb	7.176s

This shows that the fastest option of all is json-iter. It is 1.2x compared to the second, easyjson, 2x compared to jsonparser and about 4x compared to encoding/json. Surprisingly, simdjson is falling way behind being 6x slower than json-iter. I have not done an in depth analysis of why simdjson is this slow. If I do, I will post the results here.

Performance-wise json-iter is the clear winner.

Usability

In terms of usability each library requires something different compared to encoding/json:

  • easyjson: requires code generation. I’m ok with this to be honest, but it adds yet another command that all developers need to be aware of when modifying your structures.
  • jsonparser: requires a lot of manual parsing to get something working.
  • simdjson-go: requires even more manual parsing than jsonparser.
  • json-iter: requires importing a new library and, if you want to use the fastest option, adding a global variable that has to be used for every unmarshal/marshal call.
  • encoding/json: the familiar encode/decode and marshal/unmarshal API.

Below you can see an in depth explanation of how to use each of the libraries shown above. But in my opinion usability-wise json-iter wins again.

Using json-iter

To use json-iter you can simply import the library and define a global variable using the fastest option:

import jsoniter "github.com/json-iterator/go"

var json = jsoniter.ConfigFastest

Once you’ve done that you can use the familiar Marshal/Unmarshal and Encode/Decode API.

Using easyjson

To use easyjson you need to first install the CLI they provide:

go get -u github.com/mailru/easyjson/...

With this CLI you can generate code that parses JSON into and out of the structure. First, add the following directive to your structure:

//easyjson:json
type Request struct {
	...
}

With that you can run this command that will generate functions that match the signature of encoding/json.Marshaler and encoding/json.Unmarshaler:

easyjson -byte -pkg

Using jsonparser

There are many ways to parse a JSON using the jsonparser library. I went with an approach that has a lot of code but most of it is boiler plate, which means that adding new fields is relatively straight forward and does not require a lot of modifications. We start by defining all the fields that the OpenRTB request will have and mapping their IDs to the path were it can be found:

type fieldIdx iota

const (
	fieldDevice fieldIdx = iota
	fieldImp
	fieldApp
	fieldId
	fieldAt
	fieldBcat
	fieldBadv
	fieldBapp
	fieldRegs
)

type rtbFieldDef struct {
	idx  fieldIdx
	path []string
}

var (
	reqFields = []rtbFieldDef{
		{fieldDevice, []string{"device"}},
		{fieldImp, []string{"imp"}},
		{fieldApp, []string{"app"}},
		{fieldId, []string{"id"}},
		{fieldAt, []string{"at"}},
		{fieldBcat, []string{"bcat"}},
		{fieldBadv, []string{"badv"}},
		{fieldBapp, []string{"bapp"}},
		{fieldRegs, []string{"regs"}},
	}

	reqPaths = rtbBuildPaths(reqFields)
)

func rtbBuildPaths(fields []rtbFieldDef) [][]string {
	ret := make([][]string, 0, 10)
	for _, f := range fields {
		ret = append(ret, f.path)
	}
	return ret
}

Once we’ve defined the fields for this top level object we can write the parsing function. This function basically iterates over the JSON one key at a time. For each key it finds it tries to map it to one of the keys we defined in the reqPaths variable. If it finds a match then it sets the value to the corresponding fields on the structure:

func (r *Request) UnmarshalJSONReq(b []byte) error {
	jsonparser.EachKey(b, func(idx int, value []byte, vt jsonparser.ValueType, err error) {
		r.setField(idx, value, vt, err)
	}, reqPaths...)

	return nil
}

func (data *Request) setField(idx int, value []byte, _ jsonparser.ValueType, _ error) {
	switch fieldIdx(idx) {
	case fieldDevice:
		data.Device = &Device{}
		data.Device.UnmarshalJSONReq(value)
	case fieldImp:
		data.Imps = []*Imp{}
		jsonparser.ArrayEach(value, func(arrdata []byte, dataType jsonparser.ValueType, offset int, err error) {
			imp := &Imp{}
			if err := imp.UnmarshalJSONReq(value); err != nil {
				return
			}
			data.Imps = append(data.Imps, imp)
		})
	case fieldApp:
		data.App = &App{}
		data.App.UnmarshalJSONReq(value)
	case fieldId:
		data.ID = string(value)
	case fieldAt:
		data.At, _ = strconv.ParseInt(string(value), 10, 64)
	case fieldBcat:
		data.BCat = []string{}
		jsonparser.ArrayEach(value, func(arrdata []byte, dataType jsonparser.ValueType, offset int, err error) {
			data.BCat = append(data.BCat, string(value))
		})
	case fieldBadv:
		data.BAdv = []string{}
		jsonparser.ArrayEach(value, func(arrdata []byte, dataType jsonparser.ValueType, offset int, err error) {
			data.BAdv = append(data.BAdv, string(value))
		})
	case fieldBapp:
		data.BApp = []string{}
		jsonparser.ArrayEach(value, func(arrdata []byte, dataType jsonparser.ValueType, offset int, err error) {
			data.BApp = append(data.BApp, string(value))
		})
	case fieldRegs:
		data.Regs = &Regs{}
		data.Regs.UnmarshalJSONReq(value)
	}
}

With this you can start parsing the top level object. However, if you look at the fieldApp key for example you can see that we are calling an UnmarshalJSONReq function of the App object. This structure essentially does the same thing than the top level one. It first defines the list of fields that we care about and their corresponding paths and then implements the function that iterates over each key setting the corresponding values:

const (
	fieldAppName fieldIdx = iota
	fieldAppPubId
	fieldAppBundle
	fieldAppLanguage
	fieldAppId
	fieldExtDevUserId
)

var (
	appFields = []rtbFieldDef{
		{fieldAppName, []string{"name"}},
		{fieldAppPubId, []string{"publisher", "id"}},
		{fieldAppBundle, []string{"bundle"}},
		{fieldAppLanguage, []string{"content", "language"}},
		{fieldAppId, []string{"id"}},
		{fieldExtDevUserId, []string{"ext", "devuserid"}},
	}

	appPaths = rtbBuildPaths(appFields)
)

func (a *App) setField(idx int, value []byte, _ jsonparser.ValueType, _ error) {
	switch fieldIdx(idx) {
	case fieldAppName:
		a.Name = string(value)
	case fieldAppPubId:
		a.Publisher.ID = string(value)
	case fieldAppBundle:
		a.Bundle = string(value)
	case fieldAppLanguage:
		a.Content.Language = string(value)
	case fieldAppId:
		a.ID = string(value)
	case fieldExtDevUserId:
		a.Ext.Devuserid = string(value)
	}
}

func (a *App) UnmarshalJSONReq(b []byte) error {
	a.Publisher = &Publisher{}
	a.Content = &AppContent{}
	a.Ext = &AppExt{}
	jsonparser.EachKey(b, func(idx int, value []byte, vt jsonparser.ValueType, err error) {
		a.setField(idx, value, vt, err)
	}, appPaths...)

	return nil
}

If you want to add a new object to the structure you need to implement all this boiler plate. However, if you just want to add a new field to an existing object the change is relatively straight forward. This is why jsonparser in terms of usability is worst than json-iter but still better than simdjson-go.

Using simdjson-go

There is a lot of code involved when parsing JSON with simdjson-go. Here I went with an approach that basically requires you to add more parsing code every time you add a new field. We could implement this with reflection like encoding/json does but that would hurt the performance even more.

Implementing a parser for simdjson requires one to start iterating over the tape that the library generated and map each field according to its type. We start by parsing the top level object and identifying it as a simdjson.TypeObject:

func (r *Request) UnmarshalJSONSimd(b []byte) error {
	parsed, err := simdjson.Parse(b, nil)
	if err != nil {
		return err
	}

	var (
		iter = parsed.Iter()
		obj  = &simdjson.Object{}
		tmp  = &simdjson.Iter{}
	)

	for {
		typ := iter.Advance()

		switch typ {
		case simdjson.TypeRoot:
			if typ, tmp, err = iter.Root(tmp); err != nil {
				return err
			}

			switch typ {
			case simdjson.TypeObject:
				if obj, err = tmp.Object(obj); err != nil {
					return err
				}
				return r.parse(tmp, obj)
			}
		default:
			return nil
		}
	}
}

Within the simdjson.TypeObject switch we can start parsing the OpenRTB request, but the request has many internal objects and each of them can have a different type. This means that for every one of those types we need to parse it separately. To keep this example brief we will only parse two types, Object and Array:

func (r *Request) parse(tmp *simdjson.Iter, obj *simdjson.Object) error {
	arr := &simdjson.Array{}
	for {
		name, t, err := obj.NextElementBytes(tmp)
		if err != nil {
			return err
		}

		if t == simdjson.TypeNone {
			break
		}

		switch t {
		case simdjson.TypeObject:
			if err := r.parseObject(name, tmp); err != nil {
				return err
			}
		case simdjson.TypeArray:
			if _, err := tmp.Array(arr); err != nil {
				return err
			}
			if err := r.parseArray(name, tmp, arr); err != nil {
				return err
			}
		}
	}

	return nil
}

Lets double click on the parsing of an Object. Within an OpenRTB request we have many different objects, like device and app. Each object will require its own parsing function that will map each available key to the corresponding value of the structure. For example, parsing the device object would look something like this:

func (d *Device) parse(iter *simdjson.Iter, obj *simdjson.Object) error {
	for {
		name, t, err := obj.NextElementBytes(iter)
		if err != nil {
			return err
		}

		if t == simdjson.TypeNone {
			return nil
		}

		switch t {
		case simdjson.TypeInt:
			n, err := iter.Int()
			if err != nil {
				return err
			}
			switch {
			case bytes.Compare(name, hKey) == 0:
				d.H = n
			case bytes.Compare(name, wKey) == 0:
				d.W = n
			case bytes.Compare(name, dtKey) == 0:
				d.DeviceType = n
			case bytes.Compare(name, ctKey) == 0:
				d.ConnectionType = n
			}
		case simdjson.TypeString:
			b, err := iter.StringBytes()
			if err != nil {
				return err
			}
			switch {
			case bytes.Compare(name, ipKey) == 0:
				d.IP = string(b)
			case bytes.Compare(name, uaKey) == 0:
				d.UA = string(b)
			case bytes.Compare(name, osKey) == 0:
				d.OS = string(b)
			case bytes.Compare(name, osvKey) == 0:
				d.OSV = string(b)
			case bytes.Compare(name, ifaKey) == 0:
				d.IFA = string(b)
			case bytes.Compare(name, hwvKey) == 0:
				d.HWV = string(b)
			case bytes.Compare(name, modelKey) == 0:
				d.Model = string(b)
			case bytes.Compare(name, dntKey) == 0:
				d.DNT = string(b)
			case bytes.Compare(name, langKey) == 0:
				d.Language = string(b)
			}
		}
	}
}

In this example you can see how tedious it would be to add a new field or top level object. Which is why from a usability point of view, simdjson-go is way behind.

Conclusion

When looking into overall performance and usability, json-iter is the clear winner. It gives roughly a 4x improvement over encoding/json and 1.2x over the second most performant option. It is also extremely easy to use. You simply need to import it, define a global variable (that you can call json to make it even easier) and then use it like you would use encoding/json. On top of that, the community around json-iter seems to be really active. And this library is supported on many different languages.