Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
81 changes: 36 additions & 45 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -227,56 +227,47 @@ If a common specification is missing, please feel free to submit a PR

# PERFORMANCE / OTHER LIBRARIES

The following benchmarks were run separately because some libraries were using cgo on specific platforms (notabley, the fastly version)
The benchmarks live under `bench/` and compare this library against several others.

```
// On my OS X 10.14.6, 2.3 GHz Intel Core i5, 16GB memory.
// go version go1.13.4 darwin/amd64
hummingbird% go test -tags bench -benchmem -bench .
<snip>
BenchmarkTebeka-4 297471 3905 ns/op 257 B/op 20 allocs/op
BenchmarkJehiah-4 818444 1773 ns/op 256 B/op 17 allocs/op
BenchmarkFastly-4 2330794 550 ns/op 80 B/op 5 allocs/op
BenchmarkLestrrat-4 916365 1458 ns/op 80 B/op 2 allocs/op
BenchmarkLestrratCachedString-4 2527428 546 ns/op 128 B/op 2 allocs/op
BenchmarkLestrratCachedWriter-4 537422 2155 ns/op 192 B/op 3 allocs/op
// AMD Ryzen 9 7900X3D, Linux/amd64
// go version go1.26.1 linux/amd64
% go test -benchmem -bench .
goos: linux
goarch: amd64
pkg: github.com/lestrrat-go/strftime/bench
cpu: AMD Ryzen 9 7900X3D 12-Core Processor
BenchmarkTebeka-24 728451 1458 ns/op 260 B/op 20 allocs/op
BenchmarkJehiah-24 1898193 622.1 ns/op 256 B/op 17 allocs/op
BenchmarkFastly-24 1356129 881.0 ns/op 168 B/op 6 allocs/op
BenchmarkNcruces-24 5115555 230.7 ns/op 64 B/op 1 allocs/op
BenchmarkNcrucesAppend-24 6263023 199.2 ns/op 0 B/op 0 allocs/op
BenchmarkLestrrat-24 5860896 206.4 ns/op 128 B/op 2 allocs/op
BenchmarkLestrratCachedString-24 6105082 189.2 ns/op 128 B/op 2 allocs/op
BenchmarkLestrratCachedWriter-24 6648992 168.7 ns/op 64 B/op 1 allocs/op
BenchmarkLestrratCachedFormatBuffer-24 8669540 136.7 ns/op 0 B/op 0 allocs/op
PASS
ok github.com/lestrrat-go/strftime 25.618s
ok github.com/lestrrat-go/strftime/bench 13.281s
```

```
// On a host on Google Cloud Platform, machine-type: f1-micro (vCPU x 1, memory: 0.6GB)
// (Yes, I was being skimpy)
// Linux <snip> 4.9.0-11-amd64 #1 SMP Debian 4.9.189-3+deb9u1 (2019-09-20) x86_64 GNU/Linux
// go version go1.13.4 linux/amd64
hummingbird% go test -tags bench -benchmem -bench .
<snip>
BenchmarkTebeka 254997 4726 ns/op 256 B/op 20 allocs/op
BenchmarkJehiah 659289 1882 ns/op 256 B/op 17 allocs/op
BenchmarkFastly 389150 3044 ns/op 224 B/op 13 allocs/op
BenchmarkLestrrat 699069 1780 ns/op 80 B/op 2 allocs/op
BenchmarkLestrratCachedString 2081594 589 ns/op 128 B/op 2 allocs/op
BenchmarkLestrratCachedWriter 825763 1480 ns/op 192 B/op 3 allocs/op
PASS
ok github.com/lestrrat-go/strftime 11.355s
```

This library is much faster than other libraries *IF* you can reuse the format pattern.

Here's the annotated list from the benchmark results. You can clearly see that (re)using a `Strftime` object
and producing a string is the fastest. Writing to an `io.Writer` seems a bit sluggish, but since
the one producing the string is doing almost exactly the same thing, we believe this is purely the overhead of
writing to an `io.Writer`

| Import Path | Score | Note |
|:------------------------------------|--------:|:--------------------------------|
| github.com/lestrrat-go/strftime | 3000000 | Using `FormatString()` (cached) |
| github.com/fastly/go-utils/strftime | 2000000 | Pure go version on OS X |
| github.com/lestrrat-go/strftime | 1000000 | Using `Format()` (NOT cached) |
| github.com/jehiah/go-strftime | 1000000 | |
| github.com/fastly/go-utils/strftime | 1000000 | cgo version on Linux |
| github.com/lestrrat-go/strftime | 500000 | Using `Format()` (cached) |
| github.com/tebeka/strftime | 300000 | |
This library is the fastest of the bunch across every access pattern. The annotated
list below ranks the relevant variants from fastest to slowest:

| Import Path | ns/op | allocs | Note |
|:------------------------------------|------:|-------:|:----------------------------------------------|
| github.com/lestrrat-go/strftime | 136.7 | 0 | `FormatBuffer()` into a reused slice (cached) |
| github.com/lestrrat-go/strftime | 168.7 | 1 | `Format()` to an `io.Writer` (cached) |
| github.com/lestrrat-go/strftime | 189.2 | 2 | `FormatString()` (cached) |
| github.com/ncruces/go-strftime | 199.2 | 0 | `AppendFormat()` |
| github.com/lestrrat-go/strftime | 206.4 | 2 | package-level `Format()` (compiled patterns are cached) |
| github.com/ncruces/go-strftime | 230.7 | 1 | `Format()` |
| github.com/jehiah/go-strftime | 622.1 | 17 | |
| github.com/fastly/go-utils/strftime | 881.0 | 6 | |
| github.com/tebeka/strftime | 1458 | 20 | |

The fastest path is reusing a `Strftime` object and appending into a slice you own
(`FormatBuffer`), which allocates nothing. The package-level `Format()` caches compiled
patterns internally (bounded), so even repeated one-off calls with the same pattern stay fast.

However, depending on your pattern, this speed may vary. If you find a particular pattern that seems sluggish,
please send in patches or tests.
Expand Down
4 changes: 2 additions & 2 deletions appenders.go
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,8 @@ var (
dayOfMonthZeroPad = StdlibFormat("02")
dayOfMonthSpacePad = StdlibFormat("_2")
ymd = StdlibFormat("2006-01-02")
twentyFourHourClockZeroPad = &hourPadded{twelveHour: false, pad: '0'}
twelveHourClockZeroPad = &hourPadded{twelveHour: true, pad: '0'}
twentyFourHourClockZeroPad = StdlibFormat("15")
twelveHourClockZeroPad = StdlibFormat("03")
dayOfYear = AppendFunc(appendDayOfYear)
twentyFourHourClockSpacePad = &hourPadded{twelveHour: false, pad: ' '}
twelveHourClockSpacePad = &hourPadded{twelveHour: true, pad: ' '}
Expand Down
12 changes: 7 additions & 5 deletions specifications.go
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,10 @@ type SpecificationSet interface {
type specificationSet struct {
mutable bool
lock rwLocker
store map[byte]Appender
// store is indexed directly by the specification byte. Since keys are
// always a single byte, a fixed array avoids the hashing cost of a map
// on the hot compile path. A nil entry means "not set".
store [256]Appender
}

// The default specification set does not need any locking as it is never
Expand Down Expand Up @@ -64,7 +67,6 @@ func newSpecificationSet() *specificationSet {
ds := &specificationSet{
mutable: true,
lock: &sync.RWMutex{},
store: make(map[byte]Appender),
}
populateDefaultSpecifications(ds)

Expand Down Expand Up @@ -127,8 +129,8 @@ func (ds *specificationSet) Lookup(b byte) (Appender, error) {
ds.lock.RLock()
defer ds.lock.RUnlock()
}
v, ok := ds.store[b]
if !ok {
v := ds.store[b]
if v == nil {
return nil, fmt.Errorf(`lookup failed: '%%%c' was not found in specification set`, b)
}
return v, nil
Expand All @@ -141,7 +143,7 @@ func (ds *specificationSet) Delete(b byte) error {

ds.lock.Lock()
defer ds.lock.Unlock()
delete(ds.store, b)
ds.store[b] = nil
return nil
}

Expand Down
86 changes: 71 additions & 15 deletions strftime.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,19 +6,25 @@ import (
"io"
"strings"
"sync"
"sync/atomic"
"time"
)

type compileHandler interface {
handle(Appender)
handleVerbatim(string)
handleSpec(Appender)
}

// compile, and create an appender list
type appenderListBuilder struct {
list *combiningAppend
}

func (alb *appenderListBuilder) handle(a Appender) {
func (alb *appenderListBuilder) handleVerbatim(s string) {
alb.list.Append(Verbatim(s))
}

func (alb *appenderListBuilder) handleSpec(a Appender) {
alb.list.Append(a)
}

Expand All @@ -28,20 +34,22 @@ type appenderExecutor struct {
dst []byte
}

func (ae *appenderExecutor) handle(a Appender) {
// handleVerbatim appends the static text directly, avoiding the heap
// allocation that boxing it into a verbatimw Appender would incur on
// this per-call compile path.
func (ae *appenderExecutor) handleVerbatim(s string) {
ae.dst = append(ae.dst, s...)
}

func (ae *appenderExecutor) handleSpec(a Appender) {
ae.dst = a.Append(ae.dst, ae.t)
}

func compile(handler compileHandler, p string, ds SpecificationSet) error {
for l := len(p); l > 0; l = len(p) {
// This is a really tight loop, so we don't even calls to
// Verbatim() to cuase extra stuff
var verbatim verbatimw

i := strings.IndexByte(p, '%')
if i < 0 {
verbatim.s = p
handler.handle(&verbatim)
handler.handleVerbatim(p)
// this is silly, but I don't trust break keywords when there's a
// possibility of this piece of code being rearranged
p = p[l:]
Expand All @@ -55,8 +63,7 @@ func compile(handler compileHandler, p string, ds SpecificationSet) error {
// we already know that i < l - 1
// everything up to the i is verbatim
if i > 0 {
verbatim.s = p[:i]
handler.handle(&verbatim)
handler.handleVerbatim(p[:i])
p = p[i:]
}

Expand All @@ -81,7 +88,7 @@ func compile(handler compileHandler, p string, ds SpecificationSet) error {
specification = unpadded{inner: specification}
}

handler.handle(specification)
handler.handleSpec(specification)
p = p[specIdx+1:]
}
return nil
Expand Down Expand Up @@ -149,15 +156,64 @@ func releasdeFmtAppendExecutor(v *appenderExecutor) {
fmtAppendExecutorPool.Put(v)
}

// formatCacheLimit caps the number of distinct patterns Format will keep
// compiled. The bound keeps memory usage predictable even when patterns are
// derived from untrusted input; once it is reached, additional patterns are
// formatted on the fly without being cached.
const formatCacheLimit = 1024

var (
formatCache sync.Map // pattern string -> *Strftime
formatCacheLen atomic.Int64
)

// cachedStrftime returns a compiled Strftime for the default specification
// set, reusing a previously compiled one when possible. The boolean result is
// false (with no error) when the cache is full and the pattern was not already
// cached, so the caller can fall back to compiling on the fly.
func cachedStrftime(p string) (*Strftime, bool, error) {
if v, ok := formatCache.Load(p); ok {
f, _ := v.(*Strftime)
return f, true, nil
}
if formatCacheLen.Load() >= formatCacheLimit {
return nil, false, nil
}

f, err := New(p)
if err != nil {
return nil, false, err
}
if actual, loaded := formatCache.LoadOrStore(p, f); loaded {
cached, _ := actual.(*Strftime)
return cached, true, nil
}
formatCacheLen.Add(1)
return f, true, nil
}

// Format takes the format `s` and the time `t` to produce the
// format date/time. Note that this function re-compiles the
// pattern every time it is called.
// format date/time.
//
// When called without options, compiled patterns are cached (up to an
// internal limit) so that repeated calls with the same pattern avoid
// recompilation. Calls that pass options always compile on the fly.
//
// If you know beforehand that you will be reusing the pattern
// within your application, consider creating a `Strftime` object
// and reusing it.
func Format(p string, t time.Time, options ...Option) (string, error) {
// TODO: this may be premature optimization
if len(options) == 0 {
f, ok, err := cachedStrftime(p)
if err != nil {
return "", fmt.Errorf("failed to compile format: %w", err)
}
if ok {
return f.FormatString(t), nil
}
// cache is full: fall through and format on the fly
}

ds, err := getSpecificationSetFor(options...)
if err != nil {
return "", fmt.Errorf("failed to get specification set: %w", err)
Expand Down
31 changes: 31 additions & 0 deletions strftime_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,37 @@ func TestExclusion(t *testing.T) {
}
}

func TestFormatCache(t *testing.T) {
const pattern = `%Y-%m-%d %H:%M:%S`
expected, err := strftime.New(pattern)
if !assert.NoError(t, err, `strftime.New succeeds`) {
return
}

// Repeated calls must return identical, correct results whether or not
// the pattern was already cached.
for i := 0; i < 3; i++ {
s, err := strftime.Format(pattern, ref)
if !assert.NoError(t, err, `strftime.Format succeeds`) {
return
}
assert.Equal(t, expected.FormatString(ref), s, `cached Format matches compiled output`)
}

// Passing options bypasses the cache but must still work.
withOpt, err := strftime.Format(`%L`, ref, strftime.WithMilliseconds('L'))
if !assert.NoError(t, err, `strftime.Format with options succeeds`) {
return
}
assert.Equal(t, "123", withOpt, `option-based Format produces milliseconds`)

// Invalid patterns must return an error and must not be cached.
for i := 0; i < 2; i++ {
_, err := strftime.Format(`%`, ref)
assert.Error(t, err, `invalid pattern returns error`)
}
}

func TestInvalid(t *testing.T) {
_, err := strftime.New("%")
if !assert.Error(t, err, `strftime.New should return error`) {
Expand Down
Loading