tl;dr
Surge currently provides separate implementations for each function for Float and Double, respectively. This makes Surge basically incompatible with Swift's T: FloatingPoint generics. By introducing a little bit of internal runtime dynamism we aim to migrate existing function pairs to their generic equivalent for T: FloatingPoint.
What?
With the recent refactors we have managed to reduce the implementations of each computation into a function set consisting of a single internal core-implementation, acting as a single source of truth, and a bunch of thin public convenience wrapper functions.
Scalar-Division ([Scalar] / Scalar) is implemented like this:
public func / <L>(lhs: L, rhs: Float) -> [Float] where L: UnsafeMemoryAccessible, L.Element == Float {
return div(lhs, rhs)
}
public func div<L>(_ lhs: L, _ rhs: Float) -> [Float] where L: UnsafeMemoryAccessible, L.Element == Float {
return withArray(from: lhs) { divInPlace(&$0, rhs) }
}
func divInPlace<L>(_ lhs: inout L, _ rhs: Float) where L: UnsafeMutableMemoryAccessible, L.Element == Float {
lhs.withUnsafeMutableMemory { lm in
var scalar = rhs
vDSP_vsdiv(lm.pointer, numericCast(lm.stride), &scalar, lm.pointer, numericCast(lm.stride), numericCast(lm.count))
}
}
… with an almost identical copy existing for each of these functions for Double, instead of Float.
Why?
While the project's current state is quite an improvement over its previous state it has a couple of remaining deficits:
- We have literally everything in two near-identical flavors:
Float and Double.
- One cannot currently use Surge in contexts where one is using
T: FloatingPoint over Float/Double.
So this got me thinking: What if we migrated Surge from using Float/Double to an API with T: FloatingPoint and then internally make use of some dynamic language features to roll our own polymorphism over the closed set of Float and Double with a fatalError(…) on type-mismatch?
Aforementioned dynamism would add a certain amount of runtime overhead to Surge. It is important to note however that we would be adding a constant overhead (O(1) vs. O(N)), as a single call of Surge.divInPlace(_:_:) over a pair of 10_000-element arrays only adds a single branch per execution, not 10_000 branches in a loop, as would be the case for a naïve non-parallel looping implementation.
How?
So how would this look like? What would we need to change?
- We would replace every existing pair of thin
public wrapper functions for Float/Double with a single equivalent function that is generic over T: FloatingPoint, instead.
- We would merge every existing pair of
internal …InPlace(…) core-implementation functions for Float/Double into a single equivalent function that is generic over T: FloatingPoint on the outside and then performs a switch on T.self on the inside, instead.
- We would add
func withMemoryRebound(to:_:) to UnsafeMemory<T> and UnsafeMutableMemory<T>, so that we can efficiently cast from UnsafeMemory<T: FloatingPoint> to UnsafeMemory<Double>, without having to copy/cast any individual values.
- We would add
func withUnsafeMemory(as:…) convenience functions for retrieving type-cast variants of UnsafeMemory<T> from instances of UnsafeMemoryAccessible/UnsafeMutableMemoryAccessible.
- We would refactor the
func …InPlace(…) implementations into something like this:
func divInPlace<L, T>(_ lhs: inout L, _ rhs: T) where L: UnsafeMutableMemoryAccessible, L.Element == T, T: FloatingPoint & ExpressibleByFloatLiteral {
let rhs = CollectionOfOne(rhs)
withUnsafeMemory(
&lhs,
rhs,
float: { lhs, rhs in
vDSP_vsdiv(lhs.pointer, numericCast(lhs.stride), rhs.pointer, lhs.pointer, numericCast(lhs.stride), numericCast(lhs.count))
},
double: { lhs, rhs in
vDSP_vsdivD(lhs.pointer, numericCast(lhs.stride), rhs.pointer, lhs.pointer, numericCast(lhs.stride), numericCast(lhs.count))
}
)
}
So far I have not been able to measure any noticeable performance regressions introduced by this change.
There also should be very little breakage from the changes, as T: FloatingPoint is for the most part a strict superset of either Float or Double.
(I already have a proof-of-concept for this on a local branch and will push it as a PR at some point.)
tl;dr
Surge currently provides separate implementations for each function for
FloatandDouble, respectively. This makes Surge basically incompatible with Swift'sT: FloatingPointgenerics. By introducing a little bit of internal runtime dynamism we aim to migrate existing function pairs to their generic equivalent forT: FloatingPoint.What?
With the recent refactors we have managed to reduce the implementations of each computation into a function set consisting of a single
internalcore-implementation, acting as a single source of truth, and a bunch of thinpublicconvenience wrapper functions.Scalar-Division (
[Scalar] / Scalar) is implemented like this:… with an almost identical copy existing for each of these functions for
Double, instead ofFloat.Why?
While the project's current state is quite an improvement over its previous state it has a couple of remaining deficits:
FloatandDouble.T: FloatingPointoverFloat/Double.So this got me thinking: What if we migrated Surge from using
Float/Doubleto an API withT: FloatingPointand then internally make use of some dynamic language features to roll our own polymorphism over the closed set ofFloatandDoublewith afatalError(…)on type-mismatch?Aforementioned dynamism would add a certain amount of runtime overhead to Surge. It is important to note however that we would be adding a constant overhead (
O(1)vs.O(N)), as a single call ofSurge.divInPlace(_:_:)over a pair of10_000-element arrays only adds a single branch per execution, not10_000branches in a loop, as would be the case for a naïve non-parallel looping implementation.How?
So how would this look like? What would we need to change?
publicwrapper functions forFloat/Doublewith a single equivalent function that is generic overT: FloatingPoint, instead.internal…InPlace(…)core-implementation functions forFloat/Doubleinto a single equivalent function that is generic overT: FloatingPointon the outside and then performs aswitchonT.selfon the inside, instead.func withMemoryRebound(to:_:)toUnsafeMemory<T>andUnsafeMutableMemory<T>, so that we can efficiently cast fromUnsafeMemory<T: FloatingPoint>toUnsafeMemory<Double>, without having to copy/cast any individual values.func withUnsafeMemory(as:…)convenience functions for retrieving type-cast variants ofUnsafeMemory<T>from instances ofUnsafeMemoryAccessible/UnsafeMutableMemoryAccessible.func …InPlace(…)implementations into something like this:So far I have not been able to measure any noticeable performance regressions introduced by this change.
There also should be very little breakage from the changes, as
T: FloatingPointis for the most part a strict superset of eitherFloatorDouble.(I already have a proof-of-concept for this on a local branch and will push it as a PR at some point.)