Skip to content

Add baby CAS unit conversion#89

Open
jessealama wants to merge 14 commits intomainfrom
baby-cas-unit-conversion
Open

Add baby CAS unit conversion#89
jessealama wants to merge 14 commits intomainfrom
baby-cas-unit-conversion

Conversation

@jessealama
Copy link
Copy Markdown
Collaborator

Adds a lightweight unit conversion system ("baby CAS") to the Amount proposal. The spec text introduces a static conversion factor table covering length, mass, volume, temperature, area, speed, concentration, and digital units, along with the ConvertUnitValue abstract operation that performs exact rational arithmetic over these factors. A TypeScript generator script (scripts/generate-conversion-table.ts) derives the ecmarkup table rows from CLDR unit data, ensuring the spec table stays in sync with upstream definitions.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 23, 2026

PR Preview Action v1.8.1

QR code for preview link

🚀 View preview at
https://tc39.github.io/proposal-amount/pr-preview/pr-89/

Built to branch gh-pages at 2026-04-01 09:56 UTC.
Preview will be ready when the GitHub Pages deployment is complete.

@jessealama jessealama force-pushed the baby-cas-unit-conversion branch 4 times, most recently from 0404a2e to 42ca8a2 Compare March 27, 2026 09:36
Conversion factors are stored as integer Numerator/Denominator pairs.
ConvertUnitValue combines source and target factors over a common
denominator and delegates to ApplyUnitConversion, which computes
(value × numerator + offset) / denominator using Number arithmetic.
@jessealama jessealama force-pushed the baby-cas-unit-conversion branch from 42ca8a2 to 4fc61a6 Compare March 27, 2026 09:48
Split the offset argument into offsetNumerator/offsetDenominator and
reduce conversionNumerator/conversionDenominator by their GCD before
the floating-point arithmetic. ConvertUnitValue now passes the scale
factor and offset as independent rationals.
Copy link
Copy Markdown
Member

@gibson042 gibson042 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned in the ECMA-402 meeting, I think this is doing far too much. If we pursue such an approach, I would instead expect it to take the form of expression-aware consumption of units.xml, from which point the specification could either require mathematical value calculations with a final conversion to Number or an up-front elementary arithmetic symbolic reduction followed by Number calculations. And in neither case would I expect to duplicate CLDR contents into large tables in ECMA specifications.

For a maximally complex example (i.e., involving both a non-unit factor and a non-zero offset), consider converting 100°C to Fahrenheit, which relies upon the following definitions in reverse order:

<convertUnit source='fahrenheit' baseUnit='kelvin' factor='5/9' offset='2298.35/9' systems="ussystem uksystem"/>
<convertUnit source='celsius' baseUnit='kelvin' offset='273.15' systems="si metric"/>
  • A naïve Number calculation gives the wrong answer: ((celsiusInput * (celsius.factor ?? 1) + (celsius.offset ?? 0)) - (fahrenheit.offset ?? 0)) / (fahrenheit.factor ?? 1) = ((100 * (undefined ?? 1) + (273.15 ?? 0)) - (2298.35/9 ?? 0)) / (5/9 ?? 1) = ((100 * 1 + 273.15) - 2298.35/9) / (5/9) = 211.99999999999997 = 211.99999999999997𝔽.
  • A mathematical value calculation gives the right answer: ((100 × 1 + 273.15) − 2298.35÷9) ÷ (5÷9) = ℝ(212) → 212𝔽.
  • Number calculation of the elementary arithmetic reduction of the expression also gives the right answer: ((celsiusInput × 1 + 273.15) - 2298.35÷9) ÷ (5÷9) = ((celsiusInput × 9 + 273.15 × 9) - 2298.35) ÷ 5 = (celsiusInput × 9 + 160) ÷ 5 = celsiusInput × 1.8 + 32 → 100 * 1.8 + 32 = 212 = 212𝔽.

Further, because arithmetic reduction in the mathematical value domain is guaranteed to be result-preserving, it is equally applicable to both sensible approaches—the above mathematical value calculation is exactly equivalent to 100 × 1.8 + 32 = ℝ(212) → 212𝔽. And since only linear conversions are in scope, that means conversion requires at most one multiplication and one addition—and even further still, given the current contents of units.xml, addition is necessary only for temperature conversions involving Celsius and/or Fahrenheit (every other non-special conversion is possible with just a single multiplication).

However, note that the mathematical value calculation and Number calculation of the elementary arithmetic reduction approaches are not definitionally equivalent—converting 80063993375475600°C with the former would produce 144115188075856128𝔽 (ℝ(144115188075856112) being snapped per the Number value for x from exactly halfway between two Number values with a mutual separation of 32 to that [higher] one with even significand), while the latter would produce 144115188075856096𝔽 (80063993375475600 * 1.8 first snapping midpoint ℝ(144115188075856080) down to even-significand 144115188075856064𝔽 in Number::multiply and then the subsequent + 32 effecting no further rounding).

Note also than even mathematical perfection will not eliminate surprises with rounding modes that are not based on "nearest" behavior (because e.g. 7 inches converts to 0.5833333333333333𝔽 feet, which converts to 6.999999999999999𝔽 inches). But I don't think such repeated conversions are in scope, which means both approaches are equally viable, especially given that only temperature conversions would be subject to intermediate rounding in the Number-calculations one, and even then the error would be like any other binary64 operation. What we're left with seems like immaterial end-user differences and a tradeoff between easy-to-specify mathematical value operations and easy-to-implement Number operations.

@jessealama
Copy link
Copy Markdown
Collaborator Author

An earlier version of the spec (now since squashed out of existence) was literally nothing more thanReturn 𝔽(ℝ(_value_) × _conversionFactor_ + _offset_)). That's the heart of the matter. It's very short, and if we can get away with citing CLDR without including any units or conversion factors in the spec, even better. But I thought that more would be needed from implementors. That's why I've included the data tables from CLDR and spelled out how to perform the conversion with integers, even (potentially) performing common-factor cancellation of ratios to reduce the chance of an intermediate calculation overflowing. (The understanding, of course, is that this can't guard against all possible overflows or inevitable rounding. It's better than the naive approach, but still just best-effort.) If editors and implementors are happy with the more compact approach, I'm OK with that. The spelled-out approach has the advantage of having a common understanding of reducing the possibility of different understandings of 𝔽(ℝ(_value_) × _conversionFactor_ + _offset_)), but I'm overthinking things there, I'm OK trimming this down.

Remove embedded conversion factor tables from spec.emu and intl.emu,
reference CLDR units.xml instead. Add CreateFormatterObject AO and
resolve FormatNumericToString TODOs. Remove 402-specific convertTo.
@jessealama
Copy link
Copy Markdown
Collaborator Author

@gibson042 Thanks for the review. I've reworked this quite a bit.

The CLDR data tables have been removed entirely. In their place, the spec now normatively references CLDR's units.xml <convertUnit> elements and specifies that implementations must extract exact mathematical values from the factor and offset expressions. I've added a note showing worked examples for inch, celsius, and fahrenheit to illustrate the extraction.

The approach is your option (B): reduction in Number-land. ApplyUnitConversion takes integer pairs (non-reified rational numbers) coming from the CLDR, does GCD reduction to minimize intermediate magnitudes (though maybe we could simplify the spec even further and make GCD reduction an implementation note?), and performs a single multiply-then-divide (or, for offset conversions, puts both terms over a common denominator for a single division point). This is done with Numbers, not mathematical values.

Agree that fidelity of multi-step conversions might still result in some loss of precision. In particular, we make no round-trip guarantees. But this is better than naive calculation with JS floats.

I also took the opportunity to clean up things the 402 side, too, as discussed in the call. The .convertTo override and ResolveUnitPreference have been removed for now, keeping unit conversion fully specified in 262. There's not really that much in the 402 part anymore.

@jessealama jessealama marked this pull request as ready for review March 31, 2026 14:56
@jessealama jessealama requested review from eemeli and gibson042 and removed request for gibson042 March 31, 2026 14:56
Copy link
Copy Markdown
Member

@gibson042 gibson042 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I second all of @eemeli's comments. References to rational numbers and/or fraction reduction are a distracting nuisance; ECMA-262 already has infinite precision mathematical values than can further simplify this spec text.

jessealama and others added 7 commits April 1, 2026 11:23
Co-authored-by: Richard Gibson <richard.gibson@gmail.com>
Co-authored-by: Richard Gibson <richard.gibson@gmail.com>
Co-authored-by: Richard Gibson <richard.gibson@gmail.com>
Co-authored-by: Richard Gibson <richard.gibson@gmail.com>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this file needs to be removed, and package.json and package-lock.json reverted.

<emu-clause id="sec-amount-unit-conversion-data">
<h1>Unit Conversion Data</h1>
<p>Unit conversion data is derived from CLDR file <a href="https://github.com/unicode-org/cldr/blob/main/common/supplemental/units.xml"><code>units.xml</code></a>. As described in <a href="https://unicode.org/reports/tr35/tr35-info.html#conversion-data">Unicode Technical Standard #35 Part 6 Supplemental, Conversion Data</a>, each <code>&lt;convertUnit&gt;</code> element defines how to convert a <code>source</code> unit into a compatible <code>baseUnit</code>. An ECMAScript implementation must ignore all <code>special</code> conversions and support all conversions based on <code>factor</code> and/or <code>offset</code>, interpreting the value for each as an arithmetic expression with mathematical value operands (noting the respective defaults of 1 and 0 and the implicit presence of an identity mapping for each unit identified as the value of a <code>baseUnit</code>).</p>
<p>Two units are in the same <dfn id="dfn-unit-category">unit category</dfn> if and only if they share the same <code>baseUnit</code> value in CLDR. A <dfn id="dfn-base-unit">base unit</dfn> is any unit that appears as a <code>baseUnit</code> value; its conversion factor is 1 and its offset is 0.</p>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not going to push too hard on this here, but I don't think I see sufficient value in defining "unit category" and "base unit" terms.

1. If the CLDR unit conversion data specifies a conversion offset for _unit_, let _offset_ be the Number value closest to that rational offset; otherwise, let _offset_ be *+0*<sub>𝔽</sub>.
1. If _unit_ is the <code>source</code> of a <code>&lt;convertUnit&gt;</code> element in the <emu-xref href="#sec-amount-unit-conversion-data">unit conversion data</emu-xref>, then
1. Let _element_ be that <code>&lt;convertUnit&gt;</code> element.
1. Let _category_ be the <code>baseUnit</code> of _element_.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing this in practice, I think we should use baseUnit values directly rather than introducing a "category" concept.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...but looking ahead, we will want the actual category for ECMA-402 unit conversion. That should come from a reference to UTS #35 Part 6 Supplemental, Compute the category.

1. Let _factor_ be the Number value closest to the rational conversion factor for _unit_ as specified by the CLDR unit conversion data.
1. If the CLDR unit conversion data specifies a conversion offset for _unit_, let _offset_ be the Number value closest to that rational offset; otherwise, let _offset_ be *+0*<sub>𝔽</sub>.
1. If _unit_ is the <code>source</code> of a <code>&lt;convertUnit&gt;</code> element in the <emu-xref href="#sec-amount-unit-conversion-data">unit conversion data</emu-xref>, then
1. Let _element_ be that <code>&lt;convertUnit&gt;</code> element.
Copy link
Copy Markdown
Member

@gibson042 gibson042 Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. Let _element_ be that <code>&lt;convertUnit&gt;</code> element.
1. Let _element_ be that <code>&lt;convertUnit&gt;</code> element.
1. If _element_ has an attribute <code>special</code>, throw a *TypeError* exception.

1. Let _factor_ be 1.
1. Let _offset_ be 0.
1. Else,
1. Throw a *RangeError* exception.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think TypeError is a better fit for attempts to convert unknown units.

Suggested change
1. Throw a *RangeError* exception.
1. Throw a *TypeError* exception.

1. Let _sourceConv_ be ? GetUnitConversionFactor(_sourceUnit_).
1. Let _targetConv_ be ? GetUnitConversionFactor(_targetUnit_).
1. If SameValue(_sourceConv_.[[Category]], _targetConv_.[[Category]]) is *false*, throw a *RangeError* exception.
1. If _sourceConv_.[[Category]] is not _targetConv_.[[Category]], throw a *RangeError* exception.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here as well.

Suggested change
1. If _sourceConv_.[[Category]] is not _targetConv_.[[Category]], throw a *RangeError* exception.
1. If _sourceConv_.[[Category]] is not _targetConv_.[[Category]], throw a *TypeError* exception.

Comment on lines +203 to +210
1. If _sourceOffset_ = 0 and _targetOffset_ = 0, then
1. If _value_ is *+0*<sub>𝔽</sub> or _value_ is *-0*<sub>𝔽</sub>, return _value_.
1. Else,
1. If _value_ is *+0*<sub>𝔽</sub> or _value_ is *-0*<sub>𝔽</sub>, set _value_ to *+0*<sub>𝔽</sub>.
1. Return _value_ × 𝔽(_sourceFactor_ / _targetFactor_) + 𝔽((_sourceOffset_ - _targetOffset_) / _targetFactor_).
</emu-alg>
<emu-note>
<p>The conversion from _sourceUnit_ to _targetUnit_ through the base unit is: _result_ = _value_ × _sourceFactor_ / _targetFactor_ + (_sourceOffset_ − _targetOffset_) / _targetFactor_. The expressions _sourceFactor_ / _targetFactor_ and (_sourceOffset_ − _targetOffset_) / _targetFactor_ are computed as mathematical values, and only the multiplication and addition are performed as Number operations. For non-offset conversions (the vast majority), the second term is 𝔽(0) = *+0*<sub>𝔽</sub> and the addition is a no-op.</p>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multiplication should preserve negative zero, but there's no need for this much special-casing.

Suggested change
1. If _sourceOffset_ = 0 and _targetOffset_ = 0, then
1. If _value_ is *+0*<sub>𝔽</sub> or _value_ is *-0*<sub>𝔽</sub>, return _value_.
1. Else,
1. If _value_ is *+0*<sub>𝔽</sub> or _value_ is *-0*<sub>𝔽</sub>, set _value_ to *+0*<sub>𝔽</sub>.
1. Return _value_ × 𝔽(_sourceFactor_ / _targetFactor_) + 𝔽((_sourceOffset_ - _targetOffset_) / _targetFactor_).
</emu-alg>
<emu-note>
<p>The conversion from _sourceUnit_ to _targetUnit_ through the base unit is: _result_ = _value_ × _sourceFactor_ / _targetFactor_ + (_sourceOffset_ − _targetOffset_) / _targetFactor_. The expressions _sourceFactor_ / _targetFactor_ and (_sourceOffset_ − _targetOffset_) / _targetFactor_ are computed as mathematical values, and only the multiplication and addition are performed as Number operations. For non-offset conversions (the vast majority), the second term is 𝔽(0) = *+0*<sub>𝔽</sub> and the addition is a no-op.</p>
1. If _sourceOffset_ is _targetOffset_, then
1. NOTE: This preserves a _value_ of *-0*<sub>𝔽</sub>.
1. Return _value_ × 𝔽(_sourceFactor_ / _targetFactor_).
1. Return _value_ × 𝔽(_sourceFactor_ / _targetFactor_) + 𝔽((_sourceOffset_ - _targetOffset_) / _targetFactor_).
</emu-alg>
<emu-note>
<p>The conversion from _sourceUnit_ to _targetUnit_ through the base unit is: _result_ = _value_ × (_sourceFactor_ / _targetFactor_) + (_sourceOffset_ − _targetOffset_) / _targetFactor_. The subexpressions _sourceFactor_ / _targetFactor_ and (_sourceOffset_ − _targetOffset_) / _targetFactor_ are computed as mathematical values, and then converted to Number values for the multiplication and addition. For non-offset conversions (the vast majority), the offset term is 0 and addition is skipped in order to preserve an input _value_ of *-0*<sub>𝔽</sub>.</p>

1. If _value_ is *+0*<sub>𝔽</sub> or _value_ is *-0*<sub>𝔽</sub>, return _value_.
1. Else,
1. If _value_ is *+0*<sub>𝔽</sub> or _value_ is *-0*<sub>𝔽</sub>, set _value_ to *+0*<sub>𝔽</sub>.
1. Return _value_ × 𝔽(_sourceFactor_ / _targetFactor_) + 𝔽((_sourceOffset_ - _targetOffset_) / _targetFactor_).
Copy link
Copy Markdown
Contributor

@sffc sffc Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: should this be (A)

𝔽(_sourceFactor_ / _targetFactor_)

or should it be (B)

(𝔽(_sourceFactor_) / 𝔽(_targetFactor_))

(A) means that engines need to perform MV steps before they perform the float steps, but this is for a fixed number of unit pairs so it could be theoretically cached.

(B) means that we are exposing a greater degree of floating point arithmetic madness, but it is probably easier to implement, since every unit can have a single pre-computed 64-bit number value.

Note: here is an example where (A) and (B) are different:

1e5 / 0.3
// 333333.3333333334
333333.333333333333
// 333333.3333333333

I haven't yet found an example using actual CLDR conversion data but I haven't done an exhaustive search.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be (A). With (B), 18 inches would convert to 1.5000000000000002 feet.

Copy link
Copy Markdown
Member

@gibson042 gibson042 Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It most certainly should be (A); that's the very purpose here. Converting 5 inches to feet should behave like 5 * (1/12) (0.41666666666666663), not like 5 * ((0.3048/12) / 0.3048) (0.41666666666666674).

Implementations are free to optimize performance by caching binary64 values for conversion pairs and even to compile in pre-computed values since there are so few (I count 155 <convertUnit> elements, split into 39 shared-baseUnit groups, most of which include just a single element and the largest of which includes 31, for a total of no more than 2276 conversion factors [accounting for forward and reverse permutations] but in practice more like 2028 because of the many unit factors—and even fewer if complex and/or trivial special cases are separated).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good point to avoid repeated factors like that.

The ICU implementation, I believe, cancels out identical factors and then performs floating point arithmetic on the remaining factors, although it might have the option of using a decimal library instead. Based on the direction of the proposal being Number-centric, it would be best to avoid requiring an engine to figure out how to do this math in MV space.

2276 factors to hard-code is a lot, and it will grow quadratically when we add more units.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought: could we somehow specify that the factor computation uses something like Math.mulPrecise? (we have Math.sumPrecise since https://github.com/tc39/proposal-math-sum)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Observation: we only have a problem when there are more than 2 non-identical factors. Also, factors of 10 are fairly easy for an engine to special-case.

Do we have an idea of how many convertible unit pairs involve 3 or more non-equal, non-power-of-10 factors?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants