Skip to content

Mirror ranking algorithm overweights update time vs bandwidth differences #25

Description

@Patrick-Ze

I've been using apt-smart and noticed that the mirror ranking algorithm may not optimally reflect real-world performance differences. The current implementation appears to give disproportionate weight to small differences in update time compared to massive differences in bandwidth.

def sort_key(self):
"""
A tuple that can be used to sort the mirror by its availability/performance metrics.
The tuple created by this property contains four numbers in the following order:
1. The number 1 when :attr:`is_available` is :data:`True` or
the number 0 when :attr:`is_available` is :data:`False`
(because most importantly a mirror must be available).
2. The number 0 when :attr:`is_updating` is :data:`True` or
the number 1 when :attr:`is_updating` is :data:`False`
(because being updated at this very moment is *bad*).
3. The negated value of :attr:`last_updated` (because the
lower :attr:`last_updated` is, the better). If :attr:`last_updated`
is :data:`None` then :data:`LAST_UPDATED_DEFAULT` is used instead.
4. The value of :attr:`bandwidth` (because the higher
:attr:`bandwidth` is, the better).
By sorting :class:`CandidateMirror` objects on these tuples in
ascending order, the last mirror in the sorted results will be the
"most suitable mirror" (given the available information).
"""
return (int(self.is_available),
int(not self.is_updating),
-(self.last_updated if self.last_updated is not None else LAST_UPDATED_DEFAULT),
self.bandwidth or 0)

The sort_key method in CandidateMirror uses 4 factors with equal weighting, this can lead to situations where:

------------------------------------------------------------------------------------------------------
| Rank | Mirror URL                         | Available? | Updating? | Last updated    | Bandwidth   |
------------------------------------------------------------------------------------------------------
|    1 | http://mirror.nyist.edu.cn/ubun... | Yes        | No        | 2 hours behind  | 257.03 KB/s |
|    2 | http://ports.ubuntu.com            | Yes        | No        | 2 hours behind  | 75.25 KB/s  |
|    3 | https://mirror.nyist.edu.cn/ubu... | Yes        | No        | 2 hours behind  | 32.02 KB/s  |
|    4 | http://mirrors.aliyun.com/ubuntu   | Yes        | No        | 3 hours behind  | 3.84 MB/s   |
|    5 | https://mirrors.sdu.edu.cn/ubuntu  | Yes        | No        | 3 hours behind  | 514.34 KB/s |
|    6 | http://mirrors.sdu.edu.cn/ubuntu   | Yes        | No        | 3 hours behind  | 391.18 KB/s |

A mirror with 32 KB/s bandwidth and 2-hour delay ranks higher than
A mirror with 3.84 MB/s bandwidth and 3-hour delay

The 1-hour difference in update time creates a larger sorting impact than the 2-order-of-magnitude difference in bandwidth. A mirror source with 32KB/s can hardly be considered very usable.

Assigning different weights to various factors could potentially serve as a solution for apt-smart

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions