ENH: Ridge Regression CPU SPMD support#3506
Conversation
|
/intelci: run |
| } | ||
|
|
||
| TEMPLATE_LIST_TEST_M(lr_spmd_test, "RR common flow", "[rr][spmd]", lr_types) { | ||
| SKIP_IF(this->not_float64_friendly()); |
There was a problem hiding this comment.
Why would it need to skip this one if it's on CPU?
There was a problem hiding this comment.
This test is common for CPU and GPU and this check is necessary just for GPU without native float64 support
There was a problem hiding this comment.
@DDJHB @Alexandr-Solovev it would be better to add if_gpu condition then.
|
/intelci: run |
There was a problem hiding this comment.
Pull request overview
Adds CPU distributed (SPMD) Ridge Regression training support to the oneAPI flow, mirroring existing distributed Linear Regression patterns and extending the SPMD test suite accordingly.
Changes:
- Added new distributed Ridge Regression C++ samples for MPI and CCL backends.
- Extended
linear_regressionSPMD tests with a Ridge Regression (alpha-regularized) common-flow test.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| samples/oneapi/cpp/mpi/sources/ridge_regression_distr_mpi.cpp | New MPI-based distributed Ridge Regression sample using SPMD preview::train / preview::infer with alpha regularization. |
| samples/oneapi/cpp/ccl/sources/ridge_regression_distr_ccl.cpp | New CCL-based distributed Ridge Regression sample using SPMD communicator and alpha regularization. |
| cpp/oneapi/dal/algo/linear_regression/test/spmd.cpp | Adds an RR-tagged SPMD test invoking run_and_check_ridge() to validate distributed ridge flow. |
| #include <iomanip> | ||
| #include <iostream> | ||
|
|
||
| #include "oneapi/dal/algo/linear_regression.hpp" | ||
| #include "oneapi/dal/io/csv.hpp" | ||
| #include "oneapi/dal/spmd/mpi/communicator.hpp" | ||
|
|
There was a problem hiding this comment.
std::runtime_error is used in main() but <stdexcept> is not included. Relying on transitive includes is non-portable and may break builds with stricter standard library headers. Include <stdexcept> directly in this sample.
There was a problem hiding this comment.
Valid, but this should be changed in all the samples - a task for another PR.
| int main(int argc, char const *argv[]) { | ||
| int status = MPI_Init(nullptr, nullptr); | ||
| if (status != MPI_SUCCESS) { | ||
| throw std::runtime_error{ "Problem occurred during MPI init" }; | ||
| } | ||
|
|
||
| run(); | ||
|
|
||
| status = MPI_Finalize(); | ||
| if (status != MPI_SUCCESS) { | ||
| throw std::runtime_error{ "Problem occurred during MPI finalize" }; | ||
| } |
There was a problem hiding this comment.
If run() throws (e.g., CSV read / train / infer exceptions), MPI_Finalize() will be skipped, which can leave MPI in an undefined state and cause hangs/aborts on some runtimes. Consider wrapping run() in a try/catch that always calls MPI_Finalize() (or introduce an RAII guard that finalizes in its destructor).
There was a problem hiding this comment.
This code is aligned with other samples. Need to make those changes in all samples - a task for another PR.
| #include <iomanip> | ||
| #include <iostream> | ||
|
|
||
| #include "oneapi/dal/algo/linear_regression.hpp" | ||
| #include "oneapi/dal/io/csv.hpp" | ||
| #include "oneapi/dal/spmd/ccl/communicator.hpp" | ||
|
|
There was a problem hiding this comment.
std::runtime_error is used in main() but <stdexcept> is not included. Relying on transitive includes is non-portable and may break builds with stricter standard library headers. Include <stdexcept> directly in this sample.
| int main(int argc, char const *argv[]) { | ||
| ccl::init(); | ||
| int status = MPI_Init(nullptr, nullptr); | ||
| if (status != MPI_SUCCESS) { | ||
| throw std::runtime_error{ "Problem occurred during MPI init" }; | ||
| } | ||
|
|
||
| run(); | ||
|
|
||
| status = MPI_Finalize(); | ||
| if (status != MPI_SUCCESS) { | ||
| throw std::runtime_error{ "Problem occurred during MPI finalize" }; | ||
| } |
There was a problem hiding this comment.
If run() throws (e.g., CSV read / train / infer exceptions), MPI_Finalize() will be skipped, which can leave MPI in an undefined state and cause hangs/aborts on some runtimes. Consider wrapping run() in a try/catch that always calls MPI_Finalize() (or introduce an RAII guard that finalizes in its destructor).
|
/intelci: run |
|
These samples (oneapi/cpp) do not appear to be running anywhere in our private CI - is this expected? Or if they are, please point to where |
Description
This PR adds distributed (SPMD) training support for Ridge Regression on CPU, following Linear Regression changes.
Checklist:
Completeness and readability
Testing
No new failures.
Performance
Not applicable