Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions tensorflow/core/kernels/image/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -264,6 +264,8 @@ tf_kernel_library(
prefix = "extract_image_patches_op",
deps = [
"//tensorflow/core/kernels:ops_util",
"@com_google_absl//absl/status",
"@com_google_absl//absl/strings",
] + IMAGE_DEPS,
)

Expand Down Expand Up @@ -309,6 +311,7 @@ tf_kernel_library(
deps = IMAGE_DEPS + [
":sampling_kernels",
"@com_google_absl//absl/status",
"@com_google_absl//absl/strings",
"@com_google_absl//absl/strings:str_format",
],
)
Expand Down
4 changes: 4 additions & 0 deletions tensorflow/core/kernels/image/extract_image_patches_op.cc
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,10 @@ limitations under the License.
// See docs in ../ops/image_ops.cc.

#include <cstdint>
#include <string>

#include "absl/status/status.h"
#include "absl/strings/str_cat.h"
#define USE_EIGEN_TENSOR
#define EIGEN_USE_THREADS

Expand Down
1 change: 1 addition & 0 deletions tensorflow/core/kernels/image/scale_and_translate_op.cc
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ limitations under the License.
#include <vector>

#include "absl/status/status.h"
#include "absl/strings/str_cat.h"
#include "absl/strings/str_format.h"
#define EIGEN_USE_THREADS

Expand Down
47 changes: 47 additions & 0 deletions tensorflow/lite/delegates/ynnpack/BUILD
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
load("@rules_cc//cc:cc_library.bzl", "cc_library")
load("//tensorflow:tensorflow.default.bzl", "get_compatible_with_portable")
load("//tensorflow/lite:build_def.bzl", "tflite_copts")

package(
# copybara:uncomment default_applicable_licenses = ["//tensorflow:LICENSE"],
default_visibility = ["//visibility:public"],
licenses = ["notice"],
)

cc_library(
name = "ynnpack_delegate",
srcs = [
"copy.cc",
"copy.h",
"dot.cc",
"dot.h",
"elementwise.cc",
"elementwise.h",
"pooling.cc",
"pooling.h",
"reduction.cc",
"reduction.h",
"softmax.cc",
"softmax.h",
"utils.cc",
"utils.h",
"ynnpack_delegate.cc",
],
hdrs = ["ynnpack_delegate.h"],
compatible_with = get_compatible_with_portable(),
copts = tflite_copts(),
linkstatic = True,
deps = [
"//tensorflow/lite:builtin_ops",
"//tensorflow/lite:minimal_logging",
"//tensorflow/lite/core/c:common",
"//tensorflow/lite/delegates/utils:simple_delegate",
"//tensorflow/lite/kernels:kernel_util",
"//tensorflow/lite/types:half",
"@XNNPACK//ynnpack",
"@XNNPACK//ynnpack/composites",
"@com_google_absl//absl/container:flat_hash_map",
"@com_google_absl//absl/types:span",
"@slinky//slinky/base:thread_pool_impl",
],
)
49 changes: 49 additions & 0 deletions tensorflow/lite/delegates/ynnpack/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# YNNPACK Delegate for LiteRT

> [!WARNING]
> The YNNPACK delegate is **experimental** and under active development. Expect
> bugs and performance issues when using it.

The YNNPACK delegate allows LiteRT (formerly TensorFlow Lite) to offload
supported operators to YNNPACK.

YNNPACK aims to provide great flexibility with good performance.

## Delegate Provider Options

When using LiteRT tooling (e.g., benchmarks, evaluation tools) that link the
`ynnpack_delegate_provider`, the following command-line flags are exposed to
configure the YNNPACK delegate:

### Core Options

* **`--use_ynnpack=true|false`** (default: `false`):
Explicitly apply the YNNPACK delegate to the model.

* **`--num_threads=N`** (default: `0` or `1` depending on tool):
The number of threads to use for execution. Note that YNNPACK will only use
a thread pool for `num_threads > 1`. A value of `0` or `1` disables the
thread pool (single-threaded execution).

### YNNPACK Specific Options

* **`--ynnpack_static_shape=true|false`** (default: `false`):
Make input shapes static instead of dynamic. Enabling this may improve
execution (`Invoke`) performance by allowing YNNPACK to optimize for fixed
shapes, but it makes model reshaping (`ResizeInputTensor`) much more
expensive.

* **`--ynnpack_fast_math=true|false`** (default: `false`):
Enable `YNN_FLAG_FAST_MATH`. This allows YNNPACK to use faster but
potentially less precise mathematical approximations.

* **`--ynnpack_consistent_arithmetic=true|false`** (default: `false`):
Enable `YNN_FLAG_CONSISTENT_ARITHMETIC`. YNNPACK will attempt to provide
numerically consistent results for all hardware the **same build** of
YNNPACK runs on. It does not guarantee consistency across different builds
(which means it does not guarantee consistency across different platforms,
which are necessarily different builds).

* **`--ynnpack_no_excess_precision=true|false`** (default: `false`):
Enable `YNN_FLAG_NO_EXCESS_PRECISION`. YNNPACK will not promote tensors to
higher precision as a performance optimization.
Loading
Loading