Skip to content

Suboptimal (slow) "ugrep" behavior when used behind a pipe? #538

@jschleus

Description

@jschleus

On an Apache web server, I sometimes search for specific patterns at the end of a large log file (roughly 3,000,000 lines). To speed up the search, I use the tac command (which prints files in reverse order) and the option -m 1 to stop reading the file after the first matching line.

Here's an example searching explicitly for the last entry of the log file (previously determined for this example):

  1. Using "grep":
time grep “04/Apr/2026:23:59:57” access_log > /dev/null
real    0m0.204s
user    0m0.164s
sys     0m0.040s

time tac access_log | grep -m 1 “04/Apr/2026:23:59:57” > /dev/null
real    0m0.002s
user    0m0.001s
sys     0m0.001s
  1. Using “ugrep”:
time ugrep “04/Apr/2026:23:59:57” access_log > /dev/null
real    0m0.092s
user    0m0.036s
sys     0m0.056s

time tac access_log | ugrep -m 1 “04/Apr/2026:23:59:57” > /dev/null
real    0m0.641s
user    0m0.387s
sys     0m0.527s

In this example, ugrep is roughly twice as fast as grep in the standard case, but when used behind the pipe, the execution time of the entire command increases significantly and the hoped-for speedup effect from tac does not materialize (quite the opposite).
Also using the ugrep option --max-count=1 shows the same behavior.

For the sake of completeness, here is the raw time consumed by the tac command alone:

time tac fresh-access_log.260404 > /dev/null
real    0m0.526s
user    0m0.361s
sys     0m0.164s

As a layman, I'd venture a somewhat bold guess from the above data that grep immediately reads the data from the pipe and then, after the first match, stops the process before the pipe (here tac), while ugrep first reads all (!) the data and only then starts the search.

A similar behavior occurs when using cat, except that the times taken aren't quite as dramatically different, since cat is faster than tac.

But maybe I've overlooked something fundamental.

Metadata

Metadata

Assignees

Labels

patchA patch to fix an issueproblemSomething isn't working due to a (minor) problem

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions