Skip to content

Add composite indexes on #__tasks for task runner performance#995

Open
vinespie wants to merge 1 commit into
akeeba:mainfrom
vinespie:perf/tasks-table-indexes
Open

Add composite indexes on #__tasks for task runner performance#995
vinespie wants to merge 1 commit into
akeeba:mainfrom
vinespie:perf/tasks-table-indexes

Conversation

@vinespie

Copy link
Copy Markdown

Without indexes on (enabled, next_execution, priority) and (enabled, last_exit_code, next_execution), the task runner query performs a full table scan on every CRON tick. On deployments with ~1500 sites (~3400+ tasks), this caused 9ms full table scans (3402 rows examined) instead of sub-millisecond index lookups (20 rows).

Measured improvement: 30x faster query, 170x fewer rows examined.

Includes migration actions for existing installations via database:update.

@vinespie vinespie force-pushed the perf/tasks-table-indexes branch from c0f4074 to 3e6d0ac Compare June 10, 2026 13:49
Without indexes on (enabled, next_execution, priority) and
(enabled, last_exit_code, next_execution), the task runner query
performs a full table scan on every CRON tick. On deployments with
~1500 sites (~3400+ tasks), this caused 9ms full table scans (3402
rows examined) instead of sub-millisecond index lookups (20 rows).

Measured improvement: 30x faster query, 170x fewer rows examined.

Includes migration actions for existing installations via database:update.

Signed-off-by: Vincent Espié <v.espie@c3rb.fr>
@vinespie vinespie force-pushed the perf/tasks-table-indexes branch from 3e6d0ac to e29ef83 Compare June 10, 2026 13:57
@nikosdion

Copy link
Copy Markdown
Member

I am not sure this is a net positive change. It is very counter-intuitive, so please let me explain. I am not trying to be pedantic, I am trying to be pragmatic – and explain what I saw during the design phase, and why I decided against indices on the table.

Yes, adding indices does improve read performance by a dramatic percentage (I would argue that you could do away with a single compound and two single-column indices by using WHERE EXISTS subclauses if the use case warranted the optimization – which, to be clear, it does not). However, the absolute value of the performance gain is very small and possibly close to statistical insignificance. We would need to know the sample size and standard deviation to figure out whether the observed ~0.008 seconds per read improvement is real or if it's a statistical outlier. Even if it's real, over the course of an entire day you gain at most 2 minutes of CPU time. That's a bucket of water in the ocean, as my physics professor would say.

Moreover, this ignores the fact that this table is write-heavy. Adding two compound keys on a write-heavy table has a 10% to 30% impact on write performance. This includes INSERT, UPDATE, and DELETE. Remember how many times the tasks table gets updated? A task gets updated at least twice; when it's fetched from the queue, and when it finishes. Many longer running tasks as updated far more times over their lifetime e.g. all core and extension updates, backups, PHP file change scanner, core sums.

Indices are great when you have a read-heavy table, as is the case for most CMS. You create, update, and edit content orders of magnitude fewer times than you read it. Using extensive indices makes a lot of sense in this use case. A tasks queue table is perverse in that the exact opposite is true! You write to it a heck of a lot more than you read from it (always talking about reads that can benefit from an index; reading a single record is always done against its primary key which only has an index update impact on record creation and deletion, never on update, as the indexed field doesn't change on update).

You may want to monitor a few things on your server. First, monitor your iowait – if it went up implementing this change you are experiencing I/O contention due to the write-heavy nature of the tasks table. Run SHOW ENGINE INNODB STATUS and also check the performance_schema.table_io_waits_summary_by_index_usage table to see which indices cost you on writes versus earning their keep on reads. I believe you'll see that the two indices you added are killing your performance, unless you happen to have a fairly massive buffer pool and/or filesystem cache i.e. if your server is over-provisioned for the task at hand. However, that would not be the typical environment we target for use with Panopticon. We usually see installations in resource-constrained environments, be it a virtual / shared server, or a container.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants