Determine Long-Running Background Jobs
Audience: Data Users
Content Summary: This page shows users how to identify background jobs that take too long to run.
Query Overview
Below is a SQL query, which will help determine background jobs that take too long to run.
Note: This query unions the job and archive table. Adjust the top interval appropriately for your needs.
Instructions
Run the following SQL query to determine background jobs that take too long to run.
WITH
--FOR INTERVAL USAGE ===> NOW() - INTERVAL '4 hours' AS start_dt
--FOR SPECIFIC DATETIME USAGE ===> TO_TIMESTAMP('2021-11-23 12:05:00', 'YYYY-MM-DD hh24:mi:ss')::timestamp without time zone at time zone 'EDT' AS start_dt
vars AS (
SELECT
NOW() AS end_dt,
NOW() - INTERVAL '7 days' AS start_dt
),
all_jobs_raw AS (
SELECT name, state, startedon, createdon, completedon, data FROM pgboss.job
UNION SELECT name, state, startedon, createdon, completedon, data FROM pgboss.archive -- COMMENT OUT THIS TABLE IF LESS THAN 12 HOURS AND RUNNING SLOW
),
all_jobs AS (
SELECT
CASE
WHEN name = 'automaticSubscription' AND data->>'profileId' IS NULL THEN 'autoSub - ByDataSource'
WHEN name = 'automaticSubscription' AND data->>'profileId' IS NOT NULL THEN 'autoSub - ByUser'
WHEN name ILIKE 'nativeSqlProfileRefresh_Snowflake%' THEN 'nativeSqlProfileRefresh_Snowflake%'
WHEN name ILIKE 'nativeSqlProfileRefresh_Redshift%' THEN 'nativeSqlProfileRefresh_Redshift%'
ELSE name
END AS name, state, startedon, completedon
FROM all_jobs_raw INNER JOIN vars ON TRUE
WHERE all_jobs_raw.createdon > vars.start_dt and all_jobs_raw.createdon < vars.end_dt
),
batches AS (
SELECT DISTINCT name, COUNT(*) AS job_count, AVG(completedon - startedon) AS average_time
FROM all_jobs GROUP BY name)
SELECT *, job_count * average_time AS total_time, end_dt - start_dt AS all_time
FROM batches
JOIN vars ON TRUE
ORDER BY total_time DESC;