AI Workforce Exposure Dashboard: Methodology

Purpose

This document describes the data sources, transformations, assumptions, and limitations of the AI Workforce Exposure Dashboard. It is written for critical evaluation, not promotion. The dashboard estimates how exposed Australian occupations are to AI based on observed Claude usage patterns in the United States, projected onto Australian task descriptions via semantic matching.

The core question this dashboard attempts to answer: For each Australian occupation, what share of its tasks are currently being performed with AI assistance, weighted by how much time workers spend on each task? We do not measure this directly. We approximate it by semantically matching Australian task descriptions to US-derived O*NET tasks that have AI penetration scores from Claude usage data.

1. Data Sources

1.1 Australian Task Classifications (Primary Task Framework)

Source: ABS occupation task classifications (ANZSCO-based).

Contains 2,031 unique Australian tasks across 628 ANZSCO 6-digit occupations, forming 12,374 task-occupation pairs. Each task has a time allocation percentage representing the share of the working day spent on that task within its occupation. This is the primary structural framework for the dashboard — all exposure scores are built on top of these Australian tasks and their time weights.

**1.2 Task Penetration (O*NET)**

Source: Anthropic Economic Index (Massenkoff & McCrory, 2026).

Contains 17,998 O*NET tasks, each with a penetration score between 0 and 1.0. "Penetration" means the share of observed Claude usage that involved a given task, derived from approximately 2 million Claude conversations recorded between August and November 2025. A penetration of 1.0 does not mean all workers performing that task use AI; it means that every time this task appeared in the Claude usage sample, AI was involved. The sample is Claude users, who are by definition already using AI.

1.3 Automation vs. Augmentation by Task

Source: Anthropic Economic Index.

Contains task-level breakdowns into five usage categories:

Automation: directive (AI executes with minimal human input) and feedback loop (human iterates but AI does the core work)
Augmentation: task iteration (human does the core work, AI assists), validation (human checks AI output), and learning (human uses AI to build understanding)

These are fractions that sum to approximately 1.0 per task, with a residual category for conversations that could not be classified. Of the O*NET tasks matched to Australian tasks that have AI penetration > 0, approximately 61% have automation/augmentation breakdown data available.

1.4 Australian Occupation Profiles

Source: Australian Bureau of Statistics, Labour Force Survey, November 2025.

Contains 1,236 ANZSCO codes (358 at the 4-digit level, 878 at the 6-digit level). For each occupation:

Employment count, part-time percentage, female percentage
Median weekly earnings, median age, employment growth
State/territory distribution (percentage in each of NSW, VIC, QLD, SA, WA, TAS, NT, ACT)
Age profile across 8 age bands
Education profile across 7 levels (Postgraduate through Year 10 and below)
Top industries (ranked)

Some 6-digit occupations have suppressed values ("N/A" or "<50") due to small cell sizes in the Labour Force Survey.

1.5 Theoretical Automation Potential

Source: Eloundou et al. (2023), "GPTs are GPTs," distributed via the Anthropic Economic Index.

Contains a ChanceAuto field: a theoretical automation potential score on a 0-100 scale per SOC code, based on expert and GPT-4 assessments of which tasks could be automated by large language models. This data is from 2023 and reflects GPT-4-era capability assessments. It serves as a comparison point against observed (2025) exposure, not as a primary metric.

2. Data Pipeline

Australian Tasks (ABS/ANZSCO)          O*NET Tasks (17,998)
  2,031 unique tasks                   with AI penetration scores
  628 ANZSCO 6-digit occupations       from Anthropic Economic Index
  12,374 task-occupation pairs
  with time allocation %
          |                                     |
          |                                     |
          v                                     v
    ┌─────────────────────────────────────────────────┐
    │  STAGE 1: TF-IDF Word Similarity                │
    │  For each AU task, find top-10 O*NET candidates │
    │  from the full 17,998 task list                 │
    └─────────────────────┬───────────────────────────┘
                          |
                          v
    ┌─────────────────────────────────────────────────┐
    │  STAGE 2: LLM Semantic Matching (GPT-5.4-mini) │
    │  Pick best match from top-10 candidates         │
    │  Assign confidence level                        │
    └─────────────────────┬───────────────────────────┘
                          |
                          v
    ┌─────────────────────────────────────────────────┐
    │  Matched Task Pairs with Confidence             │
    │  exact: 87 | high: 1,728 | medium: 182         │
    │  low: 10   | none/no-match: 24                  │
    │  89% high or exact confidence                   │
    └─────────────────────┬───────────────────────────┘
                          |
                          v
    ┌─────────────────────────────────────────────────┐
    │  ANZSCO 6-digit Exposure Score                  │
    │  = Σ(task_penetration × time_allocation         │
    │      × confidence_weight)                       │
    │                                                 │
    │  Confidence weights:                            │
    │    exact/high = 1.0                             │
    │    medium     = 0.5                             │
    │    low        = 0.25                            │
    │    none       = 0                               │
    └─────────────────────┬───────────────────────────┘
                          |
          ┌───────────────┼───────────────┐
          v               v               v
    ┌───────────┐  ┌────────────┐  ┌────────────────┐
    │ Auto/Aug  │  │ 4-digit    │  │ Demographic    │
    │ from      │  │ scores     │  │ breakdowns     │
    │ matched   │  │ (avg of    │  │ (state,gender, │
    │ O*NET     │  │ 6-digit    │  │ education)     │
    │ task      │  │ children)  │  │ at 4-digit     │
    └───────────┘  └────────────┘  └────────────────┘

3. Transformations

**3.1 Semantic Task Matching (Australian → O*NET)**

This is the core methodological step. Rather than chaining occupational classification crosswalks (SOC → ISCO → OSCA → ANZSCO), we match at the task level: each of the 2,031 unique Australian task descriptions is semantically matched to the closest O*NET task description.

Stage 1 — TF-IDF candidate retrieval. For each Australian task, we compute TF-IDF word similarity against the full set of 17,998 O*NET tasks and retrieve the top-10 candidates. This is a cheap, fast retrieval step to narrow the search space.

Stage 2 — LLM semantic matching. GPT-5.4-mini evaluates the top-10 candidates and selects the best semantic match, assigning a confidence level to the match quality.

Match quality distribution:

ConfidenceCountShareExact874.3%High1,72885.1%Medium1829.0%Low100.5%None / no match241.2%

89% of Australian tasks are matched at high or exact confidence. 454 of the 2,031 tasks (22%) matched to O*NET tasks that have AI penetration greater than zero.

3.2 Occupation Exposure Scores

For each ANZSCO 6-digit occupation, the exposure score is:

exposure = Σ (task_penetration × time_allocation × confidence_weight)

Where the sum is over all tasks assigned to that occupation, and:

task_penetration is the AI penetration score of the matched O*NET task (0 to 1.0)
time_allocation is the Australian time allocation for that task (fraction of working day)
confidence_weight discounts low-quality matches:
exact / high = 1.0
medium = 0.5
low = 0.25
none = 0

This is genuinely time-weighted using Australian time allocation data, unlike the previous crosswalk method which averaged SOC-level scores with no task-level weighting. The workforce-weighted average exposure across the Australian workforce is approximately 25.8%, compared to approximately 6% under the old crosswalk method. The increase is primarily because the task-level matching avoids the dilution effect of averaging across loosely-related SOC occupations.

3.3 Automation vs. Augmentation Ratios

For each ANZSCO occupation, the automation/augmentation breakdown flows directly from the matched O*NET task's auto/aug data.

Of the 471 distinct O*NET tasks matched to Australian tasks with AI penetration > 0, 286 (61%) have automation/augmentation breakdown data. Occupations where all AI-active tasks have auto/aug data show a split bar chart; those where some or all tasks lack this data show a plain bar indicating the data limitation.

3.4 ANZSCO 4-Digit Scores

ANZSCO 4-digit exposure scores are computed as the simple average of their 6-digit children's scores. These 4-digit scores are used for all demographic breakdowns to avoid double-counting (since 6-digit codes are children of 4-digit parents, summing both would overcount workers).

3.5 Demographic Breakdowns (State, Gender, Education)

Computed using ANZSCO 4-digit codes only.

For each ANZSCO 4-digit code:

1. Assign one exposure score (average of its 6-digit children).

2. Distribute workers by demographic attribute. Example for gender: female_workers = employed × female_pct / 100.

3. Compute employment-weighted average exposure: sum(exposure × workers) / sum(workers).

4. For the automation/augmentation ratio within each demographic bucket: sum(auto_ratio × exposure × workers) / sum((auto_ratio + aug_ratio) × exposure × workers).

The dashboard covers 13.3 million workers across 1,024 ANZSCO 6-digit occupations. The total Australian workforce from ABS Labour Force data (at the 4-digit level) is 14.4 million. The 1.1 million worker gap (7.7%) represents occupations that exist at the 4-digit level in ABS profiles but could not be matched to 6-digit task data. These unmapped workers are excluded from all demographic breakdowns and maps.

3.6 Theoretical Automation Potential

For comparison purposes, theoretical automation potential scores from Eloundou et al. (2023) are mapped to ANZSCO occupations via the SOC-to-ANZSCO crosswalk chain. These are displayed alongside observed exposure for context.

4. Limitations

4.1 Fundamental

Claude-only measurement. Exposure scores are based solely on Claude (Anthropic) usage. They capture nothing from ChatGPT, GitHub Copilot, Midjourney, Gemini, domain-specific AI tools (medical imaging, legal research platforms, CAD tools), robotic process automation, or physical robotics. For occupations where competing AI tools dominate (e.g., Copilot for software engineering, Midjourney for visual design), the true AI exposure is likely substantially higher than these scores suggest.

US AI usage patterns applied to Australian task structures. The AI penetration scores reflect how US-based Claude users interact with the tool. We are matching Australian task descriptions to these US-derived scores. Even where the task match is semantically correct, the rate of AI adoption for that task in Australia may differ from the US due to different work practices, regulatory environments, technology adoption rates, and industry composition.

Point-in-time snapshot. The Claude usage data covers August through November 2025. AI capabilities and adoption patterns are changing rapidly. These scores are already a historical measurement.

4.2 Semantic Matching

LLM matching introduces a new source of error. The GPT-5.4-mini model picks the "best" ONET match for each Australian task, but an imperfect match can inflate or deflate individual task scores. An Australian task matched to an ONET task with high AI penetration, when the true closest task has low penetration (or vice versa), distorts the occupation's overall score. This is a qualitatively different kind of error from the crosswalk averaging — it operates at the individual task level rather than the occupation level.

Confidence weighting is an arbitrary choice. The decision to weight medium-confidence matches at 0.5x and low-confidence matches at 0.25x is a judgement call, not empirically derived. Different weighting schemes would produce different scores. The choice of cutoffs (why 0.5 and not 0.6 or 0.4?) is not grounded in any calibration exercise.

11% of tasks have medium, low, or no confidence. While 89% of matches are high or exact confidence, the remaining 11% (216 tasks) include matches that may be substantively wrong. These tasks still contribute to occupation scores (at reduced weight for medium and low, zero for none), and the reduced weight may not accurately reflect the actual match quality.

4.3 Coverage

Automation/augmentation data only covers 61% of AI-active tasks. Of the O*NET tasks matched to Australian tasks that have AI penetration > 0, only 286 of 471 have automation/augmentation breakdowns. Occupations relying on the remaining 39% of tasks cannot show a meaningful auto/aug split.

Penetration is not a deployment rate. A task penetration score of 1.0 means AI was involved every time that task appeared in the Claude conversation sample. It does not mean 100% of workers performing that task use AI. The sample is Claude users, who are by definition already using AI.

Only 22% of Australian tasks match to AI-active ONET tasks. 454 of 2,031 Australian tasks matched to ONET tasks with penetration > 0. The remaining 78% contribute zero to exposure scores. This could mean these tasks genuinely have no AI involvement (via Claude), or it could mean the matching failed to find the right O*NET equivalent.

4.4 Data Quality

ABS suppression. Employment, earnings, and growth figures for some 6-digit ANZSCO occupations are suppressed ("N/A" or "<50") due to small cell sizes in the Labour Force Survey. These occupations are included where exposure data exists but lack complete demographic characterisation.

Theoretical scores are dated. The Eloundou et al. (2023) theoretical automation scores are based on GPT-4-era assessments. AI capabilities have advanced materially since then.

5. What the Dashboard Does Not Do

It does not predict job losses or job creation.
It does not measure productivity gains from AI adoption.
It does not assess the quality or accuracy of AI-assisted task completion.
It does not distinguish between casual/experimental AI use and deep workflow integration.
It does not account for organisational, regulatory, or infrastructure barriers to AI adoption in Australia.
It does not incorporate any AI tools other than Claude.

6. What Would Improve This Analysis

1. Direct Australian AI usage data rather than US-derived scores mapped onto Australian tasks.

2. Multi-tool coverage — incorporating usage data from ChatGPT, Copilot, and domain-specific AI tools.

3. Calibration of confidence weights — empirically testing whether 0.5x for medium matches is too generous or too conservative.

4. Human validation of semantic matches — spot-checking a random sample of task matches, particularly the medium and low confidence ones, to estimate the real error rate.

5. Longitudinal tracking — comparing exposure over time as AI capabilities and adoption evolve.

6. Employer survey validation — ground-truthing these estimates against what Australian employers and workers actually report about AI use in their roles.

7. Full automation/augmentation coverage — obtaining auto/aug data for the 39% of AI-active tasks currently missing it.

7. Reproduction

All transformation scripts are in the dashboard repository. The pipeline runs as follows:

1. match_tasks.py — Performs the two-stage semantic matching (TF-IDF retrieval + GPT-5.4-mini selection) between Australian tasks and O*NET tasks.

2. build_exposure.py — Computes ANZSCO 6-digit exposure scores using matched task penetration, time allocation, and confidence weights.

3. compute_all_demographics.py — Computes state, gender, education, and age breakdowns using ABS occupation profile data at the 4-digit ANZSCO level.

4. compute_state_scores.py — Computes state-level aggregate exposure metrics.

Input data files are listed in Section 1. The semantic matching step depends on GPT-5.4-mini API access and is not fully deterministic due to LLM sampling.

8. Citations

Massenkoff, M. & McCrory, P. (2026). "Labor Market Impacts of AI: A New Measure and Early Evidence." Anthropic Research. https://www.anthropic.com/research/labor-market-impacts

Eloundou, T., Manning, S., Mishkin, P., & Rock, D. (2023). "GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models." arXiv:2303.10130.

Australian Bureau of Statistics (2025). "Occupation profiles data — November 2025 (Revised)." ABS Labour Force Survey.

Australian Bureau of Statistics. ANZSCO — Australian and New Zealand Standard Classification of Occupations. Occupation task classifications.

Anthropic Economic Index. HuggingFace dataset. https://huggingface.co/datasets/Anthropic/EconomicIndex