Skip to content

VRscores: Bulk downloads

Use is subject to VRscores Terms of Use: noncommercial only; no governmental/quasi-governmental or political activity. Terms of Use

This beta release contains bulk VRscores panels at the employer, occupation, industry, and metro (MSA) levels covering 2012–2024, published on Harvard Dataverse. Panels are constructed from the 2024 ingest and span the full 2012–2024 period. Read our methodology for field definitions and sourcing.

Workforce scale

24.5M workers

Unique employees tracked over a decade.

Employer coverage

534K+ employers

Unique employers (VRIDs; VRscores employer identifiers).

Industry depth

1,000+ industries

Six-digit coverage with NAICS2 rollups supplied in exports.

Time span

2012–2024

Annual, unbalanced panel.

Available panels

Download ready-to-use parquet files or preview them on Harvard Dataverse.

VRID (employer) × year

Employer-Year Panel

Beta

2012-2024 employer (VRID) panel. The full dataset contains 6.26M employer-year observations, measuring the partisanship of 24.5M unique workers and more than 534K unique employers.

Years
2012-2024
Rows
6.26M employer-years (≈534K unique employers)
View codebook

SOC (occupation) × year

Occupation-Year Panel

Beta

2012-2024 occupation aggregates spanning 256 6-digit SOC groups (4,979 rows).

Years
2012-2024
Rows
4,979 occupation-years
View codebook

NAICS × year

Industry-Year Panel

Beta

2012-2024 NAICS (6-digit) panel with 13,131 rows.

Years
2012-2024
Rows
13,131 NAICS-years (≈1,010/year)
View codebook

Metropolitan statistical area × year

MSA-Year Panel

Beta

2012-2024 metro (MSA) panel across 366 MSAs.

Years
2012-2024
Rows
4,758 metro-years (366/year)
View codebook
Quick start: preview files before downloadingExpand

Panel files are sizeable (employer-year parquet slices approach ~200 MB each). Inspect the headers before streaming the full payload:

# Check size & content type
curl -I https://dataverse.harvard.edu/api/access/datafile/FILE_ID

# Peek at a small sample (text output)
curl "https://dataverse.harvard.edu/api/access/datafile/FILE_ID" | head

Replace FILE_ID with the numeric identifier listed on Dataverse. The same ID works for Parquet (format=default) and CSV (format=original) downloads.

DuckDB (Parquet)

duckdb <<'SQL'
INSTALL httpfs;
LOAD httpfs;
SET enable_progress_bar = true;
-- Inspect a sample without downloading the full file
SELECT * FROM read_parquet('https://dataverse.harvard.edu/api/access/datafile/FILE_ID', hive_partitioning = 0) LIMIT 10;
-- Persist locally if you need everything
COPY (SELECT * FROM read_parquet('https://dataverse.harvard.edu/api/access/datafile/FILE_ID')) TO 'employer_year.parquet';
SQL

Python + Polars (Parquet)

import polars as pl

scan = pl.scan_parquet("https://dataverse.harvard.edu/api/access/datafile/FILE_ID")
print(scan.limit(10).collect())  # preview

# Download selected columns locally
subset = scan.select([
    "vrid",
    "company_name",
    "pct_dem_cws_emp",
    "pct_rep_cws_emp",
])
subset.collect(streaming=True).write_parquet("employer_year_subset.parquet")

CSV access for tabular ingests

# Dataverse stores CSV uploads as .tab; append ?format=original
curl -L 'https://dataverse.harvard.edu/api/access/datafile/FILE_ID?format=original' -o msa_year.csv

Dataverse exports use the same guardrails as the explorer (employers with fewer than 25 matched workers are excluded unless noted otherwise).

Codebooks

Preview the column dictionaries for each dataset before downloading. Expand a panel to see the available fields and definitions.

Employer-Year PanelExpand

Full column dictionary for the VRID-year aggregate (2012-2024).

ColumnTypeDescription
vridstringVRscores employer identifier (VRID).
yearintegerCalendar year.
company_namestringName of employer.
employee_countintegerNumber of unique workers in the company-year (minimum 5).
avg_match_qualityfloatAverage match quality across all matched workers (higher is a higher probability match).
dem_workers_rawfloatWorkers with Democratic affiliation based solely on L2 voter registrations (raw counts).
rep_workers_rawfloatWorkers with Republican affiliation based solely on L2 voter registrations (raw counts).
other_workers_rawfloatMatched workers whose L2 record is neither Democratic nor Republican (raw counts).
party_known_workers_rawfloatWorkers with any L2 party assignment (Democratic, Republican, or other).
dem_workers_impfloatImputed Democratic worker count (we are able to impute most, but not all, independents/unknown).
rep_workers_impfloatImputed Republican worker count (we are able to impute most, but not all, independents/unknown).
other_workers_impfloatWorkers who are not registered Democrat or Republican and for whom we could not confidently impute partisanship (`max(employee_count − dem_workers_imp − rep_workers_imp, 0)`).
democrat_pct_rawfloatDemocratic share using raw L2 counts (`dem_workers_raw / employee_count`).
republican_pct_rawfloatRepublican share using raw L2 counts.
nonpartisan_pct_rawfloatNonpartisan share using raw L2 counts.
democrat_pct_two_party_rawfloatDemocratic share among two-party workers from raw L2 counts (`dem_workers_raw / (dem_workers_raw + rep_workers_raw)`).
republican_pct_two_party_rawfloatRepublican share among two-party workers from raw L2 counts (`rep_workers_raw / (dem_workers_raw + rep_workers_raw)`).
two_party_margin_rawfloatRepublican minus Democratic share among two-party workers (raw L2 counts; `(rep_workers_raw − dem_workers_raw) / (dem_workers_raw + rep_workers_raw)`).
overall_margin_rawfloatRepublican minus Democratic share of the full workforce (raw L2 counts; `(rep_workers_raw − dem_workers_raw) / employee_count`).
democrat_pct_impfloatDemocratic share after imputing independents/unknown (`dem_workers_imp / employee_count`; registered partisans remain unchanged).
republican_pct_impfloatRepublican share after imputing independents/unknown (`rep_workers_imp / employee_count`; registered partisans remain unchanged).
nonpartisan_pct_impfloatNonpartisan share after imputing independents/unknown (`other_workers_imp / employee_count`).
democrat_pct_two_party_impfloatDemocratic share among imputed two-party workers (`dem_workers_imp / (dem_workers_imp + rep_workers_imp)`; imputations only affect independents/unknown).
republican_pct_two_party_impfloatRepublican share among imputed two-party workers (`rep_workers_imp / (dem_workers_imp + rep_workers_imp)`; imputations only affect independents/unknown).
two_party_margin_impfloatImputed two-party margin (`(rep_workers_imp − dem_workers_imp) / (dem_workers_imp + rep_workers_imp)`).
overall_margin_impfloatImputed overall margin (`(rep_workers_imp − dem_workers_imp) / employee_count`).
political_diversity_rawfloatOne minus the sum of squared raw partisan shares (1 − ∑pᵢ²); higher values indicate more partisan diversity.
political_diversity_impfloatOne minus the sum of squared imputed partisan shares (1 − ∑pᵢ²); higher values indicate more partisan diversity.
effective_parties_rawfloatEffective number of parties based on raw shares (inverse Herfindahl index).
effective_parties_impfloatEffective number of parties based on imputed shares (inverse Herfindahl index).
latest_processed_atstringISO 8601 timestamp indicating when the company-year was last processed.
Occupation-Year PanelExpand

SOC-level column dictionary for the occupation-year panel (2012-2024).

ColumnTypeDescription
onet_codestringO*NET SOC occupation code at the occupation-year aggregation level.
onet_titlestringOccupation title corresponding to the O*NET SOC code.
employee_countintegerNumber of unique workers in the occupation-year (minimum 5).
rep_workers_rawfloatWorkers with Republican affiliation based solely on L2 voter registrations (raw counts).
other_workers_rawfloatMatched workers whose L2 record is neither Democratic nor Republican (raw counts).
party_known_workers_rawfloatWorkers with any L2 party assignment (Democratic, Republican, or other).
dem_workers_impfloatImputed Democratic worker count (we are able to impute most, but not all, independents/unknown).
rep_workers_impfloatImputed Republican worker count (we are able to impute most, but not all, independents/unknown).
other_workers_impfloatWorkers who are not registered Democrat or Republican and for whom we could not confidently impute partisanship (`max(employee_count − dem_workers_imp − rep_workers_imp, 0)`).
avg_match_qualityfloatAverage match quality across all matched workers (higher is a higher probability match).
democrat_pct_rawfloatDemocratic share using raw L2 counts (`dem_workers_raw / employee_count`).
republican_pct_rawfloatRepublican share using raw L2 counts.
nonpartisan_pct_rawfloatNonpartisan share using raw L2 counts.
democrat_pct_two_party_rawfloatDemocratic share among two-party workers from raw L2 counts (`dem_workers_raw / (dem_workers_raw + rep_workers_raw)`).
republican_pct_two_party_rawfloatRepublican share among two-party workers from raw L2 counts (`rep_workers_raw / (dem_workers_raw + rep_workers_raw)`).
two_party_margin_rawfloatRepublican minus Democratic share among two-party workers (raw L2 counts; `(rep_workers_raw − dem_workers_raw) / (dem_workers_raw + rep_workers_raw)`).
overall_margin_rawfloatRepublican minus Democratic share of the full workforce (raw L2 counts; `(rep_workers_raw − dem_workers_raw) / employee_count`).
democrat_pct_impfloatDemocratic share after imputing independents/unknown (`dem_workers_imp / employee_count`; registered partisans remain unchanged).
republican_pct_impfloatRepublican share after imputing independents/unknown (`rep_workers_imp / employee_count`; registered partisans remain unchanged).
nonpartisan_pct_impfloatNonpartisan share after imputing independents/unknown (`other_workers_imp / employee_count`).
democrat_pct_two_party_impfloatDemocratic share among imputed two-party workers (`dem_workers_imp / (dem_workers_imp + rep_workers_imp)`; imputations only affect independents/unknown).
republican_pct_two_party_impfloatRepublican share among imputed two-party workers (`rep_workers_imp / (dem_workers_imp + rep_workers_imp)`; imputations only affect independents/unknown).
two_party_margin_impfloatImputed two-party margin (`(rep_workers_imp − dem_workers_imp) / (dem_workers_imp + rep_workers_imp)`).
overall_margin_impfloatImputed overall margin (`(rep_workers_imp − dem_workers_imp) / employee_count`).
political_diversity_rawfloatOne minus the sum of squared raw partisan shares (1 − ∑pᵢ²); higher values indicate more partisan diversity.
political_diversity_impfloatOne minus the sum of squared imputed partisan shares (1 − ∑pᵢ²); higher values indicate more partisan diversity.
effective_parties_rawfloatEffective number of parties based on raw shares (inverse Herfindahl index).
effective_parties_impfloatEffective number of parties based on imputed shares (inverse Herfindahl index).
latest_processed_atstringISO 8601 timestamp indicating when the occupation-year was last processed.
Industry-Year PanelExpand

NAICS-level column dictionary for the industry-year panel (2012-2024).

ColumnTypeDescription
naics_codestringPrimary NAICS code.
naics_descstringText description for the NAICS code.
employee_countintegerNumber of unique workers in the NAICS-year group (minimum 5).
dem_workers_rawfloatWorkers with Democratic affiliation based solely on L2 voter registrations (raw counts).
rep_workers_rawfloatWorkers with Republican affiliation based solely on L2 voter registrations (raw counts).
other_workers_rawfloatMatched workers whose L2 record is neither Democratic nor Republican (raw counts).
party_known_workers_rawfloatWorkers with any L2 party assignment (Democratic, Republican, or other).
dem_workers_impfloatImputed Democratic worker count (we are able to impute most, but not all, independents/unknown).
rep_workers_impfloatImputed Republican worker count (we are able to impute most, but not all, independents/unknown).
other_workers_impfloatWorkers who are not registered Democrat or Republican and for whom we could not confidently impute partisanship (`max(employee_count − dem_workers_imp − rep_workers_imp, 0)`).
avg_match_qualityfloatAverage match quality across all matched workers (higher is a higher probability match).
democrat_pct_rawfloatDemocratic share using raw L2 counts (`dem_workers_raw / employee_count`).
republican_pct_rawfloatRepublican share using raw L2 counts.
nonpartisan_pct_rawfloatNonpartisan share using raw L2 counts.
democrat_pct_two_party_rawfloatDemocratic share among two-party workers from raw L2 counts (`dem_workers_raw / (dem_workers_raw + rep_workers_raw)`).
republican_pct_two_party_rawfloatRepublican share among two-party workers from raw L2 counts (`rep_workers_raw / (dem_workers_raw + rep_workers_raw)`).
two_party_margin_rawfloatRepublican minus Democratic share among two-party workers (raw L2 counts; `(rep_workers_raw − dem_workers_raw) / (dem_workers_raw + rep_workers_raw)`).
overall_margin_rawfloatRepublican minus Democratic share of the full workforce (raw L2 counts; `(rep_workers_raw − dem_workers_raw) / employee_count`).
democrat_pct_impfloatDemocratic share after imputing independents/unknown (`dem_workers_imp / employee_count`; registered partisans remain unchanged).
republican_pct_impfloatRepublican share after imputing independents/unknown (`rep_workers_imp / employee_count`; registered partisans remain unchanged).
nonpartisan_pct_impfloatNonpartisan share after imputing independents/unknown (`other_workers_imp / employee_count`).
democrat_pct_two_party_impfloatDemocratic share among imputed two-party workers (`dem_workers_imp / (dem_workers_imp + rep_workers_imp)`; imputations only affect independents/unknown).
republican_pct_two_party_impfloatRepublican share among imputed two-party workers (`rep_workers_imp / (dem_workers_imp + rep_workers_imp)`; imputations only affect independents/unknown).
two_party_margin_impfloatImputed two-party margin (`(rep_workers_imp − dem_workers_imp) / (dem_workers_imp + rep_workers_imp)`).
overall_margin_impfloatImputed overall margin (`(rep_workers_imp − dem_workers_imp) / employee_count`).
political_diversity_rawfloatOne minus the sum of squared raw partisan shares (1 − ∑pᵢ²); higher values indicate more partisan diversity.
political_diversity_impfloatOne minus the sum of squared imputed partisan shares (1 − ∑pᵢ²); higher values indicate more partisan diversity.
effective_parties_rawfloatEffective number of parties based on raw shares (inverse Herfindahl index).
effective_parties_impfloatEffective number of parties based on imputed shares (inverse Herfindahl index).
latest_processed_atstringISO 8601 timestamp indicating when the NAICS-year was last processed.
MSA PanelExpand

Metro-level column dictionary for the MSA panel (2012-2024).

ColumnTypeDescription
yearintegerCalendar year.
msastringCombined CBSA/MSA identifier derived from position-level data.
employee_countintegerNumber of unique workers in the MSA-year (minimum 5).
dem_workers_rawfloatWorkers with Democratic affiliation based solely on L2 voter registrations (raw counts).
rep_workers_rawfloatWorkers with Republican affiliation based solely on L2 voter registrations (raw counts).
other_workers_rawfloatMatched workers whose L2 record is neither Democratic nor Republican (raw counts).
party_known_workers_rawfloatWorkers with any L2 party assignment (Democratic, Republican, or other).
dem_workers_impfloatImputed Democratic worker count (we are able to impute most, but not all, independents/unknown).
rep_workers_impfloatImputed Republican worker count (we are able to impute most, but not all, independents/unknown).
other_workers_impfloatWorkers who are not registered Democrat or Republican and for whom we could not confidently impute partisanship (`max(employee_count − dem_workers_imp − rep_workers_imp, 0)`).
avg_match_qualityfloatAverage match quality across all matched workers (higher is a higher probability match).
democrat_pct_rawfloatDemocratic share using raw L2 counts (`dem_workers_raw / employee_count`).
republican_pct_rawfloatRepublican share using raw L2 counts.
nonpartisan_pct_rawfloatNonpartisan share using raw L2 counts.
democrat_pct_two_party_rawfloatDemocratic share among two-party workers (`dem_workers_raw / (dem_workers_raw + rep_workers_raw)`).
republican_pct_two_party_rawfloatRepublican share among two-party workers from raw L2 counts (`rep_workers_raw / (dem_workers_raw + rep_workers_raw)`).
two_party_margin_rawfloatRepublican minus Democratic share among two-party workers (raw L2 counts; `(rep_workers_raw − dem_workers_raw) / (dem_workers_raw + rep_workers_raw)`).
overall_margin_rawfloatRepublican minus Democratic share of the full workforce (raw L2 counts; `(rep_workers_raw − dem_workers_raw) / employee_count`).
democrat_pct_impfloatDemocratic share after imputing independents/unknown (`dem_workers_imp / employee_count`; registered partisans remain unchanged).
republican_pct_impfloatRepublican share after imputing independents/unknown (`rep_workers_imp / employee_count`; registered partisans remain unchanged).
nonpartisan_pct_impfloatNonpartisan share after imputing independents/unknown (`other_workers_imp / employee_count`).
democrat_pct_two_party_impfloatDemocratic share among imputed two-party workers (`dem_workers_imp / (dem_workers_imp + rep_workers_imp)`; imputations only affect independents/unknown).
republican_pct_two_party_impfloatRepublican share among imputed two-party workers (`rep_workers_imp / (dem_workers_imp + rep_workers_imp)`; imputations only affect independents/unknown).
two_party_margin_impfloatImputed two-party margin (`(rep_workers_imp − dem_workers_imp) / (dem_workers_imp + rep_workers_imp)`).
overall_margin_impfloatImputed overall margin (`(rep_workers_imp − dem_workers_imp) / employee_count`).
political_diversity_rawfloatOne minus the sum of squared raw partisan shares (1 − ∑pᵢ²); higher values indicate more partisan diversity.
political_diversity_impfloatOne minus the sum of squared imputed partisan shares (1 − ∑pᵢ²); higher values indicate more partisan diversity.
effective_parties_rawfloatEffective number of parties based on raw shares (inverse Herfindahl index).
effective_parties_impfloatEffective number of parties based on imputed shares (inverse Herfindahl index).
latest_position_enddatetimestampMost recent position end timestamp among workers contributing to the MSA-year.