R Runner File Specifications

Contents

    General Information

    • This document outlines the standardized file formats for insights+ outcome products generated by parameterized R scripts (R Runner).
    • Currently, the tables produced by R Runner are for Explain, Forecast, and Predict stages.
    • Employment Gap table field names and types are discussed separately in this document as it does not follow the standard table format.

    File Specifications

    • Generated files for each stage are stored in a separate subfolder within each outcome folder.
      • e.g., college_going/explain
    • If a custom population ID is supplied, an additional separate folder will be created.
      • e.g., college_going/explain/custom_population_id
    • For outcome_name, the name of the outcome being analyzed is used.
    • For cohort_name, the name of the base cohort being analyzed is used.
    StageTableFile Name
    ExplainImpact table(outcome_name)_(cohort_name)_explain_ variable_impact.csv
    PredictInfluence-impact table(outcome_name)_(cohort_name)_predict_influence_impact.csv
    PredictIndividual-level predicted probability table(outcome_name)_(cohort_name)_predict_individual_probability.csv
    PredictPredicted/observed outcome table(outcome_name)_(cohort_name)_predict_model_performance.csv
    ForecastHistorical count/rate table(outcome_name)_(cohort_name)_forecast_historical_data.csv
    ForecastForecasted count/rate table(outcome_name)_(cohort_name)_forecast_forecasted_data.csv

    Field Specifications

    • IMPORTANT NOTE. The order of fields, as indicated by field number, must remain unchanged for each file. Any additions or reordering must be discussed and coordinated with the Dev team.
    • Explain – Impact table
      • This table is used for the Top Impacts panel.
    Field NumberField NameField Type (Range)Field Description
    1yearCharacterThis field includes cohort years.
    2variableCharacterThis field includes the name of the variables included in the analysis
    3DNumeric (0-1)This field represents Impact measure in the panel.
    4cohortCharacterThis field includes cohort names.
    5industryIntegerFor Industry Placement, industry field represents two-digit North American Industry Classification System (NAICS) codes. For other outcomes this field is left blank (NA).
    • Predict – Influence-impact table
      • This table is used for the Things That Matter panel.
    Field NumberField NameField Type (Range)Field Description
    1variableCharacterThis field includes the name of the variables included in the analysis
    2importanceNumeric (0-100)This field represents Impact measure in the panel.
    3controlNumeric (0-100)This field represents Influence measure in the panel.
    4cohortCharacterThis field includes cohort names.
    5industryCharacterFor Industry Placement, this field represents two-digit North American Industry Classification System (NAICS) codes. For other outcomes this field is left blank (NA).
    • Predict – Individual-level predicted probability table
      • This table is used for the Where Things Matter panel and Model Performance panel.
    Field NumberField NameField Type (Range)Field Description
    1nswers_idCharacterThis field includes student identifiers.
    2yearCharacterThis field includes cohort years.
    3(variable_name)Character or Numeric(variable_name) represents a bundle of variable fields included in the analysis. e.g., gender, race, in_state.

    IMPORTANT NOTE: The order of variables inserted into dat_tbl in the R Runner script must remain unchanged. Any additions or changes to the variable order must be discussed and coordinated with the Dev team.
    4prediction_probNumeric (0-100)This field includes individual-level predicted probabilities (0-100) to build parallel box plots and histogram plots.
    5cohortCharacterThis field includes cohort names.
    6industryCharacterFor Industry Placement, this field represents two-digit North American Industry Classification System (NAICS) codes. For other outcomes this field is left blank (NA).
    • Predict – model performance table
      • This table is used for the Model Performance panel.
      • The data of the table is used to compute the model performance (accuracy, sensitivity, specificity) and build histogram plots.
    Field NumberField NameField Type (Range)Field Description
    1prediction_probNumeric (0-100)This field includes individual-level predicted probabilities (0-100) to build parallel box plots and histogram plots.
    2predictionInteger (0,1)This field includes individual-level predicted outcomes (True, False).
    3observedInteger (0,1)This field includes individual-level observed outcomes (True, False).
    4cohortCharacterThis field includes cohort names.
    5industryCharacterFor Industry Placement, this field represents two-digit North American Industry Classification System (NAICS) codes. For other outcomes this field is left blank (NA).
    • Forecast – Historical count/rate table
      • This table is used for the What to Expect plot and Forecasted Count/Rate table.
    Field NumberField NameField Type (Range)Field Description
    1yearIntegerThis field includes cohort years.
    2countInteger (0-Inf)This field includes the total number of students who achieved the outcome in the given year.
    3percentNumeric (0-100)This field includes the percentage of students who achieved the outcome in the given year.
    4cohortCharacterThis field includes cohort names.
    5industryCharacterFor Industry Placement, industry field represents two-digit North American Industry Classification System (NAICS) codes. For other outcomes this field is left blank (NA).
    • Forecast – Forecasted count/rate table
      • This table is used for the What to Expect plot and Forecasted Count/Rate table.
    Field NumberField NameField Type (Range)Field Description
    1yearIntegerThis field includes cohort years.
    2modelCharacterThis field includes the name of forecasting models used in the analysis
    3count_predictedInteger (0-Inf)This field includes the predicted total number of students who achieved the outcome in the given year.
    4count_lowerInteger (0-Inf)This field includes the lower bound of prediction interval for the total number of students who achieved the outcome in the given year
    5count_upperInteger (0-Inf)This field includes the upper bound of prediction interval for the total number of students who achieved the outcome in the given year
    6percent_ predictedNumeric (0-100)This field includes the predicted percentage of students who achieved the outcome in the given year.
    7percent_lowerNumeric (0-100)This field includes the lower bound of prediction interval for the percentage of students who achieved the outcome in the given year.
    8percent_upperNumeric (0-100)This field includes the upper bound of prediction interval for the percentage of students who achieved the outcome in the given year.
    9cohortCharacterThis field includes cohort names.
    10industryCharacterFor Industry Placement, industry field represents two-digit North American Industry Classification System (NAICS) codes. For other outcomes this field is left blank (NA).

    Employment Gap Table Field Specifications

    • Forecast – Historical count/rate table
      • This table is used for the What to Expect plot and Forecasted Count/Rate table.
    Field NumberField NameField Type (Range)Field Description
    1yearIntegerThis field includes cohort years.
    2countInteger (0-Inf)This field includes the total number of students who achieved the outcome in the given year.
    3demandInteger (0-Inf)This field includes the total job opening estimates for each occupation in the given year.
    4cohortCharacterThis field includes cohort names.
    5industryCharacterFor Industry Placement, industry field represents two-digit North American Industry Classification System (NAICS) codes. For other outcomes this field is left blank (NA).
    • Forecast – Forecasted count/rate table
      • This table is used for the What to Expect plot and Forecasted Count/Rate table.
    Field NumberField NameField Type (Range)Field Description
    1yearIntegerThis field includes cohort years.
    2modelCharacterThis field includes the name of forecasting models used in the analysis
    3count_predictedInteger (0-Inf)This field includes the predicted total number of students who achieved the outcome in the given year.
    4count_lowerInteger (0-Inf)This field includes the lower bound of prediction interval for the total number of students who achieved the outcome in the given year
    5count_upperInteger (0-Inf)This field includes the upper bound of prediction interval for the total number of students who achieved the outcome in the given year
    6demandNumeric (0-100)This field includes the total job opening estimates for each occupation in the latest available year from historical count/rate table.
    7cohortCharacterThis field includes cohort names.
    8industryCharacterFor Industry Placement, industry field represents two-digit North American Industry Classification System (NAICS) codes. For other outcomes this field is left blank (NA).
    Updated on April 8, 2026

    Leave a Reply

    Your email address will not be published. Required fields are marked *