AI Won’t Replace You. But “Prompting for Code” Might.
Published: October 2025
By Amy Humke, Ph.D.
Founder, Critical Influence

If you’re a data scientist or analyst, you have heard the drumbeat: get in front of AI or get replaced by it. Most people stop at brainstorming, starter code, and debugging. That is the floor, not the ceiling.
The durable edge comes from wiring AI across the evidence lifecycle, so work moves faster from signal to decision to action. This article shows where AI actually buys leverage, what is accessible now versus aspirational, and a start-to-finish walkthrough that keeps the human judgment where it belongs.
What Actually Slows Us Down
Using AI only for code shifts the bottleneck. Projects stall in scoping, data quality, feature relevance, explainability, handoff, monitoring, and adoption. The model is rarely the slow part. The slow part is turning context into decisions people will act on.
That is where AI is beneficial. It structures messy inputs, removes glue work, standardizes artifacts, and keeps a clean trail of what changed and why.
Value: broader coverage, fewer blind spots, faster time to a credible decision.
What “Beyond Prompting” Looks Like
-
Scoping and Alignment
- With AI: Surface gaps, frame options, run sanity math, draft a cited one-pager with placeholders.
- Your Job: Set targets and owners, pick the option, wire data results to action, sign off.
- Value: Faster greenlights with less rework.
-
Context Capture You Will Use
- With AI: Mine briefs, tickets, change notes, and emails into a dated timeline of what changed, who touched it, and expected direction.
- Your Job: Mark credible covariates versus noise.
- Value: Faster "why" answers, fewer dashboard debates.
-
Data Assembly: Dictionaries and Checks
- With AI: Propose a shortlist of candidate fields, justify each, draft dictionary entries for the few that matter, and emit validation SQL.
- Your Job: Keep the 8 to 15 that drive the decision, wire the checks, note exclusions, and why.
- Value: A defendable subset with runnable checks, so modeling starts sooner.
-
Feature Ideation with a Brake Pedal
- With AI: Propose features tied to the context timeline and dictionary, emit Python stubs, and note expected direction.
- Your Job: Run leakage, missingness, and correlation screens, fit a baseline, and keep only what lifts a held-out metric.
- Value: More ideas with less noise and clear acceptance criteria.
-
Modeling: Classic Stack, Modern Packaging
- With AI: Generate leakage test stubs, propose monotonic constraints from business rules, build threshold exploration views, and draft a model card template.
- Your Job: Choose the objective from the decision contract, approve constraints, run tests, and complete the model card with real limits.
- Value: Safer models that are easier to review and approve.
-
Explainability: Consistent Narratives at Scale
- With AI: Convert SHAP results to a three-bullet rationale plus a counselor line. Flag likely interactions and suggest quick diagnostics.
- Your Job: Validate advice, confirm top interactions, add features or rules if needed, lock prompts, and test edge cases.
- Value: Fewer "are we sure" meetings and faster adoption.
-
Handoff, Monitoring, and Rhythm
- With AI: Draft a plain README and simple runbook, assemble a handoff bundle, and generate weekly snapshots for drift and rule-fire rates.
- Your Job: Sanity check facts, keep scope minimal, verify a simple pause or rollback, review weekly adoption and drift.
- Value: You are not shipping infrastructure, you are shipping clarity and a lightweight rhythm of accountability.
-
Closing the Loop: From “We Know” to “We Do”
- With AI: Compile a monthly decision review that links what changed, what fired, what moved, and proposes two to three small tests with owners and start or stop criteria.
- Your Job: Run the review like operations, fix missed rule-fires first, then adjust thresholds, approve the next tests, assign owners, and dates.
- Value: You leave with decisions and owners, not another interesting dashboard.
Example: One Project, End-to-End
Scenario: You are asked to help Enrollment prioritize outreach so more applicants become day-one students. The target is simple: increase new starts, not just conversations. You will build a propensity score plus pre-commit rules that translate scores into outreach moves.
We will walk through the lifecycle and show where AI is the co-pilot and where your judgment runs the show.
1) Scope: from request to pre-commit contract
With AI: Feed the LLM the intake notes, known KPI/owner info, and a short description of the enrollment workflow. Instruct it to: * Surface gaps, not guess: Return a concise checklist of missing facts grouped by Decision, Metric, Data, Capacity, Policy, Risk. * Draft options, not answers: Produce two or three viable ways to run prioritization (e.g., score-only, tiered rules), each with pros, cons, and data needs. * Do sanity math, not goal-setting: Convert baselines and staffing limits into feasible contact coverage to size thresholds. * Assemble a brief skeleton: Compile a one-page Decision Brief with placeholders and citations back to the intake.
Ask it to draft the brief with: * Decision: who gets prioritized for counselor outreach within 48 hours post-application. * Target metric: new-start rate at 90 days. * Pre-commit rules (template): Tier 1 (high score) to human call; Tier 2 (mid score) to SMS/email with human call only if slack capacity; Tier 3 (low score) to nurture only. * Tuning plan: Thresholds chosen to maximize lift under current daily call capacity. * Risks and guardrails: Fairness checks, no automated declines, human override. * Placeholders clearly marked: owner names, threshold values, pilot target range, review date.
Your job: Use the gap checklist to get missing facts. Choose the option, set a pilot target, assign owners and timelines, finalize thresholds, and secure sign-off before modeling.
Value: fewer false starts, an approvable document, and a testable plan.
2) Context capture: the “why” fuel
With AI: Point the model at your initiative log, tickets, change notes, and key emails. Ask it to produce a six-month timeline of changes to application forms, marketing channels, etc. Request a compact table with date, change, segment affected, expected direction, and a link to evidence.
Your job: Mark which events are credible covariates versus noise.
Value: You can link performance shifts to real-world causes instead of arguing by anecdote.
3) Data assembly: dictionaries and checks, fast
With AI: Provide schemas, sample rows, and a one-liner on the decision. Ask the LLM to: * Propose a shortlist, not everything: 12 to 20 candidate fields tied to the decision path (e.g., behavioral signals, timing). * Score and justify each field: Rationale, source of truth, and a risk tag (leakage, fairness, PII). * Draft dictionaries for the shortlist: Name, definition, business meaning, calculation logic, allowed values, freshness, and owner. * Attach checks with code: Suggest minimal validation (null rules, uniqueness, allowed categories) with SQL snippets.
Your job: Cut to the 8 to 15 fields that drive the decision today. Add 3 to 5 pilot fields. Approve checks, wire them into your pipeline, and document exclusions.
Value: turns schemas into a shippable subset with shared definitions and runnable checks.
4) Feature ideation: broaden the surface, narrow the keepers
With AI: Provide the context timeline, field dictionary, and target definition. Ask for 20 to 40 candidate features (behavioral, timing, etc.). Each feature should include a one-line "why this matters" tied to the context timeline and emit a Python stub to compute it.
Your job: Enforce the brake pedal. Run leakage checks, missingness, and correlation scans. Fit a baseline model and keep only features that improve a held-out metric you care about.
Value: more ideas, less noise, and rigorous acceptance criteria.
5) Modeling: classic stack, modern packaging
With AI: Generate scaffolding to make the model safer and easier to review: * Leakage test plan and code stubs: Unit tests for temporal leakage, entity overlap, etc. * Monotonicity assumptions to constraints: Propose features that should only move the prediction in one direction based on business rules. * Threshold exploration notebook: Plots for confusion matrix, cost curves, and capacity-constrained operating points. * Model card template: Structured document with purpose, data, objective, constraints, fairness checks, and limits.
Your job: Choose the objective from the decision contract. Approve constraints, run leakage tests, and fill the model card with real numbers and limits.
Value: safer models that stakeholders can approve without guesswork.
6) Explainability: consistent narratives at scale
With AI: Pipe SHAP outputs into a prompt that returns a three-bullet rationale plus a counselor line (e.g., "Signals of readiness present... use Script B."). Ask it to flag suspected interactions by scanning the SHAP summary. Have it list interaction hypotheses with quick diagnostics.
Your job: Verify that attribution maps to advice. Run a targeted diagnostic on top interaction hypotheses. If confirmed, engineer a feature, add a rule, or re-check narratives. Then lock the prompt and test edge cases.
Value: consistent explanations that travel well across counseling teams.
7) Handoff, monitoring, and rhythm that people will read
With AI: Draft a README and overview in plain language. Define a simple runbook (what to watch, what alerts mean, what to do). Assemble a handoff bundle with the job summary, drift checks, and logs. Generate weekly snapshots for drift, score distribution, and rule-fire rates.
Your job: Sanity check facts, keep scope minimal, verify pause or rollback, and review weekly adoption and drift.
Value: You are shipping clarity and a lightweight rhythm of accountability.
8) Closing the loop: from “we know” to “we do”
With AI: Compile a monthly Decision Review that auto-pulls the last 30 days of logs and outcomes. Summarize what changed, what fired, and what moved. Propose two to three bite-sized tests or process fixes with owner names and start/stop criteria.
Your job: Run the review like operations. Fix missed rule-fires first. Then adjust thresholds. Approve the next tests, assign owners and dates, and lock the plan.
Value: you leave with decisions, owners, and start or stop rules, not another interesting dashboard.
Conclusion: Experiment, Validate, Repeat
- Start small. Pick one exciting step and try it this week. Ask the LLM what you do not know.
- Break work into chunks. Do not ask for everything at once. Draft a README section by section.
- Treat the model like a teaching assistant. Ask for examples, edge cases, and reasons. Make it show its work.
- Validate constantly—sanity checks, leakage tests, simple holdouts, spot reads of outputs. If something feels too clean, dig.
- Keep the human in the loop. Targets, thresholds, owners, and policy live with you. The LLM drafts and computes. You decide.
- Save what works. When a prompt or pattern helps, keep it in your toolkit.
- Explore beyond this list. These are starting points, not limits.
Curiosity first. Guardrails always. Ship in small slices. The point is not to automate everything. The point is to move faster from “I think” to “I know” to “we acted.”