Modeling Team Performance After Leadership Change: A Classroom Project Using Crystal Palace as a Case Study
Data ScienceLesson PlansSports Analytics

Modeling Team Performance After Leadership Change: A Classroom Project Using Crystal Palace as a Case Study

UUnknown
2026-03-09
11 min read
Advertisement

Turn Crystal Palace's managerial change into a curriculum-ready data project teaching time-series and causal inference with practical steps for 2026 classrooms.

Hook: Turn a real football upheaval into a classroom lab for statistics and causal thinking

Students and teachers often struggle to find classroom projects that are both data-rich and curriculum-aligned. They want real-world relevance, clear learning outcomes, and reproducible steps that teach modern statistical methods. This project does exactly that: use historical match and event data around Oliver Glasner's tenure at Crystal Palace in 2025–26 to model how managerial change affects team performance, teaching time-series analysis and causal inference in a practical, classroom-ready way.

Executive summary: What this lesson delivers

This unit gives learners a hands-on path from raw sports data to causal claims. In one curriculum module students will:

  • Collect and clean match-level and team-level football data (results, xG, lineup changes, transfers, injuries).
  • Apply descriptive time-series techniques and visualisations to detect trends and disruptions.
  • Use three causal approaches: Interrupted Time Series, Difference-in-Differences, and Synthetic Control, and learn how to test assumptions.
  • Report findings with transparent code notebooks and reproducible figures suitable for assessment.

Why Crystal Palace and Oliver Glasner in 2026 are a timely case study

In January 2026 the Guardian reported that Oliver Glasner confirmed he will leave Crystal Palace at the end of the season, having told the chair in October that he wanted 'a new challenge' and with captain Marc Guéhi reportedly close to a transfer to Manchester City. These public events create natural treatment dates teachers can use in causal designs: the announcement, subsequent media coverage, and the official departure date all offer different intervention points to study.

Beyond the headline, 2025–26 has been a year when sports analytics adopted more rigorous causal methods. Clubs increasingly combine event and tracking data with causal ML tools to evaluate coach impact, and open-source packages for causal inference matured through 2024–2026. That makes this unit both current and forward-looking for students preparing for higher study or data-focused careers.

Curriculum alignment and learning objectives

This project aligns to UK and international curricula commonly taught to older secondary and early undergraduate students. Suggested alignments:

  • GCSE/IGCSE/KS4: basic descriptive statistics, interpreting graphs, and simple rate comparisons.
  • A-level statistics / IB SL/HL: regression, time-series basics, hypothesis testing, and data cleaning.
  • Undergraduate introductory econometrics / data science: causal inference methods, panel data, and model validation.

Key learning objectives:

  • Explain and visualise time-dependent patterns in team performance metrics.
  • Estimate and interpret the effect of managerial announcements/changes using causal designs.
  • Evaluate threats to identification and perform sensitivity checks including placebo tests.
  • Produce a reproducible report and presentation that communicates uncertainty clearly.

Data needed and where to get it

For classrooms with limited budgets we recommend building the project on freely accessible data and documented open sources. Key recommended data items and sources:

  • Match results and dates: league websites, FBref, and official club pages.
  • Expected goals (xG) and shot data: Understat and FBref often publish xG for recent seasons.
  • Team lineups and minutes: match reports available publicly; some sites archive lineups.
  • Transfer and contract events: Transfermarkt for dates and fees, club announcements for timing.
  • Injury reports and absences: public club reports and press archives for high-level indicators.
  • Control teams and league context: choose similar clubs in the same league for comparison.

For schools with access to commercial feeds (Opta, StatsBomb, Wyscout) you can extend the unit to use event- or tracking-level microdata for advanced analysis.

Designing the treatment and outcomes

Crucial design decisions include defining the treatment date and primary outcomes. Use one or more of these treatment definitions:

  • Announcement date when Glasner tells the club he seeks a new challenge (October 2025). This may capture morale or media effects.
  • Public confirmation (eg. the 16 January 2026 Guardian report). Useful for replication with public evidence.
  • Official departure date at season end. This isolates the structural effect of an actual managerial change.

Suggested primary outcomes:

  • Points per match (PPP) or points per game averaged with rolling windows.
  • Goal difference per match and expected goal (xG) difference per match.
  • Probability of win/draw/loss using logistic regression.
  • Secondary outcomes: clean sheets, shots on target, possession share, and player-level metrics.

Three causal approaches, explained and classroom-ready

1. Interrupted Time Series (ITS)

What it is: ITS models a continuous outcome over time and estimates the change in level and slope after an intervention. It's an accessible first causal step when you have enough pre- and post-intervention observations.

Implementation: Fit a regression of outcome on time, an indicator for post-intervention period, and an interaction term time x post. Include covariates to control for schedule strength and home/away effects.

Classroom tips:

  • Use 7- or 10-match rolling averages to reduce match-to-match noise.
  • Discuss autocorrelation and show how to inspect residuals and use Newey-West standard errors or ARIMA corrections.

2. Difference-in-Differences (DiD)

What it is: DiD compares Crystal Palace to a set of control teams before and after a treatment date. It estimates the average treatment effect under the parallel trends assumption.

Implementation: Build a panel of teams and match-days, include team and time fixed effects, and interact the treatment indicator with the treated team dummy. Modern guidance (as of 2024–2026) emphasises using debiased DiD estimators for staggered or heterogeneous adoption.

Classroom tips:

  • Pick control teams with similar pre-intervention trends in PPP or xG.
  • Perform a pre-trend test and plot event-study coefficients to visualise dynamics.

3. Synthetic Control

What it is: Synthetic control constructs a weighted combination of control teams to create a counterfactual Crystal Palace that matches pre-treatment trends, then compares post-treatment divergence.

Implementation: Use the Synth package in R or Python equivalents. Include pre-treatment match-by-match outcomes and predictors like squad market value, prior season points, and average xG.

Classroom tips:

  • Run placebo tests by applying synthetic control to control teams.
  • Show how weights are chosen and why the pre-treatment fit matters for credibility.

Step-by-step classroom project plan (6 weeks)

Here is a compact, practical schedule teachers can adapt.

  1. Week 1: Project briefing and data collection. Students identify treatment date and collect match-level data for Crystal Palace and at least five control teams.
  2. Week 2: Data cleaning and exploratory analysis. Compute rolling averages, plot timelines, and summarise pre-treatment behaviour.
  3. Week 3: Interrupted Time Series modelling. Fit ITS, check residuals, present initial findings.
  4. Week 4: Difference-in-Differences. Construct panel, run DiD with fixed effects, and test parallel trends.
  5. Week 5: Synthetic Control and robustness checks. Run placebo tests and alternative specifications.
  6. Week 6: Reporting and presentations. Students produce a reproducible notebook, a one-page memo, and a 10-minute class presentation.

Practical implementation notes: software, code snippets and packages

Recommended software:

  • Python: pandas, statsmodels, linearmodels, DoWhy, EconML, causalimpact, synthetic_control packages.
  • R: tidyverse, plm, synth, CausalImpact, fixest.
  • Notebooks: Jupyter or RStudio for reproducible work.

Simple ITS regression formula (pseudocode):

    outcome ~ time + post_indicator + time:post_indicator + home_away + opponent_strength
  

DiD regression formula (panel):

    outcome ~ treated_team*post_period + team_fixed_effects + time_fixed_effects + covariates
  

Synthetic control steps:

  1. Select donor pool of similar teams.
  2. Choose predictor variables and pre-treatment outcomes.
  3. Run synth to compute weights and plot treated vs synthetic outcome.

Diagnostics, sensitivity and common pitfalls

Good science depends on good checks. Students should be taught to run these diagnostics as standard practice:

  • Pre-trend checks for DiD: visual event study and statistical tests for differential trends.
  • Autocorrelation and unit root tests for time-series; correct standard errors accordingly.
  • Placebo tests for synthetic control and DiD: apply the same method to control teams and compare effect sizes.
  • Robustness to outcome definition: PPP vs xG-based outcomes may tell different stories; report both.
  • Small sample and season breaks: use rolling-window analyses and transparently discuss limitations.

Concrete example: choosing your treatment date for Glasner

Suppose students choose the October 2025 internal announcement as treatment. They will:

  1. Collect match outcomes from August 2025 through the end of the season to build pre- and post-series.
  2. Visualise PPP and rolling xG before and after October to see any immediate level or slope change.
  3. Run ITS as a first-pass to estimate immediate effects, then run DiD using matched control teams that had similar August–October trends.

Discuss how public confirmation on 16 January 2026 may be an additional shock. Students can explore multiple treatment windows to compare effects and discuss the plausibility of each causal interpretation.

Advanced extensions for 2026-ready classrooms

If you have access to richer data or advanced students, extend the unit by:

  • Using tracking or event-level data to build player-level outcomes (pressing intensity, expected assists) and perform micro-level causal analysis.
  • Applying causal forests or doubly-robust machine learning (EconML, DoubleML) to estimate heterogeneous effects across match types and opponent strength.
  • Incorporating Bayesian Structural Time Series (CausalImpact) to quantify uncertainty and model complex seasonality.
  • Teaching reproducibility and version control: require students to publish code in a Git repository and document data provenance.

Ethics, data privacy and presenting uncertainty

Sports analytics can easily overclaim. Teach students to:

  • Be transparent about data sources and limitations.
  • Avoid attributing player-level outcomes to managers without robust evidence.
  • Respect privacy when working with detailed player tracking or personal data; adhere to school policies and data-use licenses.
  • Present confidence intervals not just point estimates, and explain what a non-significant result means in practical terms.

Assessment and rubric ideas

Assessments should reward both technical skill and critical thinking. Consider a balanced rubric:

  • Data preparation and documentation: 20 percent.
  • Correct application of at least two causal methods and diagnostics: 40 percent.
  • Interpretation, limitations and ethical discussion: 20 percent.
  • Reproducible code notebook and presentation: 20 percent.

Sample classroom deliverables

  • Jupyter notebook or RMarkdown file with cleaned data, code, and figures.
  • Short memo answering: 'Did Oliver Glasner's announcement/departure have a measurable effect on Crystal Palace performance?'
  • 10-minute presentation focusing on method, results, and limitations.

Practical tips for teachers

  • Start with a single outcome and method before layering complexity.
  • Provide template notebooks and cleaned datasets for Week 1 to reduce setup friction.
  • Encourage group work: one subgroup handles ITS, another DiD, another synthetic control, then compare findings.
  • Use visual assessments heavily: time-series plots, event-study graphs, and synthetic control fits are more intuitive than coefficients alone.

Why this matters in 2026 and beyond

By 2026, sports organisations increasingly rely on causal analytics to inform hiring and tactical decisions. Teaching students how to move from correlation to careful causal claims equips them with transferable skills used across environment and space science, economics, and data science. This Crystal Palace case study is not merely about football: it is a microcosm for teaching rigorous empirical reasoning in a context that motivates students.

Use this project to teach not only how to estimate effects, but how to think critically about whether effect estimates are credible.

Actionable takeaways for immediate use

  1. Download match results and xG for Crystal Palace and five control teams for the 2025–26 season and the previous season.
  2. Pick a treatment date and visualise rolling PPP and xG around that date.
  3. Run a basic ITS and a DiD with team and time fixed effects; present both results and explain any discrepancies.
  4. Perform at least one placebo test (eg. assign the treatment date to a control team) to check for spurious findings.
  5. Document all steps in a shared notebook and discuss limitations in your presentation.

Closing: turn curiosity into classroom assessment

This Crystal Palace project gives students an authentic opportunity to apply modern statistical tools to a high-interest real-world issue. It trains them to collect messy data, think causally, and communicate uncertainty – essential skills for the next generation of scientists and analysts. Teachers can adapt the scope to their class resources and depth of curriculum.

To get started this week: assemble your dataset, decide your treatment date, and run a rolling-average plot. If you would like a starter dataset and template notebooks tailored to A-level or IB, request them from your school data lead or your local data science community.

Call to action

Ready to bring this project into your classroom? Download free templates, sample datasets, and an assessment rubric from the NaturalScience UK teaching resources page, or contact us to request a customised lesson pack for your year group. Share student project highlights with the community to inspire future classroom investigations.

Advertisement

Related Topics

#Data Science#Lesson Plans#Sports Analytics
U

Unknown

Contributor

Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.

Advertisement
2026-03-09T09:27:42.166Z