Analysis: 20250811

%load_ext autoreload
%autoreload 2

from cdk.analysis.cytosol import platereader as pr
import warnings

# Ignore warnings
warnings.filterwarnings('ignore')

# Initialize plotting
pr.plot_setup()

Analysis: 20250811

All the data is in the same file, so I just updated the platemap to include everything and to add some extra labels we can slice the data by and to make the names simpler so that the legends are easier to interpret. The original spreadsheet is here on google drive.

platemap_path = "../Platemap (acjs)/20250811-acjs-PPK-platemap.tsv"
data_path = "../data/20250729-104410-cytation5-pure-timecourse-gfp--biotek-cdk.txt"

data, platemap = pr.load_platereader_data(data_path, platemap_path)

Plots

Make some timeseries plots. We can add extra parameters to slice the data different ways (like hue and col below). The plot_curves function takes the same parameters as relplot in Seaborn.

Make a plot per crowding molecule (col="Crowder"), coloring each line by the kind of metabolism (hue="Metabolism").

pr.plot_curves(data, hue="Metabolism", col="Crowder");
<Figure size 1614.25x500 with 3 Axes>

Swap them around, so we’re looking at the difference between crowders, split out by the kind of metabolism. It’s interesting that CP is really similar for both PEG and no crowder, while PEG solidly beats no crowder in the NEB positive.

pr.plot_curves(data, hue="Crowder", col="Metabolism");
<Figure size 2110.88x500 with 4 Axes>

Replicates

Let’s take a look at the individual traces, to see where the wide errors on the traces are coming from.

pr.plot_curves(data, hue="Metabolism", col="Crowder", units="Well", estimator=None, facet_kws={"sharey":False});
<Figure size 1614.25x500 with 3 Axes>

Looks like the variance is, in general, coming from either a very high or very low replicate within a set. In the Optiprep set, we see two nice high lines for the NEB positive, plus one weirdly low one. In the no-crowder set, we see a nice high curve for CP+PPK, and two lower curves, though they’re all kind of spread out. The three NEB positive lines are also spread out, though not so much.

The PEG lines are all nicely clustered, and we see the weird step change happen to all of them (at different times). Unclear what this is--worth digging into. Given the wells were in different parts of the plate, it’s also interesting to consider whether the higher variance is somehow coming from their location on plate.

Let’s plot them with a shared Y axis, just to see the variance, etc, in relation to the overall scale of the experiment as a whole.

pr.plot_curves(data, hue="Metabolism", col="Crowder", units="Well", estimator=None);
<Figure size 1614.25x500 with 3 Axes>

We can also crop back the time to ~steady state, so that we can better see differences in the early kinetic curve.

pr.plot_curves(data[data["Time"] < "5:00:00"], hue="Metabolism", col="Crowder", units="Well", estimator=None);
<Figure size 1614.25x500 with 3 Axes>

To really split it out, we could look at only the replicates for each Crowder-Metabolism pair, so we’re just looking at three lines per plot.

pr.plot_curves(data[data["Time"] < "5:00:00"], hue="Metabolism", row="Metabolism", col="Crowder", units="Well", estimator=None);
<Figure size 1614.25x2000 with 12 Axes>

Steady state

Same as we did for the timeseries, this time doing the kind of metabolism on the x axis (x="Metabolism") and crowder as color (hue="Crowder").

p = pr.plot_steadystate(data, x="Metabolism", hue="Crowder");
<Figure size 710.875x400 with 1 Axes>

Swap them around like we did before, in case this is easier to read.

p = pr.plot_steadystate(data, x="Crowder", hue="Metabolism");
<Figure size 714.25x400 with 1 Axes>

Make a plot like we had above, with one panel per crowder.

p = pr.plot_steadystate(data, x="Metabolism", hue="Metabolism", col="Crowder", col_wrap=3);
<Figure size 1800x400 with 3 Axes>

Error Bars

The error bars are noticeably smaller than what we’d guess by eye from the curves--why is this? By default, the error bars are plotting the 95% confidence interval--the range within which we expect the true mean falls. There are other ways to visualize the error. First, let’s look at what all the steady states look like (per well):

p = pr.plot_steadystate(data, x="Well", hue="Metabolism", col="Crowder", col_wrap=3);
<Figure size 1914.25x400 with 3 Axes>

This matches what we see in the curves more closely. The variance in some of the experiments looks a lot less bad, when scaled to the max of the best experiment (this might actually be kind of misleading though).

Let’s plot one of the earlier graphs, but with some different estimates of error/variance. Check out the seaborn documentation on uncertainty for an explanation of what these all are.

First, standard deviation:

p = pr.plot_steadystate(data, x="Crowder", hue="Metabolism", errorbar="sd");
<Figure size 714.25x400 with 1 Axes>

This looks a lot more like we’re expecting. What about percent interval?

p = pr.plot_steadystate(data, x="Crowder", hue="Metabolism", errorbar="pi");
<Figure size 714.25x400 with 1 Axes>

Also pretty solid. We can try the standard error:

p = pr.plot_steadystate(data, x="Crowder", hue="Metabolism", errorbar="se");
<Figure size 714.25x400 with 1 Axes>

This is unsurprisingly similar to the 95% confidence interval, given our low number of samples. Overall, we should probably change the default errorbar estimator to the standard deviation, since it’s less surprising and we’re unlikely to have a large population in any sample (large number of replicates).

We can also provide a custom estimator--let’s use one that plots the range:

p = pr.plot_steadystate(data, x="Crowder", hue="Metabolism", errorbar=lambda x: (x.min(), x.max()));
<Figure size 714.25x400 with 1 Axes>

Perhaps this is even better than the SD, since we know we’re likely to have 2-3, and probably at most 5, replicates of a given sample.

New feature inspired--let’s plot the raw data points (per well) on top of the bars:

p = pr.plot_steadystate(data, x="Crowder", hue="Metabolism", show_points=True);
<Figure size 714.25x400 with 1 Axes>

Mg Concentration vs Optiprep

p = pr.plot_steadystate(data, x="Crowder", hue="[Mg2+]", show_points=True)
/opt/anaconda3/envs/bnext-cdk/lib/python3.12/site-packages/seaborn/axisgrid.py:854: UserWarning: The palette list has more values (10) than needed (2), which may not be intended.
  func(*plot_args, **plot_kwargs)
<Figure size 667.75x400 with 1 Axes>
p = pr.plot_steadystate(data, hue="[Mg2+]", col="Crowder", x="Type", sharey=False, show_points=True)
/opt/anaconda3/envs/bnext-cdk/lib/python3.12/site-packages/seaborn/axisgrid.py:854: UserWarning: The palette list has more values (10) than needed (2), which may not be intended.
  func(*plot_args, **plot_kwargs)
/opt/anaconda3/envs/bnext-cdk/lib/python3.12/site-packages/seaborn/axisgrid.py:854: UserWarning: The palette list has more values (10) than needed (2), which may not be intended.
  func(*plot_args, **plot_kwargs)
/opt/anaconda3/envs/bnext-cdk/lib/python3.12/site-packages/seaborn/axisgrid.py:854: UserWarning: The palette list has more values (10) than needed (2), which may not be intended.
  func(*plot_args, **plot_kwargs)
<Figure size 1267.75x800 with 3 Axes>

Kinetics Analysis

Plot the overall kinetics. This plot is going to be huge, so we increase the number of columns we’re willing to make (col_wrap=4).

pr.plot_kinetics(data, col_wrap=3)
<Figure size 1800x1600 with 12 Axes>
pr.plot_kinetics(data, col_wrap=3, sharey=False)
<Figure size 1800x1600 with 12 Axes>

We don’t have specific functions yet, but we can also use the kinetic analysis table to drill into some specifics (like how long each well takes to get to steady state).

import seaborn as sns

data["Velocity"] = data["Data"].diff()
g = sns.relplot(
    data=data[(data["Time"] > "00:00:30") & (data["Time"] <= "5:00:00")], 
    x="Time", 
    y="Velocity", 
    hue="Metabolism", 
    col="Crowder", 
    kind="line")
pr._plot_timedelta(g)
<Figure size 1614.25x500 with 3 Axes>
kinetics = pr.kinetic_analysis(data)
kinetics
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

g = sns.catplot(data=kinetics["Steady State"], y="Time", x="Name", kind="bar", orient="v", aspect=2)

# The names are still kind of long, so we need to rotate them in order to read them properly.
g.set_xticklabels(rotation=90)
<seaborn.axisgrid.FacetGrid at 0x712eb093d0a0>
<Figure size 1011.11x500 with 1 Axes>

Quick experiment to see if there is some spatial effect to the variance on the plate. It doesn’t particularly look like it, though I could belive that it’s getting kind of worse as we go down from row B, to C, to D.

ss = pr.find_steady_state(data, group_by="Well").reset_index()
ss["Row"] = ss["Well"].str[:1]
ss["Col"] = pd.to_numeric(ss["Well"].str[1:])
sns.heatmap(ss.pivot(index="Row", columns="Col", values="Data_steadystate"))
<Axes: xlabel='Col', ylabel='Row'>
<Figure size 640x480 with 2 Axes>

Summary

Overall summary of experiment

pr.plot_summary(data)
/home/acjs/.conda/envs/bnext-cdk-acjs/lib/python3.12/site-packages/pandas/core/arraylike.py:399: RuntimeWarning: overflow encountered in exp
  result = getattr(ufunc, method)(*inputs, **kwargs)
<Figure size 1200x1200 with 9 Axes>
df = data[data["Experiment"] == "PPK"]
kinetics = pr.kinetic_analysis(df, ["Name"])
kinetics
pr.plot_kinetics_by_well(
    data=df,
    kinetics=kinetics,
    group_by=["Name"],
    show_mean=True,
    show_fit=True,
    annotate=False,
    hue="Name"
)
<Figure size 640x480 with 1 Axes>
import matplotlib.pyplot as plt
import pandas as pd

fig, ax = plt.subplots()
ax.axis('off')
pd.plotting.table(ax, pr.kinetic_analysis(df))
<Figure size 640x480 with 1 Axes>