SARS-CoV-2 JN.1

COVID-19 is still with us and it's underlying virus, SARS-CoV-2, is still evolving. The latest variant of concern is JN.1. As of February 2, 2024, the CDC estimates that JN.1 is the viral source of about 93% of all US COVID cases.

JN.1 evolved from BA.2.86. BA.2.86 has a large number changes in the spike protein when compared to the original Wuhan sequence. The mutations gave BA.2.86 improved ability to evade the human immune system.

In order to compare JN.1 changes to to BA.2.86, on January 22 I downloaded metadata for 16,460,375  sequences from GISAID. I also downloaded the full collection of FASTA SARS-CoV-2 sequences. 

The analysis that follows is similar to this post. GISAID FASTA headers changed so I had to update some of code. I also wanted to add argument types to the Python code. 

I used R to extract the BA.2.86 and JN.1 sequences from the metadata file.

filter_chunk <- function(pango_lineage) {
    function(df, pos) {
      df <- suppressMessages(as_tibble(df, .name_repair = 'universal'))
      df <- df %>% 
        filter(Pango.lineage == pango_lineage) %>%
        mutate(Collection.date = 
                 as.Date(Collection.date, format = '%Y-%m-%d')) %>%
        mutate(Submission.date = 
                 as.Date(Submission.date, format = '%Y-%m-%d'))
    }
}  


system.time(df_BA_2_86 <- read_tsv_chunked('data/metadata.tsv.gz',
                                           DataFrameCallback$new(filter_chunk(pango_lineage = 'BA.2.86')),
					   chunk_size = 1000000))

system.time(df_JN_1 <- read_tsv_chunked('data/metadata.tsv.gz',
                                        DataFrameCallback$new(filter_chunk(pango_lineage = 'JN.1')),
					chunk_size = 1000000))

write_tsv(df_BA_2_86, 'data/BA_2_86.tsv')
write_tsv(df_JN_1, 'data/JN_1.tsv')

I plotted the variant counts.

system.time(df_variants_world <- get_selected_variants_ch('data/metadata.tsv.gz', selected_variants = c('BA.2.86', 'JN.1'), start_date = as.Date('2023-01-01'), country = NULL, title = 'World'))
system.time(df_variants_usa <- get_selected_variants_ch('data/metadata.tsv.gz', selected_variants = c('BA.2.86', 'JN.1'), start_date = as.Date('2023-01-01'), title = 'USA'))




As you can see, BA.2.86 didn't get much traction. JN.1 quickly took over.

Some of the FASTA sequences were shorter than full length and some had a large number of Ns. The Ns represent area where sequence identity wasn't determined.

ggplot(df_BA_2_86, aes(x = N.Content)) + geom_histogram(color = 'black', fill = 'white') + ggtitle('BA_2_86')
ggplot(df_JN_1, aes(x = N.Content)) + geom_histogram(color = 'black', fill = 'white') + ggtitle('JN_1')



I selected the BA.2.86 and JN.1 sequences from the FASTA file.

time python select_pango_sequences.py -i data/sequences.fasta.gz -m /data/BA_2_86.tsv -o /data/BA_2_86.fasta
time python select_pango_sequences.py -i data/sequences.fasta.gz -m data/JN_1.tsv -o data/JN_1.fasta

Next, the short sequences and sequences with a larger fraction of Ns were removed.

time python gisaid_remove_seqs.py -i data/BA_2_86.fasta -m /data/BA_2_86.tsv -o data/BA_2_86_processed.fasta -n 0.05 -l 28000
time python gisaid_remove_seqs.py -i data/JN_1.fasta -m data/JN_1.tsv -o data/JN_1_processed.fasta -n 0.05 -l 28000

Each set of sequences was aligned with MAFTT using the original Wuhan sequence as a reference..

time mafft --auto --preservecase --thread -1 --addfragments data/BA_2_86_processed.fasta data/EPI_ISL_402124.fasta > data/BA_2_86_aln.fasta
time mafft --auto --preservecase --thread -1 --addfragments data/JN_1_processed.fasta data/EPI_ISL_402124.fasta > data/JN_1_aln.fasta

Once the sequences are aligned, I counted the mutations the positions that differed from the Wuhan sequence.

time python gisaid_mutations.py -i data/BA_2_86_aln.fasta -o output/BA_2_86_mutations.csv -g data/NC_045512.2.gb
time python gisaid_mutations.py -i data/JN_1_aln.fasta -o output/JN_1_mutations.csv -g data/NC_045512.2.gb

The full mutation tables can be found on GitHub.

Finally, we can read the mutations in R and see the differences.

> BA_2_86_mutations <- read_csv('output/BA_2_86_mutations.csv')
> JN_1_mutations <- read_csv('output/JN_1_mutations.csv')
> JN_1_mutations %>% filter(! (mutation %in% BA_2_86_mutations$mutation)) %>% select(mutation, product)

  mutation product             
  <chr>    <chr>               
1 NA       NA                  
2 G282G    nsp3                
3 K1155R   nsp3                
4 R252K    nsp6                
5 L44L     nsp9                
6 C285C    3'-to-5' exonuclease
7 L455S    surface glycoprotein
8 F19L     ORF7b  

There is only one difference in the spike protein (surface glycoprotein) but it's enough to increase the transmissibility of JN.1 over BA.2.86.

The code for this analysis is similar to the code for this post, but has been updated to include argument types and the updated GISAID FASTA headers. The code can be found on GitHub.


Thanks wsltty folks!


I use this blog to post small programming projects that implement some aspect of a subject that I'm interested in. This post is different.

The computer that I use for development uses Windows 11. For most of my programming work, I use WSL2. Last week, Windows update KB5032190 caused havoc with several apps that I use regularly: VSCode, wsltty, Task Scheduler, and Windows itself.

I use VSCode as my IDE. It operates with WSL via an exposed port. VSCode is actually running in Windows, but connects to WSL so it can run Linux compilers, apps, etc. After the update, it refused to connect to WSL. I fixed this problem the way many Windows app problems are solved; I deleted VSCode and reinstalled. After that it worked. I have no idea why it failed initially.

I run a backup program every night at 1:00 am. It is launched by the Windows Task Scheduler. The actual program is ancient. It was written in perl about ten years ago. It has been operating flawlessly on different systems for that time. Last week, Task Scheduler began to randomly skip launching the task. No errors or any events are reported, other apps were launched, but the backup app was simply ignored. I submitted feedback to Microsoft on this issue. I haven't received a reply yet.

After the update, icons mysteriously disappeared from my taskbar. Clicking on the blank space where the icon should be will bring up the window. This has supposedly been fixed in the Windows Insider channel. I'm hesitant to join the Insider program because I worry that I'll be facing broken apps on a regular basis.

wsltty is my go-to terminal for Linux. Its roots are in mintty, a terminal emulator for Cygwin. It's a simple terminal window that behaves like a Linux terminal. When wsltty began to fail, I found several viable workarounds, but I still preferred wsltty. I submitted an issue on the wsltty GitHub page. It quickly became apparent that many other suffered the same problem. Some people dove into the problem and located where it arose. Within a week, the good folks who support wsltty provided an updated version. This is the sort of thing that makes open source so great!

THANK YOU WSLTTY FOLKS!!!




Global Temperature Anomaly November 2023

 Given the current OpenAI turmoil, it's hard to say how ChatGPT will develop in the near term. I, for one, welcome our new Microsoft OpenAI overlords, for today at least. I have no hopes for a transparent ChatGPT.



Predicting how temperature will rise over the next few years is difficult. Its seems that the world keeps being surprised that things are worse than we suspected. There may be reasons other than anthropogenic climate effects, but humanity is the main culprit. We're well on our way for 2023 being the hottest year on record.

On 2023-11-14, I downloaded the global monthly temperature anomaly data from the Met Office Hadley Centre observations datasets. The CSV data file contains monthly estimates of the global temperature anomaly from January 1850 to September 2023. The temperature anomalies (deg C) are relative to 1961-1990.

> library(tidyverse)
> df_temp_hadcrut <- read_csv('ChatGPT/data/HadCRUT.5.0.1.0.analysis.summary_series.global.monthly.csv', name_repair = 'universal')
> names(df_temp_hadcrut)
[1] "Time"                           "Anomaly..deg.C."                "Lower.confidence.limit..2.5.." 
[4] "Upper.confidence.limit..97.5.."
> names(df_temp_hadcrut)
[1] "Time"                           "Anomaly..deg.C."                "Lower.confidence.limit..2.5.." 
[4] "Upper.confidence.limit..97.5.."
> df_temp_hadcrut <- df_temp_hadcrut %>% rename(`Anomaly (deg C)` = `Anomaly..deg.C.` )
> df_temp_hadcrut <- df_temp_hadcrut %>% rename(`Lower confidence limit (2.5%)` = `Lower.confidence.limit..2.5..` ) %>% rename(`Upper confidence limit (97.5%)` = `Upper.confidence.limit..97.5..`)
> write_csv(df_temp_hadcrut, 'ChatGPT/data/HadCRUT.5.0.1.0.analysis.summary_series.global.monthly.csv')
> ggplot(df_temp_hadcrut, aes(x = Time, y = `Anomaly (deg C)`)) + ylab('Temperature Anomaly \u00B0C') + ggtitle('Temperature Anomaly') + scale_x_discrete(breaks = c(df_temp_hadcrut$Time[seq(1, nrow(df_temp_hadcrut), by = 120)], df_temp_hadcrut$Time[nrow(df_temp_hadcrut)])) + geom_point() + theme(axis.text.x = element_text(angle = 90, hjust = 1))
> 

You can see that 2023 is shaping up to be very warm.

ChatGPT helps out


I wanted to try some sort of deep learning to fit the data and make predictions. I decided to see if ChatGPT could give me a start.

Query:
write Pytorch code to perform a smooth fit of the temperature anomaly data from 1850 to 2020 and predict temperature anomaly values for 2021 and 2022

This a pretty terrible query, but surprisingly ChatGPT came up with reasonable start code.


To perform a smooth fit of the temperature anomaly data from 1850 to 2020, you can use a smoothing algorithm such as **moving average smoothing** or **exponential smoothing**. Here is an example of how to perform moving average smoothing on the temperature anomaly data using Python:

```python
import pandas as pd
import matplotlib.pyplot as plt

# Load the data
df = pd.read_csv('global_temperature_anomaly_1850_2023.csv')
df = df[df['Year'] >= 1850]
df = df[df['Year'] <= 2020]

# Compute the moving average
window_size = 12  # Use a window size of 12 months
df['MA'] = df['J-D'].rolling(window_size).mean()

# Plot the data
plt.plot(df['Year'], df['J-D'], label='Temperature Anomaly')
plt.plot(df['Year'], df['MA'], label=f'{window_size}-Month Moving Average')
plt.title('Global Temperature Anomaly (1850-2020)')
plt.xlabel('Year')
plt.ylabel('Temperature Anomaly (°C)')
plt.legend()
plt.show()
```

This code reads the CSV file `global_temperature_anomaly_1850_2023.csv` into a `pandas` DataFrame and computes the 12-month moving average of the `J-D` column (which represents the annual temperature anomaly) for the years 1850 to 2020. It then plots the original data and the moving average on the same graph using `matplotlib`.

To predict temperature anomaly values for 2021 and 2022, you can use a machine learning algorithm such as **LSTM**. Here is an example of how to use PyTorch to train an LSTM model on the temperature anomaly data and predict future values:

```python
import pandas as pd
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader

# Define the LSTM model
class LSTM(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(LSTM, self).__init__()
        self.hidden_size = hidden_size
        self.lstm = nn.LSTM(input_size, hidden_size)
        self.linear = nn.Linear(hidden_size, output_size)

    def forward(self, input):
        output, _ = self.lstm(input.view(len(input), 1, -1))
        output = self.linear(output.view(len(input), -1))
        return output[-1]

# Load the data
df = pd.read_csv('global_temperature_anomaly_1850_2023.csv')
df = df[df['Year'] >= 1850]
df = df[df['Year'] <= 2020]
data = torch.tensor(df['J-D'].values, dtype=torch.float32)

# Define the dataset and dataloader
class TemperatureDataset(Dataset):
    def __init__(self, data, seq_len):
        self.data = data
        self.seq_len = seq_len

    def __len__(self):
        return len(self.data) - self.seq_len

    def __getitem__(self, idx):
        return self.data[idx:idx+self.seq_len], self.data[idx+self.seq_len]

seq_len = 12  # Use a sequence length of 12 months
dataset = TemperatureDataset(data, seq_len)
dataloader = DataLoader(dataset, batch_size=1)

# Train the LSTM model
input_size = seq_len
hidden_size = 32
output_size = 1
lr = 0.001
num_epochs = 100

model = LSTM(input_size, hidden_size, output_size)
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=lr)

for epoch in range(num_epochs):
    for X, y in dataloader:
        optimizer.zero_grad()
        y_pred = model(X)
        loss = criterion(y_pred, y)
        loss.backward()
        optimizer.step()

# Predict future values
future = 24  # Predict 24 months into the future
predictions = []
with torch.no_grad():
    model.eval()
    for i in range(future):
        X = data[-seq_len:].view(1, -1)
        y_pred = model(X)
        predictions.append(y_pred.item())
        data = torch.cat([data, y_pred.view(-1)])

# Plot the data and predictions
plt.plot(df['Year'], df['J-D'], label='Temperature Anomaly')
plt.plot(range(2021, 2023+1), predictions, label='Predictions')
plt.title('Global Temperature Anomaly (1850-2023)')
plt.xlabel('Year')
plt.ylabel('Temperature Anomaly (°C)')
plt.legend()
plt.show()
```

This code reads the CSV file `global_temperature_anomaly_1850_2023.csv` into a `pandas` DataFrame and trains an LSTM model on the `J-D` column (which represents the annual temperature anomaly) for

Source: Conversation with Bing, 11/14/2023
(1) Time Series Anomaly Detection using LSTM Autoencoders. https://github.com/curiousily/Getting-Things-Done-with-Pytorch/blob/master/manuscript/06.time-series-anomaly-ecg.md.
(2) Automatic differentiation package - torch.autograd — PyTorch 2.1 .... https://pytorch.org/docs/stable/autograd.html.
(3) Learning PyTorch with Examples. https://pytorch.org/tutorials/beginner/pytorch_with_examples.html.
(4) Performance Tuning Guide — PyTorch Tutorials 2.1.0+cu121 documentation. https://pytorch.org/tutorials/recipes/recipes/tuning_guide.html.
(5) PyTorch: is there a definitive training loop similar to Keras' fit()?. https://stackoverflow.com/questions/59584457/pytorch-is-there-a-definitive-training-loop-similar-to-keras-fit.
(6) PyTorch: Predicting future values with LSTM - Stack Overflow. https://stackoverflow.com/questions/66048406/pytorch-predicting-future-values-with-lstm.
(7) Predicting future values with LSTM - PyTorch Forums. https://discuss.pytorch.org/t/predicting-future-values-with-lstm/110916.
(8) python - PyTorch : predict single example - Stack Overflow. https://stackoverflow.com/questions/51041128/pytorch-predict-single-example.
(9) Weather Forecasting | Papers With Code. https://paperswithcode.com/task/weather-forecasting.
(10) undefined. https://avatars.githubusercontent.com/u/150327?v=4.
(11) undefined. https://github.com/curiousily/Getting-Things-Done-with-Pytorch/blob/master/manuscript/06.time-series-anomaly-ecg.md?raw=true.
(12) undefined. https://desktop.github.com.
(13) undefined. https://docs.github.com/articles/about-issue-and-pull-request-templates.
(14) undefined. https://github.com/curiousily/Getting-Things-Done-with-Pytorch/raw/master/manuscript/06.time-series-anomaly-ecg.md.

{ending was cut off}

I used this code as a start. It's a very simple NN using an LSTM layer followed by a linear layer. I kept the LSTM model and the data loader as well as much of the main code. The code isn't fast because I didn't use a GPU.

You can find the full code on GitHub.

I used the data up until the end of 2020 as training and predicted values for 2021 to September 2023. Here's what the fit looks like. 


The predictions are particularly accurate. The rapid temperature rise over the last few years confounds the training data. In general, this sort of noisy data with shifting trends is difficult to predict.





Programming With ChatGPT Part 2

 

Image generated by OpenAI

In a previous post, I described my experiments using Bing/ChatGPT for programming. This time, I want to try some simpler tasks to see how ChatGPT handles them.

In the previous post, ChatGPT generated the following Python code. I wrapped the generated code in a function.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

def main():
    out_file = '/mnt/d/Documents/analytic_garden/cp_pytorch/data/ts_chatgpt.csv'

    # Define the number of data points
    n = 1000

    # Define the means and variances for each region
    means = [1, 5, 2]
    variances = [1, 1, 1]

    # Generate the time series
    ts = np.zeros(n)
    for i in range(n):
        if i < n//3:
            ts[i] = np.random.normal(means[0], variances[0])
        elif i < 2*n//3:
            ts[i] = np.random.normal(means[1], variances[1])
        else:
            ts[i] = np.random.normal(means[2], variances[2])

    # Plot the time series
    plt.plot(ts)
    plt.title('Time Series with 3 Regions')
    plt.xlabel('Time')
    plt.ylabel('Value')
    plt.show()

    df = pd.DataFrame({'x': list(range(len(ts))), 'y': ts})
    df.to_csv(out_file, index = False)

I asked ChatGPT to translate this code to Julia.

convert the following python code to Julia

Here's the response.

Sure! Here's the Julia code equivalent to your Python code:

```julia
using Random, CSV, DataFrames, Plots

function main()
    out_file = "/mnt/d/Documents/analytic_garden/cp_pytorch/data/ts_chatgpt.csv"

    # Define the number of data points
    n = 1000

    # Define the means and variances for each region
    means = [1, 5, 2]
    variances = [1, 1, 1]

    # Generate the time series
    ts = zeros(n)
    for i in 1:n
        if i < n ÷ 3
            ts[i] = rand(Normal(means[1], variances[1]))
        elseif i < 2n ÷ 3
            ts[i] = rand(Normal(means[2], variances[2]))
        else
            ts[i] = rand(Normal(means[3], variances[3]))
        end
    end

    # Plot the time series
    plot(ts, title = "Time Series with 3 Regions", xlabel = "Time", ylabel = "Value")
    savefig("/mnt/d/Documents/analytic_garden/cp_pytorch/data/ts_chatgpt.png")

    df = DataFrame(x = 1:length(ts), y = ts)
    CSV.write(out_file, df)
end

The code above uses the following Julia packages:
- `Random` for generating random numbers.
- `CSV` and `DataFrames` for handling data.
- `Plots` for creating plots.

Please note that the file path `/mnt/d/Documents/analytic_garden/cp_pytorch/data/ts_chatgpt.csv` and `/mnt/d/Documents/analytic_garden/cp_pytorch/data/ts_chatgpt.png` should be adjusted according to your system.

Let me know if there's anything else I can help you with!

Source: Conversation with Bing, 9/26/2023
(1) CodeConvert AI - Convert code with a click of a button. https://www.codeconvert.ai/python-to-julia-converter.
(2) Python to Julia conversion - General Usage - Julia Programming Language. https://discourse.julialang.org/t/python-to-julia-conversion/35280.
(3) How to convert a Python pandas into a Julia DataFrame (using PyJulia .... https://stackoverflow.com/questions/63731550/how-to-convert-a-python-pandas-into-a-julia-dataframe-using-pyjulia-and-back-t.
(4) undefined. https://github.com/JuliaPy/PyCall.jl.

When I ran it, I received the following error.

julia> include("src/main.jl")
main (generic function with 1 method)

julia> main()
ERROR: UndefVarError: Normal not defined
Stacktrace:
 [1] main()
   @ Main ./REPL[8]:15
 [2] top-level scope
   @ REPL[12]:1

julia>

I gave the error message to ChatGPT

ERROR: UndefVarError: Normal not defined
Normal is not defined

The response was:

I apologize for the error. It seems that the `Normal` function is not defined in Julia. To fix this issue, you need to import the `Distributions` package, which provides the `Normal` distribution.

Please add the following line at the beginning of your code:

```julia
using Distributions
```

This will import the necessary functions and allow you to use `Normal` to generate random numbers from a normal distribution.

Let me know if you need any further assistance!

I added the additional line, but still had a problem.

julia> using Distributions

julia> main()
qt.qpa.plugin: Could not find the Qt platform plugin "wayland" in ""
"/mnt/d/Documents/analytic_garden/cp_pytorch/data/ts_chatgpt.csv"

That wasn't a ChatGPT error. My system didn't have the proper library installed. I asked how to fix the problem.

I received this error
qt.qpa.plugin: Could not find the Qt platform plugin "wayland" in ""

ChatGPT responded:

I apologize for the inconvenience. The error message you encountered indicates that the **Qt platform plugin "wayland"** is missing or not properly configured on your system. This error can occur when running applications that rely on the Qt framework, such as graphical user interfaces (GUIs) created with Qt.

To resolve this issue, you can try the following steps:

1. **Install the required Qt dependencies**: Ensure that you have all the necessary dependencies installed for the Qt framework. You may need to consult the documentation or support resources for the specific application or library you are using to determine the required dependencies.

2. **Check your environment variables**: Verify that your environment variables are correctly set up to locate the necessary Qt libraries and plugins. Specifically, check if the `QT_PLUGIN_PATH` variable is properly configured to point to the directory containing the required plugins.

3. **Reinstall or update Qt**: If you have already installed Qt, consider reinstalling or updating it to ensure that all required components are present and up to date. You can download the latest version of Qt from the official Qt website.

4. **Consult application-specific resources**: If you encountered this error while running a specific application, consult the application's documentation, support forums, or other resources for guidance on resolving this issue. The developers or community members may have encountered similar problems and can provide specific troubleshooting steps.

Please note that these steps are general recommendations, and the specific solution may vary depending on your system configuration and the application you are using. If you continue to experience difficulties, I recommend reaching out to the application's support channels for further assistance.

Let me know if there's anything else I can help you with!

Source: Conversation with Bing, 9/26/2023
(1) qt.qpa.plugin: Could not find the Qt platform plugin "wayland". https://stackoverflow.com/questions/69994530/qt-qpa-plugin-could-not-find-the-qt-platform-plugin-wayland.
(2) Error: Could not find the Qt platform plugin "wayland" in. https://discussion.fedoraproject.org/t/error-could-not-find-the-qt-platform-plugin-wayland-in/82029.
(3) qt.qpa.plugin: Could not find the Qt platform plugin "wayland" in .... https://bbs.archlinux.org/viewtopic.php?id=284214.
(4) qt.qpa.plugin: Could not load the Qt platform plugin "wayland" in .... https://stackoverflow.com/questions/69269471/qt-qpa-plugin-could-not-load-the-qt-platform-plugin-wayland-in-even-though.

That didn't really solve my problem. I asked how to install the plugin. I already knew that the StackOverflow reference contained the correct answer, but I wanted to see if ChatGPT could figure it out.

how do i install Qt platform plugin wayland on ubuntu 22.04

I received this response.

To install the **Qt platform plugin "wayland"** on Ubuntu 22.04, you can follow these steps:

1. Open a terminal window.

2. Run the following command to install the required package:

   ```shell
   sudo apt install qtwayland5
   ```

   This command will install the `qtwayland5` package, which provides the Qt platform plugin for Wayland¹.

3. After the installation is complete, you should be able to use the Qt platform plugin "wayland" in your applications without encountering the error message you mentioned.

Please note that you may need to restart your system or log out and log back in for the changes to take effect.

Let me know if there's anything else I can help you with!

Source: Conversation with Bing, 9/26/2023
(1) ubuntu - qt.qpa.plugin: Could not find the Qt platform plugin "wayland .... https://stackoverflow.com/questions/69994530/qt-qpa-plugin-could-not-find-the-qt-platform-plugin-wayland.
(2) How To Install qt6-wayland-dev-tools on Ubuntu 22.04. https://installati.one/install-qt6-wayland-dev-tools-ubuntu-22-04/.
(3) Qt platform plugin "wayland" not found while running Python in VScode .... https://stackoverflow.com/questions/74939728/qt-platform-plugin-wayland-not-found-while-running-python-in-vscode-on-ubuntu.
(4) 22.04 - How do you install qt on ubuntu22.04 - Ask Ubuntu. https://askubuntu.com/questions/1404263/how-do-you-install-qt-on-ubuntu22-04.

After installing qtwayland5, the code worked, but not perfectly. It produced the CSV file and saved the figure to a file, but didn't display the image. Saving the image to the file was added by ChatGPT. It wasn't in the original Python code. I deleted that line and the image was displayed.





Programming With ChatGPT


You may have read stories like this on Reddit or other sites: "Two weeks ago I couldn't program at all, now, thanks to ChatGPT, I'm a senior software engineer at Google." That is a bit of an exaggeration, but sites like r/ChatGPT often contain quite a bit of AI hype.

I was recently working on using PyTorch to detect change points in time series sequences.  I have written about change point detection before: here, here, here, and here. There are numerous methods for detecting regions of change in series data depending on the characteristics of the data. I was interested to see what ChatGPT could contribute.

Change point detection algorithms are traditionally classified as online or offline. Offline methods have the entire data sequence available. Online algorithms process each data point as it arrives. I'm interested in offline change point detection. This paper gives a good overview of some of the common detection methods.

Generating Data

My first step was to generate some data. I used Bing/ChatGPT. Bing chat uses ChatGPT 4 and also provides search results. 

I tried the following prompt to as ChatGPT to provide some code for generating timeseries data.

write python code to generate a time series with 3 regions, each region having a different mean and variance

Here's ChatGPT's response:

Here is a Python code that generates a time series with 3 regions, each region having a different mean and variance:

```python
import numpy as np
import matplotlib.pyplot as plt

# Define the number of data points
n = 1000

# Define the means and variances for each region
means = [0, 5, 10]
variances = [1, 2, 3]

# Generate the time series
ts = np.zeros(n)
for i in range(n):
    if i < n//3:
        ts[i] = np.random.normal(means[0], variances[0])
    elif i < 2*n//3:
        ts[i] = np.random.normal(means[1], variances[1])
    else:
        ts[i] = np.random.normal(means[2], variances[2])

# Plot the time series
plt.plot(ts)
plt.title('Time Series with 3 Regions')
plt.xlabel('Time')
plt.ylabel('Value')
plt.show()
```

In this code, we first define the number of data points `n`, and then define the means and variances for each of the three regions. We then generate the time series by sampling from normal distributions with the specified means and variances for each region. Finally, we plot the time series using `matplotlib`.

I hope this helps!

Source: Conversation with Bing, 9/20/2023
(1) Time Series Analysis in Python  A Comprehensive Guide with Examples. https://www.machinelearningplus.com/time-series/time-series-analysis-python/.
(2) Time series / date functionality  pandas 2.1.0 documentation. https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html.
(3) A Guide to Time Series Visualization with Python 3. https://www.digitalocean.com/community/tutorials/a-guide-to-time-series-visualization-with-python-3.
(4) A Python Library to Generate a Synthetic Time Series Data. https://towardsdatascience.com/tsbngen-a-python-library-to-generate-time-series-data-from-an-arbitrary-dynamic-bayesian-network-4b46e178cd9f.
(5) Complete Guide on Time Series Analysis in Python | Kaggle. https://www.kaggle.com/code/prashant111/complete-guide-on-time-series-analysis-in-python.
(6) undefined. https://raw.githubusercontent.com/selva86/datasets/master/a10.csv.

Not Bad! It works well. I might have used a list comprehension rather than the for loop, but the code is functional and easily understandable and modifiable.

The code produces this plot.


PyTorch for Change Point Detection

After generating some fake data, I next tried to get ChatGPT to write some code using PyTorch to detect change points.

Here was my prompt:

write pytorch code to find change points in a time series

and the result:

Sure! Here's an example of PyTorch code to find change points in a time series:

```python
import torch
import numpy as np
from ruptures.costs import LeastAbsoluteDeviation
from ruptures.model import Binseg

# Generate a random time series
np.random.seed(0)
n_samples, n_features = 500, 1
signal = np.random.randn(n_samples, n_features)

# Define the cost function and the model
cost = LeastAbsoluteDeviation().fit(signal)
model = Binseg(custom_cost=cost).fit(signal)

# Get the change points
change_points = model.predict(n_bkps=2)

print("Change points:", change_points)
```

This code uses the `ruptures` library, which is a Python library for performing offline change point detection. In this example, we generate a random time series and use the `LeastAbsoluteDeviation` cost function with the `Binseg` model to detect change points. The `n_bkps` parameter specifies the number of change points to detect.

Please note that this is just one approach to change point detection in time series data. There are other algorithms and libraries available, depending on your specific requirements and preferences. Feel free to explore other options and adapt the code to your needs.

Let me know if you need any further assistance!
```python
import torch
import numpy as np
from ruptures.costs import LeastAbsoluteDeviation
from ruptures.model import Binseg

# Generate a random time series
np.random.seed(0)
n_samples, n_features = 500, 1
signal = np.random.randn(n_samples, n_features)

# Define the cost function and the model
cost = LeastAbsoluteDeviation().fit(signal)
model = Binseg(custom_cost=cost).fit(signal)

# Get the change points
change_points = model.predict(n_bkps=2)

print("Change points:", change_points)
```

This code uses the `ruptures` library, which is a Python library for performing offline change point detection. In this example, we generate a random time series and use the `LeastAbsoluteDeviation` cost function with the `Binseg` model to detect change points. The `n_bkps` parameter specifies the number of change points to detect.

Please note that this is just one approach to change point detection in time series data. There are other algorithms and libraries available, depending on your specific requirements and preferences. Feel free to explore other options and adapt the code to your needs.

Let me know if you need any further assistance!

Source: Conversation with Bing, 9/16/2023
(1) A Brief Introduction to Change Point Detection using Python. https://techrando.com/2019/08/14/a-brief-introduction-to-change-point-detection-using-python/.
(2) python - Time series change point detection - Stack Overflow. https://stackoverflow.com/questions/69010927/time-series-change-point-detection.
(3) Calculate the accuracy every epoch in PyTorch - Stack Overflow. https://stackoverflow.com/questions/51503851/calculate-the-accuracy-every-epoch-in-pytorch.
(4) GitHub - LEE-SEON-WOO/TSCP2_pytorch: Time Series Change Point Detection .... https://github.com/lee-seon-woo/tscp2_pytorch.
(5) GitHub - timeseriesAI/tsai: Time series Timeseries Deep Learning .... https://github.com/timeseriesAI/tsai.
(6) undefined. https://github.com/deepcharles/ruptures.
(7) changepoint-detection · GitHub Topics · GitHub. https://github.com/topics/changepoint-detection.
(8) GitHub - OctoberChang/klcpd_code: Kernel Change-point Detection with .... https://github.com/OctoberChang/klcpd_code.
(9) GitHub - HolyBayes/TIRE_pytorch: Pytorch implementation of TIRE for .... https://github.com/HolyBayes/TIRE_pytorch.

The code was presented twice. It uses ruptures, a Python library for off-line change point detection. Ruptures is great library, but it doesn't use PyTorch. The code loads PyTorch, but doesn't use it. 

A more serious problem arises when I tried to run the code. I received this error:

ImportError                               Traceback (most recent call last)
Cell In[1], line 3
      1 import torch
      2 import numpy as np
----> 3 from ruptures.costs import LeastAbsoluteDeviation
      4 from ruptures.model import Binseg
      6 # Generate a random time series

ImportError: cannot import name 'LeastAbsoluteDeviation' from 'ruptures.costs' (/home/bill/mambaforge/envs/nn/lib/python3.11/site-packages/ruptures/costs/__init__.py)

I'm not sure where the line from ruptures.costs import LeastAbsoluteDeviation is coming from. Searching the web for it finds no results. I guess ChatGPT made it up.

Also, this line will create an error:

from ruptures.model import Binseg


ModuleNotFoundError                       Traceback (most recent call last)
Cell In[2], line 1
----> 1 from ruptures.model import Binseg

ModuleNotFoundError: No module named 'ruptures.model'

I told ChatGPT.

This code has an error

It replied:

I'm sorry to hear that. Could you please provide more details about the error? What is the error message that you are seeing? This will help me understand the issue better and provide you with a more accurate solution.

In the meantime, here are some general tips that might help you troubleshoot the issue:

- Check if all the required libraries and dependencies are installed and up-to-date.
- Make sure that the input data is in the correct format and shape.
- Double-check the code for any syntax errors or typos.
- Try running the code on a smaller dataset or with simpler parameters to isolate the issue.

If you can provide more information about the error, I'll be happy to help you further. Let me know!

Source: Conversation with Bing, 9/16/2023
(1) TorchVision Object Detection Finetuning Tutorial  PyTorch Tutorials 2. .... https://pytorch.org/tutorials/intermediate/torchvision_tutorial.html.
(2) Anomaly detection points to LayerNormalization operation for in-place error. https://discuss.pytorch.org/t/anomaly-detection-points-to-layernormalization-operation-for-in-place-error/180372.
(3) Object detection fine tuning model initialisation error - PyTorch Forums. https://discuss.pytorch.org/t/object-detection-fine-tuning-model-initialisation-error/159940.
(4) torch.utils.checkpoint  PyTorch 2.0 documentation. https://pytorch.org/docs/stable/checkpoint.html.
(5) undefined. https://github.com/gautamchitnis/cocoapi.git@cocodataset-master.

I prompted ChatGPT with this query.

There is no function called LeastAbsoluteDeviation in ruptures.cost

Here's what ChatGPT had to say:

I apologize for the confusion earlier. It seems that the `LeastAbsoluteDeviation` function is not available in the `ruptures.costs` module. Instead, you can try using the `Pelt` model with the `l1` cost function to detect change points in your time series data. Here's an example of how you can modify the code:

```python
import torch
import numpy as np
from ruptures.costs import cost_l1
from ruptures.model import Pelt

# Generate a random time series
np.random.seed(0)
n_samples, n_features = 500, 1
signal = np.random.randn(n_samples, n_features)

# Define the cost function and the model
model = Pelt(model="l1").fit(signal)

# Get the change points
change_points = model.predict(pen=10)

print("Change points:", change_points)
```

In this example, we use the `cost_l1` function from the `ruptures.costs` module and the `Pelt` model to detect change points in your time series data. The `pen` parameter specifies the penalty value for the number of change points.

Please note that this is just one approach to change point detection in time series data. There are other algorithms and libraries available, depending on your specific requirements and preferences. Feel free to explore other options and adapt the code to your needs.

Let me know if you need any further assistance!

This code doesn't work either.

ImportError                               Traceback (most recent call last)
Cell In[9], line 3
      1 import torch
      2 import numpy as np
----> 3 from ruptures.costs import cost_l1
      4 from ruptures.model import Pelt
      6 # Generate a random time series

ImportError: cannot import name 'cost_l1' from 'ruptures.costs' (/home/bill/mambaforge/envs/nn/lib/python3.11/site-packages/ruptures/costs/__init__.py)

I fed that error message to ChatGPT and received this reply:

I’m sorry to hear that you’re encountering an error. Based on the error message you provided, it seems that there is an issue with the ruptures library. Specifically, it appears that the cost_l1 function cannot be imported from the ruptures.costs module.

Here are some general tips that might help you troubleshoot the issue:

Make sure that the ruptures library is installed and up-to-date. You can check this by running !pip show ruptures in your Jupyter notebook.
Check if there are any typos or syntax errors in your code. Double-check the spelling of the function name and make sure that it is imported correctly.
Try importing the function directly using from ruptures.costs.least_absolute_deviation import LeastAbsoluteDeviation instead of from ruptures.costs import LeastAbsoluteDeviation.
If none of these solutions work, try uninstalling and reinstalling the ruptures library.
If you can provide more information about the error or your specific use case, I’ll be happy to help you further. Let me know!

I told it:

there is no cost_l1 function in ruptures.cose. The correct function is CostL1


I apologize for the confusion earlier. You are correct, the correct function name in the `ruptures.costs` module is `CostL1`. Here's an updated version of the code:

```python
import torch
import numpy as np
from ruptures.costs import CostL1
from ruptures.model import Pelt

# Generate a random time series
np.random.seed(0)
n_samples, n_features = 500, 1
signal = np.random.randn(n_samples, n_features)

# Define the cost function and the model
cost = CostL1().fit(signal)
model = Pelt(custom_cost=cost).fit(signal)

# Get the change points
change_points = model.predict(pen=10)

print("Change points:", change_points)
```

In this updated code, we use the `CostL1` function from the `ruptures.costs` module and the `Pelt` model to detect change points in your time series data. The `pen` parameter specifies the penalty value for the number of change points.

Please note that this is just one approach to change point detection in time series data. There are other algorithms and libraries available, depending on your specific requirements and preferences. Feel free to explore other options and adapt the code to your needs.

Let me know if you need any further assistance!
```python
import torch
import numpy as np
from ruptures.costs import CostL1
from ruptures.model import Pelt

# Generate a random time series
np.random.seed(0)
n_samples, n_features = 500, 1
signal = np.random.randn(n_samples, n_features)

# Define the cost function and the model
cost = CostL1().fit(signal)
model = Pelt(custom_cost=cost).fit(signal)

# Get the change points
change_points = model.predict(pen=10)

print("Change points:", change_points)
```

In this updated code, we use the `CostL1` function from the `ruptures.costs` module and the `Pelt` model to detect change points in your time series data. The `pen` parameter specifies the penalty value for the number of change points.

Please note that this is just one approach to change point detection in time series data. There are other algorithms and libraries available, depending on your specific requirements and preferences. Feel free to explore other options and adapt the code to your needs.

Let me know if you need any further assistance!

Source: Conversation with Bing, 9/16/2023
(1) TorchVision Object Detection Finetuning Tutorial  PyTorch Tutorials 2. .... https://pytorch.org/tutorials/intermediate/torchvision_tutorial.html.
(2) Object detection fine tuning model initialisation error. https://discuss.pytorch.org/t/object-detection-fine-tuning-model-initialisation-error/159940.
(3) Anomaly detection points to LayerNormalization operation for in-place error. https://discuss.pytorch.org/t/anomaly-detection-points-to-layernormalization-operation-for-in-place-error/180372.
(4) torch.utils.checkpoint  PyTorch 2.0 documentation. https://pytorch.org/docs/stable/checkpoint.html.
(5) pytorch_lightning.ModelCheckpoint not saving model. https://stackoverflow.com/questions/77092260/pytorch-lightning-modelcheckpoint-not-saving-model.
(6) undefined. https://github.com/gautamchitnis/cocoapi.git@cocodataset-master.
(7) unable to import pytorch-lightning - Stack Overflow. https://stackoverflow.com/questions/66505335/unable-to-import-pytorch-lightning.
(8) python - Cannot import Pytorch [WinError 126] The specified module .... https://stackoverflow.com/questions/61488902/cannot-import-pytorch-winerror-126-the-specified-module-could-not-be-found.
(9) Unable to import pytorch_lightning on google colab. https://stackoverflow.com/questions/66538407/unable-to-import-pytorch-lightning-on-google-colab.
(10) python - Import error when importing torch - Stack Overflow. https://stackoverflow.com/questions/49624501/import-error-when-importing-torch.
(11) undefined. https://github.com/PyTorchLightning/pytorch-lightning.
(12) undefined. https://pytorch.org.

This doesn't work.

ModuleNotFoundError                       Traceback (most recent call last)
Cell In[11], line 4
      2 import numpy as np
      3 from ruptures.costs import CostL1
----> 4 from ruptures.model import Pelt
      6 # Generate a random time series
      7 np.random.seed(0)

ModuleNotFoundError: No module named 'ruptures.model'

I gave that error message as a prompt and received another apology.

I’m sorry to hear that you’re encountering an error. Based on the error message you provided, it seems that there is an issue with the ruptures library. Specifically, it appears that the Pelt function cannot be imported from the ruptures.model module.

Here are some general tips that might help you troubleshoot the issue:

Make sure that the ruptures library is installed and up-to-date. You can check this by running !pip show ruptures in your Jupyter notebook.
Check if there are any typos or syntax errors in your code. Double-check the spelling of the function name and make sure that it is imported correctly.
Try importing the function directly using from ruptures.model.pelt import Pelt instead of from ruptures.model import Pelt.
If none of these solutions work, try uninstalling and reinstalling the ruptures library.
If you can provide more information about the error or your specific use case, I’ll be happy to help you further. Let me know!

I used this prompt:

There is no module called ruptures.model. It should be from ruptures import Pelt.

I received this:

An error has occurred. Let's start over.

Everything was lost

That was disappointing. This problem seemed to be chronic with Bing Chat. It kept losing the connection or dying in mid answer.

It looks like ChatGPT was resurrecting some old ruptures code from a previous version. In the end, I couldn't get it to come up with anything workable, even ignoring the fact that it wasn't using PyTorch like I intended.

A Second Attempt


Maybe my prompts were too vague. I tried to be more specific.

write python code to read a csv file containing x and y columns, plot the y variable, use the klcpd library to detect change points in the y variable

klcpd is a module for change point detection in time series.

Here is ChatGPT's response.

Sure! Here's an example Python code that reads a CSV file containing `x` and `y` columns, plots the `y` variable, and uses the **klcpd** library to detect change points in the `y` variable:

```python
import pandas as pd
import matplotlib.pyplot as plt
from klcpd import KL_CPD

# Read the CSV file
df = pd.read_csv('your_file.csv')

# Extract the 'x' and 'y' columns
x = df['x']
y = df['y']

# Plot the 'y' variable
plt.plot(x, y)
plt.xlabel('x')
plt.ylabel('y')
plt.title('Plot of y variable')

# Use KL-CPD to detect change points in 'y'
model = KL_CPD()
change_points = model.detect_change_points(y)

# Print the detected change points
print('Change points:', change_points)
```

Please make sure to replace `'your_file.csv'` with the actual path to your CSV file. Also, ensure that you have installed the **klcpd** library before running this code. You can find more information about the library and its usage in the [official GitHub repository](https://github.com/HolyBayes/klcpd).

Let me know if you need any further assistance!

Source: Conversation with Bing, 9/21/2023
(1) . https://bing.com/search?q=klcpd+library.
(2) Valatie Free Library. https://valatielibrary.net/.
(3) GitHub - HolyBayes/klcpd: PyTorch package for KL-CPD algorithm for .... https://github.com/HolyBayes/klcpd.
(4) King County Library System. https://kcls.org/.
(5) undefined. https://www.linkedin.com/in/artem-ryzhikov-2b6308103/.
(6) undefined. https://www.hse.ru/org/persons/190912317.
(7) Read specific columns from a csv file with csv module?. https://stackoverflow.com/questions/16503560/read-specific-columns-from-a-csv-file-with-csv-module.
(8) coordinates - Python: How to read a text file containing co-ordinates .... https://stackoverflow.com/questions/20735922/python-how-to-read-a-text-file-containing-co-ordinates-in-row-column-format-int.
(9) Visualize data from CSV file in Python - GeeksforGeeks. https://www.geeksforgeeks.org/visualize-data-from-csv-file-in-python/.
(10) Accessing column data from a CSV file in Python. https://stackoverflow.com/questions/17186893/accessing-column-data-from-a-csv-file-in-python.
(11) csv - How to read specific columns and Rows in Python? - Stack Overflow. https://stackoverflow.com/questions/42920792/how-to-read-specific-columns-and-rows-in-python.

This doesn't work, even if the correct file name is supplied.

TypeError                                 Traceback (most recent call last)
Cell In[2], line 19
     16 plt.title('Plot of y variable')
     18 # Use KL-CPD to detect change points in 'y'
---> 19 model = KL_CPD()
     20 change_points = model.detect_change_points(y)
     22 # Print the detected change points

TypeError: KL_CPD.__init__() missing 1 required positional argument: 'D'

I gave ChatGPT the error message as prompt. It apologized but it didn't give me much help.

I apologize for the error you encountered. It seems that there was a **TypeError** when initializing the **KL_CPD** model. This error occurs when the required arguments for the model are not provided correctly.

To resolve this issue, please make sure that you have installed the **klcpd** library correctly. You can install it using the following command:

```python
pip install git+https://github.com/HolyBayes/klcpd
```

After installing the library, you need to pass the appropriate arguments to the **KL_CPD** constructor. The required arguments depend on the specific implementation of the **KL_CPD** class you are using. Please refer to the documentation or examples provided by the library to understand how to correctly initialize the model.

If you are still facing issues after following these steps, please provide more details about the error message and any additional information that might be helpful in diagnosing the problem.

I hope this helps! Let me know if you have any further questions.

Source: Conversation with Bing, 9/21/2023
(1) GitHub - HolyBayes/klcpd: PyTorch package for KL-CPD algorithm for .... https://github.com/HolyBayes/klcpd.
(2) Getting errors in running the code with python3 #3 - GitHub. https://github.com/OctoberChang/klcpd_code/issues/3.
(3) OctoberChang/klcpd_code - GitHub. https://github.com/OctoberChang/klcpd_code.
(4) undefined. https://www.linkedin.com/in/artem-ryzhikov-2b6308103/.
(5) undefined. https://www.hse.ru/org/persons/190912317.

At this point, I was the one who was doing the debugging.

My prompt:

the proper method to call KL_CPD is model = KL_CPD(D = 1)

ChatGPT came back with this. It's still incorrect.

I apologize for the confusion. You are correct! The proper method to call the **KL_CPD** model is by passing the required arguments to the constructor. In this case, you need to provide the value of `D` as `1`. Here's the corrected code:

```python
import pandas as pd
import matplotlib.pyplot as plt
from klcpd import KL_CPD

# Read the CSV file
df = pd.read_csv('your_file.csv')

# Extract the 'x' and 'y' columns
x = df['x']
y = df['y']

# Plot the 'y' variable
plt.plot(x, y)
plt.xlabel('x')
plt.ylabel('y')
plt.title('Plot of y variable')

# Use KL-CPD to detect change points in 'y'
model = KL_CPD(D=1)
change_points = model.detect_change_points(y)

# Print the detected change points
print('Change points:', change_points)
'''

Please make sure to replace `'your_file.csv'` with the actual path to your CSV file. If you have any further questions or need additional assistance, feel free to ask!

This also produced an error.

AttributeError                            Traceback (most recent call last)
Cell In[6], line 20
     18 # Use KL-CPD to detect change points in 'y'
     19 model = KL_CPD(D = 1)
---> 20 change_points = model.detect_change_points(y)
     22 # Print the detected change points
     23 print('Change points:', change_points)

File ~/mambaforge/envs/nn/lib/python3.11/site-packages/torch/nn/modules/module.py:1614, in Module.__getattr__(self, name)
   1612     if name in modules:
   1613         return modules[name]
-> 1614 raise AttributeError("'{}' object has no attribute '{}'".format(
   1615     type(self).__name__, name))

AttributeError: 'KL_CPD' object has no attribute 'detect_change_points'

I tried to give ChatGPT the proper syntax. klcpd.model doesn't have a function called detect_change_points. ChatGPT apparently made that one up.

That doesn't work either. The proper line is model.fit(y); change_points = model.predict(y). Plot y and the change points

I got this back.

I apologize for the confusion. You are correct! The proper method to detect change points using the **KL_CPD** model is by calling the `fit` method followed by the `predict` method. Here's the corrected code:

```python
import pandas as pd
import matplotlib.pyplot as plt
from klcpd import KL_CPD

# Read the CSV file
df = pd.read_csv('your_file.csv')

# Extract the 'x' and 'y' columns
x = df['x']
y = df['y']

# Plot the 'y' variable
plt.plot(x, y)
plt.xlabel('x')
plt.ylabel('y')
plt.title('Plot of y variable')

# Use KL-CPD to detect change points in 'y'
model = KL_CPD()
model.fit(y)
change_points = model.predict(y)

# Plot 'y' and the detected change points
plt.plot(x, y)
plt.plot(x[change_points], y[change_points], 'ro')
plt.show()

```

Please make sure to replace `'your_file.csv'` with the actual path to your CSV file. If you have any further questions or need additional assistance, feel free to ask!

Source: Conversation with Bing, 9/21/2023
(1) GitHub - HolyBayes/klcpd: PyTorch package for KL-CPD algorithm for .... https://github.com/HolyBayes/klcpd.
(2) Getting errors in running the code with python3 #3 - GitHub. https://github.com/OctoberChang/klcpd_code/issues/3.
(3) OctoberChang/klcpd_code - GitHub. https://github.com/OctoberChang/klcpd_code.
(4) undefined. https://www.linkedin.com/in/artem-ryzhikov-2b6308103/.
(5) undefined. https://www.hse.ru/org/persons/190912317.

We're now going in circles. The error model = KL_CPD() is back. There is at least one other error. plt.plot(x[change_points], y[change_points], 'ro') will fail because change_points is not an integer and thus not a proper index into x.

At this point, I gave up. It was much harder to try and debug ChatGPT code than it would have been to simply write it myself with a little help from searching the web for the correct syntax.

Here's the debugged code, starting from ChatGPT's version.  Debugged by me.

import pandas as pd
import matplotlib.pyplot as plt
from klcpd import KL_CPD
import torch

# Read the CSV file
df = pd.read_csv('data/ts_chatgpt.csv')

# Extract the 'x' and 'y' columns
x = df['x']
y = df['y']

# Plot the 'y' variable
plt.plot(x, y)
plt.xlabel('x')
plt.ylabel('y')
plt.title('Plot of y variable')

# Use KL-CPD to detect change points in 'y'
model = KL_CPD(D = 1).to(torch.device('cuda'))
model.fit(y)
change_points = model.predict(y)

# Plot 'y' and the detected change points
plt.plot(x, y)
plt.plot(change_points, 'ro')
plt.show()

# Print the detected change points
#print('Change points:', change_points)

It produces this plot.

You can see it finds the change points, so it's possible to find change points with klcpd.

Programming with ChatGPT seems like pair programming with a very obtuse junior developer. Maybe. I'm just not good at prompts. ChatGPT did OK with a simple task like generating some false data. It didn't help at all with the problem of change point detection.

I think I'll give GitHub Co-Pilot a try and see if it is any easier to use.