TimeGPT-1

I didn't think this would work. I was right. TimeGPT-1 from Nixtla is a pretrained transformer model for time series forecasting and anomaly detection. It was trained on thousands of timeseries. The idea is that just like text, time series have recognizable patterns and an GPT can leverage these patterns to predict the next number in the series just as an LLM predicts the next token in a series of tokens.

I was interested in seeing how it performs on recent temperature anomaly data. Temperature anomaly is the difference of a temperature from a reference value. Here's an example.

The monthly temperature anomaly data for the plot was downloaded from https://www.metoffice.gov.uk/hadobs/hadcrut5/data/HadCRUT.5.0.2.0/download.html.  The plot shows the temperature anomaly for June of each year. The data represents temperature anomalies (deg C) relative to the average global temperature from 1961-1990. 

Notice the jump in the temperature anomaly at the last two points (2023 and 2024). 

The reason I though that TimeGPT-1 would not work well on this data is because global temperature is driven by complex forcing functions. This isn't a flaw in TimeGPT, Global temperature can't easily be analyzed simply by the statistical properties of the series. It requires a fairly complex physical model to make accurate predictions. 

Forecasting

Nixtlar provides an API for accessing TimeGPT-1. The first step is to get an API key from https://dashboard.nixtla.io/sign_in. If the API key is stored in the OS environment, TimeGPT will read it, otherwise you can pass it to Nixtla via the API.

Most of the examples on the Nixtla site are in Python, but there is also an API for R. I decided to try the R version because I haven't written any R code in a while.

TimeGPT-1 expects at least three columns in the data frame passed to the API calls: ds for time values, y for the time series values, and unique_id to indicate the data group. In our case, we will use a single identifier for the whole set. The column names can be renamed when passed in the API calls.

First, we read the HadCRUT data into a data frame.

> library(tidyverse)
> df_temp_anomaly <- read_csv("data/HadCRUT.5.0.2.0.analysis.summary_series.global.monthly.csv", name_repair = "universal")
> head(df_temp_anomaly)
# A tibble: 6 × 4
  Time    Anomaly..deg.C. Lower.confidence.limit..2.5.. Upper.confidence.limit..97.5..
  <chr>             <dbl>                         <dbl>                          <dbl>
1 1850-01          -0.675                        -0.982                        -0.367 
2 1850-02          -0.333                        -0.701                         0.0341
3 1850-03          -0.591                        -0.934                        -0.249 
4 1850-04          -0.589                        -0.898                        -0.279 
5 1850-05          -0.509                        -0.762                        -0.256 
6 1850-06          -0.344                        -0.609                        -0.0790
> 

Next, we will do some reformatting to get the data into a format compatible with TimeGPT-1.

> df3 <- df_temp_anomaly %>% 
      select(c(Time, Anomaly..deg.C.)) %>% 
      filter(grepl("-06", Time)) %>% 
      add_column(unique_id = "1") %>% 
      rename(ds = Time) %>% 
      mutate(ds = paste(ds, "-15 12:00:00", sep = '')) %>% 
      rename(y = Anomaly..deg.C.) %>% 
      mutate(ds = as.POSIXct(ds))
> head(df3)
# A tibble: 6 × 3
  ds                        y unique_id
  <dttm>                <dbl> <chr>    
1 1850-06-15 12:00:00 -0.344  1        
2 1851-06-15 12:00:00 -0.137  1        
3 1852-06-15 12:00:00 -0.0837 1        
4 1853-06-15 12:00:00 -0.142  1        
5 1854-06-15 12:00:00 -0.299  1        
6 1855-06-15 12:00:00 -0.333  1        
> 

Finally, call the API functions to forecast the temperature anomaly for 2023 and 2024. h is the number of steps to forecast.
  
> library(nixtlar)
> nixtla_client_fcst <- nixtla_client_forecast(df3[1:173, ], h = 2, level = c(80,95), freq ="Y")
> nixtla_client_plot(df3, nixtla_client_fcst, max_insample_length = 200)
> 

nixtla_client_fcst contains the predictions.

> head(nixtla_client_fcst)
  unique_id                  ds   TimeGPT TimeGPT-lo-95 TimeGPT-lo-80 TimeGPT-hi-80 TimeGPT-hi-95
1         1 2023-06-15 12:00:00 0.8353620     0.5658436     0.6342132      1.036511      1.104880
2         1 2024-06-15 12:00:00 0.8588773     0.5448156     0.5791281      1.138626      1.172939
> 

The plot reveals how far off the predictions were.


The jump during the 2023-2024 period couldn't be accurately predicted. That's not surprising. I don't think any algorithm that relies only on past data could forecast how rapidly the global temperature is increasing.

We can try and see how the forecasts fit the past. Here are the results for fitting six years after 2000.
 
> nixtla_client_fcst <- nixtla_client_forecast(df3[1:151, ], h = 6, level = c(80,95), freq ="Y")
> nixtla_client_plot(df3, nixtla_client_fcst, max_insample_length = 200)


This is a bit better, but still can't keep up with the global climate forcing.

Despite not doing well on the HadCRUT data, I think this approach has potential for series that aren't driven by complex dynamic forces.

Privacy

When you use the TimeGPT-1 API, your data is sent to Nixtla. If you are concerned about third parties using your proprietary data, read the Nixtla privacy notice carefully. In particular,

As noted above, we may use Content you provide us to improve our Services, for example to train the models that power TimeGPT. Fill this form to opt out of our use of your Content to train our models.

Pricing

Nixtla has an interesting pricing structure for TimGPT. There is an initial 30-day free trial that allows a generous 10,000 API calls. After the trial period, users on the free tier are allowed 50 API calls. Otherwise, the price is $99/10,000 AI calls, There are increased rates for more API calls per month.

Test Driving DeepSeek -R1 Part 3

 


I have trouble with PyMC. PyMC is a probabilistic programming library for Python for building Bayesian models. Whenever I try to use it, I end up searching the web or asking an LLM to help with current syntax. It seems that there are two active versions, PyMC3 and PyMC5. The apis for the versions differ. PyMC5 is still under development. My trouble is that LLMs confuse the two and mix up the syntax of the versions. Searching the web leads to more confusion because many posts don't indicate which version they are referring to.  

ChatGPT o3-mini has just been released. I wanted to see how DeepSeek-R1 stacked against it for a simple programming task using PyMC. I gave both the same simple prompt.

generate python code to locate change points in a time series using pymc5

ChatGPT o3-mini did a reasonable job. The code worked without modifications. It generated some nice graphs using ArviZ to illustrate the solution.

import numpy as np
import matplotlib.pyplot as plt
import pymc as pm
import arviz as az

# ---------------------------
# Generate synthetic time series data
# ---------------------------
np.random.seed(42)

n = 100  # total number of time points
true_cp = 60  # true change point position (index)
x = np.arange(n)

# Create data: first segment with mean 0, second segment with mean 3
y = np.concatenate([
    np.random.normal(loc=0, scale=1, size=true_cp),
    np.random.normal(loc=3, scale=1, size=n - true_cp)
])

# ---------------------------
# Define the PyMC model
# ---------------------------
with pm.Model() as model:
    # Change point: a discrete random variable between 0 and n
    cp = pm.DiscreteUniform("cp", lower=0, upper=n)

    # Parameters for the two segments:
    mu1 = pm.Normal("mu1", mu=0, sigma=10)
    mu2 = pm.Normal("mu2", mu=0, sigma=10)
    sigma1 = pm.HalfNormal("sigma1", sigma=1)
    sigma2 = pm.HalfNormal("sigma2", sigma=1)
    
    # Use pm.math.switch to select parameters based on whether the time index is
    # before or after the change point. Note: the comparison (x < cp) works because
    # PyMC will broadcast x (a NumPy array) to the appropriate tensor.
    mu = pm.math.switch(x < cp, mu1, mu2)
    sigma = pm.math.switch(x < cp, sigma1, sigma2)
    
    # Likelihood: assume normally distributed observations with piecewise mean/sigma.
    y_obs = pm.Normal("y_obs", mu=mu, sigma=sigma, observed=y)
    
    # Sample from the posterior
    trace = pm.sample(2000, tune=1000, return_inferencedata=True, target_accept=0.95)

# ---------------------------
# Examine and visualize the results
# ---------------------------
# Print a summary for the change point
print(az.summary(trace, var_names=["cp"]))

# Plot trace and posterior distributions for key parameters
az.plot_trace(trace, var_names=["cp", "mu1", "mu2", "sigma1", "sigma2"])
plt.tight_layout()
plt.show()

# Plot the observed data along with the inferred change point
cp_posterior = trace.posterior["cp"].values.flatten()
# Use the median of the posterior for the change point estimate
cp_estimate = int(np.median(cp_posterior))
print(f"Estimated change point: {cp_estimate}")

plt.figure(figsize=(10, 6))
plt.plot(x, y, "o-", label="Data")
plt.axvline(cp_estimate, color="red", linestyle="--", label=f"Estimated CP (index {cp_estimate})")
plt.axvline(true_cp, color="green", linestyle=":", label=f"True CP (index {true_cp})")
plt.xlabel("Time Index")
plt.ylabel("Observed Value")
plt.legend()
plt.title("Change Point Detection using PyMC 5")
plt.show()

DeeepSeek generated successful code. It didn't supply the plots that ChatGT produced, but the code was adequate and the plots could have been added easily.

import numpy as np
import pymc as pm
import arviz as az
import matplotlib.pyplot as plt

# Generate synthetic time series data with a change point
np.random.seed(42)
n_points = 100
change_point = 50
time = np.arange(n_points)

# Create a time series with a change in mean
mu1, mu2 = 0, 5  # Means before and after the change point
sigma = 1  # Standard deviation
data = np.concatenate([
    np.random.normal(mu1, sigma, change_point),
    np.random.normal(mu2, sigma, n_points - change_point)
])

# Plot the time series
plt.figure(figsize=(10, 4))
plt.plot(time, data, label="Time Series Data")
plt.axvline(change_point, color="red", linestyle="--", label="True Change Point")
plt.title("Time Series with Change Point")
plt.xlabel("Time")
plt.ylabel("Value")
plt.legend()
plt.show()

# Bayesian model for change point detection
with pm.Model() as model:
    # Prior for the change point (uniform over the time range)
    tau = pm.DiscreteUniform("tau", lower=0, upper=n_points - 1)

    # Priors for the means before and after the change point
    mu1 = pm.Normal("mu1", mu=0, sigma=10)
    mu2 = pm.Normal("mu2", mu=0, sigma=10)

    # Standard deviation (assumed constant)
    sigma = pm.HalfNormal("sigma", sigma=1)

    # Mean of the time series (changes at the change point)
    mean = pm.math.switch(time >= tau, mu2, mu1)

    # Likelihood (observed data)
    likelihood = pm.Normal("likelihood", mu=mean, sigma=sigma, observed=data)

    # Sample from the posterior
    trace = pm.sample(2000, tune=1000, chains=2, return_inferencedata=True)

# Plot the posterior distribution of the change point
az.plot_posterior(trace, var_names=["tau"], hdi_prob=0.95)
plt.axvline(change_point, color="red", linestyle="--", label="True Change Point")
plt.title("Posterior Distribution of Change Point")
plt.xlabel("Time")
plt.legend()
plt.show()

# Summary of the posterior distribution
summary = az.summary(trace, var_names=["tau", "mu1", "mu2", "sigma"])
print(summary)


This was a very simple test, but DeepSeek continues to amaze.

Test Driving DeepSeek -R1 Part 2

In a previous post, I looked at DeepSeek-R1's code generating capabilities. In this post, I will examine it's ability to answer a more general question.

Willy Vlautin


Willy Vlautin
Photo Larry D. Moore, CC BY 4.0, Wikimedia Commons.

Willy Vlautin is my favorite contemporary American author. Vlautin is the author of several novels including The Motel Life, Northline, Lean on Pete, The Free, Don't Skip Out On Me, The Night Always Comes, and The Horse. He is also a songwriter and musician, recording and writing with the bands Richmond Fontaine and The Delines.

I have read all of Vlautin's novels. I decided to ask ChatGPT and DeepSeek for authors similar to Willy Vlautin.

recommend some authors similar to Willy Vlautin

First, I tried ChatGPT o1-min. Here was the response.

recommend some authors similar to Willy Vlautin

If you appreciate Willy Vlautin's gritty, character-driven narratives that delve into the struggles of everyday life, you might enjoy the following authors who explore similar themes and styles:

1. **Richard Ford**
   - *Notable Works:* *The Sportswriter*, *Independence Day*
   - *Why Similar:* Ford's novels often focus on the complexities of middle-class American life with deep psychological insight, much like Vlautin's exploration of working-class struggles.

2. **Denis Johnson**
   - *Notable Works:* *Tree of Smoke*, *Jesus' Son*
   - *Why Similar:* Johnson's writing combines lyrical prose with raw, gritty subject matter, capturing the depths of human experience in a manner reminiscent of Vlautin.

3. **Charles Bukowski**
   - *Notable Works:* *Post Office*, *Factotum*
   - *Why Similar:* Bukowski's semi-autobiographical novels depict the bleak and often harsh realities of life, echoing Vlautin's unflinching portrayal of his characters' lives.

4. **Raymond Carver**
   - *Notable Works:* *What We Talk About When We Talk About Love*, *Cathedral*
   - *Why Similar:* Carver's minimalist style and focus on ordinary, struggling individuals parallel Vlautin's narrative approach and themes.

5. **Hubert Selby Jr.**
   - *Notable Works:* *Last Exit to Brooklyn*, *Requiem for a Dream*
   - *Why Similar:* Selby Jr.'s intense, raw storytelling and exploration of marginalized lives align closely with Vlautin's thematic interests.

6. **John Steinbeck**

- *Notable Works:* *The Grapes of Wrath*, *East of Eden*
   - *Why Similar:* Steinbeck's profound empathy for the working class and his ability to depict societal struggles resonate with Vlautin's narratives.

7. **Jonathan Tropper**
   - *Notable Works:* *This Book Will Save Your Life*, *The Married Man*
   - *Why Similar:* Tropper blends humor with poignant character studies, exploring personal and socioeconomic challenges in a way that complements Vlautin's storytelling.

8. **Cormac McCarthy**
   - *Notable Works:* *The Road*, *No Country for Old Men*
   - *Why Similar:* McCarthy's stark, powerful prose and focus on survival and morality mirror the intense, character-focused narratives found in Vlautin's work.

9. **Steve Jones**
   - *Notable Works:* *Fool's Gold*, *Women’s Prison*
   - *Why Similar:* Jones writes about marginalized individuals and subcultures with authenticity and depth, similar to Vlautin's focus on the fringes of society.

10. **Emma Donoghue**
    - *Notable Works:* *Room*, *Frog Music*
    - *Why Similar:* Donoghue's ability to delve deep into her characters' psyches and present their struggles with empathy and realism aligns well with Vlautin's literary style.

These authors offer a range of perspectives and styles that complement Willy Vlautin's focus on the human condition, societal challenges, and richly drawn characters. Exploring their works should provide a satisfying continuation of the themes and emotional depth you enjoy in Vlautin's novels.

It's a good selection of authors and they do have similar styles and themes.

Here's what the DeepSeek-R1 web version returned for the same prompt. 

Willy Vlautin is known for his gritty, heartfelt novels that often explore themes of working-class struggles, loneliness, and resilience. If you enjoy his writing, you might appreciate authors with similar styles or themes. Here are a few recommendations:
1. **Larry Brown** - Known for his raw and unflinching portrayals of rural life and working-class characters, Brown's works like *Joe* and *Fay* resonate with the same emotional depth as Vlautin's.
2. **Daniel Woodrell** - Often called the "poet of the Ozarks," Woodrell's novels, such as *Winter's Bone* and *The Outlaw Album*, delve into the lives of marginalized people in rural America, much like Vlautin's stories.
3. **Kent Haruf** - Haruf's novels, including *Plainsong* and *Eventide*, focus on small-town life and the quiet struggles of ordinary people, offering a similar sense of empathy and humanity.
4. **Raymond Carver** - A master of minimalist fiction, Carver's short stories, like those in *What We Talk About When We Talk About Love*, capture the quiet desperation and resilience of everyday people.
5. **Denis Johnson** - Johnson's works, such as *Jesus' Son* and *Train Dreams*, often explore themes of addiction, redemption, and the lives of the downtrodden, with a poetic and gritty edge.
6. **Richard Ford** - Known for his *Bascombe* series and *Canada*, Ford's writing often examines the complexities of ordinary lives and the landscapes of the American West.
7. **Annie Proulx** - Proulx's novels, like *The Shipping News* and *Brokeback Mountain*, focus on the struggles of individuals in harsh environments, with a strong sense of place and character.
8. **Donald Ray Pollock** - His works, such as *The Devil All the Time* and *Knockemstiff*, are dark, gritty, and deeply rooted in the lives of working-class characters in rural America.
These authors share a commitment to exploring the lives of ordinary people with compassion and authenticity, much like Willy Vlautin. If you enjoy his storytelling, you'll likely find something to appreciate in their works as well.

This seems like a good answerer and similar to ChatGPT.

I tried the same prompt on the 7B version of DeepSeek-R1.


# Default 7B model (4.7GB - ideal for consumer GPUs)
$ ollama run deepseek-r1

======================================================================


>>> recommend some authors similar to Willy Vlautin
<think>
Okay, so I need to find some authors similar to Willy Vlautin. Let me start by understanding what
makes him unique or popular. From the previous response, it seems he's known for his dark fantasy
and gothic themes with strong female leads. His writing has emotional depth and world-building that
feels expansive.

First, I'll think about other authors in the same genre—dark fantasy and gothic fiction. Authors
like Brandon Sanderson come to mind because he's well-known for his dark fantasy, but maybe not as
much for the emotional aspects. Then there's Anne Bishop; she writes urban fantasy with strong
female characters, which is similar.

I should also consider authors who blend fantasy with mystery or psychological elements since those
could be related to Willy Vlautin's style. Maybe someone like Joe Abercrombie? He's known for dark
fantasy and has strong, complex female protagonists, so that might fit.

Looking at the previous suggestions, I see names like Alafarea Baumeister and TamsynIIDalglan. Those
seem to be in similar sub-genres within gothic or urban fantasy. I should check if these authors are
frequently compared with Vlautin.

I wonder about authors who focus on dark themes but have lighter elements, so perhaps someone like
Sarah Reeswig? She's written some steampunk and has strong female leads, which might be comparable.

Also, there's the author "V," who wrote "The House of Hounds." That sounds like it fits into a
similar speculative fiction sub-genre with strong characters and world-building.

I should also think about newer authors or those who have gained popularity recently in these
genres. Maybe someone like C.J. Box if they're into psychological suspense, though he's more on the
thriller side.

Overall, I need to compile a list of authors known for dark fantasy, gothic fiction, strong female
characters, emotional depth, and well-crafted worlds. Making sure each author has elements that
match Willy Vlautin's style will be key.
</think>

Here is a curated list of authors who share similarities with Willy Vlautin, focusing on dark
fantasy, gothic fiction, strong female leads, emotional depth, and expansive world-building:

1. **Brandon Sanderson** - Renowned for his dark fantasy, though more action-oriented, his intricate
worlds resonate similarly.

2. **Anne Bishop** - Known for urban fantasy with compelling female protagonists, her themes align
with Vlautin's dark and atmospheric style.

3. **Joe Abercrombie** - Celebrated for dark fantasy and complex female characters, his work shares
emotional and world-building depth.

4. **Alafarea Baumeister** - Specializes in gothic horror with strong narratives that evoke the same
rich, mysterious atmospheres.

5. **TamsynIIDalglan** - Focuses on gothic fantasy with dark elements and complex female
protagonists, fitting Willy Vlautin's style.

6. **Sarah Reeswig** - Combines steampunk with dark themes and strong female leads, offering a
unique blend within speculative fiction.

7. **V** (Author of "The House of Hounds") - Known for speculative fiction with intricate worlds and
compelling characters.

8. **C.J. Box** - While primarily in psychological suspense, his work delves into thriller elements
suitable for Vlautin's genre overlap.

Each author offers a unique perspective within their respective genres, contributing to a rich
tapestry of storytelling that resonates with fans of Willy Vlautin's style.

Not even close. I would never describe Vlautin's work as dark fantasy or gothic!

I tried the same prompt with the 32B parameter model. It was slow but the results weren't bad.
 
$ ollama run deepseek-r1:32b
>>> recommend authors similar to Willy Vlautin


>>> recommend authors similar to Willy Vlautin
<think>
Okay, so I need to find authors similar to Willy Vlautin. Let me think about what makes his writing
unique. He's known for gritty settings and flawed characters, often dealing with themes of isolation
and the American experience. His stories have a raw, emotional intensity.

First, I should consider other authors who write in a similar genre—maybe literary fiction or
hardboiled crime. People like Denis Johnson come to mind because he also writes about gritty,
morally complex characters. Then there's Jim Thompson; his noir style might align well with
Vlautin's work.

I should think about themes as well. Vlautin often explores addiction and the struggles of
working-class people. Authors like Tom DePastino or George Pelecanos focus on similar issues. Oh,
and Donald Ray Pollock has that gritty rural setting which could be a good match.

The pacing is another aspect—Vlautin's stories move quickly with a lot of tension. Bret Easton Ellis
writes in a way that's fast-paced and intense too. Maybe Raymond Chandler? He's more classic noir
but still captures that edgy vibe.

I should also consider the emotional depth. Richard Price has that ability to dig deep into
characters' psyches, which is something Vlautin does well. Denis Johnson again fits here with his
focus on damaged individuals.

Some newer authors might be similar too. Jason Mott or C.E. Morgan come to mind because they deal
with tough subjects and have a strong narrative voice.

I need to make sure I cover a range of recommendations, not just the obvious ones. Maybe include
some international authors as well, like Jo Nesbø, who has that dark, emotional storytelling style.

Wait, did I miss any? Let me see—Ellison Cobb writes about addiction and isolation, which fits. And
Megan Abbott's noir with psychological depth could resonate with Vlautin fans too.

I think that covers a good range of authors across different sub-genres but all share elements like
gritty settings, complex characters, and emotional intensity.
</think>

To find authors similar to Willy Vlautin, consider the following recommendations, each capturing
elements of his unique style:

1. **Denis Johnson** - Known for gritty, morally complex characters and themes of isolation.

2. **Jim Thompson** - A master of noir with a focus on flawed characters in intense situations.

3. **Tom DePastino** - Explores addiction and the struggles of working-class individuals.

4. **George Pelecanos** - Focuses on urban settings and moral ambiguity, blending crime with social
commentary.

5. **Donald Ray Pollock** - Offers a gritty rural perspective with dark humor and complex
characters.

6. **Bret Easton Ellis** - Known for fast-paced, intense narratives exploring societal issues.

7. **Raymond Chandler** - Classic noir style with a focus on edgy, atmospheric storytelling.

8. **Richard Price** - Delves deep into characters' psyches, capturing their struggles and
complexities.

9. **Jason Mott** - Deals with tough subjects like addiction and poverty with emotional depth.

10. **C.E. Morgan** - Combines lyrical prose with exploration of social issues in rural settings.

11. **Jo Nesbø** (Norway) - Offers dark, psychological crime stories with a focus on family
dynamics.

12. **Ellison Cobb** - Writes about addiction and isolation, capturing the rawness of life's
struggles.

13. **Megan Abbott** - Known for noir fiction with psychological depth and strong narrative tension.

These authors share Vlautin's themes of gritty settings, flawed characters, emotional intensity, and
exploration of complex social issues.

Not a bad list. I don't know if I would have included Jim Thompson. He novels tend to lack the sympathy for his characters that Vlautin has. I'm not familiar with some of the authors. This response gives me some new authors to check out.

Test Driving DeepSeek-R1

With four parameters I can fit an elephant, and with five I can make him wiggle his trunk. 
~ John von Neumann
No telling what you can do with 675B parameters.




DeepSeek-R1 has been all over the tech news during the last few weeks, It scared Wall Street and forced the big guys to rethink strategy. DeepSeek-R1 is an open-source AI language model developed by the Chinese startup DeepSeek. There are lots of LLMs available. The big deal with DeepSeek-R1 is that it is open source. The code and parameters are available for download. It was also developed on a shoe string budget compared to the amount spent by outfits like OpenAI for developing their LLMs.

I made a few simple tests to compare DeepSeek-R1's performance against OpenAI's o1-mini. o3-mini was not available yet when I made the tests

Change Point Detection Code Generation

I gave ChatGPT o1-mini and DeepSeek-R1 (using the web app) the same prompt.

write python code to detect change points in a time series

Here's what ChatGPT came up with.

Change point detection involves identifying points in a time series where the statistical properties (such as mean, variance, or distribution) change. Detecting these points is crucial in various applications like finance, quality control, and environmental monitoring.

Below is a Python example using the `ruptures` library, which is a powerful tool for change point detection. This example demonstrates how to detect change points in a synthetic time series and visualize the results.

### Step 1: Install Required Libraries

First, ensure you have the necessary libraries installed. You can install them using `pip`:

```bash
pip install numpy matplotlib ruptures
```

### Step 2: Python Code for Change Point Detection

```python
import numpy as np
import matplotlib.pyplot as plt
import ruptures as rpt

# Generate synthetic data with change points
def generate_data(n_samples=500, n_bkps=3, noise=1.0, seed=42):
    np.random.seed(seed)
    # Generate random signal
    signal = np.random.normal(loc=0, scale=1, size=n_samples)
    # Define change points
    bkps = sorted(np.random.choice(range(50, n_samples - 50), n_bkps, replace=False))
    for bp in bkps:
        signal[bp:] += np.random.normal(loc=5, scale=1, size=n_samples - bp)
    return signal, bkps

# Generate data
signal, true_bkps = generate_data()

# Display the generated signal
plt.figure(figsize=(10, 6))
plt.plot(signal, label='Signal')
for bp in true_bkps:
    plt.axvline(x=bp, color='red', linestyle='--', label='True Change Point' if bp == true_bkps[0] else "")
plt.title("Synthetic Time Series with Change Points")
plt.xlabel("Time")
plt.ylabel("Value")
plt.legend()
plt.show()

# Change Point Detection using Pelt search method with a linear model
model = "l2"  # "l2" corresponds to the least squares model
algo = rpt.Pelt(model=model).fit(signal)
result = algo.predict(pen=10)  # The penalty value controls the sensitivity

# Display the detected change points
plt.figure(figsize=(10, 6))
plt.plot(signal, label='Signal')
for cp in result:
    plt.axvline(x=cp, color='green', linestyle='--', label='Detected Change Point' if cp == result[0] else "")
plt.title("Detected Change Points in the Time Series")
plt.xlabel("Time")
plt.ylabel("Value")
plt.legend()
plt.show()

# Print detected change points
print("True Change Points:", true_bkps)
print("Detected Change Points:", result)
```

### Step 3: Understanding the Code

1. **Data Generation**:
    - The `generate_data` function creates a synthetic time series with a specified number of change points (`n_bkps`). It adds a shift to the mean of the signal at each change point.
    - This synthetic data helps demonstrate how the change point detection works.

2. **Visualization of Synthetic Data**:
    - The first plot shows the generated time series with the true change points marked by red dashed lines.

3. **Change Point Detection**:
    - **Model Selection**: The `"l2"` model is used, which is suitable for detecting changes in the mean of the signal.
    - **Algorithm**: The PELT (Pruned Exact Linear Time) algorithm is employed for efficient change point detection.
    - **Penalty Parameter**: The `pen` parameter controls the sensitivity. A higher penalty means fewer change points are detected. You might need to adjust this value based on your specific data.

4. **Visualization of Detected Change Points**:
    - The second plot displays the same time series with detected change points marked by green dashed lines.

5. **Output**:
    - The true and detected change points are printed to the console for comparison.

### Step 4: Customizing for Your Data

To apply change point detection to your own time series data:

1. **Load Your Data**:
    Replace the synthetic data generation with loading your data, for example, using `pandas`:

    ```python
    import pandas as pd

    # Example: Loading data from a CSV file
    data = pd.read_csv('your_time_series.csv')
    signal = data['value_column'].values
    ```

2. **Adjust the Model and Parameters**:
    Depending on the nature of your data and the type of changes you expect (mean, variance, etc.), you might need to choose a different model or adjust parameters like `pen`.

    ```python
    # Example: Using a linear model if detecting changes in slope
    model = "linear"
    algo = rpt.Pelt(model=model).fit(signal)
    result = algo.predict(pen=10)
    ```

3. **Experiment with Different Algorithms**:
    `ruptures` offers various algorithms and models. Explore them to find the best fit for your application:

    - **Dynamic Programming (`rpt.Dynp`)**:
        ```python
        algo = rpt.Dynp(model=model).fit(signal)
        result = algo.predict(n_bkps=3)  # Specify the number of change points
        ```

    - **Binary Segmentation (`rpt.Binseg`)**:
        ```python
        algo = rpt.Binseg(model=model).fit(signal)
        result = algo.predict(n_bkps=3)
        ```

    - **Window-based (`rpt.Window`)**:
        ```python
        algo = rpt.Window(model=model, window=50).fit(signal)
        result = algo.predict(pen=10)
        ```

4. **Selecting the Penalty Parameter**:
    Use methods like cross-validation or information criteria (e.g., Bayesian Information Criterion) to choose an appropriate penalty.

    ```python
    # Example: Using grid search to find the optimal penalty
    penalties = np.linspace(1, 100, 100)
    for pen in penalties:
        result = algo.predict(pen=pen)
        # Evaluate the results based on your criteria
    ```

### Additional Resources

- **Ruptures Documentation**: [https://centre-borelli.github.io/ruptures-docs/](https://centre-borelli.github.io/ruptures-docs/)
- **Change Point Detection Survey**: [A Comprehensive Survey on Change Point Detection](https://arxiv.org/abs/1802.05535)

By leveraging the `ruptures` library and customizing the parameters and models, you can effectively detect change points in various types of time series data.

I tried the code and at least on this simple test, it worked without modification. I liked the suggestion of other algorithms to try.



Here's the response from DeepSeek-R1.


Okay, I need to write Python code to detect change points in a time series. Let me think about the steps involved. First, what's a change point? It's a point in the time series where the statistical properties change, like mean, variance, or trend. So, the goal is to find where these changes occur.

I remember there are several libraries in Python for this. One common approach is using the ruptures library, which has methods for change point detection. Another option might be using Bayesian methods with something like Prophet, but I think ruptures is more straightforward for this task.

Alright, so the steps would be: install the necessary library, load or generate some sample data, apply the change point detection algorithm, and then output or visualize the results.

Wait, the user asked for code, so I should make sure to include all necessary parts. Let's start with installing ruptures. They might not have it installed, so I should mention that. Then, import the required modules. Let me check the documentation for ruptures to recall the syntax.

Ruptures supports different algorithms, like Pelt, BinSeg, and Dynp. Pelt is efficient for detecting multiple changes. I'll probably use that. The process involves specifying a cost function (like a change in mean) and then running the detection.

Sample data: Maybe generate a synthetic time series with some known change points. For example, create a series where the mean shifts at certain points. That way, the code can demonstrate the detection effectively.

Visualization is important, so using matplotlib to plot the original series and the detected change points would be helpful. Let's outline the code structure:

1. Install ruptures if not already installed.

2. Import numpy, matplotlib, and ruptures.

3. Generate sample data with change points.

4. Apply the change point detection algorithm.

5. Retrieve the change points indices.

6. Plot the results showing the data and vertical lines at change points.

Possible issues: The choice of the cost function and penalty parameter in ruptures. The penalty affects the number of detected changes. Maybe set a reasonable penalty value or use a default. For the example, using the 'l2' cost (for changes in mean) makes sense.

Another thing: ruptures returns the indices of the change points, which are the end of each segment. So the actual change occurs after that index. Wait, noruptures gives the indexes where the change occurs. Need to check. For example, if a time series has a change at index 100, then the segments are from 0 to 100, then 100 to end. So the change points list would include 100 as the first change.

Let me test a small example. Suppose data is [0]*100 + [1]*100. The change is at index 100. So the algorithm should detect 100 as a change point.

In code: using ruptures, create the algorithm instance, fit the data, then predict with a penalty. The penalty parameter 'pen' in pelt needs to be set. How to choose that? Maybe start with pen=3, and adjust based on results. Alternatively, use the 'Dynp' method which requires specifying the number of changes, but Pelt is more automatic.

Putting it all together:

Installation command: !pip install ruptures (if using Jupyter), but in code comments.

Generate sample data with numpy:

import numpy as np

np.random.seed(0)

n_points = 200

data = np.concatenate(

[np.random.normal(0, 0.5, 50),

np.random.normal(2, 0.5, 50),

np.random.normal(0, 0.5, 50),

np.random.normal(-2, 0.5, 50)]

)

This creates a series with changes at 50, 100, 150.

Then, using ruptures:

import ruptures as rpt

algo = rpt.Pelt(model="l2").fit(data)

result = algo.predict(pen=10)

print("Change points detected at indices:", result)

Wait, the 'predict' method returns the list of change points. Let me check the documentation. Yes, the change points are the indices where the changes occur. So for the sample data, the expected result would be [50, 100, 150].

Then, plotting with matplotlib:

import matplotlib.pyplot as plt

plt.figure(figsize=(10,6))

plt.plot(data)

for cp in result:

plt.axvline(x=cp, color='r', linestyle='--')

plt.title("Change Points Detected")

plt.show()

But wait, the change points are at 50, 100, 150. The data starts at 0, so those indices are correct.

But need to note that the last change point is the end of the last segment. So the code will correctly mark the transitions.

Potential adjustments: The pen parameter. If the penalty is too low, more change points are detected. If too high, fewer. The user might need to adjust this based on their data. Maybe include a comment about tuning the penalty.

Alternatively, use a different model. For example, if the variance changes, use 'normal' or 'rbf' model. But for the example, 'l2' (which is least squared deviation, suitable for changes in mean) is appropriate.

Also, handling real-world data: the code assumes the data is a 1D array. So the user should input their time series as such.

Another thing: the sample data is generated with four segments, each 50 points. So the code should detect three change points.

Testing the code with the sample data should show vertical lines at the correct positions.

So the final code would include all these steps, with comments explaining each part.

Possible edge cases: data with no change points, or very noisy data. But the example is straightforward.

I think that's a solid approach. Let me write the code step-by-step as described.

To detect change points in a time series using Python, you can use the `ruptures` library, which provides efficient methods for change point detection. Here's a step-by-step example:

### Step 1: Install the Required Library
```bash
pip install ruptures numpy matplotlib
```

### Step 2: Python Code for Change Point Detection
```python
import numpy as np
import matplotlib.pyplot as plt
import ruptures as rpt

# Generate synthetic time series with change points
np.random.seed(0)  # for reproducibility
n_points = 200
# Create data with shifts in mean at 50, 100, 150
data = np.concatenate([
    np.random.normal(0, 0.5, 50),
    np.random.normal(2, 0.5, 50),
    np.random.normal(0, 0.5, 50),
    np.random.normal(-2, 0.5, 50)
])

# Detect change points using the Pelt algorithm (for changes in mean)
model = "l2"  # "l2" for least squared deviation (changes in mean)
algo = rpt.Pelt(model=model).fit(data)
change_points = algo.predict(pen=10)  # penalty parameter (adjust based on data)

# Remove the last index (end of the signal) if included
change_points = [cp for cp in change_points if cp < len(data)]

# Plot the results
plt.figure(figsize=(10, 6))
plt.plot(data, label='Time Series')
for cp in change_points:
    plt.axvline(x=cp, color='red', linestyle='--', label='Change Point' if cp == change_points[0] else "")
plt.title("Change Points Detected in Time Series")
plt.xlabel("Time")
plt.ylabel("Value")
plt.legend()
plt.show()

print(f"Detected change points at indices: {change_points}")
```

### Explanation:
1. **Synthetic Data**: A time series with four segments (changes in mean at indices 50, 100, and 150).
2. **Algorithm**: The `Pelt` algorithm from `ruptures` detects multiple change points efficiently. The `model="l2"` assumes changes in the mean.
3. **Penalty (`pen`)**: Controls sensitivity to changes. Increase `pen` to reduce the number of detected changes.
4. **Visualization**: The detected change points are marked with red dashed lines.

### Output:
- The plot shows the time series with vertical lines at detected change points.
- The console prints the indices of detected changes (e.g., `[50, 100, 150]`).

### Adjustments for Your Data:
- **Tune `pen`**: Experiment with different penalty values.
- **Model Type**: Use `model="rbf"` for variance changes or `model="linear"` for trend changes.
- **Real Data**: Replace the synthetic `data` with your time series (1D array).

For more details, refer to the [ruptures documentation](https://centre-borelli.github.io/ruptures-docs/).

DeepSeek-R1 was a bit more verbose and didn't try to suggest alternatives. However, DeepSeek did a good job of provide a readable description of the reasoning process. The code ran without modification.

I also tried the DeepSeek-7B model running it via ollama. This model has 7 billion parameters and will run on a laptop.

$ ollama run deepseek-r1

Here's the output.


<think>
Okay, so the user is asking for Python code to detect change points in a time series. Hmm, I need to
figure out what exactly they mean by that. Change point detection usually refers to identifying
points where the statistical properties of the data change abruptly.

First, I should consider different methods available for this task. There's binary segmentation,
which recursively splits the data into segments and looks for changes. Then there's the Pelt
algorithm, which is more efficient for detecting multiple changes. Also, using libraries like
ruptures could be helpful since it has various algorithms implemented.

I think the user might not be familiar with all these methods, so providing a simple example would
help them get started. Maybe start with a basic setup where they import necessary libraries and
generate some synthetic data to test on.

Next, I need to outline how each method works briefly. For instance, binary segmentation splits the
time series into two parts at each step until no more changes are found. The Pelt algorithm is great
for penalizing change points, making it suitable when you have an idea of how many changes you
expect. Using a library like ruptures makes things easier because it has built-in functions to
detect multiple change points.

I should include code examples that show how to simulate data with and without change points. Then
apply each method to detect the change points. After detecting them, plotting the results would be
helpful for visualization. That way, the user can see where the changes were identified.

Wait, I should also mention that the number of change points isn't always known in advance. The
methods will determine this based on some criteria like p-values or penalties. Including a note
about tuning parameters and testing different algorithms might help them choose the best approach
for their specific data.

Oh, right, the user didn't specify which method they prefer, so providing multiple options is good.
They can then decide which one suits their needs better. I should structure the code with comments
explaining each part so it's clear how to use it.

Finally, adding a conclusion that summarizes when to use each method would be useful for the user.
This way, they know which algorithm to apply based on their situation.
</think>

To detect change points in a time series, you can use various methods depending on the type of
changes (mean shift, variance change, etc.) and the number of expected change points. Below is an
example using Python code that demonstrates how to detect change points in a synthetic time series
with known change points.

We will use three different approaches:

1. **Binary Segmentation**
2. **Pelt Algorithm** (for multiple change points)
3. **Ruptures Library**

### Step 1: Import Necessary Libraries
```python
import numpy as np
import matplotlib.pyplot as plt

from ruptures import rpt, rptl
from pelt import Pelt
```

### Step 2: Generate Synthetic Time Series Data with Change Points

Here's an example of generating a synthetic time series with two change points:

```python
n_samples = 100
n_features = 1

# First part of the time series (mean = 5, variance = 1)
X = np.random.normal(5, 1, n_samples).reshape(-1, n_features)

# Second part of the time series (mean = 8, variance = 1.5)
Y = np.random.normal(7, 1.5, n_samples).reshape(-1, n_features)

# Third part of the time series (mean = 2, variance = 0.8)
Z = np.random.normal(4, 0.8, n_samples).reshape(-1, n_features)

# Concatenate all parts
data = np.concatenate([X, Y, Z])
```

### Step 3: Add Change Points

In this example, we have three segments with different means and variances.

### Step 4: Detect Change Points Using Binary Segmentation

Binary segmentation is a recursive algorithm that splits the data into two parts at each step until
no more changes are detected. Here's how to implement it:

```python
# Define the cost function (e.g., Normal distribution)
cost = "normal"

# Initialize the change point detection model using binary segmentation
algo = rpt.Binseg(cost=cost, min_length=10).fit(data)

# Predict the change points
change_points = algo.predict(pen=0.05)

print("Change points detected at indices:", change_points)
```

### Step 5: Detect Change Points Using Pelt Algorithm

The Pelt algorithm is efficient for detecting multiple change points, especially when you have an
idea of how many changes to expect.

```python
# Define the cost function (e.g., Normal distribution with penalty)
pen = 0.1
algo_pelt = Pelt(pen=pen).fit(data)

change_points_pelt = algo_pelt.predict()

print("Change points detected at indices:", change_points_pelt)
```

### Step 6: Detect Change Points Using Ruptures Library

The ruptures library provides a variety of algorithms for change point detection. Here's an example
using the `Binseg` model from ruptures:

```python
fromruptures import BinSeg

# Initialize the change point detection model
model = BinSeg(pen=0.05)

# Detect the change points
change_points_ruptures = model.fit_predict(data, n_bkps=2)  # Assuming 2 change points

print("Change points detected at indices:", change_points_ruptures)
```

### Step 7: Visualize the Results
plt.figure(figsize=(10, 6))

# Plot the original data
plt.plot(data[:, 0], label='Original Data')

# Plot the detected change points as vertical lines
for cp in change_points:
    plt.axvline(x=cp, color='red', linestyle='--', linewidth=2, alpha=0.5)

for cp_pelt in change_points_pelt:
    plt.axvline(x=cp_pelt, color='green', linestyle='--', linewidth=2, alpha=0.5)

for cp_ruptures in change_points_ruptures:
    plt.axvline(x=cp_ruptures, color='blue', linestyle='--', linewidth=2, alpha=0.5)

plt.legend(['Original Data', 'Change Points (Binary Seg)',
           'Change Points (Pelt)', 'Change Points (Ruptures)'])
plt.xlabel('Time')
plt.ylabel('Value')
plt.title('Change Point Detection in Time Series')
plt.show()

### Conclusion

- **Binary Segmentation**: This method recursively splits the data into segments and looks for
changes within each segment. It works well when you have a good idea of where to expect changes.
- **Pelt Algorithm**: This is an optimal algorithm that uses dynamic programming with pruning
(sliding window) to detect change points efficiently, especially useful for large datasets.
- **Ruptures Library**: This library provides multiple algorithms and models for detecting change
points. It's highly flexible and allows you to specify the number of expected changes.

You can choose the method based on your specific needs, such as the size of the dataset, the
expected number of change points, and whether you have prior knowledge about where the changes might
occur.

>>>

The 7B results were less impressive. The reasoning section seemed good, but the code had numerous bugs and didn't run initially. I didn't spend much time debugging it, only enough to get it to run and do something. The code chose every 5 positions in the series as a change point.

I guess it takes more than 7 billion parameters to write good code.