Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drift Detection Methods -> Learned Kernel -> Dataset format incompatible #798

Open
righelcpm opened this issue May 24, 2023 · 4 comments
Open

Comments

@righelcpm
Copy link

I am facing a format incompatibility issue. I have tried to follow the structure here (https://docs.seldon.io/projects/alibi-detect/en/stable/cd/methods/learnedkerneldrift.html).
I could not understand properly the (imput) file format required/needed.
Could someone help me, please?

My code:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import scipy.stats
import os
import tensorflow as tf

pip install alibi-detect

from alibi_detect.cd import LearnedKernelDrift
from tensorflow.keras.layers import Conv2D, Dense, Flatten, Input
from alibi_detect.utils.tensorflow import DeepKernel
sea_noise = pd.read_csv("/content/drive/MyDrive/Dataset/sea_0123_gradual_noise_0.2_1000.csv")

X_ref_2 = np.transpose(np.vstack([sea_noise.X1.values]))
X_test_2 = np.transpose(np.vstack([sea_noise.X3.values]))
# Learned Kernel Drift

# To define the projection phi
proj = tf.keras.Sequential(
  [
      Input(shape=(32, 32, 3)),
      Conv2D(8, 4, strides=2, padding='same', activation=tf.nn.relu),
      Conv2D(16, 4, strides=2, padding='same', activation=tf.nn.relu),
      Conv2D(32, 4, strides=2, padding='same', activation=tf.nn.relu),
      Flatten(),
  ]
  
)
[sea_0123_gradual_noise_0.2_1000.csv](https://github.com/SeldonIO/alibi-detect/files/11554875/sea_0123_gradual_noise_0.2_1000.csv)


# To define the kernel
kernel = DeepKernel(proj, eps=0.01)

# To instantiate the detector
cd_7 = LearnedKernelDrift(X_ref_2, kernel, backend='tensorflow', p_val=.05, epochs=10, batch_size=32)

preds_7 = cd_7.predict(X_test_2, return_p_val=True, return_distance=True)
@mauicv
Copy link
Collaborator

mauicv commented May 25, 2023

Hey @righelcpm,
It's hard to help without more details as I'm not sure exactly what the error is. Can you include the full error message? What is the dataset exactly?

@righelcpm
Copy link
Author

righelcpm commented May 25, 2023

I forgot to show the most important: the error message:

ValueError                                Traceback (most recent call last)
[<ipython-input-18-059926eedbfd>](https://localhost:8080/#) in <cell line: 21>()
     19 cd_7 = LearnedKernelDrift(X_ref_2, kernel, backend='tensorflow', p_val=.05, epochs=10, batch_size=32)
     20 
---> 21 preds_7 = cd_7.predict(X_test_2, return_p_val=True, return_distance=True)

6 frames
[/usr/local/lib/python3.10/dist-packages/alibi_detect/utils/tensorflow/kernels.py](https://localhost:8080/#) in call(self, x, y)
    169 
    170     def call(self, x: tf.Tensor, y: tf.Tensor) -> tf.Tensor:
--> 171         similarity = self.kernel_a(self.proj(x), self.proj(y))  # type: ignore[operator]
    172         if self.kernel_b is not None:
    173             similarity = (1-self.eps)*similarity + self.eps*self.kernel_b(x, y)  # type: ignore[operator]

ValueError: Exception encountered when calling layer 'sequential' (type Sequential).

Input 0 of layer "conv2d" is incompatible with the layer: expected min_ndim=4, found ndim=2. Full shape received: (32, 1)

Call arguments received by layer 'sequential' (type Sequential):
  • inputs=tf.Tensor(shape=(32, 1), dtype=float32)
  • training=None
  • mask=None

@mauicv
Copy link
Collaborator

mauicv commented May 25, 2023

It looks like your input shape is wrong. If you define the projection as:

proj = tf.keras.Sequential(
  [
      Input(shape=(32, 32, 3)),
      Conv2D(8, 4, strides=2, padding='same', activation=tf.nn.relu),
      Conv2D(16, 4, strides=2, padding='same', activation=tf.nn.relu),
      Conv2D(32, 4, strides=2, padding='same', activation=tf.nn.relu),
      Flatten(),
  ]
)

then it expects the data to be shape (None, 32, 32, 3) on input but the dataset you've attached seems to be shape (None, 3, )? Assuming this is the issue then you'll have to change the model to match the data. You'll also probably want to use Dense layers instead of the Conv2D layers.

@mauicv
Copy link
Collaborator

mauicv commented May 25, 2023

That being said, I'm a little confused what you're doing here:

sea_noise = pd.read_csv("/content/drive/MyDrive/Dataset/sea_0123_gradual_noise_0.2_1000.csv")

X_ref_2 = np.transpose(np.vstack([sea_noise.X1.values]))
X_test_2 = np.transpose(np.vstack([sea_noise.X3.values]))

It looks like you are fitting on one feature and testing on another. Is this what you mean to do? Can you explain what it is you're trying to attempt and what the data is I might be able to give a better answer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants