Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zoo pod stuck in Pending #241

Closed
alexfrieden opened this issue Jan 14, 2019 · 8 comments
Closed

zoo pod stuck in Pending #241

alexfrieden opened this issue Jan 14, 2019 · 8 comments

Comments

@alexfrieden
Copy link

Hi folks,
I followed the instructions for standing up the k8s cluster with running things in the following order:

kubectl -n kafka apply -f configure/aws-storageclass-zookeeper-gp2.yml
kubectl -n kafka apply -f configure/aws-storageclass-broker-gp2.yml
kubectl -n kafka apply -f 00-namespace.yml
kubectl -n kafka apply -f rbac-namespace-default/
kubectl -n kafka apply -f zookeeper/
kubectl -n kafka apply -f kafka/

The logs seem fine but when I go to see the pods I get pod/zoo-0 and zoo-1 stuck in Pending

kubectl -n kafka get all
NAME          READY   STATUS    RESTARTS   AGE
pod/kafka-0   1/1     Running   0          4m
pod/kafka-1   1/1     Running   0          4m
pod/kafka-2   1/1     Running   0          4m
pod/pzoo-0    1/1     Running   0          6m
pod/pzoo-1    1/1     Running   0          6m
pod/pzoo-2    1/1     Running   0          6m
pod/zoo-0     0/1     Pending   0          6m
pod/zoo-1     0/1     Pending   0          6m

NAME                TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)             AGE
service/bootstrap   ClusterIP   100.68.139.100   <none>        9092/TCP            4m
service/broker      ClusterIP   None             <none>        9092/TCP            4m
service/pzoo        ClusterIP   None             <none>        2888/TCP,3888/TCP   6m
service/zoo         ClusterIP   None             <none>        2888/TCP,3888/TCP   6m
service/zookeeper   ClusterIP   100.67.78.247    <none>        2181/TCP            6m

NAME                     DESIRED   CURRENT   AGE
statefulset.apps/kafka   3         3         4m
statefulset.apps/pzoo    3         3         6m
statefulset.apps/zoo     2         2         6m

Any thoughts on what is going on here? Any help is appreciated.

@Jacobh2
Copy link

Jacobh2 commented Jan 15, 2019

What is the output if you do kubectl -n kafka describe pod zoo-0? My initial thoughts is that you have run out of resources on your nodes.

@alexfrieden
Copy link
Author

@Jacobh2 looks like it can't find volume even though storage class exists.

kubectl -n kafka describe pod zoo-0
Name:           zoo-0
Namespace:      kafka
Node:           <none>
Labels:         app=zookeeper
                controller-revision-hash=zoo-7c5447d489
                statefulset.kubernetes.io/pod-name=zoo-0
                storage=persistent-regional
Annotations:    <none>
Status:         Pending
IP:
Controlled By:  StatefulSet/zoo
Init Containers:
  init-config:
    Image:      solsson/kafka-initutils@sha256:2cdb90ea514194d541c7b869ac15d2d530ca64889f56e270161fe4e5c3d076ea
    Port:       <none>
    Host Port:  <none>
    Command:
      /bin/bash
      /etc/kafka-configmap/init.sh
    Environment:
      ID_OFFSET:  4
    Mounts:
      /etc/kafka from config (rw)
      /etc/kafka-configmap from configmap (rw)
      /var/lib/zookeeper from data (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-4fpbx (ro)
Containers:
  zookeeper:
    Image:       solsson/kafka:2.1.0@sha256:ac3f06d87d45c7be727863f31e79fbfdcb9c610b51ba9cf03c75a95d602f15e1
    Ports:       2181/TCP, 2888/TCP, 3888/TCP
    Host Ports:  0/TCP, 0/TCP, 0/TCP
    Command:
      ./bin/zookeeper-server-start.sh
      /etc/kafka/zookeeper.properties
    Limits:
      memory:  120Mi
    Requests:
      cpu:      10m
      memory:   100Mi
    Readiness:  exec [/bin/sh -c [ "imok" = "$(echo ruok | nc -w 1 -q 1 127.0.0.1 2181)" ]] delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:
      KAFKA_LOG4J_OPTS:  -Dlog4j.configuration=file:/etc/kafka/log4j.properties
    Mounts:
      /etc/kafka from config (rw)
      /var/lib/zookeeper from data (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-4fpbx (ro)
Conditions:
  Type           Status
  PodScheduled   False
Volumes:
  data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  data-zoo-0
    ReadOnly:   false
  configmap:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      zookeeper-config
    Optional:  false
  config:
    Type:    EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
  default-token-4fpbx:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-4fpbx
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason            Age                    From               Message
  ----     ------            ----                   ----               -------
  Warning  FailedScheduling  2m4s (x3914 over 19h)  default-scheduler  pod has unbound PersistentVolumeClaims (repeated 2 times)

@hexene
Copy link

hexene commented Jan 16, 2019

I have the same problem. You'll notice the zoo-0 and zoo-1 pods make a claim for the kafka-zookeeper-regional storage, but a configuration file doesn't exist for it for AWS. Not sure what the recommendation here is though.

@solsson
Copy link
Contributor

solsson commented Jan 16, 2019

Storage classes were always meant to be custom. The stuff in https://github.com/Yolean/kubernetes-kafka/tree/master/configure is basically just examples. With regional volumes GKE clusters will have to adapt the examples as well.

It is of course optional to have zoo PVs span multiple availability zones. For example your cluster might be in a single zone, or you're fine with restricting all zookeeper pods to the zone of their respective volumes.

@alexfrieden
Copy link
Author

@solsson what is the recommendation? To add availability zones?

@solsson
Copy link
Contributor

solsson commented Jan 16, 2019

There is no recommendation :) There is only examples. You need to make the trade-offs, cost/availability etc. The zookeeper readme refers to some background, but I see now that it's from before #191.

@alexfrieden
Copy link
Author

Thanks @solsson I guess I am still a little unclear as to what the "pod has unbound PersistentVolumeClaims" is referring to. Storage class names I have changed to be "kafka-zookeeper"

@alexfrieden
Copy link
Author

@solsson weird, I deleted everything in the namespace, deleted all storage classes and it seems to be running now (at least containers are started and everything is in running state).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants