Status & Troubleshooting
This guide covers checking deployment status, viewing logs, and troubleshooting common issues.
Checking Deployment Status
Using the CLI
Get comprehensive deployment status:
rulebricks statusThis displays:
- Infrastructure: Cluster endpoint, node status, resource usage
- Kubernetes: Node count, pod distribution, health
- Database: Availability, endpoints, connection status
- Application: Deployment status, replicas, versions
- Services: Endpoints, versions, health
- Certificates: Validity, expiration dates
Status Output Example
Infrastructure:
Cluster: my-rulebricks-cluster
Endpoint: https://abc123.region.eks.amazonaws.com
Nodes: 3/3 healthy
Kubernetes:
Nodes: 3
CPU Usage: 45%
Memory Usage: 62%
Database:
Type: self-hosted
Status: Running
Endpoint: postgresql.my-namespace.svc.cluster.local:5432
Application:
Status: Running
Replicas: 2/2
Version: 1.2.3
Services:
App: https://app.example.com
Grafana: https://grafana.example.com
Supabase: https://supabase.example.com
Certificates:
app.example.com: Valid (expires in 89 days)Viewing Logs
Component Logs
View logs from specific components:
# View app logs
rulebricks logs app
# Follow logs in real-time
rulebricks logs app -f
# View last N lines
rulebricks logs app --tail 500
# View all components
rulebricks logs all -fAvailable Components
app- Main Rulebricks applicationhps- HPS service (rule processing)workers- Worker podsdatabase- PostgreSQL databasesupabase- All Supabase servicestraefik- Ingress controllerprometheus- Metrics collectiongrafana- Monitoring dashboardsall- Combined logs from all components
Using kubectl
You can also use kubectl directly:
# List all pods
kubectl get pods --all-namespaces
# View logs from a pod
kubectl logs <pod-name> -n <namespace> -f
# View logs from all pods in a deployment
kubectl logs -l app=rulebricks-app -n <namespace> -f
# View previous container logs (if pod restarted)
kubectl logs <pod-name> -n <namespace> --previousCommon Issues and Solutions
Infrastructure Issues
Cluster Creation Fails
Symptoms:
- Terraform errors during deployment
- Timeout errors
- Resource quota errors
Solutions:
-
Check cloud provider quotas:
- AWS: Check service quotas in AWS Console
- GCP: Verify billing is enabled and quotas are sufficient
- Azure: Check subscription quotas
-
Verify credentials:
# AWS aws sts get-caller-identity # GCP gcloud auth list # Azure az account show -
Check for existing resources:
- Verify no conflicting resource names
- Check for existing clusters with same name
- Review Terraform state
-
Review Terraform logs:
rulebricks deploy --verbose
Cluster Not Accessible
Symptoms:
kubectlcommands fail- Cannot connect to cluster
Solutions:
-
Verify kubectl context:
kubectl config current-context kubectl config get-contexts -
Update kubeconfig:
# AWS aws eks update-kubeconfig --name <cluster-name> --region <region> # GCP gcloud container clusters get-credentials <cluster-name> --region <region> # Azure az aks get-credentials --resource-group <rg> --name <cluster-name> -
Check network connectivity:
- Verify firewall rules
- Check security groups
- Test network connectivity
Application Issues
Pods Not Starting
Symptoms:
- Pods in
PendingorCrashLoopBackOffstate - Pods not reaching
Runningstate
Solutions:
-
Check pod status:
kubectl get pods --all-namespaces kubectl describe pod <pod-name> -n <namespace> -
Review pod events:
kubectl get events --all-namespaces --sort-by='.lastTimestamp' -
Check resource constraints:
kubectl top nodes kubectl describe nodes -
Review pod logs:
kubectl logs <pod-name> -n <namespace> kubectl logs <pod-name> -n <namespace> --previous -
Common causes:
- Insufficient node resources
- Image pull errors
- Configuration errors
- Resource quota limits
Application Not Responding
Symptoms:
- 502/503 errors
- Timeout errors
- Service unavailable
Solutions:
-
Check service status:
kubectl get svc --all-namespaces kubectl describe svc <service-name> -n <namespace> -
Verify ingress:
kubectl get ingress --all-namespaces kubectl describe ingress <ingress-name> -n <namespace> -
Check pod health:
kubectl get pods -n <namespace> rulebricks status -
Review application logs:
rulebricks logs app -f -
Test connectivity:
# Test service endpoint kubectl port-forward svc/<service-name> 8080:80 -n <namespace> curl http://localhost:8080
Database Issues
Database Connection Failures
Symptoms:
- Application cannot connect to database
- Database connection errors in logs
Solutions:
-
Check database status:
rulebricks status kubectl get pods -n <database-namespace> -
Verify database is running:
kubectl logs <db-pod> -n <database-namespace> -
Test database connectivity:
kubectl exec -it <db-pod> -n <database-namespace> -- psql -U postgres -c "SELECT version();" -
Check database credentials:
kubectl get secret <db-secret> -n <namespace> -o yaml -
Review connection string:
- Verify database host/port
- Check credentials
- Verify network policies
Database Migration Failures
Symptoms:
- Migration errors in logs
- Database schema not updated
Solutions:
-
Review migration logs:
rulebricks logs database -
Check migration status:
kubectl logs <app-pod> -n <namespace> | grep -i migration -
Manual migration (if needed):
- Connect to database
- Review migration files
- Run migrations manually if necessary
Certificate Issues
TLS Certificate Not Generated
Symptoms:
- Certificate not issued
- HTTPS not working
- Certificate errors in browser
Solutions:
-
Check cert-manager:
kubectl get pods -n cert-manager kubectl logs -n cert-manager -l app=cert-manager -
Verify certificate requests:
kubectl get certificaterequests --all-namespaces kubectl describe certificaterequest <name> -n <namespace> -
Check DNS configuration:
dig your-domain.com nslookup your-domain.com -
Verify domain ownership:
- Ensure DNS points to load balancer
- Check DNS propagation (can take 5-30 minutes)
- Verify ports 80 and 443 are accessible
-
Review ACME challenges:
kubectl get challenges --all-namespaces kubectl describe challenge <name> -n <namespace>
Certificate Expired
Symptoms:
- Certificate expiration warnings
- HTTPS errors
Solutions:
-
Check certificate status:
kubectl get certificates --all-namespaces kubectl describe certificate <name> -n <namespace> -
Force renewal (if needed):
kubectl delete certificaterequest <name> -n <namespace> # cert-manager will automatically create a new request -
Verify automatic renewal:
- cert-manager automatically renews certificates
- Check renewal schedule in certificate spec
Resource Issues
Out of Resources
Symptoms:
- Pods in
Pendingstate InsufficientCPUorInsufficientMemoryevents- Node resource exhaustion
Solutions:
-
Check node resources:
kubectl top nodes kubectl describe nodes -
Check pod resources:
kubectl top pods --all-namespaces -
Scale up cluster:
- Increase
node_countin configuration - Enable autoscaling
- Deploy with updated configuration
- Increase
-
Optimize resource requests:
- Review resource requests in configuration
- Adjust based on actual usage
- Consider using larger instance types
High Resource Usage
Symptoms:
- High CPU/memory usage
- Performance degradation
- Pod evictions
Solutions:
-
Identify resource consumers:
kubectl top pods --all-namespaces --sort-by=cpu kubectl top pods --all-namespaces --sort-by=memory -
Review resource limits:
kubectl describe pod <pod-name> -n <namespace> -
Scale services:
- Increase replicas for stateless services
- Enable autoscaling
- Adjust resource limits
-
Optimize configuration:
- Review performance settings
- Adjust volume level
- Tune Kafka and worker settings
Debug Mode
Enable verbose logging for detailed debugging:
rulebricks deploy --verbose
rulebricks status --verbose
rulebricks logs app -vGetting Help
Collecting Debug Information
Before seeking help, collect:
-
Configuration:
cat rulebricks.yaml -
Status:
rulebricks status > status.txt -
Logs:
rulebricks logs all > logs.txt -
Kubernetes state:
kubectl get all --all-namespaces > k8s-state.txt kubectl describe nodes > nodes.txt -
Events:
kubectl get events --all-namespaces > events.txt
Support Resources
- Documentation: This guide and other documentation pages
- GitHub Issues: GitHub Issues (opens in a new tab)
- Email Support: support@rulebricks.com
When contacting support, include:
- Configuration file (without secrets)
- Status output
- Relevant logs
- Error messages
- Steps to reproduce
Best Practices
- Monitor regularly: Check status and logs regularly
- Set up alerts: Configure monitoring alerts
- Keep backups: Regular database and configuration backups
- Test changes: Test in development before production
- Document customizations: Keep notes on manual changes
- Review logs: Regularly review logs for issues
- Update regularly: Keep CLI and deployments updated
Next Steps
- Set up monitoring: See Monitoring & Logging
- Configure backups: Set up automated backups
- Plan upgrades: See Upgrades & Maintenance
- Optimize performance: Review performance configuration