Setup Playbooks

Production-oriented setup guidance for reverse proxies, Kubernetes collector filtering, and full 360 telemetry.

Back to Summary

Edge and Ingress

Use NGINX, Traefik, or cloud L7 to terminate TLS and route OTLP traffic to your collector.

Collector Processing

Apply filtering, transform, and batching in collector pipelines to control cost and cardinality.

360 Telemetry

Collect app, host, and cluster telemetry with Prometheus and OTEL so you can correlate all layers.

Reference Topology

SDKs / Agents -> Reverse Proxy (TLS) -> OTEL Collector (filter + batch) -> SOBS OTLP
                    

                Prometheus exporters (kube-state-metrics, node-exporter, cAdvisor) -> Collector Prometheus receiver -> SOBS

For Kubernetes, run collector as a Deployment for cluster-level data and optionally a DaemonSet for node-level metrics/logs.

Reverse Proxy and Ingress (NGINX)

Route OTLP HTTP traffic through your edge while preserving protocol expectations and request body sizes.

server {
                    

                      listen 443 ssl;
                    

                      server_name otlp.example.com;
                    

                      client_max_body_size 32m;
                    

                      location /v1/ {
                    

                        proxy_pass http://otel-collector.monitoring.svc.cluster.local:4318;
                    

                        proxy_http_version 1.1;
                    

                        proxy_set_header Host $host;
                    

                        proxy_set_header X-Forwarded-Proto $scheme;
                    

                      }
                    

                }

Use OTLP HTTP on port 4318 at the edge for simpler L7 routing.
If using OTLP gRPC, configure HTTP/2 pass-through end-to-end and avoid protocol downgrades.
Apply auth at ingress (basic auth, mTLS, or token headers) and document trust boundaries.

Kubernetes Collector Filtering and Routing

Use processors to reduce noisy metrics and keep only namespaces/services needed for alerting and dashboards.

processors:
                    

                      filter/metrics_allowlist:
                    

                        metrics:
                    

                          include:
                    

                            match_type: regexp
                    

                            metric_names:
                    

                              - ^kube_.*
                    

                              - ^node_.*
                    

                      filter/drop_kube_system:
                    

                        metrics:
                    

                          datapoint:
                    

                            - resource.attributes["k8s.namespace.name"] == "kube-system"
                    

                      batch:
                    

                        timeout: 2s
                    

                        send_batch_size: 2048
                    

                    service:
                    

                      pipelines:
                    

                        metrics:
                    

                          receivers: [otlp, prometheus]
                    

                          processors: [filter/metrics_allowlist, filter/drop_kube_system, batch]
                    

                      exporters: [otlphttp]

Start with allow-lists to control ingestion growth, then widen selectively.
Use separate pipelines for traces, metrics, and logs if retention or routing differs.
Add tail-sampling for traces in high-volume clusters.

Prometheus and Cluster Telemetry (360 View)

Signal Layer	Recommended Sources	Collector Receiver
Application	OTLP SDKs from services, jobs, and workers	`otlp`
Host / Node	node-exporter, hostmetrics, kubelet cAdvisor	`prometheus`, `hostmetrics`
Cluster / Control Plane	kube-state-metrics, kube events, API object state	`prometheus`, `k8s_cluster`

Production Checklist

Enable TLS at ingress and restrict OTLP endpoints to trusted networks.
Set collector memory limits and queue/retry settings before high-volume rollout.
Define allow-list metric filters to avoid unbounded cardinality.
Validate namespace coverage for business-critical workloads.
Test dashboard and anomaly rules with app plus infra telemetry present.

Troubleshooting Commands

kubectl get pods -n monitoring | grep -i otel
                    

                    kubectl logs deployment/otel-collector -n monitoring --tail=200
                    

                    kubectl get servicemonitors,podmonitors -A
                    

                curl -sv http://localhost:44317/v1/metrics

If metrics are sparse, inspect collector logs for dropped points and confirm receivers are scraping expected targets.