Case 1
Key Design Decisions
Why FastAPI for API Layer
Case 1
Key Design Decisions
Triton Inference Server
Case 1
Key Design Decisions
PostgreSQL for Metadata
Case 1
Key Design Decisions
MinIO/S3 for Object Storage
Case 1
Why not use a NoSQL database like MongoDB for metadata?
I considered MongoDB, but chose PostgreSQL because:
- Structured relationships: Images → Objects is a clear relational model
- ACID transactions: When creating an image record and its objects, we need atomicity
- Query patterns: Most queries are simple lookups by ID or filtering by image_id (indexed)
- Maturity: PostgreSQL is battle-tested for production workloads
- Team familiarity: Easier for other engineers to maintain
However, if we needed to store unstructured detection metadata or had very high write volumes, MongoDB could be a better fit.
Case 1
Why did you choose YOLOv8-Segmentation over other approaches
Sistem, mimari olarak YOLOv8 Segmentation modelini kullanmasına rağmen, eğitim veri setinde maske etiketlerinin bulunmaması ve kod altyapısının modelden dönen segmentasyon verisini (output1) işlememesi nedeniyle fiilen standart bir nesne tespit (object detection) aracı olarak çalışmaktadır. “Daire tespiti” olarak sunulan sonuçlar, modelin nesnenin şeklini tanımasıyla değil, tespit edilen dikdörtgen kutuların (bounding box) genişlik ve yükseklik değerlerinin ortalaması alınarak yapılan matematiksel bir tahmine (approximation) dayanır; bu nedenle görselleştirilen çıktılar, nesnenin gerçek formu değil, hesaplanan koordinatlara sonradan çizdirilen yapay ve kusursuz dairelerdir.
Case 1
Peki datada segmentation yok, sistemi başka nasıl kurgulardın
Seviye 1: Hibrit Yaklaşım (YOLO + OpenCV “Hough Circle”)
Seviye 2: YOLO + GrabCut (Foreground Extraction)
Seviye 3: YOLO + Pre-trained Background Removal (U2Net / Rembg)
Seviye 4: YOLO + SAM (Segment Anything Model) - Prompting
Seviye 5: Auto-Labeling & Model Distillation (En Gelişmiş Mimari) 👑
Case 1
S3/MinIO vs Local Filesystem
Local filesystem ilk bakışta basit görünse de horizontal scaling’i anında kilitler. API server’lar stateless olduğunda çoklu instance çalıştırırsınız; fakat her instance kendi diski üzerinde tutulmuş image’ları göremez. Shared-disk çözümü (NFS, EFS) devreye girince de latency artar, throughput düşer ve reliability zayıflar. S3/MinIO ise tek bir global object store gibi çalışarak tüm instance’ların aynı asetlere erişmesini garanti eder.
Case 1
Zamanın olsa depolama sistemini nasıl kurardın
Hibrit bir mimari, performans ile dayanıklılık arasında orta yol arayan sistemler için güçlü bir seçenektir. Burada dosyalar önce local disk’e çok hızlı şekilde yazılır, API request’i düşük latency ile cevaplanır. Daha sonra background worker—örn. Celery, Kafka consumer, SQS/Lambda pipeline—dosyayı S3’e upload eder. Upload başarılı olduğunda metadata güncellenir. Bu model özellikle yüksek QPS altında S3 write latency’sini kullanıcıya hissettirmemek için idealdir.
Case 1
### Q4: How does your solution handle high traffic? What are the bottlenecks?
1. API Layer (FastAPI):
Case 1
### Q4: How does your solution handle high traffic? What are the bottlenecks?
Case 1
### Q4: How does your solution handle high traffic? What are the bottlenecks?
Case 1
### Q4: How does your solution handle high traffic? What are the bottlenecks?
Case 1
Bottlenecks & Solutions:
| Triton GPU |
| PostgreSQL Writes |
| Image Upload |
| Storage I/O |
Triton GPU |
~30-50 req/s per GPU | Add more GPU instances, use TensorRT |
| Image Upload |
Network bandwidth | Async uploads, compression |
| Storage I/O |
S3 API rate limits | Local cache layer, CDN |
| Triton GPU |
PostgreSQL Writes |
~1000 writes/s | Connection pooling, write batching
Case 1
Scaling Strategy:
Case 1
Follow-up: What metrics would you monitor to detect bottlenecks?
API Metrics
Case 1
Follow-up: What metrics would you monitor to detect bottlenecks?
Triton Metrics
Case 1
Follow-up: What metrics would you monitor to detect bottlenecks?
Database Metrics
Case 1
Follow-up: What metrics would you monitor to detect bottlenecks?
Storage Metrics
Case 1
Follow-up: What metrics would you monitor to detect bottlenecks?
Business Metrics
Case 1
How do you ensure reliability and fault tolerance?
Health Checks
- Kubernetes/Docker can restart unhealthy containers
- Readiness probe checks Triton, PostgreSQL, S3 availability
Error Handling
- Graceful degradation: If Triton is down, return 503 (Service Unavailable)
- Retry logic with exponential backoff for transient failures
- Circuit breaker pattern for external services (can be added)
Data Consistency
- Database transactions for atomic operations
- Idempotent endpoints (can retry safely)
- Status tracking (UPLOADED → PROCESSING → COMPLETED/FAILED)
Case 1
Fault Scenarios & Handling
| Triton Down
PostgreSQL Down
| S3 Unavailable
| GPU OOM
| Disk Full
| Triton Down | Health check fails | Return 503, log error, alert |
PostgreSQL Down | Connection error | Retry with backoff, alert |
| S3 Unavailable | API error | Fallback to local storage, alert |
| GPU OOM | Triton error | Reduce batch size, alert |
| Disk Full | Write error | Alert, cleanup old files |
Case 1
How would you handle a partial failure (e.g., Triton works but S3 is down)?
I’d implement a graceful degradation strategy:
Case 1
Walk me through your model evaluation strategy.