This repository has no description
0

Configure Feed

Select the types of activity you want to include in your feed.

at master 104 lines 7.5 kB View raw View rendered
1# Basketball Object Detection — Production Handover 2 3This folder is the result of a multi-day R&D session optimizing YOLO-family models for basketball + basketball_hoop detection on Android (Samsung Galaxy A36 5G as reference device). It contains the production-ready tflite models, the Android-side requirements for consuming them, and a protocol for verifying quality in the kima app before shipping. 4 5## ⚠️ Library version: bump to `posedetection-compose:4.11.0` 6 7The R&D work shipped a new **`posedetection-compose:4.11.0`** library release containing the letterbox-aware detector + camera config + delegate logging that the rect models depend on. The kima app currently uses an older version (probably 4.10.0). **Bump the dependency before integrating any of these models.** 8 9The new version is available in two places: 10 11### Option A — Maven Local (immediate, no remote publish needed) 12 13The library is already published to maven local on the dev machine: 14`~/.m2/repository/com/performancecoachlab/posedetection/posedetection-compose/4.11.0/` 15 16To pull it from kima, the kima app's root `build.gradle.kts` (or `settings.gradle.kts`'s `dependencyResolutionManagement`) must include `mavenLocal()` in its repositories list: 17 18```kotlin 19repositories { 20 mavenLocal() // ← add this line if not already present 21 google() 22 mavenCentral() 23} 24``` 25 26Then bump the dependency in whichever module consumes posedetection (typically the kima Android app's `build.gradle.kts`): 27 28```kotlin 29implementation("com.performancecoachlab.posedetection:posedetection-compose:4.11.0") 30``` 31 32This will resolve from maven local without any network calls. 33 34### Option B — Remote (after publishing) 35 36The 4.11.0 library code lives on the **`release-4.11.0`** branch in the PoseDetection repo. Once it's merged + published to maven central via the existing `vanniktech.maven.publish` flow, kima can drop the `mavenLocal()` line and just consume it from central. Until then, use Option A. 37 38### What's actually in 4.11.0 39 40The full commit message on `release-4.11.0` lists everything. The headline: 41- **`ImageDetector.android.kt`** (NEW class) — letterbox-aware detector with full un-letterbox of output bboxes. The detector all the rect models in `models/` were tested against. 42- **`CameraView.android.kt`** — pins `ImageAnalysis` to `AspectRatio.RATIO_4_3` so the camera frame distribution matches the rect models' expected geometry. 43- **`CustomObjectModel.android.kt`** — explicit GPU/NNAPI/CPU delegate logging at INFO level. Surfaces silent CPU fallbacks via `adb logcat -s TFLite:I`. 44- **No breaking API changes** — only additive. Existing kima usage of the library should keep working unchanged after the version bump. 45 46## TL;DR — Use this model 47 48**`models/yolo26n_v11_rect_512x384.tflite`** is the recommended production default. 49 50| Property | Value | 51|---|---| 52| Architecture | YOLO26n (2026 generation, ~5.2 GFLOPs, ~2.4M params after fusion) | 53| Training | 50 epochs, AdamW, `imgsz=512`, `rect=True`, Roboflow `dataset_v11` (basketball court footage, letterboxed preprocessing) | 54| Input tensor | `[1, 384, 512, 3]` float32 (NHWC), pixel values normalized to `[0, 1]` | 55| Output tensor | `[1, 300, 6]` float32 = `[x1, y1, x2, y2, conf, cls]` per row, **NMS already applied**, coords normalized to `[0, 1]` over the input tensor | 56| Classes | `0 = basketball`, `1 = basketball_hoop` | 57| File size | 9.3 MB (fp32) | 58| Measured speed | ~6.7 Hz on the A36 5G via TFLite GPU delegate | 59| Measured quality | mean confidence ~0.78–0.83 across the 60s basketball test clip | 60| val mAP50 | 0.7632 | 61 62This model is **rectangular** (4:3 landscape) and was trained with `rect=True`, which means it learned features tuned to a 384×512 tensor. It expects the camera to deliver landscape 4:3 frames, which the kima app must enforce — see `ANDROID_INTEGRATION.md`. 63 64## Why fp32 not fp16 65 66Tested both. fp16 saves ~50% file size but on the A36 5G's TFLite GPU delegate they ran at the same speed and fp16 cost ~0.02 mean confidence. Since 4 MB of model file is negligible vs the production app's overall size, **stick with fp32** — same speed, better quality. 67 68## What's in this folder 69 70``` 71handover/ 72├── README.md ← you are here 73├── MODELS.md ← detailed comparison of all 6 candidates 74├── ANDROID_INTEGRATION.md ← what the kima app's Android side needs 75├── TESTING_PROTOCOL.md ← how to A/B test models in the kima app 76├── LESSONS_LEARNED.md ← key insights from the R&D session 77├── models/ 78│ ├── yolo26n_v11_rect_512x384.tflite ← RECOMMENDED PRODUCTION DEFAULT (rect=True) 79│ ├── yolo26n_v11_rect_384x288.tflite ← speed-critical alternative (>10 Hz) 80│ ├── yolo26n_v11_square_512.tflite ← square 512², slightly higher quality at lower speed 81│ ├── yolo26n_v11_square_416.tflite ← square 416² baseline reference 82│ ├── yolo11n_dataset640.tflite ← historical quality ceiling, KNOWN LOOSE BBOXES 83│ └── yolo11n_noise_floor_baseline.tflite ← stable model for noise-floor sanity checks 84└── reference_code/ 85 ├── ImageDetector.android.kt ← working letterbox-aware detector 86 ├── CustomObjectModel.android.kt ← TFLite interpreter setup with delegate logging 87 ├── CameraView_snippet.kt ← CameraX 4:3 + landscape pin 88 ├── AndroidManifest_snippet.xml ← orientation lock 89 └── compare_logs.py ← portable Python report generator for A/B tests 90``` 91 92## Quick start 93 941. **Read `MODELS.md`** to understand the candidates and pick a starting model. The recommended default is `yolo26n_v11_rect_512x384.tflite`. 952. **Read `ANDROID_INTEGRATION.md`** to see what camera config and detector code the kima app needs. **Critical**: the model expects a 4:3 landscape camera frame; the kima app must pin CameraX accordingly. 963. **Drop the chosen tflite** into the kima app's assets folder. 974. **Implement the camera + detector** per `ANDROID_INTEGRATION.md`. The reference code in `reference_code/` is the working version from this project. 985. **Verify with `TESTING_PROTOCOL.md`** before shipping. Even a single 60-second comparison run against a known video can catch integration mistakes early. 99 100## Source notes 101 102- All models are trained on the `kima-rbjnn/dataset-3k5b6` Roboflow project, version 11 (preprocessing = "Fit (black padding)"; the v8 stretched dataset was the source of weeks of accuracy issues — **never train on stretched data again**, see `LESSONS_LEARNED.md`). 103- The pipeline is closed-loop: edit `collab/Model Training.ipynb` cell 5, run training in Colab, drive-sync the tflite to the Mac, sync to assets via `tools/sync_models.py`, build, install, run a 60s back-to-back comparison via `tools/run_auto_experiment.sh`, generate the report via `tools/compare_logs.py`. The full source pipeline lives in the parent project repo. 104- **iOS is not covered here.** This handover is Android-only. iOS uses CoreML exports of the same `.pt` weights, and the user has confirmed the iOS pipeline already works at 10+ Hz with high accuracy because CoreML bakes preprocessing into the model graph. If kima ships iOS too, the iOS side likely doesn't need any of this — just the same Roboflow dataset versions and a separate CoreML export.