From 6da1b8dd13dfe6729204173c818871ea50679ec9 Mon Sep 17 00:00:00 2001 From: Carlos Santos <4a.santos@gmail.com> Date: Mon, 31 Mar 2025 12:59:45 +0200 Subject: [PATCH] backend: Update README to clarify recording lock behavior and garbage collection process --- backend/README.md | 48 +++++++++++++++++++++++++++++++---------------- 1 file changed, 32 insertions(+), 16 deletions(-) diff --git a/backend/README.md b/backend/README.md index 156be29..83a1835 100644 --- a/backend/README.md +++ b/backend/README.md @@ -31,22 +31,8 @@ The recording feature is based on the following key concepts: Each room can only have one active recording at a time. When a new recording starts, a lock is acquired to mark that room as actively recording. Any attempt to start another recording for the same room while the lock is active will be rejected. 2. **Lock lifetime**: - The lock has a default lifetime of six hours. It will be automatically released either when the recording is manually stopped and an `egress_ended` webhook is received, or when the room meeting ends. + The lock has not lifetime. It is not automatically released after a certain period. Instead, it remains active until the recording is manually stopped and an `egress_ended` webhook is received, or when the room meeting ends. This design choice allows for flexibility in managing recordings, as the lock can be held for an extended duration if needed. However, it also means that care must be taken to ensure that the lock is released appropriately to avoid blocking future recording attempts. (see **Failure handling** below). -3. **Distributed lock storage**: - The lock is stored in Redis, enabling all instances to share and access the same lock state. This ensures that any instance — not just the one that initiated the recording — can release the lock, helping to prevent desynchronization issues and orphaned locks. - -4. **Failure handling**: - If an OpenVidu instance crashes while an OpenVidu Meet recording is active, the lock remains in place until its lifetime expires. This scenario can block subsequent recording attempts if the lock is not released promptly. To mitigate this issue, a lock garbage collector is implemented to periodically clean up orphaned locks. - - The garbage collector runs every XX minutes and performs the following checks for each lock: - - **Room Existence:** Verifies if the room associated with the lock still exists. - - **Recording Status:** Uses LiveKit SDK to determine if there is an active recording for that room. - - If the room does not exist or no active recording is detected, the lock is considered orphaned and is immediately released. This strategy helps ensure that stale locks do not prevent new recordings and maintains overall system reliability. - -### Recording Compose process -The recording compose process is initiated by sending a `startRecording` request to the OpenVidu instance. The backend then waits for the `egress_started` event from LiveKit, which indicates that the recording has started successfully. If this event is not received within 30 seconds, the backend attempts to stop the recording. ```mermaid flowchart TD @@ -67,6 +53,37 @@ flowchart TD K --> L{"Stop recording result"} L -- "Success (recording stopped)" --> N["Reject Request"] --> H L -- "Error (recording not found, already stopped,\nor unknown error)" --> O["Reject Request"] --> J +``` + +4. **Failure handling**: + If an OpenVidu instance crashes while a recording is active, the lock remains in place. This scenario can block subsequent recording attempts if the lock is not released promptly. To mitigate this issue, a lock garbage collector is implemented to periodically clean up orphaned locks. + + The garbage collector runs when the OpenVidu deployment starts, and then every 30 minutes. + +```mermaid +graph TD; + A[Initiate cleanup process] --> C[Search for recording locks] + C -->|Error| D[Log and exit] + C -->|No locks found| D + C -->|Locks found| E[Iterate over each lockId] + + E --> Z[Check if lock still exists] + Z -->|Lock not found| M[Proceed to next roomId] + Z -->|Lock exists| Y[Check lock age] + Y -->|Lock too recent| M + Y -->|Lock old enough| H[Retrieve room information] + + H -->|Room has no publishers| W[Check for in-progress recordings] + W -->|Active recordings| L[Keep lock] + W -->|No active recordings| I[Release lock] + + H -->|Room found with publishers| W[Check for in-progress recordings] + H -->|Room not found| W[Check for in-progress recordings] + + I --> M + L --> M + M -->|More rooms| E + M -->|No more rooms| N[Process completed] ``` @@ -76,7 +93,6 @@ Recordings are stored in the `openvidu-meet/recordings` s3 directory, inside of Each recording is stored in a directory named after the room where it was recorded. Inside this directory, there is a `.metadata` directory that contains the metadata of the recording, and a directory for each egressId that contains the recording files. - ```plaintext openvidu/ ├── openvidu-meet/