feat(metadata): add get_video_metadata & get_video_metadata_summary; docs(api); tests(metadata)
This commit is contained in:
parent
9d14f6bc01
commit
b19dbb67a5
100
CLAUDE.md
Normal file
100
CLAUDE.md
Normal file
@ -0,0 +1,100 @@
|
||||
# CLAUDE.md
|
||||
|
||||
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
||||
|
||||
## Development Commands
|
||||
|
||||
### Build and Prepare
|
||||
```bash
|
||||
npm run prepare # Compile TypeScript and make binary executable
|
||||
```
|
||||
|
||||
### Testing
|
||||
```bash
|
||||
npm test # Run Jest tests with ESM support
|
||||
```
|
||||
|
||||
### Manual Testing
|
||||
```bash
|
||||
npx @kevinwatt/yt-dlp-mcp # Start MCP server manually
|
||||
```
|
||||
|
||||
## Code Architecture
|
||||
|
||||
### MCP Server Implementation
|
||||
This is an MCP (Model Context Protocol) server that integrates with `yt-dlp` for video/audio downloading. The server:
|
||||
|
||||
- **Entry point**: `src/index.mts` - Main MCP server implementation with tool handlers
|
||||
- **Modular design**: Each feature lives in `src/modules/` (video.ts, audio.ts, subtitle.ts, search.ts, metadata.ts)
|
||||
- **Configuration**: `src/config.ts` - Centralized config with environment variable support and validation
|
||||
- **Utility functions**: `src/modules/utils.ts` - Shared spawn and cleanup utilities
|
||||
|
||||
### Tool Architecture
|
||||
The server exposes 8 MCP tools:
|
||||
1. `search_videos` - YouTube video search
|
||||
2. `list_subtitle_languages` - List available subtitles
|
||||
3. `download_video_subtitles` - Download subtitle files
|
||||
4. `download_video` - Download videos with resolution/trimming options
|
||||
5. `download_audio` - Extract and download audio
|
||||
6. `download_transcript` - Generate clean text transcripts
|
||||
7. `get_video_metadata` - Extract comprehensive video metadata (JSON format)
|
||||
8. `get_video_metadata_summary` - Get human-readable metadata summary
|
||||
|
||||
### Key Patterns
|
||||
- **Unified error handling**: `handleToolExecution()` wrapper for consistent error responses
|
||||
- **Spawn management**: All external tool calls go through `_spawnPromise()` with cleanup
|
||||
- **Configuration-driven**: All defaults and behavior configurable via environment variables
|
||||
- **ESM modules**: Uses `.mts` extension and ESM imports throughout
|
||||
- **Filename sanitization**: Cross-platform safe filename handling with length limits
|
||||
- **Metadata extraction**: Uses `yt-dlp --dump-json` for comprehensive video information without downloading content
|
||||
|
||||
### Dependencies
|
||||
- **Required external**: `yt-dlp` must be installed and in PATH
|
||||
- **Core MCP**: `@modelcontextprotocol/sdk` for server implementation
|
||||
- **Process management**: `spawn-rx` for async process spawning
|
||||
- **File operations**: `rimraf` for cleanup
|
||||
|
||||
### Configuration System
|
||||
`CONFIG` object loaded from `config.ts` supports:
|
||||
- Download directory customization (defaults to ~/Downloads)
|
||||
- Resolution/format preferences
|
||||
- Filename sanitization rules
|
||||
- Temporary directory management
|
||||
- Environment variable overrides (YTDLP_* prefix)
|
||||
|
||||
### Testing Setup
|
||||
- **Jest with ESM**: Custom config for TypeScript + ESM support
|
||||
- **Test isolation**: Tests run in separate environment with mocked dependencies
|
||||
- **Coverage**: Tests for each module in `src/__tests__/`
|
||||
|
||||
### TypeScript Configuration
|
||||
- **Strict mode**: All strict TypeScript checks enabled
|
||||
- **ES2020 target**: Modern JavaScript features
|
||||
- **Declaration generation**: Types exported to `lib/` for consumption
|
||||
- **Source maps**: Enabled for debugging
|
||||
|
||||
### Build Output
|
||||
- **Compiled code**: `lib/` directory with .js, .d.ts, and .map files
|
||||
- **Executable**: `lib/index.mjs` with shebang for direct execution
|
||||
- **Module structure**: Preserves source module organization
|
||||
|
||||
## Metadata Module Details
|
||||
|
||||
### VideoMetadata Interface
|
||||
The `metadata.ts` module exports a comprehensive `VideoMetadata` interface containing fields like:
|
||||
- Basic info: `id`, `title`, `description`, `duration`, `upload_date`
|
||||
- Channel info: `channel`, `channel_id`, `channel_url`, `uploader`
|
||||
- Analytics: `view_count`, `like_count`, `comment_count`
|
||||
- Technical: `formats`, `thumbnails`, `subtitles`
|
||||
- Content: `tags`, `categories`, `series`, `episode` data
|
||||
|
||||
### Key Functions
|
||||
- `getVideoMetadata(url, fields?, config?)` - Extract full or filtered metadata as JSON
|
||||
- `getVideoMetadataSummary(url, config?)` - Generate human-readable summary
|
||||
|
||||
### Testing
|
||||
Comprehensive test suite in `src/__tests__/metadata.test.ts` covers:
|
||||
- Field filtering and extraction
|
||||
- Error handling for invalid URLs
|
||||
- Format validation
|
||||
- Real-world integration with YouTube videos
|
||||
19
README.md
19
README.md
@ -4,9 +4,11 @@ An MCP server implementation that integrates with yt-dlp, providing video and au
|
||||
|
||||
## Features
|
||||
|
||||
* **Video Metadata**: Extract comprehensive video information without downloading content
|
||||
* **Subtitles**: Download subtitles in SRT format for LLMs to read
|
||||
* **Video Download**: Save videos to your Downloads folder with resolution control
|
||||
* **Audio Download**: Save audios to your Downloads folder
|
||||
* **Video Search**: Search for videos on YouTube using keywords
|
||||
* **Privacy-Focused**: Direct download without tracking
|
||||
* **MCP Integration**: Works with Dive and other MCP-compatible LLMs
|
||||
|
||||
@ -85,6 +87,19 @@ pip install yt-dlp
|
||||
* `url` (string, required): URL of the video
|
||||
* `language` (string, optional): Language code (e.g., 'en', 'zh-Hant', 'ja'). Defaults to 'en'
|
||||
|
||||
* **get_video_metadata**
|
||||
* Extract comprehensive video metadata without downloading the content
|
||||
* Returns detailed information including title, description, channel, timestamps, view counts, and more
|
||||
* Inputs:
|
||||
* `url` (string, required): URL of the video
|
||||
* `fields` (array, optional): Specific metadata fields to extract (e.g., ['id', 'title', 'description', 'channel']). If not provided, returns all available metadata
|
||||
|
||||
* **get_video_metadata_summary**
|
||||
* Get a human-readable summary of key video metadata
|
||||
* Returns formatted text with title, channel, duration, views, upload date, and description preview
|
||||
* Inputs:
|
||||
* `url` (string, required): URL of the video
|
||||
|
||||
## Usage Examples
|
||||
|
||||
Ask your LLM to:
|
||||
@ -99,6 +114,10 @@ Ask your LLM to:
|
||||
"Download audio from this YouTube video: https://youtube.com/watch?v=..."
|
||||
"Get a clean transcript of this video: https://youtube.com/watch?v=..."
|
||||
"Download Spanish transcript from this video: https://youtube.com/watch?v=..."
|
||||
"Get metadata for this video: https://youtube.com/watch?v=..."
|
||||
"Show me the title, description, and channel info for this video: https://youtube.com/watch?v=..."
|
||||
"Get a summary of this video's metadata: https://youtube.com/watch?v=..."
|
||||
"Extract just the id, title, and view count from this video: https://youtube.com/watch?v=..."
|
||||
```
|
||||
|
||||
## Manual Start
|
||||
|
||||
47
docs/api.md
47
docs/api.md
@ -116,6 +116,53 @@ const subtitles = await downloadSubtitles(
|
||||
console.log(subtitles);
|
||||
```
|
||||
|
||||
## Metadata Operations
|
||||
|
||||
### getVideoMetadata(url: string, fields?: string[]): Promise<string>
|
||||
|
||||
Extract comprehensive video metadata using yt-dlp without downloading the content.
|
||||
|
||||
**Parameters:**
|
||||
- `url`: The URL of the video to extract metadata from
|
||||
- `fields`: (Optional) Specific metadata fields to extract (e.g., `['id', 'title', 'description', 'channel']`). If omitted, returns all available metadata. If provided as an empty array `[]`, returns `{}`.
|
||||
|
||||
**Returns:**
|
||||
- Promise resolving to a JSON string of metadata (pretty-printed)
|
||||
|
||||
**Example:**
|
||||
```javascript
|
||||
import { getVideoMetadata } from '@kevinwatt/yt-dlp-mcp';
|
||||
|
||||
// Get all metadata
|
||||
const all = await getVideoMetadata('https://www.youtube.com/watch?v=jNQXAC9IVRw');
|
||||
console.log(all);
|
||||
|
||||
// Get specific fields only
|
||||
const subset = await getVideoMetadata(
|
||||
'https://www.youtube.com/watch?v=jNQXAC9IVRw',
|
||||
['id', 'title', 'description', 'channel']
|
||||
);
|
||||
console.log(subset);
|
||||
```
|
||||
|
||||
### getVideoMetadataSummary(url: string): Promise<string>
|
||||
|
||||
Get a human-readable summary of key video metadata fields.
|
||||
|
||||
**Parameters:**
|
||||
- `url`: The URL of the video
|
||||
|
||||
**Returns:**
|
||||
- Promise resolving to a formatted text summary (title, channel, duration, views, upload date, description preview, etc.)
|
||||
|
||||
**Example:**
|
||||
```javascript
|
||||
import { getVideoMetadataSummary } from '@kevinwatt/yt-dlp-mcp';
|
||||
|
||||
const summary = await getVideoMetadataSummary('https://www.youtube.com/watch?v=jNQXAC9IVRw');
|
||||
console.log(summary);
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
### Config Interface
|
||||
|
||||
192
src/__tests__/metadata.test.ts
Normal file
192
src/__tests__/metadata.test.ts
Normal file
@ -0,0 +1,192 @@
|
||||
// @ts-nocheck
|
||||
// @jest-environment node
|
||||
import { describe, test, expect, beforeAll } from '@jest/globals';
|
||||
import { getVideoMetadata, getVideoMetadataSummary } from '../modules/metadata.js';
|
||||
import type { VideoMetadata } from '../modules/metadata.js';
|
||||
import { CONFIG } from '../config.js';
|
||||
|
||||
// 設置 Python 環境
|
||||
process.env.PYTHONPATH = '';
|
||||
process.env.PYTHONHOME = '';
|
||||
|
||||
describe('Video Metadata Extraction', () => {
|
||||
const testUrl = 'https://www.youtube.com/watch?v=jNQXAC9IVRw';
|
||||
|
||||
describe('getVideoMetadata', () => {
|
||||
test('should extract basic metadata from YouTube video', async () => {
|
||||
const metadataJson = await getVideoMetadata(testUrl);
|
||||
const metadata: VideoMetadata = JSON.parse(metadataJson);
|
||||
|
||||
// 驗證基本字段存在
|
||||
expect(metadata).toHaveProperty('id');
|
||||
expect(metadata).toHaveProperty('title');
|
||||
expect(metadata).toHaveProperty('uploader');
|
||||
expect(metadata).toHaveProperty('duration');
|
||||
expect(metadata.id).toBe('jNQXAC9IVRw');
|
||||
expect(typeof metadata.title).toBe('string');
|
||||
expect(typeof metadata.uploader).toBe('string');
|
||||
expect(typeof metadata.duration).toBe('number');
|
||||
});
|
||||
|
||||
test('should extract specific fields when requested', async () => {
|
||||
const fields = ['id', 'title', 'description', 'channel', 'timestamp'];
|
||||
const metadataJson = await getVideoMetadata(testUrl, fields);
|
||||
const metadata = JSON.parse(metadataJson);
|
||||
|
||||
// 應該只包含請求的字段
|
||||
expect(Object.keys(metadata)).toEqual(expect.arrayContaining(fields.filter(f => metadata[f] !== undefined)));
|
||||
|
||||
// 不應該包含其他字段(如果它們存在於原始數據中)
|
||||
expect(metadata).not.toHaveProperty('formats');
|
||||
expect(metadata).not.toHaveProperty('thumbnails');
|
||||
});
|
||||
|
||||
test('should handle empty fields array gracefully', async () => {
|
||||
const metadataJson = await getVideoMetadata(testUrl, []);
|
||||
const metadata = JSON.parse(metadataJson);
|
||||
|
||||
// 空數組應該返回空對象
|
||||
expect(metadata).toEqual({});
|
||||
});
|
||||
|
||||
test('should handle non-existent fields gracefully', async () => {
|
||||
const fields = ['id', 'title', 'non_existent_field', 'another_fake_field'];
|
||||
const metadataJson = await getVideoMetadata(testUrl, fields);
|
||||
const metadata = JSON.parse(metadataJson);
|
||||
|
||||
// 應該包含存在的字段
|
||||
expect(metadata).toHaveProperty('id');
|
||||
expect(metadata).toHaveProperty('title');
|
||||
|
||||
// 不應該包含不存在的字段
|
||||
expect(metadata).not.toHaveProperty('non_existent_field');
|
||||
expect(metadata).not.toHaveProperty('another_fake_field');
|
||||
});
|
||||
|
||||
test('should throw error for invalid URL', async () => {
|
||||
await expect(getVideoMetadata('invalid-url')).rejects.toThrow();
|
||||
await expect(getVideoMetadata('https://invalid-domain.com/video')).rejects.toThrow();
|
||||
});
|
||||
|
||||
test('should include requested metadata fields from issue #16', async () => {
|
||||
const fields = ['id', 'title', 'description', 'creators', 'timestamp', 'channel', 'channel_id', 'channel_url'];
|
||||
const metadataJson = await getVideoMetadata(testUrl, fields);
|
||||
const metadata = JSON.parse(metadataJson);
|
||||
|
||||
// 驗證 issue #16 中請求的字段
|
||||
expect(metadata).toHaveProperty('id');
|
||||
expect(metadata).toHaveProperty('title');
|
||||
expect(metadata.id).toBe('jNQXAC9IVRw');
|
||||
expect(typeof metadata.title).toBe('string');
|
||||
|
||||
// 這些字段可能存在也可能不存在,取決於視頻
|
||||
if (metadata.description !== undefined) {
|
||||
expect(typeof metadata.description).toBe('string');
|
||||
}
|
||||
if (metadata.creators !== undefined) {
|
||||
expect(Array.isArray(metadata.creators)).toBe(true);
|
||||
}
|
||||
if (metadata.timestamp !== undefined) {
|
||||
expect(typeof metadata.timestamp).toBe('number');
|
||||
}
|
||||
if (metadata.channel !== undefined) {
|
||||
expect(typeof metadata.channel).toBe('string');
|
||||
}
|
||||
if (metadata.channel_id !== undefined) {
|
||||
expect(typeof metadata.channel_id).toBe('string');
|
||||
}
|
||||
if (metadata.channel_url !== undefined) {
|
||||
expect(typeof metadata.channel_url).toBe('string');
|
||||
}
|
||||
});
|
||||
});
|
||||
|
||||
describe('getVideoMetadataSummary', () => {
|
||||
test('should generate human-readable summary', async () => {
|
||||
const summary = await getVideoMetadataSummary(testUrl);
|
||||
|
||||
expect(typeof summary).toBe('string');
|
||||
expect(summary.length).toBeGreaterThan(0);
|
||||
|
||||
// 應該包含基本信息
|
||||
expect(summary).toMatch(/Title:/);
|
||||
|
||||
// 可能包含的其他字段
|
||||
const commonFields = ['Channel:', 'Duration:', 'Views:', 'Upload Date:'];
|
||||
const hasAtLeastOneField = commonFields.some(field => summary.includes(field));
|
||||
expect(hasAtLeastOneField).toBe(true);
|
||||
});
|
||||
|
||||
test('should handle videos with different metadata availability', async () => {
|
||||
const summary = await getVideoMetadataSummary(testUrl);
|
||||
|
||||
// 摘要應該是有效的字符串
|
||||
expect(typeof summary).toBe('string');
|
||||
expect(summary.trim().length).toBeGreaterThan(0);
|
||||
|
||||
// 每行應該有意義的格式 (字段: 值) - 但要注意有些標題可能包含特殊字符
|
||||
const lines = summary.split('\n').filter(line => line.trim());
|
||||
expect(lines.length).toBeGreaterThan(0);
|
||||
|
||||
// 至少應該有一行包含冒號(格式為 "字段: 值")
|
||||
const hasFormattedLines = lines.some(line => line.includes(':'));
|
||||
expect(hasFormattedLines).toBe(true);
|
||||
}, 30000);
|
||||
|
||||
test('should throw error for invalid URL', async () => {
|
||||
await expect(getVideoMetadataSummary('invalid-url')).rejects.toThrow();
|
||||
}, 30000);
|
||||
});
|
||||
|
||||
describe('Error Handling', () => {
|
||||
test('should provide helpful error message for unavailable video', async () => {
|
||||
const unavailableUrl = 'https://www.youtube.com/watch?v=invalid_video_id_123456789';
|
||||
|
||||
await expect(getVideoMetadata(unavailableUrl)).rejects.toThrow(/unavailable|private|not available/i);
|
||||
});
|
||||
|
||||
test('should handle network errors gracefully', async () => {
|
||||
// 使用一個應該引起網路錯誤的 URL
|
||||
const badNetworkUrl = 'https://httpstat.us/500';
|
||||
|
||||
await expect(getVideoMetadata(badNetworkUrl)).rejects.toThrow();
|
||||
});
|
||||
|
||||
test('should handle unsupported URLs', async () => {
|
||||
const unsupportedUrl = 'https://example.com/not-a-video';
|
||||
|
||||
await expect(getVideoMetadata(unsupportedUrl)).rejects.toThrow();
|
||||
}, 10000);
|
||||
});
|
||||
|
||||
describe('Real-world Integration', () => {
|
||||
test('should work with different video platforms supported by yt-dlp', async () => {
|
||||
// 只測試 YouTube,因為其他平台的可用性可能會變化
|
||||
const youtubeUrl = 'https://www.youtube.com/watch?v=jNQXAC9IVRw';
|
||||
|
||||
const metadataJson = await getVideoMetadata(youtubeUrl, ['id', 'title', 'extractor']);
|
||||
const metadata = JSON.parse(metadataJson);
|
||||
|
||||
expect(metadata.extractor).toMatch(/youtube/i);
|
||||
expect(metadata.id).toBe('jNQXAC9IVRw');
|
||||
});
|
||||
|
||||
test('should extract metadata that matches issue #16 requirements', async () => {
|
||||
const requiredFields = ['id', 'title', 'description', 'creators', 'timestamp', 'channel', 'channel_id', 'channel_url'];
|
||||
const metadataJson = await getVideoMetadata(testUrl, requiredFields);
|
||||
const metadata = JSON.parse(metadataJson);
|
||||
|
||||
// 驗證至少有基本字段
|
||||
expect(metadata).toHaveProperty('id');
|
||||
expect(metadata).toHaveProperty('title');
|
||||
|
||||
// 記錄實際返回的字段以便調試
|
||||
console.log('Available metadata fields for issue #16:', Object.keys(metadata));
|
||||
|
||||
// 檢查每個請求的字段是否存在或者有合理的替代
|
||||
const availableFields = Object.keys(metadata);
|
||||
const hasRequiredBasics = availableFields.includes('id') && availableFields.includes('title');
|
||||
expect(hasRequiredBasics).toBe(true);
|
||||
});
|
||||
});
|
||||
});
|
||||
@ -17,8 +17,9 @@ import { downloadVideo } from "./modules/video.js";
|
||||
import { downloadAudio } from "./modules/audio.js";
|
||||
import { listSubtitles, downloadSubtitles, downloadTranscript } from "./modules/subtitle.js";
|
||||
import { searchVideos } from "./modules/search.js";
|
||||
import { getVideoMetadata, getVideoMetadataSummary } from "./modules/metadata.js";
|
||||
|
||||
const VERSION = '0.6.27';
|
||||
const VERSION = '0.6.28';
|
||||
|
||||
/**
|
||||
* Validate system configuration
|
||||
@ -186,6 +187,33 @@ server.setRequestHandler(ListToolsRequestSchema, async () => {
|
||||
required: ["url"],
|
||||
},
|
||||
},
|
||||
{
|
||||
name: "get_video_metadata",
|
||||
description: "Extract comprehensive video metadata without downloading the content. Returns detailed information including title, description, channel, timestamps, view counts, and more.",
|
||||
inputSchema: {
|
||||
type: "object",
|
||||
properties: {
|
||||
url: { type: "string", description: "URL of the video" },
|
||||
fields: {
|
||||
type: "array",
|
||||
items: { type: "string" },
|
||||
description: "Optional: Specific metadata fields to extract (e.g., ['id', 'title', 'description', 'channel']). If not provided, returns all available metadata."
|
||||
},
|
||||
},
|
||||
required: ["url"],
|
||||
},
|
||||
},
|
||||
{
|
||||
name: "get_video_metadata_summary",
|
||||
description: "Get a human-readable summary of key video metadata including title, channel, duration, views, upload date, and description preview.",
|
||||
inputSchema: {
|
||||
type: "object",
|
||||
properties: {
|
||||
url: { type: "string", description: "URL of the video" },
|
||||
},
|
||||
required: ["url"],
|
||||
},
|
||||
},
|
||||
],
|
||||
};
|
||||
});
|
||||
@ -231,6 +259,7 @@ server.setRequestHandler(
|
||||
endTime?: string;
|
||||
query?: string;
|
||||
maxResults?: number;
|
||||
fields?: string[];
|
||||
};
|
||||
|
||||
if (toolName === "search_videos") {
|
||||
@ -269,6 +298,16 @@ server.setRequestHandler(
|
||||
() => downloadTranscript(args.url, args.language || CONFIG.download.defaultSubtitleLanguage, CONFIG),
|
||||
"Error downloading transcript"
|
||||
);
|
||||
} else if (toolName === "get_video_metadata") {
|
||||
return handleToolExecution(
|
||||
() => getVideoMetadata(args.url, args.fields, CONFIG),
|
||||
"Error extracting video metadata"
|
||||
);
|
||||
} else if (toolName === "get_video_metadata_summary") {
|
||||
return handleToolExecution(
|
||||
() => getVideoMetadataSummary(args.url, CONFIG),
|
||||
"Error generating video metadata summary"
|
||||
);
|
||||
} else {
|
||||
return {
|
||||
content: [{ type: "text", text: `Unknown tool: ${toolName}` }],
|
||||
|
||||
295
src/modules/metadata.ts
Normal file
295
src/modules/metadata.ts
Normal file
@ -0,0 +1,295 @@
|
||||
import type { Config } from "../config.js";
|
||||
import {
|
||||
_spawnPromise,
|
||||
validateUrl
|
||||
} from "./utils.js";
|
||||
|
||||
/**
|
||||
* Video metadata interface containing all fields that can be extracted
|
||||
*/
|
||||
export interface VideoMetadata {
|
||||
// Basic video information
|
||||
id?: string;
|
||||
title?: string;
|
||||
fulltitle?: string;
|
||||
description?: string;
|
||||
alt_title?: string;
|
||||
display_id?: string;
|
||||
|
||||
// Creator/uploader information
|
||||
uploader?: string;
|
||||
uploader_id?: string;
|
||||
uploader_url?: string;
|
||||
creators?: string[];
|
||||
creator?: string;
|
||||
|
||||
// Channel information
|
||||
channel?: string;
|
||||
channel_id?: string;
|
||||
channel_url?: string;
|
||||
channel_follower_count?: number;
|
||||
channel_is_verified?: boolean;
|
||||
|
||||
// Timestamps and dates
|
||||
timestamp?: number;
|
||||
upload_date?: string;
|
||||
release_timestamp?: number;
|
||||
release_date?: string;
|
||||
release_year?: number;
|
||||
modified_timestamp?: number;
|
||||
modified_date?: string;
|
||||
|
||||
// Video properties
|
||||
duration?: number;
|
||||
duration_string?: string;
|
||||
view_count?: number;
|
||||
concurrent_view_count?: number;
|
||||
like_count?: number;
|
||||
dislike_count?: number;
|
||||
repost_count?: number;
|
||||
average_rating?: number;
|
||||
comment_count?: number;
|
||||
age_limit?: number;
|
||||
|
||||
// Content classification
|
||||
live_status?: string;
|
||||
is_live?: boolean;
|
||||
was_live?: boolean;
|
||||
playable_in_embed?: string;
|
||||
availability?: string;
|
||||
media_type?: string;
|
||||
|
||||
// Playlist information
|
||||
playlist_id?: string;
|
||||
playlist_title?: string;
|
||||
playlist?: string;
|
||||
playlist_count?: number;
|
||||
playlist_index?: number;
|
||||
playlist_autonumber?: number;
|
||||
playlist_uploader?: string;
|
||||
playlist_uploader_id?: string;
|
||||
playlist_channel?: string;
|
||||
playlist_channel_id?: string;
|
||||
|
||||
// URLs and technical info
|
||||
webpage_url?: string;
|
||||
webpage_url_domain?: string;
|
||||
webpage_url_basename?: string;
|
||||
original_url?: string;
|
||||
filename?: string;
|
||||
ext?: string;
|
||||
|
||||
// Content metadata
|
||||
categories?: string[];
|
||||
tags?: string[];
|
||||
cast?: string[];
|
||||
location?: string;
|
||||
license?: string;
|
||||
|
||||
// Series/episode information
|
||||
series?: string;
|
||||
series_id?: string;
|
||||
season?: string;
|
||||
season_number?: number;
|
||||
season_id?: string;
|
||||
episode?: string;
|
||||
episode_number?: number;
|
||||
episode_id?: string;
|
||||
|
||||
// Music/track information
|
||||
track?: string;
|
||||
track_number?: number;
|
||||
track_id?: string;
|
||||
artists?: string[];
|
||||
artist?: string;
|
||||
genres?: string[];
|
||||
genre?: string;
|
||||
composers?: string[];
|
||||
composer?: string;
|
||||
album?: string;
|
||||
album_type?: string;
|
||||
album_artists?: string[];
|
||||
album_artist?: string;
|
||||
disc_number?: number;
|
||||
|
||||
// Technical metadata
|
||||
extractor?: string;
|
||||
epoch?: number;
|
||||
|
||||
// Additional fields that might be present
|
||||
[key: string]: unknown;
|
||||
}
|
||||
|
||||
/**
|
||||
* Extract video metadata without downloading the actual video content.
|
||||
* Uses yt-dlp's --dump-json flag to get comprehensive metadata.
|
||||
*
|
||||
* @param url - The URL of the video to extract metadata from
|
||||
* @param fields - Optional array of specific fields to extract. If not provided, returns all available metadata
|
||||
* @param config - Configuration object (currently unused but kept for consistency)
|
||||
* @returns Promise resolving to formatted metadata string or JSON object
|
||||
* @throws {Error} When URL is invalid or metadata extraction fails
|
||||
*
|
||||
* @example
|
||||
* ```typescript
|
||||
* // Get all metadata
|
||||
* const metadata = await getVideoMetadata('https://youtube.com/watch?v=...');
|
||||
* console.log(metadata);
|
||||
*
|
||||
* // Get specific fields only
|
||||
* const specificData = await getVideoMetadata(
|
||||
* 'https://youtube.com/watch?v=...',
|
||||
* ['id', 'title', 'description', 'channel']
|
||||
* );
|
||||
* console.log(specificData);
|
||||
* ```
|
||||
*/
|
||||
export async function getVideoMetadata(
|
||||
url: string,
|
||||
fields?: string[],
|
||||
_config?: Config
|
||||
): Promise<string> {
|
||||
// Validate the URL
|
||||
validateUrl(url);
|
||||
|
||||
const args = [
|
||||
"--dump-json",
|
||||
"--no-warnings",
|
||||
"--no-check-certificate",
|
||||
url
|
||||
];
|
||||
|
||||
try {
|
||||
// Execute yt-dlp to get metadata
|
||||
const output = await _spawnPromise("yt-dlp", args);
|
||||
|
||||
// Parse the JSON output
|
||||
const metadata: VideoMetadata = JSON.parse(output);
|
||||
|
||||
// If specific fields are requested, filter the metadata
|
||||
if (fields !== undefined && fields.length >= 0) {
|
||||
const filteredMetadata: Partial<VideoMetadata> = {};
|
||||
|
||||
for (const field of fields) {
|
||||
if (metadata.hasOwnProperty(field)) {
|
||||
filteredMetadata[field as keyof VideoMetadata] = metadata[field as keyof VideoMetadata];
|
||||
}
|
||||
}
|
||||
|
||||
return JSON.stringify(filteredMetadata, null, 2);
|
||||
}
|
||||
|
||||
// Return formatted JSON string with all metadata
|
||||
return JSON.stringify(metadata, null, 2);
|
||||
|
||||
} catch (error) {
|
||||
if (error instanceof Error) {
|
||||
// Handle common yt-dlp errors
|
||||
if (error.message.includes("Video unavailable")) {
|
||||
throw new Error(`Video is unavailable or private: ${url}`);
|
||||
} else if (error.message.includes("Unsupported URL")) {
|
||||
throw new Error(`Unsupported URL or extractor not found: ${url}`);
|
||||
} else if (error.message.includes("network")) {
|
||||
throw new Error(`Network error while extracting metadata: ${error.message}`);
|
||||
} else {
|
||||
throw new Error(`Failed to extract video metadata: ${error.message}`);
|
||||
}
|
||||
}
|
||||
throw new Error(`Failed to extract video metadata from ${url}`);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Get a human-readable summary of key video metadata fields.
|
||||
* This is useful for quick overview without overwhelming JSON output.
|
||||
*
|
||||
* @param url - The URL of the video to extract metadata from
|
||||
* @param config - Configuration object (currently unused but kept for consistency)
|
||||
* @returns Promise resolving to a formatted summary string
|
||||
* @throws {Error} When URL is invalid or metadata extraction fails
|
||||
*
|
||||
* @example
|
||||
* ```typescript
|
||||
* const summary = await getVideoMetadataSummary('https://youtube.com/watch?v=...');
|
||||
* console.log(summary);
|
||||
* // Output:
|
||||
* // Title: Example Video Title
|
||||
* // Channel: Example Channel
|
||||
* // Duration: 10:30
|
||||
* // Views: 1,234,567
|
||||
* // Upload Date: 2023-12-01
|
||||
* // Description: This is an example video...
|
||||
* ```
|
||||
*/
|
||||
export async function getVideoMetadataSummary(
|
||||
url: string,
|
||||
_config?: Config
|
||||
): Promise<string> {
|
||||
// Get the full metadata first
|
||||
const metadataJson = await getVideoMetadata(url, undefined, _config);
|
||||
const metadata: VideoMetadata = JSON.parse(metadataJson);
|
||||
|
||||
// Format key fields into a readable summary
|
||||
const lines: string[] = [];
|
||||
|
||||
if (metadata.title) {
|
||||
lines.push(`Title: ${metadata.title}`);
|
||||
}
|
||||
|
||||
if (metadata.channel) {
|
||||
lines.push(`Channel: ${metadata.channel}`);
|
||||
}
|
||||
|
||||
if (metadata.uploader && metadata.uploader !== metadata.channel) {
|
||||
lines.push(`Uploader: ${metadata.uploader}`);
|
||||
}
|
||||
|
||||
if (metadata.duration_string) {
|
||||
lines.push(`Duration: ${metadata.duration_string}`);
|
||||
} else if (metadata.duration) {
|
||||
const hours = Math.floor(metadata.duration / 3600);
|
||||
const minutes = Math.floor((metadata.duration % 3600) / 60);
|
||||
const seconds = metadata.duration % 60;
|
||||
const durationStr = hours > 0
|
||||
? `${hours}:${minutes.toString().padStart(2, '0')}:${seconds.toString().padStart(2, '0')}`
|
||||
: `${minutes}:${seconds.toString().padStart(2, '0')}`;
|
||||
lines.push(`Duration: ${durationStr}`);
|
||||
}
|
||||
|
||||
if (metadata.view_count !== undefined) {
|
||||
lines.push(`Views: ${metadata.view_count.toLocaleString()}`);
|
||||
}
|
||||
|
||||
if (metadata.like_count !== undefined) {
|
||||
lines.push(`Likes: ${metadata.like_count.toLocaleString()}`);
|
||||
}
|
||||
|
||||
if (metadata.upload_date) {
|
||||
// Format YYYYMMDD to YYYY-MM-DD
|
||||
const dateStr = metadata.upload_date;
|
||||
if (dateStr.length === 8) {
|
||||
const formatted = `${dateStr.substring(0, 4)}-${dateStr.substring(4, 6)}-${dateStr.substring(6, 8)}`;
|
||||
lines.push(`Upload Date: ${formatted}`);
|
||||
} else {
|
||||
lines.push(`Upload Date: ${dateStr}`);
|
||||
}
|
||||
}
|
||||
|
||||
if (metadata.live_status && metadata.live_status !== 'not_live') {
|
||||
lines.push(`Status: ${metadata.live_status.replace('_', ' ')}`);
|
||||
}
|
||||
|
||||
if (metadata.tags && metadata.tags.length > 0) {
|
||||
lines.push(`Tags: ${metadata.tags.slice(0, 5).join(', ')}${metadata.tags.length > 5 ? '...' : ''}`);
|
||||
}
|
||||
|
||||
if (metadata.description) {
|
||||
// Truncate description to first 200 characters
|
||||
const desc = metadata.description.length > 200
|
||||
? metadata.description.substring(0, 200) + '...'
|
||||
: metadata.description;
|
||||
lines.push(`Description: ${desc}`);
|
||||
}
|
||||
|
||||
return lines.join('\n');
|
||||
}
|
||||
Loading…
x
Reference in New Issue
Block a user