Dataset Overview
The project utilizes the KArSL-502 (King Saud University Arabic Sign Language) dataset, a large-scale word-level sign language dataset.
Dataset Characteristics
- Size: 502 classes (Arabic sign words).
- Subjects: 3 distinct signers (01, 02, 03).
- Structure:
- Raw Video: Folder structure
signer/split/word/video.mp4. - Keypoints: Extracted MediaPipe landmarks stored as
.npzfiles.
- Raw Video: Folder structure
Directory Structure
The processed dataset is expected to follow this hierarchy:
data/
├── 01/ # Signer 01
│ ├── train/
│ │ ├── 0001/ # Word ID 1
│ │ └── ...
│ └── test/
├── 02/ # Signer 02
└── 03/ # Signer 03Keypoint Data Format
Each video is converted into a sequence of keypoints with the following shape:
(SEQ_LEN, TOTAL_FEATURES)
Where TOTAL_FEATURES is the flattened vector of all landmarks:
- Pose: 33 points × 3 coords (x, y, z)
- Face: Selected subset points × 3 coords
- Hands: 21 points × 3 coords × 2 hands