# SignIT Dataset
This README is related to the following paper:

Alessia Micieli, Giovanni Maria Farinella, Francesco Ragusa (2026). SignIT: A Comprehensive Dataset and Multimodal Analysis for Italian Sign Language Recognition. In International Conference on Computer Vision Theory and Applications (VISAPP).

In the following we detailed how to use videos together with metadata contained in CSV files and 2D Keypoints.


## Structure

```text
.
├── Videos/
│   ├── animali_animals/
│   ├── cibo_food/
│   ├── colori_colors/
│   ├── emozioni_emotions/
│   └── famiglia_family/
├── Videos.csv
├── 2D_Keypoints_Body/2D_Keypoints_Body.csv
├── 2D_Keypoints_Hand/2D_Keypoints_Hand.csv
├── 2D_keypoints_Face/2D_keypoints_Face.csv
├── CSV Macro/CSV Macro/Macro.csv
└── CSV Micro/CSV Micro/
    ├── Animals.csv
    ├── Colors.csv
    ├── Emotions.csv
    ├── Family.csv
    └── Food.csv
```

Folders under `Videos/` use the `italian_english` naming format, for example `animali_animals`. Class subfolders follow the same convention, for example `Videos/cibo_food/acqua_water`.

## Main CSV

### `Videos.csv`

Main index of the cropped videos. Each row describes one video and contains:

- `name_video`: video filename, including class name, sequence, and frame interval.
- `it_macro`, `eng_macro`: macro-category in Italian and English.
- `it_label`, `eng_label`: class or gloss in Italian and English.
- `frame_start_csv`, `frame_end_csv`: frame interval in the source video.
- `split`: recommended dataset split, for example `train` or `test`.

Use this CSV when working at video level.

## 2D Keypoint CSVs

The following three CSV files contain one row per frame and share the same descriptive columns:

- `it_label`, `eng_label`: sign class in Italian and English.
- `new_name`: frame filename, for example `01_00565.jpg`.
- `it_macro`, `eng_macro`: macro-category.
- `split`: dataset split.

The final column contains the keypoint vector:

- `2D_Keypoints_Body/2D_Keypoints_Body.csv`: `Vector_body` column.
- `2D_Keypoints_Hand/2D_Keypoints_Hand.csv`: `Vector_hand` column.
- `2D_keypoints_Face/2D_keypoints_Face.csv`: `Vector_face` column.

Vectors are stored as Python lists serialized as strings. To use them as numeric lists:

```python
import ast
import pandas as pd

body = pd.read_csv("2D_Keypoints_Body/2D_Keypoints_Body.csv")
body["Vector_body"] = body["Vector_body"].apply(ast.literal_eval)

first_vector = body.loc[0, "Vector_body"]
```

To join body, hand, and face keypoints for the same frame, use the shared columns:

```python
import ast
import pandas as pd

body = pd.read_csv("2D_Keypoints_Body/2D_Keypoints_Body.csv")
hand = pd.read_csv("2D_Keypoints_Hand/2D_Keypoints_Hand.csv")
face = pd.read_csv("2D_keypoints_Face/2D_keypoints_Face.csv")

keys = ["it_label", "eng_label", "new_name", "it_macro", "eng_macro", "split"]

df = body.merge(hand, on=keys).merge(face, on=keys)

for col in ["Vector_body", "Vector_hand", "Vector_face"]:
    df[col] = df[col].apply(ast.literal_eval)
```

## Macro and Micro CSVs

`CSV Macro/CSV Macro/Macro.csv` contains all frames aggregated at macro-category level.

Files in `CSV Micro/CSV Micro/` contain the same frame-level metadata, split by macro-category:

- `Animals.csv`
- `Colors.csv`
- `Emotions.csv`
- `Family.csv`
- `Food.csv`

These files are useful when training or evaluating a model on a single macro-category, or when preparing hierarchical macro/micro experiments.

## Recommended Usage

For video-level experiments:

1. Read `Videos.csv`.
2. Filter by `split`, `it_macro`/`eng_macro`, or `it_label`/`eng_label`.
3. Build the video path as `Videos/{it_macro}_{eng_macro}/{it_label}_{normalized_eng_label}/{name_video}`.

Example:

```python
from pathlib import Path
import pandas as pd

videos = pd.read_csv("Videos.csv")
train = videos[videos["split"] == "train"]
animals = train[train["eng_macro"] == "animals"]

def normalize_label(value):
    return str(value).strip().lower().replace(" ", "_").replace("-", "_")

def video_path(row):
    macro_dir = f"{row.it_macro}_{row.eng_macro}"
    class_dir = f"{row.it_label.strip()}_{normalize_label(row.eng_label)}"
    return Path("Videos") / macro_dir / class_dir / row.name_video

first_path = video_path(videos.iloc[0])
```

## Note on Reconstructed Videos

Some videos were not extracted directly from the original source because the source video was no longer available. They were reconstructed from frames that were processed for the related article. As a result, some reconstructed videos may be less smooth than videos extracted directly from the original source.

For frame/keypoint-level experiments:

1. Read one or more CSV files from `2D_Keypoints_*`.
2. Convert the `Vector_*` columns with `ast.literal_eval`.
3. Use `split` to separate training and test data.
4. Use `it_label`/`eng_label` as classification targets.

## Naming Notes

Some CSV files preserve historical names when needed for compatibility with previous processing pipelines. Video folders have been normalized to the bilingual `italian_english` format.

## Cite us


If you find our work useful, cite our paper:

@inproceedings{micieli2026signit,
                title={SignIT: A Comprehensive Dataset and Multimodal Analysis for Italian Sign Language Recognition}, 
                author={Alessia Micieli and Giovanni Maria Farinella and Francesco Ragusa},
                booktitle={Proceedings of the 21st International Conference on Computer Vision Theory and Applications (VISAPP 2026)},
                volume={2},
                pages={461-468},
                year={2026},
                doi={10.5220/0014480100004084},
                isbn={978-989-758-804-4}
            }