google-research-datasets

Objectron google-research-datasets/Objectron Jupyter Notebook

Objectron is a dataset of short, object-centric video clips. In addition, the videos also contain AR session metadata including camera poses, sparse point-clouds and planes. In each video, the camera moves around and above the object and captures it from different views. Each object is annotated with a 3D bounding box. The 3D bounding box describes the object’s position, orientation, and dimensions. The dataset contains about 15K annotated video clips and 4M annotated images in the following categories: bikes, books, bottles, cameras, cereal boxes, chairs, cups, laptops, and shoes

2.3k 265

wit google-research-datasets/wit

WIT (Wikipedia-based Image Text) Dataset is a large multimodal multilingual dataset comprising 37M+ image-text sets with 11M+ unique images across 100+ languages.

1.1k 46

wiki-split google-research-datasets/wiki-split

One million English sentences, each split into two sentences that together preserve the original meaning, extracted from Wikipedia edits.

101 4

query-wellformedness google-research-datasets/query-wellformedness

25,100 queries from the Paralex corpus (Fader et al., 2013) annotated with human ratings of whether they are well-formed natural language questions.

85 11

Repositories (4)