ref_코코

설명 :

COCO 데이터세트의 이미지를 기반으로 한 3개의 참조 표현 데이터세트 모음입니다. 참조 표현은 이미지의 고유한 개체를 설명하는 텍스트 조각입니다. 이러한 데이터 세트는 인간 평가자에게 COCO 데이터 세트의 경계 상자로 묘사된 개체를 명확하게 하도록 요청하여 수집됩니다.

RefCoco 및 RefCoco+는 Kazemzadeh et al. 2014. RefCoco+ 표현은 엄격하게 모양 기반 설명이며, 평가자가 위치 기반 설명을 사용하지 못하도록 방지하여 시행됩니다(예: "오른쪽에 있는 사람"은 RefCoco+에 대한 유효한 설명이 아닙니다). RefCocoG는 Mao et al. 2016이며 주석 처리의 차이로 인해 RefCoco에 비해 객체에 대한 설명이 더 풍부합니다. 특히 RefCoco는 인터랙티브 게임 기반 환경에서 수집된 반면, RefCocoG는 비인터랙티브 환경에서 수집되었습니다. 평균적으로 RefCocoG의 표현당 단어 수는 8.4개이고 RefCoco의 표현당 단어 수는 3.5개입니다.

각 데이터 세트에는 일반적으로 모두 논문에 보고되는 서로 다른 분할 할당이 있습니다. RefCoco 및 RefCoco+의 "testA" 및 "testB" 세트에는 각각 사람만 포함되고 사람이 아닌 세트만 포함됩니다. 이미지는 다양한 분할로 분할됩니다. "google" 분할에서는 이미지가 아닌 객체가 열차 분할과 비열차 분할 간에 분할됩니다. 이는 학습 분할과 검증 분할 모두에 동일한 이미지가 나타날 수 있지만 이미지에서 참조되는 객체는 두 세트 간에 서로 다르다는 것을 의미합니다. 대조적으로, "unc" 및 "umd"는 학습, 검증 및 테스트 분할 간에 파티션 이미지를 분할합니다. RefCocoG에서 "google" 분할에는 표준 테스트 세트가 없으며 검증 세트는 일반적으로 논문에서 "val*"로 보고됩니다.

각 데이터세트 및 분할에 대한 통계("refs"는 참조 표현식 수이고 "images"는 이미지 수):

데이터 세트	분할	나뉘다	심판	이미지
레코코	Google	기차	40000	19213
레코코	Google	발	5000	4559
레코코	Google	시험	5000	4527
레코코	unc	기차	42404	16994
레코코	unc	발	3811	1500
레코코	unc	종피	1975년	750
레코코	unc	테스트B	1810년	750
레프코코+	unc	기차	42278	16992
레프코코+	unc	발	3805	1500
레프코코+	unc	종피	1975년	750
레프코코+	unc	테스트B	1798년	750
레코코그	Google	기차	44822	24698
레코코그	Google	발	5000	4650
레코코그	음	기차	42226	21899
레코코그	음	발	2573	1300
레코코그	음	시험	5023	2600

추가 문서 : 코드 가 포함된 논문 탐색
홈페이지 : https://github.com/lichengunc/refer
소스 코드 : tfds.datasets.ref_coco.Builder
버전 :
- 1.0.0 : 최초 릴리스.
- 1.1.0 (기본값): 마스크가 추가되었습니다.
다운로드 크기 : Unknown size
수동 다운로드 지침 : 이 데이터세트에서는 소스 데이터를 download_config.manual_dir 에 수동으로 다운로드해야 합니다(기본값은 ~/tensorflow_datasets/downloads/manual/ ).
https://github.com/lichengunc/refer 의 지침에 따라 저장소에 지정된 data/ 디렉터리와 일치하는 주석과 이미지를 다운로드하세요.

https://github.com/cocodataset/cocoapi 의 PythonAPI 지침에 따라 https://cocodataset.org/#download 에서 pycocotools 및 인스턴스_train2014 주석 파일을 가져옵니다.
(1)의 Refer.py와 (2)의 pycocotools를 모두 PYTHONPATH에 추가합니다.
manual_download_process.py를 실행하여 refcoco.json을 생성하고 ref_data_root , coco_annotations_file 및 out_file 해당 파일을 다운로드했거나 저장하려는 위치에 해당하는 값으로 바꿉니다. manual_download_process.py는 TFDS 저장소에서 찾을 수 있습니다.
https://cocodataset.org/#download 에서 COCO 훈련 세트를 다운로드하여 coco_train2014/ 라는 폴더에 저장하세요. refcoco.json coco_train2014 와 동일한 수준으로 이동합니다.
표준 수동 다운로드 지침을 따르십시오.

자동 캐시 ( 문서 ): 아니요
기능 구조 :

FeaturesDict({
    'coco_annotations': Sequence({
        'area': int64,
        'bbox': BBoxFeature(shape=(4,), dtype=float32),
        'id': int64,
        'label': int64,
    }),
    'image': Image(shape=(None, None, 3), dtype=uint8),
    'image/id': int64,
    'objects': Sequence({
        'area': int64,
        'bbox': BBoxFeature(shape=(4,), dtype=float32),
        'gt_box_index': int64,
        'id': int64,
        'label': int64,
        'mask': Image(shape=(None, None, 3), dtype=uint8),
        'refexp': Sequence({
            'raw': Text(shape=(), dtype=string),
            'refexp_id': int64,
        }),
    }),
})

기능 문서 :

특징	수업	모양	Dtype
	특징Dict
coco_annotations	순서
coco_annotations/area	텐서		정수64
coco_annotations/bbox	B박스특징	(4,)	float32
coco_annotations/id	텐서		정수64
coco_annotations/레이블	텐서		정수64
영상	영상	(없음, 없음, 3)	uint8
이미지/ID	텐서		정수64
사물	순서
객체/영역	텐서		정수64
객체/bbox	B박스특징	(4,)	float32
객체/gt_box_index	텐서		정수64
객체/ID	텐서		정수64
객체/라벨	텐서		정수64
객체/마스크	영상	(없음, 없음, 3)	uint8
객체/참조 표현식	순서
객체/refexp/raw	텍스트		끈
객체/refexp/refexp_id	텐서		정수64

감독되는 키 ( as_supervised doc 참조): None
인용 :

@inproceedings{kazemzadeh2014referitgame,
  title={Referitgame: Referring to objects in photographs of natural scenes},
  author={Kazemzadeh, Sahar and Ordonez, Vicente and Matten, Mark and Berg, Tamara},
  booktitle={Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP)},
  pages={787--798},
  year={2014}
}
@inproceedings{yu2016modeling,
  title={Modeling context in referring expressions},
  author={Yu, Licheng and Poirson, Patrick and Yang, Shan and Berg, Alexander C and Berg, Tamara L},
  booktitle={European Conference on Computer Vision},
  pages={69--85},
  year={2016},
  organization={Springer}
}
@inproceedings{mao2016generation,
  title={Generation and Comprehension of Unambiguous Object Descriptions},
  author={Mao, Junhua and Huang, Jonathan and Toshev, Alexander and Camburu, Oana and Yuille, Alan and Murphy, Kevin},
  booktitle={CVPR},
  year={2016}
}
@inproceedings{nagaraja2016modeling,
  title={Modeling context between objects for referring expression understanding},
  author={Nagaraja, Varun K and Morariu, Vlad I and Davis, Larry S},
  booktitle={European Conference on Computer Vision},
  pages={792--807},
  year={2016},
  organization={Springer}
}