laion400 মি

বর্ণনা :

LAION-400M ডেটাসেট সম্পূর্ণরূপে খোলামেলা, অবাধে অ্যাক্সেসযোগ্য।

এই ডেটাসেটের সম্পূর্ণ বিবরণের জন্য https://laion.ai/laion-400-open-dataset/ দেখুন।

LAION-400M ডেটাসেটের সমস্ত ছবি এবং টেক্সট ওপেনএআই-এর CLIP দিয়ে ফিল্টার করা হয়েছে টেক্সট এবং ইমেজ এম্বেডিংয়ের মধ্যে কোসাইন সাদৃশ্য গণনা করে এবং 0.3-এর নিচে সাদৃশ্যযুক্ত সেগুলিকে ফেলে দেওয়া হয়েছে। মানুষের মূল্যায়নের মাধ্যমে 0.3-এর থ্রেশহোল্ড নির্ধারণ করা হয়েছিল এবং শব্দার্থিক ইমেজ-টেক্সট-কন্টেন্ট ম্যাচিং অনুমান করার জন্য এটি একটি ভাল হিউরিস্টিক বলে মনে হচ্ছে।

ইমেজ-টেক্সট-জোড়াগুলি সাধারণ ক্রল ওয়েব ডেটা ডাম্প থেকে বের করা হয়েছে এবং 2014 এবং 2021 এর মধ্যে ক্রল করা র্যান্ডম ওয়েব পৃষ্ঠাগুলি থেকে নেওয়া হয়েছে৷

অতিরিক্ত ডকুমেন্টেশন : কোড সহ কাগজপত্রে অন্বেষণ করুন
হোমপেজ : https://laion.ai/blog/laion-400-open-dataset/
সোর্স কোড : tfds.vision_language.laion400m.Laion400m
সংস্করণ :
- 1.0.0 (ডিফল্ট): প্রাথমিক প্রকাশ।
ডাউনলোড আকার : Unknown size
ডেটাসেটের আকার : Unknown size
ম্যানুয়াল ডাউনলোডের নির্দেশাবলী : এই ডেটাসেটের জন্য আপনাকে download_config.manual_dir এ ম্যানুয়ালি উৎস ডেটা ডাউনলোড করতে হবে ( ~/tensorflow_datasets/downloads/manual/ এ ডিফল্ট):
https://laion.ai/blog/laion-400-open-dataset/ এ "তথ্য ডাউনলোড করুন" বিভাগটি দেখুন
স্বয়ংক্রিয় ক্যাশে ( ডকুমেন্টেশন ): অজানা
বিভাজন :

বিভক্ত	উদাহরণ

তত্ত্বাবধান করা কী (দেখুন as_supervised doc ): None
চিত্র ( tfds.show_examples ): সমর্থিত নয়।
উদাহরণ ( tfds.as_dataframe ): অনুপস্থিত।
উদ্ধৃতি :

@article{DBLP:journals/corr/abs-2111-02114,
  author    = {Christoph Schuhmann and
               Richard Vencu and
               Romain Beaumont and
               Robert Kaczmarczyk and
               Clayton Mullis and
               Aarush Katta and
               Theo Coombes and
               Jenia Jitsev and
               Aran Komatsuzaki},
  title     = { {LAION-400M:} Open Dataset of CLIP-Filtered 400 Million Image-Text
               Pairs},
  journal   = {CoRR},
  volume    = {abs/2111.02114},
  year      = {2021},
  url       = {https://arxiv.org/abs/2111.02114},
  eprinttype = {arXiv},
  eprint    = {2111.02114},
  timestamp = {Fri, 05 Nov 2021 15:25:54 +0100},
  biburl    = {https://dblp.org/rec/journals/corr/abs-2111-02114.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

laion400m/ইমেজ (ডিফল্ট কনফিগারেশন)

বৈশিষ্ট্য গঠন :

FeaturesDict({
    'caption': Text(shape=(), dtype=string),
    'image': Image(shape=(None, None, 3), dtype=uint8, description=image),
    'license': Text(shape=(), dtype=string),
    'nsfw': ClassLabel(shape=(), dtype=int64, num_classes=4),
    'original_height': Scalar(shape=(), dtype=int32, description=original height of the image),
    'original_width': Scalar(shape=(), dtype=int32, description=original width of the image),
    'similarity': Scalar(shape=(), dtype=float64, description=cosine similarity score between the text and image embedding. Missing values default to -1.0),
    'url': Text(shape=(), dtype=string),
})

বৈশিষ্ট্য ডকুমেন্টেশন :

বৈশিষ্ট্য	ক্লাস	আকৃতি	ডিটাইপ	বর্ণনা	মান পরিসীমা
	ফিচারসডিক্ট
ক্যাপশন	পাঠ্য		স্ট্রিং	এইচটিএমএল অল্ট-টেক্সট অ্যাট্রিবিউট
ইমেজ	ছবি	(কোনটিই নয়, 3)	uint8	ইমেজ
লাইসেন্স	পাঠ্য		স্ট্রিং	ক্রিয়েটিভ কমন্স লাইসেন্সের ধরন (যদি প্রযোজ্য হয়)
nsfw	ক্লাসলেবেল		int64	NSFW ট্যাগ (CLIP এর মাধ্যমে সনাক্ত করা হয়েছে)। অসংলগ্ন এবং অনুপস্থিত ট্যাগগুলি UNTAGGED দিয়ে প্রতিস্থাপিত হয়৷
মূল_উচ্চতা	স্কেলার		int32	ছবির মূল উচ্চতা
মূল_প্রস্থ	স্কেলার		int32	ছবির মূল প্রস্থ
মিল	স্কেলার		float64	টেক্সট এবং ইমেজ এম্বেডিংয়ের মধ্যে কোসাইন সাদৃশ্য স্কোর। অনুপস্থিত মান ডিফল্ট -1.0	[০.০, ১.০]
url	পাঠ্য		স্ট্রিং	ছবির URL

laion400m/এম্বেডিং

বৈশিষ্ট্য গঠন :

FeaturesDict({
    'caption': Text(shape=(), dtype=string),
    'image_embedding': Tensor(shape=(512,), dtype=float16, description=CLIP image embedding),
    'license': Text(shape=(), dtype=string),
    'nsfw': ClassLabel(shape=(), dtype=int64, num_classes=4),
    'original_height': Scalar(shape=(), dtype=int32, description=original height of the image),
    'original_width': Scalar(shape=(), dtype=int32, description=original width of the image),
    'similarity': Scalar(shape=(), dtype=float64, description=cosine similarity score between the text and image embedding. Missing values default to -1.0),
    'text_embedding': Tensor(shape=(512,), dtype=float16, description=CLIP text embedding),
    'url': Text(shape=(), dtype=string),
})

বৈশিষ্ট্য ডকুমেন্টেশন :

বৈশিষ্ট্য	ক্লাস	আকৃতি	ডিটাইপ	বর্ণনা	মান পরিসীমা
	ফিচারসডিক্ট
ক্যাপশন	পাঠ্য		স্ট্রিং	এইচটিএমএল অল্ট-টেক্সট অ্যাট্রিবিউট
ইমেজ_এম্বেডিং	টেনসর	(512,)	float16	CLIP ইমেজ এম্বেডিং
লাইসেন্স	পাঠ্য		স্ট্রিং	ক্রিয়েটিভ কমন্স লাইসেন্সের ধরন (যদি প্রযোজ্য হয়)
nsfw	ক্লাসলেবেল		int64	NSFW ট্যাগ (CLIP এর মাধ্যমে সনাক্ত করা হয়েছে)। অসংলগ্ন এবং অনুপস্থিত ট্যাগগুলি UNTAGGED দিয়ে প্রতিস্থাপিত হয়৷
মূল_উচ্চতা	স্কেলার		int32	ছবির মূল উচ্চতা
মূল_প্রস্থ	স্কেলার		int32	ছবির মূল প্রস্থ
মিল	স্কেলার		float64	টেক্সট এবং ইমেজ এম্বেডিংয়ের মধ্যে কোসাইন সাদৃশ্য স্কোর। অনুপস্থিত মান ডিফল্ট -1.0	[০.০, ১.০]
টেক্সট_এম্বেডিং	টেনসর	(512,)	float16	CLIP পাঠ্য এমবেডিং
url	পাঠ্য		স্ট্রিং	ছবির URL