לאיון 400 מ'

תיאור :

מערך הנתונים של LAION-400M פתוח לחלוטין, נגיש באופן חופשי.

בדוק https://laion.ai/laion-400-open-dataset/ לתיאור המלא של מערך הנתונים הזה.

כל התמונות והטקסטים במערך הנתונים של LAION-400M סוננו עם ה-CLIP של OpenAI על ידי חישוב הדמיון הקוסינוס בין הטבעת הטקסט והתמונה והורדת אלה עם דמיון מתחת ל-0.3. הסף של 0.3 נקבע באמצעות הערכות אנושיות ונראה כי הוא היוריסטיקה טובה להערכת התאמת תמונה-טקסט-תוכן סמנטית.

צמדי התמונה-טקסט חולצו מ-Common Crawl Web Data dump והם מדפי אינטרנט אקראיים שנסרקו בין 2014 ל-2021.

תיעוד נוסף : חקור על ניירות עם קוד
דף הבית : https://laion.ai/blog/laion-400-open-dataset/
קוד מקור : tfds.vision_language.laion400m.Laion400m
גרסאות :
- 1.0.0 (ברירת מחדל): שחרור ראשוני.
גודל הורדה : Unknown size
גודל ערכת נתונים : Unknown size
הוראות הורדה ידניות : מערך נתונים זה מחייב אותך להוריד את נתוני המקור באופן ידני אל download_config.manual_dir (ברירת המחדל היא ~/tensorflow_datasets/downloads/manual/ ):
עיין בסעיף "הורדת מידע" בכתובת https://laion.ai/blog/laion-400-open-dataset/
שמור אוטומטי במטמון ( תיעוד ): לא ידוע
פיצולים :

לְפַצֵל	דוגמאות

מפתחות בפיקוח (ראה as_supervised doc ): None
איור ( tfds.show_examples ): לא נתמך.
דוגמאות ( tfds.as_dataframe ): חסר.
ציטוט :

@article{DBLP:journals/corr/abs-2111-02114,
  author    = {Christoph Schuhmann and
               Richard Vencu and
               Romain Beaumont and
               Robert Kaczmarczyk and
               Clayton Mullis and
               Aarush Katta and
               Theo Coombes and
               Jenia Jitsev and
               Aran Komatsuzaki},
  title     = { {LAION-400M:} Open Dataset of CLIP-Filtered 400 Million Image-Text
               Pairs},
  journal   = {CoRR},
  volume    = {abs/2111.02114},
  year      = {2021},
  url       = {https://arxiv.org/abs/2111.02114},
  eprinttype = {arXiv},
  eprint    = {2111.02114},
  timestamp = {Fri, 05 Nov 2021 15:25:54 +0100},
  biburl    = {https://dblp.org/rec/journals/corr/abs-2111-02114.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

laion400m/images (תצורת ברירת המחדל)

מבנה תכונה :

FeaturesDict({
    'caption': Text(shape=(), dtype=string),
    'image': Image(shape=(None, None, 3), dtype=uint8, description=image),
    'license': Text(shape=(), dtype=string),
    'nsfw': ClassLabel(shape=(), dtype=int64, num_classes=4),
    'original_height': Scalar(shape=(), dtype=int32, description=original height of the image),
    'original_width': Scalar(shape=(), dtype=int32, description=original width of the image),
    'similarity': Scalar(shape=(), dtype=float64, description=cosine similarity score between the text and image embedding. Missing values default to -1.0),
    'url': Text(shape=(), dtype=string),
})

תיעוד תכונה :

תכונה	מַחלָקָה	צוּרָה	Dtype	תֵאוּר	טווח ערכים
	FeaturesDict
כּוֹתֶרֶת	טֶקסט		חוּט	תכונת טקסט חלופי של HTML
תְמוּנָה	תְמוּנָה	(אין, אין, 3)	uint8	תְמוּנָה
רִשָׁיוֹן	טֶקסט		חוּט	סוג רישיון Creative Commons (אם רלוונטי)
nsfw	ClassLabel		int64	תג NSFW (זוהה עם CLIP). תגים לא מגובשים וחסרים מוחלפים ב-UNTAGGED
גובה_מקורי	סקלר		int32	הגובה המקורי של התמונה
רוחב_מקורי	סקלר		int32	הרוחב המקורי של התמונה
דִמיוֹן	סקלר		לצוף64	ציון דמיון קוסינוס בין הטבעת הטקסט לתמונה. ערכים חסרים ברירת המחדל היא -1.0	[0.0, 1.0]
כתובת אתר	טֶקסט		חוּט	כתובת האתר של התמונה

laion400m/הטבעות

מבנה תכונה :

FeaturesDict({
    'caption': Text(shape=(), dtype=string),
    'image_embedding': Tensor(shape=(512,), dtype=float16, description=CLIP image embedding),
    'license': Text(shape=(), dtype=string),
    'nsfw': ClassLabel(shape=(), dtype=int64, num_classes=4),
    'original_height': Scalar(shape=(), dtype=int32, description=original height of the image),
    'original_width': Scalar(shape=(), dtype=int32, description=original width of the image),
    'similarity': Scalar(shape=(), dtype=float64, description=cosine similarity score between the text and image embedding. Missing values default to -1.0),
    'text_embedding': Tensor(shape=(512,), dtype=float16, description=CLIP text embedding),
    'url': Text(shape=(), dtype=string),
})

תיעוד תכונה :

תכונה	מַחלָקָה	צוּרָה	Dtype	תֵאוּר	טווח ערכים
	FeaturesDict
כּוֹתֶרֶת	טֶקסט		חוּט	תכונת טקסט חלופי של HTML
תמונה_הטבעה	מוֹתֵחַ	(512,)	לצוף16	הטבעת תמונת CLIP
רִשָׁיוֹן	טֶקסט		חוּט	סוג רישיון Creative Commons (אם רלוונטי)
nsfw	ClassLabel		int64	תג NSFW (זוהה עם CLIP). תגים לא מגובשים וחסרים מוחלפים ב-UNTAGGED
גובה_מקורי	סקלר		int32	הגובה המקורי של התמונה
רוחב_מקורי	סקלר		int32	הרוחב המקורי של התמונה
דִמיוֹן	סקלר		לצוף64	ציון דמיון קוסינוס בין הטבעת הטקסט לתמונה. ערכים חסרים ברירת המחדל היא -1.0	[0.0, 1.0]
text_embedding	מוֹתֵחַ	(512,)	לצוף16	הטמעת טקסט CLIP
כתובת אתר	טֶקסט		חוּט	כתובת האתר של התמונה

לאיון 400 מ' קל לארגן דפים בעזרת אוספים אפשר לשמור ולסווג תוכן על סמך ההעדפות שלך.

laion400m/images (תצורת ברירת המחדל)

laion400m/הטבעות

לאיון 400 מ'