يدعم TFDS الآن تنسيق الكرواسون 🥐 ! اقرأ الوثائق لمعرفة المزيد.

تمت ترجمة هذه الصفحة بواسطة Cloud Translation API‏.

wiki_dpr

مراجع:

psgs_w100.nq.exact

استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:

ds = tfds.load('huggingface:wiki_dpr/psgs_w100.nq.exact')

وصف :

This is the wikipedia split used to evaluate the Dense Passage Retrieval (DPR) model.
It contains 21M passages from wikipedia along with their DPR embeddings.
The wikipedia articles were split into multiple, disjoint text blocks of 100 words as passages.

الترخيص : لا يوجد ترخيص معروف
الإصدار : 0.0.0
الإنشقاقات :

ينقسم	أمثلة
`'train'`	21015300

سمات :

{
    "id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "embeddings": {
        "feature": {
            "dtype": "float32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

psgs_w100.nq.compressed

استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:

ds = tfds.load('huggingface:wiki_dpr/psgs_w100.nq.compressed')

وصف :

This is the wikipedia split used to evaluate the Dense Passage Retrieval (DPR) model.
It contains 21M passages from wikipedia along with their DPR embeddings.
The wikipedia articles were split into multiple, disjoint text blocks of 100 words as passages.

الترخيص : لا يوجد ترخيص معروف
الإصدار : 0.0.0
الإنشقاقات :

ينقسم	أمثلة
`'train'`	21015300

سمات :

{
    "id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "embeddings": {
        "feature": {
            "dtype": "float32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

psgs_w100.nq.no_index

استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:

ds = tfds.load('huggingface:wiki_dpr/psgs_w100.nq.no_index')

وصف :

This is the wikipedia split used to evaluate the Dense Passage Retrieval (DPR) model.
It contains 21M passages from wikipedia along with their DPR embeddings.
The wikipedia articles were split into multiple, disjoint text blocks of 100 words as passages.

الترخيص : لا يوجد ترخيص معروف
الإصدار : 0.0.0
الإنشقاقات :

ينقسم	أمثلة
`'train'`	21015300

سمات :

{
    "id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "embeddings": {
        "feature": {
            "dtype": "float32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

psgs_w100.multiset.exact

استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:

ds = tfds.load('huggingface:wiki_dpr/psgs_w100.multiset.exact')

وصف :

This is the wikipedia split used to evaluate the Dense Passage Retrieval (DPR) model.
It contains 21M passages from wikipedia along with their DPR embeddings.
The wikipedia articles were split into multiple, disjoint text blocks of 100 words as passages.

الترخيص : لا يوجد ترخيص معروف
الإصدار : 0.0.0
الإنشقاقات :

ينقسم	أمثلة
`'train'`	21015300

سمات :

{
    "id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "embeddings": {
        "feature": {
            "dtype": "float32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

psgs_w100.multiset.compressed

استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:

ds = tfds.load('huggingface:wiki_dpr/psgs_w100.multiset.compressed')

وصف :

This is the wikipedia split used to evaluate the Dense Passage Retrieval (DPR) model.
It contains 21M passages from wikipedia along with their DPR embeddings.
The wikipedia articles were split into multiple, disjoint text blocks of 100 words as passages.

الترخيص : لا يوجد ترخيص معروف
الإصدار : 0.0.0
الإنشقاقات :

ينقسم	أمثلة
`'train'`	21015300

سمات :

{
    "id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "embeddings": {
        "feature": {
            "dtype": "float32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

psgs_w100.multiset.no_index

استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:

ds = tfds.load('huggingface:wiki_dpr/psgs_w100.multiset.no_index')

وصف :

This is the wikipedia split used to evaluate the Dense Passage Retrieval (DPR) model.
It contains 21M passages from wikipedia along with their DPR embeddings.
The wikipedia articles were split into multiple, disjoint text blocks of 100 words as passages.

الترخيص : لا يوجد ترخيص معروف
الإصدار : 0.0.0
الإنشقاقات :

ينقسم	أمثلة
`'train'`	21015300

سمات :

{
    "id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "embeddings": {
        "feature": {
            "dtype": "float32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}