unnatural_instructions

説明:

論文で説明されているデータセット: Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor (2022)。オプションの制約 / LLM によって生成された再定式化を含む、自然言語命令のセットが含まれています。

ホームページ: https://github.com/orhonovich/unnatural-instructions
ソースコード: tfds.text.unnatural_instructions.UnnaturalInstructions
バージョン:
- 0.0.1 (デフォルト): 初期リリース。追加の処理を使用する必要があるため、命令/入力を省略します。 Instruction_with_inputs と reformulations には、命令とコンテキストが含まれます。
ダウンロードサイズ: 17.48 MiB
データセットのサイズ: 154.71 MiB
自動キャッシュ(ドキュメント): shuffle_files=Falseの場合のみ (トレーニング)
スプリット:

スプリット	例
`'train'`	66,010

機能構造:

FeaturesDict({
    'id': Text(shape=(), dtype=string),
    'instances': Sequence({
        'constraints': Text(shape=(), dtype=string),
        'input': Text(shape=(), dtype=string),
        'instruction_with_input': Text(shape=(), dtype=string),
        'output': Text(shape=(), dtype=string),
    }),
    'instruction': Text(shape=(), dtype=string),
    'reformulations': Sequence({
        'input': Text(shape=(), dtype=string),
        'instruction': Text(shape=(), dtype=string),
        'instruction_with_input': Text(shape=(), dtype=string),
        'output': Text(shape=(), dtype=string),
    }),
})

機能のドキュメント:

特徴	クラス	Dtype	説明
	特徴辞書
ID	文章	ストリング	たとえば、一意の識別子。
インスタンス	順序
インスタンス/制約	文章	ストリング	タスク固有の制約。
インスタンス/入力	文章	ストリング	指定された命令のプレースホルダーに入力される入力。
インスタンス/instruction_with_input	文章	ストリング	プレースホルダーに提供される入力を含む命令。
インスタンス/出力	文章	ストリング	特定のタスクのターゲット出力。
命令	文章	ストリング	入力用のプレースホルダー付きの命令。
再定式化	順序
再定式化/入力	文章	ストリング	指定された命令のプレースホルダーに入力される入力。
再定式化/指示	文章	ストリング	入力用のプレースホルダー付きの命令。
再定式化/instruction_with_input	文章	ストリング	プレースホルダーに提供される入力を含む命令。
再定式化/出力	文章	ストリング	特定のタスクのターゲット出力。

監視されたキー( as_supervised docを参照): None
図( tfds.show_examples ): サポートされていません。
例( tfds.as_dataframe ):

引用：

@misc{honovich2022unnatural,
      title = {Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor},
      author = {Honovich, Or and Scialom, Thomas and Levy, Omer and Schick, Timo},
      url = {https://arxiv.org/abs/2212.09689},
      publisher = {arXiv},
      year={2022}
}

unnatural_instructions コレクションでコンテンツを整理 必要に応じて、コンテンツの保存と分類を行います。

unnatural_instructions