lbpp
Stay organized with collections
Save and categorize content based on your preferences.
Less Basic Python Programming is a collection of 161 programming problems with
accompanying unit tests. They were created with the aim of being fresh (not
leaked at the time of creation) and more difficult than similar datasets (e.g.,
HumanEval and MBPP). It can serve as a drop-in replacement or enrichment of
those datasets as they are structured in an equivalent way.
FeaturesDict({
'categories': Sequence(Text(shape=(), dtype=string)),
'completion': Text(shape=(), dtype=string),
'instruction': Text(shape=(), dtype=string),
'language': Text(shape=(), dtype=string),
'signature': Text(shape=(), dtype=string),
'task_id': Text(shape=(), dtype=string),
'test_file': Text(shape=(), dtype=string),
'test_list': Sequence(Text(shape=(), dtype=string)),
'test_setup': Text(shape=(), dtype=string),
'title': Text(shape=(), dtype=string),
})
Feature |
Class |
Shape |
Dtype |
Description |
|
FeaturesDict |
|
|
|
categories |
Sequence(Text) |
(None,) |
string |
|
completion |
Text |
|
string |
|
instruction |
Text |
|
string |
|
language |
Text |
|
string |
|
signature |
Text |
|
string |
|
task_id |
Text |
|
string |
|
test_file |
Text |
|
string |
|
test_list |
Sequence(Text) |
(None,) |
string |
|
test_setup |
Text |
|
string |
|
title |
Text |
|
string |
|
@inproceedings{matton-etal-2024-leakage,
title = "On Leakage of Code Generation Evaluation Datasets",
author = "Matton, Alexandre and
Sherborne, Tom and
Aumiller, Dennis and
Tommasone, Elena and
Alizadeh, Milad and
He, Jingyi and
Ma, Raymond and
Voisin, Maxime and
Gilsenan-McMahon, Ellen and
Gall{\'e}, Matthias",
editor = "Al-Onaizan, Yaser and
Bansal, Mohit and
Chen, Yun-Nung",
booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2024",
month = nov,
year = "2024",
address = "Miami, Florida, USA",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2024.findings-emnlp.772/",
doi = "10.18653/v1/2024.findings-emnlp.772",
pages = "13215--13223",
}
lbpp/all (default config)
Split |
Examples |
'test' |
944 |
lbpp/multilingual
Split |
Examples |
'test' |
944 |
lbpp/default
Split |
Examples |
'test' |
162 |
lbpp/python
Split |
Examples |
'test' |
162 |
lbpp/cpp
Split |
Examples |
'test' |
161 |
lbpp/go
Split |
Examples |
'test' |
161 |
lbpp/java
Split |
Examples |
'test' |
158 |
lbpp/js
Split |
Examples |
'test' |
153 |
lbpp/javascript
Split |
Examples |
'test' |
153 |
lbpp/rust
Split |
Examples |
'test' |
149 |
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-06-03 UTC.
[null,null,["Last updated 2025-06-03 UTC."],[],[],null,["# lbpp\n\n\u003cbr /\u003e\n\n- **Description**:\n\n*Less Basic Python Programming* is a collection of 161 programming problems with\naccompanying unit tests. They were created with the aim of being fresh (not\nleaked at the time of creation) and more difficult than similar datasets (e.g.,\nHumanEval and MBPP). It can serve as a drop-in replacement or enrichment of\nthose datasets as they are structured in an equivalent way.\n\n- **Homepage** :\n \u003chttps://aclanthology.org/2024.findings-emnlp.772/\u003e\n\n- **Source code** :\n [`tfds.datasets.lbpp.Builder`](https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/datasets/lbpp/lbpp_dataset_builder.py)\n\n- **Versions**:\n\n - **`2.0.0`** (default): No release notes.\n- **Auto-cached**\n ([documentation](https://www.tensorflow.org/datasets/performances#auto-caching)):\n Yes\n\n- **Feature structure**:\n\n FeaturesDict({\n 'categories': Sequence(Text(shape=(), dtype=string)),\n 'completion': Text(shape=(), dtype=string),\n 'instruction': Text(shape=(), dtype=string),\n 'language': Text(shape=(), dtype=string),\n 'signature': Text(shape=(), dtype=string),\n 'task_id': Text(shape=(), dtype=string),\n 'test_file': Text(shape=(), dtype=string),\n 'test_list': Sequence(Text(shape=(), dtype=string)),\n 'test_setup': Text(shape=(), dtype=string),\n 'title': Text(shape=(), dtype=string),\n })\n\n- **Feature documentation**:\n\n| Feature | Class | Shape | Dtype | Description |\n|-------------|----------------|---------|--------|-------------|\n| | FeaturesDict | | | |\n| categories | Sequence(Text) | (None,) | string | |\n| completion | Text | | string | |\n| instruction | Text | | string | |\n| language | Text | | string | |\n| signature | Text | | string | |\n| task_id | Text | | string | |\n| test_file | Text | | string | |\n| test_list | Sequence(Text) | (None,) | string | |\n| test_setup | Text | | string | |\n| title | Text | | string | |\n\n- **Supervised keys** (See\n [`as_supervised` doc](https://www.tensorflow.org/datasets/api_docs/python/tfds/load#args)):\n `None`\n\n- **Figure**\n ([tfds.show_examples](https://www.tensorflow.org/datasets/api_docs/python/tfds/visualization/show_examples)):\n Not supported.\n\n- **Citation**:\n\n @inproceedings{matton-etal-2024-leakage,\n title = \"On Leakage of Code Generation Evaluation Datasets\",\n author = \"Matton, Alexandre and\n Sherborne, Tom and\n Aumiller, Dennis and\n Tommasone, Elena and\n Alizadeh, Milad and\n He, Jingyi and\n Ma, Raymond and\n Voisin, Maxime and\n Gilsenan-McMahon, Ellen and\n Gall{\\'e}, Matthias\",\n editor = \"Al-Onaizan, Yaser and\n Bansal, Mohit and\n Chen, Yun-Nung\",\n booktitle = \"Findings of the Association for Computational Linguistics: EMNLP 2024\",\n month = nov,\n year = \"2024\",\n address = \"Miami, Florida, USA\",\n publisher = \"Association for Computational Linguistics\",\n url = \"https://aclanthology.org/2024.findings-emnlp.772/\",\n doi = \"10.18653/v1/2024.findings-emnlp.772\",\n pages = \"13215--13223\",\n }\n\nlbpp/all (default config)\n-------------------------\n\n- **Config description**: Multilingual LBPP\n\n- **Download size** : `1.78 MiB`\n\n- **Dataset size** : `4.30 MiB`\n\n- **Splits**:\n\n| Split | Examples |\n|----------|----------|\n| `'test'` | 944 |\n\n- **Examples** ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\nlbpp/multilingual\n-----------------\n\n- **Config description**: Multilingual LBPP\n\n- **Download size** : `1.78 MiB`\n\n- **Dataset size** : `4.30 MiB`\n\n- **Splits**:\n\n| Split | Examples |\n|----------|----------|\n| `'test'` | 944 |\n\n- **Examples** ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\nlbpp/default\n------------\n\n- **Config description**: Python LBPP\n\n- **Download size** : `279.90 KiB`\n\n- **Dataset size** : `627.04 KiB`\n\n- **Splits**:\n\n| Split | Examples |\n|----------|----------|\n| `'test'` | 162 |\n\n- **Examples** ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\nlbpp/python\n-----------\n\n- **Config description**: Python LBPP\n\n- **Download size** : `279.90 KiB`\n\n- **Dataset size** : `627.04 KiB`\n\n- **Splits**:\n\n| Split | Examples |\n|----------|----------|\n| `'test'` | 162 |\n\n- **Examples** ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\nlbpp/cpp\n--------\n\n- **Config description**: C++ LBPP\n\n- **Download size** : `314.45 KiB`\n\n- **Dataset size** : `761.87 KiB`\n\n- **Splits**:\n\n| Split | Examples |\n|----------|----------|\n| `'test'` | 161 |\n\n- **Examples** ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\nlbpp/go\n-------\n\n- **Config description**: Go LBPP\n\n- **Download size** : `317.09 KiB`\n\n- **Dataset size** : `687.23 KiB`\n\n- **Splits**:\n\n| Split | Examples |\n|----------|----------|\n| `'test'` | 161 |\n\n- **Examples** ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\nlbpp/java\n---------\n\n- **Config description**: Java LBPP\n\n- **Download size** : `337.90 KiB`\n\n- **Dataset size** : `887.40 KiB`\n\n- **Splits**:\n\n| Split | Examples |\n|----------|----------|\n| `'test'` | 158 |\n\n- **Examples** ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\nlbpp/js\n-------\n\n- **Config description**: JavaScript LBPP\n\n- **Download size** : `303.40 KiB`\n\n- **Dataset size** : `756.69 KiB`\n\n- **Splits**:\n\n| Split | Examples |\n|----------|----------|\n| `'test'` | 153 |\n\n- **Examples** ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\nlbpp/javascript\n---------------\n\n- **Config description**: JavaScript LBPP\n\n- **Download size** : `303.40 KiB`\n\n- **Dataset size** : `756.69 KiB`\n\n- **Splits**:\n\n| Split | Examples |\n|----------|----------|\n| `'test'` | 153 |\n\n- **Examples** ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\nlbpp/rust\n---------\n\n- **Config description**: JavaScript LBPP\n\n- **Download size** : `272.61 KiB`\n\n- **Dataset size** : `684.31 KiB`\n\n- **Splits**:\n\n| Split | Examples |\n|----------|----------|\n| `'test'` | 149 |\n\n- **Examples** ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples..."]]