penguins

  • Description:

Measurements for three penguin species observed in the Palmer Archipelago, Antarctica.

These data were collected from 2007 - 2009 by Dr. Kristen Gorman with the Palmer Station Long Term Ecological Research Program, part of the US Long Term Ecological Research Network. The data were originally imported from the Environmental Data Initiative (EDI) Data Portal, and are available for use by CC0 license ("No Rights Reserved") in accordance with the Palmer Station Data Policy. This copy was imported from Allison Horst's GitHub repository.

@Manual{,
  title = {palmerpenguins: Palmer Archipelago (Antarctica) penguin data},
  author = {Allison Marie Horst and Alison Presmanes Hill and Kristen B Gorman},
  year = {2020},
  note = {R package version 0.1.0},
  doi = {10.5281/zenodo.3960218},
  url = {https://allisonhorst.github.io/palmerpenguins/},
}

penguins/processed (default config)

  • Config description: penguins/processed is a drop-in replacement for the iris dataset. It contains 4 normalised numerical features presented as a single tensor, no missing values and the class label (species) is presented as an integer (n = 334).

  • Download size: 25.05 KiB

  • Dataset size: 17.61 KiB

  • Splits:

Split Examples
'train' 334
  • Feature structure:
FeaturesDict({
    'features': Tensor(shape=(4,), dtype=float32),
    'species': ClassLabel(shape=(), dtype=int64, num_classes=3),
})
  • Feature documentation:
Feature Class Shape Dtype Description
FeaturesDict
features Tensor (4,) float32
species ClassLabel int64

penguins/simple

  • Config description: penguins/simple has been processed from the raw dataset, with simplified class labels derived from text fields, missing values marked as NaN/NA and retains only 7 significant features (n = 344).

  • Download size: 13.20 KiB

  • Dataset size: 56.10 KiB

  • Splits:

Split Examples
'train' 344
  • Feature structure:
FeaturesDict({
    'body_mass_g': float32,
    'culmen_depth_mm': float32,
    'culmen_length_mm': float32,
    'flipper_length_mm': float32,
    'island': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'sex': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'species': ClassLabel(shape=(), dtype=int64, num_classes=3),
})
  • Feature documentation:
Feature Class Shape Dtype Description
FeaturesDict
body_mass_g Tensor float32
culmen_depth_mm Tensor float32
culmen_length_mm Tensor float32
flipper_length_mm Tensor float32
island ClassLabel int64
sex ClassLabel int64
species ClassLabel int64
  • Supervised keys (See as_supervised doc): ({'body_mass_g': 'body_mass_g', 'culmen_depth_mm': 'culmen_depth_mm', 'culmen_length_mm': 'culmen_length_mm', 'flipper_length_mm': 'flipper_length_mm', 'island': 'island', 'sex': 'sex', 'species': 'species'}, 'species')

  • Examples (tfds.as_dataframe):

penguins/raw

  • Config description: penguins/raw is the original, unprocessed copy from @allisonhorst, containing all 17 features, presented either as numeric types or as raw text (n = 344).

  • Download size: 49.72 KiB

  • Dataset size: 164.51 KiB

  • Splits:

Split Examples
'train' 344
  • Feature structure:
FeaturesDict({
    'Body Mass (g)': float32,
    'Clutch Completion': Text(shape=(), dtype=string),
    'Comments': Text(shape=(), dtype=string),
    'Culmen Depth (mm)': float32,
    'Culmen Length (mm)': float32,
    'Date Egg': Text(shape=(), dtype=string),
    'Delta 13 C (o/oo)': float32,
    'Delta 15 N (o/oo)': float32,
    'Flipper Length (mm)': float32,
    'Individual ID': Text(shape=(), dtype=string),
    'Island': Text(shape=(), dtype=string),
    'Region': Text(shape=(), dtype=string),
    'Sample Number': int32,
    'Sex': Text(shape=(), dtype=string),
    'Species': Text(shape=(), dtype=string),
    'Stage': Text(shape=(), dtype=string),
    'studyName': Text(shape=(), dtype=string),
})
  • Feature documentation:
Feature Class Shape Dtype Description
FeaturesDict
Body Mass (g) Tensor float32
Clutch Completion Text string
Comments Text string
Culmen Depth (mm) Tensor float32
Culmen Length (mm) Tensor float32
Date Egg Text string
Delta 13 C (o/oo) Tensor float32
Delta 15 N (o/oo) Tensor float32
Flipper Length (mm) Tensor float32
Individual ID Text string
Island Text string
Region Text string
Sample Number Tensor int32
Sex Text string
Species Text string
Stage Text string
studyName Text string