41 datasets found

Licenses: Creative Commons CCZero (CC0-1.0);Open Data Commons Public Domain Dedication and Licence (PDDL-1.0);

Filter Results
  • Phishing Websites

    One of the challenges faced by our research was the unavailability of reliable training datasets. In fact this challenge faces any researcher in the field. However, although...
  • Higgs Boson detection data

    Higgs Boson detection data. The data has been produced using Monte Carlo simulations. The first 21 features (columns 2-22) are kinematic properties measured by the particle...
  • Speed Dating

    This data was gathered from participants in experimental speed dating events from 2002-2004. During the events, the attendees would have a four-minute "first date" with every...
  • MiceProtein

    The data set consists of the expression levels of 77 proteins/protein modifications that produced detectable signals in the nuclear fraction of cortex. There are 38 control mice...
  • Car Evaluation Database

    The Car Evaluation Database contains examples with the structural information removed, i.e., directly relates CAR to the six input attributes: buying, maint, doors, persons,...
  • Quickbird imagery

    High-resolution Remote Sensing data set (Quickbird). Small number of training samples of diseased trees, large number for other land cover. Testing data set from stratified...
  • Fashion-MNIST

    Fashion-MNIST is a dataset of Zalando's article images, consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale...
  • SensIT Vehicle

    Vehicle classification in distributed sensor networks.
  • Isolated Letter Speech Recognition

    ISOLET (Isolated Letter Speech Recognition) dataset was generated as follows: 150 subjects spoke the name of each letter of the alphabet twice. Hence, there are 52 training...
  • Stalite data

    The database consists of the multi-spectral values of pixels in 3x3 neighbourhoods in a satellite image, and the classification associated with the central pixel in each...
  • Poker-Hand stream mining

    Dataset created to study concept drift in stream mining. It is constructed by combining the Covertype, Poker-Hand, and Electricity datasets.
  • Waveform Database Generator

    Generator generating 3 classes of waves. Each class is generated from a combination of 2 of 3 "base" waves.
  • Tic-Tac-Toe Endgame database

    This database encodes the complete set of possible board configurations at the end of tic-tac-toe games, where "x" is assumed to have played first. The target concept is "win...
  • Glass Identification Database

    The study of classification of types of glass was motivated by criminological investigation. At the scene of the crime, the glass left can be used as evidence
  • Image Segmentation Data Set

    The instances were drawn randomly from a database of 7 outdoor images. The images were hand-segmented to create a classification for every pixel. Each instance is a 3x3 region
  • This dataset describes mushrooms in terms of their physical characteristics....

    This dataset describes mushrooms in terms of their physical characteristics. They are classified into: poisonous or edible.
  • Breast cancer data

    This breast cancer domain was obtained from the University Medical Centre, Institute of Oncology, Ljubljana, Yugoslavia
  • 1985 Auto Imports Databas

    This data set consists of three types of entities: (a) the specification of an auto in terms of various characteristics, (b) its assigned insurance risk rating, (c) its...
  • BUPA liver disorders

    BUPA liver disorders. The first 5 variables are all blood tests which are thought to be sensitive to liver disorders that might arise from excessive alcohol consumption. Each...
  • Cardiac Arrhythmia Database

    The aim is to determine the type of arrhythmia from the ECG recordings. This database contains 279 attributes, 206 of which are linear valued and the rest are nominal.
You can also access this registry using the API (see API Docs).