-
Semeion
Semeion Handwritten Digit Data Set, where 1593 handwritten digits from around 80 persons were scanned and documented. The each of the 256 variables V1 - V256 describe one of the... -
Micro Mass
MicroMass (pure spectra version) is a dataset to explore machine learning approaches for the identification of microorganisms from mass-spectrometry data. -
Phishing Websites
One of the challenges faced by our research was the unavailability of reliable training datasets. In fact this challenge faces any researcher in the field. However, although... -
Higgs Boson detection data
Higgs Boson detection data. The data has been produced using Monte Carlo simulations. The first 21 features (columns 2-22) are kinematic properties measured by the particle... -
Quickbird imagery
High-resolution Remote Sensing data set (Quickbird). Small number of training samples of diseased trees, large number for other land cover. Testing data set from stratified... -
Datacenter Monitoring Data
Data collected from monitored supercomputers hosted at CINECA -
Synthetic Control Chart Time Series
This data consists of synthetically generated control charts. This dataset contains 600 examples of control charts synthetically generated by the process in Alcock and Manolopoulos -
Isolated Letter Speech Recognition
ISOLET (Isolated Letter Speech Recognition) dataset was generated as follows: 150 subjects spoke the name of each letter of the alphabet twice. Hence, there are 52 training... -
Stalite data
The database consists of the multi-spectral values of pixels in 3x3 neighbourhoods in a satellite image, and the classification associated with the central pixel in each... -
Poker-Hand stream mining
Dataset created to study concept drift in stream mining. It is constructed by combining the Covertype, Poker-Hand, and Electricity datasets. -
Waveform Database Generator
Generator generating 3 classes of waves. Each class is generated from a combination of 2 of 3 "base" waves. -
Large Soybean Database
This is the large soybean database from the UCI repository, with its training and test database combined into a single file. -
Glass Identification Database
The study of classification of types of glass was motivated by criminological investigation. At the scene of the crime, the glass left can be used as evidence -
Image Segmentation Data Set
The instances were drawn randomly from a database of 7 outdoor images. The images were hand-segmented to create a classification for every pixel. Each instance is a 3x3 region -
Letter Image Recognition Data
The objective is to identify each of a large number of black-and-white rectangular pixel displays as one of the 26 capital letters in the English alphabet. -
Chess (King-Rook vs. King-Pawn)
Chess Endgame Database; The format for instances in this database is a sequence of 37 attribute values. Each instance is a board-descriptions for this chess endgame. The first... -
JRC Data Catalogue
The Joint Research Centre (JRC) is the European Commission's in-house science service which employs scientists to carry out research in order to provide independent scientific... -
Data.gov.ie
The Open Data listed in data.gov.ie is published by Government Departments and Public Bodies -
Open data catalogue with all types of information regarding the Catalan Region
Open data from the Catalan Government regarding Culture and leisure, Demographics, Economy, Education, Employment, Energy, Environment, Finance, Health, Housing, Industry,... -
IDESCAT
Open data from the Statistic Boureau of Catalonia