OMLDataSet consists of an
data.frame containing the data set, the old and new column names and,
finally, the target features.
colnames.old contains the original names, i.e., the column names that were
uploaded to the server, while
colnames.new contains the names that you will see when
working with the data in R.
Most of the time, old and new column names are identical. Only if the original names are
not valid, the new ones will differ.
target.features contains the column name(s) from the
OMLDataSet that refer to the target feature(s).
makeOMLDataSet(desc, data, colnames.old = colnames(data), colnames.new = colnames(data), target.features = NULL)
data("airquality") dsc = "Daily air quality measurements in New York, May to September 1973. This data is taken from R." cit = "Chambers, J. M., Cleveland, W. S., Kleiner, B. and Tukey, P. A. (1983) Graphical Methods for Data Analysis. Belmont, CA: Wadsworth." desc_airquality = makeOMLDataSetDescription(name = "airquality", description = dsc, creator = "New York State Department of Conservation (ozone data) and the National Weather Service (meteorological data)", collection.date = "May 1, 1973 to September 30, 1973", language = "English", licence = "GPL-2", url = "https://stat.ethz.ch/R-manual/R-devel/library/datasets/html/00Index.html", default.target.attribute = "Ozone", citation = cit, tags = "R") airquality_oml = makeOMLDataSet(desc = desc_airquality, data = airquality, colnames.old = colnames(airquality), colnames.new = colnames(airquality), target.features = "Ozone")