A Bayesian Approach For Supervised
Discretization
Author(s)
M. Boullé
Abstract
In supervised machine learning, some algorithms are restricted to discrete data
and thus need to discretize continuous attributes. In this paper, we present a new
discretization method called MODL, based on a Bayesian approach. The MODL
method relies on a model space of discretizations and on a prior distribution
defined on this model space. This allows the setting up of an evaluation criterion
of discretization, which is minimal for the most probable discretization given the
data, i.e. the Bayes optimal discretization. We compare this approach with the
MDL approach and statistical approaches used in other discretization methods,
from a theoretical and experimental point of view. Extensive experiments show
that the MODL method builds high quality discretizations.
Keywords: supervised learning, data preparation, discretization, Bayesianism.
1 Introduction
While real data often comes in mixed format, discrete and continuous, many
induction algorithms rely on discrete attributes and need to discretize continuous
attributes, i.e. to slice their domain into a finite number of intervals. More
generally, using discretization to preprocess continuous attribute often provides
many advantages. Discrete values are generally more understandable than
continuous values both for users and experts. Many classification algorithms are
more accurate and run faster when discretization is used.
Discretization of continuous attributes is a problem that has been studied
extensively in the past [6, 7, 9, 12, 16]. For example, decision tree algorithms
exploit a discretization method to handle continuous attributes. C4.5 [13] uses
the information gain based on Shannon entropy. CART [5] applies the Gini
Keywords
supervised learning, data preparation, discretization, Bayesianism.
Related Book
Other papers in this volume
Warning (2)
: foreach() argument must be of type array|object, null given [in
/var/www/dce7ae55-385b-4ffa-8595-3ec5e61ff110/public_html/app/templates/Papers/view.php, line
364]
Code
$counter = '0';
foreach ($paper['book']['Paper'] as $otherPaper) {
if ((!empty($otherPaper['name'])) && ($counter < '7') && ($otherPaper['available'] == 1)) {
Cake\Error\ErrorTrap->handleError() /var/www/dce7ae55-385b-4ffa-8595-3ec5e61ff110/public_html/app/templates/Papers/view.php, line 364
/var/www/dce7ae55-385b-4ffa-8595-3ec5e61ff110/public_html/app/vendor/cakephp/cakephp/src/View/View.php /var/www/dce7ae55-385b-4ffa-8595-3ec5e61ff110/public_html/app/vendor/cakephp/cakephp/src/View/View.php, line 1188
Cake\View\View->_evaluate() /var/www/dce7ae55-385b-4ffa-8595-3ec5e61ff110/public_html/app/vendor/cakephp/cakephp/src/View/View.php, line 1145
Cake\View\View->_render() /var/www/dce7ae55-385b-4ffa-8595-3ec5e61ff110/public_html/app/vendor/cakephp/cakephp/src/View/View.php, line 785
Cake\View\View->render() /var/www/dce7ae55-385b-4ffa-8595-3ec5e61ff110/public_html/app/vendor/cakephp/cakephp/src/Controller/Controller.php, line 712
Cake\Controller\Controller->render() /var/www/dce7ae55-385b-4ffa-8595-3ec5e61ff110/public_html/app/vendor/cakephp/cakephp/src/Controller/Controller.php, line 516
Cake\Controller\Controller->invokeAction() /var/www/dce7ae55-385b-4ffa-8595-3ec5e61ff110/public_html/app/vendor/cakephp/cakephp/src/Controller/ControllerFactory.php, line 166
Cake\Controller\ControllerFactory->handle() /var/www/dce7ae55-385b-4ffa-8595-3ec5e61ff110/public_html/app/vendor/cakephp/cakephp/src/Controller/ControllerFactory.php, line 141
Cake\Controller\ControllerFactory->invoke() /var/www/dce7ae55-385b-4ffa-8595-3ec5e61ff110/public_html/app/vendor/cakephp/cakephp/src/Http/BaseApplication.php, line 362
Cake\Http\BaseApplication->handle() /var/www/dce7ae55-385b-4ffa-8595-3ec5e61ff110/public_html/app/vendor/cakephp/cakephp/src/Http/Runner.php, line 86
Cake\Http\Runner->handle() /var/www/dce7ae55-385b-4ffa-8595-3ec5e61ff110/public_html/app/vendor/cakephp/cakephp/src/Http/Middleware/CsrfProtectionMiddleware.php, line 169
Cake\Http\Middleware\CsrfProtectionMiddleware->process() /var/www/dce7ae55-385b-4ffa-8595-3ec5e61ff110/public_html/app/vendor/cakephp/cakephp/src/Http/Runner.php, line 82
Cake\Http\Runner->handle() /var/www/dce7ae55-385b-4ffa-8595-3ec5e61ff110/public_html/app/vendor/cakephp/cakephp/src/Http/Middleware/BodyParserMiddleware.php, line 157
Cake\Http\Middleware\BodyParserMiddleware->process() /var/www/dce7ae55-385b-4ffa-8595-3ec5e61ff110/public_html/app/vendor/cakephp/cakephp/src/Http/Runner.php, line 82
Cake\Http\Runner->handle() /var/www/dce7ae55-385b-4ffa-8595-3ec5e61ff110/public_html/app/vendor/cakephp/cakephp/src/Routing/Middleware/RoutingMiddleware.php, line 118
Cake\Routing\Middleware\RoutingMiddleware->process() /var/www/dce7ae55-385b-4ffa-8595-3ec5e61ff110/public_html/app/vendor/cakephp/cakephp/src/Http/Runner.php, line 82
Cake\Http\Runner->handle() /var/www/dce7ae55-385b-4ffa-8595-3ec5e61ff110/public_html/app/vendor/cakephp/cakephp/src/Routing/Middleware/AssetMiddleware.php, line 69
Cake\Routing\Middleware\AssetMiddleware->process() /var/www/dce7ae55-385b-4ffa-8595-3ec5e61ff110/public_html/app/vendor/cakephp/cakephp/src/Http/Runner.php, line 82
Cake\Http\Runner->handle() /var/www/dce7ae55-385b-4ffa-8595-3ec5e61ff110/public_html/app/vendor/cakephp/cakephp/src/Error/Middleware/ErrorHandlerMiddleware.php, line 115
Cake\Error\Middleware\ErrorHandlerMiddleware->process() /var/www/dce7ae55-385b-4ffa-8595-3ec5e61ff110/public_html/app/vendor/cakephp/cakephp/src/Http/Runner.php, line 82
Cake\Http\Runner->handle() /var/www/dce7ae55-385b-4ffa-8595-3ec5e61ff110/public_html/app/vendor/cakephp/debug_kit/src/Middleware/DebugKitMiddleware.php, line 60
DebugKit\Middleware\DebugKitMiddleware->process() /var/www/dce7ae55-385b-4ffa-8595-3ec5e61ff110/public_html/app/vendor/cakephp/cakephp/src/Http/Runner.php, line 82
Cake\Http\Runner->handle() /var/www/dce7ae55-385b-4ffa-8595-3ec5e61ff110/public_html/app/vendor/cakephp/cakephp/src/Http/Runner.php, line 60
Cake\Http\Runner->run() /var/www/dce7ae55-385b-4ffa-8595-3ec5e61ff110/public_html/app/vendor/cakephp/cakephp/src/Http/Server.php, line 104
Cake\Http\Server->run() /var/www/dce7ae55-385b-4ffa-8595-3ec5e61ff110/public_html/app/webroot/index.php, line 37
[main]