We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.
Assessing cocoa bean quality using spectral information offers a noninvasive and objective alternative to traditional, often subjective and destructive, methods. However, progress has been limited by the lack of comprehensive datasets across multiple spectral resolutions. This work presents a new dataset capturing the spectral properties of cocoa beans at different spatiospectral resolutions, enabling non-invasive quality assessment and scalable evaluation methodologies. It comprises 19 scenes acquired with four imaging devices under both open (invasive) and closed (non-invasive) conditions, along with corresponding physicochemical measurements. Data collection follows the Colombian standard NTC 1252:2021, which labels beans as well, partially, or poorly fermented. Global physicochemical properties-moisture, polyphenols, and cadmium-were measured using gravimetric analysis, UV-visible spectroscopy, and atomic absorption spectroscopy with microwave digestion. Hyperspectral images were obtained using four devices covering up to the 350-1000 nm spectral range. Statistical analysis shows the dataset distinguishes between cocoa quality levels under both open and closed conditions, supporting the development of automated classification methods.